haoranzz opened a new issue, #8567:
URL: https://github.com/apache/hudi/issues/8567

   **Describe the problem you faced**
   We are running AWS Glue job to perform a daily compact+clean for our data 
set (3 tables in serial order).
   The Glue times out on a day, we started to observe a few things:
   1. compaction stopped: log files are growing for the affected tables.
   2. archived files stopped to be generated into archived folder.
   3. compaction timeline did not finished for that run (only .requested and 
.inflight present for particular run)
   ```
   For example:
   2023-04-02 02:53:59          0 20230402094244750.compaction.inflight
   2023-04-02 02:53:57     961221 20230402094244750.compaction.requested
   ```
        
   *The large number of files is eating up our S3 connections and slowing down 
our job dramatically.*
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   1. Start Glue job which runs `hoodieCompactor.compact(...)` on a Hudi table.
   2. Glue job timesout before the compact finishes.
   
   
   **Expected behavior**
   The failed compaction should be rolled back automatically, next compaction 
run should be able to compact and reduce the log file count.
   
   **Environment Description**
   * Glue version: Glue 4.0
   
   * Hudi version :  0.13.0
   
   * Spark version : 3.3
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   Compactor and Cleaner configs
   ```
   compactorCfg.sparkMemory = 10G
   compactorCfg.runningMode = HoodieCompactor.SCHEDULE_AND_EXECUTE
   compactorCfg.retry = 1
   ```
   
   Cleaner config: 
   ```
   cleanerCfg.configs.add(s"hoodie.cleaner.commits.retained=10")
   ```
   
   **Stacktrace**
   We did not see exception print.
   Below is the screenshot of the Spark history server timeline.
   
![screencapture-ip-10-160-55-183-cl-local-18080-history-spark-application-1680428560504-jobs-2023-04-24-16_38_10](https://user-images.githubusercontent.com/32605720/234138382-4bc88351-0e77-4fd3-9335-d32bb16ff642.png)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to