awpengfei opened a new issue #4863:
URL: https://github.com/apache/hudi/issues/4863


   **Describe the problem you faced**
   
   * At instant time `20220221085407453`, Flink sent a compaction request to 
merge the delta log files into the base parquet files.
   ```
   2022-02-21 08:58:50,410 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Create new 
file for toInstant 
?hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/.hoodie/20220221085407453.compaction.inflight
   2022-02-21 08:58:50,583 INFO  
org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - Execute 
compaction plan for instant 20220221085407453 as 3 file groups
   ```
   * At time `2022-02-21 09:00:07,398`, an exception occurred in task 
`hoodie_stream_write` that caused the job to restart.
   ```
   ……
   Caused by: org.apache.hudi.exception.HoodieAppendException: Failed while 
appending records to 
hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/.5ce039d0-5080-41c2-a2b4-aaae3b92ea36_20220221085407453.log.1_0-1-118
   Caused by: java.io.IOException: Failed to replace a bad datanode on the 
existing pipeline due to no more good datanodes being available to try.
   ……
   ```
   * When the job finished to restart, Flink sent a rollback request and then 
the compaction at instant time `20220221085407453` finished.
   ```
   2022-02-21 09:00:08,879 INFO  
org.apache.hudi.table.action.rollback.BaseRollbackPlanActionExecutor [] - 
Requesting Rollback with instant time 
[==>20220221090008627__rollback__REQUESTED]
   2022-02-21 09:00:08,947 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Create new 
file for toInstant 
?hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/.hoodie/20220221085407453.commit
   2022-02-21 09:00:08,947 INFO  org.apache.hudi.client.HoodieFlinkWriteClient  
              [] - Compacted successfully on commit 20220221085407453
   ```
   * Then the rollback request at instant time `20220221090008627` began to 
rollback the compaction commit at instant time `20220221085407453`. It deleted 
the base parquet files with instant time `20220221085407453`.
   ```
   2022-02-21 09:00:09,155 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Create new 
file for toInstant 
?hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/.hoodie/20220221090008627.rollback.inflight
   2022-02-21 09:00:09,156 INFO  
org.apache.hudi.table.action.rollback.MergeOnReadRollbackActionExecutor [] - 
Rolling back instant [==>20220221085407453__compaction__INFLIGHT]
   2022-02-21 09:00:09,156 INFO  
org.apache.hudi.table.action.rollback.MergeOnReadRollbackActionExecutor [] - 
Unpublished [==>20220221085407453__compaction__INFLIGHT]
   2022-02-21 09:00:09,205 WARN  
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor [] - Rollback 
finished without deleting inflight instant file. 
Instant=[==>20220221085407453__compaction__INFLIGHT]
   2022-02-21 09:00:09,205 INFO  
org.apache.hudi.table.action.rollback.MergeOnReadRollbackActionExecutor [] - 
Time(in ms) taken to finish rollback 49
   2022-02-21 09:00:09,205 INFO  
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor [] - Rolled 
back inflight instant 20220221085407453
   2022-02-21 09:00:09,206 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Checking for 
file exists 
?hdfs://da-hdfs/user/hive/warehouse/mysql.db/user_auth_hudi/.hoodie/20220221090008627.rollback.inflight
   2022-02-21 09:00:09,313 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Create new 
file for toInstant 
?hdfs://da-hdfs/user/hive/warehouse/mysql.db/user_auth_hudi/.hoodie/20220221090008627.rollback
   2022-02-21 09:00:09,313 INFO  
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor [] - Rollback 
of Commits [20220221085407453] is complete
   2022-02-21 09:00:09,326 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Loaded 
instants upto : Option{val=[20220221090008627__rollback__COMPLETED]}
   ```
   ```
   compaction show --instant 20220221085407453
   
╔════════════════╤══════════════════════════════════════╤═══════════════════╤════════════════════════════════════════════════════════════════════════╤═══════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
   ║ Partition Path │ FileId                               │ Base-Instant      
│ Data File Path                                                         │ 
Total Delta Files │ getMetrics                                                  
                                                           ║
   
╠════════════════╪══════════════════════════════════════╪═══════════════════╪════════════════════════════════════════════════════════════════════════╪═══════════════════╪════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╣
   ║                │ 4ec1bad9-e941-4748-a82d-04461975b3dc │ 20220221084228229 
│ 4ec1bad9-e941-4748-a82d-04461975b3dc_0-1-117_20220221084228229.parquet │ 1    
             │ {TOTAL_LOG_FILES=1.0, TOTAL_IO_READ_MB=46.0, 
TOTAL_LOG_FILES_SIZE=7709317.0, TOTAL_IO_WRITE_MB=39.0, TOTAL_IO_MB=85.0} ║
   
╟────────────────┼──────────────────────────────────────┼───────────────────┼────────────────────────────────────────────────────────────────────────┼───────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
   ║                │ 5ce039d0-5080-41c2-a2b4-aaae3b92ea36 │ 20220221084228229 
│ 5ce039d0-5080-41c2-a2b4-aaae3b92ea36_0-1-117_20220221084228229.parquet │ 1    
             │ {TOTAL_LOG_FILES=1.0, TOTAL_IO_READ_MB=46.0, 
TOTAL_LOG_FILES_SIZE=7688548.0, TOTAL_IO_WRITE_MB=39.0, TOTAL_IO_MB=85.0} ║
   
╟────────────────┼──────────────────────────────────────┼───────────────────┼────────────────────────────────────────────────────────────────────────┼───────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
   ║                │ 88609801-c541-4dd3-8996-d5588b85fd03 │ 20220221084228229 
│ 88609801-c541-4dd3-8996-d5588b85fd03_0-1-117_20220221084228229.parquet │ 2    
             │ {TOTAL_LOG_FILES=2.0, TOTAL_IO_READ_MB=44.0, 
TOTAL_LOG_FILES_SIZE=6908476.0, TOTAL_IO_WRITE_MB=38.0, TOTAL_IO_MB=82.0} ║
   
╚════════════════╧══════════════════════════════════════╧═══════════════════╧════════════════════════════════════════════════════════════════════════╧═══════════════════╧════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
   ```
   ```
   show rollback --instant 20220221090008627
   
╔═══════════════════╤═════════════════════╤═══════════╤═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╤═══════════╗
   ║ Instant           │ Rolledback Instant  │ Partition │ Deleted File         
                                                                                
                             │ Succeeded ║
   
╠═══════════════════╪═════════════════════╪═══════════╪═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╪═══════════╣
   ║ 20220221090008627 │ [20220221085407453] │           │ 
hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/4ec1bad9-e941-4748-a82d-04461975b3dc_0-1-118_20220221085407453.parquet
    │ true      ║
   
╟───────────────────┼─────────────────────┼───────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼───────────╢
   ║ 20220221090008627 │ [20220221085407453] │           │ 
hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/88609801-c541-4dd3-8996-d5588b85fd03_0-1-118_20220221085407453.parquet
    │ true      ║
   
╟───────────────────┼─────────────────────┼───────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼───────────╢
   ║ 20220221090008627 │ [20220221085407453] │           │ 
hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/5ce039d0-5080-41c2-a2b4-aaae3b92ea36_0-1-118_20220221085407453.parquet
    │ true      ║
   
╚═══════════════════╧═════════════════════╧═══════════╧═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╧═══════════╝
   ```
   * Util the next compaction at instant time `20220221090858111`, the base 
parquet files at instant time `20220221085407453` had not been generated. That 
caused the compaction at instant time `20220221090858111` doesn't contain the 
data before the compaction at instant time `20220221085407453`.
   ```
   compaction show --instant 20220221090858111
   
╔════════════════╤══════════════════════════════════════╤═══════════════════╤════════════════╤═══════════════════╤═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
   ║ Partition Path │ FileId                               │ Base-Instant      
│ Data File Path │ Total Delta Files │ getMetrics                               
                                                                               ║
   
╠════════════════╪══════════════════════════════════════╪═══════════════════╪════════════════╪═══════════════════╪═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╣
   ║                │ 5ce039d0-5080-41c2-a2b4-aaae3b92ea36 │ 20220221085407453 
│ null           │ 2                 │ {TOTAL_LOG_FILES=2.0, 
TOTAL_IO_READ_MB=9.0, TOTAL_LOG_FILES_SIZE=9751919.0, TOTAL_IO_WRITE_MB=120.0, 
TOTAL_IO_MB=129.0} ║
   
╟────────────────┼──────────────────────────────────────┼───────────────────┼────────────────┼───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
   ║                │ 4ec1bad9-e941-4748-a82d-04461975b3dc │ 20220221085407453 
│ null           │ 1                 │ {TOTAL_LOG_FILES=1.0, 
TOTAL_IO_READ_MB=9.0, TOTAL_LOG_FILES_SIZE=9673812.0, TOTAL_IO_WRITE_MB=120.0, 
TOTAL_IO_MB=129.0} ║
   
╟────────────────┼──────────────────────────────────────┼───────────────────┼────────────────┼───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
   ║                │ 88609801-c541-4dd3-8996-d5588b85fd03 │ 20220221085407453 
│ null           │ 1                 │ {TOTAL_LOG_FILES=1.0, 
TOTAL_IO_READ_MB=8.0, TOTAL_LOG_FILES_SIZE=9316325.0, TOTAL_IO_WRITE_MB=120.0, 
TOTAL_IO_MB=128.0} ║
   
╚════════════════╧══════════════════════════════════════╧═══════════════════╧════════════════╧═══════════════════╧═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
   ```
   
   **Environment Description**
   
   * Hudi version : 0.10.1
   
   * Hadoop version : 3.3.1
   
   * Flink version : 1.13.5
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to