awpengfei opened a new issue #4863:
URL: https://github.com/apache/hudi/issues/4863
**Describe the problem you faced**
* At instant time `20220221085407453`, Flink sent a compaction request to
merge the delta log files into the base parquet files.
```
2022-02-21 08:58:50,410 INFO
org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Create new
file for toInstant
?hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/.hoodie/20220221085407453.compaction.inflight
2022-02-21 08:58:50,583 INFO
org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - Execute
compaction plan for instant 20220221085407453 as 3 file groups
```
* At time `2022-02-21 09:00:07,398`, an exception occurred in task
`hoodie_stream_write` that caused the job to restart.
```
……
Caused by: org.apache.hudi.exception.HoodieAppendException: Failed while
appending records to
hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/.5ce039d0-5080-41c2-a2b4-aaae3b92ea36_20220221085407453.log.1_0-1-118
Caused by: java.io.IOException: Failed to replace a bad datanode on the
existing pipeline due to no more good datanodes being available to try.
……
```
* When the job finished to restart, Flink sent a rollback request and then
the compaction at instant time `20220221085407453` finished.
```
2022-02-21 09:00:08,879 INFO
org.apache.hudi.table.action.rollback.BaseRollbackPlanActionExecutor [] -
Requesting Rollback with instant time
[==>20220221090008627__rollback__REQUESTED]
2022-02-21 09:00:08,947 INFO
org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Create new
file for toInstant
?hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/.hoodie/20220221085407453.commit
2022-02-21 09:00:08,947 INFO org.apache.hudi.client.HoodieFlinkWriteClient
[] - Compacted successfully on commit 20220221085407453
```
* Then the rollback request at instant time `20220221090008627` began to
rollback the compaction commit at instant time `20220221085407453`. It deleted
the base parquet files with instant time `20220221085407453`.
```
2022-02-21 09:00:09,155 INFO
org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Create new
file for toInstant
?hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/.hoodie/20220221090008627.rollback.inflight
2022-02-21 09:00:09,156 INFO
org.apache.hudi.table.action.rollback.MergeOnReadRollbackActionExecutor [] -
Rolling back instant [==>20220221085407453__compaction__INFLIGHT]
2022-02-21 09:00:09,156 INFO
org.apache.hudi.table.action.rollback.MergeOnReadRollbackActionExecutor [] -
Unpublished [==>20220221085407453__compaction__INFLIGHT]
2022-02-21 09:00:09,205 WARN
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor [] - Rollback
finished without deleting inflight instant file.
Instant=[==>20220221085407453__compaction__INFLIGHT]
2022-02-21 09:00:09,205 INFO
org.apache.hudi.table.action.rollback.MergeOnReadRollbackActionExecutor [] -
Time(in ms) taken to finish rollback 49
2022-02-21 09:00:09,205 INFO
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor [] - Rolled
back inflight instant 20220221085407453
2022-02-21 09:00:09,206 INFO
org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Checking for
file exists
?hdfs://da-hdfs/user/hive/warehouse/mysql.db/user_auth_hudi/.hoodie/20220221090008627.rollback.inflight
2022-02-21 09:00:09,313 INFO
org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Create new
file for toInstant
?hdfs://da-hdfs/user/hive/warehouse/mysql.db/user_auth_hudi/.hoodie/20220221090008627.rollback
2022-02-21 09:00:09,313 INFO
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor [] - Rollback
of Commits [20220221085407453] is complete
2022-02-21 09:00:09,326 INFO
org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded
instants upto : Option{val=[20220221090008627__rollback__COMPLETED]}
```
```
compaction show --instant 20220221085407453
╔════════════════╤══════════════════════════════════════╤═══════════════════╤════════════════════════════════════════════════════════════════════════╤═══════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║ Partition Path │ FileId │ Base-Instant
│ Data File Path │
Total Delta Files │ getMetrics
║
╠════════════════╪══════════════════════════════════════╪═══════════════════╪════════════════════════════════════════════════════════════════════════╪═══════════════════╪════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╣
║ │ 4ec1bad9-e941-4748-a82d-04461975b3dc │ 20220221084228229
│ 4ec1bad9-e941-4748-a82d-04461975b3dc_0-1-117_20220221084228229.parquet │ 1
│ {TOTAL_LOG_FILES=1.0, TOTAL_IO_READ_MB=46.0,
TOTAL_LOG_FILES_SIZE=7709317.0, TOTAL_IO_WRITE_MB=39.0, TOTAL_IO_MB=85.0} ║
╟────────────────┼──────────────────────────────────────┼───────────────────┼────────────────────────────────────────────────────────────────────────┼───────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ │ 5ce039d0-5080-41c2-a2b4-aaae3b92ea36 │ 20220221084228229
│ 5ce039d0-5080-41c2-a2b4-aaae3b92ea36_0-1-117_20220221084228229.parquet │ 1
│ {TOTAL_LOG_FILES=1.0, TOTAL_IO_READ_MB=46.0,
TOTAL_LOG_FILES_SIZE=7688548.0, TOTAL_IO_WRITE_MB=39.0, TOTAL_IO_MB=85.0} ║
╟────────────────┼──────────────────────────────────────┼───────────────────┼────────────────────────────────────────────────────────────────────────┼───────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ │ 88609801-c541-4dd3-8996-d5588b85fd03 │ 20220221084228229
│ 88609801-c541-4dd3-8996-d5588b85fd03_0-1-117_20220221084228229.parquet │ 2
│ {TOTAL_LOG_FILES=2.0, TOTAL_IO_READ_MB=44.0,
TOTAL_LOG_FILES_SIZE=6908476.0, TOTAL_IO_WRITE_MB=38.0, TOTAL_IO_MB=82.0} ║
╚════════════════╧══════════════════════════════════════╧═══════════════════╧════════════════════════════════════════════════════════════════════════╧═══════════════════╧════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
```
```
show rollback --instant 20220221090008627
╔═══════════════════╤═════════════════════╤═══════════╤═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╤═══════════╗
║ Instant │ Rolledback Instant │ Partition │ Deleted File
│ Succeeded ║
╠═══════════════════╪═════════════════════╪═══════════╪═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╪═══════════╣
║ 20220221090008627 │ [20220221085407453] │ │
hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/4ec1bad9-e941-4748-a82d-04461975b3dc_0-1-118_20220221085407453.parquet
│ true ║
╟───────────────────┼─────────────────────┼───────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼───────────╢
║ 20220221090008627 │ [20220221085407453] │ │
hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/88609801-c541-4dd3-8996-d5588b85fd03_0-1-118_20220221085407453.parquet
│ true ║
╟───────────────────┼─────────────────────┼───────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼───────────╢
║ 20220221090008627 │ [20220221085407453] │ │
hdfs://da-hdfs/user/hive/warehouse/default.db/hudi_test/5ce039d0-5080-41c2-a2b4-aaae3b92ea36_0-1-118_20220221085407453.parquet
│ true ║
╚═══════════════════╧═════════════════════╧═══════════╧═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╧═══════════╝
```
* Util the next compaction at instant time `20220221090858111`, the base
parquet files at instant time `20220221085407453` had not been generated. That
caused the compaction at instant time `20220221090858111` doesn't contain the
data before the compaction at instant time `20220221085407453`.
```
compaction show --instant 20220221090858111
╔════════════════╤══════════════════════════════════════╤═══════════════════╤════════════════╤═══════════════════╤═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║ Partition Path │ FileId │ Base-Instant
│ Data File Path │ Total Delta Files │ getMetrics
║
╠════════════════╪══════════════════════════════════════╪═══════════════════╪════════════════╪═══════════════════╪═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╣
║ │ 5ce039d0-5080-41c2-a2b4-aaae3b92ea36 │ 20220221085407453
│ null │ 2 │ {TOTAL_LOG_FILES=2.0,
TOTAL_IO_READ_MB=9.0, TOTAL_LOG_FILES_SIZE=9751919.0, TOTAL_IO_WRITE_MB=120.0,
TOTAL_IO_MB=129.0} ║
╟────────────────┼──────────────────────────────────────┼───────────────────┼────────────────┼───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ │ 4ec1bad9-e941-4748-a82d-04461975b3dc │ 20220221085407453
│ null │ 1 │ {TOTAL_LOG_FILES=1.0,
TOTAL_IO_READ_MB=9.0, TOTAL_LOG_FILES_SIZE=9673812.0, TOTAL_IO_WRITE_MB=120.0,
TOTAL_IO_MB=129.0} ║
╟────────────────┼──────────────────────────────────────┼───────────────────┼────────────────┼───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ │ 88609801-c541-4dd3-8996-d5588b85fd03 │ 20220221085407453
│ null │ 1 │ {TOTAL_LOG_FILES=1.0,
TOTAL_IO_READ_MB=8.0, TOTAL_LOG_FILES_SIZE=9316325.0, TOTAL_IO_WRITE_MB=120.0,
TOTAL_IO_MB=128.0} ║
╚════════════════╧══════════════════════════════════════╧═══════════════════╧════════════════╧═══════════════════╧═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
```
**Environment Description**
* Hudi version : 0.10.1
* Hadoop version : 3.3.1
* Flink version : 1.13.5
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]