beyond1920 opened a new issue, #9139: URL: https://github.com/apache/hudi/issues/9139
Hudi compaction job would complete a compact commit to timeline even if exists write errors. A snapshot query failed because of a broken parquet . <img width="643" alt="image" src="https://github.com/apache/hudi/assets/1525333/b52d770b-1305-43c7-af9e-ffede3cb14fc"> The parquet is generated by a compaction job. <img width="1238" alt="image" src="https://github.com/apache/hudi/assets/1525333/a87f0427-a949-49cd-a2eb-2a41bc8b9301"> The compact is committed successful, but it has write errors. <img width="1026" alt="image" src="https://github.com/apache/hudi/assets/1525333/ba49f848-b84f-4049-97e9-d5a0716ae7cb"> I use `HoodieCompactor` in hudi-utility module to execute the compaction job. The compaction job failed itself, but the compaction commit is completed in hoodie timeline. Currently, the steps of `HoodieCompactor` is: 1. commit compaction would not check whether there exists write status errors <img width="1257" alt="image" src="https://github.com/apache/hudi/assets/1525333/234adb38-e029-46b9-b2d8-05004902ce8a"> 2. check the error finally after the compaction is committed. <img width="1139" alt="image" src="https://github.com/apache/hudi/assets/1525333/3c1186da-c979-4280-ac20-3ef161512c19"> I check the commit step of clustering service and write operation, they all would check whether there exists write status errors before commit, only `completeCompaction` and `completeLogCompaction` would not check the errors before commit. I'm not sure whether this is a bug or not? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
