geserdugarov commented on PR #12796: URL: https://github.com/apache/hudi/pull/12796#issuecomment-2651347353
@danny0405 , @xiarixiaoyao, @yuzhaojing, @wombatu-kun , hi! If you don't mind and would have time, could you, please, review this PR related to corresponding RFC https://github.com/apache/hudi/pull/12697. Actually, the main part of proposed changes has been done in this PR. The only missed part for now is consistent hashing support (_in progress_) and bounded context (_will check it next_). I've also finished testing. - Run benchmark on `lineitem` table from TPC-H for supported write scenarios, and got about 30% performance increase. - Manually checked that restore from Flink checkpoint is successful. - Enabled `write.fast.mode` by default to run all tests in https://github.com/apache/hudi/pull/12817. There are only 2 errors in `test-flink (flink1.20, 1.11.3)`: https://github.com/apache/hudi/actions/runs/13261652696/job/37019456652?pr=12817 ```text Error: testScheduleSplitPlan Time elapsed: 0.034 s <<< ERROR! org.apache.hudi.exception.HoodieNotSupportedException: Currently, consistent hashing is not supported with enabled 'write.fast.mode' at org.apache.hudi.sink.cluster.ITTestFlinkConsistentHashingClustering.prepareData(ITTestFlinkConsistentHashingClustering.java:126) at org.apache.hudi.sink.cluster.ITTestFlinkConsistentHashingClustering.testScheduleSplitPlan(ITTestFlinkConsistentHashingClustering.java:79) Error: testScheduleMergePlan Time elapsed: 0.027 s <<< ERROR! org.apache.hudi.exception.HoodieNotSupportedException: Currently, consistent hashing is not supported with enabled 'write.fast.mode' at org.apache.hudi.sink.cluster.ITTestFlinkConsistentHashingClustering.prepareData(ITTestFlinkConsistentHashingClustering.java:126) at org.apache.hudi.sink.cluster.ITTestFlinkConsistentHashingClustering.testScheduleMergePlan(ITTestFlinkConsistentHashingClustering.java:104) ``` These errors are related to not supported consistent hashing yet. All other cases are successfully passed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
