eisig opened a new issue #789: HoodieMergeOnReadTable rollback hangs URL: https://github.com/apache/incubator-hudi/issues/789 There seems to be two bugs with the master branch(commit: ae3c02fb3) my steps: 1. use HDFSParquetImporter to import from hive to hudi 2. use HoodieDeltaStreamer to import new data from kafka.(I add a option to allow missing checkpointStr) the config is same as #779, with --disable-compaction. And then ` select distinct _hoodie_commit_time from rt_table/ro_table` only return the first the commit time (use max() to ensure no newer commits return); But there are newer .deltacommit file in the .hoodie folder. 3. restart the spark job. open the spark UI, will find that the job hangs at `collect at HoodieMergeOnReadTable.java:318` (It hangs every time) ``` org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:45) com.uber.hoodie.table.HoodieMergeOnReadTable.rollback(HoodieMergeOnReadTable.java:318) com.uber.hoodie.HoodieWriteClient.doRollbackAndGetStats(HoodieWriteClient.java:884) com.uber.hoodie.HoodieWriteClient.rollbackInternal(HoodieWriteClient.java:962) com.uber.hoodie.HoodieWriteClient.rollback(HoodieWriteClient.java:773) com.uber.hoodie.HoodieWriteClient.rollbackInflightCommits(HoodieWriteClient.java:1182) com.uber.hoodie.HoodieWriteClient.startCommitWithTime(HoodieWriteClient.java:1050) com.uber.hoodie.HoodieWriteClient.startCommit(HoodieWriteClient.java:1043) com.uber.hoodie.utilities.deltastreamer.DeltaSync.startCommit(DeltaSync.java:406) com.uber.hoodie.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:332) com.uber.hoodie.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:227) com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:382) java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) java.lang.Thread.run(Thread.java:748) ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
