[GitHub] [hudi] n3nash commented on a change in pull request #2359: [HUDI-1486] Remove inflight rollback in hoodie writer

GitBox Thu, 07 Jan 2021 01:12:48 -0800


n3nash commented on a change in pull request #2359:
URL: https://github.com/apache/hudi/pull/2359#discussion_r553198185




##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java
##########
@@ -707,24 +739,51 @@ public void rollbackInflightCompaction(HoodieInstant 
inflightInstant, HoodieTabl
   }
 
   /**
-   * Cleanup all pending commits.
+   * Rollback all failed commits.
    */
-  private void rollbackPendingCommits() {
+  public void rollbackFailedCommits() {
     HoodieTable<T, I, K, O> table = createTable(config, hadoopConf);
-    HoodieTimeline inflightTimeline = 
table.getMetaClient().getCommitsTimeline().filterPendingExcludingCompaction();
-    List<String> commits = 
inflightTimeline.getReverseOrderedInstants().map(HoodieInstant::getTimestamp)
-        .collect(Collectors.toList());
-    for (String commit : commits) {
-      if (HoodieTimeline.compareTimestamps(commit, 
HoodieTimeline.LESSER_THAN_OR_EQUALS,
+    List<String> instantsToRollback = getInstantsToRollback(table);

Review comment:
       For part 1 of the question, what you mentioned is true, in addition, I 
have also added a check in the merging of log blocks where if the instant is 
not on the active timeline, that block will be ignored (so for this specific 
case, rollback blocks are not needed). For the case where the instant has been 
archived, that cannot happen until that failed commit has been rolled back by 
the cleaner (changes to the archival ensure that)
   
   For part 2, it's slightly complicated. 
   
   For parallel writing, the assumption is the following : 
   1) Writer 1 starts writing to file groups f1_c1, f2_c1
   2) Writer 2 starts writing to file groups f3_c2, f4_c2
   3) Any scheduling operations (schedule compaction, schedule clustering, 
clean) cannot happen for f1, f2, f3, f4 because updates may have started for 
f1.log_c1 and scheduleCompaction may create a new phantom file slice f1_c3, in 
which case f1.log_c1 updates are lost. 
   So, for all operations to run concurrently, they truly have to not overlap 
in file slices
   
   For concurrent writing, we may need some more changes :
   1) Writer 1 starts writing to file groups f1_c1, f2_c1
   2) Writer 2 starts writing to file groups **f2_c2**, f3_c2
   3) scheduleCompaction for f2_c1 & f1_c1 -> f2_c3, f1_c3
   
   Now the other PR needs some more enhancements to not only check how many 
writers have happened since the time it started, but also if there are any 
other "schedule" operations that are also on the timeline. This will I think we 
will have to implement a ConcurrentRequestStrategyWithPriority where we can 
allow for schedule to fail if it sees a writer running etc..




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] n3nash commented on a change in pull request #2359: [HUDI-1486] Remove inflight rollback in hoodie writer

Reply via email to