alexeykudinkin commented on a change in pull request #4716:
URL: https://github.com/apache/hudi/pull/4716#discussion_r796088457



##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java
##########
@@ -90,42 +86,41 @@ public MarkerBasedRollbackStrategy(HoodieTable<?, ?, ?, ?> 
table, HoodieEngineCo
                 Collections.singletonList(fullDeletePath.toString()),
                 Collections.emptyMap());
           case APPEND:
+            // NOTE: This marker file-path does NOT correspond to a log-file, 
but rather is a phony
+            //       path serving as a "container" for the following 
components:
+            //          - Base file's file-id
+            //          - Base file's commit instant
+            //          - Partition path
             return 
getRollbackRequestForAppend(WriteMarkers.stripMarkerSuffix(markerFilePath));
           default:
             throw new HoodieRollbackException("Unknown marker type, during 
rollback of " + instantToRollback);
         }
-      }, parallelism).stream().collect(Collectors.toList());
+      }, parallelism);
     } catch (Exception e) {
       throw new HoodieRollbackException("Error rolling back using marker files 
written for " + instantToRollback, e);
     }
   }
 
-  protected HoodieRollbackRequest getRollbackRequestForAppend(String 
appendBaseFilePath) throws IOException {
-    Path baseFilePathForAppend = new Path(basePath, appendBaseFilePath);
+  protected HoodieRollbackRequest getRollbackRequestForAppend(String 
markerFilePath) throws IOException {
+    Path baseFilePathForAppend = new Path(basePath, markerFilePath);
     String fileId = FSUtils.getFileIdFromFilePath(baseFilePathForAppend);
     String baseCommitTime = 
FSUtils.getCommitTime(baseFilePathForAppend.getName());
-    String partitionPath = FSUtils.getRelativePartitionPath(new 
Path(basePath), new Path(basePath, appendBaseFilePath).getParent());
-    Map<FileStatus, Long> writtenLogFileSizeMap = 
getWrittenLogFileSizeMap(partitionPath, baseCommitTime, fileId);
-    Map<String, Long> writtenLogFileStrSizeMap = new HashMap<>();
-    for (Map.Entry<FileStatus, Long> entry : writtenLogFileSizeMap.entrySet()) 
{
-      writtenLogFileStrSizeMap.put(entry.getKey().getPath().toString(), 
entry.getValue());
-    }
-    return new HoodieRollbackRequest(partitionPath, fileId, baseCommitTime, 
Collections.emptyList(), writtenLogFileStrSizeMap);
+    String relativePartitionPath = FSUtils.getRelativePartitionPath(new 
Path(basePath), baseFilePathForAppend.getParent());
+    Path partitionPath = FSUtils.getPartitionPath(config.getBasePath(), 
relativePartitionPath);
+
+    // NOTE: Since we're rolling back incomplete Delta Commit, it only could 
have appended its
+    //       block to the latest log-file
+    // TODO(HUDI-1517) use provided marker-file's path instead
+    HoodieLogFile latestLogFile = 
FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
+        HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime).get();
+
+    // NOTE: Marker's don't carry information about the cumulative size of the 
blocks that have been appended,
+    //       therefore we simply stub this value.
+    Map<String, Long> logFilesWithBlocsToRollback =

Review comment:
       There's no issues w/ `HoodieMetadataFileInfo` -- it always stores 
file-size. The problem that i was bringing up earlier is that we manicure 
`FileStatus` w/ delta file-size during append.
   
   But since deltas are only used for log-file appends, and it isn't actually 
faulty here, i don't think there's need for us to drop everything and try to 
fix it right away, since its not an issue at the moment (and unlikely to 
become) and its scope is actually very limited.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to