scxwhite commented on a change in pull request #4400:
URL: https://github.com/apache/hudi/pull/4400#discussion_r772868883



##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/HoodieCompactor.java
##########
@@ -264,8 +264,11 @@ HoodieCompactionPlan generateCompactionPlan(
         .getLatestFileSlices(partitionPath)
         .filter(slice -> 
!fgIdsInPendingCompactionAndClustering.contains(slice.getFileGroupId()))
         .map(s -> {
+          // We can think that the latest data is in the latest delta log 
file, so we sort it from large

Review comment:
       You're right, but in most cases, the new data is often in the latest 
delta log, so we sort it from large to small according to the instance time. 
The program will avoid updating the data in the externalspillablemap to save 
compact time. What do you think




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to