scxwhite commented on a change in pull request #4400:
URL: https://github.com/apache/hudi/pull/4400#discussion_r785259224
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/HoodieCompactor.java
##########
@@ -264,8 +264,11 @@ HoodieCompactionPlan generateCompactionPlan(
.getLatestFileSlices(partitionPath)
.filter(slice ->
!fgIdsInPendingCompactionAndClustering.contains(slice.getFileGroupId()))
.map(s -> {
+ // We can think that the latest data is in the latest delta log
file, so we sort it from large
Review comment:
> I think you are assuming the later writes in the log always overwrites
the earlier ones? this is not true always.
In the compact plan generation phase, I just changed the order of reading
delta log files. In the internal production environment, I have used this
method for a month, and no data exceptions have occurred(cluster、clean、compact
all inline). Now, I don't know how I should test this place. Can you give me
some suggestions
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]