kbuci commented on code in PR #10965:
URL: https://github.com/apache/hudi/pull/10965#discussion_r1568070078
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java:
##########
@@ -1135,8 +1138,36 @@ protected void
completeLogCompaction(HoodieCommitMetadata metadata, HoodieTable
*/
protected HoodieWriteMetadata<O> compact(String compactionInstantTime,
boolean shouldComplete) {
HoodieTable table = createTable(config, context.getHadoopConf().get());
+ Option<HoodieInstant> instantToCompactOption =
Option.fromJavaOptional(table.getActiveTimeline()
+ .filterCompletedAndCompactionInstants()
+ .getInstants()
+ .stream()
+ .filter(instant ->
HoodieActiveTimeline.EQUALS.test(instant.getTimestamp(), compactionInstantTime))
+ .findFirst());
+ try {
+ // Transaction serves to ensure only one compact job for this instant
will start heartbeat, and any other concurrent
+ // compact job will abort if they attempt to execute compact before
heartbeat expires
+ // Note that as long as all jobs for this table use this API for
compact, then this alone should prevent
+ // compact rollbacks from running concurrently to compact commits.
+ txnManager.beginTransaction(instantToCompactOption,
txnManager.getLastCompletedTransactionOwner());
Review Comment:
> Even on REQUESTED: since we are taking a lock and checking for heart beat
client, wouldn't that ensure only one writer can proceed and the other writer
will fail.
> why would (3) not see heartbeat? we are taking a block and start emitting
heart beats right.
Oh in step (3) the job (B) will see a heartbeat. Just to clarify the
scenario here only applies if we were to have jobs ignore heartbeat check if
the compact plan is not inflight yet (in case that might have been the source
of confusion here).
If you meant why Job (C) in (7) doesn't see heartbeat, then the reason for
that is that Job (A) created a heartbeat but Job (B) did not, so once (A) stops
then the heartbeat is expired and (C) has no way to know that there is a job
(B) working on this plan. At first glance it might seem like having Job (B)
also emit a heartbeat (during Step (2)) would avoid this issue, but that would
cause another issue: when Job (A) terminates it would clean up the heartbeat
file. So since there is no semaphore-like or shared-exclusive lock-like
functionality in heartbeating I'm not sure how or if we can have concurrent
compact jobs share or ignore a heartbeat.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]