hudi-agent commented on code in PR #19023:
URL: https://github.com/apache/hudi/pull/19023#discussion_r3434936243
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java:
##########
@@ -652,6 +652,8 @@ private void doCommit(long checkpointId, String instant,
List<WriteStatus> dataW
FlinkValidatorUtils.runValidators(conf, instant, allWriteStatus,
checkpointCommitMetadata, () ->
StreamerUtil.getPreviousCommitMetadata(this.metaClient));
+ // refresh the last txn metadata for OCC
+ this.writeClient.preTxn(tableState.operationType, this.metaClient,
instant);
Review Comment:
🤖 Following up on @danny0405's earlier point that `preTxn` should be invoked
before each instant starts: moving it from `startInstant` into `doCommit`
actually goes the other way and looks like it could regress normal multi-writer
OCC. With the baseline captured microseconds before commit,
`SimpleConcurrentFileWritesConflictResolutionStrategy.getCandidateInstantsV8AndAbove`
calls `findInstantsAfter(lastSuccessful.requestedTime())` (strict `>`), which
excludes the most-recent-completed instant itself — so an external writer that
started and completed during our write window becomes the new baseline and is
silently dropped from the candidate set (and it's also not in
`pendingInflightAndRequestedInstants`, so
`getCompletedInstantsDuringCurrentWriteOperation` won't catch it either).
Spark's `BaseHoodieWriteClient.preWrite` sets `lastCompletedTxnAndMetadata` at
write start for exactly this reason. Could the previous shape — `preTxn` at
`startInstant` plus the explicit between-it
eration refresh in `commitInstants` — be restored to preserve the detection
window?
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]