kbuci commented on PR #18160:
URL: https://github.com/apache/hudi/pull/18160#issuecomment-3881836875

   > Not sure I understand the exact sequence here. can you help me understand.
   > 
   > Any write to mdt is guarded by data table lock and hence its fair to say, 
mdt is a single writer table.
   > 
   > whenever we wanted to apply any commits or clean or rollback from data 
table to mdt, we instantiate a new mdt writer, apply the commit/clean/rollback 
and close it out.
   > 
   > Which means, every new write to mdt, will always have updated mdt 
timeline. So, how come writer 2 could see a stale timeline in mdt, if writer 1 
just happened to update mdt timeline.
   
   @nsivabalan Sure let me clarify, this isn't a multi-writer scenario 
actually. The scenario is:
   
   - The `processAndCommit`  
https://github.com/apache/hudi/pull/18160/changes#diff-65dd70e3cb912c49b3972598c897e47c8ef08f687789f86cb33f567006ef50e9R215
 above runs
   - It indirectly calls `commitInternal`
   - Which then calls `metadataMetaClient = 
rollbackFailedWrites(dataWriteConfig, writeClient, metadataMetaClient);` 
   And then rolls back inflight deltacommits in MDT timeline, including the 
target `deltaCommitInstant` of the rollback.
   Although this above rollback call reloads the MDT metaclient/timeline, when 
we get to 
https://github.com/apache/hudi/pull/18160/changes#diff-65dd70e3cb912c49b3972598c897e47c8ef08f687789f86cb33f567006ef50e9L218
 we don't see this refreshed timeline. And hence we attempt to rollback a 
deltacommit in MDT timeline that was already rolled back. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to