hgudladona commented on issue #12298:
URL: https://github.com/apache/hudi/issues/12298#issuecomment-2494512842

   Let me clarify.
   
   This is not a multi writer, we only have 1 writer jobs with cleans running 
async. 
   
   The scenario can be described like this:  
   Active Timeline Before: C1,C2,C3 ; Commit C4 some N records are partition to 
write is tenant=12345/date=20241120 -- Active Timeline After: C1,C2,C3,C4
   Active Timeline Before: C1,C2,C3,C4; Commit C5 some N records are partition 
to write is tenant=12345/date=20241121 -- Active Timeline After: C1,C2,C3,C4,C5
   -- Archive moves commit C1,C2,C3,C4 to archived dir --
   Active Timeline Before: C5 ; Commit C6 some N records are partition to write 
is tenant=12345/date=20241120 -- Intermittently this commit fails with the 
exception during write phase.  Partition  tenant=12345/date=20241120  is no 
longer tracked in active timeline, But when new writes target this partition it 
identifies a small file written during commit C4 but this file id cannot be 
found in the write phase, although it exists on the file system. This 
intermittently fails when timeline sever is ON and consistently succeeds when 
its OFF
    
   Hope this helps. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to