hgudladona commented on issue #12298:
URL: https://github.com/apache/hudi/issues/12298#issuecomment-2494512842
Let me clarify.
This is not a multi writer, we only have 1 writer jobs with cleans running
async.
The scenario can be described like this:
Active Timeline Before: C1,C2,C3 ; Commit C4 some N records are partition to
write is tenant=12345/date=20241120 -- Active Timeline After: C1,C2,C3,C4
Active Timeline Before: C1,C2,C3,C4; Commit C5 some N records are partition
to write is tenant=12345/date=20241121 -- Active Timeline After: C1,C2,C3,C4,C5
-- Archive moves commit C1,C2,C3,C4 to archived dir --
Active Timeline Before: C5 ; Commit C6 some N records are partition to write
is tenant=12345/date=20241120 -- Intermittently this commit fails with the
exception during write phase. Partition tenant=12345/date=20241120 is no
longer tracked in active timeline, But when new writes target this partition it
identifies a small file written during commit C4 but this file id cannot be
found in the write phase, although it exists on the file system. This
intermittently fails when timeline sever is ON and consistently succeeds when
its OFF
Hope this helps.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]