danny0405 commented on PR #9182:
URL: https://github.com/apache/hudi/pull/9182#issuecomment-1654867431

   > how should these log files be cleaned up. Duplicate bucket id files cause 
tasks to fail to start all the time
   
   The log expected to be cleaned when the instant is committed (we have a 
marker machanism to ensure the retried files got cleaned), then issue here is 
why these partitial files are visible to the `BucketStreamWriter`, that's the 
direction we should dig into.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to