whocanhu commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2081430619

   we also meet the issue in our tests, the case is that we just use simple 
bucket index with mor without partition, and when we restart the job, it will 
write success once, and but then the bucket id conflict. 
   00000074-3413-4e9e-b4dd-4676e4eeccb4-0_74-8-3151_20240426173957249.parquet 
(bulkinsert generate it)
   00000074-50bc-4b34-82c5-08c210d82d33-0_74-26-37004_20240428145515030.parquet 
(deltacommit generate it after a restart)
   according to the driver log, the restart job read a rollback commit, and 
seems the timeline not load all the bucket completely
   24/04/28 14:57:32 INFO HoodieBucketIndex: Get BucketIndexLocationMapper for 
partitions: []
   24/04/28 14:57:32 INFO HoodieActiveTimeline: Loaded instants upto : 
Option{val=[20240428145515193__rollback__COMPLETED__20240428145515760]}


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to