xuzifu666 commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2081431584

   > we also meet the issue in our tests, the case is that we just use simple 
bucket index with mor without partition, and when we restart the job, it will 
write success once, and but then the bucket id conflict. 
00000074-3413-4e9e-b4dd-4676e4eeccb4-0_74-8-3151_20240426173957249.parquet 
(bulkinsert generate it) 
00000074-50bc-4b34-82c5-08c210d82d33-0_74-26-37004_20240428145515030.parquet 
(deltacommit generate it after a restart) according to the driver log, the 
restart job read a rollback commit, and seems the timeline not load all the 
bucket completely 24/04/28 14:57:32 INFO HoodieBucketIndex: Get 
BucketIndexLocationMapper for partitions: [] 24/04/28 14:57:32 INFO 
HoodieActiveTimeline: Loaded instants upto : 
Option{val=[20240428145515193__rollback__COMPLETED__20240428145515760]}
   
   Yes,it is a serious problem which would block user's business. would rise it 
to find out error instant first which can help user to continue their business. 
@danny0405 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to