whocanhu commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2081430619
we also meet the issue in our tests, the case is that we just use simple
bucket index with mor without partition, and when we restart the job, it will
write success once, and but then the bucket id conflict.
00000074-3413-4e9e-b4dd-4676e4eeccb4-0_74-8-3151_20240426173957249.parquet
(bulkinsert generate it)
00000074-50bc-4b34-82c5-08c210d82d33-0_74-26-37004_20240428145515030.parquet
(deltacommit generate it after a restart)
according to the driver log, the restart job read a rollback commit, and
seems the timeline not load all the bucket completely
24/04/28 14:57:32 INFO HoodieBucketIndex: Get BucketIndexLocationMapper for
partitions: []
24/04/28 14:57:32 INFO HoodieActiveTimeline: Loaded instants upto :
Option{val=[20240428145515193__rollback__COMPLETED__20240428145515760]}
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]