beyond1920 commented on issue #10407:
URL: https://github.com/apache/hudi/issues/10407#issuecomment-1870862338

   ![image](https://github.co@zyclove 
/apache/hudi/assets/1525333/9083eb26-71fd-4656-9c25-c0374fc7ccf2)
   @zyclove Data deduplication caused by records with same primary key value 
are written into different file groups.
   It seems like the first commit use simple bucket index, because the file 
group id has an bucket id as prefix. However in the second commit, file group 
id does not has bucket id as prefix, it seems that the simple bucket index did 
not take effect during this write job. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to