nsivabalan commented on issue #3478:
URL: https://github.com/apache/hudi/issues/3478#issuecomment-1018642537


   @affei : not sure if this matters. But for a partitioned dataset, a pair of 
partition path and record key is unique for a given hudi table. So, there could 
be duplicate record keys in the output across diff partitions. Can you confirm 
that when you said you are seeing duplicates, you meant duplicate records 
having same value for both partition path and record keys.
   
   If you wish to have globally unique record keys, you may have to choose one 
of the GLOBAL index options for index types.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to