kazdy commented on issue #6869:
URL: https://github.com/apache/hudi/issues/6869#issuecomment-1270599382

   Hi, 
   
   record key works like a PK on a table (unique, non-nullable field). 
   In your case, you end up with two records with different record key and 
that's expected.
   Precombine and upserts are supposed to maintain the uniqueness of recordKey.
   
   assume you use only delivery field as record key to make it easier 
   so if you have record with delivery:3000 hudi will do insert (if record with 
same record key does not exists in the table),  if record with delivery:2000 
and it already exists in the table then update
   
   precombine works before write, incoming batch of data is deduplicated based 
on record key and precombine field
   so if in incoming batch you have two records with the same record key, then 
one with greater precombine field value will be passed to write operation


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to