ccchenhe commented on issue #6034:
URL: https://github.com/apache/hudi/issues/6034#issuecomment-1174746611

   > You can set up partitions, can you show some characteristics of the data 
that duplicates ? It is helpful to dig into the reason.
   ok
   
   now same id have 2 records, 1 is insert, other 1 is update
   ```json
   // insert
   
{"database":"database_00000001","table":"table_00000001","type":"insert","ts":1656643663,"maxwell_ts":1656643663889000,"xid":8498,"xoffset":1,"primary_key":[10401000107002207010151820438060],"primary_key_columns":["id"],"data":{"id":"10401000107002207010151820438060","version":1,"create_time":1656643663,"update_time":1656643663},"old":{}}
   // update
   
{"database":"database_00000001","table":"table_00000001","type":"update","ts":1656660063,"maxwell_ts":1656660063062000,"xid":7210,"xoffset":1,"primary_key":[10401000107002207010151820438060],"primary_key_columns":["id"],"data":{"id":"10401000107002207010151820438060","version":2,"create_time":1656643663,"update_time":1656660063},"old":{"version":1,"update_time":1656643663}}
   
   ```
   application using flink bloom state consume kafka ( these 2 records), and we 
got
   
![image](https://user-images.githubusercontent.com/20533543/177280265-1f3dc430-ec16-453c-9484-525e031e83ef.png)
   
   application using flink bucket consume kafka ( these 2 records), and we got
   <img width="1792" alt="image" 
src="https://user-images.githubusercontent.com/20533543/177280462-66a9e285-73ae-484c-ac91-c80acc61a3ac.png";>
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to