prasannarajaperumal commented on PR #5436:
URL: https://github.com/apache/hudi/pull/5436#issuecomment-1206165913

   We need to explicitly call out how CDC behaves for the following scenarios 
in the commit range operated
   
   1. Insert and Delete of the same key (multiple times?)
       - Option 1: Show nothing on the CDC stream
       - Option 2: I,D on the row_key (multiple times if needed)
   2. Delete and Insert of the same key
       - Option 1: Show nothing on the CDC stream
       - Option 2: D,I on the row_key
   
   I know CDC on traditional databases have chosen one or the other. I am in 
favour of Option 2. 
   Depending on what we decide here - cdc_log_block becomes mandatory tracking 
and we cannot just look at the 2 versions to construct the CDC stream. 
   cc @YannByron @danny0405 @xushiyan @vinothchandar 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to