sleapfish commented on issue #2284:
URL: https://github.com/apache/hudi/issues/2284#issuecomment-773903253
@nsivabalan You are right.
I just want to add couple of things to this:
- Ideally this should support specifying SCD columns that you want to track
- For example: data set has row_key, col1 and col2. You want to track
changes for col2 only. If the incoming source data set includes existing
row_key and only col1 has changed then do simple UPSERT (no history required).
But, if col2 has changed then do SCD UPSERT.
- It shouldn't be triggered if none of the columns have changed
- hudi_commit_time of ended (historical record) should probably be t5 in
your case as well (since the record got updated)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]