YannByron commented on code in PR #6256: URL: https://github.com/apache/hudi/pull/6256#discussion_r947417129
########## rfc/rfc-51/rfc-51.md: ########## @@ -148,20 +152,27 @@ hudi_cdc_table/ Under a partition directory, the `.log` file with `CDCBlock` above will keep the changing data we have to materialize. -There is an option to control what data is written to `CDCBlock`, that is `hoodie.table.cdc.supplemental.logging`. See the description of this config above. +#### Write-on-indexing vs Write-on-compaction Review Comment: The valid range of cdc query is same to `timetravel`'s, only the instants in active timeline can be queried. 1. Only c3 can be queried. But if c3 needs to load the previous file slice that has been cleaned, throw exceptions. Because part of cdc data from c3 is lost. 2. Unless the files that is needed to be loaded instants in current active timeline have already been cleaned, no exception will be thrown. That means some cdc data from the cleaned instants will be lost. The behavior is like that sync the binlog from mysql to kafka. If binlog is archived before they are synced to kafka, they can't be seen in kafka. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
