mehtaashish23 opened a new issue #1949: URL: https://github.com/apache/iceberg/issues/1949
Currently, the IncrementalDataScan doesn't support IncrementalReads on Overwrite snapshot [here], but as a client, I should be able to read just append data or delete data explicitly and construct an incremental reader at the application level. For instance: A client doing updates based on the primary key can potentially be able to construct back CDC by reading append data and delete data in separate DataFrame, and then take client-side joins. There should be options for the reader to pass options 1. To read the overwrite snapshots (to allow appended data via it) 2. Another option to read-only Deleted DataFiles during IncrementalScan NOTE: This is to allow clients, who would prefer "copy on write" implementation with Iceberg for executing SQL like DELETE or MERGE_INTO (in future). [here]: https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/IncrementalDataTableScan.java#L122 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
