karim-ramadan opened a new pull request, #7295: URL: https://github.com/apache/iceberg/pull/7295
### Context As brought up in issue #2788, the only 2 possible actions if reading an iceberg table as a Spark streaming DataFrame are either to skip it or fail. A third possible option would be to consider only added files and ignore deleted files. ### Proposal In this PR I propose a new spark reading option: `streaming-overwrite-snapshots-read-mode` with three possible values: SKIP, BREAK, ADDED_FILES_ONLY to substitute the already existing `streaming-skip-overwrite-snapshots` (true|false) The new ADDED_FILES_ONLY would consider just adding files. ### Notes - The old conf streaming-skip-overwrite-snapshots have been maintained and used to integrate with the new one (the new one has higher precedence) - Some fixes to unit tests have been applied to make them work on Windows I could revert those changes and address them in another PR if needed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
