t3hw commented on PR #15710: URL: https://github.com/apache/iceberg/pull/15710#issuecomment-4669899149
> @t3hw Whether multiple sink task can case date duplication? @jerryzhujing multi task should be safe: 1. Each task is mapped to a unique set of partitions by kafka connect 2. Source topic offsets and the worker DataWritten events are committed to the control topic within a transaction, so the partition offsets only advances when the data files are successfully written. If tasks > partitions, the extra tasks should sit idle and consume nothing. regardless, the code that handles tasks/partitions is not in this repo, its in the main kafka connect codebase (apache/kafka), this repo only handles the iceberg writer tasks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
