[ https://issues.apache.org/jira/browse/KAFKA-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17508566#comment-17508566 ]
Anil Dasari edited comment on KAFKA-13601 at 3/18/22, 4:52 AM: --------------------------------------------------------------- Filed partitioner uses starting offset of the batch in the file name and is the only varying parameter. There will be only one parquet file per worker (consumer/partition) (that is out of sync with offsets) present in destination (s3 in my case) if worker dies before committing an offset. So new or restarted worker would override that parquet file as start offset of the partition remans same. Please let me know if you have any questions. was (Author: JIRAUSER283879): File name in the filed partitioner uses starting offset of the batch and there will be only one parquet file per worker (consumer/partition) present in destination (s3 in my case) if worker dies before committing an offset. So new or restarted worker would override that parquet file as start offset of the partition remans same. Please let me know if you have any questions. > Add option to support sync offset commit in Kafka Connect Sink > -------------------------------------------------------------- > > Key: KAFKA-13601 > URL: https://issues.apache.org/jira/browse/KAFKA-13601 > Project: Kafka > Issue Type: New Feature > Components: KafkaConnect > Reporter: Anil Dasari > Priority: Major > > Exactly once in s3 connector with scheduled rotation and field partitioner > can be achieved with consumer offset sync' commit after message batch flushed > to sink successfully > Currently, WorkerSinkTask committing the consumer offsets asynchronously and > at regular intervals of WorkerConfig.OFFSET_COMMIT_INTERVAL_MS_CONFIG > [https://github.com/apache/kafka/blob/371f14c3c12d2e341ac96bd52393b43a10acfa84/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L203] > [https://github.com/apache/kafka/blob/371f14c3c12d2e341ac96bd52393b43a10acfa84/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L196] > [https://github.com/apache/kafka/blob/371f14c3c12d2e341ac96bd52393b43a10acfa84/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L354] > > Add config to allow user to select synchronous commit over > WorkerConfig.OFFSET_COMMIT_INTERVAL_MS_CONFIG -- This message was sent by Atlassian Jira (v8.20.1#820001)