I searched through jira and the mailing list for prior discussion of this and could not find any. Forgive me if I missed it, and if so please send a link!
It was raised in the kafka-python issue list by an astute reader that the KafkaConsumer autocommit semantics can be accidentally broken by consumer methods that themselves call poll(), triggering background tasks like AutoCommitTask inadvertently. Normally, the autocommit semantics say that message offsets will not be committed (ack) until after the consumer has processed them. Common pattern in pseudocode would be: ``` while True: batch = consumer.poll(); for message in batch: process(message); # failure here should block acks for all messages since last poll() ``` This is a good at-least-once-delivery model. But so the problem raised is that if during message processing the user were to call a consumer method that does network requests via poll(), then it is possible that the AutoCommitTask could be called prematurely and messages returned in the last batch could be committed/acked before processing completes. Such methods appear to include: consumer.listTopics, consumer.position, consumer.partitionsFor. The problem then is that if there is a failure after one of these methods but before message processing completes, those messages will have been auto-committed and will not be reprocessed. Has this issue been discussed before? Any thoughts on how to address? -Dana