Sumant Tambe created KAFKA-4089:
-----------------------------------
Summary: KafkaProducer raises Batch Expired exception
Key: KAFKA-4089
URL: https://issues.apache.org/jira/browse/KAFKA-4089
Project: Kafka
Issue Type: Bug
Components: clients
Affects Versions: 0.10.0.1
Reporter: Sumant Tambe
The batch expiration logic ({{RecordAccumualator.abortExpiredBatches}}) ejects
batches out the cluster metadata needed an update
({{Metadata.timeToNextUpdate==0}}). In this case, no nodes are "ready" to send
data to ({{result.readyNodes}} is empty). As a consequence, {{Sender.drain}}
does not drain any batch at all and therefore no new topic-partitions are
muted.
The batch expiration logic ({{RecordAccumualator.abortExpiredBatches}})
bypasses muted partitions only. As there are no new muted partitions, all
batches, regardless of topic-partition, are subject to expiration. As a result,
a group of batches expire if they linger in the queue for longer than
{{requestTimeout}}.
Expiring batches unconditionally is a bug. It's too greedy.
The current condition in {{abortExpiredBatches}} that bypasses muted partitions
is necessary but not sufficient. It should additionally bypass partitions for
which leader information is known and fresh.
Conversely, it should expire batches only when the following is true
# !muted AND
# meta-data is fresh but leader not available
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)