[
https://issues.apache.org/jira/browse/KAFKA-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932125#comment-13932125
]
Jay Kreps commented on KAFKA-1303:
----------------------------------
I don't think this is actually a problem. A metadata request could be slow for
any number of reasons in which case this will happen. There is no guarantee
that failover + metadata refresh completes before the retries are exhausted.
I would be opposed to having special connections for metadata requests.
However if we want to try to improve this we could include a smarter heuristic
in Sender.selectMetadataDestination. Currently we prefer a node that we already
have a connection to and which has no requests currently being sent, however
many requests could be sent but not processed yet. We could prefer instead the
node which we have a connection to which has the fewest in-flight requests.
> metadata request in the new producer can be delayed
> ---------------------------------------------------
>
> Key: KAFKA-1303
> URL: https://issues.apache.org/jira/browse/KAFKA-1303
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 0.8.2
> Reporter: Jun Rao
>
> While debugging a system test, I observed the following.
> 1. A broker side configuration
> (replica.fetch.wait.max.ms=500,replica.fetch.min.bytes=4096) made the time to
> complete a produce request long (each taking about 500ms with ack=-1).
> 2. The producer client has a bunch of outstanding produce requests queued up
> on the brokers.
> 3. One of the brokers fails and we force updating the metadata.
> 4. The metadata request is queued up behind those outstanding producer
> requests.
> 5. By the time the metadata response comes back, some messages have failed
> all retries because of stale metadata.
--
This message was sent by Atlassian JIRA
(v6.2#6252)