[
https://issues.apache.org/jira/browse/KAFKA-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yu Gan updated KAFKA-6404:
--------------------------
Description:
pls wait for some minutes, i'm editing...
kafka broker version: 0.10.0.1
cluster level: 240 nodes
situatition: someone use high released version (such as 0.11.x) of
bin/kafka-console-consumer.sh with parameter "--zookeeper" to continuously
consume a topic with partitions spread all the brokers
phenomenon:
1.broker server log:
errors like: 1) Connection to 2 was disconnected before the response was read;
2) Shrinking ISR for partition [abc, 21] from 33,13,14 to 33;
3) ERROR Processor got uncaught exception. (kafka.network.Processor)
java.nio.BufferUnderflowException
2.common consumers keeping in rebalance status:
errors like:
1) c.p.b.f.l.c.FiberTopoWorkerThread : got uncaught exception
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be
completed since the group has already rebalanced and assigned the partitions to
another member. This means that the time between subsequent calls to poll() was
longer than the configured session.timeout.ms, which typically implies that the
poll loop is spending too much time message processing. You can address this
either by increasing the session timeout or by reducing the maximum size of
batches returned in poll() with max.poll.records.
2) java.lang.IllegalStateException: Correlation id for response (1246203) does
not match request (1246122)
bad results: kafka cluster in sick
root cause:
1) OldConsumer after 0.10.1 in ConsumerFetcherThread.scala:
{code:java}
private val fetchRequestBuilder = new FetchRequestBuilder().
clientId(clientId).
replicaId(Request.OrdinaryConsumerId).
maxWait(config.fetchWaitMaxMs).
minBytes(config.fetchMinBytes).
requestVersion(3) // for now, the old consumer is pinned to the old message
format through the fetch request
{code}
was:
pls wait for some minutes, i'm editing...
kafka broker version: 0.10.0.1
cluster level: 240 nodes
situatition: someone use high released version (such as 0.11.x) of
bin/kafka-console-consumer.sh with parameter "--zookeeper" to continuously
consume a topic with partitions spread all the brokers
phenomenon:
1.broker server log:
errors like: 1) Connection to 2 was disconnected before the response was read;
2) Shrinking ISR for partition [abc, 21] from 33,13,14 to 33;
3) ERROR Processor got uncaught exception. (kafka.network.Processor)
java.nio.BufferUnderflowException
2.common consumers keeping in rebalance status:
errors like:
1) c.p.b.f.l.c.FiberTopoWorkerThread : got uncaught exception
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be
completed since the group has already rebalanced and assigned the partitions to
another member. This means that the time between subsequent calls to poll() was
longer than the configured session.timeout.ms, which typically implies that the
poll loop is spending too much time message processing. You can address this
either by increasing the session timeout or by reducing the maximum size of
batches returned in poll() with max.poll.records.
2) java.lang.IllegalStateException: Correlation id for response (1246203) does
not match request (1246122)
bad results: kafka cluster in sick
> OldConsumer FetchRequest apiVersion not match resulting in broker
> RequestHandler socket leak
> --------------------------------------------------------------------------------------------
>
> Key: KAFKA-6404
> URL: https://issues.apache.org/jira/browse/KAFKA-6404
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 0.10.0.1
> Reporter: Yu Gan
> Priority: Critical
>
> pls wait for some minutes, i'm editing...
> kafka broker version: 0.10.0.1
> cluster level: 240 nodes
> situatition: someone use high released version (such as 0.11.x) of
> bin/kafka-console-consumer.sh with parameter "--zookeeper" to continuously
> consume a topic with partitions spread all the brokers
> phenomenon:
> 1.broker server log:
> errors like: 1) Connection to 2 was disconnected before the response was read;
> 2) Shrinking ISR for partition [abc, 21] from 33,13,14 to 33;
> 3) ERROR Processor got uncaught exception. (kafka.network.Processor)
> java.nio.BufferUnderflowException
> 2.common consumers keeping in rebalance status:
> errors like:
> 1) c.p.b.f.l.c.FiberTopoWorkerThread : got uncaught exception
> org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be
> completed since the group has already rebalanced and assigned the partitions
> to another member. This means that the time between subsequent calls to
> poll() was longer than the configured session.timeout.ms, which typically
> implies that the poll loop is spending too much time message processing. You
> can address this either by increasing the session timeout or by reducing the
> maximum size of batches returned in poll() with max.poll.records.
> 2) java.lang.IllegalStateException: Correlation id for response (1246203)
> does not match request (1246122)
> bad results: kafka cluster in sick
> root cause:
> 1) OldConsumer after 0.10.1 in ConsumerFetcherThread.scala:
> {code:java}
> private val fetchRequestBuilder = new FetchRequestBuilder().
> clientId(clientId).
> replicaId(Request.OrdinaryConsumerId).
> maxWait(config.fetchWaitMaxMs).
> minBytes(config.fetchMinBytes).
> requestVersion(3) // for now, the old consumer is pinned to the old
> message format through the fetch request
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)