Hi all,

We have update Kafka-storm client for Kafka Spout from 1.0.3 to 1.1
recently. Before this upgrade, we were able to use our application without
any issue, but after the upgrade, our performance
dropped significantly wherever we are using more than a single partition
for our Kafka topic! If I use more than a single partition, I can see the
following error in storm log which affects throughput significantly.

org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be
completed since the group has already rebalanced and assigned the
partitions to another member. This means that the time between subsequent
calls to poll() was longer than the configured session.timeout.ms, which
typically implies that the poll loop is spending too much time message
processing. You can address this either by increasing the session timeout
or by reducing the maximum size of batches returned in poll() with
max.poll.records. at org.apache.kafka.clients.consumer.internals.
ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:600)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$
OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:541) at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$
CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679) at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$
CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658) at
org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$
RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278) at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.
clientPoll(ConsumerNetworkClient.java:360) at org.apache.kafka.clients.
consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(
ConsumerNetworkClient.java:192) at org.apache.kafka.clients.
consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.
commitOffsetsSync(ConsumerCoordinator.java:426) at org.apache.kafka.clients.
consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1059) at
org.apache.storm.kafka.spout.KafkaSpout.commitOffsetsForAckedTuples(KafkaSpout.java:302)
at org.apache.storm.kafka.spout.KafkaSpout.nextTuple(KafkaSpout.java:204)
at 
org.apache.storm.daemon.executor$fn__6505$fn__6520$fn__6551.invoke(executor.clj:651)
at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at
clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:748)

We have changed the value of max.poll.records or session.timeout.ms but it
wasn't helpful at all. I was wondering what has changed that caused this
performance issue.

P.S: In addition to the upgrade, we are using a dedicated Kafka Brokers
right now, so I am not entirely sure that the issue is related to the
client upgrade. Previously Kafka and Storm supervisor were collocated on
the same host.

Regards,
Ali

Reply via email to