Alex Dunayevsky created KAFKA-6743:
--------------------------------------
Summary: ConsumerPerformance fails to consume all messages on
topics with large number of partitions
Key: KAFKA-6743
URL: https://issues.apache.org/jira/browse/KAFKA-6743
Project: Kafka
Issue Type: Bug
Components: core, tools
Affects Versions: 0.11.0.2
Reporter: Alex Dunayevsky
ConsumerPerformance fails to consume all messages on topics with large number
of partitions due to a relatively short default polling loop timeout (1000 ms)
that is not reachable and modifiable by the end user.
Demo: Create a topic of 10 000 partitions, send a 50 000 000 of 100 byte
records using kafka-producer-perf-test and consume them using
kafka-consumer-perf-test (ConsumerPerformance). You will likely notice that the
number of records returned by the kafka-consumer-perf-test is many times less
than expected 50 000 000. This happens due to specific ConsumerPerformance
implementation. As the result, in some rough cases it may take a long enough
time to process/iterate through the records polled in batches, thus, the time
may exceed the default hardcoded polling loop timeout and this is probably not
what we want from this utility.
We have two options:
1) Increasing polling loop timeout in ConsumerPerformance implementation. It
defaults to 1000 ms and is hardcoded, thus cannot be changed but we could
export it as an OPTIONAL kafka-consumer-perf-test parameter to enable it on a
script level configuration and available to the end user.
2) Decreasing max.poll.records on a Consumer config level. This is not a fine
option though since we do not want to touch the default settings.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)