[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15743437#comment-15743437 ] Ismael Juma commented on KAFKA-4405: Thanks for the PR [~enothereska]. Can you please update the JIRA title and description to match what we actually merged? > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk >Assignee: Eno Thereska > Fix For: 0.10.2.0 > > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15742591#comment-15742591 ] ASF GitHub Bot commented on KAFKA-4405: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/2193 > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk >Assignee: Eno Thereska > Fix For: 0.10.2.0 > > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711481#comment-15711481 ] Eno Thereska commented on KAFKA-4405: - [~guozhang] Ignore previous comment, I was testing by mistake with max.poll.records=1000 today. The PR is still showing benefits for max.poll.records=1 > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711475#comment-15711475 ] ASF GitHub Bot commented on KAFKA-4405: --- GitHub user enothereska reopened a pull request: https://github.com/apache/kafka/pull/2193 KAFKA-4405: Check max.poll.records before prefetching You can merge this pull request into a Git repository by running: $ git pull https://github.com/enothereska/kafka KAFKA-4405-prefetch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/2193.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2193 commit 592482e0c41b14f66bd78f2f4f6247a686743280 Author: Eno ThereskaDate: 2016-11-30T09:29:55Z Check max.poll.records before prefetching > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711461#comment-15711461 ] Eno Thereska commented on KAFKA-4405: - [~guozhang]I tested with the new trunk containing KAFKA-4469 and I don't see the performance discrepancy anymore. So I'll close this PR. Thanks. > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711450#comment-15711450 ] ASF GitHub Bot commented on KAFKA-4405: --- Github user enothereska closed the pull request at: https://github.com/apache/kafka/pull/2193 > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708052#comment-15708052 ] Eno Thereska commented on KAFKA-4405: - I opened a PR since the performance difference is still significant. > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708050#comment-15708050 ] ASF GitHub Bot commented on KAFKA-4405: --- GitHub user enothereska opened a pull request: https://github.com/apache/kafka/pull/2193 KAFKA-4405: Check max.poll.records before prefetching You can merge this pull request into a Git repository by running: $ git pull https://github.com/enothereska/kafka KAFKA-4405-prefetch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/2193.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2193 commit 592482e0c41b14f66bd78f2f4f6247a686743280 Author: Eno ThereskaDate: 2016-11-30T09:29:55Z Check max.poll.records before prefetching > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707215#comment-15707215 ] ysysberserk commented on KAFKA-4405: Oh, I am sorry that I only read the code and did not notice that in fetchablePartitions() the partition which has records is removed before sent a prefetch. So there is no such a big problem and current work flow is ok. But we can still add check if total number of retained records is less than max.poll.records before we call fetcher.sendFetches().It will prevent useless call and save a lot of time. > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706166#comment-15706166 ] Guozhang Wang commented on KAFKA-4405: -- Thanks [~enothereska]. > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15705691#comment-15705691 ] Ismael Juma commented on KAFKA-4405: [~ysysberserk], are you really seeing the behaviour you have described here? Fetcher should only try to prefetch if we don't have records for that partition, see the code below (`fetchablePartitions` in particular): {code} private List fetchablePartitions() { List fetchable = subscriptions.fetchablePartitions(); if (nextInLineRecords != null && !nextInLineRecords.isEmpty()) fetchable.remove(nextInLineRecords.partition); for (CompletedFetch completedFetch : completedFetches) fetchable.remove(completedFetch.partition); return fetchable; } /** * Create fetch requests for all nodes for which we have assigned partitions * that have no existing requests in flight. */ private MapcreateFetchRequests() { // create the fetch info Cluster cluster = metadata.fetch(); Map > fetchable = new LinkedHashMap<>(); for (TopicPartition partition : fetchablePartitions()) { Node node = cluster.leaderFor(partition); if (node == null) { metadata.requestUpdate(); } else if (this.client.pendingRequestCount(node) == 0) { // if there is a leader and no in-flight requests, issue a new fetch LinkedHashMap fetch = fetchable.get(node); if (fetch == null) { fetch = new LinkedHashMap<>(); fetchable.put(node, fetch); } long position = this.subscriptions.position(partition); fetch.put(partition, new FetchRequest.PartitionData(position, this.fetchSize)); log.trace("Added fetch request for partition {} at offset {}", partition, position); } else { log.trace("Skipping fetch for partition {} because there is an in-flight request to {}", partition, node); } } // create the fetches Map requests = new HashMap<>(); for (Map.Entry > entry : fetchable.entrySet()) { Node node = entry.getKey(); FetchRequest fetch = new FetchRequest(this.maxWaitMs, this.minBytes, this.maxBytes, entry.getValue()); requests.put(node, fetch); } return requests; } {code} > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15705655#comment-15705655 ] Eno Thereska commented on KAFKA-4405: - [~ijuma] yes. > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15705594#comment-15705594 ] Ismael Juma commented on KAFKA-4405: [~enothereska], do you mean the following? {code} if (!records.isEmpty()) { // before returning the fetched records, we can send off the next round of fetches // and avoid block waiting for their responses to enable pipelining while the user // is handling the fetched records. // // NOTE: since the consumed position has already been updated, we must not allow // wakeups or any other errors to be triggered prior to returning the fetched records. if (records.size() < fetcher.maxPollRecords) { fetcher.sendFetches(); client.pollNoWakeup(); } if (this.interceptors == null) return new ConsumerRecords<>(records); else return this.interceptors.onConsume(new ConsumerRecords<>(records)); } {code} > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15705475#comment-15705475 ] Eno Thereska commented on KAFKA-4405: - [~guozhang], [~hachikuji] Indeed I can verify that I saw a 50% increase in performance with max.poll.records set to 1 with [~ysysberserk]'s suggestion in KafkaConsumer::poll() // 1 line change. For testing, I made fetcher.maxPollRecords public. if (records.size() < fetcher.maxPollRecords){ fetcher.sendFetches(); client.pollNoWakeup(); } > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15701241#comment-15701241 ] ysysberserk commented on KAFKA-4405: It seems to me that it is not only a performance problem but will cause big problem . For example , assume we set max.poll.records to 1 and each prefetch request can fetch 1000 records and there are plenty unprocessed records to be read. Then to consume all 1000 records in the first prefetch request, we need call poll() 1000 times and 1000 prefetch request will be sent and 1000 * 1000 records are returned and stored in memory. And when it goes on , more and more records will be fetched and need to wait for a long time to be consumed. When there are too many records, it may run out of memory. So the count of prefetch must be strictly limited and be sent only when necessary. > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15701118#comment-15701118 ] ysysberserk commented on KAFKA-4405: It sounds better, I will check it. But comparing the total number of retained records with max.poll.records will be much simpler and can be done with only a few lines. > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15701064#comment-15701064 ] Guozhang Wang commented on KAFKA-4405: -- [~enothereska] Maybe this can result in performance differences in consumer. > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request
[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700704#comment-15700704 ] Jason Gustafson commented on KAFKA-4405: I think a better approach is KAFKA-4007. The idea is to always send the next prefetch (regardless whether we still have data queued), but avoid reading it from the socket until we need to. This is also related to KIP-81: https://cwiki.apache.org/confluence/display/KAFKA/KIP-81%3A+Bound+Fetch+memory+usage+in+the+consumer. > Kafka consumer improperly send prefetch request > --- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.1 >Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)