[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-12-12 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15743437#comment-15743437
 ] 

Ismael Juma commented on KAFKA-4405:


Thanks for the PR [~enothereska]. Can you please update the JIRA title and 
description to match what we actually merged?

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>Assignee: Eno Thereska
> Fix For: 0.10.2.0
>
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-12-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15742591#comment-15742591
 ] 

ASF GitHub Bot commented on KAFKA-4405:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2193


> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>Assignee: Eno Thereska
> Fix For: 0.10.2.0
>
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-12-01 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711481#comment-15711481
 ] 

Eno Thereska commented on KAFKA-4405:
-

[~guozhang] Ignore previous comment, I was testing by mistake with 
max.poll.records=1000 today. The PR is still showing benefits for 
max.poll.records=1

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711475#comment-15711475
 ] 

ASF GitHub Bot commented on KAFKA-4405:
---

GitHub user enothereska reopened a pull request:

https://github.com/apache/kafka/pull/2193

KAFKA-4405: Check max.poll.records before prefetching



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka KAFKA-4405-prefetch

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2193.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2193


commit 592482e0c41b14f66bd78f2f4f6247a686743280
Author: Eno Thereska 
Date:   2016-11-30T09:29:55Z

Check max.poll.records before prefetching




> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-12-01 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711461#comment-15711461
 ] 

Eno Thereska commented on KAFKA-4405:
-

[~guozhang]I tested with the new trunk containing KAFKA-4469 and I don't see 
the performance discrepancy anymore. So I'll close this PR. Thanks.

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711450#comment-15711450
 ] 

ASF GitHub Bot commented on KAFKA-4405:
---

Github user enothereska closed the pull request at:

https://github.com/apache/kafka/pull/2193


> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-30 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708052#comment-15708052
 ] 

Eno Thereska commented on KAFKA-4405:
-

I opened a PR since the performance difference is still significant.

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708050#comment-15708050
 ] 

ASF GitHub Bot commented on KAFKA-4405:
---

GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/2193

KAFKA-4405: Check max.poll.records before prefetching



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka KAFKA-4405-prefetch

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2193.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2193


commit 592482e0c41b14f66bd78f2f4f6247a686743280
Author: Eno Thereska 
Date:   2016-11-30T09:29:55Z

Check max.poll.records before prefetching




> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-29 Thread ysysberserk (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707215#comment-15707215
 ] 

ysysberserk commented on KAFKA-4405:


Oh, I am sorry that I only read the code and did not notice that in 
fetchablePartitions() the partition which has records is removed before sent a 
prefetch.

So there is no such a big problem and current work flow is ok.

But we can still add check if total number of retained records is less than 
max.poll.records  before we call 

fetcher.sendFetches().It will prevent useless call and save a lot of time.

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-29 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706166#comment-15706166
 ] 

Guozhang Wang commented on KAFKA-4405:
--

Thanks [~enothereska].

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-29 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15705691#comment-15705691
 ] 

Ismael Juma commented on KAFKA-4405:


[~ysysberserk], are you really seeing the behaviour you have described here? 
Fetcher should only try to prefetch if we don't have records for that 
partition, see the code below (`fetchablePartitions` in particular):

{code}
private List fetchablePartitions() {
List fetchable = subscriptions.fetchablePartitions();
if (nextInLineRecords != null && !nextInLineRecords.isEmpty())
fetchable.remove(nextInLineRecords.partition);
for (CompletedFetch completedFetch : completedFetches)
fetchable.remove(completedFetch.partition);
return fetchable;
}

/**
 * Create fetch requests for all nodes for which we have assigned partitions
 * that have no existing requests in flight.
 */
private Map createFetchRequests() {
// create the fetch info
Cluster cluster = metadata.fetch();
Map> 
fetchable = new LinkedHashMap<>();
for (TopicPartition partition : fetchablePartitions()) {
Node node = cluster.leaderFor(partition);
if (node == null) {
metadata.requestUpdate();
} else if (this.client.pendingRequestCount(node) == 0) {
// if there is a leader and no in-flight requests, issue a new 
fetch
LinkedHashMap fetch 
= fetchable.get(node);
if (fetch == null) {
fetch = new LinkedHashMap<>();
fetchable.put(node, fetch);
}

long position = this.subscriptions.position(partition);
fetch.put(partition, new FetchRequest.PartitionData(position, 
this.fetchSize));
log.trace("Added fetch request for partition {} at offset {}", 
partition, position);
} else {
log.trace("Skipping fetch for partition {} because there is an 
in-flight request to {}", partition, node);
}
}

// create the fetches
Map requests = new HashMap<>();
for (Map.Entry> entry : fetchable.entrySet()) {
Node node = entry.getKey();
FetchRequest fetch = new FetchRequest(this.maxWaitMs, 
this.minBytes, this.maxBytes, entry.getValue());
requests.put(node, fetch);
}
return requests;
}
{code}

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-29 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15705655#comment-15705655
 ] 

Eno Thereska commented on KAFKA-4405:
-

[~ijuma] yes.

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-29 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15705594#comment-15705594
 ] 

Ismael Juma commented on KAFKA-4405:


[~enothereska], do you mean the following?

{code}
if (!records.isEmpty()) {
// before returning the fetched records, we can send off 
the next round of fetches
// and avoid block waiting for their responses to enable 
pipelining while the user
// is handling the fetched records.
//
// NOTE: since the consumed position has already been 
updated, we must not allow
// wakeups or any other errors to be triggered prior to 
returning the fetched records.
   if (records.size() < fetcher.maxPollRecords) {
   fetcher.sendFetches();
   client.pollNoWakeup();
   }

if (this.interceptors == null)
return new ConsumerRecords<>(records);
else
return this.interceptors.onConsume(new 
ConsumerRecords<>(records));
}
{code}

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-29 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15705475#comment-15705475
 ] 

Eno Thereska commented on KAFKA-4405:
-

[~guozhang], [~hachikuji]
Indeed I can verify that I saw a 50% increase in performance with 
max.poll.records set to 1 with [~ysysberserk]'s suggestion in 
KafkaConsumer::poll()
   // 1 line change. For testing, I made fetcher.maxPollRecords 
public.
if (records.size() < fetcher.maxPollRecords){
fetcher.sendFetches();
client.pollNoWakeup();
}


> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-27 Thread ysysberserk (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15701241#comment-15701241
 ] 

ysysberserk commented on KAFKA-4405:


It seems to me that it is not only a performance problem but will cause big 
problem .

For example , assume we set max.poll.records to 1 and each prefetch request can 
fetch 1000 records and there are plenty unprocessed records to be read.

Then to consume all 1000 records in the first prefetch request, we need call 
poll() 1000 times and 1000 prefetch request will be sent and 1000 * 1000 
records are returned and stored in memory.

And when it goes on , more and more records will be fetched and need to wait 
for a long time to be consumed.

When there are too many records, it may run out of memory.

So the count of prefetch must be strictly limited and be sent only when 
necessary.

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-27 Thread ysysberserk (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15701118#comment-15701118
 ] 

ysysberserk commented on KAFKA-4405:


It sounds better, I will check it.

But comparing the total number of retained records with  max.poll.records will 
be much simpler and can be done with only a few lines.

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-27 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15701064#comment-15701064
 ] 

Guozhang Wang commented on KAFKA-4405:
--

[~enothereska] Maybe this can result in performance differences in consumer.

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4405) Kafka consumer improperly send prefetch request

2016-11-27 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700704#comment-15700704
 ] 

Jason Gustafson commented on KAFKA-4405:


I think a better approach is KAFKA-4007. The idea is to always send the next 
prefetch (regardless whether we still have data queued), but avoid reading it 
from the socket until we need to. This is also related to KIP-81: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-81%3A+Bound+Fetch+memory+usage+in+the+consumer.

> Kafka consumer improperly send prefetch request
> ---
>
> Key: KAFKA-4405
> URL: https://issues.apache.org/jira/browse/KAFKA-4405
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: ysysberserk
>
> Now kafka consumer has added max.poll.records to limit the count of messages 
> return by poll().
> According to KIP-41, to implement  max.poll.records, the prefetch request 
> should only be sent when the total number of retained records is less than 
> max.poll.records.
> But in the code of 0.10.0.1 , the consumer will send a prefetch request if it 
> retained any records and never check if total number of retained records is 
> less than max.poll.records..
> If max.poll.records is set to a count much less than the count of message 
> fetched , the poll() loop will send a lot of requests than expected and will 
> have more and more records fetched and stored in memory before they can be 
> consumed.
> So before sending a  prefetch request , the consumer must check if total 
> number of retained records is less than max.poll.records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)