[ 
https://issues.apache.org/jira/browse/KAFKA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378476#comment-14378476
 ] 

Jay Kreps commented on KAFKA-2045:
----------------------------------

There are really two issues:
1. Bounding fetch size while still guaranteeing that you eventually get data 
from each partition
2. Pooling and reusing byte buffers

I actually think (1) is really pressing, but (2) is just an optimization that 
may or may not have high payoff.

(1) is what leads to the huge memory allocations and sudden OOM when a consumer 
falls behind and then suddenly has lots of data or when partition assignment 
changes.

For (1) I think we need to figure out whether this is (a) some heuristic in the 
consumer which decides to only do fetches for a subset of topic/partitions or 
(b) a new parameter in the fetch request that gives a total bound on the 
request size. I think we discussed this a while back and agreed on (b), but I 
can't remember now. The argument if I recall was that that was the only way for 
the server to monitor all the subscribed topics and avoid blocking on an empty 
topic while non-empty partitions have data.

Bounding the allocations should help performance a lot too.

If we do this bounding then I think reuse will be a lot easier to since each 
response will use at most that many bytes and you could potentially even just 
statically allocate the byte buffer for each partition and reuse it.

> Memory Management on the consumer
> ---------------------------------
>
>                 Key: KAFKA-2045
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2045
>             Project: Kafka
>          Issue Type: Sub-task
>            Reporter: Guozhang Wang
>
> We need to add the memory management on the new consumer like we did in the 
> new producer. This would probably include:
> 1. byte buffer re-usage for fetch response partition data.
> 2. byte buffer re-usage for on-the-fly de-compression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to