Re: [DISCUSS] KIP-81: Max in-flight fetches

2017-03-06 Thread Mickael Maison
>> >> >> >> Essentially this would limit reading fetch responses but allow for >> >> other >> >> >> >> responses to be processed. >> >> >> >> >> >> >> >> This is a sample of sizes for responses I collecte

Re: [DISCUSS] KIP-81: Max in-flight fetches

2017-01-19 Thread Jason Gustafson
ther > >> >> >> responses to be processed. > >> >> >> > >> >> >> This is a sample of sizes for responses I collected : > >> >> >> > >> >> >> * size=108 APIKEY=3 METADATA > &

Re: [DISCUSS] KIP-81: Max in-flight fetches

2017-01-18 Thread Mickael Maison
;> * size=193 APIKEY=11 JOIN_GROUP >> >> >> * size=39 APIKEY=14 SYNC_GROUP >> >> >> ***** size=39 APIKEY=9 OFFSET_FETCH >> >> >> * size=45 APIKEY=2 LIST_OFFSETS >> >> >> * size=88926 APIKEY=1 FETCH >&g

Re: [DISCUSS] KIP-81: Max in-flight fetches

2017-01-17 Thread radai
t; * size=39 APIKEY=9 OFFSET_FETCH > >> >> * size=45 APIKEY=2 LIST_OFFSETS > >> >> * size=88926 APIKEY=1 FETCH > >> >> * size=45 APIKEY=1 FETCH > >> >> ***** size=6 APIKEY=12 HEARTBEAT > >

Re: [DISCUSS] KIP-81: Max in-flight fetches

2017-01-11 Thread Mickael Maison
2 HEARTBEAT >> >> * size=45 APIKEY=1 FETCH >> >> * size=45 APIKEY=1 FETCH >> >> * size=45 APIKEY=1 FETCH >> >> * size=6 APIKEY=12 HEARTBEAT >> >> ***** size=45 APIKEY=1 FETCH >> >> * size=45 APIKEY=1 FETCH &

Re: [DISCUSS] KIP-81: Max in-flight fetches

2017-01-11 Thread Rajini Sivaram
size=45 APIKEY=1 FETCH > >> > >> What do you think? > >> ---------------------- > >> Edoardo Comar > >> IBM MessageHub > >> eco...@uk.ibm.com > >> IBM UK Ltd, Hursley Park, SO21 2JN > >> > >

Re: [DISCUSS] KIP-81: Max in-flight fetches

2017-01-11 Thread Mickael Maison
ssageHub >> eco...@uk.ibm.com >> IBM UK Ltd, Hursley Park, SO21 2JN >> >> IBM United Kingdom Limited Registered in England and Wales with number >> 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 >> 3AU >> >> >> >> Fr

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-14 Thread Rajini Sivaram
b > eco...@uk.ibm.com > IBM UK Ltd, Hursley Park, SO21 2JN > > IBM United Kingdom Limited Registered in England and Wales with number > 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 > 3AU > > > > From: Rajini Sivaram <rajin

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-14 Thread Edoardo Comar
red in England and Wales with number 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU From: Rajini Sivaram <rajinisiva...@googlemail.com> To: dev@kafka.apache.org Date: 13/12/2016 17:27 Subject: Re: [DISCUSS] KIP-81: Max in-flight fetches Co

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-13 Thread Rajini Sivaram
Coordinator starvation: For an implementation based on KIP-72, there will be coordinator starvation without KAFKA-4137 since you would stop reading from sockets when the memory pool is full (the fact that coordinator messages are small doesn't help). I imagine you can work around this by treating

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-13 Thread Mickael Maison
Thanks for all the feedback. I've updated the KIP with all the details. Below are a few of the main points: - Overall memory usage of the consumer: I made it clear the memory pool is only used to store the raw bytes from the network and that the decompressed/deserialized messages are not stored

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-13 Thread Ismael Juma
Makes sense Jay. Mickael, in addition to how we can compute defaults of the other settings from `buffer.memory`, it would be good to specify what is allowed and how we handle the different cases (e.g. what do we do if `max.partition.fetch.bytes` is greater than `buffer.memory`, is that simply not

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-13 Thread Jay Kreps
Hey Ismael, Yeah I think we are both saying the same thing---removing only works if you have a truly optimal strategy. Actually even dynamically computing a reasonable default isn't totally obvious (do you set fetch.max.bytes to equal buffer.memory to try to queue up as much data in the network

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-12 Thread Ismael Juma
Hi Jay, About `max.partition.fetch.bytes`, yes it was an oversight not to lower its priority as part of KIP-74 given the existence of `fetch.max.bytes` and the fact that we can now make progress in the presence of oversized messages independently of either of those settings. I agree that we

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-12 Thread Jay Kreps
I think the question is whether we have a truly optimal strategy for deriving the partition- and fetch-level configs from the global setting. If we do then we should just get rid of them. If not, then if we can at least derive usually good and never terrible settings from the global limit at

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-12 Thread Jason Gustafson
Yeah, that's a good point. Perhaps in retrospect, it would have been better to define `buffer.memory` first and let `fetch.max.bytes` be based off of it. I like `buffer.memory` since it gives the consumer nice symmetry with the producer and its generic naming gives us some flexibility internally

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-10 Thread Jay Kreps
Jason, it's not just decompression but also the conversion from packed bytes to java objects, right? That can be even larger than the decompression blow up. I think this may be okay, the problem may just be that the naming is a bit misleading. In the producer you are literally allocating a buffer

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-09 Thread Jason Gustafson
Hi Mickael, I think the approach looks good, just a few minor questions: 1. The KIP doesn't say what the default value of `buffer.memory` will be. Looks like we use 50MB as the default for `fetch.max.bytes`, so perhaps it makes sense to set the default based on that. Might also be worth

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-02 Thread Mickael Maison
It's been a few days since the last comments. KIP-72 vote seems to have passed so if I don't get any new comments I'll start the vote on Monday. Thanks On Mon, Nov 14, 2016 at 6:25 PM, radai wrote: > +1 - there's is a need for an effective way to control kafka memory

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-11-14 Thread radai
+1 - there's is a need for an effective way to control kafka memory consumption - both on the broker and on clients. i think we could even reuse the exact same param name - *queued.max.bytes *- as it would serve the exact same purpose. also (and again its the same across the broker and clients)

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-11-14 Thread Mickael Maison
Thanks for all the replies. I've updated the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-81%3A+Bound+Fetch+memory+usage+in+the+consumer The main point is to selectively read from sockets instead of throttling FetchRequests sends. I also mentioned it will be reusing the MemoryPool

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-11-09 Thread radai
selectively reading from sockets achieves memory control (up to and not including talk of (de)compression) this is exactly what i (also, even mostly) did for kip-72 - which i hope in itself should be a reason to think about both KIPs at the same time because the changes will be similar (at least

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-11-02 Thread Mickael Maison
Thanks for all the feedback. I agree, throttling the requests sent will most likely result in a loss of throughput -> BAD ! As suggested, selectively reading from the socket should enable to control the memory usage without impacting performance. I've had look at that today and I can see how that

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-11-02 Thread Jay Kreps
Hey Radai, I think there are a couple discussions here. The first is about what is the interface to the user. The other is about what is exposed in the protocol, and implementation details of reading requests. I strongly agree with giving the user a simple "use X MB of memory" config and we

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-11-02 Thread Gwen Shapira
+1 On Wed, Nov 2, 2016 at 10:34 AM, radai wrote: > In my opinion a lot of kafka configuration options were added using the > "minimal diff" approach, which results in very nuanced and complicated > configs required to indirectly achieve some goal. case in point -

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-11-02 Thread radai
In my opinion a lot of kafka configuration options were added using the "minimal diff" approach, which results in very nuanced and complicated configs required to indirectly achieve some goal. case in point - timeouts. The goal here is to control the memory requirement. the 1st config was max

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-10-31 Thread Joel Koshy
Agreed with this approach. One detail to be wary of is that since we multiplex various other requests (e.g., heartbeats, offset commits, metadata, etc.) over the client that connects to the coordinator this could delay some of these critical requests. Realistically I don't think it will be an

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-10-30 Thread Jun Rao
Hi, Mickael, I agree with others that it's better to be able to control the bytes the consumer can read from sockets, instead of limiting the fetch requests. KIP-72 has a proposal to bound the memory size at the socket selector level. Perhaps that can be leveraged in this KIP too.

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-10-27 Thread Jay Kreps
This is a good observation on limiting total memory usage. If I understand the proposal I think it is that the consumer client would stop sending fetch requests once a certain number of in-flight fetch requests is met. I think a better approach would be to always issue one fetch request to each

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-10-26 Thread Jason Gustafson
Hey Mickael, Thanks for picking this up and sorry for the late comment. In the proposed changes section, you have the following: Update Fetcher.java to check the number of existing in-flight fetches (this > is already tracked by numInFlightFetches) before initiating new fetch > requests in

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-10-13 Thread Mickael Maison
I've now updated the KIP. New link as I've updated the title: https://cwiki.apache.org/confluence/display/KAFKA/KIP-81%3A+Bound+Fetch+memory+usage+in+the+consumer Any further feedback welcome ! On Tue, Oct 11, 2016 at 6:00 PM, Mickael Maison wrote: > Thanks for the

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-10-11 Thread Mickael Maison
Thanks for the feedback. Regarding the config name, I agree it's probably best to reuse the same name as the producer (buffer.memory) whichever implementation we decide to use. At first, I opted for limiting the max number of concurrent fetches as it felt more natural in the Fetcher code.

Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-10-10 Thread Ismael Juma
Hi Mickael, Thanks for the KIP. A quick comment on the rejected alternative of using a bounded memory pool: "While this might be the best way to handle that on the server side it's unclear if this would suit the client well. Usually the client has a rough idea about how many partitions it will