>> >> >> >> Essentially this would limit reading fetch responses but allow for
>> >> other
>> >> >> >> responses to be processed.
>> >> >> >>
>> >> >> >> This is a sample of sizes for responses I collecte
ther
> >> >> >> responses to be processed.
> >> >> >>
> >> >> >> This is a sample of sizes for responses I collected :
> >> >> >>
> >> >> >> * size=108 APIKEY=3 METADATA
> &
;> * size=193 APIKEY=11 JOIN_GROUP
>> >> >> * size=39 APIKEY=14 SYNC_GROUP
>> >> >> ***** size=39 APIKEY=9 OFFSET_FETCH
>> >> >> * size=45 APIKEY=2 LIST_OFFSETS
>> >> >> * size=88926 APIKEY=1 FETCH
>&g
t; * size=39 APIKEY=9 OFFSET_FETCH
> >> >> * size=45 APIKEY=2 LIST_OFFSETS
> >> >> * size=88926 APIKEY=1 FETCH
> >> >> * size=45 APIKEY=1 FETCH
> >> >> ***** size=6 APIKEY=12 HEARTBEAT
> >
2 HEARTBEAT
>> >> * size=45 APIKEY=1 FETCH
>> >> * size=45 APIKEY=1 FETCH
>> >> * size=45 APIKEY=1 FETCH
>> >> * size=6 APIKEY=12 HEARTBEAT
>> >> ***** size=45 APIKEY=1 FETCH
>> >> * size=45 APIKEY=1 FETCH
&
size=45 APIKEY=1 FETCH
> >>
> >> What do you think?
> >> ----------------------
> >> Edoardo Comar
> >> IBM MessageHub
> >> eco...@uk.ibm.com
> >> IBM UK Ltd, Hursley Park, SO21 2JN
> >>
> >
ssageHub
>> eco...@uk.ibm.com
>> IBM UK Ltd, Hursley Park, SO21 2JN
>>
>> IBM United Kingdom Limited Registered in England and Wales with number
>> 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6
>> 3AU
>>
>>
>>
>> Fr
b
> eco...@uk.ibm.com
> IBM UK Ltd, Hursley Park, SO21 2JN
>
> IBM United Kingdom Limited Registered in England and Wales with number
> 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6
> 3AU
>
>
>
> From: Rajini Sivaram <rajin
red in England and Wales with number
741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6
3AU
From: Rajini Sivaram <rajinisiva...@googlemail.com>
To: dev@kafka.apache.org
Date: 13/12/2016 17:27
Subject: Re: [DISCUSS] KIP-81: Max in-flight fetches
Co
Coordinator starvation: For an implementation based on KIP-72, there will
be coordinator starvation without KAFKA-4137 since you would stop reading
from sockets when the memory pool is full (the fact that coordinator
messages are small doesn't help). I imagine you can work around this by
treating
Thanks for all the feedback.
I've updated the KIP with all the details.
Below are a few of the main points:
- Overall memory usage of the consumer:
I made it clear the memory pool is only used to store the raw bytes
from the network and that the decompressed/deserialized messages are
not stored
Makes sense Jay.
Mickael, in addition to how we can compute defaults of the other settings
from `buffer.memory`, it would be good to specify what is allowed and how
we handle the different cases (e.g. what do we do if
`max.partition.fetch.bytes`
is greater than `buffer.memory`, is that simply not
Hey Ismael,
Yeah I think we are both saying the same thing---removing only works if you
have a truly optimal strategy. Actually even dynamically computing a
reasonable default isn't totally obvious (do you set fetch.max.bytes to
equal buffer.memory to try to queue up as much data in the network
Hi Jay,
About `max.partition.fetch.bytes`, yes it was an oversight not to lower its
priority as part of KIP-74 given the existence of `fetch.max.bytes` and the
fact that we can now make progress in the presence of oversized messages
independently of either of those settings.
I agree that we
I think the question is whether we have a truly optimal strategy for
deriving the partition- and fetch-level configs from the global setting. If
we do then we should just get rid of them. If not, then if we can at least
derive usually good and never terrible settings from the global limit at
Yeah, that's a good point. Perhaps in retrospect, it would have been better
to define `buffer.memory` first and let `fetch.max.bytes` be based off of
it. I like `buffer.memory` since it gives the consumer nice symmetry with
the producer and its generic naming gives us some flexibility internally
Jason, it's not just decompression but also the conversion from packed
bytes to java objects, right? That can be even larger than the
decompression blow up. I think this may be okay, the problem may just be
that the naming is a bit misleading. In the producer you are literally
allocating a buffer
Hi Mickael,
I think the approach looks good, just a few minor questions:
1. The KIP doesn't say what the default value of `buffer.memory` will be.
Looks like we use 50MB as the default for `fetch.max.bytes`, so perhaps it
makes sense to set the default based on that. Might also be worth
It's been a few days since the last comments. KIP-72 vote seems to
have passed so if I don't get any new comments I'll start the vote on
Monday.
Thanks
On Mon, Nov 14, 2016 at 6:25 PM, radai wrote:
> +1 - there's is a need for an effective way to control kafka memory
+1 - there's is a need for an effective way to control kafka memory
consumption - both on the broker and on clients.
i think we could even reuse the exact same param name - *queued.max.bytes *-
as it would serve the exact same purpose.
also (and again its the same across the broker and clients)
Thanks for all the replies.
I've updated the KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-81%3A+Bound+Fetch+memory+usage+in+the+consumer
The main point is to selectively read from sockets instead of
throttling FetchRequests sends. I also mentioned it will be reusing
the MemoryPool
selectively reading from sockets achieves memory control (up to and not
including talk of (de)compression)
this is exactly what i (also, even mostly) did for kip-72 - which i hope in
itself should be a reason to think about both KIPs at the same time because
the changes will be similar (at least
Thanks for all the feedback.
I agree, throttling the requests sent will most likely result in a
loss of throughput -> BAD !
As suggested, selectively reading from the socket should enable to
control the memory usage without impacting performance. I've had look
at that today and I can see how that
Hey Radai,
I think there are a couple discussions here. The first is about what is the
interface to the user. The other is about what is exposed in the protocol,
and implementation details of reading requests. I strongly agree with
giving the user a simple "use X MB of memory" config and we
+1
On Wed, Nov 2, 2016 at 10:34 AM, radai wrote:
> In my opinion a lot of kafka configuration options were added using the
> "minimal diff" approach, which results in very nuanced and complicated
> configs required to indirectly achieve some goal. case in point -
In my opinion a lot of kafka configuration options were added using the
"minimal diff" approach, which results in very nuanced and complicated
configs required to indirectly achieve some goal. case in point - timeouts.
The goal here is to control the memory requirement. the 1st config was max
Agreed with this approach.
One detail to be wary of is that since we multiplex various other requests
(e.g., heartbeats, offset commits, metadata, etc.) over the client that
connects to the coordinator this could delay some of these critical
requests. Realistically I don't think it will be an
Hi, Mickael,
I agree with others that it's better to be able to control the bytes the
consumer can read from sockets, instead of limiting the fetch requests.
KIP-72 has a proposal to bound the memory size at the socket selector
level. Perhaps that can be leveraged in this KIP too.
This is a good observation on limiting total memory usage. If I understand
the proposal I think it is that the consumer client would stop sending
fetch requests once a certain number of in-flight fetch requests is met. I
think a better approach would be to always issue one fetch request to each
Hey Mickael,
Thanks for picking this up and sorry for the late comment. In the proposed
changes section, you have the following:
Update Fetcher.java to check the number of existing in-flight fetches (this
> is already tracked by numInFlightFetches) before initiating new fetch
> requests in
I've now updated the KIP.
New link as I've updated the title:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-81%3A+Bound+Fetch+memory+usage+in+the+consumer
Any further feedback welcome !
On Tue, Oct 11, 2016 at 6:00 PM, Mickael Maison
wrote:
> Thanks for the
Thanks for the feedback.
Regarding the config name, I agree it's probably best to reuse the
same name as the producer (buffer.memory) whichever implementation we
decide to use.
At first, I opted for limiting the max number of concurrent fetches as
it felt more natural in the Fetcher code.
Hi Mickael,
Thanks for the KIP. A quick comment on the rejected alternative of using a
bounded memory pool:
"While this might be the best way to handle that on the server side it's
unclear if this would suit the client well. Usually the client has a rough
idea about how many partitions it will
33 matches
Mail list logo