In your instance if you have four JVMs (i.e., consumer processes), six
threads per consumer process and 12 partitions, then each thread would
only get one partition but the first two processes will get all the
partitions and the last two processes would be idle. We could tweak
the assignment strate
Hi Joel,
Yes, I am on Kafka Trunk branch. In my scenario, if you have back-up
threads does that impact the allocation. If I have 24 threads (6 thread
for each JVM total of 4 JVMS) in above example , does partition allocation
gets evenly distributed (3 on each JVM) ? is this supported use case ?
BTW, "roundrobin" was a recent addition so you would need to be on
trunk to use that. The partition assignor will lay out all the
available consumer threads; and all the available partitions in a
deterministic order (based on a hashcode); it then uses a circular
iterator over the consumers and the
HI Joel,
Correction to my previous question: What is expected behavior of *roundrobin
*policy above scenario ?
Thanks,
Bhavesh
On Thu, Oct 30, 2014 at 1:39 PM, Bhavesh Mistry
wrote:
> Hi Joel,
>
> I have similar issue. I have tried *partition.assignment.strategy=*
> *"roundrobin"*, but ho
Hi Joel,
I have similar issue. I have tried *partition.assignment.strategy=*
*"roundrobin"*, but how do you accept this accept to work ?
We have a topic with 32 partitions and 4 JVM with 10 threads each ( 8 is
backup if one of JVM goes down). The roundrobin does not select all the
JVM only 3 J
> example: launching 4 processes on 4 different machines with 4 threads per
> process on 12 partition topic will have each machine with 3 assigned
> threads and one doing nothing. more over no matter what number of threads
> each process will have , as long as it is bigger then 3, the end result
>
Jun, Joel,
The issue here is exactly which threads are left out, and which threads are
assigned partitions.
Maybe I am missing something but what I want is to balance consuming
threads across machines/processes, regardless of the amount of threads the
machine launches (side effect: this way if you
Shlomi,
If you are on trunk, and your consumer subscriptions are identical
then you can try a slightly different partition assignment strategy.
Try setting partition.assignment.strategy="roundrobin" in your
consumer config.
Thanks,
Joel
On Wed, Oct 29, 2014 at 06:29:30PM -0700, Jun Rao wrote:
>
By consumer, I actually mean consumer threads (the thread # you used when
creating consumer streams). So, if you have 4 consumers, each with 4
threads, 4 of the threads will not get any data with 12 partitions. It
sounds like that's not what you get? What's the output of the
ConsumerOffsetChecker
Jun,
I hear you say "partitions are evenly distributed among all consumers in
the same group", yet I did bump into a case where launching a process with
X high level consumer API threads took over all partitions, sending
existing consumers to be unemployed.
According to the claim above, and if I
You can take a look at the "consumer rebalancing algorithm" part in
http://kafka.apache.org/documentation.html. Basically, partitions are
evenly distributed among all consumers in the same group. If there are more
consumers in a group than partitions, some consumers will never get any
data.
Thanks
Hi All,
Using Kafka's high consumer API I have bumped into a situation where
launching a consumer process P1 with X consuming threads on a topic with X
partition kicks out all other existing consumer threads that consumed prior
to launching the process P.
That is, consumer process P is stealing al
12 matches
Mail list logo