>From the FAQ:

"To reduce # of open sockets, in 0.8.0 (
https://issues.apache.org/jira/browse/KAFKA-1017), when the partitioning
key is not specified or null, a producer will pick a random partition and
stick to it for some time (default is 10 mins) before switching to another
one. So, if there are fewer producers than partitions, at a given point of
time, some partitions may not receive any data. To alleviate this problem,
one can either reduce the metadata refresh interval or specify a message
key and a customized random partitioner. For more detail see this thread
http://mail-archives.apache.org/mod_mbox/kafka-dev/201310.mbox/%3CCAFbh0Q0aVh%2Bvqxfy7H-%2BMnRFBt6BnyoZk1LWBoMspwSmTqUKMg%40mail.gmail.com%3E
"


On Wed, Jul 15, 2015 at 4:13 PM, Stefan Miklosovic <mikloso...@gmail.com>
wrote:

> Maybe there is some reason why produce sticks with a partition for
> some period of time - mostly performance related. I can imagine that
> constant switching between partitions can be kind of slow in such
> sense that producer has to "refocus" on another partition to send a
> message to and this switching may cost something so switching happens
> sporadically.
>
> On the other hand, I would never expect such behaviour I encountered.
> If it is once propagated as "random", I expect that it is really
> random and not "random but .... not random every time". It is hard to
> figure out these information, the only way seems to be to try all
> other solutions and the most awkward one you would never expect to
> work is actually the proper one ...
>
> On Thu, Jul 16, 2015 at 12:53 AM, JIEFU GONG <jg...@berkeley.edu> wrote:
> > This is a total shot in the dark here so please ignore this if it fails
> to
> > make sense, but I remember that on some previous implementation of the
> > producer prior to when round-robin was enabled, producers would send
> > messages to only one of the partitions for a set period of time
> > (configurable, I believe) before moving onto the next one. This caused
> me a
> > similar grievance as I would notice only a few of my consumers would get
> > data while others were completely idle.
> >
> > Sounds similar, so check if that's a possibility at all?
> >
> > On Wed, Jul 15, 2015 at 3:04 PM, Jagbir Hooda <jho...@gmail.com> wrote:
> >
> >> Hi Stefan,
> >>
> >> Have you looked at the following output for message distribution
> >> across the topic-partitions and which topic-partition is consumed by
> >> which consumer thread?
> >>
> >> kafaka-server/bin>./kafka-run-class.sh
> >> kafka.tools.ConsumerOffsetChecker --zkconnect localhost:2181 --group
> >> <consumer_group_name>
> >>
> >> Jagbir
> >>
> >> On Wed, Jul 15, 2015 at 12:50 PM, Stefan Miklosovic
> >> <mikloso...@gmail.com> wrote:
> >> > I have following problem, I tried almost everything I could but
> without
> >> any luck
> >> >
> >> > All I want to do is to have 1 producer, 1 topic, 10 partitions and 10
> >> consumers.
> >> >
> >> > All I want is to send 1M of messages via producer to these 10
> consumers.
> >> >
> >> > I am using built Kafka 0.8.3 from current upstream so I have bleeding
> >> > edge stuff. It does not work on 0.8.1.1 nor 0.8.2 stream.
> >> >
> >> > The problem I have is that I expect that when I send 1 milion of
> >> > messages via that producer, I will have all consumers busy. In other
> >> > words, if a message to be sent via producer is sent to partition
> >> > randomly (roundrobin / range), I expect that all 10 consumers will
> >> > process about 100k of messages each because producer sends it to
> >> > random partition of these 10.
> >> >
> >> > But I have never achieved such outcome.
> >> >
> >> > I was trying these combinations:
> >> >
> >> > 1) old scala producer vs old scala consumer
> >> >
> >> > Consumer was created by Consumers.createJavaConsumer() ten times.
> >> > Every consumer is running in the separate thread.
> >> >
> >> > 2) old scala producer vs new java consumer
> >> >
> >> > new consumer was used like I have 10 consumers listening for a topic
> >> > and 10 consumers subscribed to 1 partition. (consumer 1 - partition 1,
> >> > consumer 2 - paritition 2 and so on)
> >> >
> >> > 3) old scala producer with custom partitioner
> >> >
> >> > I even tried to use my own partitioner, I just generated a random
> >> > number from 0 to 9 so I expected that the messages will be sent
> >> > randomly to the partition of that number.
> >> >
> >> > All I see is that there are only couple of consumers from these 10
> >> > utilized, even I am sending 1M of messages, all I got from the
> >> > debugging output is some preselected set of consumers which appear to
> >> > be selected randomly.
> >> >
> >> > Do you have ANY hint why all consumers are not utilized even
> >> > partitions are selected randomly?
> >> >
> >> > My initial suspicion was that rebalancing was done badly. The think
> >> > was I was generating old consumers in a loop quicky one after another
> >> > and I can imaging that rebalancing algorithm got mad.
> >> >
> >> > So I abandon this solution and I was thinking that let's just
> >> > subscribe these consumers one by one to some partition so I will have
> >> > 1 consumer subscribed just to 1 partition and there will not be any
> >> > rebalancing at all.
> >> >
> >> > Oh my how wrong was I ... nothing changed.
> >> >
> >> > So I was thinking that if I have 10 consumers, each one subscribed to
> >> > 1 paritition, maybe producer is just sending messages to some set of
> >> > partitions and that's it. I  was not sure how this can be possible so
> >> > to be super sure about the even spreading of message to partitions, I
> >> > used custom partitioner class in old consumer so I will be sure that
> >> > the partition the message will be sent to is super random.
> >> >
> >> > But that does not seems to work either.
> >> >
> >> > Please people, help me.
> >> >
> >> > --
> >> > Stefan Miklosovic
> >>
> >
> >
> >
> > --
> >
> > Jiefu Gong
> > University of California, Berkeley | Class of 2017
> > B.A Computer Science | College of Letters and Sciences
> >
> > jg...@berkeley.edu <elise...@berkeley.edu> | (925) 400-3427
>
>
>
> --
> Stefan Miklosovic
>

Reply via email to