Hello Joe, The reason we make the producers to produce to a fixed partition for each metadata-refresh interval are the following:
https://issues.apache.org/jira/browse/KAFKA-1017 https://issues.apache.org/jira/browse/KAFKA-959 So in a word the randomness is still preserved but within one metadata-refresh interval the assignment is fixed. I agree that the document should be updated accordingly. Guozhang On Fri, Sep 13, 2013 at 1:48 PM, Joe Stein <crypt...@gmail.com> wrote: > Isn't this a bug? > > I don't see why we would want users to have to code and generate random > partition keys to randomly distributed the data to partitions, that is > Kafka's job isn't it? > > Or if supplying a null value tell the user this is not supported (throw > exception) in KeyedMessage like we do for topic and not treat null as a key > to hash? > > My preference is to put those three lines back in and let key be null and > give folks randomness unless its not a bug and there is a good reason for > it? > > Is there something about > https://issues.apache.org/jira/browse/KAFKA-691that requires the lines > taken out? I haven't had a chance to look through > it yet > > My thought is a new person coming in they would expect to see the > partitions filling up in a round robin fashion as our docs says and unless > we force them in the API to know they have to-do this or give them the > ability for this to happen when passing nothing in > > /******************************************* > Joe Stein > Founder, Principal Consultant > Big Data Open Source Security LLC > http://www.stealth.ly > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> > ********************************************/ > > > On Fri, Sep 13, 2013 at 4:17 PM, Drew Goya <d...@gradientx.com> wrote: > > > I ran into this problem as well Prashant. The default partition key was > > recently changed: > > > > > > > https://github.com/apache/kafka/commit/b71e6dc352770f22daec0c9a3682138666f032be > > > > It no longer assigns a random partition to data with a null partition > key. > > I had to change my code to generate random partition keys to get the > > randomly distributed behavior the producer used to have. > > > > > > On Fri, Sep 13, 2013 at 11:42 AM, prashant amar <amasin...@gmail.com> > > wrote: > > > > > Thanks Neha > > > > > > I will try applying this property and circle back. > > > > > > Also, I have been attempting to execute kafka-producer-perf-test.sh > and I > > > receive the following error > > > > > > Error: Could not find or load main class > > > kafka.perf.ProducerPerformance > > > > > > I am running against 0.8.0-beta1 > > > > > > Seems like perf is a separate project in the workspace. > > > > > > Does sbt package-assembly bundle the perf jar as well? > > > > > > Neither producer-perf-test not consumer-test are working with this > build > > > > > > > > > > > > On Fri, Sep 13, 2013 at 9:56 AM, Neha Narkhede < > neha.narkh...@gmail.com > > > >wrote: > > > > > > > As Jun suggested, one reason could be that the > > > > topic.metadata.refresh.interval.ms is too high. Did you observe if > the > > > > distribution improves after topic.metadata.refresh.interval.ms has > > > passed > > > > ? > > > > > > > > Thanks > > > > Neha > > > > > > > > > > > > On Fri, Sep 13, 2013 at 4:47 AM, prashant amar <amasin...@gmail.com> > > > > wrote: > > > > > > > > > I am using kafka 08 version ... > > > > > > > > > > > > > > > On Thu, Sep 12, 2013 at 8:44 PM, Jun Rao <jun...@gmail.com> wrote: > > > > > > > > > > > Which revision of 0.8 are you using? In a recent change, a > producer > > > > will > > > > > > stick to a partition for topic.metadata.refresh.interval.ms > > (defaults > > > > to > > > > > > 10 > > > > > > mins) time before picking another partition at random. > > > > > > Thanks, > > > > > > Jun > > > > > > > > > > > > > > > > > > On Thu, Sep 12, 2013 at 1:56 PM, prashant amar < > > amasin...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > I created a topic with 4 partitions and for some reason the > > > producer > > > > is > > > > > > > pushing only to one partition. > > > > > > > > > > > > > > This is consistently happening across all topics that I created > > ... > > > > > > > > > > > > > > Is there a specific configuration that I need to apply to > ensure > > > that > > > > > > load > > > > > > > is evenly distributed across all partitions? > > > > > > > > > > > > > > > > > > > > > Group Topic Pid Offset > > > > > > logSize > > > > > > > Lag Owner > > > > > > > perfgroup1 perfpayload1 0 10965 > > > > > 11220 > > > > > > > 255 perfgroup1_XXXX-0 > > > > > > > perfgroup1 perfpayload1 1 0 > > > 0 > > > > > > > 0 perfgroup1_XXXX-1 > > > > > > > perfgroup1 perfpayload1 2 0 > > > 0 > > > > > > > 0 perfgroup1_XXXXX-2 > > > > > > > perfgroup1 perfpayload1 3 0 > > > 0 > > > > > > > 0 perfgroup1_XXXXX-3 > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- -- Guozhang