Re: Is Kafka documentation regarding null key misleading?

2014-12-11 Thread Steven Wu
Guozhang, can you point me to the code that implements periodic/sticky random partitioner? I actually like to try it out in our env, even though I assume it is NOT ported to 0.8.2 java producer. Thanks, Steven On Mon, Dec 8, 2014 at 1:43 PM, Guozhang Wang wangg...@gmail.com wrote: Hi Yury,

Re: Is Kafka documentation regarding null key misleading?

2014-12-11 Thread Guozhang Wang
Steven, You can take a look at kafka.producer.async.DefaultEventHandler, in getPartition function. Guozhang On Thu, Dec 11, 2014 at 9:58 AM, Steven Wu stevenz...@gmail.com wrote: Guozhang, can you point me to the code that implements periodic/sticky random partitioner? I actually like to

Re: Is Kafka documentation regarding null key misleading?

2014-12-08 Thread Guozhang Wang
Hi Yury, Originally the producer behavior under null-key is random random, but later changed to this periodic random to reduce the number of sockets on the server side: imagine if you have n brokers and m producers where m n, with random random distribution each server will need to maintain a

Is Kafka documentation regarding null key misleading?

2014-12-05 Thread Yury Ruchin
Hello, I've come across a (seemingly) strange situation when my Kafka producer gave so uneven distribution across partitions. I found that I used null key to produce messages, guided by the following clause in the documentation: If the key is null, then a random broker partition is picked.

Re: Is Kafka documentation regarding null key misleading?

2014-12-05 Thread Michal Michalski
Yes, it is *very* misleading in my opinion - I've seen so many people surprised with that behaviour... Technically it's 100% correct of course: If the key is null, then the Producer will assign the message to a random Partition. - that's what actually happens, because assignment is random.

Re: Is Kafka documentation regarding null key misleading?

2014-12-05 Thread Andrew Jorgensen
If you look under Producer configs you see the following key ‘topic.metadata.refresh.interval.ms’ with a default of 600 * 1000 (10 minutes). It is not entirely clear but this controls how often a producer will a null key partitioner will switch partitions that it is writing to. In my production