Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Ewen Cheslack-Postava
Also worth mentioning is that the new producer doesn't have this behavior -- it will round robin over available partitions for records without keys. Available means it currently has a leader -- under normal cases this means it distributes evenly across all partitions, but if a partition is down

Kafka partitioning is pretty much broken

2015-07-15 Thread Stefan Miklosovic
I have following problem, I tried almost everything I could but without any luck All I want to do is to have 1 producer, 1 topic, 10 partitions and 10 consumers. All I want is to send 1M of messages via producer to these 10 consumers. I am using built Kafka 0.8.3 from current upstream so I have

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread JIEFU GONG
This is a total shot in the dark here so please ignore this if it fails to make sense, but I remember that on some previous implementation of the producer prior to when round-robin was enabled, producers would send messages to only one of the partitions for a set period of time (configurable, I

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Stefan Miklosovic
I think I figured it out. I had to use custom parititioner which does basically nothing. Even I used it before, it was not taken into consideration because I was sending KeyedMessage without any key. Just partition and payload. Now I am doing it like this: producer.send(new KeyedMessageString,

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Stefan Miklosovic
Nice one! That might be it as well. Do you have an idea what is that configuration parameter called? On Thu, Jul 16, 2015 at 12:53 AM, JIEFU GONG jg...@berkeley.edu wrote: This is a total shot in the dark here so please ignore this if it fails to make sense, but I remember that on some previous

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Stefan Miklosovic
Maybe there is some reason why produce sticks with a partition for some period of time - mostly performance related. I can imagine that constant switching between partitions can be kind of slow in such sense that producer has to refocus on another partition to send a message to and this switching

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Lance Laursen
From the FAQ: To reduce # of open sockets, in 0.8.0 ( https://issues.apache.org/jira/browse/KAFKA-1017), when the partitioning key is not specified or null, a producer will pick a random partition and stick to it for some time (default is 10 mins) before switching to another one. So, if there are

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Jagbir Hooda
Hi Stefan, Have you looked at the following output for message distribution across the topic-partitions and which topic-partition is consumed by which consumer thread? kafaka-server/bin./kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --zkconnect localhost:2181 --group consumer_group_name