Are you using new producer or old producer? The old producer has 10 min sticky partition behavior while the new producer does not.
Thanks, Jiangjie (Becket) Qin On 6/2/15, 11:58 PM, "Sebastien Falquier" <sebastien.falqu...@teads.tv> wrote: >Hi Jason, > >The default partitioner does not make the job since my producers haven't a >smooth traffic. What I mean is that they can deliver lots of messages >during 10 minutes and less during the next 10 minutes, that is too say the >first partition will have stacked most of the messages of the last 20 >minutes. > >By the way, I don't understand your point about breaking batch into 2 >separate partitions. With that code, I jump to a new partition on message >201, 401, 601, ... with batch size = 200, where is my mistake? > >Thanks for your help, >Sébastien > >2015-06-02 16:55 GMT+02:00 Jason Rosenberg <j...@squareup.com>: > >> Hi Sebastien, >> >> You might just try using the default partitioner (which is random). It >> works by choosing a random partition each time it re-polls the meta-data >> for the topic. By default, this happens every 10 minutes for each topic >> you produce to (so it evenly distributes load at a granularity of 10 >> minutes). This is based on 'topic.metadata.refresh.interval.ms'. >> >> I suspect your code is causing double requests for each batch, if your >> partitioning is actually breaking up your batches into 2 separate >> partitions. Could be an off by 1 error, with your modulo calculation? >> Perhaps you need to use '% 0' instead of '% 1' there? >> >> Jason >> >> >> >> On Tue, Jun 2, 2015 at 3:35 AM, Sebastien Falquier < >> sebastien.falqu...@teads.tv> wrote: >> >> > Hi guys, >> > >> > I am new to Kafka and I am facing a problem I am not able to sort out. >> > >> > To smooth traffic over all my brokers' partitions, I have coded a >>custom >> > Paritioner for my producers, using a simple round robin algorithm that >> > jumps from a partition to another on every batch of messages >> (corresponding >> > to batch.num.messages value). It looks like that : >> > https://gist.github.com/sfalquier/4c0c7f36dd96d642b416 >> > >> > With that fix, every partitions are used equally, but the amount of >> > requests from the producers to the brokers have been multiplied by 2. >>I >> do >> > not understand since all producers are async with >>batch.num.messages=200 >> > and the amount of messages processed is still the same as before. Why >>do >> > producers need more requests to do the job? As internal traffic is a >>bit >> > critical on our platform, I would really like to reduce producers' >> requests >> > volume if possible. >> > >> > Any idea? Any suggestion? >> > >> > Regards, >> > Sébastien >> > >>