All, need some clarity and confirmation on the following behavior.
Use-Case ------------ 1. I have a topic T spread across two brokers (B1, B2)running on different machines, each having 2 partitions configured for T. Totally 4 partitions (1-0, 1-1, 2-0, 2-1) 2. Consumer C1 is part of group g1 and is consuming from from B1, B2 for T 3. Add a new consumer C2 part of g1 This is triggering a re balance across C1 & C2 and eventually C1 gets 1-0, 1-1 and C2 gets 2-0, 2-1. P.S. - B1, C1 are sharing the same machine, same is the case with B2,C2 Behavior --------- both consumers are getting partitions which are hosted on the same boxes. Is this a coincidence or an optimization w.r.t locality of data and will always be applied. More questions ----------------- 1. When would you want to have multiple partitions of the same topic hosted on the same broker. Is it that you have 2 partitions of T on B1 and 10 on B2 and on re balance C1 & C2 would get 6 each. 2. As in the above use-case, C1 has 1-0 & 1-1 partitions of T and adding messages to B1 results in the messages being spread across both the partitions. Is this behavior round robin or based on segment file size/other parameters? 3. Is it possible to configure #partitons based on topic, if so how? -- Inder