Also, to answer your question, it is intentional. This way each consumer connects to and interacts with a fixed number of brokers irrespective of the total size of the cluster.
-Jay On Mon, Oct 24, 2011 at 7:31 AM, Jun Rao <jun...@gmail.com> wrote: > During rebalance, we simply sort all partitions and consumers by name and > give each consumer an even range of partitions. Since partitions on the > same > broker sort together, they tend to be given out to the same consumer, as in > this case. > > Since partition is the unit of rebalance, you want to have at least as many > partitions as consumers. This is the main reason to have more than 1 > partition per broker. > > Number of partitions is controlled by 2 config parameters: num.partitions > and topic.partition.count.map. The former is the default for all topics and > the latter is for specific topics. > > Jun > > On Mon, Oct 24, 2011 at 1:28 AM, Inder Pall <inder.p...@gmail.com> wrote: > > > All, > > > > need some clarity and confirmation on the following behavior. > > > > Use-Case > > ------------ > > 1. I have a topic T spread across two brokers (B1, B2)running on > different > > machines, each having 2 partitions configured for T. Totally 4 partitions > > (1-0, 1-1, 2-0, 2-1) > > 2. Consumer C1 is part of group g1 and is consuming from from B1, B2 for > T > > 3. Add a new consumer C2 part of g1 > > > > This is triggering a re balance across C1 & C2 and eventually C1 gets > 1-0, > > 1-1 and C2 gets 2-0, 2-1. > > P.S. - B1, C1 are sharing the same machine, same is the case with B2,C2 > > > > Behavior > > --------- > > both consumers are getting partitions which are hosted on the same boxes. > > Is > > this a coincidence or an optimization w.r.t locality of data and will > > always > > be applied. > > > > More questions > > ----------------- > > 1. When would you want to have multiple partitions of the same topic > hosted > > on the same broker. Is it that you have 2 partitions of T on B1 and 10 on > > B2 > > and on re balance C1 & C2 would get 6 each. > > 2. As in the above use-case, C1 has 1-0 & 1-1 partitions of T and adding > > messages to B1 results in the messages being spread across both the > > partitions. Is this behavior round robin or based on segment file > > size/other > > parameters? > > 3. Is it possible to configure #partitons based on topic, if so how? > > > > -- Inder > > >