I am +1 Todd's suggestion, the default reassignment scheme is only used when a reassignment command is issued with no scheme specified, and changing this default scheme should not automatically trigger a reassignment of all existing topics: it will only take effect when the next reassignment command with no specific scheme is issued.
On Thu, Mar 5, 2015 at 10:16 AM, Todd Palino <tpal...@gmail.com> wrote: > I would not think that partitions moving would cause any orphaned messages > like that. I would be more concerned about what happens when you change the > default on a running cluster from one scheme to another. Would we want to > support some kind of automated reassignment of existing partitions > (personally - no. I want to trigger that manually because it is a very disk > and network intensive process)? > > -Todd > > On Wed, Mar 4, 2015 at 7:33 PM, Tong Li <liton...@us.ibm.com> wrote: > > > > > > > Todd, > > I think plugable design is good with solid default. The only issue I > > feel is when you use one and switch to another, will we end up with some > > unread messages hanging around and no one thinks or knows it is their > > responsibility to take care of them? > > > > Thanks. > > > > Tong > > > > Sent from my iPhone > > > > > On Mar 5, 2015, at 10:46 AM, Todd Palino <tpal...@gmail.com> wrote: > > > > > > Apologize for the late comment on this... > > > > > > So fair assignment by count (taking into account the current partition > > > count of each broker) is very good. However, it's worth noting that all > > > partitions are not created equal. We have actually been performing more > > > rebalance work based on the partition size on disk, as given equal > > > retention of all topics, the size on disk is a better indicator of the > > > amount of traffic a partition gets, both in terms of storage and > network > > > traffic. Overall, this seems to be a better balance. > > > > > > In addition to this, I think there is very much a need to have Kafka be > > > rack-aware. That is, to be able to assure that for a given cluster, you > > > never assign all replicas for a given partition in the same rack. This > > > would allow us to guard against maintenances or power failures that > > affect > > > a full rack of systems (or a given switch). > > > > > > I think it would make sense to implement the reassignment logic as a > > > pluggable component. That way it would be easy to select a scheme when > > > performing a reassignment (count, size, rack aware). Configuring a > > default > > > scheme for a cluster would allow for the brokers to create new topics > and > > > partitions in compliance with the requested policy. > > > > > > -Todd > > > > > > > > > On Thu, Jan 22, 2015 at 10:13 PM, Joe Stein <joe.st...@stealth.ly> > > wrote: > > > > > > > I will go back through the ticket and code and write more up. Should > be > > > > able to-do that sometime next week. The intention was to not replace > > > > existing functionality by issue a WARN on use. The following version > it > > is > > > > released we could then deprecate it... I will fix the KIP for that > too. > > > > > > > > On Fri, Jan 23, 2015 at 12:34 AM, Neha Narkhede <n...@confluent.io> > > wrote: > > > > > > > > > Hey Joe, > > > > > > > > > > 1. Could you add details to the Public Interface section of the > KIP? > > This > > > > > should include the proposed changes to the partition reassignment > > tool. > > > > > Also, maybe the new option can be named --rebalance instead of > > > > > --re-balance? > > > > > 2. It makes sense to list --decommission-broker as part of this > KIP. > > > > > Similarly, shouldn't we also have an --add-broker option? The way I > > see > > > > > this is that there are several events when a partition reassignment > > is > > > > > required. Before this functionality is automated on the broker, the > > tool > > > > > will generate an ideal replica placement for each such event. The > > users > > > > > should merely have to specify the nature of the event e.g. adding a > > > > broker > > > > > or decommissioning an existing broker or merely rebalancing. > > > > > 3. If I understand the KIP correctly, the upgrade plan for this > > feature > > > > > includes removing the existing --generate option on the partition > > > > > reassignment tool in 0.8.3 while adding all the new options in the > > same > > > > > release. Is that correct? > > > > > > > > > > Thanks, > > > > > Neha > > > > > > > > > > On Thu, Jan 22, 2015 at 9:23 PM, Jay Kreps <jay.kr...@gmail.com> > > wrote: > > > > > > > > > > > Ditto on this one. Can you give the algorithm we want to > implement? > > > > > > > > > > > > Also I think in terms of scope this is just proposing to change > the > > > > logic > > > > > > in ReassignPartitionsCommand? I think we've had the discussion > > various > > > > > > times on the mailing list that what people really want is just > for > > > > Kafka > > > > > to > > > > > > do it's best to balance data in an online fashion (for some > > definition > > > > of > > > > > > balance). i.e. if you add a new node partitions would slowly > > migrate to > > > > > it, > > > > > > and if a node dies, partitions slowly migrate off it. This could > > > > > > potentially be more work, but I'm not sure how much more. Has > > anyone > > > > > > thought about how to do it? > > > > > > > > > > > > -Jay > > > > > > > > > > > > On Wed, Jan 21, 2015 at 10:11 PM, Joe Stein < > joe.st...@stealth.ly> > > > > > wrote: > > > > > > > > > > > > > Posted a KIP for --re-balance for partition assignment in > > > > reassignment > > > > > > > tool. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-6+-+New > > +reassignment+partition+logic+for+re-balancing > > > > > > > > > > > > > > JIRA https://issues.apache.org/jira/browse/KAFKA-1792 > > > > > > > > > > > > > > While going through the KIP I thought of one thing from the > JIRA > > that > > > > > we > > > > > > > should change. We should preserve --generate to be existing > > > > > functionality > > > > > > > for the next release it is in. If folks want to use > --re-balance > > then > > > > > > > great, it just won't break any upgrade paths, yet. > > > > > > > > > > > > > > /******************************************* > > > > > > > Joe Stein > > > > > > > Founder, Principal Consultant > > > > > > > Big Data Open Source Security LLC > > > > > > > http://www.stealth.ly > > > > > > > Twitter: @allthingshadoop > > <http://www.twitter.com/allthingshadoop> > > > > > > > ********************************************/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Thanks, > > > > > Neha > > > > > > > > > > > > -- -- Guozhang