[ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291417#comment-14291417
 ] 

Neha Narkhede commented on KAFKA-1792:
--------------------------------------

[~Dmitry Pekar] Hope you enjoyed your break :)

Thanks for attaching the rebalance use cases. Few comments (possibly best to 
discuss these on the KIP, though will post them here as well) -

1. IIUC, the purpose of --rebalance is to figure out the most optimal partition 
reassignment given the current topics and brokers in the cluster. If so, the 
user shouldn't have to specify either the topics or the brokers. The purpose is 
that the new rebalance algorithm figures it out. In any case, --broker-list is 
unnecessary since the current list of brokers in the cluster can be read 
through zookeeper. 
2. Given #1, decommissioning a broker also shouldn't require the user to 
specify anything while using the --rebalance option. Again, the tool will give 
the user the most optimal assignment given the current list of active brokers 
in the cluster (which doesn't include the decommissioned broker since it has 
presumably already been shut down)
3. We should probably give appropriate description along with the ideal 
reassignment when --rebalance is used that will explicitly call out whether a 
new broker has been detected or an old broker has been shut down (or 
decommissioned). That way, admins can clearly understand the nature of the 
reassignment before executing it through the --execute option.
4. Seems like the only new option that we really need is the --rebalance option 
which replaces --generate.

> change behavior of --generate to produce assignment config with fair replica 
> distribution and minimal number of reassignments
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1792
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1792
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: tools
>            Reporter: Dmitry Pekar
>            Assignee: Dmitry Pekar
>             Fix For: 0.8.3
>
>         Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, 
> KAFKA-1792_2014-12-08_13:42:43.patch, KAFKA-1792_2014-12-19_16:48:12.patch, 
> KAFKA-1792_2015-01-14_12:54:52.patch, generate_alg_tests.txt, 
> rebalance_use_cases.txt
>
>
> Current implementation produces fair replica distribution between specified 
> list of brokers. Unfortunately, it doesn't take
> into account current replica assignment.
> So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
> broker id=3, 
> generate will create an assignment config which will redistribute replicas 
> fairly across brokers [0..3] 
> in the same way as those partitions were created from scratch. It will not 
> take into consideration current replica 
> assignment and accordingly will not try to minimize number of replica moves 
> between brokers.
> As proposed by [~charmalloc] this should be improved. New output of improved 
> --generate algorithm should suite following requirements:
> - fairness of replica distribution - every broker will have R or R+1 replicas 
> assigned;
> - minimum of reassignments - number of replica moves between brokers will be 
> minimal;
> Example.
> Consider following replica distribution per brokers [0..3] (we just added 
> brokers 2 and 3):
> - broker - 0, 1, 2, 3 
> - replicas - 7, 6, 0, 0
> The new algorithm will produce following assignment:
> - broker - 0, 1, 2, 3 
> - replicas - 4, 3, 3, 3
> - moves - -3, -3, +3, +3
> It will be fair and number of moves will be 6, which is minimal for specified 
> initial distribution.
> The scope of this issue is:
> - design an algorithm matching the above requirements;
> - implement this algorithm and unit tests;
> - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to