[jira] [Commented] (KAFKA-6794) Support for incremental replica reassignment

2019-02-25 Thread GEORGE LI (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777257#comment-16777257
 ] 

GEORGE LI commented on KAFKA-6794:
--

Hi [~viktorsomogyi],  

I think this "Incremental Reassignment"  is different from KIP-236  "Planned 
Future Change" section.  That one is basically trying to overcome the current 
limitation that only one batch of reassignments can be run in 
/admin/reassign_partitions. 

e.g.  50 reassignments in a batch submitted,   49 completed.  and there is one 
long running reassignment pending in /admin/reassign_partitions,  Currently,  
not able to submit new batch until all in  /admin/reassign_partitions are 
completed and the node is removed from ZK.   If the cluster is pretty much 
idle,  this pretty much waste the resource for not able to submit new 
reassignments. 

The proposal is to enable submit new batch to a queue (ZK node),  and merge the 
new assignments to /admin/reassign_partitions.   This will try to use the  
Cancel Reassignments if there is conflict (same topic/partition) in both the 
new queue and the current /admin/reassign_partitions .

 

> Support for incremental replica reassignment
> 
>
> Key: KAFKA-6794
> URL: https://issues.apache.org/jira/browse/KAFKA-6794
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6794) Support for incremental replica reassignment

2019-02-25 Thread GEORGE LI (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777149#comment-16777149
 ] 

GEORGE LI commented on KAFKA-6794:
--

Hi, [~viktorsomogyi], 

When I rebalance the whole cluster,  I generate the reassignment plan json with 
 a list of  topic/partitions with its  new_replicas/original_replicas,  and 
sort them by their size, so try to group them in batches of similar sizes for 
execution,  so that they are expected to complete reassignment using about  the 
same amount of time.  

Say there are  1000 reassignments, and 50 per batch.  That will be at least 20 
batches/buckets to put in for execution.  Comparing the new_replicas Vs. 
original_replicas  set,  the algorithm can detect if there is more than 1 new 
replica in the new_replicas, if yes, then break it and put in different 
batch/bucket.There are other considerations of  the reassignments in the 
same batch:  e.g.  for different topic/partition,  try to spread the load and 
not to overwhelm a Leader.  e.g.  the Leadership bytes within the same batch 
for reassignments should be balanced across all brokers/leaders in the cluster 
as much as possible.  I think this (optimal executions of reassignment plans in 
batches) can only be achieved  outside of Kafka.  



> Support for incremental replica reassignment
> 
>
> Key: KAFKA-6794
> URL: https://issues.apache.org/jira/browse/KAFKA-6794
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6794) Support for incremental replica reassignment

2019-02-25 Thread Viktor Somogyi-Vass (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776694#comment-16776694
 ] 

Viktor Somogyi-Vass commented on KAFKA-6794:


hey [~sql_consulting], thanks for sharing this. I think it is also a good 
approach, and frankly now the only way one could incrementalize reassignments 
manually. How would you make this queue? Do you have an algorithm for choosing 
the next replica to drop and to add?

> Support for incremental replica reassignment
> 
>
> Key: KAFKA-6794
> URL: https://issues.apache.org/jira/browse/KAFKA-6794
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6794) Support for incremental replica reassignment

2019-02-22 Thread GEORGE LI (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775598#comment-16775598
 ] 

GEORGE LI commented on KAFKA-6794:
--

I also have seen this issue.  When more than one broker is in the New Replicas 
of the reassignments,  the topic is big,  even with throttle,  the leader is 
working hard to sync to all the extra followers and could cause latency jump. 

 

{{One of solutions is execute the reassignment plans in an "Optimal" way.  
Submit the reassignment plans in batches.   making sure each batch, the 
topic/partition will have only one extra New broker in the New Replicas,  wait 
till that reassignment completes, then resubmit another one.  e.g.  for if the 
reassignment is (1,2,3,4) =>  (5,6,7,8).   Split it in 4 batches (buckets), 
every batch only 1 new replica.  }}

 

{{Batch 1:  (1,2,3,5)}}

{{Batch 2:  (1,2,5,6)}}

{{Batch 3:  (1,5,6,7)}}

{{Batch 4:  (5,6,7,8)}}

 

Between each batch,  check ZK node /admin/reassign_partitions exists,  if yes, 
sleep and check again,   if not, submit next batch. 

 

 

> Support for incremental replica reassignment
> 
>
> Key: KAFKA-6794
> URL: https://issues.apache.org/jira/browse/KAFKA-6794
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6794) Support for incremental replica reassignment

2018-12-07 Thread Viktor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713043#comment-16713043
 ] 

Viktor Somogyi commented on KAFKA-6794:
---

Also created an early pull request for matching up if the algorithm we figured 
is in line with the goals of this jira.

> Support for incremental replica reassignment
> 
>
> Key: KAFKA-6794
> URL: https://issues.apache.org/jira/browse/KAFKA-6794
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Viktor Somogyi
>Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6794) Support for incremental replica reassignment

2018-12-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712987#comment-16712987
 ] 

ASF GitHub Bot commented on KAFKA-6794:
---

viktorsomogyi opened a new pull request #6011: [WIP] KAFKA-6794: Incremental 
partition reassignment
URL: https://github.com/apache/kafka/pull/6011
 
 
   This pull request replaces the current partition reassignment strategy with 
an incremental one.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support for incremental replica reassignment
> 
>
> Key: KAFKA-6794
> URL: https://issues.apache.org/jira/browse/KAFKA-6794
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Viktor Somogyi
>Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6794) Support for incremental replica reassignment

2018-12-07 Thread Viktor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712629#comment-16712629
 ] 

Viktor Somogyi commented on KAFKA-6794:
---

I think this change doesn't need a KIP for now, so I'm collected the algorithm 
and some examples here, [~hachikuji] please have a look at it.
h2. Calculating A Reassignment Step

For calculating a reassignment step, always the final target replica (FTR) set 
and the current replica (CR) set is used.
 # Calculate the replicas to be dropped (DR):
 # Calculate n = size(FTR) - size(CR)
 ## Filter those replicas from CR which are not in FTR, this is the excess 
replica (ER) set
 ## Sort the ER set in an order where the leader is the last (this will ensure 
that it will be selected only when needed).
 ## Take the first n replicas of ER, that will be the set of dropped replicas
 # Calculate the new replica (NR) to be added by selecting the first replica 
from FTR that is not in CR
 # Create the target replica (TR) set: CR + NR - DR
 # If this is the last step, then order the replicas as specified by FTR. This 
means that the last step is always equals to FTR

h2. Performing A Reassignment Step
 # Wait until CR is entirely in ISR. This will make sure that we're starting 
off with a solid base for reassignment.
 # Calculate the next reassignment step as described above based on the 
reassignment context.
 # Wait until all brokers in the target replicas (TR) of the reassignment step 
are alive. This will make sure that reassignment starts only when the target 
brokers can perform the actual reassignment step.
 # If we have new replicas in ISR from the previous step, change the states' of 
those to OnlineReplica
 # Update CR in Zookeeper with TR: with this the DR set will be drop and NR set 
will be added.
 # Send LeaderAndIsr request to all replicas in CR + NR so they would be 
notified of the Zookeeper events.
 # Start new replicas in NR by moving them to NewReplica state.
 # Set CR to TR in memory.
 # Send LeaderAndIsr request with a potential new leader (if current leader not 
in TR) and a new CR (using TR) and same ISR to every broker in TR
 # Replicas in DR -> Offline (force those replicas out of ISR)
 # Replicas in DR -> NonExistentReplica (force those replicas to be deleted)
 # Update the /admin/reassign_partitions path in ZK to remove this partition.
 # After electing leader, the replicas and ISR information changes, so resend 
the update metadata request to every broker

h2. Example

The following code block shows how a transition happens from (0, 1, 2) into (3, 
4, 5) where the initial leader is 0.
{noformat}
 (0, 1, 2) // starting assignment
 |
(0, 1, 2, 3)   // +3
 |
(0, 2, 3, 4)   // -1 +4
 |
(0, 3, 4, 5)   // -2 +5
 |
 (3, 4, 5) // -0, new leader (3) is elected, requested order is matched, 
reassignment finished
{noformat}
Let's take a closer look at the third step above:
{noformat}
FTR = (3, 4, 5)
CR = (0, 1, 2, 3)
 
n = size(FTR) - size(CR)  // 1
ER = CR - FTR // (0, 1, 2)
ER = order(ER)// (1, 2, 0)
DR = takeFirst(ER, n) // (1)
 
NR = first(FTR - CR)  // 4
TR = CR + NR - DR // (0, 2, 3, 4)
{noformat}

> Support for incremental replica reassignment
> 
>
> Key: KAFKA-6794
> URL: https://issues.apache.org/jira/browse/KAFKA-6794
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Viktor Somogyi
>Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6794) Support for incremental replica reassignment

2018-11-29 Thread Viktor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16703314#comment-16703314
 ] 

Viktor Somogyi commented on KAFKA-6794:
---

[~hachikuji] it seems we finally have a solution which passes for all the unit 
tests. I'm gonna clean it up, write a doc about it and will try to demo it to 
you guys in some ways (it could be a long KIP or a recording or we could even 
manage to do a live demo).

> Support for incremental replica reassignment
> 
>
> Key: KAFKA-6794
> URL: https://issues.apache.org/jira/browse/KAFKA-6794
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Viktor Somogyi
>Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6794) Support for incremental replica reassignment

2018-09-24 Thread Sandor Murakozi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625945#comment-16625945
 ] 

Sandor Murakozi commented on KAFKA-6794:


[~viktorsomogyi] you can build on my half-ready solution available in 
[https://github.com/smurakozi/kafka/tree/KAFKA-6794]

> Support for incremental replica reassignment
> 
>
> Key: KAFKA-6794
> URL: https://issues.apache.org/jira/browse/KAFKA-6794
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Sandor Murakozi
>Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6794) Support for incremental replica reassignment

2018-09-24 Thread Sandor Murakozi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625935#comment-16625935
 ] 

Sandor Murakozi commented on KAFKA-6794:


[~viktorsomogyi] I think I won't be able to work on it for a while. Please feel 
free to reassign.

> Support for incremental replica reassignment
> 
>
> Key: KAFKA-6794
> URL: https://issues.apache.org/jira/browse/KAFKA-6794
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Sandor Murakozi
>Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6794) Support for incremental replica reassignment

2018-09-19 Thread Viktor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620507#comment-16620507
 ] 

Viktor Somogyi commented on KAFKA-6794:
---

Hi [~smurakozi], are you working on this? In case you don't, I'd reassign this 
to myself if you don't mind.

> Support for incremental replica reassignment
> 
>
> Key: KAFKA-6794
> URL: https://issues.apache.org/jira/browse/KAFKA-6794
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Sandor Murakozi
>Priority: Major
>
> Say you have a replication factor of 4 and you trigger a reassignment which 
> moves all replicas to new brokers. Now 8 replicas are fetching at the same 
> time which means you need to account for 8 times the current producer load 
> plus the catch-up replication. To make matters worse, the replicas won't all 
> become in-sync at the same time; in the worst case, you could have 7 replicas 
> in-sync while one is still catching up. Currently, the old replicas won't be 
> disabled until all new replicas are in-sync. This makes configuring the 
> throttle tricky since ISR traffic is not subject to it.
> Rather than trying to bring all 4 new replicas online at the same time, a 
> friendlier approach would be to do it incrementally: bring one replica 
> online, bring it in-sync, then remove one of the old replicas. Repeat until 
> all replicas have been changed. This would reduce the impact of a 
> reassignment and make configuring the throttle easier at the cost of a slower 
> overall reassignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)