[ 
https://issues.apache.org/jira/browse/KAFKA-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064856#comment-17064856
 ] 

ASF GitHub Bot commented on KAFKA-6145:
---------------------------------------

cadonna commented on pull request #8334: KAFKA-6145: Add balanced assignment 
algorithm
URL: https://github.com/apache/kafka/pull/8334
 
 
   This algorithm assigns tasks to clients and tries to
   - balance the distribution of the  partitions of the
     same input topic over stream threads and clients,
     i.e., data parallel workload balance
   - balance the distribution of work over stream threads.
   The algorithm does not take into account potentially existing states
   on the client.
   
   The assignment is considered balanced when the difference in
   assigned tasks between the stream thread with the most tasks and
   the stream thread with the least tasks does not exceed a given
   balance factor.
   
   The algorithm prioritizes balance over stream threads
   higher than balance over clients.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Warm up new KS instances before migrating tasks - potentially a two phase 
> rebalance
> -----------------------------------------------------------------------------------
>
>                 Key: KAFKA-6145
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6145
>             Project: Kafka
>          Issue Type: New Feature
>          Components: streams
>            Reporter: Antony Stubbs
>            Assignee: Sophie Blee-Goldman
>            Priority: Major
>              Labels: needs-kip
>
> Currently when expanding the KS cluster, the new node's partitions will be 
> unavailable during the rebalance, which for large states can take a very long 
> time, or for small state stores even more than a few ms can be a deal breaker 
> for micro service use cases.
> One workaround would be two execute the rebalance in two phases:
> 1) start running state store building on the new node
> 2) once the state store is fully populated on the new node, only then 
> rebalance the tasks - there will still be a rebalance pause, but would be 
> greatly reduced
> Relates to: KAFKA-6144 - Allow state stores to serve stale reads during 
> rebalance



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to