[ 
https://issues.apache.org/jira/browse/HELIX-654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014966#comment-16014966
 ] 

ASF GitHub Bot commented on HELIX-654:
--------------------------------------

Github user jiajunwang commented on a diff in the pull request:

    https://github.com/apache/helix/pull/88#discussion_r117135837
  
    --- Diff: helix-core/src/main/java/org/apache/helix/task/JobRebalancer.java 
---
    @@ -420,6 +411,14 @@ private ResourceAssignment 
computeResourceMapping(String jobResource,
                   workflowConfig, workflowCtx, allPartitions, 
cache.getIdealStates());
           for (Map.Entry<String, SortedSet<Integer>> entry : 
taskAssignments.entrySet()) {
             String instance = entry.getKey();
    +
    +        if (!isGenericTaskJob(jobCfg) || jobCfg.isRebalanceRunningTask()) {
    --- End diff --
    
    Why is this logic in the for loop? Do we need to execute it for each 
<instance, partitions> entry?


> Rebalance running task
> ----------------------
>
>                 Key: HELIX-654
>                 URL: https://issues.apache.org/jira/browse/HELIX-654
>             Project: Apache Helix
>          Issue Type: New Feature
>          Components: helix-core
>            Reporter: Weihan Kong
>
> h3. Feature summary
> Helix Task Framework empowers user to run tasks on instances managed by 
> Helix. There're 2 type of tasks: generic task and fixed target task. For 
> fixed target task, the task always follows the targeted partition and is 
> rebalanced if the partition is rebalanced. For generic task, Helix provides 
> user the choice to rebalance the running task or not, when the topology of 
> the cluster changes.
> For most users, it's better to disabled this feature(as default) since 
> there's no need to re-run the task every time new node is added. For users 
> with long-running tasks, enabling this feature can be very useful so that 
> when new node is added, the load of the tasks are better balanced among the 
> cluster.
> h3. Defined system behavior
> h4. When a node fails,
> h6. Feature disabled:
> * Running tasks on that failed node will be rebalanced to a live node, since 
> the task no longer exists and failed with the node.
> h6. Feature enabled:
> * Same.
> h4. When a new node is added,
> h6. Feature disabled:
> * Running tasks will continue to run on the current instance.
> * If a running task fails after a while, it might be rebalanced and run on 
> other instances, according to the new rebalance assignment under the new 
> cluster topology.
> h6. Feature enabled:
> * Running task might be cancelled and rebalanced immediately, according to 
> the new rebalance assignment under the new cluster topology.
> h3. Configuration
> A job level config field(RebalanceRunningTask) in JobConfig to enable/disable 
> this feature. By default it's false.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to