[ 
https://issues.apache.org/jira/browse/HELIX-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16408772#comment-16408772
 ] 

ASF GitHub Bot commented on HELIX-682:
--------------------------------------

Github user dasahcc commented on a diff in the pull request:

    https://github.com/apache/helix/pull/156#discussion_r176271788
  
    --- Diff: 
helix-core/src/main/java/org/apache/helix/controller/stages/MessageGenerationPhase.java
 ---
    @@ -121,6 +131,18 @@ public void process(ClusterEvent event) throws 
Exception {
     
               Message message = null;
     
    +          if (shouldCleanUpPendingMessage(pendingMessage, currentState,
    +              currentStateOutput.getEndTime(resourceName, partition, 
instanceName))) {
    +            logger.info(
    +                "Adding pending message {} on instance {} to GC. Msg: 
{}->{}, current state of resource {}:{} is {}",
    --- End diff --
    
    Let's not use GC name for it.


> Stale message should not prevent controller from rebalancing resource
> ---------------------------------------------------------------------
>
>                 Key: HELIX-682
>                 URL: https://issues.apache.org/jira/browse/HELIX-682
>             Project: Apache Helix
>          Issue Type: Bug
>            Reporter: Hao Zhang
>            Priority: Major
>
> Currently during MessageGenerationPhase, we skip re-balancing when there is 
> pending message. Though we assume that participant will delete messages when 
> they finish the task, there will be cases that when ZK is not stable and 
> participant fail to do so, which will leave message un-deleted and thus block 
> rebalance.
> Ideally on controller side, we should try to delete message as well: if 
> partition's current state is same as message's toState, or there is totally 
> invalid message remaining, controller should try to delete message to unblock 
> rebalancing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to