[ https://issues.apache.org/jira/browse/HELIX-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451348#comment-16451348 ]
ASF GitHub Bot commented on HELIX-682: -------------------------------------- GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/195 [HELIX-682] delete duplicated message and log error in HelixTaskExecutor on participant This PR is the second part of message dedup on participant side You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhan849/helix harry/participant-msg-dedup Alternatively you can review and apply these changes as the patch at: https://github.com/apache/helix/pull/195.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #195 ---- commit 8aba9bea0734da11722fbc8cceb74f34dd6a37c6 Author: Harry Zhang <zhan849@...> Date: 2018-04-24T22:34:08Z [HELIX-682] delete duplicated message and log error in HelixTaskExecutor on participant ---- > Stale message should not prevent controller from rebalancing resource > --------------------------------------------------------------------- > > Key: HELIX-682 > URL: https://issues.apache.org/jira/browse/HELIX-682 > Project: Apache Helix > Issue Type: Bug > Reporter: Hao Zhang > Priority: Major > > Currently during MessageGenerationPhase, we skip re-balancing when there is > pending message. Though we assume that participant will delete messages when > they finish the task, there will be cases that when ZK is not stable and > participant fail to do so, which will leave message un-deleted and thus block > rebalance. > Ideally on controller side, we should try to delete message as well: if > partition's current state is same as message's toState, or there is totally > invalid message remaining, controller should try to delete message to unblock > rebalancing -- This message was sent by Atlassian JIRA (v7.6.3#76005)