[ https://issues.apache.org/jira/browse/HELIX-681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16408337#comment-16408337 ]
ASF GitHub Bot commented on HELIX-681: -------------------------------------- Github user zhan849 commented on a diff in the pull request: https://github.com/apache/helix/pull/152#discussion_r176182830 --- Diff: helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTask.java --- @@ -168,7 +169,14 @@ public HelixTaskResult call() { // forward relay messages attached to this message to other participants if (taskResult.isSuccess()) { - forwardRelayMessages(accessor, _message, taskResult.getCompleteTime()); + try { + forwardRelayMessages(accessor, _message, taskResult.getCompleteTime()); + } catch (Exception e) { + // Fail to send relay message should not result in a task execution failure + // Currently we don't log error to ZK to reduce writes as when accessor throws + // exception, ZK might not be in good condition. + logger.error("Failed to send relay messages.", e); --- End diff -- will change > Participant should not fail state transition on fail to delete / relay message > ------------------------------------------------------------------------------ > > Key: HELIX-681 > URL: https://issues.apache.org/jira/browse/HELIX-681 > Project: Apache Helix > Issue Type: Bug > Reporter: Hao Zhang > Priority: Major > > Currently we have a general try-catch block in HelixTask and > HelixTaskExecutor, which, upon any exception thrown from state transition > routine, will fail state transition. However there are at least the following > cases in which state transition should be considered as successful: > * When we fail to delete message after successfully handled message and > updated current state -> this is because we already completed state > transition and current state is consistent between participant and ZK > * When we fail to send out relay message > as relay message provides only > best effort of delivering messages, which has nothing to do with state > transition's results. In case of fail to relay message, controller will > resend message which ensures correctness. -- This message was sent by Atlassian JIRA (v7.6.3#76005)