[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator
[ https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164205#comment-17164205 ] Aleksey Plekhanov commented on IGNITE-12930: Cherry-picked to 2.9 > DistributedProcess fails node if unable to send single message to coordinator > - > > Key: IGNITE-12930 > URL: https://issues.apache.org/jira/browse/IGNITE-12930 > Project: Ignite > Issue Type: Bug >Reporter: Maxim Muzafarov >Assignee: Amelchev Nikita >Priority: Critical > Fix For: 2.9 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The > [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java] > fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to > send a message to the coordinator (e.g. the coordinator fails right before > the single message is sent). > {code:java} > try { > ctx.io().sendToGridTopic(p.crdId, > GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL); > } > catch (IgniteCheckedException e) { > log.error("Unable to send message to coordinator.", e); > ctx.failure().process(new FailureContext(CRITICAL_ERROR, > new Exception("Unable to send message to coordinator.", > e))); > } > {code} > h4. Expected behaviour > If the {{ClusterTopologyCheckedException}} occurs need to wait for the > NODE_LEFT event of the coordinator node and re-init the distributed process > future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator
[ https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163767#comment-17163767 ] Maxim Muzafarov commented on IGNITE-12930: -- [~alex_pl], Can you please herry-pick changes to 2.9 branch? > DistributedProcess fails node if unable to send single message to coordinator > - > > Key: IGNITE-12930 > URL: https://issues.apache.org/jira/browse/IGNITE-12930 > Project: Ignite > Issue Type: Bug >Reporter: Maxim Muzafarov >Assignee: Amelchev Nikita >Priority: Critical > Fix For: 2.9 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The > [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java] > fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to > send a message to the coordinator (e.g. the coordinator fails right before > the single message is sent). > {code:java} > try { > ctx.io().sendToGridTopic(p.crdId, > GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL); > } > catch (IgniteCheckedException e) { > log.error("Unable to send message to coordinator.", e); > ctx.failure().process(new FailureContext(CRITICAL_ERROR, > new Exception("Unable to send message to coordinator.", > e))); > } > {code} > h4. Expected behaviour > If the {{ClusterTopologyCheckedException}} occurs need to wait for the > NODE_LEFT event of the coordinator node and re-init the distributed process > future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator
[ https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163715#comment-17163715 ] Maxim Muzafarov commented on IGNITE-12930: -- LGTM, merged to the master branch. > DistributedProcess fails node if unable to send single message to coordinator > - > > Key: IGNITE-12930 > URL: https://issues.apache.org/jira/browse/IGNITE-12930 > Project: Ignite > Issue Type: Bug >Reporter: Maxim Muzafarov >Assignee: Amelchev Nikita >Priority: Critical > Fix For: 2.9 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The > [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java] > fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to > send a message to the coordinator (e.g. the coordinator fails right before > the single message is sent). > {code:java} > try { > ctx.io().sendToGridTopic(p.crdId, > GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL); > } > catch (IgniteCheckedException e) { > log.error("Unable to send message to coordinator.", e); > ctx.failure().process(new FailureContext(CRITICAL_ERROR, > new Exception("Unable to send message to coordinator.", > e))); > } > {code} > h4. Expected behaviour > If the {{ClusterTopologyCheckedException}} occurs need to wait for the > NODE_LEFT event of the coordinator node and re-init the distributed process > future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator
[ https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163027#comment-17163027 ] Ignite TC Bot commented on IGNITE-12930: {panel:title=Branch: [pull/7714/head] Base: [master] : Possible Blockers (1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1} {color:#d04437}PDS (Indexing){color} [[tests 0 Exit Code |https://ci.ignite.apache.org/viewLog.html?buildId=5483734]] {panel} {panel:title=Branch: [pull/7714/head] Base: [master] : New Tests (9)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1} {color:#8b}Basic 1{color} [[tests 1|https://ci.ignite.apache.org/viewLog.html?buildId=5476398]] * {color:#013220}IgniteBasicTestSuite: DistributedProcessCoordinatorLeftTest.testCoordinatorFailed - PASSED{color} {color:#8b}Service Grid (legacy mode){color} [[tests 4|https://ci.ignite.apache.org/viewLog.html?buildId=5477629]] * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple [val1=DiscoveryEvent [evtNode=d39c1354-630f-4473-baf5-a1a7f6d98db1, topVer=0, msgTemplate=null, span=null, nodeId8=834e6b1d, msg=, type=NODE_JOINED, tstamp=1595272932280], val2=AffinityTopologyVersion [topVer=2506475986375759344, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple [val1=DiscoveryEvent [evtNode=d39c1354-630f-4473-baf5-a1a7f6d98db1, topVer=0, msgTemplate=null, span=null, nodeId8=834e6b1d, msg=, type=NODE_JOINED, tstamp=1595272932280], val2=AffinityTopologyVersion [topVer=2506475986375759344, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple [val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest [id=81787bd6371-eb1f7fb7-6b13-4baa-8289-717b85e530b2, reqs=SingletonList [ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent [evtNode=abf8efb3-8dee-41e0-a013-6a0b2749be78, topVer=0, msgTemplate=null, span=null, nodeId8=abf8efb3, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1595272932280]], val2=AffinityTopologyVersion [topVer=-331686843132431608, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple [val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest [id=81787bd6371-eb1f7fb7-6b13-4baa-8289-717b85e530b2, reqs=SingletonList [ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent [evtNode=abf8efb3-8dee-41e0-a013-6a0b2749be78, topVer=0, msgTemplate=null, span=null, nodeId8=abf8efb3, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1595272932280]], val2=AffinityTopologyVersion [topVer=-331686843132431608, minorTopVer=0]]] - PASSED{color} {color:#8b}Service Grid{color} [[tests 4|https://ci.ignite.apache.org/viewLog.html?buildId=5476451]] * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple [val1=DiscoveryEvent [evtNode=2dd53edc-dc06-462b-bf30-e545e6c14dfe, topVer=0, msgTemplate=null, span=null, nodeId8=aafd3c7e, msg=, type=NODE_JOINED, tstamp=1595260520048], val2=AffinityTopologyVersion [topVer=-4844912795698995190, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple [val1=DiscoveryEvent [evtNode=2dd53edc-dc06-462b-bf30-e545e6c14dfe, topVer=0, msgTemplate=null, span=null, nodeId8=aafd3c7e, msg=, type=NODE_JOINED, tstamp=1595260520048], val2=AffinityTopologyVersion [topVer=-4844912795698995190, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple [val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest [id=162a9fc6371-764d10c4-ba93-4e98-9c23-7213e8c6d27f, reqs=SingletonList [ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent [evtNode=76541175-c259-4c83-a5c0-97e6eaf3eec5, topVer=0, msgTemplate=null, span=null, nodeId8=76541175, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1595260520048]], val2=AffinityTopologyVersion [topVer=4650137978769321553, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple [val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest [id=162a9fc6371-764d10c4-ba93-4e98-9c23-7213e8c6d27f, reqs=SingletonList [ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent [evtNode=76541175-c259-4c83-a5c0-97e6eaf3eec5, topVer=0, msgTemplate=null, span=null, nodeId8=76541175, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1595260520048]], val2=AffinityTopologyVersion [topVer=4650137978769321553, minorTopVer=0]]] - PASSED{color} {panel}
[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator
[ https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161050#comment-17161050 ] Aleksey Plekhanov commented on IGNITE-12930: [~mmuzaf], any updates here? > DistributedProcess fails node if unable to send single message to coordinator > - > > Key: IGNITE-12930 > URL: https://issues.apache.org/jira/browse/IGNITE-12930 > Project: Ignite > Issue Type: Bug >Reporter: Maxim Muzafarov >Assignee: Amelchev Nikita >Priority: Major > Fix For: 2.9 > > Time Spent: 1.5h > Remaining Estimate: 0h > > The > [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java] > fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to > send a message to the coordinator (e.g. the coordinator fails right before > the single message is sent). > {code:java} > try { > ctx.io().sendToGridTopic(p.crdId, > GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL); > } > catch (IgniteCheckedException e) { > log.error("Unable to send message to coordinator.", e); > ctx.failure().process(new FailureContext(CRITICAL_ERROR, > new Exception("Unable to send message to coordinator.", > e))); > } > {code} > h4. Expected behaviour > If the {{ClusterTopologyCheckedException}} occurs need to wait for the > NODE_LEFT event of the coordinator node and re-init the distributed process > future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator
[ https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155212#comment-17155212 ] Maxim Muzafarov commented on IGNITE-12930: -- [~alex_pl] Yes, I'll finish the review. I think we should include this issue to 2.9 release since it fails the node in rare cases when the TDE or Snapshot operation starts. > DistributedProcess fails node if unable to send single message to coordinator > - > > Key: IGNITE-12930 > URL: https://issues.apache.org/jira/browse/IGNITE-12930 > Project: Ignite > Issue Type: Bug >Reporter: Maxim Muzafarov >Assignee: Amelchev Nikita >Priority: Major > Fix For: 2.9 > > Time Spent: 1.5h > Remaining Estimate: 0h > > The > [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java] > fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to > send a message to the coordinator (e.g. the coordinator fails right before > the single message is sent). > {code:java} > try { > ctx.io().sendToGridTopic(p.crdId, > GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL); > } > catch (IgniteCheckedException e) { > log.error("Unable to send message to coordinator.", e); > ctx.failure().process(new FailureContext(CRITICAL_ERROR, > new Exception("Unable to send message to coordinator.", > e))); > } > {code} > h4. Expected behaviour > If the {{ClusterTopologyCheckedException}} occurs need to wait for the > NODE_LEFT event of the coordinator node and re-init the distributed process > future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator
[ https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155199#comment-17155199 ] Aleksey Plekhanov commented on IGNITE-12930: Looks like review started, but not finished. [~mmuzaf], do you have plans to finish the review to resolve the ticket in the 2.9 release? > DistributedProcess fails node if unable to send single message to coordinator > - > > Key: IGNITE-12930 > URL: https://issues.apache.org/jira/browse/IGNITE-12930 > Project: Ignite > Issue Type: Bug >Reporter: Maxim Muzafarov >Assignee: Amelchev Nikita >Priority: Major > Fix For: 2.9 > > Time Spent: 1.5h > Remaining Estimate: 0h > > The > [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java] > fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to > send a message to the coordinator (e.g. the coordinator fails right before > the single message is sent). > {code:java} > try { > ctx.io().sendToGridTopic(p.crdId, > GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL); > } > catch (IgniteCheckedException e) { > log.error("Unable to send message to coordinator.", e); > ctx.failure().process(new FailureContext(CRITICAL_ERROR, > new Exception("Unable to send message to coordinator.", > e))); > } > {code} > h4. Expected behaviour > If the {{ClusterTopologyCheckedException}} occurs need to wait for the > NODE_LEFT event of the coordinator node and re-init the distributed process > future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator
[ https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090424#comment-17090424 ] Ignite TC Bot commented on IGNITE-12930: {panel:title=Branch: [pull/7714/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} [TeamCity *-- Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=5248855buildTypeId=IgniteTests24Java8_RunAll] > DistributedProcess fails node if unable to send single message to coordinator > - > > Key: IGNITE-12930 > URL: https://issues.apache.org/jira/browse/IGNITE-12930 > Project: Ignite > Issue Type: Bug >Reporter: Maxim Muzafarov >Assignee: Amelchev Nikita >Priority: Major > Fix For: 2.9 > > Time Spent: 10m > Remaining Estimate: 0h > > The > [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java] > fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to > send a message to the coordinator (e.g. the coordinator fails right before > the single message is sent). > {code:java} > try { > ctx.io().sendToGridTopic(p.crdId, > GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL); > } > catch (IgniteCheckedException e) { > log.error("Unable to send message to coordinator.", e); > ctx.failure().process(new FailureContext(CRITICAL_ERROR, > new Exception("Unable to send message to coordinator.", > e))); > } > {code} > h4. Expected behaviour > If the {{ClusterTopologyCheckedException}} occurs need to wait for the > NODE_LEFT event of the coordinator node and re-init the distributed process > future. -- This message was sent by Atlassian Jira (v8.3.4#803005)