[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator

2020-07-24 Thread Aleksey Plekhanov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164205#comment-17164205
 ] 

Aleksey Plekhanov commented on IGNITE-12930:


Cherry-picked to 2.9

> DistributedProcess fails node if unable to send single message to coordinator
> -
>
> Key: IGNITE-12930
> URL: https://issues.apache.org/jira/browse/IGNITE-12930
> Project: Ignite
>  Issue Type: Bug
>Reporter: Maxim Muzafarov
>Assignee: Amelchev Nikita
>Priority: Critical
> Fix For: 2.9
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The 
> [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java]
>  fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to 
> send a message to the coordinator (e.g. the coordinator fails right before 
> the single message is sent).
> {code:java}
> try {
> ctx.io().sendToGridTopic(p.crdId, 
> GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL);
> }
> catch (IgniteCheckedException e) {
> log.error("Unable to send message to coordinator.", e);
> ctx.failure().process(new FailureContext(CRITICAL_ERROR,
> new Exception("Unable to send message to coordinator.", 
> e)));
> }
> {code}
> h4. Expected behaviour
> If the {{ClusterTopologyCheckedException}} occurs need to wait for the 
> NODE_LEFT event of the coordinator node and re-init the distributed process 
> future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator

2020-07-23 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163767#comment-17163767
 ] 

Maxim Muzafarov commented on IGNITE-12930:
--

[~alex_pl],

Can you please herry-pick changes to 2.9 branch?

> DistributedProcess fails node if unable to send single message to coordinator
> -
>
> Key: IGNITE-12930
> URL: https://issues.apache.org/jira/browse/IGNITE-12930
> Project: Ignite
>  Issue Type: Bug
>Reporter: Maxim Muzafarov
>Assignee: Amelchev Nikita
>Priority: Critical
> Fix For: 2.9
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The 
> [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java]
>  fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to 
> send a message to the coordinator (e.g. the coordinator fails right before 
> the single message is sent).
> {code:java}
> try {
> ctx.io().sendToGridTopic(p.crdId, 
> GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL);
> }
> catch (IgniteCheckedException e) {
> log.error("Unable to send message to coordinator.", e);
> ctx.failure().process(new FailureContext(CRITICAL_ERROR,
> new Exception("Unable to send message to coordinator.", 
> e)));
> }
> {code}
> h4. Expected behaviour
> If the {{ClusterTopologyCheckedException}} occurs need to wait for the 
> NODE_LEFT event of the coordinator node and re-init the distributed process 
> future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator

2020-07-23 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163715#comment-17163715
 ] 

Maxim Muzafarov commented on IGNITE-12930:
--

LGTM, merged to the master branch.

> DistributedProcess fails node if unable to send single message to coordinator
> -
>
> Key: IGNITE-12930
> URL: https://issues.apache.org/jira/browse/IGNITE-12930
> Project: Ignite
>  Issue Type: Bug
>Reporter: Maxim Muzafarov
>Assignee: Amelchev Nikita
>Priority: Critical
> Fix For: 2.9
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The 
> [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java]
>  fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to 
> send a message to the coordinator (e.g. the coordinator fails right before 
> the single message is sent).
> {code:java}
> try {
> ctx.io().sendToGridTopic(p.crdId, 
> GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL);
> }
> catch (IgniteCheckedException e) {
> log.error("Unable to send message to coordinator.", e);
> ctx.failure().process(new FailureContext(CRITICAL_ERROR,
> new Exception("Unable to send message to coordinator.", 
> e)));
> }
> {code}
> h4. Expected behaviour
> If the {{ClusterTopologyCheckedException}} occurs need to wait for the 
> NODE_LEFT event of the coordinator node and re-init the distributed process 
> future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator

2020-07-22 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163027#comment-17163027
 ] 

Ignite TC Bot commented on IGNITE-12930:


{panel:title=Branch: [pull/7714/head] Base: [master] : Possible Blockers 
(1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}PDS (Indexing){color} [[tests 0 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5483734]]

{panel}
{panel:title=Branch: [pull/7714/head] Base: [master] : New Tests 
(9)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}Basic 1{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5476398]]
* {color:#013220}IgniteBasicTestSuite: 
DistributedProcessCoordinatorLeftTest.testCoordinatorFailed - PASSED{color}

{color:#8b}Service Grid (legacy mode){color} [[tests 
4|https://ci.ignite.apache.org/viewLog.html?buildId=5477629]]
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=d39c1354-630f-4473-baf5-a1a7f6d98db1, topVer=0, 
msgTemplate=null, span=null, nodeId8=834e6b1d, msg=, type=NODE_JOINED, 
tstamp=1595272932280], val2=AffinityTopologyVersion 
[topVer=2506475986375759344, minorTopVer=0]]] - PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=d39c1354-630f-4473-baf5-a1a7f6d98db1, topVer=0, 
msgTemplate=null, span=null, nodeId8=834e6b1d, msg=, type=NODE_JOINED, 
tstamp=1595272932280], val2=AffinityTopologyVersion 
[topVer=2506475986375759344, minorTopVer=0]]] - PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=81787bd6371-eb1f7fb7-6b13-4baa-8289-717b85e530b2, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=abf8efb3-8dee-41e0-a013-6a0b2749be78, topVer=0, msgTemplate=null, 
span=null, nodeId8=abf8efb3, msg=null, type=DISCOVERY_CUSTOM_EVT, 
tstamp=1595272932280]], val2=AffinityTopologyVersion 
[topVer=-331686843132431608, minorTopVer=0]]] - PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=81787bd6371-eb1f7fb7-6b13-4baa-8289-717b85e530b2, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=abf8efb3-8dee-41e0-a013-6a0b2749be78, topVer=0, msgTemplate=null, 
span=null, nodeId8=abf8efb3, msg=null, type=DISCOVERY_CUSTOM_EVT, 
tstamp=1595272932280]], val2=AffinityTopologyVersion 
[topVer=-331686843132431608, minorTopVer=0]]] - PASSED{color}

{color:#8b}Service Grid{color} [[tests 
4|https://ci.ignite.apache.org/viewLog.html?buildId=5476451]]
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=2dd53edc-dc06-462b-bf30-e545e6c14dfe, topVer=0, 
msgTemplate=null, span=null, nodeId8=aafd3c7e, msg=, type=NODE_JOINED, 
tstamp=1595260520048], val2=AffinityTopologyVersion 
[topVer=-4844912795698995190, minorTopVer=0]]] - PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=2dd53edc-dc06-462b-bf30-e545e6c14dfe, topVer=0, 
msgTemplate=null, span=null, nodeId8=aafd3c7e, msg=, type=NODE_JOINED, 
tstamp=1595260520048], val2=AffinityTopologyVersion 
[topVer=-4844912795698995190, minorTopVer=0]]] - PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=162a9fc6371-764d10c4-ba93-4e98-9c23-7213e8c6d27f, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=76541175-c259-4c83-a5c0-97e6eaf3eec5, topVer=0, msgTemplate=null, 
span=null, nodeId8=76541175, msg=null, type=DISCOVERY_CUSTOM_EVT, 
tstamp=1595260520048]], val2=AffinityTopologyVersion 
[topVer=4650137978769321553, minorTopVer=0]]] - PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=162a9fc6371-764d10c4-ba93-4e98-9c23-7213e8c6d27f, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=76541175-c259-4c83-a5c0-97e6eaf3eec5, topVer=0, msgTemplate=null, 
span=null, nodeId8=76541175, msg=null, type=DISCOVERY_CUSTOM_EVT, 
tstamp=1595260520048]], val2=AffinityTopologyVersion 
[topVer=4650137978769321553, minorTopVer=0]]] - PASSED{color}

{panel}

[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator

2020-07-20 Thread Aleksey Plekhanov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161050#comment-17161050
 ] 

Aleksey Plekhanov commented on IGNITE-12930:


[~mmuzaf], any updates here?

> DistributedProcess fails node if unable to send single message to coordinator
> -
>
> Key: IGNITE-12930
> URL: https://issues.apache.org/jira/browse/IGNITE-12930
> Project: Ignite
>  Issue Type: Bug
>Reporter: Maxim Muzafarov
>Assignee: Amelchev Nikita
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The 
> [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java]
>  fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to 
> send a message to the coordinator (e.g. the coordinator fails right before 
> the single message is sent).
> {code:java}
> try {
> ctx.io().sendToGridTopic(p.crdId, 
> GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL);
> }
> catch (IgniteCheckedException e) {
> log.error("Unable to send message to coordinator.", e);
> ctx.failure().process(new FailureContext(CRITICAL_ERROR,
> new Exception("Unable to send message to coordinator.", 
> e)));
> }
> {code}
> h4. Expected behaviour
> If the {{ClusterTopologyCheckedException}} occurs need to wait for the 
> NODE_LEFT event of the coordinator node and re-init the distributed process 
> future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator

2020-07-10 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155212#comment-17155212
 ] 

Maxim Muzafarov commented on IGNITE-12930:
--

[~alex_pl] 

Yes, I'll finish the review.
I think we should include this issue to 2.9 release since it fails the node in 
rare cases when the TDE or Snapshot operation starts.

> DistributedProcess fails node if unable to send single message to coordinator
> -
>
> Key: IGNITE-12930
> URL: https://issues.apache.org/jira/browse/IGNITE-12930
> Project: Ignite
>  Issue Type: Bug
>Reporter: Maxim Muzafarov
>Assignee: Amelchev Nikita
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The 
> [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java]
>  fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to 
> send a message to the coordinator (e.g. the coordinator fails right before 
> the single message is sent).
> {code:java}
> try {
> ctx.io().sendToGridTopic(p.crdId, 
> GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL);
> }
> catch (IgniteCheckedException e) {
> log.error("Unable to send message to coordinator.", e);
> ctx.failure().process(new FailureContext(CRITICAL_ERROR,
> new Exception("Unable to send message to coordinator.", 
> e)));
> }
> {code}
> h4. Expected behaviour
> If the {{ClusterTopologyCheckedException}} occurs need to wait for the 
> NODE_LEFT event of the coordinator node and re-init the distributed process 
> future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator

2020-07-10 Thread Aleksey Plekhanov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155199#comment-17155199
 ] 

Aleksey Plekhanov commented on IGNITE-12930:


Looks like review started, but not finished.

[~mmuzaf], do you have plans to finish the review to resolve the ticket in the 
2.9 release? 

> DistributedProcess fails node if unable to send single message to coordinator
> -
>
> Key: IGNITE-12930
> URL: https://issues.apache.org/jira/browse/IGNITE-12930
> Project: Ignite
>  Issue Type: Bug
>Reporter: Maxim Muzafarov
>Assignee: Amelchev Nikita
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The 
> [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java]
>  fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to 
> send a message to the coordinator (e.g. the coordinator fails right before 
> the single message is sent).
> {code:java}
> try {
> ctx.io().sendToGridTopic(p.crdId, 
> GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL);
> }
> catch (IgniteCheckedException e) {
> log.error("Unable to send message to coordinator.", e);
> ctx.failure().process(new FailureContext(CRITICAL_ERROR,
> new Exception("Unable to send message to coordinator.", 
> e)));
> }
> {code}
> h4. Expected behaviour
> If the {{ClusterTopologyCheckedException}} occurs need to wait for the 
> NODE_LEFT event of the coordinator node and re-init the distributed process 
> future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12930) DistributedProcess fails node if unable to send single message to coordinator

2020-04-23 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090424#comment-17090424
 ] 

Ignite TC Bot commented on IGNITE-12930:


{panel:title=Branch: [pull/7714/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=5248855buildTypeId=IgniteTests24Java8_RunAll]

> DistributedProcess fails node if unable to send single message to coordinator
> -
>
> Key: IGNITE-12930
> URL: https://issues.apache.org/jira/browse/IGNITE-12930
> Project: Ignite
>  Issue Type: Bug
>Reporter: Maxim Muzafarov
>Assignee: Amelchev Nikita
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The 
> [DistributedProcess|https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java]
>  fails the local node ({{FailureHandler}} CRITICAL_ERROR thrown) if unable to 
> send a message to the coordinator (e.g. the coordinator fails right before 
> the single message is sent).
> {code:java}
> try {
> ctx.io().sendToGridTopic(p.crdId, 
> GridTopic.TOPIC_DISTRIBUTED_PROCESS, singleMsg, SYSTEM_POOL);
> }
> catch (IgniteCheckedException e) {
> log.error("Unable to send message to coordinator.", e);
> ctx.failure().process(new FailureContext(CRITICAL_ERROR,
> new Exception("Unable to send message to coordinator.", 
> e)));
> }
> {code}
> h4. Expected behaviour
> If the {{ClusterTopologyCheckedException}} occurs need to wait for the 
> NODE_LEFT event of the coordinator node and re-init the distributed process 
> future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)