[jira] [Commented] (IGNITE-4499) TcpDiscoverySpi is not reliable in some network split scenarios.

2017-01-18 Thread Andrey Gura (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828296#comment-15828296
 ] 

Andrey Gura commented on IGNITE-4499:
-

Fixed. Current solution: Node should be kicked out from topology (forcibly 
failed). At this moment this valid only for TCP connection, not shmem.

{{TcpCommunicationSpi}} fails node in case if it can connect to remote node 
(server or client) and all retries are failed. Serve node can fail both server 
or client node. Client nodes can fail only other clients nodes. It is 
implemented in {{ctreateTcpClient()}} method.

{{TcpDiscoveryNodeFailedMessage}} will be handled by {{TcpDiscoverySpi}} in a 
special manner in case of forcible node fail. All nodes will not handle this 
message if it isn't verified by coordinator. It allows to avoid of topology 
crashes in cases, for example, when two nodes try to kick out each other 
(changes in {{ServerImpl}} class).

Client node now can receive {{TcpDiscoveryNodeFailedMessage}} in case of 
forcible fail. In this case client reconnection will be performed with delay.


> TcpDiscoverySpi is not reliable in some network split scenarios.
> 
>
> Key: IGNITE-4499
> URL: https://issues.apache.org/jira/browse/IGNITE-4499
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.6
>Reporter: Alexei Scherbakov
>Assignee: Andrey Gura
> Fix For: 2.0
>
>
> Where is a possible caveat in current discovery implementation using ring of 
> nodes.
> Imagine grid consisting of nodes A B C D
> Let them form the ring:
> A-B-C-D-A
> If network connectivity issues will arise between nodes A-C and B-D
> discovery spi will never know it and will continue to assume the topology is 
> valid. 
> On other side, TcpCommunicationSpi will try to run transaction on this 
> topology and never will succeed.
> We must drop nodes from topology on communication spi errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (IGNITE-4499) TcpDiscoverySpi is not reliable in some network split scenarios.

2016-12-27 Thread Vyacheslav Daradur (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15780538#comment-15780538
 ] 

Vyacheslav Daradur commented on IGNITE-4499:


I think it will decide in IGNITE-4501

> TcpDiscoverySpi is not reliable in some network split scenarios.
> 
>
> Key: IGNITE-4499
> URL: https://issues.apache.org/jira/browse/IGNITE-4499
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.6
>Reporter: Alexei Scherbakov
> Fix For: 2.0
>
>
> Where is a possible caveat in current discovery implementation using ring of 
> nodes.
> Imagine grid consisting of nodes A B C D
> Let them form the ring:
> A-B-C-D-A
> If network connectivity issues will arise between nodes A-C and B-D
> discovery spi will never know it and will continue to assume the topology is 
> valid. 
> On other side, TcpCommunicationSpi will try to run transaction on this 
> topology and never will succeed.
> We must drop nodes from topology on communication spi errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (IGNITE-4499) TcpDiscoverySpi is not reliable in some network split scenarios.

2016-12-27 Thread Alexei Scherbakov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15780169#comment-15780169
 ] 

Alexei Scherbakov commented on IGNITE-4499:
---

Sadly I have no reproducer at the moment.

> TcpDiscoverySpi is not reliable in some network split scenarios.
> 
>
> Key: IGNITE-4499
> URL: https://issues.apache.org/jira/browse/IGNITE-4499
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.6
>Reporter: Alexei Scherbakov
> Fix For: 2.0
>
>
> Where is a possible caveat in current discovery implementation using ring of 
> nodes.
> Imagine grid consisting of nodes A B C D
> Let them form the ring:
> A-B-C-D-A
> If network connectivity issues will arise between nodes A-C and B-D
> discovery spi will never know it and will continue to assume the topology is 
> valid. 
> On other side, TcpCommunicationSpi will try to run transaction on this 
> topology and never will succeed.
> We must drop nodes from topology on communication spi errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)