[jira] [Commented] (IGNITE-10354) Failing client node due to not receiving metrics updates

ASF GitHub Bot (JIRA) Fri, 23 Nov 2018 00:28:28 -0800


    [ 
https://issues.apache.org/jira/browse/IGNITE-10354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16696501#comment-16696501
 ]


ASF GitHub Bot commented on IGNITE-10354:
-----------------------------------------

GitHub user gromtech opened a pull request:

    https://github.com/apache/ignite/pull/5485

    IGNITE-10354 Failing client node due to not receiving metrics updates

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gridgain/apache-ignite ignite-10354

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/ignite/pull/5485.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5485
    
----
commit 5032b5a546c38f9a1d34325bb7a7ea39c66e46e1
Author: Roman Guseinov <gromcase@...>
Date:   2018-11-23T08:26:06Z

    IGNITE-10354 Failing client node due to not receiving metrics updates

----


> Failing client node due to not receiving metrics updates
> --------------------------------------------------------
>
>                 Key: IGNITE-10354
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10354
>             Project: Ignite
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 2.6
>            Reporter: Roman Guseinov
>            Assignee: Roman Guseinov
>            Priority: Major
>         Attachments: ClientDisconnectedTest.java
>
>
> In some cases after the coordinator change, the client node can be failed 
> before it can establish a connection to another server from the cluster.
> {code:java}
> [2018-11-21 12:21:45,769][WARN 
> ][tcp-disco-msg-worker-#15%server-b%][TestTcpDiscoverySpi] Failing client 
> node due to not receiving metrics updates from client node within 
> 'IgniteConfiguration.clientFailureDetectionTimeout' (consider increasing 
> configuration property) [timeout=10000, node=TcpDiscoveryNode 
> [id=dc739711-f685-45e8-9017-1f91b1d86c8c, addrs=[0:0:0:0:0:0:0:1, 10.0.75.1, 
> 127.0.0.1, 192.168.1.51, 192.168.192.1], sockAddrs=[/0:0:0:0:0:0:0:1:0, 
> LAPTOP-6FN8RAOS/10.0.75.1:0, /127.0.0.1:0, /192.168.192.1:0, 
> /192.168.1.51:0], discPort=0, order=2, intOrder=2, 
> lastExchangeTime=1542774105666, loc=false, ver=2.4.0#20180830-sha1:345c0a7c, 
> isClient=true]]
> [2018-11-21 12:21:45,791][INFO 
> ][tcp-client-disco-msg-worker-#10%client%][TestTcpDiscoverySpi] Client node 
> disconnected from cluster, will try to reconnect with new id 
> [newId=46812956-2fc4-4b74-9909-d523a547ba0e, 
> prevId=dc739711-f685-45e8-9017-1f91b1d86c8c, locNode=TcpDiscoveryNode 
> [id=dc739711-f685-45e8-9017-1f91b1d86c8c, addrs=[0:0:0:0:0:0:0:1, 10.0.75.1, 
> 127.0.0.1, 192.168.1.51, 192.168.192.1], sockAddrs=[/0:0:0:0:0:0:0:1:0, 
> LAPTOP-6FN8RAOS/10.0.75.1:0, /127.0.0.1:0, /192.168.192.1:0, 
> /192.168.1.51:0], discPort=0, order=2, intOrder=0, 
> lastExchangeTime=1542774104031, loc=true, ver=2.4.0#20180830-sha1:345c0a7c, 
> isClient=true]]
> {code}
> It looks like a race condition.
> Steps to reproduce:
> 1. Start server A.
> 2. Start client.
> 3. Start server B.
> 4. Stop server A.
> If add Thread.sleep(10000) between (3) and (4) then the client node won't be 
> disconnected from the cluster.
> Reproducer is attached [^ClientDisconnectedTest.java].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IGNITE-10354) Failing client node due to not receiving metrics updates

Reply via email to