[jira] [Comment Edited] (IGNITE-11425) Log information about inaccessible nodes through Communication

Vladislav Pyatkov (JIRA) Tue, 30 Apr 2019 03:17:22 -0700


    [ 
https://issues.apache.org/jira/browse/IGNITE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830155#comment-16830155
 ]


Vladislav Pyatkov edited comment on IGNITE-11425 at 4/30/19 10:16 AM:
----------------------------------------------------------------------

[~Denis Chudov] Looks good to me.
Please, move by process.


was (Author: v.pyatkov):
[~Denis Chudov] Looks good to me.
Please, move it by process.

> Log information about inaccessible nodes through Communication
> --------------------------------------------------------------
>
>                 Key: IGNITE-11425
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11425
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Vladislav Pyatkov
>            Assignee: Denis Chudov
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In case of long getting communication TCP client (longe than this 
> CONNECTION_ESTABLISH_THRESHOLD_MS = 100) message will printed:
> {noformat}
> [sys-#20167%dht.CacheGetReadFromBackupFailoverTest0%][TcpCommunicationSpi] 
> TCP client created [client=GridTcpNioCommunicationClient 
> [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker 
> [super=AbstractNioClientWorker [idx=3, bytesRcvd=0, bytesSent=0, 
> bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker 
> [name=grid-nio-worker-tcp-comm-3, 
> igniteInstanceName=dht.CacheGetReadFromBackupFailoverTest0, finished=false, 
> heartbeatTs=1550512236151, hashCode=140561231, interrupted=false, 
> runner=grid-nio-worker-tcp-comm-3-#20147%dht.CacheGetReadFromBackupFailoverTest0%]]],
>  writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
> readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
> inRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=0, 
> sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
> [id=8a660330-6ddb-4031-b955-4cb4f4b00002, addrs=ArrayList [127.0.0.1], 
> sockAddrs=HashSet [/127.0.0.1:47502], discPort=47502, order=5, intOrder=4, 
> lastExchangeTime=1550512235890, loc=false, ver=2.8.0#20190218-sha1:29232e37, 
> isClient=false], connected=false, connectCnt=2, queueLimit=4096, 
> reserveCnt=2, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor 
> [acked=0, resendCnt=0, rcvCnt=0, sentCnt=0, reserved=true, lastAck=0, 
> nodeLeft=false, node=TcpDiscoveryNode 
> [id=8a660330-6ddb-4031-b955-4cb4f4b00002, addrs=ArrayList [127.0.0.1], 
> sockAddrs=HashSet [/127.0.0.1:47502], discPort=47502, order=5, intOrder=4, 
> lastExchangeTime=1550512235890, loc=false, ver=2.8.0#20190218-sha1:29232e37, 
> isClient=false], connected=false, connectCnt=2, queueLimit=4096, 
> reserveCnt=2, pairedConnections=false], super=GridNioSessionImpl 
> [locAddr=/127.0.0.1:38770, rmtAddr=/127.0.0.1:45212, 
> createTime=1550512236151, closeTime=0, bytesSent=0, bytesRcvd=0, 
> bytesSent0=0, bytesRcvd0=0, sndSchedTime=1550512236151, 
> lastSndTime=1550512236151, lastRcvTime=1550512236151, readsPaused=false, 
> filterChain=FilterChain[filters=[GridNioCodecFilter 
> [parser=org.apache.ignite.internal.util.nio.GridDirectParser@d240a48, 
> directMode=true], GridConnectionBytesVerifyFilter], accepted=false, 
> markedForClose=false]], super=GridAbstractCommunicationClient 
> [lastUsed=1550512236151, closed=false, connIdx=0]], duration=211ms]
> {noformat}
> but in some cases we can not to get client during time out, and the message 
> reduce to
> {noformat}
> TCP client created [client=null, duration=60004 ms]
> {noformat}
> According to the message you cannot understand which nodes were inaccessible.
> Moreover, wants to see the connection trouble earlier than the 10 minutes 
> after.
> Should to log ip/host for clear understanding what was the node and log WARN 
> message each time when need to increase timeout:
> {code}
> if (lastWaitingTimeout < 60000)
>   lastWaitingTimeout *= 2;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (IGNITE-11425) Log information about inaccessible nodes through Communication

Reply via email to