Roman Puchkovskiy created IGNITE-17152:
------------------------------------------
Summary: Improve logging levels for situations when dealing with a
client node
Key: IGNITE-17152
URL: https://issues.apache.org/jira/browse/IGNITE-17152
Project: Ignite
Issue Type: Improvement
Components: networking
Reporter: Roman Puchkovskiy
Assignee: Roman Puchkovskiy
Fix For: 2.14
An example follows:
[2022-04-27T23:01:17,872][ERROR][query-#17069%nebula-node%][TcpCommunicationSpi]
Failed to send message to remote node [node=TcpDiscoveryNode
[id=67cf0e5e-974c-463a-a1f2-915fe3cdd3e7, consistentId=67cf0e5e-974c-
2463a-a1f2-915fe3cdd3e7, addrs=ArrayList [0:0:0:0:0:0:0:1%lo0, 127.0.0.1,
127.94.0.1, 192.168.1.35], sockAddrs=HashSet [/127.0.0.1:0,
0:0:0:0:0:0:0:1%lo0:0, /192.168.1.35:0, /127.94.0.1:0], discPort=0, order=25,
3intOrder=15, lastExchangeTime=1651100317979, loc=false,
ver=8.8.14#20220124-sha1:53de42db, isClient=true], msg=GridIoMessage [plc=10,
topic=TOPIC_QUERY, topicOrd=19, ordered=false, timeout=0, skipOnTimeout=false
4, msg=GridQueryFailResponse [qryReqId=1, errMsg=Failed to wait for
establishing inverse connection (node left topology):
67cf0e5e-974c-463a-a1f2-915fe3cdd3e7, failCode=0, sqlErrCode=0]]]
org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Failed to
wait for establishing inverse connection (node left topology):
67cf0e5e-974c-463a-a1f2-915fe3cdd3e7
Here, a client has left the topology, hence we were not able to send it some
message. The resulting problem is not the server internal problem, it is just a
consequence of a client leaving (which is normal). So in this case the problem
should not be logged as an ERROR to avoid too much noise in the log.
Another similar log is
[2022-04-27T23:01:17,872][ERROR][query-#17069%xxx-node%][GridMapQueryExecutor]
Failed to send error message.
2org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Failed to
wait for establishing inverse connection (node left topology):
67cf0e5e-974c-463a-a1f2-915fe3cdd3e7 3
Here, an error message was tried to be sent to a client, but it has already
left. Similar reasoning implies that we should not log at as ERROR.
One more situation is demonstrated by the following log:
[2022-05-16T16:43:51,301][ERROR][sys-#51%xxx-node%][TcpCommunicationSpi] Failed
to send message to remote node [node=TcpDiscoveryNode
[id=68e268f7-abf2-41a1-a4fa-520169d2dac5,
consistentId=68e268f7-abf2-41a1-a4fa-520169d2dac5, addrs=ArrayList
2[0:0:0:0:0:0:0:1%lo0, 127.0.0.1, 127.94.0.1, 192.168.1.170], sockAddrs=HashSet
[/127.0.0.1:0, 0:0:0:0:0:0:0:1%lo0:0, /192.168.1.170:0, /127.94.0.1:0],
discPort=0, order=79, intOrder=44, lastExchangeTime=1652719430974, loc=false,
ver=8.8.14#202201 324-sha1:53de42db, isClient=true], msg=GridIoMessage [plc=0,
topic=TOPIC_COMM_USER, topicOrd=9, ordered=true, timeout=5000,
skipOnTimeout=true, msg=GridIoUserMessage [clsLdrId=null, depMode=null,
depClsName=null, userVer=null, ldrParties=null, dep 4=null]]]
org.apache.ignite.IgniteCheckedException: Failed to connect to node
68e268f7-abf2-41a1-a4fa-520169d2dac5 because it is started in
'forceClientToServerConnections' mode; inverse connection will be requested.
Here, the exception is not a problem at all, it's just used for flow control,
and it should not be logged at ERROR as well.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)