[ https://issues.apache.org/jira/browse/IGNITE-11707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Maxim Muzafarov updated IGNITE-11707: ------------------------------------- Fix Version/s: (was: 2.8) > Tcp Discovery should drop pending metrics update message when new message is > received > ------------------------------------------------------------------------------------- > > Key: IGNITE-11707 > URL: https://issues.apache.org/jira/browse/IGNITE-11707 > Project: Ignite > Issue Type: Improvement > Reporter: Alexey Goncharuk > Assignee: Alexey Goncharuk > Priority: Major > > I've stumbled across the following behavior on a large cluster with large > number of caches: > When several new nodes are being added to the cluster, a client node may hang > infinitely on join. On server nodes one can observe tcp discovery message > worker continuously processing metrics update messages and writing metrics to > socket. From the logs it was clear that the cluster generated a lot of > metrics update messages and a node could not cope with it. > Even when metrics update message is generated on coordinator, this scenario > is possible when message round-trip/processing time is compared to the > metrics update frequency. > To mitigate the issue, we should drop a not-yet-processed metrics update > message when a new metrics update message is received. -- This message was sent by Atlassian Jira (v8.3.4#803005)