Hello! You seem to have an awful lot of errors related to connectivity problems between nodes, such as:
Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address [addr=ult-s2-svr1.dataprocessors.com.au/10.16.1.47:47106, err=Connection refused] Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address [addr=ult-s2-svr3/10.16.1.43:47102, err=Remote node ID is not as expected [expected=d97b5e5d-fb46-4b5b-91ad-79a69fce738f, rcvd=1dc23ebb-0997-4858-9433-d5d30c9b643e]] I recommend figuring those errors out: it's possible that you have nodes in your cluster which are not reachable by communication from server node(s), but present in discovery. Such nodes will cause all kinds of problems in cluster. Regards, -- Ilya Kasnacheev пт, 8 нояб. 2019 г. в 17:12, mvkarp <liquid_ninj...@hotmail.com>: > Ok, there are no exceptions in the ignite logs for the client JVMs but I've > attached the log for one of the problem servers. Looks like a few errors > but > I am unable to determine the root cause. > ignite-46073e05.zip > < > http://apache-ignite-users.70518.x6.nabble.com/file/t2658/ignite-46073e05.zip> > > > > ilya.kasnacheev wrote > > Hello! > > > > This is very strange, since we expect this collection to be cleared on > > exchange. > > > > Please make sure you don't have any stray exceptions during exchange in > > your logs. > > > > Regards, > > -- > > Ilya Kasnacheev > > > > > > пт, 8 нояб. 2019 г. в 12:49, mvkarp < > > > liquid_ninja2k@ > > > >: > > > >> Hi, > >> > >> This is not the case. Always only a maximum total of two server nodes. > >> One > >> JVM server on each. However there are many client JVMs that start and > >> stop > >> caches with setClientMode=true. It looks like one of the server > instances > >> is > >> immune to the issue, whilst the most newly created one gets the leak, > >> with > >> a > >> lot of partition exchanges happening for EVT_NODE_JOINED and > >> EVT_NODE_LEFT > >> (one of the nodes don't get any of these partition exchanges, however > the > >> exact server node that gets this can alternate so its not linked to one > >> node > >> in particular but seems to be linked to the most newly launched server). > >> > >> > >> ilya.kasnacheev wrote > >> > Hello! > >> > > >> > How many nodes do you have in your cluster? > >> > > >> > From the dump it seems that the number of server nodes is in > thousands. > >> Is > >> > this the case? > >> > > >> > Regards, > >> > -- > >> > Ilya Kasnacheev > >> > > >> > > >> > пт, 8 нояб. 2019 г. в 10:26, mvkarp < > >> > >> > liquid_ninja2k@ > >> > >> > >: > >> > > >> >> Let me know if these help or if you need anything more specific. > >> >> recoveryBallotBoxes.zip > >> >> < > >> >> > >> > http://apache-ignite-users.70518.x6.nabble.com/file/t2658/recoveryBallotBoxes.zip > >> > > >> >> > >> >> > >> >> > >> >> ilya.kasnacheev wrote > >> >> > Hello! > >> >> > > >> >> > Can you please check whether there are any especially large objects > >> >> inside > >> >> > recoveryBallotBoxes object graph? Sorting by retained heap may help > >> in > >> >> > determining this. It would be nice to know what is the type > >> histogram > >> >> of > >> >> > what's inside recoveryBallotBoxes and where the bulk of heap usage > >> >> > resides. > >> >> > > >> >> > Regards, > >> >> > -- > >> >> > Ilya Kasnacheev > >> >> > > >> >> > > >> >> > чт, 7 нояб. 2019 г. в 06:23, mvkarp < > >> >> > >> >> > liquid_ninja2k@ > >> >> > >> >> > >: > >> >> > > >> >> >> I've attached another set of screenshots, might be more clear. > >> >> >> heap.zip > >> >> >> < > >> >> > http://apache-ignite-users.70518.x6.nabble.com/file/t2658/heap.zip> > >> >> >> > >> >> >> > >> >> >> mvkarp wrote > >> >> >> > I've attached some extra screenshots showing what is inside > these > >> >> >> records > >> >> >> > and path to GC roots. heap.zip > >> >> >> > < > >> >> >> > >> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/heap.zip> > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > -- > >> >> >> > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/ > >> >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> -- > >> >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/ > >> >> > >> > >> > >> > >> > >> > >> -- > >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/ > >> > > > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >