Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-18 Thread Ivan Pavlukhin
Hi, Thank you for digging deeper! No good ideas about problems in curCrd.local(). Have you tried to reproduce a leak with starting/stopping huge amount of client nodes? вс, 17 нояб. 2019 г. в 13:16, mvkarp : > > Only other thing I can think of if it's through onDiscovery() is that >

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-17 Thread mvkarp
Only other thing I can think of if it's through onDiscovery() is that curCrd.local() somehow is returning true. However I am unable to find exactly how local() is determined since there appears to be a big chain. I know that the node uuid on the leaking server is on a different physical node as

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-16 Thread Ivan Pavlukhin
> But the population might be through processRecoveryFinishedMessage() - which > does not do any check for isLocal() and goes straight to processing message > since waitForCoordinatorInit() on a MvccRecoveryFinishedMessage always > returns false? All nodes sends messages intended for handling

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-16 Thread mvkarp
1 & 2. Actually, looking at latest Master on the release of 2.7.5 and the current Master version, it is 'pickMvccCoordinator' function which returns the coordinator (this is same function that selects node that is not Client and Ignite version >= 2.7). curCrd is then assigned the return variable

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-15 Thread Ivan Pavlukhin
1. MVCC coordinator should not do anything when there is no MVCC caches, actually it should not be active in such case. Basically, MVCC coordinator is needed to have a consistent order between transactions. 2. In 2.7.5 "assigned coordinator" is always selected but it does not mean that it is

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-14 Thread mvkarp
Hi, after investigating I have few questions regarding this issue. 1. Having lack of knowledge in what MVCC coordinator is used for, are you able to shed some light on the role and purpose of the MVCC coordinator? What does the MVCC coordinator do, why is one selected? Should an MVCC coordinator

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-11 Thread Ivan Pavlukhin
Hi, My first thought is deploying a service [1] (remotely dynamically Ignite.services().deploy() or statically IgniteConfiguration.setServiceConfiguration()) clearing problematic map periodically. [1] https://apacheignite.readme.io/docs/service-grid пн, 11 нояб. 2019 г. в 13:20, mvkarp : > >

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-11 Thread mvkarp
Hi, Would you have any suggestion on how to implement a last chance workaround for this issue for the server JVM? Ivan Pavlukhin wrote > Hi, > > I suspect a following here. Some node treats itself as a MVCC > coordinator and creates a new RecoveryBallotBox when each client node > leaves. Some

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-10 Thread Ivan Pavlukhin
Hi, I suspect a following here. Some node treats itself as a MVCC coordinator and creates a new RecoveryBallotBox when each client node leaves. Some (may be all) other nodes think that MVCC is disabled and do not send a vote (assumed for aforementioned ballot box) to MVCC coordinator.

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-10 Thread mvkarp
We have frequently stopping and starting clients in short lived client JVM processes as required for our purposes, this seems to lead to a huge bunch of PME (but no rebalancing) and topology changes (topVer=300,000+) Still can not figure out why this map won't clear (there are no exceptions or

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-10 Thread mvkarp
Hi, There are no more exceptions or err in the logs, only hundreds of thousands of these logs - heap usage is still increasing steeply and leak is still present. [13:46:17,632][INFO][disco-event-worker-#102][GridDiscoveryManager] Topology snapshot [ver=366003, locNode=6a9db3c2, servers=2,

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-08 Thread Ilya Kasnacheev
Hello! You seem to have an awful lot of errors related to connectivity problems between nodes, such as: Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address [addr=ult-s2-svr1.dataprocessors.com.au/10.16.1.47:47106, err=Connection refused] Suppressed: class

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-08 Thread mvkarp
Ok, there are no exceptions in the ignite logs for the client JVMs but I've attached the log for one of the problem servers. Looks like a few errors but I am unable to determine the root cause. ignite-46073e05.zip

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-08 Thread Ilya Kasnacheev
Hello! This is very strange, since we expect this collection to be cleared on exchange. Please make sure you don't have any stray exceptions during exchange in your logs. Regards, -- Ilya Kasnacheev пт, 8 нояб. 2019 г. в 12:49, mvkarp : > Hi, > > This is not the case. Always only a maximum

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-08 Thread mvkarp
Hi, This is not the case. Always only a maximum total of two server nodes. One JVM server on each. However there are many client JVMs that start and stop caches with setClientMode=true. It looks like one of the server instances is immune to the issue, whilst the most newly created one gets the

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-07 Thread mvkarp
Let me know if these help or if you need anything more specific. recoveryBallotBoxes.zip ilya.kasnacheev wrote > Hello! > > Can you please check whether there are any especially large objects inside >

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-07 Thread Ilya Kasnacheev
Hello! Can you please check whether there are any especially large objects inside recoveryBallotBoxes object graph? Sorting by retained heap may help in determining this. It would be nice to know what is the type histogram of what's inside recoveryBallotBoxes and where the bulk of heap usage

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-06 Thread mvkarp
I've attached another set of screenshots, might be more clear. heap.zip mvkarp wrote > I've attached some extra screenshots showing what is inside these records > and path to GC roots. heap.zip >

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-06 Thread mvkarp
I've created ticket, not too sure about how to go about creating a reproducer for this - https://issues.apache.org/jira/browse/IGNITE-12350 I've attached some extra screenshots showing what is inside these records and path to GC roots. heap.zip

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-06 Thread Ilya Kasnacheev
Hello! Can you please show contents of some of these records, as well as their referential path to MvccProcessorImpl? Regards, -- Ilya Kasnacheev пт, 1 нояб. 2019 г. в 03:25, mvkarp : > < > http://apache-ignite-users.70518.x6.nabble.com/file/t2658/heapanalysisMAT.jpg> > > > I've attached an

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-01 Thread Ivan Pavlukhin
Hi, Sounds like a bug. Would be great to have a ticket with reproducer. пт, 1 нояб. 2019 г. в 03:25, mvkarp : > > > > I've attached an Eclipse MAT heap analysis. As you can see MVCC is disabled > (there are no

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-10-31 Thread mvkarp
I've attached an Eclipse MAT heap analysis. As you can see MVCC is disabled (there are no TRANSACTIONAL_SNAPSHOT caches in the cluster) -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-10-30 Thread Denis Magda
Please try to capture head dumps that will show where is the leak. Share the dumps with us if the leak is not caused by the application code. - Denis On Tue, Oct 29, 2019 at 11:59 PM mvkarp wrote: > Hi team, any update/clarification on this? It is quite critical bug in > production

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-10-29 Thread mvkarp
Hi team, any update/clarification on this? It is quite critical bug in production environment as it is taking 100% CPU usage and leads to OOM / crashes. As more information, this is also affecting the ignite server JVMs causing them to crash, and it seems to be assigning a mvcc coordinator node

recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-10-25 Thread mvkarp
Hi, I am on Ignite 2.7.5 with MVCC disabled for all caches (all CacheAtomicityMode is ATOMIC) After analysing a few heaps on the CLIENT node JVM using Eclipse MAT there is a 'recoveryBallotBoxes' ConcurrentHashMap in the MvccProcessorImpl that is growing infinitely in size on the heap at a