Hi, after investigating I have few questions regarding this issue.

1. Having lack of knowledge in what MVCC coordinator is used for, are you
able to shed some light on the role and purpose of the MVCC coordinator?
What does the MVCC coordinator do, why is one selected? Should an MVCC
coordinator be selected regardless of MVCC being disabled? (i.e. is it used
for any other base features and is it just the way Ignite is meant to work)

2. Following on from this, after looking at the code of the
MvccProcessorImpl.java class in Ignite 2.7.5 Github, it looks like an MVCC
coordinator is ALWAYS selected and assigns one of the server nodes as the
MVCC coordinator, regardless of having TRANSACTIONAL_SNAPSHOT cache or not
(mvccEnabled can be false but a MVCC coordinator is still be selected).

https://github.com/apache/ignite/blob/ignite-2.7.5/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/mvcc/MvccProcessorImpl.java

On Line 861, in assignMvccCoordinator method, it loops through all nodes in
the cluster with only these two conditions. 

*if (!node.isClient() && supportsMvcc(node))*

It only checks if the node is not a client, and that is supportsMvcc (which
is true for all versions > 2.7). It does not check mvccEnabled at all.


Can you confirm the above is intentional/expected or if there is another
piece of code I am missing?


3. As extra information, the node that happens to be selected as MVCC
coordinator does not get the leak. But every other client/server gets the
leak.



Ivan Pavlukhin wrote
> Hi,
> 
> I suspect a following here. Some node treats itself as a MVCC
> coordinator and creates a new RecoveryBallotBox when each client node
> leaves. Some (may be all) other nodes think that MVCC is disabled and
> do not send a vote (assumed for aforementioned ballot box) to MVCC
> coordinator. Consequently a memory leak.
> 
> A following could be done:
> 1. Figure out why some node treats itself MVCC coordinator and others
> think that MVCC is disabled.
> 2. Try to introduce some defensive matters in Ignite code to protect
> from the leak in a long running cluster.
> 
> As a last chance workaround I can suggest writing custom code, which
> cleans recoveryBallotBoxes map from time to time (most likely using
> reflection).
> 
> пн, 11 нояб. 2019 г. в 08:53, mvkarp <

> liquid_ninja2k@

> >:
>>
>> We have frequently stopping and starting clients in short lived client
>> JVM
>> processes as required for our purposes, this seems to lead to a huge
>> bunch
>> of PME (but no rebalancing) and topology changes (topVer=300,000+)
>>
>> Still can not figure out why this map won't clear (there are no
>> exceptions
>> or err at all in the entire log)
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
> 
> 
> 
> -- 
> Best regards,
> Ivan Pavlukhin





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to