[ https://issues.apache.org/jira/browse/IGNITE-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790472#comment-16790472 ]
Amelchev Nikita commented on IGNITE-11460: ------------------------------------------ [~amashenkov], In such case we can get a situation when the client already joined to topology (will receive mvcc requests) but the current coordinator is in a disconnected state until processed discovery event. And possible another situation when the client already disconnected but the coordinator wasn't disconnected until processed client disconnect discovery event. Also, it seems there is no event for local node join in the discovery event thread. With my changes coordinator will be set correctly on local join/on disconnected and will not process unnecessary events that happen(will be processed) after the client leave topology. When the client join topology again it will be set a correct coordinator and events will be processed only after get a client reconnect event. > MVCC: Possible race on coordinator changing on client reconnection. > ------------------------------------------------------------------- > > Key: IGNITE-11460 > URL: https://issues.apache.org/jira/browse/IGNITE-11460 > Project: Ignite > Issue Type: Bug > Reporter: Amelchev Nikita > Assignee: Amelchev Nikita > Priority: Major > Labels: MakeTeamcityGreenAgain > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > I found that the wrong coordinator can be set in case of client reconnect: > {noformat} > assert newCrd.topologyVersion().compareTo(curCrd.topologyVersion()) > 0; > java.lang.AssertionError > at > org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onCoordinatorChanged(MvccProcessorImpl.java:541) > at > org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onLocalJoin(MvccProcessorImpl.java:416) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:851) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:601) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2681) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2719) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) > at java.lang.Thread.run(Thread.java:748) > {noformat} > I have attached reproducer in PR. > The main reason is that coordinator can be changed from discovery event > thread when the client already disconnect (disconnection processed in > notifier thread and change coordinator on onDisconnected method). > Coordinator can be changed in cases: > 1. notifier disco thread: onDisconnected method > 2. event disco thread: onDiscovery listener. > and events can be processed with some delay and override coordinator that set > in notifier thread. -- This message was sent by Atlassian JIRA (v7.6.3#76005)