[ 
https://issues.apache.org/jira/browse/IGNITE-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Lukyanov reassigned IGNITE-7753:
------------------------------------------

    Assignee:     (was: Stanislav Lukyanov)

> Processors are incorrectly initialized if a node joins during cluster 
> activation
> --------------------------------------------------------------------------------
>
>                 Key: IGNITE-7753
>                 URL: https://issues.apache.org/jira/browse/IGNITE-7753
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.3, 2.4, 2.5
>            Reporter: Stanislav Lukyanov
>            Priority: Major
>
> If a node joins during the cluster activation process (while the related 
> exchange operation is in progress), then some of the GridProcessor instances 
> of that node will be incorrectly initialized. While GridClusterStateProcessor 
> will correctly report the active cluster state, other processors that are 
> sensitive to the cluster state, e.g. GridServiceProcessor, will be not 
> initialized.
> A reproducer is below. 
> =======================
> Ignite server = 
> IgnitionEx.start("examples/config/persistentstore/example-persistent-store.xml",
>  "server");
>         CyclicBarrier barrier = new CyclicBarrier(2);
>         Thread activationThread = new Thread(() -> {
>             try {
>                 barrier.await();
>                 server.active(true);
>             }
>             catch (Exception e) {
>                 e.printStackTrace(); // TODO implement.
>             }
>         });
>         activationThread.start();
>         barrier.await();
>         IgnitionEx.setClientMode(true);
>         Ignite client = 
> IgnitionEx.start("examples/config/persistentstore/example-persistent-store.xml",
>  "client");
>         activationThread.join();
>         client.services().deployClusterSingleton("myClusterSingleton", new 
> SimpleMapServiceImpl<>());
> =======================
> Here a single server node is started, then simultaneously a client node is 
> being started and the cluster is being activated, then client attempts to 
> deploy a service. As the result, the thread calling the deploy method hangs 
> forever with a stack trace like this:
> =======================
> "main@1" prio=5 tid=0x1 nid=NA waiting
>   java.lang.Thread.State: WAITING
>         at sun.misc.Unsafe.park(Unsafe.java:-1)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>         at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
>         at 
> org.apache.ignite.internal.util.IgniteUtils.awaitQuiet(IgniteUtils.java:7505)
>         at 
> org.apache.ignite.internal.processors.service.GridServiceProcessor.serviceCache(GridServiceProcessor.java:290)
>         at 
> org.apache.ignite.internal.processors.service.GridServiceProcessor.writeServiceToCache(GridServiceProcessor.java:728)
>         at 
> org.apache.ignite.internal.processors.service.GridServiceProcessor.deployAll(GridServiceProcessor.java:634)
>         at 
> org.apache.ignite.internal.processors.service.GridServiceProcessor.deployAll(GridServiceProcessor.java:600)
>         at 
> org.apache.ignite.internal.processors.service.GridServiceProcessor.deployMultiple(GridServiceProcessor.java:488)
>         at 
> org.apache.ignite.internal.processors.service.GridServiceProcessor.deployClusterSingleton(GridServiceProcessor.java:469)
>         at 
> org.apache.ignite.internal.IgniteServicesImpl.deployClusterSingleton(IgniteServicesImpl.java:120)
> =======================
> The behavior depends on the timings - the client has to join in the middle of 
> the activation's exchange process. Putting Thread.sleep(4000) into 
> GridDhtPartitionsExchangeFuture.onClusterStateChangeRequest seems to work on 
> a development laptop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to