[ 
https://issues.apache.org/jira/browse/IGNITE-11238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Plekhanov updated IGNITE-11238:
---------------------------------------
    Fix Version/s:     (was: 2.9)
                   2.10

> Possible hang on exchange
> -------------------------
>
>                 Key: IGNITE-11238
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11238
>             Project: Ignite
>          Issue Type: Bug
>          Components: general
>            Reporter: Igor Seliverstov
>            Priority: Critical
>             Fix For: 2.10
>
>
> Currently we may hang on exchange for a while (two network timeouts) waiting 
> for release a latch (see 
> {{GridDhtPartitionsExchangeFuture#waitPartitionRelease releaseLatch}}) in 
> case a processing topology version has not been added to discovery history 
> yet but client acknowledge already received by coordinator:
> {code:java}
> [2019-02-06 
> 17:43:17,009][ERROR][sys-#43%mvcc.CacheMvccPartitionedSqlCoordinatorFailoverTest0%][ExchangeLatchManager]
>  Topology AffinityTopologyVersion [topVer=24, minorTopVer=0] not found in 
> discovery history ; consider increasing IGNITE_DISCOVERY_HISTORY_SIZE 
> property. Current value is -1
> class org.apache.ignite.IgniteException: Topology AffinityTopologyVersion 
> [topVer=24, minorTopVer=0] not found in discovery history ; consider 
> increasing IGNITE_DISCOVERY_HISTORY_SIZE property. Current value is -1
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.aliveNodesForTopologyVer(ExchangeLatchManager.java:260)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.getLatchCoordinator(ExchangeLatchManager.java:302)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.processAck(ExchangeLatchManager.java:351)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.lambda$new$0(ExchangeLatchManager.java:121)
>       at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1561)
>       at 
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1189)
>       at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:127)
>       at 
> org.apache.ignite.internal.managers.communication.GridIoManager$8.run(GridIoManager.java:1086)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> {code}
> This way the received ack won't be processed, so, we will be waiting for 
> retry:
> {code:java}
>                     // Try to resend ack.
>                     releaseLatch.countDown();
> {code}
> To solve the issue we need to test whether the version is present in 
> discovery history and put it into a pending map if i isn't so (see 
> {{ExchangeLatchManager#pendingAcks}})



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to