Yakov, When node is stopped all cache futures are completed with error, where did you see hang?
On Sat, Nov 28, 2015 at 3:37 PM, Yakov Zhdanov <[email protected]> wrote: > Guys, > > I see the following code > > (org/apache/ignite/internal/processors/cache/distributed/dht/GridDhtTxPrepareFuture.java:1129): > > try { > cctx.io().send(n, req, tx.ioPolicy()); > } > catch (ClusterTopologyCheckedException e) { > fut.onNodeLeft(e); > } > catch (IgniteCheckedException e) { > if (!cctx.kernalContext().isStopping()) > fut.onResult(e); > } > > > Which means that in case if node has just started stop procedure, all cache > operations may potentially hang. If cache.put() is called from job and node > is stopping gracefully, stop process hangs with 100% probability. > > This issue does not threaten failure detection and nodes crash cases since > this is handled by separate logic. > > I fixed Communication SPI to use its internal stopping flag instead of the > system wide one and this seems to fix the issue with graceful stop. > > Semyon, can you please see if this may cause any other issue of the kind? > > My changes are here - https://github.com/apache/ignite/pull/278 > > --Yakov >
