Hi,

Is it possible to provide full logs or reproducer?

Anyway, I see that exchange waits for something and you should see reason
at logs after phrase "Failed to wait for partition release future".

On Wed, May 24, 2017 at 7:31 AM, bintisepaha <[email protected]> wrote:

> Hi Igniters,
>
> We have been testing with Ignite 1.9.0 and have this client that runs a
> simple (no-join) SQL Query on a single distributed cache. But if we kill
> the
> server node for testing in the meantime and if the client was running this
> query, it actually stalls the whole cluster.
>
> All we have to do for the grid to resume functioning is restart the client.
> This may have something to do with data rebalancing when a server node
> dies.
> Would setting a rebalanceDelay help? we are using the default of 0 now.
>
> How does a client affect the whole cluster like this? and restarting it
> fixes the stall? The server nodes exchange worker threads are stuck on
> partitioning data.
>
> Client thread stuck below (thread dump)
>
> Name: main
> State: TIMED_WAITING
> Total blocked: 40  Total waited: 102,828
>
> Stack trace:
> java.lang.Thread.sleep(Native Method)
> org.apache.ignite.internal.processors.query.h2.twostep.
> GridReduceQueryExecutor.query(GridReduceQueryExecutor.java:494)
> org.apache.ignite.internal.processors.query.h2.
> IgniteH2Indexing$7.iterator(IgniteH2Indexing.java:1315)
> org.apache.ignite.internal.processors.cache.QueryCursorImpl.iterator(
> QueryCursorImpl.java:94)
> org.apache.ignite.internal.processors.query.h2.
> IgniteH2Indexing$8.iterator(IgniteH2Indexing.java:1355)
> org.apache.ignite.internal.processors.cache.QueryCursorImpl.iterator(
> QueryCursorImpl.java:94)
> com.tudor.server.grid.matching.GridMatcher.getTradeOrdersForPSGroup(
> GridMatcher.java:322)
> com.tudor.server.grid.matching.MatcherDelegate.unmatchRematch(
> MatcherDelegate.java:101)
> com.tudor.server.grid.matching.GridMatcher.processPendingOrder(
> GridMatcher.java:275)
> com.tudor.server.grid.matching.GridMatcher.run(GridMatcher.java:201)
> com.tudor.server.grid.matching.GridMatcher.main(GridMatcher.java:99)
>
>
> server node exchange worker thread dump
>
>
> "exchange-worker-#34%DataGridServer-Development%" Id=68 in TIMED_WAITING
> on
> lock=org.apache.ignite.internal.util.future.GridCompoundFuture@7e9c149b
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.parkNanos(
> LockSupport.java:215)
>   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.
> doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.
> tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at
> org.apache.ignite.internal.util.future.GridFutureAdapter.
> get0(GridFutureAdapter.java:189)
>   at
> org.apache.ignite.internal.util.future.GridFutureAdapter.
> get(GridFutureAdapter.java:139)
>   at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.
> GridDhtPartitionsExchangeFuture.waitPartitionRelease(
> GridDhtPartitionsExchangeFuture.java:779)
>   at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.
> GridDhtPartitionsExchangeFuture.distributedExchange(
> GridDhtPartitionsExchangeFuture.java:732)
>   at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.
> GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFutur
> e.java:489)
>   at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeMana
> ger$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1674)
>   at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
>
> Any help is appreciated.
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/SQL-query-on-client-stalling-the-grid-
> when-server-node-dies-tp13107.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Reply via email to