[
https://issues.apache.org/jira/browse/IGNITE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196248#comment-15196248
]
Andrey Gura commented on IGNITE-2797:
-------------------------------------
During debugging of hanging tests I found a couple of suspicious places:
1. {{GridDhtTransactionalCacheAdapter.lockAllAsyncInternal()}} method can
create {{GridDhtLockFuture}} instance with zero timeout, so timeout object will
not be registered and future can be in not completed state.
2. Sometimes test throws {{AssertionError}} from
{{GridDhtTransactionalCacheAdapter}} class (see stacktrace below). As result
future listener that creates {{GridNearLockResponse}} instances will fail.
{{txState.empty()}} method can return {{true}} in case of exception. Such case
happens when {{GridDhtTxLocalAdapter.lockAllAsync()}} method returns
{{GridFinishedFuture}} instance after {{checkValid}} method threw exception due
to a timeout exceeded. So it's valid case and assertion can be replaced by
something like {{assert !t.empty() || e != null}}.
{noformat}
Exception in thread "sys-#31%distributed.CacheTxLockTimeoutTest0%"
java.lang.AssertionError
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$10.apply(GridDhtTransactionalCacheAdapter.java:971)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$10.apply(GridDhtTransactionalCacheAdapter.java:957)
at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:125)
at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:307)
at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:300)
at
org.apache.ignite.internal.util.future.GridFinishedFuture.listen(GridFinishedFuture.java:138)
at
org.apache.ignite.internal.util.future.GridEmbeddedFuture.<init>(GridEmbeddedFuture.java:89)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtEmbeddedFuture.<init>(GridDhtEmbeddedFuture.java:60)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter.lockAllAsync(GridDhtTransactionalCacheAdapter.java:955)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter.processNearLockRequest(GridDhtTransactionalCacheAdapter.java:562)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter.access$000(GridDhtTransactionalCacheAdapter.java:88)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$3.apply(GridDhtTransactionalCacheAdapter.java:138)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$3.apply(GridDhtTransactionalCacheAdapter.java:136)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:582)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:280)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:204)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:80)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:163)
at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:822)
at
org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:103)
at
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:785)
{noformat}
> Prepare and finish future never time out
> ----------------------------------------
>
> Key: IGNITE-2797
> URL: https://issues.apache.org/jira/browse/IGNITE-2797
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Affects Versions: 1.5.0.final
> Reporter: Valentin Kulichenko
> Priority: Blocker
> Labels: community, customer, important
> Fix For: 1.6
>
>
> Even if transaction timeout is configured, transaction will not timeout if
> it's already in prepare state. It will be shown in log as pending transaction
> and can cause the whole cluster hang.
> We need to add a mechanism that will properly timeout prepare and (if
> possible) finish futures.
> Also we can create an event that will be fired if there is a transaction
> pending for a long time, showing which nodes we are waiting responses from.
> This will allow user to recover by stopping only these nodes instead of
> restarting the whole cluster.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)