[ 
https://issues.apache.org/jira/browse/IGNITE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196248#comment-15196248
 ] 

Andrey Gura commented on IGNITE-2797:
-------------------------------------

During debugging of hanging tests I found a couple of suspicious places:

1. {{GridDhtTransactionalCacheAdapter.lockAllAsyncInternal()}} method can 
create {{GridDhtLockFuture}} instance with zero timeout, so timeout object will 
not be registered and future can be in not completed state.

2. Sometimes test throws {{AssertionError}} from 
{{GridDhtTransactionalCacheAdapter}} class (see stacktrace below). As result 
future listener that creates {{GridNearLockResponse}} instances will fail.

{{txState.empty()}} method can return {{true}} in case of exception. Such case 
happens when {{GridDhtTxLocalAdapter.lockAllAsync()}} method returns 
{{GridFinishedFuture}} instance after {{checkValid}} method threw exception due 
to a timeout exceeded. So it's valid case and assertion can be replaced by 
something like {{assert !t.empty() || e != null}}.

{noformat}
Exception in thread "sys-#31%distributed.CacheTxLockTimeoutTest0%" 
java.lang.AssertionError
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$10.apply(GridDhtTransactionalCacheAdapter.java:971)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$10.apply(GridDhtTransactionalCacheAdapter.java:957)
        at 
org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:125)
        at 
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:307)
        at 
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:300)
        at 
org.apache.ignite.internal.util.future.GridFinishedFuture.listen(GridFinishedFuture.java:138)
        at 
org.apache.ignite.internal.util.future.GridEmbeddedFuture.<init>(GridEmbeddedFuture.java:89)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtEmbeddedFuture.<init>(GridDhtEmbeddedFuture.java:60)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter.lockAllAsync(GridDhtTransactionalCacheAdapter.java:955)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter.processNearLockRequest(GridDhtTransactionalCacheAdapter.java:562)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter.access$000(GridDhtTransactionalCacheAdapter.java:88)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$3.apply(GridDhtTransactionalCacheAdapter.java:138)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$3.apply(GridDhtTransactionalCacheAdapter.java:136)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:582)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:280)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:204)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:80)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:163)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:822)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:103)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:785)
{noformat} 

> Prepare and finish future never time out
> ----------------------------------------
>
>                 Key: IGNITE-2797
>                 URL: https://issues.apache.org/jira/browse/IGNITE-2797
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>    Affects Versions: 1.5.0.final
>            Reporter: Valentin Kulichenko
>            Priority: Blocker
>              Labels: community, customer, important
>             Fix For: 1.6
>
>
> Even if transaction timeout is configured, transaction will not timeout if 
> it's already in prepare state. It will be shown in log as pending transaction 
> and can cause the whole cluster hang.
> We need to add a mechanism that will properly timeout prepare and (if 
> possible) finish futures.
> Also we can create an event that will be fired if there is a transaction 
> pending for a long time, showing which nodes we are waiting responses from. 
> This will allow user to recover by stopping only these nodes instead of 
> restarting the whole cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to