[ 
https://issues.apache.org/jira/browse/IGNITE-10518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743399#comment-16743399
 ] 

Oleg Ignatenko edited comment on IGNITE-10518 at 1/15/19 9:48 PM:
------------------------------------------------------------------

(x) Teamcity history for reproducer 
([IgniteTxCachePrimarySyncTest.testSingleKeyCommitFromPrimary|https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=4989034880085631279&tab=testDetails])
 suggests that problem hasn't been fixed in any imaginable way: I checked last 
100 execution results for about 30 days since Dec 16 2018 and all of them 
without any exception show all the same "muted failure" result:
{noformat}
Test status     Duration                Build Info      Changes         Agent
Muted failure   18ms            … MVCC Cache 9  pull/5823/head  #1023   Tests 
passed: 10, muted: 9      andrey.mashenk… (2)     14 Jan 19 17:34 
publicagent17_9096
Muted failure   12ms            … MVCC Cache 9  refs/heads/master       #1020   
Tests passed: 10, muted: 9      No changes      14 Jan 19 14:10 
publicagent07_9092
Muted failure   24ms            … MVCC Cache 9  refs/heads/master       #1019   
Tests passed: 10, muted: 9      No changes      14 Jan 19 13:06 
publicagent13_9096
Muted failure   17ms            … MVCC Cache 9  refs/heads/master       #1018   
Tests passed: 10, muted: 9      Changes (2)     14 Jan 19 12:17 
publicagent10_9092
Muted failure   18ms            … MVCC Cache 9  refs/heads/master       #1017   
Tests passed: 10, muted: 9      Changes (2)     14 Jan 19 11:16 
publicagent14_9096
Muted failure   14ms            … MVCC Cache 9  refs/heads/master       #1016   
Tests passed: 10, muted: 9      No changes      14 Jan 19 10:06 
publicagent11_9092
Muted failure   15ms            … MVCC Cache 9  refs/heads/master       #1015   
Tests passed: 10, muted: 9      No changes      14 Jan 19 09:17 
publicagent10_9096
Muted failure   12ms            … MVCC Cache 9  refs/heads/master       #1014   
Tests passed: 10, muted: 9      No changes      14 Jan 19 08:28 
publicagent11_9092
Muted failure   25ms            … MVCC Cache 9  refs/heads/master       #1013   
Tests passed: 10, muted: 9      No changes      14 Jan 19 07:36 
publicagent17_9091
Muted failure   16ms            … MVCC Cache 9  refs/heads/master       #1012   
Tests passed: 10, muted: 9      No changes      14 Jan 19 06:46 
publicagent11_9096
Muted failure   26ms            … MVCC Cache 9  refs/heads/master       #1011   
Tests passed: 10, muted: 9      No changes      14 Jan 19 05:56 
publicagent09_9094
Muted failure   8ms             … MVCC Cache 9  refs/heads/master       #1010   
Tests passed: 10, muted: 9      No changes      14 Jan 19 05:07 
publicagent17_9092
Muted failure   18ms            … MVCC Cache 9  refs/heads/master       #1009   
Tests passed: 10, muted: 9      No changes      14 Jan 19 04:16 
publicagent15_9094
Muted failure   18ms            … MVCC Cache 9  refs/heads/master       #1008   
Tests passed: 10, muted: 9      No changes      14 Jan 19 03:26 
publicagent16_9096
Muted failure   25ms            … MVCC Cache 9  refs/heads/master       #1007   
Tests passed: 10, muted: 9      No changes      14 Jan 19 01:56 
publicagent14_9094
Muted failure   10ms            … MVCC Cache 9  refs/heads/master       #1006   
Tests passed: 10, muted: 9      No changes      14 Jan 19 01:06 
publicagent16_9093
Muted failure   20ms            … MVCC Cache 9  pull/5814/head  #1005   Tests 
passed: 9, ignored: 1, muted: 9   Oleg Ignatenko (79)     14 Jan 19 00:16 
publicagent06_9092
Muted failure   13ms            … MVCC Cache 9  refs/heads/master       #1004   
Tests passed: 10, muted: 9      No changes      13 Jan 19 23:47 
publicagent16_9092
... etc{noformat}
----
I happened to find it out when re-running TC bot to get visa for IGNITE-10796 
because I picked unmuted test from master. I re-run MVCC 9 suite several times 
and every time it failed with execution timeout and it passed only after I 
suppressed execution of reproducer back again.

Typical thread dump I observed from timed out test:
{noformat}
"sys-stripe-0-#557%distributed.IgniteTxCachePrimarySyncTest0%" #631 prio=5 
os_prio=0 tid=0x00007f7861d06000 nid=0x73583 waiting on condition 
[0x00007f7817af9000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
        at 
org.apache.ignite.internal.util.StripedExecutor$StripeConcurrentQueue.take(StripedExecutor.java:672)
        at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:494)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
        at java.lang.Thread.run(Thread.java:748){noformat}
(i) Reopening the ticket because of above. In case if I am mistaken - 
[~amashenkov], [~gvvinblade], if you can provide successful teamcity execution 
results for this test case (or better yet, TC bot visa for this PR) then please 
feel free to close it again.


was (Author: oignatenko):
(x) Teamcity history for reproducer 
([IgniteTxCachePrimarySyncTest0.testSingleKeyCommitFromPrimary|https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=4989034880085631279&tab=testDetails])
 suggests that problem hasn't been fixed in any imaginable way: I checked last 
100 execution results for about 30 days since Dec 16 2018 and all of them 
without any exception show all the same "muted failure" result:
{noformat}
Test status     Duration                Build Info      Changes         Agent
Muted failure   18ms            … MVCC Cache 9  pull/5823/head  #1023   Tests 
passed: 10, muted: 9      andrey.mashenk… (2)     14 Jan 19 17:34 
publicagent17_9096
Muted failure   12ms            … MVCC Cache 9  refs/heads/master       #1020   
Tests passed: 10, muted: 9      No changes      14 Jan 19 14:10 
publicagent07_9092
Muted failure   24ms            … MVCC Cache 9  refs/heads/master       #1019   
Tests passed: 10, muted: 9      No changes      14 Jan 19 13:06 
publicagent13_9096
Muted failure   17ms            … MVCC Cache 9  refs/heads/master       #1018   
Tests passed: 10, muted: 9      Changes (2)     14 Jan 19 12:17 
publicagent10_9092
Muted failure   18ms            … MVCC Cache 9  refs/heads/master       #1017   
Tests passed: 10, muted: 9      Changes (2)     14 Jan 19 11:16 
publicagent14_9096
Muted failure   14ms            … MVCC Cache 9  refs/heads/master       #1016   
Tests passed: 10, muted: 9      No changes      14 Jan 19 10:06 
publicagent11_9092
Muted failure   15ms            … MVCC Cache 9  refs/heads/master       #1015   
Tests passed: 10, muted: 9      No changes      14 Jan 19 09:17 
publicagent10_9096
Muted failure   12ms            … MVCC Cache 9  refs/heads/master       #1014   
Tests passed: 10, muted: 9      No changes      14 Jan 19 08:28 
publicagent11_9092
Muted failure   25ms            … MVCC Cache 9  refs/heads/master       #1013   
Tests passed: 10, muted: 9      No changes      14 Jan 19 07:36 
publicagent17_9091
Muted failure   16ms            … MVCC Cache 9  refs/heads/master       #1012   
Tests passed: 10, muted: 9      No changes      14 Jan 19 06:46 
publicagent11_9096
Muted failure   26ms            … MVCC Cache 9  refs/heads/master       #1011   
Tests passed: 10, muted: 9      No changes      14 Jan 19 05:56 
publicagent09_9094
Muted failure   8ms             … MVCC Cache 9  refs/heads/master       #1010   
Tests passed: 10, muted: 9      No changes      14 Jan 19 05:07 
publicagent17_9092
Muted failure   18ms            … MVCC Cache 9  refs/heads/master       #1009   
Tests passed: 10, muted: 9      No changes      14 Jan 19 04:16 
publicagent15_9094
Muted failure   18ms            … MVCC Cache 9  refs/heads/master       #1008   
Tests passed: 10, muted: 9      No changes      14 Jan 19 03:26 
publicagent16_9096
Muted failure   25ms            … MVCC Cache 9  refs/heads/master       #1007   
Tests passed: 10, muted: 9      No changes      14 Jan 19 01:56 
publicagent14_9094
Muted failure   10ms            … MVCC Cache 9  refs/heads/master       #1006   
Tests passed: 10, muted: 9      No changes      14 Jan 19 01:06 
publicagent16_9093
Muted failure   20ms            … MVCC Cache 9  pull/5814/head  #1005   Tests 
passed: 9, ignored: 1, muted: 9   Oleg Ignatenko (79)     14 Jan 19 00:16 
publicagent06_9092
Muted failure   13ms            … MVCC Cache 9  refs/heads/master       #1004   
Tests passed: 10, muted: 9      No changes      13 Jan 19 23:47 
publicagent16_9092
... etc{noformat}
----
I happened to find it out when re-running TC bot to get visa for IGNITE-10796 
because I picked unmuted test from master. I re-run MVCC 9 suite several times 
and every time it failed with execution timeout and it passed only after I 
suppressed execution of reproducer back again.

Typical thread dump I observed from timed out test:
{noformat}
"sys-stripe-0-#557%distributed.IgniteTxCachePrimarySyncTest0%" #631 prio=5 
os_prio=0 tid=0x00007f7861d06000 nid=0x73583 waiting on condition 
[0x00007f7817af9000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
        at 
org.apache.ignite.internal.util.StripedExecutor$StripeConcurrentQueue.take(StripedExecutor.java:672)
        at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:494)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
        at java.lang.Thread.run(Thread.java:748){noformat}
(i) Reopening the ticket because of above. In case if I am mistaken - 
[~amashenkov], [~gvvinblade], if you can provide successful teamcity execution 
results for this test case (or better yet, TC bot visa for this PR) then please 
feel free to close it again.

> MVCC: Update operation may hangs on backup on unstable topology. 
> -----------------------------------------------------------------
>
>                 Key: IGNITE-10518
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10518
>             Project: Ignite
>          Issue Type: Bug
>          Components: mvcc
>            Reporter: Andrew Mashenkov
>            Assignee: Andrew Mashenkov
>            Priority: Critical
>              Labels: Hanging, failover, mvcc_stabilization_stage_1
>             Fix For: 2.8
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Update operation may hangs on backup awaiting next topology.
> Symptoms: 
>  # Exchange for topology version 6.1 has been finished.
>  # Exchange for topology version 6.2 awaits for partition release.
>  # DhtTxRemote waits for exchange.
> Seems, tx maps on outdated topology version.
> Reproducer IgniteTxCachePrimarySyncTest.testSingleKeyCommit()  in Mvcc mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to