[ 
https://issues.apache.org/jira/browse/CASSANDRA-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17109647#comment-17109647
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-15685 at 5/17/20, 9:20 PM:
---------------------------------------------------------------------------

After a couple of hundred more runs of this test (my gut feeling told me that I 
miss something), it was confirmed that the lossy notifications are not the 
primary issue with this test.
In some cases even if we catch the notifications for success/error and the 
flags "success" and "wasConsistent" are properly set, still the PreviewRepair 
shows that the Incremental Repair is still running.

{code:java}
[junit-timeout] java.lang.RuntimeException: Repair session 
82ff3420-9737-11ea-b32d-7fa12d874715 for range [(-1,9223372036854775805], 
(9223372036854775805,-1]] failed with error An incremental repair with session 
id 82eff1e0-9737-11ea-b32d-7fa12d874715 finished during this preview repair 
runtime
{code}

Turns out getting the notification doesn't always mean that the rest of the 
nodes are already informed about the completion. I can easily increase the time 
before preview repair starts.
But we were considering with [~dcapwell] to open a case as there might be other 
parts of the code or tools relying only on the notifications for completion. 
Worth to be checked.
Also, I am gonna check tomorrow in detail how we can improve this test not to 
rely on timing but probably some metadata. 


was (Author: e.dimitrova):
After a couple of hundred more runs of this test (my gut feeling told me that I 
miss something), it was confirmed that the lossy notifications are not the 
primary issue with this test.
The thing is that even if we catch the notifications for success/error and the 
flags "success" and "wasConsistent" are properly set, still the PreviewRepair 
shows that the Incremental Repair is still running.

{code:java}
[junit-timeout] java.lang.RuntimeException: Repair session 
82ff3420-9737-11ea-b32d-7fa12d874715 for range [(-1,9223372036854775805], 
(9223372036854775805,-1]] failed with error An incremental repair with session 
id 82eff1e0-9737-11ea-b32d-7fa12d874715 finished during this preview repair 
runtime
{code}

Turns out getting the notification doesn't always mean that the rest of the 
nodes are already informed about the completion. I can easily increase the time 
before preview repair starts.
But we were considering with [~dcapwell] to open a case as there might be other 
parts of the code or tools relying only on the notifications for completion. 
Worth to be checked.
Also, I am gonna check tomorrow in detail how we can improve this test not to 
rely on timing but probably some metadata. 

> flaky testWithMismatchingPending - 
> org.apache.cassandra.distributed.test.PreviewRepairTest
> ------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15685
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15685
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/dtest
>            Reporter: Kevin Gallardo
>            Assignee: Ekaterina Dimitrova
>            Priority: Normal
>              Labels: pull-request-available
>             Fix For: 4.0-alpha
>
>         Attachments: log-CASSANDRA-15685.txt, output
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Observed in: 
> https://app.circleci.com/pipelines/github/newkek/cassandra/34/workflows/1c6b157d-13c3-48a9-85fb-9fe8c153256b/jobs/191/tests
> Failure:
> {noformat}
> testWithMismatchingPending - 
> org.apache.cassandra.distributed.test.PreviewRepairTest
> junit.framework.AssertionFailedError
>       at 
> org.apache.cassandra.distributed.test.PreviewRepairTest.testWithMismatchingPending(PreviewRepairTest.java:97)
> {noformat}
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15685]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to