[jira] [Commented] (CASSANDRA-17804) AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically on missing stdout contents

2022-08-31 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598340#comment-17598340
 ] 

Andres de la Peña commented on CASSANDRA-17804:
---

Committed to 4.1 as 
[1ee5df02b1f98cf38f126d47a7f3fb153f790d52|https://github.com/apache/cassandra/commit/1ee5df02b1f98cf38f126d47a7f3fb153f790d52]
 and merged up to 
[trunk|https://github.com/apache/cassandra/commit/b39596b65c6bdbed5f2be35008291c69b41f1675].

> AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically 
> on missing stdout contents
> ---
>
> Key: CASSANDRA-17804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Caleb Rackliffe
>Assignee: Paulo Motta
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See 
> [https://app.circleci.com/pipelines/github/maedhroz/cassandra/487/workflows/0ad42210-2979-4c5d-a4e8-d8cedf9285a7/jobs/4686/tests#failed-test-0]
>  
> My guess is that in some resource constrained environment, even the first 
> {{nodeool listsnapshots}} invocation doesn’t have “dropped” in the stdout. In 
> other words, we skip to the state of the world the last assertion in the test 
> is checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17804) AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically on missing stdout contents

2022-08-31 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598301#comment-17598301
 ] 

Andres de la Peña commented on CASSANDRA-17804:
---

Sure, starting commit.

> AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically 
> on missing stdout contents
> ---
>
> Key: CASSANDRA-17804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Caleb Rackliffe
>Assignee: Paulo Motta
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See 
> [https://app.circleci.com/pipelines/github/maedhroz/cassandra/487/workflows/0ad42210-2979-4c5d-a4e8-d8cedf9285a7/jobs/4686/tests#failed-test-0]
>  
> My guess is that in some resource constrained environment, even the first 
> {{nodeool listsnapshots}} invocation doesn’t have “dropped” in the stdout. In 
> other words, we skip to the state of the world the last assertion in the test 
> is checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17804) AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically on missing stdout contents

2022-08-30 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17597984#comment-17597984
 ] 

Paulo Motta commented on CASSANDRA-17804:
-

Thanks for the test runs [~adelapena]. Would you mind committing this? 
Unfortunately I'm not able to commit this before a few days.

> AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically 
> on missing stdout contents
> ---
>
> Key: CASSANDRA-17804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Caleb Rackliffe
>Assignee: Paulo Motta
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See 
> [https://app.circleci.com/pipelines/github/maedhroz/cassandra/487/workflows/0ad42210-2979-4c5d-a4e8-d8cedf9285a7/jobs/4686/tests#failed-test-0]
>  
> My guess is that in some resource constrained environment, even the first 
> {{nodeool listsnapshots}} invocation doesn’t have “dropped” in the stdout. In 
> other words, we skip to the state of the world the last assertion in the test 
> is checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17804) AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically on missing stdout contents

2022-08-30 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17597983#comment-17597983
 ] 

Andres de la Peña commented on CASSANDRA-17804:
---

It seems that all the runs above have succeeded :)

> AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically 
> on missing stdout contents
> ---
>
> Key: CASSANDRA-17804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Caleb Rackliffe
>Assignee: Paulo Motta
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See 
> [https://app.circleci.com/pipelines/github/maedhroz/cassandra/487/workflows/0ad42210-2979-4c5d-a4e8-d8cedf9285a7/jobs/4686/tests#failed-test-0]
>  
> My guess is that in some resource constrained environment, even the first 
> {{nodeool listsnapshots}} invocation doesn’t have “dropped” in the stdout. In 
> other words, we skip to the state of the world the last assertion in the test 
> is checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17804) AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically on missing stdout contents

2022-08-30 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17597969#comment-17597969
 ] 

Andres de la Peña commented on CASSANDRA-17804:
---

Here are some repeated runs of the fixed flaky test, still running:
||branch||vnodes||CI||
|4.1|false|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2023/workflows/942564a7-5d9f-4c12-b2f0-4ca5cc490749]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2023/workflows/978ca93d-b464-478c-a3df-6628f0f58545]|
|4.1|true|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2026/workflows/32cd730b-e299-4889-8835-770156956ac7]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2026/workflows/9702e9e6-fc1f-4981-8d83-4ce7d3ef65f2]|
|trunk|false|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2024/workflows/8e78ad06-c453-4bb4-ad32-4ebce1ddc2be]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2024/workflows/bdd5cf85-5766-4352-9a20-05b01feb0aa2]|
|trunk|true|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2024/workflows/8e78ad06-c453-4bb4-ad32-4ebce1ddc2be]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2025/workflows/f3ef33b3-37ee-44fb-b5c3-274dc24a338b]|

The CircleCI config file for these runs has been generated with:
{code:java}
.circleci/generate.sh -m \
  -e REPEATED_UTEST_TARGET=test-jvm-dtest-some \
  -e REPEATED_UTEST_COUNT=500 \
  -e 
REPEATED_UTEST_CLASS=org.apache.cassandra.distributed.test.AutoSnapshotTtlTest \
  -e REPEATED_UTEST_METHODS=testAutoSnapshotTTlOnDropAfterRestart \
  -e REPEATED_UTEST_VNODES=true
{code}

> AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically 
> on missing stdout contents
> ---
>
> Key: CASSANDRA-17804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Caleb Rackliffe
>Assignee: Paulo Motta
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See 
> [https://app.circleci.com/pipelines/github/maedhroz/cassandra/487/workflows/0ad42210-2979-4c5d-a4e8-d8cedf9285a7/jobs/4686/tests#failed-test-0]
>  
> My guess is that in some resource constrained environment, even the first 
> {{nodeool listsnapshots}} invocation doesn’t have “dropped” in the stdout. In 
> other words, we skip to the state of the world the last assertion in the test 
> is checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17804) AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically on missing stdout contents

2022-08-29 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17597287#comment-17597287
 ] 

Caleb Rackliffe commented on CASSANDRA-17804:
-

+1

> AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically 
> on missing stdout contents
> ---
>
> Key: CASSANDRA-17804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Caleb Rackliffe
>Assignee: Paulo Motta
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See 
> [https://app.circleci.com/pipelines/github/maedhroz/cassandra/487/workflows/0ad42210-2979-4c5d-a4e8-d8cedf9285a7/jobs/4686/tests#failed-test-0]
>  
> My guess is that in some resource constrained environment, even the first 
> {{nodeool listsnapshots}} invocation doesn’t have “dropped” in the stdout. In 
> other words, we skip to the state of the world the last assertion in the test 
> is checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17804) AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically on missing stdout contents

2022-08-26 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17585331#comment-17585331
 ] 

Benjamin Lerer commented on CASSANDRA-17804:


[~maedhroz] are you happy with the latest changes?

> AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically 
> on missing stdout contents
> ---
>
> Key: CASSANDRA-17804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Caleb Rackliffe
>Assignee: Paulo Motta
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See 
> [https://app.circleci.com/pipelines/github/maedhroz/cassandra/487/workflows/0ad42210-2979-4c5d-a4e8-d8cedf9285a7/jobs/4686/tests#failed-test-0]
>  
> My guess is that in some resource constrained environment, even the first 
> {{nodeool listsnapshots}} invocation doesn’t have “dropped” in the stdout. In 
> other words, we skip to the state of the world the last assertion in the test 
> is checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17804) AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically on missing stdout contents

2022-08-17 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581029#comment-17581029
 ] 

Paulo Motta commented on CASSANDRA-17804:
-

Thanks for the feedback. Addressed nit comments and resubmitted CI for 
4.1/trunk:

|[trunk|https://github.com/apache/cassandra/pull/1596]|[CI|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1651/]|
|[4.1|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-17804-4.1]|[CI|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1884/]|

> AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically 
> on missing stdout contents
> ---
>
> Key: CASSANDRA-17804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Caleb Rackliffe
>Assignee: Paulo Motta
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See 
> [https://app.circleci.com/pipelines/github/maedhroz/cassandra/487/workflows/0ad42210-2979-4c5d-a4e8-d8cedf9285a7/jobs/4686/tests#failed-test-0]
>  
> My guess is that in some resource constrained environment, even the first 
> {{nodeool listsnapshots}} invocation doesn’t have “dropped” in the stdout. In 
> other words, we skip to the state of the world the last assertion in the test 
> is checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17804) AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically on missing stdout contents

2022-08-15 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579888#comment-17579888
 ] 

Caleb Rackliffe commented on CASSANDRA-17804:
-

I dropped a couple nits in the PR, but LGTM overall. I guess the difficulty 
here is that you can't strictly guess how long the restart will take, and 
therefore can't know when the first check on {{listsnapshots}} is irrelevant. 
The only alternative I could see is inspecting the stdout, pulling a timestamp 
of some kind from it, and ignoring the check on the drop prefix in the first 
assert if too much time had passed.

> AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically 
> on missing stdout contents
> ---
>
> Key: CASSANDRA-17804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Caleb Rackliffe
>Assignee: Paulo Motta
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See 
> [https://app.circleci.com/pipelines/github/maedhroz/cassandra/487/workflows/0ad42210-2979-4c5d-a4e8-d8cedf9285a7/jobs/4686/tests#failed-test-0]
>  
> My guess is that in some resource constrained environment, even the first 
> {{nodeool listsnapshots}} invocation doesn’t have “dropped” in the stdout. In 
> other words, we skip to the state of the world the last assertion in the test 
> is checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17804) AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically on missing stdout contents

2022-08-14 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579469#comment-17579469
 ] 

Paulo Motta commented on CASSANDRA-17804:
-

This test makes the following assertions to ensure snapshots of dropped tables 
are properly expired after a node restart when auto_snapshot_ttl=20s:
a) that a snapshot of a dropped table exists after a node restart
b) that the snapshot is gone after expiration (20s)

This test assumes that the restart will take less than auto_snapshot_ttl=20s. 
If a restart takes longer than that then the snapshot is cleared as soon as the 
node starts and the first assertion fails.

The trivial fix is to increase auto_snapshot_ttl=60s, in the hope that a 
restart will never take longer than a minute and the first assertion will never 
fail. Unfortunately this will make the test longer but a bit more deterministic.

* PR: https://github.com/apache/cassandra/pull/1789
* CI: https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1871/

> AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically 
> on missing stdout contents
> ---
>
> Key: CASSANDRA-17804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Caleb Rackliffe
>Assignee: Paulo Motta
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>
> See 
> [https://app.circleci.com/pipelines/github/maedhroz/cassandra/487/workflows/0ad42210-2979-4c5d-a4e8-d8cedf9285a7/jobs/4686/tests#failed-test-0]
>  
> My guess is that in some resource constrained environment, even the first 
> {{nodeool listsnapshots}} invocation doesn’t have “dropped” in the stdout. In 
> other words, we skip to the state of the world the last assertion in the test 
> is checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17804) AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically on missing stdout contents

2022-08-09 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577541#comment-17577541
 ] 

Paulo Motta commented on CASSANDRA-17804:
-

Thanks for checking [~maedhroz], I'll take a look.

> AutoSnapshotTtlTest#testAutoSnapshotTTlOnDropAfterRestart fails sporadically 
> on missing stdout contents
> ---
>
> Key: CASSANDRA-17804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Caleb Rackliffe
>Assignee: Paulo Motta
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>
> See 
> [https://app.circleci.com/pipelines/github/maedhroz/cassandra/487/workflows/0ad42210-2979-4c5d-a4e8-d8cedf9285a7/jobs/4686/tests#failed-test-0]
>  
> My guess is that in some resource constrained environment, even the first 
> {{nodeool listsnapshots}} invocation doesn’t have “dropped” in the stdout. In 
> other words, we skip to the state of the world the last assertion in the test 
> is checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org