date:20220819

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 2f928df9 generate docs for 11349c1a
 new f1618fae generate docs for 11349c1a

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (2f928df9)
\
 N -- N -- N   refs/heads/asf-staging (f1618fae)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 site-ui/build/ui-bundle.zip | Bin 4740078 -> 4740078 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra-website] branch asf-staging updated (46ec2cc8 -> 2f928df9)

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 46ec2cc8 generate docs for 11349c1a
 new 2f928df9 generate docs for 11349c1a

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (46ec2cc8)
\
 N -- N -- N   refs/heads/asf-staging (2f928df9)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 site-ui/build/ui-bundle.zip | Bin 4740078 -> 4740078 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17828) Read/Write/Truncate throw RequestFailure in a race condition with callback timeouts, should return Timeout instead



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-17828:
--
  Fix Version/s: 4.2
 (was: 4.x)
  Since Version: 3.0.0
Source Control Link: 
https://github.com/apache/cassandra/commit/7dfcba2a923dea75574eaeaf4aa37e2fe0abbbdc
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Read/Write/Truncate throw RequestFailure in a race condition with callback 
> timeouts, should return Timeout instead
> --
>
> Key: CASSANDRA-17828
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17828
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.2
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> There is an edge case with write timeout where the condition gets signaled on 
> the timeouts and this happens before await times out, this triggers us to 
> send a Failure rather than a Timeout



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch trunk updated: Read/Write/Truncate throw RequestFailure in a race condition with callback timeouts, should return Timeout instead

2022-08-19 Thread dcapwell

This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new c4b1c0614e Read/Write/Truncate throw RequestFailure in a race 
condition with callback timeouts, should return Timeout instead
c4b1c0614e is described below

commit c4b1c0614e42b4ea2064822d31c28aa5d4f1450a
Author: David Capwell 
AuthorDate: Fri Aug 19 16:42:56 2022 -0700

Read/Write/Truncate throw RequestFailure in a race condition with callback 
timeouts, should return Timeout instead

patch by David Capwell; reviewed by Caleb Rackliffe for CASSANDRA-17828
---
 CHANGES.txt|   1 +
 .../org/apache/cassandra/net/RequestCallback.java  |  17 ++
 .../service/AbstractWriteResponseHandler.java  |  40 ++--
 .../cassandra/service/TruncateResponseHandler.java |  29 ++-
 .../cassandra/service/reads/ReadCallback.java  |  13 +-
 .../test/metrics/RequestTimeoutTest.java   | 241 +
 .../org/apache/cassandra/utils/AssertionUtils.java | 124 +++
 .../apache/cassandra/utils/AssertionUtilsTest.java |  45 
 8 files changed, 483 insertions(+), 27 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 36beb3c27f..3fd1a8c747 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.2
+ * Read/Write/Truncate throw RequestFailure in a race condition with callback 
timeouts, should return Timeout instead (CASSANDRA-17828)
  * Add ability to log load profiles at fixed intervals (CASSANDRA-17821)
  * Protect against Gossip backing up due to a quarantined endpoint without 
version information (CASSANDRA-17830)
  * NPE in org.apache.cassandra.cql3.Attributes.getTimeToLive (CASSANDRA-17822)
diff --git a/src/java/org/apache/cassandra/net/RequestCallback.java 
b/src/java/org/apache/cassandra/net/RequestCallback.java
index bd14cae1d0..14e0169b85 100644
--- a/src/java/org/apache/cassandra/net/RequestCallback.java
+++ b/src/java/org/apache/cassandra/net/RequestCallback.java
@@ -17,6 +17,8 @@
  */
 package org.apache.cassandra.net;
 
+import java.util.Map;
+
 import org.apache.cassandra.exceptions.RequestFailureReason;
 import org.apache.cassandra.locator.InetAddressAndPort;
 
@@ -63,4 +65,19 @@ public interface RequestCallback
 return false;
 }
 
+static boolean isTimeout(Map 
failureReasonByEndpoint)
+{
+// The reason that all must be timeout to be called a timeout is as 
follows
+// Assume RF=6, QUORUM, and failureReasonByEndpoint.size() == 3
+// R1 -> TIMEOUT
+// R2 -> TIMEOUT
+// R3 -> READ_TOO_MANY_TOMBSTONES
+// Since we got a reply back, and that was a failure, we should return 
a failure letting the user know.
+// When all failures are a timeout, then this is a race condition with
+// org.apache.cassandra.utils.concurrent.Awaitable.await(long, 
java.util.concurrent.TimeUnit)
+// The race is that the message expire path runs and expires all 
messages, this then casues the condition
+// to signal telling the caller "got all replies!".
+return 
failureReasonByEndpoint.values().stream().allMatch(RequestFailureReason.TIMEOUT::equals);
+}
+
 }
diff --git 
a/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java 
b/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java
index 4d75f19bca..76ad4c2ff8 100644
--- a/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java
+++ b/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java
@@ -17,12 +17,16 @@
  */
 package org.apache.cassandra.service;
 
+import java.util.HashMap;
+import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.atomic.AtomicInteger;
 import java.util.concurrent.atomic.AtomicIntegerFieldUpdater;
+import java.util.function.Function;
 import java.util.function.Supplier;
+import java.util.stream.Collectors;
 
 import javax.annotation.Nullable;
 
@@ -113,34 +117,42 @@ public abstract class AbstractWriteResponseHandler 
implements RequestCallback
 {
 long timeoutNanos = currentTimeoutNanos();
 
-boolean success;
+boolean signaled;
 try
 {
-success = condition.await(timeoutNanos, NANOSECONDS);
+signaled = condition.await(timeoutNanos, NANOSECONDS);
 }
 catch (InterruptedException e)
 {
 throw new UncheckedInterruptedException(e);
 }
 
-if (!success)
-{
-int blockedFor = blockFor();
-int acks = ackCount();
-// It's pretty unlikely, but we can race between exiting await 
above and here, so
-// that we could now have enough acks. In that case, we "lie" on 
the acks count to
-// avoid

[jira] [Commented] (CASSANDRA-17457) User password strength

2022-08-19 Thread Jackson Fleming (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582095#comment-17582095
 ] 

Jackson Fleming commented on CASSANDRA-17457:
-

Hi, I'm new to this community, but I'd be keen to pick this one up. Seems like 
a great first improvement to contribute to the project.

 

Is there someone in the community that would be good to talk to about defining 
minimum password complexity? Would this be a discussion for the dev mailing 
list? (Sorry for the newbie questions!).

> User password strength
> --
>
> Key: CASSANDRA-17457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17457
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Authorization
>Reporter: Berenguer Blasi
>Priority: Normal
>  Labels: low-hanging-fruit
>
> Currently we can create a user with a very insecure password such as 'A'.
> _CREATE ROLE coach WITH PASSWORD = 'A' AND LOGIN = true;_
>  
> As we can see there are no restrictions on length, characters, etc We should 
> discuss and adopt some best practices in this area. A warning would be the 
> preference instead of erroring out. Historically this has been left to be 
> dealt by LDAP or other auth systems so we can't error out.
> Newcomers:
> - We should add warnings when a weak password is provided on DCL CQL. The 
> {{validate}} method looks like a good place at face value. Fell free to 
> analyze and suggest otherwise. See {{ClientWarn}} usages for examples.
> - We should add junit methods for the newly created warnings



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra-website] branch asf-staging updated (3c92867b -> 46ec2cc8)

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 3c92867b generate docs for 11349c1a
 new 46ec2cc8 generate docs for 11349c1a

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (3c92867b)
\
 N -- N -- N   refs/heads/asf-staging (46ec2cc8)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/search-index.js |   2 +-
 site-ui/build/ui-bundle.zip | Bin 4740078 -> 4740078 bytes
 2 files changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra-website] branch asf-staging updated (cadc0f07 -> 3c92867b)

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard cadc0f07 generate docs for 11349c1a
 new 3c92867b generate docs for 11349c1a

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (cadc0f07)
\
 N -- N -- N   refs/heads/asf-staging (3c92867b)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../doc/4.1/cassandra/operating/bulk_loading.html  |  18 ---
 .../4.1/cassandra/tools/sstable/sstableloader.html |  25 +++--
 .../doc/4.2/cassandra/operating/bulk_loading.html  |  18 ---
 .../4.2/cassandra/tools/nodetool/profileload.html  |  18 +--
 .../cassandra/tools/nodetool/toppartitions.html|  18 +--
 .../4.2/cassandra/tools/sstable/sstableloader.html |  25 +++--
 .../latest/cassandra/operating/bulk_loading.html   |  18 ---
 .../cassandra/tools/sstable/sstableloader.html |  25 +++--
 .../trunk/cassandra/operating/bulk_loading.html|  18 ---
 .../cassandra/tools/nodetool/profileload.html  |  18 +--
 .../cassandra/tools/nodetool/toppartitions.html|  18 +--
 .../cassandra/tools/sstable/sstableloader.html |  25 +++--
 content/search-index.js|   2 +-
 site-ui/build/ui-bundle.zip| Bin 4740078 -> 4740078 
bytes
 14 files changed, 217 insertions(+), 29 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17750) Remove dependency on Maven Ant Tasks



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582085#comment-17582085
 ] 

David Capwell commented on CASSANDRA-17750:
---

+1

> Remove dependency on Maven Ant Tasks
> 
>
> Key: CASSANDRA-17750
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17750
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build, Dependencies, Packaging
>Reporter: Abe Ratnofsky
>Assignee: Abe Ratnofsky
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> Apache Cassandra depends on Maven Ant Tasks (MAT) during build, for declaring 
> dependencies and generating POM files from within build.xml. MAT has long 
> been retired (no commits since maintenance in 2015), has registered CVEs in 
> dependencies (CVE-2017-1000487), and encourages migration to its successor, 
> Maven Artifact Resolver Ant Tasks (MARAT).
> As part of CASSANDRA-16391 
> , mck migrated 
> dependency resolution to MARAT, but MAT is still included in our build for 
> generating POMs since MARAT does not have an alternative to the writepom task 
> provided by MAT. I have a patch ready that removes MAT completely, with a 
> workaround for POM generation.
> I am not advocating for any kind of migration away from Ant to an alternative 
> like Gradle or Maven, just to be extra clear.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17828) Read/Write/Truncate throw RequestFailure in a race condition with callback timeouts, should return Timeout instead



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582082#comment-17582082
 ] 

David Capwell commented on CASSANDRA-17828:
---

Starting commit

CI Results (pending):
||Branch||Source||Circle CI||Jenkins||
|trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-17828-trunk-BDF33D62-DC26-4EA5-AF26-4280D949FF0E]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-17828-trunk-BDF33D62-DC26-4EA5-AF26-4280D949FF0E]|[build|unknown]|


> Read/Write/Truncate throw RequestFailure in a race condition with callback 
> timeouts, should return Timeout instead
> --
>
> Key: CASSANDRA-17828
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17828
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> There is an edge case with write timeout where the condition gets signaled on 
> the timeouts and this happens before await times out, this triggers us to 
> send a Failure rather than a Timeout



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582059#comment-17582059
 ] 

Brandon Williams commented on CASSANDRA-13010:
--

+1

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17828) Read/Write/Truncate throw RequestFailure in a race condition with callback timeouts, should return Timeout instead



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-17828:
--
Status: Ready to Commit  (was: Review In Progress)

> Read/Write/Truncate throw RequestFailure in a race condition with callback 
> timeouts, should return Timeout instead
> --
>
> Key: CASSANDRA-17828
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17828
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> There is an edge case with write timeout where the condition gets signaled on 
> the timeouts and this happens before await times out, this triggers us to 
> send a Failure rather than a Timeout



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17843) Fix test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17843:

Description: 
test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest

needs to be fixed on 4.1 and trunk

[https://app.circleci.com/pipelines/github/yifan-c/cassandra/394/workflows/0079ce49-a851-4bf9-b6cd-3f9b76e9667c/jobs/3786/tests#failed-test-0]
{code:java}
java.lang.ClassCastException: class org.apache.cassandra.utils.TimeUUID cannot 
be cast to class java.util.UUID (org.apache.cassandra.utils.TimeUUID is in 
unnamed module of loader 'app'; java.util.UUID is in module java.base of loader 
'bootstrap') at 
org.apache.cassandra.distributed.test.IncRepairCoordinatorErrorTest.errorTest(IncRepairCoordinatorErrorTest.java:50)
 at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method) at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code}

  was:
test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest

needs to be fixed on 4.1 and trunk

https://app.circleci.com/pipelines/github/yifan-c/cassandra/394/workflows/0079ce49-a851-4bf9-b6cd-3f9b76e9667c/jobs/3786/tests#failed-test-0


> Fix 
> test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest
> 
>
> Key: CASSANDRA-17843
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17843
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.1-beta, 4.1.x, 4.x
>
>
> test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest
> needs to be fixed on 4.1 and trunk
> [https://app.circleci.com/pipelines/github/yifan-c/cassandra/394/workflows/0079ce49-a851-4bf9-b6cd-3f9b76e9667c/jobs/3786/tests#failed-test-0]
> {code:java}
> java.lang.ClassCastException: class org.apache.cassandra.utils.TimeUUID 
> cannot be cast to class java.util.UUID (org.apache.cassandra.utils.TimeUUID 
> is in unnamed module of loader 'app'; java.util.UUID is in module java.base 
> of loader 'bootstrap') at 
> org.apache.cassandra.distributed.test.IncRepairCoordinatorErrorTest.errorTest(IncRepairCoordinatorErrorTest.java:50)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17677) Fix BulkLoader to load entireSSTableThrottle and entireSSTableInterDcThrottle



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582052#comment-17582052
 ] 

Yifan Cai commented on CASSANDRA-17677:
---

Thank you for filing the ticket!  

> Fix BulkLoader to load  entireSSTableThrottle and entireSSTableInterDcThrottle
> --
>
> Key: CASSANDRA-17677
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17677
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Ekaterina Dimitrova
>Assignee: Francisco Guerrero
>Priority: Normal
> Fix For: 4.1-alpha, 4.2
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> {{entire_sstable_stream_throughput_outbound and 
> entire_sstable_inter_dc_stream_throughput_outbound}} were introduced in 
> CASSANDRA-17065.They were added to the LoaderOptions class but they are not 
> loaded in BulkLoader as {{throttle}} and {{interDcThrottle are. }}{{As part 
> of this ticket we need to fix the BulkLoader, also those properties should be 
> advertised as MiB/s, not megabits/s. This was not changed in CASSANDRA-15234 
> for the bulk loader because those are not loaded and those variables in 
> LoaderOptions are disconnected from the Cassandra config parameters and 
> unused at the moment. }}
> It will be good also to update the doc here - 
> [https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html,|https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html]
> {{and add a test that those are loaded properly when used with the 
> BulkLoader. }}
> {{CC [~frankgh] }}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17677) Fix BulkLoader to load entireSSTableThrottle and entireSSTableInterDcThrottle



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582051#comment-17582051
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17677:
-

Oh there is also testConnectionsAreRejectedWithInvalidConfig failing on trunk 
but there is already a ticket for that one - CASSANDRA-17618

> Fix BulkLoader to load  entireSSTableThrottle and entireSSTableInterDcThrottle
> --
>
> Key: CASSANDRA-17677
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17677
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Ekaterina Dimitrova
>Assignee: Francisco Guerrero
>Priority: Normal
> Fix For: 4.1-alpha, 4.2
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> {{entire_sstable_stream_throughput_outbound and 
> entire_sstable_inter_dc_stream_throughput_outbound}} were introduced in 
> CASSANDRA-17065.They were added to the LoaderOptions class but they are not 
> loaded in BulkLoader as {{throttle}} and {{interDcThrottle are. }}{{As part 
> of this ticket we need to fix the BulkLoader, also those properties should be 
> advertised as MiB/s, not megabits/s. This was not changed in CASSANDRA-15234 
> for the bulk loader because those are not loaded and those variables in 
> LoaderOptions are disconnected from the Cassandra config parameters and 
> unused at the moment. }}
> It will be good also to update the doc here - 
> [https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html,|https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html]
> {{and add a test that those are loaded properly when used with the 
> BulkLoader. }}
> {{CC [~frankgh] }}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17677) Fix BulkLoader to load entireSSTableThrottle and entireSSTableInterDcThrottle



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582050#comment-17582050
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17677:
-

I just opened a ticket  CASSANDRA-17843 for the new test failure

> Fix BulkLoader to load  entireSSTableThrottle and entireSSTableInterDcThrottle
> --
>
> Key: CASSANDRA-17677
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17677
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Ekaterina Dimitrova
>Assignee: Francisco Guerrero
>Priority: Normal
> Fix For: 4.1-alpha, 4.2
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> {{entire_sstable_stream_throughput_outbound and 
> entire_sstable_inter_dc_stream_throughput_outbound}} were introduced in 
> CASSANDRA-17065.They were added to the LoaderOptions class but they are not 
> loaded in BulkLoader as {{throttle}} and {{interDcThrottle are. }}{{As part 
> of this ticket we need to fix the BulkLoader, also those properties should be 
> advertised as MiB/s, not megabits/s. This was not changed in CASSANDRA-15234 
> for the bulk loader because those are not loaded and those variables in 
> LoaderOptions are disconnected from the Cassandra config parameters and 
> unused at the moment. }}
> It will be good also to update the doc here - 
> [https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html,|https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html]
> {{and add a test that those are loaded properly when used with the 
> BulkLoader. }}
> {{CC [~frankgh] }}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17843) Fix test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17843:

 Bug Category: Parent values: Correctness(12982)
   Complexity: Normal
  Component/s: CI
Discovered By: User Report
Fix Version/s: 4.1-beta
   4.1.x
   4.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Fix 
> test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest
> 
>
> Key: CASSANDRA-17843
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17843
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.1-beta, 4.1.x, 4.x
>
>
> test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest
> needs to be fixed on 4.1 and trunk
> https://app.circleci.com/pipelines/github/yifan-c/cassandra/394/workflows/0079ce49-a851-4bf9-b6cd-3f9b76e9667c/jobs/3786/tests#failed-test-0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17834) Repair coordinator can get stuck in an infinite loop if it gets a FailSession after it's marked a session FINALIZED



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582048#comment-17582048
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17834:
-

Ticket opened - CASSANDRA-17843

> Repair coordinator can get stuck in an infinite loop if it gets a FailSession 
> after it's marked a session FINALIZED
> ---
>
> Key: CASSANDRA-17834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.0.6, 4.1-beta, 4.2
>
>
> If the repair coordinator gets a FailSession message after it has marked the 
> session as FINALIZED it will start an infinite loop of FailSession messages. 
> This is unique to 4.0+ in {{CoordinatorSession.java}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-17843) Fix test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest

Ekaterina Dimitrova created CASSANDRA-17843:
---

 Summary: Fix 
test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest
 Key: CASSANDRA-17843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17843
 Project: Cassandra
  Issue Type: Bug
Reporter: Ekaterina Dimitrova


test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest

needs to be fixed on 4.1 and trunk

https://app.circleci.com/pipelines/github/yifan-c/cassandra/394/workflows/0079ce49-a851-4bf9-b6cd-3f9b76e9667c/jobs/3786/tests#failed-test-0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17834) Repair coordinator can get stuck in an infinite loop if it gets a FailSession after it's marked a session FINALIZED



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582047#comment-17582047
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17834:
-

Unfortunately 4.1 and trunk have diverged from 4.0 a lot in the past year. The 
new test is failing on both branches. 

[~jmckenzie], [~dcapwell] , can you, please, take a look?

> Repair coordinator can get stuck in an infinite loop if it gets a FailSession 
> after it's marked a session FINALIZED
> ---
>
> Key: CASSANDRA-17834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.0.6, 4.1-beta, 4.2
>
>
> If the repair coordinator gets a FailSession message after it has marked the 
> session as FINALIZED it will start an infinite loop of FailSession messages. 
> This is unique to 4.0+ in {{CoordinatorSession.java}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17677) Fix BulkLoader to load entireSSTableThrottle and entireSSTableInterDcThrottle



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582046#comment-17582046
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17677:
-

Thanks [~yifanc], seems like this is a new test just freshly added in 
CASSANDRA-17834. I will ping [~jmckenzie] 

 

> Fix BulkLoader to load  entireSSTableThrottle and entireSSTableInterDcThrottle
> --
>
> Key: CASSANDRA-17677
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17677
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Ekaterina Dimitrova
>Assignee: Francisco Guerrero
>Priority: Normal
> Fix For: 4.1-alpha, 4.2
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> {{entire_sstable_stream_throughput_outbound and 
> entire_sstable_inter_dc_stream_throughput_outbound}} were introduced in 
> CASSANDRA-17065.They were added to the LoaderOptions class but they are not 
> loaded in BulkLoader as {{throttle}} and {{interDcThrottle are. }}{{As part 
> of this ticket we need to fix the BulkLoader, also those properties should be 
> advertised as MiB/s, not megabits/s. This was not changed in CASSANDRA-15234 
> for the bulk loader because those are not loaded and those variables in 
> LoaderOptions are disconnected from the Cassandra config parameters and 
> unused at the moment. }}
> It will be good also to update the doc here - 
> [https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html,|https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html]
> {{and add a test that those are loaded properly when used with the 
> BulkLoader. }}
> {{CC [~frankgh] }}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17677) Fix BulkLoader to load entireSSTableThrottle and entireSSTableInterDcThrottle



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-17677:
--
  Fix Version/s: 4.1-alpha
 4.2
 (was: 4.x)
 (was: 4.1.x)
 (was: 4.1-beta)
  Since Version: 4.1-alpha1
Source Control Link: 
https://github.com/apache/cassandra/commit/83c169ec9e36324f27bf562951362f4a03c3c688
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed into cassandra 4.1 as 
[83c169ec9|https://github.com/apache/cassandra/commit/83c169ec9e36324f27bf562951362f4a03c3c688]
 and merged up to trunk.

> Fix BulkLoader to load  entireSSTableThrottle and entireSSTableInterDcThrottle
> --
>
> Key: CASSANDRA-17677
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17677
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Ekaterina Dimitrova
>Assignee: Francisco Guerrero
>Priority: Normal
> Fix For: 4.1-alpha, 4.2
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> {{entire_sstable_stream_throughput_outbound and 
> entire_sstable_inter_dc_stream_throughput_outbound}} were introduced in 
> CASSANDRA-17065.They were added to the LoaderOptions class but they are not 
> loaded in BulkLoader as {{throttle}} and {{interDcThrottle are. }}{{As part 
> of this ticket we need to fix the BulkLoader, also those properties should be 
> advertised as MiB/s, not megabits/s. This was not changed in CASSANDRA-15234 
> for the bulk loader because those are not loaded and those variables in 
> LoaderOptions are disconnected from the Cassandra config parameters and 
> unused at the moment. }}
> It will be good also to update the doc here - 
> [https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html,|https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html]
> {{and add a test that those are loaded properly when used with the 
> BulkLoader. }}
> {{CC [~frankgh] }}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch trunk updated (4526b3fcbd -> 4aa3bbda79)

2022-08-19 Thread ycai

This is an automated email from the ASF dual-hosted git repository.

ycai pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 4526b3fcbd Add ability to log load profiles at fixed intervals
 add 83c169ec9e Fix BulkLoader to load entireSSTableThrottle and 
entireSSTableInterDcThrottle
 add 4aa3bbda79 Merge branch 'cassandra-4.1' into trunk

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt|   1 +
 .../cassandra/pages/operating/bulk_loading.adoc|  18 ++-
 .../pages/tools/sstable/sstableloader.adoc |  19 ++-
 .../cassandra/config/DatabaseDescriptor.java   |  10 ++
 .../org/apache/cassandra/tools/BulkLoader.java |  30 ++--
 .../org/apache/cassandra/tools/LoaderOptions.java  | 161 -
 .../org/apache/cassandra/tools/BulkLoaderTest.java |  47 ++
 .../apache/cassandra/tools/LoaderOptionsTest.java  | 142 +++---
 8 files changed, 350 insertions(+), 78 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch cassandra-4.1 updated (b95a6931f5 -> 83c169ec9e)

2022-08-19 Thread ycai

This is an automated email from the ASF dual-hosted git repository.

ycai pushed a change to branch cassandra-4.1
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from b95a6931f5 Merge branch 'cassandra-4.0' into cassandra-4.1
 add 83c169ec9e Fix BulkLoader to load entireSSTableThrottle and 
entireSSTableInterDcThrottle

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt|   1 +
 .../cassandra/pages/operating/bulk_loading.adoc|  18 ++-
 .../pages/tools/sstable/sstableloader.adoc |  19 ++-
 .../cassandra/config/DatabaseDescriptor.java   |  10 ++
 .../org/apache/cassandra/tools/BulkLoader.java |  30 ++--
 .../org/apache/cassandra/tools/LoaderOptions.java  | 161 -
 .../org/apache/cassandra/tools/BulkLoaderTest.java |  47 ++
 .../apache/cassandra/tools/LoaderOptionsTest.java  | 142 +++---
 8 files changed, 350 insertions(+), 78 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-17677) Fix BulkLoader to load entireSSTableThrottle and entireSSTableInterDcThrottle



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581971#comment-17581971
 ] 

Yifan Cai edited comment on CASSANDRA-17677 at 8/19/22 9:19 PM:


Starting commit

CI Results:
||Branch||Source||Circle CI||
|cassandra-4.1|[branch|https://github.com/yifan-c/cassandra/tree/commit_remote_branch/CASSANDRA-17677-cassandra-4.1-49D558C4-750C-4AE9-81D0-C89424123139]|[build|https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=commit_remote_branch%2FCASSANDRA-17677-cassandra-4.1-49D558C4-750C-4AE9-81D0-C89424123139]|
|trunk|[branch|https://github.com/yifan-c/cassandra/tree/commit_remote_branch/CASSANDRA-17677-trunk-49D558C4-750C-4AE9-81D0-C89424123139]|[build|https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=commit_remote_branch%2FCASSANDRA-17677-trunk-49D558C4-750C-4AE9-81D0-C89424123139]|

Test result looks green. There is one test failure, 
IncRepairCoordinatorErrorTest#errorTest. It fails on my locally when switching 
to trunk/HEAD. So I believe it is not related with the patch. 


was (Author: yifanc):
Starting commit

CI Results (pending):
||Branch||Source||Circle CI||
|cassandra-4.1|[branch|https://github.com/yifan-c/cassandra/tree/commit_remote_branch/CASSANDRA-17677-cassandra-4.1-49D558C4-750C-4AE9-81D0-C89424123139]|[build|https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=commit_remote_branch%2FCASSANDRA-17677-cassandra-4.1-49D558C4-750C-4AE9-81D0-C89424123139]|
|trunk|[branch|https://github.com/yifan-c/cassandra/tree/commit_remote_branch/CASSANDRA-17677-trunk-49D558C4-750C-4AE9-81D0-C89424123139]|[build|https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=commit_remote_branch%2FCASSANDRA-17677-trunk-49D558C4-750C-4AE9-81D0-C89424123139]|


> Fix BulkLoader to load  entireSSTableThrottle and entireSSTableInterDcThrottle
> --
>
> Key: CASSANDRA-17677
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17677
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Ekaterina Dimitrova
>Assignee: Francisco Guerrero
>Priority: Normal
> Fix For: 4.1-beta, 4.1.x, 4.x
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> {{entire_sstable_stream_throughput_outbound and 
> entire_sstable_inter_dc_stream_throughput_outbound}} were introduced in 
> CASSANDRA-17065.They were added to the LoaderOptions class but they are not 
> loaded in BulkLoader as {{throttle}} and {{interDcThrottle are. }}{{As part 
> of this ticket we need to fix the BulkLoader, also those properties should be 
> advertised as MiB/s, not megabits/s. This was not changed in CASSANDRA-15234 
> for the bulk loader because those are not loaded and those variables in 
> LoaderOptions are disconnected from the Cassandra config parameters and 
> unused at the moment. }}
> It will be good also to update the doc here - 
> [https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html,|https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html]
> {{and add a test that those are loaded properly when used with the 
> BulkLoader. }}
> {{CC [~frankgh] }}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-17819) Test failure: org.apache.cassandra.distributed.test.SchemaTest.schemaReset



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582043#comment-17582043
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-17819 at 8/19/22 9:16 PM:
--

CI-wise the test seems fine but you broke 
org.apache.cassandra.distributed.test.jmx.JMXGetterCheckTest

I made a quick skim but it will take me some time to dig into the details. I 
left a few immediate small comments on the commit, more on Monday.

We need also 4.0 patch right? 

Maybe [~adelapena] will also want to take a look at the changes? 


was (Author: e.dimitrova):
The test seems fine but you broke 
org.apache.cassandra.distributed.test.jmx.JMXGetterCheckTest

I made a quick skim but it will take me some time to dig into the details. I 
left a few immediate small comments on the commit, more on Monday.

We need also 4.0 patch right? 

Maybe [~adelapena] will also want to take a look at the changes? 

> Test failure: org.apache.cassandra.distributed.test.SchemaTest.schemaReset
> --
>
> Key: CASSANDRA-17819
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17819
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Andres de la Peña
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>
> The test 
> {{{}org.apache.cassandra.distributed.test.SchemaTest.schemaReset{}}}, 
> recently introduced by CASSANDRA-17658, is flaky on 4.1 and trunk:
>  * 4.1: 
> [https://ci-cassandra.apache.org/job/Cassandra-4.1/134/testReport/org.apache.cassandra.distributed.test/SchemaTest/schemaReset_2/]
>  * trunk: 
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1265/testReport/org.apache.cassandra.distributed.test/SchemaTest/schemaReset_2/]
> {code:java}
> Error Message
> Condition with lambda expression in 
> org.apache.cassandra.distributed.test.SchemaTest that uses 
> org.apache.cassandra.distributed.Cluster was not fulfilled within 1 minutes.
> Stacktrace
> org.awaitility.core.ConditionTimeoutException: Condition with lambda 
> expression in org.apache.cassandra.distributed.test.SchemaTest that uses 
> org.apache.cassandra.distributed.Cluster was not fulfilled within 1 minutes.
>   at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165)
>   at 
> org.awaitility.core.CallableCondition.await(CallableCondition.java:78)
>   at 
> org.awaitility.core.CallableCondition.await(CallableCondition.java:26)
>   at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895)
>   at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:864)
>   at 
> org.apache.cassandra.distributed.test.SchemaTest.schemaReset(SchemaTest.java:115)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Standard Output
> INFO  [main]  2022-08-15 15:02:14,783 Reflections.java:219 - 
> Reflections took 1873 ms to scan 8 urls, producing 1754 keys and 6912 values
> INFO  [main]  2022-08-15 15:02:16,407 Reflections.java:219 - 
> Reflections took 1561 ms to scan 8 urls, producing 1754 keys and 6912 values
> Node id topology:
> node 1: dc = datacenter0, rack = rack0
> node 2: dc = datacenter0, rack = rack0
> Configured node count: 2, nodeIdTopology size: 2
> DEBUG [main] node1 2022-08-15 15:02:17,554 InternalLoggerFactory.ja
> ...[truncated 1761288 chars]...
> cutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> INFO  [node2_isolatedExecutor:3] node2 2022-08-15 15:03:52,096 
> MessagingService.java:519 - Waiting for messaging service to quiesce
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-17819) Test failure: org.apache.cassandra.distributed.test.SchemaTest.schemaReset



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582043#comment-17582043
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-17819 at 8/19/22 9:16 PM:
--

CI-wise the test seems fine but the patch broke 
org.apache.cassandra.distributed.test.jmx.JMXGetterCheckTest

I made a quick skim but it will take me some time to dig into the details. I 
left a few immediate small comments on the commit, more on Monday.

We need also 4.0 patch right? 

Maybe [~adelapena] will also want to take a look at the changes? 


was (Author: e.dimitrova):
CI-wise the test seems fine but you broke 
org.apache.cassandra.distributed.test.jmx.JMXGetterCheckTest

I made a quick skim but it will take me some time to dig into the details. I 
left a few immediate small comments on the commit, more on Monday.

We need also 4.0 patch right? 

Maybe [~adelapena] will also want to take a look at the changes? 

> Test failure: org.apache.cassandra.distributed.test.SchemaTest.schemaReset
> --
>
> Key: CASSANDRA-17819
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17819
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Andres de la Peña
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>
> The test 
> {{{}org.apache.cassandra.distributed.test.SchemaTest.schemaReset{}}}, 
> recently introduced by CASSANDRA-17658, is flaky on 4.1 and trunk:
>  * 4.1: 
> [https://ci-cassandra.apache.org/job/Cassandra-4.1/134/testReport/org.apache.cassandra.distributed.test/SchemaTest/schemaReset_2/]
>  * trunk: 
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1265/testReport/org.apache.cassandra.distributed.test/SchemaTest/schemaReset_2/]
> {code:java}
> Error Message
> Condition with lambda expression in 
> org.apache.cassandra.distributed.test.SchemaTest that uses 
> org.apache.cassandra.distributed.Cluster was not fulfilled within 1 minutes.
> Stacktrace
> org.awaitility.core.ConditionTimeoutException: Condition with lambda 
> expression in org.apache.cassandra.distributed.test.SchemaTest that uses 
> org.apache.cassandra.distributed.Cluster was not fulfilled within 1 minutes.
>   at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165)
>   at 
> org.awaitility.core.CallableCondition.await(CallableCondition.java:78)
>   at 
> org.awaitility.core.CallableCondition.await(CallableCondition.java:26)
>   at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895)
>   at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:864)
>   at 
> org.apache.cassandra.distributed.test.SchemaTest.schemaReset(SchemaTest.java:115)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Standard Output
> INFO  [main]  2022-08-15 15:02:14,783 Reflections.java:219 - 
> Reflections took 1873 ms to scan 8 urls, producing 1754 keys and 6912 values
> INFO  [main]  2022-08-15 15:02:16,407 Reflections.java:219 - 
> Reflections took 1561 ms to scan 8 urls, producing 1754 keys and 6912 values
> Node id topology:
> node 1: dc = datacenter0, rack = rack0
> node 2: dc = datacenter0, rack = rack0
> Configured node count: 2, nodeIdTopology size: 2
> DEBUG [main] node1 2022-08-15 15:02:17,554 InternalLoggerFactory.ja
> ...[truncated 1761288 chars]...
> cutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> INFO  [node2_isolatedExecutor:3] node2 2022-08-15 15:03:52,096 
> MessagingService.java:519 - Waiting for messaging service to quiesce
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17819) Test failure: org.apache.cassandra.distributed.test.SchemaTest.schemaReset



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582043#comment-17582043
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17819:
-

The test seems fine but you broke 
org.apache.cassandra.distributed.test.jmx.JMXGetterCheckTest

I made a quick skim but it will take me some time to dig into the details. I 
left a few immediate small comments on the commit, more on Monday.

We need also 4.0 patch right? 

Maybe [~adelapena] will also want to take a look at the changes? 

> Test failure: org.apache.cassandra.distributed.test.SchemaTest.schemaReset
> --
>
> Key: CASSANDRA-17819
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17819
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Andres de la Peña
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>
> The test 
> {{{}org.apache.cassandra.distributed.test.SchemaTest.schemaReset{}}}, 
> recently introduced by CASSANDRA-17658, is flaky on 4.1 and trunk:
>  * 4.1: 
> [https://ci-cassandra.apache.org/job/Cassandra-4.1/134/testReport/org.apache.cassandra.distributed.test/SchemaTest/schemaReset_2/]
>  * trunk: 
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1265/testReport/org.apache.cassandra.distributed.test/SchemaTest/schemaReset_2/]
> {code:java}
> Error Message
> Condition with lambda expression in 
> org.apache.cassandra.distributed.test.SchemaTest that uses 
> org.apache.cassandra.distributed.Cluster was not fulfilled within 1 minutes.
> Stacktrace
> org.awaitility.core.ConditionTimeoutException: Condition with lambda 
> expression in org.apache.cassandra.distributed.test.SchemaTest that uses 
> org.apache.cassandra.distributed.Cluster was not fulfilled within 1 minutes.
>   at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165)
>   at 
> org.awaitility.core.CallableCondition.await(CallableCondition.java:78)
>   at 
> org.awaitility.core.CallableCondition.await(CallableCondition.java:26)
>   at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895)
>   at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:864)
>   at 
> org.apache.cassandra.distributed.test.SchemaTest.schemaReset(SchemaTest.java:115)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Standard Output
> INFO  [main]  2022-08-15 15:02:14,783 Reflections.java:219 - 
> Reflections took 1873 ms to scan 8 urls, producing 1754 keys and 6912 values
> INFO  [main]  2022-08-15 15:02:16,407 Reflections.java:219 - 
> Reflections took 1561 ms to scan 8 urls, producing 1754 keys and 6912 values
> Node id topology:
> node 1: dc = datacenter0, rack = rack0
> node 2: dc = datacenter0, rack = rack0
> Configured node count: 2, nodeIdTopology size: 2
> DEBUG [main] node1 2022-08-15 15:02:17,554 InternalLoggerFactory.ja
> ...[truncated 1761288 chars]...
> cutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> INFO  [node2_isolatedExecutor:3] node2 2022-08-15 15:03:52,096 
> MessagingService.java:519 - Waiting for messaging service to quiesce
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17828) Read/Write/Truncate throw RequestFailure in a race condition with callback timeouts, should return Timeout instead

2022-08-19 Thread Caleb Rackliffe (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582037#comment-17582037
 ] 

Caleb Rackliffe commented on CASSANDRA-17828:
-

+1 (assuming we go the route of making 100% TIMEOUT failure reasons correspond 
to an overall timeout and expose the failure map otherwise)

> Read/Write/Truncate throw RequestFailure in a race condition with callback 
> timeouts, should return Timeout instead
> --
>
> Key: CASSANDRA-17828
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17828
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> There is an edge case with write timeout where the condition gets signaled on 
> the timeouts and this happens before await times out, this triggers us to 
> send a Failure rather than a Timeout



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581990#comment-17581990
 ] 

Stefan Miklosovic commented on CASSANDRA-13010:
---

8 precommit 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/1216/workflows/19d0beb7-310b-4170-8863-608bff7adb40
11 precommit 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/1216/workflows/5ed6064f-f72d-4b44-bd0f-a47885e10de1

simplified solution we just talked about is here: 
https://github.com/apache/cassandra/pull/1801

one test is repeatedly failing (not related to this PR)

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17677) Fix BulkLoader to load entireSSTableThrottle and entireSSTableInterDcThrottle



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-17677:
--
Status: Ready to Commit  (was: Review In Progress)

Starting commit

CI Results (pending):
||Branch||Source||Circle CI||
|cassandra-4.1|[branch|https://github.com/yifan-c/cassandra/tree/commit_remote_branch/CASSANDRA-17677-cassandra-4.1-49D558C4-750C-4AE9-81D0-C89424123139]|[build|https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=commit_remote_branch%2FCASSANDRA-17677-cassandra-4.1-49D558C4-750C-4AE9-81D0-C89424123139]|
|trunk|[branch|https://github.com/yifan-c/cassandra/tree/commit_remote_branch/CASSANDRA-17677-trunk-49D558C4-750C-4AE9-81D0-C89424123139]|[build|https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=commit_remote_branch%2FCASSANDRA-17677-trunk-49D558C4-750C-4AE9-81D0-C89424123139]|


> Fix BulkLoader to load  entireSSTableThrottle and entireSSTableInterDcThrottle
> --
>
> Key: CASSANDRA-17677
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17677
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Ekaterina Dimitrova
>Assignee: Francisco Guerrero
>Priority: Normal
> Fix For: 4.1-beta, 4.1.x, 4.x
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> {{entire_sstable_stream_throughput_outbound and 
> entire_sstable_inter_dc_stream_throughput_outbound}} were introduced in 
> CASSANDRA-17065.They were added to the LoaderOptions class but they are not 
> loaded in BulkLoader as {{throttle}} and {{interDcThrottle are. }}{{As part 
> of this ticket we need to fix the BulkLoader, also those properties should be 
> advertised as MiB/s, not megabits/s. This was not changed in CASSANDRA-15234 
> for the bulk loader because those are not loaded and those variables in 
> LoaderOptions are disconnected from the Cassandra config parameters and 
> unused at the moment. }}
> It will be good also to update the doc here - 
> [https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html,|https://cassandra.apache.org/doc/latest/cassandra/operating/bulk_loading.html]
> {{and add a test that those are loaded properly when used with the 
> BulkLoader. }}
> {{CC [~frankgh] }}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17842) Add the ability for operators to allow intentional loosening of definition of "empty" in Gossip for specific edge case failure scenarios



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17842:
--
Change Category: Operability
 Complexity: Normal
  Fix Version/s: 4.x
 Status: Open  (was: Triage Needed)

> Add the ability for operators to allow intentional loosening of definition of 
> "empty" in Gossip for specific edge case failure scenarios
> 
>
> Key: CASSANDRA-17842
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17842
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Gossip
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.x
>
>
> Right now {{empty}} is very specific to a single edge case (i.e. in 
> {{isEmptyWithoutStatus()}} our usage of hbState() + applicationState), but 
> there are other failure cases which block host replacements and require 
> intrusive workarounds and human intervention to recover from when you have 
> something in hbState() you don't expect.
> If we allow opt-in to a more risky (i.e. we don’t know how we got there) 
> definition of empty, then host replacements can make progress even when 
> Gossip's gotten into a bad state. Which it does. All too often.
> This parameter will obviously need some NEWS.txt and other documentation 
> around it to explain the context for end users.
> Now that I think of it, general "how to troubleshoot Gossip problems" might 
> be worth writing up and including this as part of it for operators and users, 
> specifically on our 
> [Troubleshooting|https://cassandra.apache.org/doc/latest/cassandra/troubleshooting/index.html]
>  page. Probably create that as another ticket and defer that update to there 
> and rely on news.txt and the param documentation for this one just to get the 
> functionality into the system for operators who need it.
> A touch more context:
> {code}
> // In the very specific case where hbState.isEmpty and STATUS is missing, 
> this is known to be safe to "fake"
> // the data, as this happens when the gossip state isn't coming from the node 
> but instead from a peer who
> // restarted and is missing the node's state
> //
> // When hbState is *not* empty, then the node gossiped an empty STATUS, this 
> happens during bootstrap and it's not
> // possible to tell if this is ok or not (we can't really tell if the node is 
> dead or having networking issues);
> // for these cases we need to allow an external actor to verify and inform 
> Cassandra that it is safe; this is done by
> // updating the LOOSE_DEF_OF_EMPTY_ENABLED field.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17842) Add the ability for operators to allow intentional loosening of definition of "empty" in Gossip for specific edge case failure scenarios



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17842:
--
Description: 
Right now {{empty}} is very specific to a single edge case (i.e. in 
{{isEmptyWithoutStatus()}} our usage of hbState() + applicationState), but 
there are other failure cases which block host replacements and require 
intrusive workarounds and human intervention to recover from when you have 
something in hbState() you don't expect.

If we allow opt-in to a more risky (i.e. we don’t know how we got there) 
definition of empty, then host replacements can make progress even when 
Gossip's gotten into a bad state. Which it does. All too often.

This parameter will obviously need some NEWS.txt and other documentation around 
it to explain the context for end users.

Now that I think of it, general "how to troubleshoot Gossip problems" might be 
worth writing up and including this as part of it for operators and users, 
specifically on our 
[Troubleshooting|https://cassandra.apache.org/doc/latest/cassandra/troubleshooting/index.html]
 page. Probably create that as another ticket and defer that update to there 
and rely on news.txt and the param documentation for this one just to get the 
functionality into the system for operators who need it.

A touch more context:
{code}
// In the very specific case where hbState.isEmpty and STATUS is missing, this 
is known to be safe to "fake"
// the data, as this happens when the gossip state isn't coming from the node 
but instead from a peer who
// restarted and is missing the node's state
//
// When hbState is *not* empty, then the node gossiped an empty STATUS, this 
happens during bootstrap and it's not
// possible to tell if this is ok or not (we can't really tell if the node is 
dead or having networking issues);
// for these cases we need to allow an external actor to verify and inform 
Cassandra that it is safe; this is done by
// updating the LOOSE_DEF_OF_EMPTY_ENABLED field.
{code}

  was:
Right now {{empty}} is very specific to a single edge case (i.e. in 
{{isEmptyWithoutStatus()}} our usage of hbState() + applicationState), but 
there are other failure cases which block host replacements and require 
intrusive workarounds and human intervention to recover from when you have 
something in hbState() you don't expect.

If we allow opt-in to a more risky (i.e. we don’t know how we got there) 
definition of empty, then host replacements can make progress even when 
Gossip's gotten into a bad state. Which it does. All too often.

This parameter will obviously need some NEWS.txt and other documentation around 
it to explain the context for end users.

Now that I think of it, general "how to troubleshoot Gossip problems" might be 
worth writing up and including this as part of it for operators and users, 
specifically on our 
[Troubleshooting|https://cassandra.apache.org/doc/latest/cassandra/troubleshooting/index.html]
 page. Probably create that as another ticket and defer that update to there 
and rely on news.txt and the param documentation for this one just to get the 
functionality into the system for operators who need it.


> Add the ability for operators to allow intentional loosening of definition of 
> "empty" in Gossip for specific edge case failure scenarios
> 
>
> Key: CASSANDRA-17842
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17842
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Gossip
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
>
> Right now {{empty}} is very specific to a single edge case (i.e. in 
> {{isEmptyWithoutStatus()}} our usage of hbState() + applicationState), but 
> there are other failure cases which block host replacements and require 
> intrusive workarounds and human intervention to recover from when you have 
> something in hbState() you don't expect.
> If we allow opt-in to a more risky (i.e. we don’t know how we got there) 
> definition of empty, then host replacements can make progress even when 
> Gossip's gotten into a bad state. Which it does. All too often.
> This parameter will obviously need some NEWS.txt and other documentation 
> around it to explain the context for end users.
> Now that I think of it, general "how to troubleshoot Gossip problems" might 
> be worth writing up and including this as part of it for operators and users, 
> specifically on our 
> [Troubleshooting|https://cassandra.apache.org/doc/latest/cassandra/troubleshooting/index.html]
>  page. Probably create that as another ticket and defer that update to there 
> and rely on news.txt and the param documentation for this one just to get the 
>

[jira] [Created] (CASSANDRA-17842) Add the ability for operators to allow intentional loosening of definition of "empty" in Gossip for specific edge case failure scenarios

Josh McKenzie created CASSANDRA-17842:
-

Summary: Add the ability for operators to allow intentional
loosening of definition of "empty" in Gossip for specific edge case failure
scenarios
Key: CASSANDRA-17842
URL: https://issues.apache.org/jira/browse/CASSANDRA-17842
Project: Cassandra
Issue Type: Improvement
Components: Cluster/Gossip
Reporter: Josh McKenzie
Assignee: Josh McKenzie

Right now {{empty}} is very specific to a single edge case (i.e. in
{{isEmptyWithoutStatus()}} our usage of hbState() + applicationState), but
there are other failure cases which block host replacements and require
intrusive workarounds and human intervention to recover from when you have
something in hbState() you don't expect.

If we allow opt-in to a more risky (i.e. we don’t know how we got there)
definition of empty, then host replacements can make progress even when
Gossip's gotten into a bad state. Which it does. All too often.

This parameter will obviously need some NEWS.txt and other documentation around
it to explain the context for end users.

Now that I think of it, general "how to troubleshoot Gossip problems" might be
worth writing up and including this as part of it for operators and users,
specifically on our
[Troubleshooting|https://cassandra.apache.org/doc/latest/cassandra/troubleshooting/index.html]
page. Probably create that as another ticket and defer that update to there
and rely on news.txt and the param documentation for this one just to get the
functionality into the system for operators who need it.

--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-17841) Log anticompaction cancellation at INFO level

Josh McKenzie created CASSANDRA-17841:
-

 Summary: Log anticompaction cancellation at INFO level
 Key: CASSANDRA-17841
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17841
 Project: Cassandra
  Issue Type: Improvement
  Components: Consistency/Repair
Reporter: Josh McKenzie
Assignee: Josh McKenzie


If anticompaction can be interrupted after repair as a normal part of 
operations we should log at INFO level rather than error.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17841) Log anticompaction cancellation at INFO level



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17841:
--
Change Category: Operability
 Complexity: Low Hanging Fruit
  Fix Version/s: 4.x
 Status: Open  (was: Triage Needed)

> Log anticompaction cancellation at INFO level
> -
>
> Key: CASSANDRA-17841
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17841
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Repair
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.x
>
>
> If anticompaction can be interrupted after repair as a normal part of 
> operations we should log at INFO level rather than error.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17840) IndexOutOfBoundsException in Paging State Version Inference (V3 State Received on V4 Connection)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17840:
--
 Bug Category: Parent values: Degradation(12984)Level 1 values: Other 
Exception(12998)
   Complexity: Normal
Discovered By: User Report
Fix Version/s: 3.11.x
   4.0.x
   4.1.x
   4.x
 Severity: Low
   Status: Open  (was: Triage Needed)

> IndexOutOfBoundsException in Paging State Version Inference (V3 State 
> Received on V4 Connection)
> 
>
> Key: CASSANDRA-17840
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17840
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.1.x, 4.x
>
>
> In {{PagingState.java}}, {{index}} is an integer field, and we add long 
> values to it without a {{Math.toIntExact}} check. While we’re checking for 
> negative return values returned by {{getUnsignedVInt}}, there's a chance that 
> the value returned by it is so large that addition operation would cause 
> integer overflow, or the value itself is large enough to cause overflow.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-17840) IndexOutOfBoundsException in Paging State Version Inference (V3 State Received on V4 Connection)

Josh McKenzie created CASSANDRA-17840:
-

 Summary: IndexOutOfBoundsException in Paging State Version 
Inference (V3 State Received on V4 Connection)
 Key: CASSANDRA-17840
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17840
 Project: Cassandra
  Issue Type: Bug
  Components: Messaging/Client
Reporter: Josh McKenzie
Assignee: Josh McKenzie


In {{PagingState.java}}, {{index}} is an integer field, and we add long values 
to it without a {{Math.toIntExact}} check. While we’re checking for negative 
return values returned by {{getUnsignedVInt}}, there's a chance that the value 
returned by it is so large that addition operation would cause integer 
overflow, or the value itself is large enough to cause overflow.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17839) Out of range exception on column index downsampling



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17839:
--
 Bug Category: Parent values: Degradation(12984)Level 1 values: Other 
Exception(12998)
   Complexity: Low Hanging Fruit
Discovered By: User Report
 Severity: Low
   Status: Open  (was: Triage Needed)

> Out of range exception on column index downsampling
> ---
>
> Key: CASSANDRA-17839
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17839
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
>
> We've seen a histogram overflow exception in the wild; an 
> IllegalArgumentException w/{{{}Out of range{}}} on {{{}Ints.checkedCast{}}}. 
> Looks like we need to tune the {{defaultPartitionSizeHistogram}} a bit and be 
> more graceful about handling and degrading gracefully in {{SegmentedFile}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-17839) Out of range exception on column index downsampling

Josh McKenzie created CASSANDRA-17839:
-

 Summary: Out of range exception on column index downsampling
 Key: CASSANDRA-17839
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17839
 Project: Cassandra
  Issue Type: Bug
  Components: Observability/Metrics
Reporter: Josh McKenzie
Assignee: Josh McKenzie


We've seen a histogram overflow exception in the wild; an 
IllegalArgumentException w/{{{}Out of range{}}} on {{{}Ints.checkedCast{}}}. 
Looks like we need to tune the {{defaultPartitionSizeHistogram}} a bit and be 
more graceful about handling and degrading gracefully in {{SegmentedFile}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17821) Add ability to log load profiles at fixed intervals



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581959#comment-17581959
 ] 

Yifan Cai commented on CASSANDRA-17821:
---

Yes. +1 on the patch.

> Add ability to log load profiles at fixed intervals
> ---
>
> Key: CASSANDRA-17821
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17821
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.2
>
>
> A jmx operation to run profileload and log results every X would be helpful 
> so operators can hot prop it on troubled nodes/clusters to identify 
> troublesome queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17679) Make resumable bootstrap feature optional



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17679:
--
Status: Open  (was: Patch Available)

> Make resumable bootstrap feature optional
> -
>
> Key: CASSANDRA-17679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17679
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Consistency/Streaming
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> From the patch I'm working on:
> {code}
> # In certain environments, operators may want to disable resumable bootstrap 
> in order to avoid potential correctness
> # violations or data loss scenarios. Largely this centers around nodes going 
> down during bootstrap, tombstones being
> # written, and potential races with repair. By default we leave this on as 
> it's been enabled for quite some time,
> # however the option to disable it is more palatable now that we have zero 
> copy streaming as that greatly accelerates
> # bootstraps. This defaults to true.
> # resumable_bootstrap_enabled: true
> {code}
> Not really a great fit for guardrails as it's less a "feature to be toggled 
> on and off" and more a subset of a specific feature that in certain 
> circumstances can lead to issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17821) Add ability to log load profiles at fixed intervals



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17821:
--
  Fix Version/s: 4.2
Source Control Link: 
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=commit;h=4526b3fcbde22d09065820286dd434d93ecc89ba
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Add ability to log load profiles at fixed intervals
> ---
>
> Key: CASSANDRA-17821
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17821
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.2
>
>
> A jmx operation to run profileload and log results every X would be helpful 
> so operators can hot prop it on troubled nodes/clusters to identify 
> troublesome queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch trunk updated: Add ability to log load profiles at fixed intervals

This is an automated email from the ASF dual-hosted git repository.

jmckenzie pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 4526b3fcbd Add ability to log load profiles at fixed intervals
4526b3fcbd is described below

commit 4526b3fcbde22d09065820286dd434d93ecc89ba
Author: Josh McKenzie 
AuthorDate: Tue Aug 16 14:19:46 2022 -0400

Add ability to log load profiles at fixed intervals

Patch by Yifan Cai; reviewed by Josh McKenzie, Dinesh Joshi, and Chris 
Lohfink for CASSANDRA-17821

Co-authored-by: Yifan Cai 
Co-authored-by: Josh McKenzie 
---
 CHANGES.txt|   1 +
 NEWS.txt   |   4 +-
 .../apache/cassandra/metrics/FrequencySampler.java |  42 +--
 .../org/apache/cassandra/metrics/MaxSampler.java   |  41 ++-
 src/java/org/apache/cassandra/metrics/Sampler.java |  88 -
 .../apache/cassandra/metrics/SamplingManager.java  | 391 +
 .../apache/cassandra/service/StorageService.java   |  57 ++-
 .../cassandra/service/StorageServiceMBean.java |  30 ++
 src/java/org/apache/cassandra/tools/NodeProbe.java |  24 +-
 .../cassandra/tools/nodetool/ProfileLoad.java  | 210 ++-
 .../distributed/test/ProfileLoadTest.java  | 151 
 .../org/apache/cassandra/metrics/SamplerTest.java  |   7 +-
 .../apache/cassandra/tools/TopPartitionsTest.java  |  38 +-
 13 files changed, 917 insertions(+), 167 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index e2e4d7ddde..255b46d7aa 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.2
+ * Add ability to log load profiles at fixed intervals (CASSANDRA-17821)
  * Protect against Gossip backing up due to a quarantined endpoint without 
version information (CASSANDRA-17830)
  * NPE in org.apache.cassandra.cql3.Attributes.getTimeToLive (CASSANDRA-17822)
  * Add guardrail for column size (CASSANDRA-17151)
diff --git a/NEWS.txt b/NEWS.txt
index 780cd7b88f..d8e9318a41 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -75,10 +75,12 @@ New features
   - Whether DROP KEYSPACE commands are allowed.
   - Column value size
 - It is possible to list ephemeral snapshots by nodetool listsnaphots 
command when flag "-e" is specified.
+- Added a new flag to `nodetool profileload` and JMX endpoint to set up 
recurring profile load generation on specified
+  intervals (see CASSANDRA-17821)
 
 Upgrading
 -
-- Emphemeral marker files for snapshots done by repairs are not created 
anymore, 
+- Ephemeral marker files for snapshots done by repairs are not created 
anymore,
   there is a dedicated flag in snapshot manifest instead. On upgrade of a 
node to version 4.2, on node's start, in case there 
   are such ephemeral snapshots on disk, they will be deleted (same 
behaviour as before) and any new ephemeral snapshots 
   will stop to create ephemeral marker files as flag in a snapshot 
manifest was introduced instead.
diff --git a/src/java/org/apache/cassandra/metrics/FrequencySampler.java 
b/src/java/org/apache/cassandra/metrics/FrequencySampler.java
index 8a8918b9fa..d4dfe860a4 100644
--- a/src/java/org/apache/cassandra/metrics/FrequencySampler.java
+++ b/src/java/org/apache/cassandra/metrics/FrequencySampler.java
@@ -33,33 +33,31 @@ import static java.util.concurrent.TimeUnit.MILLISECONDS;
  * add("x", 10); and add("x", 20); will result in "x" = 30 This uses 
StreamSummary to only store the
  * approximate cardinality (capacity) of keys. If the number of distinct keys 
exceed the capacity, the error of the
  * sample may increase depending on distribution of keys among the total set.
+ *
+ * Note: {@link Sampler#samplerExecutor} is single threaded but we still need 
to synchronize as we have access
+ * from both internal and the external JMX context that can cause races.
  * 
  * @param 
  */
 public abstract class FrequencySampler extends Sampler
 {
 private static final Logger logger = 
LoggerFactory.getLogger(FrequencySampler.class);
-private long endTimeNanos = -1;
 
 private StreamSummary summary;
 
 /**
  * Start to record samples
  *
- * @param capacity
- *Number of sample items to keep in memory, the lower this is
- *the less accurate results are. For best results use value
- *close to cardinality, but understand the memory trade offs.
+ * @param capacity Number of sample items to keep in memory, the lower 
this is
+ * the less accurate results are. For best results use 
value
+ * close to cardinality, but understand the memory trade 
offs.
  */
-public synchronized void beginSampling(int capacity, int durationMillis)
+public synchronized void beginSampling(int capacity, long durationMillis)
 {
-if (endTimeNanos == -1 ||

[jira] [Commented] (CASSANDRA-17821) Add ability to log load profiles at fixed intervals



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581956#comment-17581956
 ] 

Josh McKenzie commented on CASSANDRA-17821:
---

Got the +1 on the PR from [~yifanc] with a couple nits; addressing, rebasing, 
and merging.

> Add ability to log load profiles at fixed intervals
> ---
>
> Key: CASSANDRA-17821
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17821
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
>
> A jmx operation to run profileload and log results every X would be helpful 
> so operators can hot prop it on troubled nodes/clusters to identify 
> troublesome queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17821) Add ability to log load profiles at fixed intervals



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17821:
--
Status: Ready to Commit  (was: Review In Progress)

> Add ability to log load profiles at fixed intervals
> ---
>
> Key: CASSANDRA-17821
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17821
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
>
> A jmx operation to run profileload and log results every X would be helpful 
> so operators can hot prop it on troubled nodes/clusters to identify 
> troublesome queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17834) Repair coordinator can get stuck in an infinite loop if it gets a FailSession after it's marked a session FINALIZED



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17834:
--
  Fix Version/s: 4.0.6
 4.1-beta
 4.2
 (was: 4.x)
 (was: 4.0.x)
 (was: 4.1.x)
  Since Version: 4.0
Source Control Link: 
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=commit;h=0353df7542dbdbb1140a72899666e4587e87a083
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Repair coordinator can get stuck in an infinite loop if it gets a FailSession 
> after it's marked a session FINALIZED
> ---
>
> Key: CASSANDRA-17834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.0.6, 4.1-beta, 4.2
>
>
> If the repair coordinator gets a FailSession message after it has marked the 
> session as FINALIZED it will start an infinite loop of FailSession messages. 
> This is unique to 4.0+ in {{CoordinatorSession.java}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17834) Repair coordinator can get stuck in an infinite loop if it gets a FailSession after it's marked a session FINALIZED



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17834:
--
Status: Ready to Commit  (was: Review In Progress)

> Repair coordinator can get stuck in an infinite loop if it gets a FailSession 
> after it's marked a session FINALIZED
> ---
>
> Key: CASSANDRA-17834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 4.x
>
>
> If the repair coordinator gets a FailSession message after it has marked the 
> session as FINALIZED it will start an infinite loop of FailSession messages. 
> This is unique to 4.0+ in {{CoordinatorSession.java}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch cassandra-4.1 updated (958889cfeb -> b95a6931f5)

This is an automated email from the ASF dual-hosted git repository.

jmckenzie pushed a change to branch cassandra-4.1
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 958889cfeb Merge branch 'cassandra-4.0' into cassandra-4.1
 new 0353df7542 Prevent infinite loop in repair coordinator on FailSession
 new b95a6931f5 Merge branch 'cassandra-4.0' into cassandra-4.1

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|  2 +
 .../repair/consistent/CoordinatorSession.java  | 11 +
 .../cassandra/repair/consistent/LocalSessions.java |  7 ++-
 .../test/IncRepairCoordinatorErrorTest.java| 57 ++
 4 files changed, 76 insertions(+), 1 deletion(-)
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] 01/01: Merge branch 'cassandra-4.1' into trunk

This is an automated email from the ASF dual-hosted git repository.

jmckenzie pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit fd7b1bf81e17681567ba54b0577c788842c05bec
Merge: 9f8646ed49 b95a6931f5
Author: Josh McKenzie 
AuthorDate: Fri Aug 19 12:36:08 2022 -0400

Merge branch 'cassandra-4.1' into trunk

 CHANGES.txt|  2 +
 .../repair/consistent/CoordinatorSession.java  | 11 +
 .../cassandra/repair/consistent/LocalSessions.java |  7 ++-
 .../test/IncRepairCoordinatorErrorTest.java| 57 ++
 4 files changed, 76 insertions(+), 1 deletion(-)



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch cassandra-4.0 updated: Prevent infinite loop in repair coordinator on FailSession

This is an automated email from the ASF dual-hosted git repository.

jmckenzie pushed a commit to branch cassandra-4.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/cassandra-4.0 by this push:
 new 0353df7542 Prevent infinite loop in repair coordinator on FailSession
0353df7542 is described below

commit 0353df7542dbdbb1140a72899666e4587e87a083
Author: Josh McKenzie 
AuthorDate: Thu Aug 18 13:00:45 2022 -0400

Prevent infinite loop in repair coordinator on FailSession

Patch by Marcus Eriksson; reviewed by David Capwell, Blake Eggleston, and 
Josh McKenzie for CASSANDRA-17834

Co-authored-by: Marcus Eriksson 
Co-authored-by: Josh McKenzie 
---
 CHANGES.txt|  1 +
 .../repair/consistent/CoordinatorSession.java  | 11 +
 .../cassandra/repair/consistent/LocalSessions.java |  7 ++-
 .../test/IncRepairCoordinatorErrorTest.java| 57 ++
 4 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index b09b00e498..d1fb5626dd 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0.6
+ * Prevent infinite loop in repair coordinator on FailSession (CASSANDRA-17834)
  * Fix race condition on updating cdc size and advancing to next segment 
(CASSANDRA-17792)
  * Add 'noboolean' rpm build for older distros like CentOS7 (CASSANDRA-17765)
  * Fix default value for compaction_throughput_mb_per_sec in Config class to 
match  the one in cassandra.yaml (CASSANDRA-17790)
diff --git 
a/src/java/org/apache/cassandra/repair/consistent/CoordinatorSession.java 
b/src/java/org/apache/cassandra/repair/consistent/CoordinatorSession.java
index d60541e17e..5ddac3f745 100644
--- a/src/java/org/apache/cassandra/repair/consistent/CoordinatorSession.java
+++ b/src/java/org/apache/cassandra/repair/consistent/CoordinatorSession.java
@@ -21,8 +21,10 @@ package org.apache.cassandra.repair.consistent;
 import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
+import java.util.Set;
 import java.util.concurrent.atomic.AtomicBoolean;
 import java.util.function.Supplier;
+import java.util.stream.Collectors;
 
 import javax.annotation.Nullable;
 
@@ -263,6 +265,15 @@ public class CoordinatorSession extends ConsistentSession
 
 public synchronized void fail()
 {
+Set> cantFail = 
participantStates.entrySet()
+  
.stream()
+  
.filter(entry -> !entry.getValue().canTransitionTo(State.FAILED))
+  
.collect(Collectors.toSet());
+if (!cantFail.isEmpty())
+{
+logger.error("Can't transition endpoints {} to FAILED", cantFail, 
new RuntimeException());
+return;
+}
 logger.info("Incremental repair session {} failed", sessionID);
 sendFailureMessageToParticipants();
 setAll(State.FAILED);
diff --git a/src/java/org/apache/cassandra/repair/consistent/LocalSessions.java 
b/src/java/org/apache/cassandra/repair/consistent/LocalSessions.java
index 6f7b93e930..cfb90ef04d 100644
--- a/src/java/org/apache/cassandra/repair/consistent/LocalSessions.java
+++ b/src/java/org/apache/cassandra/repair/consistent/LocalSessions.java
@@ -727,7 +727,12 @@ public class LocalSessions
 {
 synchronized (session)
 {
-if (session.getState() != FAILED)
+if (session.getState() == FINALIZED)
+{
+logger.error("Can't change the state of session {} from 
FINALIZED to FAILED", session.sessionID, new RuntimeException());
+return;
+}
+else if (session.getState() != FAILED)
 {
 logger.info("Failing local repair session {}", 
session.sessionID);
 setStateAndSave(session, FAILED);
diff --git 
a/test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest.java
 
b/test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest.java
new file mode 100644
index 00..c06e848399
--- /dev/null
+++ 
b/test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest.java
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable

[cassandra] 01/01: Merge branch 'cassandra-4.0' into cassandra-4.1

This is an automated email from the ASF dual-hosted git repository.

jmckenzie pushed a commit to branch cassandra-4.1
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit b95a6931f5b1a4e25cce81f87dd97063c0e9dd87
Merge: 958889cfeb 0353df7542
Author: Josh McKenzie 
AuthorDate: Fri Aug 19 12:34:33 2022 -0400

Merge branch 'cassandra-4.0' into cassandra-4.1

# Conflicts:
#   CHANGES.txt
#   
src/java/org/apache/cassandra/repair/consistent/CoordinatorSession.java

 CHANGES.txt|  2 +
 .../repair/consistent/CoordinatorSession.java  | 11 +
 .../cassandra/repair/consistent/LocalSessions.java |  7 ++-
 .../test/IncRepairCoordinatorErrorTest.java| 57 ++
 4 files changed, 76 insertions(+), 1 deletion(-)

diff --cc CHANGES.txt
index 9b9b622c18,d1fb5626dd..b18925d9ed
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,44 -1,5 +1,46 @@@
 +4.1-alpha2
 + * Fix a race condition where a keyspace can be oopened while it is being 
removed (CASSANDRA-17658)
 + * DatabaseDescriptor will set the default failure detector during client 
initialization (CASSANDRA-17782)
 + * Avoid initializing schema via SystemKeyspace.getPreferredIP() with the 
BulkLoader tool (CASSANDRA-17740)
 + * Uncomment prepared_statements_cache_size, key_cache_size, 
counter_cache_size, index_summary_capacity which were
 +   commented out by mistake in a previous patch
 +   Fix breaking change with cache_load_timeout; cache_load_timeout_seconds 
<=0 and cache_load_timeout=0 are equivalent
 +   and they both mean disabled
 +   Deprecate public method setRate(final double throughputMbPerSec) in 
Compaction Manager in favor of
 +   setRateInBytes(final double throughputBytesPerSec)
 +   Revert breaking change removal of 
StressCQLSSTableWriter.Builder.withBufferSizeInMB(int size). Deprecate it in 
favor
 +   of StressCQLSSTableWriter.Builder.withBufferSizeInMiB(int size)
 +   Fix precision issues, add new -m flag (for nodetool/setstreamthroughput, 
nodetool/setinterdcstreamthroughput,
 +   nodetool/getstreamthroughput and nodetoo/getinterdcstreamthroughput), add 
new -d flags (nodetool/getstreamthroughput, 
nodetool/getinterdcstreamthroughput, nodetool/getcompactionthroughput)
 +   Fix a bug with precision in nodetool/compactionstats
 +   Deprecate StorageService methods and add new ones for 
stream_throughput_outbound, inter_dc_stream_throughput_outbound,
 +   compaction_throughput_outbound in the JMX MBean 
`org.apache.cassandra.db:type=StorageService`
 +   Removed getEntireSSTableStreamThroughputMebibytesPerSec in favor of new 
getEntireSSTableStreamThroughputMebibytesPerSecAsDouble
 +   in the JMX MBean `org.apache.cassandra.db:type=StorageService`
 +   Removed getEntireSSTableInterDCStreamThroughputMebibytesPerSec in favor of 
getEntireSSTableInterDCStreamThroughputMebibytesPerSecAsDouble
 +   in the JMX MBean `org.apache.cassandra.db:type=StorageService` 
(CASSANDRA-17725)
 + * Fix sstable_preemptive_open_interval disabled value. 
sstable_preemptive_open_interval = null backward compatible with
 +   sstable_preemptive_open_interval_in_mb = -1 (CASSANDRA-17737)
 + * Remove usages of Path#toFile() in the snapshot apparatus (CASSANDRA-17769)
 + * Fix Settings Virtual Table to update paxos_variant after startup and 
rename enable_uuid_sstable_identifiers to
 +   uuid_sstable_identifiers_enabled as per our config naming conventions 
(CASSANDRA-17738)
 + * index_summary_resize_interval_in_minutes = -1 is equivalent to 
index_summary_resize_interval being set to null or
 +   disabled. JMX MBean IndexSummaryManager, setResizeIntervalInMinutes method 
still takes resizeIntervalInMinutes = -1 for disabled (CASSANDRA-17735)
 + * min_tracked_partition_size_bytes parameter from 4.1 alpha1 was renamed to 
min_tracked_partition_size (CASSANDRA-17733)
 + * Remove commons-lang dependency during build runtime (CASSANDRA-17724)
 + * Relax synchronization on StreamSession#onError() to avoid deadlock 
(CASSANDRA-17706)
 + * Fix AbstractCell#toString throws MarshalException for cell in collection 
(CASSANDRA-17695)
 + * Add new vtable output option to compactionstats (CASSANDRA-17683)
 + * Fix commitLogUpperBound initialization in AbstractMemtableWithCommitlog 
(CASSANDRA-17587)
 + * Fix widening to long in getBatchSizeFailThreshold (CASSANDRA-17650)
 + * Fix widening from mebibytes to bytes in IntMebibytesBound (CASSANDRA-17716)
 + * Revert breaking change in nodetool clientstats and expose cient options 
through nodetool clientstats --client-options. (CASSANDRA-17715)
 + * Fix missed nowInSec values in QueryProcessor (CASSANDRA-17458)
 + * Revert removal of withBufferSizeInMB(int size) in CQLSSTableWriter.Builder 
class and deprecate it in favor of withBufferSizeInMiB(int size) 
(CASSANDRA-17675)
 + * Remove expired snapshots of dropped tables after restart (CASSANDRA-17619)
 +Merged from 4.0:
+ 4.0.6
+  * Prevent infinite loop in repair coordinator on

[cassandra] branch trunk updated (9f8646ed49 -> fd7b1bf81e)

This is an automated email from the ASF dual-hosted git repository.

jmckenzie pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 9f8646ed49 Merge branch 'cassandra-4.1' into trunk
 new 0353df7542 Prevent infinite loop in repair coordinator on FailSession
 new b95a6931f5 Merge branch 'cassandra-4.0' into cassandra-4.1
 new fd7b1bf81e Merge branch 'cassandra-4.1' into trunk

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|  2 +
 .../repair/consistent/CoordinatorSession.java  | 11 +
 .../cassandra/repair/consistent/LocalSessions.java |  7 ++-
 .../test/IncRepairCoordinatorErrorTest.java| 57 ++
 4 files changed, 76 insertions(+), 1 deletion(-)
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/IncRepairCoordinatorErrorTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17828) Read/Write/Truncate throw RequestFailure in a race condition with callback timeouts, should return Timeout instead

2022-08-19 Thread Caleb Rackliffe (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-17828:

Reviewers: Caleb Rackliffe, Caleb Rackliffe
   Caleb Rackliffe, Caleb Rackliffe  (was: Caleb Rackliffe)
   Status: Review In Progress  (was: Patch Available)

> Read/Write/Truncate throw RequestFailure in a race condition with callback 
> timeouts, should return Timeout instead
> --
>
> Key: CASSANDRA-17828
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17828
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> There is an edge case with write timeout where the condition gets signaled on 
> the timeouts and this happens before await times out, this triggers us to 
> send a Failure rather than a Timeout



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-17461) Test Failure: org.apache.cassandra.distributed.test.CASTest.testConflictingWritesWithStaleRingInformation

2022-08-19 Thread Benedict Elliott Smith (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581928#comment-17581928
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-17461 at 8/19/22 3:46 PM:
-

bq. in node1's view, node4 has been decommissioned/assassinated and is in the 
LEFT state

No, node1's view has node4 as never having joined the ring, as we invoke 
{{unsafeAnnulEndpoint}} after updating the token information via the LEFT 
state. This is just a convenient way to simulate node4 joining without node1 
receiving the gossip state.

bq. any gossip exchange between node1 and node4 will cause the state change 
back to the correct state of NORMAL ... and in the real world, may not get so 
lucky in timing to be from one

You have it backwards. Ordinary operations _depend_ on gossip disseminating 
this promptly for correctness. We are simulating this having not happened yet, 
leading to an inconsistent view of the token ring and permitting operations to 
reach consensus without overlapping quorums, due to token ownership 
disagreements. Paxos operations (which must be linearizable) are now able to 
detect this inconsistency themselves and enforce it, by updating the gossip 
state directly.

Note that the {{Timeout}} is also a _valid outcome_ for this test, just a bad 
one for validating everything else is working correctly. The reason the 
{{Timeout}} occurs is that we explicitly drop messages to one of the _correct_ 
owners to ensure we can only succeed by contacting the _new_ owner, and if this 
race occurs we drop this message too, leaving only one correct owner that can 
respond.

bq. it looks like the other 'stale ring' tests are explicitly adding node4 
back, only this one does not.

Yes, they are testing other things.




was (Author: benedict):
bq. in node1's view, node4 has been decommissioned/assassinated and is in the 
LEFT state

No, in node1's view node4 has never joined the ring, as we invoke 
{{unsafeAnnulEndpoint}} after updating the token information via the LEFT 
state. This is just a convenient way to simulate node4 joining without node1 
receiving the gossip state.

bq. any gossip exchange between node1 and node4 will cause the state change 
back to the correct state of NORMAL ... and in the real world, may not get so 
lucky in timing to be from one

You have it backwards. Ordinary operations _depend_ on gossip disseminating 
this promptly for correctness. We are simulating this having not happened yet, 
leading to an inconsistent view of the token ring and permitting operations to 
reach consensus without overlapping quorums, due to token ownership 
disagreements. Paxos operations (which must be linearizable) are now able to 
detect this inconsistency themselves and enforce it, by updating the gossip 
state directly.

Note that the {{Timeout}} is also a _valid outcome_ for this test, just a bad 
one for validating everything else is working correctly. The reason the 
{{Timeout}} occurs is that we explicitly drop messages to one of the _correct_ 
owners to ensure we can only succeed by contacting the _new_ owner, and if this 
race occurs we drop this message too, leaving only one correct owner that can 
respond.

bq. it looks like the other 'stale ring' tests are explicitly adding node4 
back, only this one does not.

Yes, they are testing other things.



> Test Failure: 
> org.apache.cassandra.distributed.test.CASTest.testConflictingWritesWithStaleRingInformation
> -
>
> Key: CASSANDRA-17461
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17461
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Andres de la Peña
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>
> Intermittent failures on {{org.apache.cassandra.distributed.test.CASTest}} 
> for trunk:
> * 
> [testConflictingWritesWithStaleRingInformation|https://ci-cassandra.apache.org/job/Cassandra-trunk/1024/testReport/org.apache.cassandra.distributed.test/CASTest/testConflictingWritesWithStaleRingInformation_3/]
> * 
> [testSuccessfulWriteBeforeRangeMovement|https://ci-cassandra.apache.org/job/Cassandra-trunk/1025/testReport/org.apache.cassandra.distributed.test/CASTest/testSuccessfulWriteBeforeRangeMovement/]
> * 
> [testSuccessfulWriteDuringRangeMovementFollowedByConflicting|https://ci-cassandra.apache.org/job/Cassandra-trunk/1020/testReport/org.apache.cassandra.distributed.test/CASTest/testSuccessfulWriteDuringRangeMovementFollowedByConflicting/]
> * 
> [testSucccessfulWriteDuringRangeMovementFollowedByRead|https://ci-cassandra.apache.org/job/Cassandra-trunk/1020/testReport/org.apache.cassandra.distributed.test/CASTest/testSucccessfulWriteDuringRangeMovementFollowedByRead/]
>

[jira] [Commented] (CASSANDRA-17461) Test Failure: org.apache.cassandra.distributed.test.CASTest.testConflictingWritesWithStaleRingInformation

2022-08-19 Thread Benedict Elliott Smith (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581928#comment-17581928
 ] 

Benedict Elliott Smith commented on CASSANDRA-17461:


bq. in node1's view, node4 has been decommissioned/assassinated and is in the 
LEFT state

No, in node1's view node4 has never joined the ring, as we invoke 
{{unsafeAnnulEndpoint}} after updating the token information via the LEFT 
state. This is just a convenient way to simulate node4 joining without node1 
receiving the gossip state.

bq. any gossip exchange between node1 and node4 will cause the state change 
back to the correct state of NORMAL ... and in the real world, may not get so 
lucky in timing to be from one

You have it backwards. Ordinary operations _depend_ on gossip disseminating 
this promptly for correctness. We are simulating this having not happened yet, 
leading to an inconsistent view of the token ring and permitting operations to 
reach consensus without overlapping quorums, due to token ownership 
disagreements. Paxos operations (which must be linearizable) are now able to 
detect this inconsistency themselves and enforce it, by updating the gossip 
state directly.

Note that the {{Timeout}} is also a _valid outcome_ for this test, just a bad 
one for validating everything else is working correctly. The reason the 
{{Timeout}} occurs is that we explicitly drop messages to one of the _correct_ 
owners to ensure we can only succeed by contacting the _new_ owner, and if this 
race occurs we drop this message too, leaving only one correct owner that can 
respond.

bq. it looks like the other 'stale ring' tests are explicitly adding node4 
back, only this one does not.

Yes, they are testing other things.



> Test Failure: 
> org.apache.cassandra.distributed.test.CASTest.testConflictingWritesWithStaleRingInformation
> -
>
> Key: CASSANDRA-17461
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17461
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Andres de la Peña
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>
> Intermittent failures on {{org.apache.cassandra.distributed.test.CASTest}} 
> for trunk:
> * 
> [testConflictingWritesWithStaleRingInformation|https://ci-cassandra.apache.org/job/Cassandra-trunk/1024/testReport/org.apache.cassandra.distributed.test/CASTest/testConflictingWritesWithStaleRingInformation_3/]
> * 
> [testSuccessfulWriteBeforeRangeMovement|https://ci-cassandra.apache.org/job/Cassandra-trunk/1025/testReport/org.apache.cassandra.distributed.test/CASTest/testSuccessfulWriteBeforeRangeMovement/]
> * 
> [testSuccessfulWriteDuringRangeMovementFollowedByConflicting|https://ci-cassandra.apache.org/job/Cassandra-trunk/1020/testReport/org.apache.cassandra.distributed.test/CASTest/testSuccessfulWriteDuringRangeMovementFollowedByConflicting/]
> * 
> [testSucccessfulWriteDuringRangeMovementFollowedByRead|https://ci-cassandra.apache.org/job/Cassandra-trunk/1020/testReport/org.apache.cassandra.distributed.test/CASTest/testSucccessfulWriteDuringRangeMovementFollowedByRead/]
> All four seem to have the same aspect:
> {code}
> Failed 2 times in the last 5 runs. Flakiness: 50%, Stability: 60%
> Error Message
> CAS operation timed out: received 1 of 2 required responses after 0 
> contention retries
> Stacktrace
> org.apache.cassandra.exceptions.CasWriteTimeoutException: CAS operation timed 
> out: received 1 of 2 required responses after 0 contention retries
>   at 
> org.apache.cassandra.service.paxos.Paxos$MaybeFailure.markAndThrowAsTimeoutOrFailure(Paxos.java:547)
>   at org.apache.cassandra.service.paxos.Paxos.begin(Paxos.java:1048)
>   at org.apache.cassandra.service.paxos.Paxos.cas(Paxos.java:659)
>   at org.apache.cassandra.service.paxos.Paxos.cas(Paxos.java:618)
>   at org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:307)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:500)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:467)
>   at 
> org.apache.cassandra.distributed.impl.Coordinator.unsafeExecuteInternal(Coordinator.java:122)
>   at 
> org.apache.cassandra.distributed.impl.Coordinator.unsafeExecuteInternal(Coordinator.java:103)
>   at 
> org.apache.cassandra.distributed.impl.Coordinator.lambda$executeWithResult$0(Coordinator.java:66)
>   at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:47)
>   at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:57)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581912#comment-17581912
 ] 

Brandon Williams commented on CASSANDRA-13010:
--

Very recently we added CASSANDRA-16844 which started the whole compatibility 
flag situation we are in now, so I'm sure there will be others.

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581909#comment-17581909
 ] 

Stefan Miklosovic commented on CASSANDRA-13010:
---

One point to add is that I do not think that besides this "group by dir" there 
will ever be any other flag like that. What flag that might be? 

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581908#comment-17581908
 ] 

Stefan Miklosovic commented on CASSANDRA-13010:
---

But I see the problem with the backward compatibility - like we want to have 
the non-vtable output same as it was. Look, lets go with your idea. I already 
rewrote the patch (actually simplified) and I am running the build.

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581907#comment-17581907
 ] 

Brandon Williams commented on CASSANDRA-13010:
--

bq. Having said that, I have to admit that "verbose" is rather unfortunate name 
for that flag.

I thought this too, until you confirmed the future use of "verbose" and then it 
made sense .

bq.  Maybe "--group-by-target-dir" would be more appropriate?

This is exactly what I'm trying to avoid, because if we do that, then every 
time we want to change the output we'll have a new flag to append, and it will 
become a mess to get actual verbose output from compactionstats.

bq. If we go with vtable only, to achieve this "grouping by disk"

The nice thing about vtables is if you want different output, switch to cql and 
select it however you like.

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17461) Test Failure: org.apache.cassandra.distributed.test.CASTest.testConflictingWritesWithStaleRingInformation



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581902#comment-17581902
 ] 

Brandon Williams commented on CASSANDRA-17461:
--

bq. We sort of need the ownership discrepancy to be discovered by the Paxos 
operation

Can you help me understand why this is the case?  At this point in time, in 
node1's view, node4 has been decommissioned/assassinated and is in the LEFT 
state, but node4's actual state is unchanged from NORMAL so any gossip exchange 
between node1 and node4 will cause the state change back to the correct state 
of NORMAL.  This doesn't have to be from a Paxos operation though (and in the 
real world, may not get so lucky in timing to be from one) and it looks like 
the other 'stale ring' tests are explicitly adding node4 back, only this one 
does not.

> Test Failure: 
> org.apache.cassandra.distributed.test.CASTest.testConflictingWritesWithStaleRingInformation
> -
>
> Key: CASSANDRA-17461
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17461
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Andres de la Peña
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>
> Intermittent failures on {{org.apache.cassandra.distributed.test.CASTest}} 
> for trunk:
> * 
> [testConflictingWritesWithStaleRingInformation|https://ci-cassandra.apache.org/job/Cassandra-trunk/1024/testReport/org.apache.cassandra.distributed.test/CASTest/testConflictingWritesWithStaleRingInformation_3/]
> * 
> [testSuccessfulWriteBeforeRangeMovement|https://ci-cassandra.apache.org/job/Cassandra-trunk/1025/testReport/org.apache.cassandra.distributed.test/CASTest/testSuccessfulWriteBeforeRangeMovement/]
> * 
> [testSuccessfulWriteDuringRangeMovementFollowedByConflicting|https://ci-cassandra.apache.org/job/Cassandra-trunk/1020/testReport/org.apache.cassandra.distributed.test/CASTest/testSuccessfulWriteDuringRangeMovementFollowedByConflicting/]
> * 
> [testSucccessfulWriteDuringRangeMovementFollowedByRead|https://ci-cassandra.apache.org/job/Cassandra-trunk/1020/testReport/org.apache.cassandra.distributed.test/CASTest/testSucccessfulWriteDuringRangeMovementFollowedByRead/]
> All four seem to have the same aspect:
> {code}
> Failed 2 times in the last 5 runs. Flakiness: 50%, Stability: 60%
> Error Message
> CAS operation timed out: received 1 of 2 required responses after 0 
> contention retries
> Stacktrace
> org.apache.cassandra.exceptions.CasWriteTimeoutException: CAS operation timed 
> out: received 1 of 2 required responses after 0 contention retries
>   at 
> org.apache.cassandra.service.paxos.Paxos$MaybeFailure.markAndThrowAsTimeoutOrFailure(Paxos.java:547)
>   at org.apache.cassandra.service.paxos.Paxos.begin(Paxos.java:1048)
>   at org.apache.cassandra.service.paxos.Paxos.cas(Paxos.java:659)
>   at org.apache.cassandra.service.paxos.Paxos.cas(Paxos.java:618)
>   at org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:307)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:500)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:467)
>   at 
> org.apache.cassandra.distributed.impl.Coordinator.unsafeExecuteInternal(Coordinator.java:122)
>   at 
> org.apache.cassandra.distributed.impl.Coordinator.unsafeExecuteInternal(Coordinator.java:103)
>   at 
> org.apache.cassandra.distributed.impl.Coordinator.lambda$executeWithResult$0(Coordinator.java:66)
>   at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:47)
>   at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:57)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> Standard Output
> DEBUG [main] 2022-03-19 16:20:42,868 Reflections.java:198 - going to scan 
> these urls: 
> [jar:file:/home/cassandra/cassandra/build/apache-cassandra-4.1-SNAPSHOT.jar!/,
>  
> jar:file:/home/cassandra/cassandra/build/test/lib/jars/simulator-bootstrap.jar!/,
>  
> jar:file:/home/cassandra/cassandra/build/test/lib/jars/dtest-api-0.0.12.jar!/,
>  file:/home/cassandra/cassandra/build/classes/fqltool/, 
> file:/home/cassandra/cassandra/build/test/classes/, 
> file:/home/cassandra/cassandra/build/classes/main/, file:/home/cass
> ...[truncated 4929659 chars]...
> gService.java:519 - Waiting for messaging service to quiesce
> INFO

[jira] [Comment Edited] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581900#comment-17581900
 ] 

Stefan Miklosovic edited comment on CASSANDRA-13010 at 8/19/22 3:00 PM:


If there is ever some other field added into CompactionInfo and we want to show 
it to a user, we would add it everywhere - to whatever output. So to answer 
your question, yes, we would add it under "verbose" as well.

Having said that, I have to admit that "verbose" is rather unfortunate name for 
that flag. What I am trying to capture is to have the output grouped per disk 
as Jon suggested. Maybe "--group-by-target-dir" would be more appropriate?

If we go with vtable only, to achieve this "grouping by disk", we could at 
least order the output somehow but that is tricky, we would probably order the 
cql output too and that is just ... strange.


was (Author: smiklosovic):
If there is ever some other field added into CompactionInfo and we want to show 
it to a user, we would add it everywhere - to whatever output. So to answer 
your question, yes, we would add it under "verbose" as well.

Having said that, I have to admit that "verbose" is rather unfortunate name for 
that flag. What I am trying to capture is to have the output grouped per disk 
as Jon suggested. Maybe "--group-by-targer-dir" would be more appropriate?

If we go with vtable only, to achieve this "grouping by disk", we could at 
least order the output somehow but that is tricky, we would probably order the 
cql output too and that is just ... strange.

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581900#comment-17581900
 ] 

Stefan Miklosovic commented on CASSANDRA-13010:
---

If there is ever some other field added into CompactionInfo and we want to show 
it to a user, we would add it everywhere - to whatever output. So to answer 
your question, yes, we would add it under "verbose" as well.

Having said that, I have to admit that "verbose" is rather unfortunate name for 
that flag. What I am trying to capture is to have the output grouped per disk 
as Jon suggested. Maybe "--group-by-targer-dir" would be more appropriate?

If we go with vtable only, to achieve this "grouping by disk", we could at 
least order the output somehow but that is tricky, we would probably order the 
cql output too and that is just ... strange.

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581893#comment-17581893
 ] 

Brandon Williams commented on CASSANDRA-13010:
--

The scenario I'd like to avoid is adding a flag each time we want to vary the 
output, and appending the vtable output is a good way of doing that.  The next 
time we want to add something here, are you suggesting we would also lump it 
under the verbose flag?

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581873#comment-17581873
 ] 

Stefan Miklosovic commented on CASSANDRA-13010:
---

I am confused. What I was trying to do all the time was to follow the idea of 
Jon to write that output in such a way that it would group all compactions 
under the same disk. Now if we do want to have this only for vtable, do I 
understand it right that we will not have such output anymore? Each vtable 
entry for compaction will contain just a respective directory, that is all. Is 
this ok for people? I think the way of doing things per Jon makes sense to me.

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581858#comment-17581858
 ] 

Brandon Williams commented on CASSANDRA-13010:
--

bq. it is not possible to mix "verbose" with "vtable" output in nodetool

To be clear, I think we should drop the 'verbose' flag and leave this accessed 
via the vtable flag.  There's really no utility in having it available via two 
flags, and it's good to nudge people toward vtables when we can.

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-13010:
-
Reviewers: Brandon Williams
   Status: Review In Progress  (was: Patch Available)

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581842#comment-17581842
 ] 

Stefan Miklosovic commented on CASSANDRA-13010:
---

j11 precommit build 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/1213/workflows/2ec59cb3-59fe-470f-bc33-3e7d09360501

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17819) Test failure: org.apache.cassandra.distributed.test.SchemaTest.schemaReset

2022-08-19 Thread Jacek Lewandowski (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581809#comment-17581809
 ] 

Jacek Lewandowski commented on CASSANDRA-17819:
---

https://app.circleci.com/pipelines/github/jacek-lewandowski/cassandra/257/workflows/0b864a1b-efd9-4c08-8294-a1a891933820/jobs/1786

We will see how the rest of the tests look like

> Test failure: org.apache.cassandra.distributed.test.SchemaTest.schemaReset
> --
>
> Key: CASSANDRA-17819
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17819
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Andres de la Peña
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.1-beta, 4.x
>
>
> The test 
> {{{}org.apache.cassandra.distributed.test.SchemaTest.schemaReset{}}}, 
> recently introduced by CASSANDRA-17658, is flaky on 4.1 and trunk:
>  * 4.1: 
> [https://ci-cassandra.apache.org/job/Cassandra-4.1/134/testReport/org.apache.cassandra.distributed.test/SchemaTest/schemaReset_2/]
>  * trunk: 
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1265/testReport/org.apache.cassandra.distributed.test/SchemaTest/schemaReset_2/]
> {code:java}
> Error Message
> Condition with lambda expression in 
> org.apache.cassandra.distributed.test.SchemaTest that uses 
> org.apache.cassandra.distributed.Cluster was not fulfilled within 1 minutes.
> Stacktrace
> org.awaitility.core.ConditionTimeoutException: Condition with lambda 
> expression in org.apache.cassandra.distributed.test.SchemaTest that uses 
> org.apache.cassandra.distributed.Cluster was not fulfilled within 1 minutes.
>   at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165)
>   at 
> org.awaitility.core.CallableCondition.await(CallableCondition.java:78)
>   at 
> org.awaitility.core.CallableCondition.await(CallableCondition.java:26)
>   at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895)
>   at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:864)
>   at 
> org.apache.cassandra.distributed.test.SchemaTest.schemaReset(SchemaTest.java:115)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Standard Output
> INFO  [main]  2022-08-15 15:02:14,783 Reflections.java:219 - 
> Reflections took 1873 ms to scan 8 urls, producing 1754 keys and 6912 values
> INFO  [main]  2022-08-15 15:02:16,407 Reflections.java:219 - 
> Reflections took 1561 ms to scan 8 urls, producing 1754 keys and 6912 values
> Node id topology:
> node 1: dc = datacenter0, rack = rack0
> node 2: dc = datacenter0, rack = rack0
> Configured node count: 2, nodeIdTopology size: 2
> DEBUG [main] node1 2022-08-15 15:02:17,554 InternalLoggerFactory.ja
> ...[truncated 1761288 chars]...
> cutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> INFO  [node2_isolatedExecutor:3] node2 2022-08-15 15:03:52,096 
> MessagingService.java:519 - Waiting for messaging service to quiesce
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581775#comment-17581775
 ] 

Brandon Williams commented on CASSANDRA-13010:
--

We also need to run the j11 precommit pipeline.

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581725#comment-17581725
 ] 

Stefan Miklosovic commented on CASSANDRA-13010:
---

the build with the latest changes is here: 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/1213/workflows/b8445da5-3f64-46c5-8331-f72f90523835
PR is same: https://github.com/apache/cassandra/pull/1791

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-13010:
--
Test and Documentation Plan: 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/1213/workflows/b8445da5-3f64-46c5-8331-f72f90523835
  (was: 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/1207/workflows/19676308-1a92-4b2d-a16a-ea8dad9054d5)

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested
> Fix For: 4.x
>
> Attachments: cleanup.png, multiple operations.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-17838) Cassandra COPY command ignore QUOTE option

2022-08-19 Thread Tarik Haddad (Jira)

Tarik Haddad created CASSANDRA-17838:


 Summary: Cassandra COPY command ignore QUOTE option
 Key: CASSANDRA-17838
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17838
 Project: Cassandra
  Issue Type: Bug
Reporter: Tarik Haddad


I am not able to copy data into a CSV file and have them wrapped with single 
quotes:

e.g.

{*}Table{*}: myks.mytable
||column1||column2||
|value1|value2|

command used: {{copy myks.mytable (column1,column2) to 'myfile.csv' WITH 
HEADER=FALSE AND DELIMITER=';' AND QUOTE='"';}}

*expected result:*

The CSV file to have one row: {{"value1";"value2"}}

*actual result:*

The CSV file has one row: {{value1:value2}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra-website] branch asf-staging updated (71d8be370 -> cadc0f07d)