[jira] [Commented] (CASSANDRA-16456) Add Plugin Support for CQLSH

2022-04-22 Thread Brian Houser (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526772#comment-17526772
 ] 

Brian Houser commented on CASSANDRA-16456:
--

Ok, cool I will implement what I described with the points you added.   Will be 
done by the weekend pacific.

> Add Plugin Support for CQLSH
> 
>
> Key: CASSANDRA-16456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16456
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/cqlsh
>Reporter: Brian Houser
>Assignee: Brian Houser
>Priority: Normal
>  Labels: gsoc2021, mentor
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Currently the Cassandra drivers offer a plugin authenticator architecture for 
> the support of different authentication methods. This has been leveraged to 
> provide support for LDAP, Kerberos, and Sigv4 authentication. Unfortunately, 
> cqlsh, the included CLI tool, does not offer such support. Switching to a new 
> enhanced authentication scheme thus means being cut off from using cqlsh in 
> normal operation.
> We should have a means of using the same plugins and authentication providers 
> as the Python Cassandra driver.
> Here's a link to an initial draft of 
> [CEP|https://docs.google.com/document/d/1_G-OZCAEmDyuQuAN2wQUYUtZBEJpMkHWnkYELLhqvKc/edit?usp=sharing].



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-website] branch asf-staging updated (6cf9cac7 -> 815fe0cb)

2022-04-22 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 6cf9cac7 generate docs for 8fd077a6
 new 815fe0cb generate docs for 8fd077a6

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (6cf9cac7)
\
 N -- N -- N   refs/heads/asf-staging (815fe0cb)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 site-ui/build/ui-bundle.zip | Bin 4740078 -> 4740078 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17166) Enhance SnakeYAML properties to be reusable outside of YAML parsing, support camel case conversion to snake case, and add support to ignore properties

2022-04-22 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-17166:
--
  Fix Version/s: 4.1
 (was: 4.x)
Source Control Link: 
https://github.com/apache/cassandra/commit/9b7e50b29bd029fc2151789306dc28864e1fc689
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Enhance SnakeYAML properties to be reusable outside of YAML parsing, support 
> camel case conversion to snake case, and add support to ignore properties
> --
>
> Key: CASSANDRA-17166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17166
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 15h 10m
>  Remaining Estimate: 0h
>
> SnakeYaml is rather limited in the “object mapping” layer, which forces our 
> internal code to match specific patterns (all fields public and camel case); 
> we can remove this restriction by leveraging Jackson for property lookup, and 
> leaving the YAML handling to SnakeYAML



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17166) Enhance SnakeYAML properties to be reusable outside of YAML parsing, support camel case conversion to snake case, and add support to ignore properties

2022-04-22 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526749#comment-17526749
 ] 

David Capwell commented on CASSANDRA-17166:
---

Starting commit

CI Results (pending):
||Branch||Source||Circle CI||Jenkins||
|trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-17166-trunk-B0041C5D-C3FD-41B0-8F73-BC6B8C01DCA0]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-17166-trunk-B0041C5D-C3FD-41B0-8F73-BC6B8C01DCA0]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/1627/]|


> Enhance SnakeYAML properties to be reusable outside of YAML parsing, support 
> camel case conversion to snake case, and add support to ignore properties
> --
>
> Key: CASSANDRA-17166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17166
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 15h 10m
>  Remaining Estimate: 0h
>
> SnakeYaml is rather limited in the “object mapping” layer, which forces our 
> internal code to match specific patterns (all fields public and camel case); 
> we can remove this restriction by leveraging Jackson for property lookup, and 
> leaving the YAML handling to SnakeYAML



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17425) Add new CQL function maxWritetime

2022-04-22 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526743#comment-17526743
 ] 

Yifan Cai commented on CASSANDRA-17425:
---

I rebased my patch on trunk and created a new PR: 
[https://github.com/apache/cassandra/pull/1584]

CI: 
[https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=CASSANDRA-17425%2Ftrunk-new=all]

There are 2 commits. The first commit is the rebased original implementation. 
The second and optional commit implements the function based on WritetimeOrTTL. 
[~adelapena], can you review next week? 

> Add new CQL function maxWritetime
> -
>
> Key: CASSANDRA-17425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17425
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Syntax
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The function "writetime" does not support multi-cell types, e.g. collections 
> and UDT. It would be useful to enable querying the latest modified timestamp 
> of a column value. 
> I'd like to propose to add a new function named "maxWritetime", which returns 
> the largest timestamp amongst the cells. When being applied to the single 
> cell types, it returns the same result as "writetime".



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-website] branch asf-staging updated (c636267c -> 6cf9cac7)

2022-04-22 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard c636267c generate docs for 8fd077a6
 new 6cf9cac7 generate docs for 8fd077a6

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (c636267c)
\
 N -- N -- N   refs/heads/asf-staging (6cf9cac7)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../doc/4.1/cassandra/tools/nodetool/compact.html  |   6 +-
 .../latest/cassandra/tools/nodetool/compact.html   |   6 +-
 .../trunk/cassandra/tools/nodetool/compact.html|   6 +-
 content/search-index.js|   2 +-
 site-ui/build/ui-bundle.zip| Bin 4740078 -> 4740078 
bytes
 5 files changed, 16 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17166) Enhance SnakeYAML properties to be reusable outside of YAML parsing, support camel case conversion to snake case, and add support to ignore properties

2022-04-22 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-17166:
--
Status: Ready to Commit  (was: Review In Progress)

2 +1s

> Enhance SnakeYAML properties to be reusable outside of YAML parsing, support 
> camel case conversion to snake case, and add support to ignore properties
> --
>
> Key: CASSANDRA-17166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17166
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 15h 10m
>  Remaining Estimate: 0h
>
> SnakeYaml is rather limited in the “object mapping” layer, which forces our 
> internal code to match specific patterns (all fields public and camel case); 
> we can remove this restriction by leveraging Jackson for property lookup, and 
> leaving the YAML handling to SnakeYAML



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17537) nodetool compact should support using a key string to find the range to avoid operators having to manually do this

2022-04-22 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-17537:
--
  Fix Version/s: 4.1
Source Control Link: 
https://github.com/apache/cassandra/commit/2b90ac1a1671b4071d9aa6f18e852021bc66702d
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> nodetool compact should support using a key string to find the range to avoid 
> operators having to manually do this
> --
>
> Key: CASSANDRA-17537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17537
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Its common that a single key needs to be compact, and operators need to do 
> the following
> 1) go from key -> token
> 2) generate range
> 3) call nodetool compact with this range
> We can simply this workflow by adding this to compact
> nodetool compact ks.tbl -k “key1"



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: nodetool compact should support using a key string to find the range to avoid operators having to manually do this

2022-04-22 Thread dcapwell
This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 2b90ac1a16 nodetool compact should support using a key string to find 
the range to avoid operators having to manually do this
2b90ac1a16 is described below

commit 2b90ac1a1671b4071d9aa6f18e852021bc66702d
Author: David Capwell 
AuthorDate: Thu Apr 21 14:37:59 2022 -0700

nodetool compact should support using a key string to find the range to 
avoid operators having to manually do this

patch by David Capwell; reviewed by Marcus Eriksson for CASSANDRA-17537
---
 CHANGES.txt|   1 +
 .../org/apache/cassandra/db/ColumnFamilyStore.java |   5 +
 .../db/compaction/CompactionController.java|   6 +-
 .../cassandra/db/compaction/CompactionManager.java |  30 +-
 .../apache/cassandra/dht/Murmur3Partitioner.java   |   5 +
 .../cassandra/io/sstable/format/SSTableReader.java |   7 ++
 .../apache/cassandra/service/StorageService.java   |  49 +-
 .../cassandra/service/StorageServiceMBean.java |   7 ++
 src/java/org/apache/cassandra/tools/NodeProbe.java |   5 +
 .../apache/cassandra/tools/nodetool/Compact.java   |  13 ++-
 .../org/apache/cassandra/tools/ToolRunner.java |  53 ++
 .../cassandra/tools/nodetool/CompactTest.java  | 107 +
 12 files changed, 274 insertions(+), 14 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 9e9e1ee2f1..972f760442 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.1
+ * nodetool compact should support using a key string to find the range to 
avoid operators having to manually do this (CASSANDRA-17537)
  * Add guardrail for data disk usage (CASSANDRA-17150)
  * Tool to list data paths of existing tables (CASSANDRA-17568)
  * Migrate track_warnings to more standard naming conventions and use latest 
configuration types rather than long (CASSANDRA-17560)
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 35ca94214d..47dd66d7ae 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -2365,6 +2365,11 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 return tokenRanges;
 }
 
+public void forceCompactionForKey(DecoratedKey key)
+{
+CompactionManager.instance.forceCompactionForKey(this, key);
+}
+
 public static Iterable all()
 {
 List> stores = new 
ArrayList<>(Schema.instance.getKeyspaces().size());
diff --git 
a/src/java/org/apache/cassandra/db/compaction/CompactionController.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionController.java
index e1b0f32583..814292f207 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionController.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionController.java
@@ -34,7 +34,6 @@ import org.apache.cassandra.io.sstable.format.SSTableReader;
 import org.apache.cassandra.io.util.FileDataInput;
 import org.apache.cassandra.io.util.FileUtils;
 import org.apache.cassandra.schema.CompactionParams.TombstoneOption;
-import org.apache.cassandra.utils.AlwaysPresentFilter;
 import org.apache.cassandra.utils.OverlapIterator;
 import org.apache.cassandra.utils.concurrent.Refs;
 
@@ -255,10 +254,7 @@ public class CompactionController extends 
AbstractCompactionController
 
 for (SSTableReader sstable: filteredSSTables)
 {
-// if we don't have bloom filter(bf_fp_chance=1.0 or filter file 
is missing),
-// we check index file instead.
-if (sstable.getBloomFilter() instanceof AlwaysPresentFilter && 
sstable.getPosition(key, SSTableReader.Operator.EQ, false) != null
-|| sstable.getBloomFilter().isPresent(key))
+if (sstable.maybePresent(key))
 {
 minTimestampSeen = Math.min(minTimestampSeen, 
sstable.getMinTimestamp());
 hasTimestamp = true;
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index 47ed3d5e11..165e1e02f3 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@ -27,6 +27,7 @@ import java.util.concurrent.TimeUnit;
 import java.util.concurrent.atomic.AtomicInteger;
 import java.util.function.BooleanSupplier;
 import java.util.function.Predicate;
+import java.util.function.Supplier;
 import java.util.stream.Collectors;
 import javax.management.openmbean.OpenDataException;
 import javax.management.openmbean.TabularData;
@@ -928,10 +929,10 @@ public class CompactionManager implements 
CompactionManagerMBean
  

[jira] [Commented] (CASSANDRA-17537) nodetool compact should support using a key string to find the range to avoid operators having to manually do this

2022-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526702#comment-17526702
 ] 

Brandon Williams commented on CASSANDRA-17537:
--

That is an old known flaky: CASSANDRA-16677

> nodetool compact should support using a key string to find the range to avoid 
> operators having to manually do this
> --
>
> Key: CASSANDRA-17537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17537
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Its common that a single key needs to be compact, and operators need to do 
> the following
> 1) go from key -> token
> 2) generate range
> 3) call nodetool compact with this range
> We can simply this workflow by adding this to compact
> nodetool compact ks.tbl -k “key1"



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17537) nodetool compact should support using a key string to find the range to avoid operators having to manually do this

2022-04-22 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526701#comment-17526701
 ] 

David Capwell commented on CASSANDRA-17537:
---

CI was clean other than org.apache.cassandra.net.ConnectionTest, rerunning 
locally as I think its just flaky... if passes will merge

> nodetool compact should support using a key string to find the range to avoid 
> operators having to manually do this
> --
>
> Key: CASSANDRA-17537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17537
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Its common that a single key needs to be compact, and operators need to do 
> the following
> 1) go from key -> token
> 2) generate range
> 3) call nodetool compact with this range
> We can simply this workflow by adding this to compact
> nodetool compact ks.tbl -k “key1"



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16555) Add out-of-the-box snitch for Ec2 IMDSv2

2022-04-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16555:
-
Reviewers: Brandon Williams

> Add out-of-the-box snitch for Ec2 IMDSv2
> 
>
> Key: CASSANDRA-16555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16555
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Consistency/Coordination
>Reporter: Paul Rütter (BlueConic)
>Assignee: fulco taen
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to patch a vulnerability, Amazon came up with a new version of their 
> metadata service.
> It's no longer unrestricted but now requires a token (in a header), in order 
> to access the metadata service.
> See 
> [https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html]
>  for more information.
> Cassandra currently doesn't offer an out-of-the-box snitch class to support 
> this.
> See 
> [https://cassandra.apache.org/doc/latest/operating/snitch.html#snitch-classes]
> This issue asks to add support for this as a separate snitch class.
> We'll probably do a PR for this, as we are in the process of developing one.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-website] branch asf-staging updated (5ec74b34 -> c636267c)

2022-04-22 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 5ec74b34 generate docs for 8fd077a6
 new c636267c generate docs for 8fd077a6

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (5ec74b34)
\
 N -- N -- N   refs/heads/asf-staging (c636267c)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/search-index.js |   2 +-
 site-ui/build/ui-bundle.zip | Bin 4740078 -> 4740078 bytes
 2 files changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17576) Make GuardrailDiskUsageTest deterministic

2022-04-22 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17576:

Fix Version/s: 4.1

> Make GuardrailDiskUsageTest deterministic
> -
>
> Key: CASSANDRA-17576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17576
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Guardrails
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.1
>
>
> Jenkins is low on space so we should mock the amount of available disk space 
> when testing the disk usage guardrails as otherwise the tests fail.
> The issue was not seen before commit as CircleCI doesn't have storage 
> problems.
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1095/]
> Cc [~adelapena] 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17576) Make GuardrailDiskUsageTest deterministic

2022-04-22 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17576:

 Bug Category: Parent values: Code(13163)
   Complexity: Normal
  Component/s: Feature/Guardrails
Discovered By: Unit Test
 Severity: Low
   Status: Open  (was: Triage Needed)

> Make GuardrailDiskUsageTest deterministic
> -
>
> Key: CASSANDRA-17576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17576
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Guardrails
>Reporter: Ekaterina Dimitrova
>Priority: Normal
>
> Jenkins is low on space so we should mock the amount of available disk space 
> when testing the disk usage guardrails as otherwise the tests fail.
> The issue was not seen before commit as CircleCI doesn't have storage 
> problems.
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1095/]
> Cc [~adelapena] 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-17576) Make GuardrailDiskUsageTest deterministic

2022-04-22 Thread Ekaterina Dimitrova (Jira)
Ekaterina Dimitrova created CASSANDRA-17576:
---

 Summary: Make GuardrailDiskUsageTest deterministic
 Key: CASSANDRA-17576
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17576
 Project: Cassandra
  Issue Type: Bug
Reporter: Ekaterina Dimitrova


Jenkins is low on space so we should mock the amount of available disk space 
when testing the disk usage guardrails as otherwise the tests fail.

The issue was not seen before commit as CircleCI doesn't have storage problems.

[https://ci-cassandra.apache.org/job/Cassandra-trunk/1095/]

Cc [~adelapena] 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17575) forceCompactionForTokenRange when using a wrapped range may include sstables not within that range

2022-04-22 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-17575:
--
 Bug Category: Parent values: Correctness(12982)Level 1 values: API / 
Semantic Implementation(12988)
   Complexity: Normal
Discovered By: Unit Test
Fix Version/s: 4.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> forceCompactionForTokenRange when using a wrapped range may include sstables 
> not within that range
> --
>
> Key: CASSANDRA-17575
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17575
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>
> This was found in CASSANDRA-17537
> When you compact the range (32, 31] this should include everything BUT 32, 
> but in the test 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategyTest#testTokenRangeCompaction
>  it found that SSTables with the bounds (32, 32) were getting included in the 
> set of SSTables to compact



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-17575) forceCompactionForTokenRange when using a wrapped range may include sstables not within that range

2022-04-22 Thread David Capwell (Jira)
David Capwell created CASSANDRA-17575:
-

 Summary: forceCompactionForTokenRange when using a wrapped range 
may include sstables not within that range
 Key: CASSANDRA-17575
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17575
 Project: Cassandra
  Issue Type: Bug
  Components: Local/Compaction
Reporter: David Capwell


This was found in CASSANDRA-17537

When you compact the range (32, 31] this should include everything BUT 32, but 
in the test 
org.apache.cassandra.db.compaction.LeveledCompactionStrategyTest#testTokenRangeCompaction
 it found that SSTables with the bounds (32, 32) were getting included in the 
set of SSTables to compact



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17537) nodetool compact should support using a key string to find the range to avoid operators having to manually do this

2022-04-22 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526671#comment-17526671
 ] 

David Capwell commented on CASSANDRA-17537:
---

spoke with [~marcuse] and looks like sstablesInBounds is returning SSTables not 
within the range, which causes the above to fail; filing a different ticket for 
this and removed the assert from this block

> nodetool compact should support using a key string to find the range to avoid 
> operators having to manually do this
> --
>
> Key: CASSANDRA-17537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17537
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction, Tool/nodetool
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Its common that a single key needs to be compact, and operators need to do 
> the following
> 1) go from key -> token
> 2) generate range
> 3) call nodetool compact with this range
> We can simply this workflow by adding this to compact
> nodetool compact ks.tbl -k “key1"



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-website] branch asf-staging updated (3cb0927f -> 5ec74b34)

2022-04-22 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 3cb0927f generate docs for 8fd077a6
 new 5ec74b34 generate docs for 8fd077a6

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (3cb0927f)
\
 N -- N -- N   refs/heads/asf-staging (5ec74b34)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 site-ui/build/ui-bundle.zip | Bin 4740078 -> 4740078 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2022-04-22 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15510:
---
  Fix Version/s: 4.0.5
 4.1
 (was: 4.x)
 (was: 4.0.x)
Source Control Link: 
https://github.com/apache/cassandra/commit/018c8e0d5e8bc55fc51d3361fcb27c3c1fd189f6
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed into cassandra-4.0 at 018c8e0d5e8bc55fc51d3361fcb27c3c1fd189f6 and 
merged into trunk

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.5, 4.1
>
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new 

[cassandra] 01/01: Merge branch cassandra-4.0 into trunk

2022-04-22 Thread blerer
This is an automated email from the ASF dual-hosted git repository.

blerer pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 003a96b6a6f649f99138b94c52d28b73c2c3547a
Merge: 2723c91878 018c8e0d5e
Author: Benjamin Lerer 
AuthorDate: Fri Apr 22 19:28:32 2022 +0200

Merge branch cassandra-4.0 into trunk



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated (2723c91878 -> 003a96b6a6)

2022-04-22 Thread blerer
This is an automated email from the ASF dual-hosted git repository.

blerer pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 2723c91878 Merge branch 'cassandra-4.0' into trunk
 add 018c8e0d5e Optimise BTree build, update and transform operations
 new 003a96b6a6 Merge branch cassandra-4.0 into trunk

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-4.0 updated (2873c91269 -> 018c8e0d5e)

2022-04-22 Thread blerer
This is an automated email from the ASF dual-hosted git repository.

blerer pushed a change to branch cassandra-4.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 2873c91269 Split ReadRepairQueryTypesTest to avoid JUnit timeouts
 add 018c8e0d5e Optimise BTree build, update and transform operations

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt|3 +
 build.xml  |4 +-
 src/java/org/apache/cassandra/db/Columns.java  |4 +-
 .../db/partitions/AtomicBTreePartition.java|   14 +-
 .../cassandra/db/partitions/PartitionUpdate.java   |6 +-
 .../org/apache/cassandra/db/rows/BTreeRow.java |2 +-
 .../cassandra/db/rows/ComplexColumnData.java   |   27 +-
 src/java/org/apache/cassandra/db/rows/Row.java |2 +-
 .../org/apache/cassandra/utils/BulkIterator.java   |  112 +
 .../org/apache/cassandra/utils/btree/BTree.java| 3485 +---
 .../apache/cassandra/utils/btree/BTreeRemoval.java |   12 +-
 .../org/apache/cassandra/utils/btree/BTreeSet.java |   46 +-
 .../apache/cassandra/utils/btree/NodeBuilder.java  |  441 ---
 .../apache/cassandra/utils/btree/TreeBuilder.java  |  121 -
 .../cassandra/utils/btree/UpdateFunction.java  |   32 +-
 .../utils/caching/TinyThreadLocalPool.java |   85 +
 .../org/apache/cassandra/utils/LongBTreeTest.java  |  587 ++--
 .../BTreeBench.java}   |   75 +-
 .../test/microbench/btree/BTreeBuildBench.java |  127 +
 .../test/microbench/btree/BTreeTransformBench.java |  194 ++
 .../test/microbench/btree/BTreeUpdateBench.java|  324 ++
 .../test/microbench/btree/IntVisitor.java  |   85 +
 .../test/microbench/btree/Megamorphism.java|  169 +
 .../cassandra/utils/btree/BTreeRemovalTest.java|   17 +-
 .../utils/btree/BTreeSearchIteratorTest.java   |6 +-
 .../apache/cassandra/utils/btree/BTreeTest.java|  239 +-
 26 files changed, 4712 insertions(+), 1507 deletions(-)
 create mode 100644 src/java/org/apache/cassandra/utils/BulkIterator.java
 delete mode 100644 src/java/org/apache/cassandra/utils/btree/NodeBuilder.java
 delete mode 100644 src/java/org/apache/cassandra/utils/btree/TreeBuilder.java
 create mode 100644 
src/java/org/apache/cassandra/utils/caching/TinyThreadLocalPool.java
 copy 
test/microbench/org/apache/cassandra/test/microbench/{BTreeBuildBench.java => 
btree/BTreeBench.java} (54%)
 create mode 100644 
test/microbench/org/apache/cassandra/test/microbench/btree/BTreeBuildBench.java
 create mode 100644 
test/microbench/org/apache/cassandra/test/microbench/btree/BTreeTransformBench.java
 create mode 100644 
test/microbench/org/apache/cassandra/test/microbench/btree/BTreeUpdateBench.java
 create mode 100644 
test/microbench/org/apache/cassandra/test/microbench/btree/IntVisitor.java
 create mode 100644 
test/microbench/org/apache/cassandra/test/microbench/btree/Megamorphism.java


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically

2022-04-22 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-17543:
--
  Fix Version/s: 4.1
 4.0.4
 (was: 4.x)
  Since Version: 4.0.0
Source Control Link: 
https://github.com/apache/cassandra/commit/2873c9126979e21a8089e9a18d96af802745dbc2
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE 
> coordinator=1 flush=false paging=false] times out sporadically
> ---
>
> Key: CASSANDRA-17543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17543
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Caleb Rackliffe
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.1, 4.0.4
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8:
>  strategy=NONE coordinator=1 flush=false paging=false]
> {noformat}
> Error Message
> Timeout occurred. Please note the time in the report does not reflect the 
> time until the timeout.
> Stacktrace
> junit.framework.AssertionFailedError: Timeout occurred. Please note the time 
> in the report does not reflect the time until the timeout.
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {noformat}
> See 
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1075/testReport/org.apache.cassandra.distributed.test/ReadRepairQueryTypesTest/testUnrestrictedQueryOnSkinnyTable_8__strategy_NONE_coordinator_1_flush_false_paging_false_/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-website] branch asf-staging updated (eb4d1ab0 -> 3cb0927f)

2022-04-22 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard eb4d1ab0 generate docs for 8fd077a6
 new 3cb0927f generate docs for 8fd077a6

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (eb4d1ab0)
\
 N -- N -- N   refs/heads/asf-staging (3cb0927f)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../cassandra/configuration/cass_yaml_file.html|  53 --
 .../4.1/cassandra/tools/nodetool/bootstrap.html|   8 +--
 .../nodetool/{refresh.html => datapaths.html}  |  21 +++---
 .../doc/4.1/cassandra/tools/nodetool/nodetool.html |  12 ++--
 .../4.1/cassandra/tools/nodetool/repair_admin.html |  80 ++---
 .../cassandra/troubleshooting/use_nodetool.html|  46 
 .../cassandra/configuration/cass_yaml_file.html|  53 --
 .../latest/cassandra/tools/nodetool/bootstrap.html |   8 +--
 .../cassandra/tools/nodetool/datapaths.html}   |  21 +++---
 .../latest/cassandra/tools/nodetool/nodetool.html  |  12 ++--
 .../cassandra/tools/nodetool/repair_admin.html |  80 ++---
 .../cassandra/troubleshooting/use_nodetool.html|  46 
 .../cassandra/configuration/cass_yaml_file.html|  53 --
 .../trunk/cassandra/tools/nodetool/bootstrap.html  |   8 +--
 .../cassandra/tools/nodetool/datapaths.html}   |  21 +++---
 .../trunk/cassandra/tools/nodetool/nodetool.html   |  12 ++--
 .../cassandra/tools/nodetool/repair_admin.html |  80 ++---
 .../cassandra/troubleshooting/use_nodetool.html|  46 
 content/search-index.js|   2 +-
 site-ui/build/ui-bundle.zip| Bin 4740078 -> 4740078 
bytes
 20 files changed, 469 insertions(+), 193 deletions(-)
 copy content/doc/4.1/cassandra/tools/nodetool/{refresh.html => datapaths.html} 
(98%)
 copy content/doc/{4.1/cassandra/tools/nodetool/refresh.html => 
latest/cassandra/tools/nodetool/datapaths.html} (98%)
 copy content/doc/{4.1/cassandra/tools/nodetool/refresh.html => 
trunk/cassandra/tools/nodetool/datapaths.html} (98%)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically

2022-04-22 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-17543:
--
Status: Ready to Commit  (was: Review In Progress)

> ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE 
> coordinator=1 flush=false paging=false] times out sporadically
> ---
>
> Key: CASSANDRA-17543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17543
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Caleb Rackliffe
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8:
>  strategy=NONE coordinator=1 flush=false paging=false]
> {noformat}
> Error Message
> Timeout occurred. Please note the time in the report does not reflect the 
> time until the timeout.
> Stacktrace
> junit.framework.AssertionFailedError: Timeout occurred. Please note the time 
> in the report does not reflect the time until the timeout.
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {noformat}
> See 
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1075/testReport/org.apache.cassandra.distributed.test/ReadRepairQueryTypesTest/testUnrestrictedQueryOnSkinnyTable_8__strategy_NONE_coordinator_1_flush_false_paging_false_/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically

2022-04-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526547#comment-17526547
 ] 

Andres de la Peña commented on CASSANDRA-17543:
---

Thanks, committed to {{cassandra-4.0}} as 
[2873c9126979e21a8089e9a18d96af802745dbc2|https://github.com/apache/cassandra/commit/2873c9126979e21a8089e9a18d96af802745dbc2]
 and [merge into 
{{trunk}}|https://github.com/apache/cassandra/commit/2723c91878cfd7005a53f6118015c484dacc0f32]

> ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE 
> coordinator=1 flush=false paging=false] times out sporadically
> ---
>
> Key: CASSANDRA-17543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17543
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Caleb Rackliffe
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8:
>  strategy=NONE coordinator=1 flush=false paging=false]
> {noformat}
> Error Message
> Timeout occurred. Please note the time in the report does not reflect the 
> time until the timeout.
> Stacktrace
> junit.framework.AssertionFailedError: Timeout occurred. Please note the time 
> in the report does not reflect the time until the timeout.
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {noformat}
> See 
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1075/testReport/org.apache.cassandra.distributed.test/ReadRepairQueryTypesTest/testUnrestrictedQueryOnSkinnyTable_8__strategy_NONE_coordinator_1_flush_false_paging_false_/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-4.0 updated: Split ReadRepairQueryTypesTest to avoid JUnit timeouts

2022-04-22 Thread adelapena
This is an automated email from the ASF dual-hosted git repository.

adelapena pushed a commit to branch cassandra-4.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/cassandra-4.0 by this push:
 new 2873c91269 Split ReadRepairQueryTypesTest to avoid JUnit timeouts
2873c91269 is described below

commit 2873c9126979e21a8089e9a18d96af802745dbc2
Author: Andrés de la Peña 
AuthorDate: Wed Apr 13 12:09:17 2022 +0100

Split ReadRepairQueryTypesTest to avoid JUnit timeouts

patch by Andrés de la Peña; reviewed by Caleb Rackliffe for CASSANDRA-17543
---
 .../test/ReadRepairCollectionQueriesTest.java  |  236 
 .../distributed/test/ReadRepairInQueriesTest.java  |  247 
 .../test/ReadRepairPointQueriesTest.java   |   79 ++
 .../distributed/test/ReadRepairQueryTester.java|  280 +
 .../distributed/test/ReadRepairQueryTypesTest.java | 1192 
 .../test/ReadRepairRangeQueriesTest.java   |  261 +
 .../test/ReadRepairSliceQueriesTest.java   |  145 +++
 .../test/ReadRepairUnrestrictedQueriesTest.java|  116 ++
 8 files changed, 1364 insertions(+), 1192 deletions(-)

diff --git 
a/test/distributed/org/apache/cassandra/distributed/test/ReadRepairCollectionQueriesTest.java
 
b/test/distributed/org/apache/cassandra/distributed/test/ReadRepairCollectionQueriesTest.java
new file mode 100644
index 00..6149ffc93f
--- /dev/null
+++ 
b/test/distributed/org/apache/cassandra/distributed/test/ReadRepairCollectionQueriesTest.java
@@ -0,0 +1,236 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.distributed.test;
+
+import org.junit.Test;
+
+import static org.apache.cassandra.distributed.shared.AssertUtils.row;
+
+/**
+ * {@link ReadRepairQueryTester} for queries on collections.
+ */
+public class ReadRepairCollectionQueriesTest extends ReadRepairQueryTester
+{
+/**
+ * Test unrestricted queries with frozen tuples.
+ */
+@Test
+public void testTuple()
+{
+tester("")
+.createTable("CREATE TABLE %s (k int PRIMARY KEY, a tuple, b 
tuple)")
+.mutate("INSERT INTO %s (k, a, b) VALUES (0, (1, 2), (3, 4))")
+.queryColumns("a", 1, 1,
+  rows(row(tuple(1, 2))),
+  rows(row(0, tuple(1, 2), tuple(3, 4))),
+  rows(row(0, tuple(1, 2), null)))
+.deleteColumn("DELETE a FROM %s WHERE k=0", "b", 0, 1,
+  rows(row(tuple(3, 4))),
+  rows(row(0, null, tuple(3, 4))),
+  rows(row(0, tuple(1, 2), tuple(3, 4
+.deleteRows("DELETE FROM %s WHERE k=0", 1,
+rows(),
+rows(row(0, null, tuple(3, 4
+.tearDown();
+}
+
+/**
+ * Test unrestricted queries with frozen sets.
+ */
+@Test
+public void testFrozenSet()
+{
+tester("")
+.createTable("CREATE TABLE %s (k int PRIMARY KEY, a frozen>, 
b frozen>)")
+.mutate("INSERT INTO %s (k, a, b) VALUES (0, {1, 2}, {3, 4})")
+.queryColumns("a[1]", 1, 1,
+  rows(row(1)),
+  rows(row(0, set(1, 2), set(3, 4))),
+  rows(row(0, set(1, 2), null)))
+.deleteColumn("DELETE a FROM %s WHERE k=0", "b[4]", 0, 1,
+  rows(row(4)),
+  rows(row(0, null, set(3, 4))),
+  rows(row(0, set(1, 2), set(3, 4
+.deleteRows("DELETE FROM %s WHERE k=0", 1,
+rows(),
+rows(row(0, null, set(3, 4
+.tearDown();
+}
+
+/**
+ * Test unrestricted queries with frozen lists.
+ */
+@Test
+public void testFrozenList()
+{
+tester("")
+.createTable("CREATE TABLE %s (k int PRIMARY KEY, a frozen>, 
b frozen>)")
+.mutate("INSERT INTO %s (k, a, b) VALUES (0, [1, 2], [3, 4])")
+.queryColumns("a", 1, 1,
+  rows(row(list(1, 2))),
+  rows(row(0, list(1, 2), list(3, 4))),
+  rows(row(0, list(1, 2), null)))
+

[cassandra] branch trunk updated (b3842de5cf -> 2723c91878)

2022-04-22 Thread adelapena
This is an automated email from the ASF dual-hosted git repository.

adelapena pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from b3842de5cf Add guardrail for data disk usage
 new 2873c91269 Split ReadRepairQueryTypesTest to avoid JUnit timeouts
 new 2723c91878 Merge branch 'cassandra-4.0' into trunk

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../test/ReadRepairCollectionQueriesTest.java  |  236 
 .../distributed/test/ReadRepairInQueriesTest.java  |  247 
 .../test/ReadRepairPointQueriesTest.java   |   79 ++
 .../distributed/test/ReadRepairQueryTester.java|  279 +
 .../distributed/test/ReadRepairQueryTypesTest.java | 1191 
 .../test/ReadRepairRangeQueriesTest.java   |  261 +
 .../test/ReadRepairSliceQueriesTest.java   |  145 +++
 .../test/ReadRepairUnrestrictedQueriesTest.java|  116 ++
 8 files changed, 1363 insertions(+), 1191 deletions(-)
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/ReadRepairCollectionQueriesTest.java
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/ReadRepairInQueriesTest.java
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/ReadRepairPointQueriesTest.java
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/ReadRepairQueryTester.java
 delete mode 100644 
test/distributed/org/apache/cassandra/distributed/test/ReadRepairQueryTypesTest.java
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/ReadRepairRangeQueriesTest.java
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/ReadRepairSliceQueriesTest.java
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/ReadRepairUnrestrictedQueriesTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge branch 'cassandra-4.0' into trunk

2022-04-22 Thread adelapena
This is an automated email from the ASF dual-hosted git repository.

adelapena pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 2723c91878cfd7005a53f6118015c484dacc0f32
Merge: b3842de5cf 2873c91269
Author: Andrés de la Peña 
AuthorDate: Fri Apr 22 17:30:22 2022 +0100

Merge branch 'cassandra-4.0' into trunk

 .../test/ReadRepairCollectionQueriesTest.java  |  236 
 .../distributed/test/ReadRepairInQueriesTest.java  |  247 
 .../test/ReadRepairPointQueriesTest.java   |   79 ++
 .../distributed/test/ReadRepairQueryTester.java|  279 +
 .../distributed/test/ReadRepairQueryTypesTest.java | 1191 
 .../test/ReadRepairRangeQueriesTest.java   |  261 +
 .../test/ReadRepairSliceQueriesTest.java   |  145 +++
 .../test/ReadRepairUnrestrictedQueriesTest.java|  116 ++
 8 files changed, 1363 insertions(+), 1191 deletions(-)

diff --cc 
test/distributed/org/apache/cassandra/distributed/test/ReadRepairQueryTester.java
index 00,10bf05021b..26516104fb
mode 00,100644..100644
--- 
a/test/distributed/org/apache/cassandra/distributed/test/ReadRepairQueryTester.java
+++ 
b/test/distributed/org/apache/cassandra/distributed/test/ReadRepairQueryTester.java
@@@ -1,0 -1,280 +1,279 @@@
+ /*
+  * Licensed to the Apache Software Foundation (ASF) under one
+  * or more contributor license agreements.  See the NOTICE file
+  * distributed with this work for additional information
+  * regarding copyright ownership.  The ASF licenses this file
+  * to you under the Apache License, Version 2.0 (the
+  * "License"); you may not use this file except in compliance
+  * with the License.  You may obtain a copy of the License at
+  *
+  * http://www.apache.org/licenses/LICENSE-2.0
+  *
+  * Unless required by applicable law or agreed to in writing, software
+  * distributed under the License is distributed on an "AS IS" BASIS,
+  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  * See the License for the specific language governing permissions and
+  * limitations under the License.
+  */
+ 
+ package org.apache.cassandra.distributed.test;
+ 
+ import java.io.IOException;
+ import java.util.ArrayList;
+ import java.util.Collection;
+ import java.util.List;
+ 
+ import org.junit.AfterClass;
+ import org.junit.BeforeClass;
+ import org.junit.runner.RunWith;
+ import org.junit.runners.Parameterized;
+ 
+ import org.apache.cassandra.distributed.Cluster;
+ import org.apache.cassandra.service.reads.repair.ReadRepairStrategy;
+ 
 -import static java.util.concurrent.TimeUnit.MINUTES;
+ import static 
org.apache.cassandra.distributed.shared.AssertUtils.assertEquals;
+ import static org.apache.cassandra.distributed.shared.AssertUtils.assertRows;
+ import static 
org.apache.cassandra.service.reads.repair.ReadRepairStrategy.NONE;
+ 
+ /**
+  * Base class for tests around read repair functionality with different query 
types and schemas.
+  * 
+  * Each test verifies that its tested query triggers read repair propagating 
mismatched rows/columns and row/column
+  * deletions. They also verify that the selected rows and columns are 
propagated through read repair on missmatch,
+  * and that unselected rows/columns are not repaired.
+  * 
+  * The tests are parameterized for:
+  * 
+  * 
+  * Data to be repaired residing on the query coordinator or a 
replica
+  * Data to be repaired residing on memtables or flushed to 
sstables
+  * 
+  * 
+  * All derived tests follow a similar pattern:
+  * 
+  * Create a keyspace with RF=2 and a table
+  * Insert some data in only one of the nodes
+  * Run the tested read query selecting a subset of the inserted 
columns with CL=ALL
+  * Verify that the previous read has triggered read repair 
propagating only the queried columns
+  * Run the tested read query again but this time selecting all the 
columns
+  * Verify that the previous read has triggered read repair 
propagating the rest of the queried rows
+  * Delete one of the involved columns in just one node
+  * Run the tested read query again but this time selecting a column 
different to the deleted one
+  * Verify that the previous read hasn't propagated the column 
deletion
+  * Run the tested read query again selecting all the columns
+  * Verify that the previous read has triggered read repair 
propagating the column deletion
+  * Delete one of the involved rows in just one node
+  * Run the tested read query again selecting all the columns
+  * Verify that the previous read has triggered read repair 
propagating the row deletions
+  * Verify the final status of each node and drop the table
+  * 
+  */
+ @RunWith(Parameterized.class)
+ public abstract class ReadRepairQueryTester extends TestBaseImpl
+ {
+ private static final int NUM_NODES = 2;
+ 
+ /**
+  * The read repair strategy to be used
+  */
+ 

[jira] [Commented] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2022-04-22 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526543#comment-17526543
 ] 

Benjamin Lerer commented on CASSANDRA-15510:


CI runs for 
[4.0|https://app.circleci.com/pipelines/github/blerer/cassandra/284/workflows/76db84a8-a6a1-4364-85ce-72ed5f12081f]
 and 
[trunk|https://app.circleci.com/pipelines/github/blerer/cassandra/287/workflows/f9ad1572-460f-4b99-bfbd-ac3edaac61da]

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the 

[jira] [Commented] (CASSANDRA-17370) Add flag enabling operators to restrict use of ALLOW FILTERING in queries

2022-04-22 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526541#comment-17526541
 ] 

Josh McKenzie commented on CASSANDRA-17370:
---

+1 here

> Add flag enabling operators to restrict use of ALLOW FILTERING in queries
> -
>
> Key: CASSANDRA-17370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17370
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Semantics, Feature/Guardrails
>Reporter: Savni Nagarkar
>Assignee: Savni Nagarkar
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> This ticket adds the ability for operators to disallow use of ALLOW FILTERING 
> predicates in CQL SELECT statements. As queries that ALLOW FILTERING can 
> place additional load on the database, the flag enables operators to provide 
> tighter bounds on performance guarantees. The patch includes a new yaml 
> property, as well as a hot property enabling the value to be modified via JMX 
> at runtime.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically

2022-04-22 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526524#comment-17526524
 ] 

Caleb Rackliffe commented on CASSANDRA-17543:
-

Go for it. The only failure I see is {{test_oversized_mutation}}, so it's 
"clean" :)

> ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE 
> coordinator=1 flush=false paging=false] times out sporadically
> ---
>
> Key: CASSANDRA-17543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17543
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Caleb Rackliffe
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8:
>  strategy=NONE coordinator=1 flush=false paging=false]
> {noformat}
> Error Message
> Timeout occurred. Please note the time in the report does not reflect the 
> time until the timeout.
> Stacktrace
> junit.framework.AssertionFailedError: Timeout occurred. Please note the time 
> in the report does not reflect the time until the timeout.
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {noformat}
> See 
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1075/testReport/org.apache.cassandra.distributed.test/ReadRepairQueryTypesTest/testUnrestrictedQueryOnSkinnyTable_8__strategy_NONE_coordinator_1_flush_false_paging_false_/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16555) Add out-of-the-box snitch for Ec2 IMDSv2

2022-04-22 Thread Jeremiah Jordan (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Jordan updated CASSANDRA-16555:

Test and Documentation Plan: New snitch should be added to the docs.
 Status: Patch Available  (was: Open)

Just noticed this PR today.  Putting this to patch available, looks like it was 
never transitioned and slipped through.

> Add out-of-the-box snitch for Ec2 IMDSv2
> 
>
> Key: CASSANDRA-16555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16555
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Consistency/Coordination
>Reporter: Paul Rütter (BlueConic)
>Assignee: fulco taen
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to patch a vulnerability, Amazon came up with a new version of their 
> metadata service.
> It's no longer unrestricted but now requires a token (in a header), in order 
> to access the metadata service.
> See 
> [https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html]
>  for more information.
> Cassandra currently doesn't offer an out-of-the-box snitch class to support 
> this.
> See 
> [https://cassandra.apache.org/doc/latest/operating/snitch.html#snitch-classes]
> This issue asks to add support for this as a separate snitch class.
> We'll probably do a PR for this, as we are in the process of developing one.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-16555) Add out-of-the-box snitch for Ec2 IMDSv2

2022-04-22 Thread Jeremiah Jordan (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Jordan reassigned CASSANDRA-16555:
---

Assignee: fulco taen

> Add out-of-the-box snitch for Ec2 IMDSv2
> 
>
> Key: CASSANDRA-16555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16555
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Consistency/Coordination
>Reporter: Paul Rütter (BlueConic)
>Assignee: fulco taen
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to patch a vulnerability, Amazon came up with a new version of their 
> metadata service.
> It's no longer unrestricted but now requires a token (in a header), in order 
> to access the metadata service.
> See 
> [https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html]
>  for more information.
> Cassandra currently doesn't offer an out-of-the-box snitch class to support 
> this.
> See 
> [https://cassandra.apache.org/doc/latest/operating/snitch.html#snitch-classes]
> This issue asks to add support for this as a separate snitch class.
> We'll probably do a PR for this, as we are in the process of developing one.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17150) Guardrails for disk usage

2022-04-22 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-17150:
--
  Fix Version/s: 4.1
 (was: 4.x)
Source Control Link: 
https://github.com/apache/cassandra/commit/b3842de5cf1fa1b81872effb4585fbc7e1873d59
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Guardrails for disk usage
> -
>
> Key: CASSANDRA-17150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17150
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Add guardrails for disk usage establishing soft/hard limits on the percentage 
> of used disk space. For example:
> {code}
> # Warning threshold to warn when local disk usage exceeds threshold. Valid 
> values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_warn_threshold: -1
> # Failure threshold to reject write requests if replica disk usage exceeds 
> threshold. Valid values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_failure_threshold: -1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17150) Guardrails for disk usage

2022-04-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526506#comment-17526506
 ] 

Andres de la Peña commented on CASSANDRA-17150:
---

Thanks, committed to {{trunk}} as 
[b3842de5cf1fa1b81872effb4585fbc7e1873d59|https://github.com/apache/cassandra/commit/b3842de5cf1fa1b81872effb4585fbc7e1873d59].

> Guardrails for disk usage
> -
>
> Key: CASSANDRA-17150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17150
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Add guardrails for disk usage establishing soft/hard limits on the percentage 
> of used disk space. For example:
> {code}
> # Warning threshold to warn when local disk usage exceeds threshold. Valid 
> values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_warn_threshold: -1
> # Failure threshold to reject write requests if replica disk usage exceeds 
> threshold. Valid values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_failure_threshold: -1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Add guardrail for data disk usage

2022-04-22 Thread adelapena
This is an automated email from the ASF dual-hosted git repository.

adelapena pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new b3842de5cf Add guardrail for data disk usage
b3842de5cf is described below

commit b3842de5cf1fa1b81872effb4585fbc7e1873d59
Author: Andrés de la Peña 
AuthorDate: Fri Apr 22 16:36:07 2022 +0100

Add guardrail for data disk usage

patch by Andrés de la Peña; reviewed by Ekaterina Dimitrova and Stefan 
Miklosovic for CASSANDRA-17150

Co-authored-by: Andrés de la Peña 
Co-authored-by: Zhao Yang 
Co-authored-by: Eduard Tudenhoefner 
---
 CHANGES.txt|   1 +
 NEWS.txt   |  28 +
 conf/cassandra.yaml|  24 +-
 .../config/CassandraRelevantProperties.java|   8 +
 src/java/org/apache/cassandra/config/Config.java   |  49 +-
 .../apache/cassandra/config/DataStorageSpec.java   |  13 +-
 .../apache/cassandra/config/GuardrailsOptions.java | 121 +++-
 .../org/apache/cassandra/cql3/QueryOptions.java|   9 +-
 .../cassandra/cql3/selection/ResultSetBuilder.java |   5 +-
 .../cassandra/cql3/statements/BatchStatement.java  |   8 +-
 .../cql3/statements/ModificationStatement.java |  25 +
 src/java/org/apache/cassandra/db/Directories.java  |   5 +
 src/java/org/apache/cassandra/db/ReadCommand.java  |   8 +-
 .../apache/cassandra/db/guardrails/Guardrail.java  |  92 ++-
 .../apache/cassandra/db/guardrails/Guardrails.java | 112 +++-
 .../cassandra/db/guardrails/GuardrailsConfig.java  |  25 +-
 .../cassandra/db/guardrails/GuardrailsMBean.java   |  61 +-
 .../db/guardrails/PercentageThreshold.java |  56 ++
 .../apache/cassandra/db/guardrails/Predicates.java |  93 
 .../apache/cassandra/db/guardrails/Threshold.java  |  20 +-
 .../org/apache/cassandra/gms/ApplicationState.java |   1 +
 .../org/apache/cassandra/gms/VersionedValue.java   |   5 +
 .../cassandra/io/sstable/format/SSTableWriter.java |   2 +-
 .../apache/cassandra/service/StorageService.java   |   2 +
 .../service/disk/usage/DiskUsageBroadcaster.java   | 181 ++
 .../service/disk/usage/DiskUsageMonitor.java   | 233 
 .../service/disk/usage/DiskUsageState.java |  70 +++
 .../test/guardrails/GuardrailDiskUsageTest.java| 225 
 .../cassandra/config/DataStorageSpecTest.java  |  29 +-
 .../db/guardrails/GuardrailCollectionSizeTest.java |  10 +-
 .../db/guardrails/GuardrailDiskUsageTest.java  | 617 +
 .../cassandra/db/guardrails/GuardrailTester.java   |  10 +
 .../cassandra/db/guardrails/GuardrailsTest.java|  46 ++
 .../cassandra/db/guardrails/ThresholdTester.java   |  28 +-
 .../cassandra/db/virtual/GossipInfoTableTest.java  |   3 +-
 35 files changed, 2125 insertions(+), 100 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index a1213090e2..9e9e1ee2f1 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.1
+ * Add guardrail for data disk usage (CASSANDRA-17150)
  * Tool to list data paths of existing tables (CASSANDRA-17568)
  * Migrate track_warnings to more standard naming conventions and use latest 
configuration types rather than long (CASSANDRA-17560)
  * Add support for CONTAINS and CONTAINS KEY in conditional UPDATE and DELETE 
statement (CASSANDRA-10537)
diff --git a/NEWS.txt b/NEWS.txt
index a891eb3a9a..fd31e06c93 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -56,6 +56,34 @@ using the provided 'sstableupgrade' tool.
 
 New features
 
+- Added a new guardrails framework allowing to define soft/hard limits for 
different user actions, such as limiting
+  the number of tables, columns per table or the size of collections. 
These guardrails are only applied to regular
+  user queries, and superusers and internal queries are excluded. Reaching 
the soft limit raises a client warning,
+  whereas reaching the hard limit aborts the query. In both cases a log 
message and a diagnostic event are emitted.
+  Additionally, some guardrails are not linked to specific user queries 
due to techincal limitations, such as
+  detecting the size of large collections during compaction or 
periodically monitoring the disk usage. These
+  guardrails would only emit the proper logs and diagnostic events when 
triggered, without aborting any processes.
+  Guardrails config is defined through cassandra.yaml properties, and they 
can be dynamically updated through the
+  JMX MBean `org.apache.cassandra.db:type=Guardrails`. There are 
guardrails for:
+- Number of user keyspaces.
+- Number of user tables.
+- Number of columns per table.
+- Number of secondary indexes per table.
+- Number of materialized tables per table.
+- Number of fields per user-defined type.
+- Number of items in a collection .
+- 

[jira] [Updated] (CASSANDRA-17557) Fix a few config parameters after the Paxos improvements commit

2022-04-22 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17557:

Reviewers: Benedict Elliott Smith, Ekaterina Dimitrova  (was: Benedict 
Elliott Smith)
   Status: Review In Progress  (was: Patch Available)

> Fix a few config parameters after the Paxos improvements commit
> ---
>
> Key: CASSANDRA-17557
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17557
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.x
>
>
> After committing the Paxos improvements, it was identified that the following 
> configuration parameters need additional work:
>  * repair_request_timeout_in_ms - can be removed
>  * paxos_auto_repair_threshold_mb - I think it can be also removed; to be 
> confirmed with the author
> Discussed a bit in Slack and on this PR - 
> https://github.com/apache/cassandra/commit/d2923275e360a1ee9db498e748c269f701bb3a8b



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17557) Fix a few config parameters after the Paxos improvements commit

2022-04-22 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17557:

Reviewers: Benedict Elliott Smith  (was: Benedict Elliott Smith, Ekaterina 
Dimitrova)

> Fix a few config parameters after the Paxos improvements commit
> ---
>
> Key: CASSANDRA-17557
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17557
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.x
>
>
> After committing the Paxos improvements, it was identified that the following 
> configuration parameters need additional work:
>  * repair_request_timeout_in_ms - can be removed
>  * paxos_auto_repair_threshold_mb - I think it can be also removed; to be 
> confirmed with the author
> Discussed a bit in Slack and on this PR - 
> https://github.com/apache/cassandra/commit/d2923275e360a1ee9db498e748c269f701bb3a8b



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17557) Fix a few config parameters after the Paxos improvements commit

2022-04-22 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526504#comment-17526504
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17557:
-

Thanks, I rebased your branch and pushed a new CI run: 
[J8|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/1566/workflows/393c8917-1d44-41be-afec-8ab6a97b7ede],
  
[J11|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/1566/workflows/23ff50bf-a946-4097-b6d6-d624983c932c]
 - pending results, just started it. 

>From config perspective looks good, I guess for the Verb class change someone 
>more familiar with your latest work than me should say. [~barnie] or 
>[~ifesdjeen] maybe? 

> Fix a few config parameters after the Paxos improvements commit
> ---
>
> Key: CASSANDRA-17557
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17557
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.x
>
>
> After committing the Paxos improvements, it was identified that the following 
> configuration parameters need additional work:
>  * repair_request_timeout_in_ms - can be removed
>  * paxos_auto_repair_threshold_mb - I think it can be also removed; to be 
> confirmed with the author
> Discussed a bit in Slack and on this PR - 
> https://github.com/apache/cassandra/commit/d2923275e360a1ee9db498e748c269f701bb3a8b



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17150) Guardrails for disk usage

2022-04-22 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526485#comment-17526485
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17150:
-

Thanks [~adelapena], I see in CI only known old failures that have respective 
tickets. I think we are ready to commit it. :)  

> Guardrails for disk usage
> -
>
> Key: CASSANDRA-17150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17150
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Add guardrails for disk usage establishing soft/hard limits on the percentage 
> of used disk space. For example:
> {code}
> # Warning threshold to warn when local disk usage exceeds threshold. Valid 
> values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_warn_threshold: -1
> # Failure threshold to reject write requests if replica disk usage exceeds 
> threshold. Valid values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_failure_threshold: -1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17150) Guardrails for disk usage

2022-04-22 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17150:

Status: Ready to Commit  (was: Review In Progress)

> Guardrails for disk usage
> -
>
> Key: CASSANDRA-17150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17150
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Add guardrails for disk usage establishing soft/hard limits on the percentage 
> of used disk space. For example:
> {code}
> # Warning threshold to warn when local disk usage exceeds threshold. Valid 
> values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_warn_threshold: -1
> # Failure threshold to reject write requests if replica disk usage exceeds 
> threshold. Valid values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_failure_threshold: -1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-17574) Throw exception on wrong config boundaries

2022-04-22 Thread Ekaterina Dimitrova (Jira)
Ekaterina Dimitrova created CASSANDRA-17574:
---

 Summary: Throw exception on wrong config boundaries
 Key: CASSANDRA-17574
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17574
 Project: Cassandra
  Issue Type: Bug
Reporter: Ekaterina Dimitrova


While working on CASSANDRA-15234 we noticed usage of negative values where they 
are not supposed to be used. We fixed that for the parameters in scope - type 
duration, data storage and data rate but as [~brandon.williams] pointed - there 
are other examples from the rest of the config that is good to be fixed too.

This ticket should handle:
- check the rest of the parameters, where negatives shouldn't be allowed they 
shouldn't be allowed

- ensure that whatever validations we apply to parameters during startup (check 
the DatabaseDescriptor) are applied also in the respective setters for those 
parameters.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17574) Throw exception on wrong config boundaries

2022-04-22 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17574:

 Bug Category: Parent values: Correctness(12982)
   Complexity: Low Hanging Fruit
  Component/s: Local/Config
Discovered By: User Report
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Throw exception on wrong config boundaries
> --
>
> Key: CASSANDRA-17574
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17574
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Ekaterina Dimitrova
>Priority: Normal
>
> While working on CASSANDRA-15234 we noticed usage of negative values where 
> they are not supposed to be used. We fixed that for the parameters in scope - 
> type duration, data storage and data rate but as [~brandon.williams] pointed 
> - there are other examples from the rest of the config that is good to be 
> fixed too.
> This ticket should handle:
> - check the rest of the parameters, where negatives shouldn't be allowed they 
> shouldn't be allowed
> - ensure that whatever validations we apply to parameters during startup 
> (check the DatabaseDescriptor) are applied also in the respective setters for 
> those parameters.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17574) Throw exception on wrong config boundaries

2022-04-22 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17574:

Fix Version/s: 4.x

> Throw exception on wrong config boundaries
> --
>
> Key: CASSANDRA-17574
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17574
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.x
>
>
> While working on CASSANDRA-15234 we noticed usage of negative values where 
> they are not supposed to be used. We fixed that for the parameters in scope - 
> type duration, data storage and data rate but as [~brandon.williams] pointed 
> - there are other examples from the rest of the config that is good to be 
> fixed too.
> This ticket should handle:
> - check the rest of the parameters, where negatives shouldn't be allowed they 
> shouldn't be allowed
> - ensure that whatever validations we apply to parameters during startup 
> (check the DatabaseDescriptor) are applied also in the respective setters for 
> those parameters.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17329) Fix failing test - dtest-upgrade.upgrade_internal_auth_test.TestAuthUpgrade.test_upgrade_legacy_tabl

2022-04-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17329:
-
Fix Version/s: 3.11.x

> Fix failing test - 
> dtest-upgrade.upgrade_internal_auth_test.TestAuthUpgrade.test_upgrade_legacy_tabl
> 
>
> Key: CASSANDRA-17329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17329
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> Failed 6 times in the last 16 runs. Flakiness: 60%, Stability: 62%
> Error Message
> ccmlib.node.TimeoutError: 26 Jan 2022 22:48:51 [node1] after 120.17/120 
> seconds Missing: ['Listening for thrift clients...'] not found in system.log: 
>  Head: INFO  [main] 2022-01-26 22:46:51,840 YamlConfigura  Tail: ...ing 
> legacy permissions data INFO  [OptionalTasks:1] 2022-01-26 22:47:08,405 
> CassandraAuthorizer.java:444 - Completed conversion of legacy permissions



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data

2022-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526470#comment-17526470
 ] 

Brandon Williams commented on CASSANDRA-17180:
--

Data dir in the system keyspace makes the most sense to me.  The system ks is 
generally not backed up/restored since that's a bad idea, so if the file is 
there is won't accidentally be restored and cause a problem.

> Implement startup check to prevent Cassandra start to spread zombie data
> 
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-17566) Fix flaky test - org.apache.cassandra.distributed.test.repair.ForceRepairTest.force

2022-04-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-17566:


Assignee: Brandon Williams

> Fix flaky test - 
> org.apache.cassandra.distributed.test.repair.ForceRepairTest.force
> ---
>
> Key: CASSANDRA-17566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 4.x
>
>
> Seen on jenkins here: 
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1083/testReport/org.apache.cassandra.distributed.test.repair/ForceRepairTest/force_2/]
>  
> and circle here:
> https://app.circleci.com/pipelines/github/driftx/cassandra/440/workflows/42f936c7-2ede-4fbf-957c-5fb4e461dd90/jobs/5160/tests#failed-test-1
> {noformat}
> junit.framework.AssertionFailedError: nodetool command [repair, 
> distributed_test_keyspace, --force, --full] was not successful
> stdout:
> [2022-04-20 15:11:01,402] Starting repair command #2 
> (1701a090-c0bc-11ec-9898-07c796ce6a49), repairing keyspace 
> distributed_test_keyspace with repair options (parallelism: parallel, primary 
> range: false, incremental: false, job threads: 1, ColumnFamilies: [], 
> dataCenters: [], hosts: [], previewKind: NONE, # of ranges: 3, pull repair: 
> false, force repair: true, optimise streams: false, ignore unreplicated 
> keyspaces: false, repairPaxos: true, paxosOnly: false)
> [2022-04-20 15:11:11,406] Repair command #2 failed with error Did not get 
> replies from all endpoints.
> [2022-04-20 15:11:11,408] Repair command #2 finished with error
> stderr:
> error: Repair job has failed with the error message: Repair command #2 failed 
> with error Did not get replies from all endpoints.. Check the logs on the 
> repair participants for further details
> -- StackTrace --
> java.lang.RuntimeException: Repair job has failed with the error message: 
> Repair command #2 failed with error Did not get replies from all endpoints.. 
> Check the logs on the repair participants for further details
>   at 
> org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:137)
>   at 
> org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77)
>   at 
> javax.management.NotificationBroadcasterSupport.handleNotification(NotificationBroadcasterSupport.java:275)
>   at 
> javax.management.NotificationBroadcasterSupport$SendNotifJob.run(NotificationBroadcasterSupport.java:352)
>   at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:124)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data

2022-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526467#comment-17526467
 ] 

Stefan Miklosovic commented on CASSANDRA-17180:
---

I think we need to really place it to a data dir because /tmp is not durable 
enough and other Cassandra dir might not be writable. The only durable & 
writable is data dir.

> Implement startup check to prevent Cassandra start to spread zombie data
> 
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17551) Allow 0 to be used in collection_size guardrails in order to prohibit collections

2022-04-22 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526465#comment-17526465
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17551:
-

We had further discussion with [~adelapena] and he has a good point that we 
already have feature flags like "materialized_views_enabled" so disabling 
collections with 0 will diverge from our current approach.

This will require broader discussion and consideration. Not doing this patch at 
the moment. We can start discussion and more work for the next release, now we 
have only a week until freeze so it will be a rush. Moving the ticket back to 
open and marking it 5.x

> Allow 0 to be used in collection_size guardrails in order to prohibit 
> collections
> -
>
> Key: CASSANDRA-17551
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17551
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.x
>
>
> Allow 0 to be used in collection_size guardrails in order to prohibit 
> collections



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17551) Allow 0 to be used in collection_size guardrails in order to prohibit collections

2022-04-22 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17551:

Fix Version/s: 5.x
   (was: 4.x)

> Allow 0 to be used in collection_size guardrails in order to prohibit 
> collections
> -
>
> Key: CASSANDRA-17551
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17551
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.x
>
>
> Allow 0 to be used in collection_size guardrails in order to prohibit 
> collections



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-04-22 Thread Aleksandr Sorokoumov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-17456:
-
Test and Documentation Plan: I made the existing dtest applicable to C* 
versions until 4.0.x and added an in-jvm dtest to cover rejection of oversized 
mutations on insert.
 Status: Patch Available  (was: In Progress)

As Benedict suggested, I moved the mutation size check from CommitLog to the 
client and internode connections.

Patches:
 * 
[17456-trunk|https://github.com/apache/cassandra/compare/trunk...Ge:17456-trunk?expand=1]
 * [dtest|https://github.com/apache/cassandra-dtest/pull/186]

[Jenkins CI 
run|https://ci-cassandra.apache.org/job/Cassandra-devbranch/1626/#showFailuresLink]

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data

2022-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526463#comment-17526463
 ] 

Stefan Miklosovic commented on CASSANDRA-17180:
---

Yes, I realised that /tmp/ problem just now ... ahh.

> Can't we parse it like any other JSON file?

Ah right, I know what you mean. 

So the two outstanding questions is the default place of this file and if it 
should be enabled by default.

> Implement startup check to prevent Cassandra start to spread zombie data
> 
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data

2022-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526452#comment-17526452
 ] 

Stefan Miklosovic edited comment on CASSANDRA-17180 at 4/22/22 2:18 PM:


[~paulo] thanks for finally looking into it, I ll deal with it over the weekend 
to finally move this over the line.

I had implemented something similar to your postActions idea but Brandon's 
opinion was that we are inventing just something else here. But I see you moved 
that "execute post actions loop" after all checks are verified in 
CassandraDaemon instead of having it in StartupChecks.verify directly. I am 
fine with your take on that, is Brandon too?

Good  to know this is going to check system_distributed and system_auth too.

As for the default place of the heartbeat file, thats good point. Maybe we 
should go a little bit wild here and we might save it to /tmp/ ? I think that 
has the most guarantee of being writable. I do not like the fact that there is 
suddenly some file in area for sstables / tables. Other existing software might 
have a problem with this. For example when you are backuping, you would need to 
what ... exclude or include that file? It depends how people look at these 
backups etc. For that reason I would place it somewhere else. But  if we 
place it to /tmp, and you have more than one node running on the same machine, 
there will be the clash as two nodes happen to write to the same file {_}by 
default{_}. In that case we would have to make that file name unique, e.g. by 
including node's id. What is your take on this?

Yes we can rename that class.

I do not mind to start to write JSON into that file, but ... how do you want to 
parse that file? I still need to read it / check it and so on. By what you 
would like to replace all that logic?

EDIT: I will think more about the consequencies of making this enabled by 
default. That is simple thing to change at the end of this work anyway, might 
be done whenever we want.

EDIT 2: writing to /tmp/ is quite a bad idea because that tend to be wiped out 
on restarts.


was (Author: smiklosovic):
[~paulo] thanks for finally looking into it, I ll deal with it over the weekend 
to finally move this over the line.

I had implemented something similar to your postActions idea but Brandon's 
opinion was that we are inventing just something else here. But I see you moved 
that "execute post actions loop" after all checks are verified in 
CassandraDaemon instead of having it in StartupChecks.verify directly. I am 
fine with your take on that, is Brandon too?

Good  to know this is going to check system_distributed and system_auth too.

As for the default place of the heartbeat file, thats good point. Maybe we 
should go a little bit wild here and we might save it to /tmp/ ? I think that 
has the most guarantee of being writable. I do not like the fact that there is 
suddenly some file in area for sstables / tables. Other existing software might 
have a problem with this. For example when you are backuping, you would need to 
what ... exclude or include that file? It depends how people look at these 
backups etc. For that reason I would place it somewhere else. But  if we 
place it to /tmp, and you have more than one node running on the same machine, 
there will be the clash as two nodes happen to write to the same file {_}by 
default{_}. In that case we would have to make that file name unique, e.g. by 
including node's id. What is your take on this?

Yes we can rename that class.

I do not mind to start to write JSON into that file, but ... how do you want to 
parse that file? I still need to read it / check it and so on. By what you 
would like to replace all that logic?

EDIT: I will think more about the consequencies of making this enabled by 
default. That is simple thing to change at the end of this work anyway, might 
be done whenever we want.

> Implement startup check to prevent Cassandra start to spread zombie data
> 
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> 

[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data

2022-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526459#comment-17526459
 ] 

Brandon Williams commented on CASSANDRA-17180:
--

bq. I am fine with your take on that, is Brandon too?

I'm fine with the majority here.

bq. Maybe we should go a little bit wild here and we might save it to /tmp/ ?

That sounds unworkable to me.  There's no guarantee of durability until the 
next startup, people override tmpdir, etc.

bq. I do not mind to start to write JSON into that file, but ... how do you 
want to parse that file?

Can't we parse it like any other JSON file?


> Implement startup check to prevent Cassandra start to spread zombie data
> 
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data

2022-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526452#comment-17526452
 ] 

Stefan Miklosovic edited comment on CASSANDRA-17180 at 4/22/22 2:11 PM:


[~paulo] thanks for finally looking into it, I ll deal with it over the weekend 
to finally move this over the line.

I had implemented something similar to your postActions idea but Brandon's 
opinion was that we are inventing just something else here. But I see you moved 
that "execute post actions loop" after all checks are verified in 
CassandraDaemon instead of having it in StartupChecks.verify directly. I am 
fine with your take on that, is Brandon too?

Good  to know this is going to check system_distributed and system_auth too.

As for the default place of the heartbeat file, thats good point. Maybe we 
should go a little bit wild here and we might save it to /tmp/ ? I think that 
has the most guarantee of being writable. I do not like the fact that there is 
suddenly some file in area for sstables / tables. Other existing software might 
have a problem with this. For example when you are backuping, you would need to 
what ... exclude or include that file? It depends how people look at these 
backups etc. For that reason I would place it somewhere else. But  if we 
place it to /tmp, and you have more than one node running on the same machine, 
there will be the clash as two nodes happen to write to the same file {_}by 
default{_}. In that case we would have to make that file name unique, e.g. by 
including node's id. What is your take on this?

Yes we can rename that class.

I do not mind to start to write JSON into that file, but ... how do you want to 
parse that file? I still need to read it / check it and so on. By what you 
would like to replace all that logic?

EDIT: I will think more about the consequencies of making this enabled by 
default. That is simple thing to change at the end of this work anyway, might 
be done whenever we want.


was (Author: smiklosovic):
[~paulo] thanks for finally looking into it, I ll deal with it over the weekend 
to finally move this over the line.

I had implemented something similar to your postActions idea but Brandon's 
opinion was that we are inventing just something else here. But I see you moved 
that "execute post actions loop" after all checks are verified in 
CassandraDaemon instead of having it in StartupChecks.verify directly. I am 
fine with your take on that, is Brandon too?

Good  to know this is going to check system_distributed and system_auth too.

As for the default place of the heartbeat file, thats good point. Maybe we 
should go a little bit wild here and we might save it to /tmp/ ? I think that 
has the most guarantee of being writable. I do not like the fact that there is 
suddenly some file in area for sstables / tables. Other existing software might 
have a problem with this. For example when you are backuping, you would need to 
what ... exclude or include that file? It depends how people look at these 
backups etc. For that reason I would place it somewhere else. But  if we 
place it to /tmp, and you have more than one node running on the same machine, 
there will be the clash as two nodes happen to write to the same file {_}by 
default{_}. In that case we would have to make that file name unique, e.g. by 
including node's id. What is your take on this?

Yes we can rename that class.

I do not mind to start to write JSON into that file, but ... how do you want to 
parse that file? I still need to read it / check it and so on. By what you 
would like to replace all that logic?

 

> Implement startup check to prevent Cassandra start to spread zombie data
> 
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org


[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data

2022-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526452#comment-17526452
 ] 

Stefan Miklosovic commented on CASSANDRA-17180:
---

[~paulo] thanks for finally looking into it, I ll deal with it over the weekend 
to finally move this over the line.

I had implemented something similar to your postActions idea but Brandon's 
opinion was that we are inventing just something else here. But I see you moved 
that "execute post actions loop" after all checks are verified in 
CassandraDaemon instead of having it in StartupChecks.verify directly. I am 
fine with your take on that, is Brandon too?

Good  to know this is going to check system_distributed and system_auth too.

As for the default place of the heartbeat file, thats good point. Maybe we 
should go a little bit wild here and we might save it to /tmp/ ? I think that 
has the most guarantee of being writable. I do not like the fact that there is 
suddenly some file in area for sstables / tables. Other existing software might 
have a problem with this. For example when you are backuping, you would need to 
what ... exclude or include that file? It depends how people look at these 
backups etc. For that reason I would place it somewhere else. But  if we 
place it to /tmp, and you have more than one node running on the same machine, 
there will be the clash as two nodes happen to write to the same file {_}by 
default{_}. In that case we would have to make that file name unique, e.g. by 
including node's id. What is your take on this?

Yes we can rename that class.

I do not mind to start to write JSON into that file, but ... how do you want to 
parse that file? I still need to read it / check it and so on. By what you 
would like to replace all that logic?

 

> Implement startup check to prevent Cassandra start to spread zombie data
> 
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17150) Guardrails for disk usage

2022-04-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526317#comment-17526317
 ] 

Andres de la Peña edited comment on CASSANDRA-17150 at 4/22/22 2:05 PM:


[~e.dimitrova] thanks for the review. I think I have addressed the last bits. 
I'm running CI after rebase+squash:
||PR||CI||
|[trunk|https://github.com/apache/cassandra/pull/1546]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/d032178d-f8a9-4124-b36f-5bf6f47b3116]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/bc844580-6f3a-4bc3-a4d0-d85f082330f8]|

Please note that during the rebase I have replaced a few references to the 
removed {{Config.DISABLED_GUARDRAIL}} constant by {{-1}}. Those references were 
recently added to track warnings during CASSANDRA-17560. As it's mentioned 
[here|https://github.com/apache/cassandra/pull/1572#discussion_r854251196], 
using {{-1}} as the disabled value is a global config convention and not a 
guardrails thing, so we should either use it directly or define a new constant 
with a more generic name. If we decide to do the latter, I'd prefer to do it in 
a separate ticket, so we can focus on locating all the usages around.


was (Author: adelapena):
[~e.dimitrova] thanks for the review. I think I have addressed the last bits. 
I'm running CI after rebase+squash:
||PR||CI||
|[trunk|https://github.com/apache/cassandra/pull/1546]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/d032178d-f8a9-4124-b36f-5bf6f47b3116]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/bc844580-6f3a-4bc3-a4d0-d85f082330f8]|

Please note that during the rebase I have replaced a few references to the 
removed `Config.DISABLED_GUARDRAIL` constant by {{{}-1{}}}. Those references 
were recently added to track warnings during CASSANDRA-17560. As it's mentioned 
[here|https://github.com/apache/cassandra/pull/1572#discussion_r854251196], 
using {{-1}} as the disabled value is a global config convention and not a 
guardrails thing, so we should either use it directly or define a new constant 
with a more generic name. If we decide to do the latter, I'd prefer to do it in 
a separate ticket, so we can focus on locating all the usages around.

> Guardrails for disk usage
> -
>
> Key: CASSANDRA-17150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17150
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Add guardrails for disk usage establishing soft/hard limits on the percentage 
> of used disk space. For example:
> {code}
> # Warning threshold to warn when local disk usage exceeds threshold. Valid 
> values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_warn_threshold: -1
> # Failure threshold to reject write requests if replica disk usage exceeds 
> threshold. Valid values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_failure_threshold: -1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data

2022-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526440#comment-17526440
 ] 

Brandon Williams commented on CASSANDRA-17180:
--

That sounds like a great idea to me.

> Implement startup check to prevent Cassandra start to spread zombie data
> 
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17370) Add flag enabling operators to restrict use of ALLOW FILTERING in queries

2022-04-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526435#comment-17526435
 ] 

Andres de la Peña commented on CASSANDRA-17370:
---

[~jmckenzie] [~dcapwell] are we ready to commit this before the freeze?

> Add flag enabling operators to restrict use of ALLOW FILTERING in queries
> -
>
> Key: CASSANDRA-17370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17370
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Semantics, Feature/Guardrails
>Reporter: Savni Nagarkar
>Assignee: Savni Nagarkar
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> This ticket adds the ability for operators to disallow use of ALLOW FILTERING 
> predicates in CQL SELECT statements. As queries that ALLOW FILTERING can 
> place additional load on the database, the flag enables operators to provide 
> tighter bounds on performance guarantees. The patch includes a new yaml 
> property, as well as a hot property enabling the value to be modified via JMX 
> at runtime.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data

2022-04-22 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526431#comment-17526431
 ] 

Paulo Motta commented on CASSANDRA-17180:
-

{quote}Can we just use File.setLastModified and File.lastModified to read/write 
the heartbeat instead?
{quote}
alternatively we can just write a JSON similar to the snapshot manifest, since 
we can use existing JSON utilities to read/write the hearbeat file without 
needing to implement a custom parser. something like this:
{noformat}
{"last_heartbeat": "2022-04-22T13:33:41Z"}
{noformat}
we could later augment this json with more info if the need arises.

WDYT?

> Implement startup check to prevent Cassandra start to spread zombie data
> 
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Tibor Repasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526413#comment-17526413
 ] 

Tibor Repasi commented on CASSANDRA-17568:
--

Thank you for the commitment, the reviews and the productive feedback. Glad to 
see that coming in 4.1.

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526410#comment-17526410
 ] 

Stefan Miklosovic commented on CASSANDRA-17568:
---

Thanks [~rtib]  for the effort, it was very smooth cooperation at GitHub. 
Definitely keep this stuff coming if you had any other ideas to implement.

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17568:
--
  Fix Version/s: 4.1
 (was: 4.x)
Source Control Link: 
https://github.com/apache/cassandra/commit/c26dc06a28b0e150384474001ac23026ae76e6d5
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526404#comment-17526404
 ] 

Brandon Williams commented on CASSANDRA-17568:
--

+1

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: add datapaths subcommand to nodetool

2022-04-22 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new c26dc06a28 add datapaths subcommand to nodetool
c26dc06a28 is described below

commit c26dc06a28b0e150384474001ac23026ae76e6d5
Author: Tibor Répási 
AuthorDate: Wed Apr 20 22:10:13 2022 +0200

add datapaths subcommand to nodetool

patch by Tibor Repasi; reviewed by Stefan Miklosovic and Brandon Williams 
for CASSANDRA-17568
---
 CHANGES.txt|   1 +
 .../pages/troubleshooting/use_nodetool.adoc|  43 ++
 src/java/org/apache/cassandra/tools/NodeTool.java  |   1 +
 .../apache/cassandra/tools/nodetool/DataPaths.java |  53 +++
 .../tools/nodetool/stats/DataPathsHolder.java  |  84 ++
 .../tools/nodetool/stats/DataPathsPrinter.java |  63 
 .../cassandra/tools/nodetool/DataPathsTest.java| 170 +
 7 files changed, 415 insertions(+)

diff --git a/CHANGES.txt b/CHANGES.txt
index 5ab33a229b..a1213090e2 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.1
+ * Tool to list data paths of existing tables (CASSANDRA-17568)
  * Migrate track_warnings to more standard naming conventions and use latest 
configuration types rather than long (CASSANDRA-17560)
  * Add support for CONTAINS and CONTAINS KEY in conditional UPDATE and DELETE 
statement (CASSANDRA-10537)
  * Migrate advanced config parameters to the new Config types (CASSANDRA-17431)
diff --git a/doc/modules/cassandra/pages/troubleshooting/use_nodetool.adoc 
b/doc/modules/cassandra/pages/troubleshooting/use_nodetool.adoc
index f80d039695..a313432cbb 100644
--- a/doc/modules/cassandra/pages/troubleshooting/use_nodetool.adoc
+++ b/doc/modules/cassandra/pages/troubleshooting/use_nodetool.adoc
@@ -240,3 +240,46 @@ concurrent compactions such that compactions complete 
quickly but don't
 take too many resources away from query threads is very important for
 performance. If you notice compaction unable to keep up, try tuning
 Cassandra's `concurrent_compactors` or `compaction_throughput` options.
+
+[[nodetool-datapaths]]
+== Paths used for data files
+
+Cassandra is persisting data on disk within the configured directories. Data
+files are distributed among the directories configured with 
`data_file_directories`.
+Resembling the structure of keyspaces and tables, Cassandra is creating
+subdirectories within `data_file_directories`. However, directories aren't 
removed
+even if the tables and keyspaces are dropped. While these directories are kept 
with
+the reason of holding snapshots, they are subject to removal. This is where 
operators
+need to know which directories are still in use. Running the `nodetool 
datapaths`
+command is an easy way to list in which directories Cassandra is actually 
storing
+sstable data on disk.
+
+[source, bash]
+
+% nodetool datapaths -- system_auth
+Keyspace: system_auth
+   Table: role_permissions
+   Paths:
+   
/var/lib/cassandra/data/system_auth/role_permissions-3afbe79f219431a7add7f5ab90d8ec9c
+
+   Table: network_permissions
+   Paths:
+   
/var/lib/cassandra/data/system_auth/network_permissions-d46780c22f1c3db9b4c1b8d9fbc0cc23
+
+   Table: resource_role_permissons_index
+   Paths:
+   
/var/lib/cassandra/data/system_auth/resource_role_permissons_index-5f2fbdad91f13946bd25d5da3a5c35ec
+
+   Table: roles
+   Paths:
+   
/var/lib/cassandra/data/system_auth/roles-5bc52802de2535edaeab188eecebb090
+
+   Table: role_members
+   Paths:
+   
/var/lib/cassandra/data/system_auth/role_members-0ecdaa87f8fb3e6088d174fb36fe5c0d
+
+
+
+By default all keyspaces and tables are listed, however, a list of `keyspace` 
and
+`keyspace.table` arguments can be given to query specific data paths. Using 
the `--format`
+option the output can be formatted as YAML or JSON.
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index f9422bdbba..476353fee0 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -102,6 +102,7 @@ public class NodeTool
 Compact.class,
 CompactionHistory.class,
 CompactionStats.class,
+DataPaths.class,
 Decommission.class,
 DescribeCluster.class,
 DescribeRing.class,
diff --git a/src/java/org/apache/cassandra/tools/nodetool/DataPaths.java 
b/src/java/org/apache/cassandra/tools/nodetool/DataPaths.java
new file mode 100644
index 00..10ae01e8da
--- /dev/null
+++ b/src/java/org/apache/cassandra/tools/nodetool/DataPaths.java
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or 

[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17568:
--
Status: Ready to Commit  (was: Review In Progress)

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526401#comment-17526401
 ] 

Stefan Miklosovic edited comment on CASSANDRA-17568 at 4/22/22 12:44 PM:
-

https://app.circleci.com/pipelines/github/instaclustr/cassandra/935/workflows/5894b8ae-571c-4d95-8379-fcb894da34e9

one jvm dtest fails, not related at all, thats known flaky


was (Author: smiklosovic):
https://app.circleci.com/pipelines/github/instaclustr/cassandra/935/workflows/5894b8ae-571c-4d95-8379-fcb894da34e9

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526401#comment-17526401
 ] 

Stefan Miklosovic commented on CASSANDRA-17568:
---

https://app.circleci.com/pipelines/github/instaclustr/cassandra/935/workflows/5894b8ae-571c-4d95-8379-fcb894da34e9

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-website] branch asf-staging updated (4cb38fd9 -> eb4d1ab0)

2022-04-22 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 4cb38fd9 generate docs for 8fd077a6
 new eb4d1ab0 generate docs for 8fd077a6

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (4cb38fd9)
\
 N -- N -- N   refs/heads/asf-staging (eb4d1ab0)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 site-ui/build/ui-bundle.zip | Bin 4740078 -> 4740078 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17568:
-
Status: Review In Progress  (was: Needs Committer)

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17568:
-
Reviewers: Brandon Williams, Stefan Miklosovic  (was: Stefan Miklosovic)

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526341#comment-17526341
 ] 

Brandon Williams commented on CASSANDRA-17568:
--

Please add a J11 CI run, and if that is clean I am +1.

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526338#comment-17526338
 ] 

Stefan Miklosovic commented on CASSANDRA-17568:
---

Squashed commits branch: 
https://github.com/instaclustr/cassandra/commits/CASSANDRA-17568-squashed

build: 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/935/workflows/68fe55ef-851d-4c54-80af-668068e99abd

I am +1 on this. Waiting for the second reviewer.

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17568:
--
Status: Needs Committer  (was: Review In Progress)

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables

2022-04-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17568:
--
Status: Review In Progress  (was: Changes Suggested)

> Implement nodetool command to list data directories of existing tables
> --
>
> Key: CASSANDRA-17568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17568
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/nodetool
>Reporter: Tibor Repasi
>Assignee: Tibor Repasi
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> When a table is created, dropped and re-created with the same name, 
> directories remain within data paths. Operators may be challenged finding out 
> which directories belong to existing tables and which may be subject to 
> removal. However, the information is available in CQL as well as in MBeans 
> via JMX, a convenient access to this information is still missing.
> My proposal is a new nodetool subcommand allowing to list data paths of all 
> existing tables.
> {code}
> % bin/nodetool datapaths -- example
> Keyspace : example
>   Table : test
>   Paths :
>   
> /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically

2022-04-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526329#comment-17526329
 ] 

Andres de la Peña edited comment on CASSANDRA-17543 at 4/22/22 11:03 AM:
-

[~maedhroz] are we ready to commit this? I have just rebased without conflicts 
and I'm running CI one last time:
||PR||CI||
|[4.0|https://github.com/apache/cassandra/pull/1568]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/d8256249-6af4-425b-80c0-3b5109204530]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/264718b3-376c-4d85-b700-767dee99e3bd]|
|[trunk|https://github.com/apache/cassandra/pull/1569]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/1699556f-c389-472d-b217-fd17e1007a41]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/fc449127-5ea4-4438-9aaa-163c78c62884]|


was (Author: adelapena):
[~maedhroz] are ready to commit this? I have just rebased without conflicts and 
I'm running CI one last time:

||PR||CI||
|[4.0|https://github.com/apache/cassandra/pull/1568]  
|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/d8256249-6af4-425b-80c0-3b5109204530]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/264718b3-376c-4d85-b700-767dee99e3bd]|
|[trunk|https://github.com/apache/cassandra/pull/1569]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/1699556f-c389-472d-b217-fd17e1007a41]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/fc449127-5ea4-4438-9aaa-163c78c62884]|

> ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE 
> coordinator=1 flush=false paging=false] times out sporadically
> ---
>
> Key: CASSANDRA-17543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17543
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Caleb Rackliffe
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8:
>  strategy=NONE coordinator=1 flush=false paging=false]
> {noformat}
> Error Message
> Timeout occurred. Please note the time in the report does not reflect the 
> time until the timeout.
> Stacktrace
> junit.framework.AssertionFailedError: Timeout occurred. Please note the time 
> in the report does not reflect the time until the timeout.
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> 

[jira] [Commented] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically

2022-04-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526329#comment-17526329
 ] 

Andres de la Peña commented on CASSANDRA-17543:
---

[~maedhroz] are ready to commit this? I have just rebased without conflicts and 
I'm running CI one last time:

||PR||CI||
|[4.0|https://github.com/apache/cassandra/pull/1568]  
|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/d8256249-6af4-425b-80c0-3b5109204530]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/264718b3-376c-4d85-b700-767dee99e3bd]|
|[trunk|https://github.com/apache/cassandra/pull/1569]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/1699556f-c389-472d-b217-fd17e1007a41]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/fc449127-5ea4-4438-9aaa-163c78c62884]|

> ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE 
> coordinator=1 flush=false paging=false] times out sporadically
> ---
>
> Key: CASSANDRA-17543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17543
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Caleb Rackliffe
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8:
>  strategy=NONE coordinator=1 flush=false paging=false]
> {noformat}
> Error Message
> Timeout occurred. Please note the time in the report does not reflect the 
> time until the timeout.
> Stacktrace
> junit.framework.AssertionFailedError: Timeout occurred. Please note the time 
> in the report does not reflect the time until the timeout.
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.util.Vector.forEach(Vector.java:1388)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {noformat}
> See 
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1075/testReport/org.apache.cassandra.distributed.test/ReadRepairQueryTypesTest/testUnrestrictedQueryOnSkinnyTable_8__strategy_NONE_coordinator_1_flush_false_paging_false_/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 

[jira] [Commented] (CASSANDRA-17150) Guardrails for disk usage

2022-04-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526317#comment-17526317
 ] 

Andres de la Peña commented on CASSANDRA-17150:
---

[~e.dimitrova] thanks for the review. I think I have addressed the last bits. 
I'm running CI after rebase+squash:
||PR||CI||
|[trunk|https://github.com/apache/cassandra/pull/1546]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/d032178d-f8a9-4124-b36f-5bf6f47b3116]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/bc844580-6f3a-4bc3-a4d0-d85f082330f8]|

Please note that during the rebase I have replaced a few references to the 
removed `Config.DISABLED_GUARDRAIL` constant by {{{}-1{}}}. Those references 
were recently added to track warnings during CASSANDRA-17560. As it's mentioned 
[here|https://github.com/apache/cassandra/pull/1572#discussion_r854251196], 
using {{-1}} as the disabled value is a global config convention and not a 
guardrails thing, so we should either use it directly or define a new constant 
with a more generic name. If we decide to do the latter, I'd prefer to do it in 
a separate ticket, so we can focus on locating all the usages around.

> Guardrails for disk usage
> -
>
> Key: CASSANDRA-17150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17150
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Add guardrails for disk usage establishing soft/hard limits on the percentage 
> of used disk space. For example:
> {code}
> # Warning threshold to warn when local disk usage exceeds threshold. Valid 
> values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_warn_threshold: -1
> # Failure threshold to reject write requests if replica disk usage exceeds 
> threshold. Valid values: (1, 100]
> # Defaults to -1 to disable.
> # disk_usage_percentage_failure_threshold: -1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17212) Migrate threshold for minimum keyspace replication factor to guardrails

2022-04-22 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-17212:
--
Reviewers: Andres de la Peña

> Migrate threshold for minimum keyspace replication factor to guardrails
> ---
>
> Key: CASSANDRA-17212
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17212
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Savni Nagarkar
>Priority: Normal
> Fix For: 4.x
>
>
> The config property 
> [{{minimum_keyspace_rf}}|https://github.com/apache/cassandra/blob/5fdadb25f95099b8945d9d9ee11d3e380d3867f4/conf/cassandra.yaml]
>  that was added by CASSANDRA-14557 can be migrated to guardrails, for example:
> {code}
> guardrails:
> ...
> replication_factor:
> warn_threshold: 2
> abort_threshold: 3
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11871) Allow to aggregate by time intervals

2022-04-22 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-11871:
---
Description: 
For time series data it can be usefull to aggregate by time intervals.

The idea would be to add support for one or several functions in the {{GROUP 
BY}} clause.

Regarding the implementation, even if in general I also prefer to follow the 
SQL syntax, I do not believe it will be a good fit for Cassandra.

If we have a table like:
{code}
CREATE TABLE trades
{
symbol text,
date date,
time time,
priceMantissa int,
priceExponent tinyint,
volume int,
PRIMARY KEY ((symbol, date), time)
};
{code}
The trades will be inserted with an increasing time and sorted in the same 
order. As we can have to process a large amount of data, we want to try to 
limit ourself to the cases where we can build the groups on the flight (which 
is not a requirement in the SQL world).

If we want to get the number of trades per minutes with the SQL syntax we will 
have to write:

{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY hour(time), minute(time);}}
which is fine. The problem is that if the user invert by mistake the functions 
like that:
{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY minute(time), hour(time);}}
the query will return weird results.
The only way to prevent that would be to check the function order and make sure 
that we do not allow to skip functions (e.g. {{GROUP BY hour(time), 
second(time)}}).

In my opinion a function like {{floor(, )}} will be 
much better as it does not allow for this type of mistakes and is much more 
flexible (you can create 5 minutes buckets if you want to).
{code}SELECT floor(time, m), count() FROM Trades 
 WHERE symbol = 'AAPL' AND date = '2016-01-11'
 GROUP BY floor(time, m);{code}

An important aspect to keep in mind with a function like {{floor}} is the 
starting point. For a query like:  {{SELECT floor(time, m), count() FROM Trades 
WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time 
=< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result 
should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}.  
 

  was:
For time series data it can be usefull to aggregate by time intervals.

The idea would be to add support for one or several functions in the {{GROUP 
BY}} clause.

Regarding the implementation, even if in general I also prefer to follow the 
SQL syntax, I do not believe it will be a good fit for Cassandra.

If we have a table like:
{code}
CREATE TABLE trades
{
symbol text,
date date,
time time,
priceMantissa int,
priceExponent tinyint,
volume int,
PRIMARY KEY ((symbol, date), time)
};
{code}
The trades will be inserted with an increasing time and sorted in the same 
order. As we can have to process a large amount of data, we want to try to 
limit ourself to the cases where we can build the groups on the flight (which 
is not a requirement in the SQL world).

If we want to get the number of trades per minutes with the SQL syntax we will 
have to write:

{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY hour(time), minute(time);}}
which is fine. The problem is that if the user invert by mistake the functions 
like that:
{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY minute(time), hour(time);}}
the query will return weird results.
The only way to prevent that would be to check the function order and make sure 
that we do not allow to skip functions (e.g. {{GROUP BY hour(time), 
second(time)}}).

In my opinion a function like {{floor(, )}} will be 
much better as it does not allow for this type of mistakes and is much more 
flexible (you can create 5 minutes buckets if you want to).
{code}SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date 
= '2016-01-11' GROUP BY floor(time, m);{code}

An important aspect to keep in mind with a function like {{floor}} is the 
starting point. For a query like:  {{SELECT floor(time, m), count() FROM Trades 
WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time 
=< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result 
should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}.  
 


> Allow to aggregate by time intervals
> 
>
> Key: CASSANDRA-11871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>Priority: 

[jira] [Updated] (CASSANDRA-11871) Allow to aggregate by time intervals

2022-04-22 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-11871:
---
Description: 
For time series data it can be usefull to aggregate by time intervals.

The idea would be to add support for one or several functions in the {{GROUP 
BY}} clause.

Regarding the implementation, even if in general I also prefer to follow the 
SQL syntax, I do not believe it will be a good fit for Cassandra.

If we have a table like:
{code}
CREATE TABLE trades
{
symbol text,
date date,
time time,
priceMantissa int,
priceExponent tinyint,
volume int,
PRIMARY KEY ((symbol, date), time)
};
{code}
The trades will be inserted with an increasing time and sorted in the same 
order. As we can have to process a large amount of data, we want to try to 
limit ourself to the cases where we can build the groups on the flight (which 
is not a requirement in the SQL world).

If we want to get the number of trades per minutes with the SQL syntax we will 
have to write:

{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY hour(time), minute(time);}}
which is fine. The problem is that if the user invert by mistake the functions 
like that:
{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY minute(time), hour(time);}}
the query will return weird results.
The only way to prevent that would be to check the function order and make sure 
that we do not allow to skip functions (e.g. {{GROUP BY hour(time), 
second(time)}}).

In my opinion a function like {{floor(, )}} will be 
much better as it does not allow for this type of mistakes and is much more 
flexible (you can create 5 minutes buckets if you want to).
{code}
SELECT floor(time, m), count() FROM Trades 
WHERE symbol = 'AAPL' AND date = '2016-01-11'
GROUP BY floor(time, m);
{code}
An important aspect to keep in mind with a function like {{floor}} is the 
starting point. For a query like:  {{SELECT floor(time, m), count() FROM Trades 
WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time 
=< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result 
should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}.  
 

  was:
For time series data it can be usefull to aggregate by time intervals.

The idea would be to add support for one or several functions in the {{GROUP 
BY}} clause.

Regarding the implementation, even if in general I also prefer to follow the 
SQL syntax, I do not believe it will be a good fit for Cassandra.

If we have a table like:
{code}
CREATE TABLE trades
{
symbol text,
date date,
time time,
priceMantissa int,
priceExponent tinyint,
volume int,
PRIMARY KEY ((symbol, date), time)
};
{code}
The trades will be inserted with an increasing time and sorted in the same 
order. As we can have to process a large amount of data, we want to try to 
limit ourself to the cases where we can build the groups on the flight (which 
is not a requirement in the SQL world).

If we want to get the number of trades per minutes with the SQL syntax we will 
have to write:

{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY hour(time), minute(time);}}
which is fine. The problem is that if the user invert by mistake the functions 
like that:
{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY minute(time), hour(time);}}
the query will return weird results.
The only way to prevent that would be to check the function order and make sure 
that we do not allow to skip functions (e.g. {{GROUP BY hour(time), 
second(time)}}).

In my opinion a function like {{floor(, )}} will be 
much better as it does not allow for this type of mistakes and is much more 
flexible (you can create 5 minutes buckets if you want to).
{code}SELECT floor(time, m), count() FROM Trades 
 WHERE symbol = 'AAPL' AND date = '2016-01-11'
 GROUP BY floor(time, m);{code}

An important aspect to keep in mind with a function like {{floor}} is the 
starting point. For a query like:  {{SELECT floor(time, m), count() FROM Trades 
WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time 
=< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result 
should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}.  
 


> Allow to aggregate by time intervals
> 
>
> Key: CASSANDRA-11871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>Priority: 

[jira] [Updated] (CASSANDRA-11871) Allow to aggregate by time intervals

2022-04-22 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-11871:
---
  Fix Version/s: 4.1
 (was: 4.x)
Source Control Link: 
https://github.com/apache/cassandra/commit/1ad8bf67a9c82cbb5ff38e5cf785f9fe2516d009
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Patch committed into trunk at 1ad8bf67a9c82cbb5ff38e5cf785f9fe2516d009

> Allow to aggregate by time intervals
> 
>
> Key: CASSANDRA-11871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> For time series data it can be usefull to aggregate by time intervals.
> The idea would be to add support for one or several functions in the {{GROUP 
> BY}} clause.
> Regarding the implementation, even if in general I also prefer to follow the 
> SQL syntax, I do not believe it will be a good fit for Cassandra.
> If we have a table like:
> {code}
> CREATE TABLE trades
> {
> symbol text,
> date date,
> time time,
> priceMantissa int,
> priceExponent tinyint,
> volume int,
> PRIMARY KEY ((symbol, date), time)
> };
> {code}
> The trades will be inserted with an increasing time and sorted in the same 
> order. As we can have to process a large amount of data, we want to try to 
> limit ourself to the cases where we can build the groups on the flight (which 
> is not a requirement in the SQL world).
> If we want to get the number of trades per minutes with the SQL syntax we 
> will have to write:
> {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
> AND date = '2016-01-11' GROUP BY hour(time), minute(time);}}
> which is fine. The problem is that if the user invert by mistake the 
> functions like that:
> {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
> AND date = '2016-01-11' GROUP BY minute(time), hour(time);}}
> the query will return weird results.
> The only way to prevent that would be to check the function order and make 
> sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), 
> second(time)}}).
> In my opinion a function like {{floor(, )}} will be 
> much better as it does not allow for this type of mistakes and is much more 
> flexible (you can create 5 minutes buckets if you want to).
> {code}SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND 
> date = '2016-01-11' GROUP BY floor(time, m);{code}
> An important aspect to keep in mind with a function like {{floor}} is the 
> starting point. For a query like:  {{SELECT floor(time, m), count() FROM 
> Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' 
> AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the 
> result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}.  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11871) Allow to aggregate by time intervals

2022-04-22 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-11871:
---
Status: Ready to Commit  (was: Review In Progress)

> Allow to aggregate by time intervals
> 
>
> Key: CASSANDRA-11871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> For time series data it can be usefull to aggregate by time intervals.
> The idea would be to add support for one or several functions in the {{GROUP 
> BY}} clause.
> Regarding the implementation, even if in general I also prefer to follow the 
> SQL syntax, I do not believe it will be a good fit for Cassandra.
> If we have a table like:
> {code}
> CREATE TABLE trades
> {
> symbol text,
> date date,
> time time,
> priceMantissa int,
> priceExponent tinyint,
> volume int,
> PRIMARY KEY ((symbol, date), time)
> };
> {code}
> The trades will be inserted with an increasing time and sorted in the same 
> order. As we can have to process a large amount of data, we want to try to 
> limit ourself to the cases where we can build the groups on the flight (which 
> is not a requirement in the SQL world).
> If we want to get the number of trades per minutes with the SQL syntax we 
> will have to write:
> {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
> AND date = '2016-01-11' GROUP BY hour(time), minute(time);}}
> which is fine. The problem is that if the user invert by mistake the 
> functions like that:
> {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
> AND date = '2016-01-11' GROUP BY minute(time), hour(time);}}
> the query will return weird results.
> The only way to prevent that would be to check the function order and make 
> sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), 
> second(time)}}).
> In my opinion a function like {{floor(, )}} will be 
> much better as it does not allow for this type of mistakes and is much more 
> flexible (you can create 5 minutes buckets if you want to).
> {code}SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND 
> date = '2016-01-11' GROUP BY floor(time, m);{code}
> An important aspect to keep in mind with a function like {{floor}} is the 
> starting point. For a query like:  {{SELECT floor(time, m), count() FROM 
> Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' 
> AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the 
> result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}.  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11871) Allow to aggregate by time intervals

2022-04-22 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-11871:
---
Description: 
For time series data it can be usefull to aggregate by time intervals.

The idea would be to add support for one or several functions in the {{GROUP 
BY}} clause.

Regarding the implementation, even if in general I also prefer to follow the 
SQL syntax, I do not believe it will be a good fit for Cassandra.

If we have a table like:
{code}
CREATE TABLE trades
{
symbol text,
date date,
time time,
priceMantissa int,
priceExponent tinyint,
volume int,
PRIMARY KEY ((symbol, date), time)
};
{code}
The trades will be inserted with an increasing time and sorted in the same 
order. As we can have to process a large amount of data, we want to try to 
limit ourself to the cases where we can build the groups on the flight (which 
is not a requirement in the SQL world).

If we want to get the number of trades per minutes with the SQL syntax we will 
have to write:

{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY hour(time), minute(time);}}
which is fine. The problem is that if the user invert by mistake the functions 
like that:
{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY minute(time), hour(time);}}
the query will return weird results.
The only way to prevent that would be to check the function order and make sure 
that we do not allow to skip functions (e.g. {{GROUP BY hour(time), 
second(time)}}).

In my opinion a function like {{floor(, )}} will be 
much better as it does not allow for this type of mistakes and is much more 
flexible (you can create 5 minutes buckets if you want to).
{code}SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date 
= '2016-01-11' GROUP BY floor(time, m);{code}

An important aspect to keep in mind with a function like {{floor}} is the 
starting point. For a query like:  {{SELECT floor(time, m), count() FROM Trades 
WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time 
=< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result 
should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}.  
 

  was:
For time series data it can be usefull to aggregate by time intervals.

The idea would be to add support for one or several functions in the {{GROUP 
BY}} clause.

Regarding the implementation, even if in general I also prefer to follow the 
SQL syntax, I do not believe it will be a good fit for Cassandra.

If we have a table like:
{code}
CREATE TABLE trades
{
symbol text,
date date,
time time,
priceMantissa int,
priceExponent tinyint,
volume int,
PRIMARY KEY ((symbol, date), time)
};
{code}
The trades will be inserted with an increasing time and sorted in the same 
order. As we can have to process a large amount of data, we want to try to 
limit ourself to the cases where we can build the groups on the flight (which 
is not a requirement in the SQL world).

If we want to get the number of trades per minutes with the SQL syntax we will 
have to write:

{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY hour(time), minute(time);}}
which is fine. The problem is that if the user invert by mistake the functions 
like that:
{{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY minute(time), hour(time);}}
the query will return weird results.
The only way to prevent that would be to check the function order and make sure 
that we do not allow to skip functions (e.g. {{GROUP BY hour(time), 
second(time)}}).

In my opinion a function like {{floor(, )}} will be 
much better as it does not allow for this type of mistakes and is much more 
flexible (you can create 5 minutes buckets if you want to).
{{SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = 
'2016-01-11' GROUP BY floor(time, m);}}

An important aspect to keep in mind with a function like {{floor}} is the 
starting point. For a query like:  {{SELECT floor(time, m), count() FROM Trades 
WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time 
=< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result 
should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}.  
 


> Allow to aggregate by time intervals
> 
>
> Key: CASSANDRA-11871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 

[jira] [Commented] (CASSANDRA-16456) Add Plugin Support for CQLSH

2022-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526259#comment-17526259
 ] 

Stefan Miklosovic commented on CASSANDRA-16456:
---

point 3) with the addition that it should warn you that stuff should be in 
credentials instead of cqlshrc when it comes to username / password. We do not 
have any control over any other possible further credentials located in cqlshrc 
but username and password as these two are the most known.

point 4) same, we should emit warning as it is done now that this stuff should 
be located in credentials

The reason for the warning is that then we will remove the support of 
authentication section in cqlshrc in the next release and everything will go to 
credentials only (or as flags on the command line).

point 5) if you meant override as in "applied on top of them" then yes, you are 
basically adding one set (as a mathematical construct) to the other one with a 
detail thatit will replace values in cqlshrc by these which are as values for 
the same key in credentials file

point 6) yes, that username flag on the console, then you ask for password. 
Because out of the box you can login just without anything and it will assume 
you are loging anonymously.

> Add Plugin Support for CQLSH
> 
>
> Key: CASSANDRA-16456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16456
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tool/cqlsh
>Reporter: Brian Houser
>Assignee: Brian Houser
>Priority: Normal
>  Labels: gsoc2021, mentor
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Currently the Cassandra drivers offer a plugin authenticator architecture for 
> the support of different authentication methods. This has been leveraged to 
> provide support for LDAP, Kerberos, and Sigv4 authentication. Unfortunately, 
> cqlsh, the included CLI tool, does not offer such support. Switching to a new 
> enhanced authentication scheme thus means being cut off from using cqlsh in 
> normal operation.
> We should have a means of using the same plugins and authentication providers 
> as the Python Cassandra driver.
> Here's a link to an initial draft of 
> [CEP|https://docs.google.com/document/d/1_G-OZCAEmDyuQuAN2wQUYUtZBEJpMkHWnkYELLhqvKc/edit?usp=sharing].



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org