[jira] [Comment Edited] (CASSANDRA-17401) Race condition in QueryProcessor causes just prepared statement not to be in the prepared statements cache

2024-01-22 Thread Long Pan (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809755#comment-17809755
 ] 

Long Pan edited comment on CASSANDRA-17401 at 1/23/24 6:39 AM:
---

[~chovatia.jayd...@gmail.com] and I managed to reproduce the issue. Basically a 
large number of QPS and client connections are necessary to reproduce it. Here 
are the steps to reproduce:

*Server Setup (Cassandra 4.0.6):*
A 3-node Cassandra cluster. Each node with 64GB mem (16GB heap), 7 CPU cores. 
({{{}native_transport_max_threads = 1024).{}}}

*Keyspace/Table:*
CREATE KEYSPACE test_ks WITH REPLICATION = \{ ‘class’ : 
‘NetworkTopologyStrategy’, ‘datacenter1’ : 3 } ;
CREATE TABLE test_ks.table1 ( p_id text, c_id text, v text);

*Client Setup:*
30 hosts (12 CPU cores per host). Each host run the following pseudo-code, 
using *GoCql* client:
{code:java}
cluster.CQLVersion = "3.4.0"
cluster.ProtoVersion = 4
cluster.Timeout = 5s
cluster.ConnectTimeout = 10s
cluster.NumConns = 3
cluster.Consistency = LocalQuorum
cluster.RetryPolicy = SimpleRetryPolicy{NumRetries: 1}
cluster.SocketKeepalive = 20s
cluster.HostSelectionPolicy = RoundRobinHostPolicy

sessionCount = 30
qpsPerSession = 30
cqlQuery = "SELECT p_id,c_id,v FROM test_ks.table1 WHERE p_id = ? AND c_id 
= ?"
for (i = 0; i < sessionCount; i++) {
   session = cluster.createSession
   rateLimiter = NewRateLimiter(qpsPerSession)
   newGoRoutine.run( sendReads(session, rateLimiter) )
}

/ *
  sendReads(session, rateLimiter) {
 for {
newGoRoutine.run (
   if (rateLimiter.allow) {
  session.execute(cqlQuery, randomString, randomString)
   }
)
 }
  }
*/ {code}
Traffic generated this way will result in ~10K coordiator QPS and ~3k client 
connections per Cassandra node.

*Trigger Point:*
Manually issue a CQL query to add a column in the table: “ALTER TABLE 
test_ks.table1 ADD new_col text;”

{*}Symmpton{*}:
Seconds after the trigger point, one or more Cassandra nodes will show number 
of native_transport threads reaching {{{}native_transport_max_threads{}}}, and 
pending native transport tasks grow endlessly.


was (Author: JIRAUSER303782):
[~chovatia.jayd...@gmail.com] and I managed to reproduce the issue. Basically a 
large number of QPS and client connections are necessary to reproduce it. Here 
are the steps to reproduce:

*Server Setup (Cassandra 4.0.6):*
A 3-node Cassandra cluster. Each node with 64GB mem (16GB heap), 7 CPU cores. 
({{{}native_transport_max_threads = 1024).{}}}

*Keyspace/Table:*
CREATE KEYSPACE test_ks WITH REPLICATION = \{ ‘class’ : 
‘NetworkTopologyStrategy’, ‘datacenter1’ : 3 } ;
CREATE TABLE test_ks.table1 ( p_id text, c_id text, v text);

*Client Setup:*
30 hosts. Each host run the following pseudo-code, using *GoCql* client:
{code:java}
cluster.CQLVersion = "3.4.0"
cluster.ProtoVersion = 4
cluster.Timeout = 5s
cluster.ConnectTimeout = 10s
cluster.NumConns = 3
cluster.Consistency = LocalQuorum
cluster.RetryPolicy = SimpleRetryPolicy{NumRetries: 1}
cluster.SocketKeepalive = 20s
cluster.HostSelectionPolicy = RoundRobinHostPolicy

sessionCount = 30
qpsPerSession = 30
cqlQuery = "SELECT p_id,c_id,v FROM test_ks.table1 WHERE p_id = ? AND c_id 
= ?"
for (i = 0; i < sessionCount; i++) {
   session = cluster.createSession
   rateLimiter = NewRateLimiter(qpsPerSession)
   newGoRoutine.run( sendReads(session, rateLimiter) )
}

/ *
  sendReads(session, rateLimiter) {
 for {
newGoRoutine.run (
   if (rateLimiter.allow) {
  session.execute(cqlQuery, randomString, randomString)
   }
)
 }
  }
*/ {code}
Traffic generated this way will result in ~10K coordiator QPS and ~3k client 
connections per Cassandra node.

*Trigger Point:*
Manually issue a CQL query to add a column in the table: “ALTER TABLE 
test_ks.table1 ADD new_col text;”

{*}Symmpton{*}:
Seconds after the trigger point, one or more Cassandra nodes will show number 
of native_transport threads reaching {{{}native_transport_max_threads{}}}, and 
pending native transport tasks grow endlessly.

> Race condition in QueryProcessor causes just prepared statement not to be in 
> the prepared statements cache
> --
>
> Key: CASSANDRA-17401
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17401
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ivan Senic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The changes in the 
> 

[jira] [Comment Edited] (CASSANDRA-17401) Race condition in QueryProcessor causes just prepared statement not to be in the prepared statements cache

2024-01-22 Thread Long Pan (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809755#comment-17809755
 ] 

Long Pan edited comment on CASSANDRA-17401 at 1/23/24 6:36 AM:
---

[~chovatia.jayd...@gmail.com] and I managed to reproduce the issue. Basically a 
large number of QPS and client connections are necessary to reproduce it. Here 
are the steps to reproduce:

*Server Setup (Cassandra 4.0.6):*
A 3-node Cassandra cluster. Each node with 64GB mem (16GB heap), 7 CPU cores. 
({{{}native_transport_max_threads = 1024).{}}}

*Keyspace/Table:*
CREATE KEYSPACE test_ks WITH REPLICATION = \{ ‘class’ : 
‘NetworkTopologyStrategy’, ‘datacenter1’ : 3 } ;
CREATE TABLE test_ks.table1 ( p_id text, c_id text, v text);

*Client Setup:*
30 hosts. Each host run the following pseudo-code, using *GoCql* client:
{code:java}
cluster.CQLVersion = "3.4.0"
cluster.ProtoVersion = 4
cluster.Timeout = 5s
cluster.ConnectTimeout = 10s
cluster.NumConns = 3
cluster.Consistency = LocalQuorum
cluster.RetryPolicy = SimpleRetryPolicy{NumRetries: 1}
cluster.SocketKeepalive = 20s
cluster.HostSelectionPolicy = RoundRobinHostPolicy

sessionCount = 30
qpsPerSession = 30
cqlQuery = "SELECT p_id,c_id,v FROM test_ks.table1 WHERE p_id = ? AND c_id 
= ?"
for (i = 0; i < sessionCount; i++) {
   session = cluster.createSession
   rateLimiter = NewRateLimiter(qpsPerSession)
   newGoRoutine.run( sendReads(session, rateLimiter) )
}

/ *
  sendReads(session, rateLimiter) {
 for {
newGoRoutine.run (
   if (rateLimiter.allow) {
  session.execute(cqlQuery, randomString, randomString)
   }
)
 }
  }
*/ {code}
Traffic generated this way will result in ~10K coordiator QPS and ~3k client 
connections per Cassandra node.

*Trigger Point:*
Manually issue a CQL query to add a column in the table: “ALTER TABLE 
test_ks.table1 ADD new_col text;”

{*}Symmpton{*}:
Seconds after the trigger point, one or more Cassandra nodes will show number 
of native_transport threads reaching {{{}native_transport_max_threads{}}}, and 
pending native transport tasks grow endlessly.


was (Author: JIRAUSER303782):
[~chovatia.jayd...@gmail.com] and I managed to reproduce the issue. Basically a 
large number of QPS and client connections are necessary to reproduce it. Here 
are the steps to reproduce:

*Server Setup:*
A 3-node Cassandra cluster. Each node with 64GB mem (16GB heap), 7 CPU cores. 
({{{}native_transport_max_threads = 1024){}}}

*Keyspace/Table:*
CREATE KEYSPACE test_ks WITH REPLICATION = \{ ‘class’ : 
‘NetworkTopologyStrategy’, ‘datacenter1’ : 3 } ;
CREATE TABLE test_ks.table1 ( p_id text, c_id text, v text);

*Client Setup:*
30 hosts. Each host run the following pseudo-code, using *GoCql* client:
{code:java}
cluster.CQLVersion = "3.4.0"
cluster.ProtoVersion = 4
cluster.Timeout = 5s
cluster.ConnectTimeout = 10s
cluster.NumConns = 3
cluster.Consistency = LocalQuorum
cluster.RetryPolicy = SimpleRetryPolicy{NumRetries: 1}
cluster.SocketKeepalive = 20s
cluster.HostSelectionPolicy = RoundRobinHostPolicy

sessionCount = 30
qpsPerSession = 30
cqlQuery = "SELECT p_id,c_id,v FROM test_ks.table1 WHERE p_id = ? AND c_id 
= ?"
for (i = 0; i < sessionCount; i++) {
   session = cluster.createSession
   rateLimiter = NewRateLimiter(qpsPerSession)
   newGoRoutine.run( sendReads(session, rateLimiter) )
}

/ *
  sendReads(session, rateLimiter) {
 for {
newGoRoutine.run (
   if (rateLimiter.allow) {
  session.execute(cqlQuery, randomString, randomString)
   }
)
 }
  }
*/ {code}

Traffic generated this way will result in ~10K coordiator QPS and ~3k client 
connections per Cassandra node.

*Trigger Point:*
Manually issue a CQL query to add a column in the table: “ALTER TABLE 
test_ks.table1 ADD new_col text;”

{*}Symmpton{*}:
Seconds after the trigger point, one or more Cassandra nodes will show number 
of native_transport threads reaching {{{}native_transport_max_threads{}}}, and 
pending native transport tasks grow endlessly.

> Race condition in QueryProcessor causes just prepared statement not to be in 
> the prepared statements cache
> --
>
> Key: CASSANDRA-17401
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17401
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ivan Senic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The changes in the 
> 

[jira] [Comment Edited] (CASSANDRA-17401) Race condition in QueryProcessor causes just prepared statement not to be in the prepared statements cache

2024-01-22 Thread Long Pan (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809755#comment-17809755
 ] 

Long Pan edited comment on CASSANDRA-17401 at 1/23/24 6:35 AM:
---

[~chovatia.jayd...@gmail.com] and I managed to reproduce the issue. Basically a 
large number of QPS and client connections are necessary to reproduce it. Here 
are the steps to reproduce:

*Server Setup:*
A 3-node Cassandra cluster. Each node with 64GB mem (16GB heap), 7 CPU cores. 
({{{}native_transport_max_threads = 1024){}}}

*Keyspace/Table:*
CREATE KEYSPACE test_ks WITH REPLICATION = \{ ‘class’ : 
‘NetworkTopologyStrategy’, ‘datacenter1’ : 3 } ;
CREATE TABLE test_ks.table1 ( p_id text, c_id text, v text);

*Client Setup:*
30 hosts. Each host run the following pseudo-code, using *GoCql* client:
{code:java}
cluster.CQLVersion = "3.4.0"
cluster.ProtoVersion = 4
cluster.Timeout = 5s
cluster.ConnectTimeout = 10s
cluster.NumConns = 3
cluster.Consistency = LocalQuorum
cluster.RetryPolicy = SimpleRetryPolicy{NumRetries: 1}
cluster.SocketKeepalive = 20s
cluster.HostSelectionPolicy = RoundRobinHostPolicy

sessionCount = 30
qpsPerSession = 30
cqlQuery = "SELECT p_id,c_id,v FROM test_ks.table1 WHERE p_id = ? AND c_id 
= ?"
for (i = 0; i < sessionCount; i++) {
   session = cluster.createSession
   rateLimiter = NewRateLimiter(qpsPerSession)
   newGoRoutine.run( sendReads(session, rateLimiter) )
}

/ *
  sendReads(session, rateLimiter) {
 for {
newGoRoutine.run (
   if (rateLimiter.allow) {
  session.execute(cqlQuery, randomString, randomString)
   }
)
 }
  }
*/ {code}

Traffic generated this way will result in ~10K coordiator QPS and ~3k client 
connections per Cassandra node.

*Trigger Point:*
Manually issue a CQL query to add a column in the table: “ALTER TABLE 
test_ks.table1 ADD new_col text;”

{*}Symmpton{*}:
Seconds after the trigger point, one or more Cassandra nodes will show number 
of native_transport threads reaching {{{}native_transport_max_threads{}}}, and 
pending native transport tasks grow endlessly.


was (Author: JIRAUSER303782):
[~chovatia.jayd...@gmail.com] and I managed to reproduce the issue. Basically a 
large number of QPS and client connections are necessary to reproduce it. Here 
are the steps to reproduce:


*Server Setup:*
A 3-node Cassandra cluster. Each node with 64GB mem (16GB heap), 7 CPU cores. 
({{{}native_transport_max_threads = 1024){}}}

*Keyspace/Table:*
CREATE KEYSPACE test_ks WITH REPLICATION = \{ ‘class’ : 
‘NetworkTopologyStrategy’, ‘datacenter1’ : 3 } ;
CREATE TABLE test_ks.table1 ( p_id text, c_id text, v text);

*Client Setup:*
30 hosts. Each host run the following pseudo-code, using *GoCql* client:
cluster.CQLVersion = "3.4.0"
cluster.ProtoVersion = 4
cluster.Timeout = 5s
cluster.ConnectTimeout = 10s
cluster.NumConns = 3
cluster.Consistency = LocalQuorum
cluster.RetryPolicy = SimpleRetryPolicy\{NumRetries: 1}
cluster.SocketKeepalive = 20s
cluster.HostSelectionPolicy = RoundRobinHostPolicy

sessionCount = 30
qpsPerSession = 30
cqlQuery = "SELECT p_id,c_id,v FROM test_ks.table1 WHERE p_id = ? AND c_id 
= ?"
for (i = 0; i < sessionCount; i++) \{
   session = cluster.createSession
   rateLimiter = NewRateLimiter(qpsPerSession)
   newGoRoutine.run( sendReads(session, rateLimiter) )
}

/ *
  sendReads(session, rateLimiter) \{
 for {
newGoRoutine.run (
   if (rateLimiter.allow) {
  session.execute(cqlQuery, randomString, randomString)
   }
)
 }
  }
*/
Traffic generated this way will result in ~10K coordiator QPS and ~3k client 
connections per Cassandra node.

*Trigger Point:*
Manually issue a CQL query to add a column in the table: “ALTER TABLE 
test_ks.table1 ADD new_col text;”

{*}Symmpton{*}:
Seconds after the trigger point, one or more Cassandra nodes will show number 
of native_transport threads reaching {{{}native_transport_max_threads{}}}, and 
pending native transport tasks grow endlessly.

> Race condition in QueryProcessor causes just prepared statement not to be in 
> the prepared statements cache
> --
>
> Key: CASSANDRA-17401
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17401
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ivan Senic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The changes in the 
> 

[jira] [Commented] (CASSANDRA-17401) Race condition in QueryProcessor causes just prepared statement not to be in the prepared statements cache

2024-01-22 Thread Long Pan (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809755#comment-17809755
 ] 

Long Pan commented on CASSANDRA-17401:
--

[~chovatia.jayd...@gmail.com] and I managed to reproduce the issue. Basically a 
large number of QPS and client connections are necessary to reproduce it. Here 
are the steps to reproduce:


*Server Setup:*
A 3-node Cassandra cluster. Each node with 64GB mem (16GB heap), 7 CPU cores. 
({{{}native_transport_max_threads = 1024){}}}

*Keyspace/Table:*
CREATE KEYSPACE test_ks WITH REPLICATION = \{ ‘class’ : 
‘NetworkTopologyStrategy’, ‘datacenter1’ : 3 } ;
CREATE TABLE test_ks.table1 ( p_id text, c_id text, v text);

*Client Setup:*
30 hosts. Each host run the following pseudo-code, using *GoCql* client:
cluster.CQLVersion = "3.4.0"
cluster.ProtoVersion = 4
cluster.Timeout = 5s
cluster.ConnectTimeout = 10s
cluster.NumConns = 3
cluster.Consistency = LocalQuorum
cluster.RetryPolicy = SimpleRetryPolicy\{NumRetries: 1}
cluster.SocketKeepalive = 20s
cluster.HostSelectionPolicy = RoundRobinHostPolicy

sessionCount = 30
qpsPerSession = 30
cqlQuery = "SELECT p_id,c_id,v FROM test_ks.table1 WHERE p_id = ? AND c_id 
= ?"
for (i = 0; i < sessionCount; i++) \{
   session = cluster.createSession
   rateLimiter = NewRateLimiter(qpsPerSession)
   newGoRoutine.run( sendReads(session, rateLimiter) )
}

/ *
  sendReads(session, rateLimiter) \{
 for {
newGoRoutine.run (
   if (rateLimiter.allow) {
  session.execute(cqlQuery, randomString, randomString)
   }
)
 }
  }
*/
Traffic generated this way will result in ~10K coordiator QPS and ~3k client 
connections per Cassandra node.

*Trigger Point:*
Manually issue a CQL query to add a column in the table: “ALTER TABLE 
test_ks.table1 ADD new_col text;”

{*}Symmpton{*}:
Seconds after the trigger point, one or more Cassandra nodes will show number 
of native_transport threads reaching {{{}native_transport_max_threads{}}}, and 
pending native transport tasks grow endlessly.

> Race condition in QueryProcessor causes just prepared statement not to be in 
> the prepared statements cache
> --
>
> Key: CASSANDRA-17401
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17401
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ivan Senic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The changes in the 
> [QueryProcessor#prepare|https://github.com/apache/cassandra/blame/cassandra-4.0.2/src/java/org/apache/cassandra/cql3/QueryProcessor.java#L575-L638]
>  method that were introduced in versions *4.0.2* and *3.11.12* can cause a 
> race condition between two threads trying to concurrently prepare the same 
> statement. This race condition can cause removing of a prepared statement 
> from the cache, after one of the threads has received the result of the 
> prepare and eventually uses MD5Digest to call 
> [QueryProcessor#getPrepared|https://github.com/apache/cassandra/blame/cassandra-4.0.2/src/java/org/apache/cassandra/cql3/QueryProcessor.java#L212-L215].
> The race condition looks like this:
>  * Thread1 enters _prepare_ method and resolves _safeToReturnCached_ as false
>  * Thread1 executes eviction of hashes
>  * Thread2 enters _prepare_ method and resolves _safeToReturnCached_ as false
>  * Thread1 prepares the statement and caches it
>  * Thread1 returns the result of the prepare
>  * Thread2 executes eviction of hashes
>  * Thread1 tries to execute the prepared statement with the received 
> MD5Digest, but statement is not in the cache as it was evicted by Thread2
> I tried to reproduce this by using a Java driver, but hitting this case from 
> a client side is highly unlikely and I can not simulate the needed race 
> condition. However, we can easily reproduce this in Stargate (details 
> [here|https://github.com/stargate/stargate/pull/1647]), as it's closer to 
> QueryProcessor.
> Reproducing this in a unit test is fairly easy. I am happy to showcase this 
> if needed.
> Note that the issue can occur only when  safeToReturnCached is resolved as 
> false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


sarankk commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462775859


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -109,6 +134,50 @@ private ClientConfig(Map options)
 this.quoteIdentifiers = MapUtils.getBoolean(options, 
QUOTE_IDENTIFIERS, false);
 }
 
+private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean 
hasDeprecatedOption,
+ boolean 
clearSnapshot,
+ String 
clearSnapshotStrategyOption)
+{
+if (hasDeprecatedOption)
+{
+LOGGER.warn("The deprecated option 'clearSnapshot' is set. Please 
set 'clearSnapshotStrategy' instead.");
+if (clearSnapshotStrategyOption == null)
+{
+return clearSnapshot ? ClearSnapshotStrategy.defaultStrategy() 
: new ClearSnapshotStrategy.NoOp();
+}
+}
+if (clearSnapshotStrategyOption == null)
+{
+LOGGER.debug("No clearSnapshotStrategy is set. Using the default 
strategy");
+return ClearSnapshotStrategy.defaultStrategy();
+}
+String[] strategyParts = clearSnapshotStrategyOption.split(" ");
+String strategyName;
+String snapshotTTL = null;
+if (strategyParts.length == 1)
+{
+strategyName = strategyParts[0].trim();
+}
+else if (strategyParts.length == 2)
+{
+strategyName = strategyParts[0].trim();
+snapshotTTL = strategyParts[1].trim();
+if (!Pattern.matches(SNAPSHOT_TTL_PATTERN, snapshotTTL))
+{
+String msg = "Incorrect value set for clearSnapshotStrategy, 
expected format is " +
+ "{strategy [snapshotTTLvalue]}. TTL value 
specified must contain unit along. " +
+ "For e.g. 2d represents a TTL for 2 days";
+throw new IllegalArgumentException(msg);
+}
+}
+else
+{
+LOGGER.error("Invalid value for ClearSnapshotStrategy: '{}'", 
clearSnapshotStrategyOption);
+throw new IllegalArgumentException("Invalid value: " + 
clearSnapshotStrategyOption);

Review Comment:
   Should we fail fast there too?. If users are setting `clearSnapshotStrategy` 
then they might want a particular behaviour, if that behaviour can't be met, 
better to alert earlier. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


sarankk commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462774050


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -24,25 +24,45 @@
 import java.util.Map;
 import java.util.Optional;
 import java.util.UUID;
+import java.util.regex.Pattern;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 import org.apache.cassandra.bridge.BigNumberConfigImpl;
 import org.apache.cassandra.spark.config.SchemaFeature;
 import org.apache.cassandra.spark.config.SchemaFeatureSet;
 import org.apache.cassandra.spark.data.partitioner.ConsistencyLevel;
 import org.apache.cassandra.spark.utils.MapUtils;
+import org.jetbrains.annotations.NotNull;
 import org.jetbrains.annotations.Nullable;
 
 import static 
org.apache.cassandra.spark.data.CassandraDataLayer.aliasLastModifiedTimestamp;
 
 public final class ClientConfig
 {
+private static final Logger LOGGER = 
LoggerFactory.getLogger(ClientConfig.class);
+
 public static final String SIDECAR_INSTANCES = "sidecar_instances";
 public static final String KEYSPACE_KEY = "keyspace";
 public static final String TABLE_KEY = "table";
 public static final String SNAPSHOT_NAME_KEY = "snapshotName";
 public static final String DC_KEY = "dc";
 public static final String CREATE_SNAPSHOT_KEY = "createSnapshot";
 public static final String CLEAR_SNAPSHOT_KEY = "clearSnapshot";
+/**
+ * Format of clearSnapshotStrategy is {strategy [snapshotTTLvalue]}, 
clearSnapshotStrategy holds both the strategy
+ * and in case of TTL based strategy, TTL value. For e.g. 
onCompletionOrTTL 2d, TTL 2d, noOp, onCompletion. For
+ * clear snapshot strategies allowed check {@link ClearSnapshotStrategy}
+ */
+public static final String CLEAR_SNAPSHOT_STRATEGY_KEY = 
"clearSnapshotStrategy";
+/**
+ * TTL value is time to live option for the snapshot (available since 
Cassandra 4.1+). TTL value specified must
+ * contain unit along. For e.g. 2d represents a TTL for 2 days; 1h 
represents a TTL of 1 hour, etc.
+ * Valid units are {@code d}, {@code h}, {@code s}, {@code ms}, {@code 
us}, {@code µs}, {@code ns}, and {@code m}.
+ */
+public static final String DEFAULT_SNAPSHOT_TTL_VALUE = "2d";
+public static final String SNAPSHOT_TTL_PATTERN = 
"\\d+(d|h|m|s|ms)|(\\d+|\\d+\\.\\d+|\\.\\d+)[eE][+-](\\d+|\\d+\\.\\d+|\\.\\d+)(us|µs|ns)";

Review Comment:
   Thanks updated pattern accordingly 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19097) Test Failure: bootstrap_test.TestBootstrap.*

2024-01-22 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809752#comment-17809752
 ] 

Berenguer Blasi commented on CASSANDRA-19097:
-

What I have seen so far perusing the 4 nodes logs is that all writes got to 1. 
No data reaches nodes 2 or 3. Then node 4 comes up and streams only from node 
1. Then node4 is queried and it fails. The amount of token/ring recalculation 
traces are impossible to follow. I need to check now against a working example 
set of logs, let's see if I can follow 8 log files lol. Not reverting yet 
#justfyi

> Test Failure: bootstrap_test.TestBootstrap.*
> 
>
> Key: CASSANDRA-19097
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19097
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Michael Semb Wever
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> test_killed_wiped_node_cannot_join
> test_read_from_bootstrapped_node
> test_shutdown_wiped_node_cannot_join
> Seen in dtests_offheap in CASSANDRA-19034
> https://app.circleci.com/pipelines/github/michaelsembwever/cassandra/258/workflows/cea7d697-ca33-40bb-8914-fb9fc662820a/jobs/21162/parallel-runs/38
> {noformat}
> self = 
> def test_killed_wiped_node_cannot_join(self):
> >   self._wiped_node_cannot_join_test(gently=False)
> bootstrap_test.py:608: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = , gently = False
> def _wiped_node_cannot_join_test(self, gently):
> """
> @jira_ticket CASSANDRA-9765
> Test that if we stop a node and wipe its data then the node cannot 
> join
> when it is not a seed. Test both a nice shutdown or a forced 
> shutdown, via
> the gently parameter.
> """
> cluster = self.cluster
> 
> cluster.set_environment_variable('CASSANDRA_TOKEN_PREGENERATION_DISABLED', 
> 'True')
> cluster.populate(3)
> cluster.start()
> 
> stress_table = 'keyspace1.standard1'
> 
> # write some data
> node1 = cluster.nodelist()[0]
> node1.stress(['write', 'n=10K', 'no-warmup', '-rate', 'threads=8'])
> 
> session = self.patient_cql_connection(node1)
> original_rows = list(session.execute("SELECT * FROM 
> {}".format(stress_table,)))
> 
> # Add a new node, bootstrap=True ensures that it is not a seed
> node4 = new_node(cluster, bootstrap=True)
> node4.start(wait_for_binary_proto=True)
> 
> session = self.patient_cql_connection(node4)
> >   assert original_rows == list(session.execute("SELECT * FROM 
> > {}".format(stress_table,)))
> E   assert [Row(key=b'PP...e9\xbb'), ...] == [Row(key=b'PP...e9\xbb'), 
> ...]
> E At index 587 diff: Row(key=b'OP2656L630', 
> C0=b"E02\xd2\x8clBv\tr\n\xe3\x01\xdd\xf2\x8a\x91\x7f-\x9dm'\xa5\xe7PH\xef\xc1xlO\xab+d",
>  
> C1=b"\xb2\xc0j\xff\xcb'\xe3\xcc\x0b\x93?\x18@\xc4\xc7tV\xb7q\xeeF\x82\xa4\xd3\xdcFl\xd9\x87
>  \x9a\xde\xdc\xa3", 
> C2=b'\xed\xf8\x8d%\xa4\xa6LPs;\x98f\xdb\xca\x913\xba{M\x8d6XW\x01\xea-\xb5  
> C3=b'\x9ec\xcf\xc7\xec\xa5\x85Z]\xa6\x19\xeb\xc4W\x1d%lyZj\xb9\x94I\x90\xebZ\xdba\xdd\xdc\x9e\x82\x95\x1c',
>  
> C4=b'\xab\x9e\x13\x8b\xc6\x15D\x9b\xccl\xdcX\xb23\xd0\x8b\xa3\xba7\xc1c\xf7F\x1d\xf8e\xbd\x89\xcb\xd8\xd1)f\xdd')
>  != Row(key=b'4LN78NONP0', 
> C0=b"\xdf\x90\xb3/u\xc9/C\xcdOYG3\x070@#\xc3k\xaa$M'\x19\xfb\xab\xc0\x10]\xa6\xac\x1d\x81\xad",
>  
> C1=b'\x8a\xb7j\x95\xf9\xbd?&\x11\xaaH\xcd\x87\xaa\xd2\x85\x08X\xea9\x94\xae8U\x92\xad\xb0\x1b9\xff\x87Z\xe81',
>  
> C2=b'6\x1d\xa1-\xf77\xc7\xde+`\xb7\x89\xaa\xcd\xb5_\xe5\xb3\x04\xc7\xb1\x95e\x81s\t1\x8b\x16sc\x0eMm',
>  
> C3=b'\xfbi\x08;\xc9\x94\x15}r\xfe\x1b\xae5\xf6v\x83\xae\xff\x82\x9b`J\xc2D\xa6k+\xf3\xd3\xff{C\xd0;',
>  
> C4=b'\x8f\x87\x18\x0f\xfa\xadK"\x9e\x96\x87:tiu\xa5\x99\xe1_Ax\xa3\x12\xb4Z\xc9v\xa5\xad\xb8{\xc0\xa3\x93')
> E Left contains 2830 more items, first extra item: 
> Row(key=b'5N7N172K30', 
> C0=b'Y\x81\xa6\x02\x89\xa0hyp\x00O\xe9kFp$\x86u\xea\n\x7fK\x99\xe1\xf6G\xf77\xf7\xd7\xe1\xc7L\x...0\x87a\x03\xee',
>  
> C4=b'\xe8\xd8\x17\xf3\x14\x16Q\x9d\\jb\xde=\x81\xc1B\x9c;T\xb1\xa2O-\x87zF=\x04`\x04\xbd\xc9\x95\xad')
> E Full diff:
> E   [
> …
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRASC-94) Reduce filesystem calls while streaming SSTables

2024-01-22 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRASC-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809735#comment-17809735
 ] 

Paulo Motta commented on CASSANDRASC-94:


Cool, thanks for clarifying! I can create a follow-up sidecar ticket if there's 
movement on CASSANDRA-18111.

> Reduce filesystem calls while streaming SSTables
> 
>
> Key: CASSANDRASC-94
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-94
> Project: Sidecar for Apache Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Labels: pull-request-available
>
> When streaming snapshotted SSTables from Cassandra Sidecar, Sidecar will 
> perform multiple filesystem calls:
> - Traverse the data directories to determine the keyspace / table path
> - Once found determine if the SSTable file exists under the snapshots 
> directory
> - Read the filesystem to obtain the file type and file size
> - Read the requested range of the file and stream it
> The amount of filesystem calls is manageable for streaming a single SSTable, 
> but when a client(s) read multiple SSTables, for example in the case of 
> Cassandra Analytics bulk reads, hundred to thousand of requests are performed 
> requiring every request to perform the above system calls.
> In this improvement, it is proposed introducing several caches to reduce the 
> amount of system calls while streaming SSTables.
> - *snapshot list cache*: to maintain a cache of recently listed snapshot 
> files under a snapshot directory. This cache avoids having to access the 
> filesystem every time a bulk read client list the snapshot directory.
> - *table dir cache*: to maintain a cache of recently streamed table directory 
> paths. This cache helps avoiding having to traverse the filesystem searching 
> for the table directory while running bulk reads for example. Since bulk 
> reads can stream tens to hundreds of SSTable components from a snapshot 
> directory, this cache helps avoid having to resolve the table directory each 
> time.
> - *snapshot path cache*: to maintain a cache of recently streamed snapshot 
> SSTable components. This cache avoids having to resolve the snapshot SSTable 
> component path during bulk reads. Since bulk reads streams sub-ranges of an 
> SSTable component, the resolution can happen multiple times during bulk reads 
> for a single SSTable component.
> - *file props cache*: to maintain a cache of FileProps of recently streamed 
> files. This cache avoids having to validate file properties during bulk reads 
> for example where sub-ranges of an SSTable component are streamed, therefore 
> reading the file properties can occur multiple times during bulk reads of the 
> same file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-website) branch asf-staging updated (b68569f6d -> 44f68f021)

2024-01-22 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard b68569f6d generate docs for 63a6db8b
 new 44f68f021 generate docs for 63a6db8b

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (b68569f6d)
\
 N -- N -- N   refs/heads/asf-staging (44f68f021)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../cassandra/managing/tools/nodetool/import.html  |  13 ++---
 .../cassandra/managing/tools/nodetool/import.html  |  13 ++---
 .../cassandra/managing/tools/nodetool/import.html  |  13 ++---
 .../cassandra/managing/tools/nodetool/import.html  |  13 ++---
 site-ui/build/ui-bundle.zip| Bin 4883755 -> 4883755 
bytes
 5 files changed, 40 insertions(+), 12 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


frankgh commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462559574


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -109,6 +134,50 @@ private ClientConfig(Map options)
 this.quoteIdentifiers = MapUtils.getBoolean(options, 
QUOTE_IDENTIFIERS, false);
 }
 
+private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean 
hasDeprecatedOption,

Review Comment:
   maybe make it protected for extensibility?



##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -109,6 +134,50 @@ private ClientConfig(Map options)
 this.quoteIdentifiers = MapUtils.getBoolean(options, 
QUOTE_IDENTIFIERS, false);
 }
 
+private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean 
hasDeprecatedOption,
+ boolean 
clearSnapshot,
+ String 
clearSnapshotStrategyOption)
+{
+if (hasDeprecatedOption)
+{
+LOGGER.warn("The deprecated option 'clearSnapshot' is set. Please 
set 'clearSnapshotStrategy' instead.");
+if (clearSnapshotStrategyOption == null)
+{
+return clearSnapshot ? ClearSnapshotStrategy.defaultStrategy() 
: new ClearSnapshotStrategy.NoOp();
+}
+}
+if (clearSnapshotStrategyOption == null)
+{
+LOGGER.debug("No clearSnapshotStrategy is set. Using the default 
strategy");
+return ClearSnapshotStrategy.defaultStrategy();
+}
+String[] strategyParts = clearSnapshotStrategyOption.split(" ");
+String strategyName;
+String snapshotTTL = null;
+if (strategyParts.length == 1)
+{
+strategyName = strategyParts[0].trim();
+}
+else if (strategyParts.length == 2)
+{
+strategyName = strategyParts[0].trim();
+snapshotTTL = strategyParts[1].trim();
+if (!Pattern.matches(SNAPSHOT_TTL_PATTERN, snapshotTTL))
+{
+String msg = "Incorrect value set for clearSnapshotStrategy, 
expected format is " +
+ "{strategy [snapshotTTLvalue]}. TTL value 
specified must contain unit along. " +
+ "For e.g. 2d represents a TTL for 2 days";
+throw new IllegalArgumentException(msg);
+}
+}
+else
+{
+LOGGER.error("Invalid value for ClearSnapshotStrategy: '{}'", 
clearSnapshotStrategyOption);
+throw new IllegalArgumentException("Invalid value: " + 
clearSnapshotStrategyOption);

Review Comment:
   In ClearSnapshotStrategy.create the fallback behavior is to return the 
default strategy. Should we consider that fallback strategy here as well? 



##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -24,25 +24,45 @@
 import java.util.Map;
 import java.util.Optional;
 import java.util.UUID;
+import java.util.regex.Pattern;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 import org.apache.cassandra.bridge.BigNumberConfigImpl;
 import org.apache.cassandra.spark.config.SchemaFeature;
 import org.apache.cassandra.spark.config.SchemaFeatureSet;
 import org.apache.cassandra.spark.data.partitioner.ConsistencyLevel;
 import org.apache.cassandra.spark.utils.MapUtils;
+import org.jetbrains.annotations.NotNull;
 import org.jetbrains.annotations.Nullable;
 
 import static 
org.apache.cassandra.spark.data.CassandraDataLayer.aliasLastModifiedTimestamp;
 
 public final class ClientConfig
 {
+private static final Logger LOGGER = 
LoggerFactory.getLogger(ClientConfig.class);
+
 public static final String SIDECAR_INSTANCES = "sidecar_instances";
 public static final String KEYSPACE_KEY = "keyspace";
 public static final String TABLE_KEY = "table";
 public static final String SNAPSHOT_NAME_KEY = "snapshotName";
 public static final String DC_KEY = "dc";
 public static final String CREATE_SNAPSHOT_KEY = "createSnapshot";
 public static final String CLEAR_SNAPSHOT_KEY = "clearSnapshot";
+/**
+ * Format of clearSnapshotStrategy is {strategy [snapshotTTLvalue]}, 
clearSnapshotStrategy holds both the strategy
+ * and in case of TTL based strategy, TTL value. For e.g. 
onCompletionOrTTL 2d, TTL 2d, noOp, onCompletion. For
+ * clear snapshot strategies allowed check {@link ClearSnapshotStrategy}
+ */
+public static final String CLEAR_SNAPSHOT_STRATEGY_KEY = 
"clearSnapshotStrategy";
+/**
+ * TTL value is time to live option for the snapshot (available since 
Cassandra 4.1+). TTL value specified must
+ * contain unit along. For e.g. 2d represents a TTL for 2 days; 1h 
represents a TTL of 1 hour, etc.
+ * Valid units 

[jira] [Commented] (CASSANDRA-19215) "Query start time" in native transport request threads should be the task enqueue time

2024-01-22 Thread Runtian Liu (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809674#comment-17809674
 ] 

Runtian Liu commented on CASSANDRA-19215:
-

[~ifesdjeen] any update on this one? Happy to help review if you have the patch 
ready. Thanks.

> "Query start time" in native transport request threads should be the task 
> enqueue time
> --
>
> Key: CASSANDRA-19215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client
>Reporter: Runtian Liu
>Assignee: Alex Petrov
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> Recently, our Cassandra 4.0.6 cluster experienced an outage due to a surge in 
> expensive traffic from the application side. This surge involved a large 
> volume of costly read queries, which took a considerable amount of time to 
> process on the server side. The client had timeout settings; if a request 
> timed out, it might trigger the sending of new requests. Since the server 
> nodes were overloaded, numerous nodes had hundreds of thousands of tasks 
> queued in the Native-Transport-Request pending queue. I expected that once 
> the application ceased sending requests, the server node would quickly return 
> to normal, as most requests in the queue were over half an hour old and 
> should have timed out rapidly, clearing the queue. However, it actually took 
> an hour to clear the native transport's pending queue, even with native 
> transport disabled. Upon examining the code, I noticed that for read/write 
> requests, the 
> [queryStartNanoTime|https://github.com/apache/cassandra/blob/cassandra-4.0/src/java/org/apache/cassandra/transport/Dispatcher.java#L78],
>  which determines if a request has timed out, only begins when the task 
> starts processing. This means that no matter how long a request has been 
> pending, it doesn't contribute to the timeout. I believe this is incorrect. 
> The timer should start when the Cassandra server receives the request or when 
> it enqueues the task, not when the request/task begins processing. This way, 
> an overloaded node with many pending tasks can quickly discard timed-out 
> requests and recover from an outage once new requests stop.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19273) Allow setting TTL for snapshots created by Analytics bulk reader

2024-01-22 Thread Saranya Krishnakumar (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809669#comment-17809669
 ] 

Saranya Krishnakumar commented on CASSANDRA-19273:
--

Latest CircleCl green: 
[https://app.circleci.com/pipelines/github/sarankk/cassandra-analytics/38/workflows/e9ce6765-c035-4620-afc4-8545957633a9]

> Allow setting TTL for snapshots created by Analytics bulk reader
> 
>
> Key: CASSANDRA-19273
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19273
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Assignee: Saranya Krishnakumar
>Priority: Normal
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Analytics user can add an existing snapshot's name or create a new snapshot 
> through reader options, from which data is bulk read from. Incase of creating 
> new snapshot, we want to allow users to set TTL option and have a default 
> value for the TTL. This is to make sure, in case of job failures, the 
> snapshots are cleared, to release space. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


sarankk commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462469394


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -138,7 +138,7 @@ private ClearSnapshotStrategy 
parseClearSnapshotStrategy(boolean hasDeprecatedOp
  boolean 
clearSnapshot,
  String 
clearSnapshotStrategyOption)
 {
-if (hasDeprecatedOption)
+if (hasDeprecatedOption && clearSnapshotStrategyOption == null)

Review Comment:
   Makes sense, will update it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


sarankk commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462466619


##
cassandra-analytics-integration-tests/src/test/java/org/apache/cassandra/analytics/SharedClusterSparkIntegrationTestBase.java:
##
@@ -84,7 +84,7 @@ protected DataFrameReader bulkReaderDataFrame(QualifiedName 
tableName)
   // Shutdown hooks are called after the job ends, and in the 
case of integration tests
   // the sidecar is already shut down before this. Since the 
cluster will be torn
   // down anyway, the integration job skips clearing snapshots.
-  .option("clearSnapshot", "false")
+  .option("clearSnapshotStrategy", "OnCompletion")

Review Comment:
   since `onCompletion` behaviour was similar to test setup (clear after test 
method completes), kept that. But for consistency, will change it to `noop`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


yifan-c commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462464312


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -138,7 +138,7 @@ private ClearSnapshotStrategy 
parseClearSnapshotStrategy(boolean hasDeprecatedOp
  boolean 
clearSnapshot,
  String 
clearSnapshotStrategyOption)
 {
-if (hasDeprecatedOption)
+if (hasDeprecatedOption && clearSnapshotStrategyOption == null)

Review Comment:
   If the deprecated option is set, the warning should always be logged, 
regardless whether the new option is present or not. 
   
   ```java
   if (hasDeprecatedOption)
   {
 log warning
 if (clearSnapshotStrategyOption == null)
 {
   return the default based on the value
 }
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


yifan-c commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462458671


##
cassandra-analytics-integration-tests/src/test/java/org/apache/cassandra/analytics/SharedClusterSparkIntegrationTestBase.java:
##
@@ -84,7 +84,7 @@ protected DataFrameReader bulkReaderDataFrame(QualifiedName 
tableName)
   // Shutdown hooks are called after the job ends, and in the 
case of integration tests
   // the sidecar is already shut down before this. Since the 
cluster will be torn
   // down anyway, the integration job skips clearing snapshots.
-  .option("clearSnapshot", "false")
+  .option("clearSnapshotStrategy", "OnCompletion")

Review Comment:
   before the patch, the test does not clear snapshot. Can you comment why it 
is changed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14572) Expose all table metrics in virtual table

2024-01-22 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809602#comment-17809602
 ] 

Maxim Muzafarov edited comment on CASSANDRA-14572 at 1/22/24 8:11 PM:
--

I think we can add a rate limiter or guardrail in a follow-up issue to prevent 
unwise use of metric polling, but currently, I think it is beyond the issue's 
scope as there should be no difference in a pattern usage for JMX and/or CQL 
polling. Accessing a metric by metric name is super efficient while selecting a 
large range of metrics from a large metrics set is not efficient since it 
requires the MetrciRegisty collection to be filtered each time for the required 
subset of metrics, but it doesn't cause any problems or OOMs.

As I mentioned earlier, I've created 1000 keyspaces with one table each, 
resulting in 77k metrics, you can check [the 
latest|https://issues.apache.org/jira/browse/CASSANDRA-14572?focusedCommentId=17800388=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17800388]
 benchmark for polling a metric.

The JFR for that run is here:
 [^flight_recording_1270017199_13.jfr] 

The 

{code:java}
cqlsh> select count(*) from system_metrics.keyspace_group ;

 count
---
 77462

(1 rows)
{code}

The query that I used selects a metric by partition:
{code:java}
select * from system_metrics.keyspace_group where name = ?
{code}



was (Author: mmuzaf):
I think we can add a rate limiter or guardrail in a follow-up issue to prevent 
unwise use of metric polling, but currently, I think it is beyond the issue's 
scope as there should be no difference in a pattern usage for JMX and/or CQL 
polling. Accessing a metric by metric name is super efficient while selecting a 
large range of metrics from a large metrics set is not efficient since it 
requires the MetrciRegisty collection to be filtered each time for the required 
subset of metrics, but it doesn't cause any problems or OOMs.

As I mentioned earlier, I've created 1000 keyspaces with one table, which 
resulted in 77k metrics you can check the latest benchmark for polling a 
metric. 

The JFR for that run is here:
 [^flight_recording_1270017199_13.jfr] 

The 

{code:java}
cqlsh> select count(*) from system_metrics.keyspace_group ;

 count
---
 77462

(1 rows)
{code}

The query that I used selects a metric by partition:
{code:java}
select * from system_metrics.keyspace_group where name = ?
{code}


> Expose all table metrics in virtual table
> -
>
> Key: CASSANDRA-14572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14572
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability, Observability/Metrics
>Reporter: Chris Lohfink
>Assignee: Maxim Muzafarov
>Priority: Low
>  Labels: virtual-tables
> Fix For: 5.x
>
> Attachments: flight_recording_1270017199_13.jfr, keyspayces_group 
> responses times.png, keyspayces_group summary.png, select keyspaces_group by 
> string prefix.png, select keyspaces_group compare with wo.png, select 
> keyspaces_group without value.png, systemv_views.metrics_dropped_message.png, 
> thread_pools benchmark.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> While we want a number of virtual tables to display data in a way thats great 
> and intuitive like in nodetool. There is also much for being able to expose 
> the metrics we have for tooling via CQL instead of JMX. This is more for the 
> tooling and adhoc advanced users who know exactly what they are looking for.
> *Schema:*
> Initial idea is to expose data via {{((keyspace, table), metric)}} with a 
> column for each metric value. Could also use a Map or UDT instead of the 
> column based that can be a bit more specific to each metric type. To that end 
> there can be a {{metric_type}} column and then a UDT for each metric type 
> filled in, or a single value with more of a Map style. I am 
> purposing the column type though as with {{ALLOW FILTERING}} it does allow 
> more extensive query capabilities.
> *Implementations:*
> * Use reflection to grab all the metrics from TableMetrics (see: 
> CASSANDRA-7622 impl). This is easiest and least abrasive towards new metric 
> implementors... but its reflection and a kinda a bad idea.
> * Add a hook in TableMetrics to register with this virtual table when 
> registering
> * Pull from the CassandraMetrics registery (either reporter or iterate 
> through metrics query on read of virtual table)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19275:
---
  Fix Version/s: NA
  Since Version: NA
Source Control Link: 
https://github.com/apache/cassandra-analytics/commit/d949d8c2b9813c3e8429ece34c364a356bd7d6eb
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Assignee: Francisco Guerrero
>Priority: Normal
> Fix For: NA
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19275 Fix flaxy host replacement tests and shrink tests [cassandra-analytics]

2024-01-22 Thread via GitHub


frankgh commented on PR #33:
URL: 
https://github.com/apache/cassandra-analytics/pull/33#issuecomment-1904727454

   Closed via 
https://github.com/apache/cassandra-analytics/commit/d949d8c2b9813c3e8429ece34c364a356bd7d6eb


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19275 Fix flaxy host replacement tests and shrink tests [cassandra-analytics]

2024-01-22 Thread via GitHub


frankgh closed pull request #33: CASSANDRA-19275 Fix flaxy host replacement 
tests and shrink tests
URL: https://github.com/apache/cassandra-analytics/pull/33


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-analytics) branch trunk updated: CASSANDRA-19275 Fix flaxy host replacement tests and shrink tests

2024-01-22 Thread frankgh
This is an automated email from the ASF dual-hosted git repository.

frankgh pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-analytics.git


The following commit(s) were added to refs/heads/trunk by this push:
 new d949d8c  CASSANDRA-19275 Fix flaxy host replacement tests and shrink 
tests
d949d8c is described below

commit d949d8c2b9813c3e8429ece34c364a356bd7d6eb
Author: Francisco Guerrero 
AuthorDate: Mon Jan 22 09:00:52 2024 -0800

CASSANDRA-19275 Fix flaxy host replacement tests and shrink tests

This patch fixes flaky tests when a `BindException` occurs during cluster 
provisioning.
When a `BindException` is encountered, cluster provisioning is retried for 
up-to
`MAX_CLUSTER_PROVISION_RETRIES`.

Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19275
---
 .../testing/SharedClusterIntegrationTestBase.java  | 32 +-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git 
a/cassandra-analytics-integration-framework/src/main/java/org/apache/cassandra/sidecar/testing/SharedClusterIntegrationTestBase.java
 
b/cassandra-analytics-integration-framework/src/main/java/org/apache/cassandra/sidecar/testing/SharedClusterIntegrationTestBase.java
index d306d65..cdc0708 100644
--- 
a/cassandra-analytics-integration-framework/src/main/java/org/apache/cassandra/sidecar/testing/SharedClusterIntegrationTestBase.java
+++ 
b/cassandra-analytics-integration-framework/src/main/java/org/apache/cassandra/sidecar/testing/SharedClusterIntegrationTestBase.java
@@ -20,6 +20,7 @@
 package org.apache.cassandra.sidecar.testing;
 
 import java.io.IOException;
+import java.net.BindException;
 import java.net.InetSocketAddress;
 import java.net.UnknownHostException;
 import java.nio.file.Path;
@@ -34,6 +35,7 @@ import java.util.concurrent.TimeUnit;
 import java.util.stream.Collectors;
 import java.util.stream.IntStream;
 
+import org.apache.commons.lang3.StringUtils;
 import org.junit.jupiter.api.AfterAll;
 import org.junit.jupiter.api.BeforeAll;
 import org.junit.jupiter.api.TestInstance;
@@ -77,6 +79,7 @@ import 
org.apache.cassandra.sidecar.utils.CassandraVersionProvider;
 import org.apache.cassandra.testing.TestUtils;
 import org.apache.cassandra.testing.TestVersion;
 import org.apache.cassandra.testing.TestVersionSupplier;
+import org.apache.cassandra.utils.Throwables;
 
 import static 
org.apache.cassandra.sidecar.testing.CassandraSidecarTestContext.tryGetIntConfig;
 import static org.assertj.core.api.Assertions.assertThat;
@@ -119,6 +122,7 @@ import static org.assertj.core.api.Assertions.assertThat;
 public abstract class SharedClusterIntegrationTestBase
 {
 protected final Logger logger = 
LoggerFactory.getLogger(SharedClusterIntegrationTestBase.class);
+private static final int MAX_CLUSTER_PROVISION_RETRIES = 5;
 
 protected Vertx vertx;
 protected DnsResolver dnsResolver;
@@ -138,13 +142,39 @@ public abstract class SharedClusterIntegrationTestBase
 Optional testVersion = 
TestVersionSupplier.testVersions().findFirst();
 assertThat(testVersion).isPresent();
 logger.info("Testing with version={}", testVersion);
-cluster = provisionCluster(testVersion.get());
+cluster = provisionClusterWithRetries(testVersion.get());
 assertThat(cluster).isNotNull();
 initializeSchemaForTest();
 startSidecar(cluster);
 beforeTestStart();
 }
 
+protected AbstractCluster 
provisionClusterWithRetries(TestVersion testVersion) throws IOException
+{
+for (int retry = 0; retry < MAX_CLUSTER_PROVISION_RETRIES; retry++)
+{
+try
+{
+return provisionCluster(testVersion);
+}
+catch (RuntimeException runtimeException)
+{
+boolean addressAlreadyInUse = 
Throwables.anyCauseMatches(runtimeException,
+ ex -> 
ex instanceof BindException &&
+   
StringUtils.contains(ex.getMessage(), "Address already in use"));
+if (addressAlreadyInUse)
+{
+logger.warn("Failed to provision cluster after {} 
retries", retry, runtimeException);
+}
+else
+{
+throw runtimeException;
+}
+}
+}
+throw new RuntimeException("Unable to provision cluster");
+}
+
 @AfterAll
 protected void tearDown() throws InterruptedException
 {


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19275:
---
Status: Ready to Commit  (was: Review In Progress)

> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Francisco Guerrero (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809626#comment-17809626
 ] 

Francisco Guerrero commented on CASSANDRA-19275:


Created a new JIRA to address the other class of flakiness: 
https://issues.apache.org/jira/browse/CASSANDRA-19285

> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19285) Flaky Host replacement tests and shrink tests (Instance class loader is already closed)

2024-01-22 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19285:
---
Description: 
During Circle CI runs there are some flaky integration tests, some noticed are
 * HostReplacementMultiDCTest
 * HostReplacementMultiDCFailureTest
 * HostReplacementFailureTest
 * LeavingSingleFailureTest

Some of the error message I see in these tests are e.g.

{code:java}
java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
already closed.
{code}


On repeated run, these tests pass.

  was:
During Circle CI runs there are some flaky integration tests, some noticed are
 * HostReplacementMultiDCTest
 * HostReplacementMultiDCFailureTest
 * HostReplacementFailureTest
 * LeavingSingleFailureTest

Some of the error message I see in these tests are e.g.

`java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`

On repeated run, these tests pass.


> Flaky Host replacement tests and shrink tests (Instance class loader is 
> already closed)
> ---
>
> Key: CASSANDRA-19285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19285
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> {code:java}
> java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
> org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
> already closed.
> {code}
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19285) Flaky Host replacement tests and shrink tests (Instance class loader is already closed)

2024-01-22 Thread Francisco Guerrero (Jira)
Francisco Guerrero created CASSANDRA-19285:
--

 Summary: Flaky Host replacement tests and shrink tests (Instance 
class loader is already closed)
 Key: CASSANDRA-19285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19285
 Project: Cassandra
  Issue Type: Bug
  Components: Analytics Library
Reporter: Francisco Guerrero
Assignee: Francisco Guerrero


During Circle CI runs there are some flaky integration tests, some noticed are
 * HostReplacementMultiDCTest
 * HostReplacementMultiDCFailureTest
 * HostReplacementFailureTest
 * LeavingSingleFailureTest

Some of the error message I see in these tests are e.g.

`java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`

On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-19285) Flaky Host replacement tests and shrink tests (Instance class loader is already closed)

2024-01-22 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero reassigned CASSANDRA-19285:
--

Assignee: (was: Francisco Guerrero)

> Flaky Host replacement tests and shrink tests (Instance class loader is 
> already closed)
> ---
>
> Key: CASSANDRA-19285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19285
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Priority: Normal
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> {code:java}
> java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
> org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
> already closed.
> {code}
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19275:
---
Description: 
During Circle CI runs there are some flaky integration tests, some noticed are
 * HostReplacementMultiDCTest
 * HostReplacementMultiDCFailureTest
 * HostReplacementFailureTest
 * LeavingSingleFailureTest

Some of the error message I see in these tests are e.g.

`java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`

On repeated run, these tests pass.

  was:
During Circle CI runs there are some flaky integration tests, some noticed are
 * HostReplacementMultiDCTest
 * HostReplacementMultiDCFailureTest
 * HostReplacementFailureTest
 * LeavingSingleFailureTest

Some of the error message I see in these tests are e.g.

`java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
already closed.`

`java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`

On repeated run, these tests pass.


> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRASC-94) Reduce filesystem calls while streaming SSTables

2024-01-22 Thread Francisco Guerrero (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRASC-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809611#comment-17809611
 ] 

Francisco Guerrero commented on CASSANDRASC-94:
---

> Do you think caching snapshots in the sidecar will be relevant with that in 
> place?

I think it is still relevant until we are able to leverage CASSANDRA-18111. For 
5.x, we can directly query snapshots from the snapshot virtual table 
{{org.apache.cassandra.db.virtual.SnapshotsTable}}.

One of the goals for Sidecar is to bridge gaps between different versions of 
Cassandra. We should probably have a follow up Sidecar patch once 
https://issues.apache.org/jira/browse/CASSANDRA-18111 is merged.

Let me know what are your thoughts on that.

> Reduce filesystem calls while streaming SSTables
> 
>
> Key: CASSANDRASC-94
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-94
> Project: Sidecar for Apache Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Labels: pull-request-available
>
> When streaming snapshotted SSTables from Cassandra Sidecar, Sidecar will 
> perform multiple filesystem calls:
> - Traverse the data directories to determine the keyspace / table path
> - Once found determine if the SSTable file exists under the snapshots 
> directory
> - Read the filesystem to obtain the file type and file size
> - Read the requested range of the file and stream it
> The amount of filesystem calls is manageable for streaming a single SSTable, 
> but when a client(s) read multiple SSTables, for example in the case of 
> Cassandra Analytics bulk reads, hundred to thousand of requests are performed 
> requiring every request to perform the above system calls.
> In this improvement, it is proposed introducing several caches to reduce the 
> amount of system calls while streaming SSTables.
> - *snapshot list cache*: to maintain a cache of recently listed snapshot 
> files under a snapshot directory. This cache avoids having to access the 
> filesystem every time a bulk read client list the snapshot directory.
> - *table dir cache*: to maintain a cache of recently streamed table directory 
> paths. This cache helps avoiding having to traverse the filesystem searching 
> for the table directory while running bulk reads for example. Since bulk 
> reads can stream tens to hundreds of SSTable components from a snapshot 
> directory, this cache helps avoid having to resolve the table directory each 
> time.
> - *snapshot path cache*: to maintain a cache of recently streamed snapshot 
> SSTable components. This cache avoids having to resolve the snapshot SSTable 
> component path during bulk reads. Since bulk reads streams sub-ranges of an 
> SSTable component, the resolution can happen multiple times during bulk reads 
> for a single SSTable component.
> - *file props cache*: to maintain a cache of FileProps of recently streamed 
> files. This cache avoids having to validate file properties during bulk reads 
> for example where sub-ranges of an SSTable component are streamed, therefore 
> reading the file properties can occur multiple times during bulk reads of the 
> same file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRASC-94) Reduce filesystem calls while streaming SSTables

2024-01-22 Thread Francisco Guerrero (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRASC-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809610#comment-17809610
 ] 

Francisco Guerrero commented on CASSANDRASC-94:
---

Sidecar should leverage cached snapshots from Cassandra

> Reduce filesystem calls while streaming SSTables
> 
>
> Key: CASSANDRASC-94
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-94
> Project: Sidecar for Apache Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Labels: pull-request-available
>
> When streaming snapshotted SSTables from Cassandra Sidecar, Sidecar will 
> perform multiple filesystem calls:
> - Traverse the data directories to determine the keyspace / table path
> - Once found determine if the SSTable file exists under the snapshots 
> directory
> - Read the filesystem to obtain the file type and file size
> - Read the requested range of the file and stream it
> The amount of filesystem calls is manageable for streaming a single SSTable, 
> but when a client(s) read multiple SSTables, for example in the case of 
> Cassandra Analytics bulk reads, hundred to thousand of requests are performed 
> requiring every request to perform the above system calls.
> In this improvement, it is proposed introducing several caches to reduce the 
> amount of system calls while streaming SSTables.
> - *snapshot list cache*: to maintain a cache of recently listed snapshot 
> files under a snapshot directory. This cache avoids having to access the 
> filesystem every time a bulk read client list the snapshot directory.
> - *table dir cache*: to maintain a cache of recently streamed table directory 
> paths. This cache helps avoiding having to traverse the filesystem searching 
> for the table directory while running bulk reads for example. Since bulk 
> reads can stream tens to hundreds of SSTable components from a snapshot 
> directory, this cache helps avoid having to resolve the table directory each 
> time.
> - *snapshot path cache*: to maintain a cache of recently streamed snapshot 
> SSTable components. This cache avoids having to resolve the snapshot SSTable 
> component path during bulk reads. Since bulk reads streams sub-ranges of an 
> SSTable component, the resolution can happen multiple times during bulk reads 
> for a single SSTable component.
> - *file props cache*: to maintain a cache of FileProps of recently streamed 
> files. This cache avoids having to validate file properties during bulk reads 
> for example where sub-ranges of an SSTable component are streamed, therefore 
> reading the file properties can occur multiple times during bulk reads of the 
> same file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-19275:
--
Reviewers: Yifan Cai, Yifan Cai
   Status: Review In Progress  (was: Patch Available)

> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
> org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
> already closed.`
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-19275:
--
Reviewers: Yifan Cai  (was: Yifan Cai, Yifan Cai)

> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
> org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
> already closed.`
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809604#comment-17809604
 ] 

Yifan Cai commented on CASSANDRA-19275:
---

+1 on the patch that fixes the binding errors.

> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
> org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
> already closed.`
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14572) Expose all table metrics in virtual table

2024-01-22 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809602#comment-17809602
 ] 

Maxim Muzafarov commented on CASSANDRA-14572:
-

I think we can add a rate limiter or guardrail in a follow-up issue to prevent 
unwise use of metric polling, but currently, I think it is beyond the issue's 
scope as there should be no difference in a pattern usage for JMX and/or CQL 
polling. Accessing a metric by metric name is super efficient while selecting a 
large range of metrics from a large metrics set is not efficient since it 
requires the MetrciRegisty collection to be filtered each time for the required 
subset of metrics, but it doesn't cause any problems or OOMs.

As I mentioned earlier, I've created 1000 keyspaces with one table, which 
resulted in 77k metrics you can check the latest benchmark for polling a 
metric. 

The JFR for that run is here:
 [^flight_recording_1270017199_13.jfr] 

The 

{code:java}
cqlsh> select count(*) from system_metrics.keyspace_group ;

 count
---
 77462

(1 rows)
{code}

The query that I used selects a metric by partition:
{code:java}
select * from system_metrics.keyspace_group where name = ?
{code}


> Expose all table metrics in virtual table
> -
>
> Key: CASSANDRA-14572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14572
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability, Observability/Metrics
>Reporter: Chris Lohfink
>Assignee: Maxim Muzafarov
>Priority: Low
>  Labels: virtual-tables
> Fix For: 5.x
>
> Attachments: flight_recording_1270017199_13.jfr, keyspayces_group 
> responses times.png, keyspayces_group summary.png, select keyspaces_group by 
> string prefix.png, select keyspaces_group compare with wo.png, select 
> keyspaces_group without value.png, systemv_views.metrics_dropped_message.png, 
> thread_pools benchmark.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> While we want a number of virtual tables to display data in a way thats great 
> and intuitive like in nodetool. There is also much for being able to expose 
> the metrics we have for tooling via CQL instead of JMX. This is more for the 
> tooling and adhoc advanced users who know exactly what they are looking for.
> *Schema:*
> Initial idea is to expose data via {{((keyspace, table), metric)}} with a 
> column for each metric value. Could also use a Map or UDT instead of the 
> column based that can be a bit more specific to each metric type. To that end 
> there can be a {{metric_type}} column and then a UDT for each metric type 
> filled in, or a single value with more of a Map style. I am 
> purposing the column type though as with {{ALLOW FILTERING}} it does allow 
> more extensive query capabilities.
> *Implementations:*
> * Use reflection to grab all the metrics from TableMetrics (see: 
> CASSANDRA-7622 impl). This is easiest and least abrasive towards new metric 
> implementors... but its reflection and a kinda a bad idea.
> * Add a hook in TableMetrics to register with this virtual table when 
> registering
> * Pull from the CassandraMetrics registery (either reporter or iterate 
> through metrics query on read of virtual table)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRASC-94) Reduce filesystem calls while streaming SSTables

2024-01-22 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRASC-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809601#comment-17809601
 ] 

Paulo Motta commented on CASSANDRASC-94:


I am planning to add support to caching snapshots in memory in the server as 
part of CASSANDRA-18111 (I have an draft patch but need to cleanup/rebase/test, 
should take a couple of weeks to wrap up). Do you think caching snapshots in 
the sidecar will be relevant with that in place?

One issue I see is that that functionality will probably land in 5.x, so it's 
still probably useful to have sidecar caching for 4.x.

> Reduce filesystem calls while streaming SSTables
> 
>
> Key: CASSANDRASC-94
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-94
> Project: Sidecar for Apache Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Labels: pull-request-available
>
> When streaming snapshotted SSTables from Cassandra Sidecar, Sidecar will 
> perform multiple filesystem calls:
> - Traverse the data directories to determine the keyspace / table path
> - Once found determine if the SSTable file exists under the snapshots 
> directory
> - Read the filesystem to obtain the file type and file size
> - Read the requested range of the file and stream it
> The amount of filesystem calls is manageable for streaming a single SSTable, 
> but when a client(s) read multiple SSTables, for example in the case of 
> Cassandra Analytics bulk reads, hundred to thousand of requests are performed 
> requiring every request to perform the above system calls.
> In this improvement, it is proposed introducing several caches to reduce the 
> amount of system calls while streaming SSTables.
> - *snapshot list cache*: to maintain a cache of recently listed snapshot 
> files under a snapshot directory. This cache avoids having to access the 
> filesystem every time a bulk read client list the snapshot directory.
> - *table dir cache*: to maintain a cache of recently streamed table directory 
> paths. This cache helps avoiding having to traverse the filesystem searching 
> for the table directory while running bulk reads for example. Since bulk 
> reads can stream tens to hundreds of SSTable components from a snapshot 
> directory, this cache helps avoid having to resolve the table directory each 
> time.
> - *snapshot path cache*: to maintain a cache of recently streamed snapshot 
> SSTable components. This cache avoids having to resolve the snapshot SSTable 
> component path during bulk reads. Since bulk reads streams sub-ranges of an 
> SSTable component, the resolution can happen multiple times during bulk reads 
> for a single SSTable component.
> - *file props cache*: to maintain a cache of FileProps of recently streamed 
> files. This cache avoids having to validate file properties during bulk reads 
> for example where sub-ranges of an SSTable component are streamed, therefore 
> reading the file properties can occur multiple times during bulk reads of the 
> same file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14572) Expose all table metrics in virtual table

2024-01-22 Thread Maxim Muzafarov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated CASSANDRA-14572:

Attachment: flight_recording_1270017199_13.jfr

> Expose all table metrics in virtual table
> -
>
> Key: CASSANDRA-14572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14572
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability, Observability/Metrics
>Reporter: Chris Lohfink
>Assignee: Maxim Muzafarov
>Priority: Low
>  Labels: virtual-tables
> Fix For: 5.x
>
> Attachments: flight_recording_1270017199_13.jfr, keyspayces_group 
> responses times.png, keyspayces_group summary.png, select keyspaces_group by 
> string prefix.png, select keyspaces_group compare with wo.png, select 
> keyspaces_group without value.png, systemv_views.metrics_dropped_message.png, 
> thread_pools benchmark.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> While we want a number of virtual tables to display data in a way thats great 
> and intuitive like in nodetool. There is also much for being able to expose 
> the metrics we have for tooling via CQL instead of JMX. This is more for the 
> tooling and adhoc advanced users who know exactly what they are looking for.
> *Schema:*
> Initial idea is to expose data via {{((keyspace, table), metric)}} with a 
> column for each metric value. Could also use a Map or UDT instead of the 
> column based that can be a bit more specific to each metric type. To that end 
> there can be a {{metric_type}} column and then a UDT for each metric type 
> filled in, or a single value with more of a Map style. I am 
> purposing the column type though as with {{ALLOW FILTERING}} it does allow 
> more extensive query capabilities.
> *Implementations:*
> * Use reflection to grab all the metrics from TableMetrics (see: 
> CASSANDRA-7622 impl). This is easiest and least abrasive towards new metric 
> implementors... but its reflection and a kinda a bad idea.
> * Add a hook in TableMetrics to register with this virtual table when 
> registering
> * Pull from the CassandraMetrics registery (either reporter or iterate 
> through metrics query on read of virtual table)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Francisco Guerrero (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809590#comment-17809590
 ] 

Francisco Guerrero edited comment on CASSANDRA-19275 at 1/22/24 6:21 PM:
-

The PR above only addresses the {{Failed to Bind}} errors, but it does not 
address the {{Instance class loader is already closed}} errors.


was (Author: frankgh):
The PR above only addresses the {{Failed to Bind}} errors, but it does not 
addresses the {{Instance class loader is already closed}} errors.

> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
> org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
> already closed.`
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14572) Expose all table metrics in virtual table

2024-01-22 Thread Chris Lohfink (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809595#comment-17809595
 ] 

Chris Lohfink commented on CASSANDRA-14572:
---

Wouldn't this still be significantly less GC thrashing than the currently jmx 
polling? Just from people querying queryMBeans(null, null) alone. Currently the 
last couple places I've seen are making these horrible >100-500mb object and 
sending it over in a single message before it even starts pulling metrics (when 
there's a lot of tables/keyspaces).

This is also something that can still be more improved upon as well much more 
than JMX which we don't have much control over.

Maybe in your benchmarking include allocation rates? Compare it to polling 
everything over jmx (although different companies are doing it in different 
ways so hard to be perfect about). Should also test it out if there's 1000 
tables or 1000 keyspaces with 1 table and make sure don't OOM somewhere.

> Expose all table metrics in virtual table
> -
>
> Key: CASSANDRA-14572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14572
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability, Observability/Metrics
>Reporter: Chris Lohfink
>Assignee: Maxim Muzafarov
>Priority: Low
>  Labels: virtual-tables
> Fix For: 5.x
>
> Attachments: keyspayces_group responses times.png, keyspayces_group 
> summary.png, select keyspaces_group by string prefix.png, select 
> keyspaces_group compare with wo.png, select keyspaces_group without 
> value.png, systemv_views.metrics_dropped_message.png, thread_pools 
> benchmark.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> While we want a number of virtual tables to display data in a way thats great 
> and intuitive like in nodetool. There is also much for being able to expose 
> the metrics we have for tooling via CQL instead of JMX. This is more for the 
> tooling and adhoc advanced users who know exactly what they are looking for.
> *Schema:*
> Initial idea is to expose data via {{((keyspace, table), metric)}} with a 
> column for each metric value. Could also use a Map or UDT instead of the 
> column based that can be a bit more specific to each metric type. To that end 
> there can be a {{metric_type}} column and then a UDT for each metric type 
> filled in, or a single value with more of a Map style. I am 
> purposing the column type though as with {{ALLOW FILTERING}} it does allow 
> more extensive query capabilities.
> *Implementations:*
> * Use reflection to grab all the metrics from TableMetrics (see: 
> CASSANDRA-7622 impl). This is easiest and least abrasive towards new metric 
> implementors... but its reflection and a kinda a bad idea.
> * Add a hook in TableMetrics to register with this virtual table when 
> registering
> * Pull from the CassandraMetrics registery (either reporter or iterate 
> through metrics query on read of virtual table)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero reassigned CASSANDRA-19275:
--

Assignee: Francisco Guerrero

> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
> org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
> already closed.`
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Francisco Guerrero (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809590#comment-17809590
 ] 

Francisco Guerrero commented on CASSANDRA-19275:


The PR above only addresses the {{Failed to Bind}} errors, but it does not 
addresses the {{Instance class loader is already closed}} errors.

> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
> org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
> already closed.`
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19275:
---
Authors: Francisco Guerrero
Test and Documentation Plan: The patch retries on {{BindException}} while 
provisioning a cluster for shared cluster integration tests.
 Status: Patch Available  (was: Open)

PR: https://github.com/apache/cassandra-analytics/pull/33/files
CI: https://app.circleci.com/pipelines/github/frankgh/cassandra-analytics/106

> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
> org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
> already closed.`
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19275) Flaky Host replacement tests and shrink tests

2024-01-22 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19275:
---
 Bug Category: Parent values: Code(13163)
   Complexity: Low Hanging Fruit
  Component/s: Analytics Library
Discovered By: DTest
 Severity: Low
   Status: Open  (was: Triage Needed)

> Flaky Host replacement tests and shrink tests
> -
>
> Key: CASSANDRA-19275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Saranya Krishnakumar
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Circle CI runs there are some flaky integration tests, some noticed are
>  * HostReplacementMultiDCTest
>  * HostReplacementMultiDCFailureTest
>  * HostReplacementFailureTest
>  * LeavingSingleFailureTest
> Some of the error message I see in these tests are e.g.
> `java.lang.RuntimeException: java.lang.IllegalStateException: Can't load 
> org.apache.cassandra.utils.concurrent.Ref$OnLeak. Instance class loader is 
> already closed.`
> `java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: Failed to bind port 42611 on 127.0.0.2.`
> On repeated run, these tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14572) Expose all table metrics in virtual table

2024-01-22 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809574#comment-17809574
 ] 

Michael Semb Wever commented on CASSANDRA-14572:


What about selecting all metrics ?  Expect abuse.

> Expose all table metrics in virtual table
> -
>
> Key: CASSANDRA-14572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14572
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability, Observability/Metrics
>Reporter: Chris Lohfink
>Assignee: Maxim Muzafarov
>Priority: Low
>  Labels: virtual-tables
> Fix For: 5.x
>
> Attachments: keyspayces_group responses times.png, keyspayces_group 
> summary.png, select keyspaces_group by string prefix.png, select 
> keyspaces_group compare with wo.png, select keyspaces_group without 
> value.png, systemv_views.metrics_dropped_message.png, thread_pools 
> benchmark.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> While we want a number of virtual tables to display data in a way thats great 
> and intuitive like in nodetool. There is also much for being able to expose 
> the metrics we have for tooling via CQL instead of JMX. This is more for the 
> tooling and adhoc advanced users who know exactly what they are looking for.
> *Schema:*
> Initial idea is to expose data via {{((keyspace, table), metric)}} with a 
> column for each metric value. Could also use a Map or UDT instead of the 
> column based that can be a bit more specific to each metric type. To that end 
> there can be a {{metric_type}} column and then a UDT for each metric type 
> filled in, or a single value with more of a Map style. I am 
> purposing the column type though as with {{ALLOW FILTERING}} it does allow 
> more extensive query capabilities.
> *Implementations:*
> * Use reflection to grab all the metrics from TableMetrics (see: 
> CASSANDRA-7622 impl). This is easiest and least abrasive towards new metric 
> implementors... but its reflection and a kinda a bad idea.
> * Add a hook in TableMetrics to register with this virtual table when 
> registering
> * Pull from the CassandraMetrics registery (either reporter or iterate 
> through metrics query on read of virtual table)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[PR] CASSANDRA-19275 Fix flaxy host replacement tests and shrink tests [cassandra-analytics]

2024-01-22 Thread via GitHub


frankgh opened a new pull request, #33:
URL: https://github.com/apache/cassandra-analytics/pull/33

   This patch fixes flaky tests when a `BindException` occurs during cluster 
provisioning. When a `BindException` is encountered, cluster provisioning is 
retried for up-to `MAX_CLUSTER_PROVISION_RETRIES`.
   
   Patch by Francisco Guerrero; Reviewed by TBD for CASSANDRA-19275


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19284) Harry overrides model

2024-01-22 Thread Alex Petrov (Jira)
Alex Petrov created CASSANDRA-19284:
---

 Summary: Harry overrides model
 Key: CASSANDRA-19284
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19284
 Project: Cassandra
  Issue Type: New Feature
Reporter: Alex Petrov


Harry model to allow providing specific values for the test. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18813) Simplify the bind marker and Term logic

2024-01-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809565#comment-17809565
 ] 

Andres de la Peña commented on CASSANDRA-18813:
---

The last changes look good to me, +1.

Regarding the failure of {{{}SnapshotsTest{}}}, I haven't been able to 
reproduce it with the multiplexer, neither [in the base 
branch|https://app.circleci.com/pipelines/github/adelapena/cassandra/3398/workflows/cd28d535-15c5-4ad6-9012-782d5fa9583f]
 nor [in the patched 
branch|https://app.circleci.com/pipelines/github/adelapena/cassandra/3399/workflows/f3dc74ed-9796-4cc3-95bc-d2edc026af52].
 I cannot see how it could be related to the changes, so I'd say we merge this 
and keep an eye on it.

> Simplify the bind marker and Term logic
> ---
>
> Key: CASSANDRA-18813
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18813
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Interpreter
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> The current logic around {{Term}} and {{Terms}} classes is confusing 
> specially with {{MultiItemTerminal}} and {{MultiColumnRaw}} that are used to 
> handle different use cases that could be handled simply with the {{Term}} 
> interface.
> On top of that IN marker add to the confusion because the are represented as 
> single Term where in practice they are a set of terms. Representing them as a 
> {{Terms}} could simplify  the way we handle IN restrictions.
> The goal of this ticket is:
> *  to refactor the {{Term}} and {{Terms}} interfaces to simplify the logic
> * Represents IN bind marker as {{Terms}} instead of having 2 different 
> representations (a list of terms and a single {{MultiItemTerminal}}.
> * Simplify the {{AbstractMarker}} hierachy 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-19283) Update rpm and debian packaging

2024-01-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-19283:


Assignee: Brandon Williams

> Update rpm and debian packaging
> ---
>
> Key: CASSANDRA-19283
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19283
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> While working on CASSANDRA-19001, it was identified that there are 
> differences between bin/cassandra.in.sh and redhat/cassandra.in.sh, and it 
> seems the debian diff on 5.0 was updated once in 2020 since it was created in 
> 2019.
> CC [~brandon.williams]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19001) Check whether the startup warnings for unknown modules represent a legit problem or cosmetic issue

2024-01-22 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809556#comment-17809556
 ] 

Ekaterina Dimitrova commented on CASSANDRA-19001:
-

CASSANDRA-19283 was opened to facilitate the effort. 

> Check whether the startup warnings for unknown modules represent a legit 
> problem or cosmetic issue
> --
>
> Key: CASSANDRA-19001
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19001
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-beta2, 5.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During the 5.0 alpha 2 release 
> [vote|https://lists.apache.org/thread/lt3x0obr5cpbcydf5490pj6b2q0mz5zr], 
> [~paulo] raised the following concerns:
> {code:java}
> Launched a tarball-based 5.0-alpha2 container on top of
> "eclipse-temurin:17-jre-focal" and the server starts up fine, can run
> nodetool and cqlsh.
> I got these seemingly harmless JDK17 warnings during startup and when
> running nodetool (no warnings on JDK11):
> WARNING: Unknown module: jdk.attach specified to --add-exports
> WARNING: Unknown module: jdk.compiler specified to --add-exports
> WARNING: Unknown module: jdk.compiler specified to --add-opens
> WARNING: A terminally deprecated method in java.lang.System has been called
> WARNING: System::setSecurityManager has been called by
> org.apache.cassandra.security.ThreadAwareSecurityManager
> (file:/opt/cassandra/lib/apache-cassandra-5.0-alpha2-SNAPSHOT.jar)
> WARNING: Please consider reporting this to the maintainers of
> org.apache.cassandra.security.ThreadAwareSecurityManager
> WARNING: System::setSecurityManager will be removed in a future release
> Anybody knows if these warnings are legit/expected ? We can create
> follow-up tickets if needed.
> $ java --version
> openjdk 17.0.9 2023-10-17
> OpenJDK Runtime Environment Temurin-17.0.9+9 (build 17.0.9+9)
> OpenJDK 64-Bit Server VM Temurin-17.0.9+9 (build 17.0.9+9, mixed mode,
> sharing)
> {code}
> {code:java}
> Clarification: - When running nodetool only the "Unknown module" warnings 
> show up. All warnings show up during startup.{code}
> We need to verify whether this presents a real problem in the features where 
> those modules are expected to be used, or if it is a false alarm. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19283) Update rpm and debian packaging

2024-01-22 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-19283:

 Bug Category: Parent values: Packaging(13660)Level 1 values: Package 
Distribution(13662)
   Complexity: Normal
  Component/s: Packaging
Discovered By: User Report
Fix Version/s: 5.0.x
   5.x
 Severity: Low
 Assignee: (was: Brandon Williams)
   Status: Open  (was: Triage Needed)

> Update rpm and debian packaging
> ---
>
> Key: CASSANDRA-19283
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19283
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> While working on CASSANDRA-19001, it was identified that there are 
> differences between bin/cassandra.in.sh and redhat/cassandra.in.sh, and it 
> seems the debian diff on 5.0 was updated once in 2020 since it was created in 
> 2019.
> CC [~brandon.williams]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-19283) Update rpm and debian packaging

2024-01-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-19283:


Assignee: Brandon Williams

> Update rpm and debian packaging
> ---
>
> Key: CASSANDRA-19283
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19283
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
>
> While working on CASSANDRA-19001, it was identified that there are 
> differences between bin/cassandra.in.sh and redhat/cassandra.in.sh, and it 
> seems the debian diff on 5.0 was updated once in 2020 since it was created in 
> 2019.
> CC [~brandon.williams]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19283) Update rpm and debian packaging

2024-01-22 Thread Ekaterina Dimitrova (Jira)
Ekaterina Dimitrova created CASSANDRA-19283:
---

 Summary: Update rpm and debian packaging
 Key: CASSANDRA-19283
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19283
 Project: Cassandra
  Issue Type: Bug
Reporter: Ekaterina Dimitrova


While working on CASSANDRA-19001, it was identified that there are differences 
between bin/cassandra.in.sh and redhat/cassandra.in.sh, and it seems the debian 
diff on 5.0 was updated once in 2020 since it was created in 2019.

CC [~brandon.williams]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19001) Check whether the startup warnings for unknown modules represent a legit problem or cosmetic issue

2024-01-22 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-19001:

  Fix Version/s: 5.0-beta2
 5.1
 (was: 5.x)
 (was: 5.0.x)
 (was: 5.0-rc)
  Since Version: 5.0-alpha1
Source Control Link: To https://github.com/apache/cassandra.git 
8fd44ca8fc..9f5e45e5a2 cassandra-5.0 -> cassandra-5.0 03f0d37cb0..aa644c9dfa 
trunk -> trunk
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Check whether the startup warnings for unknown modules represent a legit 
> problem or cosmetic issue
> --
>
> Key: CASSANDRA-19001
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19001
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-beta2, 5.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During the 5.0 alpha 2 release 
> [vote|https://lists.apache.org/thread/lt3x0obr5cpbcydf5490pj6b2q0mz5zr], 
> [~paulo] raised the following concerns:
> {code:java}
> Launched a tarball-based 5.0-alpha2 container on top of
> "eclipse-temurin:17-jre-focal" and the server starts up fine, can run
> nodetool and cqlsh.
> I got these seemingly harmless JDK17 warnings during startup and when
> running nodetool (no warnings on JDK11):
> WARNING: Unknown module: jdk.attach specified to --add-exports
> WARNING: Unknown module: jdk.compiler specified to --add-exports
> WARNING: Unknown module: jdk.compiler specified to --add-opens
> WARNING: A terminally deprecated method in java.lang.System has been called
> WARNING: System::setSecurityManager has been called by
> org.apache.cassandra.security.ThreadAwareSecurityManager
> (file:/opt/cassandra/lib/apache-cassandra-5.0-alpha2-SNAPSHOT.jar)
> WARNING: Please consider reporting this to the maintainers of
> org.apache.cassandra.security.ThreadAwareSecurityManager
> WARNING: System::setSecurityManager will be removed in a future release
> Anybody knows if these warnings are legit/expected ? We can create
> follow-up tickets if needed.
> $ java --version
> openjdk 17.0.9 2023-10-17
> OpenJDK Runtime Environment Temurin-17.0.9+9 (build 17.0.9+9)
> OpenJDK 64-Bit Server VM Temurin-17.0.9+9 (build 17.0.9+9, mixed mode,
> sharing)
> {code}
> {code:java}
> Clarification: - When running nodetool only the "Unknown module" warnings 
> show up. All warnings show up during startup.{code}
> We need to verify whether this presents a real problem in the features where 
> those modules are expected to be used, or if it is a false alarm. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19001) Check whether the startup warnings for unknown modules represent a legit problem or cosmetic issue

2024-01-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809551#comment-17809551
 ] 

Brandon Williams commented on CASSANDRA-19001:
--

bq. We detect whether we use JDK or JRE and add necessary options in case it is 
JDK. 

We only install the JRE in packaging, but there is nothing to preclude someone 
from installing the JDK too, so now is good time to update the scripts in the 
packaging.

bq. Opening a ticket to check those files on 4.0+ and clear all inconsistencies 
is good. WDYT?

Sure, and I'm happy to take that one.

> Check whether the startup warnings for unknown modules represent a legit 
> problem or cosmetic issue
> --
>
> Key: CASSANDRA-19001
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19001
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-rc, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During the 5.0 alpha 2 release 
> [vote|https://lists.apache.org/thread/lt3x0obr5cpbcydf5490pj6b2q0mz5zr], 
> [~paulo] raised the following concerns:
> {code:java}
> Launched a tarball-based 5.0-alpha2 container on top of
> "eclipse-temurin:17-jre-focal" and the server starts up fine, can run
> nodetool and cqlsh.
> I got these seemingly harmless JDK17 warnings during startup and when
> running nodetool (no warnings on JDK11):
> WARNING: Unknown module: jdk.attach specified to --add-exports
> WARNING: Unknown module: jdk.compiler specified to --add-exports
> WARNING: Unknown module: jdk.compiler specified to --add-opens
> WARNING: A terminally deprecated method in java.lang.System has been called
> WARNING: System::setSecurityManager has been called by
> org.apache.cassandra.security.ThreadAwareSecurityManager
> (file:/opt/cassandra/lib/apache-cassandra-5.0-alpha2-SNAPSHOT.jar)
> WARNING: Please consider reporting this to the maintainers of
> org.apache.cassandra.security.ThreadAwareSecurityManager
> WARNING: System::setSecurityManager will be removed in a future release
> Anybody knows if these warnings are legit/expected ? We can create
> follow-up tickets if needed.
> $ java --version
> openjdk 17.0.9 2023-10-17
> OpenJDK Runtime Environment Temurin-17.0.9+9 (build 17.0.9+9)
> OpenJDK 64-Bit Server VM Temurin-17.0.9+9 (build 17.0.9+9, mixed mode,
> sharing)
> {code}
> {code:java}
> Clarification: - When running nodetool only the "Unknown module" warnings 
> show up. All warnings show up during startup.{code}
> We need to verify whether this presents a real problem in the features where 
> those modules are expected to be used, or if it is a false alarm. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19097) Test Failure: bootstrap_test.TestBootstrap.*

2024-01-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809547#comment-17809547
 ] 

Brandon Williams commented on CASSANDRA-19097:
--

That really complicates things.  This means that even if we have some bug in 
streaming causing the corruption, we also have a second bug where the repair 
mechanism of ALL does not work.  I want to believe this is something silly yet 
fundamental, but all the evidence just keeps mounting.

I don't know that it will help (and I'm not sure what will) but maybe if we 
modify the test to tell us _how wrong_ the results are that will be useful.  
Currently we know something is wrong, but it's a sea of text.

> Test Failure: bootstrap_test.TestBootstrap.*
> 
>
> Key: CASSANDRA-19097
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19097
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Michael Semb Wever
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> test_killed_wiped_node_cannot_join
> test_read_from_bootstrapped_node
> test_shutdown_wiped_node_cannot_join
> Seen in dtests_offheap in CASSANDRA-19034
> https://app.circleci.com/pipelines/github/michaelsembwever/cassandra/258/workflows/cea7d697-ca33-40bb-8914-fb9fc662820a/jobs/21162/parallel-runs/38
> {noformat}
> self = 
> def test_killed_wiped_node_cannot_join(self):
> >   self._wiped_node_cannot_join_test(gently=False)
> bootstrap_test.py:608: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = , gently = False
> def _wiped_node_cannot_join_test(self, gently):
> """
> @jira_ticket CASSANDRA-9765
> Test that if we stop a node and wipe its data then the node cannot 
> join
> when it is not a seed. Test both a nice shutdown or a forced 
> shutdown, via
> the gently parameter.
> """
> cluster = self.cluster
> 
> cluster.set_environment_variable('CASSANDRA_TOKEN_PREGENERATION_DISABLED', 
> 'True')
> cluster.populate(3)
> cluster.start()
> 
> stress_table = 'keyspace1.standard1'
> 
> # write some data
> node1 = cluster.nodelist()[0]
> node1.stress(['write', 'n=10K', 'no-warmup', '-rate', 'threads=8'])
> 
> session = self.patient_cql_connection(node1)
> original_rows = list(session.execute("SELECT * FROM 
> {}".format(stress_table,)))
> 
> # Add a new node, bootstrap=True ensures that it is not a seed
> node4 = new_node(cluster, bootstrap=True)
> node4.start(wait_for_binary_proto=True)
> 
> session = self.patient_cql_connection(node4)
> >   assert original_rows == list(session.execute("SELECT * FROM 
> > {}".format(stress_table,)))
> E   assert [Row(key=b'PP...e9\xbb'), ...] == [Row(key=b'PP...e9\xbb'), 
> ...]
> E At index 587 diff: Row(key=b'OP2656L630', 
> C0=b"E02\xd2\x8clBv\tr\n\xe3\x01\xdd\xf2\x8a\x91\x7f-\x9dm'\xa5\xe7PH\xef\xc1xlO\xab+d",
>  
> C1=b"\xb2\xc0j\xff\xcb'\xe3\xcc\x0b\x93?\x18@\xc4\xc7tV\xb7q\xeeF\x82\xa4\xd3\xdcFl\xd9\x87
>  \x9a\xde\xdc\xa3", 
> C2=b'\xed\xf8\x8d%\xa4\xa6LPs;\x98f\xdb\xca\x913\xba{M\x8d6XW\x01\xea-\xb5  
> C3=b'\x9ec\xcf\xc7\xec\xa5\x85Z]\xa6\x19\xeb\xc4W\x1d%lyZj\xb9\x94I\x90\xebZ\xdba\xdd\xdc\x9e\x82\x95\x1c',
>  
> C4=b'\xab\x9e\x13\x8b\xc6\x15D\x9b\xccl\xdcX\xb23\xd0\x8b\xa3\xba7\xc1c\xf7F\x1d\xf8e\xbd\x89\xcb\xd8\xd1)f\xdd')
>  != Row(key=b'4LN78NONP0', 
> C0=b"\xdf\x90\xb3/u\xc9/C\xcdOYG3\x070@#\xc3k\xaa$M'\x19\xfb\xab\xc0\x10]\xa6\xac\x1d\x81\xad",
>  
> C1=b'\x8a\xb7j\x95\xf9\xbd?&\x11\xaaH\xcd\x87\xaa\xd2\x85\x08X\xea9\x94\xae8U\x92\xad\xb0\x1b9\xff\x87Z\xe81',
>  
> C2=b'6\x1d\xa1-\xf77\xc7\xde+`\xb7\x89\xaa\xcd\xb5_\xe5\xb3\x04\xc7\xb1\x95e\x81s\t1\x8b\x16sc\x0eMm',
>  
> C3=b'\xfbi\x08;\xc9\x94\x15}r\xfe\x1b\xae5\xf6v\x83\xae\xff\x82\x9b`J\xc2D\xa6k+\xf3\xd3\xff{C\xd0;',
>  
> C4=b'\x8f\x87\x18\x0f\xfa\xadK"\x9e\x96\x87:tiu\xa5\x99\xe1_Ax\xa3\x12\xb4Z\xc9v\xa5\xad\xb8{\xc0\xa3\x93')
> E Left contains 2830 more items, first extra item: 
> Row(key=b'5N7N172K30', 
> C0=b'Y\x81\xa6\x02\x89\xa0hyp\x00O\xe9kFp$\x86u\xea\n\x7fK\x99\xe1\xf6G\xf77\xf7\xd7\xe1\xc7L\x...0\x87a\x03\xee',
>  
> C4=b'\xe8\xd8\x17\xf3\x14\x16Q\x9d\\jb\xde=\x81\xc1B\x9c;T\xb1\xa2O-\x87zF=\x04`\x04\xbd\xc9\x95\xad')
> E Full diff:
> E   [
> …
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19097) Test Failure: bootstrap_test.TestBootstrap.*

2024-01-22 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809255#comment-17809255
 ] 

Berenguer Blasi edited comment on CASSANDRA-19097 at 1/22/24 3:20 PM:
--

Another one that failed with the ALL read. Confirmed as the failure reports 
prints the new code 
https://ci-cassandra.apache.org/job/Cassandra-5.0-dtest-offheap/152/jdk=jdk_11_latest,label=cassandra-dtest,split=13/consoleFull


was (Author: bereng):
Another one that failed with the ALL read according to timestamps 
https://ci-cassandra.apache.org/job/Cassandra-5.0-dtest-offheap/152/jdk=jdk_11_latest,label=cassandra-dtest,split=13/consoleFull

> Test Failure: bootstrap_test.TestBootstrap.*
> 
>
> Key: CASSANDRA-19097
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19097
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Michael Semb Wever
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> test_killed_wiped_node_cannot_join
> test_read_from_bootstrapped_node
> test_shutdown_wiped_node_cannot_join
> Seen in dtests_offheap in CASSANDRA-19034
> https://app.circleci.com/pipelines/github/michaelsembwever/cassandra/258/workflows/cea7d697-ca33-40bb-8914-fb9fc662820a/jobs/21162/parallel-runs/38
> {noformat}
> self = 
> def test_killed_wiped_node_cannot_join(self):
> >   self._wiped_node_cannot_join_test(gently=False)
> bootstrap_test.py:608: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = , gently = False
> def _wiped_node_cannot_join_test(self, gently):
> """
> @jira_ticket CASSANDRA-9765
> Test that if we stop a node and wipe its data then the node cannot 
> join
> when it is not a seed. Test both a nice shutdown or a forced 
> shutdown, via
> the gently parameter.
> """
> cluster = self.cluster
> 
> cluster.set_environment_variable('CASSANDRA_TOKEN_PREGENERATION_DISABLED', 
> 'True')
> cluster.populate(3)
> cluster.start()
> 
> stress_table = 'keyspace1.standard1'
> 
> # write some data
> node1 = cluster.nodelist()[0]
> node1.stress(['write', 'n=10K', 'no-warmup', '-rate', 'threads=8'])
> 
> session = self.patient_cql_connection(node1)
> original_rows = list(session.execute("SELECT * FROM 
> {}".format(stress_table,)))
> 
> # Add a new node, bootstrap=True ensures that it is not a seed
> node4 = new_node(cluster, bootstrap=True)
> node4.start(wait_for_binary_proto=True)
> 
> session = self.patient_cql_connection(node4)
> >   assert original_rows == list(session.execute("SELECT * FROM 
> > {}".format(stress_table,)))
> E   assert [Row(key=b'PP...e9\xbb'), ...] == [Row(key=b'PP...e9\xbb'), 
> ...]
> E At index 587 diff: Row(key=b'OP2656L630', 
> C0=b"E02\xd2\x8clBv\tr\n\xe3\x01\xdd\xf2\x8a\x91\x7f-\x9dm'\xa5\xe7PH\xef\xc1xlO\xab+d",
>  
> C1=b"\xb2\xc0j\xff\xcb'\xe3\xcc\x0b\x93?\x18@\xc4\xc7tV\xb7q\xeeF\x82\xa4\xd3\xdcFl\xd9\x87
>  \x9a\xde\xdc\xa3", 
> C2=b'\xed\xf8\x8d%\xa4\xa6LPs;\x98f\xdb\xca\x913\xba{M\x8d6XW\x01\xea-\xb5  
> C3=b'\x9ec\xcf\xc7\xec\xa5\x85Z]\xa6\x19\xeb\xc4W\x1d%lyZj\xb9\x94I\x90\xebZ\xdba\xdd\xdc\x9e\x82\x95\x1c',
>  
> C4=b'\xab\x9e\x13\x8b\xc6\x15D\x9b\xccl\xdcX\xb23\xd0\x8b\xa3\xba7\xc1c\xf7F\x1d\xf8e\xbd\x89\xcb\xd8\xd1)f\xdd')
>  != Row(key=b'4LN78NONP0', 
> C0=b"\xdf\x90\xb3/u\xc9/C\xcdOYG3\x070@#\xc3k\xaa$M'\x19\xfb\xab\xc0\x10]\xa6\xac\x1d\x81\xad",
>  
> C1=b'\x8a\xb7j\x95\xf9\xbd?&\x11\xaaH\xcd\x87\xaa\xd2\x85\x08X\xea9\x94\xae8U\x92\xad\xb0\x1b9\xff\x87Z\xe81',
>  
> C2=b'6\x1d\xa1-\xf77\xc7\xde+`\xb7\x89\xaa\xcd\xb5_\xe5\xb3\x04\xc7\xb1\x95e\x81s\t1\x8b\x16sc\x0eMm',
>  
> C3=b'\xfbi\x08;\xc9\x94\x15}r\xfe\x1b\xae5\xf6v\x83\xae\xff\x82\x9b`J\xc2D\xa6k+\xf3\xd3\xff{C\xd0;',
>  
> C4=b'\x8f\x87\x18\x0f\xfa\xadK"\x9e\x96\x87:tiu\xa5\x99\xe1_Ax\xa3\x12\xb4Z\xc9v\xa5\xad\xb8{\xc0\xa3\x93')
> E Left contains 2830 more items, first extra item: 
> Row(key=b'5N7N172K30', 
> C0=b'Y\x81\xa6\x02\x89\xa0hyp\x00O\xe9kFp$\x86u\xea\n\x7fK\x99\xe1\xf6G\xf77\xf7\xd7\xe1\xc7L\x...0\x87a\x03\xee',
>  
> C4=b'\xe8\xd8\x17\xf3\x14\x16Q\x9d\\jb\xde=\x81\xc1B\x9c;T\xb1\xa2O-\x87zF=\x04`\x04\xbd\xc9\x95\xad')
> E Full diff:
> E   [
> …
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16565) Remove dependency on sigar

2024-01-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-16565:
--
Complexity: Normal  (was: Low Hanging Fruit)

> Remove dependency on sigar
> --
>
> Key: CASSANDRA-16565
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16565
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: David Capwell
>Assignee: Claude Warren
>Priority: Normal
> Fix For: 5.x
>
>
> sigar is used to check if the environment has good settings for running C*, 
> but requires we bundle a lot of native libraries to perform this check (which 
> can also be done elsewhere).  This project also appears to be dead as the 
> last commit was around 6 years ago.
> With the move to resolve artifacts rather than commit them, removing this 
> dependency would remove majority of the artifacts fetched from GitHub.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19118) Add support of vector type to COPY command

2024-01-22 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-19118:
--
  Fix Version/s: 5.0-beta2
 5.1
 (was: 5.x)
 (was: 5.0-rc)
  Since Version: 5.0-alpha1
Source Control Link: 
https://github.com/apache/cassandra/commit/c76b32492f08c4af56846518488ae0b191e077e8
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Add support of vector type to COPY command
> --
>
> Key: CASSANDRA-19118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19118
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: Szymon Miezal
>Assignee: Szymon Miezal
>Priority: Normal
> Fix For: 5.0-beta2, 5.1
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Currently it's not possible to import rows with vector literals via {{COPY}} 
> command.
> STR:
>  * Create a table
> {code:sql}
> CREATE TABLE testcopyfrom (id text PRIMARY KEY, embedding_vector 
> VECTOR
> {code}
>  * Prepare csv file with sample data, for instance:
> {code:sql}
> 1,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6]"
> 2,"[-0.1, -0.2, -0.3, -0.4, -0.5, -0.6]" {code}
>  * in cqlsh run
> {code:sql}
> COPY ks.testcopyfrom FROM data.csv
> {code}
> It will result in getting:
> {code:sql}
> TypeError: Received an argument of invalid type for column 
> "embedding_vector". Expected: , 
> Got: ; (required argument is not a float){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19118) Add support of vector type to COPY command

2024-01-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809480#comment-17809480
 ] 

Andres de la Peña commented on CASSANDRA-19118:
---

Committed to {{cassandra-5.0}} as 
[c76b32492f08c4af56846518488ae0b191e077e8|https://github.com/apache/cassandra/commit/c76b32492f08c4af56846518488ae0b191e077e8]
 and merged to 
[{{trunk}}|https://github.com/apache/cassandra/commit/1e44a0850b8589e5dce3d9750f2c11add713d3f8].

Thanks for the patch and reviews.

> Add support of vector type to COPY command
> --
>
> Key: CASSANDRA-19118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19118
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: Szymon Miezal
>Assignee: Szymon Miezal
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Currently it's not possible to import rows with vector literals via {{COPY}} 
> command.
> STR:
>  * Create a table
> {code:sql}
> CREATE TABLE testcopyfrom (id text PRIMARY KEY, embedding_vector 
> VECTOR
> {code}
>  * Prepare csv file with sample data, for instance:
> {code:sql}
> 1,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6]"
> 2,"[-0.1, -0.2, -0.3, -0.4, -0.5, -0.6]" {code}
>  * in cqlsh run
> {code:sql}
> COPY ks.testcopyfrom FROM data.csv
> {code}
> It will result in getting:
> {code:sql}
> TypeError: Received an argument of invalid type for column 
> "embedding_vector". Expected: , 
> Got: ; (required argument is not a float){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch trunk updated (46b90364da -> 1e44a0850b)

2024-01-22 Thread adelapena
This is an automated email from the ASF dual-hosted git repository.

adelapena pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 46b90364da Change IP address of the CMS node during transition
 new c76b32492f Add support of vector type to cqlsh COPY command
 new 1e44a0850b Merge branch 'cassandra-5.0' into trunk

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|   1 +
 pylib/cqlshlib/copyutil.py |   9 +-
 .../apache/cassandra/tools/cqlsh/CqlshTest.java| 126 -
 3 files changed, 132 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) 01/01: Merge branch 'cassandra-5.0' into trunk

2024-01-22 Thread adelapena
This is an automated email from the ASF dual-hosted git repository.

adelapena pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 1e44a0850b8589e5dce3d9750f2c11add713d3f8
Merge: 46b90364da c76b32492f
Author: Andrés de la Peña 
AuthorDate: Mon Jan 22 14:37:38 2024 +

Merge branch 'cassandra-5.0' into trunk

 CHANGES.txt|   1 +
 pylib/cqlshlib/copyutil.py |   9 +-
 .../apache/cassandra/tools/cqlsh/CqlshTest.java| 126 -
 3 files changed, 132 insertions(+), 4 deletions(-)

diff --cc CHANGES.txt
index 290185e085,1d71cb52c3..eefc85784f
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,13 -1,5 +1,14 @@@
 -5.0-beta2
 +5.1
 + * Limit cassandra startup to supported JDKs, allow higher JDKs by setting 
CASSANDRA_JDK_UNSUPPORTED (CASSANDRA-18688)
 + * Standardize nodetool tablestats formatting of data units (CASSANDRA-19104)
 + * Make nodetool tablestats use number of significant digits for time and 
average values consistently (CASSANDRA-19015)
 + * Upgrade jackson to 2.15.3 and snakeyaml to 2.1 (CASSANDRA-18875)
 + * Transactional Cluster Metadata [CEP-21] (CASSANDRA-18330)
 + * Add ELAPSED command to cqlsh (CASSANDRA-18861)
 + * Add the ability to disable bulk loading of SSTables (CASSANDRA-18781)
 + * Clean up obsolete functions and simplify cql_version handling in cqlsh 
(CASSANDRA-18787)
 +Merged from 5.0:
+  * Add support of vector type to cqlsh COPY command (CASSANDRA-19118)
   * Make CQLSSTableWriter to support building of SAI indexes (CASSANDRA-18714)
   * Append additional JVM options when using JDK17+ (CASSANDRA-19001)
   * Upgrade Python driver to 3.29.0 (CASSANDRA-19245)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch cassandra-5.0 updated: Add support of vector type to cqlsh COPY command

2024-01-22 Thread adelapena
This is an automated email from the ASF dual-hosted git repository.

adelapena pushed a commit to branch cassandra-5.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/cassandra-5.0 by this push:
 new c76b32492f Add support of vector type to cqlsh COPY command
c76b32492f is described below

commit c76b32492f08c4af56846518488ae0b191e077e8
Author: Szymon Miężał 
AuthorDate: Thu Nov 30 17:56:48 2023 +0100

Add support of vector type to cqlsh COPY command

This patch adds a converter that allows parsing vector literals
passed via csv files to the COPY command.

patch by Szymon Miezal; reviewed by Andrés de la Peña, Stefan Miklosovic 
and Maxwell Guo for CASSANDRA-19118
---
 CHANGES.txt|   1 +
 pylib/cqlshlib/copyutil.py |   9 +-
 .../apache/cassandra/tools/cqlsh/CqlshTest.java| 126 -
 3 files changed, 132 insertions(+), 4 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index a7859ee9ec..1d71cb52c3 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 5.0-beta2
+ * Add support of vector type to cqlsh COPY command (CASSANDRA-19118)
  * Make CQLSSTableWriter to support building of SAI indexes (CASSANDRA-18714)
  * Append additional JVM options when using JDK17+ (CASSANDRA-19001)
  * Upgrade Python driver to 3.29.0 (CASSANDRA-19245)
diff --git a/pylib/cqlshlib/copyutil.py b/pylib/cqlshlib/copyutil.py
index 2a8a11d1bf..af35731005 100644
--- a/pylib/cqlshlib/copyutil.py
+++ b/pylib/cqlshlib/copyutil.py
@@ -46,7 +46,7 @@ from queue import Queue
 
 from cassandra import OperationTimedOut
 from cassandra.cluster import Cluster, DefaultConnection
-from cassandra.cqltypes import ReversedType, UserType, VarcharType
+from cassandra.cqltypes import ReversedType, UserType, VarcharType, VectorType
 from cassandra.metadata import protect_name, protect_names, protect_value
 from cassandra.policies import RetryPolicy, WhiteListRoundRobinPolicy, 
DCAwareRoundRobinPolicy, FallthroughRetryPolicy
 from cassandra.query import BatchStatement, BatchType, SimpleStatement, 
tuple_factory
@@ -2074,6 +2074,12 @@ class ImportConversion(object):
 return ImmutableDict(frozenset((convert_mandatory(ct.subtypes[0], 
v[0]), convert(ct.subtypes[1], v[1]))
  for v in [split(split_format_str % vv, 
sep=sep) for vv in split(val)]))
 
+def convert_vector(val, ct=cql_type):
+string_coordinates = split(val)
+if len(string_coordinates) != ct.vector_size:
+raise ParseError("The length of given vector value '%d' is not 
equal to the vector size from the type definition '%d'" % 
(len(string_coordinates), ct.vector_size))
+return [convert_mandatory(ct.subtype, v) for v in 
string_coordinates]
+
 def convert_user_type(val, ct=cql_type):
 """
 A user type is a dictionary except that we must convert each key 
into
@@ -2130,6 +2136,7 @@ class ImportConversion(object):
 'map': convert_map,
 'tuple': convert_tuple,
 'frozen': convert_single_subtype,
+VectorType.typename: convert_vector,
 }
 
 return converters.get(cql_type.typename, convert_unknown)
diff --git a/test/unit/org/apache/cassandra/tools/cqlsh/CqlshTest.java 
b/test/unit/org/apache/cassandra/tools/cqlsh/CqlshTest.java
index 4e6dd2088b..356769b840 100644
--- a/test/unit/org/apache/cassandra/tools/cqlsh/CqlshTest.java
+++ b/test/unit/org/apache/cassandra/tools/cqlsh/CqlshTest.java
@@ -18,16 +18,24 @@
 
 package org.apache.cassandra.tools.cqlsh;
 
+import java.io.IOException;
+import java.io.Writer;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+
 import org.junit.BeforeClass;
 import org.junit.Test;
 
 import org.apache.cassandra.cql3.CQLTester;
+import org.apache.cassandra.cql3.UntypedResultSet;
 import org.apache.cassandra.tools.ToolRunner;
 import org.apache.cassandra.tools.ToolRunner.ToolResult;
-import org.hamcrest.CoreMatchers;
 
+import static java.lang.String.format;
+import static org.assertj.core.api.Assertions.assertThat;
 import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
 
 public class CqlshTest extends CQLTester
 {
@@ -41,7 +49,119 @@ public class CqlshTest extends CQLTester
 public void testKeyspaceRequired()
 {
 ToolResult tool = ToolRunner.invokeCqlsh("SELECT * FROM test");
-assertThat(tool.getCleanedStderr(), 
CoreMatchers.containsStringIgnoringCase("No keyspace has been specified"));
+tool.asserts().errorContains("No keyspace has been specified");
 assertEquals(2, tool.getExitCode());
 }
+
+@Test
+public void testCopyFloatVector() throws IOException
+{
+assertCopyOfVectorTypeSucceeds("float", 6, new 

[jira] [Comment Edited] (CASSANDRA-16565) Remove dependency on sigar

2024-01-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809474#comment-17809474
 ] 

Stefan Miklosovic edited comment on CASSANDRA-16565 at 1/22/24 2:30 PM:


OK so the reason this is happening is that the patch also removes these lines:

{noformat}
-
-
{noformat}

That first mkdir have created "${local.repository}/org/apache/cassandra/deps/" 
as side-effect upon creating dir for sigar-bin into which all files were 
downloaded in "_resolver-dist-lib_get_files". As we removed that line, there is 
no dir created so following download will fail.

I think the fix of adding mkdir into "_resolver-dist-lib_get_files" is the 
correct one.

[~jlewandowski] already +1ed on the PR. I am going to build this one more time 
and then I approach the merge. cc [~mck]


was (Author: smiklosovic):
OK so the reason this is happening is that the patch also removes these lines:

{noformat}
-
-
{noformat}

That first mkdir have created "${local.repository}/org/apache/cassandra/deps/" 
into which all files were downloaded in "_resolver-dist-lib_get_files". As we 
removed that line, there is no dir created so following download will fail.

I think the fix of adding mkdir into "_resolver-dist-lib_get_files" is the 
correct one.

[~jlewandowski] already +1ed on the PR. I am going to build this one more time 
and then I approach the merge. cc [~mck]

> Remove dependency on sigar
> --
>
> Key: CASSANDRA-16565
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16565
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: David Capwell
>Assignee: Claude Warren
>Priority: Normal
> Fix For: 5.x
>
>
> sigar is used to check if the environment has good settings for running C*, 
> but requires we bundle a lot of native libraries to perform this check (which 
> can also be done elsewhere).  This project also appears to be dead as the 
> last commit was around 6 years ago.
> With the move to resolve artifacts rather than commit them, removing this 
> dependency would remove majority of the artifacts fetched from GitHub.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16565) Remove dependency on sigar

2024-01-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809474#comment-17809474
 ] 

Stefan Miklosovic commented on CASSANDRA-16565:
---

OK so the reason this is happening is that the patch also removes these lines:

{noformat}
-
-
{noformat}

That first mkdir have created "${local.repository}/org/apache/cassandra/deps/" 
into which all files were downloaded in "_resolver-dist-lib_get_files". As we 
removed that line, there is no dir created so following download will fail.

I think the fix of adding mkdir into "_resolver-dist-lib_get_files" is the 
correct one.

[~jlewandowski] already +1ed on the PR. I am going to build this one more time 
and then I approach the merge. cc [~mck]

> Remove dependency on sigar
> --
>
> Key: CASSANDRA-16565
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16565
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: David Capwell
>Assignee: Claude Warren
>Priority: Normal
> Fix For: 5.x
>
>
> sigar is used to check if the environment has good settings for running C*, 
> but requires we bundle a lot of native libraries to perform this check (which 
> can also be done elsewhere).  This project also appears to be dead as the 
> last commit was around 6 years ago.
> With the move to resolve artifacts rather than commit them, removing this 
> dependency would remove majority of the artifacts fetched from GitHub.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch trunk updated: Change IP address of the CMS node during transition

2024-01-22 Thread ifesdjeen
This is an automated email from the ASF dual-hosted git repository.

ifesdjeen pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 46b90364da Change IP address of the CMS node during transition
46b90364da is described below

commit 46b90364daecf1880db5eda9899d7353ad81f445
Author: Alex Petrov 
AuthorDate: Thu Dec 21 13:47:22 2023 +0100

Change IP address of the CMS node during transition

Patch by Alex Petrov; reviewed by Sam Tunnicliffe and Marcus Eriksson for 
CASSANDRA-19219
---
 .../cassandra/locator/CMSPlacementStrategy.java|  4 --
 .../cassandra/tcm/transformations/Startup.java | 20 +++
 .../distributed/test/cms/CMSAddressChangeTest.java | 67 ++
 .../test/log/ClusterMetadataTestHelper.java| 13 +
 4 files changed, 100 insertions(+), 4 deletions(-)

diff --git a/src/java/org/apache/cassandra/locator/CMSPlacementStrategy.java 
b/src/java/org/apache/cassandra/locator/CMSPlacementStrategy.java
index ea4f0cbb9a..754687a199 100644
--- a/src/java/org/apache/cassandra/locator/CMSPlacementStrategy.java
+++ b/src/java/org/apache/cassandra/locator/CMSPlacementStrategy.java
@@ -28,7 +28,6 @@ import java.util.stream.Collectors;
 
 import com.google.common.annotations.VisibleForTesting;
 
-import org.apache.cassandra.gms.FailureDetector;
 import org.apache.cassandra.schema.ReplicationParams;
 import org.apache.cassandra.tcm.ClusterMetadata;
 import org.apache.cassandra.tcm.Transformation;
@@ -137,9 +136,6 @@ public interface CMSPlacementStrategy
 
 public Boolean apply(ClusterMetadata metadata, NodeId nodeId)
 {
-if 
(!FailureDetector.instance.isAlive(metadata.directory.endpoint(nodeId)))
-return false;
-
 if (metadata.directory.peerState(nodeId) != NodeState.JOINED)
 return false;
 
diff --git a/src/java/org/apache/cassandra/tcm/transformations/Startup.java 
b/src/java/org/apache/cassandra/tcm/transformations/Startup.java
index b26cc3655c..b4d4007e43 100644
--- a/src/java/org/apache/cassandra/tcm/transformations/Startup.java
+++ b/src/java/org/apache/cassandra/tcm/transformations/Startup.java
@@ -24,7 +24,10 @@ import java.util.Objects;
 
 import org.apache.cassandra.io.util.DataInputPlus;
 import org.apache.cassandra.io.util.DataOutputPlus;
+import org.apache.cassandra.locator.InetAddressAndPort;
+import org.apache.cassandra.locator.Replica;
 import org.apache.cassandra.schema.Keyspaces;
+import org.apache.cassandra.schema.ReplicationParams;
 import org.apache.cassandra.tcm.ClusterMetadata;
 import org.apache.cassandra.tcm.ClusterMetadataService;
 import org.apache.cassandra.tcm.Transformation;
@@ -32,12 +35,14 @@ import org.apache.cassandra.tcm.membership.Directory;
 import org.apache.cassandra.tcm.membership.NodeAddresses;
 import org.apache.cassandra.tcm.membership.NodeId;
 import org.apache.cassandra.tcm.membership.NodeVersion;
+import org.apache.cassandra.tcm.ownership.DataPlacement;
 import org.apache.cassandra.tcm.ownership.DataPlacements;
 import org.apache.cassandra.tcm.sequences.LockedRanges;
 import org.apache.cassandra.tcm.serialization.MetadataSerializer;
 import org.apache.cassandra.tcm.serialization.Version;
 
 import static org.apache.cassandra.exceptions.ExceptionCode.INVALID;
+import static org.apache.cassandra.tcm.ownership.EntireRange.entireRange;
 
 public class Startup implements Transformation
 {
@@ -87,6 +92,21 @@ public class Startup implements Transformation

  next.build().metadata,

  allKeyspaces);
 
+if (prev.isCMSMember(prev.directory.endpoint(nodeId)))
+{
+ReplicationParams metaParams = ReplicationParams.meta(prev);
+InetAddressAndPort endpoint = prev.directory.endpoint(nodeId);
+Replica leavingReplica = new Replica(endpoint, entireRange, 
true);
+Replica joiningReplica = new 
Replica(addresses.broadcastAddress, entireRange, true);
+
+DataPlacement.Builder builder = 
prev.placements.get(metaParams).unbuild();
+builder.reads.withoutReplica(prev.nextEpoch(), leavingReplica);
+builder.writes.withoutReplica(prev.nextEpoch(), 
leavingReplica);
+builder.reads.withReplica(prev.nextEpoch(), joiningReplica);
+builder.writes.withReplica(prev.nextEpoch(), joiningReplica);
+newPlacement = newPlacement.unbuild().with(metaParams, 
builder.build()).build();
+}
+
 next = next.with(newPlacement);
 }
 
diff --git 
a/test/distributed/org/apache/cassandra/distributed/test/cms/CMSAddressChangeTest.java
 

[jira] [Updated] (CASSANDRA-19219) CMS: restarting a CMS node with different ip address

2024-01-22 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-19219:

  Since Version: 5.1
Source Control Link: 
https://github.com/apache/cassandra/commit/46b90364daecf1880db5eda9899d7353ad81f445
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> CMS: restarting a CMS node with different ip address
> 
>
> Key: CASSANDRA-19219
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19219
> Project: Cassandra
>  Issue Type: Bug
>  Components: Transactional Cluster Metadata
>Reporter: Paul Chandler
>Assignee: Alex Petrov
>Priority: Normal
> Fix For: 5.1-alpha1
>
> Attachments: ci_summary.html, result_details.tar.gz
>
>
> I am simulating running a cluster in Kubernetes and testing what happens when 
> a pod goes down and is re created with a new ip address, the data is all 
> stored on a detached volume so when the new pod is created all the old data 
> for the node is reattached. In 4.0 this is handled correctly the node will 
> come back up with the same hostid, tokens etc, just a new ip address and the 
> cluster is healthy throughout.
>  
> To simulate this I create a 3 node cluster on a local machine using 3 
> loopback addresses
> 127.0.0.1
> 127.0.0.2
> 127.0.0.3
> I then run nodetool -p 7199 reconfigurecms datacenter1:3 --sync to create 3 
> CMS nodes
> I then bring down 127.0.0.1 and replace the rpc_address and listen_address 
> with 127.0.0.4 and re start the node. The node then hangs with this as the 
> last error message:
> (8821185654333640868,9200867415893016118]=ForRange\{lastModified=Epoch{epoch=12},
>  
> endpointsForRange=[Full(/127.0.0.1:7000,(8821185654333640868,9200867415893016118]),
>  Full(/127.0.0.2:7000,(8821185654333640868,9200867415893016118]), 
> Full(/127.0.0.3:7000,(8821185654333640868,9200867415893016118])]},
> }}}, lockedRanges=LockedRanges\{lastModified=Epoch{epoch=14}, locked={}}}. 
> This can mean that this node is configured differently from CMS.
> java.lang.AssertionError: not aware of any cluster members
>         at 
> org.apache.cassandra.locator.NetworkTopologyStrategy.calculateNaturalReplicas(NetworkTopologyStrategy.java:233)
>         at 
> org.apache.cassandra.locator.CMSPlacementStrategy$DatacenterAware.reconfigure(CMSPlacementStrategy.java:119)
>         at 
> org.apache.cassandra.tcm.transformations.cms.PrepareCMSReconfiguration$Complex.execute(PrepareCMSReconfiguration.java:164)
>         at 
> org.apache.cassandra.tcm.log.LocalLog.processPendingInternal(LocalLog.java:429)
>         at 
> org.apache.cassandra.tcm.log.LocalLog$Async$AsyncRunnable.run(LocalLog.java:682)
>         at 
> org.apache.cassandra.concurrent.InfiniteLoopExecutor.loop(InfiniteLoopExecutor.java:121)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>         at java.base/java.lang.Thread.run(Thread.java:829)
> WARN  [GlobalLogFollower] 2023-12-21 11:11:34,408 LocalLog.java:693 - 
> Stopping log processing on the node... All subsequent epochs will be ignored.
> org.apache.cassandra.tcm.log.LocalLog$StopProcessingException: 
> java.lang.AssertionError: not aware of any cluster members
>         at 
> org.apache.cassandra.tcm.log.LocalLog.processPendingInternal(LocalLog.java:434)
>         at 
> org.apache.cassandra.tcm.log.LocalLog$Async$AsyncRunnable.run(LocalLog.java:682)
>         at 
> org.apache.cassandra.concurrent.InfiniteLoopExecutor.loop(InfiniteLoopExecutor.java:121)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>         at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.lang.AssertionError: not aware of any cluster members
>         at 
> org.apache.cassandra.locator.NetworkTopologyStrategy.calculateNaturalReplicas(NetworkTopologyStrategy.java:233)
>         at 
> org.apache.cassandra.locator.CMSPlacementStrategy$DatacenterAware.reconfigure(CMSPlacementStrategy.java:119)
>         at 
> org.apache.cassandra.tcm.transformations.cms.PrepareCMSReconfiguration$Complex.execute(PrepareCMSReconfiguration.java:164)
>         at 
> org.apache.cassandra.tcm.log.LocalLog.processPendingInternal(LocalLog.java:429)
>         ... 4 common frames omitted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19281) NativeTransportEncryptionOptionsTest fails often in CI

2024-01-22 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809465#comment-17809465
 ] 

Ekaterina Dimitrova commented on CASSANDRA-19281:
-

[~smiklosovic], I believe this issue will be solved in CASSANDRA-19239.

> NativeTransportEncryptionOptionsTest fails often in CI
> --
>
> Key: CASSANDRA-19281
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19281
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> NativeTransportEncryptionOptionsTest seems to be quite flaky. I happen to see 
> a lot of these in CI recently for trunk.
> org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest
> testEndpointVerificationEnabledIpNotInSAN
> testOptionalMtlsModeDoNotAllowNonSSLConnections
> unencryptedNativeConnectionNotlisteningOnTlsPortTest
> testEndpointVerificationEnabledIpNotInSAN
> This is the error basically for all of them
> {noformat}
> junit.framework.AssertionFailedError: Forked Java VM exited 
> abnormally. Please note the time in the report does not reflect the time 
> until the VM exit.
> at 
> jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> {noformat}
> They do pass locally just fine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19001) Check whether the startup warnings for unknown modules represent a legit problem or cosmetic issue

2024-01-22 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809464#comment-17809464
 ] 

Ekaterina Dimitrova commented on CASSANDRA-19001:
-

{quote}I don't think packaging should need this, since it controls the jre/jdk 
versions used already. 
{quote}
[~brandon.williams], I am not sure I understand you correctly. We do not 
control jdk/jre as part of the committed patch. We detect whether we use JDK or 
JRE and add necessary options in case it is JDK. 
{quote}That said, we don't regularly update them unless it's necessary, so we 
could take this opportunity to do so.
{quote}
Opening a ticket to check those files on 4.0+ and clear all inconsistencies is 
good. WDYT?

bq.  [~e.dimitrova] shouldn't be this ticket closed?
There is a discussion about whether we need to patch also the rpm and debian 
packages, too. Thus, I left it open until we figure it out. 

 

> Check whether the startup warnings for unknown modules represent a legit 
> problem or cosmetic issue
> --
>
> Key: CASSANDRA-19001
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19001
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-rc, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During the 5.0 alpha 2 release 
> [vote|https://lists.apache.org/thread/lt3x0obr5cpbcydf5490pj6b2q0mz5zr], 
> [~paulo] raised the following concerns:
> {code:java}
> Launched a tarball-based 5.0-alpha2 container on top of
> "eclipse-temurin:17-jre-focal" and the server starts up fine, can run
> nodetool and cqlsh.
> I got these seemingly harmless JDK17 warnings during startup and when
> running nodetool (no warnings on JDK11):
> WARNING: Unknown module: jdk.attach specified to --add-exports
> WARNING: Unknown module: jdk.compiler specified to --add-exports
> WARNING: Unknown module: jdk.compiler specified to --add-opens
> WARNING: A terminally deprecated method in java.lang.System has been called
> WARNING: System::setSecurityManager has been called by
> org.apache.cassandra.security.ThreadAwareSecurityManager
> (file:/opt/cassandra/lib/apache-cassandra-5.0-alpha2-SNAPSHOT.jar)
> WARNING: Please consider reporting this to the maintainers of
> org.apache.cassandra.security.ThreadAwareSecurityManager
> WARNING: System::setSecurityManager will be removed in a future release
> Anybody knows if these warnings are legit/expected ? We can create
> follow-up tickets if needed.
> $ java --version
> openjdk 17.0.9 2023-10-17
> OpenJDK Runtime Environment Temurin-17.0.9+9 (build 17.0.9+9)
> OpenJDK 64-Bit Server VM Temurin-17.0.9+9 (build 17.0.9+9, mixed mode,
> sharing)
> {code}
> {code:java}
> Clarification: - When running nodetool only the "Unknown module" warnings 
> show up. All warnings show up during startup.{code}
> We need to verify whether this presents a real problem in the features where 
> those modules are expected to be used, or if it is a false alarm. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19208) Fix ClusterMetadataUpgradeHarryTest

2024-01-22 Thread Alex Petrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809462#comment-17809462
 ] 

Alex Petrov commented on CASSANDRA-19208:
-

Committed to trunk with [b10e2693443bb5eb5c9b3d561f8d5e47ac092a8c 
|https://github.com/apache/cassandra/commit/b10e2693443bb5eb5c9b3d561f8d5e47ac092a8c]

> Fix ClusterMetadataUpgradeHarryTest
> ---
>
> Key: CASSANDRA-19208
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19208
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Normal
> Fix For: 5.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Unfortunately I do not have a stack trace handy, but the test was essentially 
> timing out after CMS was falling into an infinite loop during initialisation. 
> A problem was caused by the fact that if there is a paxos timeout during 
> propose phase, CMS would get stuck in an infinite loop trying to catch up, 
> and not finding anything to catch up from.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19208) Fix ClusterMetadataUpgradeHarryTest

2024-01-22 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-19208:

  Fix Version/s: 5.1
  Since Version: 5.1
Source Control Link: 
https://github.com/apache/cassandra/commit/b10e2693443bb5eb5c9b3d561f8d5e47ac092a8c
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Fix ClusterMetadataUpgradeHarryTest
> ---
>
> Key: CASSANDRA-19208
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19208
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Normal
> Fix For: 5.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Unfortunately I do not have a stack trace handy, but the test was essentially 
> timing out after CMS was falling into an infinite loop during initialisation. 
> A problem was caused by the fact that if there is a paxos timeout during 
> propose phase, CMS would get stuck in an infinite loop trying to catch up, 
> and not finding anything to catch up from.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch trunk updated: Fix Harry Upgrade Test - primodal epoch initialization

2024-01-22 Thread ifesdjeen
This is an automated email from the ASF dual-hosted git repository.

ifesdjeen pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new b10e269344 Fix Harry Upgrade Test - primodal epoch initialization
b10e269344 is described below

commit b10e2693443bb5eb5c9b3d561f8d5e47ac092a8c
Author: Alex Petrov 
AuthorDate: Sat Dec 23 13:30:22 2023 +0100

Fix Harry Upgrade Test - primodal epoch initialization

 Patch by Alex Petrov, reviewed by Sam Tunnicliffe for CASSANDRA-19208.
---
 .../cassandra/schema/DistributedMetadataLogKeyspace.java| 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git 
a/src/java/org/apache/cassandra/schema/DistributedMetadataLogKeyspace.java 
b/src/java/org/apache/cassandra/schema/DistributedMetadataLogKeyspace.java
index 9d16360c18..84fd433a70 100644
--- a/src/java/org/apache/cassandra/schema/DistributedMetadataLogKeyspace.java
+++ b/src/java/org/apache/cassandra/schema/DistributedMetadataLogKeyspace.java
@@ -93,7 +93,18 @@ public final class DistributedMetadataLogKeyspace
  Period.FIRST, 
FIRST.getEpoch(), FIRST.getEpoch(),
  
Transformation.Kind.PRE_INITIALIZE_CMS.toVersionedBytes(PreInitialize.blank()), 
Transformation.Kind.PRE_INITIALIZE_CMS.toString(), Entry.Id.NONE.entryId);
 
-return result.one().getBoolean("[applied]");
+UntypedResultSet.Row row = result.one();
+if (row.getBoolean("[applied]"))
+return true;
+
+if (row.getLong("epoch") == FIRST.getEpoch() &&
+row.getLong("period") == Period.FIRST &&
+row.getLong("current_epoch") == FIRST.getEpoch() &&
+row.getLong("entry_id") == Entry.Id.NONE.entryId &&
+
Transformation.Kind.PRE_INITIALIZE_CMS.toString().equals(row.getString("kind")))
+return true;
+
+throw new IllegalStateException("Could not initialize log.");
 }
 catch (CasWriteTimeoutException t)
 {


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19001) Check whether the startup warnings for unknown modules represent a legit problem or cosmetic issue

2024-01-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809421#comment-17809421
 ] 

Stefan Miklosovic commented on CASSANDRA-19001:
---

[~e.dimitrova] shouldn't be this ticket closed?

> Check whether the startup warnings for unknown modules represent a legit 
> problem or cosmetic issue
> --
>
> Key: CASSANDRA-19001
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19001
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-rc, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During the 5.0 alpha 2 release 
> [vote|https://lists.apache.org/thread/lt3x0obr5cpbcydf5490pj6b2q0mz5zr], 
> [~paulo] raised the following concerns:
> {code:java}
> Launched a tarball-based 5.0-alpha2 container on top of
> "eclipse-temurin:17-jre-focal" and the server starts up fine, can run
> nodetool and cqlsh.
> I got these seemingly harmless JDK17 warnings during startup and when
> running nodetool (no warnings on JDK11):
> WARNING: Unknown module: jdk.attach specified to --add-exports
> WARNING: Unknown module: jdk.compiler specified to --add-exports
> WARNING: Unknown module: jdk.compiler specified to --add-opens
> WARNING: A terminally deprecated method in java.lang.System has been called
> WARNING: System::setSecurityManager has been called by
> org.apache.cassandra.security.ThreadAwareSecurityManager
> (file:/opt/cassandra/lib/apache-cassandra-5.0-alpha2-SNAPSHOT.jar)
> WARNING: Please consider reporting this to the maintainers of
> org.apache.cassandra.security.ThreadAwareSecurityManager
> WARNING: System::setSecurityManager will be removed in a future release
> Anybody knows if these warnings are legit/expected ? We can create
> follow-up tickets if needed.
> $ java --version
> openjdk 17.0.9 2023-10-17
> OpenJDK Runtime Environment Temurin-17.0.9+9 (build 17.0.9+9)
> OpenJDK 64-Bit Server VM Temurin-17.0.9+9 (build 17.0.9+9, mixed mode,
> sharing)
> {code}
> {code:java}
> Clarification: - When running nodetool only the "Unknown module" warnings 
> show up. All warnings show up during startup.{code}
> We need to verify whether this presents a real problem in the features where 
> those modules are expected to be used, or if it is a false alarm. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19282) Investigate a different way how to verify CQLSSTableWriterTest data after loading

2024-01-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19282:
--
Summary: Investigate a different way how to verify CQLSSTableWriterTest 
data after loading  (was: Investigate different way how to verify 
CQLSSTableWriterTest data after loading)

> Investigate a different way how to verify CQLSSTableWriterTest data after 
> loading
> -
>
> Key: CASSANDRA-19282
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19282
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Stefan Miklosovic
>Assignee: Doug Rohrer
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> In CASSANDRA-18714, we realized that client tests 
> (CQLSSTableWriterClientTest) are not executing the same set of tests as 
> CQLSSTableWriterTest does. This unfortunate discrepancy was addressed in 
> CASSANDRA-18714 were both client and daemon tests extend abstract test class 
> where all tests are hence client and daemon run same tests, just from 
> different angle.
> This clear improvement in the consolidation of tests is not 100% pure in 
> trunk. Due to nature of the implementation in trunk, the only thing which is 
> skipped in trunk is verification of data after generated SSTables are loaded 
> in case of client tests. This was not done before CASSANDRA-18714 either 
> (because client tests were not testing what daemon ones would in the first 
> place). 
> This ticket is for the investigation how to unify tests for both modes in 
> trunk, we might probably load SSTables on the disk by means of 
> SSTableScanner, for example.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19282) Investigate different way how to verify CQLSSTableWriterTest data after loading

2024-01-22 Thread Stefan Miklosovic (Jira)
Stefan Miklosovic created CASSANDRA-19282:
-

 Summary: Investigate different way how to verify 
CQLSSTableWriterTest data after loading
 Key: CASSANDRA-19282
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19282
 Project: Cassandra
  Issue Type: Improvement
Reporter: Stefan Miklosovic
Assignee: Doug Rohrer


In CASSANDRA-18714, we realized that client tests (CQLSSTableWriterClientTest) 
are not executing the same set of tests as CQLSSTableWriterTest does. This 
unfortunate discrepancy was addressed in CASSANDRA-18714 were both client and 
daemon tests extend abstract test class where all tests are hence client and 
daemon run same tests, just from different angle.

This clear improvement in the consolidation of tests is not 100% pure in trunk. 
Due to nature of the implementation in trunk, the only thing which is skipped 
in trunk is verification of data after generated SSTables are loaded in case of 
client tests. This was not done before CASSANDRA-18714 either (because client 
tests were not testing what daemon ones would in the first place). 

This ticket is for the investigation how to unify tests for both modes in 
trunk, we might probably load SSTables on the disk by means of SSTableScanner, 
for example.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19282) Investigate different way how to verify CQLSSTableWriterTest data after loading

2024-01-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19282:
--
Fix Version/s: 5.0.x
   5.x

> Investigate different way how to verify CQLSSTableWriterTest data after 
> loading
> ---
>
> Key: CASSANDRA-19282
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19282
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Stefan Miklosovic
>Assignee: Doug Rohrer
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> In CASSANDRA-18714, we realized that client tests 
> (CQLSSTableWriterClientTest) are not executing the same set of tests as 
> CQLSSTableWriterTest does. This unfortunate discrepancy was addressed in 
> CASSANDRA-18714 were both client and daemon tests extend abstract test class 
> where all tests are hence client and daemon run same tests, just from 
> different angle.
> This clear improvement in the consolidation of tests is not 100% pure in 
> trunk. Due to nature of the implementation in trunk, the only thing which is 
> skipped in trunk is verification of data after generated SSTables are loaded 
> in case of client tests. This was not done before CASSANDRA-18714 either 
> (because client tests were not testing what daemon ones would in the first 
> place). 
> This ticket is for the investigation how to unify tests for both modes in 
> trunk, we might probably load SSTables on the disk by means of 
> SSTableScanner, for example.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14572) Expose all table metrics in virtual table

2024-01-22 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809407#comment-17809407
 ] 

Maxim Muzafarov commented on CASSANDRA-14572:
-

It is implemented so that continuous polling of a metric does not cause 
problems, as shown in the benchmarks above :) I did a lot of testing for that 
to verify the final result. 

> Expose all table metrics in virtual table
> -
>
> Key: CASSANDRA-14572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14572
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability, Observability/Metrics
>Reporter: Chris Lohfink
>Assignee: Maxim Muzafarov
>Priority: Low
>  Labels: virtual-tables
> Fix For: 5.x
>
> Attachments: keyspayces_group responses times.png, keyspayces_group 
> summary.png, select keyspaces_group by string prefix.png, select 
> keyspaces_group compare with wo.png, select keyspaces_group without 
> value.png, systemv_views.metrics_dropped_message.png, thread_pools 
> benchmark.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> While we want a number of virtual tables to display data in a way thats great 
> and intuitive like in nodetool. There is also much for being able to expose 
> the metrics we have for tooling via CQL instead of JMX. This is more for the 
> tooling and adhoc advanced users who know exactly what they are looking for.
> *Schema:*
> Initial idea is to expose data via {{((keyspace, table), metric)}} with a 
> column for each metric value. Could also use a Map or UDT instead of the 
> column based that can be a bit more specific to each metric type. To that end 
> there can be a {{metric_type}} column and then a UDT for each metric type 
> filled in, or a single value with more of a Map style. I am 
> purposing the column type though as with {{ALLOW FILTERING}} it does allow 
> more extensive query capabilities.
> *Implementations:*
> * Use reflection to grab all the metrics from TableMetrics (see: 
> CASSANDRA-7622 impl). This is easiest and least abrasive towards new metric 
> implementors... but its reflection and a kinda a bad idea.
> * Add a hook in TableMetrics to register with this virtual table when 
> registering
> * Pull from the CassandraMetrics registery (either reporter or iterate 
> through metrics query on read of virtual table)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18714) Expand CQLSSTableWriter to write SSTable-attached secondary indexes

2024-01-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-18714:
--
Source Control Link: 
https://github.com/apache/cassandra/commit/016dd6ca376ac1080bba9a1e2a6fe1d4b037e751
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Expand CQLSSTableWriter to write SSTable-attached secondary indexes
> ---
>
> Key: CASSANDRA-18714
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18714
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SAI, Tool/bulk load
>Reporter: Caleb Rackliffe
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: client-mode-cqlsstablewriter-tests.patch
>
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>
> {{CQLSSTableWriter}} currently has no way of writing any secondary indexes 
> inline as it writes the core SSTable components. With SAI, this has become 
> tractable problem, and we should be able to enhance both it and 
> {{SSTableImporter}} to handle cases where we might want to write SSTables 
> somewhere in bulk (and in parallel) and then import them without waiting for 
> index building on import. It would require the following changes:
> 1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current 
> table schema definition. Once added to the schema, any {{ColumnFamilyStore}} 
> instances opened will have those 2i defined in their index managers.
> 2.) All {{AbstractSSTableSimpleWriter}} instances must register index groups, 
> allowing the proper {{SSTableFlushObservers}} to be attached to 
> {{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached 
> indexes) components will be built incrementally along w/ the SSTable data 
> file, and will be finalized when the newly written SSTable is finalized.
> 3.) Provide an example (in a unit test?) of how a third-party tool might, 
> assuming access to the right C* JAR, validate/checksum SAI components outside 
> C* proper.
> 4.) {{SSTableImporter}} should have two new options:
> a.) an option that fails import if any SSTable-attached 2i must be built 
> (i.e. has not already been built and brought along w/ the other new SSTable 
> components)
> b.) an option that allows us to bypass full checksum validation on 
> imported/already-built SSTable-attached indexes (assuming they have just been 
> written by {{CQLSSTableWriter}})



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch cassandra-5.0 updated (9f5e45e5a2 -> 016dd6ca37)

2024-01-22 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a change to branch cassandra-5.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 9f5e45e5a2 Append additional JVM options when using JDK17+
 add 016dd6ca37 Make CQLSSTableWriter to support building of SAI indexes

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/config/Config.java   |  18 +-
 .../org/apache/cassandra/db/ColumnFamilyStore.java |  89 +++--
 .../cassandra/db/ColumnFamilyStoreMBean.java   |  41 ++-
 .../org/apache/cassandra/db/SSTableImporter.java   |  72 +++-
 .../apache/cassandra/db/memtable/TrieMemtable.java |  19 +-
 .../db/streaming/CassandraStreamReceiver.java  |   2 +-
 src/java/org/apache/cassandra/index/Index.java |  13 +-
 .../cassandra/index/SecondaryIndexManager.java |   7 +-
 .../cassandra/index/sai/SSTableContextManager.java |   2 +-
 .../index/sai/StorageAttachedIndexBuilder.java |   2 +-
 .../index/sai/StorageAttachedIndexGroup.java   |  12 +-
 .../index/sai/disk/format/IndexDescriptor.java |  35 +-
 .../sai/disk/v1/segment/SegmentTrieBuffer.java |   4 +-
 .../cassandra/index/sai/view/IndexViewManager.java |   2 +-
 .../io/sstable/AbstractSSTableSimpleWriter.java|  13 +-
 .../cassandra/io/sstable/CQLSSTableWriter.java | 149 ++--
 .../locator/AbstractReplicationStrategy.java   |   3 +-
 src/java/org/apache/cassandra/tools/NodeProbe.java |  12 +-
 .../apache/cassandra/tools/nodetool/Import.java|  15 +-
 test/unit/org/apache/cassandra/db/ImportTest.java  | 229 +++-
 .../org/apache/cassandra/index/sai/SAITester.java  |   4 +-
 .../io/sstable/CQLSSTableWriterClientTest.java |  69 +---
 .../sstable/CQLSSTableWriterConcurrencyTest.java   |   2 +-
 .../io/sstable/CQLSSTableWriterDaemonTest.java |  29 +-
 .../cassandra/io/sstable/CQLSSTableWriterTest.java | 382 ++---
 26 files changed, 892 insertions(+), 334 deletions(-)
 copy src/java/org/apache/cassandra/io/sstable/GaugeProvider.java => 
test/unit/org/apache/cassandra/io/sstable/CQLSSTableWriterDaemonTest.java (61%)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch trunk updated (aa644c9dfa -> e81b2f54b4)

2024-01-22 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from aa644c9dfa Merge branch 'cassandra-5.0' into trunk
 add 016dd6ca37 Make CQLSSTableWriter to support building of SAI indexes
 new e81b2f54b4 Merge branch 'cassandra-5.0' into trunk

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/config/Config.java   |   18 +-
 .../org/apache/cassandra/db/ColumnFamilyStore.java |   88 +-
 .../cassandra/db/ColumnFamilyStoreMBean.java   |   41 +-
 .../org/apache/cassandra/db/SSTableImporter.java   |   72 +-
 .../apache/cassandra/db/memtable/TrieMemtable.java |   19 +-
 .../db/streaming/CassandraStreamReceiver.java  |2 +-
 src/java/org/apache/cassandra/index/Index.java |   13 +-
 .../cassandra/index/SecondaryIndexManager.java |7 +-
 .../cassandra/index/sai/SSTableContextManager.java |2 +-
 .../index/sai/StorageAttachedIndexBuilder.java |2 +-
 .../index/sai/StorageAttachedIndexGroup.java   |   12 +-
 .../index/sai/disk/format/IndexDescriptor.java |   35 +-
 .../sai/disk/v1/segment/SegmentTrieBuffer.java |4 +-
 .../cassandra/index/sai/view/IndexViewManager.java |2 +-
 .../io/sstable/AbstractSSTableSimpleWriter.java|   13 +-
 .../cassandra/io/sstable/CQLSSTableWriter.java |  171 ++-
 src/java/org/apache/cassandra/tools/NodeProbe.java |   12 +-
 .../apache/cassandra/tools/nodetool/Import.java|   15 +-
 test/unit/org/apache/cassandra/db/ImportTest.java  |  222 +++-
 .../org/apache/cassandra/index/sai/SAITester.java  |4 +-
 .../io/sstable/CQLSSTableWriterClientTest.java |   78 +-
 .../sstable/CQLSSTableWriterConcurrencyTest.java   |2 +-
 .../sstable/CQLSSTableWriterDaemonTest.java}   |   29 +-
 .../cassandra/io/sstable/CQLSSTableWriterTest.java | 1209 
 25 files changed, 1377 insertions(+), 696 deletions(-)
 copy test/unit/org/apache/cassandra/{db/commitlog/GroupCommitLogTest.java => 
io/sstable/CQLSSTableWriterDaemonTest.java} (57%)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) 01/01: Merge branch 'cassandra-5.0' into trunk

2024-01-22 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit e81b2f54b4cb05ea25720a8a481ec951a20b809a
Merge: aa644c9dfa 016dd6ca37
Author: Stefan Miklosovic 
AuthorDate: Mon Jan 22 12:44:59 2024 +0100

Merge branch 'cassandra-5.0' into trunk

 CHANGES.txt|1 +
 src/java/org/apache/cassandra/config/Config.java   |   18 +-
 .../org/apache/cassandra/db/ColumnFamilyStore.java |   88 +-
 .../cassandra/db/ColumnFamilyStoreMBean.java   |   41 +-
 .../org/apache/cassandra/db/SSTableImporter.java   |   72 +-
 .../apache/cassandra/db/memtable/TrieMemtable.java |   19 +-
 .../db/streaming/CassandraStreamReceiver.java  |2 +-
 src/java/org/apache/cassandra/index/Index.java |   13 +-
 .../cassandra/index/SecondaryIndexManager.java |7 +-
 .../cassandra/index/sai/SSTableContextManager.java |2 +-
 .../index/sai/StorageAttachedIndexBuilder.java |2 +-
 .../index/sai/StorageAttachedIndexGroup.java   |   12 +-
 .../index/sai/disk/format/IndexDescriptor.java |   35 +-
 .../sai/disk/v1/segment/SegmentTrieBuffer.java |4 +-
 .../cassandra/index/sai/view/IndexViewManager.java |2 +-
 .../io/sstable/AbstractSSTableSimpleWriter.java|   13 +-
 .../cassandra/io/sstable/CQLSSTableWriter.java |  171 ++-
 src/java/org/apache/cassandra/tools/NodeProbe.java |   12 +-
 .../apache/cassandra/tools/nodetool/Import.java|   15 +-
 test/unit/org/apache/cassandra/db/ImportTest.java  |  222 +++-
 .../org/apache/cassandra/index/sai/SAITester.java  |4 +-
 .../io/sstable/CQLSSTableWriterClientTest.java |   78 +-
 .../sstable/CQLSSTableWriterConcurrencyTest.java   |2 +-
 .../io/sstable/CQLSSTableWriterDaemonTest.java |   44 +
 .../cassandra/io/sstable/CQLSSTableWriterTest.java | 1209 
 25 files changed, 1406 insertions(+), 682 deletions(-)

diff --cc CHANGES.txt
index ebf48c2315,a7859ee9ec..290185e085
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,13 -1,5 +1,14 @@@
 -5.0-beta2
 +5.1
 + * Limit cassandra startup to supported JDKs, allow higher JDKs by setting 
CASSANDRA_JDK_UNSUPPORTED (CASSANDRA-18688)
 + * Standardize nodetool tablestats formatting of data units (CASSANDRA-19104)
 + * Make nodetool tablestats use number of significant digits for time and 
average values consistently (CASSANDRA-19015)
 + * Upgrade jackson to 2.15.3 and snakeyaml to 2.1 (CASSANDRA-18875)
 + * Transactional Cluster Metadata [CEP-21] (CASSANDRA-18330)
 + * Add ELAPSED command to cqlsh (CASSANDRA-18861)
 + * Add the ability to disable bulk loading of SSTables (CASSANDRA-18781)
 + * Clean up obsolete functions and simplify cql_version handling in cqlsh 
(CASSANDRA-18787)
 +Merged from 5.0:
+  * Make CQLSSTableWriter to support building of SAI indexes (CASSANDRA-18714)
   * Append additional JVM options when using JDK17+ (CASSANDRA-19001)
   * Upgrade Python driver to 3.29.0 (CASSANDRA-19245)
   * Creating a SASI index after creating an SAI index does not break secondary 
index queries (CASSANDRA-18939)
diff --cc src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 1dc2687c40,bcf4dc7073..be6136dd4a
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@@ -389,20 -383,22 +391,21 @@@ public class ColumnFamilyStore implemen
  // only update these runtime-modifiable settings if they have not 
been modified.
  if (!minCompactionThreshold.isModified())
  for (ColumnFamilyStore cfs : concatWithIndexes())
 -cfs.minCompactionThreshold = new 
DefaultValue(metadata().params.compaction.minCompactionThreshold());
 +cfs.minCompactionThreshold = new 
DefaultValue<>(tableMetadata.params.compaction.minCompactionThreshold());
  if (!maxCompactionThreshold.isModified())
  for (ColumnFamilyStore cfs : concatWithIndexes())
 -cfs.maxCompactionThreshold = new 
DefaultValue(metadata().params.compaction.maxCompactionThreshold());
 +cfs.maxCompactionThreshold = new 
DefaultValue<>(tableMetadata.params.compaction.maxCompactionThreshold());
  if (!crcCheckChance.isModified())
  for (ColumnFamilyStore cfs : concatWithIndexes())
 -cfs.crcCheckChance = new 
DefaultValue(metadata().params.crcCheckChance);
 -
 -
compactionStrategyManager.maybeReloadParamsFromSchema(metadata().params.compaction);
 +cfs.crcCheckChance = new 
DefaultValue<>(tableMetadata.params.crcCheckChance);
  
 -indexManager.reload();
 +
compactionStrategyManager.maybeReloadParamsFromSchema(tableMetadata.params.compaction);
  
 -memtableFactory = metadata().params.memtable.factory();
 +indexManager.reload(tableMetadata);
  
 +memtableFactory = tableMetadata.params.memtable.factory();
-

[jira] [Commented] (CASSANDRA-19118) Add support of vector type to COPY command

2024-01-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809397#comment-17809397
 ] 

Andres de la Peña commented on CASSANDRA-19118:
---

CI results look good to me:

* {{AlterTest.testAlterStatementWithAdd}} has been recently hit by 
CASSANDRA-19245
* {{AlterTest.testDropListAndAddListWithSameName}} is CASSANDRA-18360
* {{coordinatorIsBehindTest}} is a timeout
* {{testOptionalMtlsModeDoNotAllowNonSSLConnections}} is a timeout
* {{test_move_single_node_localhost}} is on Butler
* {{HarrySimulatorTest}} is on Butler

[~smiklosovic] [~maxwellguo] if you don't have anything else to add I'll commit 
the changes.

> Add support of vector type to COPY command
> --
>
> Key: CASSANDRA-19118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19118
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: Szymon Miezal
>Assignee: Szymon Miezal
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Currently it's not possible to import rows with vector literals via {{COPY}} 
> command.
> STR:
>  * Create a table
> {code:sql}
> CREATE TABLE testcopyfrom (id text PRIMARY KEY, embedding_vector 
> VECTOR
> {code}
>  * Prepare csv file with sample data, for instance:
> {code:sql}
> 1,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6]"
> 2,"[-0.1, -0.2, -0.3, -0.4, -0.5, -0.6]" {code}
>  * in cqlsh run
> {code:sql}
> COPY ks.testcopyfrom FROM data.csv
> {code}
> It will result in getting:
> {code:sql}
> TypeError: Received an argument of invalid type for column 
> "embedding_vector". Expected: , 
> Got: ; (required argument is not a float){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19118) Add support of vector type to COPY command

2024-01-22 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-19118:
--
Status: Ready to Commit  (was: Review In Progress)

> Add support of vector type to COPY command
> --
>
> Key: CASSANDRA-19118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19118
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: Szymon Miezal
>Assignee: Szymon Miezal
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Currently it's not possible to import rows with vector literals via {{COPY}} 
> command.
> STR:
>  * Create a table
> {code:sql}
> CREATE TABLE testcopyfrom (id text PRIMARY KEY, embedding_vector 
> VECTOR
> {code}
>  * Prepare csv file with sample data, for instance:
> {code:sql}
> 1,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6]"
> 2,"[-0.1, -0.2, -0.3, -0.4, -0.5, -0.6]" {code}
>  * in cqlsh run
> {code:sql}
> COPY ks.testcopyfrom FROM data.csv
> {code}
> It will result in getting:
> {code:sql}
> TypeError: Received an argument of invalid type for column 
> "embedding_vector". Expected: , 
> Got: ; (required argument is not a float){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-builds) branch trunk updated: ninja-fix make removing of dangling docker volumes a failsafe attempt

2024-01-22 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 1664da7  ninja-fix make removing of dangling docker volumes a failsafe 
attempt
1664da7 is described below

commit 1664da759bb183759d89e8a70200ca804f86d6cd
Author: Mick Semb Wever 
AuthorDate: Mon Jan 22 11:57:19 2024 +0100

ninja-fix make removing of dangling docker volumes a failsafe attempt
---
 contribulyze/contribulyze.aliases | 14 +-
 contribulyze/contribulyze.py  | 14 +++---
 contribulyze/contribulyze.sh  | 24 ++--
 jenkins-dsl/cassandra_job_dsl_seed.groovy | 26 +-
 4 files changed, 51 insertions(+), 27 deletions(-)

diff --git a/contribulyze/contribulyze.aliases 
b/contribulyze/contribulyze.aliases
index 044dfe5..46f5628 100644
--- a/contribulyze/contribulyze.aliases
+++ b/contribulyze/contribulyze.aliases
@@ -1,10 +1,10 @@
 Aaron Morton,amorton
 Adam Holmberg,aholmber,Adam Holberg
 Adrian Cole,Adrain Cole
-Aleksandr Sorokoumov,Aleksandr Soromoukov
+Aleksandr Sorokoumov,Aleksandr Soromoukov,Aleksandr Sorokoumov
 Aleksey Yeschenko,iamaleksey,Aleksey,ayeschenko,Aleksey Yeshchenko,iamaleskey
 Aleksei Zotov,Alexey Zotov,Aleksei Zoto,azotcsit,Alexander Zotov
-Andrés de la Peña,Andres de la Peña,adelapena,Andres de la Pena,adelpena
+Andrés de la Peña,Andres de la Peña,adelapena,Andres de la 
Pena,adelpena,Andrés de la Peña
 Ariel Weisberg,aweisberg,Ariel,awiesberg
 Ben Coverston,bcoverston
 Benedict Elliott Smith,Benedict,bes,Benedict Elliot 
Smith,belliottsmith,belliotsmith,Benedict Elliott-Smith
@@ -21,10 +21,12 @@ David Capwell,dcapwell
 Deepak Vohra,dvohra
 Dinesh Joshi,Dinesh A. Joshi
 Eduard Tudenhöfner,Eduard Tudenhoefner
-Ekaterina Dimitrova,Ekaterina Dimotrova,edimitrova,Ekaterina Dimitrov
+Ekaterina Dimitrova,Ekaterina Dimotrova,edimitrova,Ekaterina 
Dimitrov,Ekaterina Dimimtrova
 Eric Evans,eevans
 Eric Ramirez,Erick Ramirez,Erick Ramirex,erickramirezau
 Gary Dusbabek,gdusbabek,gdusbabke,gdusbabe
+Henry Hughes,hhughes
+Jacek Lewandowski,jacek-lewandowski,jlewandowski,Jacek Lewandowski,Jacek 
Lewandowki
 Jake Luciani,T Jake Luciani,tjake,jake
 Jaroslaw Grabowski,jtgrabowski
 Jason Brown,jasobrown,jasobraown
@@ -35,15 +37,17 @@ Jonathan Ellis,jbellis,jebllis
 Joel Knighton,jknighton
 Jordan West,jrwest
 Joey Lynch,Joseph Lynch
-Josh McKenzie,jmckenzie,josh,JoshuaMcKenzie,Joshua McKenzie
+Josh McKenzie,jmckenzie,josh,JoshuaMcKenzie,Joshua McKenzie,josh-mckenzie
 Jun Rao,junrao
 Lorina Poland,polandll
 Marcus Eriksson,Marcuse,Marcus,Krummas,Marcus Erikkson
 Matthew Dennis,mdennis
 Maxwell Guo,Maxwell-Guo,maxwellguo,xuanling.gc
+Maxim Muzafarov,mmuzaf
 Michael Kjellman,mkjellman
-Mick Semb Wever,Michael Semb Wever,Mck SembWever,Michael SembWever,Mck Semb 
Wever
+Mick Semb Wever,Michael Semb Wever,Mck SembWever,Michael SembWever,Mck Semb 
Wever,Mchael Semb Wever,Michaem Semb Wever
 Nate McCall,zznate,Nat McCall
+Ningzi Zhan,ningzi.zhan
 Paul Cannon,paul cannon,pcannon
 Paulo Motta,paulo,pauloricardomg,pmotta,Paul Motta
 Pavel Yaskevich,Pavel YAskevich,pyaskevich
diff --git a/contribulyze/contribulyze.py b/contribulyze/contribulyze.py
index f7489b3..17527ed 100755
--- a/contribulyze/contribulyze.py
+++ b/contribulyze/contribulyze.py
@@ -335,7 +335,7 @@ class Contributor(object):
   out.write('\n')
   out.write('\n' % (log.sha, log.sha))
   out.write('\n')
-  sha = 'https://github.com/apache/cassandra/commit/%s;>%s' % 
(log.sha, log.sha)
+  sha = 'https://github.com/search?q=org:apache+%s+repo:apache/cassandra*=commits=advsearch;>%s'
 % (log.sha, log.sha)
   out.write('%s | %s | %s\n\n' % (sha, escape_html(log.author), 
log.date))
   out.write(spam_guard_in_html_block(re.sub(r'for CASSANDRA-([0-9]+)', 
r'for https://issues.apache.org/jira/browse/CASSANDRA-\1;>CASSANDRA-\1', 
escape_html(log.message
   out.write('\n')
@@ -453,6 +453,7 @@ log_header_re = re.compile('^commit ([0-9a-z]+)$')
 patch_by_re = re.compile('(?:.*\n)*.*patch by ([^;]+)(;|,)', 
flags=re.IGNORECASE | re.MULTILINE)
 reviewed_by_re = re.compile('(?:.*\n)*.*[;, ](?:review|test)(?:ed)? by 
((?:.|\n)+?)(?=(?: |\n)+for(?: |\n)+(?:cassandra-|#[0-9]+))', 
flags=re.IGNORECASE | re.MULTILINE)
 coauthored_by_re = re.compile(' *co-authored-by: ([^<]+)', re.IGNORECASE)
+author_re = re.compile('^Author: ([^<]+)')
 
 def graze(input):
   line = input.readline()
@@ -471,12 +472,19 @@ def graze(input):
   if LogMessage.latest is None: LogMessage.latest = log.date
   patch_field = Field("Patch")
   review_field = Field("Review")
+  m = author_re.match(log.author)
+  if m:
+  c = Contributor.get(" ".join(m.group(1).strip().split()), None)
+  patch_field.add_contributor(c)
+  log.add_field(patch_field)
+  

[jira] [Commented] (CASSANDRA-14572) Expose all table metrics in virtual table

2024-01-22 Thread Aleksey Yeschenko (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809361#comment-17809361
 ] 

Aleksey Yeschenko commented on CASSANDRA-14572:
---

It should be implemented in a way that regular polling is not a problem - no 
more than periodic polling of an equivalent regular table read.

> Expose all table metrics in virtual table
> -
>
> Key: CASSANDRA-14572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14572
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability, Observability/Metrics
>Reporter: Chris Lohfink
>Assignee: Maxim Muzafarov
>Priority: Low
>  Labels: virtual-tables
> Fix For: 5.x
>
> Attachments: keyspayces_group responses times.png, keyspayces_group 
> summary.png, select keyspaces_group by string prefix.png, select 
> keyspaces_group compare with wo.png, select keyspaces_group without 
> value.png, systemv_views.metrics_dropped_message.png, thread_pools 
> benchmark.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> While we want a number of virtual tables to display data in a way thats great 
> and intuitive like in nodetool. There is also much for being able to expose 
> the metrics we have for tooling via CQL instead of JMX. This is more for the 
> tooling and adhoc advanced users who know exactly what they are looking for.
> *Schema:*
> Initial idea is to expose data via {{((keyspace, table), metric)}} with a 
> column for each metric value. Could also use a Map or UDT instead of the 
> column based that can be a bit more specific to each metric type. To that end 
> there can be a {{metric_type}} column and then a UDT for each metric type 
> filled in, or a single value with more of a Map style. I am 
> purposing the column type though as with {{ALLOW FILTERING}} it does allow 
> more extensive query capabilities.
> *Implementations:*
> * Use reflection to grab all the metrics from TableMetrics (see: 
> CASSANDRA-7622 impl). This is easiest and least abrasive towards new metric 
> implementors... but its reflection and a kinda a bad idea.
> * Add a hook in TableMetrics to register with this virtual table when 
> registering
> * Pull from the CassandraMetrics registery (either reporter or iterate 
> through metrics query on read of virtual table)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19239) jvm-dtests crash on java 17

2024-01-22 Thread Jacek Lewandowski (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809317#comment-17809317
 ] 

Jacek Lewandowski commented on CASSANDRA-19239:
---

Thanks [~e.dimitrova], I agree, if there is no 5.0 failure, it does not make 
sense to make it a blocker for 5.0-rc.


> jvm-dtests crash on java 17
> ---
>
> Key: CASSANDRA-19239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19239
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Jacek Lewandowski
>Priority: Normal
> Fix For: 5.x
>
>
> This is a similar problem to the one mentioned in 
> https://issues.apache.org/jira/browse/CASSANDRA-15981
> I'm filling it because I've noticed the same problem on JDK17, perhaps we 
> should also disable unloading classes with CMS for JDK17. 
> However, I'm in favour of moving tests to G1 instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19281) NativeTransportEncryptionOptionsTest fails often in CI

2024-01-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19281:
--
Description: 
NativeTransportEncryptionOptionsTest seems to be quite flaky. I happen to see a 
lot of these in CI recently for trunk.

org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest

testEndpointVerificationEnabledIpNotInSAN
testOptionalMtlsModeDoNotAllowNonSSLConnections
unencryptedNativeConnectionNotlisteningOnTlsPortTest
testEndpointVerificationEnabledIpNotInSAN

This is the error basically for all of them
{noformat}
junit.framework.AssertionFailedError: Forked Java VM exited abnormally. 
Please note the time in the report does not reflect the time until the VM exit.
at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown 
Source)

{noformat}

They do pass locally just fine.

  was:
NativeTransportEncryptionOptionsTest seems to be quite flaky. I happen to see a 
lot of these in CI recently for trunk.

org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest

testEndpointVerificationEnabledIpNotInSAN
testOptionalMtlsModeDoNotAllowNonSSLConnections
unencryptedNativeConnectionNotlisteningOnTlsPortTest
testEndpointVerificationEnabledIpNotInSAN

This is the error basically for all of them
{noformat}
junit.framework.AssertionFailedError: Forked Java VM exited abnormally. 
Please note the time in the report does not reflect the time until the VM exit.
at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown 
Source)

{noformat}


> NativeTransportEncryptionOptionsTest fails often in CI
> --
>
> Key: CASSANDRA-19281
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19281
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> NativeTransportEncryptionOptionsTest seems to be quite flaky. I happen to see 
> a lot of these in CI recently for trunk.
> org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest
> testEndpointVerificationEnabledIpNotInSAN
> testOptionalMtlsModeDoNotAllowNonSSLConnections
> unencryptedNativeConnectionNotlisteningOnTlsPortTest
> testEndpointVerificationEnabledIpNotInSAN
> This is the error basically for all of them
> {noformat}
> junit.framework.AssertionFailedError: Forked Java VM exited 
> abnormally. Please note the time in the report does not reflect the time 
> until the VM exit.
> at 
> jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> {noformat}
> They do pass locally just fine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18714) Expand CQLSSTableWriter to write SSTable-attached secondary indexes

2024-01-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809300#comment-17809300
 ] 

Stefan Miklosovic edited comment on CASSANDRA-18714 at 1/22/24 8:52 AM:


*CASSANDRA-18714-trunk*
{noformat}
java17_pre-commit_tests
  j17_dtests
transient_replication_ring_test.TestTransientReplicationRing 
test_move_forwards_and_cleanup
  j17_jvm_dtests
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN
  j17_jvm_dtests_vnode
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testOptionalMtlsModeDoNotAllowNonSSLConnections
  j17_utests_oa
org.apache.cassandra.audit.AuditLoggerAuthTest 
testUNAUTHORIZED_ATTEMPTAuditing
org.apache.cassandra.audit.AuditLoggerAuthTest testCqlLoginAuditing
java11_pre-commit_tests
  j11_simulator_dtests
org.apache.cassandra.simulator.test.HarrySimulatorTest test
  j17_jvm_dtests
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
unencryptedNativeConnectionNotlisteningOnTlsPortTest
  j17_jvm_dtests_vnode
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN
{noformat}

[java17_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/4d8c81d6-3ef6-479f-bc50-b5515905f617]
[java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/415b5e27-012d-4925-bb7d-cf4ccc6375e4]

I do not see anything out of ordinary, AuditLoggerAuthTest passes locally, 
NativeTransportEncryptionOptionsTest is just flaky. 

I created CASSANDRA-19281 though.


was (Author: smiklosovic):
*CASSANDRA-18714-trunk*
{noformat}
java17_pre-commit_tests
  j17_dtests
transient_replication_ring_test.TestTransientReplicationRing 
test_move_forwards_and_cleanup
  j17_jvm_dtests
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN
  j17_jvm_dtests_vnode
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testOptionalMtlsModeDoNotAllowNonSSLConnections
  j17_utests_oa
org.apache.cassandra.audit.AuditLoggerAuthTest 
testUNAUTHORIZED_ATTEMPTAuditing
org.apache.cassandra.audit.AuditLoggerAuthTest testCqlLoginAuditing
java11_pre-commit_tests
  j11_simulator_dtests
org.apache.cassandra.simulator.test.HarrySimulatorTest test
  j17_jvm_dtests
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
unencryptedNativeConnectionNotlisteningOnTlsPortTest
  j17_jvm_dtests_vnode
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN
{noformat}

[java17_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/4d8c81d6-3ef6-479f-bc50-b5515905f617]
[java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/415b5e27-012d-4925-bb7d-cf4ccc6375e4]

I do not see anything out of ordinary, AuditLoggerAuthTest passes locally, 
NativeTransportEncryptionOptionsTest is just flaky. 

> Expand CQLSSTableWriter to write SSTable-attached secondary indexes
> ---
>
> Key: CASSANDRA-18714
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18714
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SAI, Tool/bulk load
>Reporter: Caleb Rackliffe
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: client-mode-cqlsstablewriter-tests.patch
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> {{CQLSSTableWriter}} currently has no way of writing any secondary indexes 
> inline as it writes the core SSTable components. With SAI, this has become 
> tractable problem, and we should be able to enhance both it and 
> {{SSTableImporter}} to handle cases where we might want to write SSTables 
> somewhere in bulk (and in parallel) and then import them without waiting for 
> index building on import. It would require the following changes:
> 1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current 
> table schema definition. Once added to the schema, any {{ColumnFamilyStore}} 
> instances opened will have those 2i defined in their index managers.
> 2.) All {{AbstractSSTableSimpleWriter}} instances must register index groups, 
> allowing the proper {{SSTableFlushObservers}} to be attached to 
> {{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached 
> indexes) components will be built incrementally along w/ the SSTable data 
> file, and will be finalized when the newly written SSTable is 

[jira] [Created] (CASSANDRA-19281) NativeTransportEncryptionOptionsTest fails often in CI

2024-01-22 Thread Stefan Miklosovic (Jira)
Stefan Miklosovic created CASSANDRA-19281:
-

 Summary: NativeTransportEncryptionOptionsTest fails often in CI
 Key: CASSANDRA-19281
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19281
 Project: Cassandra
  Issue Type: Improvement
Reporter: Stefan Miklosovic


NativeTransportEncryptionOptionsTest seems to be quite flaky. I happen to see a 
lot of these in CI recently for trunk.

org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest

testEndpointVerificationEnabledIpNotInSAN
testOptionalMtlsModeDoNotAllowNonSSLConnections
unencryptedNativeConnectionNotlisteningOnTlsPortTest
testEndpointVerificationEnabledIpNotInSAN

This is the error basically for all of them
{noformat}
junit.framework.AssertionFailedError: Forked Java VM exited abnormally. 
Please note the time in the report does not reflect the time until the VM exit.
at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown 
Source)

{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18714) Expand CQLSSTableWriter to write SSTable-attached secondary indexes

2024-01-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809300#comment-17809300
 ] 

Stefan Miklosovic edited comment on CASSANDRA-18714 at 1/22/24 8:47 AM:


*CASSANDRA-18714-trunk*
{noformat}
java17_pre-commit_tests
  j17_dtests
transient_replication_ring_test.TestTransientReplicationRing 
test_move_forwards_and_cleanup
  j17_jvm_dtests
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN
  j17_jvm_dtests_vnode
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testOptionalMtlsModeDoNotAllowNonSSLConnections
  j17_utests_oa
org.apache.cassandra.audit.AuditLoggerAuthTest 
testUNAUTHORIZED_ATTEMPTAuditing
org.apache.cassandra.audit.AuditLoggerAuthTest testCqlLoginAuditing
java11_pre-commit_tests
  j11_simulator_dtests
org.apache.cassandra.simulator.test.HarrySimulatorTest test
  j17_jvm_dtests
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
unencryptedNativeConnectionNotlisteningOnTlsPortTest
  j17_jvm_dtests_vnode
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN
{noformat}

[java17_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/4d8c81d6-3ef6-479f-bc50-b5515905f617]
[java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/415b5e27-012d-4925-bb7d-cf4ccc6375e4]

I do not see anything out of ordinary, AuditLoggerAuthTest passes locally, 
NativeTransportEncryptionOptionsTest is just flaky. 


was (Author: smiklosovic):
*CASSANDRA-18714-trunk*
{noformat}
java17_pre-commit_tests
  j17_dtests
transient_replication_ring_test.TestTransientReplicationRing 
test_move_forwards_and_cleanup
  j17_jvm_dtests
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN
  j17_jvm_dtests_vnode
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testOptionalMtlsModeDoNotAllowNonSSLConnections
  j17_utests_oa
org.apache.cassandra.audit.AuditLoggerAuthTest 
testUNAUTHORIZED_ATTEMPTAuditing
org.apache.cassandra.audit.AuditLoggerAuthTest testCqlLoginAuditing
java11_pre-commit_tests
  j11_simulator_dtests
org.apache.cassandra.simulator.test.HarrySimulatorTest test
  j17_jvm_dtests
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
unencryptedNativeConnectionNotlisteningOnTlsPortTest
  j17_jvm_dtests_vnode
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN
{noformat}

[java17_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/4d8c81d6-3ef6-479f-bc50-b5515905f617]
[java17_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/0e0da2f6-6c97-410a-ae87-3c30c31dbd29]
[java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/415b5e27-012d-4925-bb7d-cf4ccc6375e4]
[java11_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/e195b29a-e4de-4bfa-ad94-2ee2006001a5]


> Expand CQLSSTableWriter to write SSTable-attached secondary indexes
> ---
>
> Key: CASSANDRA-18714
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18714
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SAI, Tool/bulk load
>Reporter: Caleb Rackliffe
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: client-mode-cqlsstablewriter-tests.patch
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> {{CQLSSTableWriter}} currently has no way of writing any secondary indexes 
> inline as it writes the core SSTable components. With SAI, this has become 
> tractable problem, and we should be able to enhance both it and 
> {{SSTableImporter}} to handle cases where we might want to write SSTables 
> somewhere in bulk (and in parallel) and then import them without waiting for 
> index building on import. It would require the following changes:
> 1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current 
> table schema definition. Once added to the schema, any {{ColumnFamilyStore}} 
> instances opened will have those 2i defined in their index managers.
> 2.) All {{AbstractSSTableSimpleWriter}} instances must register index groups, 
> allowing the proper {{SSTableFlushObservers}} to be attached to 
> {{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached 
> indexes) components will be 

[jira] [Updated] (CASSANDRA-18714) Expand CQLSSTableWriter to write SSTable-attached secondary indexes

2024-01-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-18714:
--
Status: Ready to Commit  (was: Review In Progress)

> Expand CQLSSTableWriter to write SSTable-attached secondary indexes
> ---
>
> Key: CASSANDRA-18714
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18714
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SAI, Tool/bulk load
>Reporter: Caleb Rackliffe
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: client-mode-cqlsstablewriter-tests.patch
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> {{CQLSSTableWriter}} currently has no way of writing any secondary indexes 
> inline as it writes the core SSTable components. With SAI, this has become 
> tractable problem, and we should be able to enhance both it and 
> {{SSTableImporter}} to handle cases where we might want to write SSTables 
> somewhere in bulk (and in parallel) and then import them without waiting for 
> index building on import. It would require the following changes:
> 1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current 
> table schema definition. Once added to the schema, any {{ColumnFamilyStore}} 
> instances opened will have those 2i defined in their index managers.
> 2.) All {{AbstractSSTableSimpleWriter}} instances must register index groups, 
> allowing the proper {{SSTableFlushObservers}} to be attached to 
> {{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached 
> indexes) components will be built incrementally along w/ the SSTable data 
> file, and will be finalized when the newly written SSTable is finalized.
> 3.) Provide an example (in a unit test?) of how a third-party tool might, 
> assuming access to the right C* JAR, validate/checksum SAI components outside 
> C* proper.
> 4.) {{SSTableImporter}} should have two new options:
> a.) an option that fails import if any SSTable-attached 2i must be built 
> (i.e. has not already been built and brought along w/ the other new SSTable 
> components)
> b.) an option that allows us to bypass full checksum validation on 
> imported/already-built SSTable-attached indexes (assuming they have just been 
> written by {{CQLSSTableWriter}})



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18714) Expand CQLSSTableWriter to write SSTable-attached secondary indexes

2024-01-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809300#comment-17809300
 ] 

Stefan Miklosovic commented on CASSANDRA-18714:
---

*CASSANDRA-18714-trunk*
{noformat}
java17_pre-commit_tests
  j17_dtests
transient_replication_ring_test.TestTransientReplicationRing 
test_move_forwards_and_cleanup
  j17_jvm_dtests
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN
  j17_jvm_dtests_vnode
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testOptionalMtlsModeDoNotAllowNonSSLConnections
  j17_utests_oa
org.apache.cassandra.audit.AuditLoggerAuthTest 
testUNAUTHORIZED_ATTEMPTAuditing
org.apache.cassandra.audit.AuditLoggerAuthTest testCqlLoginAuditing
java11_pre-commit_tests
  j11_simulator_dtests
org.apache.cassandra.simulator.test.HarrySimulatorTest test
  j17_jvm_dtests
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
unencryptedNativeConnectionNotlisteningOnTlsPortTest
  j17_jvm_dtests_vnode
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN
{noformat}

[java17_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/4d8c81d6-3ef6-479f-bc50-b5515905f617]
[java17_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/0e0da2f6-6c97-410a-ae87-3c30c31dbd29]
[java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/415b5e27-012d-4925-bb7d-cf4ccc6375e4]
[java11_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3791/workflows/e195b29a-e4de-4bfa-ad94-2ee2006001a5]


> Expand CQLSSTableWriter to write SSTable-attached secondary indexes
> ---
>
> Key: CASSANDRA-18714
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18714
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SAI, Tool/bulk load
>Reporter: Caleb Rackliffe
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: client-mode-cqlsstablewriter-tests.patch
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> {{CQLSSTableWriter}} currently has no way of writing any secondary indexes 
> inline as it writes the core SSTable components. With SAI, this has become 
> tractable problem, and we should be able to enhance both it and 
> {{SSTableImporter}} to handle cases where we might want to write SSTables 
> somewhere in bulk (and in parallel) and then import them without waiting for 
> index building on import. It would require the following changes:
> 1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current 
> table schema definition. Once added to the schema, any {{ColumnFamilyStore}} 
> instances opened will have those 2i defined in their index managers.
> 2.) All {{AbstractSSTableSimpleWriter}} instances must register index groups, 
> allowing the proper {{SSTableFlushObservers}} to be attached to 
> {{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached 
> indexes) components will be built incrementally along w/ the SSTable data 
> file, and will be finalized when the newly written SSTable is finalized.
> 3.) Provide an example (in a unit test?) of how a third-party tool might, 
> assuming access to the right C* JAR, validate/checksum SAI components outside 
> C* proper.
> 4.) {{SSTableImporter}} should have two new options:
> a.) an option that fails import if any SSTable-attached 2i must be built 
> (i.e. has not already been built and brought along w/ the other new SSTable 
> components)
> b.) an option that allows us to bypass full checksum validation on 
> imported/already-built SSTable-attached indexes (assuming they have just been 
> written by {{CQLSSTableWriter}})



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org