[jira] [Commented] (CASSANDRA-17047) Dropping a column can break queries until the schema is fully propagated (TAKE 2)

2023-05-23 Thread Jacek Lewandowski (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725644#comment-17725644
 ] 

Jacek Lewandowski commented on CASSANDRA-17047:
---

Related/duplicates:
- https://issues.apache.org/jira/browse/CASSANDRA-14673
- https://issues.apache.org/jira/browse/CASSANDRA-18532



> Dropping a column can break queries until the schema is fully propagated 
> (TAKE 2)
> -
>
> Key: CASSANDRA-17047
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17047
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> With a table like:
> {code}
> CREATE TABLE ks.tbl (id int primary key, v1 int, v2 int, v3 int)
> {code}
> and we drop {{v2}}, we get this exception on the replicas which haven't seen 
> the schema change:
> {code}
> ERROR [SharedPool-Worker-1] node2 2020-06-24 09:49:08,107 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,node2]
> java.lang.IllegalStateException: [ColumnDefinition{name=v1, 
> type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, position=-1}, 
> ColumnDefinition{name=v2, type=org.apache.cassandra.db.marshal.Int32Type, 
> kind=REGULAR, position=-1}, ColumnDefinition{name=v3, 
> type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, position=-1}] 
> is not a subset of [v1 v3]
>   at 
> org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:546) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:478) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:184)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:114)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:102)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:132)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87)
>  ~[main/:na]
> ...
> {code}
> Note that it doesn't matter if we {{SELECT *}} or {{SELECT id, v1}}
> CASSANDRA-15899 tried to fix the problem when columns are dropped as well as 
> when columns are added. Unfortunately the fix introduced an issue and had to 
> be reverted in CASSANDRA-16735. 
> If the scenario for ADDED columns is tricky, the original scenario for 
> DROPPED columns can  be solved in a safe way at the {{ColumnFilter}} level. 
> By consequence, I think that we should at least solve that scenario.
> [~bdeggleston], [~samt], [~ifesdjeen] does my proposal makes sense to you?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17047) Dropping a column can break queries until the schema is fully propagated (TAKE 2)

2023-05-23 Thread Jacek Lewandowski (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725643#comment-17725643
 ] 

Jacek Lewandowski commented on CASSANDRA-17047:
---

[~blerer] can we complete this?

> Dropping a column can break queries until the schema is fully propagated 
> (TAKE 2)
> -
>
> Key: CASSANDRA-17047
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17047
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> With a table like:
> {code}
> CREATE TABLE ks.tbl (id int primary key, v1 int, v2 int, v3 int)
> {code}
> and we drop {{v2}}, we get this exception on the replicas which haven't seen 
> the schema change:
> {code}
> ERROR [SharedPool-Worker-1] node2 2020-06-24 09:49:08,107 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,node2]
> java.lang.IllegalStateException: [ColumnDefinition{name=v1, 
> type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, position=-1}, 
> ColumnDefinition{name=v2, type=org.apache.cassandra.db.marshal.Int32Type, 
> kind=REGULAR, position=-1}, ColumnDefinition{name=v3, 
> type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, position=-1}] 
> is not a subset of [v1 v3]
>   at 
> org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:546) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:478) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:184)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:114)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:102)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:132)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87)
>  ~[main/:na]
> ...
> {code}
> Note that it doesn't matter if we {{SELECT *}} or {{SELECT id, v1}}
> CASSANDRA-15899 tried to fix the problem when columns are dropped as well as 
> when columns are added. Unfortunately the fix introduced an issue and had to 
> be reverted in CASSANDRA-16735. 
> If the scenario for ADDED columns is tricky, the original scenario for 
> DROPPED columns can  be solved in a safe way at the {{ColumnFilter}} level. 
> By consequence, I think that we should at least solve that scenario.
> [~bdeggleston], [~samt], [~ifesdjeen] does my proposal makes sense to you?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-18547) Refactor cqlsh On/Off switch implementation

2023-05-23 Thread Brad Schoening (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brad Schoening reassigned CASSANDRA-18547:
--

Assignee: Brad Schoening

> Refactor cqlsh On/Off switch implementation
> ---
>
> Key: CASSANDRA-18547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18547
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brad Schoening
>Assignee: Brad Schoening
>Priority: Normal
>
> This change proposes to refactor the On/Off switch implemented in the class 
> and subclass SwitchCommand and SwitchCommandWithValue of cqlshmain.py to use 
> static methods instead of new classes.
> The existing code is hard to read, including the usage, which instantiates a 
> SwitchCommand object in-order to invoke the execute method:
>  
> {code:java}
> self.tracing_enabled = SwitchCommand("TRACING", 
> "Tracing").execute(self.tracing_enabled, parsed, self.printerr){code}
> this can be replaced by a more familiar direct function call:
>  
>  
> {code:java}
> self.tracing_enabled = self.on_off_toggle("TRACING", "Tracing", 
> self.tracing_enabled, parsed.get_binding('switch')){code}
>  
>  
> The refactoring would rework the command output for consistency. Instead of 
> the current:
>  
> {code:java}
> > tracing on
> Now Tracing is enabled
> > paging on
> Query paging is already enabled. Use PAGING OFF to disable.
> > expand on
> Now Expanded output is enabled
> {code}
>  
> replace with more succinct and consistent, using 'ON/OFF' instead of 
> enabled/disabled and removing the redundant 'Now':
>  
> {code:java}
> > tracing on
> TRACING set to ON
> > paging on
> PAGING is already ON
> > expand on
> EXPAND set to ON
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18547) Refactor cqlsh On/Off switch implementation

2023-05-23 Thread Brad Schoening (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brad Schoening updated CASSANDRA-18547:
---
Description: 
This change proposes to refactor the On/Off switch implemented in the class and 
subclass SwitchCommand and SwitchCommandWithValue of cqlshmain.py to use static 
methods instead of new classes.

The existing code is hard to read, including the usage, which instantiates a 
SwitchCommand object in-order to invoke the execute method:

 
{code:java}
self.tracing_enabled = SwitchCommand("TRACING", 
"Tracing").execute(self.tracing_enabled, parsed, self.printerr){code}
this can be replaced by a more familiar direct function call:
{code:java}
self.tracing_enabled = self.on_off_toggle("TRACING", "Tracing", 
self.tracing_enabled, parsed.get_binding('switch')){code}
 

The refactoring would rework the command output for consistency. Instead of the 
current:
{code:java}
> tracing on
Now Tracing is enabled
> paging on
Query paging is already enabled. Use PAGING OFF to disable.
> expand on
Now Expanded output is enabled
{code}
replace with more succinct and consistent, using 'ON/OFF' instead of 
enabled/disabled and removing the redundant 'Now':
{code:java}
> tracing on
TRACING set to ON
> paging on
PAGING is already ON
> expand on
EXPAND set to ON
{code}
 

  was:
This change proposes to refactor the On/Off switch implemented in the class and 
subclass SwitchCommand and SwitchCommandWithValue of cqlshmain.py to use static 
methods instead of new classes.

The existing code is hard to read, including the usage, which instantiates a 
SwitchCommand object in-order to invoke the execute method:

 
{code:java}
self.tracing_enabled = SwitchCommand("TRACING", 
"Tracing").execute(self.tracing_enabled, parsed, self.printerr){code}
this can be replaced by a more familiar direct function call:

 

 
{code:java}
self.tracing_enabled = self.on_off_toggle("TRACING", "Tracing", 
self.tracing_enabled, parsed.get_binding('switch')){code}
 

 


The refactoring would rework the command output for consistency. Instead of the 
current:

 
{code:java}
> tracing on
Now Tracing is enabled
> paging on
Query paging is already enabled. Use PAGING OFF to disable.
> expand on
Now Expanded output is enabled
{code}
 

replace with more succinct and consistent, using 'ON/OFF' instead of 
enabled/disabled and removing the redundant 'Now':

 
{code:java}
> tracing on
TRACING set to ON
> paging on
PAGING is already ON
> expand on
EXPAND set to ON
{code}
 


> Refactor cqlsh On/Off switch implementation
> ---
>
> Key: CASSANDRA-18547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18547
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brad Schoening
>Assignee: Brad Schoening
>Priority: Normal
>
> This change proposes to refactor the On/Off switch implemented in the class 
> and subclass SwitchCommand and SwitchCommandWithValue of cqlshmain.py to use 
> static methods instead of new classes.
> The existing code is hard to read, including the usage, which instantiates a 
> SwitchCommand object in-order to invoke the execute method:
>  
> {code:java}
> self.tracing_enabled = SwitchCommand("TRACING", 
> "Tracing").execute(self.tracing_enabled, parsed, self.printerr){code}
> this can be replaced by a more familiar direct function call:
> {code:java}
> self.tracing_enabled = self.on_off_toggle("TRACING", "Tracing", 
> self.tracing_enabled, parsed.get_binding('switch')){code}
>  
> The refactoring would rework the command output for consistency. Instead of 
> the current:
> {code:java}
> > tracing on
> Now Tracing is enabled
> > paging on
> Query paging is already enabled. Use PAGING OFF to disable.
> > expand on
> Now Expanded output is enabled
> {code}
> replace with more succinct and consistent, using 'ON/OFF' instead of 
> enabled/disabled and removing the redundant 'Now':
> {code:java}
> > tracing on
> TRACING set to ON
> > paging on
> PAGING is already ON
> > expand on
> EXPAND set to ON
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18547) Refactor cqlsh On/Off switch implementation

2023-05-23 Thread Brad Schoening (Jira)
Brad Schoening created CASSANDRA-18547:
--

 Summary: Refactor cqlsh On/Off switch implementation
 Key: CASSANDRA-18547
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18547
 Project: Cassandra
  Issue Type: Improvement
Reporter: Brad Schoening


This change proposes to refactor the On/Off switch implemented in the class and 
subclass SwitchCommand and SwitchCommandWithValue of cqlshmain.py to use static 
methods instead of new classes.

The existing code is hard to read, including the usage, which instantiates a 
SwitchCommand object in-order to invoke the execute method:

 
{code:java}
self.tracing_enabled = SwitchCommand("TRACING", 
"Tracing").execute(self.tracing_enabled, parsed, self.printerr){code}
this can be replaced by a more familiar direct function call:

 

 
{code:java}
self.tracing_enabled = self.on_off_toggle("TRACING", "Tracing", 
self.tracing_enabled, parsed.get_binding('switch')){code}
 

 


The refactoring would rework the command output for consistency. Instead of the 
current:

 
{code:java}
> tracing on
Now Tracing is enabled
> paging on
Query paging is already enabled. Use PAGING OFF to disable.
> expand on
Now Expanded output is enabled
{code}
 

replace with more succinct and consistent, using 'ON/OFF' instead of 
enabled/disabled and removing the redundant 'Now':

 
{code:java}
> tracing on
TRACING set to ON
> paging on
PAGING is already ON
> expand on
EXPAND set to ON
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18546) Remove Gossiper#makeRandomGossipDigest

2023-05-23 Thread Cameron Zemek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-18546:
--
Description: 
In going through trying to understand the Gossiper code I come across:
{code:java}
    /**
     * The gossip digest is built based on randomization
     * rather than just looping through the collection of live endpoints.
     *
     * @param gDigests list of Gossip Digests.
     */
    private void makeRandomGossipDigest(List gDigests) {code}
But I couldn't see what purpose randomization had. In fact in 3.11 it will call:
{code:java}
 doSort(gDigestList); {code}
On the receiving end, negating the purpose of the randomization.

 

In discussion with [~stefan.miklosovic] he found this ticket CASSANDRA-14174

So it seems to me this randomization may have been to allow for limited sizes 
of SYN messages. But this feature doesn't exist and as such by randomizing it 
is:
 * creating more garbage
 * using more CPU (sure its mostly trival; see next point)
 * more time spent on unnecessary functionality on the *single threaded* gossip 
stage.
 * complicating the code and making it more difficult to understand

In fact there is a bug in the implementation:
{code:java}
int generation = 0;
        int maxVersion = 0;        // local epstate will be part of 
endpointStateMap
        List endpoints = new 
ArrayList(endpointStateMap.keySet());
        Collections.shuffle(endpoints, random);
        for (InetAddress endpoint : endpoints)
        {
            epState = endpointStateMap.get(endpoint);
            if (epState != null)
            {
                generation = epState.getHeartBeatState().getGeneration();
                maxVersion = getMaxEndpointStateVersion(epState);
            }
            gDigests.add(new GossipDigest(endpoint, generation, maxVersion));
        } {code}
If epState is null and we already had a non-null epState, then the next digest 
will use the generation and maxVersion of the previous iterated epState.

 

Here is change to remove this randomization and fix the above bug, 
[https://github.com/apache/cassandra/pull/2357/commits/1ba422ab5de35f7057c7621ec3607dcbca19768c]

  was:
In going through trying to understand the Gossiper code I come across:
{code:java}
    /**
     * The gossip digest is built based on randomization
     * rather than just looping through the collection of live endpoints.
     *
     * @param gDigests list of Gossip Digests.
     */
    private void makeRandomGossipDigest(List gDigests) {code}
But I couldn't see what purpose randomization had. In fact in 3.11 it will call:
{code:java}
 doSort(gDigestList); {code}
On the receiving end, negating the purpose of the randomization.

 

In discussion with [~stefan.miklosovic] he found this ticket CASSANDRA-14174

So it seems to me this randomization may have been to allow for limited sizes 
of SYN messages. But this feature doesn't exist and as such by randomizing it 
is:
 * creating more garbage
 * using more CPU (sure its mostly trival; see next point)
 * more time spent on unnecessary functionality on the *single threaded* gossip 
stage.
 * complicating the code and making it more difficult to understand

In fact there is a bug in the implementation:
{code:java}
int generation = 0;
        int maxVersion = 0;        // local epstate will be part of 
endpointStateMap
        List endpoints = new 
ArrayList(endpointStateMap.keySet());
        Collections.shuffle(endpoints, random);
        for (InetAddress endpoint : endpoints)
        {
            epState = endpointStateMap.get(endpoint);
            if (epState != null)
            {
                generation = epState.getHeartBeatState().getGeneration();
                maxVersion = getMaxEndpointStateVersion(epState);
            }
            gDigests.add(new GossipDigest(endpoint, generation, maxVersion));
        } {code}
If epState is null and we already had a non-null epState, then the next digest 
will use the generation and maxVersion of the previous iterated epState.

 

Here is change to remove this randomization and fix the above bug, 
[https://github.com/apache/cassandra/pull/2357/files#diff-99267a2170b04fd7dd24d6c6bf2ba1fc26d6dc896cd74f8c5bd56c476e2540e4]


> Remove Gossiper#makeRandomGossipDigest
> --
>
> Key: CASSANDRA-18546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18546
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Cameron Zemek
>Priority: Normal
>
> In going through trying to understand the Gossiper code I come across:
> {code:java}
>     /**
>      * The gossip digest is built based on randomization
>      * rather than just looping through the collection of live endpoints.
>      *
>      * @param gDigests list of Gossip Digests.
>      */
>     private void makeRa

[jira] [Created] (CASSANDRA-18546) Remove Gossiper#makeRandomGossipDigest

2023-05-23 Thread Cameron Zemek (Jira)
Cameron Zemek created CASSANDRA-18546:
-

 Summary: Remove Gossiper#makeRandomGossipDigest
 Key: CASSANDRA-18546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18546
 Project: Cassandra
  Issue Type: Improvement
Reporter: Cameron Zemek


 

In going through trying to understand the Gossiper code I come across:

 
{code:java}
    /**
     * The gossip digest is built based on randomization
     * rather than just looping through the collection of live endpoints.
     *
     * @param gDigests list of Gossip Digests.
     */
    private void makeRandomGossipDigest(List gDigests) {code}
But I couldn't see what purpose randomization had. In fact in 3.11 it will call:

 

 
{code:java}
 doSort(gDigestList); {code}
On the receiving end, negating the purpose of the randomization.

 

In discussion with [~stefan.miklosovic] he found this ticket CASSANDRA-14174

So it seems to me this randomization may have been to allow for limited sizes 
of SYN messages. But this feature doesn't exist and as such by randomizing it 
is:
 * creating more garbage
 * using more CPU (sure its mostly trival; see next point)
 * more time spent on unnecessary functionality on the *single threaded* gossip 
stage.
 * complicating the code and making it more difficult to understand

In fact there is a bug in the implementation:
{code:java}
int generation = 0;
        int maxVersion = 0;        // local epstate will be part of 
endpointStateMap
        List endpoints = new 
ArrayList(endpointStateMap.keySet());
        Collections.shuffle(endpoints, random);
        for (InetAddress endpoint : endpoints)
        {
            epState = endpointStateMap.get(endpoint);
            if (epState != null)
            {
                generation = epState.getHeartBeatState().getGeneration();
                maxVersion = getMaxEndpointStateVersion(epState);
            }
            gDigests.add(new GossipDigest(endpoint, generation, maxVersion));
        } {code}
If epState is null and we already had a non-null epState, then the next digest 
will use the generation and maxVersion of the previous iterated epState.

 

Here is change to remove this randomization and fix the above bug, 
https://github.com/apache/cassandra/pull/2357/files#diff-99267a2170b04fd7dd24d6c6bf2ba1fc26d6dc896cd74f8c5bd56c476e2540e4



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18546) Remove Gossiper#makeRandomGossipDigest

2023-05-23 Thread Cameron Zemek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-18546:
--
Description: 
In going through trying to understand the Gossiper code I come across:
{code:java}
    /**
     * The gossip digest is built based on randomization
     * rather than just looping through the collection of live endpoints.
     *
     * @param gDigests list of Gossip Digests.
     */
    private void makeRandomGossipDigest(List gDigests) {code}
But I couldn't see what purpose randomization had. In fact in 3.11 it will call:
{code:java}
 doSort(gDigestList); {code}
On the receiving end, negating the purpose of the randomization.

 

In discussion with [~stefan.miklosovic] he found this ticket CASSANDRA-14174

So it seems to me this randomization may have been to allow for limited sizes 
of SYN messages. But this feature doesn't exist and as such by randomizing it 
is:
 * creating more garbage
 * using more CPU (sure its mostly trival; see next point)
 * more time spent on unnecessary functionality on the *single threaded* gossip 
stage.
 * complicating the code and making it more difficult to understand

In fact there is a bug in the implementation:
{code:java}
int generation = 0;
        int maxVersion = 0;        // local epstate will be part of 
endpointStateMap
        List endpoints = new 
ArrayList(endpointStateMap.keySet());
        Collections.shuffle(endpoints, random);
        for (InetAddress endpoint : endpoints)
        {
            epState = endpointStateMap.get(endpoint);
            if (epState != null)
            {
                generation = epState.getHeartBeatState().getGeneration();
                maxVersion = getMaxEndpointStateVersion(epState);
            }
            gDigests.add(new GossipDigest(endpoint, generation, maxVersion));
        } {code}
If epState is null and we already had a non-null epState, then the next digest 
will use the generation and maxVersion of the previous iterated epState.

 

Here is change to remove this randomization and fix the above bug, 
[https://github.com/apache/cassandra/pull/2357/files#diff-99267a2170b04fd7dd24d6c6bf2ba1fc26d6dc896cd74f8c5bd56c476e2540e4]

  was:
 

In going through trying to understand the Gossiper code I come across:

 
{code:java}
    /**
     * The gossip digest is built based on randomization
     * rather than just looping through the collection of live endpoints.
     *
     * @param gDigests list of Gossip Digests.
     */
    private void makeRandomGossipDigest(List gDigests) {code}
But I couldn't see what purpose randomization had. In fact in 3.11 it will call:

 

 
{code:java}
 doSort(gDigestList); {code}
On the receiving end, negating the purpose of the randomization.

 

In discussion with [~stefan.miklosovic] he found this ticket CASSANDRA-14174

So it seems to me this randomization may have been to allow for limited sizes 
of SYN messages. But this feature doesn't exist and as such by randomizing it 
is:
 * creating more garbage
 * using more CPU (sure its mostly trival; see next point)
 * more time spent on unnecessary functionality on the *single threaded* gossip 
stage.
 * complicating the code and making it more difficult to understand

In fact there is a bug in the implementation:
{code:java}
int generation = 0;
        int maxVersion = 0;        // local epstate will be part of 
endpointStateMap
        List endpoints = new 
ArrayList(endpointStateMap.keySet());
        Collections.shuffle(endpoints, random);
        for (InetAddress endpoint : endpoints)
        {
            epState = endpointStateMap.get(endpoint);
            if (epState != null)
            {
                generation = epState.getHeartBeatState().getGeneration();
                maxVersion = getMaxEndpointStateVersion(epState);
            }
            gDigests.add(new GossipDigest(endpoint, generation, maxVersion));
        } {code}
If epState is null and we already had a non-null epState, then the next digest 
will use the generation and maxVersion of the previous iterated epState.

 

Here is change to remove this randomization and fix the above bug, 
https://github.com/apache/cassandra/pull/2357/files#diff-99267a2170b04fd7dd24d6c6bf2ba1fc26d6dc896cd74f8c5bd56c476e2540e4


> Remove Gossiper#makeRandomGossipDigest
> --
>
> Key: CASSANDRA-18546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18546
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Cameron Zemek
>Priority: Normal
>
> In going through trying to understand the Gossiper code I come across:
> {code:java}
>     /**
>      * The gossip digest is built based on randomization
>      * rather than just looping through the collection of live endpoints.
>      *
>      * @param gDigests list of Gossip Digests

[jira] [Updated] (CASSANDRA-17262) sstableloader ignores the native port option

2023-05-23 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17262:
-
Resolution: Duplicate
Status: Resolved  (was: Open)

> sstableloader ignores the native port option
> 
>
> Key: CASSANDRA-17262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17262
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Tools
>Reporter: Oleg Zhovtanyuk
>Priority: Normal
>
> sstableloader silently ignores the native port option and connects to default 
> port.
> E.g.
> {code:java}
> sstableloader -v -d 192.168.1.24 -p 32181 -sp 32182 -u USER -pw PASSWORD 
> backups/test-workspace/test-table-9150b730742911ec9d011fb0ef4f5206 {code}
> fails with
> {code:java}
> All host(s) tried for query failed (tried: /192.168.1.24:9042 
> (com.datastax.driver.core.exceptions.TransportException: [/192.168.1.24:9042] 
> Cannot connect))
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: /192.168.1.24:9042 
> (com.datastax.driver.core.exceptions.TransportException: [/192.168.1.24:9042] 
> Cannot connect))
> at 
> com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:270)
> at 
> com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:109)
> at 
> com.datastax.driver.core.Cluster$Manager.negotiateProtocolVersionAndConnect(Cluster.java:1813)
> at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1726)
> at com.datastax.driver.core.Cluster.init(Cluster.java:214)
> at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:387)
> at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:366)
> at com.datastax.driver.core.Cluster.connect(Cluster.java:311)
> at 
> org.apache.cassandra.utils.NativeSSTableLoaderClient.init(NativeSSTableLoaderClient.java:75)
> at 
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:183)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:83)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:51)
> Exception in thread "main" org.apache.cassandra.tools.BulkLoadException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: /192.168.1.24:9042 
> (com.datastax.driver.core.exceptions.TransportException: [/192.168.1.24:9042] 
> Cannot connect))
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:96)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:51)
> Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All 
> host(s) tried for query failed (tried: /192.168.1.24:9042 
> (com.datastax.driver.core.exceptions.TransportException: [/192.168.1.24:9042] 
> Cannot connect))
> at 
> com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:270)
> at 
> com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:109)
> at 
> com.datastax.driver.core.Cluster$Manager.negotiateProtocolVersionAndConnect(Cluster.java:1813)
> at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1726)
> at com.datastax.driver.core.Cluster.init(Cluster.java:214)
> at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:387)
> at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:366)
> at com.datastax.driver.core.Cluster.connect(Cluster.java:311)
> at 
> org.apache.cassandra.utils.NativeSSTableLoaderClient.init(NativeSSTableLoaderClient.java:75)
> at 
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:183)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:83)
> ... 1 more {code}
> The solution is to pass the endpoint as '-d HOST:PORT', but it requires 
> digging in the source code and it not documented.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17262) sstableloader ignores the native port option

2023-05-23 Thread dan jatnieks (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725586#comment-17725586
 ] 

dan jatnieks commented on CASSANDRA-17262:
--

Seems like a duplicate of CASSANDRA-17210?

 

> sstableloader ignores the native port option
> 
>
> Key: CASSANDRA-17262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17262
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Tools
>Reporter: Oleg Zhovtanyuk
>Priority: Normal
>
> sstableloader silently ignores the native port option and connects to default 
> port.
> E.g.
> {code:java}
> sstableloader -v -d 192.168.1.24 -p 32181 -sp 32182 -u USER -pw PASSWORD 
> backups/test-workspace/test-table-9150b730742911ec9d011fb0ef4f5206 {code}
> fails with
> {code:java}
> All host(s) tried for query failed (tried: /192.168.1.24:9042 
> (com.datastax.driver.core.exceptions.TransportException: [/192.168.1.24:9042] 
> Cannot connect))
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: /192.168.1.24:9042 
> (com.datastax.driver.core.exceptions.TransportException: [/192.168.1.24:9042] 
> Cannot connect))
> at 
> com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:270)
> at 
> com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:109)
> at 
> com.datastax.driver.core.Cluster$Manager.negotiateProtocolVersionAndConnect(Cluster.java:1813)
> at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1726)
> at com.datastax.driver.core.Cluster.init(Cluster.java:214)
> at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:387)
> at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:366)
> at com.datastax.driver.core.Cluster.connect(Cluster.java:311)
> at 
> org.apache.cassandra.utils.NativeSSTableLoaderClient.init(NativeSSTableLoaderClient.java:75)
> at 
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:183)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:83)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:51)
> Exception in thread "main" org.apache.cassandra.tools.BulkLoadException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: /192.168.1.24:9042 
> (com.datastax.driver.core.exceptions.TransportException: [/192.168.1.24:9042] 
> Cannot connect))
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:96)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:51)
> Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All 
> host(s) tried for query failed (tried: /192.168.1.24:9042 
> (com.datastax.driver.core.exceptions.TransportException: [/192.168.1.24:9042] 
> Cannot connect))
> at 
> com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:270)
> at 
> com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:109)
> at 
> com.datastax.driver.core.Cluster$Manager.negotiateProtocolVersionAndConnect(Cluster.java:1813)
> at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1726)
> at com.datastax.driver.core.Cluster.init(Cluster.java:214)
> at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:387)
> at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:366)
> at com.datastax.driver.core.Cluster.connect(Cluster.java:311)
> at 
> org.apache.cassandra.utils.NativeSSTableLoaderClient.init(NativeSSTableLoaderClient.java:75)
> at 
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:183)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:83)
> ... 1 more {code}
> The solution is to pass the endpoint as '-d HOST:PORT', but it requires 
> digging in the source code and it not documented.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18540) negotiatedProtocolMustBeAcceptedProtocolTest tests fail with "TLSv1.1 failed to negotiate" on JDK17

2023-05-23 Thread dan jatnieks (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dan jatnieks updated CASSANDRA-18540:
-
Test and Documentation Plan: 
Patch: [https://github.com/djatnieks/cassandra/tree/CASSANDRA-18540-trunk]

PR: https://github.com/apache/cassandra/pull/2359

CI: TBD because has dependency on CASSANDRA-18180

 
 Status: Patch Available  (was: In Progress)

> negotiatedProtocolMustBeAcceptedProtocolTest tests fail with "TLSv1.1 failed 
> to negotiate" on JDK17
> ---
>
> Key: CASSANDRA-18540
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18540
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: dan jatnieks
>Assignee: dan jatnieks
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Note: This depends on having a fix for CASSANDRA-18180, otherwise most/all 
> tests in {{NativeTransportEncryptionOptionsTest}} and 
> {{InternodeEncryptionOptionsTest}} are failing due to that issue.
> Using the patch for CASSANDRA-18180, the 
> {{negotiatedProtocolMustBeAcceptedProtocolTest}} test in both 
> {{NativeTransportEncryptionOptionsTest}} and 
> {{InternodeEncryptionOptionsTest}} fails with "TLSv1.1 failed to negotiate" 
> on JDK17.
> From what I can see, the {{negotiatedProtocolMustBeAcceptedProtocolTest}} is 
> failing because in JDK11 and JDK17 the "TLSv1.1" protocol is disabled.
> Since TLSv1.1 is disabled in JDK11 and 17, one possibility is to change the 
> test to use TLSv1.2 instead of TLSv1.1. That should work directly with JDK11 
> and 17, since TLSv1.2 is one of the defaults, and it won't be an issue for 
> JDK8 as that will be dropped.
> Also, I think the point of the 
> {{negotiatedProtocolMustBeAcceptedProtocolTest}} is to test that the 
> {{accepted_protocols}} option is working correctly rather than the choice of 
> _which_ protocol is used. Meaning, I don’t think the intent was to test 
> TLSv1.1 specifically, rather that the mechanism of accepted protocols works 
> and choosing TLSv1.1 was at the time convenient - but I could be wrong.
> It also seems to me like bit of a coincidence that these tests are currently 
> working on JDK11, at least on CI. Indeed, running locally with JDK11, these 
> fail for me:
> {noformat}
> $ pwd
> /Users/dan.jatnieks/apache/cassandra-4.0
> $ java -version
> openjdk version "11.0.11" 2021-04-20
> OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
> OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed mode)
> $ ant test-jvm-dtest-some 
> -Dtest.name=org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest
>  -Duse.jdk11=true
> ...
> [junit-timeout] Testcase: 
> negotiatedProtocolMustBeAcceptedProtocolTest(org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest):
>FAILED
> [junit-timeout] Should be possible to establish a TLSv1.1 connection 
> expected: but was:
> [junit-timeout] junit.framework.AssertionFailedError: Should be possible to 
> establish a TLSv1.1 connection expected: but 
> was:
> [junit-timeout]   at 
> org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest.negotiatedProtocolMustBeAcceptedProtocolTest(NativeTransportEncryptionOptionsTest.java:160)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {noformat}
> I believe these work on CI because of CASSANDRA-16848 - in that ticket, after 
> 2021-Apr JDK8 dropped TLSv1.1 which led to a fix in 
> [cassandra-build|https://github.com/apache/cassandra-builds/commit/d1a3a0c59b3c5c17697d6a6656cd5d4f3a1cdbe9]
>  docker code to make sure TLSv1.1 is accepted. 
> I say coincidence because this change also makes it work for JDK11 and JDK17, 
> and I've been able to verify that making a change locally to the JDK 
> {{java.security}} file. I’m not sure that at the time of CASSANDRA-16848 it 
> was intended for any JDK versions.
> The point of mentioning this is that if 
> {{negotiatedProtocolMustBeAcceptedProtocolTest}} is changed to use TLSv1.2, 
> and support for JDK8 is dropped, then the changes made in CASSANDRA-16848 
> could also be reverted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18180) bulkLoaderSuccessfullyStreamsOverSsl fails with ClassCastException on JDK17

2023-05-23 Thread dan jatnieks (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dan jatnieks updated CASSANDRA-18180:
-
Test and Documentation Plan: 
Patch:

[https://github.com/djatnieks/cassandra/tree/CASSANDRA-18180]

PR:

[https://github.com/apache/cassandra/pull/2358]

 

Tested locally with jdk11 and jdk17:
{code:java}
ant test-jvm-dtest-some 
-Dtest.name=org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest{code}

  was:
Patch:

[https://github.com/djatnieks/cassandra/tree/CASSANDRA-18180]

 

Tested locally with jdk11 and jdk17:
{code:java}
ant test-jvm-dtest-some 
-Dtest.name=org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest{code}


> bulkLoaderSuccessfullyStreamsOverSsl fails with ClassCastException on JDK17
> ---
>
> Key: CASSANDRA-18180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18180
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: dan jatnieks
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While working on CASSANDRA-17992 we hit: 
> {code:java}
> java.lang.ClassCastException: class 
> org.apache.cassandra.utils.memory.BufferPool$Chunk cannot be cast to class 
> sun.nio.ch.DirectBuffer (org.apache.cassandra.utils.memory.BufferPool$Chunk 
> is in unnamed module of loader 'app'; sun.nio.ch.DirectBuffer is in module 
> java.base of loader 'bootstrap')\n\tat 
> java.base/com.sun.crypto.provider.GaloisCounterMode$GCMEngine.overlapDetection(GaloisCounterMode.java:865)\n\tat
>  
> java.base/com.sun.crypto.provider.GaloisCounterMode$GCMDecrypt.doFinal(GaloisCounterMode.java:1502)\n\tat
>  
> java.base/com.sun.crypto.provider.GaloisCounterMode.engineDoFinal(GaloisCounterMode.java:447)\n\tat
>  
> {code}
> -The issue is exposed with JDK 17, trunk; if interested, ping- [~e.dimitrova] 
> -for current branch as there is no feature branch at the moment-  we can 
> build and start from trunk with JDK17 already. Circle CI can be run for JDK17 
> too. For more information how to do that - .circleci/readme.md
> CC [~benedict] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18180) bulkLoaderSuccessfullyStreamsOverSsl fails with ClassCastException on JDK17

2023-05-23 Thread dan jatnieks (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725576#comment-17725576
 ] 

dan jatnieks commented on CASSANDRA-18180:
--

{quote}I experimented with using a map for this, but ended up learning that 
{{{}ByteBuffer{}}}'s are not suited to being used as map keys
{quote}
Just a note that it was pointed out to me that {{IdentityHashMap}} could work 
for this, however it would need to be made thread safe and performance would 
need to be verified.

> bulkLoaderSuccessfullyStreamsOverSsl fails with ClassCastException on JDK17
> ---
>
> Key: CASSANDRA-18180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18180
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: dan jatnieks
>Priority: Normal
>
> While working on CASSANDRA-17992 we hit: 
> {code:java}
> java.lang.ClassCastException: class 
> org.apache.cassandra.utils.memory.BufferPool$Chunk cannot be cast to class 
> sun.nio.ch.DirectBuffer (org.apache.cassandra.utils.memory.BufferPool$Chunk 
> is in unnamed module of loader 'app'; sun.nio.ch.DirectBuffer is in module 
> java.base of loader 'bootstrap')\n\tat 
> java.base/com.sun.crypto.provider.GaloisCounterMode$GCMEngine.overlapDetection(GaloisCounterMode.java:865)\n\tat
>  
> java.base/com.sun.crypto.provider.GaloisCounterMode$GCMDecrypt.doFinal(GaloisCounterMode.java:1502)\n\tat
>  
> java.base/com.sun.crypto.provider.GaloisCounterMode.engineDoFinal(GaloisCounterMode.java:447)\n\tat
>  
> {code}
> -The issue is exposed with JDK 17, trunk; if interested, ping- [~e.dimitrova] 
> -for current branch as there is no feature branch at the moment-  we can 
> build and start from trunk with JDK17 already. Circle CI can be run for JDK17 
> too. For more information how to do that - .circleci/readme.md
> CC [~benedict] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18180) bulkLoaderSuccessfullyStreamsOverSsl fails with ClassCastException on JDK17

2023-05-23 Thread dan jatnieks (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17723776#comment-17723776
 ] 

dan jatnieks edited comment on CASSANDRA-18180 at 5/23/23 9:07 PM:
---

Recap:  The {{attachment}} field of {{ByteBuffer}} is being used to store the 
related {{BufferPool}} {{Chunk}} object that is associated with the buffer. 
Since -JDK11- JDK16 some crypto overlap detection code in {{GaloisCounterMode}} 
expects that any object attached to a {{ByteBuffer}} implements 
{{{}DirectBuffer{}}}, and if it does not, will cause a 
{{{}ClassCastException{}}}.

Since we see this {{ClassCastException}} in tests using encryption, it seems 
it's triggered by a supported TLS setting rather than some unintended default.

The patch I provided uses the suggestion made by [~benedict] in this comment in 
CASSANDRA-17992, which is to have {{Chunk}} (and also {{{}Ref{}}}) implement 
{{{}DirectBuffer{}}}.

The main downside to this is that two additional {{--add-exports}} are required 
to be able to access JDK internal class {{{}DirectBuffer{}}}.

Access to this internal class also has a secondary impact on 
{{TestNameCheckTask}} as it uses reflection and tries to load all related 
classes of tests being checked. This led to replacing the 
{{checktestnameshelper}} ant task in {{build.xml}} with a java target so that 
it is possible to pass the needed jvm args to {{{}TestNameCheckTask{}}}.

An alternative approach that avoids accessing jdk internals would, I think, 
still need to associate {{ByteBuffer}} objects to {{Chunk}} objects. I 
experimented with using a map for this, but ended up learning that 
{{{}ByteBuffer{}}}'s are not suited to being used as map keys, as the docs 
state:
{noformat}
because buffer hash codes are content-dependent, it is inadvisable to use 
buffers as keys in hash maps or similar data structures unless it is known that 
their contents will not change.{noformat}
So, you can put a {{ByteBuffer}} into a map, but if the contents change you 
won't be able to retrieve/find it again.

I'm not sure what other workaround solution to propose that doesn't negatively 
affect complexity and/or performance, but am happy to look into other ideas.

-There is another aspect of this that I don't fully understand that's somewhat 
bothersome - this test only fails with JDK17 even though the same 
{{GaloisCounterMode}} overlap detection code is present in JDK11; for some 
reason the same code path is not executed and the tests pass with JDK11. I 
wonder why that is?-

-I did check the enabled cipher suites for internode messaging and native 
transport that I found looking at test run output - and those are the same with 
11 and 17. So I guess it's a result of some change in the JDK? If I could be 
due to something else, I don't know what that might be.-

My mistake - {{GaloisCounterMode}} overlap detection was only added in JDK16 
with [JDK-8253821|https://bugs.openjdk.org/browse/JDK-8253821], so it does make 
sense that this issue happens with JDK17 and not with JDK11.


was (Author: djatnieks):
Recap:  The {{attachment}} field of {{ByteBuffer}} is being used to store the 
related {{BufferPool}} {{Chunk}} object that is associated with the buffer. 
Since JDK11 some crypto overlap detection code in {{GaloisCounterMode}} expects 
that any object attached to a {{ByteBuffer}} implements {{{}DirectBuffer{}}}, 
and if it does not, will cause a {{{}ClassCastException{}}}.

Since we see this {{ClassCastException}} in tests using encryption, it seems 
it's triggered by a supported TLS setting rather than some unintended default.

The patch I provided uses the suggestion made by [~benedict] in this comment in 
CASSANDRA-17992, which is to have {{Chunk}} (and also {{{}Ref{}}}) implement 
{{{}DirectBuffer{}}}.

The main downside to this is that two additional {{--add-exports}} are required 
to be able to access JDK internal class {{{}DirectBuffer{}}}.

Access to this internal class also has a secondary impact on 
{{TestNameCheckTask}} as it uses reflection and tries to load all related 
classes of tests being checked. This led to replacing the 
{{checktestnameshelper}} ant task in {{build.xml}} with a java target so that 
it is possible to pass the needed jvm args to {{{}TestNameCheckTask{}}}.

An alternative approach that avoids accessing jdk internals would, I think, 
still need to associate {{ByteBuffer}} objects to {{Chunk}} objects. I 
experimented with using a map for this, but ended up learning that 
{{{}ByteBuffer{}}}'s are not suited to being used as map keys, as the docs 
state:
{noformat}
because buffer hash codes are content-dependent, it is inadvisable to use 
buffers as keys in hash maps or similar data structures unless it is known that 
their contents will not change.{noformat}
So, you can put a {{ByteBuffer}} into a map, but if the contents change you 
won't be able to re

[jira] [Updated] (CASSANDRA-18544) Make cassandra-stress possible to authenticate against JMX with username and password

2023-05-23 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-18544:
--
Change Category: Operability
 Complexity: Normal
  Fix Version/s: 5.x
   Assignee: Stefan Miklosovic
 Status: Open  (was: Triage Needed)

> Make cassandra-stress possible to authenticate against JMX with username and 
> password
> -
>
> Key: CASSANDRA-18544
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18544
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/stress
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.x
>
> Attachments: signature.asc
>
>
> If there is username/password required in order to connect via JMX, there is 
> currently no way how to do this as NodeProbe constructor in cassandra-stress 
> does not accept any credentials. 
> The way how we are reading the credentials should be refactored. Currently, 
> if there are credentials on the command line, they will be visible in the 
> logs. Making it visible on the command line for JMX credentials is not ideal 
> either.
> What I would like to see is to read all credentials which are necessary for 
> cassandra-stress from ONE file. CQL and JMX combined. 
> Because there is already some logic in place and it would be cool to have 
> this backward compatible, we may still support command line credentials for 
> CQL but we would make it deprecated and we would remove it in 6.0 so 
> cassandra-stress will be reading credentials from the file only.
> I looked into the implementation and I have an idea how to "inject" 
> credentials where necessary so they would be used even we do not use them on 
> the command line but I have not coded up anything yet.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18453) Use WithProperties to ensure that system properties are handled

2023-05-23 Thread Bernardo Botella Corbi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725571#comment-17725571
 ] 

Bernardo Botella Corbi commented on CASSANDRA-18453:


Will take a look at the new comments

> Use WithProperties to ensure that system properties are handled
> ---
>
> Key: CASSANDRA-18453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18453
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Maxim Muzafarov
>Assignee: Bernardo Botella Corbi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 5.x
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The {{WithProperties}} is used to handle system properties to set and reset 
> values during the test run, instead of try-catch it uses the 
> try-with-resource approach which facilitates test development.
> We need to replace all the try-catch clauses that work with system properties 
> with {{WithProperties}} and try-with-resource for all the similar cases and 
> where it is technically possible.
> Example:
> {code:java}
> try
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.setBoolean(true);
> testRecoveryWithGarbageLog();
> }
> finally
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.clearValue();
> }
> {code}
> Can be replaced with:
> {code:java}
> try (WithProperties = new 
> WithProperties().with(COMMITLOG_IGNORE_REPLAY_ERRORS, "true"))
> {
> testRecoveryWithGarbageLog();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18400) Nodetool info should report on the new networking cache

2023-05-23 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-18400:
--
Reviewers: Brandon Williams, Stefan Miklosovic  (was: Brandon Williams)
   Status: Review In Progress  (was: Needs Committer)

> Nodetool info should report on the new networking cache
> ---
>
> Key: CASSANDRA-18400
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18400
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool
>Reporter: Brad Schoening
>Assignee: Ningzi Zhan
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.x
>
> Attachments: Screen Shot 2023-05-18 at 13.57.45.png, Screen Shot 
> 2023-05-19 at 12.59.46.png
>
>
> CASSANDRA-15229 separated the chunk and network cache, creating a new 
> network_cache_size parameter.
> However, *nodetool info* does not report the in-use size of this cache or a 
> hit ratio as it does for key, row, counter and chunk cache.  
> {quote}Exceptions : 4
> Key Cache : entries 2852, size 297.59 KiB, capacity 100 MiB, 2406561 hits, 
> 2409424 requests, 0.999 recent hit rate, 14400 save period in seconds
> Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, 
> NaN recent hit rate, 0 save period in seconds
> Counter Cache : entries 0, size 0 bytes, capacity 50 MiB, 0 hits, 0 requests, 
> NaN recent hit rate, 7200 save period in seconds
> Chunk Cache : entries 1118, size 55.02 MiB, capacity 992 MiB, 4695794 misses, 
> 7179421 requests, 0.346 recent hit rate, 24.145 microseconds miss latency
> Percent Repaired : 0.0%
> {quote}
> However, when its full, it will be logged:
> {quote}[INFO ] [epollEventLoopGroup-5-12] cluster_id=1 ip_address=127.1.1.1  
> NoSpamLogger.java:92 - Maximum memory usage reached (128.000MiB), cannot 
> allocate chunk of 8.000MiB
> {quote}
> It should report a line similar to the above:
> {quote}Network Cache : entries ?, size ? MiB, capacity ? MiB, ? misses, ? 
> requests, ? recent hit rate
> {quote}
> Also, not sure why the above show NaN for row and counter cache, is there is 
> a divide by zero error when the entries are zero?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18444) CEP-15: (C*) Transactional Metadata Integration

2023-05-23 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725553#comment-17725553
 ] 

David Capwell commented on CASSANDRA-18444:
---

Accord: https://github.com/apache/cassandra-accord/pull/47
C*: https://github.com/bdeggleston/cassandra/pull/9

> CEP-15: (C*) Transactional Metadata Integration
> ---
>
> Key: CASSANDRA-18444
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18444
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Accord
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 5.0
>
>
> Integration transactional metadata with accord. TCM should update Accord 
> topology and schema, and Accord epochs should map to tcm epochs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18444) CEP-15: (C*) Transactional Metadata Integration

2023-05-23 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725553#comment-17725553
 ] 

David Capwell edited comment on CASSANDRA-18444 at 5/23/23 7:38 PM:


w/ bootstrap:

Accord: https://github.com/apache/cassandra-accord/pull/47
C*: https://github.com/bdeggleston/cassandra/pull/9


was (Author: dcapwell):
Accord: https://github.com/apache/cassandra-accord/pull/47
C*: https://github.com/bdeggleston/cassandra/pull/9

> CEP-15: (C*) Transactional Metadata Integration
> ---
>
> Key: CASSANDRA-18444
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18444
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Accord
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 5.0
>
>
> Integration transactional metadata with accord. TCM should update Accord 
> topology and schema, and Accord epochs should map to tcm epochs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-website] branch asf-staging updated (6e95c42df -> 12c0eaa69)

2023-05-23 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 6e95c42df generate docs for 16320f1d
 new 12c0eaa69 generate docs for 16320f1d

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (6e95c42df)
\
 N -- N -- N   refs/heads/asf-staging (12c0eaa69)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../doc/4.2/cassandra/managing/tools/cqlsh.html|  17 +
 .../doc/trunk/cassandra/managing/tools/cqlsh.html  |  17 +
 site-ui/build/ui-bundle.zip| Bin 4796900 -> 4796900 
bytes
 3 files changed, 34 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18540) negotiatedProtocolMustBeAcceptedProtocolTest tests fail with "TLSv1.1 failed to negotiate" on JDK17

2023-05-23 Thread dan jatnieks (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dan jatnieks updated CASSANDRA-18540:
-
Description: 
Note: This depends on having a fix for CASSANDRA-18180, otherwise most/all 
tests in {{NativeTransportEncryptionOptionsTest}} and 
{{InternodeEncryptionOptionsTest}} are failing due to that issue.

Using the patch for CASSANDRA-18180, the 
{{negotiatedProtocolMustBeAcceptedProtocolTest}} test in both 
{{NativeTransportEncryptionOptionsTest}} and {{InternodeEncryptionOptionsTest}} 
fails with "TLSv1.1 failed to negotiate" on JDK17.

>From what I can see, the {{negotiatedProtocolMustBeAcceptedProtocolTest}} is 
>failing because in JDK11 and JDK17 the "TLSv1.1" protocol is disabled.

Since TLSv1.1 is disabled in JDK11 and 17, one possibility is to change the 
test to use TLSv1.2 instead of TLSv1.1. That should work directly with JDK11 
and 17, since TLSv1.2 is one of the defaults, and it won't be an issue for JDK8 
as that will be dropped.

Also, I think the point of the {{negotiatedProtocolMustBeAcceptedProtocolTest}} 
is to test that the {{accepted_protocols}} option is working correctly rather 
than the choice of _which_ protocol is used. Meaning, I don’t think the intent 
was to test TLSv1.1 specifically, rather that the mechanism of accepted 
protocols works and choosing TLSv1.1 was at the time convenient - but I could 
be wrong.

It also seems to me like bit of a coincidence that these tests are currently 
working on JDK11, at least on CI. Indeed, running locally with JDK11, these 
fail for me:

{noformat}
$ pwd
/Users/dan.jatnieks/apache/cassandra-4.0

$ java -version
openjdk version "11.0.11" 2021-04-20
OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed mode)

$ ant test-jvm-dtest-some 
-Dtest.name=org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest
 -Duse.jdk11=true

...

[junit-timeout] Testcase: 
negotiatedProtocolMustBeAcceptedProtocolTest(org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest):
 FAILED
[junit-timeout] Should be possible to establish a TLSv1.1 connection 
expected: but was:
[junit-timeout] junit.framework.AssertionFailedError: Should be possible to 
establish a TLSv1.1 connection expected: but 
was:
[junit-timeout] at 
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest.negotiatedProtocolMustBeAcceptedProtocolTest(NativeTransportEncryptionOptionsTest.java:160)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit-timeout] at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
{noformat}

I believe these work on CI because of CASSANDRA-16848 - in that ticket, after 
2021-Apr JDK8 dropped TLSv1.1 which led to a fix in 
[cassandra-build|https://github.com/apache/cassandra-builds/commit/d1a3a0c59b3c5c17697d6a6656cd5d4f3a1cdbe9]
 docker code to make sure TLSv1.1 is accepted. 

I say coincidence because this change also makes it work for JDK11 and JDK17, 
and I've been able to verify that making a change locally to the JDK 
{{java.security}} file. I’m not sure that at the time of CASSANDRA-16848 it was 
intended for any JDK versions.

The point of mentioning this is that if 
{{negotiatedProtocolMustBeAcceptedProtocolTest}} is changed to use TLSv1.2, and 
support for JDK8 is dropped, then the changes made in CASSANDRA-16848 could 
also be reverted.


  was:
Note: This depends on having a fix for CASSANDRA-18180, otherwise most/all 
tests in {{NativeTransportEncryptionOptionsTest}} and 
{{InternodeEncryptionOptionsTest}} are failing due to that issue.

Using the patch for CASSANDRA-18180, the 
{{negotiatedProtocolMustBeAcceptedProtocolTest}} test in both 
{{NativeTransportEncryptionOptionsTest}} and {{InternodeEncryptionOptionsTest}} 
fails with "TLSv1.1 failed to negotiate" on JDK17.

>From what I can see, the {{negotiatedProtocolMustBeAcceptedProtocolTest}} is 
>failing because in JDK11 and JDK17 the "TLSv1.1" protocol is disabled.

Since TLSv1.1 is disabled in JDK11 and 17, one possibility is to change the 
test to use TLSv1.2 instead of TLSv1.1. That should work directly with JDK11 
and 17, since TLSv1.2 is one of the defaults, and it won't be an issue for JDK8 
as that will be dropped.

Also, I think the point of the {{negotiatedProtocolMustBeAcceptedProtocolTest}} 
is to test that the {{accepted_protocols}} option is working correctly rather 
than the choice of _which_ protocol is used. Meaning, I don’t think the intent 
was to test TLSv1.1 specifically, rather that the mechanism of accepted 
protocols works and choosing TLSv1.1 was at the time conveni

[jira] [Assigned] (CASSANDRA-18545) [Analytics] Abstract mTLS provisioning via a SecretsProvider

2023-05-23 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero reassigned CASSANDRA-18545:
--

Assignee: Francisco Guerrero

> [Analytics] Abstract mTLS provisioning via a SecretsProvider
> 
>
> Key: CASSANDRA-18545
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18545
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>
> This enhancement allows us to abstract the mTLS secrets provisioning through 
> a {{SecretsProvider}} interface. This will allow custom implementations of 
> the {{SecretsProvider}} to be able to hook into the secrets provisioning. We 
> need to provide a default implementation {{SslConfigSecretsProvider}} which 
> provides secrets via the {{SslConfig}} which is parsed from the reader 
> options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18545) [Analytics] Abstract mTLS provisioning via a SecretsProvider

2023-05-23 Thread Francisco Guerrero (Jira)
Francisco Guerrero created CASSANDRA-18545:
--

 Summary: [Analytics] Abstract mTLS provisioning via a 
SecretsProvider
 Key: CASSANDRA-18545
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18545
 Project: Cassandra
  Issue Type: Improvement
Reporter: Francisco Guerrero


This enhancement allows us to abstract the mTLS secrets provisioning through a 
{{SecretsProvider}} interface. This will allow custom implementations of the 
{{SecretsProvider}} to be able to hook into the secrets provisioning. We need 
to provide a default implementation {{SslConfigSecretsProvider}} which provides 
secrets via the {{SslConfig}} which is parsed from the reader options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18017) Jenkins: Consider using the no-build-test flag

2023-05-23 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725510#comment-17725510
 ] 

Maxim Muzafarov commented on CASSANDRA-18017:
-

Sure, no problem. But is there any way to test it without merging?

I'm struggling with a very suspicious test fail (see related issues). It 
happens only on Jenkins, but not on CircleCI and it seems it is related to the 
checkstyle checks run. So, my theory is if we are able to skip it for tests as 
it was suggested here the issue will never occur.

> Jenkins: Consider using the no-build-test flag
> --
>
> Key: CASSANDRA-18017
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18017
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Andres de la Peña
>Assignee: Maxim Muzafarov
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> CASSANDRA-16625 and CASSANDRA-18000 added a {{-Dno-build-test=true}} flag to 
> skip the dependencies of the Ant test targets. This is useful to speed up the 
> test targets if the {{build-test}} target has already been previously run, 
> for example as part of {{{}ant jar{}}}.
> That was created thinking mainly on CircleCI's multiplexer, where we run the 
> same test target repeatedly. Skipping the already run depended on targets can 
> significantly speed up the tests. The flag however is also useful for all 
> other test jobs because every parallel runner can skip the test building 
> step, and we have hundreds of parallel runners. Saving around 30s on every 
> runner adds up considerable savings.
> Maybe this flag can also be used for skipping test builds on Jenkins too, so 
> each parallel test split can benefit from a slight boost. That could be done 
> if either {{build-test}} or {{jar}} have already been run before calling the 
> test target. I'm not familiarized with Jenkins config so I'm not sure whether 
> it makes sense.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18444) CEP-15: (C*) Transactional Metadata Integration

2023-05-23 Thread Blake Eggleston (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725509#comment-17725509
 ] 

Blake Eggleston commented on CASSANDRA-18444:
-

with bootstrap integration:
[C*|https://github.com/bdeggleston/cassandra/tree/bootstrap-integration]
[accord|https://github.com/bdeggleston/cassandra-accord/tree/bootstrap-integration]

> CEP-15: (C*) Transactional Metadata Integration
> ---
>
> Key: CASSANDRA-18444
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18444
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Accord
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 5.0
>
>
> Integration transactional metadata with accord. TCM should update Accord 
> topology and schema, and Accord epochs should map to tcm epochs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18512) nodetool describecluster command is not showing correct Down count.

2023-05-23 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-18512:
-
Test and Documentation Plan: run CI
 Status: Patch Available  (was: Open)

> nodetool describecluster command is not showing correct Down count.  
> -
>
> Key: CASSANDRA-18512
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18512
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Ranju
>Assignee: Ranju
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.x
>
> Attachments: 
> 0001-Fix-Down-nodes-counter-in-nodetool-describeCluster-c.patch
>
>
> There are some nodes down in the cluster of Cassandra Version 4.x
> # nodetool describecluster command output shows these ips as unreachable.
> UNREACHABLE: [, , ]
> Stats for all nodes:
>     Live: 3
>     Joining: 0
>     Moving: 0
>     Leaving: 0
>     Unreachable: 3
> But under data center , count of down pod is always shown as 0.
> Data Centers: 
>     dc1 #Nodes: 3 #Down: 0
>     dc2 #Nodes: 3 #Down: 0
>  
> Steps to reproduce:
>  # Setup two Data centers dc1,dc2, each datacenter was having 3 nodes - 
> dc1:3,dc2:3
>  # mark down any 3 nodes of two data centers.
>  # Run nodetool describecluster command from the live node and check the 
> Unreachable count , which is 3 and Down Count is 0 , both are not matched.
>  
> Expected Output: Unreachable and Down count should have the same value.
> Data Centers:
>         dc1 #Nodes: 3 #Down: 1
>         dc2 #Nodes: 3 #Down: 2
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18512) nodetool describecluster command is not showing correct Down count.

2023-05-23 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-18512:
-
Status: Needs Committer  (was: Patch Available)

> nodetool describecluster command is not showing correct Down count.  
> -
>
> Key: CASSANDRA-18512
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18512
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Ranju
>Assignee: Ranju
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.x
>
> Attachments: 
> 0001-Fix-Down-nodes-counter-in-nodetool-describeCluster-c.patch
>
>
> There are some nodes down in the cluster of Cassandra Version 4.x
> # nodetool describecluster command output shows these ips as unreachable.
> UNREACHABLE: [, , ]
> Stats for all nodes:
>     Live: 3
>     Joining: 0
>     Moving: 0
>     Leaving: 0
>     Unreachable: 3
> But under data center , count of down pod is always shown as 0.
> Data Centers: 
>     dc1 #Nodes: 3 #Down: 0
>     dc2 #Nodes: 3 #Down: 0
>  
> Steps to reproduce:
>  # Setup two Data centers dc1,dc2, each datacenter was having 3 nodes - 
> dc1:3,dc2:3
>  # mark down any 3 nodes of two data centers.
>  # Run nodetool describecluster command from the live node and check the 
> Unreachable count , which is 3 and Down Count is 0 , both are not matched.
>  
> Expected Output: Unreachable and Down count should have the same value.
> Data Centers:
>         dc1 #Nodes: 3 #Down: 1
>         dc2 #Nodes: 3 #Down: 2
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18512) nodetool describecluster command is not showing correct Down count.

2023-05-23 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725488#comment-17725488
 ] 

Brandon Williams commented on CASSANDRA-18512:
--

4.0 has CASSANDRA-18366 but is otherwise clean, 4.1 is clean aside from an 
unrelated timeout, and trunk is clean. +1 from me.



> nodetool describecluster command is not showing correct Down count.  
> -
>
> Key: CASSANDRA-18512
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18512
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Ranju
>Assignee: Ranju
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.x
>
> Attachments: 
> 0001-Fix-Down-nodes-counter-in-nodetool-describeCluster-c.patch
>
>
> There are some nodes down in the cluster of Cassandra Version 4.x
> # nodetool describecluster command output shows these ips as unreachable.
> UNREACHABLE: [, , ]
> Stats for all nodes:
>     Live: 3
>     Joining: 0
>     Moving: 0
>     Leaving: 0
>     Unreachable: 3
> But under data center , count of down pod is always shown as 0.
> Data Centers: 
>     dc1 #Nodes: 3 #Down: 0
>     dc2 #Nodes: 3 #Down: 0
>  
> Steps to reproduce:
>  # Setup two Data centers dc1,dc2, each datacenter was having 3 nodes - 
> dc1:3,dc2:3
>  # mark down any 3 nodes of two data centers.
>  # Run nodetool describecluster command from the live node and check the 
> Unreachable count , which is 3 and Down Count is 0 , both are not matched.
>  
> Expected Output: Unreachable and Down count should have the same value.
> Data Centers:
>         dc1 #Nodes: 3 #Down: 1
>         dc2 #Nodes: 3 #Down: 2
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18453) Use WithProperties to ensure that system properties are handled

2023-05-23 Thread Jacek Lewandowski (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacek Lewandowski updated CASSANDRA-18453:
--
Reviewers: Jacek Lewandowski, Stefan Miklosovic  (was: Stefan Miklosovic)

> Use WithProperties to ensure that system properties are handled
> ---
>
> Key: CASSANDRA-18453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18453
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Maxim Muzafarov
>Assignee: Bernardo Botella Corbi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 5.x
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The {{WithProperties}} is used to handle system properties to set and reset 
> values during the test run, instead of try-catch it uses the 
> try-with-resource approach which facilitates test development.
> We need to replace all the try-catch clauses that work with system properties 
> with {{WithProperties}} and try-with-resource for all the similar cases and 
> where it is technically possible.
> Example:
> {code:java}
> try
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.setBoolean(true);
> testRecoveryWithGarbageLog();
> }
> finally
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.clearValue();
> }
> {code}
> Can be replaced with:
> {code:java}
> try (WithProperties = new 
> WithProperties().with(COMMITLOG_IGNORE_REPLAY_ERRORS, "true"))
> {
> testRecoveryWithGarbageLog();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18453) Use WithProperties to ensure that system properties are handled

2023-05-23 Thread Jacek Lewandowski (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacek Lewandowski updated CASSANDRA-18453:
--
Status: Changes Suggested  (was: Review In Progress)

> Use WithProperties to ensure that system properties are handled
> ---
>
> Key: CASSANDRA-18453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18453
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Maxim Muzafarov
>Assignee: Bernardo Botella Corbi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 5.x
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The {{WithProperties}} is used to handle system properties to set and reset 
> values during the test run, instead of try-catch it uses the 
> try-with-resource approach which facilitates test development.
> We need to replace all the try-catch clauses that work with system properties 
> with {{WithProperties}} and try-with-resource for all the similar cases and 
> where it is technically possible.
> Example:
> {code:java}
> try
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.setBoolean(true);
> testRecoveryWithGarbageLog();
> }
> finally
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.clearValue();
> }
> {code}
> Can be replaced with:
> {code:java}
> try (WithProperties = new 
> WithProperties().with(COMMITLOG_IGNORE_REPLAY_ERRORS, "true"))
> {
> testRecoveryWithGarbageLog();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18453) Use WithProperties to ensure that system properties are handled

2023-05-23 Thread Jacek Lewandowski (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725447#comment-17725447
 ] 

Jacek Lewandowski commented on CASSANDRA-18453:
---

Left some comments

> Use WithProperties to ensure that system properties are handled
> ---
>
> Key: CASSANDRA-18453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18453
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Maxim Muzafarov
>Assignee: Bernardo Botella Corbi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 5.x
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The {{WithProperties}} is used to handle system properties to set and reset 
> values during the test run, instead of try-catch it uses the 
> try-with-resource approach which facilitates test development.
> We need to replace all the try-catch clauses that work with system properties 
> with {{WithProperties}} and try-with-resource for all the similar cases and 
> where it is technically possible.
> Example:
> {code:java}
> try
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.setBoolean(true);
> testRecoveryWithGarbageLog();
> }
> finally
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.clearValue();
> }
> {code}
> Can be replaced with:
> {code:java}
> try (WithProperties = new 
> WithProperties().with(COMMITLOG_IGNORE_REPLAY_ERRORS, "true"))
> {
> testRecoveryWithGarbageLog();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18453) Use WithProperties to ensure that system properties are handled

2023-05-23 Thread Jacek Lewandowski (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacek Lewandowski updated CASSANDRA-18453:
--
Status: Review In Progress  (was: Needs Committer)

> Use WithProperties to ensure that system properties are handled
> ---
>
> Key: CASSANDRA-18453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18453
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Maxim Muzafarov
>Assignee: Bernardo Botella Corbi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 5.x
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The {{WithProperties}} is used to handle system properties to set and reset 
> values during the test run, instead of try-catch it uses the 
> try-with-resource approach which facilitates test development.
> We need to replace all the try-catch clauses that work with system properties 
> with {{WithProperties}} and try-with-resource for all the similar cases and 
> where it is technically possible.
> Example:
> {code:java}
> try
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.setBoolean(true);
> testRecoveryWithGarbageLog();
> }
> finally
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.clearValue();
> }
> {code}
> Can be replaced with:
> {code:java}
> try (WithProperties = new 
> WithProperties().with(COMMITLOG_IGNORE_REPLAY_ERRORS, "true"))
> {
> testRecoveryWithGarbageLog();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18544) Make cassandra-stress possible to authenticate against JMX with username and password

2023-05-23 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725444#comment-17725444
 ] 

Brandon Williams commented on CASSANDRA-18544:
--

bq. I would say command line parameter would beat the file.

Agreed.

> Make cassandra-stress possible to authenticate against JMX with username and 
> password
> -
>
> Key: CASSANDRA-18544
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18544
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/stress
>Reporter: Stefan Miklosovic
>Priority: Normal
> Attachments: signature.asc
>
>
> If there is username/password required in order to connect via JMX, there is 
> currently no way how to do this as NodeProbe constructor in cassandra-stress 
> does not accept any credentials. 
> The way how we are reading the credentials should be refactored. Currently, 
> if there are credentials on the command line, they will be visible in the 
> logs. Making it visible on the command line for JMX credentials is not ideal 
> either.
> What I would like to see is to read all credentials which are necessary for 
> cassandra-stress from ONE file. CQL and JMX combined. 
> Because there is already some logic in place and it would be cool to have 
> this backward compatible, we may still support command line credentials for 
> CQL but we would make it deprecated and we would remove it in 6.0 so 
> cassandra-stress will be reading credentials from the file only.
> I looked into the implementation and I have an idea how to "inject" 
> credentials where necessary so they would be used even we do not use them on 
> the command line but I have not coded up anything yet.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18542) Add a simple way to run pycodestyle checks

2023-05-23 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725442#comment-17725442
 ] 

Brandon Williams commented on CASSANDRA-18542:
--

These would be nice, but I think the reason this is done in a dtest is because 
it can control the dependencies, and thus have pycodestyle installed.  We can't 
do that though unless we ship it, which is undesirable.  I haven't thought of 
the best way to handle this yet.

> Add a simple way to run pycodestyle checks
> --
>
> Key: CASSANDRA-18542
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18542
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.x
>
>
> As the title says.  Basically I want to solve the problem of having perfectly 
> valid good looking python, only push and have the dtests run pycodestyle and 
> fail everything out for a missing blank line.  We should have a convenient 
> way to catch this sooner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18544) Make cassandra-stress possible to authenticate against JMX with username and password

2023-05-23 Thread miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

miklosovic updated CASSANDRA-18544:
---
Attachment: signature.asc

I am fine with leaving it as it is and read it from one file if specified.

What should have the priority if both are provided? I would say command line 
parameter would beat the file.


Sent from ProtonMail mobile



\

> Make cassandra-stress possible to authenticate against JMX with username and 
> password
> -
>
> Key: CASSANDRA-18544
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18544
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/stress
>Reporter: Stefan Miklosovic
>Priority: Normal
> Attachments: signature.asc
>
>
> If there is username/password required in order to connect via JMX, there is 
> currently no way how to do this as NodeProbe constructor in cassandra-stress 
> does not accept any credentials. 
> The way how we are reading the credentials should be refactored. Currently, 
> if there are credentials on the command line, they will be visible in the 
> logs. Making it visible on the command line for JMX credentials is not ideal 
> either.
> What I would like to see is to read all credentials which are necessary for 
> cassandra-stress from ONE file. CQL and JMX combined. 
> Because there is already some logic in place and it would be cool to have 
> this backward compatible, we may still support command line credentials for 
> CQL but we would make it deprecated and we would remove it in 6.0 so 
> cassandra-stress will be reading credentials from the file only.
> I looked into the implementation and I have an idea how to "inject" 
> credentials where necessary so they would be used even we do not use them on 
> the command line but I have not coded up anything yet.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18512) nodetool describecluster command is not showing correct Down count.

2023-05-23 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725440#comment-17725440
 ] 

Brandon Williams commented on CASSANDRA-18512:
--

Thanks for the patch, [~ranju]!  I've applied this from 4.0 up to trunk and 
started CI:

||Branch||CI||
|[4.0|https://github.com/driftx/cassandra/tree/CASSANDRA-18512-4.0]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1023/workflows/0230c5f3-1b96-4edc-83e7-638bd52bc896],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1023/workflows/4482c093-a469-47dc-9e94-9b721cdedeab]|
|[4.1|https://github.com/driftx/cassandra/tree/CASSANDRA-18512-4.1]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1022/workflows/9531fbae-e7b5-4d23-8f86-dc6707feb8f6],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1022/workflows/6854bd36-cbe9-403a-8d57-6a732e94134b]|
|[trunk|https://github.com/driftx/cassandra/tree/CASSANDRA-18512-trunk]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1024/workflows/87b8427e-2d1e-4f51-83ce-6664d916b0ad],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1024/workflows/dbf3742d-3a3b-43a9-9316-27d14fe159df]|

I'll check back later when that has completed.

> nodetool describecluster command is not showing correct Down count.  
> -
>
> Key: CASSANDRA-18512
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18512
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Ranju
>Assignee: Ranju
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.x
>
> Attachments: 
> 0001-Fix-Down-nodes-counter-in-nodetool-describeCluster-c.patch
>
>
> There are some nodes down in the cluster of Cassandra Version 4.x
> # nodetool describecluster command output shows these ips as unreachable.
> UNREACHABLE: [, , ]
> Stats for all nodes:
>     Live: 3
>     Joining: 0
>     Moving: 0
>     Leaving: 0
>     Unreachable: 3
> But under data center , count of down pod is always shown as 0.
> Data Centers: 
>     dc1 #Nodes: 3 #Down: 0
>     dc2 #Nodes: 3 #Down: 0
>  
> Steps to reproduce:
>  # Setup two Data centers dc1,dc2, each datacenter was having 3 nodes - 
> dc1:3,dc2:3
>  # mark down any 3 nodes of two data centers.
>  # Run nodetool describecluster command from the live node and check the 
> Unreachable count , which is 3 and Down Count is 0 , both are not matched.
>  
> Expected Output: Unreachable and Down count should have the same value.
> Data Centers:
>         dc1 #Nodes: 3 #Down: 1
>         dc2 #Nodes: 3 #Down: 2
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18544) Make cassandra-stress possible to authenticate against JMX with username and password

2023-05-23 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725430#comment-17725430
 ] 

Brandon Williams commented on CASSANDRA-18544:
--

bq. we would make it deprecated and we would remove it in 6.0

I don't think we should remove this as it's quite convenient and some users 
will still want to use that.  In general I don't think people who don't care 
about the logging or know how to avoid it will want to have to create a file 
for credentials all the time, but having it as another option makes sense.

> Make cassandra-stress possible to authenticate against JMX with username and 
> password
> -
>
> Key: CASSANDRA-18544
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18544
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/stress
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> If there is username/password required in order to connect via JMX, there is 
> currently no way how to do this as NodeProbe constructor in cassandra-stress 
> does not accept any credentials. 
> The way how we are reading the credentials should be refactored. Currently, 
> if there are credentials on the command line, they will be visible in the 
> logs. Making it visible on the command line for JMX credentials is not ideal 
> either.
> What I would like to see is to read all credentials which are necessary for 
> cassandra-stress from ONE file. CQL and JMX combined. 
> Because there is already some logic in place and it would be cool to have 
> this backward compatible, we may still support command line credentials for 
> CQL but we would make it deprecated and we would remove it in 6.0 so 
> cassandra-stress will be reading credentials from the file only.
> I looked into the implementation and I have an idea how to "inject" 
> credentials where necessary so they would be used even we do not use them on 
> the command line but I have not coded up anything yet.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18537) Add JMX utility class to in-jvm dtest to ease development of new tests using JMX

2023-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CASSANDRA-18537:
---
Labels: pull-request-available  (was: )

> Add JMX utility class to in-jvm dtest to ease development of new tests using 
> JMX
> 
>
> Key: CASSANDRA-18537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18537
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: Doug Rohrer
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>
> While reviewing CASSANDRA-18511, some repetitive code was identified across 
> the 4 branches, and 2 different tests, that would also be repeated for any 
> new usages of the JMX support in the in-jvm dtest framework. Therefore, a 
> utility class should be added to the dtest-api's `shared` package that will 
> simplify some of this repetitive and error-prone code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18544) Make cassandra-stress possible to authenticate against JMX with username and password

2023-05-23 Thread Stefan Miklosovic (Jira)
Stefan Miklosovic created CASSANDRA-18544:
-

 Summary: Make cassandra-stress possible to authenticate against 
JMX with username and password
 Key: CASSANDRA-18544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18544
 Project: Cassandra
  Issue Type: Improvement
  Components: Tool/stress
Reporter: Stefan Miklosovic


If there is username/password required in order to connect via JMX, there is 
currently no way how to do this as NodeProbe constructor in cassandra-stress 
does not accept any credentials. 

The way how we are reading the credentials should be refactored. Currently, if 
there are credentials on the command line, they will be visible in the logs. 
Making it visible on the command line for JMX credentials is not ideal either.

What I would like to see is to read all credentials which are necessary for 
cassandra-stress from ONE file. CQL and JMX combined. 

Because there is already some logic in place and it would be cool to have this 
backward compatible, we may still support command line credentials for CQL but 
we would make it deprecated and we would remove it in 6.0 so cassandra-stress 
will be reading credentials from the file only.

I looked into the implementation and I have an idea how to "inject" credentials 
where necessary so they would be used even we do not use them on the command 
line but I have not coded up anything yet.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-18529) Remove legacy thrift options from cassandra-stress

2023-05-23 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic reassigned CASSANDRA-18529:
-

Assignee: Stefan Miklosovic

> Remove legacy thrift options from cassandra-stress
> --
>
> Key: CASSANDRA-18529
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18529
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/stress
>Reporter: Brad Schoening
>Assignee: Stefan Miklosovic
>Priority: Low
> Fix For: 5.0
>
>
> The cassandra-stress *mode* option allows specifying options for native 
> protocol and cql3, but these don't seem useful as there would seem to be no 
> other valid options now that cql3 is the standard and thrift no longer 
> supported. 
> -mode "native cql3 user=cassandra password=xx" 
> While user and password, mode native and cqlsh3 could be removed.  Perhaps 
> change the arguments for user and password to match those used with cqlsh 
> would align the tools.
> I.e., 
> cassandra-stress -u cassandra -p x
> Also, the readme.txt in tools/stress states "cassandra-stress supports 
> benchmarking any Cassandra cluster of version 2.0+" but maybe should be 
> updated to a supported Cassandra version, i.e., 3.11.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15046) Add a "history" command to cqlsh. Perhaps "show history"?

2023-05-23 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-15046:
--
  Fix Version/s: 5.0
 (was: 5.x)
Source Control Link: 
https://github.com/apache/cassandra/commit/61333964f42e27ec78fec5b4ec25d9313dfc4eee
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Add a "history" command to cqlsh.  Perhaps "show history"?
> --
>
> Key: CASSANDRA-15046
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15046
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Interpreter
>Reporter: Wes Peters
>Assignee: Brad Schoening
>Priority: Low
>  Labels: lhf
> Fix For: 5.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I was trying to capture some create key space and create table commands from 
> a running cqlsh, and found there was no equivalent to the '\s' history 
> command in Postgres' psql shell.  It's a great tool for figuring out what you 
> were doing yesterday.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Add HISTORY command for CQLSH

2023-05-23 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 61333964f4 Add HISTORY command for CQLSH
61333964f4 is described below

commit 61333964f42e27ec78fec5b4ec25d9313dfc4eee
Author: Brad Schoening 
AuthorDate: Fri May 19 13:52:33 2023 -0400

Add HISTORY command for CQLSH

patch by Brad Schoening; reviewed by Stefan Miklosovic and Brandon Williams 
for CASSANDRA-15046
---
 CHANGES.txt|  1 +
 .../cassandra/pages/managing/tools/cqlsh.adoc  | 14 +
 pylib/cqlshlib/cqlshhandling.py| 10 +++-
 pylib/cqlshlib/cqlshmain.py| 65 ++
 pylib/cqlshlib/test/test_cqlsh_completion.py   |  4 +-
 5 files changed, 69 insertions(+), 25 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index a5a6dffab7..12741fad68 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 5.0
+ * Add HISTORY command for CQLSH (CASSANDRA-15046)
  * Fix sstable formats configuration (CASSANDRA-18441)
  * Add guardrail to bound timestamps (CASSANDRA-18352)
  * Add keyspace_name column to system_views.clients (CASSANDRA-18525)
diff --git a/doc/modules/cassandra/pages/managing/tools/cqlsh.adoc 
b/doc/modules/cassandra/pages/managing/tools/cqlsh.adoc
index afafa1d875..b5209e9b4c 100644
--- a/doc/modules/cassandra/pages/managing/tools/cqlsh.adoc
+++ b/doc/modules/cassandra/pages/managing/tools/cqlsh.adoc
@@ -266,6 +266,20 @@ Gives information about cqlsh commands. To see available 
topics, enter
 `HELP `. Also see the `--browser` argument for controlling what
 browser is used to display help.
 
+=== `HISTORY`
+
+Prints to the screen the last `n` cqlsh commands executed on the server.
+The number of lines defaults to 50 if not specified. `n` is set for
+the current CQL session so if you set it e.g. to `10`, from that point
+there will be at most 10 last commands returned to you.
+
+`Usage`:
+
+[source,none]
+
+HISTORY 
+
+
 === `TRACING`
 
 Enables or disables tracing for queries. When tracing is enabled, once a
diff --git a/pylib/cqlshlib/cqlshhandling.py b/pylib/cqlshlib/cqlshhandling.py
index 57ec28b763..e6a121fd80 100644
--- a/pylib/cqlshlib/cqlshhandling.py
+++ b/pylib/cqlshlib/cqlshhandling.py
@@ -37,7 +37,8 @@ my_commands_ending_with_newline = (
 'exit',
 'quit',
 'clear',
-'cls'
+'cls',
+'history'
 )
 
 cqlsh_syntax_completers = []
@@ -73,6 +74,7 @@ cqlsh_special_cmd_command_syntax_rules = r'''
| 
| 
| 
+   | 
;
 '''
 
@@ -207,6 +209,11 @@ cqlsh_clear_cmd_syntax_rules = r'''
  ;
 '''
 
+cqlsh_history_cmd_syntax_rules = r'''
+ ::= "history" (n=)?
+;
+'''
+
 cqlsh_question_mark = r'''
  ::= "?" ;
 '''
@@ -232,6 +239,7 @@ cqlsh_extra_syntax_rules = cqlsh_cmd_syntax_rules + \
 cqlsh_login_cmd_syntax_rules + \
 cqlsh_exit_cmd_syntax_rules + \
 cqlsh_clear_cmd_syntax_rules + \
+cqlsh_history_cmd_syntax_rules + \
 cqlsh_question_mark
 
 
diff --git a/pylib/cqlshlib/cqlshmain.py b/pylib/cqlshlib/cqlshmain.py
index c550ed84c6..7cffd68e0e 100755
--- a/pylib/cqlshlib/cqlshmain.py
+++ b/pylib/cqlshlib/cqlshmain.py
@@ -1836,6 +1836,26 @@ class Shell(cmd.Cmd):
 else:
 self.printerr("*** No help on %s" % (t,))
 
+def do_history(self, parsed):
+"""
+HISTORY [cqlsh only]
+
+   Displays the most recent commands executed in cqlsh
+
+HISTORY ()
+
+   If n is specified, the history display length is set to n for this 
session
+"""
+
+history_length = readline.get_current_history_length()
+
+n = parsed.get_binding('n')
+if (n is not None):
+self.max_history_length_shown = int(n)
+
+for index in range(max(1, history_length - 
self.max_history_length_shown), history_length):
+print(readline.get_history_item(index))
+
 def do_unicode(self, parsed):
 """
 Textual input/output
@@ -1913,6 +1933,27 @@ class Shell(cmd.Cmd):
 self.cov.save()
 self.cov = None
 
+def init_history(self):
+if readline is not None:
+try:
+readline.read_history_file(HISTORY)
+except IOError:
+pass
+delims = readline.get_completer_delims()
+delims.replace("'", "")
+delims += '.'
+readline.set_completer_delims(delims)
+
+# configure length of history shown
+self.max_history_length_shown = 50
+
+def save_history(self):
+if readline is not None:
+try:
+readline.write_history_file(HISTORY)
+except IOError:
+pass
+

[jira] [Updated] (CASSANDRA-18512) nodetool describecluster command is not showing correct Down count.

2023-05-23 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-18512:
-
 Bug Category: Parent values: Correctness(12982)Level 1 values: Transient 
Incorrect Response(12987)
   Complexity: Low Hanging Fruit
Discovered By: User Report
Fix Version/s: 4.0.x
   4.1.x
   5.x
 Severity: Low
   Status: Open  (was: Triage Needed)

> nodetool describecluster command is not showing correct Down count.  
> -
>
> Key: CASSANDRA-18512
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18512
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Ranju
>Assignee: Ranju
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.x
>
> Attachments: 
> 0001-Fix-Down-nodes-counter-in-nodetool-describeCluster-c.patch
>
>
> There are some nodes down in the cluster of Cassandra Version 4.x
> # nodetool describecluster command output shows these ips as unreachable.
> UNREACHABLE: [, , ]
> Stats for all nodes:
>     Live: 3
>     Joining: 0
>     Moving: 0
>     Leaving: 0
>     Unreachable: 3
> But under data center , count of down pod is always shown as 0.
> Data Centers: 
>     dc1 #Nodes: 3 #Down: 0
>     dc2 #Nodes: 3 #Down: 0
>  
> Steps to reproduce:
>  # Setup two Data centers dc1,dc2, each datacenter was having 3 nodes - 
> dc1:3,dc2:3
>  # mark down any 3 nodes of two data centers.
>  # Run nodetool describecluster command from the live node and check the 
> Unreachable count , which is 3 and Down Count is 0 , both are not matched.
>  
> Expected Output: Unreachable and Down count should have the same value.
> Data Centers:
>         dc1 #Nodes: 3 #Down: 1
>         dc2 #Nodes: 3 #Down: 2
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18512) nodetool describecluster command is not showing correct Down count.

2023-05-23 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-18512:
-
Reviewers: Brandon Williams

> nodetool describecluster command is not showing correct Down count.  
> -
>
> Key: CASSANDRA-18512
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18512
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Ranju
>Assignee: Ranju
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.x
>
> Attachments: 
> 0001-Fix-Down-nodes-counter-in-nodetool-describeCluster-c.patch
>
>
> There are some nodes down in the cluster of Cassandra Version 4.x
> # nodetool describecluster command output shows these ips as unreachable.
> UNREACHABLE: [, , ]
> Stats for all nodes:
>     Live: 3
>     Joining: 0
>     Moving: 0
>     Leaving: 0
>     Unreachable: 3
> But under data center , count of down pod is always shown as 0.
> Data Centers: 
>     dc1 #Nodes: 3 #Down: 0
>     dc2 #Nodes: 3 #Down: 0
>  
> Steps to reproduce:
>  # Setup two Data centers dc1,dc2, each datacenter was having 3 nodes - 
> dc1:3,dc2:3
>  # mark down any 3 nodes of two data centers.
>  # Run nodetool describecluster command from the live node and check the 
> Unreachable count , which is 3 and Down Count is 0 , both are not matched.
>  
> Expected Output: Unreachable and Down count should have the same value.
> Data Centers:
>         dc1 #Nodes: 3 #Down: 1
>         dc2 #Nodes: 3 #Down: 2
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18453) Use WithProperties to ensure that system properties are handled

2023-05-23 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725228#comment-17725228
 ] 

Stefan Miklosovic edited comment on CASSANDRA-18453 at 5/23/23 11:49 AM:
-

j11 precommit 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/2277/workflows/6d0d282b-0a05-4852-b067-a4c12390051c
j8 precommit 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/2277/workflows/c0557226-1353-4705-9073-35758be14422

nicer j8 precommit here 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/2278/workflows/960692f9-9bf3-4ea8-873e-5023be807095

this is the flake https://issues.apache.org/jira/browse/CASSANDRA-18440

+1 

https://github.com/instaclustr/cassandra/commits/CASSANDRA-18453-trunk


was (Author: smiklosovic):
j11 precommit 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/2277/workflows/6d0d282b-0a05-4852-b067-a4c12390051c
j8 precommit 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/2277/workflows/c0557226-1353-4705-9073-35758be14422

nicer j8 precommit here 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/2278/workflows/960692f9-9bf3-4ea8-873e-5023be807095

this is the flake https://issues.apache.org/jira/browse/CASSANDRA-18440

+1

> Use WithProperties to ensure that system properties are handled
> ---
>
> Key: CASSANDRA-18453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18453
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Maxim Muzafarov
>Assignee: Bernardo Botella Corbi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 5.x
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The {{WithProperties}} is used to handle system properties to set and reset 
> values during the test run, instead of try-catch it uses the 
> try-with-resource approach which facilitates test development.
> We need to replace all the try-catch clauses that work with system properties 
> with {{WithProperties}} and try-with-resource for all the similar cases and 
> where it is technically possible.
> Example:
> {code:java}
> try
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.setBoolean(true);
> testRecoveryWithGarbageLog();
> }
> finally
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.clearValue();
> }
> {code}
> Can be replaced with:
> {code:java}
> try (WithProperties = new 
> WithProperties().with(COMMITLOG_IGNORE_REPLAY_ERRORS, "true"))
> {
> testRecoveryWithGarbageLog();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15046) Add a "history" command to cqlsh. Perhaps "show history"?

2023-05-23 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-15046:
-
Status: Ready to Commit  (was: Review In Progress)

> Add a "history" command to cqlsh.  Perhaps "show history"?
> --
>
> Key: CASSANDRA-15046
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15046
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Interpreter
>Reporter: Wes Peters
>Assignee: Brad Schoening
>Priority: Low
>  Labels: lhf
> Fix For: 5.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I was trying to capture some create key space and create table commands from 
> a running cqlsh, and found there was no equivalent to the '\s' history 
> command in Postgres' psql shell.  It's a great tool for figuring out what you 
> were doing yesterday.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18453) Use WithProperties to ensure that system properties are handled

2023-05-23 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725228#comment-17725228
 ] 

Stefan Miklosovic edited comment on CASSANDRA-18453 at 5/23/23 11:47 AM:
-

j11 precommit 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/2277/workflows/6d0d282b-0a05-4852-b067-a4c12390051c
j8 precommit 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/2277/workflows/c0557226-1353-4705-9073-35758be14422

nicer j8 precommit here 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/2278/workflows/960692f9-9bf3-4ea8-873e-5023be807095

this is the flake https://issues.apache.org/jira/browse/CASSANDRA-18440

+1


was (Author: smiklosovic):
j11 precommit 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/2277/workflows/6d0d282b-0a05-4852-b067-a4c12390051c
j8 precommit 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/2277/workflows/c0557226-1353-4705-9073-35758be14422

> Use WithProperties to ensure that system properties are handled
> ---
>
> Key: CASSANDRA-18453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18453
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Maxim Muzafarov
>Assignee: Bernardo Botella Corbi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 5.x
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The {{WithProperties}} is used to handle system properties to set and reset 
> values during the test run, instead of try-catch it uses the 
> try-with-resource approach which facilitates test development.
> We need to replace all the try-catch clauses that work with system properties 
> with {{WithProperties}} and try-with-resource for all the similar cases and 
> where it is technically possible.
> Example:
> {code:java}
> try
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.setBoolean(true);
> testRecoveryWithGarbageLog();
> }
> finally
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.clearValue();
> }
> {code}
> Can be replaced with:
> {code:java}
> try (WithProperties = new 
> WithProperties().with(COMMITLOG_IGNORE_REPLAY_ERRORS, "true"))
> {
> testRecoveryWithGarbageLog();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18453) Use WithProperties to ensure that system properties are handled

2023-05-23 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-18453:
--
Status: Review In Progress  (was: Changes Suggested)

> Use WithProperties to ensure that system properties are handled
> ---
>
> Key: CASSANDRA-18453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18453
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Maxim Muzafarov
>Assignee: Bernardo Botella Corbi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 5.x
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The {{WithProperties}} is used to handle system properties to set and reset 
> values during the test run, instead of try-catch it uses the 
> try-with-resource approach which facilitates test development.
> We need to replace all the try-catch clauses that work with system properties 
> with {{WithProperties}} and try-with-resource for all the similar cases and 
> where it is technically possible.
> Example:
> {code:java}
> try
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.setBoolean(true);
> testRecoveryWithGarbageLog();
> }
> finally
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.clearValue();
> }
> {code}
> Can be replaced with:
> {code:java}
> try (WithProperties = new 
> WithProperties().with(COMMITLOG_IGNORE_REPLAY_ERRORS, "true"))
> {
> testRecoveryWithGarbageLog();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18453) Use WithProperties to ensure that system properties are handled

2023-05-23 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-18453:
--
Status: Needs Committer  (was: Review In Progress)

> Use WithProperties to ensure that system properties are handled
> ---
>
> Key: CASSANDRA-18453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18453
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Maxim Muzafarov
>Assignee: Bernardo Botella Corbi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 5.x
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The {{WithProperties}} is used to handle system properties to set and reset 
> values during the test run, instead of try-catch it uses the 
> try-with-resource approach which facilitates test development.
> We need to replace all the try-catch clauses that work with system properties 
> with {{WithProperties}} and try-with-resource for all the similar cases and 
> where it is technically possible.
> Example:
> {code:java}
> try
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.setBoolean(true);
> testRecoveryWithGarbageLog();
> }
> finally
> {
> COMMITLOG_IGNORE_REPLAY_ERRORS.clearValue();
> }
> {code}
> Can be replaced with:
> {code:java}
> try (WithProperties = new 
> WithProperties().with(COMMITLOG_IGNORE_REPLAY_ERRORS, "true"))
> {
> testRecoveryWithGarbageLog();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15046) Add a "history" command to cqlsh. Perhaps "show history"?

2023-05-23 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725377#comment-17725377
 ] 

Brandon Williams commented on CASSANDRA-15046:
--

bq. I think it is enough to build this for j8 as we are not touching any Java 
code.

I concur.  +1

> Add a "history" command to cqlsh.  Perhaps "show history"?
> --
>
> Key: CASSANDRA-15046
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15046
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Interpreter
>Reporter: Wes Peters
>Assignee: Brad Schoening
>Priority: Low
>  Labels: lhf
> Fix For: 5.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I was trying to capture some create key space and create table commands from 
> a running cqlsh, and found there was no equivalent to the '\s' history 
> command in Postgres' psql shell.  It's a great tool for figuring out what you 
> were doing yesterday.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15046) Add a "history" command to cqlsh. Perhaps "show history"?

2023-05-23 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-15046:
-
Reviewers: Brandon Williams, Stefan Miklosovic  (was: Ekaterina Dimitrova, 
Paulo Motta, Stefan Miklosovic)
   Status: Review In Progress  (was: Needs Committer)

> Add a "history" command to cqlsh.  Perhaps "show history"?
> --
>
> Key: CASSANDRA-15046
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15046
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Interpreter
>Reporter: Wes Peters
>Assignee: Brad Schoening
>Priority: Low
>  Labels: lhf
> Fix For: 5.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I was trying to capture some create key space and create table commands from 
> a running cqlsh, and found there was no equivalent to the '\s' history 
> command in Postgres' psql shell.  It's a great tool for figuring out what you 
> were doing yesterday.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18512) nodetool describecluster command is not showing correct Down count.

2023-05-23 Thread Ranju (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ranju updated CASSANDRA-18512:
--
Mentor:   (was: Manish Khandelwal)

> nodetool describecluster command is not showing correct Down count.  
> -
>
> Key: CASSANDRA-18512
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18512
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Ranju
>Assignee: Ranju
>Priority: Normal
> Attachments: 
> 0001-Fix-Down-nodes-counter-in-nodetool-describeCluster-c.patch
>
>
> There are some nodes down in the cluster of Cassandra Version 4.x
> # nodetool describecluster command output shows these ips as unreachable.
> UNREACHABLE: [, , ]
> Stats for all nodes:
>     Live: 3
>     Joining: 0
>     Moving: 0
>     Leaving: 0
>     Unreachable: 3
> But under data center , count of down pod is always shown as 0.
> Data Centers: 
>     dc1 #Nodes: 3 #Down: 0
>     dc2 #Nodes: 3 #Down: 0
>  
> Steps to reproduce:
>  # Setup two Data centers dc1,dc2, each datacenter was having 3 nodes - 
> dc1:3,dc2:3
>  # mark down any 3 nodes of two data centers.
>  # Run nodetool describecluster command from the live node and check the 
> Unreachable count , which is 3 and Down Count is 0 , both are not matched.
>  
> Expected Output: Unreachable and Down count should have the same value.
> Data Centers:
>         dc1 #Nodes: 3 #Down: 1
>         dc2 #Nodes: 3 #Down: 2
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18543) Waiting for gossip to settle does not wait for live endpoints

2023-05-23 Thread Cameron Zemek (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725280#comment-17725280
 ] 

Cameron Zemek commented on CASSANDRA-18543:
---

[^gossip4.patch]

Here is the patch applied to 4.0.4 . I haven't done any testing of this patch 
against 4.x.

> Waiting for gossip to settle does not wait for live endpoints
> -
>
> Key: CASSANDRA-18543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18543
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
> Attachments: gossip.patch, gossip4.patch
>
>
> When a node starts it will get endpoint states (via shadow round) but have 
> all nodes marked as down. The problem is the wait to settle only checks the 
> size of endpoint states is stable before starting Native transport. Once 
> native transport starts it will receive queries and fail consistency levels 
> such as LOCAL_QUORUM since it still thinks nodes are down.
> This is problem for a number of large clusters for our customers. The cluster 
> has quorum but due to this issue a node restart is causing a bunch of query 
> errors.
> My initial solution to this was to only check live endpoints size in addition 
> to size of endpoint states. This worked but I noticed in testing this fix 
> that there also a lot of duplication of checking the same node (via Echo 
> messages) for liveness. So the patch also removes this duplication of 
> checking node is UP in markAlive.
> The final problem I found while testing is sometimes could still not see a 
> change in live endpoints due to only 1 second polling, so the patch allows 
> for overridding the settle parameters. I could not reliability reproduce this 
> but think its worth providing a way to override these hardcoded values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18543) Waiting for gossip to settle does not wait for live endpoints

2023-05-23 Thread Cameron Zemek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-18543:
--
Attachment: gossip4.patch

> Waiting for gossip to settle does not wait for live endpoints
> -
>
> Key: CASSANDRA-18543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18543
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
> Attachments: gossip.patch, gossip4.patch
>
>
> When a node starts it will get endpoint states (via shadow round) but have 
> all nodes marked as down. The problem is the wait to settle only checks the 
> size of endpoint states is stable before starting Native transport. Once 
> native transport starts it will receive queries and fail consistency levels 
> such as LOCAL_QUORUM since it still thinks nodes are down.
> This is problem for a number of large clusters for our customers. The cluster 
> has quorum but due to this issue a node restart is causing a bunch of query 
> errors.
> My initial solution to this was to only check live endpoints size in addition 
> to size of endpoint states. This worked but I noticed in testing this fix 
> that there also a lot of duplication of checking the same node (via Echo 
> messages) for liveness. So the patch also removes this duplication of 
> checking node is UP in markAlive.
> The final problem I found while testing is sometimes could still not see a 
> change in live endpoints due to only 1 second polling, so the patch allows 
> for overridding the settle parameters. I could not reliability reproduce this 
> but think its worth providing a way to override these hardcoded values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18543) Waiting for gossip to settle does not wait for live endpoints

2023-05-23 Thread Cameron Zemek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-18543:
--
Description: 
When a node starts it will get endpoint states (via shadow round) but have all 
nodes marked as down. The problem is the wait to settle only checks the size of 
endpoint states is stable before starting Native transport. Once native 
transport starts it will receive queries and fail consistency levels such as 
LOCAL_QUORUM since it still thinks nodes are down.

This is problem for a number of large clusters for our customers. The cluster 
has quorum but due to this a node restart is causing a bunch of query errors.

My initial solution to this was to only check live endpoints size in addition 
to size of endpoint states. This worked but I noticed in testing this fix that 
there also a lot of duplication of checking the same node (via Echo messages) 
for liveness. So the patch also removes this duplication of checking node is UP 
in markAlive.

The final problem I found while testing is sometimes could still not see a 
change in live endpoints due to only 1 second polling, so the patch allows for 
overridding the settle parameters. I could not reliability reproduce this but 
think its worth providing a way to override these hardcoded values.

  was:
When a node starts it will get endpoint states (via shadow round) but have all 
nodes marked as down. The problem is the wait to settle only checks the size of 
endpoint states is stable before starting Native transport. Once native 
transport starts it will receive queries and fail consistency levels such as 
LOCAL_QUORUM since it still thinks nodes are down.

My initial solution to this was to also check live endpoints size in addition 
to size of endpoint states. This worked but I noticed in testing this fix that 
there also a lot of duplication of checking the same node (via Echo messages) 
for liveness. So the patch also removes this duplication of checking node is UP.

The final problem I found while testing is sometimes could still not see a 
change in live endpoints due to only 1 second polling, so the patch allows for 
overridding the settle parameters.


> Waiting for gossip to settle does not wait for live endpoints
> -
>
> Key: CASSANDRA-18543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18543
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
> Attachments: gossip.patch
>
>
> When a node starts it will get endpoint states (via shadow round) but have 
> all nodes marked as down. The problem is the wait to settle only checks the 
> size of endpoint states is stable before starting Native transport. Once 
> native transport starts it will receive queries and fail consistency levels 
> such as LOCAL_QUORUM since it still thinks nodes are down.
> This is problem for a number of large clusters for our customers. The cluster 
> has quorum but due to this a node restart is causing a bunch of query errors.
> My initial solution to this was to only check live endpoints size in addition 
> to size of endpoint states. This worked but I noticed in testing this fix 
> that there also a lot of duplication of checking the same node (via Echo 
> messages) for liveness. So the patch also removes this duplication of 
> checking node is UP in markAlive.
> The final problem I found while testing is sometimes could still not see a 
> change in live endpoints due to only 1 second polling, so the patch allows 
> for overridding the settle parameters. I could not reliability reproduce this 
> but think its worth providing a way to override these hardcoded values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18543) Waiting for gossip to settle does not wait for live endpoints

2023-05-23 Thread Cameron Zemek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-18543:
--
Description: 
When a node starts it will get endpoint states (via shadow round) but have all 
nodes marked as down. The problem is the wait to settle only checks the size of 
endpoint states is stable before starting Native transport. Once native 
transport starts it will receive queries and fail consistency levels such as 
LOCAL_QUORUM since it still thinks nodes are down.

This is problem for a number of large clusters for our customers. The cluster 
has quorum but due to this issue a node restart is causing a bunch of query 
errors.

My initial solution to this was to only check live endpoints size in addition 
to size of endpoint states. This worked but I noticed in testing this fix that 
there also a lot of duplication of checking the same node (via Echo messages) 
for liveness. So the patch also removes this duplication of checking node is UP 
in markAlive.

The final problem I found while testing is sometimes could still not see a 
change in live endpoints due to only 1 second polling, so the patch allows for 
overridding the settle parameters. I could not reliability reproduce this but 
think its worth providing a way to override these hardcoded values.

  was:
When a node starts it will get endpoint states (via shadow round) but have all 
nodes marked as down. The problem is the wait to settle only checks the size of 
endpoint states is stable before starting Native transport. Once native 
transport starts it will receive queries and fail consistency levels such as 
LOCAL_QUORUM since it still thinks nodes are down.

This is problem for a number of large clusters for our customers. The cluster 
has quorum but due to this a node restart is causing a bunch of query errors.

My initial solution to this was to only check live endpoints size in addition 
to size of endpoint states. This worked but I noticed in testing this fix that 
there also a lot of duplication of checking the same node (via Echo messages) 
for liveness. So the patch also removes this duplication of checking node is UP 
in markAlive.

The final problem I found while testing is sometimes could still not see a 
change in live endpoints due to only 1 second polling, so the patch allows for 
overridding the settle parameters. I could not reliability reproduce this but 
think its worth providing a way to override these hardcoded values.


> Waiting for gossip to settle does not wait for live endpoints
> -
>
> Key: CASSANDRA-18543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18543
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
> Attachments: gossip.patch
>
>
> When a node starts it will get endpoint states (via shadow round) but have 
> all nodes marked as down. The problem is the wait to settle only checks the 
> size of endpoint states is stable before starting Native transport. Once 
> native transport starts it will receive queries and fail consistency levels 
> such as LOCAL_QUORUM since it still thinks nodes are down.
> This is problem for a number of large clusters for our customers. The cluster 
> has quorum but due to this issue a node restart is causing a bunch of query 
> errors.
> My initial solution to this was to only check live endpoints size in addition 
> to size of endpoint states. This worked but I noticed in testing this fix 
> that there also a lot of duplication of checking the same node (via Echo 
> messages) for liveness. So the patch also removes this duplication of 
> checking node is UP in markAlive.
> The final problem I found while testing is sometimes could still not see a 
> change in live endpoints due to only 1 second polling, so the patch allows 
> for overridding the settle parameters. I could not reliability reproduce this 
> but think its worth providing a way to override these hardcoded values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18084) Introduce tags to snitch for better decision making for replica placement in topology strategies

2023-05-23 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725264#comment-17725264
 ] 

Stefan Miklosovic commented on CASSANDRA-18084:
---

I am not sure what is so hard about looking into this for 10 minutes to tell me 
if I am on the good path and all I need to do is to finish the tests. I am 
working on easy patches on purpose as it is almost impossible to merge anything 
bigger.

> Introduce tags to snitch for better decision making for replica placement in 
> topology strategies
> 
>
> Key: CASSANDRA-18084
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18084
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Cluster/Gossip, Cluster/Schema, Legacy/Distributed 
> Metadata
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We would like to have extra meta-information in cassandra-rackdc.properties 
> which would further differentiate nodes in dc / racks.
> The main motivation behind this is that we have special requirements around 
> node's characteristics based on which we want to make further decisions when 
> it comes to replica placement in topology strategies. (New topology strategy 
> would be mostly derived from NTS / would be extended)
> The most reasonable way to do that is to introduce a new property into 
> cassandra-rackdc.properties called "tags"
> {code:java}
> # Arbitrary tag to assign to this node, they serve as additional 
> identificator of a node based on which operators might act. 
> # Value of this property is meant to be a comma-separated list of strings.
> #tags=tag1,tag2 
> {code}
> We also want to introduce new application state called TAGS. On startup of a 
> node, this node would advertise its tags to cluster and vice versa, all nodes 
> would tell to that respective node what tags they have so everybody would see 
> the same state of tags across the cluster based on which topology strategies 
> would do same decisions.
> These tags are not meant to be changed during whole runtime of a node, 
> similarly as datacenter and rack is not.
> For performance reasons, we might limit the maximum size of tags (sum of 
> their lenght), to be, for example, 64 characters and anything bigger would be 
> either shortened or the start would fail.
> Once we have tags for all nodes, we can have access to them, cluster-wide, 
> from TokenMetadata which is used quite heavily in topology strategies and it 
> exposes other relevant topology information (dc's, racks ...). We would just 
> add a new way to look at nodes.
> Tags would be a set.
> This would be persisted to the system.local to see what tags a local node has 
> and it would be persisted to system.peers_v2 to see what tags all other nodes 
> have. Column would be called "tags".
> {code:java}
> admin@cqlsh> select * from system.local ;
> @ Row 1
>  key | local
>  bootstrapped| COMPLETED
>  broadcast_address   | 172.19.0.5
>  broadcast_port  | 7000
>  cluster_name| Test Cluster
>  cql_version | 3.4.6
>  data_center | dc1
>  gossip_generation   | 1669739177
>  host_id | 54f8c6ea-a6ba-40c5-8fa5-484b2b4184c9
>  listen_address  | 172.19.0.5
>  listen_port | 7000
>  native_protocol_version | 5
>  partitioner | org.apache.cassandra.dht.Murmur3Partitioner
>  rack| rack1
>  release_version | 4.2-SNAPSHOT
>  rpc_address | 172.19.0.5
>  rpc_port| 9042
>  schema_version  | ef865449-2491-33b8-95b0-47c09cb14ea9
>  tags| {'tag1', 'tag2'}
>  tokens  | {'6504358681601109713'} 
> {code}
> for system.peers_v2:
> {code:java}
> admin@cqlsh> select peer,tags from system.peers_v2 ;
> @ Row 1
> --+-
>  peer | 172.19.0.15
>  tags | {'tag2', 'tag3'}
> @ Row 2
> --+-
>  peer | 172.19.0.11
>  tags | null
> {code}
> the POC implementation doing exactly that is here:
> We do not want to provide our custom topology strategies which are using this 
> feature as that will be the most probably a proprietary solution. This might 
> indeed change in the future. For now, we just want to implement hooks we can 
> base our in-house implementation on. All other people can benefit from this 
> as well if they choose so as this feature enables them to do that.
> Adding tags is not only about custom topology strategies. Operators could tag 
> their nodes if they wish to make further distinctions on topology level for 
> their operational needs.
> [https://github.com/instaclustr/cassandra/co

[jira] [Commented] (CASSANDRA-12937) Default setting (yaml) for SSTable compression

2023-05-23 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725262#comment-17725262
 ] 

Stefan Miklosovic commented on CASSANDRA-12937:
---

I added all my changes and fixed tests in this branch 
https://github.com/instaclustr/cassandra/commits/CASSANDRA-12937

I am running a Jenkins build to see how it looks. We still need to cover python 
dtests.

Feel free to go through that branch. I squashed and rebased your patch and 
applied changes on top. 

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12937) Default setting (yaml) for SSTable compression

2023-05-23 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-12937:
--
Status: Patch Available  (was: Review In Progress)

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Claude Warren
>Priority: Low
>  Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying 
> the field required for defining the default compression parameters. In 
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for 
> the default compression. This field should be initialized in 
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where 
> {{CompressionParams.DEFAULT}} was used the code should call 
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some 
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test 
> that the table schema use the new default when a new table is created (see 
> CreateTest for some example).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org