[jira] [Updated] (CASSANDRA-16807) Weak visibility guarantees of Accumulator lead to failed assertions during digest comparison

2021-07-20 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16807:

Fix Version/s: (was: 4.0.x)

> Weak visibility guarantees of Accumulator lead to failed assertions during 
> digest comparison
> 
>
> Key: CASSANDRA-16807
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16807
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0, 4.0-rc, 4.x
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This problem could manifest on all versions, beginning on at least 3.0, but 
> I’ll focus on the way it manifests in 4.0 here.
> In what now seems like a wise move, CASSANDRA-16097 added an assertion to 
> {{DigestResolver#responseMatch()}} that ensures the responses snapshot has at 
> least one visible elements to compare (although of course only one element 
> trivially cannot generate a mismatch and short-circuits immediately). 
> However, at the point {{ReadCallback#onResponse()}} signals the waiting 
> resolver, there is no guarantee that the size of the generated snapshot of 
> the responses {{Accumulator}} is non-zero, or perhaps more worryingly, at 
> least equal to the number of blocked-for responses. This seems to be a 
> consequence of the documented weak visibility guarantees on 
> {{Accumulator#add()}}. In short, if there are concurrent invocations of 
> add(), is it not guaranteed that there is any visible size change after any 
> one of them return, but only after all complete.
> The particular exception looks something like this:
> {noformat}
> java.lang.AssertionError: Attempted response match comparison while no 
> responses have been received.
>   at 
> org.apache.cassandra.service.reads.DigestResolver.responsesMatch(DigestResolver.java:110)
>   at 
> org.apache.cassandra.service.reads.AbstractReadExecutor.awaitResponses(AbstractReadExecutor.java:393)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:2150)
>   at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1979)
>   at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1882)
>   at 
> org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:1121)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:296)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:248)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:90)
> {noformat}
> It’s possible to reproduce this on simple single-partition reads without any 
> short-read protection or replica filtering protection. I’ve also been able to 
> reproduce this synthetically with [a unit 
> test|https://github.com/apache/cassandra/pull/1110] on {{ReadCallback}}.
> It seems like the most straightforward way to fix this would be to avoid 
> signaling in {{ReadCallback#onResponse()}} until the visible size of the 
> accumulator is at least the number of received responses. In most cases, this 
> is trivially true, and our signaling behavior won’t change at all. In the 
> very rare case that there are two (or more) concurrent calls to 
> {{onResponse()}}, the second (or last) will signal, and having one more 
> response than we strictly need should have no negative side effects. (We 
> don’t seem to make any strict assertions about having exactly the number of 
> required responses, only that we have enough.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16808) Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is incorrect

2021-07-20 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16808:

Fix Version/s: (was: 4.0.1)

> Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is 
> incorrect
> --
>
> Key: CASSANDRA-16808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0, 4.0-rc, 4.x
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Fixing CASSANDRA-16797 has exposed an issue with the way {{FWD_FRM}} is 
> serialized.
> In the code cleanup during the internode messaging refactor, the 
> serialization for {{FWD_FRM}} (the endpoint to respond to for forwarded 
> messages) was implemented using the same serialization format as 
> CompactEndpointSerializationHelper which prefixes the address bytes with 
> their length, however the FWD_FRM parameter value does not include a length 
> and just converts the parameter value to an InetAddress.
> In a mixed version cluster this causes the pre-4.0 nodes to fail when 
> deserializing the mutation
> {code:java}
> java.lang.RuntimeException: java.net.UnknownHostException: addr is of illegal 
> length
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) 
> ~[dtest-3.0.25.jar:na]
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  ~[na:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  ~[dtest-3.0.25.jar:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> ~[dtest-3.0.25.jar:na]
> at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
> Caused by: java.net.UnknownHostException: addr is of illegal length
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1208) 
> ~[na:na]
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1571) 
> ~[na:na]
> at 
> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:57)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[dtest-3.0.25.jar:na]
> ... 5 common frames omitted
> {code}
> Unfortunately there isn't a clean fix I can see as 
> {{org.apache.cassandra.io.IVersionedAsymmetricSerializer#deserialize}} used 
> to deserialize the FWD_FRM address does not take a maximum length to 
> deserialize and it's impossible to tell definitely know if it's an IPv4 or 
> IPv6 address from the first four bytes.
> The patch I'm submitting special-cases the deserializing pre-4.0 {{FWD_FRM}} 
> parameters in the {{Message}} deserializer. That seems preferable to 
> extending the deserialization interface or creating a new {{DataInputBuffer}} 
> limited by the parameter value length.
> Once that was fixed, the INSERT statements were still failing which I tracked 
> down to the 4.0 optimization of serializing the forwarded message once if the 
> message id is the same
>  
> [https://github.com/apache/cassandra/blob/cassandra-4.0/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L76]
> In the test case I wrote, only one message was being forwarded and that had a 
> different id to the original forwarded message. The {{useSameMessageID}} 
> method only checked message Ids within the forwarded messages.
>  
> Code Details:
> When MutationVerbHandler.forwardToLocalNodes is constructing the forwarding 
> message it just stores the the byte array representing the IPv4 or IPv6 
> address in the parameter array.
> (link 
> [https://github.com/apache/cassandra/blob/44604b7316fcbfd7d0d7425e75cd7ebe267e3247/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L90]
>  )
> {code:java}
> private static void forwardToLocalNodes(Mutation mutation, 
> MessagingService.Verb verb, byte[] forwardBytes, InetAddress from) throws 
> IOException
> {
> try (DataInputStream in = new DataInputStream(new 
> FastByteArrayInputStream(forwardBytes)))
> {
> int size = in.readInt();
> // tell the recipients who to send their ack to
> MessageOut message = new MessageOut<>(verb, mutation, 
> Mutation.serializer).withParameter(Mutation.FORWARD_FROM, from.getAddress());
> {code}
> When the message is serialized in 3.0 MessageOut.serialize, that raw 

[jira] [Commented] (CASSANDRA-16807) Weak visibility guarantees of Accumulator lead to failed assertions during digest comparison

2021-07-20 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384644#comment-17384644
 ] 

Caleb Rackliffe commented on CASSANDRA-16807:
-

Alright, the trunk patch is looking good, and our only test failure is 
{{HostReplacementTest#replaceAliveHost}}, which appears to be failing on other 
trunk-based branches as we speak.

> Weak visibility guarantees of Accumulator lead to failed assertions during 
> digest comparison
> 
>
> Key: CASSANDRA-16807
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16807
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0-rc, 4.0.x, 4.x
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This problem could manifest on all versions, beginning on at least 3.0, but 
> I’ll focus on the way it manifests in 4.0 here.
> In what now seems like a wise move, CASSANDRA-16097 added an assertion to 
> {{DigestResolver#responseMatch()}} that ensures the responses snapshot has at 
> least one visible elements to compare (although of course only one element 
> trivially cannot generate a mismatch and short-circuits immediately). 
> However, at the point {{ReadCallback#onResponse()}} signals the waiting 
> resolver, there is no guarantee that the size of the generated snapshot of 
> the responses {{Accumulator}} is non-zero, or perhaps more worryingly, at 
> least equal to the number of blocked-for responses. This seems to be a 
> consequence of the documented weak visibility guarantees on 
> {{Accumulator#add()}}. In short, if there are concurrent invocations of 
> add(), is it not guaranteed that there is any visible size change after any 
> one of them return, but only after all complete.
> The particular exception looks something like this:
> {noformat}
> java.lang.AssertionError: Attempted response match comparison while no 
> responses have been received.
>   at 
> org.apache.cassandra.service.reads.DigestResolver.responsesMatch(DigestResolver.java:110)
>   at 
> org.apache.cassandra.service.reads.AbstractReadExecutor.awaitResponses(AbstractReadExecutor.java:393)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:2150)
>   at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1979)
>   at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1882)
>   at 
> org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:1121)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:296)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:248)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:90)
> {noformat}
> It’s possible to reproduce this on simple single-partition reads without any 
> short-read protection or replica filtering protection. I’ve also been able to 
> reproduce this synthetically with [a unit 
> test|https://github.com/apache/cassandra/pull/1110] on {{ReadCallback}}.
> It seems like the most straightforward way to fix this would be to avoid 
> signaling in {{ReadCallback#onResponse()}} until the visible size of the 
> accumulator is at least the number of received responses. In most cases, this 
> is trivially true, and our signaling behavior won’t change at all. In the 
> very rare case that there are two (or more) concurrent calls to 
> {{onResponse()}}, the second (or last) will signal, and having one more 
> response than we strictly need should have no negative side effects. (We 
> don’t seem to make any strict assertions about having exactly the number of 
> required responses, only that we have enough.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16807) Weak visibility guarantees of Accumulator lead to failed assertions during digest comparison

2021-07-20 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16807:

Fix Version/s: 4.0

> Weak visibility guarantees of Accumulator lead to failed assertions during 
> digest comparison
> 
>
> Key: CASSANDRA-16807
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16807
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0, 4.0-rc, 4.0.x, 4.x
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This problem could manifest on all versions, beginning on at least 3.0, but 
> I’ll focus on the way it manifests in 4.0 here.
> In what now seems like a wise move, CASSANDRA-16097 added an assertion to 
> {{DigestResolver#responseMatch()}} that ensures the responses snapshot has at 
> least one visible elements to compare (although of course only one element 
> trivially cannot generate a mismatch and short-circuits immediately). 
> However, at the point {{ReadCallback#onResponse()}} signals the waiting 
> resolver, there is no guarantee that the size of the generated snapshot of 
> the responses {{Accumulator}} is non-zero, or perhaps more worryingly, at 
> least equal to the number of blocked-for responses. This seems to be a 
> consequence of the documented weak visibility guarantees on 
> {{Accumulator#add()}}. In short, if there are concurrent invocations of 
> add(), is it not guaranteed that there is any visible size change after any 
> one of them return, but only after all complete.
> The particular exception looks something like this:
> {noformat}
> java.lang.AssertionError: Attempted response match comparison while no 
> responses have been received.
>   at 
> org.apache.cassandra.service.reads.DigestResolver.responsesMatch(DigestResolver.java:110)
>   at 
> org.apache.cassandra.service.reads.AbstractReadExecutor.awaitResponses(AbstractReadExecutor.java:393)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:2150)
>   at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1979)
>   at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1882)
>   at 
> org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:1121)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:296)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:248)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:90)
> {noformat}
> It’s possible to reproduce this on simple single-partition reads without any 
> short-read protection or replica filtering protection. I’ve also been able to 
> reproduce this synthetically with [a unit 
> test|https://github.com/apache/cassandra/pull/1110] on {{ReadCallback}}.
> It seems like the most straightforward way to fix this would be to avoid 
> signaling in {{ReadCallback#onResponse()}} until the visible size of the 
> accumulator is at least the number of received responses. In most cases, this 
> is trivially true, and our signaling behavior won’t change at all. In the 
> very rare case that there are two (or more) concurrent calls to 
> {{onResponse()}}, the second (or last) will signal, and having one more 
> response than we strictly need should have no negative side effects. (We 
> don’t seem to make any strict assertions about having exactly the number of 
> required responses, only that we have enough.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16808) Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is incorrect

2021-07-20 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384630#comment-17384630
 ] 

Caleb Rackliffe edited comment on CASSANDRA-16808 at 7/21/21, 4:09 AM:
---

The 4.0.0 test run is perfect.

[~jmeredithco] 4.0 and trunk aren't quite there. We're seeing CASSANDRA-16803 
and some other assorted timeout and heap space problems, but those are almost 
certainly unrelated. The troubling bit is what looks like 
{{MixedModeMessageForwardTest.checkWritesForwardedToOtherDcTest}}, which is 
new, 
[failing|https://app.circleci.com/pipelines/github/maedhroz/cassandra/295/workflows/1e5bff9f-36db-46c8-bdf2-419f830a4b3f/jobs/1792/tests#failed-test-1].

UPDATE: This is a 3.0 -> 3.11 problem, so I'm slightly refactoring the 4.0 and 
trunk patches to mimic the single upgrade path test in 4.0.0, which is all we 
should need...


was (Author: maedhroz):
The 4.0.0 test run is perfect.

[~jmeredithco] 4.0 and trunk aren't quite there. We're seeing CASSANDRA-16803 
and some other assorted timeout and heap space problems, but those are almost 
certainly unrelated. The troubling bit is what looks like 
{{MixedModeMessageForwardTest.checkWritesForwardedToOtherDcTest}}, which is 
new, 
[failing|https://app.circleci.com/pipelines/github/maedhroz/cassandra/295/workflows/1e5bff9f-36db-46c8-bdf2-419f830a4b3f/jobs/1792/tests#failed-test-1].

> Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is 
> incorrect
> --
>
> Key: CASSANDRA-16808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0, 4.0.1, 4.0-rc, 4.x
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Fixing CASSANDRA-16797 has exposed an issue with the way {{FWD_FRM}} is 
> serialized.
> In the code cleanup during the internode messaging refactor, the 
> serialization for {{FWD_FRM}} (the endpoint to respond to for forwarded 
> messages) was implemented using the same serialization format as 
> CompactEndpointSerializationHelper which prefixes the address bytes with 
> their length, however the FWD_FRM parameter value does not include a length 
> and just converts the parameter value to an InetAddress.
> In a mixed version cluster this causes the pre-4.0 nodes to fail when 
> deserializing the mutation
> {code:java}
> java.lang.RuntimeException: java.net.UnknownHostException: addr is of illegal 
> length
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) 
> ~[dtest-3.0.25.jar:na]
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  ~[na:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  ~[dtest-3.0.25.jar:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> ~[dtest-3.0.25.jar:na]
> at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
> Caused by: java.net.UnknownHostException: addr is of illegal length
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1208) 
> ~[na:na]
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1571) 
> ~[na:na]
> at 
> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:57)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[dtest-3.0.25.jar:na]
> ... 5 common frames omitted
> {code}
> Unfortunately there isn't a clean fix I can see as 
> {{org.apache.cassandra.io.IVersionedAsymmetricSerializer#deserialize}} used 
> to deserialize the FWD_FRM address does not take a maximum length to 
> deserialize and it's impossible to tell definitely know if it's an IPv4 or 
> IPv6 address from the first four bytes.
> The patch I'm submitting special-cases the deserializing pre-4.0 {{FWD_FRM}} 
> parameters in the {{Message}} deserializer. That seems preferable to 
> extending the deserialization interface or creating a new {{DataInputBuffer}} 
> limited by the parameter value length.
> Once that was fixed, the INSERT statements were still failing which I tracked 
> down to the 4.0 optimization of serializing the forwarded message once if the 
> message id is the same
>  
> 

[jira] [Commented] (CASSANDRA-16808) Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is incorrect

2021-07-20 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384630#comment-17384630
 ] 

Caleb Rackliffe commented on CASSANDRA-16808:
-

The 4.0.0 test run is perfect.

[~jmeredithco] 4.0 and trunk aren't quite there. We're seeing CASSANDRA-16803 
and some other assorted timeout and heap space problems, but those are almost 
certainly unrelated. The troubling bit is what looks like 
{{MixedModeMessageForwardTest.checkWritesForwardedToOtherDcTest}}, which is 
new, 
[failing|https://app.circleci.com/pipelines/github/maedhroz/cassandra/295/workflows/1e5bff9f-36db-46c8-bdf2-419f830a4b3f/jobs/1792/tests#failed-test-1].

> Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is 
> incorrect
> --
>
> Key: CASSANDRA-16808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0, 4.0.1, 4.0-rc, 4.x
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Fixing CASSANDRA-16797 has exposed an issue with the way {{FWD_FRM}} is 
> serialized.
> In the code cleanup during the internode messaging refactor, the 
> serialization for {{FWD_FRM}} (the endpoint to respond to for forwarded 
> messages) was implemented using the same serialization format as 
> CompactEndpointSerializationHelper which prefixes the address bytes with 
> their length, however the FWD_FRM parameter value does not include a length 
> and just converts the parameter value to an InetAddress.
> In a mixed version cluster this causes the pre-4.0 nodes to fail when 
> deserializing the mutation
> {code:java}
> java.lang.RuntimeException: java.net.UnknownHostException: addr is of illegal 
> length
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) 
> ~[dtest-3.0.25.jar:na]
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  ~[na:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  ~[dtest-3.0.25.jar:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> ~[dtest-3.0.25.jar:na]
> at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
> Caused by: java.net.UnknownHostException: addr is of illegal length
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1208) 
> ~[na:na]
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1571) 
> ~[na:na]
> at 
> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:57)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[dtest-3.0.25.jar:na]
> ... 5 common frames omitted
> {code}
> Unfortunately there isn't a clean fix I can see as 
> {{org.apache.cassandra.io.IVersionedAsymmetricSerializer#deserialize}} used 
> to deserialize the FWD_FRM address does not take a maximum length to 
> deserialize and it's impossible to tell definitely know if it's an IPv4 or 
> IPv6 address from the first four bytes.
> The patch I'm submitting special-cases the deserializing pre-4.0 {{FWD_FRM}} 
> parameters in the {{Message}} deserializer. That seems preferable to 
> extending the deserialization interface or creating a new {{DataInputBuffer}} 
> limited by the parameter value length.
> Once that was fixed, the INSERT statements were still failing which I tracked 
> down to the 4.0 optimization of serializing the forwarded message once if the 
> message id is the same
>  
> [https://github.com/apache/cassandra/blob/cassandra-4.0/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L76]
> In the test case I wrote, only one message was being forwarded and that had a 
> different id to the original forwarded message. The {{useSameMessageID}} 
> method only checked message Ids within the forwarded messages.
>  
> Code Details:
> When MutationVerbHandler.forwardToLocalNodes is constructing the forwarding 
> message it just stores the the byte array representing the IPv4 or IPv6 
> address in the parameter array.
> (link 
> [https://github.com/apache/cassandra/blob/44604b7316fcbfd7d0d7425e75cd7ebe267e3247/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L90]
>  )
> {code:java}
> private static void forwardToLocalNodes(Mutation mutation, 
> MessagingService.Verb verb, byte[] 

[jira] [Updated] (CASSANDRA-16808) Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is incorrect

2021-07-20 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16808:

Fix Version/s: 4.0

> Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is 
> incorrect
> --
>
> Key: CASSANDRA-16808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0, 4.0.1, 4.0-rc, 4.x
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Fixing CASSANDRA-16797 has exposed an issue with the way {{FWD_FRM}} is 
> serialized.
> In the code cleanup during the internode messaging refactor, the 
> serialization for {{FWD_FRM}} (the endpoint to respond to for forwarded 
> messages) was implemented using the same serialization format as 
> CompactEndpointSerializationHelper which prefixes the address bytes with 
> their length, however the FWD_FRM parameter value does not include a length 
> and just converts the parameter value to an InetAddress.
> In a mixed version cluster this causes the pre-4.0 nodes to fail when 
> deserializing the mutation
> {code:java}
> java.lang.RuntimeException: java.net.UnknownHostException: addr is of illegal 
> length
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) 
> ~[dtest-3.0.25.jar:na]
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  ~[na:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  ~[dtest-3.0.25.jar:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> ~[dtest-3.0.25.jar:na]
> at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
> Caused by: java.net.UnknownHostException: addr is of illegal length
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1208) 
> ~[na:na]
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1571) 
> ~[na:na]
> at 
> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:57)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[dtest-3.0.25.jar:na]
> ... 5 common frames omitted
> {code}
> Unfortunately there isn't a clean fix I can see as 
> {{org.apache.cassandra.io.IVersionedAsymmetricSerializer#deserialize}} used 
> to deserialize the FWD_FRM address does not take a maximum length to 
> deserialize and it's impossible to tell definitely know if it's an IPv4 or 
> IPv6 address from the first four bytes.
> The patch I'm submitting special-cases the deserializing pre-4.0 {{FWD_FRM}} 
> parameters in the {{Message}} deserializer. That seems preferable to 
> extending the deserialization interface or creating a new {{DataInputBuffer}} 
> limited by the parameter value length.
> Once that was fixed, the INSERT statements were still failing which I tracked 
> down to the 4.0 optimization of serializing the forwarded message once if the 
> message id is the same
>  
> [https://github.com/apache/cassandra/blob/cassandra-4.0/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L76]
> In the test case I wrote, only one message was being forwarded and that had a 
> different id to the original forwarded message. The {{useSameMessageID}} 
> method only checked message Ids within the forwarded messages.
>  
> Code Details:
> When MutationVerbHandler.forwardToLocalNodes is constructing the forwarding 
> message it just stores the the byte array representing the IPv4 or IPv6 
> address in the parameter array.
> (link 
> [https://github.com/apache/cassandra/blob/44604b7316fcbfd7d0d7425e75cd7ebe267e3247/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L90]
>  )
> {code:java}
> private static void forwardToLocalNodes(Mutation mutation, 
> MessagingService.Verb verb, byte[] forwardBytes, InetAddress from) throws 
> IOException
> {
> try (DataInputStream in = new DataInputStream(new 
> FastByteArrayInputStream(forwardBytes)))
> {
> int size = in.readInt();
> // tell the recipients who to send their ack to
> MessageOut message = new MessageOut<>(verb, mutation, 
> Mutation.serializer).withParameter(Mutation.FORWARD_FROM, from.getAddress());
> {code}
> When the message is serialized in 3.0 MessageOut.serialize, that 

[jira] [Commented] (CASSANDRA-16807) Weak visibility guarantees of Accumulator lead to failed assertions during digest comparison

2021-07-20 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384514#comment-17384514
 ] 

Caleb Rackliffe commented on CASSANDRA-16807:
-

Thanks [~adelapena]. I've applied all but one of your suggestions. Just waiting 
for a final test run: 
https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-16807-trunk

> Weak visibility guarantees of Accumulator lead to failed assertions during 
> digest comparison
> 
>
> Key: CASSANDRA-16807
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16807
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0-rc, 4.0.x, 4.x
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This problem could manifest on all versions, beginning on at least 3.0, but 
> I’ll focus on the way it manifests in 4.0 here.
> In what now seems like a wise move, CASSANDRA-16097 added an assertion to 
> {{DigestResolver#responseMatch()}} that ensures the responses snapshot has at 
> least one visible elements to compare (although of course only one element 
> trivially cannot generate a mismatch and short-circuits immediately). 
> However, at the point {{ReadCallback#onResponse()}} signals the waiting 
> resolver, there is no guarantee that the size of the generated snapshot of 
> the responses {{Accumulator}} is non-zero, or perhaps more worryingly, at 
> least equal to the number of blocked-for responses. This seems to be a 
> consequence of the documented weak visibility guarantees on 
> {{Accumulator#add()}}. In short, if there are concurrent invocations of 
> add(), is it not guaranteed that there is any visible size change after any 
> one of them return, but only after all complete.
> The particular exception looks something like this:
> {noformat}
> java.lang.AssertionError: Attempted response match comparison while no 
> responses have been received.
>   at 
> org.apache.cassandra.service.reads.DigestResolver.responsesMatch(DigestResolver.java:110)
>   at 
> org.apache.cassandra.service.reads.AbstractReadExecutor.awaitResponses(AbstractReadExecutor.java:393)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:2150)
>   at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1979)
>   at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1882)
>   at 
> org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:1121)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:296)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:248)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:90)
> {noformat}
> It’s possible to reproduce this on simple single-partition reads without any 
> short-read protection or replica filtering protection. I’ve also been able to 
> reproduce this synthetically with [a unit 
> test|https://github.com/apache/cassandra/pull/1110] on {{ReadCallback}}.
> It seems like the most straightforward way to fix this would be to avoid 
> signaling in {{ReadCallback#onResponse()}} until the visible size of the 
> accumulator is at least the number of received responses. In most cases, this 
> is trivially true, and our signaling behavior won’t change at all. In the 
> very rare case that there are two (or more) concurrent calls to 
> {{onResponse()}}, the second (or last) will signal, and having one more 
> response than we strictly need should have no negative side effects. (We 
> don’t seem to make any strict assertions about having exactly the number of 
> required responses, only that we have enough.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16808) Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is incorrect

2021-07-20 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384500#comment-17384500
 ] 

Caleb Rackliffe commented on CASSANDRA-16808:
-

Here are the pre-commit branches and in-progress Circle CI runs...

||[4.0.0|https://github.com/apache/cassandra/pull/1112]|[Circle 
CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-16808-4.0.0]||
||[4.0|https://github.com/apache/cassandra/pull/1113]|[Circle 
CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-16808-4.0]||
||[trunk|https://github.com/apache/cassandra/pull/1114]|[Circle 
CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-16808-trunk]|

> Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is 
> incorrect
> --
>
> Key: CASSANDRA-16808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0.1, 4.0-rc, 4.x
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Fixing CASSANDRA-16797 has exposed an issue with the way {{FWD_FRM}} is 
> serialized.
> In the code cleanup during the internode messaging refactor, the 
> serialization for {{FWD_FRM}} (the endpoint to respond to for forwarded 
> messages) was implemented using the same serialization format as 
> CompactEndpointSerializationHelper which prefixes the address bytes with 
> their length, however the FWD_FRM parameter value does not include a length 
> and just converts the parameter value to an InetAddress.
> In a mixed version cluster this causes the pre-4.0 nodes to fail when 
> deserializing the mutation
> {code:java}
> java.lang.RuntimeException: java.net.UnknownHostException: addr is of illegal 
> length
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) 
> ~[dtest-3.0.25.jar:na]
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  ~[na:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  ~[dtest-3.0.25.jar:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> ~[dtest-3.0.25.jar:na]
> at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
> Caused by: java.net.UnknownHostException: addr is of illegal length
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1208) 
> ~[na:na]
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1571) 
> ~[na:na]
> at 
> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:57)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[dtest-3.0.25.jar:na]
> ... 5 common frames omitted
> {code}
> Unfortunately there isn't a clean fix I can see as 
> {{org.apache.cassandra.io.IVersionedAsymmetricSerializer#deserialize}} used 
> to deserialize the FWD_FRM address does not take a maximum length to 
> deserialize and it's impossible to tell definitely know if it's an IPv4 or 
> IPv6 address from the first four bytes.
> The patch I'm submitting special-cases the deserializing pre-4.0 {{FWD_FRM}} 
> parameters in the {{Message}} deserializer. That seems preferable to 
> extending the deserialization interface or creating a new {{DataInputBuffer}} 
> limited by the parameter value length.
> Once that was fixed, the INSERT statements were still failing which I tracked 
> down to the 4.0 optimization of serializing the forwarded message once if the 
> message id is the same
>  
> [https://github.com/apache/cassandra/blob/cassandra-4.0/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L76]
> In the test case I wrote, only one message was being forwarded and that had a 
> different id to the original forwarded message. The {{useSameMessageID}} 
> method only checked message Ids within the forwarded messages.
>  
> Code Details:
> When MutationVerbHandler.forwardToLocalNodes is constructing the forwarding 
> message it just stores the the byte array representing the IPv4 or IPv6 
> address in the parameter array.
> (link 
> [https://github.com/apache/cassandra/blob/44604b7316fcbfd7d0d7425e75cd7ebe267e3247/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L90]
>  )
> {code:java}
> private static void forwardToLocalNodes(Mutation 

[jira] [Updated] (CASSANDRA-16808) Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is incorrect

2021-07-20 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16808:

Fix Version/s: 4.x
   4.0.1

> Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is 
> incorrect
> --
>
> Key: CASSANDRA-16808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0.1, 4.0-rc, 4.x
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Fixing CASSANDRA-16797 has exposed an issue with the way {{FWD_FRM}} is 
> serialized.
> In the code cleanup during the internode messaging refactor, the 
> serialization for {{FWD_FRM}} (the endpoint to respond to for forwarded 
> messages) was implemented using the same serialization format as 
> CompactEndpointSerializationHelper which prefixes the address bytes with 
> their length, however the FWD_FRM parameter value does not include a length 
> and just converts the parameter value to an InetAddress.
> In a mixed version cluster this causes the pre-4.0 nodes to fail when 
> deserializing the mutation
> {code:java}
> java.lang.RuntimeException: java.net.UnknownHostException: addr is of illegal 
> length
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) 
> ~[dtest-3.0.25.jar:na]
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  ~[na:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  ~[dtest-3.0.25.jar:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> ~[dtest-3.0.25.jar:na]
> at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
> Caused by: java.net.UnknownHostException: addr is of illegal length
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1208) 
> ~[na:na]
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1571) 
> ~[na:na]
> at 
> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:57)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[dtest-3.0.25.jar:na]
> ... 5 common frames omitted
> {code}
> Unfortunately there isn't a clean fix I can see as 
> {{org.apache.cassandra.io.IVersionedAsymmetricSerializer#deserialize}} used 
> to deserialize the FWD_FRM address does not take a maximum length to 
> deserialize and it's impossible to tell definitely know if it's an IPv4 or 
> IPv6 address from the first four bytes.
> The patch I'm submitting special-cases the deserializing pre-4.0 {{FWD_FRM}} 
> parameters in the {{Message}} deserializer. That seems preferable to 
> extending the deserialization interface or creating a new {{DataInputBuffer}} 
> limited by the parameter value length.
> Once that was fixed, the INSERT statements were still failing which I tracked 
> down to the 4.0 optimization of serializing the forwarded message once if the 
> message id is the same
>  
> [https://github.com/apache/cassandra/blob/cassandra-4.0/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L76]
> In the test case I wrote, only one message was being forwarded and that had a 
> different id to the original forwarded message. The {{useSameMessageID}} 
> method only checked message Ids within the forwarded messages.
>  
> Code Details:
> When MutationVerbHandler.forwardToLocalNodes is constructing the forwarding 
> message it just stores the the byte array representing the IPv4 or IPv6 
> address in the parameter array.
> (link 
> [https://github.com/apache/cassandra/blob/44604b7316fcbfd7d0d7425e75cd7ebe267e3247/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L90]
>  )
> {code:java}
> private static void forwardToLocalNodes(Mutation mutation, 
> MessagingService.Verb verb, byte[] forwardBytes, InetAddress from) throws 
> IOException
> {
> try (DataInputStream in = new DataInputStream(new 
> FastByteArrayInputStream(forwardBytes)))
> {
> int size = in.readInt();
> // tell the recipients who to send their ack to
> MessageOut message = new MessageOut<>(verb, mutation, 
> Mutation.serializer).withParameter(Mutation.FORWARD_FROM, from.getAddress());
> {code}
> When the message is serialized in 3.0 

[jira] [Updated] (CASSANDRA-16808) Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is incorrect

2021-07-20 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16808:

Status: Ready to Commit  (was: Review In Progress)

> Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is 
> incorrect
> --
>
> Key: CASSANDRA-16808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-rc
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Fixing CASSANDRA-16797 has exposed an issue with the way {{FWD_FRM}} is 
> serialized.
> In the code cleanup during the internode messaging refactor, the 
> serialization for {{FWD_FRM}} (the endpoint to respond to for forwarded 
> messages) was implemented using the same serialization format as 
> CompactEndpointSerializationHelper which prefixes the address bytes with 
> their length, however the FWD_FRM parameter value does not include a length 
> and just converts the parameter value to an InetAddress.
> In a mixed version cluster this causes the pre-4.0 nodes to fail when 
> deserializing the mutation
> {code:java}
> java.lang.RuntimeException: java.net.UnknownHostException: addr is of illegal 
> length
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) 
> ~[dtest-3.0.25.jar:na]
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  ~[na:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  ~[dtest-3.0.25.jar:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> ~[dtest-3.0.25.jar:na]
> at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
> Caused by: java.net.UnknownHostException: addr is of illegal length
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1208) 
> ~[na:na]
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1571) 
> ~[na:na]
> at 
> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:57)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[dtest-3.0.25.jar:na]
> ... 5 common frames omitted
> {code}
> Unfortunately there isn't a clean fix I can see as 
> {{org.apache.cassandra.io.IVersionedAsymmetricSerializer#deserialize}} used 
> to deserialize the FWD_FRM address does not take a maximum length to 
> deserialize and it's impossible to tell definitely know if it's an IPv4 or 
> IPv6 address from the first four bytes.
> The patch I'm submitting special-cases the deserializing pre-4.0 {{FWD_FRM}} 
> parameters in the {{Message}} deserializer. That seems preferable to 
> extending the deserialization interface or creating a new {{DataInputBuffer}} 
> limited by the parameter value length.
> Once that was fixed, the INSERT statements were still failing which I tracked 
> down to the 4.0 optimization of serializing the forwarded message once if the 
> message id is the same
>  
> [https://github.com/apache/cassandra/blob/cassandra-4.0/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L76]
> In the test case I wrote, only one message was being forwarded and that had a 
> different id to the original forwarded message. The {{useSameMessageID}} 
> method only checked message Ids within the forwarded messages.
>  
> Code Details:
> When MutationVerbHandler.forwardToLocalNodes is constructing the forwarding 
> message it just stores the the byte array representing the IPv4 or IPv6 
> address in the parameter array.
> (link 
> [https://github.com/apache/cassandra/blob/44604b7316fcbfd7d0d7425e75cd7ebe267e3247/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L90]
>  )
> {code:java}
> private static void forwardToLocalNodes(Mutation mutation, 
> MessagingService.Verb verb, byte[] forwardBytes, InetAddress from) throws 
> IOException
> {
> try (DataInputStream in = new DataInputStream(new 
> FastByteArrayInputStream(forwardBytes)))
> {
> int size = in.readInt();
> // tell the recipients who to send their ack to
> MessageOut message = new MessageOut<>(verb, mutation, 
> Mutation.serializer).withParameter(Mutation.FORWARD_FROM, from.getAddress());
> {code}
> When the message is serialized in 3.0 MessageOut.serialize, 

[jira] [Updated] (CASSANDRA-16775) Reduce the log level on "expected" repair exceptions

2021-07-20 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-16775:
--
Reviewers: Josh McKenzie  (was: Marcus Eriksson)

> Reduce the log level on "expected" repair exceptions
> 
>
> Key: CASSANDRA-16775
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16775
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Repair, Observability/Logging
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.x
>
>
> Many of the repair errors we typically see in the logs are redundant. Say for 
> example that one node has an unreadable SSTable...we should log that fact at 
> ERROR, but then the failing repairs due to that unreadable SSTable should be 
> at WARN, making it easier to find the actual problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16675) Preserve Query Performance with ClusteringIndexNamesFilter After Running DROP COMPACT STORAGE

2021-07-20 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-16675:
--
Reviewers:   (was: Josh McKenzie)

> Preserve Query Performance with ClusteringIndexNamesFilter After Running DROP 
> COMPACT STORAGE
> -
>
> Key: CASSANDRA-16675
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16675
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Local Write-Read Paths
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0.x
>
>
> Before the completion of CASSANDRA-16226, upgrading a cluster from 2.1 to 3.0 
> with compact tables could cause a significant regression in the latency of 
> reads using ClusteringIndexNamesFilter. The details are described in that 
> Jira, but in short, 3.0+ did not skip SSTables it should have during reads, 
> because it thought (wrongly) there might be primary key liveness information 
> in SSTables for compact tables.
> CASSANDRA-16226 addressed this behavior for still-compact tables, and also 
> maintained it after DROP COMPACT STORAGE was run. However, it also allowed 
> tables that were never compact to drop rows from query results if they 
> contained no live non-key columns, which is only a normal behavior for 
> compact tables. This is addressed in CASSANDRA-16671 by reverting the bits of 
> the logic from CASSANDRA-16226 that deal with formerly compact tables where 
> DROP COMPACT STORAGE has been run, in the interest of unblocking the 4.0 
> release and making sure strictly compact and strictly non-compact tables are 
> queried properly and construct properly formed results.
> This goal of this issue is to safely restore the performance of formerly 
> compact tables, which necessarily contain ambiguous primary key liveness 
> info. Roughly, the idea is that we record in a system table (and pull into 
> TableMetadata) the time when DROP COMPACT STORAGE is executed. If a time 
> exists for a table, we can treat it as being formerly compact, and ignore 
> primary key liveness info for determining row completeness in 
> SinglePartitionReadCommand#isComplete(). Otherwise, the normal rules for 
> never-compact tables will apply, avoiding any regression in the scenario 
> described by CASSANDRA-16671.
> This would obviously not be helpful in the case where a user has already 
> dropped compact storage, but it may logically be the best we can do, given we 
> cannot correctly reconstruct liveness info for SSTables created while a table 
> was compact (i.e. there is no way to tell INSERT and UPDATE apart for those). 
> Especially if CASSANDRA-16671 moves in the direction of disabling DROP 
> COMPACT STORAGE by default, I would also propose that we do this only for 
> 4.0+.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16807) Weak visibility guarantees of Accumulator lead to failed assertions during digest comparison

2021-07-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384461#comment-17384461
 ] 

Andres de la Peña commented on CASSANDRA-16807:
---

Removing {{received}} and {{waitingFor}} make sense to me if we don't receive 
responses from unneeded replicas. I have left some very trivial suggestions 
[here|https://github.com/adelapena/cassandra/commit/97ca3a260d2e99fe030a743d5be8b4cd302ba8a9],
 feel free to ignore them if you don't agree.

Also, I was wondering whether it would make sense to, instead of just removing 
{{waitingFor}}, replacing it by an assertion verifying that we certainly don't 
receive those messages, for example [this 
way|https://github.com/adelapena/cassandra/commit/a573103522421034fa87030b68422c0d4f775467].
 

> Weak visibility guarantees of Accumulator lead to failed assertions during 
> digest comparison
> 
>
> Key: CASSANDRA-16807
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16807
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0-rc, 4.0.x, 4.x
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> This problem could manifest on all versions, beginning on at least 3.0, but 
> I’ll focus on the way it manifests in 4.0 here.
> In what now seems like a wise move, CASSANDRA-16097 added an assertion to 
> {{DigestResolver#responseMatch()}} that ensures the responses snapshot has at 
> least one visible elements to compare (although of course only one element 
> trivially cannot generate a mismatch and short-circuits immediately). 
> However, at the point {{ReadCallback#onResponse()}} signals the waiting 
> resolver, there is no guarantee that the size of the generated snapshot of 
> the responses {{Accumulator}} is non-zero, or perhaps more worryingly, at 
> least equal to the number of blocked-for responses. This seems to be a 
> consequence of the documented weak visibility guarantees on 
> {{Accumulator#add()}}. In short, if there are concurrent invocations of 
> add(), is it not guaranteed that there is any visible size change after any 
> one of them return, but only after all complete.
> The particular exception looks something like this:
> {noformat}
> java.lang.AssertionError: Attempted response match comparison while no 
> responses have been received.
>   at 
> org.apache.cassandra.service.reads.DigestResolver.responsesMatch(DigestResolver.java:110)
>   at 
> org.apache.cassandra.service.reads.AbstractReadExecutor.awaitResponses(AbstractReadExecutor.java:393)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:2150)
>   at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1979)
>   at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1882)
>   at 
> org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:1121)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:296)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:248)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:90)
> {noformat}
> It’s possible to reproduce this on simple single-partition reads without any 
> short-read protection or replica filtering protection. I’ve also been able to 
> reproduce this synthetically with [a unit 
> test|https://github.com/apache/cassandra/pull/1110] on {{ReadCallback}}.
> It seems like the most straightforward way to fix this would be to avoid 
> signaling in {{ReadCallback#onResponse()}} until the visible size of the 
> accumulator is at least the number of received responses. In most cases, this 
> is trivially true, and our signaling behavior won’t change at all. In the 
> very rare case that there are two (or more) concurrent calls to 
> {{onResponse()}}, the second (or last) will signal, and having one more 
> response than we strictly need should have no negative side effects. (We 
> don’t seem to make any strict assertions about having exactly the number of 
> required responses, only that we have enough.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16808) Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is incorrect

2021-07-20 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384460#comment-17384460
 ] 

Brandon Williams commented on CASSANDRA-16808:
--

+1

> Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is 
> incorrect
> --
>
> Key: CASSANDRA-16808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-rc
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Fixing CASSANDRA-16797 has exposed an issue with the way {{FWD_FRM}} is 
> serialized.
> In the code cleanup during the internode messaging refactor, the 
> serialization for {{FWD_FRM}} (the endpoint to respond to for forwarded 
> messages) was implemented using the same serialization format as 
> CompactEndpointSerializationHelper which prefixes the address bytes with 
> their length, however the FWD_FRM parameter value does not include a length 
> and just converts the parameter value to an InetAddress.
> In a mixed version cluster this causes the pre-4.0 nodes to fail when 
> deserializing the mutation
> {code:java}
> java.lang.RuntimeException: java.net.UnknownHostException: addr is of illegal 
> length
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) 
> ~[dtest-3.0.25.jar:na]
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  ~[na:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  ~[dtest-3.0.25.jar:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> ~[dtest-3.0.25.jar:na]
> at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
> Caused by: java.net.UnknownHostException: addr is of illegal length
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1208) 
> ~[na:na]
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1571) 
> ~[na:na]
> at 
> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:57)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[dtest-3.0.25.jar:na]
> ... 5 common frames omitted
> {code}
> Unfortunately there isn't a clean fix I can see as 
> {{org.apache.cassandra.io.IVersionedAsymmetricSerializer#deserialize}} used 
> to deserialize the FWD_FRM address does not take a maximum length to 
> deserialize and it's impossible to tell definitely know if it's an IPv4 or 
> IPv6 address from the first four bytes.
> The patch I'm submitting special-cases the deserializing pre-4.0 {{FWD_FRM}} 
> parameters in the {{Message}} deserializer. That seems preferable to 
> extending the deserialization interface or creating a new {{DataInputBuffer}} 
> limited by the parameter value length.
> Once that was fixed, the INSERT statements were still failing which I tracked 
> down to the 4.0 optimization of serializing the forwarded message once if the 
> message id is the same
>  
> [https://github.com/apache/cassandra/blob/cassandra-4.0/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L76]
> In the test case I wrote, only one message was being forwarded and that had a 
> different id to the original forwarded message. The {{useSameMessageID}} 
> method only checked message Ids within the forwarded messages.
>  
> Code Details:
> When MutationVerbHandler.forwardToLocalNodes is constructing the forwarding 
> message it just stores the the byte array representing the IPv4 or IPv6 
> address in the parameter array.
> (link 
> [https://github.com/apache/cassandra/blob/44604b7316fcbfd7d0d7425e75cd7ebe267e3247/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L90]
>  )
> {code:java}
> private static void forwardToLocalNodes(Mutation mutation, 
> MessagingService.Verb verb, byte[] forwardBytes, InetAddress from) throws 
> IOException
> {
> try (DataInputStream in = new DataInputStream(new 
> FastByteArrayInputStream(forwardBytes)))
> {
> int size = in.readInt();
> // tell the recipients who to send their ack to
> MessageOut message = new MessageOut<>(verb, mutation, 
> Mutation.serializer).withParameter(Mutation.FORWARD_FROM, from.getAddress());
> {code}
> When the message is serialized in 3.0 MessageOut.serialize, that raw 

[jira] [Updated] (CASSANDRA-16675) Preserve Query Performance with ClusteringIndexNamesFilter After Running DROP COMPACT STORAGE

2021-07-20 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-16675:
--
Reviewers: Josh McKenzie

> Preserve Query Performance with ClusteringIndexNamesFilter After Running DROP 
> COMPACT STORAGE
> -
>
> Key: CASSANDRA-16675
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16675
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Local Write-Read Paths
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0.x
>
>
> Before the completion of CASSANDRA-16226, upgrading a cluster from 2.1 to 3.0 
> with compact tables could cause a significant regression in the latency of 
> reads using ClusteringIndexNamesFilter. The details are described in that 
> Jira, but in short, 3.0+ did not skip SSTables it should have during reads, 
> because it thought (wrongly) there might be primary key liveness information 
> in SSTables for compact tables.
> CASSANDRA-16226 addressed this behavior for still-compact tables, and also 
> maintained it after DROP COMPACT STORAGE was run. However, it also allowed 
> tables that were never compact to drop rows from query results if they 
> contained no live non-key columns, which is only a normal behavior for 
> compact tables. This is addressed in CASSANDRA-16671 by reverting the bits of 
> the logic from CASSANDRA-16226 that deal with formerly compact tables where 
> DROP COMPACT STORAGE has been run, in the interest of unblocking the 4.0 
> release and making sure strictly compact and strictly non-compact tables are 
> queried properly and construct properly formed results.
> This goal of this issue is to safely restore the performance of formerly 
> compact tables, which necessarily contain ambiguous primary key liveness 
> info. Roughly, the idea is that we record in a system table (and pull into 
> TableMetadata) the time when DROP COMPACT STORAGE is executed. If a time 
> exists for a table, we can treat it as being formerly compact, and ignore 
> primary key liveness info for determining row completeness in 
> SinglePartitionReadCommand#isComplete(). Otherwise, the normal rules for 
> never-compact tables will apply, avoiding any regression in the scenario 
> described by CASSANDRA-16671.
> This would obviously not be helpful in the case where a user has already 
> dropped compact storage, but it may logically be the best we can do, given we 
> cannot correctly reconstruct liveness info for SSTables created while a table 
> was compact (i.e. there is no way to tell INSERT and UPDATE apart for those). 
> Especially if CASSANDRA-16671 moves in the direction of disabling DROP 
> COMPACT STORAGE by default, I would also propose that we do this only for 
> 4.0+.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16808) Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is incorrect

2021-07-20 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16808:

Reviewers: Brandon Williams, Caleb Rackliffe  (was: Caleb Rackliffe)

> Pre-4.0 FWD_FRM message parameter serialization and message-id forwarding is 
> incorrect
> --
>
> Key: CASSANDRA-16808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-rc
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Fixing CASSANDRA-16797 has exposed an issue with the way {{FWD_FRM}} is 
> serialized.
> In the code cleanup during the internode messaging refactor, the 
> serialization for {{FWD_FRM}} (the endpoint to respond to for forwarded 
> messages) was implemented using the same serialization format as 
> CompactEndpointSerializationHelper which prefixes the address bytes with 
> their length, however the FWD_FRM parameter value does not include a length 
> and just converts the parameter value to an InetAddress.
> In a mixed version cluster this causes the pre-4.0 nodes to fail when 
> deserializing the mutation
> {code:java}
> java.lang.RuntimeException: java.net.UnknownHostException: addr is of illegal 
> length
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) 
> ~[dtest-3.0.25.jar:na]
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  ~[na:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  ~[dtest-3.0.25.jar:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> ~[dtest-3.0.25.jar:na]
> at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
> Caused by: java.net.UnknownHostException: addr is of illegal length
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1208) 
> ~[na:na]
> at java.base/java.net.InetAddress.getByAddress(InetAddress.java:1571) 
> ~[na:na]
> at 
> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:57)
>  ~[dtest-3.0.25.jar:na]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[dtest-3.0.25.jar:na]
> ... 5 common frames omitted
> {code}
> Unfortunately there isn't a clean fix I can see as 
> {{org.apache.cassandra.io.IVersionedAsymmetricSerializer#deserialize}} used 
> to deserialize the FWD_FRM address does not take a maximum length to 
> deserialize and it's impossible to tell definitely know if it's an IPv4 or 
> IPv6 address from the first four bytes.
> The patch I'm submitting special-cases the deserializing pre-4.0 {{FWD_FRM}} 
> parameters in the {{Message}} deserializer. That seems preferable to 
> extending the deserialization interface or creating a new {{DataInputBuffer}} 
> limited by the parameter value length.
> Once that was fixed, the INSERT statements were still failing which I tracked 
> down to the 4.0 optimization of serializing the forwarded message once if the 
> message id is the same
>  
> [https://github.com/apache/cassandra/blob/cassandra-4.0/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L76]
> In the test case I wrote, only one message was being forwarded and that had a 
> different id to the original forwarded message. The {{useSameMessageID}} 
> method only checked message Ids within the forwarded messages.
>  
> Code Details:
> When MutationVerbHandler.forwardToLocalNodes is constructing the forwarding 
> message it just stores the the byte array representing the IPv4 or IPv6 
> address in the parameter array.
> (link 
> [https://github.com/apache/cassandra/blob/44604b7316fcbfd7d0d7425e75cd7ebe267e3247/src/java/org/apache/cassandra/db/MutationVerbHandler.java#L90]
>  )
> {code:java}
> private static void forwardToLocalNodes(Mutation mutation, 
> MessagingService.Verb verb, byte[] forwardBytes, InetAddress from) throws 
> IOException
> {
> try (DataInputStream in = new DataInputStream(new 
> FastByteArrayInputStream(forwardBytes)))
> {
> int size = in.readInt();
> // tell the recipients who to send their ack to
> MessageOut message = new MessageOut<>(verb, mutation, 
> Mutation.serializer).withParameter(Mutation.FORWARD_FROM, from.getAddress());
> {code}
> When the message is serialized in 3.0 

[jira] [Updated] (CASSANDRA-15985) python dtest TestCqlsh added enable_scripted_user_defined_functions which breaks on 2.2

2021-07-20 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-15985:

Reviewers: Ekaterina Dimitrova, Ekaterina Dimitrova  (was: Ekaterina 
Dimitrova)
   Ekaterina Dimitrova, Ekaterina Dimitrova
   Status: Review In Progress  (was: Patch Available)

> python dtest TestCqlsh added enable_scripted_user_defined_functions which 
> breaks on 2.2
> ---
>
> Key: CASSANDRA-15985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15985
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 2.2.x
>
>
> {code}
> ERROR [main] 2020-07-26 03:03:14,108 CassandraDaemon.java:744 - Exception 
> encountered during startup
> org.apache.cassandra.exceptions.ConfigurationException: Invalid yaml. Please 
> remove properties [enable_scripted_user_defined_functions] from your 
> cassandra.yaml
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader$MissingPropertiesChecker.check(YamlConfigurationLoader.java:146)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:113)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:85)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:151)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:133)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:604)
>  [main/:na]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:731) 
> [main/:na]]
> {code}
> This test doesn’t put a version limit, so all tests fail on 2.2 since the 
> property was added to all clusters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15985) python dtest TestCqlsh added enable_scripted_user_defined_functions which breaks on 2.2

2021-07-20 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-15985:

Status: Changes Suggested  (was: Review In Progress)

> python dtest TestCqlsh added enable_scripted_user_defined_functions which 
> breaks on 2.2
> ---
>
> Key: CASSANDRA-15985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15985
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 2.2.x
>
>
> {code}
> ERROR [main] 2020-07-26 03:03:14,108 CassandraDaemon.java:744 - Exception 
> encountered during startup
> org.apache.cassandra.exceptions.ConfigurationException: Invalid yaml. Please 
> remove properties [enable_scripted_user_defined_functions] from your 
> cassandra.yaml
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader$MissingPropertiesChecker.check(YamlConfigurationLoader.java:146)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:113)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:85)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:151)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:133)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:604)
>  [main/:na]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:731) 
> [main/:na]]
> {code}
> This test doesn’t put a version limit, so all tests fail on 2.2 since the 
> property was added to all clusters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16663) Request-Based Native Transport Rate-Limiting

2021-07-20 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16663:

Reviewers: Benedict Elliott Smith, Sam Tunnicliffe  (was: Sam Tunnicliffe)

> Request-Based Native Transport Rate-Limiting
> 
>
> Key: CASSANDRA-16663
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16663
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Together, CASSANDRA-14855, CASSANDRA-15013, and CASSANDRA-15519 added support 
> for a runtime-configurable, per-coordinator limit on the number of bytes 
> allocated for concurrent requests over the native protocol. It supports 
> channel back-pressure by default, and optionally supports throwing 
> OverloadedException if that is requested in the relevant connection’s STARTUP 
> message.
> This can be an effective tool to prevent the coordinator from running out of 
> memory, but it may not correspond to how expensive a queries are or provide a 
> direct conceptual mapping to how users think about request capacity. I 
> propose adding the option of request-based (or perhaps more correctly 
> message-based) back-pressure, coexisting with (and reusing the logic that 
> supports) the current bytes-based back-pressure.
> _We can roll this forward in phases_, where the server’s cost accounting 
> becomes more accurate, we segment limits by operation type/keyspace/etc., and 
> the client/driver reacts more intelligently to (especially non-back-pressure) 
> overload, _but something minimally viable could look like this_:
> 1.) Reuse most of the existing logic in Limits, et al. to support a simple 
> per-coordinator limit only on native transport requests per second. Under 
> this limit will be CQL reads and writes, but also auth requests, prepare 
> requests, and batches. This is obviously simplistic, and it does not account 
> for the variation in cost between individual queries, but even a fixed cost 
> model should be useful in aggregate.
>  * If the client specifies THROW_ON_OVERLOAD in its STARTUP message at 
> connection time, a breach of the per-node limit will result in an 
> OverloadedException being propagated to the client, and the server will 
> discard the request.
>  * If THROW_ON_OVERLOAD is not specified, the server will stop consuming 
> messages from the channel/socket, which should back-pressure the client, 
> while the message continues to be processed.
> 2.) This limit is infinite by default (or simply disabled), and can be 
> enabled via the YAML config or JMX at runtime. (It might be cleaner to have a 
> no-op rate limiter that's used when the feature is disabled entirely.)
> 3.) The current value of the limit is available via JMX, and metrics around 
> coordinator operations/second are already available to compare against it.
> 4.) Any interaction with existing byte-based limits will intersect. (i.e. A 
> breach of any limit, bytes or request-based, will actuate back-pressure or 
> OverloadedExceptions.)
> In this first pass, explicitly out of scope would be any work on the 
> client/driver side.
> In terms of validation/testing, our biggest concern with anything that adds 
> overhead on a very hot path is performance. In particular, we want to fully 
> understand how the client and server perform along two axes constituting 4 
> scenarios. Those are a.) whether or not we are breaching the request limit 
> and b.) whether the server is throwing on overload at the behest of the 
> client. Having said that, query execution should dwarf the cost of limit 
> accounting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15985) python dtest TestCqlsh added enable_scripted_user_defined_functions which breaks on 2.2

2021-07-20 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384445#comment-17384445
 ] 

Ekaterina Dimitrova commented on CASSANDRA-15985:
-

I just committed CASSANDRA-16736 and I am open to take over the rest of the 
cqlsh fixes as part of this patch if you don't have time [~Ge], just let me 
know. Otherwise, I promise at least a review. :) 

PS I decided not to overcomplicate the situation, committed what was already 
approved and the rest is unrelated test changes that we can handle here as per 
our agreements. 

> python dtest TestCqlsh added enable_scripted_user_defined_functions which 
> breaks on 2.2
> ---
>
> Key: CASSANDRA-15985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15985
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 2.2.x
>
>
> {code}
> ERROR [main] 2020-07-26 03:03:14,108 CassandraDaemon.java:744 - Exception 
> encountered during startup
> org.apache.cassandra.exceptions.ConfigurationException: Invalid yaml. Please 
> remove properties [enable_scripted_user_defined_functions] from your 
> cassandra.yaml
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader$MissingPropertiesChecker.check(YamlConfigurationLoader.java:146)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:113)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:85)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:151)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:133)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:604)
>  [main/:na]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:731) 
> [main/:na]]
> {code}
> This test doesn’t put a version limit, so all tests fail on 2.2 since the 
> property was added to all clusters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16736) CQL shell should prefer newer TLS version by default

2021-07-20 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16736:

Source Control Link: 
https://github.com/apache/cassandra/commit/cb0e4386d8ac2d13e7f594ae3d6cacc0b4246855
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> CQL shell should prefer newer TLS version by default
> 
>
> Key: CASSANDRA-16736
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16736
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/cqlsh
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 2.2.x
>
>
> This is a follow up to CASSANDRA-16695 where a patch was committed to 3.0, 
> 3.11, 4.0 and trunk.
> As part of it we saw that CQL shell python DTests are failing due to config 
> issues for version 2.2.
> As part of the current ticket we need to produce a fix to be able to run the 
> CQL shell tests and verify and apply the patch from CASSANDRA-16695 to 
> Cassandra 2.2 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16736) CQL shell should prefer newer TLS version by default

2021-07-20 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384442#comment-17384442
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-16736 at 7/20/21, 7:25 PM:
---

I committed the current approved patch for Cassandra v 2.2 and DTest repo. The 
rest of the cqlsh failures will be handled in the linked ticket - 
CASSANDRA-15985

 

To [https://github.com/apache/cassandra.git]

   b3f9921881..cb0e4386d8  cassandra-2.2 -> cassandra-2.2

   fbb20b9162..23b0348789  cassandra-3.0 -> cassandra-3.0

   b604bd20cd..8d881d994f  cassandra-3.11 -> cassandra-3.11

   64ec400ab6..e0cecaeec0  cassandra-4.0 -> cassandra-4.0

   fcea6a5509..518b7becf1  trunk -> trunk

 

To [https://github.com/apache/cassandra-dtest.git]

   7bf5be87..af5d69e7  trunk -> trunk


was (Author: e.dimitrova):
I committed the current approved patch for 2.2 and dtest. The rest of the cqlsh 
failures will be handled in the linked ticket - CASSANDRA-15985

 

To https://github.com/apache/cassandra.git

   b3f9921881..cb0e4386d8  cassandra-2.2 -> cassandra-2.2

   fbb20b9162..23b0348789  cassandra-3.0 -> cassandra-3.0

   b604bd20cd..8d881d994f  cassandra-3.11 -> cassandra-3.11

   64ec400ab6..e0cecaeec0  cassandra-4.0 -> cassandra-4.0

   fcea6a5509..518b7becf1  trunk -> trunk

 

To https://github.com/apache/cassandra-dtest.git

   7bf5be87..af5d69e7  trunk -> trunk

> CQL shell should prefer newer TLS version by default
> 
>
> Key: CASSANDRA-16736
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16736
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/cqlsh
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 2.2.x
>
>
> This is a follow up to CASSANDRA-16695 where a patch was committed to 3.0, 
> 3.11, 4.0 and trunk.
> As part of it we saw that CQL shell python DTests are failing due to config 
> issues for version 2.2.
> As part of the current ticket we need to produce a fix to be able to run the 
> CQL shell tests and verify and apply the patch from CASSANDRA-16695 to 
> Cassandra 2.2 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16736) CQL shell should prefer newer TLS version by default

2021-07-20 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384442#comment-17384442
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16736:
-

I committed the current approved patch for 2.2 and dtest. The rest of the cqlsh 
failures will be handled in the linked ticket - CASSANDRA-15985

 

To https://github.com/apache/cassandra.git

   b3f9921881..cb0e4386d8  cassandra-2.2 -> cassandra-2.2

   fbb20b9162..23b0348789  cassandra-3.0 -> cassandra-3.0

   b604bd20cd..8d881d994f  cassandra-3.11 -> cassandra-3.11

   64ec400ab6..e0cecaeec0  cassandra-4.0 -> cassandra-4.0

   fcea6a5509..518b7becf1  trunk -> trunk

 

To https://github.com/apache/cassandra-dtest.git

   7bf5be87..af5d69e7  trunk -> trunk

> CQL shell should prefer newer TLS version by default
> 
>
> Key: CASSANDRA-16736
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16736
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/cqlsh
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 2.2.x
>
>
> This is a follow up to CASSANDRA-16695 where a patch was committed to 3.0, 
> 3.11, 4.0 and trunk.
> As part of it we saw that CQL shell python DTests are failing due to config 
> issues for version 2.2.
> As part of the current ticket we need to produce a fix to be able to run the 
> CQL shell tests and verify and apply the patch from CASSANDRA-16695 to 
> Cassandra 2.2 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-3.0 updated (fbb20b9 -> 23b0348)

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a change to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from fbb20b9  Receipt of gossip shutdown updates TokenMetadata
 add cb0e438  CQL shell should prefer newer TLS version by default patch by 
Kamlesh Ghoradkar; reviewed by Ekaterina Dimitrova, Adam Holmberg, David 
Capwell, Justin Chu and Brandon Williamms for CASSANDRA-16736
 add 23b0348  Merge branch 'cassandra-2.2' into cassandra-3.0

No new revisions were added by this update.

Summary of changes:

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-3.11 updated (b604bd2 -> 8d881d9)

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a change to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from b604bd2  Merge branch 'cassandra-3.0' into cassandra-3.11
 add cb0e438  CQL shell should prefer newer TLS version by default patch by 
Kamlesh Ghoradkar; reviewed by Ekaterina Dimitrova, Adam Holmberg, David 
Capwell, Justin Chu and Brandon Williamms for CASSANDRA-16736
 add 23b0348  Merge branch 'cassandra-2.2' into cassandra-3.0
 add 8d881d9  Merge branch 'cassandra-3.0' into cassandra-3.11

No new revisions were added by this update.

Summary of changes:

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated (fcea6a5 -> 518b7be)

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from fcea6a5  Merge branch 'cassandra-4.0' into trunk
 add cb0e438  CQL shell should prefer newer TLS version by default patch by 
Kamlesh Ghoradkar; reviewed by Ekaterina Dimitrova, Adam Holmberg, David 
Capwell, Justin Chu and Brandon Williamms for CASSANDRA-16736
 add 23b0348  Merge branch 'cassandra-2.2' into cassandra-3.0
 add 8d881d9  Merge branch 'cassandra-3.0' into cassandra-3.11
 add e0cecae  Merge branch 'cassandra-3.11' into cassandra-4.0
 new 518b7be  Merge branch 'cassandra-4.0' into trunk

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge branch 'cassandra-4.0' into trunk

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 518b7becf196517aeac3b4d10623f41a4459ced2
Merge: fcea6a5 e0cecae
Author: Ekaterina Dimitrova 
AuthorDate: Tue Jul 20 15:13:47 2021 -0400

Merge branch 'cassandra-4.0' into trunk


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-2.2 updated (b3f9921 -> cb0e438)

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a change to branch cassandra-2.2
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from b3f9921  Introduce SemVer4j for version representation, parsing and 
handling. And correct supported upgrade paths. Add v4X to Java DTests (after 
cassandra-4.0 branch was created)
 add cb0e438  CQL shell should prefer newer TLS version by default patch by 
Kamlesh Ghoradkar; reviewed by Ekaterina Dimitrova, Adam Holmberg, David 
Capwell, Justin Chu and Brandon Williamms for CASSANDRA-16736

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt|  1 +
 pylib/cqlshlib/sslhandling.py  | 20 --
 pylib/cqlshlib/test/config/sslhandling.config  |  2 +
 .../test/config/sslhandling_invalid.config |  2 +
 pylib/cqlshlib/test/test_sslhandling.py| 75 ++
 5 files changed, 95 insertions(+), 5 deletions(-)
 create mode 100644 pylib/cqlshlib/test/config/sslhandling.config
 create mode 100644 pylib/cqlshlib/test/config/sslhandling_invalid.config
 create mode 100644 pylib/cqlshlib/test/test_sslhandling.py

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-4.0 updated (64ec400 -> e0cecae)

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a change to branch cassandra-4.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 64ec400  Merge branch 'cassaNDRA-3.11' into cassandra-4.0
 add cb0e438  CQL shell should prefer newer TLS version by default patch by 
Kamlesh Ghoradkar; reviewed by Ekaterina Dimitrova, Adam Holmberg, David 
Capwell, Justin Chu and Brandon Williamms for CASSANDRA-16736
 add 23b0348  Merge branch 'cassandra-2.2' into cassandra-3.0
 add 8d881d9  Merge branch 'cassandra-3.0' into cassandra-3.11
 add e0cecae  Merge branch 'cassandra-3.11' into cassandra-4.0

No new revisions were added by this update.

Summary of changes:

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-dtest] branch trunk updated: Pass enable_scripted_user_defined_functions to clusters with version >= 3.0 patch by Ekaterina Dimitrova, review by Brandon Williams for CASSANDRA-16736

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git


The following commit(s) were added to refs/heads/trunk by this push:
 new af5d69e  Pass enable_scripted_user_defined_functions to clusters with 
version >= 3.0 patch by Ekaterina Dimitrova, review by Brandon Williams for 
CASSANDRA-16736
af5d69e is described below

commit af5d69e7efbab5a609bf0556b2c59f58e7acc5a2
Author: Ekaterina Dimitrova 
AuthorDate: Wed Jul 14 19:58:18 2021 -0400

Pass enable_scripted_user_defined_functions to clusters with version >= 3.0
patch by Ekaterina Dimitrova, review by Brandon Williams for CASSANDRA-16736
---
 cqlsh_tests/test_cqlsh.py | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/cqlsh_tests/test_cqlsh.py b/cqlsh_tests/test_cqlsh.py
index 2e1a659..a5e31e6 100644
--- a/cqlsh_tests/test_cqlsh.py
+++ b/cqlsh_tests/test_cqlsh.py
@@ -97,10 +97,14 @@ class TestCqlsh(Tester, CqlshMixin):
 # override cluster options to enable user defined functions
 # currently only needed for test_describe
 @pytest.fixture
-def fixture_dtest_setup_overrides(self):
+def fixture_dtest_setup_overrides(self, dtest_config):
 dtest_setup_overrides = DTestSetupOverrides()
-dtest_setup_overrides.cluster_options = 
ImmutableMapping({'enable_user_defined_functions': 'true',
-
'enable_scripted_user_defined_functions': 'true'})
+if dtest_config.cassandra_version_from_build >= '3.0':
+dtest_setup_overrides.cluster_options = 
ImmutableMapping({'enable_user_defined_functions': 'true',
+  
'enable_scripted_user_defined_functions': 'true'})
+else:
+dtest_setup_overrides.cluster_options = 
ImmutableMapping({'enable_user_defined_functions': 'true'})
+
 return dtest_setup_overrides
 
 @classmethod
@@ -892,6 +896,7 @@ VALUES (4, blobAsInt(0x), '', blobAsBigint(0x), 0x, 
blobAsBoolean(0x), blobAsDec
 assert "'min_threshold': '10'" in stdout
 assert "'max_threshold': '100'" in stdout
 
+@since('3.0')
 def test_describe_functions(self, fixture_dtest_setup_overrides):
 """Test DESCRIBE statements for functions and aggregate functions"""
 self.cluster.populate(1)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16796) Clear pending ranges for a SHUTDOWN peer

2021-07-20 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-16796:

  Fix Version/s: (was: 4.0.x)
 (was: 3.11.x)
 (was: 3.0.x)
 4.0.1
 3.11.11
 3.0.25
  Since Version: 3.0.0
Source Control Link: 
https://github.com/apache/cassandra/commit/fbb20b9162b73c4de8a82cf4ffdde3304e904603
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Thanks, cleaned up those unused imports and committed to 3.0 and merged to 3.11 
-> 4.0 -> trunk. 

> Clear pending ranges for a SHUTDOWN peer
> 
>
> Key: CASSANDRA-16796
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16796
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 3.0.25, 3.11.11, 4.0.1
>
>
> If a node involved in a MOVE operation should fail, peers can sometimes 
> maintain pending ranges for it even when it has left the ring and/or been 
> replaced (in practice until the peer is next bounced). This in turn can lead 
> to bogus unavailable responses to clients if a replica for the any of the 
> pending ranges should go down.
> If the moving node crashes hard, a subsequent replacement will correctly fail 
> as long as cassandra.consistent.rangemovement is set to true because the new 
> node will learn the MOVING status from the remaining peers. A graceful 
> shutdown, however, causes that status to be replaced with SHUTDOWN, but 
> doesn't update TokenMetadata, so pending ranges remain for the down node, 
> even after it has been removed from the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16796) Clear pending ranges for a SHUTDOWN peer

2021-07-20 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-16796:

Status: Ready to Commit  (was: Review In Progress)

> Clear pending ranges for a SHUTDOWN peer
> 
>
> Key: CASSANDRA-16796
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16796
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> If a node involved in a MOVE operation should fail, peers can sometimes 
> maintain pending ranges for it even when it has left the ring and/or been 
> replaced (in practice until the peer is next bounced). This in turn can lead 
> to bogus unavailable responses to clients if a replica for the any of the 
> pending ranges should go down.
> If the moving node crashes hard, a subsequent replacement will correctly fail 
> as long as cassandra.consistent.rangemovement is set to true because the new 
> node will learn the MOVING status from the remaining peers. A graceful 
> shutdown, however, causes that status to be replaced with SHUTDOWN, but 
> doesn't update TokenMetadata, so pending ranges remain for the down node, 
> even after it has been removed from the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge branch 'cassandra-4.0' into trunk

2021-07-20 Thread samt
This is an automated email from the ASF dual-hosted git repository.

samt pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit fcea6a5509eb1b0f83aba3e6eda234383f1b9d11
Merge: e8ba1c3 64ec400
Author: Sam Tunnicliffe 
AuthorDate: Tue Jul 20 19:32:39 2021 +0100

Merge branch 'cassandra-4.0' into trunk

 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/gms/Gossiper.java|   6 +-
 .../cassandra/distributed/shared/ClusterUtils.java |  31 ++
 .../cassandra/distributed/test/GossipTest.java | 112 -
 4 files changed, 144 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge branch 'cassandra-3.0' into cassandra-3.11

2021-07-20 Thread samt
This is an automated email from the ASF dual-hosted git repository.

samt pushed a commit to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit b604bd20cde2a47ed375c1460da4f05182dbd037
Merge: 25dbbfd fbb20b9
Author: Sam Tunnicliffe 
AuthorDate: Tue Jul 20 19:00:25 2021 +0100

Merge branch 'cassandra-3.0' into cassandra-3.11

 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/gms/Gossiper.java|   6 +-
 .../cassandra/distributed/test/GossipTest.java | 138 -
 3 files changed, 139 insertions(+), 6 deletions(-)

diff --cc CHANGES.txt
index a43559e,58ac902..89c7813
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,17 -1,5 +1,18 @@@
 -3.0.25:
 +3.11.11
 + * Make cqlsh use the same set of reserved keywords than the server uses 
(CASSANDRA-15663)
 + * Optimize bytes skipping when reading SSTable files (CASSANDRA-14415)
 + * Enable tombstone compactions when unchecked_tombstone_compaction is set in 
TWCS (CASSANDRA-14496)
 + * Read only the required SSTables for single partition queries 
(CASSANDRA-16737)
 + * Fix LeveledCompactionStrategy compacts last level throw an 
ArrayIndexOutOfBoundsException (CASSANDRA-15669)
 + * Maps $CASSANDRA_LOG_DIR to cassandra.logdir java property when executing 
nodetool (CASSANDRA-16199)
 + * Nodetool garbagecollect should retain SSTableLevel for LCS 
(CASSANDRA-16634)
 + * Ignore stale acks received in the shadow round (CASSANDRA-16588)
 + * Add autocomplete and error messages for provide_overlapping_tombstones 
(CASSANDRA-16350)
 + * Add StorageServiceMBean.getKeyspaceReplicationInfo(keyspaceName) 
(CASSANDRA-16447)
 + * Make sure sstables with moved starts are removed correctly in 
LeveledGenerations (CASSANDRA-16552)
 + * Upgrade jackson-databind to 2.9.10.8 (CASSANDRA-16462)
 +Merged from 3.0:
+  * Receipt of gossip shutdown notification updates TokenMetadata 
(CASSANDRA-16796)
   * Count bloom filter misses correctly (CASSANDRA-12922)
   * Reject token() in MV WHERE clause (CASSANDRA-13464)
   * Ensure java executable is on the path (CASSANDRA-14325)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-4.0 updated (1853006 -> 64ec400)

2021-07-20 Thread samt
This is an automated email from the ASF dual-hosted git repository.

samt pushed a change to branch cassandra-4.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 1853006  Merge branch 'cassandra-3.11' into cassandra-4.0
 new fbb20b9  Receipt of gossip shutdown updates TokenMetadata
 new b604bd2  Merge branch 'cassandra-3.0' into cassandra-3.11
 new 64ec400  Merge branch 'cassaNDRA-3.11' into cassandra-4.0

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/gms/Gossiper.java|   6 +-
 .../cassandra/distributed/shared/ClusterUtils.java |  31 ++
 .../cassandra/distributed/test/GossipTest.java | 112 -
 4 files changed, 144 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge branch 'cassaNDRA-3.11' into cassandra-4.0

2021-07-20 Thread samt
This is an automated email from the ASF dual-hosted git repository.

samt pushed a commit to branch cassandra-4.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 64ec400ab6504ebf9447eb7250064a7795e29b81
Merge: 1853006 b604bd2
Author: Sam Tunnicliffe 
AuthorDate: Tue Jul 20 19:31:56 2021 +0100

Merge branch 'cassaNDRA-3.11' into cassandra-4.0

 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/gms/Gossiper.java|   6 +-
 .../cassandra/distributed/shared/ClusterUtils.java |  31 ++
 .../cassandra/distributed/test/GossipTest.java | 112 -
 4 files changed, 144 insertions(+), 6 deletions(-)

diff --cc CHANGES.txt
index b51f94f,89c7813..414dc3b
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -9,7 -3,16 +9,8 @@@ Merged from 3.11
   * Optimize bytes skipping when reading SSTable files (CASSANDRA-14415)
   * Enable tombstone compactions when unchecked_tombstone_compaction is set in 
TWCS (CASSANDRA-14496)
   * Read only the required SSTables for single partition queries 
(CASSANDRA-16737)
 - * Fix LeveledCompactionStrategy compacts last level throw an 
ArrayIndexOutOfBoundsException (CASSANDRA-15669)
 - * Maps $CASSANDRA_LOG_DIR to cassandra.logdir java property when executing 
nodetool (CASSANDRA-16199)
 - * Nodetool garbagecollect should retain SSTableLevel for LCS 
(CASSANDRA-16634)
 - * Ignore stale acks received in the shadow round (CASSANDRA-16588)
 - * Add autocomplete and error messages for provide_overlapping_tombstones 
(CASSANDRA-16350)
 - * Add StorageServiceMBean.getKeyspaceReplicationInfo(keyspaceName) 
(CASSANDRA-16447)
 - * Make sure sstables with moved starts are removed correctly in 
LeveledGenerations (CASSANDRA-16552)
 - * Upgrade jackson-databind to 2.9.10.8 (CASSANDRA-16462)
  Merged from 3.0:
+  * Receipt of gossip shutdown notification updates TokenMetadata 
(CASSANDRA-16796)
   * Count bloom filter misses correctly (CASSANDRA-12922)
   * Reject token() in MV WHERE clause (CASSANDRA-13464)
   * Ensure java executable is on the path (CASSANDRA-14325)
diff --cc src/java/org/apache/cassandra/gms/Gossiper.java
index e194807,1603693..2c38cfb
--- a/src/java/org/apache/cassandra/gms/Gossiper.java
+++ b/src/java/org/apache/cassandra/gms/Gossiper.java
@@@ -539,13 -435,15 +539,17 @@@ public class Gossiper implements IFailu
  EndpointState epState = endpointStateMap.get(endpoint);
  if (epState == null)
  return;
- epState.addApplicationState(ApplicationState.STATUS_WITH_PORT, 
StorageService.instance.valueFactory.shutdown(true));
+ VersionedValue shutdown = 
StorageService.instance.valueFactory.shutdown(true);
 -epState.addApplicationState(ApplicationState.STATUS, shutdown);
++epState.addApplicationState(ApplicationState.STATUS_WITH_PORT, 
shutdown);
 +epState.addApplicationState(ApplicationState.STATUS, 
StorageService.instance.valueFactory.shutdown(true));
  epState.addApplicationState(ApplicationState.RPC_READY, 
StorageService.instance.valueFactory.rpcReady(false));
  epState.getHeartBeatState().forceHighestPossibleVersionUnsafe();
  markDead(endpoint, epState);
  FailureDetector.instance.forceConviction(endpoint);
 +GossiperDiagnostics.markedAsShutdown(this, endpoint);
+ for (IEndpointStateChangeSubscriber subscriber : subscribers)
 -subscriber.onChange(endpoint, ApplicationState.STATUS, shutdown);
++subscriber.onChange(endpoint, ApplicationState.STATUS_WITH_PORT, 
shutdown);
+ logger.debug("Marked {} as shutdown", endpoint);
  }
  
  /**
diff --cc 
test/distributed/org/apache/cassandra/distributed/shared/ClusterUtils.java
index a68e819,000..1755857
mode 100644,00..100644
--- a/test/distributed/org/apache/cassandra/distributed/shared/ClusterUtils.java
+++ b/test/distributed/org/apache/cassandra/distributed/shared/ClusterUtils.java
@@@ -1,774 -1,0 +1,805 @@@
 +/*
 + * Licensed to the Apache Software Foundation (ASF) under one
 + * or more contributor license agreements.  See the NOTICE file
 + * distributed with this work for additional information
 + * regarding copyright ownership.  The ASF licenses this file
 + * to you under the Apache License, Version 2.0 (the
 + * "License"); you may not use this file except in compliance
 + * with the License.  You may obtain a copy of the License at
 + *
 + * http://www.apache.org/licenses/LICENSE-2.0
 + *
 + * Unless required by applicable law or agreed to in writing, software
 + * distributed under the License is distributed on an "AS IS" BASIS,
 + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 + * See the License for the specific language governing permissions and
 + * limitations under the License.
 + */
 +
 +package org.apache.cassandra.distributed.shared;
 +
 +import java.io.File;
 +import java.net.InetSocketAddress;
 +import java.util.ArrayList;
 +import 

[cassandra] branch trunk updated (e8ba1c3 -> fcea6a5)

2021-07-20 Thread samt
This is an automated email from the ASF dual-hosted git repository.

samt pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from e8ba1c3  Merge branch 'cassandra-4.0' into trunk
 new fbb20b9  Receipt of gossip shutdown updates TokenMetadata
 new b604bd2  Merge branch 'cassandra-3.0' into cassandra-3.11
 new 64ec400  Merge branch 'cassaNDRA-3.11' into cassandra-4.0
 new fcea6a5  Merge branch 'cassandra-4.0' into trunk

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/gms/Gossiper.java|   6 +-
 .../cassandra/distributed/shared/ClusterUtils.java |  31 ++
 .../cassandra/distributed/test/GossipTest.java | 112 -
 4 files changed, 144 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-3.0 updated: Receipt of gossip shutdown updates TokenMetadata

2021-07-20 Thread samt
This is an automated email from the ASF dual-hosted git repository.

samt pushed a commit to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/cassandra-3.0 by this push:
 new fbb20b9  Receipt of gossip shutdown updates TokenMetadata
fbb20b9 is described below

commit fbb20b9162b73c4de8a82cf4ffdde3304e904603
Author: Sam Tunnicliffe 
AuthorDate: Mon Jul 12 17:23:18 2021 +0100

Receipt of gossip shutdown updates TokenMetadata

Patch by Sam Tunnicliffe; reviewed by Caleb Rackliffe for
CASSANDRA-16796
---
 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/gms/Gossiper.java|   6 +-
 .../cassandra/distributed/test/GossipTest.java | 138 -
 3 files changed, 139 insertions(+), 6 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index d4e9322..58ac902 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.25:
+ * Receipt of gossip shutdown notification updates TokenMetadata 
(CASSANDRA-16796)
  * Count bloom filter misses correctly (CASSANDRA-12922)
  * Reject token() in MV WHERE clause (CASSANDRA-13464)
  * Ensure java executable is on the path (CASSANDRA-14325)
diff --git a/src/java/org/apache/cassandra/gms/Gossiper.java 
b/src/java/org/apache/cassandra/gms/Gossiper.java
index 0f37dc9..818df50 100644
--- a/src/java/org/apache/cassandra/gms/Gossiper.java
+++ b/src/java/org/apache/cassandra/gms/Gossiper.java
@@ -426,11 +426,15 @@ public class Gossiper implements 
IFailureDetectionEventListener, GossiperMBean
 EndpointState epState = endpointStateMap.get(endpoint);
 if (epState == null)
 return;
-epState.addApplicationState(ApplicationState.STATUS, 
StorageService.instance.valueFactory.shutdown(true));
+VersionedValue shutdown = 
StorageService.instance.valueFactory.shutdown(true);
+epState.addApplicationState(ApplicationState.STATUS, shutdown);
 epState.addApplicationState(ApplicationState.RPC_READY, 
StorageService.instance.valueFactory.rpcReady(false));
 epState.getHeartBeatState().forceHighestPossibleVersionUnsafe();
 markDead(endpoint, epState);
 FailureDetector.instance.forceConviction(endpoint);
+for (IEndpointStateChangeSubscriber subscriber : subscribers)
+subscriber.onChange(endpoint, ApplicationState.STATUS, shutdown);
+logger.debug("Marked {} as shutdown", endpoint);
 }
 
 /**
diff --git 
a/test/distributed/org/apache/cassandra/distributed/test/GossipTest.java 
b/test/distributed/org/apache/cassandra/distributed/test/GossipTest.java
index 32ecb95..ba6027b 100644
--- a/test/distributed/org/apache/cassandra/distributed/test/GossipTest.java
+++ b/test/distributed/org/apache/cassandra/distributed/test/GossipTest.java
@@ -20,16 +20,15 @@ package org.apache.cassandra.distributed.test;
 
 import java.io.Closeable;
 import java.net.InetAddress;
+import java.util.ArrayList;
 import java.util.Collection;
-import java.util.concurrent.CountDownLatch;
-import java.util.concurrent.ExecutorService;
-import java.util.concurrent.Executors;
-import java.util.concurrent.Future;
-import java.util.concurrent.TimeUnit;
+import java.util.List;
+import java.util.concurrent.*;
 import java.util.concurrent.locks.LockSupport;
 import java.util.stream.Collectors;
 
 import com.google.common.collect.Iterables;
+import com.google.common.util.concurrent.Futures;
 import com.google.common.util.concurrent.Uninterruptibles;
 import org.junit.Assert;
 import org.junit.Test;
@@ -39,16 +38,21 @@ import net.bytebuddy.dynamic.loading.ClassLoadingStrategy;
 import net.bytebuddy.implementation.MethodDelegation;
 import org.apache.cassandra.dht.Token;
 import org.apache.cassandra.distributed.Cluster;
+import org.apache.cassandra.distributed.api.*;
 import org.apache.cassandra.gms.ApplicationState;
 import org.apache.cassandra.gms.EndpointState;
 import org.apache.cassandra.gms.Gossiper;
+import org.apache.cassandra.service.PendingRangeCalculatorService;
 import org.apache.cassandra.service.StorageService;
+import org.apache.cassandra.streaming.StreamPlan;
+import org.apache.cassandra.streaming.StreamResultFuture;
 import org.apache.cassandra.utils.FBUtilities;
 
 import static net.bytebuddy.matcher.ElementMatchers.named;
 import static net.bytebuddy.matcher.ElementMatchers.takesArguments;
 import static org.apache.cassandra.distributed.api.Feature.GOSSIP;
 import static org.apache.cassandra.distributed.api.Feature.NETWORK;
+import static org.junit.Assert.assertEquals;
 
 public class GossipTest extends TestBaseImpl
 {
@@ -224,4 +228,128 @@ public class GossipTest extends TestBaseImpl
 }
 }
 
+@Test
+public void gossipShutdownUpdatesTokenMetadata() throws Exception
+{
+try (Cluster cluster = Cluster.build(3)
+  .withConfig(c -> c.with(Feature.GOSSIP, 
Feature.NETWORK))
+  

[cassandra] branch cassandra-3.11 updated (25dbbfd -> b604bd2)

2021-07-20 Thread samt
This is an automated email from the ASF dual-hosted git repository.

samt pushed a change to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 25dbbfd  Merge branch 'cassandra-3.0' into cassandra-3.11
 new fbb20b9  Receipt of gossip shutdown updates TokenMetadata
 new b604bd2  Merge branch 'cassandra-3.0' into cassandra-3.11

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/gms/Gossiper.java|   6 +-
 .../cassandra/distributed/test/GossipTest.java | 138 -
 3 files changed, 139 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16807) Weak visibility guarantees of Accumulator lead to failed assertions during digest comparison

2021-07-20 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384410#comment-17384410
 ] 

Caleb Rackliffe commented on CASSANDRA-16807:
-

I think [~benedict] is right. More specifically, since CASSANDRA-14735, even 
speculating to non-local-DC nodes isn't possible for {{LOCAL_X}} reads. 
{{ReplicaPlans.forRead()}} and {{forRangeRead()}} all hit 
{{candidatesForRead()}}, which filters out non-local nodes for local CLs.

[~adelapena] I've [updated the 
PR|https://github.com/apache/cassandra/pull/1110/files?file-filters%5B%5D=.java#diff-5297967879d7d61d7874555b46b5c853138384f42724af2427271d613725eb2aL139]
 and kicked off new [test 
runs|https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-16807-trunk].
 Let me know what you think...

> Weak visibility guarantees of Accumulator lead to failed assertions during 
> digest comparison
> 
>
> Key: CASSANDRA-16807
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16807
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0-rc, 4.0.x, 4.x
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> This problem could manifest on all versions, beginning on at least 3.0, but 
> I’ll focus on the way it manifests in 4.0 here.
> In what now seems like a wise move, CASSANDRA-16097 added an assertion to 
> {{DigestResolver#responseMatch()}} that ensures the responses snapshot has at 
> least one visible elements to compare (although of course only one element 
> trivially cannot generate a mismatch and short-circuits immediately). 
> However, at the point {{ReadCallback#onResponse()}} signals the waiting 
> resolver, there is no guarantee that the size of the generated snapshot of 
> the responses {{Accumulator}} is non-zero, or perhaps more worryingly, at 
> least equal to the number of blocked-for responses. This seems to be a 
> consequence of the documented weak visibility guarantees on 
> {{Accumulator#add()}}. In short, if there are concurrent invocations of 
> add(), is it not guaranteed that there is any visible size change after any 
> one of them return, but only after all complete.
> The particular exception looks something like this:
> {noformat}
> java.lang.AssertionError: Attempted response match comparison while no 
> responses have been received.
>   at 
> org.apache.cassandra.service.reads.DigestResolver.responsesMatch(DigestResolver.java:110)
>   at 
> org.apache.cassandra.service.reads.AbstractReadExecutor.awaitResponses(AbstractReadExecutor.java:393)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:2150)
>   at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1979)
>   at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1882)
>   at 
> org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:1121)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:296)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:248)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:90)
> {noformat}
> It’s possible to reproduce this on simple single-partition reads without any 
> short-read protection or replica filtering protection. I’ve also been able to 
> reproduce this synthetically with [a unit 
> test|https://github.com/apache/cassandra/pull/1110] on {{ReadCallback}}.
> It seems like the most straightforward way to fix this would be to avoid 
> signaling in {{ReadCallback#onResponse()}} until the visible size of the 
> accumulator is at least the number of received responses. In most cases, this 
> is trivially true, and our signaling behavior won’t change at all. In the 
> very rare case that there are two (or more) concurrent calls to 
> {{onResponse()}}, the second (or last) will signal, and having one more 
> response than we strictly need should have no negative side effects. (We 
> don’t seem to make any strict assertions about having exactly the number of 
> required responses, only that we have enough.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9430) Add startup options to cqlshrc

2021-07-20 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384407#comment-17384407
 ] 

Brandon Williams commented on CASSANDRA-9430:
-

Cancelling patch as it needs a slight rebase, if anyone is interested in 
pursuing this.

> Add startup options to cqlshrc
> --
>
> Key: CASSANDRA-9430
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9430
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Tools
>Reporter: Jeremy Hanna
>Priority: Low
>  Labels: cqlsh, lhf
>
> There are certain settings that would be nice to set defaults for in the 
> cqlshrc file.  For example, a user may want to set the paging to off by 
> default for their environment.  You can't simply do
> {code}
> echo "paging off;" | cqlsh
> {code}
> because this would disable paging and immediately exit cqlsh.
> So it would be nice to have a section of the cqlshrc to include default 
> settings on startup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9430) Add startup options to cqlshrc

2021-07-20 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-9430:

Status: Open  (was: Patch Available)

> Add startup options to cqlshrc
> --
>
> Key: CASSANDRA-9430
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9430
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Tools
>Reporter: Jeremy Hanna
>Priority: Low
>  Labels: cqlsh, lhf
>
> There are certain settings that would be nice to set defaults for in the 
> cqlshrc file.  For example, a user may want to set the paging to off by 
> default for their environment.  You can't simply do
> {code}
> echo "paging off;" | cqlsh
> {code}
> because this would disable paging and immediately exit cqlsh.
> So it would be nice to have a section of the cqlshrc to include default 
> settings on startup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14346) Scheduled Repair in Cassandra

2021-07-20 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-14346:
-
Status: Open  (was: Patch Available)

> Scheduled Repair in Cassandra
> -
>
> Key: CASSANDRA-14346
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14346
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Repair
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
>  Labels: 4.0-feature-freeze-review-requested, 
> CommunityFeedbackRequested
> Fix For: 4.x
>
> Attachments: ScheduledRepairV1_20180327.pdf
>
>
> There have been many attempts to automate repair in Cassandra, which makes 
> sense given that it is necessary to give our users eventual consistency. Most 
> recently CASSANDRA-10070, CASSANDRA-8911 and CASSANDRA-13924 have all looked 
> for ways to solve this problem.
> At Netflix we've built a scheduled repair service within Priam (our sidecar), 
> which we spoke about last year at NGCC. Given the positive feedback at NGCC 
> we focussed on getting it production ready and have now been using it in 
> production to repair hundreds of clusters, tens of thousands of nodes, and 
> petabytes of data for the past six months. Also based on feedback at NGCC we 
> have invested effort in figuring out how to integrate this natively into 
> Cassandra rather than open sourcing it as an external service (e.g. in Priam).
> As such, [~vinaykumarcse] and I would like to re-work and merge our 
> implementation into Cassandra, and have created a [design 
> document|https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_t45rz7H3xs9GbFSEyGzEtM/edit?usp=sharing]
>  showing how we plan to make it happen, including the the user interface.
> As we work on the code migration from Priam to Cassandra, any feedback would 
> be greatly appreciated about the interface or v1 implementation features. I 
> have tried to call out in the document features which we explicitly consider 
> future work (as well as a path forward to implement them in the future) 
> because I would very much like to get this done before the 4.0 merge window 
> closes, and to do that I think aggressively pruning scope is going to be a 
> necessity.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6538) Provide a read-time CQL function to display the data size of columns and rows

2021-07-20 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6538:

Resolution: Won't Fix
Status: Resolved  (was: Open)

Cancelling patch and due to the amount of time passed, closing as wontfix.  
Please reopen if there is renewed interest in this.

> Provide a read-time CQL function to display the data size of columns and rows
> -
>
> Key: CASSANDRA-6538
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6538
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL
>Reporter: Johnny Miller
>Priority: Low
>  Labels: cql
> Attachments: 6538-v2.patch, 6538.patch, CodeSnippet.txt, sizeFzt.PNG
>
>
> It would be extremely useful to be able to work out the size of rows and 
> columns via CQL. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6538) Provide a read-time CQL function to display the data size of columns and rows

2021-07-20 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6538:

Status: Open  (was: Patch Available)

> Provide a read-time CQL function to display the data size of columns and rows
> -
>
> Key: CASSANDRA-6538
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6538
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL
>Reporter: Johnny Miller
>Priority: Low
>  Labels: cql
> Attachments: 6538-v2.patch, 6538.patch, CodeSnippet.txt, sizeFzt.PNG
>
>
> It would be extremely useful to be able to work out the size of rows and 
> columns via CQL. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14463) Prevent the generation of new tokens when using replace_address flag

2021-07-20 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384393#comment-17384393
 ] 

Brandon Williams edited comment on CASSANDRA-14463 at 7/20/21, 4:46 PM:


Those merge cleanly.

||Branch||CI|
|[https://github.com/driftx/cassandra/tree/CASSANDRA-14463-4.0|4.0]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/964/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/964/pipeline]|
|[https://github.com/driftx/cassandra/tree/CASSANDRA-14463-trunk|trunk]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/965/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/965/pipeline]|


was (Author: brandon.williams):
Those merge cleanly.

||Branch||CI||
[https://github.com/driftx/cassandra/tree/CASSANDRA-14463-4.0|4.0]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/964/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/964/pipeline]|
[https://github.com/driftx/cassandra/tree/CASSANDRA-14463-trunk|trunk]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/965/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/965/pipeline]|


> Prevent the generation of new tokens when using replace_address flag
> 
>
> Key: CASSANDRA-14463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Distributed Metadata
>Reporter: Vincent White
>Assignee: Vincent White
>Priority: Low
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> This is a follow up to/replacement of CASSANDRA-14073.
> The behaviour that I want to avoid is someone trying to replace a node with 
> the replace_address flag and mistakenly having that node listed in its own 
> seed list which causes the node to generate a new set of random tokens before 
> joining the ring. 
> Currently anytime an unbootstrapped node is listed in its own seed list and 
> initial_token isn't set in the yaml, Cassandra will generate a new set of 
> random tokens and join the ring regardless of whether it was replacing a 
> previous node or not. 
> We could simply check for this configuration and refuse to start but I it's 
> probably better (particularly for 3.0.X) if it's handled in the same manner 
> as skipping streaming with the allow_unsafe_replace flag that was introduced 
> in 3.X . This would still allow 3.0.X users the ability to re-bootstrap nodes 
> without needing to re-stream all the data to the node again, which can be 
> useful. 
> We currently handle replacing without streaming different;y between 3.0.X and 
> 3.X. In 3.X we have the allow_unsafe_replace JVM flag to allow the use of 
> auto_bootstrap: false in combination with the replace_address option.  But in 
> 3.0.X to perform the replacement of a node with the same IP address without 
> streaming I believe you need to:
>  * Set replace_address (because the address is already in gossip)
>  * Include the node in its own seed list (to skip bootstrapping/streaming)
>  * Set the initial_token to the token/s owned by the previous node (to 
> prevent it generating new tokens.
> I believe if 3.0.X simply refused to start when a node has itself in its seed 
> list and replace_address set this will completely block this operation. 
> Example patches to fix this edge case using allow_unsafe_replace:
>  
> ||Branch||
> |[3.0.x|https://github.com/apache/cassandra/compare/trunk...vincewhite:30-no_clobber]|
> |[3.x|https://github.com/apache/cassandra/compare/trunk...vincewhite:311-no_clobber]|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14463) Prevent the generation of new tokens when using replace_address flag

2021-07-20 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384393#comment-17384393
 ] 

Brandon Williams edited comment on CASSANDRA-14463 at 7/20/21, 4:45 PM:


Those merge cleanly.

||Branch||CI||
[https://github.com/driftx/cassandra/tree/CASSANDRA-14463-4.0|4.0]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/964/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/964/pipeline]|
[https://github.com/driftx/cassandra/tree/CASSANDRA-14463-trunk|trunk]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/965/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/965/pipeline]|



was (Author: brandon.williams):
Those merge cleanly.

||Branch||CI||
|[https://github.com/driftx/cassandra/tree/CASSANDRA-14463-4.0|4.0]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/964/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/964/pipeline]|
|[https://github.com/driftx/cassandra/tree/CASSANDRA-14463-trunk|trunk]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/965/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/965/pipeline]|


> Prevent the generation of new tokens when using replace_address flag
> 
>
> Key: CASSANDRA-14463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Distributed Metadata
>Reporter: Vincent White
>Assignee: Vincent White
>Priority: Low
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> This is a follow up to/replacement of CASSANDRA-14073.
> The behaviour that I want to avoid is someone trying to replace a node with 
> the replace_address flag and mistakenly having that node listed in its own 
> seed list which causes the node to generate a new set of random tokens before 
> joining the ring. 
> Currently anytime an unbootstrapped node is listed in its own seed list and 
> initial_token isn't set in the yaml, Cassandra will generate a new set of 
> random tokens and join the ring regardless of whether it was replacing a 
> previous node or not. 
> We could simply check for this configuration and refuse to start but I it's 
> probably better (particularly for 3.0.X) if it's handled in the same manner 
> as skipping streaming with the allow_unsafe_replace flag that was introduced 
> in 3.X . This would still allow 3.0.X users the ability to re-bootstrap nodes 
> without needing to re-stream all the data to the node again, which can be 
> useful. 
> We currently handle replacing without streaming different;y between 3.0.X and 
> 3.X. In 3.X we have the allow_unsafe_replace JVM flag to allow the use of 
> auto_bootstrap: false in combination with the replace_address option.  But in 
> 3.0.X to perform the replacement of a node with the same IP address without 
> streaming I believe you need to:
>  * Set replace_address (because the address is already in gossip)
>  * Include the node in its own seed list (to skip bootstrapping/streaming)
>  * Set the initial_token to the token/s owned by the previous node (to 
> prevent it generating new tokens.
> I believe if 3.0.X simply refused to start when a node has itself in its seed 
> list and replace_address set this will completely block this operation. 
> Example patches to fix this edge case using allow_unsafe_replace:
>  
> ||Branch||
> |[3.0.x|https://github.com/apache/cassandra/compare/trunk...vincewhite:30-no_clobber]|
> |[3.x|https://github.com/apache/cassandra/compare/trunk...vincewhite:311-no_clobber]|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14463) Prevent the generation of new tokens when using replace_address flag

2021-07-20 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384393#comment-17384393
 ] 

Brandon Williams commented on CASSANDRA-14463:
--

Those merge cleanly.

||Branch||CI||
|[https://github.com/driftx/cassandra/tree/CASSANDRA-14463-4.0|4.0]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/964/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/964/pipeline]|
|[https://github.com/driftx/cassandra/tree/CASSANDRA-14463-trunk|trunk]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/965/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/965/pipeline]|


> Prevent the generation of new tokens when using replace_address flag
> 
>
> Key: CASSANDRA-14463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Distributed Metadata
>Reporter: Vincent White
>Assignee: Vincent White
>Priority: Low
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> This is a follow up to/replacement of CASSANDRA-14073.
> The behaviour that I want to avoid is someone trying to replace a node with 
> the replace_address flag and mistakenly having that node listed in its own 
> seed list which causes the node to generate a new set of random tokens before 
> joining the ring. 
> Currently anytime an unbootstrapped node is listed in its own seed list and 
> initial_token isn't set in the yaml, Cassandra will generate a new set of 
> random tokens and join the ring regardless of whether it was replacing a 
> previous node or not. 
> We could simply check for this configuration and refuse to start but I it's 
> probably better (particularly for 3.0.X) if it's handled in the same manner 
> as skipping streaming with the allow_unsafe_replace flag that was introduced 
> in 3.X . This would still allow 3.0.X users the ability to re-bootstrap nodes 
> without needing to re-stream all the data to the node again, which can be 
> useful. 
> We currently handle replacing without streaming different;y between 3.0.X and 
> 3.X. In 3.X we have the allow_unsafe_replace JVM flag to allow the use of 
> auto_bootstrap: false in combination with the replace_address option.  But in 
> 3.0.X to perform the replacement of a node with the same IP address without 
> streaming I believe you need to:
>  * Set replace_address (because the address is already in gossip)
>  * Include the node in its own seed list (to skip bootstrapping/streaming)
>  * Set the initial_token to the token/s owned by the previous node (to 
> prevent it generating new tokens.
> I believe if 3.0.X simply refused to start when a node has itself in its seed 
> list and replace_address set this will completely block this operation. 
> Example patches to fix this edge case using allow_unsafe_replace:
>  
> ||Branch||
> |[3.0.x|https://github.com/apache/cassandra/compare/trunk...vincewhite:30-no_clobber]|
> |[3.x|https://github.com/apache/cassandra/compare/trunk...vincewhite:311-no_clobber]|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16736) CQL shell should prefer newer TLS version by default

2021-07-20 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16736:

Status: Ready to Commit  (was: Review In Progress)

> CQL shell should prefer newer TLS version by default
> 
>
> Key: CASSANDRA-16736
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16736
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/cqlsh
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 2.2.x
>
>
> This is a follow up to CASSANDRA-16695 where a patch was committed to 3.0, 
> 3.11, 4.0 and trunk.
> As part of it we saw that CQL shell python DTests are failing due to config 
> issues for version 2.2.
> As part of the current ticket we need to produce a fix to be able to run the 
> CQL shell tests and verify and apply the patch from CASSANDRA-16695 to 
> Cassandra 2.2 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16804) Create config.yml.MIDRES for 3.0 and 3.11

2021-07-20 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16804:

Source Control Link: 
https://github.com/apache/cassandra/commit/6e0b084d65774d5a973687410a3ebbf64f55bf95
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Create config.yml.MIDRES for 3.0 and 3.11
> -
>
> Key: CASSANDRA-16804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16804
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> [config.yml.MIDRES|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml.MIDRES]
>  was created for 4.0 some time ago but not for 3.0 and 3.11.
> This task is to facilitate that backport



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16804) Create config.yml.MIDRES for 3.0 and 3.11

2021-07-20 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384384#comment-17384384
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16804:
-

Patch committed (formatting addressed, thank you!) to 3.0, 3.11 and empty 
commit to 4.0 and trunk

 

To https://github.com/apache/cassandra.git

   87424dabd2..6e0b084d65  cassandra-3.0 -> cassandra-3.0

   b5f0f4cd4c..25dbbfddc2  cassandra-3.11 -> cassandra-3.11

   fd69375af0..1853006067  cassandra-4.0 -> cassandra-4.0

   ddbed08087..e8ba1c3f35  trunk -> trunk

> Create config.yml.MIDRES for 3.0 and 3.11
> -
>
> Key: CASSANDRA-16804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16804
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> [config.yml.MIDRES|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml.MIDRES]
>  was created for 4.0 some time ago but not for 3.0 and 3.11.
> This task is to facilitate that backport



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-4.0 updated (fd69375 -> 1853006)

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a change to branch cassandra-4.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from fd69375  Fix CircleCI config to also build dtest jar from cassandra-4.0
 add 6e0b084  Add config.yml.MIDRES for older Cassandra versions patch by 
Ekaterina Dimitrova; review by Andres de la Pena for CASSANDRA-16804
 add 25dbbfd  Merge branch 'cassandra-3.0' into cassandra-3.11
 add 1853006  Merge branch 'cassandra-3.11' into cassandra-4.0

No new revisions were added by this update.

Summary of changes:

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-3.11 updated (b5f0f4c -> 25dbbfd)

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a change to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from b5f0f4c  Merge branch 'cassandra-3.0' into cassandra-3.11
 add 6e0b084  Add config.yml.MIDRES for older Cassandra versions patch by 
Ekaterina Dimitrova; review by Andres de la Pena for CASSANDRA-16804
 add 25dbbfd  Merge branch 'cassandra-3.0' into cassandra-3.11

No new revisions were added by this update.

Summary of changes:
 .circleci/config-2_1.yml.mid_res.patch | 75 +
 .circleci/{config.yml.LOWRES => config.yml.MIDRES} | 95 +++---
 .circleci/generate.sh  |  6 ++
 .circleci/readme.md| 16 +++-
 4 files changed, 107 insertions(+), 85 deletions(-)
 create mode 100644 .circleci/config-2_1.yml.mid_res.patch
 copy .circleci/{config.yml.LOWRES => config.yml.MIDRES} (96%)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-3.0 updated (87424da -> 6e0b084)

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a change to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 87424da  Count bloom filter misses correctly.
 add 6e0b084  Add config.yml.MIDRES for older Cassandra versions patch by 
Ekaterina Dimitrova; review by Andres de la Pena for CASSANDRA-16804

No new revisions were added by this update.

Summary of changes:
 .circleci/config-2_1.yml.mid_res.patch | 75 ++
 .circleci/{config.yml.LOWRES => config.yml.MIDRES} | 24 +++
 .circleci/generate.sh  |  6 ++
 .circleci/readme.md| 16 -
 4 files changed, 106 insertions(+), 15 deletions(-)
 create mode 100644 .circleci/config-2_1.yml.mid_res.patch
 copy .circleci/{config.yml.LOWRES => config.yml.MIDRES} (99%)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated (ddbed08 -> e8ba1c3)

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from ddbed08  Merge branch 'cassandra-4.0' into trunk
 add 6e0b084  Add config.yml.MIDRES for older Cassandra versions patch by 
Ekaterina Dimitrova; review by Andres de la Pena for CASSANDRA-16804
 add 25dbbfd  Merge branch 'cassandra-3.0' into cassandra-3.11
 add 1853006  Merge branch 'cassandra-3.11' into cassandra-4.0
 new e8ba1c3  Merge branch 'cassandra-4.0' into trunk

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge branch 'cassandra-4.0' into trunk

2021-07-20 Thread edimitrova
This is an automated email from the ASF dual-hosted git repository.

edimitrova pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit e8ba1c3f354e2f8bbc7f694ad75ae98428de368b
Merge: ddbed08 1853006
Author: Ekaterina Dimitrova 
AuthorDate: Tue Jul 20 12:29:10 2021 -0400

Merge branch 'cassandra-4.0' into trunk


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13618) CassandraRoleManager setup task improvement

2021-07-20 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384376#comment-17384376
 ] 

Brandon Williams commented on CASSANDRA-13618:
--

This still looks valid, but needs a rebase for 4.0 and later.

> CassandraRoleManager setup task improvement
> ---
>
> Key: CASSANDRA-13618
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13618
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Jeff Jirsa
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> {{CassandraRoleManager}} blocks some functionality during setup, using a 
> delay added in CASSANDRA-9761 . Unfortunately, this setup is scheduled for 
> 10s after startup, and may not be necessary, meaning immediately after 
> startup some auth related queries may not behave as intended. We can skip 
> this delay without any additional risk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13618) CassandraRoleManager setup task improvement

2021-07-20 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-13618:
-
Status: Open  (was: Patch Available)

> CassandraRoleManager setup task improvement
> ---
>
> Key: CASSANDRA-13618
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13618
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Jeff Jirsa
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> {{CassandraRoleManager}} blocks some functionality during setup, using a 
> delay added in CASSANDRA-9761 . Unfortunately, this setup is scheduled for 
> 10s after startup, and may not be necessary, meaning immediately after 
> startup some auth related queries may not behave as intended. We can skip 
> this delay without any additional risk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16796) Clear pending ranges for a SHUTDOWN peer

2021-07-20 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384363#comment-17384363
 ] 

Caleb Rackliffe commented on CASSANDRA-16796:
-

+1 on all patches

Just watch out for the unused imports in {{GossipTest}} ;)

> Clear pending ranges for a SHUTDOWN peer
> 
>
> Key: CASSANDRA-16796
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16796
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> If a node involved in a MOVE operation should fail, peers can sometimes 
> maintain pending ranges for it even when it has left the ring and/or been 
> replaced (in practice until the peer is next bounced). This in turn can lead 
> to bogus unavailable responses to clients if a replica for the any of the 
> pending ranges should go down.
> If the moving node crashes hard, a subsequent replacement will correctly fail 
> as long as cassandra.consistent.rangemovement is set to true because the new 
> node will learn the MOVING status from the remaining peers. A graceful 
> shutdown, however, causes that status to be replaced with SHUTDOWN, but 
> doesn't update TokenMetadata, so pending ranges remain for the down node, 
> even after it has been removed from the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16811) Add brandonwilli...@apache.org to KEYS

2021-07-20 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16811:
-
Change Category: Operability
 Complexity: Normal
Component/s: Build
   Assignee: Alex Petrov
 Status: Open  (was: Triage Needed)

> Add brandonwilli...@apache.org to KEYS
> --
>
> Key: CASSANDRA-16811
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16811
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: Brandon Williams
>Assignee: Alex Petrov
>Priority: Normal
> Attachments: keys.patch
>
>
> Please add my key in the attached patch to 
> https://dist.apache.org/repos/dist/release/cassandra/KEYS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16811) Add brandonwilli...@apache.org to KEYS

2021-07-20 Thread Brandon Williams (Jira)
Brandon Williams created CASSANDRA-16811:


 Summary: Add brandonwilli...@apache.org to KEYS
 Key: CASSANDRA-16811
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16811
 Project: Cassandra
  Issue Type: Task
Reporter: Brandon Williams
 Attachments: keys.patch

Please add my key in the attached patch to 
https://dist.apache.org/repos/dist/release/cassandra/KEYS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16804) Create config.yml.MIDRES for 3.0 and 3.11

2021-07-20 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16804:

Status: Review In Progress  (was: Patch Available)

> Create config.yml.MIDRES for 3.0 and 3.11
> -
>
> Key: CASSANDRA-16804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16804
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> [config.yml.MIDRES|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml.MIDRES]
>  was created for 4.0 some time ago but not for 3.0 and 3.11.
> This task is to facilitate that backport



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16804) Create config.yml.MIDRES for 3.0 and 3.11

2021-07-20 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16804:

Status: Ready to Commit  (was: Review In Progress)

> Create config.yml.MIDRES for 3.0 and 3.11
> -
>
> Key: CASSANDRA-16804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16804
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> [config.yml.MIDRES|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml.MIDRES]
>  was created for 4.0 some time ago but not for 3.0 and 3.11.
> This task is to facilitate that backport



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10490) DTCS historic compaction, possibly with major compaction

2021-07-20 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-10490:
-
Resolution: Won't Fix
Status: Resolved  (was: Open)

Closing since DTCS is deprecated.

> DTCS historic compaction, possibly with major compaction
> 
>
> Key: CASSANDRA-10490
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10490
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Jonathan Shook
>Priority: Normal
>  Labels: compaction, dtcs, triage
> Fix For: 2.2.x, 3.11.x
>
>
> Presently, it's simply painful to run a major compaction with DTCS. It 
> doesn't really serve a useful purpose. Instead, a DTCS major compaction 
> should allow for a DTCS-style compaction to go back before 
> max_sstable_age_days. We can call this a historic compaction, for lack of a 
> better term.
> Such a compaction should not take precedence over normal compaction work, but 
> should be considered a background task. By default there should be a cap on 
> the number of these tasks running. It would be nice to have a separate 
> "max_historic_compaction_tasks" and possibly a 
> "max_historic_compaction_throughput" in the compaction settings to allow for 
> separate throttles on this. I would set these at 1 and 20% of the usual 
> compaction throughput if they aren't set explicitly.
> It may also be desirable to allow historic compaction to run apart from 
> running a major compaction, and to simply disable major compaction altogether 
> for DTCS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16804) Create config.yml.MIDRES for 3.0 and 3.11

2021-07-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384248#comment-17384248
 ] 

Andres de la Peña commented on CASSANDRA-16804:
---

Looks good to me, +1. I'm just living a minor formatting nit about the readme 
that can be addressed during commit. Thanks for the patch.

> Create config.yml.MIDRES for 3.0 and 3.11
> -
>
> Key: CASSANDRA-16804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16804
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> [config.yml.MIDRES|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml.MIDRES]
>  was created for 4.0 some time ago but not for 3.0 and 3.11.
> This task is to facilitate that backport



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16796) Clear pending ranges for a SHUTDOWN peer

2021-07-20 Thread Sam Tunnicliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384092#comment-17384092
 ] 

Sam Tunnicliffe commented on CASSANDRA-16796:
-

Thanks [~maedhroz]

bq. So just to make things explicit (for me, a non-gossip expert), notifying 
subscribers in onChange() means we hit updateNormalTokens(), which removes the 
endpoint from the "moving endpoints" set and eventually removes the pending 
ranges?

Yes, that's right. The shutting down node is essentially returning to a 
{{NORMAL}} state, so we just make sure that all
relevant parties (i.e. {{TokenMetadata}} ) are aware.

bq. What guarantees that a PendingRangeTask has run by the time 
gossipShutdownUpdatesTokenMetadata() verifies there are no longer pending range 
for node 2?

Good catch, I haven't seen any failures yet, but this definitely has the 
potential for flakiness. I've added a log
statement at {{DEBUG}} at the end of {{Gossiper::markAsShutdown}} to gate the 
test on, plus a blocking wait in the
assertion to ensure the PRT is complete before we check.

Circle runs are in the same place as before, new ASF CI jobs here: 
[3.0|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/961], 
[4.0|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/962]


> Clear pending ranges for a SHUTDOWN peer
> 
>
> Key: CASSANDRA-16796
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16796
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> If a node involved in a MOVE operation should fail, peers can sometimes 
> maintain pending ranges for it even when it has left the ring and/or been 
> replaced (in practice until the peer is next bounced). This in turn can lead 
> to bogus unavailable responses to clients if a replica for the any of the 
> pending ranges should go down.
> If the moving node crashes hard, a subsequent replacement will correctly fail 
> as long as cassandra.consistent.rangemovement is set to true because the new 
> node will learn the MOVING status from the remaining peers. A graceful 
> shutdown, however, causes that status to be replaced with SHUTDOWN, but 
> doesn't update TokenMetadata, so pending ranges remain for the down node, 
> even after it has been removed from the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16807) Weak visibility guarantees of Accumulator lead to failed assertions during digest comparison

2021-07-20 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383866#comment-17383866
 ] 

Benedict Elliott Smith commented on CASSANDRA-16807:


So, the only thing that has me worried is the code in {{waitingFor}}, which 
implies we might receive responses we don't require for our consistency level. 
I _think_ this is dead code and should be removed, and we should assert that we 
have only contacted relevant hosts. If, however, it isn't dead code then this 
fix would be insufficient, as we could have inserted these first, and may only 
have those visible. Only writes should contact replicas not involved in 
consistency, however, and I did perform a cursory check this is the case here.

So, I'd suggest:

1. Verify we do not send queries to replicas we don't want responses from
2. Do not pre-process responses we aren't {{waitingFor}}; or, assert we are 
{{waitingFor}} the commands 
3. Only test that {{resolver.getMessages().size() >= blockFor}}
4. Maybe remove {{received}} altogether, in favour of 
{{resolver.getMessages().size()}}

> Weak visibility guarantees of Accumulator lead to failed assertions during 
> digest comparison
> 
>
> Key: CASSANDRA-16807
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16807
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0-rc, 4.0.x, 4.x
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This problem could manifest on all versions, beginning on at least 3.0, but 
> I’ll focus on the way it manifests in 4.0 here.
> In what now seems like a wise move, CASSANDRA-16097 added an assertion to 
> {{DigestResolver#responseMatch()}} that ensures the responses snapshot has at 
> least one visible elements to compare (although of course only one element 
> trivially cannot generate a mismatch and short-circuits immediately). 
> However, at the point {{ReadCallback#onResponse()}} signals the waiting 
> resolver, there is no guarantee that the size of the generated snapshot of 
> the responses {{Accumulator}} is non-zero, or perhaps more worryingly, at 
> least equal to the number of blocked-for responses. This seems to be a 
> consequence of the documented weak visibility guarantees on 
> {{Accumulator#add()}}. In short, if there are concurrent invocations of 
> add(), is it not guaranteed that there is any visible size change after any 
> one of them return, but only after all complete.
> The particular exception looks something like this:
> {noformat}
> java.lang.AssertionError: Attempted response match comparison while no 
> responses have been received.
>   at 
> org.apache.cassandra.service.reads.DigestResolver.responsesMatch(DigestResolver.java:110)
>   at 
> org.apache.cassandra.service.reads.AbstractReadExecutor.awaitResponses(AbstractReadExecutor.java:393)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:2150)
>   at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1979)
>   at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1882)
>   at 
> org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:1121)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:296)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:248)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:90)
> {noformat}
> It’s possible to reproduce this on simple single-partition reads without any 
> short-read protection or replica filtering protection. I’ve also been able to 
> reproduce this synthetically with [a unit 
> test|https://github.com/apache/cassandra/pull/1110] on {{ReadCallback}}.
> It seems like the most straightforward way to fix this would be to avoid 
> signaling in {{ReadCallback#onResponse()}} until the visible size of the 
> accumulator is at least the number of received responses. In most cases, this 
> is trivially true, and our signaling behavior won’t change at all. In the 
> very rare case that there are two (or more) concurrent calls to 
> {{onResponse()}}, the second (or last) will signal, and having one more 
> response than we strictly need should have no negative side effects. (We 
> don’t seem to make any strict assertions about having exactly the number of 
> required responses, only that we have enough.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (CASSANDRA-16621) Replace spinAsserts code with Awaitility code

2021-07-20 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383794#comment-17383794
 ] 

Berenguer Blasi commented on CASSANDRA-16621:
-

- +1 to {{ThreadPoolMetricsTest}} fixes
- +1 to the optimistic poll. Also the {{ConnectionTest}} should fail the test 
case whereas it is timming out so that needs some tunning as well.
- The v4 protocol thing seems to be sthg that only happens to Jogesh. I need to 
dig a bit more...

> Replace spinAsserts code with Awaitility code
> -
>
> Key: CASSANDRA-16621
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16621
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Jogesh Anand
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 4.0.x
>
>
> Currently spinAsserts does a similar thing to Awaitility which is being used 
> more and more. We have now 2 ways of doing the same thing so it would be good 
> to consolidate



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org