[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-03-18 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509053#comment-17509053
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-17164 at 3/18/22, 7:41 PM:
---

I didn't find a ticket for this one and it is consistently failing. Is there a 
plan? Or a ticket that I am missing? 

CC our build lead this week [~stefan.miklosovic] 


was (Author: e.dimitrova):
I didn't find a ticket for this one and it is consistently failing. Is there a 
plan? Or a ticket for it? 

CC our build lead this week [~stefan.miklosovic] 

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-03-11 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504677#comment-17504677
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-17164 at 3/11/22, 8:02 PM:
---

This commit broke all unit tests - compression and cdc. I believe because of 
removed empty line in the test cassandra.yaml. I will test now and if I am 
right I will ninja fix it... 

[https://jenkins-cm4.apache.org/job/Cassandra-trunk/1002/]


was (Author: e.dimitrova):
This commit broke all unit tests - compression and cdc. I believe because of 
removed empty line in the test cassandra.yaml. I will test now and if I am 
write I will ninja fix it... 

[https://jenkins-cm4.apache.org/job/Cassandra-trunk/1002/]

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-03-10 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504677#comment-17504677
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-17164 at 3/11/22, 1:27 AM:
---

This commit broke all unit tests - compression and cdc. I believe because of 
removed empty line in the test cassandra.yaml. I will test now and if I am 
write I will ninja fix it... 

[https://jenkins-cm4.apache.org/job/Cassandra-trunk/1002/]


was (Author: e.dimitrova):
This commit broke all unit tests - compression and cdc. I believe because of 
removed empty line is the tests cassandra.yaml. I will test now and if I am 
write I will ninja fix it... 

https://jenkins-cm4.apache.org/job/Cassandra-trunk/1002/

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-02-14 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17491985#comment-17491985
 ] 

Branimir Lambov edited comment on CASSANDRA-17164 at 2/14/22, 1:14 PM:
---

I am ready with my high-level review. The new solution looks good to me and I 
am convinced of its correctness. It is a great improvement on both the 
performance and a number of issues of the older solution.

I tried to 
[document|https://github.com/blambov/cassandra/blob/17164-trunk/src/java/org/apache/cassandra/service/paxos/Paxos.md]
 everything I found non-trivial to find and/or understand. This text will 
hopefully help all others looking at this code to understand the 
implementation's logic and should be included in the patch (with any 
corrections and additions you may want to add).


was (Author: blambov):
I am ready with my high-level review. The new version looks good to me and I am 
convinced of its correctness. It is a great improvement on both the performance 
and a number of issues of the older solution.

I tried to 
[document|https://github.com/blambov/cassandra/blob/17164-trunk/src/java/org/apache/cassandra/service/paxos/Paxos.md]
 everything I found non-trivial to find and/or understand. This text will 
hopefully help all others looking at this code to understand the 
implementation's logic and should be included in the patch (with any 
corrections and additions you may want to add).

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-28 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483827#comment-17483827
 ] 

Branimir Lambov edited comment on CASSANDRA-17164 at 1/28/22, 4:03 PM:
---

This is okay. I'm writing documentation with proofs, and having them separated 
makes things a little harder to prove. However, it's not hard to map what we 
actually do to the proof's definitions, so I added an explanation instead.

I've reached this far: 
[https://github.com/blambov/cassandra/blob/17164-trunk/src/java/org/apache/cassandra/service/paxos/Paxos.md]

I need some clarification on read commutativity/concurrent reads. It looks like 
we use "promised" (i.e. the read promise) when we decide acceptance of a 
proposal (the JavaDoc description also says so). This is sensible and correct 
and appears necessary, but does this not mean that no reads can actually 
execute concurrently? In other words, do we really gain anything by storing 
read promises separately?

If we do skip proposing on a "safe" read, this is not a problem, but this is a 
separate optimization that does not appear to require read promises.


was (Author: blambov):
This is okay. I'm writing documentation with proofs, and having them separated 
makes things a little harder to prove. However, it's not hard to map what we 
actually do to the proof's definitions, so I added an explanation instead.

I've reached this far: 
[https://github.com/blambov/cassandra/blob/17164-trunk/src/java/org/apache/cassandra/service/paxos/Paxos.md]

I need some clarification on read commutativity/concurrent reads. It looks like 
we use "promised" (i.e. the read promise) when we decide acceptance of a 
proposal (the JavaDoc description also says so). This is sensible and correct 
and appears necessary, but does this not mean that no reads can actually 
execute concurrently? In other words, do we really gain anything by storing 
read promises separately?

If we do skip proposing on a "safe" read, this is not a problem, but this 
optimization does not appear to require the commutativity.

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-27 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482995#comment-17482995
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/27/22, 9:24 AM:
--

I think I am leaning towards not implementing these suggestions, for the 
following reasons

1) To persist to the {{promised}} (actually {{promisedWrite}}) register safely 
we would need to modify the timestamp behaviour, i.e. writing promises with a 
lower timestamp than accepts or commits, which harms seamless downgrades and 
backwards compatibility. Not doing so safely obviates any benefit, as we still 
need to perform this comparison in memory to resolve the equal timestamp case.
2) Separate {{accept}} and {{commit}} registers are used for essentially the 
same reason - I agree it would be simpler to maintain just one register here 
for both the mutation and the ballot, but the current representation is 100% 
downgradeable, and any modification to persistence would not be.

Do you agree?

We _could_ front-load these changes to the loading from disk phase, and modify 
the {{Snapshot}}, however, if that works for you? This could also modify the 
network representation. I think these changes would be fine, although I think 
they result in only a modest clarity improvement.


was (Author: benedict):
1) To persist to the {{promised}} (actually {{promisedWrite}}) register safely 
we would need to modify the timestamp behaviour, i.e. writing promises with a 
lower timestamp than accepts or commits, which harms seamless downgrades and 
backwards compatibility. Not doing so safely obviates any benefit, as we still 
need to perform this comparison in memory to resolve the equal timestamp case.
2) Separate {{accept}} and {{commit}} registers are used for essentially the 
same reason - I agree it would be simpler to maintain just one register here 
for both the mutation and the ballot, but the current representation is 100% 
downgradeable, and any modification to persistence would not be.

We _could_ front-load these changes to the loading from disk phase, and modify 
the {{Snapshot}}, however, if that works for you? This could also modify the 
network representation. I think these changes would be fine, although I think 
they result in only a modest clarity improvement.

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-26 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482510#comment-17482510
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/26/22, 2:20 PM:
--

I think this would be fine, I need to think through why I did not do this 
originally, as I did consider it. It may simply have been to minimise the 
change in semantics to {{system.paxos}}, without there having been any 
particularly good reason. Perhaps also for debugging purposes, it helps to be 
able to see whether something had been promised before it was 
accepted/committed.


was (Author: benedict):
I think this would be fine, I need to think through why I did not do this 
originally, as I did consider it. It may simply have been to minimise the 
change in semantics to {{system.paxos}}, without there having been any 
particularly good reason. Perhaps also for debugging purposes, it helps to be 
able to see whether something had been promised before it was 
accepted/committed.

I think there is perhaps also some additional complexity when a replica has 
promised a ballot with a timestamp, and then proceeds to accept a different one 
with the same timestamp. I don't think we would resolve these reliably, so we 
might not actually overwrite it anyway, and I don't think there is a trivial 
solution to this problem in the storage layer. I'm not certain if this would 
actually cause problems, but it would require some careful consideration to 
rule it out.

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-26 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482510#comment-17482510
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/26/22, 2:19 PM:
--

I think this would be fine, I need to think through why I did not do this 
originally, as I did consider it. It may simply have been to minimise the 
change in semantics to {{system.paxos}}, without there having been any 
particularly good reason. Perhaps also for debugging purposes, it helps to be 
able to see whether something had been promised before it was 
accepted/committed.

I think there is perhaps also some additional complexity when a replica has 
promised a ballot with a timestamp, and then proceeds to accept a different one 
with the same timestamp. I don't think we would resolve these reliably, so we 
might not actually overwrite it anyway, and I don't think there is a trivial 
solution to this problem in the storage layer. I'm not certain if this would 
actually cause problems, but it would require some careful consideration to 
rule it out.


was (Author: benedict):
I think this would be fine, I need to think through why I did not do this 
originally, as I did consider it. It may simply have been to minimise the 
change in semantics to {{system.paxos}}, without there having been any 
particularly good reason. Perhaps also for debugging purposes, it helps to be 
able to see whether something had been promised before it was 
accepted/committed.

I think there is perhaps also some additional complexity when a replica has 
promised a ballot with the same timestamp, and then proceed to accept it. I 
don't think we would resolve these reliably, so we might not actually overwrite 
it anyway, and I don't think there is a trivial solution to this problem in the 
storage layer. I'm not certain if this would actually cause problems, but it 
would require some careful consideration to rule it out.

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-26 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482510#comment-17482510
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/26/22, 2:18 PM:
--

I think this would be fine, I need to think through why I did not do this 
originally, as I did consider it. It may simply have been to minimise the 
change in semantics to {{system.paxos}}, without there having been any 
particularly good reason. Perhaps also for debugging purposes, it helps to be 
able to see whether something had been promised before it was 
accepted/committed.

I think there is perhaps also some additional complexity when a replica has 
promised a ballot with the same timestamp, and then proceed to accept it. I 
don't think we would resolve these reliably, so we might not actually overwrite 
it anyway, and I don't think there is a trivial solution to this problem in the 
storage layer. I'm not certain if this would actually cause problems, but it 
would require some careful consideration to rule it out.


was (Author: benedict):
I think this would be fine, I need to think through why I did not do this 
originally, as I did consider it. It may simply have been to minimise the 
change in semantics to {{system.paxos}}, without there having been any 
particularly good reason. Perhaps also for debugging purposes, it helps to be 
able to see whether something had been promised before it was 
accepted/committed.

I think there is perhaps also some additional complexity when a replica has 
promised a ballot with the same timestamp, and then proceed to accept it. I 
don't think we would resolve these reliably, so we might not actually overwrite 
it anyway, and I don't think there is a trivial solution to this problem in the 
storage layer.

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-25 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17481981#comment-17481981
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/25/22, 5:21 PM:
--

But this is only an optimisation, I think that would be equivalent to just 
removing the condition and ternary operator entirely, and simply returning 
{{latest(promised, latestWriteOrLowBound)}}? 

i.e. on entry {{latestWriteOrLowBound >= promisedWrite}} so this is just a 
single instruction short-circuit permitting us to avoid evaluating 
{{Ballot#compareTo}}


was (Author: benedict):
But this is only an optimisation, I think that would be equivalent to just 
removing the condition and ternary operator entirely, and saying 
{{latest(promised, latestWriteOrLowBound)}}? i.e. {{latestWriteOrLowBound}} 
already is greater than or equal to {{promisedWrite}} so this is just a single 
instruction short-circuit permitting us to avoid evaluating {{Ballot#compareTo}}

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-25 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17481966#comment-17481966
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/25/22, 5:06 PM:
--

bq. I don’t think there’s a scenario where that’s not a problem, and we’re 
checking, so we may as well log a warning.

I think it's likely that log messages for out of range operations will be 
configurable in the future, but you're right that there's no harm in logging 
warnings in the meantime.

bq. Sorry, I meant for the latestWitnessedOrLowBound method. Edit: although now 
I see this is enforced in the constructor, so nevermind. Although a comment in 
latestWitnessedOrLowBound explaining that would be helpful

Ah, that explains it. Certainly, I'll update with a comment justifying it.


was (Author: benedict):
bq. I don’t think there’s a scenario where that’s not a problem, and we’re 
checking, so we may as well log a warning.

I think it's likely that log messages for out of range operations will be 
configurable in the future, but you're right that there's no harm in logging 
warnings in the meantime.

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-25 Thread Blake Eggleston (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17481962#comment-17481962
 ] 

Blake Eggleston edited comment on CASSANDRA-17164 at 1/25/22, 5:01 PM:
---

{quote}should we log out of range requests in isInRangeAndShouldProcess?
I think this is something that still hasn't taken full shape in OSS, so I'm not 
sure what the right course of action is. Happy to do whatever you feel is 
appropriate until a broader strategy is sorted out.
{quote}
I don’t think there’s a scenario where that’s not a problem, and we’re 
checking, so we may as well log a warning.
{quote}I’d prefer we use .equals instead of checking pointer equality here. 
.equals should check for pointer equality, but will also guard against 
unintended behavior
For the assert? That would be fine, but it makes enforcing the identity 
constraint harder with automated testing. In the other case, it's a pure 
optimisation so better to keep IMO (else the benefit is likely lost)
{quote}
Sorry, I meant for the {{latestWitnessedOrLowBound}} method. Edit: although now 
I see this is enforced in the constructor, so nevermind. Although a comment in 
{{latestWitnessedOrLowBound}} explaining that would be helpful


was (Author: bdeggleston):
{quote}
should we log out of range requests in isInRangeAndShouldProcess?
I think this is something that still hasn't taken full shape in OSS, so I'm not 
sure what the right course of action is. Happy to do whatever you feel is 
appropriate until a broader strategy is sorted out.
{quote}

I don’t think there’s a scenario where that’s not a problem, and we’re 
checking, so we may as well log a warning.

{quote}
I’d prefer we use .equals instead of checking pointer equality here. .equals 
should check for pointer equality, but will also guard against unintended 
behavior
For the assert? That would be fine, but it makes enforcing the identity 
constraint harder with automated testing. In the other case, it's a pure 
optimisation so better to keep IMO (else the benefit is likely lost)
{quote}

Sorry, I meant for the {{latestWitnessedOrLowBound}} method

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-21 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480311#comment-17480311
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/22/22, 1:11 AM:
--

{quote}Why did you remove the logged warning about non linearizable reads?
{quote}
-Hopefully unintentionally...- Skimming your edits, it looks like this might 
have been down to it having been duplicated, as it is also logged at 
StorageProxy startup?

{quote} Why did you change type.decompose to type.decomposeUntyped in 
makeInternalOptionsWithNowInSec?
{quote}

I think because dtest-api uses UUID and so a full migration to TimeUUID in all 
places was kind of tricky, particularly for e.g. upgrade tests? It's hazy after 
so long.

{quote}especially since it’s not checking for the partition key or static row
{quote}
I'm sure this is an oversight that's just not important for the use case - 
which unfortunately I forget, so I'll just fix up this oversight.

{quote}should we log out of range requests in isInRangeAndShouldProcess?
{quote}
I think this is something that still hasn't taken full shape in OSS, so I'm not 
sure what the right course of action is. Happy to do whatever you feel is 
appropriate until a broader strategy is sorted out.

{quote}I’d prefer we use .equals instead of checking pointer equality here. 
.equals should check for pointer equality, but will also guard against 
unintended behavior
{quote}

For the assert? That would be fine, but it makes enforcing the identity 
constraint harder with automated testing. In the other case, it's a pure 
optimisation so better to keep IMO (else the benefit is likely lost)

{quote}WDYT about returning an empty collection if we don’t know about the 
table?
{quote}
I'll have a ponder over the weekend.

{quote}WDYT about renaming rawTimestamp to uuidTimestamp? Raw is ambiguous
{quote}
WFM

{quote}These changes seem like they may have been for debugging purposes, are 
they something that should be committed?
{quote}
Oops, yep.

{quote}A description and explanation of the upgrade procedure should also be 
included, as well as an entry in NEWS.txt
{quote}
Yep, agreed.

Thanks!


was (Author: benedict):
{quote}Why did you remove the logged warning about non linearizable reads?
{quote}
Hopefully unintentionally...

{quote} Why did you change type.decompose to type.decomposeUntyped in 
makeInternalOptionsWithNowInSec?
{quote}

I think because dtest-api uses UUID and so a full migration to TimeUUID in all 
places was kind of tricky, particularly for e.g. upgrade tests? It's hazy after 
so long.

{quote}especially since it’s not checking for the partition key or static row
{quote}
I'm sure this is an oversight that's just not important for the use case - 
which unfortunately I forget, so I'll just fix up this oversight.

{quote}should we log out of range requests in isInRangeAndShouldProcess?
{quote}
I think this is something that still hasn't taken full shape in OSS, so I'm not 
sure what the right course of action is. Happy to do whatever you feel is 
appropriate until a broader strategy is sorted out.

{quote}I’d prefer we use .equals instead of checking pointer equality here. 
.equals should check for pointer equality, but will also guard against 
unintended behavior
{quote}

For the assert? That would be fine, but it makes enforcing the identity 
constraint harder with automated testing. In the other case, it's a pure 
optimisation so better to keep IMO (else the benefit is likely lost)

{quote}WDYT about returning an empty collection if we don’t know about the 
table?
{quote}
I'll have a ponder over the weekend.

{quote}WDYT about renaming rawTimestamp to uuidTimestamp? Raw is ambiguous
{quote}
WFM

{quote}These changes seem like they may have been for debugging purposes, are 
they something that should be committed?
{quote}
Oops, yep.

{quote}A description and explanation of the upgrade procedure should also be 
included, as well as an entry in NEWS.txt
{quote}
Yep, agreed.

Thanks!

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-21 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480311#comment-17480311
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/22/22, 1:08 AM:
--

{quote}Why did you remove the logged warning about non linearizable reads?
{quote}
Hopefully unintentionally...

{quote} Why did you change type.decompose to type.decomposeUntyped in 
makeInternalOptionsWithNowInSec?
{quote}

I think because dtest-api uses UUID and so a full migration to TimeUUID in all 
places was kind of tricky, particularly for e.g. upgrade tests? It's hazy after 
so long.

{quote}especially since it’s not checking for the partition key or static row
{quote}
I'm sure this is an oversight that's just not important for the use case - 
which unfortunately I forget, so I'll just fix up this oversight.

{quote}should we log out of range requests in isInRangeAndShouldProcess?
{quote}
I think this is something that still hasn't taken full shape in OSS, so I'm not 
sure what the right course of action is. Happy to do whatever you feel is 
appropriate until a broader strategy is sorted out.

{quote}I’d prefer we use .equals instead of checking pointer equality here. 
.equals should check for pointer equality, but will also guard against 
unintended behavior
{quote}

For the assert? That would be fine, but it makes enforcing the identity 
constraint harder with automated testing. In the other case, it's a pure 
optimisation so better to keep IMO (else the benefit is likely lost)

{quote}WDYT about returning an empty collection if we don’t know about the 
table?
{quote}
I'll have a ponder over the weekend.

{quote}WDYT about renaming rawTimestamp to uuidTimestamp? Raw is ambiguous
{quote}
WFM

{quote}These changes seem like they may have been for debugging purposes, are 
they something that should be committed?
{quote}
Oops, yep.

{quote}A description and explanation of the upgrade procedure should also be 
included, as well as an entry in NEWS.txt
{quote}
Yep, agreed.

Thanks!


was (Author: benedict):
{quote}Why did you remove the logged warning about non linearizable 
reads?{quote}
Hopefully unintentionally...

{quote} Why did you change type.decompose to type.decomposeUntyped in 
makeInternalOptionsWithNowInSec? {quote}

I think because dtest-api uses UUID and so a full migration to TimeUUID in all 
places was kind of tricky, particularly for e.g. upgrade tests? It's hazy after 
so long.

{quote}especially since it’s not checking for the partition key or static 
row{quote}
I'm sure this is an oversight that's just not important for the use case - 
which unfortunately I forget, so I'll just fix up this oversight.

{quote}should we log out of range requests in isInRangeAndShouldProcess?{quote}
I think this is something that still hasn't taken full shape in OSS, so I'm not 
sure what the right course of action is. Happy to do whatever you feel is 
appropriate until a broader strategy is sorted out.

{quote}I’d prefer we use .equals instead of checking pointer equality here. 
.equals should check for pointer equality, but will also guard against 
unintended behavior{quote}

For the assert? That would be fine, but it makes enforcing the identity 
constraint harder with automated testing. In the other case, it's a pure 
optimisation so better to keep IMO (else the benefit is likely lost)

{quote}WDYT about returning an empty collection if we don’t know about the 
table?{quote}
I'll have a ponder over the weekend.

{quote}WDYT about renaming rawTimestamp to uuidTimestamp? Raw is 
ambiguous{quote}
WFM

{quote}These changes seem like they may have been for debugging purposes, are 
they something that should be committed?{quote}
Oops, yep.

Thanks!

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-20 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17479183#comment-17479183
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/20/22, 9:58 AM:
--

{quote}Specifically, what stops the following from happening?{quote}

In this example, I believe (3) would not yield {{FOUND_INCOMPLETE_COMMIT}} but 
{{FOUND_INCOMPLETE_ACCEPTED}} and so would correctly re-propose (2), expending 
its ballot?

{quote}The fact that we tie accepting a promise to a majority with matching 
"most recent commit" means that we effectively treat it as the identifier of 
the current session{quote}

It is used as an identifier for a commit that was accepted by a majority using 
that ballot, only to ensure it is durable (made it to at least a quorum) before 
we propose a new value. We don't tie accepting a promise to matching the "most 
recent commit" but we cannot safely propose anything without ensuring the prior 
commit is durable, which is why we may simply "refresh" and continue using the 
promises we sought... I feel like there may be some minor disconnects here in 
our language, though, which might be complicating the discussion a little.

{quote}It seems the next prepare will return FOUND_INCOMPLETE_ACCEPTED and 
trigger reproposal and commit{quote}

{{hasInProgressProposal}} first checks if {{latestAccepted.update.isEmpty()}} 
and if so, returns false, i.e. this would not be considered an incomplete 
proposal.




was (Author: benedict):
{quote}Specifically, what stops the following from happening?{quote}

In this example, I believe (3) would not yield {{FOUND_INCOMPLETE_COMMIT}} but 
{{FOUND_INCOMPLETE_ACCEPTED}} and so would correctly re-propose (2), expending 
its ballot?

{quote}It seems the next prepare will return FOUND_INCOMPLETE_ACCEPTED and 
trigger reproposal and commit{quote}

{{hasInProgressProposal}} first checks if {{latestAccepted.update.isEmpty()}} 
and if so, returns false, i.e. this would not be considered an incomplete 
proposal.

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-20 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17479183#comment-17479183
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/20/22, 9:46 AM:
--

{quote}Specifically, what stops the following from happening?{quote}

In this example, I believe (3) would not yield {{FOUND_INCOMPLETE_COMMIT}} but 
{{FOUND_INCOMPLETE_ACCEPTED}} and so would correctly re-propose (2), expending 
its ballot?

{quote}It seems the next prepare will return FOUND_INCOMPLETE_ACCEPTED and 
trigger reproposal and commit{quote}

{{hasInProgressProposal}} first checks if {{latestAccepted.update.isEmpty()}} 
and if so, returns false, i.e. this would not be considered an incomplete 
proposal.


was (Author: benedict):
{quote}Specifically, what stops the following from happening?{quote}

In this example, I believe (3) would not yield {{FOUND_INCOMPLETE_COMMIT}} but 
{{FOUND_INCOMPLETE_ACCEPTED}} and so would correctly re-propose (2), expending 
its ballot?

{quote}It seems the next prepare will return FOUND_INCOMPLETE_ACCEPTED and 
trigger reproposal and commit{quote}

{{hasInProgressProposal}} first checks if {{latestAccepted.update.isEmpty()}} 
and if so, returns false, i.e. this would not be considered an incomplete 
proposal.

The fast read optimisation does however leave an incomplete _promise_ that we 
might want to change to asynchronously send an empty proposal.

> CEP-14: Paxos Improvements
> --
>
> Key: CASSANDRA-17164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17164
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, Consistency/Repair
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.1
>
>
> This ticket encompasses work for [CEP-14|
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements

2022-01-20 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17479167#comment-17479167
 ] 

Branimir Lambov edited comment on CASSANDRA-17164 at 1/20/22, 8:45 AM:
---

{quote}I think this is actually the typical way of implementing Classic Paxos, 
even though Lamport's paper seems to suggest you must only contact the nodes 
that responded to the prepare (there may be something else specific about his 
formulation that necessitates this, I forget, as I dislike his writings on the 
topic). This is corroborated by Heidi Howard's dissertation, which was the 
easiest place I could find a straight-forward formulation of Classic Paxos 
besides that of Lamport. See Algorithm 3 on Page 30.
{quote}
This is an excellent pointer, thank you. Lamport's proof also works with 
separate proposal quorum – my only request here is that this should be stated 
somewhere as other contributors might start with Lamport's original definition 
too.
{quote}I don't quite follow what you mean by this, as this is not limited to 
"most recent commit", but a ballot directly maps to the instance id of classic 
paxos, it just avoids pre-splitting the range of integers.
{quote}
Well, I don't follow this one. A ballot id is something one replica selects and 
sends it as a prepare message to everyone. At this point some nodes may have 
committed a proposal, others may have accepted it, and yet others may have 
never heard of it. From the point of view of a sequence of paxos instances, the 
first two will definitely make promises for different instances, and likely the 
third one will be promising for yet another one. The fact that we tie accepting 
a promise to a majority with matching "most recent commit" means that we 
effectively treat it as the identifier of the current session.

More interestingly, we can then send a proposal to nodes with older most recent 
commit (i.e. nodes that are effectively working on a previous paxos instance), 
that proposal will be accepted (overwriting the decided value but not advancing 
the most recent commit), and this might lead to a decision using just a small 
number of participants with the right "most recent commit". Specifically, what 
stops the following from happening?
{code:java}
   A mrcA acceptedA promised B mrc   B 
accepted   B promised C mrcC accepted   C promised

prepare(1) -> ABC-   -1-
   -   1   -  -1
propose(1, X) -> ABC - 1,X1-
 1,X   1   - 1,X   1
commit(1, X) -> AB  1,X  -1   1,X   
   -   1   -  -1

prepare(2) -> AB1,X  -2   1,X   
   -   2   - 1,X   1
  majority AB, no refresh
propose(2, Y) -> BC 1,X  -2   1,X   
 2,Y   2   - 2,Y   2
  returns successful Y

prepare(3) -> AC1,X1,X3   1,X   
 2,Y   2  1,X -3
  refresh stale, then forms majority AC
propose(3, Z) -> ABC1,X3,Z3   1,X   
 3,Z   3  1,X3,Z   3
  returns successful Z   
{code}
If this is permitted, there's a further error possibility if B is partitioned 
after the 2,Y proposal succeeds and a commit is executed in B. Then the 
committed value can be read from B, and later wiped with a conflicting decision.

In LWT terms, if 
X: {{{}column = X if not exists{}}}, 
Y: {{column = Y if column == X}} and 
Z: {{{}column = Z if column == X{}}},
we may be able to read (serially) both Y and Z.
{quote}Could you explain what you are referring to here? I think this is all 
standard stuff for Paxos, we're again just recording the most recently used 
instance number for each register.
{quote}
Judging from the above, this may be insufficient. I'm likely missing something 
here – that something needs to be documented.
{quote}The final commit phase is only required to ensure any "decree" 
(decision) is disseminated. If we have proposed that no decree be made, there 
is nothing to disseminate, and nothing to complete if another transaction 
encounters it. This is in some ways an artefact of the feature of Cassandra's 
implementation, that we initiate a paxos round without knowing if it will do 
anything, though this feature would I suppose be present for read-only 
operations anyway.
{quote}
This is also hard to read from the code: {{Paxos.cas}} does not