Re: Improve the performance of CAS
@Jason, pinged Sylvain on the jira. @Jeremiah, In the contention case, if we combine the prepare and quorum read together, we will retry the Prepare phase, which may trigger the read on different replicas again, it's a overhead. We can improve it by avoid executing the read, if the replica already promised a ballot great than the prepared one. In commit failure case, each replica should already have the PartitionUpdate stored in system table, after the Propose phase. Then a following readWithPaxos or cas operation, can repair the in progress paxos state, and commit the data. Thanks Dikang. On Wed, May 16, 2018 at 3:17 PM, J. D. Jordan wrote: > I have not reasoned through this completely, but something I would want to > see before messing with this is how changing the number of rounds behaves > under contention and failure scenarios. Also how ignoring commit success > behaves in those scenarios especially under contention and with respect to > obeying CL semantics. > > -Jeremiah > > > On May 16, 2018, at 6:05 PM, Jason Brown wrote: > > > > Hey all, > > > > Before we go bananas, let's see if Sylvain, the primary author of the > > original patch, has the opportunity to chime with some explanatory notes > or > > other guidance. There may be some subtle points or considerations that > are > > not obvious, and I'd hate to lose that context. > > > > Thanks, > > > > -Jason > > > >> On Wed, May 16, 2018 at 2:57 PM, Ariel Weisberg > wrote: > >> > >> Hi, > >> > >> I think you are looking at the right low hanging fruit. Cassandra > >> deserves a better consensus protocol, but it's a very big project. > >> > >> Regards, > >> Ariel > >>> On Wed, May 16, 2018, at 5:51 PM, Dikang Gu wrote: > >>> Cool, create a jira for it, > >>> https://issues.apache.org/jira/browse/CASSANDRA-14448. I have a draft > >> patch > >>> working internally, will clean it up. > >>> > >>> The EPaxos is more complicated, could be a long term effort. > >>> > >>> Thanks > >>> Dikang. > >>> > >>> On Wed, May 16, 2018 at 2:20 PM, sankalp kohli > > >>> wrote: > >>> > Hi, > The idea of combining read with prepare sounds good. Regarding > >> reducing > the commit round trip, it is possible today by giving a lower > >> consistency > level for commit I think. > > Regarding EPaxos, it is a large change and will take longer to land. I > think we should do this as it will help lower the latencies a lot. > > Thanks, > Sankalp > > On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna < > >> jeremy.hanna1...@gmail.com> > wrote: > > > Hi Dikang, > > > > Have you seen Blake’s work on implementing egalitarian paxos or > >> epaxos*? > > That might be helpful for the discussion. > > > > Jeremy > > > > * https://issues.apache.org/jira/browse/CASSANDRA-6246 > > > >> On May 16, 2018, at 3:37 PM, Dikang Gu wrote: > >> > >> Hello C* developers, > >> > >> I'm working on some performance improvements of the lightweight > > transitions > >> (compare and set), I'd like to hear your thoughts about it. > >> > >> As you know, current CAS requires 4 round trips to finish, which > >> is not > >> efficient, especially in cross DC case. > >> 1) Prepare > >> 2) Quorum read current value > >> 3) Propose new value > >> 4) Commit > >> > >> I'm proposing the following improvements to reduce it to 2 round > >> trips, > >> which is: > >> 1) Combine prepare and quorum read together, use only one round > >> trip to > >> decide the ballot and also piggyback the current value in response. > >> 2) Propose new value, and then send out the commit request > > asynchronously, > >> so client will not wait for the ack of the commit. In case of > >> commit > >> failures, we should still have chance to retry/repair it through > >> hints > or > >> following read/cas events. > >> > >> After the improvement, we should be able to finish the CAS > >> operation > > using > >> 2 rounds trips. There can be following improvements as well, and > >> this > can > >> be a start point. > >> > >> What do you think? Did I miss anything? > >> > >> Thanks > >> Dikang > > > > > > > >> - > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > >>> > >>> > >>> > >>> -- > >>> Dikang > >> > >> - > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > >> For additional commands, e-mail: dev-h...@cassandra.apache.org > >> > >> > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassand
Re: Improve the performance of CAS
I have not reasoned through this completely, but something I would want to see before messing with this is how changing the number of rounds behaves under contention and failure scenarios. Also how ignoring commit success behaves in those scenarios especially under contention and with respect to obeying CL semantics. -Jeremiah > On May 16, 2018, at 6:05 PM, Jason Brown wrote: > > Hey all, > > Before we go bananas, let's see if Sylvain, the primary author of the > original patch, has the opportunity to chime with some explanatory notes or > other guidance. There may be some subtle points or considerations that are > not obvious, and I'd hate to lose that context. > > Thanks, > > -Jason > >> On Wed, May 16, 2018 at 2:57 PM, Ariel Weisberg wrote: >> >> Hi, >> >> I think you are looking at the right low hanging fruit. Cassandra >> deserves a better consensus protocol, but it's a very big project. >> >> Regards, >> Ariel >>> On Wed, May 16, 2018, at 5:51 PM, Dikang Gu wrote: >>> Cool, create a jira for it, >>> https://issues.apache.org/jira/browse/CASSANDRA-14448. I have a draft >> patch >>> working internally, will clean it up. >>> >>> The EPaxos is more complicated, could be a long term effort. >>> >>> Thanks >>> Dikang. >>> >>> On Wed, May 16, 2018 at 2:20 PM, sankalp kohli >>> wrote: >>> Hi, The idea of combining read with prepare sounds good. Regarding >> reducing the commit round trip, it is possible today by giving a lower >> consistency level for commit I think. Regarding EPaxos, it is a large change and will take longer to land. I think we should do this as it will help lower the latencies a lot. Thanks, Sankalp On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna < >> jeremy.hanna1...@gmail.com> wrote: > Hi Dikang, > > Have you seen Blake’s work on implementing egalitarian paxos or >> epaxos*? > That might be helpful for the discussion. > > Jeremy > > * https://issues.apache.org/jira/browse/CASSANDRA-6246 > >> On May 16, 2018, at 3:37 PM, Dikang Gu wrote: >> >> Hello C* developers, >> >> I'm working on some performance improvements of the lightweight > transitions >> (compare and set), I'd like to hear your thoughts about it. >> >> As you know, current CAS requires 4 round trips to finish, which >> is not >> efficient, especially in cross DC case. >> 1) Prepare >> 2) Quorum read current value >> 3) Propose new value >> 4) Commit >> >> I'm proposing the following improvements to reduce it to 2 round >> trips, >> which is: >> 1) Combine prepare and quorum read together, use only one round >> trip to >> decide the ballot and also piggyback the current value in response. >> 2) Propose new value, and then send out the commit request > asynchronously, >> so client will not wait for the ack of the commit. In case of >> commit >> failures, we should still have chance to retry/repair it through >> hints or >> following read/cas events. >> >> After the improvement, we should be able to finish the CAS >> operation > using >> 2 rounds trips. There can be following improvements as well, and >> this can >> be a start point. >> >> What do you think? Did I miss anything? >> >> Thanks >> Dikang > > > >> - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > > >>> >>> >>> >>> -- >>> Dikang >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: dev-h...@cassandra.apache.org >> >> - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Improve the performance of CAS
Hey all, Before we go bananas, let's see if Sylvain, the primary author of the original patch, has the opportunity to chime with some explanatory notes or other guidance. There may be some subtle points or considerations that are not obvious, and I'd hate to lose that context. Thanks, -Jason On Wed, May 16, 2018 at 2:57 PM, Ariel Weisberg wrote: > Hi, > > I think you are looking at the right low hanging fruit. Cassandra > deserves a better consensus protocol, but it's a very big project. > > Regards, > Ariel > On Wed, May 16, 2018, at 5:51 PM, Dikang Gu wrote: > > Cool, create a jira for it, > > https://issues.apache.org/jira/browse/CASSANDRA-14448. I have a draft > patch > > working internally, will clean it up. > > > > The EPaxos is more complicated, could be a long term effort. > > > > Thanks > > Dikang. > > > > On Wed, May 16, 2018 at 2:20 PM, sankalp kohli > > wrote: > > > > > Hi, > > > The idea of combining read with prepare sounds good. Regarding > reducing > > > the commit round trip, it is possible today by giving a lower > consistency > > > level for commit I think. > > > > > > Regarding EPaxos, it is a large change and will take longer to land. I > > > think we should do this as it will help lower the latencies a lot. > > > > > > Thanks, > > > Sankalp > > > > > > On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna < > jeremy.hanna1...@gmail.com> > > > wrote: > > > > > > > Hi Dikang, > > > > > > > > Have you seen Blake’s work on implementing egalitarian paxos or > epaxos*? > > > > That might be helpful for the discussion. > > > > > > > > Jeremy > > > > > > > > * https://issues.apache.org/jira/browse/CASSANDRA-6246 > > > > > > > > > On May 16, 2018, at 3:37 PM, Dikang Gu wrote: > > > > > > > > > > Hello C* developers, > > > > > > > > > > I'm working on some performance improvements of the lightweight > > > > transitions > > > > > (compare and set), I'd like to hear your thoughts about it. > > > > > > > > > > As you know, current CAS requires 4 round trips to finish, which > is not > > > > > efficient, especially in cross DC case. > > > > > 1) Prepare > > > > > 2) Quorum read current value > > > > > 3) Propose new value > > > > > 4) Commit > > > > > > > > > > I'm proposing the following improvements to reduce it to 2 round > trips, > > > > > which is: > > > > > 1) Combine prepare and quorum read together, use only one round > trip to > > > > > decide the ballot and also piggyback the current value in response. > > > > > 2) Propose new value, and then send out the commit request > > > > asynchronously, > > > > > so client will not wait for the ack of the commit. In case of > commit > > > > > failures, we should still have chance to retry/repair it through > hints > > > or > > > > > following read/cas events. > > > > > > > > > > After the improvement, we should be able to finish the CAS > operation > > > > using > > > > > 2 rounds trips. There can be following improvements as well, and > this > > > can > > > > > be a start point. > > > > > > > > > > What do you think? Did I miss anything? > > > > > > > > > > Thanks > > > > > Dikang > > > > > > > > > > > > > - > > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > > > > > > > > > > > > > > -- > > Dikang > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >
Re: Improve the performance of CAS
Hi, I think you are looking at the right low hanging fruit. Cassandra deserves a better consensus protocol, but it's a very big project. Regards, Ariel On Wed, May 16, 2018, at 5:51 PM, Dikang Gu wrote: > Cool, create a jira for it, > https://issues.apache.org/jira/browse/CASSANDRA-14448. I have a draft patch > working internally, will clean it up. > > The EPaxos is more complicated, could be a long term effort. > > Thanks > Dikang. > > On Wed, May 16, 2018 at 2:20 PM, sankalp kohli > wrote: > > > Hi, > > The idea of combining read with prepare sounds good. Regarding reducing > > the commit round trip, it is possible today by giving a lower consistency > > level for commit I think. > > > > Regarding EPaxos, it is a large change and will take longer to land. I > > think we should do this as it will help lower the latencies a lot. > > > > Thanks, > > Sankalp > > > > On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna > > wrote: > > > > > Hi Dikang, > > > > > > Have you seen Blake’s work on implementing egalitarian paxos or epaxos*? > > > That might be helpful for the discussion. > > > > > > Jeremy > > > > > > * https://issues.apache.org/jira/browse/CASSANDRA-6246 > > > > > > > On May 16, 2018, at 3:37 PM, Dikang Gu wrote: > > > > > > > > Hello C* developers, > > > > > > > > I'm working on some performance improvements of the lightweight > > > transitions > > > > (compare and set), I'd like to hear your thoughts about it. > > > > > > > > As you know, current CAS requires 4 round trips to finish, which is not > > > > efficient, especially in cross DC case. > > > > 1) Prepare > > > > 2) Quorum read current value > > > > 3) Propose new value > > > > 4) Commit > > > > > > > > I'm proposing the following improvements to reduce it to 2 round trips, > > > > which is: > > > > 1) Combine prepare and quorum read together, use only one round trip to > > > > decide the ballot and also piggyback the current value in response. > > > > 2) Propose new value, and then send out the commit request > > > asynchronously, > > > > so client will not wait for the ack of the commit. In case of commit > > > > failures, we should still have chance to retry/repair it through hints > > or > > > > following read/cas events. > > > > > > > > After the improvement, we should be able to finish the CAS operation > > > using > > > > 2 rounds trips. There can be following improvements as well, and this > > can > > > > be a start point. > > > > > > > > What do you think? Did I miss anything? > > > > > > > > Thanks > > > > Dikang > > > > > > > > > - > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > > > > > > > -- > Dikang - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Improve the performance of CAS
Cool, create a jira for it, https://issues.apache.org/jira/browse/CASSANDRA-14448. I have a draft patch working internally, will clean it up. The EPaxos is more complicated, could be a long term effort. Thanks Dikang. On Wed, May 16, 2018 at 2:20 PM, sankalp kohli wrote: > Hi, > The idea of combining read with prepare sounds good. Regarding reducing > the commit round trip, it is possible today by giving a lower consistency > level for commit I think. > > Regarding EPaxos, it is a large change and will take longer to land. I > think we should do this as it will help lower the latencies a lot. > > Thanks, > Sankalp > > On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna > wrote: > > > Hi Dikang, > > > > Have you seen Blake’s work on implementing egalitarian paxos or epaxos*? > > That might be helpful for the discussion. > > > > Jeremy > > > > * https://issues.apache.org/jira/browse/CASSANDRA-6246 > > > > > On May 16, 2018, at 3:37 PM, Dikang Gu wrote: > > > > > > Hello C* developers, > > > > > > I'm working on some performance improvements of the lightweight > > transitions > > > (compare and set), I'd like to hear your thoughts about it. > > > > > > As you know, current CAS requires 4 round trips to finish, which is not > > > efficient, especially in cross DC case. > > > 1) Prepare > > > 2) Quorum read current value > > > 3) Propose new value > > > 4) Commit > > > > > > I'm proposing the following improvements to reduce it to 2 round trips, > > > which is: > > > 1) Combine prepare and quorum read together, use only one round trip to > > > decide the ballot and also piggyback the current value in response. > > > 2) Propose new value, and then send out the commit request > > asynchronously, > > > so client will not wait for the ack of the commit. In case of commit > > > failures, we should still have chance to retry/repair it through hints > or > > > following read/cas events. > > > > > > After the improvement, we should be able to finish the CAS operation > > using > > > 2 rounds trips. There can be following improvements as well, and this > can > > > be a start point. > > > > > > What do you think? Did I miss anything? > > > > > > Thanks > > > Dikang > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > -- Dikang
Re: Improve the performance of CAS
Hi, The idea of combining read with prepare sounds good. Regarding reducing the commit round trip, it is possible today by giving a lower consistency level for commit I think. Regarding EPaxos, it is a large change and will take longer to land. I think we should do this as it will help lower the latencies a lot. Thanks, Sankalp On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna wrote: > Hi Dikang, > > Have you seen Blake’s work on implementing egalitarian paxos or epaxos*? > That might be helpful for the discussion. > > Jeremy > > * https://issues.apache.org/jira/browse/CASSANDRA-6246 > > > On May 16, 2018, at 3:37 PM, Dikang Gu wrote: > > > > Hello C* developers, > > > > I'm working on some performance improvements of the lightweight > transitions > > (compare and set), I'd like to hear your thoughts about it. > > > > As you know, current CAS requires 4 round trips to finish, which is not > > efficient, especially in cross DC case. > > 1) Prepare > > 2) Quorum read current value > > 3) Propose new value > > 4) Commit > > > > I'm proposing the following improvements to reduce it to 2 round trips, > > which is: > > 1) Combine prepare and quorum read together, use only one round trip to > > decide the ballot and also piggyback the current value in response. > > 2) Propose new value, and then send out the commit request > asynchronously, > > so client will not wait for the ack of the commit. In case of commit > > failures, we should still have chance to retry/repair it through hints or > > following read/cas events. > > > > After the improvement, we should be able to finish the CAS operation > using > > 2 rounds trips. There can be following improvements as well, and this can > > be a start point. > > > > What do you think? Did I miss anything? > > > > Thanks > > Dikang > > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >
Re: Improve the performance of CAS
Hi Dikang, Have you seen Blake’s work on implementing egalitarian paxos or epaxos*? That might be helpful for the discussion. Jeremy * https://issues.apache.org/jira/browse/CASSANDRA-6246 > On May 16, 2018, at 3:37 PM, Dikang Gu wrote: > > Hello C* developers, > > I'm working on some performance improvements of the lightweight transitions > (compare and set), I'd like to hear your thoughts about it. > > As you know, current CAS requires 4 round trips to finish, which is not > efficient, especially in cross DC case. > 1) Prepare > 2) Quorum read current value > 3) Propose new value > 4) Commit > > I'm proposing the following improvements to reduce it to 2 round trips, > which is: > 1) Combine prepare and quorum read together, use only one round trip to > decide the ballot and also piggyback the current value in response. > 2) Propose new value, and then send out the commit request asynchronously, > so client will not wait for the ack of the commit. In case of commit > failures, we should still have chance to retry/repair it through hints or > following read/cas events. > > After the improvement, we should be able to finish the CAS operation using > 2 rounds trips. There can be following improvements as well, and this can > be a start point. > > What do you think? Did I miss anything? > > Thanks > Dikang - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Improve the performance of CAS
Hello C* developers, I'm working on some performance improvements of the lightweight transitions (compare and set), I'd like to hear your thoughts about it. As you know, current CAS requires 4 round trips to finish, which is not efficient, especially in cross DC case. 1) Prepare 2) Quorum read current value 3) Propose new value 4) Commit I'm proposing the following improvements to reduce it to 2 round trips, which is: 1) Combine prepare and quorum read together, use only one round trip to decide the ballot and also piggyback the current value in response. 2) Propose new value, and then send out the commit request asynchronously, so client will not wait for the ack of the commit. In case of commit failures, we should still have chance to retry/repair it through hints or following read/cas events. After the improvement, we should be able to finish the CAS operation using 2 rounds trips. There can be following improvements as well, and this can be a start point. What do you think? Did I miss anything? Thanks Dikang