Re: Improve the performance of CAS

2018-05-16 Thread Dikang Gu
@Jason, pinged Sylvain on the jira.

@Jeremiah,
In the contention case, if we combine the prepare and quorum read together, we
will retry the Prepare phase, which may trigger the read on different
replicas again, it's a overhead. We can improve it by avoid executing the
read, if the replica already promised a ballot great than the prepared one.
In commit failure case, each replica should already have the
PartitionUpdate stored in system table, after the Propose phase. Then a
following readWithPaxos or cas operation, can repair the in progress paxos
state, and commit the data.

Thanks
Dikang.

On Wed, May 16, 2018 at 3:17 PM, J. D. Jordan 
wrote:

> I have not reasoned through this completely, but something I would want to
> see before messing with this is how changing the number of rounds behaves
> under contention and failure scenarios. Also how ignoring commit success
> behaves in those scenarios especially under contention and with respect to
> obeying CL semantics.
>
> -Jeremiah
>
> > On May 16, 2018, at 6:05 PM, Jason Brown  wrote:
> >
> > Hey all,
> >
> > Before we go bananas, let's see if Sylvain, the primary author of the
> > original patch, has the opportunity to chime with some explanatory notes
> or
> > other guidance. There may be some subtle points or considerations that
> are
> > not obvious, and I'd hate to lose that context.
> >
> > Thanks,
> >
> > -Jason
> >
> >> On Wed, May 16, 2018 at 2:57 PM, Ariel Weisberg 
> wrote:
> >>
> >> Hi,
> >>
> >> I think you are looking at the right low hanging fruit.  Cassandra
> >> deserves a better consensus protocol, but it's a very big project.
> >>
> >> Regards,
> >> Ariel
> >>> On Wed, May 16, 2018, at 5:51 PM, Dikang Gu wrote:
> >>> Cool, create a jira for it,
> >>> https://issues.apache.org/jira/browse/CASSANDRA-14448. I have a draft
> >> patch
> >>> working internally, will clean it up.
> >>>
> >>> The EPaxos is more complicated, could be a long term effort.
> >>>
> >>> Thanks
> >>> Dikang.
> >>>
> >>> On Wed, May 16, 2018 at 2:20 PM, sankalp kohli  >
> >>> wrote:
> >>>
>  Hi,
> The idea of combining read with prepare sounds good. Regarding
> >> reducing
>  the commit round trip, it is possible today by giving a lower
> >> consistency
>  level for commit I think.
> 
>  Regarding EPaxos, it is a large change and will take longer to land. I
>  think we should do this as it will help lower the latencies a lot.
> 
>  Thanks,
>  Sankalp
> 
>  On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna <
> >> jeremy.hanna1...@gmail.com>
>  wrote:
> 
> > Hi Dikang,
> >
> > Have you seen Blake’s work on implementing egalitarian paxos or
> >> epaxos*?
> > That might be helpful for the discussion.
> >
> > Jeremy
> >
> > * https://issues.apache.org/jira/browse/CASSANDRA-6246
> >
> >> On May 16, 2018, at 3:37 PM, Dikang Gu  wrote:
> >>
> >> Hello C* developers,
> >>
> >> I'm working on some performance improvements of the lightweight
> > transitions
> >> (compare and set), I'd like to hear your thoughts about it.
> >>
> >> As you know, current CAS requires 4 round trips to finish, which
> >> is not
> >> efficient, especially in cross DC case.
> >> 1) Prepare
> >> 2) Quorum read current value
> >> 3) Propose new value
> >> 4) Commit
> >>
> >> I'm proposing the following improvements to reduce it to 2 round
> >> trips,
> >> which is:
> >> 1) Combine prepare and quorum read together, use only one round
> >> trip to
> >> decide the ballot and also piggyback the current value in response.
> >> 2) Propose new value, and then send out the commit request
> > asynchronously,
> >> so client will not wait for the ack of the commit. In case of
> >> commit
> >> failures, we should still have chance to retry/repair it through
> >> hints
>  or
> >> following read/cas events.
> >>
> >> After the improvement, we should be able to finish the CAS
> >> operation
> > using
> >> 2 rounds trips. There can be following improvements as well, and
> >> this
>  can
> >> be a start point.
> >>
> >> What do you think? Did I miss anything?
> >>
> >> Thanks
> >> Dikang
> >
> >
> > 
> >> -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
> 
> >>>
> >>>
> >>>
> >>> --
> >>> Dikang
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
>
> -

Re: Improve the performance of CAS

2018-05-16 Thread J. D. Jordan
I have not reasoned through this completely, but something I would want to see 
before messing with this is how changing the number of rounds behaves under 
contention and failure scenarios. Also how ignoring commit success behaves in 
those scenarios especially under contention and with respect to obeying CL 
semantics.

-Jeremiah

> On May 16, 2018, at 6:05 PM, Jason Brown  wrote:
> 
> Hey all,
> 
> Before we go bananas, let's see if Sylvain, the primary author of the
> original patch, has the opportunity to chime with some explanatory notes or
> other guidance. There may be some subtle points or considerations that are
> not obvious, and I'd hate to lose that context.
> 
> Thanks,
> 
> -Jason
> 
>> On Wed, May 16, 2018 at 2:57 PM, Ariel Weisberg  wrote:
>> 
>> Hi,
>> 
>> I think you are looking at the right low hanging fruit.  Cassandra
>> deserves a better consensus protocol, but it's a very big project.
>> 
>> Regards,
>> Ariel
>>> On Wed, May 16, 2018, at 5:51 PM, Dikang Gu wrote:
>>> Cool, create a jira for it,
>>> https://issues.apache.org/jira/browse/CASSANDRA-14448. I have a draft
>> patch
>>> working internally, will clean it up.
>>> 
>>> The EPaxos is more complicated, could be a long term effort.
>>> 
>>> Thanks
>>> Dikang.
>>> 
>>> On Wed, May 16, 2018 at 2:20 PM, sankalp kohli 
>>> wrote:
>>> 
 Hi,
The idea of combining read with prepare sounds good. Regarding
>> reducing
 the commit round trip, it is possible today by giving a lower
>> consistency
 level for commit I think.
 
 Regarding EPaxos, it is a large change and will take longer to land. I
 think we should do this as it will help lower the latencies a lot.
 
 Thanks,
 Sankalp
 
 On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna <
>> jeremy.hanna1...@gmail.com>
 wrote:
 
> Hi Dikang,
> 
> Have you seen Blake’s work on implementing egalitarian paxos or
>> epaxos*?
> That might be helpful for the discussion.
> 
> Jeremy
> 
> * https://issues.apache.org/jira/browse/CASSANDRA-6246
> 
>> On May 16, 2018, at 3:37 PM, Dikang Gu  wrote:
>> 
>> Hello C* developers,
>> 
>> I'm working on some performance improvements of the lightweight
> transitions
>> (compare and set), I'd like to hear your thoughts about it.
>> 
>> As you know, current CAS requires 4 round trips to finish, which
>> is not
>> efficient, especially in cross DC case.
>> 1) Prepare
>> 2) Quorum read current value
>> 3) Propose new value
>> 4) Commit
>> 
>> I'm proposing the following improvements to reduce it to 2 round
>> trips,
>> which is:
>> 1) Combine prepare and quorum read together, use only one round
>> trip to
>> decide the ballot and also piggyback the current value in response.
>> 2) Propose new value, and then send out the commit request
> asynchronously,
>> so client will not wait for the ack of the commit. In case of
>> commit
>> failures, we should still have chance to retry/repair it through
>> hints
 or
>> following read/cas events.
>> 
>> After the improvement, we should be able to finish the CAS
>> operation
> using
>> 2 rounds trips. There can be following improvements as well, and
>> this
 can
>> be a start point.
>> 
>> What do you think? Did I miss anything?
>> 
>> Thanks
>> Dikang
> 
> 
> 
>> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 
 
>>> 
>>> 
>>> 
>>> --
>>> Dikang
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Improve the performance of CAS

2018-05-16 Thread Jason Brown
Hey all,

Before we go bananas, let's see if Sylvain, the primary author of the
original patch, has the opportunity to chime with some explanatory notes or
other guidance. There may be some subtle points or considerations that are
not obvious, and I'd hate to lose that context.

Thanks,

-Jason

On Wed, May 16, 2018 at 2:57 PM, Ariel Weisberg  wrote:

> Hi,
>
> I think you are looking at the right low hanging fruit.  Cassandra
> deserves a better consensus protocol, but it's a very big project.
>
> Regards,
> Ariel
> On Wed, May 16, 2018, at 5:51 PM, Dikang Gu wrote:
> > Cool, create a jira for it,
> > https://issues.apache.org/jira/browse/CASSANDRA-14448. I have a draft
> patch
> > working internally, will clean it up.
> >
> > The EPaxos is more complicated, could be a long term effort.
> >
> > Thanks
> > Dikang.
> >
> > On Wed, May 16, 2018 at 2:20 PM, sankalp kohli 
> > wrote:
> >
> > > Hi,
> > > The idea of combining read with prepare sounds good. Regarding
> reducing
> > > the commit round trip, it is possible today by giving a lower
> consistency
> > > level for commit I think.
> > >
> > > Regarding EPaxos, it is a large change and will take longer to land. I
> > > think we should do this as it will help lower the latencies a lot.
> > >
> > > Thanks,
> > > Sankalp
> > >
> > > On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna <
> jeremy.hanna1...@gmail.com>
> > > wrote:
> > >
> > > > Hi Dikang,
> > > >
> > > > Have you seen Blake’s work on implementing egalitarian paxos or
> epaxos*?
> > > > That might be helpful for the discussion.
> > > >
> > > > Jeremy
> > > >
> > > > * https://issues.apache.org/jira/browse/CASSANDRA-6246
> > > >
> > > > > On May 16, 2018, at 3:37 PM, Dikang Gu  wrote:
> > > > >
> > > > > Hello C* developers,
> > > > >
> > > > > I'm working on some performance improvements of the lightweight
> > > > transitions
> > > > > (compare and set), I'd like to hear your thoughts about it.
> > > > >
> > > > > As you know, current CAS requires 4 round trips to finish, which
> is not
> > > > > efficient, especially in cross DC case.
> > > > > 1) Prepare
> > > > > 2) Quorum read current value
> > > > > 3) Propose new value
> > > > > 4) Commit
> > > > >
> > > > > I'm proposing the following improvements to reduce it to 2 round
> trips,
> > > > > which is:
> > > > > 1) Combine prepare and quorum read together, use only one round
> trip to
> > > > > decide the ballot and also piggyback the current value in response.
> > > > > 2) Propose new value, and then send out the commit request
> > > > asynchronously,
> > > > > so client will not wait for the ack of the commit. In case of
> commit
> > > > > failures, we should still have chance to retry/repair it through
> hints
> > > or
> > > > > following read/cas events.
> > > > >
> > > > > After the improvement, we should be able to finish the CAS
> operation
> > > > using
> > > > > 2 rounds trips. There can be following improvements as well, and
> this
> > > can
> > > > > be a start point.
> > > > >
> > > > > What do you think? Did I miss anything?
> > > > >
> > > > > Thanks
> > > > > Dikang
> > > >
> > > >
> > > > 
> -
> > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Dikang
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Improve the performance of CAS

2018-05-16 Thread Ariel Weisberg
Hi,

I think you are looking at the right low hanging fruit.  Cassandra deserves a 
better consensus protocol, but it's a very big project.

Regards,
Ariel
On Wed, May 16, 2018, at 5:51 PM, Dikang Gu wrote:
> Cool, create a jira for it,
> https://issues.apache.org/jira/browse/CASSANDRA-14448. I have a draft patch
> working internally, will clean it up.
> 
> The EPaxos is more complicated, could be a long term effort.
> 
> Thanks
> Dikang.
> 
> On Wed, May 16, 2018 at 2:20 PM, sankalp kohli 
> wrote:
> 
> > Hi,
> > The idea of combining read with prepare sounds good. Regarding reducing
> > the commit round trip, it is possible today by giving a lower consistency
> > level for commit I think.
> >
> > Regarding EPaxos, it is a large change and will take longer to land. I
> > think we should do this as it will help lower the latencies a lot.
> >
> > Thanks,
> > Sankalp
> >
> > On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna 
> > wrote:
> >
> > > Hi Dikang,
> > >
> > > Have you seen Blake’s work on implementing egalitarian paxos or epaxos*?
> > > That might be helpful for the discussion.
> > >
> > > Jeremy
> > >
> > > * https://issues.apache.org/jira/browse/CASSANDRA-6246
> > >
> > > > On May 16, 2018, at 3:37 PM, Dikang Gu  wrote:
> > > >
> > > > Hello C* developers,
> > > >
> > > > I'm working on some performance improvements of the lightweight
> > > transitions
> > > > (compare and set), I'd like to hear your thoughts about it.
> > > >
> > > > As you know, current CAS requires 4 round trips to finish, which is not
> > > > efficient, especially in cross DC case.
> > > > 1) Prepare
> > > > 2) Quorum read current value
> > > > 3) Propose new value
> > > > 4) Commit
> > > >
> > > > I'm proposing the following improvements to reduce it to 2 round trips,
> > > > which is:
> > > > 1) Combine prepare and quorum read together, use only one round trip to
> > > > decide the ballot and also piggyback the current value in response.
> > > > 2) Propose new value, and then send out the commit request
> > > asynchronously,
> > > > so client will not wait for the ack of the commit. In case of commit
> > > > failures, we should still have chance to retry/repair it through hints
> > or
> > > > following read/cas events.
> > > >
> > > > After the improvement, we should be able to finish the CAS operation
> > > using
> > > > 2 rounds trips. There can be following improvements as well, and this
> > can
> > > > be a start point.
> > > >
> > > > What do you think? Did I miss anything?
> > > >
> > > > Thanks
> > > > Dikang
> > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
> 
> 
> 
> -- 
> Dikang

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Improve the performance of CAS

2018-05-16 Thread Dikang Gu
Cool, create a jira for it,
https://issues.apache.org/jira/browse/CASSANDRA-14448. I have a draft patch
working internally, will clean it up.

The EPaxos is more complicated, could be a long term effort.

Thanks
Dikang.

On Wed, May 16, 2018 at 2:20 PM, sankalp kohli 
wrote:

> Hi,
> The idea of combining read with prepare sounds good. Regarding reducing
> the commit round trip, it is possible today by giving a lower consistency
> level for commit I think.
>
> Regarding EPaxos, it is a large change and will take longer to land. I
> think we should do this as it will help lower the latencies a lot.
>
> Thanks,
> Sankalp
>
> On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna 
> wrote:
>
> > Hi Dikang,
> >
> > Have you seen Blake’s work on implementing egalitarian paxos or epaxos*?
> > That might be helpful for the discussion.
> >
> > Jeremy
> >
> > * https://issues.apache.org/jira/browse/CASSANDRA-6246
> >
> > > On May 16, 2018, at 3:37 PM, Dikang Gu  wrote:
> > >
> > > Hello C* developers,
> > >
> > > I'm working on some performance improvements of the lightweight
> > transitions
> > > (compare and set), I'd like to hear your thoughts about it.
> > >
> > > As you know, current CAS requires 4 round trips to finish, which is not
> > > efficient, especially in cross DC case.
> > > 1) Prepare
> > > 2) Quorum read current value
> > > 3) Propose new value
> > > 4) Commit
> > >
> > > I'm proposing the following improvements to reduce it to 2 round trips,
> > > which is:
> > > 1) Combine prepare and quorum read together, use only one round trip to
> > > decide the ballot and also piggyback the current value in response.
> > > 2) Propose new value, and then send out the commit request
> > asynchronously,
> > > so client will not wait for the ack of the commit. In case of commit
> > > failures, we should still have chance to retry/repair it through hints
> or
> > > following read/cas events.
> > >
> > > After the improvement, we should be able to finish the CAS operation
> > using
> > > 2 rounds trips. There can be following improvements as well, and this
> can
> > > be a start point.
> > >
> > > What do you think? Did I miss anything?
> > >
> > > Thanks
> > > Dikang
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>



-- 
Dikang


Re: Improve the performance of CAS

2018-05-16 Thread sankalp kohli
Hi,
The idea of combining read with prepare sounds good. Regarding reducing
the commit round trip, it is possible today by giving a lower consistency
level for commit I think.

Regarding EPaxos, it is a large change and will take longer to land. I
think we should do this as it will help lower the latencies a lot.

Thanks,
Sankalp

On Wed, May 16, 2018 at 2:15 PM, Jeremy Hanna 
wrote:

> Hi Dikang,
>
> Have you seen Blake’s work on implementing egalitarian paxos or epaxos*?
> That might be helpful for the discussion.
>
> Jeremy
>
> * https://issues.apache.org/jira/browse/CASSANDRA-6246
>
> > On May 16, 2018, at 3:37 PM, Dikang Gu  wrote:
> >
> > Hello C* developers,
> >
> > I'm working on some performance improvements of the lightweight
> transitions
> > (compare and set), I'd like to hear your thoughts about it.
> >
> > As you know, current CAS requires 4 round trips to finish, which is not
> > efficient, especially in cross DC case.
> > 1) Prepare
> > 2) Quorum read current value
> > 3) Propose new value
> > 4) Commit
> >
> > I'm proposing the following improvements to reduce it to 2 round trips,
> > which is:
> > 1) Combine prepare and quorum read together, use only one round trip to
> > decide the ballot and also piggyback the current value in response.
> > 2) Propose new value, and then send out the commit request
> asynchronously,
> > so client will not wait for the ack of the commit. In case of commit
> > failures, we should still have chance to retry/repair it through hints or
> > following read/cas events.
> >
> > After the improvement, we should be able to finish the CAS operation
> using
> > 2 rounds trips. There can be following improvements as well, and this can
> > be a start point.
> >
> > What do you think? Did I miss anything?
> >
> > Thanks
> > Dikang
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Improve the performance of CAS

2018-05-16 Thread Jeremy Hanna
Hi Dikang,

Have you seen Blake’s work on implementing egalitarian paxos or epaxos*?  That 
might be helpful for the discussion.

Jeremy

* https://issues.apache.org/jira/browse/CASSANDRA-6246

> On May 16, 2018, at 3:37 PM, Dikang Gu  wrote:
> 
> Hello C* developers,
> 
> I'm working on some performance improvements of the lightweight transitions
> (compare and set), I'd like to hear your thoughts about it.
> 
> As you know, current CAS requires 4 round trips to finish, which is not
> efficient, especially in cross DC case.
> 1) Prepare
> 2) Quorum read current value
> 3) Propose new value
> 4) Commit
> 
> I'm proposing the following improvements to reduce it to 2 round trips,
> which is:
> 1) Combine prepare and quorum read together, use only one round trip to
> decide the ballot and also piggyback the current value in response.
> 2) Propose new value, and then send out the commit request asynchronously,
> so client will not wait for the ack of the commit. In case of commit
> failures, we should still have chance to retry/repair it through hints or
> following read/cas events.
> 
> After the improvement, we should be able to finish the CAS operation using
> 2 rounds trips. There can be following improvements as well, and this can
> be a start point.
> 
> What do you think? Did I miss anything?
> 
> Thanks
> Dikang


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Improve the performance of CAS

2018-05-16 Thread Dikang Gu
Hello C* developers,

I'm working on some performance improvements of the lightweight transitions
(compare and set), I'd like to hear your thoughts about it.

As you know, current CAS requires 4 round trips to finish, which is not
efficient, especially in cross DC case.
1) Prepare
2) Quorum read current value
3) Propose new value
4) Commit

I'm proposing the following improvements to reduce it to 2 round trips,
which is:
1) Combine prepare and quorum read together, use only one round trip to
decide the ballot and also piggyback the current value in response.
2) Propose new value, and then send out the commit request asynchronously,
so client will not wait for the ack of the commit. In case of commit
failures, we should still have chance to retry/repair it through hints or
following read/cas events.

After the improvement, we should be able to finish the CAS operation using
2 rounds trips. There can be following improvements as well, and this can
be a start point.

What do you think? Did I miss anything?

Thanks
Dikang