Re: CEP-15 Update

Alex Petrov Tue, 11 Mar 2025 18:40:02 -0700

While I agree that time spent working on a feature is not necessarily a clear 
indicator of maturity, one can judge the scope of work and thought that went 
into Accord by both its separate repository, and the working branch.


I think that merging/accepting SASI was not a mistake. There were several 
efforts to make it work, and back in 2016 we could've made it quite viable with 
just CASSANDRA-11990 and a lot of testing. It did get superseded by SAI, but I 
can imagine a universe where SASI would have been developed into a stable 
feature. 

> is there a known path forward to fix the drop schema w nodes down issue and 
> anything written on it?
Yes, there is a clear known path for fixing schema changes, and gladly they do 
not require a protocol change, just a slightly deeper integration with TCM.


On Fri, Mar 7, 2025, at 4:44 PM, Jordan West wrote:
> I would love to have my questions answered and see some graphs I don’t think 
> those are unreasonable asks nor do they take away from the awesome work done. 
> I was suggesting 1-2 weeks for folks to have the opportunity to produce that 
> data if the original authors didn’t have time. I also don’t think that’s 
> unreasonable. but to be clear I’m not blocking anything. If folks want to 
> merge I am not objecting.
> 
> I do think we should hold features to a high standard and personally “time 
> worked on a feature” is not a criteria for me when considering why we should 
> merge. It is absolutely worth recognizing and celebrating the massive invest 
> and effort made here. It’s just an orthogonal point to me. As a contrived 
> example: If 15452 was not as impactful performance wise after a year of on 
> and off work I would’ve happily continue to address it or take a different 
> approach. SASI took a year and a half or more and I still regret that we 
> merged it into 3.x in the form we did using the same early contribution 
> model. That was an example of an extreme, and out of our control case, of an 
> entire team disbanding right after merge. 
> 
> Jordan 
> 
> On Fri, Mar 7, 2025 at 06:28 Jon Haddad <j...@rustyrazorblade.com> wrote:
>> I defer to the judgement of the folks that are most impacted by it - ones 
>> that are in the code, working on the next release.  If you all think it's 
>> good to merge, then I am 100% in support of it.  I suspect merging will help 
>> get it out faster, and I don't see any future in which we don't ship this in 
>> the next release.
>> 
>> I will be happy to help answer the "how does it compare to paxos v2" 
>> question post-merge.
>> 
>> Jon
>> 
>> 
>> 
>> On Fri, Mar 7, 2025 at 5:52 AM Josh McKenzie <jmcken...@apache.org> wrote:
>>> __
>>> 3.5 years is an incredible amount of time and work; it really is 
>>> significant and thanks to everyone involved for the investment of time and 
>>> energy.
>>> 
>>> We have a rocky history with large, disruptive contributions in the past 
>>> that have either blocked forward progress post-merge (CASSANDRA-8099), or 
>>> lingered in the code-base increasing maintenance burden on other 
>>> contributors for minimal or no user benefit (early open post SSD 
>>> transition, witness replicas, materialized views). I'm sympathetic to where 
>>> Jordan's questions stem from, as our history of leaving things in the 
>>> codebase long after they've become vestigial or abandoned has slowed down 
>>> our collective momentum maintaining the project on actively used features.
>>> 
>>> That said, I don't think Accord will run afoul of some of those same 
>>> patterns. Aside from the degree of investment already in it and sheer 
>>> number of pmc members and committers involved, I believe it's a feature 
>>> that's universally impactful and that if we had a metaphorical bus-factor 
>>> change (entire group of people working on it disappeared the day after 
>>> merge or decided to go on vacation for 5 years), others in the community 
>>> would be willing to pick things up and keep it moving given its proximity 
>>> to release readiness.
>>> 
>>> The 2 questions Jordan asked resonate with me: 1) do we have line of sight 
>>> to a fix on the schema issues, and I'll take the liberty of reframing 2) do 
>>> we have line of sight to improvement on the performance front to be usable 
>>> for multi-key transactions? (subtle: I don't think "parity with PaxosV2" is 
>>> the right target, but rather "fast enough to be usable for multi-key 
>>> transactions" since it's a new query paradigm).
>>> 
>>> Given the context on contributor backing and if the answer is yes to those 
>>> 2 questions (which I believe it is), I think we should generally be 
>>> comfortable with merging the feature as experimental at this time.
>>> 
>>> On Fri, Mar 7, 2025, at 12:54 AM, Benedict wrote:
>>>> 
>>>> There are essentially three possible timelines to choose from here: 
>>>> 
>>>> 1) We agree in the next few days to merge to trunk. We will then 
>>>> prioritise rebasing onto trunk and resolving any pre-merge items starting 
>>>> next week.
>>>> 2) There’s some more debate and agreement to merge to trunk in a week or 
>>>> two. In the meantime we will shift to internal-first development but we’ll 
>>>> likely prioritise the above work as soon as we can, which may be in a few 
>>>> weeks, so we can shift to trunk first development.
>>>> 3) We don’t agree to merge accord anytime soon, so we shift to 
>>>> internal-first development for the time being. I’m not sure when we will 
>>>> prioritise any of the above.
>>>> 
>>>> Our resources are finite and we’ve exhausted them (literally), so it’s 
>>>> pretty much pick one of the above. I don’t really mind which you pick, but 
>>>> I won’t personally be prioritising merge after this third attempt.
>>>> 
>>>> 
>>>>> On 6 Mar 2025, at 22:01, Jon Haddad <j...@rustyrazorblade.com> wrote:
>>>>> 
>>>>> Hmm... I took a look at the cep-15-accord branch in GitHub, it looks like 
>>>>> it's several hundred commits behind trunk.  Since you'll need to rebase 
>>>>> again before merge *anyways*, would it make sense to do it once more, and 
>>>>> I can publish easy-cass-lab with the latest branch?  If folks have 
>>>>> concerns, it's easy to fire up a cluster (I do it constantly) and try it 
>>>>> out.
>>>>> 
>>>>> I think if we were to do this, out of consideration we should time box 
>>>>> the amount of time for an evaluation and unless someone raises an 
>>>>> objection, consider lazy consensus achieved.
>>>>> 
>>>>> Jon
>>>>> 
>>>>> 
>>>>> 
>>>>> On Thu, Mar 6, 2025 at 12:46 PM Benedict Elliott Smith 
>>>>> <bened...@apache.org> wrote:
>>>>>> Because we want to validate against the latest code in trunk, else we 
>>>>>> are validating stale behaviours. The cost of rebasing is high, so we do 
>>>>>> not do it frequently. That means we will likely stop developing 
>>>>>> OSS-first, as the focus will have to move to our internal branch that 
>>>>>> satisfies these criteria.
>>>>>> 
>>>>>> Exactly what this might be for upstreaming I cannot say. Personally, I 
>>>>>> aim to work exclusively on the branch we are stabilising. If that is not 
>>>>>> trunk, the latency for my contributions being made public might be high, 
>>>>>> as I have a huge imbalance of over-investment to recoup, and anything 
>>>>>> unnecessary will be deferred.
>>>>>> 
>>>>>> Since the feature is disabled, and the code is almost entirely isolated, 
>>>>>> I cannot imagine the cost to the community to removing this work would 
>>>>>> be very high. But, I do not intend to argue Accord’s case here. I will 
>>>>>> let you all decide.
>>>>>> 
>>>>>> Please decide soon though, as it shapes our work planning. The positive 
>>>>>> reception so far had lead me to consider prioritising a move to 
>>>>>> trunk-first development within the next week or two, and the associated 
>>>>>> work that entails. However, if that was optimistic we will have to shift 
>>>>>> our plans.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On 6 Mar 2025, at 20:16, Jordan West <jw...@apache.org> wrote:
>>>>>>> 
>>>>>>> The work and effort in accord has been amazing. And I’m sure it sets a 
>>>>>>> new standard for code quality and correctness testing which I’m also 
>>>>>>> entirely behind. I also trust the folks working on it want to take it 
>>>>>>> to the a fully production ready solution. But I’m worried about 
>>>>>>> circumstances out of our control leaving us with a very complex feature 
>>>>>>> that isn’t complete. 
>>>>>>> 
>>>>>>> I do have some questions. Could folks help me better understand why 
>>>>>>> testing real workloads necessitates a merge (my understanding from the 
>>>>>>> original reason is this is the impetus for why we would merge now)? 
>>>>>>> Also I think the performance and scheme change caveats are rather large 
>>>>>>> ones. One of accords promise was better performance and I think making 
>>>>>>> schema changes with nodes down not being supported is a big gap. Could 
>>>>>>> we have some criteria like “supports all the operations PaxosV2 
>>>>>>> supports” or “performs as well or better than PaxosV2 on 
>>>>>>> [workload(s)]”? 
>>>>>>> 
>>>>>>> I understand waiting asks a lot of the authors in terms of baring the 
>>>>>>> burden of a more complex merge. But I think we also need to consider 
>>>>>>> what merging is asking the community to bear if the worst happens and 
>>>>>>> we are unable to take the feature from its current state to something 
>>>>>>> that can be widely used in production.
>>>>>>> 
>>>>>>> Jordan 
>>>>>>> 
>>>>>>> 
>>>>>>> On Wed, Mar 5, 2025 at 15:52 Blake Eggleston <bl...@ultrablake.com> 
>>>>>>> wrote:
>>>>>>>> __
>>>>>>>> +1 to merging it
>>>>>>>> 
>>>>>>>> On Wed, Mar 5, 2025, at 12:22 PM, Patrick McFadin wrote:
>>>>>>>>> You have my +1
>>>>>>>>> 
>>>>>>>>> On Wed, Mar 5, 2025 at 12:16 PM Benedict <bened...@apache.org> wrote:
>>>>>>>>> >
>>>>>>>>> > Correct, these caveats should only apply to tables that have 
>>>>>>>>> > opted-in to accord.
>>>>>>>>> >
>>>>>>>>> > On 5 Mar 2025, at 20:08, Jeremiah Jordan <jerem...@apache.org> 
>>>>>>>>> > wrote:
>>>>>>>>> >
>>>>>>>>> > 
>>>>>>>>> > So great to see all this hard work about to pay off!
>>>>>>>>> >
>>>>>>>>> > On the questions/concerns front, the only concern I would have 
>>>>>>>>> > towards merging this to trunk is if any of the caveats apply when 
>>>>>>>>> > someone is not using Accord.  Assuming they only apply when the 
>>>>>>>>> > feature flag is enabled, I see no reason not to get this merged 
>>>>>>>>> > into trunk once everyone involved is happy with the state of it.
>>>>>>>>> >
>>>>>>>>> > -Jeremiah
>>>>>>>>> >
>>>>>>>>> > On Mar 5, 2025 at 12:15:23 PM, Benedict Elliott Smith 
>>>>>>>>> > <bened...@apache.org> wrote:
>>>>>>>>> >>
>>>>>>>>> >> That depends on all of you lovely people :D
>>>>>>>>> >>
>>>>>>>>> >> I think we should have finished merging everything we want before 
>>>>>>>>> >> QA by ~Monday; certainly not much later.
>>>>>>>>> >>
>>>>>>>>> >> I think we have some upgrade and python dtest failures to address 
>>>>>>>>> >> as well.
>>>>>>>>> >>
>>>>>>>>> >> So it could be pretty soon if the community is supportive.
>>>>>>>>> >>
>>>>>>>>> >> On 5 Mar 2025, at 17:22, Patrick McFadin <pmcfa...@gmail.com> 
>>>>>>>>> >> wrote:
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> What is the timing for starting the merge process? I'm asking 
>>>>>>>>> >> because
>>>>>>>>> >>
>>>>>>>>> >> I have (yet another) presentation and this would be a cool update.
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> On Wed, Mar 5, 2025 at 1:22 AM Benedict Elliott Smith
>>>>>>>>> >>
>>>>>>>>> >> <bened...@apache.org> wrote:
>>>>>>>>> >>
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >> > Thanks everyone.
>>>>>>>>> >>
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >> > Jon - your help will be greatly appreciated. We’ll let you know 
>>>>>>>>> >> > when we’ve got the cycles to invest in performance work 
>>>>>>>>> >> > (hopefully fairly soon). I expect the first step will be 
>>>>>>>>> >> > improving visibility so we can better understand what the system 
>>>>>>>>> >> > is doing (particularly the caching layers), but we can dig in 
>>>>>>>>> >> > together when ready.
>>>>>>>>> >>
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >> > On 4 Mar 2025, at 18:15, Jon Haddad <j...@rustyrazorblade.com> 
>>>>>>>>> >> > wrote:
>>>>>>>>> >>
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >> > Very exciting!
>>>>>>>>> >>
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >> > I have a client that's very interested in Accord, so I should 
>>>>>>>>> >> > have budget to dig into it, especially on the performance side 
>>>>>>>>> >> > of things.
>>>>>>>>> >>
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >> > Jon
>>>>>>>>> >>
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >> > On Tue, Mar 4, 2025 at 9:57 AM Dmitry Konstantinov 
>>>>>>>>> >> > <netud...@gmail.com> wrote:
>>>>>>>>> >>
>>>>>>>>> >> >>
>>>>>>>>> >>
>>>>>>>>> >> >> Thank you to all Accord and TCM contributors, it is really 
>>>>>>>>> >> >> exciting to see a development of such huge and wonderful 
>>>>>>>>> >> >> features moving forward and opening the door to the new 
>>>>>>>>> >> >> Cassandra epoch!
>>>>>>>>> >>
>>>>>>>>> >> >>
>>>>>>>>> >>
>>>>>>>>> >> >> On Tue, 4 Mar 2025 at 20:45, Blake Eggleston 
>>>>>>>>> >> >> <bl...@ultrablake.com> wrote:
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> Thanks Benedict!
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> I’m really excited to see accord reach this milestone, even 
>>>>>>>>> >> >>> with these caveats. You seem to have left yourself off the 
>>>>>>>>> >> >>> list of contributors though, even though you’ve been a central 
>>>>>>>>> >> >>> figure in its development :) So thanks to all accord & tcm 
>>>>>>>>> >> >>> contributors, including Benedict, for making this possible!
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> On Tue, Mar 4, 2025, at 8:00 AM, Benedict Elliott Smith wrote:
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> Hi everyone,
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> It’s been exactly 3.5 years since the first commit to 
>>>>>>>>> >> >>> cassandra-accord. Yes, really, it’s been that long.
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> We will be starting to validate the feature against real 
>>>>>>>>> >> >>> workloads in the near future, so we can’t sensibly push off 
>>>>>>>>> >> >>> merging much longer. The following is a brief run-down of the 
>>>>>>>>> >> >>> state of play. There are no known bugs, but there remain a 
>>>>>>>>> >> >>> number of caveats we will be incrementally addressing in the 
>>>>>>>>> >> >>> run-up to a full release:
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> [1] Accord is likely to be SLOW until further optimisations 
>>>>>>>>> >> >>> are implemented
>>>>>>>>> >>
>>>>>>>>> >> >>> [2] Schema changes have a number of hard edges
>>>>>>>>> >>
>>>>>>>>> >> >>> [3] Validation is ongoing, so there are likely still a number 
>>>>>>>>> >> >>> of bugs to shake out
>>>>>>>>> >>
>>>>>>>>> >> >>> [4] Many operator visibility/tooling/documentation 
>>>>>>>>> >> >>> improvements are pending
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> To expand a little:
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> [1] As of the last experiment we conducted, accord’s 
>>>>>>>>> >> >>> throughput was poor - also leading to higher LAN latencies. We 
>>>>>>>>> >> >>> have done no WAN experiments to date, but the protocol 
>>>>>>>>> >> >>> guarantees should already achieve better round-trip 
>>>>>>>>> >> >>> performance, in particular under contention. Improving 
>>>>>>>>> >> >>> throughput will be the main focus of attention once we are 
>>>>>>>>> >> >>> satisfied the protocol is otherwise stable, but our focus 
>>>>>>>>> >> >>> remains validation for the moment.
>>>>>>>>> >>
>>>>>>>>> >> >>> [2] Schema changes have not yet been well integrated with TCM. 
>>>>>>>>> >> >>> Dropping a table for instance will currently cause problems if 
>>>>>>>>> >> >>> nodes are offline.
>>>>>>>>> >>
>>>>>>>>> >> >>> [3] We have a range of validations we are already performing 
>>>>>>>>> >> >>> against cassandra-accord directly, and against its integration 
>>>>>>>>> >> >>> with Cassandra in cep-15-accord. We have run hundreds of 
>>>>>>>>> >> >>> billions of simulated transactions, and are still discovering 
>>>>>>>>> >> >>> some minor fault every few billion simulated transactions or 
>>>>>>>>> >> >>> so. There remains a lot more simulated validation to explore, 
>>>>>>>>> >> >>> as well as with real clusters serving real workloads.
>>>>>>>>> >>
>>>>>>>>> >> >>> [4] There are already a range of virtual tables for exploring 
>>>>>>>>> >> >>> internal state in Accord, and reasonably good metric support. 
>>>>>>>>> >> >>> However, tracing is not yet supported, and our metric and 
>>>>>>>>> >> >>> virtual table integrations need some further development.
>>>>>>>>> >>
>>>>>>>>> >> >>> [5] There are also other edge cases to address such as 
>>>>>>>>> >> >>> ensuring we do not reuse HLCs after restart, supporting 
>>>>>>>>> >> >>> ByteOrderPartitioner, and live migration from/to Paxos is 
>>>>>>>>> >> >>> undergoing fine-tuning and validation; probably there are some 
>>>>>>>>> >> >>> other things I am forgetting.
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> Altogether the feature is fairly mature, despite these 
>>>>>>>>> >> >>> caveats. This is the fruit of the labour of a long list of 
>>>>>>>>> >> >>> contributors, including Aleksey Yeschenko, Alex Petrov, Ariel 
>>>>>>>>> >> >>> Weisberg, Blake Eggleston, Caleb Rackliffe and David Capwell, 
>>>>>>>>> >> >>> and represents a huge undertaking. It also wouldn’t have been 
>>>>>>>>> >> >>> possible without the work of Alex Petrov, Marcus Eriksson and 
>>>>>>>>> >> >>> Sam Tunnicliffe on delivering transactional cluster metadata. 
>>>>>>>>> >> >>> I hope you will join me in thanking them all for their 
>>>>>>>>> >> >>> contributions.
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> Alex has also kindly produced some initial overview 
>>>>>>>>> >> >>> documentation for developers, that can be found here: 
>>>>>>>>> >> >>> https://github.com/apache/cassandra/blob/cep-15-accord/doc/modules/cassandra/pages/developing/accord/index.adoc.
>>>>>>>>> >> >>>  This will be expanded as time permits.
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>> Does anyone have any questions or concerns?
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>>
>>>>>>>>> >>
>>>>>>>>> >> >>
>>>>>>>>> >>
>>>>>>>>> >> >>
>>>>>>>>> >>
>>>>>>>>> >> >> --
>>>>>>>>> >>
>>>>>>>>> >> >> Dmitry Konstantinov
>>>>>>>>> >>
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> 
>>>>>>>> 
>>>

Re: CEP-15 Update

Reply via email to