I think you awesome guys added so much additional testing that probably now
Jenkins can’t keep up 🤣 Seems like some CI jobs timed out while still
running tests after Accord got merged 😀

Jokes aside, awesome work! Huge congratulations to all the people involved!
👏🏻👏🏻👏🏻 Thank you all!!

On Fri, 18 Apr 2025 at 10:12, Paulo Motta <pa...@apache.org> wrote:

> Awesome milestone, congrats and thanks to all involved! 👏👏👏
>
> On Fri, 18 Apr 2025 at 05:19 Dmitry Konstantinov <netud...@gmail.com>
> wrote:
>
>> Hooray! Huge thanks to all! Now, I have no more excuses — it's time to
>> try it :-D
>>
>> On Thu, 17 Apr 2025 at 23:42, Jordan West <jorda...@gmail.com> wrote:
>>
>>> Congrats all! My previous reservations (that have been addressed) aside,
>>> this is an amazing milestone. Awesome, awesome work!
>>>
>>> Jordan
>>>
>>> On Thu, Apr 17, 2025 at 15:07 David Capwell <dcapw...@apple.com> wrote:
>>>
>>>> I have merged cep-15-accord into trunk.  If you experience any issues
>>>> please reach out to me
>>>>
>>>>
>>>> On Apr 17, 2025, at 12:55 AM, Benedict Elliott Smith <
>>>> bened...@apache.org> wrote:
>>>>
>>>> Final update: David has completed a second rebase after we reached
>>>> parity with trunk on our CI, and has confirmed tests remain stable. So I
>>>> expect CEP-15 to merge to trunk sometime today.
>>>>
>>>> No doubt there will be some unexpected disruption to others after a
>>>> patch like this lands. Reach out via slack if you have any trouble.
>>>>
>>>> On 16 Mar 2025, at 10:44, Benedict Elliott Smith <bened...@apache.org>
>>>> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> To update you: the last patches we considered blockers have landed in
>>>> the cep-15-accord branch. Caleb has now started rebasing the branch onto
>>>> trunk. I expect there will be a few failing tests still to resolve at that
>>>> point, but once they have been squashed we will proceed with the merge.
>>>>
>>>> There remains more work to do before release, and I will publish a
>>>> detailed roadmap to Jira when I’m back in a couple of weeks.
>>>>
>>>>
>>>> On 11 Mar 2025, at 20:12, Nate McCall <zznat...@gmail.com> wrote:
>>>>
>>>> It sounds like we are all pretty interested in seeing this feature land
>>>> and the branch maintenance is causing overhead that could be spent on
>>>> finalisation. +1 on merging, particularly given the feature flag work.
>>>>
>>>> Once more unto the breach 💪
>>>>
>>>> On Fri, 7 Mar 2025 at 6:56 PM, Benedict <bened...@apache.org> wrote:
>>>>
>>>>> There are essentially three possible timelines to choose from here:
>>>>>
>>>>> 1) We agree in the next few days to merge to trunk. We will then
>>>>> prioritise rebasing onto trunk and resolving any pre-merge items starting
>>>>> next week.
>>>>> 2) There’s some more debate and agreement to merge to trunk in a week
>>>>> or two. In the meantime we will shift to internal-first development but
>>>>> we’ll likely prioritise the above work as soon as we can, which may be in 
>>>>> a
>>>>> few weeks, so we can shift to trunk first development.
>>>>> 3) We don’t agree to merge accord anytime soon, so we shift to
>>>>> internal-first development for the time being. I’m not sure when we will
>>>>> prioritise any of the above.
>>>>>
>>>>> Our resources are finite and we’ve exhausted them (literally), so it’s
>>>>> pretty much pick one of the above. I don’t really mind which you pick, but
>>>>> I won’t personally be prioritising merge after this third attempt.
>>>>>
>>>>> On 6 Mar 2025, at 22:01, Jon Haddad <j...@rustyrazorblade.com> wrote:
>>>>>
>>>>> 
>>>>>
>>>>> Hmm... I took a look at the cep-15-accord branch in GitHub, it looks
>>>>> like it's several hundred commits behind trunk.  Since you'll need to
>>>>> rebase again before merge *anyways*, would it make sense to do it once
>>>>> more, and I can publish easy-cass-lab with the latest branch?  If folks
>>>>> have concerns, it's easy to fire up a cluster (I do it constantly) and try
>>>>> it out.
>>>>>
>>>>> I think if we were to do this, out of consideration we should time box
>>>>> the amount of time for an evaluation and unless someone raises an
>>>>> objection, consider lazy consensus achieved.
>>>>>
>>>>> Jon
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Mar 6, 2025 at 12:46 PM Benedict Elliott Smith <
>>>>> bened...@apache.org> wrote:
>>>>>
>>>>>> Because we want to validate against the latest code in trunk, else we
>>>>>> are validating stale behaviours. The cost of rebasing is high, so we do 
>>>>>> not
>>>>>> do it frequently. That means we will likely stop developing OSS-first, as
>>>>>> the focus will have to move to our internal branch that satisfies these
>>>>>> criteria.
>>>>>>
>>>>>> Exactly what this might be for upstreaming I cannot say. Personally,
>>>>>> I aim to work exclusively on the branch we are stabilising. If that is 
>>>>>> not
>>>>>> trunk, the latency for my contributions being made public might be high, 
>>>>>> as
>>>>>> I have a huge imbalance of over-investment to recoup, and anything
>>>>>> unnecessary will be deferred.
>>>>>>
>>>>>> Since the feature is disabled, and the code is almost entirely
>>>>>> isolated, I cannot imagine the cost to the community to removing this 
>>>>>> work
>>>>>> would be very high. But, I do not intend to argue Accord’s case here. I
>>>>>> will let you all decide.
>>>>>>
>>>>>> Please decide soon though, as it shapes our work planning. The
>>>>>> positive reception so far had lead me to consider prioritising a move to
>>>>>> trunk-first development within the next week or two, and the associated
>>>>>> work that entails. However, if that was optimistic we will have to shift
>>>>>> our plans.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 6 Mar 2025, at 20:16, Jordan West <jw...@apache.org> wrote:
>>>>>>
>>>>>> The work and effort in accord has been amazing. And I’m sure it sets
>>>>>> a new standard for code quality and correctness testing which I’m also
>>>>>> entirely behind. I also trust the folks working on it want to take it to
>>>>>> the a fully production ready solution. But I’m worried about 
>>>>>> circumstances
>>>>>> out of our control leaving us with a very complex feature that isn’t
>>>>>> complete.
>>>>>>
>>>>>> I do have some questions. Could folks help me better understand why
>>>>>> testing real workloads necessitates a merge (my understanding from the
>>>>>> original reason is this is the impetus for why we would merge now)? Also 
>>>>>> I
>>>>>> think the performance and scheme change caveats are rather large ones. 
>>>>>> One
>>>>>> of accords promise was better performance and I think making schema 
>>>>>> changes
>>>>>> with nodes down not being supported is a big gap. Could we have some
>>>>>> criteria like “supports all the operations PaxosV2 supports” or “performs
>>>>>> as well or better than PaxosV2 on [workload(s)]”?
>>>>>>
>>>>>> I understand waiting asks a lot of the authors in terms of baring the
>>>>>> burden of a more complex merge. But I think we also need to consider what
>>>>>> merging is asking the community to bear if the worst happens and we are
>>>>>> unable to take the feature from its current state to something that can 
>>>>>> be
>>>>>> widely used in production.
>>>>>>
>>>>>>
>>>>>> Jordan
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 5, 2025 at 15:52 Blake Eggleston <bl...@ultrablake.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +1 to merging it
>>>>>>>
>>>>>>> On Wed, Mar 5, 2025, at 12:22 PM, Patrick McFadin wrote:
>>>>>>>
>>>>>>> You have my +1
>>>>>>>
>>>>>>> On Wed, Mar 5, 2025 at 12:16 PM Benedict <bened...@apache.org>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > Correct, these caveats should only apply to tables that have
>>>>>>> opted-in to accord.
>>>>>>> >
>>>>>>> > On 5 Mar 2025, at 20:08, Jeremiah Jordan <jerem...@apache.org>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > 
>>>>>>> > So great to see all this hard work about to pay off!
>>>>>>> >
>>>>>>> > On the questions/concerns front, the only concern I would have
>>>>>>> towards merging this to trunk is if any of the caveats apply when 
>>>>>>> someone
>>>>>>> is not using Accord.  Assuming they only apply when the feature flag is
>>>>>>> enabled, I see no reason not to get this merged into trunk once everyone
>>>>>>> involved is happy with the state of it.
>>>>>>> >
>>>>>>> > -Jeremiah
>>>>>>> >
>>>>>>> > On Mar 5, 2025 at 12:15:23 PM, Benedict Elliott Smith <
>>>>>>> bened...@apache.org> wrote:
>>>>>>> >>
>>>>>>> >> That depends on all of you lovely people :D
>>>>>>> >>
>>>>>>> >> I think we should have finished merging everything we want before
>>>>>>> QA by ~Monday; certainly not much later.
>>>>>>> >>
>>>>>>> >> I think we have some upgrade and python dtest failures to address
>>>>>>> as well.
>>>>>>> >>
>>>>>>> >> So it could be pretty soon if the community is supportive.
>>>>>>> >>
>>>>>>> >> On 5 Mar 2025, at 17:22, Patrick McFadin <pmcfa...@gmail.com>
>>>>>>> wrote:
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> What is the timing for starting the merge process? I'm asking
>>>>>>> because
>>>>>>> >>
>>>>>>> >> I have (yet another) presentation and this would be a cool update.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Wed, Mar 5, 2025 at 1:22 AM Benedict Elliott Smith
>>>>>>> >>
>>>>>>> >> <bened...@apache.org> wrote:
>>>>>>> >>
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >> > Thanks everyone.
>>>>>>> >>
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >> > Jon - your help will be greatly appreciated. We’ll let you know
>>>>>>> when we’ve got the cycles to invest in performance work (hopefully 
>>>>>>> fairly
>>>>>>> soon). I expect the first step will be improving visibility so we can
>>>>>>> better understand what the system is doing (particularly the caching
>>>>>>> layers), but we can dig in together when ready.
>>>>>>> >>
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >> > On 4 Mar 2025, at 18:15, Jon Haddad <j...@rustyrazorblade.com>
>>>>>>> wrote:
>>>>>>> >>
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >> > Very exciting!
>>>>>>> >>
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >> > I have a client that's very interested in Accord, so I should
>>>>>>> have budget to dig into it, especially on the performance side of 
>>>>>>> things.
>>>>>>> >>
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >> > Jon
>>>>>>> >>
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >> > On Tue, Mar 4, 2025 at 9:57 AM Dmitry Konstantinov <
>>>>>>> netud...@gmail.com> wrote:
>>>>>>> >>
>>>>>>> >> >>
>>>>>>> >>
>>>>>>> >> >> Thank you to all Accord and TCM contributors, it is really
>>>>>>> exciting to see a development of such huge and wonderful features moving
>>>>>>> forward and opening the door to the new Cassandra epoch!
>>>>>>> >>
>>>>>>> >> >>
>>>>>>> >>
>>>>>>> >> >> On Tue, 4 Mar 2025 at 20:45, Blake Eggleston <
>>>>>>> bl...@ultrablake.com> wrote:
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> Thanks Benedict!
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> I’m really excited to see accord reach this milestone, even
>>>>>>> with these caveats. You seem to have left yourself off the list of
>>>>>>> contributors though, even though you’ve been a central figure in its
>>>>>>> development :) So thanks to all accord & tcm contributors, including
>>>>>>> Benedict, for making this possible!
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> On Tue, Mar 4, 2025, at 8:00 AM, Benedict Elliott Smith wrote:
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> Hi everyone,
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> It’s been exactly 3.5 years since the first commit to
>>>>>>> cassandra-accord. Yes, really, it’s been that long.
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> We will be starting to validate the feature against real
>>>>>>> workloads in the near future, so we can’t sensibly push off merging much
>>>>>>> longer. The following is a brief run-down of the state of play. There 
>>>>>>> are
>>>>>>> no known bugs, but there remain a number of caveats we will be
>>>>>>> incrementally addressing in the run-up to a full release:
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> [1] Accord is likely to be SLOW until further optimisations
>>>>>>> are implemented
>>>>>>> >>
>>>>>>> >> >>> [2] Schema changes have a number of hard edges
>>>>>>> >>
>>>>>>> >> >>> [3] Validation is ongoing, so there are likely still a number
>>>>>>> of bugs to shake out
>>>>>>> >>
>>>>>>> >> >>> [4] Many operator visibility/tooling/documentation
>>>>>>> improvements are pending
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> To expand a little:
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> [1] As of the last experiment we conducted, accord’s
>>>>>>> throughput was poor - also leading to higher LAN latencies. We have 
>>>>>>> done no
>>>>>>> WAN experiments to date, but the protocol guarantees should already 
>>>>>>> achieve
>>>>>>> better round-trip performance, in particular under contention. Improving
>>>>>>> throughput will be the main focus of attention once we are satisfied the
>>>>>>> protocol is otherwise stable, but our focus remains validation for the
>>>>>>> moment.
>>>>>>> >>
>>>>>>> >> >>> [2] Schema changes have not yet been well integrated with
>>>>>>> TCM. Dropping a table for instance will currently cause problems if 
>>>>>>> nodes
>>>>>>> are offline.
>>>>>>> >>
>>>>>>> >> >>> [3] We have a range of validations we are already performing
>>>>>>> against cassandra-accord directly, and against its integration with
>>>>>>> Cassandra in cep-15-accord. We have run hundreds of billions of 
>>>>>>> simulated
>>>>>>> transactions, and are still discovering some minor fault every few 
>>>>>>> billion
>>>>>>> simulated transactions or so. There remains a lot more simulated 
>>>>>>> validation
>>>>>>> to explore, as well as with real clusters serving real workloads.
>>>>>>> >>
>>>>>>> >> >>> [4] There are already a range of virtual tables for exploring
>>>>>>> internal state in Accord, and reasonably good metric support. However,
>>>>>>> tracing is not yet supported, and our metric and virtual table 
>>>>>>> integrations
>>>>>>> need some further development.
>>>>>>> >>
>>>>>>> >> >>> [5] There are also other edge cases to address such as
>>>>>>> ensuring we do not reuse HLCs after restart, supporting
>>>>>>> ByteOrderPartitioner, and live migration from/to Paxos is undergoing
>>>>>>> fine-tuning and validation; probably there are some other things I am
>>>>>>> forgetting.
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> Altogether the feature is fairly mature, despite these
>>>>>>> caveats. This is the fruit of the labour of a long list of contributors,
>>>>>>> including Aleksey Yeschenko, Alex Petrov, Ariel Weisberg, Blake 
>>>>>>> Eggleston,
>>>>>>> Caleb Rackliffe and David Capwell, and represents a huge undertaking. It
>>>>>>> also wouldn’t have been possible without the work of Alex Petrov, Marcus
>>>>>>> Eriksson and Sam Tunnicliffe on delivering transactional cluster 
>>>>>>> metadata.
>>>>>>> I hope you will join me in thanking them all for their contributions.
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> Alex has also kindly produced some initial overview
>>>>>>> documentation for developers, that can be found here:
>>>>>>> https://github.com/apache/cassandra/blob/cep-15-accord/doc/modules/cassandra/pages/developing/accord/index.adoc.
>>>>>>> This will be expanded as time permits.
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>> Does anyone have any questions or concerns?
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>>
>>>>>>> >>
>>>>>>> >> >>
>>>>>>> >>
>>>>>>> >> >>
>>>>>>> >>
>>>>>>> >> >> --
>>>>>>> >>
>>>>>>> >> >> Dmitry Konstantinov
>>>>>>> >>
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>
>> --
>> Dmitry Konstantinov
>>
>

Reply via email to