So great to see all this hard work about to pay off! On the questions/concerns front, the only concern I would have towards merging this to trunk is if any of the caveats apply when someone is not using Accord. Assuming they only apply when the feature flag is enabled, I see no reason not to get this merged into trunk once everyone involved is happy with the state of it.
-Jeremiah On Mar 5, 2025 at 12:15:23 PM, Benedict Elliott Smith <bened...@apache.org> wrote: > That depends on all of you lovely people :D > > I think we should have finished merging everything we want before QA by > ~Monday; certainly not much later. > > I think we have some upgrade and python dtest failures to address as well. > > So it could be pretty soon if the community is supportive. > > On 5 Mar 2025, at 17:22, Patrick McFadin <pmcfa...@gmail.com> wrote: > > > What is the timing for starting the merge process? I'm asking because > > I have (yet another) presentation and this would be a cool update. > > > On Wed, Mar 5, 2025 at 1:22 AM Benedict Elliott Smith > > <bened...@apache.org> wrote: > > > > > > Thanks everyone. > > > > > > Jon - your help will be greatly appreciated. We’ll let you know when > we’ve got the cycles to invest in performance work (hopefully fairly soon). > I expect the first step will be improving visibility so we can better > understand what the system is doing (particularly the caching layers), but > we can dig in together when ready. > > > > > > On 4 Mar 2025, at 18:15, Jon Haddad <j...@rustyrazorblade.com> wrote: > > > > > > Very exciting! > > > > > > I have a client that's very interested in Accord, so I should have > budget to dig into it, especially on the performance side of things. > > > > > > Jon > > > > > > On Tue, Mar 4, 2025 at 9:57 AM Dmitry Konstantinov <netud...@gmail.com> > wrote: > > >> > > >> Thank you to all Accord and TCM contributors, it is really exciting to > see a development of such huge and wonderful features moving forward and > opening the door to the new Cassandra epoch! > > >> > > >> On Tue, 4 Mar 2025 at 20:45, Blake Eggleston <bl...@ultrablake.com> > wrote: > > >>> > > >>> Thanks Benedict! > > >>> > > >>> I’m really excited to see accord reach this milestone, even with these > caveats. You seem to have left yourself off the list of contributors > though, even though you’ve been a central figure in its development :) So > thanks to all accord & tcm contributors, including Benedict, for making > this possible! > > >>> > > >>> On Tue, Mar 4, 2025, at 8:00 AM, Benedict Elliott Smith wrote: > > >>> > > >>> Hi everyone, > > >>> > > >>> It’s been exactly 3.5 years since the first commit to > cassandra-accord. Yes, really, it’s been that long. > > >>> > > >>> We will be starting to validate the feature against real workloads in > the near future, so we can’t sensibly push off merging much longer. The > following is a brief run-down of the state of play. There are no known > bugs, but there remain a number of caveats we will be incrementally > addressing in the run-up to a full release: > > >>> > > >>> [1] Accord is likely to be SLOW until further optimisations are > implemented > > >>> [2] Schema changes have a number of hard edges > > >>> [3] Validation is ongoing, so there are likely still a number of bugs > to shake out > > >>> [4] Many operator visibility/tooling/documentation improvements are > pending > > >>> > > >>> To expand a little: > > >>> > > >>> [1] As of the last experiment we conducted, accord’s throughput was > poor - also leading to higher LAN latencies. We have done no WAN > experiments to date, but the protocol guarantees should already achieve > better round-trip performance, in particular under contention. Improving > throughput will be the main focus of attention once we are satisfied the > protocol is otherwise stable, but our focus remains validation for the > moment. > > >>> [2] Schema changes have not yet been well integrated with TCM. > Dropping a table for instance will currently cause problems if nodes are > offline. > > >>> [3] We have a range of validations we are already performing against > cassandra-accord directly, and against its integration with Cassandra in > cep-15-accord. We have run hundreds of billions of simulated transactions, > and are still discovering some minor fault every few billion simulated > transactions or so. There remains a lot more simulated validation to > explore, as well as with real clusters serving real workloads. > > >>> [4] There are already a range of virtual tables for exploring internal > state in Accord, and reasonably good metric support. However, tracing is > not yet supported, and our metric and virtual table integrations need some > further development. > > >>> [5] There are also other edge cases to address such as ensuring we do > not reuse HLCs after restart, supporting ByteOrderPartitioner, and live > migration from/to Paxos is undergoing fine-tuning and validation; probably > there are some other things I am forgetting. > > >>> > > >>> Altogether the feature is fairly mature, despite these caveats. This > is the fruit of the labour of a long list of contributors, including > Aleksey Yeschenko, Alex Petrov, Ariel Weisberg, Blake Eggleston, Caleb > Rackliffe and David Capwell, and represents a huge undertaking. It also > wouldn’t have been possible without the work of Alex Petrov, Marcus > Eriksson and Sam Tunnicliffe on delivering transactional cluster metadata. > I hope you will join me in thanking them all for their contributions. > > >>> > > >>> Alex has also kindly produced some initial overview documentation for > developers, that can be found here: > https://github.com/apache/cassandra/blob/cep-15-accord/doc/modules/cassandra/pages/developing/accord/index.adoc. > This will be expanded as time permits. > > >>> > > >>> Does anyone have any questions or concerns? > > >>> > > >>> > > >> > > >> > > >> -- > > >> Dmitry Konstantinov > > > > > > > > >