In case you're keeping score on how frequently these are coming out: *please stop*. ;)
Silver lining - looks like we have a lot to discuss this round! Last update was late July and we've been churning through the 5.0 freeze and stabilization phase. *[New Contributors Getting Started] * Check out https://the-asf.slack.com, channel #cassandra-dev. Reply directly to me on this email if you need an invite for your account, and reach out to the @cassandra_mentors alias in the channel if you need to get oriented. We have a list of curated "getting started" tickets you can find here, filtered to "ToDo" (i.e. not yet worked): https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2160&quickFilter=2162&quickFilter=2652. *Helpful links:** * - Getting Started with Development on C*: https://cassandra.apache.org/_/development/gettingstarted.html - Building and IDE integration (worktrees are your friend; msg me on slack if you need pointers): https://cassandra.apache.org/_/development/ide.html - Code Style: https://cassandra.apache.org/_/development/code_style.html *[Dev mailing list] * https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-7-20%7Cdto=2023-10-27: My last email of shame was 35 threads. Drumroll for this one... 91. *Yeesh*. Let me stick to highlights. Ekaterina pushed through dropping JDK8 support and adding JDK17 support... back in July. If you didn't know about it by know, consider yourself doubly notified. :) . https://lists.apache.org/thread/9pwz3vtpf88fly27psc7yxvcv0lwbz8k I think I can speak on behalf of all of us when I say: **Thank You Ekaterina.** This came up recently on another thread about when to branch 5.1, but we discussed our freeze plans and exception rules for TCM and Accord here: https://lists.apache.org/thread/mzj3dq8b7mzf60k6mkby88b9n9ywmsgw. Mick was essentially looking for a similar waiver for Vector search since it was well abstracted, depended on SAI and external libs, and in general shouldn't be too big of a disruption to get into 5.0. General consensus at the time was "sure", and the work has since been completed. But here's the reminder and link for posterity (and in case you missed it). Jaydeep reached out about a potential short-term solution to detecting token-ownership mismatch while we don't yet have TCM; this seems more pressing now as we're looking at a 5.0 without yet having TCM in it. The dev ML thread is here: https://lists.apache.org/thread/4p0orhom42g36osnknqj3fqmqhvqml1g, and he created https://issues.apache.org/jira/browse/CASSANDRA-18758 dealing with the topic. There's a relatively modest (7 files, just over 300 lines) PR available here: https://github.com/apache/cassandra/pull/2595/files; I haven't looked into it, but it might be worth considering getting this into 5.0 since it looks like we're moving to cutting w/out TCM. Any thoughts? We had a pretty good discussion about automated repair scheduling, discussing whether it should live in the DB proper vs. in the sidecar, pros and cons, pressures, etc. Not sure if things moved beyond that; I know there's at least a few implementations out there that haven't yet made their way back to the ASF project proper. Thread: https://lists.apache.org/thread/glvmkwknf91rxc5l6w4d4m1kcvlr6mrv. My hope is we can avoid the gridlock we hit for a long time with the sidecar where there are multiple implementations with different tradeoffs and everyone's disincentivized from accepting a solution different from their own in-house one since it'd theoretically require re-tooling. Tough problem with no easy solutions, but would love to see this become a first class citizen in the ecosystem. Paulo brought up a discussion about moving to disk_access_mode = mmap_index_only on 5.0. Seemed to be a consensus there but I'm not sure we actually changed that in the 5.0 branch? Thread: https://lists.apache.org/thread/nhp6vftc4kc3dxskngxy5rpo1lp19drw. Just pulled on cassandra-5.0 and it looks like auto + hasLargeAddressSpace() == .mmap rather than .mmap_index_only. David Capwell worked on adding some retries to repair messages when they're failing to make the process more robust: https://lists.apache.org/thread/wxv6k6slljqcw73xcmpxj4kn5lz95jd1. Reception was positive enough that he went so far as to back-port it and also work on some for IR. Looks like he could use a reviewer here: https://issues.apache.org/jira/browse/CASSANDRA-18962 - and this is patch available. Mike Adamson reached out about adding / taking a dependency on jvector: https://lists.apache.org/thread/zkqg7mk9hp35zn0cf1tvywc2m3l63jrn. The general gist of it was "looks good, written by committer(s) / pmc members, permissvely licensed. Go for it". Some discussion about copyright holders and whether that matters from an ASF perspective, and we've further had some good discussion about the application of generative AI tooling to not just code contributed to the ASF, but also in dependencies we bring into the project. If you're curious about more details, check out the Apache LEGAL-656 JIRA here: https://issues.apache.org/jira/browse/LEGAL-656. The TL;DR comment is from Roman here: https://issues.apache.org/jira/browse/LEGAL-656?focusedCommentId=17779813&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17779813. Maxim Muzafarov keeps fighting the good fight of helping to clean up our codebase; he opened a thread about Cassandra's code style and source analysis here: https://lists.apache.org/thread/lr90ckt7scgs4tqjwd2t7928plngo5zl. We have a label for "code-polishing" that you can check that we're holding off on until after Accord and TCM merge so they don't take on a painful rebase burden mid-integration work (https://issues.apache.org/jira/issues/?jql=labels%20%3D%20code-polishing). Mick had some ideas around improving how we announce and handle having broken branches and merging to them: https://lists.apache.org/thread/n7zhzk4svdh1v3pswkrfwxw4o3g2f6xy. The gist of this: it's not great when a branch is straight up broken in ASF CI and then folks merge more code on top of that break; makes it harder to root out what's going on. We didn't _really_ get too far in closure on how we'd prevent this case in the future beyond "email the dev ML, post in #cassandra-dev slack, and... pray?". I'm in favor of a slack-bot that yells at us hourly if our builds are formally broken so we can't forget, with the assumption it _should_ be a pretty rare situation. If anyone else has input here that'd be helpful. Builds for 5.0 and trunk are now based on in-tree build scripts (found in .build). The scripts were moved from the cassandra-builds repo here: https://github.com/apache/cassandra-builds, where you can find build scripts used for other branches. Expect this to continue to evolve as we take some of the best learnings from circleci and other build systems and integrate them upstream. Claude discovered that our documentation for development dependencies is out of date: https://lists.apache.org/thread/91l7x7r0w7yycndslfc8kjs74s3jyqr2. Looks like Abe's working on an update there, but if anyone has opinions or cycles to help out this is high leverage work. Yifan Cai reached out about merging some changes for CQLSSTableWriter to 4.0 and up. Since this is offline tools only the general consensus was "go for it": https://lists.apache.org/thread/nwqdmqzoht2nyw9hg8o061vh6vk2oxd5 Maxim could use a reviewer for allowing UPDATE on settings virtual tables (ML: https://lists.apache.org/thread/rsgtwdlg411d76kptkbxv292hnv1s1c5, original ML thread here: https://lists.apache.org/thread/8kywzv24n0dp07mhvch7hwhjypssoh0l, JIRA: https://issues.apache.org/jira/browse/CASSANDRA-15254). I have to imagine most users would prefer to use CQL to interact w/their node settings than JMX, though I assume most of us have some Stockholm Syndrome at this point. Amit Pawar reached out about how we're approaching our defaults for the CommitLog (mmap vs. the new DirectI/O they have a PR up for). The general consensus was "that looks and sounds great, and we shouldn't change defaults until it's had time to bake as an option". https://lists.apache.org/thread/t6v0p10737p0joob2vcsdt0r3g8zt94q *[CI] * https://butler.cassandra.apache.org/#/ Since late July (~ 3 months): 3.0: 9 -> 18 • Was hovering around 12 ish for a good while there 3.11: 16 -> 20 • There's a lot more variance on this one. Curious why the delta from 3.0. 4.0: 24 -> 11 • Looks like long-term trend is around the 8? mark 4.1: 12 -> 12 • Pretty stable around 12 failures here 5.0: Averaging around 10 • Do we have too many branches yet? trunk: 16 -> 12 • One pretty big spike in there when CI was transitioning over, but on the whole in a pretty "tame" place. Low-grade noise on each of the branches. Spot-checking failures on 3.0, 4.0, and trunk, nothing really pops as being commonalities between them. *[What's been closed out]** * Updated quick-filter with new, ridiculous 90 day duration: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2278 JQL sorted by priority then type: https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20and%20resolution%20%3D%20fixed%20and%20resolved%20%3E%20-90d%20order%20by%20priority%20DESC%2C%20type%20DESC Due to the sheer volume of tickets (170 in the past 90 days!), I'll refrain from including them all in this email thread here. I should be considerably less "compressed for time" in the near future, so fingers crossed we can get back to a more digestible volume on these updates on a monthly cadence as we go into aggressive "release-mode". Being a part of an open-source community that's this mature, in a domain this complex, that's not only firing on all cylinders but going further and self-improving and accelerating is really gratifying and humbling for me. Thanks everyone for being a part of this. ~Josh