Re: [DISCUSS] in-tree AI-assisted code-review prompts

David Capwell Fri, 29 May 2026 10:55:20 -0700

BTW there is https://github.com/apache/cassandra/pull/4837 which adds skills 
for property / stateful testing.  It also ships tests for utilities that had 0 
direct tests and the skills were used to create all the tests


> On May 28, 2026, at 1:38 AM, Štefan Miklošovič <[email protected]> wrote:
> 
> Hi Alex,
> 
> Has the situation around your skills improved in relation to what you
> have described or can we move forward with it already?
> 
> I think it is better to have something in rather than trying to
> perfect it on the first merge. The skills are useful as they are
> already and they can be calibrated in the future.
> 
> Regards
> 
> On Fri, May 15, 2026 at 6:58 PM Alex Petrov <[email protected]> wrote:
>> 
>> It performs poorly on larger patches, so I was trying to chunk it. I was 
>> also experimenting with reverse checklists: you generate a review checklist 
>> per patch and take skill as an input inspiration. Kind of semgrep rules but 
>> you encode them verbally.
>> 
>> On Fri, May 15, 2026, at 4:37 PM, Maxim Muzafarov wrote:
>> 
>> As for large patches used to test new skills, I think the “CEP-38: CQL
>> Management API” PR ( https://github.com/apache/cassandra/pull/4582 )
>> could be a good playground to validate the relevance and accuracy of
>> the suggestions provided by the deep-review and patch-explainer
>> skills.
>> 
>> (By the way, we still need a reviewer to move this patch forward.)
>> 
>> I used patch-explainer to generate a description. This is what it looks like:
>> https://github.com/Mmuzaf/cassandra/blob/cassandra-19476-bug-hunting/CASSANDRA-19476-PR-DESCRIPTION.md
>> 
>> Thoughts,
>> 
>> I think it would be useful to explicitly mention a strategy to split
>> large patches into some reviewable parts, for example by logically
>> separating them by component. There is already a “Skip or minimize”
>> section, but it does not mention breaking large patches into blocks
>> (if it's possible). The skill currently does not mention trade-offs,
>> although during implementation I constantly kept them in mind and even
>> tracked them separately in my notes for each critical section. For
>> example, what is actually preferable: issuing a direct command QUERY
>> request or invoking pre-registered prepared statements?
>> 
>> I also experimented with Mermaid diagrams (1) instead of ASCII
>> diagrams. This is how they could look (2) and looks better than the
>> text, although I noticed they tend to be less accurate.
>> 
>> 
>> I also tested deep-review, and although I had already used Claude to
>> review my changes, it still highlighted several issues that need to be
>> fixed:
>> https://github.com/Mmuzaf/cassandra/blob/cassandra-19476-bug-hunting/CEP-38_DEEP_REVIEW.md
>> 
>> Overall, I think it’s good.
>> Could you share any deficiencies you’ve spotted, Alex?
>> 
>> 
>> [1] https://de.wikipedia.org/wiki/Mermaid_(Software)
>> [2] 
>> https://github.com/Mmuzaf/cassandra/blob/cassandra-19476-bug-hunting/CASSANDRA-19476-PR-DESCRIPTION-MERMAID.md
>> 
>> 
>> On Fri, 15 May 2026 at 09:18, Alex Petrov <[email protected]> wrote:
>>> 
>>> I have spotted some deficiencies, particularly when reviewing large 
>>> patches. I have an experiment running that might improve the situation. 
>>> I’ll report as soon I have a result.
>>> 
>>> On Thu, May 14, 2026, at 12:31 PM, Štefan Miklošovič wrote:
>>> 
>>> I just merged (1) and created (2) for tracking the patch of Alex. (1) and 
>>> (2) don't collide.
>>> 
>>> It would be cool to include this (2) in upcoming weeks, let's just live 
>>> with what Alex provided for a while to evaluate that set of skills. If the 
>>> general vibe is OK I would approach the merge. Let's give it what ... few 
>>> weeks? Until the end of the month  at least.
>>> 
>>> (1) https://issues.apache.org/jira/browse/CASSANDRA-21301
>>> (2) https://issues.apache.org/jira/browse/CASSANDRA-21373
>>> 
>>> On Mon, May 11, 2026 at 3:21 PM Štefan Miklošovič <[email protected]> 
>>> wrote:
>>> 
>>> BTW I really appreciate TLA+ machinery in that patch, I let it scan 
>>> compression dictionaries code and how we disperse notifications around the 
>>> cluster when a dict is trained etc. and it spit out stuff like this. There 
>>> is an IDEA plugin for TLA+ I ran it in and it just worked and verified :) I 
>>> can imagine these specs might be theoretically something we commit into the 
>>> repo as well when applicable. That way we would at least conceptually 
>>> codify the protocols and could elaborate on them on a high level and run 
>>> some formal verifications etc ... Really appreciate this aspect of it.
>>> 
>>> (1) https://gist.github.com/smiklosovic/24b4db51f9ee2b64d76cb0bbb104e29a
>>> 
>>> On Mon, May 11, 2026 at 11:31 AM C. Scott Andreas <[email protected]> 
>>> wrote:
>>> 
>>> Alex - thanks so much for putting this together and sharing.
>>> 
>>> Here are three additional data loss / corruption bugs identified by Arjun 
>>> Ashok using this set of skills last week:
>>> 
>>> – https://issues.apache.org/jira/browse/CASSANDRA-21356: 
>>> CursorBasedCompaction: ReusableLivenessInfo.isExpiring() incorrectly 
>>> returns true for tombstone cells, corrupting cursor-compacted SSTable 
>>> format and cell reconciliation
>>> – https://issues.apache.org/jira/browse/CASSANDRA-21357: 
>>> CursorBasedCompaction: prevUnfilteredSize always written as 0 in 
>>> SSTableCursorWriter
>>> – https://issues.apache.org/jira/browse/CASSANDRA-21358: 
>>> CursorBasedCompaction: Final index block width off by one byte in 
>>> SSTableCursorWriter#appendBIGIndex()
>>> 
>>> Stepping back a bit --
>>> 
>>> This set of skills combined with the Opus model have enabled folks to find 
>>> 14 data loss, corruption, and correctness bugs in the project in the past 
>>> ~two weeks. These are bugs that likely would have gone undetected - and if 
>>> encountered in the wild, would have required extensive manual fuzz testing 
>>> to reproduce and identify.
>>> 
>>> In the case of the the issue that I'd found and reported:
>>> https://issues.apache.org/jira/browse/CASSANDRA-21340: GROUP BY queries 
>>> silently return incomplete results due to premature SRP abort
>>> 
>>> I found this by invoking the skill with the prompt "Review Cassandra's 
>>> implementation of GROUP BY for correctness. Identify edge cases that might 
>>> result in incorrect responses. After identifying candidate bugs, fan out 
>>> subagents to write unit tests and fuzz tests attempting to reproduce them. 
>>> Assess their veracity, and present them in order of concern."
>>> 
>>> In less than 30 minutes while sitting on the sofa, the model and skill 
>>> identified CASSANDRA-21340. In another hour, I was able to establish its 
>>> veracity, then leave the model and prompt behind to work through the issue 
>>> and write up the Jira ticket by hand.
>>> 
>>> I'm *really* impressed by what this set of skills enable, and I think they 
>>> may be transformative for quality in Apache Cassandra – especially when 
>>> combined with the ability to write in-JVM dtests; Harry tests; and to use 
>>> the Simulator. These also make it a lot easier to use each of these tools.
>>> 
>>> Here's how I'm thinking about this work so far:
>>> 
>>> – The ensemble review skills are a great first-pass review that can be used 
>>> by anyone preparing a patch to identify potential issues.
>>> – They're incredible for pointing at existing and/or new + experimental 
>>> components in Cassandra to find serious correctness issues.
>>> – I'm sure we'd find latent issues if we directed the skills at interaction 
>>> between multiple components, like "range tombstones x short read protection 
>>> x reverse reads x compact storage" (etc).
>>> – I think these skills could be generalized to support bug-finding and 
>>> validation in other Apache projects.
>>> – I also think there is a generalization of these skills that could be 
>>> applied to CPU + allocation profiling and optimization.
>>> 
>>> For those who have access to a suitable model, I'd love to hear your 
>>> experience attempting to find a latent bug in the database.
>>> 
>>> I was shocked how easy it was, and am hopeful for what this might do for 
>>> quality and data integrity in the project.
>>> 
>>> – Scott
>>> 
>>> On May 8, 2026, at 5:22 PM, Alex Petrov <[email protected]> wrote:
>>> 
>>> 
>>> I would recommend Opus 4.6+ for /deep-review, but /shallow-review is 
>>> probably fine with sonnet.
>>> 
>>> Maybe time permitting, I can do evals for different models at some point.
>>> 
>>> Review process is always a bottleneck and introducing such skills should 
>>> help to make it faster and more reliable.
>>> 
>>> This is hope here, but this is also just a start: we need to reduce 
>>> false-positives, and do more with specifications (P, TLA+) for critical 
>>> parts of code.
>>> 
>>> On Fri, May 8, 2026, at 5:56 PM, Dmitry Konstantinov wrote:
>>> 
>>> Hi, Alex, thank you a lot for sharing it. I have been using Claude code for 
>>> review of my changes but in a very basic ad-hoc way, it works for simple 
>>> issues. The skills look much much more powerful. I am going to read and try 
>>> them in the upcoming weeks.
>>> Review process is always a bottleneck and introducing such skills should 
>>> help to make it faster and more reliable.
>>> 
>>> A question: what model(s) do you use to run them? Is Sonet 4.6 enough?
>>> 
>>> Thanks,
>>> Dmitry
>>> 
>>> On Fri, 8 May 2026 at 14:03, Alex Petrov <[email protected]> wrote:
>>> 
>>> 
>>> Hello folks,
>>> 
>>> We have been working on some tooling [1] around Apache Cassandra 
>>> correctness, and wanted to share it with Cassandra community.
>>> 
>>> We have approached this by "indexing" ~3k Cassandra issues and extracting 
>>> common patterns from them, generalizing them, then running evals, tweaking, 
>>> and extending them until we were had a strong signal that it performs 
>>> better than the run-of-the mill code review skill. We have benchmarked it 
>>> against some popular OSS skills (by presenting bugs we knew existed from 
>>> "indexing" Apache Kafka, inferring commit bug source from the fix, and 
>>> making sure benchmarked skills actually find it).
>>> 
>>> In addition, I did my best to codify some things I knew about correctness, 
>>> researching code, and writing repros, and what I could find in research 
>>> papers and public blog posts.
>>> 
>>> So far we were able to find (at very least) following issues (in reality 
>>> the number is higher but I have a backlog of potential leads to investigate 
>>> and reproduce longer than the time I have available for these pursuits).
>>> 
>>> deep review + fuzzer:
>>> 
>>> CASSANDRA-21307: Lower bound [SSTABLE_UPPER_BOUND(row000063)] is bigger 
>>> than first returned value
>>> CASSANDRA-21292: Row re-inserted at the exact start of a range tombstone 
>>> disappears after major compaction
>>> CASSANDRA-21255: Differentiate between legitimate cases where the first 
>>> entry is the same as the last entry and empty bounds in 
>>> SSTableCursorWriter#addIndexBlock()
>>> 
>>> shallow + deep review:
>>> 
>>> (latent) issue of unused keepFrom in linearSubtract 
>>> https://github.com/apache/cassandra-accord/pull/272
>>> CASSANDRA-21336: CursorBasedCompaction: trailing present columns are 
>>> silently dropped in encodeLargeColumnsSubset()
>>> CASSANDRA-21340: GROUP BY queries silently return incomplete results due to 
>>> premature SRP abort
>>> CASSANDRA-21352 TCM: AtomicLongBackedProcessor sort inversion
>>> CASSANDRA-21353 putShortVolatile is not volatile in InMemoryTrie
>>> 
>>> Via specifications:
>>> 
>>> CASSANDRA-21337: Difference in behavior between Cursor-Based compaction and 
>>> "Regular" compaction
>>> CASSANDRA-21336: CursorBasedCompaction: trailing present columns are 
>>> silently dropped in encodeLargeColumnsSubset()
>>> CASSANDRA-21339: CursorBasedCompaction: expiring cells, same timestamp, 
>>> same ldt, different ttl
>>> CASSANDRA-21338: value comparison direction reversed in CursorCompactor
>>> 
>>> A few folks were using this skill to test some of subsystems, and might 
>>> report more issues that I am not directly attributing here. I have also 
>>> used these skills for self-review and have caught a couple of issues before 
>>> they made it into the codebase.
>>> 
>>> Despite some early success, I still consider this a very raw set of 
>>> prompts, but I think this has utility, and based on the success we have 
>>> seen so far, can be helpful and is (according to my measurement 
>>> methodology) fairing better than one-shot code review prompts that an LLM 
>>> would generate by user request.
>>> 
>>> Since I was focusing on finding issues, running evals, and trying several 
>>> other methodologies that did not make into this version/cut, I did not have 
>>> a chance to sit and re-read the entire final result just yet, which is why 
>>> I am not suggesting merging this into Cassandra codebase until we better 
>>> vet it, but with your help and feedback maybe we can do this quicker.
>>> 
>>> Hope you find this useful, please share your opinion, experience, and 
>>> criticism.
>>> 
>>> Happy bug hunting!
>>> --Alex
>>> 
>>> [1] https://github.com/apache/cassandra/pull/4794
>>> 
>>> 
>>> On Mon, Apr 13, 2026, at 1:12 PM, Štefan Miklošovič wrote:
>>> 
>>> I noticed this PR just landed.
>>> 
>>> Volunteers reviewing / improving greatly appreciated!
>>> 
>>> (1) https://github.com/apache/cassandra/pull/4734
>>> 
>>> On Thu, Feb 26, 2026 at 5:43 PM Jon Haddad <[email protected]> wrote:
>>> 
>>> I wanted to share a couple of other things I thought of.  I wrote this:
>>> 
>>>> C*'s technical debt will make using an agent in the codebase much harder 
>>>> than using one in my own
>>> 
>>> I want to clarify my intent with this statement.  I was trying to convey 
>>> that I've had the luxury of refactoring my code several times, because I 
>>> don't have to worry about messing with other people's branches.  I usually 
>>> write something, use it briefly, find its faults, redo it, and iterate 
>>> several times.  I never consider anything done and am always looking to 
>>> improve. This is very difficult with a project involving many people who 
>>> have in-flight branches spanning several months.  Changes I consider 
>>> no-brainers might be a headache for C*.  For example, I can just add a code 
>>> formatter and rewrite every file in the codebase.  I make major changes 
>>> regularly without any consequences. Here, it impacts dozens of people.  I 
>>> proactively improve my code's architecture because there are few, if any, 
>>> negative reasons not to.  It's enabled me to pay off a ton of technical 
>>> debt that accumulated over the eight years I handwrote everything.
>>> 
>>> Another example: I've been working on an orchestration tool around 
>>> easy-db-lab to automate running my tests across several clusters in 
>>> parallel.  I recently refactored it to split the REST server code from the 
>>> execution into Gradle submodules.  Now I can create different agents 
>>> specializing in each module's content, which slims down the context for 
>>> each agent.  Since I have a very clear boundary on each agent's 
>>> responsibility, I avoid the overhead of having one agent manage one huge 
>>> codebase.  I can specifically tell that one agent is responsible for this 
>>> directory, and its expertise is in Ktor.  Another agent is a Gradle expert. 
>>>  Another is Kubernetes.  When I work on tasks they can be decomposed into 
>>> task lists for each specialized agent.
>>> 
>>> I've always thought this would be a great architectural improvement for the 
>>> C* codebase regardless of LLMs. For example, putting the CQL parser in a 
>>> standalone module would allow us to publish it so people could consume it 
>>> in their own ecosystem without pulling in C*-all.  Isolating a few of these 
>>> subsystems could reduce cognitive overhead and simplify test design.  I'm 
>>> sure making the commit log reader standalone would make it much easier to 
>>> use in the sidecar. Easily using the SSTable readers and writers without 
>>> all the other dependencies would reduce workarounds in bulk analytics and 
>>> make these types of projects more feasible, benefiting the wider ecosystem.
>>> 
>>> Regardless of this approach, creating a devcontainer environment for the 
>>> project and pushing the image to GHCR would also be beneficial.  I am now 
>>> using one with each of my tools.  I don't trust Claude not to wipe my 
>>> system, so I sandbox it in a container. It only has access to the local 
>>> project and cannot push code or reach GitHub.  Devcontainers are supported 
>>> directly in IDEA, Zed, and VSCode.  You can also launch them directly from 
>>> GitHub or use the Claude mobile app.  I haven't spent much time on this yet 
>>> though, I still prefer two big 5k screens and a deafening mechanical 
>>> keyboard.
>>> 
>>> Jon
>>> 
>>> [1] 
>>> https://github.com/rustyrazorblade/easy-db-lab/blob/main/.devcontainer/devcontainer.json
>>> [2] 
>>> https://github.com/rustyrazorblade/easy-db-lab/blob/main/.devcontainer/Dockerfile
>>> 
>>> 
>>> 
>>> On Thu, Feb 26, 2026 at 12:58 AM Štefan Miklošovič <[email protected]> 
>>> wrote:
>>> 
>>> Thank you Jon for sharing,that was very helpful. All these insights are 
>>> invaluable.
>>> 
>>> On Wed, Feb 25, 2026 at 11:50 PM Jon Haddad <[email protected]> 
>>> wrote:
>>> 
>>> Regarding ant, we'd probably want a wrapper shell script that is more 
>>> LLM-friendly, hiding the excessive text and providing more actionable 
>>> output.  You can also delegate any task to a subagent so you don't waste 
>>> your context on the `ant` output, and use Claude's new Agent Teams [1] 
>>> feature to have a "builder" agent run in its own process.
>>> Docs help Claude find code, big time.  You can give it your organizational 
>>> structure and that institutional knowledge so it doesn't have to pull in 
>>> many tokens from dozens of files.  It *definitely* works.  I've pushed over 
>>> a quarter million LOC this month alone [1], and many of you may already 
>>> know I'm obsessed with efficiency.  I constantly test new ideas and 
>>> approaches to refine my process; I've found good documentation is 
>>> *critical*.
>>> 
>>> I've recently started working with both Spec-Kit (Microsoft, but it looks 
>>> abandoned) and OpenSpec, as both are designed to maintain long-term memory 
>>> for a project's product requirements and technical decisions.  OpenSpec is 
>>> supposed to work better for brownfield and iterative projects.  I haven't 
>>> tried BMAD yet.  It seemed a bit more heavyweight, but it may be better for 
>>> this project than my personal ones, where I don't collaborate with anyone.
>>> 
>>> I have found that the best results come from loosely coupled systems.  C*'s 
>>> technical debt will make using an agent in the codebase much harder than 
>>> using one in my own.  I haven't tried to work on a patch in C* yet with an 
>>> agent, but when I do I'll be sure to share what I've learned.
>>> 
>>> Today I introduced OpenSpec to easy-db-lab, you can see what it looks like 
>>> [3] if you're curious.  A number of markdown commands were added to the 
>>> repo, and Spec-Kit was removed.  I haven't reviewed it yet.  By the time 
>>> you read this I will have likely made some changes in a review. If you want 
>>> to see the before and after, the pre-review commit is c6a94e1.
>>> 
>>> Jon
>>> 
>>> [1] https://code.claude.com/docs/en/agent-teams
>>> [2] my 2 main projects, not including client work:
>>> git log --since="$(date +%Y-%m-01)" --numstat --pretty=tformat: | awk 
>>> 'NF==3 {added+=$1; removed+=$2} END {print "Added:", added, "Removed:", 
>>> removed}'
>>> Added: 90339 Removed: 45222
>>> 
>>> git log --since="$(date +%Y-%m-01)" --numstat --pretty=tformat: | awk 
>>> 'NF==3 {added+=$1; removed+=$2} END {print "Added:", added, "Removed:", 
>>> removed}'
>>> Added: 124863 Removed: 52923
>>> 
>>> 
>>> [3] https://github.com/rustyrazorblade/easy-db-lab/pull/530/changes
>>> 
>>> On Wed, Feb 25, 2026 at 6:18 AM David Capwell <[email protected]> wrote:
>>> 
>>> I’m not against memory / skills being added, but do want to request we 
>>> think / test to make sure we can quantify the gains
>>> 
>>> <arxiv-logo-fb.png>
>>> Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding 
>>> Agents?
>>> arxiv.org
>>> 
>>> <arxiv-logo-fb.png>
>>> SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
>>> arxiv.org
>>> 
>>> 
>>> These papers actually match my lived experience with this projects and 
>>> others.
>>> 
>>> 1) using /init to create CLAUDE.md / AGENTS.md yields negative results.  
>>> This is how I started and have moved away.  What is the context you need 
>>> 100% of the thing? It’s things that Claude can’t discover easy such as 
>>> tribal knowledge (such as link to our style guide).
>>> 2) Ant is horrible for agents, not to figure out what to do (Claude is good 
>>> at that) but at context bloat… do “ant jar” and you add like 10-20k tokens… 
>>> you MUST have tooling to fix this (I ban Claude from touching ant command, 
>>> it’s only allowed to run “ai-build”, and “ai-ci-test” as these fix the 
>>> context problems; rtk “might” work here, not tested as in on leave)
>>> 3) Claude doesn’t need docs to find code, that actually confuses it more.  
>>> When it needs to modify code it’s going to have to explore and will most 
>>> likely find what it needs.  I agree docs for humans would help, but let’s 
>>> keep it out of AI memory files.
>>> 4) I only really use sonnet/opus 4.5+, these claims might not be true for 
>>> older models or the open weight models.
>>> 
>>> As for skills, the following makes sense to me but I really hope a human 
>>> writes as AI doesn’t do well at understanding the WHY well and makes bad 
>>> assumptions: property testing, stateful property testing, harry, The 
>>> Simulator.  I left out cqltester because I found Claude doesn’t suck at it, 
>>> so not sure what a skill would add. The others I found it struggles with 
>>> and produces bad quality tests.
>>> 
>>> Last comment: Stefan, your link about ai code in the project didn’t take 
>>> into account what happened in the PR.  Our global static state world caused 
>>> a single test to fail which required a complete rewrite of the patch that I 
>>> ended up doing by hand.  So that patch ended up being 100% human.
>>> 
>>> Sent from my iPhone
>>> 
>>> On Feb 18, 2026, at 6:29 PM, Štefan Miklošovič <[email protected]> 
>>> wrote:
>>> 
>>> These are great points. I like how granular the approach of having
>>> multiple files is. That means we do not need to craft one
>>> "uber-claude.md" but we can do this iteratively and per specific
>>> domain which is easier to handle.
>>> 
>>> One consequence of having these "context files" is that a contributor
>>> does not even need to use any AI whatsoever in order to be more
>>> productive and organized. There is a lot of time lost when a new
>>> contributor wants to understand how the project "thinks", what are
>>> do-s and dont-s etc. All stuff which appears once a patch is
>>> submitted. If we explained to everybody in plain English how this all
>>> works on a detailed level, per domain, that would be tremendously
>>> helpful even without AI.
>>> 
>>> It will be interesting to watch how these files are written. To
>>> formalize and write it down is quite a task on its own.
>>> 
>>> 
>>> On Wed, Feb 18, 2026 at 6:47 PM Patrick McFadin <[email protected]> wrote:
>>> 
>>> 
>>> Context size is the hardest thing to manage right now in agentic coding. 
>>> I’ve stopped using MCP and switched to skills as a result.
>>> 
>>> 
>>> A couple of things worth noting. You can use many multiple 
>>> CLAUDE.md/AGENT.md files in a large code base. I’m started doing this and 
>>> it is remarkable. For example, in the pylib directory a CLAUDE.md file 
>>> would provide the Python specific info if making changes. The standard 
>>> layout for each should be
>>> 
>>> - What is this
>>> 
>>> - Where do I get more information
>>> 
>>> - How do I run or test
>>> 
>>> - What are the non-nogetialble rules
>>> 
>>> - What does done look like
>>> 
>>> 
>>> Imagine one in all sorts of places. fqtool, sstableloader, o.a.c.io.*, 
>>> o.a.c.repair.* etc etc. And they can evolve over time as people use them.
>>> 
>>> 
>>> The other thing to bring up is Brokk built by Jonathan Ellis. He 
>>> specifically built it for large code bases and specifically tests on the 
>>> Cassandra code base. (I’ll let him jump in here)
>>> 
>>> 
>>> Patrick
>>> 
>>> 
>>> On Feb 18, 2026, at 8:51 AM, Josh McKenzie <[email protected]> wrote:
>>> 
>>> 
>>> I’ve had trouble using Claude effectively on C*’s large codebase without a 
>>> lot of repeated “repo discovery” prompting.
>>> 
>>> 
>>> Just to keep beating the drum: I've had trouble working in our codebase 
>>> effectively without a lot of repeated "repo discovery" time. In fact, a 
>>> huge portion of the time I spend working on the codebase consists of 
>>> reading into adjacent coupled classes and modules since things are a) not 
>>> consistently or thoroughly documented, and b) generally not that decoupled.
>>> 
>>> 
>>> This is also / primarily a "human <-> information interfacing efficiency 
>>> problem" and it just so happens LLM's and agents being blocked from working 
>>> on our codebase is giving us an immediate short-term pain-proxy for 
>>> something I strongly believe has been a long-term tax on us.
>>> 
>>> 
>>> On Wed, Feb 18, 2026, at 10:04 AM, Isaac Reath wrote:
>>> 
>>> 
>>> I'm a +1 for the same reason that Josh lays out. Markdown files that detail 
>>> the structure of the repo, how to build & run tests, how to get checkstyle 
>>> to pass, etc. are all very valuable to new contributors even if LLMs went 
>>> away today.
>>> 
>>> 
>>> On Tue, Feb 17, 2026 at 7:33 PM Jon Haddad <[email protected]> wrote:
>>> 
>>> 
>>> It's all part of the same topic, Yifan.  You're making a distinction 
>>> without a difference. We could just as easily be discussing supporting 
>>> certain MCP servers like serena, or baking claude into a devcontainer.  
>>> It's all relevant. There's no need to police the discussion.
>>> 
>>> 
>>> On Tue, Feb 17, 2026 at 4:25 PM Yifan Cai <[email protected]> wrote:
>>> 
>>> 
>>> The original post was about adding AI tooling, prompt, command, or skill. 
>>> The thread is shifted to AI memory files.
>>> 
>>> 
>>> I do not have an objection to any of these, but want to make sure that we 
>>> are still on the original topic.
>>> 
>>> 
>>> IMO, AI tooling has a clear scope / definition and is easier to reach 
>>> consensus on. Meanwhile, AI memory files are vague to define clearly. 
>>> Different developers on different domains could have quite different 
>>> preferences.
>>> 
>>> 
>>> - Yifan
>>> 
>>> 
>>> On Tue, Feb 17, 2026 at 3:37 PM Dmitry Konstantinov <[email protected]> 
>>> wrote:
>>> 
>>> 
>>> I do not have my one but here there are few examples from oher Apache 
>>> projects:
>>> 
>>> https://github.com/apache/camel/blob/main/AGENTS.md
>>> 
>>> https://github.com/apache/ignite-3/blob/main/CLAUDE.md
>>> 
>>> https://github.com/apache/superset/blob/master/superset/mcp_service/CLAUDE.md
>>> 
>>> 
>>> On Tue, 17 Feb 2026 at 23:22, Jon Haddad <[email protected]> wrote:
>>> 
>>> 
>>> I think a few folks are already using CLAUDE.md files in their repo and 
>>> they're just not committing them.
>>> 
>>> Anyone want to share what's already done?  I'm happy to help share what I 
>>> know about the agentic side of things, but since I don't do much in the way 
>>> of patching C* it would be a lot of guessing.
>>> 
>>> 
>>> If I'm wrong and nobody shares one, I'll take a stab at it.
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Feb 17, 2026 at 3:08 PM Štefan Miklošovič <[email protected]> 
>>> wrote:
>>> 
>>> 
>>> Great feedback everybody! Really appreciate it!
>>> 
>>> 
>>> Reading what Jon posted ... Jon, I think you are the most experienced
>>> 
>>> in this based on what you wrote. Would you mind doing some POC here
>>> 
>>> for Cassandra repo? For the trunk it is enough ... Something we might
>>> 
>>> build further on. I think we need to build the foundations of that and
>>> 
>>> put some structure into it and all things considered I think you are
>>> 
>>> best for the job here.
>>> 
>>> 
>>> If the basics are there we can play with it more before merging, this
>>> 
>>> is not something which needs to be done "tomorrow", we can collaborate
>>> 
>>> on something together for some time and add things into it as patches
>>> 
>>> come. I think it takes some time to "tune" it.
>>> 
>>> 
>>> Everybody else feel free to help! My experience in this space is
>>> 
>>> limited, I think there are people who are using it more often than me
>>> 
>>> for sure.
>>> 
>>> 
>>> Regards
>>> 
>>> 
>>> On Wed, Feb 18, 2026 at 12:59 AM Joel Shepherd <[email protected]> wrote:
>>> 
>>> 
>>> There's been some momentum building for AGENTS.md files, both on the
>>> 
>>> project and on the agent side:
>>> 
>>> 
>>>    https://agents.md/
>>> 
>>> 
>>> Same idea and benefits, but it might help to align folks on a "standard"
>>> 
>>> that will work well across agents.
>>> 
>>> 
>>> I also think that more and better code documentation can be very
>>> 
>>> beneficial when using agents to help with working out implementation
>>> 
>>> details. I spent a bunch of time in January writing an introduction to
>>> 
>>> Apache Ratis (Raft as a library:
>>> 
>>> https://github.com/apache/ratis/blob/master/ratis-docs/src/site/markdown/index.md).
>>> 
>>> The code itself is pretty well-documented but it was hard for me to
>>> 
>>> build a mental model of how to integrate with. AI was very effective in
>>> 
>>> taking the granular in-code documentation and synthesizing an overview
>>> 
>>> from it. Going the other way, the in-code documentation has made it
>>> 
>>> possible for me to deep dive the Ratis code to root cause bugs, etc.
>>> 
>>> Agents can get a lot out of good class- and method-level documentation.
>>> 
>>> 
>>> -- Joel.
>>> 
>>> 
>>> On 2/16/2026 8:03 PM, Bernardo Botella wrote:
>>> 
>>> CAUTION: This email originated from outside of the organization. Do not 
>>> click links or open attachments unless you can confirm the sender and know 
>>> the content is safe.
>>> 
>>> 
>>> 
>>> 
>>> Thanks for bringing this up Stefan!!
>>> 
>>> 
>>> A really interesting topic indeed.
>>> 
>>> 
>>> 
>>> I’ve also heard ideas around even having Claude.md type of files that help 
>>> LLMs understand the code base without having to do a full scan every time.
>>> 
>>> 
>>> So, all and all, putting together something that we as a community think 
>>> that describe good practices + repository information not only for the main 
>>> Cassandra repository, but also for its subprojects, will definitely help 
>>> contributors adhere to standards and us reviewers to ensure that some steps 
>>> at least will have been considered.
>>> 
>>> 
>>> Things like:
>>> 
>>> - Repository structure. What every folder is
>>> 
>>> - Tests suits and how they work and run
>>> 
>>> - Git commits standards
>>> 
>>> - Specific project lint rules (like braces in new lines!)
>>> 
>>> - Preferred wording style for patches/documentation
>>> 
>>> 
>>> Committed to the projects, and accesible to LLMs, sound like really useful 
>>> context for those type of contributions (that are going to keep happening 
>>> regardless).
>>> 
>>> 
>>> So curious to read what others think.
>>> 
>>> Bernardo
>>> 
>>> 
>>> PD. Totally agree that this should change nothing of the quality bar for 
>>> code reviews and merged code
>>> 
>>> 
>>> On Feb 16, 2026, at 6:27 PM, Štefan Miklošovič <[email protected]> 
>>> wrote:
>>> 
>>> 
>>> Hey,
>>> 
>>> 
>>> This happened recently in kernel space. (1), (2).
>>> 
>>> 
>>> What that is doing, as I understand it, is that you can point LLM to
>>> 
>>> these resources and then it would be more capable when reviewing
>>> 
>>> patches or even writing them. It is kind of a guide / context provided
>>> 
>>> to AI prompt.
>>> 
>>> 
>>> I can imagine we would just compile something similar, merge it to the
>>> 
>>> repo, then if somebody is prompting it then they would have an easier
>>> 
>>> job etc etc, less error prone ... adhered to code style etc ...
>>> 
>>> 
>>> This might look like a controversial topic but I think we need to
>>> 
>>> discuss this. The usage of AI is just more and more frequent. From
>>> 
>>> Cassandra's perspective there is just this (3) but I do not think we
>>> 
>>> reached any conclusions there (please correct me if I am wrong where
>>> 
>>> we are at with AI generated patches).
>>> 
>>> 
>>> This is becoming an elephant in the room, I am noticing that some
>>> 
>>> patches for Cassandra were prompted by AI completely. I think it would
>>> 
>>> be way better if we make it easy for everybody contributing like that.
>>> 
>>> 
>>> This does not mean that we, as committers, would believe what AI
>>> 
>>> generated blindlessly. Not at all. It would still need to go over the
>>> 
>>> formal review as anything else. But acting like this is not happening
>>> 
>>> and people are just not going to use AI when trying to contribute is
>>> 
>>> not right. We should embrace it in some form ...
>>> 
>>> 
>>> 1) https://github.com/masoncl/review-prompts
>>> 
>>> 2) 
>>> https://lore.kernel.org/lkml/[email protected]/
>>> 
>>> 3) https://lists.apache.org/thread/j90jn83oz9gy88g08yzv3rgyy0vdqrv7
>>> 
>>> 
>>> 
>>> 
>>> --
>>> 
>>> Dmitry Konstantinov
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Dmitry Konstantinov
>>> 
>>> 
>>> 
>>> 
>> 
>>

Re: [DISCUSS] in-tree AI-assisted code-review prompts

Reply via email to