Hi Lucas, On 2/20/26 19:31, Lucas Nussbaum wrote:
On 20/02/26 at 02:45 +0900, Simon Richter wrote:
For the long term health of the free software ecosystem, we also need an accessible entry point for novice developers.
This means leaving some simple onboarding tasks that could probably be solved by an AI agent for a human to do: the value is not in the code created, but in the knowledge transfer.
This also means limiting the complexity of software projects, and finding a good balance between boilerplate and deeply nested dependencies that reuse so much code that modifying a single line can affect hundreds of different uses.
In the above, I think that you make the assumption that AI tools would cause us to run into a situation where there would not be enough simple tasks (suitable for new contributors) remaining. Unfortunately, I think that we have enough things to do, and sufficiently few new contributors, to avoid that issue for the foreseable future.
No, I expect the people performing the simple tasks with AI assistance will no longer gain the necessary understanding to graduate from there and take on more complex tasks over time, but this has been how we've traditionally gained members.
Commercial software development is, in my opinion, already doing it wrong to look at this through a productivity angle, because they are ignoring the long-term effects this has on the market. For a community project, this is even more wrong, because we can't even throw money at failures.
That's my point: even if AI were to deliver all that is promised and the legal, ecological and economical problems were solved, it is still not a good fit for Debian, so debating whether the water usage amortizes over a greater number of queries[1] or whether the legal system agrees with the people with the deep coffers is the second step, and presumes that we
1. actually want it, 2. can use it in a way that is consistent with our values, and3. can build social structures for the project to keep operating in this changed landscape
We are divided on 1, which is why we're having this debate at all, and we will not reach a consensus here. I personally detest it, because it's taking over precisely the fun aspects of coding and leaves me with the kind of drudgework that I used to raise my consulting fees for. There is a reason I only do one or two hours of code review per day at work, and spend the rest actively developing code that is as easy to review as possible.
For 2, I think that it is mostly incompatible with our values. AI coding assistants aren't free software, and they cannot be free software. They can be OSI-licensed frontends to a proprietary online service that cannot be replicated even if the source code to all the server side components was published. We have established elsewhere in this thread that an "ethically sourced" training dataset would not be sufficient.
Like a system that has been locked down with cryptography so only approved binaries can be loaded, this technology is outside the user's control, and we should not be endorsing it, even if we are only using its output.
We cannot stop individual contributors from using it, which leads to point 3: it requires stronger vetting of contributors before allowing them unsupervised access, and a more rigorous review culture than we currently have for first-time contributors.
Basically, my concern is less with seasoned developers who have the necessary skills to solve the problem without AI assistance, and who perform a thorough review on the generated code, but with newcomers who use AI for their first contributions. I expect that any non-trivial contributions will be larger and more complex, which causes extra review effort, and at the same time, gives us less insight into people's actual skillset.
For that reason, I think we should not be accepting AI-assisted contributions from newcomers, however that leaves us with a need to create and justify policy on who is "allowed" to use AI in a Debian context.
Also, I think that this assumes that AI assistance will mainly help with easy tasks.
No, I mean that it will be used on precisely the easy tasks that we used to assess people's skills with.
This is not necessarily true: it could help with making harder tasks more accessible (by making it easier to perform refactoring to reduce the complexity or technical debt of our software, or making it more contributor-friendly by improving test suites or documentation).
TBQH, I don't expect to see this potential realized, because it is still review bottlenecked, and reviewing a large amount of text and verifying it for correctness is a lot harder than writing the text in the first place, because it requires comparing the existing mental model of the software with that described by the documentation, without mixing them up.
Any AI policy we come up with needs to solve this onboarding problem. We neither want to discourage people by rejecting their contributions, nor do we want to expend mentoring resources on people who do not want to be mentored.
You might find interesting to read https://arxiv.org/abs/2601.20245 (disclaimer: authors paid by Anthropic). It's a study that evaluates, on a given task, how much time it took developers to perform a task, and also how much understanding they gained about the technology they used . A takeaway is that there are very different ways to interact with AI, that produce very different results both in terms of speed and of understanding (see figure 11 in particular).
Measuring understanding is precisely the problem we have in onboarding. The authors of the paper have devised a quiz to be answered after the task has been completed, and have created a controlled environment to ensure that the quiz was not answered with AI assistance.
This will be hard for us to replicate during the NM process, and a policy of allowing AI-assisted contributions will make it look even more arbitrary if we require applicants to be able to maintain packages without AI assistance at least in theory. Yet I would feel uneasy admitting new DDs that require AI assistance for daily tasks, so we need to measure.
That's why my email also mentions debhelper as an aspect where we expect candidates to be familiar with layers of the software they usually will not interact with: it seems silly to teach things people won't use, but at the same time I think that having a group of fully-fledged DDs who are helpless when they encounter a debian/rules file with more than three lines is equally silly, so we do test candidates on low-level packaging minutiae.
I am fairly sure there are also aspects outside of newcomer onboarding where the use of AI creates an efficiency gain on one side, but creates additional cost on another.
It also seems hypocritical of us to operate our bug tracking system with scraper protection, diminishing the efficiency of AI assistance, and at the same time accept AI-assisted contributions.
Last but not least: Even if we ignore the massive burden this technology has imposed on the rest of the world[2], our project has been and continues to be negatively affected by it, and I strongly believe we should not contribute to the impression that this is remotely acceptable, which brings me back to question 1: do we really want this?
Simon [1] wtf [2] and we shouldn't ignore these, as Debian does not exist in a vacuum
OpenPGP_signature.asc
Description: OpenPGP digital signature

