Hi,

On 19/02/26 at 11:05 -0600, Gunnar Wolf wrote:
> Lucas Nussbaum dijo [Thu, Feb 19, 2026 at 12:18:34PM +0100]:
> > Could you elaborate on how you would use such distinction in ballot
> > options?  I think that the core issue is the fact that tools generate
> > content that is then integrated into Debian. I don't really see how it
> > could be useful to distinguish between uses of such tools as long as
> > they are used to generate code.
> > 
> > Also, in terms of terminology, I find talking about "AI" slightly better
> > because it encompasses the whole ecosystem: trained model, inference
> > provider, agent (client-side tool that interacts with the codebase).
> 
> But, besides the issue that AI is very poorly defined, just using AI as an
> umbrella term conflates too many different uses. We have already discussed
> we have several packages including sets-of-weights-derived-from-ML
> (i.e. for chess gameplay, for image editing). The discussion at hand
> centers on having code generated by a LLM, not on having models derived
> from learning on a given dataset.

Well, the subject of this email mentions "AI-assisted Contributions",
that is, contributions (code, or other content) to Debian that were
partially or fully generated with the assistance of an AI-powered
tooling. (Also I argued in [0] that the fact that such tools are
AI-powered might not be very relevant, but it is probably relevant for
those against AI, which might not be against other kinds of tools).

So this is not at all about the inclusion inside Debian of datasets
(typically models) that are used to build a service enabling such tools.
That's an entirely different topic, which was covered by a withdrawn GR
last year[1].

Note however that while Sean suggests distinguishing between uses of
LLM, another angle to introduce other ballot options would be to
distinguish on the "freeness" of the tooling. That tooling is typically
composed of:
an "agent" that runs on the developer's local machine
... that talks to an inference provider over an API
 ... that provides access to one or more models
 ... that were trained on data

In the worst case, a proprietary agent talks to a commercial inference
provider that provides access to non-open-weight models trained on data
acquired via dubious means. That's what you would get with Claude Code.

There's some hope with open weight models (including for coding) that
could be self-hosted (with enough resources) or hosted by Free
Software-friendly organizations, and that can work with open source
agents. An example combination would be the Mistral Vibe agent, with the
Mistral Devstral 2 model.

However I'm not sure it would be useful in the context of this GR to
distinguish between fully proprietary AI ecosystems or more "open"
ecosystems.

Also, none of the other organizations or projects that established
similar policies[2] made this distinction.

Lucas


[0] https://lists.debian.org/debian-vote/2026/02/msg00006.html
[1] https://www.debian.org/vote/2025/vote_002
[2] https://lists.debian.org/debian-vote/2026/02/msg00007.html

Reply via email to