Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]

Daniel P . Berrangé Fri, 17 Oct 2025 01:21:47 -0700

On Wed, Oct 08, 2025 at 09:18:04AM +0200, Markus Armbruster wrote:
> Paolo Bonzini <[email protected]> writes:
> 
> > [People in Cc are a mix of Python people, tracing people, and people
> >  who followed the recent AI discussions. - Paolo]
> >
> > This series adds type annotations to tracetool. While useful on its own, 
> > it also served as an experiment in whether AI tools could be useful and
> > appropriate for mechanical code transformations that may not involve
> > copyrightable expression.
> >
> > In this version, the types were added mostly with the RightTyper tool
> > (https://github.com/RightTyper/RightTyper), which uses profiling to detect
> > the types of arguments and return types at run time.  However, because
> > adding type annotations is such a narrow and verifiable task, I also 
> > developed
> > a parallel version using an LLM, to provide some data on topics such as:
> >
> > - how much choice/creativity is there in writing type annotations?
> >   Is it closer to writing functional code or to refactoring?
> 
> Based on my work with John Snow on typing of the QAPI generator: there
> is some choice.
> 
> Consider typing a function's argument.  Should we pick it based on what
> the function requires from its argument?  Or should the type reflect how
> the function is used?
> 
> Say the function iterates over the argument.  So we make the argument
> Iterable[...], right?  But what if all callers pass a list?  Making it
> List[...] could be clearer then.  It's a choice.
> 
> I think the choice depends on context and taste.  At some library's
> external interface, picking a more general type can make the function
> more generally useful.  But for some internal helper, I'd pick the
> actual type.
> 
> My point isn't that an LLM could not possibly do the right thing based
> on context, and maybe even "taste" distilled from its training data.  My
> point is that this isn't entirely mechanical with basically one correct
> output.
>
> Once we have such judgement calls, there's the question how an LLM's
> choice depends on its training data (first order approximation today:
> nobody knows), and whether and when that makes the LLM's output a
> derived work of its training data (to be settled in court).


There's perhaps a missing step here. The work first has to be considered
eligible for copyright protection, before we get onto questions wrt
derivative works, but that opens a can of worms....


The big challenge is that traditionally projects did not really have
to think much about where the "threshold of originality"[1] came into
play for a work to be considered eligible for copyright.

It was fine to take the conservative view that any patch benefits
from copyright protection for the individual (or more often their
employer). There was rarely any compelling need for the project to
understand if something failed to meet the threshold. We see this
in cases where a project re-licenses code - it is simpler/cheaper
just to contact all contributors for permission, than to evaluate
which subset held copyright.

This is a very good thing. Understanding where that threshold applies
is an challenging intellectual task that consumes time better spent
on other tasks. Especially notable though is that threshold varies
across countries in the world, and some places have at times even
considered merely expending labour to make a work eligible.


In trying to create an policy that permits AI contributions in some
narrow scenarios, we're trying to thread the needle and as a global
project that implies satisfying a wide variety of interpretations
for the copyright threshold. There's no clear cut answer to that,
the only option is to mitigate risk to a tolerable level.

That's something all projects already do. For example when choosing
between a CLA, vs a DCO signup, vs implicitly expecting contributors
to be good, we're making a risk/cost tradeoff.



Back to the LLMs / type annotations question. Let us accept there
there is a judgement in situations like Iterable[...] vs List[...],
and that an LLM makes that call. The LLM's basis for that judgement
call is certainly biased from what it saw in training data, just as
our own judgement call would be biased from what we've seen before
in python type annotations in other codebases.

In either case though, I have a hard time proposing why the result
of that judgement call makes the output a derived work of anything
other than the current QEMU codebase (and possibly the LLM prompt).
To say otherwise would mean my work is tainted by every codebase
I've ever read.

The most plausible scenario of "copying" existing copyright content
is where a fork of the QEMU code already exists in the training
material with the type annotations present. If that was the case
though, that pre-existing work would already be considered a
derivative of QEMU code and thus compatibly licensed. What we'd
be loosing would be the explicit SoB of the original author (if
any), rather than facing any clear license compatibility problem.


If we consider that type annotations are (a) borderline for copyright
protection eligibility and (b) the type annotations are unlikely to be
claimed as derived work of anything other than QEMU; then this looks
like a low risk scenario for use of LLM.


The problem I have remaining is around the practicality of expressing
this in a policy and having contributors & maintainers apply it well
in practice.

There's a definite "slippery slope" situation. The incentive for
contributors will be to search out reasons to justify why a work
matches the AI exception, while overworked maintainers probably
aren't going to spend much (any) mental energy on objectively
analysing the situation, unless the proposal looks terribly out
of line with expectations.


I'm not saying we shouldn't have an exception. I think the mypy case
is an fairly compelling example of a low risk activity. We will need
to become comfortable with the implications around the incentives
for contributors & maintainers when applying it though.

With regards,
Daniel

[1] https://en.wikipedia.org/wiki/Threshold_of_originality
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]

Reply via email to