Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]

Markus Armbruster Fri, 17 Oct 2025 07:06:05 -0700

Daniel P. Berrangé <[email protected]> writes:

> On Wed, Oct 08, 2025 at 09:18:04AM +0200, Markus Armbruster wrote:
>> Paolo Bonzini <[email protected]> writes:
>> 
>> > [People in Cc are a mix of Python people, tracing people, and people
>> >  who followed the recent AI discussions. - Paolo]
>> >
>> > This series adds type annotations to tracetool. While useful on its own, 
>> > it also served as an experiment in whether AI tools could be useful and
>> > appropriate for mechanical code transformations that may not involve
>> > copyrightable expression.
>> >
>> > In this version, the types were added mostly with the RightTyper tool
>> > (https://github.com/RightTyper/RightTyper), which uses profiling to detect
>> > the types of arguments and return types at run time.  However, because
>> > adding type annotations is such a narrow and verifiable task, I also 
>> > developed
>> > a parallel version using an LLM, to provide some data on topics such as:
>> >
>> > - how much choice/creativity is there in writing type annotations?
>> >   Is it closer to writing functional code or to refactoring?
>> 
>> Based on my work with John Snow on typing of the QAPI generator: there
>> is some choice.
>> 
>> Consider typing a function's argument.  Should we pick it based on what
>> the function requires from its argument?  Or should the type reflect how
>> the function is used?
>> 
>> Say the function iterates over the argument.  So we make the argument
>> Iterable[...], right?  But what if all callers pass a list?  Making it
>> List[...] could be clearer then.  It's a choice.
>> 
>> I think the choice depends on context and taste.  At some library's
>> external interface, picking a more general type can make the function
>> more generally useful.  But for some internal helper, I'd pick the
>> actual type.
>> 
>> My point isn't that an LLM could not possibly do the right thing based
>> on context, and maybe even "taste" distilled from its training data.  My
>> point is that this isn't entirely mechanical with basically one correct
>> output.
>>
>> Once we have such judgement calls, there's the question how an LLM's
>> choice depends on its training data (first order approximation today:
>> nobody knows), and whether and when that makes the LLM's output a
>> derived work of its training data (to be settled in court).
>
> There's perhaps a missing step here. The work first has to be considered
> eligible for copyright protection, before we get onto questions wrt
> derivative works, but that opens a can of worms....
>
>
> The big challenge is that traditionally projects did not really have
> to think much about where the "threshold of originality"[1] came into
> play for a work to be considered eligible for copyright.
>
> It was fine to take the conservative view that any patch benefits
> from copyright protection for the individual (or more often their
> employer). There was rarely any compelling need for the project to
> understand if something failed to meet the threshold. We see this
> in cases where a project re-licenses code - it is simpler/cheaper
> just to contact all contributors for permission, than to evaluate
> which subset held copyright.
>
> This is a very good thing. Understanding where that threshold applies
> is an challenging intellectual task that consumes time better spent
> on other tasks. Especially notable though is that threshold varies
> across countries in the world, and some places have at times even
> considered merely expending labour to make a work eligible.


Moreover, having software developers apply copyright law is about as
smart as having copyright lawyers write mission-critical code.  Both
tasks require education and experience.

> In trying to create an policy that permits AI contributions in some
> narrow scenarios, we're trying to thread the needle and as a global
> project that implies satisfying a wide variety of interpretations
> for the copyright threshold. There's no clear cut answer to that,
> the only option is to mitigate risk to a tolerable level.

Yes!

The boundary between legal and illegal is a superposition of fuzzy,
squiggly lines, one per jurisdiction.

We can only try to approximate it from the legal side.

The tighter we try to approximate, the more risk we take on.

In addition, tighter approximations can be difficult to understand and
apply.

> That's something all projects already do. For example when choosing
> between a CLA, vs a DCO signup, vs implicitly expecting contributors
> to be good, we're making a risk/cost tradeoff.

A tradeoff made with competent legal advice.  In addition, it's easy to
understand and apply.

[Strong argument why type annotations are low risk snipped...]

> The problem I have remaining is around the practicality of expressing
> this in a policy and having contributors & maintainers apply it well
> in practice.
>
> There's a definite "slippery slope" situation. The incentive for
> contributors will be to search out reasons to justify why a work
> matches the AI exception,

„Libenter homines id quod volunt credunt.“

>                           while overworked maintainers probably
> aren't going to spend much (any) mental energy on objectively
> analysing the situation, unless the proposal looks terribly out
> of line with expectations.

Yup.  Code review takes so much brain capacity that it can impair my
spelling.  Legal reasoning is right out.

> I'm not saying we shouldn't have an exception. I think the mypy case
> is an fairly compelling example of a low risk activity. We will need
> to become comfortable with the implications around the incentives
> for contributors & maintainers when applying it though.

Makes sense.

> With regards,
> Daniel
>
> [1] https://en.wikipedia.org/wiki/Threshold_of_originality

Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]

Reply via email to