Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content

James Bottomley Mon, 10 Nov 2025 13:23:02 -0800

On Mon, 2025-11-10 at 14:54 -0500, Steven Rostedt wrote:
> On Mon, 10 Nov 2025 11:36:00 -0800
> Linus Torvalds <[email protected]> wrote:
> 
> > What's the copyright difference between artificial intelligence and
> > good oldfashioned wetware that isn't documented by "I used this
> > tool and these sources".
> 
> Probably no difference. I would guess the real liability is for those
> that use AI to submit patches. With the usual disclaimers of IANAL,
> I'm assuming that when you place your "Signed-off-by", you are
> stating that you have the right to submit this code. If it comes down
> that you did not have the right to submit the code, the original
> submitter is liable.


Liable for what?  Signed-off-by is a representation by you that you
followed the DCO, nothing more:

https://developercertificate.org/

Liability arises when someone reasonably relies on that representation
for some purpose and remember most licences actually disclaim fitness
for a particular purpose in all situations (the no warranty clause), so
we have loads of protection from general "liability" fears.

> I guess the question also is, is the maintainer that took that patch
> and added their SoB also liable?

See above for liability.  If you mean what representations does a
Maintainer give with a signoff, that's usually section (c):

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it

> If it is discovered that the AI tool was using source code that it
> wasn't supposed to be using, and then injected code that was pretty
> much verbatim to the original source, where it would be a copyright
> infringement, would the submitter of the patch be responsible? Would
> the maintainer?
> 
> I guess this would be no different if the submitter saw some code
> from a proprietary project and cut and pasted it without
> understanding they were not allowed to, and submitted that.

Right, the situation is analagous.  However, remember today there's no
legal case law that says a model's output is a derivative work of its
training (although there are still several cases ongoing).

> If the lawyers come back and say the onus is on the submitter and not
> the maintainer that the code being submitted is legal to be submitted
> under copyright law, then I'm perfectly fine in accepting any AI code
> (as long as the submitter can prove they understand that code and the
> code is clean).

Again, what do you mean by Liable?  The representations in the DCO are
fairly clear and as long as you have a good faith basis for following
their requirements the chances are that even if things like CRA pierce
the licence no-warranty clauses you wouldn't end up on the hook for a
copyright violation committed by a downstream author.

Remember also that a big design of the signoff is that if someone does
do something wrong, their contributions can be quickly identified and
excised (which is probably why AI contributions should be tagged with
which AI they came from).

If you want more assurance, let's take the example of the 10 lines of
code SCO eventually decided had been cut and pasted from Unixware by a
SGI engineer.  Their goal was to go after the shipper of the code with
the biggest pockets (IBM) they never made a case against the individual
engineer (probably mostly because the GPL no-warranty would make it
very hard to make the case and in minor part because the recovery would
be minimal)

> But until the lawyers state that explicitly, I can see why
> maintainers can be nervous about accepting AI generated code. Perhaps
> this transparency can make matters worse. As it can be argued that
> the maintainer knew it was a questionable AI that generated the code?
> (Like it would be if a maintainer knew the code being submitted was
> copied from a proprietary project).

There is so far no court case that AI output infringes copyright (there
are cases that have decided that AI training breached copyright
control, but none that that makes the output a derivative work of that
training), so that currently means that everyone can accept in good
faith that AI generated code is not infringing.  Even if the copyright
lobby eventually wins a case on the derivative nature of the output,
that won't change your historical good faith basis for accepting code,
although it may mean the project needs to undertake an effort to excise
it.

As far as the copyright status of AI output in the US goes, as long as
its not derivative of something, then it's a non-human creation and as
such cannot be copyrighted at all, so it's equivalent to public domain.

Regards,

James

Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content

Reply via email to