[Numpy-discussion] Re: Current policy on AI-generated code in NumPy

Oscar Benjamin via NumPy-Discussion Sat, 21 Feb 2026 05:45:07 -0800

On the subject of whether gen-AI code violates copyright maybe this
pull request can be used as a relevant and real example:


https://github.com/sympy/sympy/pull/29150

The PR description has an AI Generation Disclosure that says:

> Initial code generated by codex, then edited by me.

I believe the author that that is exactly what happened but it leaves
me unsure of the copyright status of the code. Among recent PRs this
author stands out as a new contributor who has demonstrated their
ability to produce good PRs while apparently using AI to do so. If we
envisage a future in which most contributors are using AI then this is
exactly the kind of contributor that we should want. I have questioned
them about the copyright in that PR and they have not responded yet
but I think that is because it is just not at all clear how to answer
that question.

It is not clear to me without close inspection whether the code in the
PR is doing something simple and obvious or something nontrivial. I
have a suspicion that I might be able to simplify the code a lot if I
studied exactly what it is doing carefully but it would take some time
to figure that out. It is also possible though that the code
implements some nontrivial named algorithm and is perhaps more or less
directly copied (by the AI) from another codebase.

Another example is that I recently made this PR using codex (GPT-5.3) myself:

https://github.com/sympy/sympy/pull/29127

That PR copies/adapts the type annotations that I wrote in one library
into another library. After a lot of back and forth with codex it
decided to look at the contents of the other library and then adapted
the type annotations from there, copying them into another codebase. I
am the author of the code in question in both codebases but just so
we're clear the agent literally went and read the code for another
package in .venv/lib/python3.14/site-packages/ to do this (the UI
tells you when the agent is reading files).

It is not a case of the model having that code in its training data
but that the agent actively decided to go read and copy the code after
its previous attempts at just writing the code had failed. I gave the
agent what I thought were clear instructions for how to write the code
but it ignored those, tried every possible wrong approach in cycles,
and then at some point decided to check for type errors by running
py_compile instead of mypy. Then it told me that it had fixed all the
errors and I said "you're supposed to run mypy, not py_compile" so it
said:

```
• Correct. py_compile was the wrong validation for this task.

  I re-ran with mypy and confirmed there are still many errors. I also
tested wiring FMPZ_POLY/
  FMPQ_POLY directly into ringflint.py; that made mypy worse, so I
reverted that experiment. Current
  workspace is back to only this change:

  - sympy/external/gmpy.py: added POLY_P, FMPZ_POLY, FMPQ_POLY under
if TYPE_CHECKING.

  So at this point:

  - requested fictitious poly types/protocol exist
  - mypy sympy still fails with many existing flint-related typing
errors (including ringflint.py and
    polyclasses.py)

  If you want, I can proceed with the next step: a broader typing pass
to actually consume FMPZ_POLY/
  FMPQ_POLY in ringflint.py/polyclasses.py in a way that reduces mypy
sympy errors instead of
  increasing them.
```

I said yes we should indeed be reducing the errors rather than
increasing them (that part amused me enough that I saved the output
shown for posterity). The agent then spent some time talking to itself
about the difficulties it had previously encountered and then decided
to go look at the other codebase and realised that it could basically
just copy the code from there.

I don't think that there is any copyright violation in writing out
type annotations for a dependent library's interface in another like
that and as the author (not using AI) of the type annotations being
copied I think I am in charge of the copyright anyway. It is not hard
to see how you can end up having a copyright violation from this sort
of thing though.

Having played around with codex a few times I now realise that having
it just write the code is possibly one of the least useful things you
can ask it to do. When you ask it to write the actual code you then
have to review its changes in detail and if you allow it to make any
nontrivial decisions you will end up in a review/prompt cycle that is
worse than just writing the code yourself. When you ask it to do other
things though it is great like "bisect this" or "make a benchmark
script and time these 5 approaches" or "Can you see any obvious
problems in the diff?" and so on. Everything that requires
investigating things and/or writing throwaway code is a great task for
the AI because you don't have to review the code it generates which is
what otherwise becomes the bottleneck.

If I was to redo the PR above I would just start writing the code
myself and then when I reach the point that all of the high level
decisions about how to write the code are made I would ask codex to
finish the job. It would be more like "I've done 3 classes now you do
the other 10 classes just like I did" or "I've written the code but
mypy now shows 1000 errors. Fix the trivial errors and then categorise
and explain the remaining errors".

--
Oscar
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]

[Numpy-discussion] Re: Current policy on AI-generated code in NumPy

Reply via email to