[Numpy-discussion] Re: Current policy on AI-generated code in NumPy

David Cournapeau via NumPy-Discussion Sat, 07 Feb 2026 17:49:43 -0800

Hi Matt,

There are two aspects: can we use AI-generated code in numpy/scipy, and if
we can should we ? And to make it more complicated, the type of AI usage
affects those questions differently. E.g. I think almost nobody would
object to the use I described originally: using chats to research, analyze
literature and understand existing codebases under acceptable license.
There is no code generated there. Another extreme is all code generated and
reviewed by AI.


I will for now continue my original approach (no AI to generate any code
unless trivial + disclose its use when PR time comes).

David

On Sun, Feb 8, 2026 at 2:52 AM Matthew Brett via NumPy-Discussion <
[email protected]> wrote:

> Hi
>
> On Sat, Feb 7, 2026 at 4:54 PM Charles R Harris
> <[email protected]> wrote:
> >
> >
> >
> > On Sat, Feb 7, 2026 at 7:05 AM Matthew Brett via NumPy-Discussion <
> [email protected]> wrote:
> >>
> >> Hi,
> >>
> >> This is just a plea for some careful thought at this point.
> >>
> >> There are futures here that we likely don't want.  For example,
> >> imagine Numpy filling up with large blocks of AI-generated code, and
> >> huge PRs that are effectively impossible for humans to review.   As
> >> Oscar and Stefan have pointed out - consider what effect that is going
> >> to have on the social enterprise of open-source coding - and our
> >> ability to train new contributors.
> >>
> >> I believe we are also obliged to think hard about the consequences for
> >> copyright.   We discussed that a bit here:
> >>
> >> https://github.com/matthew-brett/sp-ai-post/blob/main/notes.md
> >>
> >> In particular - there is no good way to ensure that the AI has not
> >> sucked in copyrighted code - even if you've asked it to do a simple
> >> port of other and clearly licensed code.  There is some evidence that
> >> AI coding agents are, for whatever reason, particularly reluctant to
> >> point to GPL-licensing, when asked for code attribution.
> >>
> >> I don't think the argument that AI is inevitable is useful - yes, it's
> >> clear that AI will be part of coding in some sense, but we have yet to
> >> work out what part that will be.
> >>
> >> For example, there are different models of AI use - some of us are
> >> starting to generate large bodies of code with AI - such as Matthew
> >> Rocklin : https://matthewrocklin.com/ai-zealotry/ - but his discussion
> >> is useful.  Here are two key quotes:
> >>
> >> * "LLMs generate a lot of junk"
> >> * "AI creates technical debt, but it can clean some of it up too. (at
> >> least at a certain granularity)"
> >> * "The code we write with AI probably won't be as good as hand-crafted
> >> code, but we'll write 10x more of it"
> >>
> >> https://matthewrocklin.com/ai-zealotry/
> >>
> >> Another experienced engineer reflecting on his use of AI:
> >>
> >> """ ...  LLM coding will split up engineers based on those who
> >> primarily liked coding and those who primarily liked building.
> >>
> >> Atrophy. I've already noticed that I am slowly starting to atrophy my
> >> ability to write code manually. Generation (writing code) and
> >> discrimination (reading code) are different capabilities in the brain.
> >> Largely due to all the little mostly syntactic details involved in
> >> programming, you can review code just fine even if you struggle to
> >> write it.
> >> """
> >>
> >> https://x.com/karpathy/status/2015883857489522876
> >>
> >> Conversely - Linus Torvalds has a different model of how AI should work:
> >>
> >> """
> >> Torvalds said he's "much less interested in AI for writing code" and
> >> far more excited about "AI as the tool to help maintain code,
> >> including automated patch checking and code review before changes ever
> >> reach him."
> >> """
> >>
> >>
> https://www.zdnet.com/article/linus-torvalds-ai-tool-maintaining-linux-code/
> >>
> >> I guess y'all saw the recent Anthropic research paper comparing groups
> >> randomized to AI vs no-AI working on code problems.  They found little
> >> speedup from AI, but a dramatic drop in the level of understanding of
> >> the library they were using (in fact this was Trio).   This effect was
> >> particularly marked for experienced developers - see their figure 7.
> >>
> >> https://arxiv.org/pdf/2601.20245
> >>
> >> But in general - my argument is that now is a good time to step back
> >> and ask where we want AI to fit into the open-source world.  We
> >> open-source developers tend to care a lot about copyright, and we
> >> depend very greatly on the social aspects of coding, including our
> >> ability to train the next generation of developers, in the particular
> >> and informal way that we have learned.   We have much to lose from
> >> careless use of AI.
> >>
> >
> > E. S. Raymond is another recent convert.
> >
> > Programming with AI assistance is very revealing. It turns out I'm not
> quite who I thought I was.
> >
> > There are a lot of programmers out there who have a tremendous amount of
> ego and identity invested in the craft of coding. In knowing how to beat
> useful and correct behavior out of one language and system environment, or
> better yet many.
> >
> > If you asked me a week ago, I might have said I was one of those people.
> But a curious thing has occurred. LLMs are so good now that I can validate
> and generate a tremendous amount of code while doing hardly any hand-coding
> at all.
> >
> > And it's dawning on me that I don't miss it.
> >
> > Things are moving fast.
>
> Yes - but - it's important to separate how people feel using AI, and
> the actual outcome.   Many of y'all will I am sure have seen this
> study:
>
> https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
>
> that showed that developers estimated they would get a 25% speedup
> from AI, before they did the task; after they did the task, they felt
> they they had got a 20% speedup, and in fact (compared to matched
> tasks without AI), they suffered from a 20% slowdown.
>
> Personally - I am not very egotistical about my code, but I am
> extremely suspicious.   I know my tendency to become sloppy, to make
> and miss mistakes - what David Donoho called "the ubiquity of error":
> https://blog.nipy.org/ubiquity-of-error.html .   So AI makes me
> increasingly uncomfortable, as I feel my skill starting to atrophy (in
> the words of Andrej Karpathy quoted above).
>
> So it seems to me we have to take someone like Linus Torvalds
> seriously when he says he's "much less interested in AI for writing
> code".   Perhaps it is possible, at some point, to show that
> delegating coding to the AI leads to increased learning and greater
> ability to spot error - but so far the evidence seems to go the other
> way.   And if we "embrace" AI for that use, we run the risk of
> deskilling ourselves, filling the code-base with maintenance debt,
> effectively voiding copyright, and making it much harder to train the
> next generation,
>
> Cheers,
>
> Matthew
>
>
>
>
> --
> This email is fully human-source.    Unless I'm quoting AI, I did not
> use AI for any text in this email.
> _______________________________________________
> NumPy-Discussion mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3//lists/numpy-discussion.python.org
> Member address: [email protected]
>

_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]

[Numpy-discussion] Re: Current policy on AI-generated code in NumPy

Reply via email to