[Numpy-discussion] Re: Current policy on AI-generated code in NumPy

Bill Ross Sat, 14 Feb 2026 11:20:21 -0800

Write out here what you would want a PR author to provide.


Please let me honk in from philosophy/science/psychology to reiterate,

"The prompts, Boss, the prompts!"

Don't you think that some sample at least would add some conviction thata human was involved?

Case in point is mrdope, who I imagine could be a person talking throughan LLM out of shyness, overproductivity syndrome, or as an experiment.Per my comment:

https://github.com/numpy/numpy/pull/30828 [1].
Is the work the product of ai? Yes, but the author claims to haveverified the code. Is the author ai or not? Should we proceed with thePR?
Matti
The interactions ... have the flat agreeableness of an LLM. mrdope, areyou hear and clear? :-)


Bill

--

https://github.com/phobrain/phobrain

"Don't dope me, I'm just the racehorse here."

On 2026-02-14 09:38, Robert Kern via NumPy-Discussion wrote:

On Sat, Feb 14, 2026 at 12:17 PM Matthew Brett<[email protected]> wrote:
Hi,
On Fri, Feb 13, 2026 at 9:45 PM Robert Kern <[email protected]>wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion<[email protected]> wrote:
Just to clarify - in case it wasn't clear, what I'm floating as aproposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significantAI-generated code OR2) The code contains AI-generated content, but the AI-generated codeis sufficiently trivial that it cannot reasonably be subject tocopyright OR3) There is non-trivial AI-generated code in this PR, and I havedocumented my searches to confirm that no parts of the code aresubject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, thatthe author has documented their searches. We take the word of thecontributor for the option they have chosen. Obviously, thedocumentation requirement of case 3 is somewhat of a burden for thecontributor, and may therefore encourage them to write the codethemselves, to avoid that burden. That might not be a bad thing,long term, for the project, and it seems reasonable to me as somedefence against copyright violation, and a message that the projectcares about such violation.
For Case 3, I would love to see an example of the search that youwould accept. If you could take a recent PR (human or AI, doesn'treally matter for this purpose), and show the search that wouldsatisfy you, that would go a long way towards clarifying what you areasking for here. We'd need a worked example or two before adoptingthis policy because if I don't know what you are asking for, no newcontributor will, either.
Yes, that's a reasonable request.   But how do you think I should
proceed?   Make an issue on Numpy, and start drafting?   Start another
email thread?  Or a Discourse / Scientific Python thread?
Just here should be fine. Take an existing PR that has copyrightablecontent (e.g. an entire new function or three, each more than ~10lines, not just many one-line updates scattered around; the mostinteresting ones would be those that implement a known algorithm). Dothe code search that would satisfy you. Write out here what you wouldwant a PR author to provide.
--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]



Links:
------
[1] https://github.com/numpy/numpy/pull/30828

_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]

[Numpy-discussion] Re: Current policy on AI-generated code in NumPy

Reply via email to