Write out here what you would want a PR author to provide.
Please let me honk in from philosophy/science/psychology to reiterate,
"The prompts, Boss, the prompts!"
Don't you think that some sample at least would add some conviction that
a human was involved?
Case in point is mrdope, who I imagine could be a person talking through
an LLM out of shyness, overproductivity syndrome, or as an experiment.
Per my comment:
https://github.com/numpy/numpy/pull/30828 [1].
Is the work the product of ai? Yes, but the author claims to have
verified the code. Is the author ai or not? Should we proceed with the
PR?
Matti
The interactions ... have the flat agreeableness of an LLM. mrdope, are
you hear and clear? :-)
Bill
--
https://github.com/phobrain/phobrain
"Don't dope me, I'm just the racehorse here."
On 2026-02-14 09:38, Robert Kern via NumPy-Discussion wrote:
On Sat, Feb 14, 2026 at 12:17 PM Matthew Brett
<[email protected]> wrote:
Hi,
On Fri, Feb 13, 2026 at 9:45 PM Robert Kern <[email protected]>
wrote:
On Wed, Feb 11, 2026 at 6:26 PM Matthew Brett via NumPy-Discussion
<[email protected]> wrote:
Just to clarify - in case it wasn't clear, what I'm floating as a
proposal, would be something like this, as a message to PR authors:
Please specify one of these:
1) I wrote this code myself, without looking at significant
AI-generated code OR
2) The code contains AI-generated content, but the AI-generated code
is sufficiently trivial that it cannot reasonably be subject to
copyright OR
3) There is non-trivial AI-generated code in this PR, and I have
documented my searches to confirm that no parts of the code are
subject to existing copyright.
So - the burden for the reviewer is just to confirm, in case 3, that
the author has documented their searches. We take the word of the
contributor for the option they have chosen. Obviously, the
documentation requirement of case 3 is somewhat of a burden for the
contributor, and may therefore encourage them to write the code
themselves, to avoid that burden. That might not be a bad thing,
long term, for the project, and it seems reasonable to me as some
defence against copyright violation, and a message that the project
cares about such violation.
For Case 3, I would love to see an example of the search that you
would accept. If you could take a recent PR (human or AI, doesn't
really matter for this purpose), and show the search that would
satisfy you, that would go a long way towards clarifying what you are
asking for here. We'd need a worked example or two before adopting
this policy because if I don't know what you are asking for, no new
contributor will, either.
Yes, that's a reasonable request. But how do you think I should
proceed? Make an issue on Numpy, and start drafting? Start another
email thread? Or a Discourse / Scientific Python thread?
Just here should be fine. Take an existing PR that has copyrightable
content (e.g. an entire new function or three, each more than ~10
lines, not just many one-line updates scattered around; the most
interesting ones would be those that implement a known algorithm). Do
the code search that would satisfy you. Write out here what you would
want a PR author to provide.
--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]
Links:
------
[1] https://github.com/numpy/numpy/pull/30828
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]