[Numpy-discussion] Re: Current policy on AI-generated code in NumPy

Marten van Kerkwijk via NumPy-Discussion Fri, 13 Feb 2026 12:49:38 -0800

Hi All,

I think the discussion of ethics is important, but somewhat abstract and
difficult to act on.


Personally, I think the gravest danger is maintainer burn-out.  The
matplotlib "shame posting" referred to by Matthew earlier is a quite
worrisome -- for those who haven't seen it, see
https://github.com/matplotlib/matplotlib/pull/31132
https://github.com/matplotlib/matplotlib/pull/31138

I'd estimate that those particular AI-generated PRs have led to a direct
loss of at least a day of maintainer time (spread over multiple
maintainers), plus, if they are at all like me, a further indirect loss
of hours to days of being upset and irritated.

Over at astropy, we currently see a smaller effect of AI bots: we are
being spammed by two "developers" whose github pages state "currently
working on upskilling myself in the field of AI" (both identical
phrasing).  Even just asking to be assigned in lots of issues is
annoying, as it leads to e-mails in my inbox for any issue I happened to
comment on.

Also, I've been misled into reviewing/commenting on one PR, spending
more time than it would have been to fix things myself, only to find out
that it was generated by a bot, not a person who might become a valuable
contributor.  It left me feeling violated, and I hate feeling forced to
a mode where I only review PRs from people/handles I know, and avoid
getting notified except when tagged (though a few days ago, I got tens
of e-mails from an account that seems to be replaying astropy PRs,
yielding pings for any with @mhvk in them).

Anyway, to me the most pressing question is to devise some sort of
policy that stops people or their bots with no true interest in making
real contributions.

Sadly, I do not have a good suggestion.  But perhaps a start is to have
some kind of tick box for any new contributor that asks to state that
one is a human and asks to introduce oneself, how one found the problem,
and to describe how the PR was created and, e.g., confirm that it
complies with the AI policy (including copyright, etc.).  Or, more
strongly, require explicit permission to create PRs (following a similar
questionnaire).  A non-standard API might help...

Of course, neither of these suggestions will not guard against blatant
liers, but perhaps they at least reduce the problem.

All the best,

Marten
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]

[Numpy-discussion] Re: Current policy on AI-generated code in NumPy

Reply via email to