Hi all,

I am increasingly seeing pull requests in the SymPy repo that were
written by AI e.g. something like Claude Code or ChatGPT etc. I don't
think that any of these PRs are written by actual AI bots but rather
that they are "written" by contributors who are using AI tooling.

There are two separate categories:

- Some contributors are making reasonable changes to the code and then
using LLMs to write things like the PR description or comments on
issues.
- Some contributors are basically just vibe coding by having an LLM
write all the code for them and then opening PRs usually with very
obvious problems.

In the first case some people use LLMs to write things like PR
descriptions because English is not their first language. I can
understand this and I think it is definitely possible to do this with
LLMs in a way that is fine but it needs to amount to using them like
Google Translate rather than asking them to write the text. The
problems are that:

- LLM summaries for something like a PR are too verbose and include
lots of irrelevant information making it harder to see what the actual
point is.
- LLMs often include information that is just false such as "fixes
issue #12345" when the issue is not fixed.

I think some people are doing this in a way that is not good and I
would prefer for them to just write in broken English or use Google
Translate or something but I don't see this as a major problem.

For the vibe coding case I think that there is a real problem. Many
SymPy contributors are novices at programming and are nowhere near
experienced enough to be able to turn vibe coding into outputs that
can be included in the codebase. This means that there are just spammy
PRs with false claims about what they do like "fixes X", "10x faster"
etc where the code has not even been lightly tested and clearly does
not work or possibly does not even do anything.

I think what has happened is that the combination of user-friendly
editors with easy git/GitHub integration and LLM agent plugins has
brought us to the point where there are pretty much no technical
barriers preventing someone from opening up gibberish spam PRs while
having no real idea what they are doing.

Really this is just inexperienced people using the tools badly which
is not new. Low quality spammy PRs are not new either. There are some
significant differences though:

- I think that the number of low quality PRs is going to explode. It
was already bad last year in the run up to GSOC (January to March
time) and I think it will be much worse this year.
- I don't think that it is reasonable to give meaningful feedback on
PRs where this happens because the contributor has not spent any time
studying the code that they are changing and any feedback is just
going to be fed into an LLM.

I'm not sure what we can do about this so for now I am regularly
closing low quality PRs without much feedback but some contributors
will just go on to open up new PRs. The "anyone can submit a PR model"
has been under threat for some time but I worry that the whole idea is
going to become unsustainable.

In the context of the Russia-Ukraine war I have often seen references
to the "cost-exchange problem". This refers to the fact that while
both sides have a lot of anti-air defence capability they can still be
overrun by cheap drones because million dollar interceptor missiles
are just too expensive to be used against any large number of incoming
thousand dollar drones. The solution there would be to have some kind
of cheap interceptor like an automatic AA gun that can take out many
cheap drones efficiently even if much less effective against fancier
targets like enemy planes.

The first time I heard about ChatGPT was when I got an email from
StackOverflow saying that any use of ChatGPT was banned. Looking into
it the reason given was that it was just too easy to generate
superficially reasonable text that was low quality spam and then too
much effort for real humans to filter that spam out manually. In other
words bad/incorrect answers were nothing new but large numbers of
inexperienced people using ChatGPT had ruined the cost-exchange ratio
of filtering them out.

I think in the case of SymPy pull requests there is an analogous
"effort-exchange problem". The effort PR reviewers put in to help with
PRs is not reasonable if the author of the PR is not putting in a lot
more effort themselves because there are many times more people trying
to author PRs than review them. I don't think that it can be
sustainable in the face of this spam to review PRs in the same way as
if they had been written by humans who are at least trying to
understand what they are doing (and therefore learning from feedback).
Even just closing PRs and not giving any feedback needs to become more
efficient somehow.

We need some sort of clear guidance or policy on the use of AI that
sets clear explanations like "you still need to understand the code".
I think we will also need to ban people for spam if they are doing
things like opening AI-generated PRs without even testing the code.
The hype that is spun by AI companies probably has many novice
programmers believing that it actually is reasonable to behave like
this but it really is not and that needs to be clearly stated
somewhere. I don't think any of this is malicious but I think that it
has the potential to become very harmful to open source projects.

The situation right now is not so bad but if you project forwards a
bit to when the repo gets a lot busier after Christmas I think this is
going to be a big problem and I think it will only get worse in future
years as well.

It is very unfortunate that right now AI is being used in all the
wrong places. It can do a student's homework because it knows the
answers to all the standard homework problems but it can't do the more
complicated more realistic things and then students haven't learned
anything from doing their homework. In the context of SymPy it would
be so much more useful to have AI doing other things like reviewing
the code, finding bugs, etc rather than helping novices to get a PR
merged without actually investing the time to learn anything from the
process.

--
Oscar

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/sympy/CAHVvXxQ1ntG0EWBGihrXErLhGuABHH7Kt5RmGJvp9bHcqaC5%3DQ%40mail.gmail.com.

Reply via email to