On Mon, Jan 27, 2025 at 3:18 PM Oscar Benjamin <[email protected]> wrote: > > On Mon, 27 Jan 2025 at 22:21, Aaron Meurer <[email protected]> wrote: > > > > There's also, separately, the question of the quality of LLM generated > > code. I think that we need to use the GitHub review process we have > > always been using to ensure the SymPy code remains high quality > > regardless of its source. This means the usual things: good, thorough > > tests that check for correctness, readable code, avoiding various > > antipatterns, etc. LLM generated code won't always fit these > > parameters, especially if not prompted correctly. > > It would be great if we could have an LLM that could review basic PRs > rather than being used by the PR author to write the code. Much of the > back and forth in some PRs is about very basic things like "there > should be some tests" that could be easily handled. I have seen some > other projects using LLMs like this but it didn't seem to be very > useful.
The GitHub copilot review thing that's built-in to PRs is actually not terrible (you have to join this waitlist to use it https://github.com/github-copilot/code-review-waitlist). I was a little surprised because I thought it would be mostly noise, but they seem to have heavily biased it towards not commenting on something unless it's actually a real issue (so it will miss a lot of stuff). I don't know if there's any way to tell it about project-specific coding standards, though, which is probably something we'd want. We could experiment with cursorrules, which are a way to tell the AI in the Cursor editor project-specific guidelines, but I don't know if those apply to any PR reviewing tools. It would help any contributor who uses Cursor. Honestly this is something where tooling is still not quite developed as well as it could be yet. > > > I think the biggest concern here is contributors (especially newer > > contributors) contributing code that exclusively comes from an LLM > > without any thought from the contributor themselves. This is > > especially likely from potential GSoC applicants. This we should > > disallow, because LLMs are not good enough to do this right now, and > > in the case of a GSoC applicant, it tells us nothing about their > > coding ability. Basically, any contributor to SymPy should be > > responsible for all the code they contribute. This especially makes it > > harder to evaluate GSoC applicants, but that's unfortunately the world > > we live in and we just need to learn how to evaluate people better > > (happy to discuss ideas for this. Should we do video call interviews > > with top GSoC applicants?) > > This is where we are heading with students in my University. > Programming is increasingly assessed by interview. It just isn't > reasonable any more to take the raw code submitted and evaluate it at > face value for anything that is remotely near beginner level. There > always was the possibility that a student might have paid someone else > to do their homework but now any reasonable beginner programming > assignment can be done in seconds by LLMs that are available for free. The reason I mentioned interviews is because some people from other projects mentioned that they do it at the GSoC mentor summit. Aaron Meurer > > -- > Oscar > > -- > You received this message because you are subscribed to the Google Groups > "sympy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion visit > https://groups.google.com/d/msgid/sympy/CAHVvXxTcT%3Dnf87h1NT1sY9DXTJrOsX1Xj7arqc3ZP1Snv8Ck7A%40mail.gmail.com. -- You received this message because you are subscribed to the Google Groups "sympy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/sympy/CAKgW%3D6J2qBh2aXt_4MZxoxqWYZcn6N%3DcOJz%2BSaBP5mCjbPXq%3DQ%40mail.gmail.com.
