Re: [DISCUSS] AI-generated contributions

Gang Wu Sun, 18 Jan 2026 19:43:42 -0800

Thanks Nic for raising this!

I totally agree with your suggestions and would like to add additional ones
based on my review experience:


- Summitters should review all lines of generated code before creating the
PR to
  understand every piece of detail just like they are written by the
submitters
  themselves.
- AI tools are notorious for generating overly verbose comments, unnecessary
  test cases, fixing test failures using wrong approaches, etc. Make sure
these
  are checked and fixed.
- Reviewers are humans, so please try to break down large PRs into smaller
  ones to make reviewers' life easier to get PRs promptly reviewed.

Best,
Gang

On Mon, Jan 19, 2026 at 3:14 AM Nic Crane <[email protected]> wrote:

> Hi folks,
>
> I'm just emailing to solicit opinions on adding a page about AI-generated
> contributions to the docs. The ASF has its own guidance[1] which is fairly
> high-level and is mainly concerned with licensing. However, we are seeing
> more AI generated contributions in which the author doesn't seem to have
> engaged with the code at all and appears to have no intention of engaging
> with review comments, and I feel like it would be beneficial to have
> somewhere in the docs to point to if we close the pull request.
>
> Having guidelines also makes it easier to tell whether a contributor has
> made any effort to follow them.
>
> I experimented with approaches to being transparent about AI use in my own
> PRs and have an example here, where the changes were needed but the subject
> matter was a little out of my comfort zone[2] - see resolved comments.
>
> I've made a rough draft[3] of what I think could constitute some
> guidelines, but keen to hear what folks think. Happy to hear thoughts on
> the wording, whether this belongs in the contributor guide, or if there are
> concerns I haven't considered.
>
> Nic
>
>
> [1] https://www.apache.org/legal/generative-tooling.html
>
> [2] https://github.com/apache/arrow/pull/48634
>
> [3]
> We recognise that AI coding assistants are now a regular part of many
> developers' workflows and can improve productivity. Thoughtful use of these
> tools can be beneficial, but AI-generated PRs can sometimes lead to
> undesirable additional maintainer burden.  Human-generated mistakes tend to
> be easier to spot and reason about, and code review often feels like a
> collaborative learning experience that benefits both submitter and
> reviewer. When a PR appears to have been generated without much engagement
> from the submitter, it can feel like work that the maintainer might as well
> have done themselves.
>
> We are not opposed to the use of AI tools in generating PRs, but recommend
> the following:
> - Only take on a PR if you are able to debug and own the changes yourself
> - Make sure that the PR title and body match the style and length of others
> in this repo
> - Follow coding conventions used in the rest of the codebase
> - Be upfront about AI usage and summarise what was AI-generated
> - If there are parts you don't fully understand, add inline comments,
> explaining what steps you took to verify correctness
>   - Reference any sources that guided your changes (e.g. "took a similar
> approach to #123456")
>
> PR authors are also responsible for disclosing any copyrighted materials in
> submitted contributions, as discussed in the ASF generative tooling
> guidance: https://www.apache.org/legal/generative-tooling.html
>
> If a PR appears to be AI-generated, and the submitter hasn't engaged with
> the output, doesn't respond to review feedback, or hasn't  disclosed AI
> usage, we may close it without further review.
>

Re: [DISCUSS] AI-generated contributions

Reply via email to