Re: [DISCUSS] AI-generated contributions

Nic Crane Thu, 22 Jan 2026 16:47:41 -0800

PR here for anyone interested: https://github.com/apache/arrow/pull/48952


On Thu, 22 Jan 2026 at 09:56, Nic Crane <[email protected]> wrote:

> Thanks Andrew, I really like how you spell out the reasoning around it, I
> will see how we can incorporate some of those ideas
>
> On Thu, 22 Jan 2026 at 09:23, Andrew Lamb <[email protected]> wrote:
>
>> > We have had repeated attempts at contributions by some folks who simply
>> do not understand their generated code and when asked for clarification,
>> have the LLM generate more incorrect commentary.  It's very
>> Dunning-Krueger
>> and leads to lots of frustration all around.
>>
>> We saw this too in DataFusion and I was pleased with what we came up with
>> for rationale about why it is not helpful[1]. Basically the reviewers are
>> more efficient using the LLM tools directly and the contributor isn't
>> learning anything either.
>>
>> Andrew
>>
>>
>> [1]:
>>
>> https://datafusion.apache.org/contributor-guide/index.html#why-fully-ai-generated-prs-without-understanding-are-not-helpful
>>
>> On Mon, Jan 19, 2026 at 12:48 PM R Tyler Croy <[email protected]> wrote:
>>
>> > (replies inline)
>> >
>> > On Sunday, January 18th, 2026 at 7:43 PM, Gang Wu <[email protected]>
>> > wrote:
>> >
>> > > - Summitters should review all lines of generated code before creating
>> > the
>> > > PR to
>> > > understand every piece of detail just like they are written by the
>> > > submitters
>> > > themselves.
>> > > - AI tools are notorious for generating overly verbose comments,
>> > unnecessary
>> > > test cases, fixing test failures using wrong approaches, etc. Make
>> sure
>> > > these
>> > > are checked and fixed.
>> > > - Reviewers are humans, so please try to break down large PRs into
>> > smaller
>> > > ones to make reviewers' life easier to get PRs promptly reviewed.
>> >
>> >
>> > Like others I think Nic's draft is a good one, I would like to offer
>> some
>> > thoughts as a maintainer (delta-rs) which has received increased
>> > AI-assisted pull requests over the past six months.
>> >
>> >
>> > The "PR may be closed without further review" statement I would strongly
>> > encourage moving to the very beginning of the policy.  I would also
>> > encourage labels being used like "ai-assisted" to signal to other
>> > contributors who may or may not wish to engage in reviewing potential
>> slop.
>> >
>> > We have had repeated attempts at contributions by some folks who simply
>> do
>> > not understand their generated code and when asked for clarification,
>> have
>> > the LLM generate more incorrect commentary.  It's very Dunning-Krueger
>> and
>> > leads to lots of frustration all around.
>> >
>> > Like most policies it's important to speak to those that are acting in
>> > good faith but don't rely on everybody following the rules, and come up
>> > with an agreed upon way to handle those that don't.
>> >
>> >
>> > Either way I think it's good to ship! :)
>> >
>> >
>> >
>> > Cheers
>> >
>> >
>>
>

Re: [DISCUSS] AI-generated contributions

Reply via email to