If an AI model is trained on material with licenses incompatible with the Apache
license and produces code nearly identical to training material with an 
incompatible
license, it would surprise me that that would be considered to be code that 
could
be put under a different license.

Given that the Apache organization has a large code base and many machine
learning related projects, it may be better to train models on that code base 
and
use this to improve Apache projects.

On Tue, Jan 20, 2026, at 12:13 PM, Alenka Frim wrote:
> Hi,
>
> Thank you Nic for bringing this up!
> I agree with all that has been said and look forward to the guidelines being
> added to the documentation.
>
> Best,
> Alenka
>
> V V tor., 20. jan. 2026 ob 09:36 je oseba Jean-Baptiste Onofré <
> [email protected]> napisala:
>
>> Hi,
>>
>> This is a great suggestion.
>>
>> I would also like to remind everyone that code generated primarily by AI
>> may not be copyrightable unless it has been significantly modified by the
>> author. As such, our file headers should reflect this.
>>
>> Regards,
>> JB
>>
>> On Sun, Jan 18, 2026 at 8:14 PM Nic Crane <[email protected]> wrote:
>>
>> > Hi folks,
>> >
>> > I'm just emailing to solicit opinions on adding a page about AI-generated
>> > contributions to the docs. The ASF has its own guidance[1] which is
>> fairly
>> > high-level and is mainly concerned with licensing. However, we are seeing
>> > more AI generated contributions in which the author doesn't seem to have
>> > engaged with the code at all and appears to have no intention of engaging
>> > with review comments, and I feel like it would be beneficial to have
>> > somewhere in the docs to point to if we close the pull request.
>> >
>> > Having guidelines also makes it easier to tell whether a contributor has
>> > made any effort to follow them.
>> >
>> > I experimented with approaches to being transparent about AI use in my
>> own
>> > PRs and have an example here, where the changes were needed but the
>> subject
>> > matter was a little out of my comfort zone[2] - see resolved comments.
>> >
>> > I've made a rough draft[3] of what I think could constitute some
>> > guidelines, but keen to hear what folks think. Happy to hear thoughts on
>> > the wording, whether this belongs in the contributor guide, or if there
>> are
>> > concerns I haven't considered.
>> >
>> > Nic
>> >
>> >
>> > [1] https://www.apache.org/legal/generative-tooling.html
>> >
>> > [2] https://github.com/apache/arrow/pull/48634
>> >
>> > [3]
>> > We recognise that AI coding assistants are now a regular part of many
>> > developers' workflows and can improve productivity. Thoughtful use of
>> these
>> > tools can be beneficial, but AI-generated PRs can sometimes lead to
>> > undesirable additional maintainer burden.  Human-generated mistakes tend
>> to
>> > be easier to spot and reason about, and code review often feels like a
>> > collaborative learning experience that benefits both submitter and
>> > reviewer. When a PR appears to have been generated without much
>> engagement
>> > from the submitter, it can feel like work that the maintainer might as
>> well
>> > have done themselves.
>> >
>> > We are not opposed to the use of AI tools in generating PRs, but
>> recommend
>> > the following:
>> > - Only take on a PR if you are able to debug and own the changes yourself
>> > - Make sure that the PR title and body match the style and length of
>> others
>> > in this repo
>> > - Follow coding conventions used in the rest of the codebase
>> > - Be upfront about AI usage and summarise what was AI-generated
>> > - If there are parts you don't fully understand, add inline comments,
>> > explaining what steps you took to verify correctness
>> >   - Reference any sources that guided your changes (e.g. "took a similar
>> > approach to #123456")
>> >
>> > PR authors are also responsible for disclosing any copyrighted materials
>> in
>> > submitted contributions, as discussed in the ASF generative tooling
>> > guidance: https://www.apache.org/legal/generative-tooling.html
>> >
>> > If a PR appears to be AI-generated, and the submitter hasn't engaged with
>> > the output, doesn't respond to review feedback, or hasn't  disclosed AI
>> > usage, we may close it without further review.
>> >
>>

Reply via email to