Sorry - reposting from my subscribed address: Hi,
Sorry to top-post! But - I wanted to bring the discussion back to licensing. I have great sympathy for the ecological and code-quality concerns, but licensing is a separate question, and, it seems to me, an urgent question. Imagine I asked some AI to give me code to replicate a particular algorithm A. It is perfectly possible that the AI will largely or completely reproduce some existing GPL code for A, from its training data. There is no way that I could know that the AI has done that without some substantial research. Surely, this is a license violation of the GPL code? Let's say we accept that code. Others pick up the code and modify it for other algorithms. The code-base gets infected with GPL code, in a way that will make it very difficult to disentangle. Have we consulted a copyright lawyer on this? Specifically, have we consulted someone who advocates the GPL? Cheers, Matthew On Thu, Jul 4, 2024 at 11:27 AM Marten van Kerkwijk <m...@astro.utoronto.ca> wrote: > > Hi All, > > I agree with Dan that the actual contributions to the documentation are > of little value: it is not easy to write good documentation, with > examples that show not just the mechnanics but the purpose of the > function, i.e., go well beyond just showing some random inputs and > outputs. And poorly constructed examples are detrimental in that they > just hide the fact that the documentation is bad. > > I also second his worries about ecological and social costs. > > But let me add a third issue: the costs to maintainers. I had a quick > glance at some of those PRs when they were first posted, but basically > decided they were not worth my time to review. For a human contributor, > I might well have decided differently, since helping someone to improve > their contribution often leads to higher quality further contributions. > But here there seems to be no such hope. > > All the best, > > Marten > > Daniele Nicolodi <dani...@grinta.net> writes: > > > On 03/07/24 23:40, Matthew Brett wrote: > >> Hi, > >> > >> We recently got a set of well-labeled PRs containing (reviewed) > >> AI-generated code: > >> > >> https://github.com/numpy/numpy/pull/26827 > >> https://github.com/numpy/numpy/pull/26828 > >> https://github.com/numpy/numpy/pull/26829 > >> https://github.com/numpy/numpy/pull/26830 > >> https://github.com/numpy/numpy/pull/26831 > >> > >> Do we have a policy on AI-generated code? It seems to me that > >> AI-code in general must be a license risk, as the AI may well generate > >> code that was derived from, for example, code with a GPL-license. > > > > There is definitely the issue of copyright to keep in mind, but I see > > two other issues: the quality of the contributions and one moral issue. > > > > IMHO the PR linked above are not high quality contributions: for > > example, the added examples are often redundant with each other. In my > > experience these are representative of automatically generate content: > > as there is little to no effort involved into writing it, the content is > > often repetitive and with very low information density. In the case of > > documentation, I find this very detrimental to the overall quality. > > > > Contributions generated with AI have huge ecological and social costs. > > Encouraging AI generated contributions, especially where there is > > absolutely no need to involve AI to get to the solution, as in the > > examples above, makes the project co-responsible for these costs. > > > > Cheers, > > Dan > > > > _______________________________________________ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: m...@astro.utoronto.ca > _______________________________________________ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: matthew.br...@gmail.com _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com