Hi Greg, On 2026-05-05 at 11:05-04:00, Greg Hogan wrote: > On Tue, May 5, 2026 at 2:40 AM Nguyễn Gia Phong wrote: > [...] > > So yes, Oracle doesn't like contributions already in the public domain > > (so it can't sue other parties for infringements), but it's also > > not thrilled about infringing copyrights either (so it can be sued > > by other parties). > > > > I think the latter might apply to us. > > This has always been a risk accepting contributions. A developer could > copy and/or modify example code off Stack Overflow and fail to > properly attribute per CC BY-SA.
Indeed, though it'd be put more aptly as > A developer could copy and/or modify example code off LLM > and fail check it is (near-)verbatim of the model's training data. The issue is that such check is impossible to carry out, given the vast volume of the training data as well as the legality of obtaining them. Considering only copyright, I think it'd help to consider LLM output to be similar to a loose page you found on the street. It might be from a book under copyright, or it might be really old and have entered the public domain, who knows, so it's prolly unwise to redistribute it. On 2026-05-05 at 11:05-04:00, Greg Hogan wrote: > The risk to our project is mitigated in that most Guix contributions > are not copyrightable "factual" updates for versions, checksums, and > applying patches. Agreed, and these updates are not blocked by the lack of patches, but their reviewers. (IMHO LLM cannot meaningfully participating in opening an editor and changing version and checksum strings, or downloading a patch file anyway.) Best wishes, Phong
