On Fri, 21 Apr 2023 at 20:36, William Stein <wst...@gmail.com> wrote: > > There's is a discussion right now on HN about LLM's trained on code > > https://news.ycombinator.com/item?id=35657982 > > One of the comments https://news.ycombinator.com/item?id=35658118 > points out that most of the non-GPL super permissive licenses require > explicit attribution when creating derived works. If the output of an > LLM is a derived work (and not just some fair use of that input), then > there is legally nothing particularly special about GPL in the context > of training LLM's. That I think successfully undercuts my point in > starting this thread.
I don't think GPL or other licenses really matter here. It won't be long before these models can produce code that is sufficiently original/distinct that it would not be considered "derived" anyway. The fact that the model happened to learn in part from looking at lots of code with different licenses is not really that different from the way that humans learn programming. If I happen to look at the Sage source code and learn something in the process it does not mean that Sage's GPL conditions apply to all future code I write after gaining that knowledge. There is a spectrum from using knowledge learned from code through to adapting the code and in the extreme just copying the code. Pretty soon these models will be able to position themselves wherever you want on that spectrum. -- Oscar -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/CAHVvXxSKzm3QHxQqSx5-k8kpFZFxOSf368eCUnM-3TqM04GN-w%40mail.gmail.com.