On Thu Feb 19, 2026 at 9:53 AM GMT, Jonathan Carter wrote:
In terms of LLMs, I agree with the sentiment of others that they are
in many ways, mass plagiarism tools.
...
For example, a few weeks ago I was working late one night and I
couldn't put my finger on it, but my one loop just looked really wrong
and ugly, so I searched on duck duck go to go find some patterns that
look nice that fit my use case, and Duck Duck AI popped up and
suggested a very neat and elegant list comprehension that was such an
obviously good choice, that I really should have thought of it in the
first place.
Personally, as part of refining my own position on these matters, I've
wanted to explore the idea of what would be acceptable to me, wrt
copyright, to harness the value you demonstrate in your anecdote.
An LLM which was solely trained on a corpus of free software with
intra-compatible licensing (for the sake of this example say, GPL2 or
later, and anything compatible with it), such that we declare the
resulting weightings to be a derivative, licensed GPL2+, and attribute
the authorship to the union of authorship of *all* the inputs, and
consider anything it outputs to be a derivative, likewise GPL2+. Would
that be acceptable? Would that be useful?
--
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Jonathan Dowland
⢿⡄⠘⠷⠚⠋⠀ https://jmtd.net
⠈⠳⣄⠀⠀⠀⠀