Dixi quod…

>I don’t expect to reply much (if I’m even allowed after this) here.

… but I think I have to make one addition. I don’t normally read the
list (too much traffic, not enough spoons), but I wanted to see on
the webinterface whether the mail made it through, and I saw the mail
from peb.

The legalities aspect (when not used as distraction or waved away) is
a bit misrepresented.

Yes, the TDM exception gives an exception to copyright for the training
of models… to analyse things, for trends and the likes. Nowhere does
this allow using the models to produce output. And please, do not use
the word “generate”, they don’t generate (generative art is something
entirely different and good), they regurgitate. LLMs are a sort of
lossy compressor/decompressor, with the decompression attempting a best
*average* match to continue the “prompt” (it’s really just autocomplete
with sparks).
https://explainextended.com/2023/12/31/happy-new-year-15/ demonstrated
very nicely how they actually work, using an actually obtainable model
as example.

Incidentally, this is also why their output alone is not copyrightable
as a new work: it is produced by a deterministic machine, not a human,
and therefore does not pass threshold of originality… in two different
ways, one in the legal meaning of that term, the other in the ordinary
meaning of “originality”: there’s nothing new there, it merely

        r e g u r g i t a t e s

from its inputs. (If the companies wouldn’t filter the possible prompts,
it’d be easy to extract near-complete copies of individual “training
data” by the millions, as studies have shown.)

⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠ ⚠

HOWEVER, this does not mean that their output is free from copyright.
Rather, due to the above-mentioned properties (machine transformation
of copyrighted works), the sum of all outputs from such a model is a
derived work from all of its inputs (and for how much this is true for
each individual combination of input and output of course depends on
the prompt, PRNG seed and output in question). This does not, of course,
give you carte blanche to just use *any* of its output… not even small
ones. Citing rules do exist, after all. Especially the academics should
know some…

So.

bye,
//mirabilos
-- 
  "Using Lynx is like wearing a really good pair of shades: cuts out
   the glare and harmful UV (ultra-vanity), and you feel so-o-o COOL."
                                         -- Henry Nelson, March 1999

Reply via email to