The "Copyright and the Generative-AI Supply Chain" overview paper by the GenLaw folks (https://blog.genlaw.org/explainers/talkin.html ) is generally an excellent reference for this kind of thing, in the US context at least. They take great care to disentangle the many different copyright questions surrounding generative AI which continue to get confused in lots of discussions.
They cover this particular question on p.59-60 (chapter "II. Tracing Copyright Through the Supply Chain", section: "Pre-Trained/Base Models"). This contains some academic hedging: on the one hand, "in some cases, a model’s creators will have made creative choices that imbue the model with copyrightable expression", but on the other hand, "applying an existing algorithm and well-known architecture to an existing dataset does not involve sufficient creative choices". But at the very least they seem to encourage some healthy skepticism about copyright claims by AI labs, in the slightly snarky footnote 325: "It is worth noting that many model trainers creators certainly believe that models are copyrightable, and have released those models under licenses that are only intelligible if there is something copyrightable to license in the first place." Then there is a recent policy paper which among other things argues more strongly that "model weights [...] are largely not copyrightable": https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5049562 ("The Mirage of Artificial Intelligence Terms of Use Restrictions" by Lemley and Henderson) Regards, Tilman On Fri, Feb 21, 2025 at 4:20 AM Stefan Kaufmann (WMDE) < [email protected]> wrote: > Hello everybody, > > I could not help but fall into a series of rabbit holes over the past > weeks. The question this time: Are trained models (of the connectionist > AI flavour) actually protected by IP legislation? And if so: Why? > > I stumbled across this through a semirelated question that made me > realize: I had read a lot about TDM exceptions and whether they apply > for training connectionist models. And on the output side of things, > there is ongoing discussion on whether the outputs of generative systems > deserve IP protection. > > However, what I had to dig into and found just a bit of discussion, was > the question of IP protection of the trained models themselves. Or, in > other terms, if providers slap some license on a model they trained: On > what basis can they even constrain the way in which they can be re-used? > > I am certainly not the first person to stumble over this, so I will be > very glad about any pointers towards articles on the topic. > > What I found so far was a paper by the IPO from 2020 that laments the > supposed lack of sui generis protection for trained models: > < > https://ipo.org/wp-content/uploads/2020/11/SG-model-rights-committee-paper-pub.pdf>. > > In Section III A they argue that trained models as machine-created works > might not be subject to (US) copyright law, at least if no human > creative input or at least interaction is involved. > It could be argued that RLHF would constitute a human involvement > (albeit mostly outsourced to clickworkers in the global south) – > provided the feedback fulfills the minimum threshold of creativity and > is not merely a human in the loop that acts on predetermined rules that > leave no leeway for individual expression. > However, even this argument would fall flat whenever reinforcement > learning is based on synthetic input, e.g. a model being trained through > RL by another trained model. > > After looking at patents and trade secrets, the IPO looks longingly at > existing sui generis rights, namely the European SGDR and asks for > similar SGR for trained connectionist AI models. > > Nuno Sousa e Silva argues in > < > https://copyrightblog.kluweriplaw.com/2024/01/18/are-ai-models-weights-protected-databases/> > > (after stumbling over the very same question I had) that SGDR could be > applied out-of-the-box at least to the weights of a connectionist model, > checking pretty much all of the boxes. > > So, generally: > > a) yay, another case for the consequences of SGDR > b) did I miss stuff? Had you come across arguments that trained models > check some other category for IP protection? > bb) If not: what even is the legal basis for any license at least > outside the EU? > > Any thoughts are highly appreciated! > > regards, > -stk > -- > Stefan Kaufmann (er) > Referent Politik und öffentlicher Sektor > > Wikimedia Deutschland e. V. | Tempelhofer Ufer 23–24 | 10963 Berlin | > Tel. +49 (0)30-577 11 62-0 | <https://wikimedia.de> > > Bleiben Sie auf dem neuesten Stand! Aktuelle Nachrichten und spannende > Geschichten rund um Wikimedia, Wikipedia und Freies Wissen im > Newsletter: <https://www.wikimedia.de/newsletter/> > > Unsere Vision ist eine Welt, in der alle Menschen am Wissen der > Menschheit teilhaben, es nutzen und mehren können. Helfen Sie uns dabei! > https://spenden.wikimedia.de > > Wikimedia Deutschland — Gesellschaft zur Förderung Freien Wissens e. V. > Eingetragen im Vereinsregister des Amtsgerichts Charlottenburg, VR > 23855. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften > I Berlin, Steuernummer 27/029/42207. Geschäftsführende Vorstände: > Franziska Heine, Dr. Christian Humborg. > > _______________________________________________ > Publicpolicy mailing list -- [email protected] > To unsubscribe send an email to [email protected] >
_______________________________________________ Publicpolicy mailing list -- [email protected] To unsubscribe send an email to [email protected]
