Hi Ian, Ludovic. Ludovic Courtès <l...@gnu.org> writes:
> Hi Ian, > > Ian Eure <i...@retrospec.tv> skribis: > >> Summarizing the situation: >> >> - SHF has an opaque, difficult, and undocumented process for >> handling name changes. I’s like to stress again that this is >> *not* strictly a transgender issue (though it likely affects them >> more, or in worse/different ways) -- it is a human respect issue. >> Many, many more cisgender people change their name than >> transgender people. > > It is also not strictly an SWH issue: how does Internet Archive handle > name changes? What about append-only storage in general? We’ve > discussed this already. >> - SHF gave their archive to HuggingFace, an "AI" company which is >> generating derived works with no attribution or provenance, in >> ways which violate the both licenses of the projects used to train >> their model, and the SHF principles for LLMs. > > [...] > >> - Has Guix reached out to SHF to express these concerns / get a >> response? > > I’ve seen and participated in informal discussions, but that’s all I > know. Maintainers? We haven't. Given some improvements were apparently already made by SWF in response to concerns raised, it seems the dialogue should continue. >> - Whether a public or private response, what would Guix consider to >> be an acceptable response? An unacceptable respoinse? >> - How long is Guix willing to wait for a response? > > Free software people, myself included, have expressed disappointment > regarding the use of code harvested by SWH for HuggingFace’s training. > Stefano Zacchiroli of SWH responded to these concerns on Mastodon back > in March, as you probably saw. > > One important point is that copyleft code is excluded from the training > dataset; I was able to anecdotally check that for GPL code such as Guix > using their interface (there was a thread on Mastodon but I can’t find > it): <https://huggingface.co/spaces/bigcode/in-the-stack>. That > addresses my main concern. > > Remaining concerns include the weak wording of the principles put > forward by SWH in its statement on LLMs: > <https://www.softwareheritage.org/2023/10/19/swh-statement-on-llm-for-code/>. > I think this is something worth discussing further with them (it’s > already been brought up notably on Mastodon). It’s not clear to me > whether this is a task for Guix as a project. I don't think it is a task for Guix specifically, but rather for all users of SWH or interested parties. -- Thanks, Maxim