Re: GR -- Allow AI-Assisted Contributions

Jonathan Dowland Fri, 20 Feb 2026 11:18:45 -0800

On Fri Feb 20, 2026 at 3:48 PM GMT, Theodore Tso wrote:

in practice an LLM is trained a very large corpus of information,including things that are not code, so that it can understand a plainenglish text prompt. This includes, but is not excluded to, mailinglist archives, where the software licensing of code fragments is notnecessarily going to be clear.

That's a very good point. And we don't have a clear license for our ownmailing list content. Nor some of our other 'large' corpuses e.g. theWiki. However, large, public domain collections of texts in (at least)English are widely available. (and I see that much of Wikipedia isCC-BY-SA now, rather than GFDL).

I tentatively believe (without really having robust evidence) that thereis sufficient material to train a DFSG LLM of some scale to produce somelevel of useful output. Clearly, it would be less efficient than onethat has been trained on a larger corpus (ignoring copyright).

However this remains a thought experiment for me, to explore some of themoral issues, rather than a practical plan.

I would also suggest that we are holding LLM's to a much higher
standard that we are for human beings.

Yes but LLMs (and machines in general) are not equivalent to humans (oreven nearly so) and so we *should* hold them to different standards. Not"higher": orthogonal. They are tools. They are not remotely close toconscious. I think it's a fallacy to compare them in this sense at all.



Best wishes,

--
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Jonathan Dowland
⢿⡄⠘⠷⠚⠋⠀ https://jmtd.net

⠈⠳⣄⠀⠀⠀⠀

Re: GR -- Allow AI-Assisted Contributions

Reply via email to