On 20/02/26 at 10:38 +0000, Sean Whitton wrote: > Lucas Nussbaum [20/Feb 10:09am +01] wrote: > > I'm not sure that would work in practice. As soon as content is added > > (in the sense of + lines in a diff), all the problematic aspects of > > AI-assisted coding are there and the contributor must behave responsibly. > > Maybe *some* of the problematic aspects of LLM-assisted coding are > there. I think you have to be overgeneralising in saying that they are > all there, if all that happened was a refactoring because you preferred > to describe it in prose to an LLM instead of figuring out the Emacs > keyboard macro to do it yourself.
Refactoring is a good example, because there are various levels of refactoring, with some of them that can be performed automatically by non-AI-powered IDEs. What an LLM could do is identify similar-but-not-identical functionality in various parts of a codebase, extract a generalized version of the functionality into a function, and use it instead of the duplicated functionality. Now what happens if that generalized function is sufficiently generalized to potentially come unmodified from a different codebase? If you want to allow pre-AI-style refactorings since they do not carry any potential copyright issue, but not LLM-style refactorings, you would have to carefully draw a line in the GR text. As a made-up example, see each commit in https://github.com/lnussbaum/llm-refactor-example/commits/main/ : the caesar_cipher() implementation introduced in the last commit can probably be found in the training data. Lucas

