On 20/02/26 at 10:38 +0000, Sean Whitton wrote:
> Lucas Nussbaum [20/Feb 10:09am +01] wrote:
> > I'm not sure that would work in practice. As soon as content is added
> > (in the sense of + lines in a diff), all the problematic aspects of
> > AI-assisted coding are there and the contributor must behave responsibly.
> 
> Maybe *some* of the problematic aspects of LLM-assisted coding are
> there.  I think you have to be overgeneralising in saying that they are
> all there, if all that happened was a refactoring because you preferred
> to describe it in prose to an LLM instead of figuring out the Emacs
> keyboard macro to do it yourself.

Refactoring is a good example, because there are various levels of
refactoring, with some of them that can be performed automatically
by non-AI-powered IDEs.

What an LLM could do is identify similar-but-not-identical functionality
in various parts of a codebase, extract a generalized version of the
functionality into a function, and use it instead of the duplicated
functionality.

Now what happens if that generalized function is sufficiently
generalized to potentially come unmodified from a different codebase?
If you want to allow pre-AI-style refactorings since they do not carry
any potential copyright issue, but not LLM-style refactorings, you would
have to carefully draw a line in the GR text.

As a made-up example, see each commit in
https://github.com/lnussbaum/llm-refactor-example/commits/main/ :
the caesar_cipher() implementation introduced in the last commit can
probably be found in the training data.

Lucas

Reply via email to