On Fri, Feb 20, 2026 at 08:26:08PM +0100, Lucas Nussbaum wrote:
> As a made-up example, see each commit in
> https://github.com/lnussbaum/llm-refactor-example/commits/main/ :
> the caesar_cipher() implementation introduced in the last commit can
> probably be found in the training data.
As another hypothetical thought experiment, suppose the problem is to
optimize a program which has a bubble sort, and a human programmer is
asked to optimize it by replacing it with a quick sort.
There are only so many different ways to code the quick sort algorithm
in C, and it's likely that the human being might even be vaguely
remembering how they saw it done in some non-free source code (for
example, in Sedgewick's Algorithms book) and perhaps, subconsciously
reproduced it from some non-free source that they ocne saw.
Would that be problematic? I very much doubt it, even if it looked
frighteningly similar to the one found in AT&T's proprietary Unix
code.
Now replace "a human being" with "an LLM". How does this change the
calculus?
- Ted