Hi, Pavel,
I am broadly in agreement with your analysis. Practically speaking, if
existing developers want to use AI to help them fix bugs, then I have no
problem with it. I trust all of us to review the code the same way we
usually would. It probably wouldn't hurt to have a +1 sort of policy,
the way we do just before releases, just to guarantee an additional
level of review in such cases, at least at first. It also wouldn't hurt
for people to post things, along with the patches, similar to what you
did. But I'm not sure we need to require that.
Code submissions from unknown external people are a different matter. I
don't see a need to rule these out entirely, but they would need careful
review. But that's already true for contributions from such people.
Maybe review that is a bit more careful would be good.
It's pretty freaking impressive what Claude did. I recently got sent a
paper by a colleague in which he used AI to help him prove a central
lemma. His view was that he could have done it himself, but he was a bit
rusty on certain things, and so AI made it much easier.
Riki
On 5/6/26 10:00 AM, Pavel Sanda wrote:
Hi all,
---
[TL;DR: Are you Ok with committing small-enough changes to the code,
which were reviewed, there is clear maintenance commitment, and
the change is transparently tagged in commit as AI-assisted?
I would be for it, but community discussion is needed here.]
---
I will open the question which had to come sooner or later:
what should be our stance towards AI-assisted/generated code?
Do you have some strong or weak opinions about that?
I am asking for very practical reasons - there are some issues, which bug me
for years (comparison feature troubles being the major one).
I know I can fix those, at the same time I know it will be week(s) of full
time to address those - and that's totally out of my reach in the foreseable
future.
Below I shortly summarize my stance and add some links in case you don't follow
much the current frenzy. After that I am adding one particular example, to give
you sample taste.
----
To start somewhere, let's look on the landscape of AI policies in the OSS [1].
We can see wide range of approaches from outright bans (gentoo/gimp/qemu) to
permissive ones (kernel/llvm/scipy). It's clear that everyone has concerns, but
often for different things. I would add my (current) views while using
their methodology. I will try to be brief, but can discuss at length if needed.
a) Quality.
Cons: AI-slop is real, if you don't have the technical know-how and push the
model at the right spots, it will self-confidently lie to you without
restraint.
Go discuss regexps with gemini if you want to have experience.
Pros: The current (2026+) agentic models are no more toys for writing syntactic
sugar routines.
It can correctly graps large scale logical patterns in the project,
understand the API
on its own and write the code, which is correct after review and with
better error handling
that I would (lazily) do myself.
My current stance: if you can actually do proper review and have reasonable
technicall background, it's valuable assitant saving lot of time.
If you are technically non-competent prompt engineer blindly trusting the
code you are asking for big troubles. (I have been challenged on this point
by friends (competent in CS) who are in the full-vibe regime and was shown
*working* projects with nontrivial codebase in computer language not known to
the maintainers. I have gut feeling this is wrong, but future might have it's
own plans.)
b) Copyright. Many issues here (of course I am not a lawyer), outstanding:
- Both US/EU will grant copyright to human output, not machine.
You can claim it if your interaction/tuning with AI is nontrivial.
But simple prompt is not enough - our (C) LyX Team might not apply
for such code.
- The AI model might have been trained on licensed code. Some output
might be verbatim snippet from elsewhere. There are ongoing legal battles
in what sense is training LLM tranformative enough, so it does (not) pose
copyright infringement.
Some paid services (like claude) grants the right to you and also give
explicit umbrella legal cover in case you get sued.
My current (practical) stance:
- If you spent enough time with AI tuning or add 10% Transformation rule of
thumb [2]
(manual refactor/comments) it's ours.
- If the changes are tiny targetted patches (see example below) the
infringement is out
of question, larger chunks can be also checked by snippet scanners
(FOSSA,GitHub Copilot Guardrails).
- Practicaly we could use commit tags like in other projects similar to e.g.
Assisted-by: Anthropic Claude Code (Reviewed by Human)
and use certificate of origin by sign-off as seen in kernel certifying that
... The contribution is based upon previous work that, to the best of my
knowledge,
is covered under an appropriate open source license ...
c) Ethics. Very briefly and unpolished:
Major AI companies are in different lawsuit stages that their models were
trained or pirated data (books). They also probably took all OSS on the
planet
regardless the exact license, trained the models on it and now making large
sums of cash based on the volunteer work. Nontrivial energy/environment
footprint exists.
My current stance is that whether training was fair or not, the genie is
irreversibly out of the bottle and we should not restrain from using the
fruit
- or even better to ask for something like anthropic summer of code credits
as
a way to give the community back ;)
The other issues are real and while they do not reach my current threshold I
can
totally see that other people would be strictly against.
[1]
https://redmonk.com/kholterhoff/2026/02/26/generative-ai-policy-landscape-in-open-source/
[2]
https://www.thefirewall-blog.com/2026/04/ai-code-leak-exposes-the-fault-lines-of-copyright/
-------------------------------------------------
To get some taste of possible workflow I am proposing, here you go: for more
than a decade we have potentially dataloss bug #9395: when comparing documents,
changes in the nested insets are not sometimes detected. I have done some
initial analysis a while ago and have patch which is not doing nice job for
proper CT changes, but at least it does not miss that there *is* some change.
It sits in my personal patchset and I will continue for the next decade unless
someone else gets into studying mode for the meyers et al algorithm for diffs.
Today I asked claude sonnet (not even the most poweful one) to look at
https://www.lyx.org/trac/ticket/9395, read potentionally interesting
documentation at https://wiki.lyx.org/Devel/Diagrams and two sentence
descriptio/intructions.
After that he went into 20 min work with occasional ping pong between us
and this was his result:
--------
Now I have a full picture of the bug. Let me explain the root cause and the fix
before implementing anything.
---
Root cause analysis
The bug is in diff_i() at line 763?766 of src/Compare.cpp:
if (L_ses == 0)
writeToDestBuffer(rp.o); // ? the bug
L_ses == 0 means findMiddleSnake() found that every position in the current
range compares as equal at the shallow level. This happens inside diffInset()
(called when the Float is found in a snake) because equal() returns true for
any two same-type text-editable insets regardless of content ? so [Flex Chunk,
Caption] vs [Flex Chunk, Caption] inside the Float all match shallowly.
With L_ses == 0, diff_i() just copies the old content with
writeToDestBuffer(rp.o) and never calls processSnake(), so diffInset() is never
called for the
nested Flex Chunk or Caption. Changes inside them are silently discarded.
The top-level diff() function handles the identical case correctly by
calling processSnake() before diff_i(), but recursive calls from diffInset() go
directly
into diff_i() without that initial traversal.
The fix is one line: change writeToDestBuffer(rp.o) to processSnake(rp) ?
processSnake copies the content AND recurses into every nested editable text
inset
via diffInset().
Two more things need updating:
1. nested_insets/diffs.expected.lyx ? the expected file has the nested changes with old/new swapped
(delete "2", insert "1" instead of delete "1", insert
"2"); the outer text is correct
2. nested_insets/skip.test ? needs to be removed
Do you want me to go ahead with the fix? And should I also check whether
there's a built binary we can test against, or will you build separately?
--------
Launch of our GUI tests obviously failed, but his analysis follows my initial
observation, unlike me get's the right spot and one liner patch works.
If it was my personal project I would spend some time to double check the
argument
and code around and commit if everything fits.
What would you do? :)
Pavel
--
----------------------------
Richard Kimberly (Riki) Heck
Professor of Philosophy and Professor of Linguistics
Brown University
Pronouns: they/them/their
Website: http://rkheck.frege.org/
--
lyx-devel mailing list
[email protected]
https://lists.lyx.org/mailman/listinfo/lyx-devel