Hi all,

---
[TL;DR: Are you Ok with committing small-enough changes to the code,
which were reviewed, there is clear maintenance commitment, and
the change is transparently tagged in commit as AI-assisted?

I would be for it, but community discussion is needed here.]
---

I will open the question which had to come sooner or later:
what should be our stance towards AI-assisted/generated code?

Do you have some strong or weak opinions about that?

I am asking for very practical reasons - there are some issues, which bug me
for years (comparison feature troubles being the major one).
I know I can fix those, at the same time I know it will be week(s) of full
time to address those - and that's totally out of my reach in the foreseable 
future.

Below I shortly summarize my stance and add some links in case you don't follow
much the current frenzy. After that I am adding one particular example, to give
you sample taste.

----

To start somewhere, let's look on the landscape of AI policies in the OSS [1].
We can see wide range of approaches from outright bans (gentoo/gimp/qemu) to
permissive ones (kernel/llvm/scipy). It's clear that everyone has concerns, but
often for different things. I would add my (current) views while using
their methodology. I will try to be brief, but can discuss at length if needed.


a) Quality.
Cons: AI-slop is real, if you don't have the technical know-how and push the
      model at the right spots, it will self-confidently lie to you without 
restraint.
      Go discuss regexps with gemini if you want to have experience.

Pros: The current (2026+) agentic models are no more toys for writing syntactic 
sugar routines.
      It can correctly graps large scale logical patterns in the project, 
understand the API
      on its own and write the code, which is correct after review and with 
better error handling
      that I would (lazily) do myself.

My current stance: if you can actually do proper review and have reasonable
  technicall background, it's valuable assitant saving lot of time. 

  If you are technically non-competent prompt engineer blindly trusting the
  code you are asking for big troubles. (I have been challenged on this point
  by friends (competent in CS) who are in the full-vibe regime and was shown
  *working* projects with nontrivial codebase in computer language not known to
  the maintainers. I have gut feeling this is wrong, but future might have it's
  own plans.)


b) Copyright. Many issues here (of course I am not a lawyer), outstanding:
   - Both US/EU will grant copyright to human output, not machine.
     You can claim it if your interaction/tuning with AI is nontrivial.
     But simple prompt is not enough - our (C) LyX Team might not apply
     for such code.
   - The AI model might have been trained on licensed code. Some output
     might be verbatim snippet from elsewhere. There are ongoing legal battles
     in what sense is training LLM tranformative enough, so it does (not) pose
     copyright infringement.
     Some paid services (like claude) grants the right to you and also give
     explicit umbrella legal cover in case you get sued.

My current (practical) stance:
- If you spent enough time with AI tuning or add 10% Transformation rule of 
thumb [2]
  (manual refactor/comments) it's ours.

- If the changes are tiny targetted patches (see example below) the 
infringement is out
  of question, larger chunks can be also checked by snippet scanners 
(FOSSA,GitHub Copilot Guardrails).

- Practicaly we could use commit tags like in other projects similar to e.g.
  Assisted-by: Anthropic Claude Code (Reviewed by Human)
  and use certificate of origin by sign-off as seen in kernel certifying that 
  ... The contribution is based upon previous work that, to the best of my 
knowledge, 
  is covered under an appropriate open source license ...


c) Ethics. Very briefly and unpolished:
   Major AI companies are in different lawsuit stages that their models were
   trained or pirated data (books). They also probably took all OSS on the 
planet
   regardless the exact license, trained the models on it and now making large
   sums of cash based on the volunteer work. Nontrivial energy/environment
   footprint exists.

My current stance is that whether training was fair or not, the genie is
  irreversibly out of the bottle and we should not restrain from using the fruit
  - or even better to ask for something like anthropic summer of code credits as
  a way to give the community back ;)
  The other issues are real and while they do not reach my current threshold I 
can
  totally see that other people would be strictly against.

[1] 
https://redmonk.com/kholterhoff/2026/02/26/generative-ai-policy-landscape-in-open-source/
[2] 
https://www.thefirewall-blog.com/2026/04/ai-code-leak-exposes-the-fault-lines-of-copyright/


-------------------------------------------------

To get some taste of possible workflow I am proposing, here you go: for more
than a decade we have potentially dataloss bug #9395: when comparing documents,
changes in the nested insets are not sometimes detected. I have done some
initial analysis a while ago and have patch which is not doing nice job for
proper CT changes, but at least it does not miss that there *is* some change.

It sits in my personal patchset and I will continue for the next decade unless
someone else gets into studying mode for the meyers et al algorithm for diffs.

Today I asked claude sonnet (not even the most poweful one) to look at 
https://www.lyx.org/trac/ticket/9395, read potentionally interesting
documentation at https://wiki.lyx.org/Devel/Diagrams and two sentence
descriptio/intructions.

After that he went into 20 min work with occasional ping pong between us
and this was his result:
--------
Now I have a full picture of the bug. Let me explain the root cause and the fix 
before implementing anything.

  ---
  Root cause analysis

  The bug is in diff_i() at line 763?766 of src/Compare.cpp:

  if (L_ses == 0)
      writeToDestBuffer(rp.o);   // ? the bug

  L_ses == 0 means findMiddleSnake() found that every position in the current 
range compares as equal at the shallow level. This happens inside diffInset()
  (called when the Float is found in a snake) because equal() returns true for 
any two same-type text-editable insets regardless of content ? so [Flex Chunk,
  Caption] vs [Flex Chunk, Caption] inside the Float all match shallowly.

  With L_ses == 0, diff_i() just copies the old content with 
writeToDestBuffer(rp.o) and never calls processSnake(), so diffInset() is never 
called for the
  nested Flex Chunk or Caption. Changes inside them are silently discarded.

  The top-level diff() function handles the identical case correctly by calling 
processSnake() before diff_i(), but recursive calls from diffInset() go directly
   into diff_i() without that initial traversal.

  The fix is one line: change writeToDestBuffer(rp.o) to processSnake(rp) ? 
processSnake copies the content AND recurses into every nested editable text 
inset
  via diffInset().

  Two more things need updating:
  1. nested_insets/diffs.expected.lyx ? the expected file has the nested 
changes with old/new swapped (delete "2", insert "1" instead of delete "1", 
insert
  "2"); the outer text is correct
  2. nested_insets/skip.test ? needs to be removed

  Do you want me to go ahead with the fix? And should I also check whether 
there's a built binary we can test against, or will you build separately?

--------

Launch of our GUI tests obviously failed, but his analysis follows my initial
observation, unlike me get's the right spot and one liner patch works.

If it was my personal project I would spend some time to double check the 
argument
and code around and commit if everything fits.

What would you do? :)

Pavel
-- 
lyx-devel mailing list
[email protected]
https://lists.lyx.org/mailman/listinfo/lyx-devel

Reply via email to