On Fri, May 22, 2026 at 10:40:36AM +0200, [email protected] wrote:
> On Fri, May 22, 2026 at 08:15:16AM +0200, Renaud Allard wrote:
> > From: Renaud Allard <[email protected]>
> > To: [email protected]
> > Subject: Fwd: [ Re: lex(1): signed integer overflow in repetition count ]
> > Date: Fri, 22 May 2026 08:15:16 +0200
> >
> > > > Sorry about that. doxygen's pre.l uses {0,1000} on several string
> > > > patterns (lines 642, 645, 658, 668, 685), which exceeds OpenBSD's
> > > > RE_DUP_MAX at 255.
> > > >
> > > > The UBSan trigger I was originally fixing is "lb - 1" wrapping when
> > > > lb is INT_MIN (sscanf("%d") clamps overflow that way on OpenBSD).
> > > > That only requires forbidding negative values, not capping at 255.
> > > > The grammars-too-large case is already caught downstream by the
> > > > "input rules are too complicated" check in mkstate().
> > > >
> > > > Minimal follow-up that keeps the overflow guard but drops the cap:
> > > >
> > > Here is the correct one rebased on 1.14 after the revert:
> > >
> >
> > Small ping about this one, it has been tested against doxygen and
> > does not limit arbitrarily
> >
>
> It seems I mangled my patch in former mail.
> This should be better
Survived an llvm-22 bulk.
ok tb
>
>
> Index: nfa.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/lex/nfa.c,v
> diff -u -r1.14 nfa.c
> --- nfa.c 17 May 2026 15:32:55 -0000 1.14
> +++ nfa.c 22 May 2026 08:39:15 -0000
> @@ -556,6 +556,9 @@
>
> base_mach = copysingl(mach, lb - 1);
>
> + if (lb < 0 || (ub < 0 && ub != INFINITE_REPEAT))
> + flexfatal(_("negative repetition value"));
> +
> if (ub == INFINITE_REPEAT) {
> copy = dupmachine(mach);
> mach = link_machines(mach,
>