On Tue, May 13, 2025 at 11:04 AM Eric Blake <ebl...@redhat.com> wrote:
>
> On Tue, May 13, 2025 at 12:13:42AM -0400, Nikolaos Chatzikonstantinou wrote:
> > I'm trying to think of what the output of this program should be:
> >
> > eval(1 = 2 = 3 / 0)
>
> In 1.4.20, division by zero _is_ a late warning, while = is a warning
> produced the moment it is encountered, whether or not it is operating
> on integers or on late warnings.

I should be able to follow that with m4p.

> In branch-1.6, where = has been ripped out (well, turned into an
> invalid operator like += and friends), the behavior is different again
> - like other bad operators, it stops the parse immediately (whether
> bad operator or missing ')' takes precedence is the subject of my
> patch yesterday).  That opens up the door for an eventual m4 2.0 to
> repurpose assignment operators into actually assigning values into
> macro names [ie. eval(`a=1') could behave like define(`a', eval(1))].

I wonder if m4 should use := instead for variable assignment?

> >
> > Since it should be equal to
> >
> > eval((1 = 2) = (3 / 0))
> >
> > I'd assume it should warn as follows:
> >
> > m4:stdin:1: Warning: recommend ==, not =, for equality operator
> > m4:stdin:1: divide by zero in eval: 1 = 2 = 3 / 0
> > m4:stdin:1: Warning: recommend ==, not =, for equality operator
> >
> > which would be the breadth-first left-to-right evaluation strategy,
> > but instead it warns like so:
> >
> > m4:stdin:1: Warning: recommend ==, not =, for equality operator
> > m4:stdin:1: Warning: recommend ==, not =, for equality operator
> > m4:stdin:1: divide by zero in eval: 1 = 2 = 3 / 0
> >
> > which is the depth-first left-to-right evaluation. Although, I'm not
> > entirely sure.
>
> Stepping through the code is the best way to see what actually
> happened.  The first call to parse_expr starts with v1=1 and
> consumes/peeks at <v1=1>" = 2 ="; the second = is the same priority,
> so that is enough to set v1 to the result of "1=2", and that
> computation is what outputs the first warning.  The loop then repeats,
> and it now consumes/peeks at <v1=0>"= 3 /" which is an operator of
> higher precedence, so it must recurse into a second parse_expr.  The
> second parse_expr starts with v1=3 and consumes/peeks at <v1=3>" /
> 0\0" where end-of-string is a lower precedence operator, so it
> computes the division "3/0", which results in a divide-by-zero value,
> but no warning is issued yet.  Control flow then returns to the first
> parse_expr, which is now sitting on <v1=0>=<v2=div0>"\0", and since
> end of string is lower precedence, it is time to compute the =
> operator, which outputs the second warning about = before returning a
> div0 result.  Finally, the top-level evaluate() gets a result of div0,
> and produces the final warning and empty output.
>
> If we wanted, we could treat the warning for misuse of = the same as
> division by 0 (something that only occurs at most once if the
> expression is otherwise parseable, and can be silenced if it occurs in
> the dead branch of || or &&) rather than something that occurs every
> time it is encountered; and maybe even mask it out so that an eval()
> produces at most one warning overall (ie. a div0 warning could silence
> any = warnings).  But given that = is disappearing entirely for 1.6,
> I'm not sure it is worth a 1.4.21 just for tweaks like that.  Further,
> MOST m4 code won't trip the = messages (they have switched to == long
> ago to silence it), and the more important fact is whether the
> expansion of eval() is blank when there is a division by 0 error, with
> a message explaining how eval failed, rather than if or how many other
> warnings occurred alongside that error.

I think as it is it is fine. The division by zero is a catastrophic
error while deprecation is not.

> That said, I think it's fairly easy to change m4p to more closely
> match m4 1.4.20 behavior, including better short-circuiting of || and
> &&

I forgot about your previous report on that. I'll definitely have to fix it.

> and warnings that match m4 1.4.20 for all of eval(1%0),
> eval(1=2=3/0), eval(1+=(2=3)), eval(1+=1+=1), and eval(`++a').

I will also try to fix this. I'm mostly focused on adding the last
missing bits (debugmode, $*, $@) before I fine tune the behavior.

> It's up to you on how
> hard you want to try to exactly match m4 1.4.20's behavior, short of
> duplicating that parser identically rather than relying on Lark to
> parse for you.

I'm mostly concerned with stdout output and I hope that real world
applications of m4 do not rely on byte-precise debug output of eval().
If they do, then I'd like to fix m4p to be correct, but otherwise I'll
just consider it part of the user experience, which can differ
slightly.

> My patch cheats and turns rules for invalid operators into consuming
> the rest of the input (use of /.+/) as closest to what m4 1.4.20 does,
> but I note that even without that cheat, your grammar isn't quite
> right (in C, the += production is closer to 'assignment: atom "+="
> assignment', rather than 'assignment: logical_op "+=" logical_op',
> since the left hand side must be a single term and you can chain
> assignments with right-associativity).

That's true, however I do not know what m4 intends to do with variable
assignment so I left it at that then.

> Not touched in this patch: I think there are places where you get the
> wrong answer because you don't convert to c_int32 until at the end of
> the evaluation, whereas m4 1.4.20 is doing ALL operations in signed
> 32-bit.

Actually I think we might be in the clear with this one thanks to the
properties of modular arithmetic that say e.g. (a % c) + (b % c) is
equal to (a + b) % c. It certainly means that m4p uses more memory as
it uses bignums right up until the final computation.

I will look at your patch later to keep my goals in order; but from a
brief look I'm also wondering if I should perform warning filtering
after the parsing has occurred. Right now parsing returns a list of
warnings in order of appearance (like a flattened tree). If I'd like
to have eval((1 / 0) = (2 / 0)) warn once on divide by zero, I can
filter all DivByZero and append a single DivByZero to that list.

Regards,
Nikolaos Chatzikonstantinou

Reply via email to