On Tue, May 13, 2025 at 7:10 PM Eric Blake <ebl...@redhat.com> wrote: > > On Tue, May 13, 2025 at 04:49:58PM -0400, Nikolaos Chatzikonstantinou wrote: > > > Not touched in this patch: I think there are places where you get the > > > wrong answer because you don't convert to c_int32 until at the end of > > > the evaluation, whereas m4 1.4.20 is doing ALL operations in signed > > > 32-bit. > > > > Actually I think we might be in the clear with this one thanks to the > > properties of modular arithmetic that say e.g. (a % c) + (b % c) is > > equal to (a + b) % c. It certainly means that m4p uses more memory as > > it uses bignums right up until the final computation. > > True for unsigned math, not so true for signed math. Then again, POSIX > says: > > https://pubs.opengroup.org/onlinepubs/9799919799/utilities/m4.html > > | eval > | The eval macro shall evaluate its first argument as an arithmetic > | expression, using signed integer arithmetic with at least 32-bit > | precision. At least the following C-language operators shall be > | supported, with precedence, associativity, and behavior as described > | in 1.1.2.1 Arithmetic Precision and Operations: > > which in turn says math should be done as if by C's 'unsigned long' > (and m4 1.4.x is only doing it with int, which is not the same as > unsigned long on most 64-bit platforms), m4 1.4.x is NOT quite > POSIX-compliant. (That's something that I plan to change for m4 1.6, > but for back-compat reasons I'm reluctant to change it for 1.4.x).
Yeah, I'm not sure about it yet. I am at least aware of the issue but I didn't pursue it further when I saw that both m4 and m4p seem to give the same answer regardless of overflow. I think I read the same text as the one you posted above from POSIX at some point and I was convinced of my approach. > At any rate, I suspect you will find that you need to tweak m4p's eval > to match: > > $ m4 > eval(-1%3) > -1 > eval((0x7fffffff+2)%3) > -1 > $ m4p > eval(-1%3) > 2 > eval((0x7fffffff+2)%3) > 0 Those are useful examples. The +2 is accomplishing the same as writing 0x80000001 in both cases, and again that's equal to (int)0x80000001L assuming that reinterprets the bits in two's complement with 32bit int. Where things really go south is in modular arithmetic, but not because of the casts, but because Python's % behaves differently to C's: in short it seems that C follows the dividers sign while Python follows the divisors sign. I fixed that in the eval parser; let me know if there's further examples where the results differ from C due to casts, but I hope that settles it! P.S. The latest version v0.3.6 supports -D and $@ and $*, although I have not implemented tail-call recursion yet. (To be frank, I'm not sure how to do that, I'll have to look into it) Regards, Nikolaos Chatzikonstantinou