[Bug c++/110622] x86: Miscompilation at O1 level (O0 is working)

2023-07-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110622

--- Comment #6 from Andrew Pinski  ---
(In reply to Mathieu Malaterre from comment #5)
> > Does adding -fexcess-precision=standard help? What about -ffloat-store ?
> 
> highway is c++ only. Those flags are C only

No they are not. -fexcess-precision=standard C++ was added for GCC 13.
-ffloat-store is generic middle-end option and has been supported by all
front-ends since at least GCC 3.

[Bug c++/110622] x86: Miscompilation at O1 level (O0 is working)

2023-07-11 Thread malat at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110622

--- Comment #5 from Mathieu Malaterre  ---
> Does adding -fexcess-precision=standard help? What about -ffloat-store ?

highway is c++ only. Those flags are C only

[Bug c++/110622] x86: Miscompilation at O1 level (O0 is working)

2023-07-11 Thread malat at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110622

--- Comment #4 from Mathieu Malaterre  ---
Here is my current state of work:

*
https://github.com/malaterre/highway/commit/771ca57d2d29b48f91beae033f6854f9b2dfb730

I am open to suggestion to further reduce the test case, as I am not familiar
with those sort of symptoms and how to drive c-reduce in this case.

Thanks all

[Bug c++/110622] x86: Miscompilation at O1 level (O0 is working)

2023-07-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110622

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://github.com/google/h
   ||ighway/issues/1488#issuecom
   ||ment-1621528097

--- Comment #3 from Andrew Pinski  ---
Does adding -fexcess-precision=standard help? What about -ffloat-store ?

[Bug c++/110622] x86: Miscompilation at O1 level (O0 is working)

2023-07-11 Thread malat at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110622

--- Comment #2 from Mathieu Malaterre  ---
Created attachment 55518
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55518=edit
Preprocessed source

% /usr/bin/g++-13 -save-temps -DHWY_STATIC_DEFINE -DTOOLCHAIN_MISS_ASM_HWCAP_H
-I/home/mathieu/Perso/highway -O1 -fPIE -fvisibility=hidden
-fvisibility-inlines-hidden -Wno-builtin-macro-redefined
-D__DATE__=\"redacted\" -D__TIMESTAMP__=\"redacted\" -D__TIME__=\"redacted\"
-fmerge-all-constants -Wall -Wextra -Wconversion -Wsign-conversion -Wvla
-Wnon-virtual-dtor -fmath-errno -fno-exceptions -DHWY_IS_TEST=1
-DGTEST_HAS_PTHREAD=1 -MD -MT
CMakeFiles/math_test.dir/hwy/contrib/math/math_test.cc.o -MF
CMakeFiles/math_test.dir/hwy/contrib/math/math_test.cc.o.d -o
CMakeFiles/math_test.dir/hwy/contrib/math/math_test.cc.o -c
/home/mathieu/Perso/highway/hwy/contrib/math/math_test.cc

[Bug c++/110622] x86: Miscompilation at O1 level (O0 is working)

2023-07-11 Thread malat at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110622

--- Comment #1 from Mathieu Malaterre  ---
For reference:
* https://github.com/google/highway/issues/1488#issuecomment-1621528097

[Bug c++/110622] New: x86: Miscompilation at O1 level (O0 is working)

2023-07-11 Thread malat at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110622

Bug ID: 110622
   Summary: x86: Miscompilation at O1 level (O0 is working)
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: malat at debian dot org
  Target Milestone: ---

I can trigger an assertion in highway unit test suite on Debian/i386 when using
-O1 (does not happen at -O0).

Symptoms:

[...]
f32x2: Log1p(5.5968112995194757e-20) expected 5.5968112995194757e-20 actual 0
ulp 5.28754e+08 max ulp 3
[...]

For some reason the code return exactly zero (0) during a math computation of a
log1p function. The code runs fine at O2 on all other Debian arches (amd64,
mipsel, armel, ppc32 ...).

Since the return value is exactly 0, I suspect this is not a case of excess
precision (x87). This is not a hardware issue as I can reproduce it on multiple
hosts (Debian buildds machine, and x86 porterbox).

If I extract the math logic and call it in a `main` function, then the compiler
is able to optimize the code correctly (even at O2) and return appropriate
result (correct tolerance).

I did not observe any suspicious behavior under valgrind and/or -fsanitize.

<    1   2