[Bug target/103008] poor inlined builtin_fmod on x86_64

2022-02-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #17 from Richard Biener --- (In reply to Uroš Bizjak from comment #14) > Created attachment 52428 [details] > Proposed patch > > The attached patch implements: > > fmod (a, p) = a - trunc (a/p) * p > drem (a, p) = a - roundeven (a/

[Bug target/103008] poor inlined builtin_fmod on x86_64

2022-02-13 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #16 from rguenther at suse dot de --- On Fri, 11 Feb 2022, ubizjak at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 > > --- Comment #13 from Uroš Bizjak --- > (In reply to Richard Biener from comment #

[Bug target/103008] poor inlined builtin_fmod on x86_64

2022-02-13 Thread Dave.Love at manchester dot ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #15 from Dave.Love at manchester dot ac.uk --- "ubizjak at gmail dot com" writes: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 > > --- Comment #14 from Uroš Bizjak --- > Created attachment 52428 > --> https://gcc.gnu.org

[Bug target/103008] poor inlined builtin_fmod on x86_64

2022-02-12 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #14 from Uroš Bizjak --- Created attachment 52428 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52428&action=edit Proposed patch The attached patch implements: fmod (a, p) = a - trunc (a/p) * p drem (a, p) = a - roundeven (a

[Bug target/103008] poor inlined builtin_fmod on x86_64

2022-02-11 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #13 from Uroš Bizjak --- (In reply to Richard Biener from comment #12) > Just as data-point on znver2 Uros testcase shows > > rguenther@ryzen:/tmp> gcc-11 t.c -Ofast -lm -march=znver2 > rguenther@ryzen:/tmp> numactl --physcpubind=3

[Bug target/103008] poor inlined builtin_fmod on x86_64

2022-02-10 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #12 from Richard Biener --- Just as data-point on znver2 Uros testcase shows rguenther@ryzen:/tmp> gcc-11 t.c -Ofast -lm -march=znver2 rguenther@ryzen:/tmp> numactl --physcpubind=3 /usr/bin/time ./a.out 19.18user 0.00system 0:19.18

[Bug target/103008] poor inlined builtin_fmod on x86_64

2022-02-10 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #11 from joseph at codesourcery dot com --- An implementation using division like that definitely isn't valid without -funsafe-math-optimizations (it gives nonsense results when the exponent difference between the arguments is too

[Bug target/103008] poor inlined builtin_fmod on x86_64

2022-02-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #10 from Uroš Bizjak --- FYI, the following testcase: --cut here-- #include float __attribute__((noinline)) _fmodf (float x, float y) { return x - truncf (x/y) * y; } int main () { float a, b; volatile float z; for (a =

[Bug target/103008] poor inlined builtin_fmod on x86_64

2022-02-10 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 Richard Biener changed: What|Removed |Added CC||jsm28 at gcc dot gnu.org --- Comment #

[Bug target/103008] poor inlined builtin_fmod on x86_64

2022-02-10 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-11-01 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #7 from Uroš Bizjak --- IMO, inlined fmod (and drem) should eventually be expanded in a generic way in the middle-end as: fmod (a, p) = a - trunc (a/p) * p drem (a, p) = a - roundeven (a/p) * p so division can be later simplified t

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-10-31 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 Uroš Bizjak changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-10-30 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #5 from anlauf at gcc dot gnu.org --- There's a mixture of single and double precision in the testcase variants. I haven't checked thoroughly enough if both variants are really equivalent. Do you see the issue if you have only single

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-10-30 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #4 from Dave Love --- On further consideration, perhaps this is just a Fortran issue. I thought -ffast-math should turn off all the relevant checks to allow reducing mod to the arithmetic expression, but it probably doesn't. Also,

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-10-30 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #3 from Dave Love --- Created attachment 51709 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51709&action=edit gglx.s extract

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-10-30 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #2 from Dave Love --- Created attachment 51708 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51708&action=edit ggl.s extract

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-10-30 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #1 from Dave Love --- Created attachment 51707 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51707&action=edit gglx.f90