[Bug fortran/104535] New: don't use fmod?

2022-02-14 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104535

Bug ID: 104535
   Summary: don't use fmod?
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fx at gnu dot org
  Target Milestone: ---

I was reminded by comments on the report I made about poor fmod performance on
x86 that I should have commented on the original observation.

I'd looked at one of the Polyhedron benchmarks which suffers badly from a
simple random number routine that calls DMOD.  That gets compiled to fmod,
which is only inlined, albeit poorly on x86, with the relevant component(s) of
-ffast-math.  It seems to me that MOD should compile to the arithmetical
expression in the standard, which doesn't have the complication of having to
treat errors.  (When I defined DMOD as a statement function for it in that
routine, I got performance much closer to ifort.  I should have kept the
profiles I compared, but could regenerate them.

Is there a good reason not to do that (and maybe similarly with other
intrinsics I haven't checked)?  I could probably have a go at implementing it
if appropriate, though I don't know my way around now.

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-10-30 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008

--- Comment #4 from Dave Love  ---
On further consideration, perhaps this is just a Fortran issue.  I thought
-ffast-math should turn off all the relevant checks to allow reducing mod to
the arithmetic expression, but it probably doesn't.  Also, MAQAO complained
about x87 instructions being generated, but I'm not sure about that either if
it's just for status.  Apologies if this is invalid, and correction welcome.

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-10-30 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008

--- Comment #3 from Dave Love  ---
Created attachment 51709
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51709=edit
gglx.s extract

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-10-30 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008

--- Comment #2 from Dave Love  ---
Created attachment 51708
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51708=edit
ggl.s extract

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-10-30 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008

--- Comment #1 from Dave Love  ---
Created attachment 51707
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51707=edit
gglx.f90

[Bug target/103008] New: poor inlined builtin_fmod on x86_64

2021-10-30 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008

Bug ID: 103008
   Summary: poor inlined builtin_fmod on x86_64
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fx at gnu dot org
  Target Milestone: ---
Target: x86_64

Created attachment 51706
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51706=edit
ggl.f90

This is from looking at a Fortran benchmark set
, but presumably
isn't Fortran-specific.

One of the cases in that set (ac.f90) gets bottlenecked on a random
number routine (which may be rubbish, but it's there).  It uses DMOD,
which gets compiled to __builtin_fmod according to the tree dump, and
is inlined.  However, the benchmark performance is still 50% worse
with gfortran than Intel ifort, and if I replace DMOD with its
definition, gfortran is much closer to ifort.

I'll attach files ggl.f90, the original, and gglx.f90 which avoids the
call to the intrinsic, along with assembler from each.  The assembler
is from GCC 11.2.0, run (on SKX) as

  gfortran -Ofast -march=native

(I note that the generated fmod isn't inlined with -O3, which looks to
me like a Fortran miss that I should report.)

I only take benchmarks too seriously for understanding the results
but, at least with PDO, GCC is pretty much on a par with ifort on the
bottom line of that set, despite also #40770, and another poor case. :-)

[Bug fortran/100724] -fwhole-program breaks module use

2021-05-25 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100724

--- Comment #5 from Dave Love  ---
Thanks for the explanation.

Could the manual entry for -fwhole-program just be amended to clarify that it's
a fallback for when a linker plugin isn't available for -flto.  That may be
what it was intended to say, but it's not clear to me.  I used -fwhole-program
because it seemed to fit my case exactly.

[Bug fortran/100724] -fwhole-program breaks module use

2021-05-24 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100724

--- Comment #2 from Dave Love  ---
The manual says not to use -flto with -fwhole-program.  Is that misleading?

I checked self-built gfortran 10.2.0 again, and it definitely works for me
without -flto on Debian 10, but it fails with Red Hat devtoolset's 10.2.1 on
RHEL7.  Odd...

[Bug debug/100725] dwarf error with --whole-program

2021-05-24 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100725

--- Comment #2 from Dave Love  ---
(In reply to Jakub Jelinek from comment #1)
> Those binutils are too old for dwarf5.
> When the linker doesn't print any diagnostics, that isn't a big deal, but if
> it needs to diagnose something and parse DWARF for that, you need 2.35 or
> later.

Does that mean you can't reasonably use 11 on most distributions without
explicitly using -gdwarf-4?  The release notes suggested to me it would still
work, just not with full functionality somehow, and there is some adjustment
for the binutils version.  Is there some way to configure it to default to
DWARF 4 other than, I guess, adding specs to treat -g as -gdwarf-4?

[Bug debug/100725] New: dwarf error with --whole-program

2021-05-22 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100725

Bug ID: 100725
   Summary: dwarf error with --whole-program
   Product: gcc
   Version: 11.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fx at gnu dot org
  Target Milestone: ---

Extending my example in #100724 with -g, I see a dwarf error, which I assume is
a separate issue:

$ cat test.f90
module tw
  interface
 real function twice (x)
 end function twice
  end interface
end module tw

real function twice (x)
  twice = 2*x
end function twice

use tw
read *, x
print *, twice (x)
end
$ gfortran-11 -O2 -g -fwhole-program test.f90
/usr/bin/ld: /usr/bin/ld: DWARF error: can't find .debug_ranges section.
/tmp/cc8sFtAX.o: in function `MAIN__':
test.f90:(.text+0x7f): undefined reference to `twice_'
collect2: error: ld returned 1 exit status

Changing -g to -gdwarf-4 avoids the error, as does removing -fwhole-program. 
It also works with gcc-10 and -gdwarf-5.  The system is Debian 10 amd64
(binutils 2.31.1, if that matters).

[Bug fortran/100724] New: -fwhole-program breaks module use

2021-05-22 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100724

Bug ID: 100724
   Summary: -fwhole-program breaks module use
   Product: gcc
   Version: 11.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fx at gnu dot org
  Target Milestone: ---

I found that trying gfortran -fwhole-program failed to link a case I tried,
with undefined references to routines with interface blocks.  It's OK with
gfortran 10.
Here's a trivial example (on Debian 10, but I don't suppose that matters):

$ gfortran-11 --version
GNU Fortran (GCC) 11.1.0
$ cat test.f90
module tw
  interface
 real function twice (x)
 end function twice
  end interface
end module tw

real function twice (x)
  twice = 2*x
end function twice

use tw
read *, x
print *, twice (x)
end
$ gfortran-11 -O -fwhole-program test.f90 
/usr/bin/ld: /tmp/ccBKHiLp.o: in function `MAIN__':
test.f90:(.text+0x7d): undefined reference to `twice_'
collect2: error: ld returned 1 exit status

[Bug target/97160] Regression from GCC 8 optimizing to sincos on ppc64le

2020-09-25 Thread fx at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97160

Dave Love  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Dave Love  ---
Apologies; this isn't the compiler.  It seems the glibc that Advanced Toolkit
gcc 8 uses is responsible for the difference between that and 10.  I tried to
test that with LD_LIBRARY_PATH, but find you have to build with AT's ld64.so.2
to see the effect.

It looks as though sincos in ppc64le glibc-2.17 is just sin+cos, though I can't
immediately find my way in the source to check.  I'd assumed it was optimized.