[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-27 Thread burnus at gcc dot gnu dot org


--- Comment #27 from burnus at gcc dot gnu dot org  2010-07-27 08:44 ---
Subject: Bug 40873

Author: burnus
Date: Tue Jul 27 08:44:22 2010
New Revision: 162557

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=162557
Log:
2010-07-26  Tobias Burnus  bur...@net-b.de

PR fortran/40873
* trans-decl.c (gfc_get_extern_function_decl): Fix generation
for functions which are later in the same file.
(gfc_create_function_decl, build_function_decl,
build_entry_thunks): Add global argument.
* trans.c (gfc_generate_module_code): Update
gfc_create_function_decl call.
* trans.h (gfc_create_function_decl): Update prototype.
* resolve.c (resolve_global_procedure): Also resolve for
IFSRC_IFBODY.

2010-07-26  Tobias Burnus  bur...@net-b.de

PR fortran/40873
* gfortran.dg/whole_file_22.f90: New test.
* gfortran.dg/whole_file_23.f90: New test.


Added:
trunk/gcc/testsuite/gfortran.dg/whole_file_22.f90
trunk/gcc/testsuite/gfortran.dg/whole_file_23.f90
Modified:
trunk/gcc/fortran/ChangeLog
trunk/gcc/fortran/resolve.c
trunk/gcc/fortran/trans-decl.c
trunk/gcc/fortran/trans.c
trunk/gcc/fortran/trans.h
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-27 Thread burnus at gcc dot gnu dot org


--- Comment #28 from burnus at gcc dot gnu dot org  2010-07-27 08:46 ---
FIXED on the trunk (4.6). Thanks for the reports, comments, patches,
suggestions, and reviews!

See PR 45077, PR 45087, and PR 44945 for remaining -fwhole-(file,program) bugs.


-- 

burnus at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-26 Thread burnus at gcc dot gnu dot org


--- Comment #23 from burnus at gcc dot gnu dot org  2010-07-26 13:25 ---
Created an attachment (id=21315)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21315action=view)
New trans-decl.c patch - seems to work well

New patch. Found the problem with the help of Jakub (thanks!); not yet
regtested but it works with the previously failing examples.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-26 Thread dominiq at lps dot ens dot fr


--- Comment #24 from dominiq at lps dot ens dot fr  2010-07-26 17:02 ---
With the patch in comment #23, the polyhedron tests gas_dyn.f90 and
test_fpu.f90 do not link and compiling the test in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31867#c6 gives an ICE (Segmentation
fault). Otherwise all the other polyhedron tests compile with -O3 -g
-fwhole-program and I did not see any miscompilation, regtested without
regression. Nice work! thanks.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-26 Thread burnus at gcc dot gnu dot org


--- Comment #25 from burnus at gcc dot gnu dot org  2010-07-26 17:03 ---
(In reply to comment #23)
 Created an attachment (id=21315)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21315action=view) [edit]
 New trans-decl.c patch - seems to work well

Dominique has found a failure (segfault) with PR 31867 comment 6.

If one generates in gfc_get_extern_function_decl the code for lensum, one
finds that its argument words has locally the correct type:
  (gdb) p sym-formal-sym-as-type
  $4 = AS_ASSUMED_SHAPE
but the gsym has the wrong type
  (gdb) p gsym-ns-proc_name-formal-sym-as-type
  $10 = AS_DEFERRED
Thus, one enters the code path for descriptor-free arrays and crashes as UBOUND
is NULL.

In principle, this should get fixed in resolve_formal_arglist. One problem is
that if one enters find_arglists sym-ns != gfc_current_ns it fails.

But the actual problems seems to be in resolve_global_procedure. One has:

(gdb) p sym-attr.if_source
$27 = IFSRC_IFBODY
(gdb) p sym-formal-sym-as-type
$28 = AS_ASSUMED_SHAPE

That is: The symbol in the interface block of the module is resolved. But the
gsym is not:

(gdb) p gsym-ns-resolved
$29 = 0
(gdb) p gsym-ns-proc_name-formal-sym-as-type
$30 = AS_DEFERRED

The following patch fixes the program. (Side remark, one could do more argument
checking, cf. PR 45086.)

Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c   (revision 162538)
+++ gcc/fortran/resolve.c   (working copy)
@@ -1816,7 +1816,8 @@ resolve_global_procedure (gfc_symbol *sy
 gfc_global_used (gsym, where);

   if (gfc_option.flag_whole_file
-sym-attr.if_source == IFSRC_UNKNOWN
+(sym-attr.if_source == IFSRC_UNKNOWN
+   || sym-attr.if_source == IFSRC_IFBODY)
 gsym-type != GSYM_UNKNOWN
 gsym-ns
 gsym-ns-resolved != -1
@@ -1902,7 +1903,7 @@ resolve_global_procedure (gfc_symbol *sy
   sym-name, sym-declared_at, gfc_typename (sym-ts),
   gfc_typename (def_sym-ts));

-  if (def_sym-formal)
+  if (def_sym-formal  sym-attr.if_source != IFSRC_IFBODY)
{
  gfc_formal_arglist *arg = def_sym-formal;
  for ( ; arg; arg = arg-next)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-26 Thread dominiq at lps dot ens dot fr


--- Comment #26 from dominiq at lps dot ens dot fr  2010-07-26 21:00 ---
The patch in comment #25 fixes the ICE for PR 31867 comment 6, but causes also
several regressions with ... must have an explicit interface errors:


FAIL: gfortran.dg/allocatable_function_1.f90 (test for excess errors)
FAIL: gfortran.dg/allocatable_function_3.f90  (test for excess errors)
FAIL: gfortran.dg/char_result_3.f90  (test for excess errors)
FAIL: gfortran.dg/char_result_4.f90  (test for excess errors)
FAIL: gfortran.dg/f2c_6.f90  (test for excess errors)
FAIL: gfortran.dg/import6.f90  (test for excess errors)
FAIL: gfortran.dg/pointer_check_6.f90  (test for excess errors)
FAIL: gfortran.dg/value_tests_f03.f90  (test for excess errors)
FAIL: gfortran.fortran-torture/execute/entry_7.f90 compilation


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-25 Thread dominiq at lps dot ens dot fr


--- Comment #21 from dominiq at lps dot ens dot fr  2010-07-25 10:03 ---
With the patch in comment #18, I see all the failures reported in comment #19,
plus

FAIL: gfortran.dg/whole_file_9.f90  -O  (internal compiler error)
FAIL: gfortran.dg/g77/13037.f  -O3 -g  (internal compiler error)

With -fwhole-program -O3 -g, most of the polyhedron tests fail either with
internal compiler error: in output_die, at dwarf2out.c:11046 or with a
segmentation fault.

In addition without any option, I see several failures for codes with recursive
functions, such as (from pr27613):

program test
interface
  function bad_stuff(n)
integer :: bad_stuff (2)
integer :: n(2)
  end function bad_stuff
   recursive function rec_stuff(n) result (tmp)
integer :: n(2), tmp(2)
  end function rec_stuff
end interface
   integer :: res(2)
  res = bad_stuff((/-19,-30/))
  print *,  res
  if (any (res .ne. (/25,25/))) call abort ()
  if (any (rec_stuff((/-19,-30/)) .ne. (/25,25/))) call abort ()

end program test

  recursive function bad_stuff(n)
integer :: bad_stuff (2)
integer :: n(2), tmp(2)
bad_stuff = rec_stuff (n)
if((maxval (n)0).and.(maxval (n)  2)) then
  bad_stuff = bad_stuff + bad_stuff (maxval (n)+1) 
endif
   entry rec_stuff(n) result (tmp)
tmp=1
if(maxval (n)  5) then
  tmp = tmp + rec_stuff (n+1)
endif
  end function bad_stuff


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-25 Thread burnus at gcc dot gnu dot org


--- Comment #22 from burnus at gcc dot gnu dot org  2010-07-25 10:14 ---
The patch in comment 18 causes a segfault (in gfc_generate_function_code for
cfun-function_end_locus = input_location [Invalid write of size 4]) for the
test case in PR 40011 comment 0 (the one after Your patch fixes some
Segmentation faults (a couple)) - the test case works otherwise. No
-fwhole-program is needed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-24 Thread jv244 at cam dot ac dot uk


--- Comment #17 from jv244 at cam dot ac dot uk  2010-07-24 18:15 ---
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873#c1 still fails with current
trunk


-- 

jv244 at cam dot ac dot uk changed:

   What|Removed |Added

   Last reconfirmed|2010-05-03 10:53:20 |2010-07-24 18:15:21
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873




[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-24 Thread burnus at gcc dot gnu dot org


--- Comment #18 from burnus at gcc dot gnu dot org  2010-07-24 19:13 ---
(In reply to comment #17)
 comment 1 still fails with current trunk

(In reply to comment #1)
   subroutine two()
 call three()
   end subroutine two

   subroutine three()
   end subroutine three

The problem is that one first generates code for two and thus calls
  gfc_get_extern_function_decl
to generate the decl for three - there is no existing declaration. As next
step, one works on three and calls
   gfc_create_function_decl
which creates another declaration.


The test case in comment 1 and the one in comment 16 (with the bogus use demo
comment out) worked with an initial version of the following patch. However, it
give an ICE for the test in comment 4:
22.f90:5:0: internal compiler error: in build_function_decl, at
fortran/trans-decl.c:1599

The solution was to make the newly generated procedure as global - which it
surely is - otherwise, it were not accessible.

Index: gcc/fortran/trans-decl.c
===
--- gcc/fortran/trans-decl.c(Revision 162502)
+++ gcc/fortran/trans-decl.c(Arbeitskopie)
@@ -1411,9 +1411,19 @@ gfc_get_extern_function_decl (gfc_symbol
 !sym-attr.use_assoc
 !sym-backend_decl
 gsym  gsym-ns
-((gsym-type == GSYM_SUBROUTINE) || (gsym-type == GSYM_FUNCTION))
-gsym-ns-proc_name-backend_decl)
+((gsym-type == GSYM_SUBROUTINE) || (gsym-type == GSYM_FUNCTION)))
 {
+  if (!gsym-ns-proc_name-backend_decl)
+   {
+ tree save_fn_decl = current_function_decl;
+ /* By construction, the external function cannot be
+a contained procedure.  */
+ current_function_decl = NULL_TREE;
+ gfc_create_function_decl (gsym-ns);
+ current_function_decl = save_fn_decl;
+ gcc_assert (gsym-ns-proc_name-backend_decl);
+   }
+
   /* If the namespace has entries, the proc_name is the
 entry master.  Find the entry and use its backend_decl.
 otherwise, use the proc_name backend_decl.  */


-- 

burnus at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|pault at gcc dot gnu dot org|burnus at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-24 Thread burnus at gcc dot gnu dot org


--- Comment #19 from burnus at gcc dot gnu dot org  2010-07-24 20:16 ---
And of course the patch won't work out of the box:

$ gfortran -O2 -g  gfortran.dg/entry_array_specs_2.f  ./a.out 
gfortran.dg/entry_array_specs_2.f:16:0: internal compiler error: in output_die,
at dwarf2out.c:11046

Ditto for gfortran.dg/pr25603.f, gfortran.dg/proc_decl_2.f90,
gfortran.fortran-torture/execute/mystery_proc.f90,
gfortran.fortran-torture/execute/procarg.f90, 
gfortran.dg/loc_1.f90, gfortran.dg/value_test.f90,
gfortran.dg/value_tests_f03.f90.

Probably something is wrong with the current_function_decl = NULL_TREE; -
hopefully, it is easily fixable and does not require a completely different
approach.


But at least:

(In reply to comment #0)
 the following Polyhedron testcases fail
   ac, aermod, doduc, gas_dyn, linpk, mdbx, rnflow and test_fpu

Using -march=native -ffast-math -funroll-loops -ftree-loop-linear -O3
-fwhole-program -fwhole-file,

*ALMOST ALL* the polyhedron tests succeed - except for gas_dyn and test_fpu
(for which there are still undefined references).
[Using -g one also gets issues with polyhedron tests.]


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-07-24 Thread burnus at gcc dot gnu dot org


--- Comment #20 from burnus at gcc dot gnu dot org  2010-07-24 22:05 ---
(In reply to comment #19)
 $ gfortran -O2 -g  gfortran.dg/entry_array_specs_2.f  ./a.out 
 gfortran.dg/entry_array_specs_2.f:16:0: internal compiler error:
 in output_die, at dwarf2out.c:11046

I had a closer look at loc_1.f90. With this patch, the functions fn and foo are
actually cloned and inlined - without this does not happen. Consequently, the
ICE goes ways with -fno-inline. Thus, the patch might actually be correct and
the bug could be at a completely different place. Additionally, the patch might
give some performance boost :-)

But first, the cause for the ICE has to be found. Any idea where to start
searching?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-06-09 Thread fxcoudert at gcc dot gnu dot org


--- Comment #16 from fxcoudert at gcc dot gnu dot org  2010-06-09 22:10 
---
Another one that fails:


subroutine func (x)
  use demo
  integer :: x
  x = 999
end subroutine func

subroutine foo
  interface
subroutine func(x)
  integer :: x
   end subroutine func
  end interface

  integer :: x
  call func(x)
  if (x /= 999) call abort ()

end subroutine foo

program test
  call foo
end


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-05-26 Thread burnus at gcc dot gnu dot org


--- Comment #13 from burnus at gcc dot gnu dot org  2010-05-26 14:28 ---
Is this now fixed by the following commit? Or is something else to be done
(additional fix, backporting, ...)?

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=159852
Log:
2010-05-26  Paul Thomas  pa...@gcc.gnu.org

PR fortran/40011
* resolve.c (resolve_global_procedure): Resolve the gsymbol's
namespace before trying to reorder the gsymbols.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-05-26 Thread dominiq at lps dot ens dot fr


--- Comment #14 from dominiq at lps dot ens dot fr  2010-05-26 14:41 ---
 Is this now fixed by the following commit? Or is something else to be done
 (additional fix, backporting, ...)?

At least ac.f90 (probably all the items of the list) fails to link with -O
-fwhole-program at revision 159855.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-05-26 Thread burnus at gcc dot gnu dot org


--- Comment #15 from burnus at gcc dot gnu dot org  2010-05-26 14:45 ---
(In reply to comment #13)
 Is this now fixed by the following commit?

Answer: It is not. Comment 1 now works with -fwhole-program -O1, but comment
0 and comment 4 still fail. (Though, they work with -fwhole-file -O1.)

(In reply to comment #12)
 My belief is that with this patch and corrections of the legacy style
 testsuite cases, -fwhole-file could be finally made the default.

As there are no -fwhole-file failures in this PR (though -fwhole-file bugs,
revealed through -fwhole-program), I think this PR does not prevent enabling
-fwhole-file by default.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-05-24 Thread pault at gcc dot gnu dot org


--- Comment #12 from pault at gcc dot gnu dot org  2010-05-24 12:31 ---
Created an attachment (id=20734)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20734action=view)
Fix for this PR and PR40011 #42

This patch regtests OK apart from some peculiarities in proc_ptr_comp_9.f90 and
proc_ptr_23.f90, which fail to link with -g.  The problems do not appear to be
associated with this patch, however.

My belief is that with this patch and corrections of the legacy style
testsuite cases, -fwhole-file could be finally made the default.

Paul  


-- 

pault at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |pault at gcc dot gnu dot org
   |dot org |
 Status|NEW |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-05-20 Thread pault at gcc dot gnu dot org


--- Comment #11 from pault at gcc dot gnu dot org  2010-05-20 13:51 ---
(In reply to comment #10)
Am I right in thinking that -fwhole-file could be enabled by default, if this
PR were to be fixed?  (The appropriate changes in the testsuite would have to
be mad too.)

Paul


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-05-16 Thread rguenth at gcc dot gnu dot org


--- Comment #6 from rguenth at gcc dot gnu dot org  2010-05-16 10:53 ---
(In reply to comment #5)
 A few comments:
 
 (1) adding -flto or -fwhopr solves the linking problem for the polyhedron 
 tests
 and the reduced one in comment #1.
 
 (2) the test in comment #4 is different as it shows up for -fwhole-file and is
 not solved with -flto or -fwhopr.
 
 (3) I have been puzzled by the results in
 http://users.physik.fu-berlin.de/~tburnus/gcc-trunk/benchmark/ for 
 fatigue.f90.
 It is due to -fwhole-program:
 
 [macbook] lin/test% gfc -O3 -ffast-math -fwhole-file -flto fatigue.f90
 [macbook] lin/test% time a.out
 ...
 9.223u 0.004s 0:09.23 99.8% 0+0k 0+0io 0pf+0w
 [macbook] lin/test% gfc -O3 -ffast-math -fwhole-program fatigue.f90
 [macbook] lin/test% time a.out
 ...
 6.482u 0.004s 0:06.49 99.8% 0+0k 0+0io 0pf+0w
 
 It would be interesting to understand why and to keep this nice speed up when
 fixing this pr.

-fwhole-program enables -fwhole-file.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-05-16 Thread dominiq at lps dot ens dot fr


--- Comment #7 from dominiq at lps dot ens dot fr  2010-05-16 11:00 ---
 -fwhole-program enables -fwhole-file.

Yes, but -fwhole-file does not enable -fwhole-program. All the polyhedron tests
pass with -fwhole-file (and say -O3 -ffast-math), but the test in comment #4
fails with -whole-file.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-05-16 Thread rguenther at suse dot de


--- Comment #8 from rguenther at suse dot de  2010-05-16 11:04 ---
Subject: Re:  -fwhole-file -fwhole-program: Wrong decls
 cause too much to be optimized away

On Sun, 16 May 2010, dominiq at lps dot ens dot fr wrote:

 --- Comment #7 from dominiq at lps dot ens dot fr  2010-05-16 11:00 
 ---
  -fwhole-program enables -fwhole-file.
 
 Yes, but -fwhole-file does not enable -fwhole-program. All the polyhedron 
 tests
 pass with -fwhole-file (and say -O3 -ffast-math), but the test in comment #4
 fails with -whole-file.

-fwhole-file cannot enable -fwhole-program.  -fwhole-program says
to the optimizers that they do see the whole program - all callers
to functions defined in the current TU have to be visible (and
have correct callgraph edges, thus -fwhole-file).

You cant' compare -fwhole-file numbers to -fwhole-program numbers.
-fwhole-file is a correctness option, w/o it the Frontend generates
an invalid representation for the middle-end.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-05-16 Thread dominiq at lps dot ens dot fr


--- Comment #9 from dominiq at lps dot ens dot fr  2010-05-16 11:16 ---
 You cant' compare -fwhole-file numbers to -fwhole-program numbers.
 -fwhole-file is a correctness option, w/o it the Frontend generates
 an invalid representation for the middle-end.

Well, from what I saw running the polyhedron tests, -fwhole-file is more than a
correctness option. I think it exposes more optimization opportunities to the
middle end, giving faster executable for ac, aermod, and doduc. Note that
adding -flto gives also some speed up for these tests. Due to this pr one
cannot test the effect of -fwhole-program on half the tests. However using it
for fatigue gives a quite large speed up I do not see for the seven other
tests.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-05-16 Thread rguenther at suse dot de


--- Comment #10 from rguenther at suse dot de  2010-05-16 11:21 ---
Subject: Re:  -fwhole-file -fwhole-program: Wrong decls
 cause too much to be optimized away

On Sun, 16 May 2010, dominiq at lps dot ens dot fr wrote:

 --- Comment #9 from dominiq at lps dot ens dot fr  2010-05-16 11:16 
 ---
  You cant' compare -fwhole-file numbers to -fwhole-program numbers.
  -fwhole-file is a correctness option, w/o it the Frontend generates
  an invalid representation for the middle-end.
 
 Well, from what I saw running the polyhedron tests, -fwhole-file is more than 
 a
 correctness option. I think it exposes more optimization opportunities to the
 middle end, giving faster executable for ac, aermod, and doduc. Note that
 adding -flto gives also some speed up for these tests. Due to this pr one
 cannot test the effect of -fwhole-program on half the tests. However using it
 for fatigue gives a quite large speed up I do not see for the seven other
 tests.

It enables more optimization opportunities because calls inside the
unit are visible as such.  Without -fwhole-file nearly all calls
look like calls to external functions and local functions appear
unused.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2010-05-15 Thread dominiq at lps dot ens dot fr


--- Comment #5 from dominiq at lps dot ens dot fr  2010-05-15 19:53 ---
A few comments:

(1) adding -flto or -fwhopr solves the linking problem for the polyhedron tests
and the reduced one in comment #1.

(2) the test in comment #4 is different as it shows up for -fwhole-file and is
not solved with -flto or -fwhopr.

(3) I have been puzzled by the results in
http://users.physik.fu-berlin.de/~tburnus/gcc-trunk/benchmark/ for fatigue.f90.
It is due to -fwhole-program:

[macbook] lin/test% gfc -O3 -ffast-math -fwhole-file -flto fatigue.f90
[macbook] lin/test% time a.out
...
9.223u 0.004s 0:09.23 99.8% 0+0k 0+0io 0pf+0w
[macbook] lin/test% gfc -O3 -ffast-math -fwhole-program fatigue.f90
[macbook] lin/test% time a.out
...
6.482u 0.004s 0:06.49 99.8% 0+0k 0+0io 0pf+0w

It would be interesting to understand why and to keep this nice speed up when
fixing this pr.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2009-09-22 Thread rguenth at gcc dot gnu dot org


--- Comment #3 from rguenth at gcc dot gnu dot org  2009-09-22 15:34 ---
Reconfirmed with current trunk.  If I move three to the top of the file I
even get

/tmp/ccGHHCCU.o: In function `main':
t.f90:(.text+0x1e): undefined reference to `three_'
t.f90:(.text+0x28): undefined reference to `three_'
collect2: ld returned 1 exit status

this makes -fwhole-file less useful than necessary.  LTO manages to fix up
the decls (thus, with -flto it builds and links fine with -fwhole-program).


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

   Last reconfirmed|2009-07-28 13:12:15 |2009-09-22 15:34:34
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2009-09-22 Thread rguenth at gcc dot gnu dot org


--- Comment #4 from rguenth at gcc dot gnu dot org  2009-09-22 15:42 ---
Similar testcase from PR40011

SUBROUTINE c()
 CALL a()
END SUBROUTINE c

SUBROUTINE a()
END SUBROUTINE a

MODULE M
CONTAINS
 SUBROUTINE b()
   CALL c()
 END SUBROUTINE
END MODULE

USE M
CALL b()
END

 gfortran -fwhole-file -O t.f90
/tmp/ccW7Uhc6.o: In function `__m_MOD_b':
t.f90:(.text+0xa): undefined reference to `c_'
collect2: ld returned 1 exit status


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

OtherBugsDependingO||40011
  nThis||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2009-07-28 Thread rguenth at gcc dot gnu dot org


--- Comment #2 from rguenth at gcc dot gnu dot org  2009-07-28 13:12 ---
We have the following cgraph nodes related to daxpy:

daxpy/17(-1) @0x75fc8700
  called by: dgesl/3 (1.00 per call) dgesl/3 (1.00 per call)
  calls:
dgefa/7(7) @0x75f7b100 174 time, 45 benefit 138 size, 36 benefit 20 bytes
stack usage reachable body finalized inlinable
  called by: linpk/9 (1.00 per call)
  calls: daxpy/4 (1.00 per call) dscal/5 (1.00 per call) idamax/6 (1.00 per
call)
daxpy/4(4) @0x75f66400 125 time, 56 benefit 125 size, 47 benefit reachable
body finalized inlinable
  called by: dgefa/7 (1.00 per call)
  calls:
dgesl/3(3) @0x75f45f00 219 time, 52 benefit 165 size, 43 benefit 24 bytes
stack usage reachable body finalized inlinable
  called by: linpk/9 (1.00 per call)
  calls: ddot/2 (1.00 per call) ddot/2 (1.00 per call) daxpy/17 (1.00 per call)
daxpy/17 (1.00 per call)

where daxpy/17 is the one without a body (not merged with daxpy/4), called
by dgesl.  The call in dgefa is inlined (as single remaining call to a
then reclaimable function).

Confirmed with pauls latest patch applied.


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Keywords||wrong-code
   Last reconfirmed|-00-00 00:00:00 |2009-07-28 13:12:15
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873



[Bug fortran/40873] -fwhole-file -fwhole-program: Wrong decls cause too much to be optimized away

2009-07-27 Thread burnus at gcc dot gnu dot org


--- Comment #1 from burnus at gcc dot gnu dot org  2009-07-27 14:47 ---
(Note: -fwhole-program implies -fwhole-file; the -On option is required.)

Test case. Run with -O1 -fwhole-program.
Fails with: test.f90:(.text+0x1b): undefined reference to `three_'

  program prog
call one()
call two()
  end program prog
  subroutine one()
call three()
  end subroutine one
  subroutine two()
call three()
  end subroutine two
  subroutine three()
  end subroutine three


-- 

burnus at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||pault at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40873