[Bug middle-end/81441] slowdown due to -fpeel-loops and -ftracer added by -fprofile-use

2017-07-17 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81441

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #2 from Joost VandeVondele  
---
No, this is profile-use after a profile-generate, so standard PGO setup. The
profile-generate is using the same benchmark for generation, so coverage should
be good.

[Bug middle-end/81441] New: slowdown due to -fpeel-loops and -ftracer added by -fprofile-use

2017-07-14 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81441

Bug ID: 81441
   Summary: slowdown due to -fpeel-loops and -ftracer added by
-fprofile-use
   Product: gcc
   Version: 5.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

For our code, we see a slowdown (3%-7% depending on the user reporting) due to
the options -fpeel-loops and -ftracer added by default when using
-fprofile-use.

The code is stockfish, which is presumably the strongest open source chess
engine, and part of benchmark suites such as
https://openbenchmarking.org/test/pts/stockfish 

The same behaviour has been observed for gcc versions from 4.X to 7.1 so it is
not some recent regression and quite persistent. (Discussions in
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/YzV_fG7ejR4 and
https://github.com/official-stockfish/Stockfish/pull/1165 )

It is not easy for me to pinpoint the location in the code that is affected
most (despite the code being only ~5000 lines of C++). I tried differential
profiling with perf, but didn't get profiles that made sense to me. 

It is easy to reproduce, by testing two successive git commits where the change
of options in the Makefile is the only difference:

git clone https://github.com/official-stockfish/Stockfish.git
cd Stockfish/src/

# version with -fprofile-use -fno-peel-loops -fno-tracer
# ==
git checkout c8e5384c3a4a5d9ac709c9b50954907a7f07109c
make clean && make -j ARCH=x86-64-modern profile-build
./stockfish bench 128 1 16 default depth 2>&1 | grep 'Total time (ms)'
# (locally reports Total time (ms) : 9947)

#version with just -fprofile-use
#===
git checkout 0371a8f8c4a043cb3e7d08b5b8e7d08d49f28324
make clean && make -j ARCH=x86-64-modern profile-build
./stockfish bench 128 1 16 default depth 2>&1 | grep 'Total time (ms)'
# (locally reports Total time (ms) : 10456)

So '-fprofile-use -fno-peel-loops -fno-tracer' is 5% faster than
'-fprofile-use' in my case.

Let me know if I can provide more info. The length of the benchmarks can be
adjusted easily by changing the '16' in the bench command to smaller (shorter)
or larger (longer) numbers (time increases/decreases exponentially, change in
steps of 1 to have ~2x change).

[Bug target/80817] [missed optimization][x86] relaxed atomics

2017-05-22 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80817

--- Comment #3 from Joost VandeVondele  
---
If I compile with -m32

gcc -std=c++11 -m32 -S -O3  test.cpp

I get 

.cfi_startproc
subl$12, %esp
.cfi_def_cfa_offset 16
movl16(%esp), %ecx
fildq   (%ecx)
fistpq  (%esp)
movl(%esp), %eax
movl4(%esp), %edx
addl$1, %eax
adcl$0, %edx
movl%eax, (%esp)
movl%edx, 4(%esp)
fildq   (%esp)
fistpq  (%ecx)
addl$12, %esp
.cfi_def_cfa_offset 4
ret
.cfi_endproc


Is the above expected ? This causes a measurable slowdown in the piece of code
I'm looking at.

[Bug target/80817] [missed optimization][x86] relaxed atomics

2017-05-20 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80817

Joost VandeVondele  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-05-20
 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch
 Ever confirmed|0   |1

[Bug target/80817] New: [missed optimization][x86] relaxed atomics

2017-05-18 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80817

Bug ID: 80817
   Summary: [missed optimization][x86] relaxed atomics
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

Using gcc 7.1 on x86, the following

#include 

void increment_relaxed(std::atomic& counter) {

 atomic_store_explicit(,
  atomic_load_explicit(, std::memory_order_relaxed) + 1,
  std::memory_order_relaxed);
}

compiles to:

.cfi_startproc
movq(%rdi), %rax
addq$1, %rax
movq%rax, (%rdi)
ret
.cfi_endproc

while I would expect that 

.cfi_startproc
addq$1, (%rdi)
ret
.cfi_endproc

would be fine and more efficient. 

I also looked at 

atomic_fetch_add_explicit(, uint64_t(1), std::memory_order_relaxed); 

but that surprised me with

.cfi_startproc
lock addq   $1, (%rdi)
ret
.cfi_endproc

[Bug libfortran/51119] MATMUL slow for large matrices

2016-11-08 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #37 from Joost VandeVondele  
---
(In reply to Joost VandeVondele from comment #36)
> #pragma GCC optimize ( "-Ofast -fvariable-expansion-in-unroller
> -funroll-loops" )

and really beneficial for larger matrices would be 

-floop-nest-optimize

in particular the blocking (it would be an additional motivation for PR14741
and work on graphite in general), don't know if one can give the parameter for
the blocking. In principle the loop-nest-optimization, together with the -Ofast
(and ideally -march=native, which we can't have in libgfortran, I assume) would
yield near peak performance.

[Bug libfortran/51119] MATMUL slow for large matrices

2016-11-08 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #36 from Joost VandeVondele  
---
(In reply to Jerry DeLisle from comment #34)
> -Ofast does reorder execution.. 
> Opinions welcome.

That is absolutely OK for a matmul, and all techniques to get near peak
performance require that (e.g. use of fma, blocking, etc.). 

I didn't realize that one can easily put pragmas for single routines, so you
could experiment with something like 

#pragma GCC optimize ( "-Ofast -fvariable-expansion-in-unroller -funroll-loops"
)

[Bug fortran/68649] [6/7 Regression] note: code may be misoptimized unless -fno-strict-aliasing is used

2016-10-18 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68649

--- Comment #18 from Joost VandeVondele  
---
since this PR, and the related PR77278 can presumably only be fixed by changing
libgfortran abi (at least if I understand Richard's suggestion for fixing
this). The announced major version bump of libgfortran
(https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01376.html) could be a good
opportunity for this change. It is the major thing holding back the use of LTO
with Fortran projects, I think.

[Bug tree-optimization/77719] [7 Regression] ICE in pp_string, at pretty-print.c:955

2016-09-25 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77719

--- Comment #6 from Joost VandeVondele  
---
(In reply to kugan from comment #5)
> Sent a patch to fix this at
> https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01760.html.

Thanks, add this line before the first IF statement to silence the warnings:

  INTEGER :: isp,spdim,jsp,nsp

[Bug tree-optimization/77719] [7 Regression] ICE in pp_string, at pretty-print.c:955

2016-09-24 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77719

--- Comment #4 from Joost VandeVondele  
---
(In reply to Dominique d'Humieres from comment #3)
> I
> don't think the code is valid: spdim is an implicit real used uninitialized.

yeah, auto-reduced from valid code.

but thanks for confirming, BTW!

[Bug tree-optimization/77719] [7 Regression] ICE in pp_string, at pretty-print.c:955

2016-09-24 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77719

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #2 from Joost VandeVondele  
---
why P4, it is a middle end bug on valid code ?

[Bug tree-optimization/77719] New: [7 Regression] ICE in pp_string, at pretty-print.c:955

2016-09-24 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77719

Bug ID: 77719
   Summary: [7 Regression] ICE in pp_string, at pretty-print.c:955
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

recent trunk regression:

> cat bug.f90
SUBROUTINE urep_egr(erep,derep,surr)
  INTEGER, PARAMETER :: dp=8
  REAL(dp), INTENT(inout)  :: erep, derep(3)
  REAL(dp), INTENT(in) :: surr(2)
  REAL(dp) :: de_z, rz
  IF (n_urpoly > 0) THEN
IF (r < spxr(1,1)) THEN
  ispg: DO isp = 1,spdim ! condition ca)
IF (isp /= spdim) THEN
  nsp = 5 ! condition cb
  DO jsp = 0,nsp
IF( jsp <= 3 ) THEN
ELSE
  erep = erep + surr(jsp-3)*rz**(jsp)
ENDIF
  END DO
END IF
  END DO ispg
END IF
  END IF
END SUBROUTINE urep_egr

> gfortran  -c -O3 -ffast-math bug.f90
[...]
in pp_string, at pretty-print.c:955
0x14506c6 pp_string
../../gcc/gcc/pretty-print.c:955
0x14506c6 pp_string(pretty_printer*, char const*)
../../gcc/gcc/pretty-print.c:953
0x14514e9 pp_format(pretty_printer*, text_info*)
../../gcc/gcc/pretty-print.c:597
0x14445f1 diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*)
../../gcc/gcc/diagnostic.c:941
0x1444e48 diagnostic_impl
../../gcc/gcc/diagnostic.c:1064
0x1444f74 internal_error(char const*, ...)
../../gcc/gcc/diagnostic.c:1349
0x9130f8 gimple_check_failed(gimple const*, char const*, int, char const*,
gimple_code, tree_code)
../../gcc/gcc/gimple.c:1177
0xd992d7 GIMPLE_CHECK2
../../gcc/gcc/gimple.h:73
0xd8a037 gimple_phi_arg
../../gcc/gcc/tree-phinodes.h:37
0xd8a037 gimple_phi_arg_imm_use_ptr
../../gcc/gcc/tree-phinodes.h:37
0xd8a037 op_iter_next_use
../../gcc/gcc/ssa-iterators.h:490
0xd8a037 link_use_stmts_after
../../gcc/gcc/ssa-iterators.h:902
0xd8a037 next_imm_use_stmt
../../gcc/gcc/ssa-iterators.h:955
0xd8a037 make_new_ssa_for_def
../../gcc/gcc/tree-ssa-reassoc.c:1167
0xd8d908 make_new_ssa_for_all_defs
../../gcc/gcc/tree-ssa-reassoc.c:1194
0xd8d908 zero_one_operation
../../gcc/gcc/tree-ssa-reassoc.c:1338
0xd95430 undistribute_ops_list
../../gcc/gcc/tree-ssa-reassoc.c:1684
0xd96178 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5393
0xd95fa7 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5528
0xd95fa7 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5528
Please submit a full bug report,

> gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160924 (experimental) [trunk revision 240461] (GCC)

[Bug tree-optimization/77644] New: missed optimization with sqrt in comparison

2016-09-19 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77644

Bug ID: 77644
   Summary: missed optimization with sqrt in comparison
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

> cat t.f90
LOGICAL FUNCTION F1(A,B)
  REAL :: A,B
  F1=(abs(A)<sqrt(B))
END FUNCTION
LOGICAL FUNCTION F2(A,B)
  REAL :: A,B
  F2=(A*A<B)
END FUNCTION

LOGICAL FUNCTION F3(A,B)
  REAL :: A,B
  F3=sqrt(A*A)<sqrt(B*B)
END FUNCTION
LOGICAL FUNCTION F4(A,B)
  REAL :: A,B
  F4=(A*A)<(B*B)
END FUNCTION

In the testcase above F1 could be optimized as F2, F3 as F4, at least with
-ffast-math, getting rid of a sqrt. a a*a < b*b if a>0 and b>0.

[Bug fortran/50259] Internal Error at (1): gfc_resolve_expr(): Bad expression type

2016-08-04 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50259

Joost VandeVondele  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch
 Resolution|--- |FIXED

--- Comment #7 from Joost VandeVondele  
---
confirmed fixed.

[Bug fortran/71961] [7 Regression] 178.galgel in SPEC CPU 2000 is miscompiled

2016-07-27 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71961

--- Comment #11 from Joost VandeVondele  
---
This even gives wrong results at -O0 ... 

> cat test.f90 
INTEGER, DIMENSION(:,:), POINTER :: a
INTEGER, DIMENSION(:,:), ALLOCATABLE :: b
ALLOCATE(a(4,4),b(4,2))
a=1 ; b=2
a(:,1:2)=MATMUL(a(:,1:4),b(:,:))
write(6,*) a
IF (ANY(a.NE.RESHAPE((/8,8,8,8,8,8,8,8,1,1,1,1,1,1,1,1/),(/4,4/ &
CALL ABORT
END

gives correct results with gcc 5.3

[Bug fortran/71961] [7 Regression] 178.galgel in SPEC CPU 2000 is miscompiled

2016-07-27 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71961

--- Comment #10 from Joost VandeVondele  
---
(In reply to Thomas Koenig from comment #9)
> With a test case, it would be OK with me if somebody reverted the
> patch. I can then rework it to take care of that particular bug.

A revert would be good I think.. this is a small testcase showing the wrong
results and the missing warning. I suspect it could be matmul specific.

> cat test.f90 
REAL, DIMENSION(:,:), POINTER :: a
REAL, DIMENSION(:,:), ALLOCATABLE :: b
ALLOCATE(a(4,4),b(4,2))
CALL RANDOM_NUMBER(a)
CALL RANDOM_NUMBER(b)
a(1:4,1:2)=MATMUL(a(1:4,1:4),b(1:4,1:2))
WRITE(6,*) a(1,1)
END

> gfortran -O0 -Warray-temporaries test.f90 ; ./a.out
test.f90:6:11:

 a(1:4,1:2)=MATMUL(a(1:4,1:4),b(1:4,1:2))
   1
Warning: Creating array temporary at (1) [-Warray-temporaries]
  0.770401359

> gfortran -O1 -Warray-temporaries test.f90 ; ./a.out
  0.515214324

[Bug fortran/71961] [7 Regression] 178.galgel in SPEC CPU 2000 is miscompiled

2016-07-27 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71961

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #8 from Joost VandeVondele  
---
also miscompiles CP2K, but haven't been able to narrow it down.

[Bug middle-end/71898] [7 Regression] [graphite] ICE: verify_ssa failed

2016-07-20 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71898

--- Comment #4 from Joost VandeVondele  
---
(In reply to Martin Liška from comment #2)
> Created attachment 38939 [details]
> Candidate patch

Since this is a graphite fix, it might also fix PR71351 ?

[Bug middle-end/71898] New: [7 Regression] [graphite] ICE: verify_ssa failed

2016-07-15 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71898

Bug ID: 71898
   Summary: [7 Regression] [graphite] ICE: verify_ssa failed
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

The following started regressing roughly July 7th/8th:

> cat bug.f90
MODULE d3_poly
INTEGER, PUBLIC, PARAMETER :: max_grad2=5
INTEGER, PUBLIC, PARAMETER :: max_grad3=3
INTEGER, PUBLIC, PARAMETER :: cached_dim2=(max_grad2+1)*(max_grad2+2)/2
INTEGER, PUBLIC, PARAMETER ::
cached_dim3=(max_grad3+1)*(max_grad3+2)*(max_grad3+3)/6
INTEGER, SAVE, DIMENSION(3,cached_dim3) :: a_mono_exp3
INTEGER, SAVE, DIMENSION(cached_dim2,cached_dim2) :: a_mono_mult2
INTEGER, SAVE, DIMENSION(cached_dim3,cached_dim3) :: a_mono_mult3
INTEGER, SAVE, DIMENSION(4,cached_dim3) :: a_mono_mult3a
CONTAINS
SUBROUTINE init_d3_poly_module()
INTEGER  :: grad, i, ii, ij, j, subG
INTEGER, DIMENSION(3):: monoRes3
DO grad=0,max_grad2
DO i=grad,0,-1
DO j=grad-i,0,-1
END DO
END DO
END DO
DO ii=1,cached_dim3
DO ij=ii,cached_dim2
a_mono_mult2(ij,ii)=a_mono_mult2(ii,ij)
END DO
END DO
DO ii=1,cached_dim3
DO ij=ii,cached_dim3
monoRes3=a_mono_exp3(:,ii)+a_mono_exp3(:,ij)
   
a_mono_mult3(ii,ij)=mono_index3(monoRes3(1),monoRes3(2),monoRes3(3))+1
a_mono_mult3(ij,ii)=a_mono_mult3(ii,ij)
END DO
END DO
DO i=1,cached_dim3
   DO j=1,4
  a_mono_mult3a(j,i)=a_mono_mult3(j,i)
   END DO
END DO
END SUBROUTINE
PURE FUNCTION mono_index3(i,j,k) RESULT(res)
INTEGER, INTENT(in)  :: i, j, k
res=grad*(grad+1)*(grad+2)/6+(sgrad)*(sgrad+1)/2+k
END FUNCTION
END MODULE d3_poly

> gfortran -c -floop-nest-optimize -O1 bug.f90
bug.f90:11:0:

 SUBROUTINE init_d3_poly_module()

Error: definition in block 74 follows the use
for SSA_NAME: _56 in statement:
_104 = _56 + graphite_IV.7_99;
bug.f90:11:0: internal compiler error: verify_ssa failed
0xdd4efc verify_ssa(bool, bool)
../../gcc/gcc/tree-ssa.c:1039
0xd24a41 verify_loop_closed_ssa(bool)
../../gcc/gcc/tree-ssa-loop-manip.c:736
0x131afa9 checking_verify_loop_closed_ssa
../../gcc/gcc/tree-ssa-loop-manip.h:35
0x131afa9 graphite_verify
../../gcc/gcc/graphite-isl-ast-to-gimple.c:98
0x131afa9 graphite_regenerate_ast_isl(scop*)
../../gcc/gcc/graphite-isl-ast-to-gimple.c:3205
0x13124c3 graphite_transform_loops()
../../gcc/gcc/graphite.c:329
0x1312990 graphite_transforms
../../gcc/gcc/graphite.c:356
0x1312990 execute
../../gcc/gcc/graphite.c:433
Please submit a full bug report,

[Bug web/69601] current/ redirect is off by at least a day

2016-07-01 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69601

Joost VandeVondele  changed:

   What|Removed |Added

   Last reconfirmed|2016-06-26 00:00:00 |2016-7-1

--- Comment #4 from Joost VandeVondele  
---
and also in July...

[Bug debug/71642] New: [7 Regression] ICE: in gen_type_die_with_usage, at dwarf2out.c:22729

2016-06-24 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71642

Bug ID: 71642
   Summary: [7 Regression] ICE:  in gen_type_die_with_usage, at
dwarf2out.c:22729
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

overnight trunk regression:

> cat bug.f90
MODULE gauss_colloc
  INTEGER, PARAMETER :: dp=8
CONTAINS
SUBROUTINE collocGauss(h,h_inv,grid,poly,alphai,posi,max_r2,&
periodic,gdim,local_bounds,local_shift,poly_shift,scale,lgrid,error)
REAL(dp), DIMENSION(0:, 0:, 0:), &
  INTENT(inout)  :: grid
INTEGER, INTENT(inout), OPTIONAL :: lgrid
CONTAINS
SUBROUTINE kloop6
IF (kJump/=1 .AND. (ikstart+kmax-kstart>=ndim(2)+l_shift(2) .OR.&
ikstart2+kmin-kstart2<=l_ub(2)-ndim(2))) THEN
DO
DO k=kstart2,kend2,-1
IF ( PRESENT ( lgrid ) ) THEN
  grid(ik,ij,ii) = grid(ik,ij,ii) + p_v*res_k
END IF
END DO
END DO
END IF
END SUBROUTINE
END SUBROUTINE
END MODULE gauss_colloc

> gfortran -c -g bug.f90
bug.f90:21:0:

 END SUBROUTINE

internal compiler error: in gen_type_die_with_usage, at dwarf2out.c:22729
0x826d80 gen_type_die_with_usage
../../gcc/gcc/dwarf2out.c:22729
0x825fa5 gen_type_die_with_usage
../../gcc/gcc/dwarf2out.c:22811
0x827246 gen_type_die
../../gcc/gcc/dwarf2out.c:22907
0x820c1b gen_decl_die
../../gcc/gcc/dwarf2out.c:23522
0x82239c process_scope_var
../../gcc/gcc/dwarf2out.c:23029
0x822607 decls_for_scope
../../gcc/gcc/dwarf2out.c:23054
0x823177 gen_subprogram_die
../../gcc/gcc/dwarf2out.c:20773
0x820e5c gen_decl_die
../../gcc/gcc/dwarf2out.c:23476
0x821bce dwarf2out_decl
../../gcc/gcc/dwarf2out.c:23959
0x821f69 dwarf2out_early_global_decl
../../gcc/gcc/dwarf2out.c:23632
0x7b0bd8 symbol_table::finalize_compilation_unit()
../../gcc/gcc/cgraphunit.c:2557
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.


> gfortran -v 
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160624 (experimental) [trunk revision 237753] (GCC)

[Bug middle-end/71526] New: [7 Regression] ICE: verify_gimple failed

2016-06-14 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71526

Bug ID: 71526
   Summary: [7 Regression] ICE: verify_gimple failed
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

Overnight trunk regression, requires LTO.

> cat bug.f90
MODULE util
  INTERFACE sort
 MODULE PROCEDURE sort_cv
  END INTERFACE
CONTAINS
  SUBROUTINE sort_cv ( arr, n, index )
CHARACTER(LEN=*), INTENT(INOUT)  :: arr(1:n)
INTEGER, INTENT(OUT) :: INDEX(1:n)
INTEGER, ALLOCATABLE, DIMENSION(:, :):: entries
ALLOCATE(entries(max_length,SIZE(arr)))
  END SUBROUTINE sort_cv
END MODULE util
USE util
INTEGER, ALLOCATABLE :: ind(:)
character(len=3), ALLOCATABLE :: d(:)
CALL sort(d,N,ind)
END

> gfortran -fno-inline -flto -O2 bug.f90
bug.f90: In function ‘sort_cv.constprop’:
bug.f90:6:0: error: non-trivial conversion at assignment
   SUBROUTINE sort_cv ( arr, n, index )

logical(kind=4)
bool
_60 = _37;
bug.f90:6:0: error: type mismatch in binary expression
logical(kind=4)

bool

logical(kind=4)

_68 = _37 | _67;
bug.f90:6:0: internal compiler error: verify_gimple failed
0xaaada4 verify_gimple_in_cfg(function*, bool)
../../gcc/gcc/tree-cfg.c:5212
0x9908ac execute_function_todo
../../gcc/gcc/passes.c:1964
0x99135b execute_todo
../../gcc/gcc/passes.c:2016
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
lto-wrapper: fatal error: gfortran returned 1 exit status
compilation terminated.
/data/vjoost/gnu/binutils-2.23.2/install/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status

> gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160614 (experimental) [trunk revision 237423] (GCC)

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-06-10 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

Joost VandeVondele  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #24 from Joost VandeVondele  
---
yes, afaik.

[Bug tree-optimization/71414] 2x slower than clang summing small float array, GCC should consider larger vectorization factor for "unrolling" reductions

2016-06-07 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71414

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #6 from Joost VandeVondele  
---
Isn't this a case where -fvariable-expansion-in-unroller is helpful ?

> gcc -Ofast t.c -lrt ; ./a.out
285.670206

> gcc -Ofast -funroll-loops -fvariable-expansion-in-unroller  t.c -lrt ; ./a.out
151.246083

> gcc -Ofast -funroll-loops  t.c -lrt ; ./a.out
277.047507

There is some relation with PR25621 I think.

[Bug tree-optimization/71351] [7 Regression] ICE: Segmentation fault

2016-05-31 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71351

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #2 from Joost VandeVondele  
---
simplified:

> cat bug.f90
  SUBROUTINE print_crys_symmetry(nc,v)
INTEGER :: nc
REAL(KIND=8), DIMENSION(3,48) :: v
INTEGER  :: n,i
vs = 0.0_8
DO n = 1, nc 
   DO i = 1, 3
  vs = vs + ABS(v(i,n))
   END DO
END DO
CALL foo(vs)
  END SUBROUTINE print_crys_symmetry

> gfortran  -c -O2 -floop-nest-optimize bug.f90
bug.f90:1:0:

   SUBROUTINE print_crys_symmetry(nc,v)

internal compiler error: Segmentation fault
0xba222f crash_signal
../../gcc/gcc/toplev.c:333
0xbf6247 ssa_default_def(function*, tree_node*)
../../gcc/gcc/tree-dfa.c:305
0xbf87a8 get_or_create_ssa_default_def(function*, tree_node*)
../../gcc/gcc/tree-dfa.c:357
0xc33ab3 get_reaching_def
../../gcc/gcc/tree-into-ssa.c:1172
0xc33ab3 get_reaching_def

[Bug tree-optimization/71142] [6/7 Regression] ICE: Segmentation fault in ssa_default_def

2016-05-31 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71142

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #2 from Joost VandeVondele  
---
Doesn't fail for me on x86, but similar to PR71351, which does fail.

[Bug tree-optimization/71351] New: [7 Regression] ICE: Segmentation fault

2016-05-31 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71351

Bug ID: 71351
   Summary: [7 Regression] ICE: Segmentation fault
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

recent trunk regression:

> cat bug.f90
MODULE k290
  INTEGER, PARAMETER :: dp=8
  TYPE csym_type
  INTEGER   :: isy, nc
  REAL(KIND=dp), DIMENSION(3,48) :: v
  END TYPE csym_type
CONTAINS
  SUBROUTINE print_crys_symmetry(csym)
TYPE(csym_type), POINTER  :: csym
INTEGER  :: n,i, unit
vs = 0.0_dp
DO n = 1, csym%nc
   DO i = 1, 3
  vs = vs + ABS(csym%v(i,n))
   END DO
END DO
IF (csym%isy==0) THEN
   WRITE (unit,*) &
' (sum of translation vectors=', vs, ')'
END IF
IF (indpg>0) THEN
   CALL xstring(pgrp(-indpg),i,j)
END IF
  END SUBROUTINE print_crys_symmetry
END MODULE k290

> gfortran  -c -O2 -floop-nest-optimize bug.f90
bug.f90:8:0:

   SUBROUTINE print_crys_symmetry(csym)

internal compiler error: Segmentation fault
0xba222f crash_signal
../../gcc/gcc/toplev.c:333
0xbf6247 ssa_default_def(function*, tree_node*)
../../gcc/gcc/tree-dfa.c:305
0xbf87a8 get_or_create_ssa_default_def(function*, tree_node*)
../../gcc/gcc/tree-dfa.c:357
0xc33ab3 get_reaching_def
../../gcc/gcc/tree-into-ssa.c:1172
0xc33ab3 get_reaching_def
../../gcc/gcc/tree-into-ssa.c:1159
0xc340e7 rewrite_update_phi_arguments
../../gcc/gcc/tree-into-ssa.c:2025
0xc340e7 rewrite_update_dom_walker::before_dom_children(basic_block_def*)
../../gcc/gcc/tree-into-ssa.c:2145
0xc340e7 rewrite_update_dom_walker::before_dom_children(basic_block_def*)
../../gcc/gcc/tree-into-ssa.c:2078
0x12d2330 dom_walker::walk(basic_block_def*)
../../gcc/gcc/domwalk.c:265
0xc2fc67 rewrite_blocks
../../gcc/gcc/tree-into-ssa.c:2202
0xc372f8 update_ssa(unsigned int)
../../gcc/gcc/tree-into-ssa.c:3364
0x12fd6ea graphite_regenerate_ast_isl(scop*)
../../gcc/gcc/graphite-isl-ast-to-gimple.c:3203
0x12f4cb3 graphite_transform_loops()
../../gcc/gcc/graphite.c:329
0x12f5180 graphite_transforms
../../gcc/gcc/graphite.c:356
0x12f5180 execute
../../gcc/gcc/graphite.c:433
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

[Bug tree-optimization/71230] [7 Regression] ICE : in zero_one_operation, at tree-ssa-reassoc.c:1230

2016-05-31 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71230

Joost VandeVondele  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #17 from Joost VandeVondele  
---
This seems not yet fully fixed. While comment #10 doesn't fail any more in its
reduced form, I still get:

> cat bug.f90
MODULE ai_coulomb_test
  INTEGER, PARAMETER :: dp=8
  INTERFACE
  SUBROUTINE g2gemint(intabc,la_max,npgfa,zeta,a,lb_max,npgfb,zetb,b,&
  lr_max,ls_max,ngemc,zetc,c,nderivative)
  INTEGER, PARAMETER :: dp=8
REAL(KIND=dp), &
  DIMENSION(:, :, :, :, :, :), &
  INTENT(INOUT)  :: intabc
REAL(KIND=dp), DIMENSION(:), INTENT(IN)  :: zeta, a
REAL(KIND=dp), DIMENSION(:), INTENT(IN)  :: zetb, b
REAL(KIND=dp), DIMENSION(:, :, :), &
  INTENT(IN) :: zetc
REAL(KIND=dp), DIMENSION(:), INTENT(IN)  :: c
  END SUBROUTINE
  END INTERFACE
  PRIVATE
  PUBLIC :: eri_test
CONTAINS
  SUBROUTINE eri_test (iw)
IF ( iw>0 ) THEN
   DO l=0,lmax
 WRITE(iw,'(A,T40,A,T66,F15.3)') " Performance [Mintegrals/s]
",i2c(l),perf
   END DO
END IF
CALL geminal_test2 (iw)
CALL geminal_test4 (iw)
  END SUBROUTINE eri_test
  SUBROUTINE geminal_test2 (iw)
DO ma=0,la
  DO mb=0,lb
DO mc=0,lc
  DO md=0,ld
DO iax=0,ma
  DO iay=0,ma-iax
DO ibx=0,mb
  DO iby=0,mb-ibx
DO icx=0,mc
  DO icy=0,mc-icx
DO idx=0,md
  DO idy=0,md-idx
res1=os(na,nb,nc,nd)
  END DO
END DO
  END DO
END DO
  END DO
END DO
  END DO
END DO
  END DO
END DO
  END DO
END DO
  END SUBROUTINE geminal_test2
  SUBROUTINE geminal_test4 (iw)
REAL(KIND=dp):: d1, da, db, dc, delta, dmax, &
xa, xb, xc, xd, xr, xs
REAL(KIND=dp), ALLOCATABLE, &
  DIMENSION(:, :, :, :, :, :):: iabc1m, iabc1p, iabc2m, &
iabc2p, iabc3m, iabc3p, iabcd
REAL(KIND=dp), DIMENSION(2, 2, 1):: za, zb
REAL(KIND=dp), DIMENSION(3)  :: a, b, c, d
REAL(KIND=dp), DIMENSION(6)  :: ra, rb
CALL g2gemint(iabc2p,la,1,(/xa/),a,lc,1,(/xb/),c,llb,llb,1,zb,rb,0)
CALL g2gemint(iabc2m,la,1,(/xa/),a,lc,1,(/xb/),c,llb,llb,1,zb,rb,0)
iabc2p = (iabc2p-iabc2m)/delta
  END SUBROUTINE geminal_test4
END MODULE ai_coulomb_test

> gfortran  -c -g -O3 -ffast-math  -fprefetch-loop-arrays  bug.f90

bug.f90:20:0:

   SUBROUTINE eri_test (iw)

in pp_string, at pretty-print.c:937
0x13ee657 pp_string
../../gcc/gcc/pretty-print.c:937
0x13ee657 pp_string(pretty_printer*, char const*)
../../gcc/gcc/pretty-print.c:935
0x13ef1d6 pp_format(pretty_printer*, text_info*)
../../gcc/gcc/pretty-print.c:579
0x13ea0a1 diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*)
../../gcc/gcc/diagnostic.c:823
0x13eb905 internal_error(char const*, ...)
../../gcc/gcc/diagnostic.c:1258
0x8f9b98 gimple_check_failed(gimple const*, char const*, int, char const*,
gimple_code, tree_code)
../../gcc/gcc/gimple.c:1174
0xd6a90b GIMPLE_CHECK2
../../gcc/gcc/gimple.h:73
0xd5f267 zero_one_operation
../../gcc/gcc/tree-ssa-reassoc.c:1250
0xd66cc8 undistribute_ops_list
../../gcc/gcc/tree-ssa-reassoc.c:1604
0xd67758 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5252
0xd67597 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5382
0xd67597 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5382
0xd67597 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5382
0xd67597 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5382
0xd67597 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5382
0xd67597 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5382
0xd69d73 do_reassoc
../../gcc/gcc/tree-ssa-reassoc.c:5496
0xd69d73 execute_reassoc
../../gcc/gcc/tree-ssa-reassoc.c:5583
0xd69d73 execute
../../gcc/gcc/tree-ssa-reassoc.c:5622
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-25 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #11 from Joost VandeVondele  
---
(In reply to Joost VandeVondele from comment #10)
> I had a single file version of it, I'll try to recreate that once for our
> current version.

To use and compile the single file version try this:

wget https://www.dropbox.com/s/18oi02srbot3h9p/cp2k_single_file.f90.gz
gunzip cp2k_single_file.f90.gz
gfortran cp2k_single_file.f90 -llapack -lblas

It takes a few minutes and about 6 GB to compile

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-25 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #10 from Joost VandeVondele  
---
(In reply to kugan from comment #9)
> What application is this testcase from? I have a patch which I want to try.

This is from the CP2K code we develop, it is available from 

https://www.cp2k.org/download

and part of a few linux distros. Setting up the compilation takes a little
work, but it serves as a good testcase :-)

I had a single file version of it, I'll try to recreate that once for our
current version.

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-25 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #7 from Joost VandeVondele  
---
The following testcase is slightly different in that it leads to a segfault:

> cat bug.f90
MODULE xc_pbe
  INTEGER, PARAMETER :: dp=8
CONTAINS
SUBROUTINE pbe_lsd_calc(rhoa, rhob, norm_drho, norm_drhoa, norm_drhob,&
 e_0, e_ra, e_rb, e_ra_ndra, e_rb_ndrb, e_ndr_ndr,&
 e_ndra, e_ndrb, e_ra_ra, e_ra_rb, e_rb_rb, e_ra_ndr, e_rb_ndr,&
 grad_deriv,npoints,epsilon_rho,epsilon_drho,param,scale_ec,scale_ex,error)
REAL(kind=dp), DIMENSION(*), INTENT(inout) :: e_0, e_ra, e_rb, e_ra_ndra, &
  e_ra_ra, e_ra_rb, e_rb_rb, e_ra_ndr, e_rb_ndr
INTEGER, INTENT(in)  :: grad_deriv, npoints
REAL(kind=dp) :: A, A1rhoa, A1rhob, A_1, A_2, A_3, alpha_1_1, alpha_1_2, &
  t789, t79, t795, t798, t8, t80, t801, t812, t82, t820, t821
  SELECT CASE(grad_deriv)
  CASE default
 DO ii=1,npoints
IF (my_rho>epsilon_rho) THEN
   IF (grad_deriv>=2.OR.grad_deriv==-2) THEN
  k_s1rhoa = k_srhoa
  t801 = t96 * t798 * k_srhoa / 0.2e1_dp
  trhoarhoa = t775 * t776 * phi1rhoa + t779 * t776 * k_s1rhoa / &
   0.2e1_dp + t785 - t269 * t98 * phirhoarhoa / 0.2e1_dp + t779
* t789 &
   * phi1rhoa / 0.2e1_dp + t795 * t789 * k_s1rhoa + t801 - t96
* t274 *&
   0.2e1_dp + t269 * t277 * phi1rhoa / 0.2e1_dp + t96 * t798 *
k_s1rhoa&
   / 0.2e1_dp + t812
  t959 = t908 + t911 + t914 + t917 + t919 + 0.2e1_dp * A1rhoa *
t116&
   * Arhoa + 0.8e1_dp * t944 * Arhoa * t1rhoa + 0.2e1_dp * t314
* &
   trhoa * t1rhoa + 0.4e1_dp * t318 * trhoarhoa
  t962 = 0.2e1_dp * t101 * t1rhoa * t299 + 0.2e1_dp * t297 * t868 *
&
   t321 + 0.2e1_dp * t310 * t936 * t321 * t876 - t310 * t313 *
t959
  e_ra_ra(ii) = e_ra_ra(ii)+&
   scale_ex * (0.2e1_dp * ex_unif_a1rhoa * Fx_a + &
   my_rho * (epsilon_c_unifrhoarhoa + 0.6e1_dp * t858 * t294 *
phi1rhoa +&
   phirhoarhoa + 0.3e1_dp * t293 * t326 * phi1rhoa + t110 *
t962 * t325&
   - t110 * t967 * t879))
  trhoarhob = t775 * t776 * phirhob + t779 * t776 * k_srhob / &
   scale_ex * (0.2e1_dp * ex_unif_b * &
   Fx_bnorm_drhob + 0.2e1_dp * t481 * Fx_bnorm_drhob + 0.2e1_dp
* t156 &
   * (-0.8e1_dp * t1712 * t1713 * s_bnorm_drhob + 0.2e1_dp *
t477 * &
   s_bnorm_drhob * s_brhob + 0.2e1_dp * t477 * s_b * (-t467 *
t148 * &
   kf_brhob / 0.2e1_dp - t146 * t472 / 0.2e1_dp))) / 0.2e1_dp
   END IF
END IF
 END DO
  END SELECT
END SUBROUTINE pbe_lsd_calc
END MODULE xc_pbe


> gfortran  -c -O2 -ffast-math bug.f90
bug.f90:4:0:

 SUBROUTINE pbe_lsd_calc(rhoa, rhob, norm_drho, norm_drhoa, norm_drhob,&

internal compiler error: Segmentation fault
0xba068f crash_signal
../../gcc/gcc/toplev.c:333
0x904906 bb_seq_addr
../../gcc/gcc/gimple.h:1655
0x904906 gsi_start_bb
../../gcc/gcc/gimple-iterator.h:129
0x904906 gsi_for_stmt(gimple*)
../../gcc/gcc/gimple-iterator.c:617
0xd57ff2 insert_stmt_after
../../gcc/gcc/tree-ssa-reassoc.c:1323
0xd59cd5 build_and_add_sum
../../gcc/gcc/tree-ssa-reassoc.c:1392
0xd5b37e rewrite_expr_tree_parallel
../../gcc/gcc/tree-ssa-reassoc.c:4128
0xd65296 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5339
0xd649c7 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5391

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-25 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #6 from Joost VandeVondele  
---
reduced testcase

> cat bug.f90
MODULE xc_pbe
  INTEGER, PARAMETER :: dp=8
  PRIVATE
  PUBLIC :: pbe_lda_info, pbe_lsd_info, pbe_lda_eval, pbe_lsd_eval
CONTAINS
SUBROUTINE pbe_lsd_eval(rho_set,deriv_set,grad_deriv,pbe_params)
INTEGER, INTENT(in)  :: grad_deriv
INTEGER  :: handle, npoints, param, stat
LOGICAL  :: failure
REAL(kind=dp):: epsilon_drho, epsilon_rho, &
scale_ec, scale_ex
REAL(kind=dp), DIMENSION(:, :, :), POINTER :: dummy, e_0, e_ndr, &
  e_ndr_ndr, e_ndr_ra, e_ndr_rb, e_ndra, e_ndra_ndra, e_ndra_ra, e_ndrb, &
  e_ndrb_ndrb, e_ndrb_rb, e_ra, e_ra_ra, e_ra_rb, e_rb, e_rb_rb, &
  norm_drho, norm_drhoa, norm_drhob, rhoa, rhob
  IF (.NOT. failure) THEN
 CALL pbe_lsd_calc(&
  rhoa=rhoa, rhob=rhob, norm_drho=norm_drho, norm_drhoa=norm_drhoa,&
  norm_drhob=norm_drhob, e_0=e_0, e_ra=e_ra, e_rb=e_rb,&
  e_ra_ndra=e_ndra_ra,&
  e_rb_ndrb=e_ndrb_rb, e_ndr_ndr=e_ndr_ndr,&
  e_ndra_ndra=e_ndra_ndra, e_ndrb_ndrb=e_ndrb_ndrb, e_ndr=e_ndr,&
  e_ndra=e_ndra, e_ndrb=e_ndrb, e_ra_ra=e_ra_ra, &
  e_ra_rb=e_ra_rb, e_rb_rb=e_rb_rb, e_ra_ndr=e_ndr_ra,&
  e_rb_ndr=e_ndr_rb,&
  grad_deriv=grad_deriv, npoints=npoints, &
  epsilon_rho=epsilon_rho,epsilon_drho=epsilon_drho,&
  param=param,scale_ec=scale_ec,scale_ex=scale_ex)
  END IF
END SUBROUTINE pbe_lsd_eval
SUBROUTINE pbe_lsd_calc(rhoa, rhob, norm_drho, norm_drhoa, norm_drhob,&
 e_0, e_ra, e_rb, e_ra_ndra, e_rb_ndrb, e_ndr_ndr,&
 e_ndra_ndra, e_ndrb_ndrb, e_ndr,&
 e_ndra, e_ndrb, e_ra_ra, e_ra_rb, e_rb_rb, e_ra_ndr, e_rb_ndr,&
 grad_deriv,npoints,epsilon_rho,epsilon_drho,param,scale_ec,scale_ex)
REAL(kind=dp), DIMENSION(*), INTENT(in)  :: rhoa, rhob, norm_drho, &
norm_drhoa, norm_drhob
REAL(kind=dp), DIMENSION(*), INTENT(inout) :: e_0, e_ra, e_rb, e_ra_ndra, &
  e_rb_ndrb, e_ndr_ndr, e_ndra_ndra, e_ndrb_ndrb, e_ndr, e_ndra, e_ndrb, &
  e_ra_ra, e_ra_rb, e_rb_rb, e_ra_ndr, e_rb_ndr
INTEGER, INTENT(in)  :: grad_deriv, npoints
REAL(kind=dp), INTENT(in):: epsilon_rho, epsilon_drho
INTEGER, INTENT(in)  :: param
REAL(kind=dp), INTENT(in):: scale_ec, scale_ex
REAL(kind=dp) :: epsilon_c_unifrhoarhob, epsilon_c_unifrhob, &
  t7, t70, t705, t708, t71, t711, t72, t726, t73, t733, t736, t74, t745, &
  trhob, trhobnorm_drho, trhobrhob
  SELECT CASE(grad_deriv)
  CASE default
 DO ii=1,npoints
IF (my_rho>epsilon_rho) THEN
   IF (grad_deriv>=2.OR.grad_deriv==-2) THEN
  alpha_c1rhoa = alpha_crhoa
  f1rhoa = frhoa
  t745 = -0.4e1_dp * t77 * t245 * chirhoarhoa + (-0.2e1_dp * t194 *
&
   t212 + t626 * t596 * t628 * t34 / 0.2e1_dp -
e_c_u_0rhoarhoa) * f * &
   t733 * t254 + 0.4e1_dp * t736 * t254 + 0.12e2_dp * t85 * t79
* t682 &
   + 0.4e1_dp * t85 * t244 * chirhoarhoa
  epsilon_c_unifrhoarhoa = e_c_u_0rhoarhoa + (0.2e1_dp * t215 * &
   t233 - t674 * t644 * t676 * t52 / 0.2e1_dp) * f * t82 +
alpha_crhoa &
   * f1rhoa * t82 - 0.4e1_dp * t240 * t246 + alpha_c1rhoa *
frhoa * t82&
   + t745
  Arhoarhoa = 0.2e1_dp * t820 * t822 * t828 - t101 * t281 * (&
   0.3e1_dp * t102 * t285 * phirhoarhoa) * t107 - t851 * t289 *
t828 * &
   t321 + 0.2e1_dp * t310 * t936 * t321 * t876 - t310 * t313 *
t959
  e_ra_ra(ii) = e_ra_ra(ii)+&
   scale_ex * (0.2e1_dp * ex_unif_a1rhoa * Fx_a + &
   my_rho * (epsilon_c_unifrhoarhoa + 0.6e1_dp * t858 * t294 *
phi1rhoa +&
   - t110 * t967 * t879))
  t1674 = t1630 + t1632 + t1635 + t1638 + t1640 + 0.2e1_dp * A1rhob
&
   * Arhobrhob + 0.8e1_dp * t944 * trhob * A1rhob + 0.12e2_dp *
t953 * &
   scale_ex * (0.2e1_dp * ex_unif_a * &
   Fx_anorm_drhoa + 0.2e1_dp * t350 * Fx_anorm_drhoa + 0.2e1_dp
* t140 &
   * (-0.8e1_dp * t1000 * t1001 * s_anorm_drhoa + 0.2e1_dp *
t346 * &
   s_anorm_drhoa * s_arhoa + 0.2e1_dp * t346 * s_a * (-t336 *
t131 * &
   kf_arhoa / 0.2e1_dp - t129 * t341 / 0.2e1_dp))) / 0.2e1_dp
   END IF
END IF
 END DO
  END SELECT
END SUBROUTINE pbe_lsd_calc
END MODULE xc_pbe


> gfortran -c -O3 -ffast-math -march=westmere bug.f90
bug.f90:69:20:

- t110 * t967 * t879))
1
Warning: Extension: Unary operator following arithmetic operator (use
parentheses) at (1)
bug.f90:6:0:

 SUBROUTINE 

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-24 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #5 from Joost VandeVondele  
---
This error still seems present on a different testcase..:

/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/xc_b97.F:360:0:

   SUBROUTINE b97_lsd_eval(rho_set,deriv_set,grad_deriv,b97_params,error)

Error: definition in block 38 does not dominate use in block 37
for SSA_NAME: _1530 in statement:
_3039 = _1530 * 2.0e+0;
/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/xc_b97.F:360:0: internal compiler
error: verify_ssa failed
0xdbe8fc verify_ssa(bool, bool)
../../gcc/gcc/tree-ssa.c:1039
0xad114d execute_function_todo
../../gcc/gcc/passes.c:1971
0xad1b0b execute_todo
../../gcc/gcc/passes.c:2016

I'll try to reduce a testcase.

[Bug tree-optimization/71230] [7 Regression] ICE : in zero_one_operation, at tree-ssa-reassoc.c:1230

2016-05-24 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71230

--- Comment #10 from Joost VandeVondele  
---
new testcase:

> cat bug.f90
MODULE ai_coulomb_test
  INTEGER, PARAMETER :: dp=8
  INTERFACE
  SUBROUTINE g2gemint(intabc,la_max,npgfa,zeta,a,lb_max,npgfb,zetb,b,&
  lr_max,ls_max,ngemc,zetc,c,nderivative)
  INTEGER, PARAMETER :: dp=8
REAL(KIND=dp), &
  DIMENSION(:, :, :, :, :, :), &
  INTENT(INOUT)  :: intabc
REAL(KIND=dp), DIMENSION(:), INTENT(IN)  :: zeta, a
REAL(KIND=dp), DIMENSION(:), INTENT(IN)  :: zetb, b
REAL(KIND=dp), DIMENSION(:, :, :), &
  INTENT(IN) :: zetc
REAL(KIND=dp), DIMENSION(:), INTENT(IN)  :: c
  END SUBROUTINE
  END INTERFACE
  PRIVATE
  PUBLIC :: eri_test
CONTAINS
  SUBROUTINE eri_test (iw)
   IF ( iw>0 ) THEN
  WRITE(iw,'(//,A,/)') "foo"
   END IF
CALL geminal_test4 (iw)
  END SUBROUTINE eri_test
  SUBROUTINE geminal_test4 (iw)
REAL(KIND=dp):: d1, da, db, dc, delta, dmax, &
xa, xb, xc, xd, xr, xs
REAL(KIND=dp), ALLOCATABLE, &
  DIMENSION(:, :, :, :, :, :):: iabc1m, iabc1p, iabc2m, &
iabc2p, iabc3m, iabc3p, iabcd
REAL(KIND=dp), DIMENSION(2, 2, 1):: za, zb
REAL(KIND=dp), DIMENSION(3)  :: a, b, c, d
REAL(KIND=dp), DIMENSION(6)  :: ra, rb
DO k=1,3
  CALL g2gemint(iabc3p,la,1,(/xa/),a,lc,1,(/xb/),c,llb,llb,1,zb,rb,0)
  CALL g2gemint(iabc3m,la,1,(/xa/),a,lc,1,(/xb/),c,llb,llb,1,zb,rb,0)
  iabc3p = (iabc3p-iabc3m)/delta
END DO
  END SUBROUTINE geminal_test4
END MODULE ai_coulomb_test

> gfortran -c -O3 -ffast-math -fprefetch-loop-arrays bug.f90

bug.f90:20:0:

   SUBROUTINE eri_test (iw)

in pp_string, at pretty-print.c:937
0x13e8397 pp_string
../../gcc/gcc/pretty-print.c:937
0x13e8397 pp_string(pretty_printer*, char const*)
../../gcc/gcc/pretty-print.c:935
0x13e8f16 pp_format(pretty_printer*, text_info*)
../../gcc/gcc/pretty-print.c:579
0x13e3de1 diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*)
../../gcc/gcc/diagnostic.c:823
0x13e5645 internal_error(char const*, ...)
../../gcc/gcc/diagnostic.c:1258
0x8f88f8 gimple_check_failed(gimple const*, char const*, int, char const*,
gimple_code, tree_code)
../../gcc/gcc/gimple.c:1174
0xd671cb GIMPLE_CHECK2
../../gcc/gcc/gimple.h:73
0xd5bab7 zero_one_operation
../../gcc/gcc/tree-ssa-reassoc.c:1232
0xd63528 undistribute_ops_list
../../gcc/gcc/tree-ssa-reassoc.c:1586
0xd63fb8 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5237
0xd63df7 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5367
0xd63df7 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5367
0xd63df7 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5367
0xd63df7 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5367
0xd63df7 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5367
0xd63df7 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5367
0xd63df7 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5367
0xd66633 do_reassoc
../../gcc/gcc/tree-ssa-reassoc.c:5481
0xd66633 execute_reassoc
../../gcc/gcc/tree-ssa-reassoc.c:5568
0xd66633 execute
../../gcc/gcc/tree-ssa-reassoc.c:5607
Please submit a full bug report,

> gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160524 (experimental) [trunk revision 236623] (GCC)

[Bug tree-optimization/71230] [7 Regression] ICE : in zero_one_operation, at tree-ssa-reassoc.c:1230

2016-05-24 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71230

--- Comment #9 from Joost VandeVondele  
---
(In reply to Richard Biener from comment #8)
> There is another bug in this function remaining.

I indeed see :
bug.f90:47:0:

   SUBROUTINE eri_test (iw,error)

in pp_string, at pretty-print.c:937
0x13e8397 pp_string
../../gcc/gcc/pretty-print.c:937
0x13e8397 pp_string(pretty_printer*, char const*)
../../gcc/gcc/pretty-print.c:935
0x13e8f16 pp_format(pretty_printer*, text_info*)
../../gcc/gcc/pretty-print.c:579
0x13e3de1 diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*)
../../gcc/gcc/diagnostic.c:823
0x13e5645 internal_error(char const*, ...)
../../gcc/gcc/diagnostic.c:1258
0x8f88f8 gimple_check_failed(gimple const*, char const*, int, char const*,
gimple_code, tree_code)
../../gcc/gcc/gimple.c:1174
0xd671cb GIMPLE_CHECK2
../../gcc/gcc/gimple.h:73
0xd5bab7 zero_one_operation
../../gcc/gcc/tree-ssa-reassoc.c:1232
0xd63528 undistribute_ops_list
../../gcc/gcc/tree-ssa-reassoc.c:1586
0xd63fb8 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5237

I'll have a look at reducing it.

[Bug middle-end/71252] New: [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-24 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

Bug ID: 71252
   Summary: [7 Regression] ICE: verify_ssa failed : definition in
block 7 does not dominate use in block 6
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

today's trunk:

> cat bug.f90
MODULE xc_b97
  INTEGER, PARAMETER :: dp=8
  PRIVATE
  PUBLIC :: b97_lda_info, b97_lsd_info, b97_lda_eval, b97_lsd_eval
CONTAINS
  SUBROUTINE b97_lsd_eval(rho_set,deriv_set,grad_deriv,b97_params)
INTEGER, INTENT(in)  :: grad_deriv
INTEGER  :: handle, npoints, param, stat
LOGICAL  :: failure
REAL(kind=dp):: epsilon_drho, epsilon_rho, &
scale_c, scale_x
REAL(kind=dp), DIMENSION(:, :, :), POINTER :: dummy, e_0, e_ndra, &
  e_ndra_ndra, e_ndra_ndrb, e_ndra_ra, e_ndra_rb, e_ndrb, e_ndrb_ndrb, &
  e_ndrb_ra, e_ndrb_rb, e_ra, e_ra_ra, e_ra_rb, e_rb, e_rb_rb, &
  norm_drhoa, norm_drhob, rhoa, rhob
IF (.NOT. failure) THEN
   CALL b97_lsd_calc(&
rhoa=rhoa, rhob=rhob, norm_drhoa=norm_drhoa,&
norm_drhob=norm_drhob, e_0=e_0, &
e_ra=e_ra, e_rb=e_rb, &
e_ndra=e_ndra, e_ndrb=e_ndrb, &
e_ra_ra=e_ra_ra, e_ra_rb=e_ra_rb, e_rb_rb=e_rb_rb,&
e_ra_ndra=e_ndra_ra, e_ra_ndrb=e_ndrb_ra, &
e_rb_ndrb=e_ndrb_rb, e_rb_ndra=e_ndra_rb,&
e_ndra_ndra=e_ndra_ndra, e_ndrb_ndrb=e_ndrb_ndrb,&
e_ndra_ndrb=e_ndra_ndrb,&
grad_deriv=grad_deriv, npoints=npoints, &
epsilon_rho=epsilon_rho,epsilon_drho=epsilon_drho,&
param=param,scale_c_in=scale_c,scale_x_in=scale_x)
END IF
  END SUBROUTINE b97_lsd_eval
  SUBROUTINE b97_lsd_calc(rhoa, rhob, norm_drhoa, norm_drhob,&
   e_0, e_ra, e_rb, e_ndra, e_ndrb, &
   e_ra_ndra,e_ra_ndrb, e_rb_ndra, e_rb_ndrb,&
   e_ndra_ndra, e_ndrb_ndrb, e_ndra_ndrb, &
   e_ra_ra, e_ra_rb, e_rb_rb,&
   grad_deriv,npoints,epsilon_rho,epsilon_drho, &
   param, scale_c_in, scale_x_in)
REAL(kind=dp), DIMENSION(*), INTENT(in)  :: rhoa, rhob, norm_drhoa, &
norm_drhob
REAL(kind=dp), DIMENSION(*), INTENT(inout) :: e_0, e_ra, e_rb, e_ndra, &
  e_ndrb, e_ra_ndra, e_ra_ndrb, e_rb_ndra, e_rb_ndrb, e_ndra_ndra, &
  e_ndrb_ndrb, e_ndra_ndrb, e_ra_ra, e_ra_rb, e_rb_rb
INTEGER, INTENT(in)  :: grad_deriv, npoints
REAL(kind=dp), INTENT(in):: epsilon_rho, epsilon_drho
INTEGER, INTENT(in)  :: param
REAL(kind=dp), INTENT(in):: scale_c_in, scale_x_in
REAL(kind=dp) :: A_1, A_2, A_3, alpha_1_1, alpha_1_2, alpha_1_3, alpha_c, &
  t133, t134, t1341, t1348, t1351, t1360, t1368, t138, t1388, t139, &
  u_x_bnorm_drhobnorm_drhob, u_x_brhob, u_x_brhobnorm_drhob, u_x_brhobrhob
SELECT CASE(grad_deriv)
CASE default
   DO ii=1,npoints
  IF (rho>epsilon_rho) THEN
 IF (grad_deriv/=0) THEN
IF (grad_deriv>1 .OR. grad_deriv<-1) THEN
   alpha_c1rhob = alpha_crhob
   f1rhob = frhob
   t1360 = -0.4e1_dp * t105 * t290 * chirhobrhob + (-0.2e1_dp *
t239 &
* t257 + t709 * t1236 * t711 * t62 / 0.2e1_dp -
e_c_u_0rhobrhob) * f&
* t108 + t438 * f1rhob * t108 + 0.4e1_dp * t439 * t443
+ t1341 * &
0.4e1_dp * t1348 * t443 + 0.4e1_dp * t1351 * t443 +
0.12e2_dp * t113&
* t107 * t1299 + 0.4e1_dp * t113 * t289 * chirhobrhob
   IF (grad_deriv>1 .OR. grad_deriv==-2) THEN
   exc_rhob_rhob = scale_x * (-t4 * t6 / t1152 * gx_b / &
0.6e1_dp + e_lsda_x_brhob * (u_x_b1rhob * t31 +
u_x_b * u_x_b1rhob *&
u_x_brhobrhob * c_x_2)) + scale_c *
(((e_c_u_0rhobrhob + (0.2e1_dp *&
t726 * t1270 * t278 - t266 * (-t731 * t1205 /
0.4e1_dp + t267 * &
t1205 * t647) * t278 - t757 * t1270 * t759 * t80 /
0.2e1_dp) * f * &
t110 + alpha_crhob * f1rhob * t110 - 0.4e1_dp *
t431 * t435 + &
alpha_c1rhob * frhob * t110 + alpha_c * frhobrhob *
t110 - 0.4e1_dp &
* t433 * t435 - 0.4e1_dp * t1321 * t435 - 0.4e1_dp
* t1324 * t435 - &
0.12e2_dp * t105 * t79

[Bug middle-end/71230] [7 Regression] ICE : in zero_one_operation, at tree-ssa-reassoc.c:1230

2016-05-23 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71230

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #2 from Joost VandeVondele  
---
smaller testcase (different compiler options needed):

> gfortran -c -fbounds-check -O1 -ffast-math  bug.f90
bug.f90:1:0:

   FUNCTION pw_integral_aa ( cc ) RESULT ( integral_value )

internal compiler error: in zero_one_operation, at tree-ssa-reassoc.c:1230
0xd5b4f8 zero_one_operation
../../gcc/gcc/tree-ssa-reassoc.c:1229
0xd62f98 undistribute_ops_list
../../gcc/gcc/tree-ssa-reassoc.c:1583
0xd63a28 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5199
0xd63867 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5325
0xd63867 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5325

> cat bug.f90
  FUNCTION pw_integral_aa ( cc ) RESULT ( integral_value )
COMPLEX(KIND=8), DIMENSION(:), POINTER :: cc
integral_value = accurate_sum ( CONJG ( cc (:) ) * cc (:) )
  END FUNCTION pw_integral_aa

[Bug middle-end/71230] New: [7 Regression] ICE : in zero_one_operation, at tree-ssa-reassoc.c:1230

2016-05-23 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71230

Bug ID: 71230
   Summary: [7 Regression] ICE : in zero_one_operation, at
tree-ssa-reassoc.c:1230
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

A recent trunk regression leads to :

> gfortran -c -O1 -ffast-math bug.f90
bug.f90:6:0:

   SUBROUTINE b97_lsd_eval(rho_set,deriv_set,grad_deriv,b97_params)

internal compiler error: in zero_one_operation, at tree-ssa-reassoc.c:1230
0xd5b4f8 zero_one_operation
../../gcc/gcc/tree-ssa-reassoc.c:1229
0xd62f98 undistribute_ops_list
../../gcc/gcc/tree-ssa-reassoc.c:1583
0xd63a28 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5199
0xd63867 reassociate_bb
../../gcc/gcc/tree-ssa-reassoc.c:5325
0xd63867 reassociate_bb
[...]


> cat bug.f90
MODULE xc_b97
  INTEGER, PARAMETER :: dp=8
  PRIVATE
  PUBLIC :: b97_lsd_eval
CONTAINS
  SUBROUTINE b97_lsd_eval(rho_set,deriv_set,grad_deriv,b97_params)
INTEGER, INTENT(in)  :: grad_deriv
INTEGER  :: handle, npoints, param, stat
LOGICAL  :: failure
REAL(kind=dp):: epsilon_drho, epsilon_rho, &
scale_c, scale_x
REAL(kind=dp), DIMENSION(:, :, :), POINTER :: dummy, e_0, e_ndra, &
  e_ndra_ndra, e_ndra_ndrb, e_ndra_ra, e_ndra_rb, e_ndrb, e_ndrb_ndrb, &
  e_ndrb_ra, e_ndrb_rb, e_ra, e_ra_ra, e_ra_rb, e_rb, e_rb_rb, &
  norm_drhoa, norm_drhob, rhoa, rhob
IF (.NOT. failure) THEN
   CALL b97_lsd_calc(&
rhoa=rhoa, rhob=rhob, norm_drhoa=norm_drhoa,&
norm_drhob=norm_drhob, e_0=e_0, &
e_ra=e_ra, e_rb=e_rb, &
e_ndra=e_ndra, e_ndrb=e_ndrb, &
e_ra_ra=e_ra_ra, e_ra_rb=e_ra_rb, e_rb_rb=e_rb_rb,&
e_ra_ndra=e_ndra_ra, e_ra_ndrb=e_ndrb_ra, &
e_rb_ndrb=e_ndrb_rb, e_rb_ndra=e_ndra_rb,&
e_ndra_ndra=e_ndra_ndra, e_ndrb_ndrb=e_ndrb_ndrb,&
e_ndra_ndrb=e_ndra_ndrb,&
grad_deriv=grad_deriv, npoints=npoints, &
epsilon_rho=epsilon_rho,epsilon_drho=epsilon_drho,&
param=param,scale_c_in=scale_c,scale_x_in=scale_x)
END IF
  END SUBROUTINE b97_lsd_eval
  SUBROUTINE b97_lsd_calc(rhoa, rhob, norm_drhoa, norm_drhob,&
   e_0, e_ra, e_rb, e_ndra, e_ndrb, &
   e_ra_ndra,e_ra_ndrb, e_rb_ndra, e_rb_ndrb,&
   e_ndra_ndra, e_ndrb_ndrb, e_ndra_ndrb, &
   e_ra_ra, e_ra_rb, e_rb_rb,&
   grad_deriv,npoints,epsilon_rho,epsilon_drho, &
   param, scale_c_in, scale_x_in)
REAL(kind=dp), DIMENSION(*), INTENT(in)  :: rhoa, rhob, norm_drhoa, &
norm_drhob
REAL(kind=dp), DIMENSION(*), INTENT(inout) :: e_0, e_ra, e_rb, e_ndra, &
  e_ndrb, e_ra_ndra, e_ra_ndrb, e_rb_ndra, e_rb_ndrb, e_ndra_ndra, &
  e_ndrb_ndrb, e_ndra_ndrb, e_ra_ra, e_ra_rb, e_rb_rb
INTEGER, INTENT(in)  :: grad_deriv, npoints
REAL(kind=dp), INTENT(in):: epsilon_rho, epsilon_drho
INTEGER, INTENT(in)  :: param
REAL(kind=dp), INTENT(in):: scale_c_in, scale_x_in
REAL(kind=dp) :: A_1, A_2, A_3, alpha_1_1, alpha_1_2, alpha_1_3, alpha_c, &
  rs_b, rs_brhob, rs_brhobrhob, rsrhoa, rsrhoarhoa, rsrhoarhob, rsrhob, &
  t1014, t102, t1047, t1049, t105, t106, t107
  rsrhoa = -t4 * t212 * t208 / 0.12e2_dp
  t235 = t224 * rsrhoa / 0.2e1_dp + beta_2_1 * rsrhoa + & 
  0.3e1_dp / 0.2e1_dp * t228 * rsrhoa + t50 * t48 * rsrhoa * t232
  t237 = t235 * t236
  e_c_u_0rhoa = -0.2e1_dp * t216 * rsrhoa * t56 + t222 * t237
  epsilon_c_unifrhoa = e_c_u_0rhoa + t285 * t110 + t287 * t110 - &
  t293 + t295 * t108 + t297 * t108 + t301
  e_lsda_c_abrhoa = epsilon_c_unifrhoa * rho + epsilon_c_unif -
e_lsda_c_arhoa
  exc_rhoa = scale_x * (e_lsda_x_arhoa * gx_a + e_lsda_x_a * gx_arhoa) + &
  scale_c * (e_lsda_c_abrhoa * gc_ab + e_lsda_c_ab * gc_abrhoa + &
  e_lsda_c_arhoa * gc_a + e_lsda_c_a * gc_arhoa)
  e_ra(ii)=e_ra(ii)+exc_rhoa
  END SUBROUTINE b97_lsd_calc
END MODULE xc_b97


> gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160523 (experimental) [trunk revision 236575] (GCC)

[Bug tree-optimization/71078] New: x/abs(x) -> sign(1.0,x)

2016-05-12 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71078

Bug ID: 71078
   Summary: x/abs(x) -> sign(1.0,x)
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

just noticed in some legacy code the equivalent of this construct:

REAL FUNCTION mysign(x)
   REAL :: x
   mysign=x/abs(x)
END FUNCTION

which I would expect to be converted to some form of copysign function with
'-O3 -ffast-math', but it is not.

[Bug middle-end/70960] New: [7.0 Regression] ICE: tree check: expected ssa_name, have integer_cst in ifcvt_walk_pattern_tree, at tree-if-conv.c:2465

2016-05-05 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70960

Bug ID: 70960
   Summary: [7.0 Regression] ICE: tree check: expected ssa_name,
have integer_cst in ifcvt_walk_pattern_tree, at
tree-if-conv.c:2465
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

recent trunk regression:

> cat bug.f90 
  SUBROUTINE calbrec(a,ai,error)
REAL(KIND=8):: a(3,3), ai(3,3)
DO i = 1, 3
   il = 1
   IF (i==1) il = 2
   DO j = 1, 3
  ai(j,i) = (-1.0_8)**(i+j)*det*(a(il,jl)*a(iu,ju)-a(il,ju)*a(iu,jl))
   END DO
END DO
  END SUBROUTINE calbrec

> gfortran -c -fprofile-generate -O3 bug.f90 
bug.f90:1:0:

   SUBROUTINE calbrec(a,ai,error)

internal compiler error: tree check: expected ssa_name, have integer_cst in
ifcvt_walk_pattern_tree, at tree-if-conv.c:2465
0xe42ad4 tree_check_failed(tree_node const*, char const*, int, char const*,
...)
../../gcc/gcc/tree.c:9753
0xc095d6 tree_check
../../gcc/gcc/tree.h:3025
0xc095d6 ifcvt_walk_pattern_tree
../../gcc/gcc/tree-if-conv.c:2465
0xc094e0 ifcvt_walk_pattern_tree
../../gcc/gcc/tree-if-conv.c:2491
0xc0edbe ifcvt_repair_bool_pattern
../../gcc/gcc/tree-if-conv.c:2580
0xc0edbe tree_if_conversion
../../gcc/gcc/tree-if-conv.c:2746
0xc0edbe execute
../../gcc/gcc/tree-if-conv.c:2829
0xc0edbe execute
../../gcc/gcc/tree-if-conv.c:2808
Please submit a full bug report,

> gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160505 (experimental) [trunk revision 235918] (GCC)

[Bug middle-end/70937] New: [7.0 Regression] ICE: tree code ‘ssa_name’ is not supported in LTO streams

2016-05-03 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70937

Bug ID: 70937
   Summary: [7.0 Regression] ICE: tree code ‘ssa_name’ is not
supported in LTO streams
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

overnight regression in trunk:

> cat bug.f90 
  SUBROUTINE dbcsr_test_read_args(narg, args)
CHARACTER(len=*), DIMENSION(:), &
  INTENT(out) :: args
CHARACTER(len=80) :: line
DO
   args(narg) = line
ENDDO
  END SUBROUTINE dbcsr_test_read_args

> gfortran -flto -c -O0 bug.f90 
bug.f90:8:0: internal compiler error: tree code ‘ssa_name’ is not supported in
LTO streams
   END SUBROUTINE dbcsr_test_read_args

0xa62703 DFS::DFS(output_block*, tree_node*, bool, bool, bool)
../../gcc/gcc/lto-streamer-out.c:676
0xa6338b lto_output_tree(output_block*, tree_node*, bool, bool)
../../gcc/gcc/lto-streamer-out.c:1616
0xa5960d write_global_stream
../../gcc/gcc/lto-streamer-out.c:2415
0xa5960d lto_output_decl_state_streams(output_block*, lto_out_decl_state*)
../../gcc/gcc/lto-streamer-out.c:2462
0xa60fa4 produce_asm_for_decls()
../../gcc/gcc/lto-streamer-out.c:2839
0xacd65f write_lto
../../gcc/gcc/passes.c:2459
0xad08ae ipa_write_summaries_1
../../gcc/gcc/passes.c:2520
0xad08ae ipa_write_summaries()
../../gcc/gcc/passes.c:2580
0x7a6059 ipa_passes
../../gcc/gcc/cgraphunit.c:2310
0x7a6059 symbol_table::compile()
../../gcc/gcc/cgraphunit.c:2404
0x7a880d symbol_table::finalize_compilation_unit()
../../gcc/gcc/cgraphunit.c:2564
Please submit a full bug report,

[Bug tree-optimization/68715] [6 Regression] ice: in harmful_stmt_in_region, at graphite-scop-detection.c:1043

2016-03-15 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68715

--- Comment #7 from Joost VandeVondele  
---
(In reply to vries from comment #6)
> Created attachment 37976 [details]
> tentative patch, fixes examples from comment 4 and 5.

also fixes the first testcase, thanks!

For whatever reason I had to apply it manually.

[Bug web/69601] current/ redirect is off by at least a day

2016-03-01 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69601

--- Comment #2 from Joost VandeVondele  
---
same issue for March

[Bug middle-end/69987] [6 Regression] internal compiler error: in verify_loop_structure, at cfgloop.c:1639

2016-02-27 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69987

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch, law at gcc dot gnu.org

--- Comment #1 from Joost VandeVondele  
---
> cat bug.f90 
MODULE cp_lbfgs
  INTEGER, PARAMETER :: dp=8
CONTAINS
  SUBROUTINE mainlb(n, m, x, l, u, nbd, f, g, factr, pgtol, ws, wy, &
   csave, lsave, isave, dsave)
REAL(KIND=dp):: x(n), l(n), u(n)
REAL(KIND=dp) :: f, g(n), factr, pgtol, ws(n, m), wy(n, m), sy(m, m), &
  ss(m, m), wt(m, m), wn(2*m, 2*m), snd(2*m, 2*m), z(n), r(n), d(n), &
  t(n), wa(8*m)
CHARACTER(len=60):: task
IF (task == 'START') THEN
   IF (task(1:5) == 'FG_LN') GOTO 666
ENDIF
222 CONTINUE
DO 40 i = 1, n
   d(i) = z(i) - x(i)
40  ENDDO
666 CONTINUE
IF (info /= 0 .OR. iback >= 20) THEN
   CALL dcopy(n,r,1,g,1)
ENDIF
GOTO 222
  END SUBROUTINE mainlb
END MODULE cp_lbfgs

> gfortran  -c -O3 -fprefetch-loop-arrays bug.f90 
bug.f90:4:0:

   SUBROUTINE mainlb(n, m, x, l, u, nbd, f, g, factr, pgtol, ws, wy, &

Error: loop verification on loop tree that needs fixup
bug.f90:4:0: internal compiler error: in verify_loop_structure, at
cfgloop.c:1639


last known good r233732 , first known bad r233775

[Bug middle-end/69987] New: [6 Regression] internal compiler error: in verify_loop_structure, at cfgloop.c:1639

2016-02-27 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69987

Bug ID: 69987
   Summary: [6 Regression]  internal compiler error: in
verify_loop_structure, at cfgloop.c:1639
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

an overnight regression on trunk, which I'll tr to reduce:

/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/cp_lbfgs.F:242:0:

   SUBROUTINE mainlb(n, m, x, l, u, nbd, f, g, factr, pgtol, ws, wy, &

Error: loop verification on loop tree that needs fixup
/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/cp_lbfgs.F:242:0: internal compiler
error: in verify_loop_structure, at cfgloop.c:1639
0x773ecc verify_loop_structure()
../../gcc/gcc/cfgloop.c:1639
0xd07b50 checking_verify_loop_structure
../../gcc/gcc/cfgloop.h:324
0xd07b50 tree_transform_and_unroll_loop(loop*, unsigned int, edge_def*,
tree_niter_desc*, void (*)(loop*, void*), void*)
../../gcc/gcc/tree-ssa-loop-manip.c:1362
0xd19205 loop_prefetch_arrays
../../gcc/gcc/tree-ssa-loop-prefetch.c:1908
0xd19205 tree_ssa_prefetch_arrays()
../../gcc/gcc/tree-ssa-loop-prefetch.c:1977
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
make[2]: *** [cp_lbfgs.o] Error 1

First guess would be this is related to the fix of PR69740 (which was also
backported to the 5 branch).

[Bug fortran/69695] slice of an array retains pointer attribute

2016-02-07 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69695

--- Comment #3 from Joost VandeVondele  
---
(In reply to Mikael Morin from comment #2)

> This seems to be allowed, see 12.5.2.7:

Interesting, so that's a F2008 feature. The Cray compiler indeed gets this
right.

> So this is probably a plain wrong-code bug.

thanks for looking carefully!

[Bug web/69601] current/ redirect is off by at least a day

2016-02-07 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69601

Joost VandeVondele  changed:

   What|Removed |Added

 CC||gerald at pfeifer dot com,
   ||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #1 from Joost VandeVondele  
---
mentioned on the mailing list:

https://gcc.gnu.org/ml/gcc/2016-02/msg00066.html

Is this now fixed ? It took a couple of days for the switch to happen.

[Bug fastjar/69695] New: slice of an array retains pointer attribute

2016-02-05 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69695

Bug ID: 69695
   Summary: slice of an array retains pointer attribute
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fastjar
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

The following testcase:

> cat test.f90
module point
  implicit none
  type point_type
integer, dimension(:,:), pointer :: array
  end type point_type
contains
  subroutine ptest(a)
integer, dimension(:), intent(in), pointer :: a
write(*,*) a**2
  end subroutine ptest
end module point
program test
  use point
  implicit none
  integer :: i, j
  type(point_type), pointer:: p1
  integer, dimension(:,:), pointer :: a
  allocate(p1)
  allocate(p1%array(5,2))
  p1%array=42
  a => p1%array
  call ptest(a(:,2))
end program test

returns a valgrind error and seemingly wrong output:

> gfortran -g test.f90 
> valgrind ./a.out
==81284== Memcheck, a memory error detector
==81284== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==81284== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==81284== Command: ./a.out
==81284== 
==81284== Invalid read of size 4
==81284==at 0x400920: __point_MOD_ptest (test.f90:9)
==81284==by 0x400B64: MAIN__ (test.f90:22)
==81284==by 0x400BA3: main (test.f90:13)
==81284==  Address 0x4dbf458 is 0 bytes after a block of size 40 alloc'd
==81284==at 0x4A06B3F: malloc (vg_replace_malloc.c:299)
==81284==by 0x4009B9: MAIN__ (test.f90:19)
==81284==by 0x400BA3: main (test.f90:13)
==81284== 
1764176417641764   0
==81284== 
==81284== HEAP SUMMARY:
==81284== in use at exit: 112 bytes in 2 blocks
==81284==   total heap usage: 21 allocs, 19 frees, 11,952 bytes allocated
==81284== 
==81284== LEAK SUMMARY:
==81284==definitely lost: 72 bytes in 1 blocks
==81284==indirectly lost: 40 bytes in 1 blocks
==81284==  possibly lost: 0 bytes in 0 blocks
==81284==still reachable: 0 bytes in 0 blocks
==81284== suppressed: 0 bytes in 0 blocks
==81284== Rerun with --leak-check=full to see details of leaked memory
==81284== 
==81284== For counts of detected and suppressed errors, rerun with: -v
==81284== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 5 from 5)

However, the underlying problem is that gfortran doesn't generate a compile
time error, as the array slice is passed to a subroutine that expects a pointer
argument. Ifort diagnoses this clearly:

> ifort test.f90
test.f90(22): error #7121: A ptr dummy may only be argument associated with a
ptr, and this array element or section does not inherit the POINTER attr from
its parent array.   [A]
  call ptest(a(:,2))
-^
compilation aborted for test.f90 (code 1)

I think this should be diagnosed by gfortran as well.

[Bug fortran/69646] New: multiple warnings with -Wintrinsics-std

2016-02-02 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69646

Bug ID: 69646
   Summary: multiple warnings with -Wintrinsics-std
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

This testcase related to Janus question on the ml :

> cat test.f90
 WRITE(6,*) BGE(1,2)
END

yields three times a similar warning with -Wintrinsics-std

>  gfortran -c -std=f95 -Wintrinsics-std test.f90
test.f90:1:15:

  WRITE(6,*) BGE(1,2)
   1
Warning: The intrinsic ‘bge’ at (1) is not included in the selected standard
but new in Fortran 2008 and ‘bge’ will be treated as if declared EXTERNAL.  Use
an appropriate -std=* option or define -fall-intrinsics to allow this
intrinsic. [-Wintrinsics-std]
test.f90:1:15: Warning: The intrinsic ‘bge’ at (1) is not included in the
selected standard but new in Fortran 2008 and ‘bge’ will be treated as if
declared EXTERNAL.  Use an appropriate -std=* option or define -fall-intrinsics
to allow this intrinsic. [-Wintrinsics-std]
test.f90:1:11:

  WRITE(6,*) BGE(1,2)
   1
Warning: The intrinsic ‘bge’ at (1) is not included in the selected standard
but new in Fortran 2008 and ‘bge’ will be treated as if declared EXTERNAL.  Use
an appropriate -std=* option or define -fall-intrinsics to allow this
intrinsic. [-Wintrinsics-std]

[Bug web/69601] New: current/ redirect is off by at least a day

2016-02-01 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69601

Bug ID: 69601
   Summary: current/ redirect is off by at least a day
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: web
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

The current/ redirect as in e.g. :

https://gcc.gnu.org/ml/gcc-patches/current/

doesn't seem to redirect to https://gcc.gnu.org/ml/gcc-patches/2016-02/ at the
right point. I.e. it keeps pointing to 2016-01 even though the archives start
appearing at 2016-02, somehow the switching point is different by a day or
more.

[Bug tree-optimization/68976] [6 Regression] ICE w/ -O2 (and above) -fgraphite-identity (or -floop-nest-optimize)

2016-01-13 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68976

--- Comment #5 from Joost VandeVondele  
---
I'm somewhat surprised graphite regressions get a P4. 

Discussions on the list suggested that graphite would be enabled by default in
the near future. Lowering graphite regression priority to 'not serious' signals
the opposite.

It is a bit a chicken-and-egg problem, graphite won't get good testing if it is
not reliable, and it won't get reliable if it doesn't get good testing.

[Bug fortran/66461] [4.9/5/6 Regression] ICE on missing end program in fixed source

2016-01-10 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66461

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #7 from Joost VandeVondele  
---
(In reply to Jerry DeLisle from comment #6)
> Hiesenbug. Steves fix, probably on BSD, does not work on linux. Some of my
> attempts work when running within the debugger, but do not work outside the
> debugger. I have traced through the scanner with and without the END
> statement and see nothing yet.
> 
> We might try a regression hunt to see where it broke to get a hint. One
> thing I have not checked yet is in fixed form we skip past the first
> characters of each line.  We may need to trap an EOF when we are doing that
> skipping.

This might be obvious, but valgrind might give a hint ? :

-linux-gnu/6.0.0/finclude -o /tmp/ccyEi6l3.s
==20860== 
==20860== Invalid read of size 4
==20860==at 0x5DF7C1: free_expr0(gfc_expr*) (expr.c:431)
==20860==by 0x5DF9BD: gfc_free_expr(gfc_expr*) (expr.c:513)
==20860==by 0x60FE57: gfc_match_if(gfc_statement*) (match.c:1441)
==20860==by 0x62BF91: decode_statement (parse.c:398)
==20860==by 0x62BF91: decode_statement() (parse.c:293)
==20860==by 0x62D434: next_free (parse.c:1076)
==20860==by 0x62D434: next_statement() (parse.c:1310)
==20860==by 0x62F4D4: parse_executable(gfc_statement) (parse.c:4792)
==20860==by 0x630311: parse_progunit(gfc_statement) (parse.c:5215)
==20860==by 0x6316C0: gfc_parse_file() (parse.c:5698)
==20860==by 0x6745A2: gfc_be_parse_file() (f95-lang.c:201)
==20860==by 0xB64FD2: compile_file() (toplev.c:464)
==20860==by 0xB6706B: do_compile (toplev.c:1985)
==20860==by 0xB6706B: toplev::main(int, char**) (toplev.c:2092)
==20860==by 0x1354346: main (main.c:39)
==20860==  Address 0x4dc1b50 is 0 bytes inside a block of size 192 free'd
==20860==at 0x4A06D9B: free (vg_replace_malloc.c:530)
==20860==by 0x60FE12: gfc_match_if(gfc_statement*) (match.c:1425)
==20860==by 0x62BF91: decode_statement (parse.c:398)
==20860==by 0x62BF91: decode_statement() (parse.c:293)
==20860==by 0x62D434: next_free (parse.c:1076)
==20860==by 0x62D434: next_statement() (parse.c:1310)
==20860==by 0x62F4D4: parse_executable(gfc_statement) (parse.c:4792)
==20860==by 0x630311: parse_progunit(gfc_statement) (parse.c:5215)
==20860==by 0x6316C0: gfc_parse_file() (parse.c:5698)
==20860==by 0x6745A2: gfc_be_parse_file() (f95-lang.c:201)
==20860==by 0xB64FD2: compile_file() (toplev.c:464)
==20860==by 0xB6706B: do_compile (toplev.c:1985)
==20860==by 0xB6706B: toplev::main(int, char**) (toplev.c:2092)
==20860==by 0x1354346: main (main.c:39)
==20860==  Block was alloc'd at
==20860==at 0x4A07B05: calloc (vg_replace_malloc.c:711)
==20860==by 0x13C2368: xcalloc (xmalloc.c:163)
==20860==by 0x5DF29F: gfc_get_expr() (expr.c:48)
==20860==by 0x5DF397: gfc_get_operator_expr(locus*, gfc_intrinsic_op,
gfc_expr*, gfc_expr*) (expr.c:106)
==20860==by 0x5B1090: eval_intrinsic(gfc_intrinsic_op, eval_f, gfc_expr*,
gfc_expr*) (arith.c:1611)
==20860==by 0x5B1440: eval_intrinsic_f3 (arith.c:1725)
==20860==by 0x5B1440: eval_intrinsic_f3(gfc_intrinsic_op, arith
(*)(gfc_expr*, gfc_expr*, gfc_expr**), gfc_expr*, gfc_expr*) (arith.c:1713)
==20860==by 0x6131B2: match_equiv_operand(gfc_expr**) (matchexp.c:784)
==20860==by 0x613276: match_level_5(gfc_expr**) (matchexp.c:811)
==20860==by 0x612572: gfc_match_expr(gfc_expr**) (matchexp.c:870)
==20860==by 0x60B870: gfc_match(char const*, ...) (match.c:1015)
==20860==by 0x60FCC7: gfc_match_if(gfc_statement*) (match.c:1347)
==20860==by 0x62BF91: decode_statement (parse.c:398)
==20860==by 0x62BF91: decode_statement() (parse.c:293)

[Bug fortran/68829] [4.9/5/6 Regression] Segfaults with -Ofast due to large array on stack

2016-01-10 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68829

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #4 from Joost VandeVondele  
---
(In reply to Thomas Koenig from comment #3)
> 
> It seems that -fmax-stack-var-size=N does not override -Ofast.  Maybe
> it should.
> 

Ah interesting, I wasn't aware that '-fmax-stack-var-size' existed, but indeed,
if -fstack-arrays would honour that flag, this problem would be fixed. That
would be a good solution from my point of view.

[Bug fortran/69154] [6 Regression] ICE in gfc_trans_where_2, at fortran/trans-stmt.c:5005 on *-linux

2016-01-06 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69154

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #3 from Joost VandeVondele  
---
reduced

MODULE m_numeric_tools
 INTEGER, PARAMETER :: dp=8
CONTAINS
subroutine llsfit_svd(xx,yy,sigma,nfuncs,funcs,chisq,par,var,cov,info)
 real(dp),intent(in) :: xx(:),yy(:),sigma(:)
 real(dp),dimension(SIZE(xx)) :: bb,sigm1
 real(dp) :: tmp(nfuncs)
 real(dp),allocatable :: work(:),Vt(:,:),U(:,:),S(:)
 WHERE (S>TOL_*MAXVAL(S))
  tmp=MATMUL(bb,U)/S
 END WHERE
end subroutine llsfit_svd
END MODULE m_numeric_tools

[Bug fortran/68993] MERGE does not evaluate its arguments

2015-12-21 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68993

--- Comment #5 from Joost VandeVondele  
---
(In reply to Steve Kargl from comment #4)
> 
> I would urge anyone trying to be clever to use clear syntax:
> 

https://github.com/hfp/libxsmm/commit/cc308fc5debe6151157a4fa9efacc7aa03351283

is what we used indeed, but it is not quite as concise as one would like.

[Bug fortran/68993] MERGE does not evaluate its arguments

2015-12-20 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68993

Joost VandeVondele  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch
 Resolution|--- |INVALID

--- Comment #3 from Joost VandeVondele  
---
I believe there was some confusion about the second testcase. The real question
is if this is valid in all cases:

MERGE(C_NULL_PTR, C_LOC(pc), .NOT.PRESENT(pc)))

and I believe it is not, because all arguments might be evaluated. However, it
is also OK not to evaluate all arguments for the reasons quoted.

[Bug fortran/68993] New: MERGE does not evaluate its arguments

2015-12-19 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68993

Bug ID: 68993
   Summary: MERGE does not evaluate its arguments
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

I'm not 100% sure what the right answer is, i.e. if MERGE is defined by the
standard to do something special with respect to evaluating its arguments. The
origin is code like this:

MERGE(C_NULL_PTR, C_LOC(pc), .NOT.PRESENT(pc)))

is this standard conforming if pc is not present ? In that case MERGE is
supposed to return C_NULL_PTR, but I see no reason why C_LOC(pc) would not be
evaluated first.

Gfortran and ifort behave differently in this respect.In the below code ifort
calls foo 4x while gfortran calls it 2x.

While gfortran's way of doing things seem natural, I suspect it is not standard
conforming.

> cat test.f90
MODULE test
  INTEGER, SAVE :: i=0
CONTAINS
  INTEGER FUNCTION foo()
 i=i+1
 foo=i
  END FUNCTION
END MODULE test

USE test
WRITE(6,*) MERGE(foo(),foo(),.FALSE.)
WRITE(6,*) MERGE(foo(),foo(),.FALSE.)
WRITE(6,*) i
END

> gfortran test.f90 && ./a.out
   1
   2
   2

> ifort test.f90 && ./a.out
   2
   4
   4

[Bug fortran/68649] [6 Regression] note: code may be misoptimized unless -fno-strict-aliasing is used

2015-12-11 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68649

--- Comment #11 from Joost VandeVondele  
---
(In reply to Jerry DeLisle from comment #10)
> This PR is tagged as a regression.  Has anyone determined when it last
> worked or is it longstanding bug uncovered by recent non-fortran fe changes?

For users it is a regression in the sense that packages that compiled with
-flto -Werror with 5.3 will now stop building.

The underlying FE issue is much older, I guess.

[Bug fortran/68829] New: [4.7/4.8/4.9/5.3/6.0 Regression] Segfaults with -Ofast

2015-12-10 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68829

Bug ID: 68829
   Summary: [4.7/4.8/4.9/5.3/6.0 Regression] Segfaults with -Ofast
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

Currently, switching from -O3 to -Ofast might result in segfaulting Fortran
programs:

> cat test.f90
MODULE foo
CONTAINS
  SUBROUTINE mysum(a)
INTEGER :: a(:)
WRITE(6,*) SUM(a)
  END SUBROUTINE
END MODULE foo

USE foo
INTEGER, ALLOCATABLE :: a(:)
INTEGER, PARAMETER :: N=2**26 ! 256Mb array
ALLOCATE(a(N)) ; a=1
CALL mysum(a*a)
END

> gfortran -O3 test.f90 && ./a.out
67108864

> gfortran -Ofast test.f90 && ./a.out
Segmentation fault

The reason for this is that -Ofast enables -fstack-arrays. This options puts
temporary arrays off unlimited size on the stack. While there is precedent for
this (ifort), I think this is a poor choice. I think -fstack-arrays should only
do this for sufficiently small arrays.

This is a regression from gcc 4.6 (presumably because -Ofast didn't imply
-fstack-arrays).

[Bug middle-end/68002] retaining unused static functions at -O1

2015-12-08 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68002

Joost VandeVondele  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Joost VandeVondele  
---
yes

[Bug fortran/39772] -fcheck=bounds could check for overflow of size intrinsic.

2015-12-05 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39772

Joost VandeVondele  changed:

   What|Removed |Added

 Status|WAITING |NEW
Summary|add a correctness check for |-fcheck=bounds could check
   |the size intrinsic to   |for overflow of size
   |-fbounds-check  |intrinsic.

--- Comment #13 from Joost VandeVondele  
---
(In reply to Dominique d'Humieres from comment #11)
> Am I correct that you are asking for something like -fcheck=undefined
> (overflow, range, ... what ever is deemed suitable) for intrinsics returning
> by default an INTEGER(4)?

yes.

> 
> If yes, the summary should probably changed to be less misleading (I had to
> read the thread twice to understand).

done

[Bug fortran/55916] Alignment issues with real(16) on i686

2015-12-05 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55916

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #12 from Joost VandeVondele  
---
(In reply to Jouko Orava from comment #8)
> These issues affect probably all versions of gfortran,
> I have a patch under testing against trunk that modifies
> libgfortran internal xmalloc() and xcalloc() calls, as well
> as the intrinsic malloc() calls, to use GNU libc specific
> memalign() call. I will attach it as soon as I verify it
> works correctly.

I'm wondering if you had a working patch at some point. There are other PRs
that can only be solved if some alignment better than malloc is used in
gfortran. PR64247 and PR68101 that talk about non-reproducible results and slow
performance with avx/vector instructions respectively.

[Bug tree-optimization/68715] [6 Regression] ice: in harmful_stmt_in_region, at graphite-scop-detection.c:1043

2015-12-05 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68715

Joost VandeVondele  changed:

   What|Removed |Added

   Last reconfirmed||2015-12-5
 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch, spop at gcc dot gnu.org
   Target Milestone|--- |6.0
  Known to fail||6.0

--- Comment #1 from Joost VandeVondele  
---
another graphite ice.

[Bug tree-optimization/68715] New: [6 Regression] ice: in harmful_stmt_in_region, at graphite-scop-detection.c:1043

2015-12-05 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68715

Bug ID: 68715
   Summary: [6 Regression] ice: in harmful_stmt_in_region, at
graphite-scop-detection.c:1043
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

not a dup of PR68693:

> cat bug.f90 
SUBROUTINE se_core_core_interaction(calculate_forces)
  INTEGER, PARAMETER :: dp=8
  LOGICAL, INTENT(in):: calculate_forces
  REAL(KIND=dp), DIMENSION(3):: force_ab, rij
  LOGICAL :: lfoo,kfoo,mfoo,nfoo,ffoo
  INTEGER, PARAMETER :: mi2=42
  CALL dummy(lfoo,kfoo,mfoo,nfoo,method_id,core_core)
  IF(lfoo) THEN
 DO WHILE (ffoo())
IF (lfoo) CYCLE
IF (kfoo) CYCLE
dr1 = DOT_PRODUCT(rij,rij)
IF ( dr1 > rij_threshold ) THEN
   SELECT CASE (method_id)
   CASE (mi2)
  IF(calculate_forces) THEN
 CALL dummy2(force_ab)
 IF (nfoo) THEN
force_ab = force_ab + core_core*dr3inv
 END IF
  END IF
   END SELECT
END IF
enuclear = enuclear + enucij
 END DO
 CALL dummy3(enuclear)
  END IF
END SUBROUTINE se_core_core_interaction

> gfortran  -c -floop-nest-optimize -O1 bug.f90 
bug.f90:1:0:

 SUBROUTINE se_core_core_interaction(calculate_forces)


internal compiler error: in harmful_stmt_in_region, at
graphite-scop-detection.c:1043
0x128e27e harmful_stmt_in_region
../../gcc/gcc/graphite-scop-detection.c:1043
0x128e27e merge_sese
../../gcc/gcc/graphite-scop-detection.c:848
0x128e5cc build_scop_breadth
../../gcc/gcc/graphite-scop-detection.c:901
0x128e5cc build_scop_depth
../../gcc/gcc/graphite-scop-detection.c:879
0x128e2e5 build_scop_depth
../../gcc/gcc/graphite-scop-detection.c:865
0x128e2e5 build_scop_depth
../../gcc/gcc/graphite-scop-detection.c:865
0x128fb62 build_scops(vec<scop*, va_heap, vl_ptr>*)
../../gcc/gcc/graphite-scop-detection.c:1913
0x127df71 graphite_transform_loops()
../../gcc/gcc/graphite.c:314
0x127e5b0 graphite_transforms
../../gcc/gcc/graphite.c:363
0x127e5b0 execute
../../gcc/gcc/graphite.c:440
Please submit a full bug report,

> gfortran  -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 6.0.0 20151205 (experimental) [trunk revision 231314] (GCC)

[Bug tree-optimization/68692] New: [graphite] ice: Segmentation fault

2015-12-03 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68692

Bug ID: 68692
   Summary: [graphite] ice: Segmentation fault
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

> cat bug.f90
MODULE spme
  INTEGER, PARAMETER :: dp=8
  PRIVATE
  PUBLIC :: get_patch
CONTAINS
  SUBROUTINE get_patch ( part, box, green, npts, p, rhos, is_core, is_shell,&
 unit_charge, charges, coeff, n )
INTEGER, POINTER :: box
REAL(KIND=dp), &
  DIMENSION(-(n-1):n-1, 0:n-1), &
  INTENT(IN) :: coeff
INTEGER, DIMENSION(3), INTENT(IN):: npts
REAL(KIND=dp), DIMENSION(:, :, :), &
  INTENT(OUT):: rhos
REAL(KIND=dp):: q
REAL(KIND=dp), DIMENSION(3)  :: delta, r
CALL get_delta ( box, r, npts, delta, nbox )
CALL spme_get_patch ( rhos, nbox, delta, q, coeff )
  END SUBROUTINE get_patch
  SUBROUTINE spme_get_patch ( rhos, n, delta, q, coeff )
REAL(KIND=dp), DIMENSION(:, :, :), &
  INTENT(OUT):: rhos
REAL(KIND=dp), DIMENSION(3), INTENT(IN)  :: delta
REAL(KIND=dp), INTENT(IN):: q
REAL(KIND=dp), &
  DIMENSION(-(n-1):n-1, 0:n-1), &
  INTENT(IN) :: coeff
INTEGER, PARAMETER   :: nmax = 12
REAL(KIND=dp), DIMENSION(3, -nmax:nmax)  :: w_assign
REAL(KIND=dp), DIMENSION(3, 0:nmax-1):: deltal
REAL(KIND=dp), DIMENSION(3, 1:nmax)  :: f_assign
DO l = 1, n-1
   deltal ( 3, l ) = deltal ( 3, l-1 ) * delta ( 3 )
END DO
DO j = -(n-1), n-1, 2
   DO l = 0, n-1
  w_assign ( 1, j ) =  w_assign ( 1, j ) + &
 coeff ( j, l ) * deltal ( 1, l )
   END DO
   f_assign (3, i ) = w_assign ( 3, j )
   DO i2 = 1, n
  DO i1 = 1, n
 rhos ( i1, i2, i3 ) = r2 * f_assign ( 1, i1 )
  END DO
   END DO
END DO
  END SUBROUTINE spme_get_patch
  SUBROUTINE get_delta ( box, r, npts, delta, n )
INTEGER, POINTER :: box
REAL(KIND=dp), DIMENSION(3), INTENT(IN)  :: r
INTEGER, DIMENSION(3), INTENT(IN):: npts
REAL(KIND=dp), DIMENSION(3), INTENT(OUT) :: delta
INTEGER, DIMENSION(3):: center
REAL(KIND=dp), DIMENSION(3)  :: ca, grid_i, s
CALL real_to_scaled(s,r,box)
s = s - REAL ( NINT ( s ),KIND=dp)
IF ( MOD ( n, 2 ) == 0 ) THEN
   ca ( : ) = REAL ( center ( : ) )
END IF
delta ( : ) = grid_i ( : ) - ca ( : )
  END SUBROUTINE get_delta
END MODULE spme


> gfortran  -c -O3  -floop-nest-optimize  bug.f90
bug.f90:6:0:

   SUBROUTINE get_patch ( part, box, green, npts, p, rhos, is_core, is_shell,&


internal compiler error: Segmentation fault
0xb676cf crash_signal
../../gcc/gcc/toplev.c:334
0xbba647 ssa_default_def(function*, tree_node*)
../../gcc/gcc/tree-dfa.c:305
0xbbd088 get_or_create_ssa_default_def(function*, tree_node*)
../../gcc/gcc/tree-dfa.c:357
0xbf3e83 get_reaching_def
../../gcc/gcc/tree-into-ssa.c:1168
0xbf3e83 get_reaching_def
../../gcc/gcc/tree-into-ssa.c:1155
0xbf5dbe maybe_replace_use
../../gcc/gcc/tree-into-ssa.c:1753
0xbf5dbe rewrite_update_stmt
../../gcc/gcc/tree-into-ssa.c:1948
0xbf5dbe rewrite_update_dom_walker::before_dom_children(basic_block_def*)
../../gcc/gcc/tree-into-ssa.c:2128
0xbf5dbe rewrite_update_dom_walker::before_dom_children(basic_block_def*)
../../gcc/gcc/tree-into-ssa.c:2068
0x125a71a dom_walker::walk(basic_block_def*)
../../gcc/gcc/domwalk.c:176
0xbf28b5 rewrite_blocks
../../gcc/gcc/tree-into-ssa.c:2190
0xbf9a68 update_ssa(unsigned int)
../../gcc/gcc/tree-into-ssa.c:3351
0x128530a graphite_regenerate_ast_isl(scop*)
../../gcc/gcc/graphite-isl-ast-to-gimple.c:3271
0x127cea3 graphite_transform_loops()
../../gcc/gcc/graphite.c:336
0x127d370 graphite_transforms
../../gcc/gcc/graphite.c:363
0x127d370 execute
../../gcc/gcc/graphite.c:440
Please submit a full bug report,

> gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 6.0.0 20151204 (experimental) [trunk revision 231243] (GCC)

[Bug tree-optimization/68693] New: [6 Regression] ice: in harmful_stmt_in_region, at graphite-scop-detection.c:1052

2015-12-03 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68693

Bug ID: 68693
   Summary: [6 Regression] ice: in harmful_stmt_in_region, at
graphite-scop-detection.c:1052
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

> cat bug.f90 
MODULE dbcsr_index_operations
  INTERFACE dbcsr_build_row_index
  END INTERFACE
CONTAINS
  SUBROUTINE merge_index_arrays (new_row_i, new_col_i, new_blk_p, new_size,&
   old_row_i, old_col_i, old_blk_p, old_size,&
   add_ip, add_size, new_blk_d, old_blk_d,&
   added_size_offset, added_sizes, added_size, added_nblks, error)
INTEGER, DIMENSION(new_size), &
  INTENT(OUT):: new_blk_p, new_col_i, &
new_row_i
INTEGER, INTENT(IN)  :: old_size
INTEGER, DIMENSION(old_size), INTENT(IN) :: old_blk_p, old_col_i, &
old_row_i
INTEGER, DIMENSION(new_size), &
  INTENT(OUT), OPTIONAL  :: new_blk_d
INTEGER, DIMENSION(old_size), &
  INTENT(IN), OPTIONAL   :: old_blk_d
INTEGER, DIMENSION(:), INTENT(IN), &
  OPTIONAL   :: added_sizes
INTEGER, INTENT(OUT), OPTIONAL   :: added_size, added_nblks
LOGICAL  :: multidata
IF (add_size .GT. 0) THEN
   IF (old_size .EQ. 0) THEN
  IF (PRESENT (added_size)) added_size = SUM (added_sizes)
   ENDIF
ELSE
   new_row_i(1:old_size) = old_row_i(1:old_size)
   new_col_i(1:old_size) = old_col_i(1:old_size)
   new_blk_p(1:old_size) = old_blk_p(1:old_size)
   IF (multidata) new_blk_d(1:old_size) = old_blk_d(1:old_size)
ENDIF
  END SUBROUTINE merge_index_arrays
END MODULE dbcsr_index_operations


s> gfortran -c -floop-nest-optimize -O2 bug.f90 
bug.f90:5:0:

   SUBROUTINE merge_index_arrays (new_row_i, new_col_i, new_blk_p, new_size,&


internal compiler error: in harmful_stmt_in_region, at
graphite-scop-detection.c:1052
0x128b761 harmful_stmt_in_region
../../gcc/gcc/graphite-scop-detection.c:1052
0x128b761 merge_sese
../../gcc/gcc/graphite-scop-detection.c:857
0x128bd5c build_scop_breadth
../../gcc/gcc/graphite-scop-detection.c:910
0x128bd5c build_scop_depth
../../gcc/gcc/graphite-scop-detection.c:888
0x128bd21 build_scop_breadth
../../gcc/gcc/graphite-scop-detection.c:902
0x128bd21 build_scop_depth
../../gcc/gcc/graphite-scop-detection.c:888
0x128bb1f build_scop_depth
../../gcc/gcc/graphite-scop-detection.c:886
0x128ba75 build_scop_depth
../../gcc/gcc/graphite-scop-detection.c:874
0x128e3da build_scops(vec<scop*, va_heap, vl_ptr>*)
../../gcc/gcc/graphite-scop-detection.c:1922
0x127cd31 graphite_transform_loops()
../../gcc/gcc/graphite.c:314
0x127d370 graphite_transforms
../../gcc/gcc/graphite.c:363
0x127d370 execute
../../gcc/gcc/graphite.c:440
Please submit a full bug report,


> gfortran  -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 6.0.0 20151204 (experimental) [trunk revision 231243] (GCC)

[Bug tree-optimization/68693] [6 Regression] ice: in harmful_stmt_in_region, at graphite-scop-detection.c:1052

2015-12-03 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68693

Joost VandeVondele  changed:

   What|Removed |Added

   Last reconfirmed||2015-12-4
 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch, spop at gcc dot gnu.org
   Target Milestone|--- |6.0
  Known to fail||6.0

--- Comment #1 from Joost VandeVondele  
---
another graphite ice

[Bug tree-optimization/68692] [6 Regression] ice: Segmentation fault

2015-12-03 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68692

Joost VandeVondele  changed:

   What|Removed |Added

   Last reconfirmed||2015-12-4
 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch, spop at gcc dot gnu.org
   Target Milestone|--- |6.0
Summary|[graphite] ice: |[6 Regression] ice:
   |Segmentation fault  |Segmentation fault
  Known to fail||6.0

--- Comment #1 from Joost VandeVondele  
---
trying to get the nightly tester to run, another graphite ice

[Bug tree-optimization/68550] [6 Regression] ICE: verify_gimple failed Error: missing PHI def

2015-12-03 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68550

--- Comment #7 from Joost VandeVondele  
---
(In reply to Sebastian Pop from comment #5)
> fixed

BTW, with this fixed, I can compile our CP2K code with -floop-nest-optimize at
various -Ox and all seems correct. Thanks!

I'll try to integrate '-floop-nest-optimize' in our nightly testers.

[Bug tree-optimization/68550] [6 Regression] ICE: verify_gimple failed Error: missing PHI def

2015-12-02 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68550

--- Comment #2 from Joost VandeVondele  
---

The following simpler looking testcase fails at -O1:

> cat bug.f90
SUBROUTINE PD2VAL(RES,NDERIV,TG1,TG2,C0)
INTEGER, PARAMETER :: dp=8
REAL(KIND=dp), INTENT(OUT)  :: res(*)
REAL(KIND=dp), INTENT(IN)   :: TG1, TG2, C0(105,*)
REAL(KIND=dp)   :: T1(0:13), T2(0:13)
 DO K=1,NDERIV+1
  RES(K)=RES(K)+DOT_PRODUCT(T1(0:7),C0(70:77,K))*T2(6)
  RES(K)=RES(K)+DOT_PRODUCT(T1(0:4),C0(91:95,K))*T2(9)
  RES(K)=RES(K)+DOT_PRODUCT(T1(0:3),C0(96:99,K))*T2(10)
 ENDDO
END SUBROUTINE PD2VAL

> gfortran  -O1 -fcheck=bounds -floop-nest-optimize bug.f90
bug.f90:1:0:

 SUBROUTINE PD2VAL(RES,NDERIV,TG1,TG2,C0)


Error: missing PHI def
val.0_68 = PHI <(4), 0.0(6)>
bug.f90:1:0: internal compiler error: verify_gimple failed
0xbad719 verify_gimple_in_cfg(function*, bool)
../../gcc/gcc/tree-cfg.c:5082

[Bug fortran/68649] [6 Regression] note: code may be misoptimized unless -fno-strict-aliasing is used

2015-12-02 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68649

Joost VandeVondele  changed:

   What|Removed |Added

 Status|WAITING |UNCONFIRMED
 Depends on||68560
 Ever confirmed|1   |0

--- Comment #5 from Joost VandeVondele  
---
(In reply to Dominique d'Humieres from comment #4)
> I think it is a duplicate of pr68560.

seems certainly related, but PR68560 doesn't yield the worrying 'code may be
misoptimized unless -fno-strict-aliasing is used'. I'll just add a dependency.

> 
> > This smells like a fortran front-end issue where _gfortran_reshape_r8's decl
> > is created twice with two different argument types.
> 
> I don't think so. I think -Wlto-type-mismatch does not know part of the
> Fortran syntax.

I'm thinking the issue is on the Fortran FE side, LTO shouldn't know the
language involved. I guess some middle end person might need to have a look
however.

I'm guessing it is related to the fact that _gfortran_reshape_r8 is being
called with different pointer arguments (from -fdump-tree-original):

struct array1_real(kind=8) parm.0;
 _gfortran_reshape_r8 (, D.3433, D.3437, 0B, 0B);
struct array2_real(kind=8) parm.4;
 _gfortran_reshape_r8 (, D.3446, D.3450, 0B, 0B);

maybe for correctness there should be some casts ?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68560
[Bug 68560] [6 Regression] The test gfortran.dg/shape_8.f90 now fails when
compiled with -flto

[Bug rtl-optimization/68641] undefined variables implicitly considered to be zero

2015-12-02 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68641

--- Comment #10 from Joost VandeVondele  
---
(In reply to rguent...@suse.de from comment #9)
> Though with the testcase you gave we warn at both -O0 and -O1:

yes, but unfortunately, -Wuninitialized, also warns for 'may be used
uninitialized' which are too often false positives, and sometimes even compiler
generated variables as in PR67679. 

A -Wmust-be-uninitialized (which seemingly the compiler could do as the comment
on top of init-regs.c suggests) would be valuable. [I just checked that
-Wuninitialized -Wno-maybe-uninitialized doesn't work]

[Bug rtl-optimization/68641] undefined variables implicitly considered to be zero

2015-12-02 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68641

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #8 from Joost VandeVondele  
---
(In reply to Eric Botcazou from comment #3)
> The whole reasoning looks fairly dubious to me, the optimizer is free to do
> whatever it wants on undefined behavior and requests that the generated code
> behaves identically at all optimization levels on it have little merit IMO.

Of course, I agree that the code has undefined behavior, and that 'all bets are
off'. 

It just makes it more difficult for users to spot this undefined behavior, we
run our testsuite every night under valgrind, but can't move from -O1 to -O0,
since that would add a couple of hours to the test. Admittedly,
-fsanitize=memory would be a better solution, but it is not available with gcc.

[Bug rtl-optimization/68641] undefined variables implicitly considered to be zero

2015-12-02 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68641

--- Comment #12 from Joost VandeVondele  
---
(In reply to Marc Glisse from comment #11)
> That sounds like a bug. It works for me on a simple C testcase.

sorry, fat fingers on my side. So, yes, this works

gfortran -c -Werror=uninitialized -Wno-maybe-uninitialized test.f90

I believe this would practically resolve this PR from my point of view.

[Bug middle-end/68641] New: undefined variables implicitly considered to be zero

2015-12-01 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68641

Bug ID: 68641
   Summary: undefined variables implicitly considered to be zero
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

This is an enhancement request. Using -O1 and higher, undefined variables are
considered equal to 0 in optimization. This leads to code compiling at -O1 but
failing to compile at -O0, for example :

> cat test.f90
  INTEGER :: i
  INTEGER, POINTER :: foo
  IF (i/=0) CALL link_error
  IF (ASSOCIATED(foo)) CALL link_error
END

another disadvantage is that e.g. using valgrind, the above program will not
fail when compiled at -O1. Only at -O0 these cases will be found (leading to
much longer testing times).

Using -Wmaybe-uninitialized is not a real solution, there are just too many
false positives (I'm not aware of a -Wmust-be-uninitialized, which would be
valuable).

So, ideally no optimization is done based on an assumed value.

[Bug fortran/38312] Unexpected STATEMENT FUNCTION statement

2015-12-01 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38312

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #8 from Joost VandeVondele  
---
The current error message has changed :

test.f90:7:6:

   co(i,j)=t1(i,k)*t2(j,k)
  1

Error: The function result on the lhs of the assignment at (1) must have the
pointer attribute.

now the location gives a hint.

[Bug middle-end/68649] New: note: code may be misoptimized unless -fno-strict-aliasing is used

2015-12-01 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68649

Bug ID: 68649
   Summary: note: code may be misoptimized unless
-fno-strict-aliasing is used
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

Today's trunk produces a lot of warnings / notes all referring to functions
from libgfortran, when compiling CP2K with LTO. I'm looking at generating a
reduced testcase, but the first obvious tries didn't work.

/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/dbcsr_lib/dbcsr_block_access_c.F:709:0:
note: ‘_gfortran_reshape_c4’ was previously declared here
  data_block%p(:,:) = RESHAPE (block, (/row_size, col_size/))


/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/dbcsr_lib/dbcsr_block_access_c.F:709:0:
note: code may be misoptimized unless -fno-strict-aliasing is used
/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/helium_methods.F:1071:0: warning:
type of ‘_gfortran_unpack1’ does not match original declaration
[-Wlto-type-mismatch]
 helium%pos(:,:,:) = UNPACK(message(offset+1:offset+msglen), MASK=m,
FIELD=f )


/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/helium_methods.F:1699:0: warning:
type of ‘_gfortran_unpack1’ does not match original declaration
[-Wlto-type-mismatch]
 helium%rho_rstr(:,:,:,:) = UNPACK(message(1:msglen), MASK=m, FIELD=f)


/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/helium_methods.F:1547:0: note:
‘_gfortran_unpack1’ was previously declared here
   ig(:,:) = UNPACK(message(offset+33:offset+38), MASK=m, FIELD=f )


/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/helium_methods.F:1547:0: note: code
may be misoptimized unless -fno-strict-aliasing is used
/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/dbcsr_lib/dbcsr_block_access_s.F:709:0:
warning: type of ‘_gfortran_reshape_r4’ does not match original declaration
[-Wlto-type-mismatch]
  data_block%p(:,:) = RESHAPE (block, (/row_size, col_size/))


/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/dbcsr_lib/dbcsr_block_access_s.F:709:0:
warning: type of ‘_gfortran_reshape_r4’ does not match original declaration
[-Wlto-type-mismatch]

[Bug middle-end/68649] [6 Regression] note: code may be misoptimized unless -fno-strict-aliasing is used

2015-12-01 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68649

Joost VandeVondele  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org,
   ||Joost.VandeVondele at mat dot 
ethz
   ||.ch
Summary|note: code may be   |[6 Regression] note: code
   |misoptimized unless |may be misoptimized unless
   |-fno-strict-aliasing is |-fno-strict-aliasing is
   |used|used

--- Comment #1 from Joost VandeVondele  
---
testcase

> cat foo.f90 
SUBROUTINE s8(a8,b8)
 REAL*8 :: a8(16),b8(4,4)
 a8=RESHAPE(b8,(/16/))
 b8=RESHAPE(a8,(/4,4/))
END SUBROUTINE
 REAL*8 :: a8(16),b8(4,4)
 CALL RANDOM_NUMBER(b8)
 CALL s8(a8,b8)
 WRITE(6,*) MAXVAL(a8),MAXVAL(b8)
END

> gfortran  -flto -O3 foo.f90 
foo.f90:3:0: warning: type of ‘_gfortran_reshape_r8’ does not match original
declaration [-Wlto-type-mismatch]
  a8=RESHAPE(b8,(/16/))


foo.f90:4:0: note: ‘_gfortran_reshape_r8’ was previously declared here
  b8=RESHAPE(a8,(/4,4/))


foo.f90:4:0: note: code may be misoptimized unless -fno-strict-aliasing is used

[Bug fortran/68649] [6 Regression] note: code may be misoptimized unless -fno-strict-aliasing is used

2015-12-01 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68649

--- Comment #3 from Joost VandeVondele  
---
Grepping the list of 'note:' in our build process, it triggers for at least
these functions:

_gfortran_matmul_r8
_gfortran_reshape_4
_gfortran_reshape_c4
_gfortran_reshape_c8
_gfortran_reshape_r4
_gfortran_reshape_r8
_gfortran_unpack1

so it is somewhat general.

[Bug fortran/38312] Unexpected STATEMENT FUNCTION statement

2015-12-01 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38312

Joost VandeVondele  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Joost VandeVondele  
---
let's go for fixed.

[Bug tree-optimization/68639] New: [6 Regression] ICE: Floating point exception

2015-12-01 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68639

Bug ID: 68639
   Summary: [6 Regression] ICE: Floating point exception
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

trunk regression:

> cat bug.f90
  SUBROUTINE makeCoulE0(natorb,Coul)
INTEGER, PARAMETER :: dp=8
REAL(KIND=dp), PARAMETER :: fourpi=432.42, oorootpi=13413.3142
INTEGER :: natorb
REAL(KIND=dp), DIMENSION(45, 45), &
  INTENT(OUT):: Coul
INTEGER  :: gpt, imA, imB, k1, k2, k3, &
k4, lp, mp, np
REAL(KIND=dp):: alpha, d2f(3,3), &
d4f(3,3,3,3), f, ff, w
REAL(KIND=dp), DIMENSION(3, 45)  :: M1A
REAL(KIND=dp), DIMENSION(45) :: M0A
DO imA=1, (natorb*(natorb+1))/2
   DO imB=1, (natorb*(natorb+1))/2
  w= M0A(imA)*M0A(imB)
  DO k1=1,3
w=w+ M1A(k1,imA)*M1A(k1,imB)
  ENDDO
  Coul(imA,imB)=Coul(imA,imB)-4.0_dp*alpha**3*oorootpi*w/3.0_dp
   ENDDO
ENDDO
  END SUBROUTINE makeCoulE0

> gfortran -c  -O3 bug.f90
bug.f90:1:0:

   SUBROUTINE makeCoulE0(natorb,Coul)


internal compiler error: Floating point exception
0xb793ff crash_signal
/data/vjoost/toolchain-trunk/build/gcc-master/gcc/toplev.c:334
0xdab9b6 vectorizable_load
   
/data/vjoost/toolchain-trunk/build/gcc-master/gcc/tree-vect-stmts.c:6292
0xdb2ee9 vect_analyze_stmt(gimple*, bool*, _slp_tree*)
   
/data/vjoost/toolchain-trunk/build/gcc-master/gcc/tree-vect-stmts.c:8009
0xdc24ca vect_analyze_loop_operations
/data/vjoost/toolchain-trunk/build/gcc-master/gcc/tree-vect-loop.c:1711
0xdc24ca vect_analyze_loop_2
/data/vjoost/toolchain-trunk/build/gcc-master/gcc/tree-vect-loop.c:1998
0xdc24ca vect_analyze_loop(loop*)
/data/vjoost/toolchain-trunk/build/gcc-master/gcc/tree-vect-loop.c:2248
0xdd6059 vectorize_loops()
/data/vjoost/toolchain-trunk/build/gcc-master/gcc/tree-vectorizer.c:532
Please submit a full bug report,

> gfortran -v bug.f90
Driving: gfortran -v bug.f90 -l gfortran -l m -shared-libgcc
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/toolchain-trunk/install/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /data/vjoost/toolchain-trunk/build/gcc-master/configure
--prefix=/data/vjoost/toolchain-trunk/install --enable-languages=c,c++,fortran
--disable-multilib --disable-bootstrap --enable-lto --enable-plugins
Thread model: posix
gcc version 6.0.0 20151201 (experimental) (GCC)

[Bug fortran/68600] Inlined MATMUL is too slow.

2015-11-30 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68600

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #7 from Joost VandeVondele  
---
(In reply to Dominique d'Humieres from comment #6)
> Note a problem when 16x16 matrices are inlined with -mavx (I'll investigate
> and file a PR for it).

that's a good find!

I ran locally on haswell, and find these numbers, including openblas, and
libxsmm. 

./a.out
 Size Loops  Matmul   newmatmul dgemm-like 
dgemm
  fixed explicit  internal  libxsmm 
openblas

=
220   1.562   0.107   0.104   0.139
420   6.781   0.779   1.012   0.887
820   7.424   3.360   6.150   4.732
   1620   2.954   7.290  14.421  11.527
   3220  10.401  10.251  24.396  18.071
   64 30757  12.696  14.196  27.385  24.547
  128  3829   8.646  17.684  31.460  31.530
  256   477   7.834  19.123  37.457  37.471
  51259   8.064  19.473  40.738  40.755
 1024 7   8.334  19.475  40.931  41.112
 2048 1   3.042  19.157  41.225  41.279


so the 'newmatmul' code gets about 50% of peak. Inlined matmul is good up to
size 8/16, 16-64 libxsmm wins, >64 openblas is better. For the small sizes it
is mostly related to call eliminated overhead, I think.

[Bug bootstrap/68540] 6.0 build process broken on Linux Mint, potential include ordering problem

2015-11-27 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68540

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #6 from Joost VandeVondele  
---
graphite without recent ISL is not a good idea, graphite before 0.15 (certainly
not 0.12) could lead to exponential memory and cpu time requirements, see

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53852#c19

[Bug middle-end/68565] New: [6 Regression] graphite : -O2 -floop-nest-optimize miscompile

2015-11-26 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68565

Bug ID: 68565
   Summary: [6 Regression] graphite : -O2 -floop-nest-optimize
miscompile
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

current trunk lead to wrong code for the following testcase if
-floop-nest-optimize is used:

> cat test.f90
MODULE test
  IMPLICIT NONE
  TYPE subset_type
 INTEGER:: ncon_tot
 REAL(KIND=8),DIMENSION(:,:),ALLOCATABLE   :: coeff
  END TYPE
CONTAINS
  SUBROUTINE foo(subset)
  TYPE(subset_type):: subset
  INTEGER :: icon1
  DO icon1=1,subset%ncon_tot
   subset%coeff(:,icon1)=subset%coeff(:,icon1)/&
SQRT(DOT_PRODUCT(subset%coeff(:,icon1),subset%coeff(:,icon1)))
  END DO
  END SUBROUTINE
END MODULE

USE test
TYPE(subset_type):: subset
INTEGER, VOLATILE :: n1=7,n2=4
ALLOCATE(subset%coeff(n1,n2))
CALL RANDOM_NUMBER(subset%coeff)
subset%coeff=subset%coeff-0.5
subset%ncon_tot=n2
CALL foo(subset)
WRITE(6,*) MAXVAL(subset%coeff)
END

> gfortran -g -O2 -floop-nest-optimize test.f90 && ./a.out

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic
operation.

Backtrace for this error:
#0  0x302a43269f in ???
#1  0x40095a in __test_MOD_foo
at /data/vjoost/gnu/bugs/test.f90:13
#2  0x400bb8 in MAIN__
at /data/vjoost/gnu/bugs/test.f90:25
#3  0x4007dc in main
at /data/vjoost/gnu/bugs/test.f90:18
Floating point exception

> gfortran -g -O2 test.f90 && ./a.out
  0.75457607554184802 

> gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 6.0.0 20151126 (experimental) [trunk revision 230923] (GCC)

[Bug middle-end/68565] [6 Regression] graphite : -O2 -floop-nest-optimize miscompile

2015-11-26 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68565

Joost VandeVondele  changed:

   What|Removed |Added

   Last reconfirmed||2015-11-26
 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch, spop at gcc dot gnu.org
   Target Milestone|--- |6.0
  Known to fail||6.0

[Bug middle-end/68575] New: [6 Regression] ice: verify_ssa failed, definition in block 2 follows the use

2015-11-26 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68575

Bug ID: 68575
   Summary: [6 Regression] ice: verify_ssa failed, definition in
block 2 follows the use
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

overnight regression with current trunk:

> cat bug.f90
MODULE qs_efield_berry
  TYPE cp_error_type
  END TYPE
  INTEGER, PARAMETER :: dp=8
  TYPE qs_energy_type
REAL(KIND=dp), POINTER :: efield
  END TYPE
  TYPE qs_environment_type
  END TYPE
  INTERFACE 
SUBROUTINE foo(qs_env,energy,error)
   IMPORT 
   TYPE(qs_environment_type), POINTER :: qs_env
   TYPE(cp_error_type)  :: error
   TYPE(qs_energy_type), POINTER   :: energy
END SUBROUTINE
  END INTERFACE
CONTAINS
  SUBROUTINE qs_efield_mo_derivatives()
TYPE(qs_environment_type), POINTER :: qs_env
TYPE(cp_error_type)  :: error
COMPLEX(dp)  ::   zi(3), zphase(3)
REAL(dp) :: ci(3)
TYPE(qs_energy_type), POINTER  :: energy
CALL foo(qs_env, energy, error)
zi = zi * zphase
ci = AIMAG(LOG(zi))
DO idir=1,3
   ener_field=ener_field+ci(idir)*fieldfac(idir)
END DO
energy%efield=ener_field
  END SUBROUTINE qs_efield_mo_derivatives
END MODULE qs_efield_berry

> gfortran  -c -O3 bug.f90
bug.f90:19:0:

   SUBROUTINE qs_efield_mo_derivatives()


Error: definition in block 2 follows the use
for SSA_NAME: _65 in statement:
vectp.27_93 = _EXPR <zi[_65]>;
bug.f90:19:0: internal compiler error: verify_ssa failed
0xd78664 verify_ssa(bool, bool)
../../gcc/gcc/tree-ssa.c:1039
0xa9346d execute_function_todo
../../gcc/gcc/passes.c:1965
0xa93e0b execute_todo
../../gcc/gcc/passes.c:2010
Please submit a full bug report,
with preprocessed source if appropriate.

> gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 6.0.0 20151127 (experimental) [trunk revision 230990] (GCC)

[Bug tree-optimization/68453] [6 Regression] graphite ICE: segfault

2015-11-26 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68453

--- Comment #3 from Joost VandeVondele  
---
at r230923 this testcase seems not to fail anymore. Should this be closed as
fixed (maybe after adding the testcase ?)

I ran into a new ICE however, but I'll open a PR once  I have this reduced.

[Bug tree-optimization/68550] New: [6 Regression] ICE: verify_gimple failed Error: missing PHI def

2015-11-26 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68550

Bug ID: 68550
   Summary: [6 Regression] ICE: verify_gimple failed Error:
missing PHI def
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Joost.VandeVondele at mat dot ethz.ch
  Target Milestone: ---

graphite-triggered ICE in current trunk.

> cat bug.f90
  SUBROUTINE integrate_core_1(grid,coef_xyz,pol_x,pol_y,&
 pol_z,map,sphere_bounds,cmax,gridbounds)
INTEGER, PARAMETER :: dp=8
INTEGER, INTENT(IN):: sphere_bounds(*), cmax, &
  map(-cmax:cmax,1:3), &
  gridbounds(2,3)
REAL(dp), INTENT(IN) :: grid(gridbounds(1,1):gridbounds(2,1), &
 gridbounds(1,2):gridbounds(2,2),&
 gridbounds(1,3):gridbounds(2,3))
INTEGER, PARAMETER :: lp = 1
REAL(dp), INTENT(IN)   :: pol_x(0:lp,-cmax:cmax), &
  pol_y(1:2,0:lp,-cmax:0), &
  pol_z(1:2,0:lp,-cmax:0)
REAL(dp), INTENT(OUT) :: coef_xyz(((lp+1)*(lp+2)*(lp+3))/6)
INTEGER   :: i, ig, igmax, igmin, j, j2, &
 jg, jg2, jgmin, k, k2, kg, &
 kg2, kgmin, lxp, sci
REAL(dp)  :: coef_x(4,0:lp), &
 coef_xy(2,((lp+1)*(lp+2))/2), &
 s(4)
DO kg=kgmin,0
   DO jg=jgmin,0
  coef_x=0.0_dp
  DO ig=igmin,igmax
 DO lxp=0,lp
coef_x(:,lxp)=coef_x(:,lxp)+s(:)*pol_x(lxp,ig)
 ENDDO
  END DO
 coef_xy(:,3)=coef_xy(:,3)+coef_x(3:4,0)*pol_y(2,1,jg)
   END DO
coef_xyz(3)=coef_xyz(3)+coef_xy(1,3)*pol_z(1,0,kg)
END DO
  END SUBROUTINE integrate_core_1

> gfortran -c -O2 -floop-nest-optimize  bug.f90
bug.f90:1:0:

   SUBROUTINE integrate_core_1(grid,coef_xyz,pol_x,pol_y,&


Error: missing PHI def
bug.f90:1:0: Error: missing PHI def
prephitmp_133 = PHI <(8), (16)>
bug.f90:1:0: internal compiler error: verify_gimple failed
0xbab969 verify_gimple_in_cfg(function*, bool)
../../gcc/gcc/tree-cfg.c:5082
0xa92b8c execute_function_todo
../../gcc/gcc/passes.c:1958
0xa9361b execute_todo
../../gcc/gcc/passes.c:2010
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

> gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data/vjoost/gnu/gcc_trunk/install/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/data/vjoost/gnu/gcc_trunk/install
--enable-languages=c,c++,fortran --disable-multilib --enable-plugins
--enable-lto --disable-bootstrap
Thread model: posix
gcc version 6.0.0 20151126 (experimental) [trunk revision 230923] (GCC)

[Bug tree-optimization/68453] [6 Regression] graphite ICE: segfault

2015-11-26 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68453

--- Comment #5 from Joost VandeVondele  
---
(In reply to Sebastian Pop from comment #4)
> fixed in r230918

Thanks!

I think if you make the changelog part of the commit message (in particular the
line containing PR tree-optimization/68453) an entry should appear
automatically in bugzilla (or else I don't quite know why this doesn't happen
in your case, but normally it does).

[Bug tree-optimization/68379] [6 Regression] BB vectorization: definition in block 13 follows the use for SSA_NAME

2015-11-26 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68379

--- Comment #2 from Joost VandeVondele  
---
(In reply to Markus Trippelsdorf from comment #1)
> *** Bug 68575 has been marked as a duplicate of this bug. ***

The testcase in PR68575 doesn't require avx, so might be useful for the
testsuite.

[Bug tree-optimization/68379] [6 Regression] BB vectorization: definition in block 13 follows the use for SSA_NAME

2015-11-26 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68379

Joost VandeVondele  changed:

   What|Removed |Added

 Target|aarch64 |aarch64,
   ||x86_64-pc-linux-gnu
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2015-11-27
 Ever confirmed|0   |1

--- Comment #3 from Joost VandeVondele  
---
additionally, the target is x86_64-pc-linux-gnu

[Bug rtl-optimization/68432] [6 Regression] internal compiler error: in expand_insn, at optabs.c:6947

2015-11-25 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68432

--- Comment #11 from Joost VandeVondele  
---
(In reply to rsand...@gcc.gnu.org from comment #10)
> Series finally posted here:
> 
>   https://gcc.gnu.org/ml/gcc-patches/2015-11/msg03020.html
> 
> Sorry for the delay.

Many thanks for fixing this seemingly non-trivial issue!

[Bug libfortran/51119] MATMUL slow for large matrices

2015-11-24 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #29 from Joost VandeVondele  
---
(In reply to Thomas Koenig from comment #27)
> (In reply to Joost VandeVondele from comment #22)
> If the compiler turns out not to be reasonably smart, file a bug report :-)

what is needed for large matrices (in my opinion) is some simple loop tiling,
as can, in principle, be achieved with graphite : this is my PR14741

Good vectorization, which gcc already does well, just requires the proper
compiler options for the matmul implementation, i.e. '-O3 -march=native
-ffast-math'. However, this would require the Fortran runtime to be compiled
with such options, or at least a way to provide specialized (avx2 etc)
routines.

There is however the related PR (inner loop of matmul) : PR25621, where some
unusual flag combo helps (-fvariable-expansion-in-unroller -funroll-loops)

I think external blas and inlining of small matmuls are good things, but I
would expect the default library implementation to reach at least 50% of peak
(for e.g. a 4000x4000 matrix), which is not all that hard. Actually, would be
worth an experiment, but a Fortran loop nest which implements a matmul compiled
with ifort would presumably reach that or higher :-).

These slides show how to reach 90% of peak:
http://wiki.cs.utexas.edu/rvdg/HowToOptimizeGemm/
the code actually is not too ugly, and I think there is no need for the
explicit vector intrinsics with gcc.

I believe I had once a bug report open for small matrices, but this might have
been somewhat fixed in the meanwhile.

[Bug sanitizer/59302] tsan: Unexpected mmap in InternalAllocator!

2015-11-23 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59302

Joost VandeVondele  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Joost VandeVondele  
---
I suppose this is fixed by now, haven't seen it again.

[Bug libfortran/51119] MATMUL slow for large matrices

2015-11-23 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #22 from Joost VandeVondele  
---
(In reply to Thomas Koenig from comment #21)
> I assume that for  small matrices bordering on the silly
> (say, a matrix multiplication with dimensions of (1,2) and (2,1))
> the inline code will be faster if the code is compiled with the
> right options, due to function call overhead.  I also assume that
> libxsmm will become faster quite soon for bigger sizes.
> 
> Do you have an idea where the crossover is?

I agree that inline should be faster, if the compiler is reasonably smart, if
the matrix dimensions are known at compile time (i.e. should be able to
generate the same kernel). I haven't checked yet.

[Bug middle-end/68279] ICE: in create_pw_aff_from_tree, at graphite-sese-to-poly.c:836

2015-11-23 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68279

--- Comment #6 from Joost VandeVondele  
---
(In reply to Sebastian Pop from comment #5)
> After fixing the graphite fail, I get these warnings from the testcase in

thanks, these are due to reducing the testcase stripping variable definitions.

> Is there a flag I can set to avoid these warnings?

gfortran -c -std=legacy t.f90

[Bug libfortran/51119] MATMUL slow for large matrices

2015-11-22 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #20 from Joost VandeVondele  
---
(In reply to Jerry DeLisle from comment #19)
> If I can get something working I am thinking something like
> -fexternal-blas-n, if -n not given then default to current libblas
> behaviour. This way users have some control. With GPUs, it is not unusual to
> have hundreds of cores.  We can also, at run time, see if the opencl is
> already initialized which may mean used elsewhere so don't mess with it.

Hidden behind a -fexternal-blas-n switch might be an option. Including GPUs
seems even a tad more tricky. We have a paper on GPU (small) matrix
multiplication, http://dbcsr.cp2k.org/_media/gpu_book_chapter_submitted.pdf .
BTW, another interesting project is the libxsmm library more aimed at small
(<128) matrices see : https://github.com/hfp/libxsmm . Not sure if this info is
useful in this context, but it might provide inspiration.

[Bug libfortran/51119] MATMUL slow for large matrices

2015-11-22 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #18 from Joost VandeVondele  
---
(In reply to Jerry DeLisle from comment #17)
> I have done some experimenting.  Since gcc supports OMP and I think to some
> extent ACC why not come up with a MATMUL that exploits these if present?  On
> the darwin platform discussed in comment #12, the performance is excellent. 
> Does darwin implementation provided exploit OpenCL?  What is it using?  Why
> not enable that on other platforms if present.
> 
> I am going to explore OpenCL and clBLAS to see if I can get it to work.  If
> I am successful, I would like to hide it behind MATMUL if possible.  Any
> other opinions?

yes, this is tricky. In a multithreaded code executing matmul, what is the
strategy (nested parallelism, serial, ...) ? We usually link in a serial blas
because threading in the library is usually not good for performance of the
code overall, i.e. nested parallelism tends to perform badly. Also, how many
threads would you use by default (depending on matrix size, machine load) ?
Users on an N core machine might run N jobs in parallel, and not expect those
to start several threads each. 

Maybe, this could be part of the auto-parallelize (or similar) option that gcc
has ?

[Bug rtl-optimization/68432] [6 Regression] internal compiler error: in expand_insn, at optabs.c:6947

2015-11-19 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68432

Joost VandeVondele  changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #2 from Joost VandeVondele  
---
(In reply to rsand...@gcc.gnu.org from comment #1)
> Just to check: is this x86_64-linux-gnu?

yes it is.x86_64-pc-linux-gnu

[Bug middle-end/68279] ICE: in create_pw_aff_from_tree, at graphite-sese-to-poly.c:836

2015-11-19 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68279

Joost VandeVondele  changed:

   What|Removed |Added

   Last reconfirmed||2015-11-20

--- Comment #4 from Joost VandeVondele  
---
still happens at r230637

I notice the Fortran testcase misses its last line, for completeness:

> cat PR68279.f90

MODULE dbcsr_mm_accdrv
  INTEGER, SAVE :: accdrv_binning_nbins = 4096
  INTEGER, SAVE :: accdrv_binning_binsize = 16
  INTEGER, PARAMETER, PUBLIC :: dbcsr_ps_width = 7
  CONTAINS
  SUBROUTINE stack_binning(params_in, params_out, stack_size)
INTEGER, INTENT(IN)  :: stack_size
INTEGER, DIMENSION(dbcsr_ps_width, &
  stack_size), INTENT(OUT)   :: params_out
INTEGER, DIMENSION(dbcsr_ps_width, &
  stack_size), INTENT(IN):: params_in
INTEGER, DIMENSION(accdrv_binning_nbins) :: bin_top
INTEGER, DIMENSION(dbcsr_ps_width)   :: val
INTEGER, DIMENSION(dbcsr_ps_width, &
  accdrv_binning_binsize, &
  accdrv_binning_nbins)  :: bin_arr
 DO i=1,stack_size
val(:) = params_in(:,i)
IF(bin_top(bin_id) > accdrv_binning_binsize) THEN
   params_out(:, top:top+bin_top(bin_id)-2) = bin_arr(:,
1:bin_top(bin_id)-1, bin_id)
ENDIF
bin_arr(:, bin_top(bin_id), bin_id) =  val(:)
bin_top(bin_id) = bin_top(bin_id) + 1
 END DO
  END SUBROUTINE  stack_binning
END MODULE

  1   2   3   4   5   6   7   8   >