[Bug target/96559] Wrong code with -march=z900 -mtune=z9-109

2020-08-11 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96559

--- Comment #1 from Ulrich Weigand  ---
> [...] as __clzdi2 points to the very same place as _Z11CeilingLog2v.

How do you get to that conclusion?  Nothing in that assembler source sets
__clzdi2 to point to the same place as _Z11CeilingLog2v.  The ".globl" simply
declares that there is a globally visible definition, from someplace outside
this file.

And in fact if I compile all the way to an object file and look at it using
"objdump --disassemble --reloc", I see:

 <_Z11CeilingLog2v>:
   0:   eb cf f0 60 00 24   stmg%r12,%r15,96(%r15)
   6:   a7 fb ff 60 aghi%r15,-160
   a:   c0 c0 00 00 00 00   larl%r12,a <_Z11CeilingLog2v+0xa>
c: R_390_PC32DBL.bss+0x2
  10:   e3 20 c0 00 00 04   lg  %r2,0(%r12)
  16:   c0 e5 00 00 00 00   brasl   %r14,16 <_Z11CeilingLog2v+0x16>
18: R_390_PC32DBL   __clzdi2+0x2
  1c:   42 20 c0 08 stc %r2,8(%r12)
  20:   e3 40 f1 10 00 04   lg  %r4,272(%r15)
  26:   eb cf f1 00 00 04   lmg %r12,%r15,256(%r15)
  2c:   07 f4   br  %r4

And "nm" shows:
 U __clzdi2
0008 B compute___trans_tmp_3
 B CountLeadingZeroes64_aValue
 T _Z11CeilingLog2v

So if the call to __clzdi2 ends up going to the wrong place, something must
have gone wrong at the link stage.

[Bug target/56184] [4.8 Regression] Internal compiler error in push_reload during bootstrap stage 2

2013-02-05 Thread uweigand at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56184



Ulrich Weigand  changed:



   What|Removed |Added



 CC||uweigand at gcc dot gnu.org



--- Comment #4 from Ulrich Weigand  2013-02-05 
13:51:24 UTC ---

This is weird; I cannot reproduce the behaviour even with the exact configure

and command lines you specify.  I've been using SVN rev. 195717; which revision

do you see the problem with?



In the generated test.ii.208r.ira file I get, I see different register uses

even before IRA, compared to your version.



Would you mind sending me (offline) a full set of the dump files so I can see

where my compile run starts to diverge from yours?


[Bug target/56184] [4.8 Regression] Internal compiler error in push_reload during bootstrap stage 2

2013-02-06 Thread uweigand at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56184



--- Comment #5 from Ulrich Weigand  2013-02-06 
19:27:31 UTC ---

Depending on configure tests of the installed (cross-)assembler, the ICE may

not occur.  In those cases, I'm now able to reliably reproduce the ICE by using

-fno-section-anchors (in addition to the flags given above).


[Bug target/56184] [4.8 Regression] Internal compiler error in push_reload during bootstrap stage 2

2013-02-06 Thread uweigand at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56184



Ulrich Weigand  changed:



   What|Removed |Added



 CC||vmakarov at gcc dot gnu.org



--- Comment #6 from Ulrich Weigand  2013-02-06 
19:40:30 UTC ---

The problem occurs with the following insn:



(insn 539 383 384 46

 (set (reg:DI 355 [313]) (const_int 256 [0x100]))

 test.ii:128 643 {*movdi_vfp}

 (expr_list:REG_EQUIV (const_int 256 [0x100])

 (nil)))



Register 355 is recognized as always-equal to the constant 256, and insn 539 is

the insn that originally sets up the equivalence.  If the register doesn't get

a hard reg, what ought to happen is that users of reg 355 get replaced by the

constant, and the insn setting the equivalence ought to be deleted.  Because

the insn will get deleted anyway, it also ought to be skipped for find_reloads.



To achieve that, reg_equiv_constant(355) should hold the constant, and

reg_equiv_init(355) should point to the above insn.  However, what actually

happens in this test case is that reg_equiv_init(355) is NULL.  Therefore, the

insn is *not* skipped for find_reloads, which then aborts since it tries to

push an output reload for an always-constant register, which is not supposed to

happen.



Now the register is somewhat special in that it was created by IRA via live

range splitting.  The original register was reg 313; and this still has

reg_equiv_init(313) pointing to the above insn.  However, reg_equiv_init(355)

is NULL.  There is a routine fix_reg_equiv_init in ira.c which appears to be

intended to fix the reg_equiv_init settings of new registers created by live

range splitting.  However, this doesn't seem to have worked in this case ...



Unfortunately I'm not really familiar with the live range splitting code; maybe

Vladimir can help with this?


[Bug testsuite/49443] gcc.dg/vect/vect-peel-3.c and vect-peel-4.c fail on IA64 after testsuite change

2012-08-10 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49443

--- Comment #7 from Ulrich Weigand  2012-08-10 
13:26:51 UTC ---
Author: uweigand
Date: Fri Aug 10 13:26:44 2012
New Revision: 190296

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190296
Log:
ChangeLog:

Backport from mainline
2012-07-30  Ulrich Weigand  
Richard Earnshaw  

* target.def (vector_alignment): New target hook.
* doc/tm.texi.in (TARGET_VECTOR_ALIGNMENT): Document new hook.
* doc/tm.texi: Regenerate.
* targhooks.c (default_vector_alignment): New function.
* targhooks.h (default_vector_alignment): Add prototype.
* stor-layout.c (layout_type): Use targetm.vector_alignment.
* config/arm/arm.c (arm_vector_alignment): New function.
(TARGET_VECTOR_ALIGNMENT): Define.

* tree-vect-data-refs.c (vect_update_misalignment_for_peel): Use
vector type alignment instead of size.
* tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound): Use
element type size directly instead of computing it from alignment.
Fix variable naming and comment.


testsuite/ChangeLog:

Backport from mainline
2012-07-30  Ulrich Weigand  

* lib/target-supports.exp
(check_effective_target_vect_natural_alignment): New function.
* gcc.dg/align-2.c: Only run on targets with natural alignment
of vector types.
* gcc.dg/vect/slp-25.c: Adjust tests for targets without natural
alignment of vector types.

2011-12-21  Michael Zolotukhin  

* gcc.dg/vect/vect-peel-1.c: Adjust test diag-scans to fix fail on AVX.
* gcc.dg/vect/vect-peel-2.c: Ditto.

2011-06-21  Ira Rosen  

PR testsuite/49443
* gcc.dg/vect/vect-peel-3.c: Expect to fail on vect_no_align
targets.
* gcc.dg/vect/vect-peel-4.c: Likewise.

2011-06-14  Ira Rosen  

* gcc.dg/vect/vect-peel-3.c: Adjust misalignment values
for double-word vectors.
* gcc.dg/vect/vect-peel-4.c: Likewise.

Modified:
branches/gcc-4_6-branch/gcc/ChangeLog
branches/gcc-4_6-branch/gcc/config/arm/arm.c
branches/gcc-4_6-branch/gcc/doc/tm.texi
branches/gcc-4_6-branch/gcc/doc/tm.texi.in
branches/gcc-4_6-branch/gcc/stor-layout.c
branches/gcc-4_6-branch/gcc/target.def
branches/gcc-4_6-branch/gcc/targhooks.c
branches/gcc-4_6-branch/gcc/targhooks.h
branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/align-2.c
branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/vect/slp-25.c
branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/vect/vect-peel-1.c
branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/vect/vect-peel-2.c
branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/vect/vect-peel-3.c
branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/vect/vect-peel-4.c
branches/gcc-4_6-branch/gcc/testsuite/lib/target-supports.exp
branches/gcc-4_6-branch/gcc/tree-vect-data-refs.c
branches/gcc-4_6-branch/gcc/tree-vect-loop-manip.c


[Bug rtl-optimization/54739] [4.8 regression] FAIL: gcc.dg/lower-subreg-1.c scan-rtl-dump subreg1 "Splitting reg"

2012-10-01 Thread uweigand at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54739



--- Comment #3 from Ulrich Weigand  2012-10-01 
12:16:53 UTC ---

It seems all three of those targets have an "iordi3" pattern that triggers even

for 32-bit compiles.  In this case, the lower-subreg pass now no longer splits

the register, so that the DImode pattern is actually used.  (Prior to my patch,

the register would have been split anyway.)



The test case is intended to run on 32-bit targets where an ior:DI operation is

supposed to be split; it will now fail on targets with an iordi3 pattern.



For those targets, I guess it's up the target maintainers to decide whether:



- you want the iordi3 pattern to trigger since it gives better code than having

lower-subreg split the operation: in this case, just disable the test case for

your target (this is what I did for ARM)



or



- you'd really prefer to have lower-subreg split the operation, in which case

you should remove the iordi3 pattern


[Bug middle-end/54957] Two crashes introduced by rev192488

2012-10-23 Thread uweigand at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54957



Ulrich Weigand  changed:



   What|Removed |Added



 CC||uweigand at gcc dot gnu.org



--- Comment #14 from Ulrich Weigand  2012-10-23 
17:10:11 UTC ---

I'm getting the same crash when building libstdc++ for spu-elf:



Program received signal SIGSEGV, Segmentation fault.

emit_case_dispatch_table (index_expr=0xf601b1c0, index_type=0xf5e70420,

case_list=0x110001c0, default_label=0xeda05398, minval=0xf5dd0bc0,

maxval=0xf5dd3740, 

range=0xf5dd3740, stmt_bb=0x0) at

/home/uweigand/fsf/gcc-head/gcc/stmt.c:1919

1919  edge default_edge = EDGE_SUCC(stmt_bb, 0);

(gdb) bt

#0  emit_case_dispatch_table (index_expr=0xf601b1c0, index_type=0xf5e70420,

case_list=0x110001c0, default_label=0xeda05398, minval=0xf5dd0bc0,

maxval=0xf5dd3740, 

range=0xf5dd3740, stmt_bb=0x0) at

/home/uweigand/fsf/gcc-head/gcc/stmt.c:1919

#1  0x1079240c in expand_sjlj_dispatch_table (dispatch_index=, dispatch_table=0x10fe6108) at /home/uweigand/fsf/gcc-head/gcc/stmt.c:2292

#2  0x104ac3c4 in sjlj_emit_dispatch_table (dispatch_label=0xeda03980,

num_dispatch=8) at /home/uweigand/fsf/gcc-head/gcc/except.c:1363

#3  0x104ac6f0 in sjlj_build_landing_pads () at

/home/uweigand/fsf/gcc-head/gcc/except.c:1420

#4  0x104acb4c in finish_eh_generation () at

/home/uweigand/fsf/gcc-head/gcc/except.c:1454

#5  0x103ddc24 in gimple_expand_cfg () at

/home/uweigand/fsf/gcc-head/gcc/cfgexpand.c:4579

#6  0x106e1608 in execute_one_pass (pass=0x10ec19b4) at

/home/uweigand/fsf/gcc-head/gcc/passes.c:2320

#7  0x106e1cb4 in execute_pass_list (pass=0x10ec19b4) at

/home/uweigand/fsf/gcc-head/gcc/passes.c:2381

#8  0x10406770 in expand_function (node=0xf1998e50) at

/home/uweigand/fsf/gcc-head/gcc/cgraphunit.c:1601

#9  0x10407b44 in expand_all_functions () at

/home/uweigand/fsf/gcc-head/gcc/cgraphunit.c:1705

#10 0x10408060 in compile () at

/home/uweigand/fsf/gcc-head/gcc/cgraphunit.c:2003

#11 0x1040942c in finalize_compilation_unit () at

/home/uweigand/fsf/gcc-head/gcc/cgraphunit.c:2080

#12 0x101c5fc0 in cp_write_global_declarations () at

/home/uweigand/fsf/gcc-head/gcc/cp/decl2.c:4286

#13 0x107a6ef4 in compile_file () at

/home/uweigand/fsf/gcc-head/gcc/toplev.c:560

#14 0x107a77ec in do_compile () at

/home/uweigand/fsf/gcc-head/gcc/toplev.c:1866

#15 0x107a85bc in toplev_main (argc=23, argv=0xffabf8c4) at

/home/uweigand/fsf/gcc-head/gcc/toplev.c:1942

#16 0x10c0f6d0 in main (argc=, argv=)

at /home/uweigand/fsf/gcc-head/gcc/main.c:36


[Bug tree-optimization/47179] New: [4.5/4.6 Regression] SPU: errno misoptimization around malloc call

2011-01-05 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47179

   Summary: [4.5/4.6 Regression] SPU: errno misoptimization around
malloc call
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: uweig...@gcc.gnu.org
CC: rguent...@suse.de
Target: spu-elf


The problem described in PR 42944 is still present on SPU.

This is because the SPU system library (newlib) uses:

  extern struct _reent _impure_data;
  #define errno (_impure_data._errno)

in its version of errno.h, but GCC's call_may_clobber_ref_p_1 routine assumes
errno is always either a global or else accessed via a pointer.  It does not
handle component references.

(Note that the problem seems to be specific to the SPU, because of an
optimization that makes use of the fact that SPU code is always
single-threaded.  On other platforms, newlib accesses errno via a pointer as
well.)


[Bug tree-optimization/46021] 3 tree-ssa tests XPASS almost everywhere

2011-01-05 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46021

Ulrich Weigand  changed:

   What|Removed |Added

 Target|i386-pc-solaris2.*, |i386-pc-solaris2.*,
   |sparc-sun-solaris2.*,   |sparc-sun-solaris2.*,
   |alpha-dec-osf5.1b,  |alpha-dec-osf5.1b,
   |mips-sgi-irix6.5|mips-sgi-irix6.5, spu-elf
 CC||uweigand at gcc dot gnu.org

--- Comment #4 from Ulrich Weigand  2011-01-05 
23:38:40 UTC ---
gcc.dg/tree-ssa/20040204-1.c XPASSes on spu-elf as well ...


[Bug rtl-optimization/47299] New: Widening multiply optimization generates bad code

2011-01-14 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47299

   Summary: Widening multiply optimization generates bad code
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: uweig...@gcc.gnu.org
CC: ber...@codesourcery.com, rguent...@suse.de


Building the following test case with current mainline on i386:

unsigned short test (unsigned char val) __attribute__ ((noinline));

unsigned short
test (unsigned char val)
{
  return val * 255;
}

int
main(int argc, char**argv)
{
  printf ("test(val=40) = %x\n", test(0x40));
  return 0;
}

We get the following (correct) output with -O0:
test(val=40) = 3fc0

and the following incorrect output with -O2:
test(val=40) = ffc0

The problem appears to be related to this piece of code in expand_expr_real2,
case WIDEN_MULT_EXPR:

  expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
   EXPAND_NORMAL);
  temp = expand_widening_mult (mode, op0, op1, target,
   unsignedp, this_optab);

expand_operands will expand the constant 255 into QImode and return a
(const_int -1) for op1.  Passing this constant into expand_widening_mult then
apparently generates a simple negation operation in HImode instead (via
expand_const_mult) ...

It seems this code came in here:
http://gcc.gnu.org/ml/gcc-patches/2010-04/msg01327.html
Any suggestions how this ought to be handled?


[Bug tree-optimization/47179] [4.5/4.6 Regression] SPU: errno misoptimization around malloc call

2011-01-18 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47179

--- Comment #4 from Ulrich Weigand  2011-01-18 
20:13:59 UTC ---
Author: uweigand
Date: Tue Jan 18 20:13:56 2011
New Revision: 168961

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=168961
Log:
PR tree-optimization/47179
* config/spu/spu.c (spu_ref_may_alias_errno): New function.
(TARGET_REF_MAY_ALIAS_ERRNO): Define.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/spu/spu.c


[Bug tree-optimization/47179] [4.5/4.6 Regression] SPU: errno misoptimization around malloc call

2011-01-18 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47179

Ulrich Weigand  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #5 from Ulrich Weigand  2011-01-18 
20:16:10 UTC ---
Fixed.


[Bug tree-optimization/46021] 3 tree-ssa tests XPASS almost everywhere

2011-01-19 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46021

--- Comment #6 from Ulrich Weigand  2011-01-19 
12:56:19 UTC ---
Author: uweigand
Date: Wed Jan 19 12:56:16 2011
New Revision: 168990

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=168990
Log:
PR tree-optimization/46021
* gcc.dg/tree-ssa/20040204-1.c: Do not XFAIL on spu-*-*.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c


[Bug testsuite/45342] FAIL: gcc.dg/tls/thr-cse-1.c scan-assembler-not emutls_get_address.*emutls_get_address.*

2011-01-19 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45342

--- Comment #3 from Ulrich Weigand  2011-01-19 
13:09:56 UTC ---
Author: uweigand
Date: Wed Jan 19 13:09:51 2011
New Revision: 168992

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=168992
Log:
PR testsuite/45342
* gcc.dg/tls/thr-cse-1.c: Fix match on spu-*.*.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/tls/thr-cse-1.c


[Bug middle-end/47401] New: Support for must-not-throw regions with SJLJ exceptions broken

2011-01-21 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47401

   Summary: Support for must-not-throw regions with SJLJ
exceptions broken
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: uweig...@gcc.gnu.org
Target: spu-elf


A couple of exception related tests are failing on SPU:
FAIL: g++.dg/cpp0x/noexcept04.C execution test
FAIL: g++.dg/cpp0x/noexcept05.C scan-assembler LSDA
FAIL: g++.dg/eh/spec10.C execution test
FAIL: g++.dg/eh/spec11.C scan-assembler LSDA

These are related to must-not-throw regions.  To implement those,
we may have to register a personality function with empty LSDA.
This has been implemented for DWARF CFI exception handling, but
apparently not (fully?) for SJLJ exception handling.

For example, the test case for noexcept05.C:
struct A { ~A(); };
void g();
void f() noexcept
{
  A var;
  g();
}

compiles on the SPU to:
.global _Z1fv
.type   _Z1fv, @function
_Z1fv:
stqd$lr,16($sp)
stqd$sp,-48($sp)
ai  $sp,$sp,-48
brsl$lr,_Z1gv
ai  $2,$sp,32
ori $3,$2,0
brsl$lr,_ZN1AD1Ev
ai  $sp,$sp,48
lqd $lr,16($sp)
bi  $lr
.size   _Z1fv, .-_Z1fv

Note the absence of any call to _Unwind_SJLJ_Register.


[Bug middle-end/47401] Support for must-not-throw regions with SJLJ exceptions broken

2011-01-21 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47401

--- Comment #1 from Ulrich Weigand  2011-01-21 
16:33:43 UTC ---
Created attachment 23064
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23064
Proposed fix

Patch from http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01406.html


[Bug middle-end/47401] Support for must-not-throw regions with SJLJ exceptions broken

2011-01-21 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47401

Ulrich Weigand  changed:

   What|Removed |Added

  Attachment #23064|0   |1
is obsolete||

--- Comment #2 from Ulrich Weigand  2011-01-21 
17:18:59 UTC ---
Created attachment 23066
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23066
Updated fix

Updated patch from http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01483.html


[Bug middle-end/47401] Support for must-not-throw regions with SJLJ exceptions broken

2011-01-21 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47401

Ulrich Weigand  changed:

   What|Removed |Added

  Attachment #23066|0   |1
is obsolete||

--- Comment #3 from Ulrich Weigand  2011-01-21 
17:29:40 UTC ---
Created attachment 23067
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23067
Updated fix

Minor update as requested


[Bug middle-end/47401] Support for must-not-throw regions with SJLJ exceptions broken

2011-01-22 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47401

--- Comment #5 from Ulrich Weigand  2011-01-22 
21:24:57 UTC ---
Author: uweigand
Date: Sat Jan 22 21:24:54 2011
New Revision: 169134

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169134
Log:
PR middle-end/47401
* except.c (sjlj_assign_call_site_values): Move setting the
crtl->uses_eh_lsda flag to ...
(sjlj_mark_call_sites): ... here.
(sjlj_emit_function_enter): Support NULL dispatch label.
(sjlj_build_landing_pads): In a function with no landing pads
that still has must-not-throw regions, generate code to register
a personality function with empty LSDA.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/except.c


[Bug middle-end/47401] Support for must-not-throw regions with SJLJ exceptions broken

2011-01-22 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47401

Ulrich Weigand  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #6 from Ulrich Weigand  2011-01-22 
21:26:37 UTC ---
Fixed.


[Bug middle-end/45505] [4.6 Regression] gfortran.dg/pr25923.f90

2011-01-29 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45505

Ulrich Weigand  changed:

   What|Removed |Added

 CC||uweigand at gcc dot gnu.org

--- Comment #18 from Ulrich Weigand  2011-01-29 
17:55:51 UTC ---
The test case XPASSes on spu-*-* as well.


[Bug tree-optimization/47621] New: Missed dependencies in address-taken optimization

2011-02-06 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47621

   Summary: Missed dependencies in address-taken optimization
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: uweig...@gcc.gnu.org
CC: rguent...@suse.de


The following program is mis-optimized with 4.6 (not earlier versions) at -O or
higher optimization levels:

int
main (void)
{
  int data = 1;

  struct ptr
{
  int val;
} *ptr = (struct ptr *) &data;

  ptr->val = 0;

  return data;
}

This program should return 0, but actually returns 1.

After the "address-taken" optimization pass, we have:

  dataD.1975_4 = 1;
  MEM[(struct ptr *)&dataD.1975].valD.1977 = 0;
  D.3453_2 = dataD.1975_4;
  return D.3453_2;

so the dependency between the assignment to ptr->val and the use of data is
lost.

See this post for more information:
http://gcc.gnu.org/ml/gcc/2011-02/msg00079.html

H.J. Lu identified the following patch as introducing the regression:
http://gcc.gnu.org/ml/gcc-patches/2010-10/msg00788.html


[Bug debug/47283] [4.6 regression] ICE in refs_may_alias_p_1, at tree-ssa-alias.c

2011-02-17 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47283

--- Comment #16 from Ulrich Weigand  2011-02-17 
23:24:45 UTC ---
I tested the patch from comment #12 on spu-elf with no regressions.


[Bug target/50305] New: Inline asm reload failure when building Linux kernel

2011-09-06 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50305

 Bug #: 50305
   Summary: Inline asm reload failure when building Linux kernel
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: uweig...@gcc.gnu.org
Target: arm-linux-gnueabi


Created attachment 25204
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25204
Reduced test case

Building the Linux kernel with current mainline fails with certain sets of
options due to a reload failure on an atomic.h inline asm.

The attached test case reproduces the problem when compiled with
  -O2 -fno-omit-frame-pointer -marm -march=armv7-a

core.i: In function ‘event_alloc’:
core.i:34:2: error: ‘asm’ operand requires impossible reload
core.i:34:2: error: ‘asm’ operand requires impossible reload

Note that while the inline asm may be somewhat inefficient (enforcing a "+Qo"
constraint on an operand that is never used in the assembly), there is nothing
invalid about the set of constraints (and there are enough registers
available).

The problem seems instead to be caused by unfortunate interactions between
ARM's legitimize_reload_address routine and subsequent iterations through
find_reloads common code.


[Bug target/50305] Inline asm reload failure when building Linux kernel

2011-09-06 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50305

Ulrich Weigand  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2011-09-06
 AssignedTo|unassigned at gcc dot   |uweigand at gcc dot gnu.org
   |gnu.org |
 Ever Confirmed|0   |1

--- Comment #1 from Ulrich Weigand  2011-09-06 
17:06:45 UTC ---
Created attachment 25205
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25205
Possible fix

I'm currently testing this patch to fix the problem.


[Bug middle-end/50307] New: SSA checking ICE when building Linux kernel

2011-09-06 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50307

 Bug #: 50307
   Summary: SSA checking ICE when building Linux kernel
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Keywords: ice-checking
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: uweig...@gcc.gnu.org


Created attachment 25206
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25206
Reduced test case

When building the attached test case (reduced from Linux kernel code) with -O2
(tested on i386 and arm), I'm getting the following ICE:

vsprintf.i: In function ‘vscnprintf’:
vsprintf.i:5:1: error: number of operands and imm-links don’t agree in
statement
# .MEM_26 = VDEF <.MEM_17>
args = args_7(D);
vsprintf.i:5:1: internal compiler error: verify_ssa failed

Seems to be related to re-inlining and outlined part of the original function
...


[Bug middle-end/50307] [4.7 Regression] SSA checking ICE when building Linux kernel

2011-09-07 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50307

--- Comment #2 from Ulrich Weigand  2011-09-07 
18:33:30 UTC ---
I've confirmed that this is a regression introduced by rev. 178386
http://gcc.gnu.org/ml/gcc-cvs/2011-08/msg01405.html

So it does seem like this is another duplicate of PR 50287 / PR 50295.


[Bug target/50305] Inline asm reload failure when building Linux kernel

2011-10-06 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50305

--- Comment #3 from Ulrich Weigand  2011-10-06 
11:50:35 UTC ---
Author: uweigand
Date: Thu Oct  6 11:50:26 2011
New Revision: 179603

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=179603
Log:
gcc/
PR target/50305
* config/arm/arm.c (arm_legitimize_reload_address): Recognize
output of a previous pass through legitimize_reload_address.
Do not attempt to optimize addresses if the base register is
equivalent to a constant.

gcc/testsuite/
PR target/50305
* gcc.target/arm/pr50305.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/arm/pr50305.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.c
trunk/gcc/testsuite/ChangeLog


[Bug target/50305] Inline asm reload failure when building Linux kernel

2011-10-06 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50305

Ulrich Weigand  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #4 from Ulrich Weigand  2011-10-06 
11:53:26 UTC ---
Fixed.


[Bug target/50310] [4.7 Regression] ICE: in gen_vcondv2div2df, at config/i386/sse.md:1435 with -O -ftree-vectorize and __builtin_isunordered()

2011-10-19 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50310

--- Comment #16 from Ulrich Weigand  2011-10-19 
12:17:41 UTC ---
Author: uweigand
Date: Wed Oct 19 12:17:35 2011
New Revision: 180184

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=180184
Log:
PR target/50310
* config/spu/spu.c (spu_emit_vector_compare): Support unordered
floating-point comparisons.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/spu/spu.c


[Bug bootstrap/50801] macro location tracking patch set breaks bootstrap

2011-10-20 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50801

Ulrich Weigand  changed:

   What|Removed |Added

 CC||uweigand at gcc dot gnu.org

--- Comment #4 from Ulrich Weigand  2011-10-20 
08:21:18 UTC ---
I can confirm that this patch fixes the problem I was seeing on SPU as well.


[Bug rtl-optimization/50762] ICE: in extract_insn, at recog.c:2137 (unrecognizable insn)

2011-11-08 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50762

--- Comment #4 from Ulrich Weigand  2011-11-08 
16:00:36 UTC ---
It seems to me (part of) the problem is that the operand constraint is too
generic here:

(define_insn "*lea_4_zext"
  [(set (match_operand:DI 0 "register_operand" "=r")
(zero_extend:DI
  (match_operand:SI 1 "lea_address_operand" "p")))]

"p" accepts any legitimate address operand, including const_int.  But those are
then not actually accepted by the overall pattern (due to the mode check).

When reloading within an address operand, reload verifies only the constraint;
it does not expect the case where a constraint accepts something that is later
rejected by some other check.

One fix might be to use a more specific constraint here that does not accept
const_int.


[Bug rtl-optimization/50762] ICE: in extract_insn, at recog.c:2137 (unrecognizable insn)

2011-11-09 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50762

--- Comment #8 from Ulrich Weigand  2011-11-09 
18:52:16 UTC ---
(In reply to comment #7)
> Redefining "j" constraint as "define_address_constraint" results in:

Yes, it needs to be define_address_constraint.

> pr50762.c:48:1: error: unrecognizable insn:
> (insn 29 28 30 3 (set (reg:DI 0 ax [77])
> (zero_extend:DI (const_int 1 [0x1]))) pr50762.c:35 -1
>  (expr_list:REG_DEAD (reg/v:SI 59 [ p_60 ])
> (nil)))
> pr50762.c:48:1: internal compiler error: in extract_insn, at recog.c:2137
> Please submit a full bug report,

That's odd.

> So, _why_ reload insists on pushing zero_extended constant to the register? 
> I'd
> expect that (const_int 1) is pushed into the register.
> 
> And finally, (zero_extend:DI (const_int 1 [0x1])) equals to (const_int 1
> [0x1]), so why this RTX isn't simplified on-the-fly?

The zero_extend is a fixed part of the insn pattern in question:

(define_insn "*lea_4_zext"
  [(set (match_operand:DI 0 "register_operand" "=r")
(zero_extend:DI
  (match_operand:SI 1 "lea_address_operand" "p")))]

Reload only ever changes what is *inside* the operands.  It does not change the
fixed parts of the pattern (outside of operands).

What I would have expected to happen is for reload to load the (const_int 1)
into an SImode register, and then zero-extend that one ...

Not sure why that doesn't happen.  I'll have a look.


[Bug rtl-optimization/50762] ICE: in extract_insn, at recog.c:2137 (unrecognizable insn)

2011-11-10 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50762

--- Comment #10 from Ulrich Weigand  2011-11-10 
10:10:04 UTC ---
(In reply to comment #9)
> (In reply to comment #8)
> 
> > The zero_extend is a fixed part of the insn pattern in question:
> > 
> > (define_insn "*lea_4_zext"
> >   [(set (match_operand:DI 0 "register_operand" "=r")
> > (zero_extend:DI
> >   (match_operand:SI 1 "lea_address_operand" "p")))]
> > 
> > Reload only ever changes what is *inside* the operands.  It does not change 
> > the
> > fixed parts of the pattern (outside of operands).
> 
> If this is the case, then (const_int 1) is perfectly acceptable here. 

However, it is rejected by the lea_address_operand predicate check
due to its mode (VOIDmode != SImode).  This is a bit odd because most
standard predicates accept a CONST_INT no matter what mode is requested.

But in this case, lea_address_operand is defined as "normal" predicate
(using define_predicate), but does not fall back onto any other normal
predicate, only the "special" predicate address_operand.  Therefore,
genpreds thinks it needs to add the mode check by itself, and this
version is quite strict and does not special-case CONST_INTs.

(Note that "address_operand" is special in that it uses its mode not
to check the mode of the address, but the mode of the memory operand
that is to be referred to -- on some machines, this makes a difference.)

Maybe one way to fix this would be to define lea_address_operand
using define_special_predicate and then implement an appropriate
mode check that handles CONST_INT by hand ...

> > Not sure why that doesn't happen.  I'll have a look.

Hmm, there's this special code in find_reloads that attempts to
re-recognize patterns after structural changes to an address
were made:

  /* If we now have a simple operand where we used to have a
 PLUS or MULT, re-recognize and try again.  */
  if ((OBJECT_P (*recog_data.operand_loc[i])
   || GET_CODE (*recog_data.operand_loc[i]) == SUBREG)
  && (GET_CODE (recog_data.operand[i]) == MULT
  || GET_CODE (recog_data.operand[i]) == PLUS))
{
  INSN_CODE (insn) = -1;
  retval = find_reloads (insn, replace, ind_levels, live_known,
 reload_reg_p);
  return retval;
}

It's a bit unfortunate that this just ICEs if the modified
address is no longer recognized.

But if the lea_address_operand predicate is fixed as outlined
above, the problem should go away since the insn would then
be recognized.


[Bug rtl-optimization/50762] ICE: in extract_insn, at recog.c:2137 (unrecognizable insn)

2011-11-10 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50762

--- Comment #16 from Ulrich Weigand  2011-11-10 
14:04:42 UTC ---
(In reply to comment #15)
> (In reply to comment #14)
> > Is this with your patch from comment 6? You really can't have a CONST_INT
> > inside a zero_extend; the abort is justified.
> 
> No, this is with the patch from comment #11. This patch in fact just reverts
> the definition to define_special_predicate as was in the past.

You probably need the patch from comment 6 (with the define_address_constraint
fix) *in addition* to the patch from comment 11: this way, the zero_extend
(const_int) will be accepted temporarily by the predicate, but immediately
afterwards reloaded into a register.


[Bug target/50762] [4.7 Regression] ICE: in extract_insn, at recog.c:2137 (unrecognizable insn)

2011-11-10 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50762

--- Comment #21 from Ulrich Weigand  2011-11-10 
18:22:50 UTC ---
(In reply to comment #20)
> 
> The documentation is wrong, so following patch removes all the blurb about
> handling of constants.
> 
> Index: doc/md.texi
> ===
> --- doc/md.texi(revision 181258)
> +++ doc/md.texi(working copy)
> @@ -1001,16 +1001,7 @@
> 
>  Predicates written with @code{define_predicate} automatically include
>  a test that @var{mode} is @code{VOIDmode}, or @var{op} has the same
> -mode as @var{mode}, or @var{op} is a @code{CONST_INT} or
> -@code{CONST_DOUBLE}.  They do @emph{not} check specifically for
> -integer @code{CONST_DOUBLE}, nor do they test that the value of either
> -kind of constant fits in the requested mode.  This is because
> -target-specific predicates that take constants usually have to do more
> -stringent value checks anyway.  If you need the exact same treatment
> -of @code{CONST_INT} or @code{CONST_DOUBLE} that the generic predicates
> -provide, use a @code{MATCH_OPERAND} subexpression to call
> -@code{const_int_operand}, @code{const_double_operand}, or
> -@code{immediate_operand}.
> +mode as @var{mode}.

But this is not quite true either.  genpreds will *omit* generation of
the explicit test (mode == VOIDmode || mode == GET_MODE (op)) if the
predicate has as a sub-test a call to one of the standard predicates,
arguing that "the standard predicate already did the test, so we don't
have to repeat it".  But the test performed by the standard predicates
*does* comply with the more elaborate rule spelled out in the docs ...


[Bug target/50762] [4.7 Regression] ICE: in extract_insn, at recog.c:2137 (unrecognizable insn)

2011-11-11 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50762

--- Comment #24 from Ulrich Weigand  2011-11-11 
13:04:02 UTC ---
(In reply to comment #22)

> Yeah, this is also what I thought at the first sight. But please don't forget
> that additional c-code test effectively creates
> 
> (and (match_operand (...)
>  (match_test (call to c-code)))
> 
> and this construct will _always_ force generation of mode checks.

No, it actually does not; see e.g. in the x86-64 insn-preds.c we have

(define_predicate "long_memory_operand"
  (and (match_operand 0 "memory_operand")
   (match_test "memory_address_length (op)")))

implemented as:

int
long_memory_operand (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
{
  return (memory_operand (op, mode)) && (
#line 994 "../../gcc-head/gcc/config/i386/predicates.md"
(memory_address_length (op)));
}

This is because the genpred routines have code to follow boolean
operators and reason correctly that if match_operand performs the
mode test, and some other test is joined via "and", the test is
still not necessary.

In fact, except for the address_operand case, the mode checks seem
to be correct in their actual effect: we either call a standard operator
and omit the mode check, or else we have the genpred-generated mode check,
but in conjunction with some match_code test that excludes CONST_INT 
and CONST_DOUBLE anyway.


[Bug tree-optimization/52633] [4.7/4.8 Regression] Compiler ICE in vect_is_simple_use_1 (ARM)

2012-05-04 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52633

--- Comment #3 from Ulrich Weigand  2012-05-04 
12:46:14 UTC ---
Author: uweigand
Date: Fri May  4 12:46:04 2012
New Revision: 187158

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187158
Log:
gcc/
PR tree-optimization/52633
* tree-vect-patterns.c (vect_vect_recog_func_ptrs): Swap order of
vect_recog_widen_shift_pattern and vect_recog_over_widening_pattern.
(vect_recog_over_widening_pattern): Remove handling of code that was
already detected as over-widening pattern.  Remove special handling
of "unsigned" cases.  Instead, support general case of conversion
of the shift result to another type.

gcc/testsuite/
PR tree-optimization/52633
* gcc.dg/vect/vect-over-widen-1.c: Two patterns should now be
recognized as widening shifts instead of over-widening.
* gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
* gcc.dg/vect/vect-over-widen-4.c: Likewise.
* gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
* gcc.target/arm/pr52633.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/arm/pr52633.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c
trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c
trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c
trunk/gcc/tree-vect-patterns.c


[Bug tree-optimization/52633] [4.7/4.8 Regression] Compiler ICE in vect_is_simple_use_1 (ARM)

2012-05-04 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52633

Ulrich Weigand  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #4 from Ulrich Weigand  2012-05-04 
12:49:02 UTC ---
Fixed.


[Bug tree-optimization/52633] [4.7/4.8 Regression] Compiler ICE in vect_is_simple_use_1 (ARM)

2012-05-04 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52633

--- Comment #6 from Ulrich Weigand  2012-05-04 
14:56:52 UTC ---
Author: uweigand
Date: Fri May  4 14:56:48 2012
New Revision: 187162

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187162
Log:
2012-05-04  Ulrich Weigand  

 Backport from mainline:

 2012-05-04  Ulrich Weigand  

 PR tree-optimization/52633
 * tree-vect-patterns.c (vect_vect_recog_func_ptrs): Swap order of
 vect_recog_widen_shift_pattern and vect_recog_over_widening_pattern.
 (vect_recog_over_widening_pattern): Remove handling of code that was
 already detected as over-widening pattern.  Remove special handling
 of "unsigned" cases.  Instead, support general case of conversion
 of the shift result to another type.

 2012-05-04  Ulrich Weigand  

 * tree-vect-patterns.c (vect_single_imm_use): New function.
 (vect_recog_widen_mult_pattern): Use it instead of open-coding loop.
 (vect_recog_over_widening_pattern): Likewise.
 (vect_recog_widen_shift_pattern): Likewise.

 2012-04-10  Ulrich Weigand  

 PR tree-optimization/52870
 * tree-vect-patterns.c (vect_recog_widen_mult_pattern): Verify that
 presumed pattern statement is within the same loop or basic block.

2012-05-04  Ulrich Weigand  

 Backport from mainline:

 2012-05-04  Ulrich Weigand  

 PR tree-optimization/52633
 * gcc.dg/vect/vect-over-widen-1.c: Two patterns should now be
 recognized as widening shifts instead of over-widening.
 * gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
 * gcc.dg/vect/vect-over-widen-4.c: Likewise.
 * gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
 * gcc.target/arm/pr52633.c: New test.

 2012-04-10  Ulrich Weigand  

 PR tree-optimization/52870
 * gcc.dg/vect/pr52870.c: New test.

Added:
branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/vect/pr52870.c
branches/gcc-4_7-branch/gcc/testsuite/gcc.target/arm/pr52633.c
Modified:
branches/gcc-4_7-branch/gcc/ChangeLog
branches/gcc-4_7-branch/gcc/testsuite/ChangeLog
   
branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c
   
branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c
branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c
branches/gcc-4_7-branch/gcc/tree-vect-patterns.c


[Bug tree-optimization/52870] ICE during SLP pattern matching

2012-05-04 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52870

--- Comment #6 from Ulrich Weigand  2012-05-04 
14:56:52 UTC ---
Author: uweigand
Date: Fri May  4 14:56:48 2012
New Revision: 187162

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187162
Log:
2012-05-04  Ulrich Weigand  

 Backport from mainline:

 2012-05-04  Ulrich Weigand  

 PR tree-optimization/52633
 * tree-vect-patterns.c (vect_vect_recog_func_ptrs): Swap order of
 vect_recog_widen_shift_pattern and vect_recog_over_widening_pattern.
 (vect_recog_over_widening_pattern): Remove handling of code that was
 already detected as over-widening pattern.  Remove special handling
 of "unsigned" cases.  Instead, support general case of conversion
 of the shift result to another type.

 2012-05-04  Ulrich Weigand  

 * tree-vect-patterns.c (vect_single_imm_use): New function.
 (vect_recog_widen_mult_pattern): Use it instead of open-coding loop.
 (vect_recog_over_widening_pattern): Likewise.
 (vect_recog_widen_shift_pattern): Likewise.

 2012-04-10  Ulrich Weigand  

 PR tree-optimization/52870
 * tree-vect-patterns.c (vect_recog_widen_mult_pattern): Verify that
 presumed pattern statement is within the same loop or basic block.

2012-05-04  Ulrich Weigand  

 Backport from mainline:

 2012-05-04  Ulrich Weigand  

 PR tree-optimization/52633
 * gcc.dg/vect/vect-over-widen-1.c: Two patterns should now be
 recognized as widening shifts instead of over-widening.
 * gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
 * gcc.dg/vect/vect-over-widen-4.c: Likewise.
 * gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
 * gcc.target/arm/pr52633.c: New test.

 2012-04-10  Ulrich Weigand  

 PR tree-optimization/52870
 * gcc.dg/vect/pr52870.c: New test.

Added:
branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/vect/pr52870.c
branches/gcc-4_7-branch/gcc/testsuite/gcc.target/arm/pr52633.c
Modified:
branches/gcc-4_7-branch/gcc/ChangeLog
branches/gcc-4_7-branch/gcc/testsuite/ChangeLog
   
branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c
   
branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c
branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c
branches/gcc-4_7-branch/gcc/tree-vect-patterns.c


[Bug tree-optimization/52633] [4.7/4.8 Regression] Compiler ICE in vect_is_simple_use_1 (ARM)

2012-05-04 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52633

Ulrich Weigand  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||FIXED

--- Comment #7 from Ulrich Weigand  2012-05-04 
14:59:23 UTC ---
Backported to 4.7.  (This incidentally also fixes PR 52870, even though we
don't have a test case exposing that problem on 4.7 ...)


[Bug rtl-optimization/53227] [4.8 Regression] FAIL: gcc.target/i386/movbe-2.c scan-assembler-times movbe[ \t] 4

2012-05-04 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53227

--- Comment #2 from Ulrich Weigand  2012-05-04 
16:03:35 UTC ---
Why do you consider this a reload/RA problem?  Code before ira looks like:

(insn 2 4 3 2 (set (reg/v:DI 62 [ i ])
(mem/c:DI (reg/f:SI 16 argp) [2 i+0 S8 A32])) test.i:6 63
{*movdi_internal}
 (expr_list:REG_EQUIV (mem/c:DI (reg/f:SI 16 argp) [2 i+0 S8 A32])
(nil)))

(insn 8 7 9 2 (clobber (reg:DI 60 [ D.1367 ])) test.i:7 -1
 (nil))

(insn 9 8 10 2 (set (subreg:SI (reg:DI 60 [ D.1367 ]) 0)
(bswap:SI (subreg:SI (reg/v:DI 62 [ i ]) 4))) test.i:7 719
{*bswapsi2_movbe}
 (nil))

(insn 10 9 11 2 (set (subreg:SI (reg:DI 60 [ D.1367 ]) 4)
(bswap:SI (subreg:SI (reg/v:DI 62 [ i ]) 0))) test.i:7 719
{*bswapsi2_movbe}
 (expr_list:REG_DEAD (reg/v:DI 62 [ i ])
(nil)))

(insn 11 10 0 2 (set (mem/c:DI (symbol_ref:SI ("x") [flags 0x40]  ) [2 x+0 S8 A64])
(reg:DI 60 [ D.1367 ])) test.i:7 63 {*movdi_internal}
 (expr_list:REG_DEAD (reg:DI 60 [ D.1367 ])
(nil)))

with the memory accesses both in DImode, but the bswap already split into
SImode.  This causes the two DImode registers to be live at the same time, so
RA cannot allocate the same register for both.

Given the limited register availability on i386, which allocation would you
have suggested instead?

[ Note that I'd consider this a case where the moves certainly *ought* to have
been split into SImode, because:
- on i386 the moves will be split later on anyway
- accesses to subregs of the registers being moved already happens elsewhere. ]


[Bug rtl-optimization/53227] [4.8 Regression] FAIL: gcc.target/i386/movbe-2.c scan-assembler-times movbe[ \t] 4

2012-05-04 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53227

--- Comment #4 from Ulrich Weigand  2012-05-04 
16:56:59 UTC ---
(In reply to comment #3)
> However, reload should notice that memory could be propagated into bswap.

Since register allocation already assigned a hard reg to the pseudo, reload is
happy.  Reload doesn't generally attempt to go back and search for whether the
value is also available in memory (unless it has to ...).

In general, it is preferable for earlier passes to leave an operand as MEM (if
an insn accepts memory operands in some alternative), so that reload can then
make the decision whether to use the MEM or to reload into a register.


[Bug rtl-optimization/44141] Redundant loads and stores generated for AMD bdver1 target

2012-05-07 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141

--- Comment #16 from Ulrich Weigand  2012-05-07 
17:17:06 UTC ---
Reload inheritance generally gives up on handling subregs of pseudos, mostly
because there is no mechanism to track invalidation of parts of pseudos.

Now, in this particular case where a subreg still refers to the whole of the
pseudo and just re-interprets it as a different vector mode, it might be
possible to enhance inheritance support to handle such cases.

But this looks like a significant chance to what is already one of the most
complex parts of reload -- and would certainly involve some risk of introducing
subtle bugs.   I'm not sure if this is worth the effort ...

In particular, in this specific case the back-end could make things a whole lot
simpler by not insisting on a particular mode.  I understand MOVUPS simply
moves 16 bytes between (unaligned) memory and registers -- there's nothing in
particular that requires this to be encoded as V4SF mode.

I'd suggest simply extending the movups patterns to handle moves between
arbitrary 16-byte (vector?) modes, all in the end resolving to the same
assembler instruction.  Then you'd be able to just encode the move in question
along the lines of
   (set (reg:V2DF) (unspec:V2DF [(mem:V2DF ...)] UNSPEC_MOVU))
which would probably generate better code not just in reload, but other parts
of the RTL middle-end as well ...

[ An even more radical change might be to encode both movups and movaps as just
plain moves, with no unspec, and have the final assembly output make the choice
between the two based on the operand's MEM_ALIGN value ... Of course, this
needs MEM_ALIGN to be always correct. ]


[Bug tree-optimization/53636] New: SLP may create invalid unaligned memory accesses

2012-06-11 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53636

 Bug #: 53636
   Summary: SLP may create invalid unaligned memory accesses
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: uweig...@gcc.gnu.org


The following test case:

void test (unsigned char *dst)
{
 short tmp[11 * 8], *tptr;
 int i;

 fill (tmp);

 tptr = tmp;
 for (i = 0; i < 8; i++)
   {
 dst[0] = (-tptr[0] + 9 * tptr[0 + 1] + 9 * tptr[0 + 2] - tptr[0 + 3]) >>
7;
 dst[1] = (-tptr[1] + 9 * tptr[1 + 1] + 9 * tptr[1 + 2] - tptr[1 + 3]) >>
7;
 dst[2] = (-tptr[2] + 9 * tptr[2 + 1] + 9 * tptr[2 + 2] - tptr[2 + 3]) >>
7;
 dst[3] = (-tptr[3] + 9 * tptr[3 + 1] + 9 * tptr[3 + 2] - tptr[3 + 3]) >>
7;
 dst[4] = (-tptr[4] + 9 * tptr[4 + 1] + 9 * tptr[4 + 2] - tptr[4 + 3]) >>
7;
 dst[5] = (-tptr[5] + 9 * tptr[5 + 1] + 9 * tptr[5 + 2] - tptr[5 + 3]) >>
7;
 dst[6] = (-tptr[6] + 9 * tptr[6 + 1] + 9 * tptr[6 + 2] - tptr[6 + 3]) >>
7;
 dst[7] = (-tptr[7] + 9 * tptr[7 + 1] + 9 * tptr[7 + 2] - tptr[7 + 3]) >>
7;

 dst += 8;
 tptr += 11;
   }
}

when built on ARM with -mcpu=cortex-a9 -mfpu=neon -mfloat-abi=softfp -O
-ftree-vectorize creates code that uses a VLDR instruction to access unaligned
memory, which causes a Bus error at runtime.

The problem seems to be that the check in vect_compute_data_ref_alignment is
not enough for SLP.  Even though SLP only considers a basic blokc, the data-ref
analysis still looks at innermost loops to compute scalar evolutions.  This
results in concluding that the access "tptr[0]" is based on "tmp", which is
aligned to 8 bytes, using a step of 22 bytes.

The alignment check now only verified that the *base* is aligned.  This is OK
if we're actually vectorizing the loop.  But in the SLP case, we really need to
verify instead that the access is aligned on *every* iteration through the loop
...


[Bug tree-optimization/53636] SLP may create invalid unaligned memory accesses

2012-06-11 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53636

Ulrich Weigand  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2012-06-11
 AssignedTo|unassigned at gcc dot   |uweigand at gcc dot gnu.org
   |gnu.org |
 Ever Confirmed|0   |1


[Bug tree-optimization/53636] SLP may create invalid unaligned memory accesses

2012-06-15 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53636

--- Comment #1 from Ulrich Weigand  2012-06-15 
13:30:40 UTC ---
Author: uweigand
Date: Fri Jun 15 13:30:36 2012
New Revision: 188661

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188661
Log:
gcc/
PR tree-optimization/53636
* tree-vect-data-refs.c (vect_compute_data_ref_alignment): Verify
stride when doing basic-block vectorization.

gcc/testsuite/
PR tree-optimization/53636
* gcc.target/arm/pr53636.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/arm/pr53636.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-data-refs.c


[Bug tree-optimization/53636] SLP may create invalid unaligned memory accesses

2012-06-15 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53636

--- Comment #2 from Ulrich Weigand  2012-06-15 
15:11:51 UTC ---
Now fixed on mainline; still fails on 4.7.

(While the bug is probably latent even earlier, this particular test case does
not crash on 4.6.)


[Bug regression/53729] [4.8 regression] PR53636 fix caused bb-slp-16.c to FAIL on sparc64 and powerpc64

2012-06-20 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53729

Ulrich Weigand  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2012-06-20
 AssignedTo|unassigned at gcc dot   |uweigand at gcc dot gnu.org
   |gnu.org |
 Ever Confirmed|0   |1

--- Comment #1 from Ulrich Weigand  2012-06-20 
15:22:45 UTC ---
The problem is that SLP tests *all* accesses within the basic block for
alignment, even those that aren't actually part of a SLP instance.

This is of course broken, but that bug had been hidden by the PR53636 problem
(due to which accesses were considered aligned that actually are not).

The fix for this problem is to only check *relevant* accesses for alignment. 
(This requires moving the alignment check until after relevant statements are
actually marked ...)

I'm testing a fix.


[Bug rtl-optimization/53706] [4.8 Regression] Bootstrap failure due to "Invalid write of size 8 at 0xBDC35E: variable_htab_free (var-tracking.c:1418)

2012-06-21 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53706

Ulrich Weigand  changed:

   What|Removed |Added

 CC||uweigand at gcc dot gnu.org

--- Comment #11 from Ulrich Weigand  2012-06-21 
13:18:53 UTC ---
I'm seeing what appears to be a similar issue bootstrapping powerpc64:

*** glibc detected ***
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus: corrupted
double-linked list: 0x11afd980 ***
=== Backtrace: =
/lib64/power6x/libc.so.6[0x80f2ff9bd0]
/lib64/power6x/libc.so.6[0x80f2ffba4c]
/lib64/power6x/libc.so.6(cfree-0x117fc8)[0x80f2ffc050]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(_Z16empty_alloc_poolP14alloc_pool_def-0xd186ac)[0x1035da6c]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(_Z15free_alloc_poolP14alloc_pool_def-0xd18638)[0x1035daf8]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus[0x109b81dc]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(_Z22variable_tracking_mainv-0x6e1734)[0x109cd57c]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(_Z16execute_one_passP8opt_pass-0xa1248c)[0x1067f7cc]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(_Z17execute_pass_listP8opt_pass-0xa1203c)[0x1067fc34]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(_Z17execute_pass_listP8opt_pass-0xa12024)[0x1067fc4c]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(_Z17execute_pass_listP8opt_pass-0xa12024)[0x1067fc4c]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus[0x103e4ea8]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(_Z7compilev-0xc935c4)[0x103e7474]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(_Z25finalize_compilation_unitv-0xc92e84)[0x103e7bcc]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(_Z28cp_write_global_declarationsv-0xeb57fc)[0x101b5414]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus[0x1074b894]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(_Z11toplev_mainiPPc-0x94c944)[0x1074da44]
/home/uweigand/fsf/gcc-head-build-ppu/./prev-gcc/cc1plus(main-0x436788)[0x10c8d4b0]
/lib64/power6x/libc.so.6[0x80f2f99d34]
/lib64/power6x/libc.so.6(__libc_start_main-0x176cf0)[0x80f2f99fd0]


[Bug tree-optimization/53636] SLP may create invalid unaligned memory accesses

2012-06-26 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53636

--- Comment #3 from Ulrich Weigand  2012-06-26 
09:05:56 UTC ---
Author: uweigand
Date: Tue Jun 26 09:05:48 2012
New Revision: 188979

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188979
Log:
PR tree-optimization/53729
PR tree-optimization/53636
* tree-vect-slp.c (vect_slp_analyze_bb_1): Delay call to
vect_verify_datarefs_alignment until after statements have
been marked as relevant/irrelevant.
* tree-vect-data-refs.c (vect_verify_datarefs_alignment):
Skip irrelevant statements.
(vect_enhance_data_refs_alignment): Use STMT_VINFO_RELEVANT_P
instead of STMT_VINFO_RELEVANT.
(vect_get_data_access_cost): Do not check for supportable
alignment before calling vect_get_load_cost/vect_get_store_cost.
* tree-vect-stmts.c (vect_get_store_cost): Do not abort when
handling unsupported alignment.
(vect_get_load_cost): Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-vect-data-refs.c
trunk/gcc/tree-vect-slp.c
trunk/gcc/tree-vect-stmts.c


[Bug regression/53729] [4.8 regression] PR53636 fix caused bb-slp-16.c to FAIL on sparc64 and powerpc64

2012-06-26 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53729

--- Comment #2 from Ulrich Weigand  2012-06-26 
09:05:55 UTC ---
Author: uweigand
Date: Tue Jun 26 09:05:48 2012
New Revision: 188979

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188979
Log:
PR tree-optimization/53729
PR tree-optimization/53636
* tree-vect-slp.c (vect_slp_analyze_bb_1): Delay call to
vect_verify_datarefs_alignment until after statements have
been marked as relevant/irrelevant.
* tree-vect-data-refs.c (vect_verify_datarefs_alignment):
Skip irrelevant statements.
(vect_enhance_data_refs_alignment): Use STMT_VINFO_RELEVANT_P
instead of STMT_VINFO_RELEVANT.
(vect_get_data_access_cost): Do not check for supportable
alignment before calling vect_get_load_cost/vect_get_store_cost.
* tree-vect-stmts.c (vect_get_store_cost): Do not abort when
handling unsupported alignment.
(vect_get_load_cost): Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-vect-data-refs.c
trunk/gcc/tree-vect-slp.c
trunk/gcc/tree-vect-stmts.c


[Bug regression/53729] [4.8 regression] PR53636 fix caused bb-slp-16.c to FAIL on sparc64 and powerpc64

2012-06-26 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53729

Ulrich Weigand  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #3 from Ulrich Weigand  2012-06-26 
09:09:28 UTC ---
Fixed.


[Bug target/53854] ICE in find_constant_pool_ref

2012-07-04 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53854

--- Comment #2 from Ulrich Weigand  2012-07-04 
17:17:22 UTC ---
The problem with this is that there was a reason why I originally supported
only a single constant pool reference per instruction: there needs to be an
upper bound on the number of bytes consumed in the text section (code +
literals) by a single insn, otherwise the "pool chunkify" mechanism doesn't
work.

As an obvious extreme example, if the asm were to refer to more than 4096 bytes
of literals, it would be impossible to refer to them all using a single literal
pool base pointer.   As another obvious extreme example, if the asm *code* were
to span more than 4096 bytes, it would be impossible to have even a single
literal in the same chunk as the asm and thus be referenced from it (using the
current chunkify algorithm).

All this is less of an issue in 64-bit code since its much easier to address
literals, but we still be should be correct for 31-bit on old machines too ...

Why do the literals end up in the pool anyway in your example, as opposed to a
register?  They did with older compilers; this seems to have changed recently
due to different IRA cost computations ...   Maybe it would be better to
prevent asm statements from generating pool references; it is likely that the
asm will expect this to happen anyway?


[Bug c/52181] [4.7 Regression] merge_decls doesn't handle DECL_USER_ALIGN properly

2012-02-09 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52181

Ulrich Weigand  changed:

   What|Removed |Added

 CC||uweigand at gcc dot gnu.org

--- Comment #2 from Ulrich Weigand  2012-02-09 
18:33:12 UTC ---
The regression is present on tip of 4.6 branch as well, exposed by recent
backports from mainline.


[Bug c/52181] [4.7 Regression] merge_decls doesn't handle DECL_USER_ALIGN properly

2012-02-14 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52181

--- Comment #5 from Ulrich Weigand  2012-02-14 
17:22:23 UTC ---
Thanks for the quick fix!  Are you planning to backport to 4.6 as well?


[Bug tree-optimization/52298] ICE: verify_ssa failed: definition in block follows use

2012-02-28 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52298

--- Comment #8 from Ulrich Weigand  2012-02-28 
23:40:37 UTC ---
Author: uweigand
Date: Tue Feb 28 23:40:32 2012
New Revision: 184645

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184645
Log:
Partially revert:

2012-02-20  Richard Guenther  
PR tree-optimization/52298
* tree-vect-stmts.c (vectorizable_load): Properly use
STMT_VINFO_DR_STEP instead of DR_STEP when vectorizing
outer loops.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-vect-stmts.c


[Bug tree-optimization/52686] New: SLP crashes with WIDEN_LSHIFT_EXPR

2012-03-23 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52686

 Bug #: 52686
   Summary: SLP crashes with WIDEN_LSHIFT_EXPR
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: uweig...@gcc.gnu.org


When building the following test case:

unsigned int output[4];

void test (unsigned short *p)
{
  unsigned int x = *p;
  if (x)
{
  output[0] = x << 1;
  output[1] = x << 1;
  output[2] = x << 1;
  output[3] = x << 1;
}
}

GCC crashes with various symptoms (segmentation fault if checking is disabled;
assertion failures along the lines of "vector VEC(tree,base) index domain
error, in vect_create_vectorized_promotion_stmts at tree-vect-stmts.c:2130" if
checking is enabled) when building on ARM with -O1 -ftree-vectorize -mfpu=neon
-mfloat-abi=softfp.


[Bug tree-optimization/52686] SLP crashes with WIDEN_LSHIFT_EXPR

2012-03-23 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52686

--- Comment #2 from Ulrich Weigand  2012-03-23 
15:42:00 UTC ---
Created attachment 26968
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26968
Handle WIDEN_LSHIFT_EXPR in vect_get_smallest_scalar_type

The attached patch fixes the ICEs for me.


[Bug tree-optimization/52686] SLP crashes with WIDEN_LSHIFT_EXPR

2012-03-23 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52686

Ulrich Weigand  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2012-03-23
 Ever Confirmed|0   |1

--- Comment #1 from Ulrich Weigand  2012-03-23 
15:38:43 UTC ---
It seems the root cause of all the memory problems is that
vect_get_smallest_scalar_type neglects to handle WIDEN_LSHIFT_EXPR, which
causes the SLP pass to mis-compute the number of statements needed to vectorize
the expression.


[Bug tree-optimization/52633] [4.7 Regression] Compiler ICE in vect_is_simple_use_1 (ARM)

2012-03-23 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52633

Ulrich Weigand  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2012-03-23
 CC||uweigand at gcc dot gnu.org
 AssignedTo|unassigned at gcc dot   |uweigand at gcc dot gnu.org
   |gnu.org |
 Ever Confirmed|0   |1


[Bug tree-optimization/52686] SLP crashes with WIDEN_LSHIFT_EXPR

2012-03-26 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52686

--- Comment #3 from Ulrich Weigand  2012-03-26 
13:13:14 UTC ---
Author: uweigand
Date: Mon Mar 26 13:13:07 2012
New Revision: 185795

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=185795
Log:
gcc/
PR tree-optimization/52686
* tree-vect-data-refs.c (vect_get_smallest_scalar_type): Handle
WIDEN_LSHIFT_EXPR.

gcc/testsuite/
PR tree-optimization/52686
* gcc.target/arm/pr52686.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/arm/pr52686.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-data-refs.c


[Bug tree-optimization/52686] SLP crashes with WIDEN_LSHIFT_EXPR

2012-03-26 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52686

Ulrich Weigand  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #4 from Ulrich Weigand  2012-03-26 
13:16:30 UTC ---
Fixed.


[Bug tree-optimization/52870] New: ICE during SLP pattern matching

2012-04-04 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52870

 Bug #: 52870
   Summary: ICE during SLP pattern matching
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: uweig...@gcc.gnu.org
Target: x86_64-linux-gnu


Building the following testcase with -O -ftree-vectorize on x86_64:

long
test (int *x)
{
  unsigned long sx, xprec;

  sx = *x >= 0 ? *x : -*x;

  xprec = sx * 64;

  if (sx < 16384)
foo (sx);

  return xprec;
}


results in an ICE:

crash1.c:5:1: internal compiler error: vector VEC(vec_void_p,base) index domain
error, in vinfo_for_stmt at tree-vectorizer.h:628

(When building with --disable-checking, we get a segmentation fault instead.)


[Bug tree-optimization/52870] ICE during SLP pattern matching

2012-04-04 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52870

Ulrich Weigand  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2012-04-04
 AssignedTo|unassigned at gcc dot   |uweigand at gcc dot gnu.org
   |gnu.org |
 Ever Confirmed|0   |1

--- Comment #1 from Ulrich Weigand  2012-04-04 
15:56:06 UTC ---
It seems the problem is that vect_recog_widen_mult_pattern includes a statement
into a pattern it detects which is actually outside of the basic block that SLP
is currently operating on.  This later on causes the ICE since the statement
does not have an assigned stmt_vinfo.

I'm testing a fix.


[Bug tree-optimization/52870] ICE during SLP pattern matching

2012-04-06 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52870

--- Comment #2 from Ulrich Weigand  2012-04-06 
18:31:58 UTC ---
Patch posted:
http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00360.html


[Bug rtl-optimization/48496] [4.7/4.8 Regression] 'asm' operand requires impossible reload

2012-04-08 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48496

--- Comment #18 from Ulrich Weigand  2012-04-08 
17:32:23 UTC ---
According to Vlad's comment #4, the validity check fails because a reload insn
contains a spilled pseudo that will later be replaced by a MEM.

However, recog.c contains in various places checks that will *accept* -during
reload- a pseudo in places where a memory constraint is required; exactly
because such pseudos will actually get replaced by a MEM:

  case TARGET_MEM_CONSTRAINT:
[snip]
/* During reload, accept a pseudo  */
else if (reload_in_progress && REG_P (op)
 && REGNO (op) >= FIRST_PSEUDO_REGISTER)
  win = 1;

Note that those checks were originally added in the same patch that added this
asm validity check ...

So really that validity check shouldn't have failed just because of the
presence of a spilled pseudo.  The question is, why doesn't this work for the
ia64 test case as expected?


[Bug tree-optimization/52870] ICE during SLP pattern matching

2012-04-10 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52870

--- Comment #3 from Ulrich Weigand  2012-04-10 
10:56:17 UTC ---
Author: uweigand
Date: Tue Apr 10 10:56:11 2012
New Revision: 186272

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=186272
Log:
gcc/
PR tree-optimization/52870
* tree-vect-patterns.c (vect_recog_widen_mult_pattern): Verify that
presumed pattern statement is within the same loop or basic block.

gcc/testsuite/
PR tree-optimization/52870
* gcc.dg/vect/pr52870.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/vect/pr52870.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-patterns.c


[Bug tree-optimization/52870] ICE during SLP pattern matching

2012-04-10 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52870

Ulrich Weigand  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #4 from Ulrich Weigand  2012-04-10 
10:58:25 UTC ---
Fixed.


[Bug rtl-optimization/48496] [4.7/4.8 Regression] 'asm' operand requires impossible reload

2012-04-10 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48496

--- Comment #21 from Ulrich Weigand  2012-04-10 
12:16:36 UTC ---
(In reply to comment #19)

> and the matching alternative would be (f, Q) with
> 
> ;; Note that while this accepts mem, it only accepts non-volatile mem,
> ;; and so cannot be "fixed" by adjusting the address.  Thus it cannot
> ;; and does not use define_memory_constraint.
> (define_constraint "Q"
>   "Non-volatile memory for FP_REG loads/stores"
>   (and (match_operand 0 "memory_operand")
>(match_test "!MEM_VOLATILE_P (op)")))

Ah, I see.  So the fix would probably be to simply add an equivalent "if
reload_in_progress, accept pseudos" (since pseudo stack slots are never
volatile) to this constrains ...


[Bug target/51819] [4.7 Regression] Neon wrong code generation, Error: unsupported alignment for instruction -- `vst1.32 {d2[0]},[r0:64]'

2012-04-16 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51819

--- Comment #6 from Ulrich Weigand  2012-04-16 
15:19:47 UTC ---
Author: uweigand
Date: Mon Apr 16 15:19:43 2012
New Revision: 186498

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=186498
Log:
2012-04-16  Ulrich Weigand  

PR target/51819
* config/arm/arm.c (arm_print_operand): Fix invalid alignment
hints for 'A' operand types.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.c


[Bug tree-optimization/52633] [4.7/4.8 Regression] Compiler ICE in vect_is_simple_use_1 (ARM)

2012-04-24 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52633

--- Comment #2 from Ulrich Weigand  2012-04-24 
16:52:59 UTC ---
Some more details on what's going on here:
http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01486.html


[Bug middle-end/59119] New: Segfault in -fisolate-erroneous-paths pass

2013-11-13 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59119

Bug ID: 59119
   Summary: Segfault in -fisolate-erroneous-paths pass
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: uweigand at gcc dot gnu.org

Building the following test case (reduced from Python 2.7.5) with -O2 -g:

extern void *memmove (void *, const void *, __SIZE_TYPE__);
extern void *memset (void *, int, __SIZE_TYPE__);

typedef struct {
long n_prefix;
long n_spadding;
} NumberFieldWidths;

void
fill_number(char *buf, const NumberFieldWidths *spec)
{
if (spec->n_prefix) {
memmove(buf,
(char *) 0,
spec->n_prefix * sizeof(char));
buf += spec->n_prefix;
}
if (spec->n_spadding) {
memset(buf, 0, spec->n_spadding);
buf += spec->n_spadding;
}
}

crashes the compiler with:

formatter_string.i: In function ‘fill_number’:
formatter_string.i:11:1: internal compiler error: Segmentation fault
 fill_number(char *buf, const NumberFieldWidths *spec)
 ^
0x1064f147 crash_signal
/home/uweigand/src/gcc/gcc/toplev.c:334
0x1090806c ptrofftype_p
/home/uweigand/src/gcc/gcc/tree.h:4463
0x1090806c build2_stat(tree_code, tree_node*, tree_node*, tree_node*)
/home/uweigand/src/gcc/gcc/tree.c:4151
0x1022c0db gimple_assign_rhs_to_tree(gimple_statement_d*)
/home/uweigand/src/gcc/gcc/cfgexpand.c:103
0x10861e67 insert_debug_temp_for_var_def(gimple_stmt_iterator_d*, tree_node*)
/home/uweigand/src/gcc/gcc/tree-ssa.c:442
0x10862397 insert_debug_temps_for_defs(gimple_stmt_iterator_d*)
/home/uweigand/src/gcc/gcc/tree-ssa.c:549
0x103ff49b gsi_remove(gimple_stmt_iterator_d*, bool)
/home/uweigand/src/gcc/gcc/gimple-iterator.c:563
0x10ba8083 insert_trap_and_remove_trailing_statements
/home/uweigand/src/gcc/gcc/gimple-ssa-isolate-paths.c:110
0x10ba8a47 gimple_ssa_isolate_erroneous_paths
/home/uweigand/src/gcc/gcc/gimple-ssa-isolate-paths.c:305
0x10ba8a47 execute
/home/uweigand/src/gcc/gcc/gimple-ssa-isolate-paths.c:370

[Bug target/57363] IBM long double: adding NaN and number raises inexact exception

2013-11-13 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57363

Ulrich Weigand  changed:

   What|Removed |Added

 CC||uweigand at gcc dot gnu.org

--- Comment #3 from Ulrich Weigand  ---
Hi Adhemerval, I'm also seeing that this patch fixes some glibc failures.

What's the status of this?  Were you planning to submit it for inclusion?

B.t.w. I'm wondering if we don't need to use

+  if (fabs (z) != inf())
+   return z;

instead; z could still be minus infinity, right?


[Bug target/57949] [powerpc64] Structure parameter alignment issue with vector extensions

2013-11-15 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57949

--- Comment #9 from Ulrich Weigand  ---
Author: uweigand
Date: Fri Nov 15 23:39:50 2013
New Revision: 204870

URL: http://gcc.gnu.org/viewcvs?rev=204870&root=gcc&view=rev
Log:
gcc:

2013-11-15  Ulrich Weigand  

Backport from mainline r201750.
Note: Default setting of -mcompat-align-parm inverted!

2013-08-14  Bill Schmidt  

PR target/57949
* doc/invoke.texi: Add documentation of mcompat-align-parm
option.
* config/rs6000/rs6000.opt: Add mcompat-align-parm option.
* config/rs6000/rs6000.c (rs6000_function_arg_boundary): For AIX
and Linux, correct BLKmode alignment when 128-bit alignment is
required and compatibility flag is not set.
(rs6000_gimplify_va_arg): For AIX and Linux, honor specified
alignment for zero-size arguments when compatibility flag is not
set.

gcc/testsuite:

2013-11-15  Ulrich Weigand  

Backport from mainline r201750.
Note: Default setting of -mcompat-align-parm inverted!

2013-08-14  Bill Schmidt  

PR target/57949
* gcc.target/powerpc/pr57949-1.c: New.
* gcc.target/powerpc/pr57949-2.c: New.


Added:
branches/ibm/gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/pr57949-1.c
branches/ibm/gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/pr57949-2.c
Modified:
branches/ibm/gcc-4_8-branch/gcc/ChangeLog.ibm
branches/ibm/gcc-4_8-branch/gcc/config/rs6000/rs6000.c
branches/ibm/gcc-4_8-branch/gcc/config/rs6000/rs6000.opt
branches/ibm/gcc-4_8-branch/gcc/doc/invoke.texi
branches/ibm/gcc-4_8-branch/gcc/testsuite/ChangeLog.ibm


[Bug target/57363] IBM long double: adding NaN and number raises inexact exception

2013-12-03 Thread uweigand at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57363

Ulrich Weigand  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Ulrich Weigand  ---
Fixed.
http://gcc.gnu.org/ml/gcc-cvs/2013-12/msg00087.html


[Bug target/85075] powerpc: ICE in iszero testcase

2018-04-10 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85075

--- Comment #3 from Ulrich Weigand  ---
Maybe I'm confused, but: How does this even build?

_Float128 is a C-only extension, this type is not supposed to be available at
all in C++ mode as far as I know.

[Bug target/69286] trunk/libgcc/config/s390/tpf-unwind.h: 28 redundant condition ?

2019-09-28 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69286

--- Comment #2 from Ulrich Weigand  ---
Yes, it does appear I checked in this code, but the tpf-unwind.h changes were
actually provided by Jim Johnston on the IBM TPF team:
https://gcc.gnu.org/ml/gcc-patches/2014-07/msg02104.html

In any case, feel free to make the obvious change here :-)

[Bug bootstrap/83396] [8 Regression] Bootstrap failures with Statement Frontiers

2017-12-14 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83396

--- Comment #64 from Ulrich Weigand  ---
I'm seeing the same error on spu-elf when building newlib with GCC revision
255614.  In case this isn't fixed by more recent changes already, here's a
reduced test case (build with -O -g):

const char *
test (const char *s)
{
  for (; ; s++)
switch (*s)
  {
  case '-':
  case 0:
return 0;

  case '\t':
  case '\n':
  case '\v':
  case '\f':
  case '\r':
  case ' ':
continue;

  default:
goto break2;
  }
break2:

  for (; *s >= '0' && *s <= '9'; s++)
;

  return s;
}

[Bug target/86807] spu port needs updating for CVE-2017-5753

2018-08-06 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86807

--- Comment #1 from Ulrich Weigand  ---
Author: uweigand
Date: Mon Aug  6 14:40:56 2018
New Revision: 263335

URL: https://gcc.gnu.org/viewcvs?rev=263335&root=gcc&view=rev
Log:
[spu, commit] Define TARGET_HAVE_SPECULATION_SAFE_VALUE

The SPU processor is not affected by speculation, so this macro can
safely be defined as speculation_safe_value_not_needed.

gcc/ChangeLog:

PR target/86807
* config/spu/spu.c (TARGET_HAVE_SPECULATION_SAFE_VALUE):
Define to speculation_safe_value_not_needed.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/spu/spu.c

[Bug target/86807] spu port needs updating for CVE-2017-5753

2018-08-06 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86807

Ulrich Weigand  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |uweigand at gcc dot 
gnu.org

--- Comment #2 from Ulrich Weigand  ---
Fixed.

[Bug target/86772] [meta-bug] tracking port status for CVE-2017-5753

2018-08-06 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86772
Bug 86772 depends on bug 86807, which changed state.

Bug 86807 Summary: spu port needs updating for CVE-2017-5753
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86807

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug rtl-optimization/85180] New: Infinite (?) loop in RTL DSE optimizer

2018-04-03 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85180

Bug ID: 85180
   Summary: Infinite (?) loop in RTL DSE optimizer
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: uweigand at gcc dot gnu.org
CC: krebbel at gcc dot gnu.org
  Target Milestone: ---
Target: s390x-ibm-linux

Created attachment 43828
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43828&action=edit
Test case - run with "cc1plus -O"

When attempting to compile the attached testcase simply with "cc1plus -O" on a
s390x-ibm-linux target, the compilation process never terminates.  The problem
appear to originate in the dse.c pass; building with -fno-dse makes the problem
go away.

I'm not completely sure that this is really an infinite loop, strictly
speaking, or just some exponential time behavior somewhere.  In any case, at
the time the compiler hangs, it sits in a long chain of find_base_term calls
ultimately originating at the canon_output_dependence in dse.c:1593
(record_store).

[Bug target/70117] ppc long double isinf() is wrong?

2016-03-07 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70117

Ulrich Weigand  changed:

   What|Removed |Added

 CC||amodra at gcc dot gnu.org,
   ||dje at gcc dot gnu.org,
   ||uweigand at gcc dot gnu.org

--- Comment #1 from Ulrich Weigand  ---
Hmm.  For some reason, the gnulib definition of LDBL_MAX differs from GCC's
definition.   With gnulib, we get:

  high double: 7FEF 
  low  double: 7C8F 

while with GCC, we get:

  high double: 7FEF 
  low  double: 7C8F FFFE

In any case, someone is wrong here -- values of LDBL_MAX should certainly
agree.

Now I'm not completely sure why GCC choses the value it does, I notice that
GCC's choice is certainly deliberate; there's extra code to flip the last bit:

real.c:get_max_float

  if (fmt->pnan < fmt->p)
{
  /* This is an IBM extended double format made up of two IEEE
 doubles.  The value of the long double is the sum of the
 values of the two parts.  The most significant part is
 required to be the value of the long double rounded to the
 nearest double.  Rounding means we need a slightly smaller
 value for LDBL_MAX.  */
  buf[4 + fmt->pnan / 4] = "7bde"[fmt->pnan % 4];
}

and similarly real.c:real_maxval

  if (fmt->pnan < fmt->p)
/* This is an IBM extended double format made up of two IEEE
   doubles.  The value of the long double is the sum of the
   values of the two parts.  The most significant part is
   required to be the value of the long double rounded to the
   nearest double.  Rounding means we need a slightly smaller
   value for LDBL_MAX.  */
clear_significand_bit (r, SIGNIFICAND_BITS - fmt->pnan - 1);

This code was originally added by Alan back in 2004 ... adding Alan on CC if he
recalls what this was all about.

[Bug target/70117] ppc long double isinf() is wrong?

2016-03-07 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70117

--- Comment #4 from Ulrich Weigand  ---
(In reply to Alan Modra from comment #3)
> > while with GCC, we get:
> > 
> >   high double: 7FEF 
> >   low  double: 7C8F FFFE
> 
> Right.  This is 0x1.f78p+1023
> 
> gnulib isn't correct here.  As the comment says the high double must be the
> value of the long double correctly rounded to double (to nearest since that
> is the only mode supported for IBM extended double).  Any long double value
> higher than the above will round up the high double to inf.

Well, what I don't quite understand is that the gnulib value, which is

0x1.f7cp+1023

likewise should round to the same double value, shouldn't it?  I notice that if
I actually attempt to use that value in C source code, the compiler does indeed
round it to inf -- but I don't see why it actually should do so ...

[Bug target/70117] ppc long double isinf() is wrong?

2016-03-07 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70117

--- Comment #7 from Ulrich Weigand  ---
Ah, OK.  I did't realize this value didn't fit into a 106-bit mantissa.

I agree that it probably doesn't make sense to change the internal
representation to allow larger mantissas.  First of all, there's nothing really
special about 107 bits; there can be IBM long double values that would require
a much larger mantissa in the internal representation, since we can have many
implicit zero bits.  But more problematical, if we change the internal
representation to a mantissa larger than 106 bits, there will be values in that
internal format that cannot be represented directly in the target IBM long
double format.

In any case, I certainly agree that the is* routines for IBM long double should
simply operate on the high double of the pair.

I still think that it would be better for gnulib to use the same LDBL_MAX as
GCC, which means gnulib should probably be changed to use the 106-bit value.

[Bug rtl-optimization/70168] New: [5 Regression] Wrong code generation in __sync_val_compare_and_swap on PowerPC

2016-03-10 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70168

Bug ID: 70168
   Summary: [5 Regression] Wrong code generation in
__sync_val_compare_and_swap on PowerPC
   Product: gcc
   Version: 5.4.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: uweigand at gcc dot gnu.org
CC: amodra at gcc dot gnu.org, dje at gcc dot gnu.org
  Target Milestone: ---
Target: powerpc64le-linux

Building the following test case on powerpc64le-linux with -O2 using the
current GCC 5 branch:

unsigned long atomicAND (volatile unsigned long *memRef, unsigned long mask)
{
  unsigned long oldValue, newValue, prevValue;

  for (oldValue = *memRef; ; oldValue = prevValue)
{
  newValue = oldValue & mask;
  if (newValue == oldValue)
break;

  prevValue = __sync_val_compare_and_swap (memRef, oldValue, newValue);
  if (prevValue == oldValue)
break;
}

  return oldValue;
}

results in this assembler output:

atomicAND:
ld 9,0(3)
b .L6
.p2align 4,,15
.L10:
sync
.L3:
ldarx 10,0,3
cmpd 0,10,9
bne- 0,.L4
stdcx. 9,0,3
bne- 0,.L3
.L4:
isync
cmpld 7,9,10
beq 7,.L7
mr 9,10
.L6:
and 10,9,4
cmpld 7,9,10
bne 7,.L10
.L7:
mr 3,10
blr

Note how the stdcx. stores r9, which holds the original value, not the masked
value.  Therefore, this will usually succeed without updating the memory
location.

Debugging this problem, it turns out that rs6000_expand_atomic_compare_and_swap
is called with retval (operands[1]) equal to newval (operands[4]), and the
expander then proceeds to clobber newval before using it.

There is code to detect overlap of retval with oldval (operands[3]), but not
with newval.  Adding the equivalent detection code fixes the problem for me.

For some reason, I can reproduce this problem only with GCC 5; the same problem
should still be latently present in mainline since there's still no overlap
check, but I seem unable to construct a test case that would actually cause
retval == newval with mainline.

[Bug rtl-optimization/70168] [5 Regression] Wrong code generation in __sync_val_compare_and_swap on PowerPC

2016-03-10 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70168

--- Comment #1 from Ulrich Weigand  ---
Created attachment 37925
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37925&action=edit
Patch to add retval vs. newval overlap check

This patch fixes the problem for me with the GCC 5 branch.  Not fully
regression tested yet.

[Bug target/70168] [5 Regression] Wrong code generation in __sync_val_compare_and_swap on PowerPC

2016-03-10 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70168

Ulrich Weigand  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
URL||https://gcc.gnu.org/ml/gcc-
   ||patches/2016-03/msg00671.ht
   ||ml
  Component|rtl-optimization|target
   Assignee|unassigned at gcc dot gnu.org  |uweigand at gcc dot 
gnu.org

--- Comment #3 from Ulrich Weigand  ---
Patch posted.

[Bug target/70168] [5 Regression] Wrong code generation in __sync_val_compare_and_swap on PowerPC

2016-03-10 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70168

--- Comment #4 from Ulrich Weigand  ---
Author: uweigand
Date: Thu Mar 10 23:58:44 2016
New Revision: 234126

URL: https://gcc.gnu.org/viewcvs?rev=234126&root=gcc&view=rev
Log:
PR target/70168
* config/rs6000/rs6000.c (rs6000_expand_atomic_compare_and_swap):
Handle overlapping retval and newval.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.c

[Bug target/70168] [5 Regression] Wrong code generation in __sync_val_compare_and_swap on PowerPC

2016-03-10 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70168

--- Comment #5 from Ulrich Weigand  ---
Author: uweigand
Date: Thu Mar 10 23:59:20 2016
New Revision: 234127

URL: https://gcc.gnu.org/viewcvs?rev=234127&root=gcc&view=rev
Log:
PR target/70168
* config/rs6000/rs6000.c (rs6000_expand_atomic_compare_and_swap):
Handle overlapping retval and newval.

Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/rs6000/rs6000.c

[Bug target/70168] [5 Regression] Wrong code generation in __sync_val_compare_and_swap on PowerPC

2016-03-10 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70168

Ulrich Weigand  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Ulrich Weigand  ---
Fixed.

[Bug rtl-optimization/78812] New: Wrong code generation due to hoisting memory load across function call

2016-12-14 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78812

Bug ID: 78812
   Summary: Wrong code generation due to hoisting memory load
across function call
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: uweigand at gcc dot gnu.org
  Target Milestone: ---

Compiling the following test case with g++ -Os -fpic on s390x-ibm-linux results
in abort() being called unexpectedly:

extern "C" void abort (void) __attribute__ ((__noreturn__));

class Transaction
{
public:
  bool Aborted;

  Transaction () : Aborted (false) { }

  ~Transaction ()
  {
if (!Aborted)
  abort ();
  }
};

void test (Transaction &Trans) __attribute__ ((noinline));
void test (Transaction &Trans)
{
  Trans.Aborted = true;
}

int main (void)
{
  Transaction T;
  test (T);
}


What happens is that the destructor for Transaction is inlined into main
(twice, once in the regular exit and once in the exception path).  The load of
the Aborted member is then moved by the code hoisting pass (pass_rtl_hoist) in
gcse.c to a single location in basic block 2.

However, that basic block ends with the call to the "test" routine, and for
some reason, code hoisting thinks it therefore needs to move the hoisted
instruction to *before* that call (see the CALL_P case of
insert_insn_end_basic_block).  This causes wrong code generation since that
call actually modifies the memory that is being loaded.

There seems to be other code that appears intended to recognize that case and
avoid hoisting then (see prune_expressions), but that apparently only looks for
abnormal edges, which we don't have here.

Not sure how this is supposed to work correctly ...

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-03 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #22 from Ulrich Weigand  ---
(In reply to Jakub Jelinek from comment #21)
> Could libsanitizer call __tls_get_offset instead, after setting %r12 or
> whatever else is needed for it to make work and then perhaps adjust the
> result if needed?
> E.g. on s390x __tls_get_offset is internally:
> __tls_get_offset:\n\
> la  %r2,0(%r2,%r12)\n\
> jg  __tls_get_addr\n\
> and in the interceptor:
> #ifdef __s390x__
>   "la %r2, 0(%r2,%r12)\n"
>   "jg __interceptor___tls_get_addr_internal_protected\n"
> #else
> at which point the original %r2 and %r12 is lost and it is hard to call the
> original __tls_get_offset, it might be better to pass the original %r2 and
> %r12 values to some C function and from that compute the r2 + r12 the code
> perhaps needs for its own thing, but then we could (again in assembly) call
> the original __tls_get_offset again if needed.

Yes, it would appear to be safer to call __tls_get_offset instead.
You probably do not even need the original %r12, but simply subtract
%r12 (whatever it currently is) from %r2 before calling the original
__tls_get_offset.  The value of %r12 is not used for anything except
adding it to %r2.

> That said, if asan wants to intercept also what dlsym will internally call,
> then that will not really work.  But does libasan on other targets rely on
> dlsym calling __tls_get_addr internally in those cases?  That would be yet
> another reliance on glibc internals.

As I understand it, they do make that assumption; libsanitizer must get
involved at the point any new TLS data section is allocated.  Since this
allocation may happen as a result of a dlsym call, those cases have to
be intercepted as well.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-11 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #48 from Ulrich Weigand  ---
s390(x) has -fasynchronous-unwind-tables on by default anyway, and .eh_frame
based DWARF unwinding is the only way to create stack backtraces that always
works.

However, I understood that asan deliberately doesn't want to use DWARF
unwinding for the the malloc/free case since it can be slow.  That's why Marcin
actually added -mbackchain to LLVM in the first place.  (We've had -mbackchain
in GCC forever, but it has defaulted to off for a very long time.)

I don't think we should switch to *always* using backchain unwinding in asan,
since system libraries on s390 will be built without backchain.  However,
switching -mbackchain on by default when building for asan might make sense.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #59 from Ulrich Weigand  ---
(In reply to Dominik Vogt from comment #57)
> libsanitizer miscalculates the Pcs in the backtrace:
> 
> #0 0x1000839 in NullDeref
> #1 0x10006c1 in main
> #2 0x3fff6e23069 in __libc_start_main
> #3 0x100073d
> 
> These are all odd addresses, pointing to the last byte of the previous
> instruction.  In case of null-deref-1.c that byte belongs to some
> instrumentation code that is associated with line 11.

Normally you should decrement the return address by one for normal frames (in
order to identify the call instruction), but you should not decrement the
return address for signal frames (since the address already identifies the
faulting instruction).

That's why there's usually a bit to distinguish signal frames from normal
frames during unwinding.  Maybe this somehow doesn't work correctly with the
libsanitizer unwinding?

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #60 from Ulrich Weigand  ---
... well, as Florian said as well :-)

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #71 from Ulrich Weigand  ---
(In reply to Dominik Vogt from comment #70)
> If funny line information is the only consequence, no.  Is it safe to assume
> that libsanitizer won't crash or produce garbege because of this?

Why should line infomation be "funny"?  With the odd addresses (decremented by
one), line information should identify the line of the call, otherwise we'd get
the line *after* the call.  IMO identifying the call is actually better ...

[Bug target/82960] spu_machine_dependent_reorg does not handle jump_table_data insn

2017-11-20 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82960

--- Comment #3 from Ulrich Weigand  ---
I'll have a look.   I still need to get my SPU build environment back up and
running, the build currently fails due to unrelated issues.

I remember looking at this a few years back:
https://gcc.gnu.org/ml/gcc-patches/2013-04/msg00151.html

This seemed to have fixed the problem back then, not sure why this no longer
works.

  1   2   >