[Bug target/91035] [10 Regression] gotools fails to build on s390x-linux-gnu

2019-10-14 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91035

--- Comment #9 from Andreas Krebbel  ---
I've just posted two patches to fix the remaining GO build problems on S/390.
Ian could you please pick those up to make GO build again on S/390?

Sync hardware facility names with other files in os_linux_s390x.go
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00963.html

GO S/390: Add kdsaQuery function
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00964.html

[Bug go/91035] [10 Regression] gotools fails to build on s390x-linux-gnu

2019-10-10 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91035

--- Comment #8 from Andreas Krebbel  ---
With that patch GCCGO bootstraps fine until r275473 where libgo got updated to
version 1.13beta1. Then there are a few problems with hardware crypto support
on Z. I'll try to address these with separate patches and BZs.

I plan to commit the patch also to GCC 9 and 8 after giving it some time on
master.

[Bug go/91035] [10 Regression] gotools fails to build on s390x-linux-gnu

2019-10-10 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91035

--- Comment #7 from Andreas Krebbel  ---
Author: krebbel
Date: Thu Oct 10 07:56:25 2019
New Revision: 276790

URL: https://gcc.gnu.org/viewcvs?rev=276790=gcc=rev
Log:
S/390: PR91035 Fix call to __morestack

For the call to __morestack we use a special ABI in the S/390 back-end
which requires us to emit a parameter block to the .rodata section.
It contains the label whereto __morestack needs to return.  The
parameter block needs to be explicit in RTL since we also need to take
the address of it loaded into r1 in order to pass its address to
__morestack.  In order to express correctly what __morestack does its
RTX also contained the return label. Hence we had the return label to
occur twice in the insn stream.  This is problematic when it comes to
redirecting edges.  The correlation between these two occurrences of
the label cannot be expressed so when doing a redirect only the label
in the jump RTX gets modified while the parameter block label stays as
is.

The patch avoids having two instancs of the label by merging the
parameter block generation and the __morestack call RTX into one. By
doing this I could also get rid of the unspec which was required for
the parameter block generation so far.

gcc/ChangeLog:

2019-10-10  Andreas Krebbel  

PR target/91035
* config/s390/s390-protos.h (s390_output_split_stack_data): Add
prototype.
* config/s390/s390.md (UNSPECV_SPLIT_STACK_DATA): Remove.
("split_stack_data", "split_stack_call")
("split_stack_call_", "split_stack_cond_call")
("split_stack_cond_call_"): Remove.
("@split_stack_call", "@split_stack_cond_call"): New
insn definition.
* config/s390/s390.c (s390_output_split_stack_data): New function.
(s390_expand_split_stack_prologue): Use the merged expander.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390-protos.h
trunk/gcc/config/s390/s390.c
trunk/gcc/config/s390/s390.md

[Bug rtl-optimization/88751] Performance regression reload vs lra

2019-09-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88751

--- Comment #9 from Andreas Krebbel  ---
Author: krebbel
Date: Fri Sep 20 12:18:26 2019
New Revision: 276000

URL: https://gcc.gnu.org/viewcvs?rev=276000=gcc=rev
Log:
Fix PR88751

This patch implements a small improvement for the heuristic in lra
which decides when it has to activate the simpler register allocation
algorithm.

gcc/ChangeLog:

2019-09-20  Andreas Krebbel  

Backport from mainline
2019-06-06  Andreas Krebbel  

PR rtl-optimization/88751
* ira.c (ira): Use the number of the actually referenced registers
when calculating the threshold.


Modified:
branches/gcc-9-branch/gcc/ChangeLog
branches/gcc-9-branch/gcc/ira.c

[Bug rtl-optimization/88751] Performance regression reload vs lra

2019-09-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88751

--- Comment #8 from Andreas Krebbel  ---
Author: krebbel
Date: Fri Sep 20 09:23:50 2019
New Revision: 275993

URL: https://gcc.gnu.org/viewcvs?rev=275993=gcc=rev
Log:
Fix PR88751

This patch implements a small improvement for the heuristic in lra
which decides when it has to activate the simpler register allocation
algorithm.

gcc/ChangeLog:

2019-09-20  Andreas Krebbel  

Backport from mainline
2019-06-06  Andreas Krebbel  

PR rtl-optimization/88751
* ira.c (ira): Use the number of the actually referenced registers
when calculating the threshold.


Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/ira.c

[Bug rtl-optimization/88751] Performance regression reload vs lra

2019-09-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88751

--- Comment #7 from Andreas Krebbel  ---
Author: krebbel
Date: Fri Sep 20 09:03:44 2019
New Revision: 275991

URL: https://gcc.gnu.org/viewcvs?rev=275991=gcc=rev
Log:
Fix PR88751

This patch implements a small improvement for the heuristic in lra
which decides when it has to activate the simpler register allocation
algorithm.

gcc/ChangeLog:

2019-09-20  Andreas Krebbel  

Backport from mainline
2019-06-06  Andreas Krebbel  

PR rtl-optimization/88751
* ira.c (ira): Use the number of the actually referenced registers
when calculating the threshold.


Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/ira.c

[Bug go/91035] [10 Regression] gotools fails to build on s390x-linux-gnu

2019-09-05 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91035

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-09-05
 Ever confirmed|0   |1

--- Comment #6 from Andreas Krebbel  ---
Bisect indicates that the problem might be related to that change:

Author: ian
Date: Thu May 30 17:26:46 2019
New Revision: 271784

URL: https://gcc.gnu.org/viewcvs?rev=271784=gcc=rev
Log:
compiler: intrinsify sync/atomic functions

Let the Go frontend recognize sync/atomic functions and turn them
into intrinsics.

Also make sure not to intrinsify calls in go or defer statements.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/178937

Modified:
trunk/gcc/go/gofrontend/MERGE
trunk/gcc/go/gofrontend/expressions.cc
trunk/gcc/go/gofrontend/statements.cc

[Bug target/69142] missing documentation for s/390 zvector builtin features

2019-07-15 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69142

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-07-15
 Ever confirmed|0   |1

[Bug rtl-optimization/88751] Performance regression reload vs lra

2019-06-06 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88751

--- Comment #6 from Andreas Krebbel  ---
Author: krebbel
Date: Thu Jun  6 11:35:04 2019
New Revision: 271996

URL: https://gcc.gnu.org/viewcvs?rev=271996=gcc=rev
Log:
Fix PR88751

This patch implements a small improvement for the heuristic in lra
which decides when it has to activate the simpler register allocation
algorithm.

gcc/ChangeLog:

2019-06-06  Andreas Krebbel  

PR rtl-optimization/88751
* ira.c (ira): Use the number of the actually referenced registers
when calculating the threshold.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/ira.c

[Bug rtl-optimization/88751] Performance regression reload vs lra

2019-05-31 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88751

--- Comment #4 from Andreas Krebbel  ---
(In reply to Babneet Singh from comment #3)
> Hi Andreas and Richard: What's the status for this issue? Which approach
> will be used to resolve this issue?

I would like to have Vladimir comment on this first, since he wrote the code
and definitely knows this stuff best.

Richard, would it be ok with you to raise the prio? OpenJ9 is a pretty
important workload I think.

[Bug target/89952] S/390: Inconsistent CFI info when restoring frame pointer from fpr

2019-04-24 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89952

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Andreas Krebbel  ---
Fixed upstream with the patch from comment #2

[Bug target/89952] S/390: Inconsistent CFI info when restoring frame pointer from fpr

2019-04-24 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89952

--- Comment #2 from Andreas Krebbel  ---
Author: krebbel
Date: Wed Apr 24 13:40:38 2019
New Revision: 270544

URL: https://gcc.gnu.org/viewcvs?rev=270544=gcc=rev
Log:
S/390: Fix PR89952 incorrect CFI

This patch fixes a cases where inconsistent CFI is generated.

After restoring the hard frame pointer (r11) from an FPR we have to
set the CFA register.  In order to be able to set it back to the stack
pointer (r15) we have to make sure that r15 has been restored already.

The patch also adds a scheduler dependency to prevent the instruction
scheduler from swapping the r11 and r15 restore again.

gcc/ChangeLog:

2019-04-24  Andreas Krebbel  

PR target/89952
* config/s390/s390.c (s390_restore_gprs_from_fprs): Restore GPRs
from FPRs in reverse order.  Generate REG_CFA_DEF_CFA note also
for restored hard frame pointer.
(s390_sched_dependencies_evaluation): Implement new target hook.
(TARGET_SCHED_DEPENDENCIES_EVALUATION_HOOK): New macro definition.

gcc/testsuite/ChangeLog:

2019-04-24  Andreas Krebbel  

PR target/89952
* gcc.target/s390/pr89952.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/s390/pr89952.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.c
trunk/gcc/testsuite/ChangeLog

[Bug target/89952] S/390: Inconsistent CFI info when restoring frame pointer from fpr

2019-04-03 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89952

--- Comment #1 from Andreas Krebbel  ---
Created attachment 46083
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46083=edit
Experimental patch

[Bug target/89952] New: S/390: Inconsistent CFI info when restoring frame pointer from fpr

2019-04-03 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89952

Bug ID: 89952
   Summary: S/390: Inconsistent CFI info when restoring frame
pointer from fpr
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

Created attachment 46082
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46082=edit
Testcase

Compiling the attached testcase with GCC built with checking enabled ICEs:

during RTL pass: dwarf2
t.c:10:1: internal compiler error: in maybe_record_trace_start, at
dwarf2cfi.c:2348   
 }
 ^
0x13a7649 maybe_record_trace_start
/home/andreas/gcc/gcc/dwarf2cfi.c:2348
0x13a9d6f scan_trace
/home/andreas/gcc/gcc/dwarf2cfi.c:2541
0x13aa40b create_cfi_notes
/home/andreas/gcc/gcc/dwarf2cfi.c:2694
0x13aa40b execute_dwarf2_frame
/home/andreas/gcc/gcc/dwarf2cfi.c:3057
0x13aa40b execute
/home/andreas/gcc/gcc/dwarf2cfi.c:3545

There is an edge from very early in the function to right before the call to
"j" which is executed as sibcall. In between the hard frame pointer (r11) is
saved to and FPR, set to the stack pointer and restored from the FPR. After
restoring the hard frame pointer register to its former value the backend
misses to set the CFA register back to r15. That's why the sibcall insn can be
reached with the CFA register being either r11 or r15.

[Bug target/89775] [7/8 Regression] S/390: Stackpointer save/restore instructions optimized away

2019-03-25 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89775

--- Comment #4 from Andreas Krebbel  ---
Author: krebbel
Date: Mon Mar 25 10:18:57 2019
New Revision: 269910

URL: https://gcc.gnu.org/viewcvs?rev=269910=gcc=rev
Log:
S/390: Fix PR89775. Stackpointer save/restore instructions removed

Even if a global register is being clobbered in a function we usually
do not save and restore it. However, we still have to do this if it is
a special register. Most of the places in the backend handle this
correctly but not the prologue/epilogue optimization.

gcc/ChangeLog:

2019-03-25  Andreas Krebbel  

Backport from mainline
2019-03-20  Andreas Krebbel  

PR target/89775
* config/s390/s390.c (global_not_special_regno_p): Move to make it
available to ...
(s390_optimize_register_info): Use global_not_special_regno_p to
check for global regs.

2019-03-25  Andreas Krebbel  

Backport from mainline
2019-03-20  Jakub Jelinek  

PR target/89775
* gcc.target/s390/pr89775-1.c: New test.
* gcc.target/s390/pr89775-2.c: New test.


Added:
branches/gcc-8-branch/gcc/testsuite/gcc.target/s390/pr89775-1.c
branches/gcc-8-branch/gcc/testsuite/gcc.target/s390/pr89775-2.c
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/s390/s390.c
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug target/89775] [7/8/9 Regression] S/390: Stackpointer save/restore instructions optimized away

2019-03-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89775

--- Comment #2 from Andreas Krebbel  ---
Author: krebbel
Date: Wed Mar 20 15:28:38 2019
New Revision: 269823

URL: https://gcc.gnu.org/viewcvs?rev=269823=gcc=rev
Log:
S/390: Fix PR89775. Stackpointer save/restore instructions removed

Even if a global register is being clobbered in a function we usually
do not save and restore it. However, we still have to do this if it is
a special register. Most of the places in the backend handle this
correctly but not the prologue/epilogue optimization.

gcc/ChangeLog:

2019-03-20  Andreas Krebbel  

PR target/89775
* config/s390/s390.c (global_not_special_regno_p): Move to make it
available to ...
(s390_optimize_register_info): Use global_not_special_regno_p to
check for global regs.

2019-03-20  Jakub Jelinek  

PR target/89775
* gcc.target/s390/pr89775-1.c: New test.
* gcc.target/s390/pr89775-2.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/s390/pr89775-1.c
trunk/gcc/testsuite/gcc.target/s390/pr89775-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.c
trunk/gcc/testsuite/ChangeLog

[Bug target/89775] New: Stackpointer save/restore instructions optimized away

2019-03-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89775

Bug ID: 89775
   Summary: Stackpointer save/restore instructions optimized away
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

Defining the stack pointer register as global register triggers wrong code to
be generated thanks to bug in the prologue/epilogue optimization. In this case
the prologue save instruction for the stack pointer is removed while the
epilogue insn is kept. So r15 will be restored from f0 and hence loaded with
garbage.

register a __asm__("15");
b() { char c = 0; }

b:
ldgr%f2,%r11
lay %r15,-168(%r15)
lgr %r11,%r15
mvi 167(%r11),0
lr  0,0
lgr %r2,%r1
lgdr%r11,%f2
lgdr%r15,%f0
br  %r14


Even if a global register is being clobbered in a function we usually do not
save and restore it. However, we still have to do this if it is a special
register. Most of the places in the backend handle this correctly but not the
prologue/epilogue optimization.

A fix is being tested and will be available soon.

[Bug target/89203] Linux S/390: Unable to build GCC 8.2.0 on Red Hat Enterprise Linux Server release 6.9

2019-02-27 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89203

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-02-27
 Ever confirmed|0   |1

--- Comment #3 from Andreas Krebbel  ---
I can confirm that this is caused by PR89361 as Jakub mentioned. With the fix
libgomp configures successfully with GCC 8 branch.

Would it be possible for you to just use GCC 8 from DTS?

Jakub: Could you please apply the patch also to GCC 8 branch?

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-02-05 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #21 from Andreas Krebbel  ---
(In reply to Jakub Jelinek from comment #17)
> (In reply to Andreas Krebbel from comment #16)
> > I'll commit a patch which just removes the splitter for now. I'll try to
> > come up with a nicer testcase.
> 
> All 3 s390 splitters that do this?

I've only removed the load and test splitter for now. The other two are only
used for access register setters. There is only that one user in Glibc and we
have it that way since the very beginning. I will revisit these for GCC >9 but
would rather leave them in for now.

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-02-05 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #20 from Andreas Krebbel  ---
Author: krebbel
Date: Tue Feb  5 17:19:26 2019
New Revision: 268552

URL: https://gcc.gnu.org/viewcvs?rev=268552=gcc=rev
Log:
S/390: Remove load and test fp splitter

gcc/ChangeLog:

2019-02-05  Andreas Krebbel  

Backport from mainline
2019-02-05  Andreas Krebbel  

PR target/88856
* config/s390/s390.md: Remove load and test FP splitter.


Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/s390/s390.md

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-02-05 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #19 from Andreas Krebbel  ---
Author: krebbel
Date: Tue Feb  5 17:17:00 2019
New Revision: 268551

URL: https://gcc.gnu.org/viewcvs?rev=268551=gcc=rev
Log:
S/390: Remove load and test fp splitter

gcc/ChangeLog:

2019-02-05  Andreas Krebbel  

Backport from mainline
2019-02-05  Andreas Krebbel  

PR target/88856
* config/s390/s390.md: Remove load and test FP splitter.


Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/s390/s390.md

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-02-05 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #18 from Andreas Krebbel  ---
Author: krebbel
Date: Tue Feb  5 17:14:11 2019
New Revision: 268550

URL: https://gcc.gnu.org/viewcvs?rev=268550=gcc=rev
Log:
S/390: Remove load and test fp splitter

gcc/ChangeLog:

2019-02-05  Andreas Krebbel  

PR target/88856
* config/s390/s390.md: Remove load and test FP splitter.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.md

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-02-05 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #16 from Andreas Krebbel  ---
I'll commit a patch which just removes the splitter for now. I'll try to come
up with a nicer testcase.

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-02-04 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #14 from Andreas Krebbel  ---
(In reply to Jakub Jelinek from comment #11)
> ... Can't what you are doing in the splitters be done in
> define_peephole2 instead?

Not that easy unfortunately.  peephole2 will run after reload. So the FP
constant ok 0.0 will already be reloaded into a register first or pushed into
literal pool. The point of doing the transformation is to avoid this.

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-02-01 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #9 from Andreas Krebbel  ---
Created attachment 45588
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45588=edit
experimental patch

That patch appears to fix the problem for me.

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-02-01 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #8 from Andreas Krebbel  ---
The r265193 patch was found via reghunt. However, it just reveals an underlying
issue.

The problem can also be seen with mainline.

The miscompile happens in the following loop:
  do 110 j = 1, n
 if (sdiag(j) .eq. zero .and. nsing .eq. n) nsing = j - 1
 if (nsing .lt. n) wa(j) = zero
  110continue

The problem appears to be rather related to ifcvt. ifcvt generates a load on
condition for the sdiag(j) .eq. zero comparison by inserting insns: 2480, 2481,
2482:

265.ce2

(insn 915 918 916 88 (set (reg:DF 590 [ MEM[base: sdiag_143(D), index:
ivtmp.67_240, offset: 0B] ])
(mem:DF (reg/v/f:DI 239 [ sdiag ]) [2 MEM[base: sdiag_143(D), index:
ivtmp.67_240, offset: 0B]+0 S8 A64])) "min.qrsolv.f":51 1289 {*movdf_64dfp}
 (nil))
(insn 916 915 2480 88 (set (reg:CCZ 33 %cc)
(compare:CCZ (reg:DF 590 [ MEM[base: sdiag_143(D), index: ivtmp.67_240,
offset: 0B] ])
(const_double:DF 0.0 [0x0.0p+0]))) "min.qrsolv.f":51 1255
{*cmpdf_ccs}
 (expr_list:REG_DEAD (reg:DF 590 [ MEM[base: sdiag_143(D), index:
ivtmp.67_240, offset: 0B] ])
(nil)))
(insn 2480 916 2481 88 (set (reg:SI 733)
(const_int 0 [0])) 1274 {*movsi_zarch}
 (nil))
(insn 2481 2480 2482 88 (set (reg:CCZ 33 %cc)
(compare:CCZ (reg:DF 590 [ MEM[base: sdiag_143(D), index: ivtmp.67_240,
offset: 0B] ])
(const_double:DF 0.0 [0x0.0p+0]))) 1255 {*cmpdf_ccs}
 (nil))
(insn 2482 2481 927 88 (set (reg/v:SI 109 [ nsing ])
(if_then_else:SI (ne (reg:CCZ 33 %cc)
(const_int 0 [0]))
(reg/v:SI 109 [ nsing ])
(reg:SI 733))) 1676 {*movsicc}
 (nil))
(note 927 2482 928 88 NOTE_INSN_DELETED)
(jump_insn 928 927 932 88 (parallel [
(set (pc)
(if_then_else (le (reg:SI 320 [ _444 ])
(reg/v:SI 109 [ nsing ]))
(label_ref:DI 943)
(pc)))
(clobber (reg:CC 33 %cc))
]) "min.qrsolv.f":52 1260 {*cmp_and_br_signed_si}
 (expr_list:REG_UNUSED (reg:CC 33 %cc)
(int_list:REG_BR_PROB 536870916 (nil)))
 -> 943)

In the backend we have that interesting splitter which triggers for the old and
now obsolete compare in insn 916

(define_split
  [(set (match_operand 0 "cc_reg_operand")
(compare (match_operand:FP 1 "register_operand")
 (match_operand:FP 2 "const0_operand")))]
  "TARGET_HARD_FLOAT && REG_P (operands[1]) && dead_or_set_p (insn,
operands[1])"
  [(parallel
[(set (match_dup 0) (match_dup 3))
 (clobber (match_dup 1))])]
 {
   /* s390_match_ccmode requires the compare to have the same CC mode
  as the CC destination register.  */
   operands[3] = gen_rtx_COMPARE (GET_MODE (operands[0]),
  operands[1], operands[2]);
 })

268.split1   insn 916 -> insn 2677
The REG_DEAD note becomes a clobber due to that

(insn 915 918 2677 105 (set (reg:DF 590 [ MEM[base: sdiag_143(D), index:
ivtmp.67_240, offset: 0B] ])
(mem:DF (reg/v/f:DI 239 [ sdiag ]) [2 MEM[base: sdiag_143(D), index:
ivtmp.67_240, offset: 0B]+0 S8 A64])) "min.qrsolv.f":51 1289 {*movdf_64dfp}
 (nil))
(insn 2677 915 2480 105 (parallel [
(set (reg:CCZ 33 %cc)
(compare:CCZ (reg:DF 590 [ MEM[base: sdiag_143(D), index:
ivtmp.67_240, offset: 0B] ])
(const_double:DF 0.0 [0x0.0p+0])))
(clobber (reg:DF 590 [ MEM[base: sdiag_143(D), index: ivtmp.67_240,
offset: 0B] ]))
]) "min.qrsolv.f":51 -1
 (nil))
(insn 2480 2677 2481 105 (set (reg:SI 733)
(const_int 0 [0])) 1274 {*movsi_zarch}
 (nil))
(insn 2481 2480 2482 105 (set (reg:CCZ 33 %cc)
(compare:CCZ (reg:DF 590 [ MEM[base: sdiag_143(D), index: ivtmp.67_240,
offset: 0B] ])
(const_double:DF 0.0 [0x0.0p+0]))) 1255 {*cmpdf_ccs}
 (nil))

294.cprop_hardreg appears to mess up things: REG_DEAD note in insn 2677 does
not appear to fit the used reg but CC is unused now

(insn 3107 927 915 121 (set (reg:DI 3 %r3 [1060])
(mem/f/c:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 520 [0x208])) [3 sdiag+0 S8 A64])) "min.qrsolv.f":51
1270 {*movdi_64}
 (nil))
(insn 915 3107 2677 121 (set (reg:DF 19 %f6 [orig:590 MEM[base: sdiag_143(D),
index: ivtmp.67_240, offset: 0B] ] [590])
(mem:DF (reg:DI 3 %r3 [1060]) [2 MEM[base: sdiag_143(D), index:
ivtmp.67_240, offset: 0B]+0 S8 A64])) "min.qrsolv.f":51 1289 {*movdf_64dfp}
 (expr_list:REG_DEAD (reg:DI 3 %r3 [1060])
(nil)))
(insn 2677 915 3108 121 (parallel [
(set (reg:CCZ 33 %cc)
(compare:CCZ (reg:DF 19 %f6 [orig:590 MEM[base: sdiag_143(D),
index: ivtmp.67_240, offset: 0B] ] [590])
(const_double:DF 0.0 [0x0.0p+0])))
(clobber (reg:DF 19 %f6 [orig:590 MEM[base: sdiag_143(D), index:
ivtmp.67_240, offset: 0B] ] [590]))
]) "min.qrsolv.f":51 

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-02-01 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #7 from Andreas Krebbel  ---
gfortran -O3 -march=zEC12 -funroll-loops -fpie qrsolv-reduc.f -c
gcc qrsolv-caller.c -c
gcc qrsolv-caller.o qrsolv-reduc.o -o t

r265191
./t
1.359429

r265193
./t
0.00

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-02-01 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #6 from Andreas Krebbel  ---
Created attachment 45587
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45587=edit
A C wrapper to call the qrsolv function in the fortran snippet

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-02-01 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #5 from Andreas Krebbel  ---
Created attachment 45586
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45586=edit
qrsolv-reduc.f   the miscompiled fortran file autoreduced from scipy

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-01-28 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

Andreas Krebbel  changed:

   What|Removed |Added

 Status|WAITING |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |krebbel at gcc dot 
gnu.org

--- Comment #4 from Andreas Krebbel  ---
I'm able to reproduce the problem now and will try to have a look.

[Bug rtl-optimization/88953] Unrecognizable insn on architecture zEC12 with boost::bimap

2019-01-22 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88953

--- Comment #4 from Andreas Krebbel  ---
Looks like a problem which was fixed with r265158:

S/390: Fix problem with vec_init expander

gcc/ChangeLog:

2018-10-15  Andreas Krebbel  

* config/s390/s390.c (s390_expand_vec_init): Force vector element
into reg if it isn't a general operand.

gcc/testsuite/ChangeLog:

2018-10-15  Andreas Krebbel  

* g++.dg/vec-init-1.C: New test.



I've backported the patch to GCC 7 and 8 branch on 2018-10-19. Canonical is
aware of the problem and will pick the patch up for their next GCC updates.

Could you please check whether this fixes your problem?

[Bug target/88856] [8/9 Regression] gfortran producing wrong code with -funroll-loops

2019-01-17 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88856

--- Comment #3 from Andreas Krebbel  ---
I've tried building scipy 1.1.0 from github on a Fedora installation. The build
already uses -funroll-loops. But I couldn't reproduce the problem with the
resulting binary.

gcc version 8.0.1 20180324

Aurelien already tracked it down to a miscompilation of
scipy/optimize/minpack/qrsolv.f

This source file appears to contain just a single function (qrsolv) which is
not too big. I think I can work with that after being able to reproduce the
problem.

As Jakub mentioned the exact compiler cmdline would be good.

[Bug rtl-optimization/88751] Performance regression reload vs lra

2019-01-09 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88751

--- Comment #2 from Andreas Krebbel  ---
(In reply to Richard Biener from comment #1)
...
> Would be interesting to know the sparseness of regs / BBs for your testcase
> at the point of LRA and whether compacting regs (do we ever do that?) might
> be a good idea in general.  (we do compact BBs regularly)

Good point. Only 9352 of the 27089 pseudos appear to be actually referenced.
Hence the following patch fixes the problem for me:

diff --git a/gcc/ira.c b/gcc/ira.c
index c8f2df43dd1..965819e1ef9 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -5157,6 +5157,7 @@ ira (FILE *f)
   int ira_max_point_before_emit;
   bool saved_flag_caller_saves = flag_caller_saves;
   enum ira_region saved_flag_ira_region = flag_ira_region;
+  int i, num_used_regs = 0;

   clear_bb_flags ();

@@ -5172,12 +5173,17 @@ ira (FILE *f)

   ira_conflicts_p = optimize > 0;

+  /* Determine the number of pseudos actually requiring coloring.  */
+  for (i = FIRST_PSEUDO_REGISTER; i < max_reg_num (); i++)
+num_used_regs += !!(DF_REG_USE_COUNT (i) + DF_REG_DEF_COUNT (i));
+
   /* If there are too many pseudos and/or basic blocks (e.g. 10K
  pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
  use simplified and faster algorithms in LRA.  */
   lra_simple_p
 = (ira_use_lra_p
-   && max_reg_num () >= (1 << 26) / last_basic_block_for_fn (cfun));
+   && num_used_regs >= (1 << 26) / last_basic_block_for_fn (cfun));
+
   if (lra_simple_p)
 {
   /* It permits to skip live range splitting in LRA.  */

[Bug rtl-optimization/88751] New: Performance regression reload vs lra

2019-01-08 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88751

Bug ID: 88751
   Summary: Performance regression reload vs lra
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

There is a big performance drop in OpenJ9 after they have updated from GCC
4.8.5 to GCC 7.3.0.

- The performance regression disappears after compiling the byte code
interpreter loop with -mno-lra.
https://github.com/eclipse/openj9/blob/master/runtime/vm/BytecodeInterpreter.hpp

- The problem comes from the frequently accessed _pc and _sp variables being
assigned to stack slots instead of registers. With GCC 4.8 both variables end
up in hard regs.

- The problem can be seen on x86 as well as on S/390.

- In LRA the root cause of the problem is a threshold which prevents LRA from
running the full register coloring step (ira.c):

   /* If there are too many pseudos and/or basic blocks (e.g. 10K
  pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
  use simplified and faster algorithms in LRA.  */
  lra_simple_p = (ira_use_lra_p && max_reg_num () >= (1 << 26) /
  last_basic_block_for_fn (cfun));

  For the huge run() function in the byte code interpreter the numbers are:

  (gdb) p max_reg_num()
  $6 = 27089
  (gdb) p last_basic_block_for_fn(cfun)
  $7 = 4799

  Forcing GCC to run the full coloring pass makes the _pc and _sp variables to
get hard regs assigned again.


As a quick workaround we might want to turn this threshold into a parameter.

Long-term it would be good if we could either enable the heuristic to estimate
whether full coloring would be beneficial or improve the fallback coloring to
cover such important cases.

[Bug middle-end/88246] Abort signal terminated program collect2 - munmap_chunk(): invalid pointer

2018-12-03 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88246

Andreas Krebbel  changed:

   What|Removed |Added

 Status|RESOLVED|CLOSED

--- Comment #11 from Andreas Krebbel  ---
Bootstrap works again.

[Bug middle-end/88246] Abort signal terminated program collect2 - munmap_chunk(): invalid pointer

2018-11-28 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88246

Andreas Krebbel  changed:

   What|Removed |Added

 Target||s390x-redhat-linux
   Priority|P3  |P1
 CC||marxin at gcc dot gnu.org
   Host||s390x-redhat-linux
  Build||s390x-redhat-linux

[Bug middle-end/88246] New: Abort signal terminated program collect2 - munmap_chunk(): invalid pointer

2018-11-28 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88246

Bug ID: 88246
   Summary: Abort signal terminated program collect2 -
munmap_chunk(): invalid pointer
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: critical
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

GCC build fails on s390x since r266508 with:

/home/andreas/bisect/gcc-266508-build/./gcc/xgcc
-B/home/andreas/bisect/gcc-266508-build/./gcc/
-B/home/andreas/bisect/gcc-266508-install/s390x-ibm-linux-gnu/bin/
-B/home/andreas/bisect/gcc-266508-install/s390x-ibm-linux-gnu/lib/ -isystem
/home/andreas/bisect/gcc-266508-install/s390x-ibm-linux-gnu/include -isystem
/home/andreas/bisect/gcc-266508-install/s390x-ibm-linux-gnu/sys-include  
-fno-checking -O2  -g -O2 -DIN_GCC-W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition 
-isystem ./include   -fPIC -mlong-double-128 -g -DIN_LIBGCC2 -fbuilding-libgcc
-fno-stack-protector  -shared -nodefaultlibs -Wl,--soname=libgcc_s.so.1
-Wl,--version-script=libgcc.map -o 32/libgcc_s.so.1.tmp -g -O2 -m31 -B./
_muldi3_s.o _negdi2_s.o _lshrdi3_s.o _ashldi3_s.o _ashrdi3_s.o _cmpdi2_s.o
_ucmpdi2_s.o _clear_cache_s.o _trampoline_s.o __main_s.o _absvsi2_s.o
_absvdi2_s.o _addvsi3_s.o _addvdi3_s.o _subvsi3_s.o _subvdi3_s.o _mulvsi3_s.o
_mulvdi3_s.o _negvsi2_s.o _negvdi2_s.o _ctors_s.o _ffssi2_s.o _ffsdi2_s.o
_clz_s.o _clzsi2_s.o _clzdi2_s.o _ctzsi2_s.o _ctzdi2_s.o _popcount_tab_s.o
_popcountsi2_s.o _popcountdi2_s.o _paritysi2_s.o _paritydi2_s.o _powisf2_s.o
_powidf2_s.o _powixf2_s.o _powitf2_s.o _mulhc3_s.o _mulsc3_s.o _muldc3_s.o
_mulxc3_s.o _multc3_s.o _divhc3_s.o _divsc3_s.o _divdc3_s.o _divxc3_s.o
_divtc3_s.o _bswapsi2_s.o _bswapdi2_s.o _clrsbsi2_s.o _clrsbdi2_s.o
_fixunssfsi_s.o _fixunsdfsi_s.o _fixunsxfsi_s.o _fixxfdi_s.o _fixunsxfdi_s.o
_floatdisf_s.o _floatdidf_s.o _floatdixf_s.o _floatditf_s.o _floatundisf_s.o
_floatundidf_s.o _floatundixf_s.o _floatunditf_s.o _divdi3_s.o _moddi3_s.o
_divmoddi4_s.o _udivdi3_s.o _umoddi3_s.o _udivmoddi4_s.o _udiv_w_sdiv_s.o
_fixsfdi_s.o _fixdfdi_s.o _fixtfdi_s.o _fixunssfdi_s.o _fixunsdfdi_s.o
_fixunstfdi_s.o enable-execute-stack_s.o unwind-dw2_s.o unwind-dw2-fde-dip_s.o
unwind-sjlj_s.o unwind-c_s.o emutls_s.o libgcc.a -lc && rm -f 32/libgcc_s.so &&
if [ -f 32/libgcc_s.so.1 ]; then mv -f 32/libgcc_s.so.1
32/libgcc_s.so.1.backup; else true; fi && mv 32/libgcc_s.so.1.tmp
32/libgcc_s.so.1 && ln -s libgcc_s.so.1 32/libgcc_s.so
munmap_chunk(): invalid pointer
munmap_chunk(): invalid pointer
xgcc: internal compiler error: Aborted signal terminated program collect2
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://gcc.gnu.org/bugs/> for instructions.
make[5]: *** [Makefile:992: libgcc_s.so] Error 4
make[5]: Leaving directory
'/home/andreas/bisect/gcc-266508-build/s390x-ibm-linux-gnu/32/libgcc'
make[4]: *** [Makefile:1210: multi-do] Error 1
make[4]: Leaving directory
'/home/andreas/bisect/gcc-266508-build/s390x-ibm-linux-gnu/libgcc'
make[3]: *** [Makefile:127: all-multi] Error 2
make[3]: *** Waiting for unfinished jobs
xgcc: internal compiler error: Aborted signal terminated program collect2
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://gcc.gnu.org/bugs/> for instructions.
make[3]: *** [Makefile:992: libgcc_s.so] Error 4
make[3]: Leaving directory
'/home/andreas/bisect/gcc-266508-build/s390x-ibm-linux-gnu/libgcc'
make[2]: *** [Makefile:17482: all-stage2-target-libgcc] Error 2
make[2]: Leaving directory '/home/andreas/bisect/gcc-266508-build'
make[1]: *** [Makefile:21818: stage2-bubble] Error 2
make[1]: Leaving directory '/home/andreas/bisect/gcc-266508-build'
make: *** [Makefile:978: all] Error 2

[Bug middle-end/88085] User alignments on var decls not respected if smaller than type alignment

2018-11-19 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88085

--- Comment #2 from Andreas Krebbel  ---
(In reply to Richard Biener from comment #1)
...
> which is bogus again unless the caller already had pre-existing attrs
> on the MEM.  I guess using
> 
>   attrs.align = refattrs ? MAX (refattrs.align, obj_align) : obj_align;
> 
> fixes your issue?  Or are you objectp == true?  I think an INDIRECT_REF never
> happens today.

Yes, it does. objectp is true as well. t is a var decl.

[Bug middle-end/88085] New: User alignments on var decls not respected if smaller than type alignment

2018-11-19 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88085

Bug ID: 88085
   Summary: User alignments on var decls not respected if smaller
than type alignment
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

typedef int __attribute__((vector_size(16))) v4si; 

v4si a4 __attribute__((aligned(4))); 

void 
foo (v4si a) 
{ 
 a4 += a; 
}

-O3 -dP

MEM_ALIGN for variable a4 is 128 bits although it is emitted with an alignment
requirement of just 4: .comm a4,16,4

It works when moving the aligned attribute over to the typedef instead.

foo:
#(insn:TI 6 3 7 2 (set (reg:V4SI 20 xmm0 [85])
#(plus:V4SI (reg:V4SI 20 xmm0 [86])
#(mem/c:V4SI (symbol_ref:DI ("a4") [flags 0x2] ) [1 a4+0 S16 A128]))) "t.c":8:6 3122 {*addv4si3}
# (expr_list:REG_EQUIV (mem/c:V4SI (symbol_ref:DI ("a4") [flags 0x2]
) [1 a4+0 S16 A128])
#(nil)))
paddd   a4(%rip), %xmm0 # 6 [c=12 l=8]  *addv4si3/0
#(insn:TI 7 6 18 2 (set (mem/c:V4SI (symbol_ref:DI ("a4") [flags 0x2] ) [1 a4+0 S16 A128])
#(reg:V4SI 20 xmm0 [85])) "t.c":8:6 1198 {movv4si_internal}
# (expr_list:REG_DEAD (reg:V4SI 20 xmm0 [85])
#(nil)))
movaps  %xmm0, a4(%rip) # 7 [c=4 l=7]  movv4si_internal/3
#(jump_insn:TI 14 18 15 2 (simple_return) "t.c":9:1 688
{simple_return_internal}
# (nil)
# -> simple_return)
ret # 14[c=0 l=1]  simple_return_internal
.size   foo, .-foo
.comm   a4,16,4
.ident  "GCC: (GNU) 9.0.0 20181114 (experimental)"



set_mem_attributes_minus_bitpos in emit-rtl.c:

This code sets the alignment field based on the type alignment:

  /* ??? If T is a type, respecting mode alignment may *also* be wrong
 e.g. if the type carries an alignment attribute.  Should we be
 able to simply always use TYPE_ALIGN?  */
}

  /* We can set the alignment from the type if we are making an object or if
 this is an INDIRECT_REF.  */
  if (objectp || TREE_CODE (t) == INDIRECT_REF)
attrs.align = MAX (attrs.align, TYPE_ALIGN (type));


This code later would use the alignment in the aligned attribute (in obj_align)
but ignores it if it is lower than the alignment we already have?! As expected
setting bigger alignments on the var decl works.

  /* If this is an indirect reference, record it.  */
  else if (TREE_CODE (t) == MEM_REF 
   || TREE_CODE (t) == TARGET_MEM_REF)
{
  attrs.expr = t;
  attrs.offset_known_p = true;
  attrs.offset = 0;
  apply_bitpos = bitpos;
}

  /* Compute the alignment.  */
  unsigned int obj_align;
  unsigned HOST_WIDE_INT obj_bitpos;
  get_object_alignment_1 (t, _align, _bitpos);
  unsigned int diff_align = known_alignment (obj_bitpos - bitpos);
  if (diff_align != 0)
obj_align = MIN (obj_align, diff_align);
  attrs.align = MAX (attrs.align, obj_align);
}

[Bug tree-optimization/88044] [9 regression] gfortran.dg/transfer_intrinsic_3.f90 hangs after r266171

2018-11-16 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88044

Andreas Krebbel  changed:

   What|Removed |Added

 Target|powerpc64*-*-*  |powerpc64*-*-*, s390x-*-*
 CC||krebbel at gcc dot gnu.org
   Host|powerpc64*-*-*  |powerpc64*-*-*, s390x-*-*
  Build|powerpc64*-*-*  |powerpc64*-*-*, s390x-*-*

--- Comment #2 from Andreas Krebbel  ---
The testcase hangs also on S/390.

[Bug target/87723] [9 Regression] ICE: output_operand: invalid %-code on s390x

2018-11-06 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87723

--- Comment #3 from Andreas Krebbel  ---
Author: krebbel
Date: Tue Nov  6 10:22:05 2018
New Revision: 265832

URL: https://gcc.gnu.org/viewcvs?rev=265832=gcc=rev
Log:
S/390: Fix PR87723

gcc/ChangeLog:

2018-11-06  Andreas Krebbel  

PR target/87723
* config/s390/s390.md ("*rsbg_di_rotl"): Remove mode
attributes for operands 3 and 4.

gcc/testsuite/ChangeLog:

2018-11-06  Andreas Krebbel  

PR target/87723
* gcc.target/s390/pr87723.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/s390/pr87723.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.md
trunk/gcc/testsuite/ChangeLog

[Bug target/87762] [9 Regression] extract_constrain_insn, at recog.c:2206 on s390x

2018-10-26 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87762

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

[Bug target/87762] [9 Regression] extract_constrain_insn, at recog.c:2206 on s390x

2018-10-26 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87762

Andreas Krebbel  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |iii at gcc dot gnu.org

--- Comment #1 from Andreas Krebbel  ---
Caused by r265490. Ilya please have a look.

[Bug target/87723] [9 Regression] ICE: output_operand: invalid %-code on s390x

2018-10-25 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87723

--- Comment #2 from Andreas Krebbel  ---
(In reply to Andreas Krebbel from comment #1)
> Created attachment 44898 [details]
> Patch
> 
> The "*rsbg_di_rotl" output string uses mode attributes with actually
> using a mode iterator.

s/with/without/

[Bug target/87723] [9 Regression] ICE: output_operand: invalid %-code on s390x

2018-10-25 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87723

--- Comment #1 from Andreas Krebbel  ---
Created attachment 44898
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44898=edit
Patch

The "*rsbg_di_rotl" output string uses mode attributes with actually
using a mode iterator.

[Bug target/87723] [9 Regression] ICE: output_operand: invalid %-code on s390x

2018-10-25 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87723

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed|2018-10-24 00:00:00 |2018-10-25
   Assignee|unassigned at gcc dot gnu.org  |krebbel at gcc dot 
gnu.org
 Ever confirmed|0   |1

[Bug target/86804] s390 port needs updating for CVE-2017-5753

2018-10-16 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86804

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Andreas Krebbel  ---
Fixed with:

Author: krebbel
Date: Thu Sep 27 08:03:42 2018
New Revision: 264663

URL: https://gcc.gnu.org/viewcvs?rev=264663=gcc=rev
Log:
S/390: Implement speculation barrier

gcc/ChangeLog:

2018-09-27  Andreas Krebbel  

* config/s390/s390.md (PPA_TX_ABORT, PPA_OOO_BARRIER): New
constant definitions.
("tx_assist"): Replace magic number with PPA_TX_ABORT.
("*ppa"): Enable pattern also for -march=zEC12 -mno-htm.
("speculation_barrier"): New expander definition.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.md

[Bug target/86772] [meta-bug] tracking port status for CVE-2017-5753

2018-10-16 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86772
Bug 86772 depends on bug 86804, which changed state.

Bug 86804 Summary: s390 port needs updating for CVE-2017-5753
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86804

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug target/80080] S390: Isses with emitted cs-instructions for __atomic builtins.

2018-09-06 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080

--- Comment #14 from Andreas Krebbel  ---
Author: krebbel
Date: Thu Sep  6 07:38:42 2018
New Revision: 264143

URL: https://gcc.gnu.org/viewcvs?rev=264143=gcc=rev
Log:
S/390: Prohibit SYMBOL_REF in UNSPECV_CAS

Inhibit constant propagation inlining SYMBOL_REF loads into
UNSPECV_CAS.  Even though reload can later undo it, the resulting
code will be less efficient.

gcc/ChangeLog:

2018-09-06  Ilya Leoshkevich  

PR target/80080
* config/s390/predicates.md: Add nonsym_memory_operand.
* config/s390/s390.c (s390_legitimize_cs_operand): If operand
contains a SYMBOL_REF, load it into an intermediate pseudo.
(s390_emit_compare_and_swap): Legitimize operand.
* config/s390/s390.md: Use the new nonsym_memory_operand
with UNSPECV_CAS patterns.

gcc/testsuite/ChangeLog:

2018-09-06  Ilya Leoshkevich  

PR target/80080
* gcc.target/s390/pr80080-3.c: New test.
* gcc.target/s390/s390.exp: Make sure the new test passes
on all optimization levels.



Added:
trunk/gcc/testsuite/gcc.target/s390/pr80080-3.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/predicates.md
trunk/gcc/config/s390/s390.c
trunk/gcc/config/s390/s390.md
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/s390/s390.exp

[Bug target/80080] S390: Isses with emitted cs-instructions for __atomic builtins.

2018-09-06 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080

--- Comment #13 from Andreas Krebbel  ---
Author: krebbel
Date: Thu Sep  6 07:35:35 2018
New Revision: 264142

URL: https://gcc.gnu.org/viewcvs?rev=264142=gcc=rev
Log:
S/390: Register pass_s390_early_mach statically

The dump file used to come at the end of the sorted dump file list,
because the pass was registered dynamically. This did not reflect the
order in which passes are executed. Static registration fixes this:

* foo4.c.277r.split2
* foo4.c.281r.early_mach
* foo4.c.282r.pro_and_epilogue

gcc/ChangeLog:

2018-09-06  Ilya Leoshkevich  

PR target/80080
* config/s390/s390-passes.def: New file.
* config/s390/s390-protos.h (class rtl_opt_pass): Add forward
declaration.
(make_pass_s390_early_mach): Add declaration.
* config/s390/s390.c (make_pass_s390_early_mach):
(s390_option_override): Remove dynamic registration.
* config/s390/t-s390: Add s390-passes.def.



Added:
trunk/gcc/config/s390/s390-passes.def
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390-protos.h
trunk/gcc/config/s390/s390.c
trunk/gcc/config/s390/t-s390

[Bug target/84332] ICE in insn_default_length, at config/s390/s390.md:9697 for -fstack-clash-protection

2018-08-09 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84332

Andreas Krebbel  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Andreas Krebbel  ---
Fixed with commit from comment 3

[Bug target/84332] ICE in insn_default_length, at config/s390/s390.md:9697 for -fstack-clash-protection

2018-08-09 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84332

--- Comment #3 from Andreas Krebbel  ---
Author: krebbel
Date: Thu Aug  9 07:06:23 2018
New Revision: 263441

URL: https://gcc.gnu.org/viewcvs?rev=263441=gcc=rev
Log:
S/390: Fix PR84332 ICE with stack clash protection

Our implementation of the stack probe requires the probe interval to
be used as displacement in an address operand.  The maximum probe
interval currently is 64k.  This would exceed short displacements.
Trim that value down to 4k if that happens.  This might result in too
many probes being generated only on the oldest supported machine level
z900.

gcc/ChangeLog:

2018-08-09  Andreas Krebbel  

PR target/84332
* config/s390/s390.c (s390_option_override_internal): Reduce the
stack-clash-protection-probe-interval param if it would be too big
for z900.

gcc/testsuite/ChangeLog:

2018-08-09  Andreas Krebbel  

PR target/84332
* gcc.target/s390/pr84332.c: New testcase.


Added:
trunk/gcc/testsuite/gcc.target/s390/pr84332.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.c
trunk/gcc/testsuite/ChangeLog

[Bug target/84332] ICE in insn_default_length, at config/s390/s390.md:9697 for -fstack-clash-protection

2018-08-08 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84332

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2018-08-08
   Assignee|unassigned at gcc dot gnu.org  |krebbel at gcc dot 
gnu.org
 Ever confirmed|0   |1

[Bug target/79895] ICE in extract_constrain_insn, at recog.c:2213

2018-08-08 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79895

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Andreas Krebbel  ---
Fixed

[Bug target/85295] ICE in extract_constrain_insn, at recog.c:2205

2018-08-08 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85295

Andreas Krebbel  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Andreas Krebbel  ---
Fixed with the patch from comment 3

[Bug c++/86082] user-defined literals are not converted to the execution charset

2018-08-08 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86082

Andreas Krebbel  changed:

   What|Removed |Added

 Status|RESOLVED|CLOSED

--- Comment #9 from Andreas Krebbel  ---
Closing

[Bug c++/86082] user-defined literals are not converted to the execution charset

2018-08-08 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86082

Andreas Krebbel  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Andreas Krebbel  ---
Fixed for GCC 9. No backport planned for GCC 8.

[Bug rtl-optimization/83420] S/390 bootstrap failure starting with r255569

2018-08-08 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83420

Andreas Krebbel  changed:

   What|Removed |Added

 Status|RESOLVED|CLOSED

--- Comment #4 from Andreas Krebbel  ---
Closing

[Bug rtl-optimization/83420] S/390 bootstrap failure starting with r255569

2018-08-08 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83420

Andreas Krebbel  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Andreas Krebbel  ---
Fixed

[Bug target/85295] ICE in extract_constrain_insn, at recog.c:2205

2018-08-08 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85295

--- Comment #3 from Andreas Krebbel  ---
Author: krebbel
Date: Wed Aug  8 12:38:51 2018
New Revision: 263396

URL: https://gcc.gnu.org/viewcvs?rev=263396=gcc=rev
Log:
S/390: Fix PR85295

gcc/ChangeLog:

2018-08-08  Andreas Krebbel  

PR target/85295
* config/s390/constraints.md ("NxHD0", "NxSD0"): New constraint
definitions.
* config/s390/s390.md ("movti"): Add more alternatives for
constant to GPR copies.

gcc/testsuite/ChangeLog:

2018-08-08  Andreas Krebbel  

PR target/85295
* gcc.target/s390/TI-constants-lra.c: New testcase.
* gcc.target/s390/TI-constants-nolra.c: New testcase.


Added:
trunk/gcc/testsuite/gcc.target/s390/TI-constants-lra.c
trunk/gcc/testsuite/gcc.target/s390/TI-constants-nolra.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/constraints.md
trunk/gcc/config/s390/s390.md
trunk/gcc/testsuite/ChangeLog

[Bug tree-optimization/86844] wrong code generation cause by store merging pass

2018-08-03 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86844

Andreas Krebbel  changed:

   What|Removed |Added

   Keywords||wrong-code
   Priority|P3  |P1
   Severity|normal  |major

[Bug tree-optimization/86844] wrong code generation cause by store merging pass

2018-08-03 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86844

--- Comment #1 from Andreas Krebbel  ---
Created attachment 44503
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44503=edit
experimental patch

This patch adds a check to check_no_overlap which rejects overlaps if it has
seen a non-constant store in between. This fixes the testcase for me.

[Bug tree-optimization/86844] New: wrong code generation cause by store merging pass

2018-08-03 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86844

Bug ID: 86844
   Summary: wrong code generation cause by store merging pass
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

Created attachment 44502
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44502=edit
Reduced testcase

Compiling the attached testcase with -O2 results in the following code:

movzbl  4(%rdi), %eax
movl$33024, 8(%rdi)
movb%al, 10(%rdi)

33024 -> 0  0 129 0

The store of 222 gets optimized away.

Without store merging:

movzbl  4(%rdi), %eax
movl$0, 8(%rdi)
movb$-34, 11(%rdi)
movb$-127, 9(%rdi)
movb%al, 10(%rdi)

The original order of stores:

  a->b.wd0.u4i = 0;
  a->b.wd0.s2.w = 222;
  a->b.wd0.s2.y = 129;
  a->b.wd0.s2.z = a->f.wd1.s2.z;


coalesce_immediate_stores first reorders the stores according to its bit
positions:

  a->b.wd0.u4i = 0;
  a->b.wd0.s2.y = 129;
  a->b.wd0.s2.z = a->f.wd1.s2.z;
  a->b.wd0.s2.w = 222;

It then merges the first and the second and has to end the group seeing the
third. So the last ends up in its own group. Emitting the stores in the
original order makes the 222 store dead. The first two should not be merged.

coalesce_immediate_stores already tries to detect cases where stores later in
the chain might get invalidated by merging early stores but it also assumes
that if the later store also stores a constant it will be possible to merge it
as well.  However, in this case the non-constant store in between prevents
this.


Store merging pass output:

;; Function f (f, funcdef_no=0, decl_uid=1922, cgraph_uid=1, symbol_order=0)

Processing basic block <2>:
Starting new chain with statement:
a_3(D)->b.wd0.u4i = 0;
The base object is:
a_3(D)
Recording immediate store from stmt:
a_3(D)->b.wd0.s2.w = 222;
Recording immediate store from stmt:
a_3(D)->b.wd0.s2.y = 129;
Recording immediate store from stmt:
a_3(D)->b.wd0.s2.z = _1;
stmt causes chain termination:
return;
Attempting to coalesce 4 stores in chain
New store group
Store 0:
bitsize:32 bitpos:64 val:0
Store 1:
bitsize:8 bitpos:72 val:129
After writing 0 of size 32 at position 0
  the merged value contains 00 00 00 00 
  the merged mask contains  00 00 00 00 
After writing 129 of size 8 at position 8
  the merged value contains 00 81 00 00 
  the merged mask contains  00 00 00 00 
New store group
Store 2:
bitsize:8 bitpos:80 val:_1
New store group
Store 3:
bitsize:8 bitpos:88 val:222
Coalescing successful!
Merged into 1 stores
New sequence of 1 stores to replace old one of 2 stores
# .MEM_6 = VDEF <.MEM_5>
MEM[(union  *)a_3(D) + 8B] = 33024;
Merging successful!
f (struct bar * a)
{
  unsigned char _1;

   [local count: 1073741825]:
  a_3(D)->b.wd0.s2.w = 222;
  MEM[(union  *)a_3(D) + 8B] = 33024;
  _1 = a_3(D)->D.1919.f.wd1.s2.z;
  a_3(D)->b.wd0.s2.z = _1;
  return;

}

[Bug target/86547] s390x: Maximum number of LRA assignment passes is achieved (30) when compiling a small inline assembler snippet

2018-07-30 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86547

--- Comment #7 from Andreas Krebbel  ---
Author: krebbel
Date: Mon Jul 30 08:30:06 2018
New Revision: 263063

URL: https://gcc.gnu.org/viewcvs?rev=263063=gcc=rev
Log:
lra: consider clobbers when selecting hard_regno to spill

The idea behind the rclass loop in spill_hard_reg_in_range() seems to
be: find a hard_regno, which in general conflicts with reload regno,
but does not do so between `from` and `to`, and then do the live range
splitting based on this information. To check the absence of conflicts,
we make use of insn_bitmap, which does not contain insns which clobber
the hard_regno.

gcc/ChangeLog:

2018-07-30  Ilya Leoshkevich  

PR target/86547
* lra-constraints.c (spill_hard_reg_in_range): When selecting the
hard_regno, make sure no insn between `from` and `to` clobbers it.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/lra-constraints.c

[Bug rtl-optimization/80818] LRA clobbers live hard reg clobbered during rematerialization

2018-07-25 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80818

Andreas Krebbel  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
   Last reconfirmed||2018-07-25
Version|8.0 |7.3.1
 Resolution|FIXED   |---
 Ever confirmed|0   |1

--- Comment #14 from Andreas Krebbel  ---
I ran into the same problem with current GCC 7 branch. Could you apply your fix
also to GCC 7 branch?

[Bug c++/86082] user-defined literals are not converted to the execution charset

2018-07-03 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86082

--- Comment #6 from Andreas Krebbel  ---
(In reply to jwakely from comment #5)
> On 03/07/18 10:07 +, r...@gcc.gnu.org wrote:
> >--- Comment #4 from Jonathan Wakely  ---
> >Is this fixed now?
> 
> Or do you plan to backport it?

I would like to backport it to GCC 7 and 8 branch as well. Ok?

[Bug c++/86082] user-defined literals are not converted to the execution charset

2018-06-25 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86082

--- Comment #3 from Andreas Krebbel  ---
Author: krebbel
Date: Mon Jun 25 07:16:59 2018
New Revision: 262003

URL: https://gcc.gnu.org/viewcvs?rev=262003=gcc=rev
Log:
C++: Fix PR86082

When turning a user-defined numerical literal into an operator
invocation the literal needs to be translated to the execution
character set.

gcc/cp/ChangeLog:

2018-06-25  Andreas Krebbel  

PR C++/86082
* parser.c (make_char_string_pack): Pass this literal chars
through cpp_interpret_string.
(cp_parser_userdef_numeric_literal): Check the result of
make_char_string_pack.

gcc/testsuite/ChangeLog:

2018-06-25  Andreas Krebbel  

PR C++/86082
* g++.dg/pr86082.C: New test.


Added:
trunk/gcc/testsuite/g++.dg/pr86082.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/cp/parser.c
trunk/gcc/testsuite/ChangeLog

[Bug c++/86082] user-defined literals are not converted to the execution charset

2018-06-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86082

--- Comment #2 from Andreas Krebbel  ---
Created attachment 44300
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44300=edit
experimental patch

[Bug c++/86082] user-defined literals are not converted to the execution charset

2018-06-07 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86082

Andreas Krebbel  changed:

   What|Removed |Added

 Target||x86_64
   Host||x86_64
  Build||x86_64

--- Comment #1 from Andreas Krebbel  ---
fails at least since r228905

[Bug c++/86082] New: user-defined literals are not converted to the execution charset

2018-06-07 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86082

Bug ID: 86082
   Summary: user-defined literals are not converted to the
execution charset
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

template  void q();
template <> void q<'1','2','3'>() {}

template  void operator""_test() { q (); }

int
main ()
{
  123_test;
}

builds fine with 'g++ t.cpp'
but triggers a link error compiled with 'g++ t.cpp -fexec-charset=IBM1047'

In the specialization of q the string literals '1', '2', '3' get converted to
the target character set as expected. However, the call generated in the body
of the operator does still use the source character set:

10:  2 FUNCGLOBAL DEFAULT2 void q<(char)-15,
(char)-14, (char)-13>()
11: 000216 FUNCGLOBAL DEFAULT2 main
12: 12 FUNCWEAK   DEFAULT6 void operator""
_test<(char)49, (char)50, (char)51>()
13:  0 NOTYPE  GLOBAL DEFAULT  UND void q<(char)49,
(char)50, (char)51>()


When converting a user-defined literal into string literals also a conversion
into the execution charset is required.

[Bug tree-optimization/85478] ICE with single element vector

2018-06-04 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85478

Andreas Krebbel  changed:

   What|Removed |Added

 Status|RESOLVED|CLOSED

--- Comment #13 from Andreas Krebbel  ---
Fixed.

[Bug tree-optimization/85478] ICE with single element vector

2018-04-24 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85478

--- Comment #10 from Andreas Krebbel  ---
Author: krebbel
Date: Tue Apr 24 12:18:26 2018
New Revision: 259593

URL: https://gcc.gnu.org/viewcvs?rev=259593=gcc=rev
Log:
Fix PR85478

gcc/ChangeLog:

2018-04-24  Andreas Krebbel  

PR tree-optimization/85478
* tree-vect-loop.c (vect_analyze_loop_2): Do not call
vect_grouped_store_supported for single element vectors.

gcc/testsuite/ChangeLog:

2018-04-24  Andreas Krebbel  

PR tree-optimization/85478
* g++.dg/pr85478.C: New test.


Added:
trunk/gcc/testsuite/g++.dg/pr85478.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-loop.c

[Bug tree-optimization/85478] ICE with single element vector

2018-04-23 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85478

--- Comment #8 from Andreas Krebbel  ---
The problem is similar to PR83753 but with a different call-chain. Richard
Sandiford fixed it by adding:

  /* First cope with the degenerate case of a single-element
 vector.  */
  if (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 1U))
*memory_access_type = VMAT_CONTIGUOUS;

to get_group_load_store_type.  This prevents vect_grouped_store_supported from
being called for single element vectors. 

For this PR vect_grouped_store_supported is called from vect_analyze_loop_2. I
don't know if there is also a better way to deal with it in the caller?!

But regardless I think vect_grouped_store_supported should return false for
single element vectors as proposed in:

https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00758.html

[Bug tree-optimization/85478] ICE with single element vector

2018-04-23 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85478

--- Comment #7 from Andreas Krebbel  ---
The cross from comment #6 did not trigger the problem because I accidentally
built it with --disable-checking. Dropping this and adding
--with-long-double-128 triggers the ICE on a full cross as well as on a cross
without sysroot.

[Bug tree-optimization/85478] ICE with single element vector

2018-04-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85478

--- Comment #6 from Andreas Krebbel  ---
The difference I have seen so far was triggered by building the cross with
"--without-headers". As a result the detected glibc version is 0.0:

config.log:

configure:28145: checking for target glibc version
configure:28169: result: 0.0

This in turn fails to set the proper default for the long double data type in
configure:

if test $glibc_version_major -gt 2 \
  || ( test $glibc_version_major -eq 2 && test $glibc_version_minor -ge 4 );
then :
  gcc_cv_target_ldbl128=yes
else
  ...


configuring the cross --with-long-double-128 makes the first set of differences
to disappear. However, the testcase still doesn't ICE when compiled with the
cross.

I will retry with a full cross. There appear to be more settings depending on
the Glibc version.

[Bug tree-optimization/85478] ICE with single element vector

2018-04-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85478

--- Comment #4 from Andreas Krebbel  ---
Indeed it does not appear to fail with a cross from x86. I've checked with
r259518 on s390x as well as on x86. With an x86 cross no tree dump is generated
after 012t.ompexp and the generated assembler file does not contain any code.

x86->s390x cross 012.ompexp:
...
;; Function c::f (_ZN1c1fIP2abIfEPS1_IeEEEiT_T0_,
funcdef_no=15, decl_uid
=2862, cgraph_uid=9, symbol_order=9)

c::f (struct ab * g, struct ab * h)
{
  struct ab * i;
  struct ab D.2925;

   :
  if (i == g)
goto ; [INV]
  else
goto ; [INV]

   :
  ab::ab (, MEM[(const struct ab &)i]);
  *h = D.2925;
  h = h + 16;
  i = i + 16;
  goto ; [INV]

   :
  __builtin_unreachable ();

}



;; Function ab::ab (_ZN2abIeEC2ES_IfE, funcdef_no=6,
decl_uid=2666, cgraph_uid=2, symbol_order=2)

ab::ab (struct ab * const this, struct ab g)
{
  complex double D.2939;

   :
  MEM[(struct  &)this] = {CLOBBER};
  D.2939 = ab::m ();
  _1 = REALPART_EXPR ;
  _2 = IMAGPART_EXPR ;
  _3 = COMPLEX_EXPR <_1, _2>;
  this->n = _3;
  return;

}


s390x native 012.ompexp:

;; Function c::f (_ZN1c1fIP2abIfEPS1_IgEEEiT_T0_,
funcdef_no=15, decl_uid
=2896, cgraph_uid=9, symbol_order=9)

c::f (struct ab * g, struct ab * h)
{
  struct ab * i;
  struct ab D.2959;

   :
  if (i == g)
goto ; [INV]
  else
goto ; [INV]

   :
  ab::ab (, MEM[(const struct ab &)i]);
  *h = D.2959;
  D.2959 = {CLOBBER};
  h = h + 32;
  i = i + 16;
  goto ; [INV]

   :
  __builtin_unreachable ();

}



;; Function ab::ab (_ZN2abIgEC2ES_IfE, funcdef_no=6,
decl_uid=2700, cgraph_uid=2, symbol_order=2)

ab::ab (struct ab * const this, struct ab g)
{
  complex double D.2973;

   :
  MEM[(struct  &)this] = {CLOBBER};
  D.2973 = ab::m ();
  _1 = REALPART_EXPR ;
  _2 = (long double) _1;
  _3 = IMAGPART_EXPR ;
  _4 = (long double) _3;
  _5 = COMPLEX_EXPR <_2, _4>;
  this->n = _5;
  return;

}

[Bug tree-optimization/85478] ICE with single element vector

2018-04-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85478

--- Comment #2 from Andreas Krebbel  ---
I've opened another bugzilla for a probably unrelated problem triggered by a
testcase reduce from the same source file:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85481

[Bug c++/85481] New: ICE in maybe_explain_implicit_delete

2018-04-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85481

Bug ID: 85481
   Summary: ICE in maybe_explain_implicit_delete
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

Created attachment 43998
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43998=edit
Autoreduced testcase

cc1plus t.cc

 } class b {
t.cc:2:1: error: expected ‘,’ or ‘...’ before ‘}’ token
  )
 ^
t.cc:2:2: error: expected ‘;’ after class definition
  ^
t.cc:3:14: error: expected ‘;’ at end of member declaration
t.cc:4:2: error: expected ‘;’ after class definition
 } class B {   virtual ~B(;
  b d
   
^
   
 ;
t.cc:5:2: error: expected ‘;’ after class definition
 } template class : B {
  ^
  ;
t.cc:5:22: error: expected ‘}’ at end of input
 } template class : B {
  ^
t.cc:5:18: error: use of deleted function ‘b::~b()’
 } template class : B {
  ^
t.cc:3:7: internal compiler error: in maybe_explain_implicit_delete, at
cp/method.c:1873
   a c ~b() = default
   ^
0x12e6821 maybe_explain_implicit_delete(tree_node*)
/home/andreas/build/../gcc/gcc/cp/method.c:1873
0x12888b1 mark_used(tree_node*, int)
/home/andreas/build/../gcc/gcc/cp/decl2.c:5255
0x11ad759 build_over_call
/home/andreas/build/../gcc/gcc/cp/call.c:7736
0x11b221b build_new_method_call_1
/home/andreas/build/../gcc/gcc/cp/call.c:9378
0x11b221b build_new_method_call(tree_node*, tree_node*, vec<tree_node*, va_gc,
vl_embed>**, tree_node*, int, tree_node**, int)
/home/andreas/build/../gcc/gcc/cp/call.c:9453
0x12da887 locate_fn_flags
/home/andreas/build/../gcc/gcc/cp/method.c:1024
0x12ddee3 walk_field_subobs
/home/andreas/build/../gcc/gcc/cp/method.c:1439
0x12dedab synthesized_method_walk
/home/andreas/build/../gcc/gcc/cp/method.c:1741
0x12e38f3 get_defaulted_eh_spec(tree_node*, int)
/home/andreas/build/../gcc/gcc/cp/method.c:1775
0x13c5105 maybe_instantiate_noexcept(tree_node*, int)
/home/andreas/build/../gcc/gcc/cp/pt.c:23256
0x13d9de5 check_final_overrider
/home/andreas/build/../gcc/gcc/cp/search.c:1935
0x13d9de5 look_for_overrides_r
/home/andreas/build/../gcc/gcc/cp/search.c:2089
0x13d9de5 look_for_overrides(tree_node*, tree_node*)
/home/andreas/build/../gcc/gcc/cp/search.c:2034
0x11d6b8d check_for_override(tree_node*, tree_node*)
/home/andreas/build/../gcc/gcc/cp/class.c:2774
0x12e1a13 lazily_declare_fn(special_function_kind, tree_node*)
/home/andreas/build/../gcc/gcc/cp/method.c:2404
0x11c9125 dfs_declare_virt_assop_and_dtor
/home/andreas/build/../gcc/gcc/cp/class.c:3011
0x13d4d57 dfs_walk_all(tree_node*, tree_node* (*)(tree_node*, void*),
tree_node* (*)(tree_node*, void*), void*)
/home/andreas/build/../gcc/gcc/cp/search.c:1410
0x13d4ded dfs_walk_all(tree_node*, tree_node* (*)(tree_node*, void*),
tree_node* (*)(tree_node*, void*), void*)
/home/andreas/build/../gcc/gcc/cp/search.c:1422
0x11ea4c5 declare_virt_assop_and_dtor
/home/andreas/build/../gcc/gcc/cp/class.c:3030
0x11ea4c5 add_implicitly_declared_members
/home/andreas/build/../gcc/gcc/cp/class.c:3170

[Bug tree-optimization/85478] ICE with single element vector

2018-04-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85478

--- Comment #1 from Andreas Krebbel  ---
The testcases ICEs since r253196:

S/390: Set the preferred mode for float vectors

gcc/ChangeLog:

2017-09-26  Andreas Krebbel  

* config/s390/s390.c (s390_preferred_simd_mode): Return V4SFmode
for SFmode.


with:

during RTL pass: reload
t2.cc: In member function ‘dealii::FullMatrix&
dealii::FullMatrix::operator=(const dealii::FullMatrix&) [with
number2 = std::complex; number = std::complex]’:
t2.cc:199:3: internal compiler error: Max. number of generated reload insns per
insn is achieved (90)

   }
   ^
0x185f553 lra_constraints(bool)
/home/andreas/gcc/gcc/lra-constraints.c:4756
0x1845459 lra(_IO_FILE*)
/home/andreas/gcc/gcc/lra.c:2390
0x17f260b do_reload
/home/andreas/gcc/gcc/ira.c:5440
0x17f260b execute
/home/andreas/gcc/gcc/ira.c:5624
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.



With the poly-int patches the ICE is triggered during vectorization already
probably papering over the original ICE.

With the patch posted here the vectorization will not continue and does not
appear to end up in that situation anymore:

https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00758.html

[Bug tree-optimization/85478] New: ICE with single element vector

2018-04-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85478

Bug ID: 85478
   Summary: ICE with single element vector
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

Created attachment 43996
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43996=edit
Autoreduced testcase

Compiling the attached testcase triggers an ICE

cc1plus -march=arch12 -O3 -fpermissive t.cc

Performing interprocedural optimizations   
 <*free_lang_data>   

  
 Assembling functions:
  s& s::operator=(const s&) [with t =
ab; ae = ab]during GIMPLE pass: vect

t2.cc: In member function ‘s& s::operator=(const s&) [with t =
ab; ae = ab]’:
t2.cc:39:8: internal compiler error: in exact_div, at poly-int.h:2139   
 s ::operator=(const s ) {  
^
0x21e8941 poly_int<1u, poly_result::is_poly>::type, poly_coeff_pair_traits::is_poly>::type>::result_kind>::type>
exact_div<1u, unsigned long, int>(poly_int_pod<1u, unsigned long> const&, int)
/home/andreas/build/../gcc/gcc/poly-int.h:2139  
0x21e8941 vect_grouped_store_supported(tree_node*, unsigned long)   
/home/andreas/build/../gcc/gcc/tree-vect-data-refs.c:5150   
0x1ce5115 vect_analyze_loop_2  
/home/andreas/build/../gcc/gcc/tree-vect-loop.c:2495
0x1ce5115 vect_analyze_loop(loop*, _loop_vec_info*) 
/home/andreas/build/../gcc/gcc/tree-vect-loop.c:2621
0x1d03e13 vectorize_loops()
/home/andreas/build/../gcc/gcc/tree-vectorizer.c:664

[Bug testsuite/85326] `make check` fails with `--disable-bootstrap` and `--enable-languages=c`

2018-04-13 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85326

--- Comment #4 from Andreas Krebbel  ---
Author: krebbel
Date: Fri Apr 13 09:14:32 2018
New Revision: 259369

URL: https://gcc.gnu.org/viewcvs?rev=259369=gcc=rev
Log:
IBM Z: Get rid of target specific C++ testcase

gcc/testsuite/ChangeLog:

2018-04-13  Andreas Krebbel  

PR testsuite/85326
* gcc.target/s390/pr77822-1.C: Rename to ...
* gcc.target/s390/pr77822-1.c: ... this. Add asm scan check.
* gcc.target/s390/pr77822-2.c: Add asm scan check.
* gcc.target/s390/s390.exp: Remove C from testcase regexps.


Added:
trunk/gcc/testsuite/gcc.target/s390/pr77822-1.c
Removed:
trunk/gcc/testsuite/gcc.target/s390/pr77822-1.C
Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/s390/pr77822-2.c
trunk/gcc/testsuite/gcc.target/s390/s390.exp

[Bug middle-end/85369] New: no -Wstringop-overflow for a strcpy / stpcpy call with a nonstring pointer when providing movstr pattern

2018-04-12 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85369

Bug ID: 85369
   Summary: no -Wstringop-overflow for a strcpy / stpcpy call with
a nonstring pointer when providing movstr pattern
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

c-c++-common/attr-nonstring-3.c fails on IBM Z. A warning only appears when the
strcpy/stpcpy are expanded as normal calls. If the back-end provides the movstr
expander no warning will appear (if the expander can be used).

Just issuing a warning in the builtin expansion logic might end up emitting two
warnings: see PR85359

[Bug tree-optimization/85368] [8 regression] phi-opt-11 test fails on IBM Z

2018-04-12 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85368

--- Comment #1 from Andreas Krebbel  ---
For e.g. Power this has been fixed as part of PR81184

[Bug tree-optimization/81184] [8 regression] gcc.dg/pr21643.c and gcc.dg/tree-ssa/phi-opt-11.c fail starting with r249450

2018-04-12 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81184

--- Comment #10 from Andreas Krebbel  ---
I've verified that the problem is fixed on Power. So I've opened a separate BZ
for this #85368

[Bug tree-optimization/85368] New: [8 regression] phi-opt-11 test fails on IBM Z

2018-04-12 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85368

Bug ID: 85368
   Summary: [8 regression] phi-opt-11 test fails on IBM Z
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

No IF statements remain although LOGICAL_OP_NON_SHORT_CIRCUIT is not
defined on S/390 and hence defaults to true when using
-mbranch-cost=2.

The testcase appears to expect 2 IFs to remain for function
h. However, these get removed in phiopt1.

This code turns the TRUTH_ANDIF_EXPR condition into a TRUTH_AND_EXPR:

fold-const.c:8178

  if (LOGICAL_OP_NON_SHORT_CIRCUIT
  && !flag_sanitize_coverage
  && (code == TRUTH_AND_EXPR
  || code == TRUTH_ANDIF_EXPR
  || code == TRUTH_OR_EXPR
  || code == TRUTH_ORIF_EXPR))
{
  enum tree_code ncode, icode;

  ncode = (code == TRUTH_ANDIF_EXPR || code == TRUTH_AND_EXPR)
  ? TRUTH_AND_EXPR : TRUTH_OR_EXPR;
  icode = ncode == TRUTH_AND_EXPR ? TRUTH_ANDIF_EXPR : TRUTH_ORIF_EXPR;
...

  /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B)
 into (A OR B).
 For sequence point consistancy, we need to check for trapping,
 and side-effects.  */
  else if (code == icode && simple_operand_p_2 (arg0)
   && simple_operand_p_2 (arg1))
return fold_build2_loc (loc, ncode, type, arg0, arg1);


ANDIFs would be split into two separate IFs but since it had been replaced with
and AND instead the truth value gets computed by the gimplifier:

004t.gimple

h (int a, int b, int c, int d)
{
  int D.2246;

  _1 = a == d;
  _2 = b > c;
  _3 = _1 & _2;
  if (_3 != 0) goto ; else goto ;
  :
  D.2246 = d;
  // predicted unlikely by early return (on trees) predictor.
  return D.2246;
  :
  D.2246 = a;
  return D.2246;
}

which eventually gets optimized in phiop1 to:

Removing basic block 3
;; basic block 3, loop depth 0
;;  pred:   2
;;  succ:   4


COND_EXPR in block 2 and PHI in block 4 converted to straightline code.
Merging blocks 2 and 4
fix_loop_structure: fixing up loops for function
h (int a, int b, int c, int d)
{
  _Bool _1;
  _Bool _2;
  _Bool _3;

   [local count: 1073741825]:
  _1 = a_5(D) == d_6(D);
  _2 = b_7(D) > c_8(D);
  _3 = _1 & _2;
  return a_5(D);
}

No IFs.

[Bug tree-optimization/81184] [8 regression] gcc.dg/pr21643.c and gcc.dg/tree-ssa/phi-opt-11.c fail starting with r249450

2018-04-10 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81184

Andreas Krebbel  changed:

   What|Removed |Added

 CC||krebbel at gcc dot gnu.org

--- Comment #9 from Andreas Krebbel  ---
Is this really fixed for Power64. At least on s390 I still these testcases
failing.
I had a look into phi-opt-11.c.

No IF statements remain although LOGICAL_OP_NON_SHORT_CIRCUIT is not
defined on S/390 and hence defaults to true when using
-mbranch-cost=2.

The testcase appears to expect 2 IFs to remain for function
h. However, these get removed phiopt1.

This code turns the TRUTH_ANDIF_EXPR condition into a TRUTH_AND_EXPR:

fold-const.c:8178

  if (LOGICAL_OP_NON_SHORT_CIRCUIT
  && !flag_sanitize_coverage
  && (code == TRUTH_AND_EXPR
  || code == TRUTH_ANDIF_EXPR
  || code == TRUTH_OR_EXPR
  || code == TRUTH_ORIF_EXPR))
{
  enum tree_code ncode, icode;

  ncode = (code == TRUTH_ANDIF_EXPR || code == TRUTH_AND_EXPR)
  ? TRUTH_AND_EXPR : TRUTH_OR_EXPR;
  icode = ncode == TRUTH_AND_EXPR ? TRUTH_ANDIF_EXPR : TRUTH_ORIF_EXPR;
...

  /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B)
 into (A OR B).
 For sequence point consistancy, we need to check for trapping,
 and side-effects.  */
  else if (code == icode && simple_operand_p_2 (arg0)
   && simple_operand_p_2 (arg1))
return fold_build2_loc (loc, ncode, type, arg0, arg1);


This prevents the gimplifier from splitting the condition into two
separate IF statements. Instead the truth value gets computed:

004t.gimple

h (int a, int b, int c, int d)
{
  int D.2246;

  _1 = a == d;
  _2 = b > c;
  _3 = _1 & _2;
  if (_3 != 0) goto ; else goto ;
  :
  D.2246 = d;
  // predicted unlikely by early return (on trees) predictor.
  return D.2246;
  :
  D.2246 = a;
  return D.2246;
}

which eventually gets optimized in phiop1 to:

Removing basic block 3
;; basic block 3, loop depth 0
;;  pred:   2
;;  succ:   4


COND_EXPR in block 2 and PHI in block 4 converted to straightline code.
Merging blocks 2 and 4
fix_loop_structure: fixing up loops for function
h (int a, int b, int c, int d)
{
  _Bool _1;
  _Bool _2;
  _Bool _3;

   [local count: 1073741825]:
  _1 = a_5(D) == d_6(D);
  _2 = b_7(D) > c_8(D);
  _3 = _1 & _2;
  return a_5(D);
}

No IFs.

[Bug target/85295] ICE in extract_constrain_insn, at recog.c:2205

2018-04-10 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85295

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed|2018-04-09 00:00:00 |2018-04-10
   Assignee|unassigned at gcc dot gnu.org  |krebbel at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Andreas Krebbel  ---
(In reply to Jakub Jelinek from comment #1)
> Shouldn't we just remove -mno-lra support for s390*?
> I mean, -mlra is the default on s390* already since GCC 4.9?, so the testing
> period must be over already.

If this can be fixed easily I would prefer to keep -mno-lra. It sometimes helps
when debugging lra problems. I'll try to have a look.

[Bug tree-optimization/84486] [7/8 Regression] code hoisting removes alignment assumption

2018-03-05 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84486

--- Comment #3 from Andreas Krebbel  ---
(In reply to Richard Biener from comment #2)
> Created attachment 43540 [details]
> candidate patch
> 
> Can you check whether this patch works for you (on the unreduced testcase
> which likely exists)?

Yes, it does fix the bigger testcase as well. Thanks!

Do you plan to backport this also for GCC 7 branch?

[Bug ada/84706] New: Ada bootstrap fails on s390x since r258124

2018-03-05 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84706

Bug ID: 84706
   Summary: Ada bootstrap fails on s390x since r258124
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

raised TYPES.UNRECOVERABLE_ERROR : comperr.adb:407
gnatmake: "/home/andreas/gcc/gcc/ada/xref_lib.adb" compilation error
make[3]: *** [../gcc-interface/Makefile:2212: common-tools] Error 4
make[3]: Leaving directory
'/home/andreas/build/gcc-64-master-z13-build/gcc/ada/tools'
make[2]: *** [Makefile:191: gnattools-native] Error 2
make[2]: Leaving directory
'/home/andreas/build/gcc-64-master-z13-build/gnattools'
make[1]: *** [Makefile:13026: all-gnattools] Error 2
make[1]: *** Waiting for unfinished jobs

[Bug tree-optimization/84486] New: code hoisting removes alignment assumption

2018-02-20 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84486

Bug ID: 84486
   Summary: code hoisting removes alignment assumption
   Product: gcc
   Version: 7.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

Created attachment 43473
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43473=edit
Autoreduced testcase

The __atomic_compare_exchange_n builtin on s390 uses the cdsg (compare and swap
double) instruction for 16 byte aligned operands and falls back to a library
call otherwise. Since the code hoisting change r238242 alignment hints applied
with __builtin_assume_aligned appear to get lost and we get a library call for
the attached testcase.

hardware instruction: -O1
libatomic call with: -O1 -fcode-hoisting

[Bug target/84295] [7 Regression] glibc failed to build

2018-02-09 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84295

Andreas Krebbel  changed:

   What|Removed |Added

 Status|RESOLVED|CLOSED

--- Comment #3 from Andreas Krebbel  ---
GCC 7 backport:

https://gcc.gnu.org/viewcvs/gcc?view=revision=257523

[Bug target/84295] [7 Regression] glibc failed to build

2018-02-09 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84295

Andreas Krebbel  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Andreas Krebbel  ---
Fixed with:

2018-02-09  Andreas Krebbel  

PR target/PR84295
* config/s390/s390.c (s390_set_current_function): Invoke
s390_indirect_branch_settings also if fndecl didn't change.

gcc/testsuite/ChangeLog:

2018-02-09  Andreas Krebbel  

PR target/PR84295
* gcc.target/s390/pr84295.c: New test.

[Bug target/84295] [7 Regression] glibc failed to build

2018-02-09 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84295

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2018-02-09
 Ever confirmed|0   |1

--- Comment #1 from Andreas Krebbel  ---
I'm testing the following fix:

index 62a60e2..298fdd1 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -16135,7 +16135,10 @@ s390_set_current_function (tree fndecl)
  several times in the course of compiling a function, and we don't want to
  slow things down too much or call target_reinit when it isn't safe.  */
   if (fndecl == s390_previous_fndecl)
-return;
+{
+  s390_indirect_branch_settings (fndecl);
+  return;
+}

   tree old_tree;
   if (s390_previous_fndecl == NULL_TREE)

[Bug rtl-optimization/83147] LRA inheritance undo on multiple sets problem

2018-01-19 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83147

--- Comment #3 from Andreas Krebbel  ---
(In reply to Vladimir Makarov from comment #2)
> (In reply to Andreas Krebbel from comment #1)
> > Created attachment 42714 [details]
> > Experimental patch
> > 
> > This patch appears to fix the problem for me. However, it isn't really
> > tested yet.
> 
> Hi, Andreas.  Thank you for working on this problem.
> 
> Although I can not reproduce the bug on today trunk, I am completely agree
> with your analysis and the patch.  If you don't mind i'll test the patch
> (under your name) and commit it to the trunk if it is ok for you.  Please,
> let me know.

Of course. Thanks!

[Bug rtl-optimization/83420] S/390 bootstrap failure starting with r255569

2017-12-18 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83420

--- Comment #2 from Andreas Krebbel  ---
Author: krebbel
Date: Mon Dec 18 11:31:06 2017
New Revision: 255777

URL: https://gcc.gnu.org/viewcvs?rev=255777=gcc=rev
Log:
S/390: PR83420: Improve hotpatch option parsing.

With the attached patch we get rid of the following build failure:

/home/andreas/build/../gcc/gcc/config/s390/s390.c: In function ‘void
s390_option_override()’:
/home/andreas/build/../gcc/gcc/config/s390/s390.c:15361:16: error: ‘char*
strncpy(char*, const char*, size_t)’ specified bound 256 equals destination
size [-Werror=stringop-truncation]
strncpy (s, opt->arg, 256);
^~

gcc/ChangeLog:

2017-12-18  Andreas Krebbel  

PR target/83420
* config/s390/s390.c (s390_option_override): Avoid strncpy.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.c

[Bug rtl-optimization/83420] S/390 bootstrap failure starting with r255569

2017-12-14 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83420

Andreas Krebbel  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2017-12-14
   Assignee|unassigned at gcc dot gnu.org  |krebbel at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Andreas Krebbel  ---
The error messages I've posted were not complete. The important is that one:

/home/andreas/build/../gcc/gcc/config/s390/s390.c: In function ‘void
s390_option_override()’:
/home/andreas/build/../gcc/gcc/config/s390/s390.c:15361:16: error: ‘char*
strncpy(char*, const char*, size_t)’ specified bound 256 equals destination
size [-Werror=stringop-truncation]
strncpy (s, opt->arg, 256);
^~


And this should be easy to fix. Mine

[Bug rtl-optimization/83420] New: S/390 bootstrap failure starting with r255569

2017-12-14 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83420

Bug ID: 83420
   Summary: S/390 bootstrap failure starting with r255569
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

S/390 64 bit currently fails to bootstrap in stage 3. The problem started with
r255569

In file included from /home/andreas/build/../gcc/gcc/system.h:691,
 from /home/andreas/build/../gcc/gcc/ada/adaint.c:107:
/home/andreas/build/../gcc/gcc/ada/adaint.c: In function ‘char*
__gnat_locate_exec(char*, char*)’:
/home/andreas/build/../gcc/gcc/ada/adaint.c:2890:12: warning: argument 1 null
where non-null expected [-Wnonnull]
(strlen (exec_name) + strlen (HOST_EXECUTABLE_SUFFIX) + 1);
 ~~~^~~
/home/andreas/build/../gcc/gcc/../include/libiberty.h:722:37: note: in
definition of macro ‘alloca’
 # define alloca(x) __builtin_alloca(x)
 ^
In file included from
/home/andreas/build/gcc-8.0.0-64-13122017-build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include/cstring:42,
 from /home/andreas/build/../gcc/gcc/system.h:235,
 from /home/andreas/build/../gcc/gcc/ada/adaint.c:107:
/usr/include/string.h:394:15: note: in a call to function ‘size_t strlen(const
char*)’ declared here
 extern size_t strlen (const char *__s)
   ^~
/home/andreas/build/../gcc/gcc/ada/adaint.c:2892:14: warning: argument 2 null
where non-null expected [-Wnonnull]
   strcpy (full_exec_name, exec_name);
   ~~~^~~
In file included from
/home/andreas/build/gcc-8.0.0-64-13122017-build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include/cstring:42,
 from /home/andreas/build/../gcc/gcc/system.h:235,
 from /home/andreas/build/../gcc/gcc/ada/adaint.c:107:
/usr/include/string.h:125:14: note: in a call to function ‘char* strcpy(char*,
const char*)’ declared here
 extern char *strcpy (char *__restrict __dest, const char *__restrict __src)
  ^~

[Bug rtl-optimization/83147] LRA inheritance undo on multiple sets problem

2017-11-24 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83147

Andreas Krebbel  changed:

   What|Removed |Added

   Keywords||wrong-code
 Target||s390x-ibm-linux
   Priority|P3  |P2
   Host||s390x-ibm-linux
  Build||s390x-ibm-linux

[Bug rtl-optimization/83147] LRA inheritance undo on multiple sets problem

2017-11-24 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83147

--- Comment #1 from Andreas Krebbel  ---
Created attachment 42714
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42714=edit
Experimental patch

This patch appears to fix the problem for me. However, it isn't really tested
yet.

[Bug rtl-optimization/83147] New: LRA inheritance undo on multiple sets problem

2017-11-24 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83147

Bug ID: 83147
   Summary: LRA inheritance undo on multiple sets problem
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

Created attachment 42713
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42713=edit
Autoreduced testcase

Compiling the attached testcase with:

gcc -march=z196 -m64 -mzarch -O2 -o t.s t.cc

produces the following sequence:

...
stmg%r2,%r3,160(%r15)
ltg %r2,184(%r15)<--- read from uninitialized memory
lghi%r3,0
ltg %r1,168(%r15)
lghi%r1,1
locgre  %r2,%r1
...

This currently makes bootstrap with "--with-arch=z196" fail on S/390.

The ltg instruction is a load and test being a parallel of a compare and a set
using the same source operand (272r.ira):

(insn 122 62 48 6 (parallel [
(set (reg:CCZ 33 %cc)
(compare:CCZ (subreg:DI (reg:TI 100 [ width+-8 ]) 8)
(const_int 0 [0])))
(set (reg:DI 118 [ nbwc ])
(subreg:DI (reg:TI 100 [ width+-8 ]) 8))
]) 1213 {*tstdi_extimm}
 (expr_list:REG_UNUSED (reg:CCZ 33 %cc)
(nil)))

LRA generates an inheritance reload replacing both occurrences of the source
operand r100 with r132 (273r.reload):

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  Creating newreg=132 from oldreg=100, assigning class GENERAL_REGS to
inheritance r132
Original reg change 100->132 (bb6):
  122: {%cc:CCZ=cmp(r132:TI#8,0);r118:DI=r132:TI#8;}
  REG_UNUSED %cc:CCZ
Add inheritance<-original before:
  162: r132:TI=r100:TI

Inheritance reuse change 100->132 (bb6):
  158: r129:DI=r132:TI#8
  REG_DEAD r132:TI
  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

And another one for r100 stacking on top of the first:
163: r133=r100
162: r132=r133
122: use r132

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  Creating newreg=133 from oldreg=100, assigning class GENERAL_REGS to
inheritance r133
Original reg change 100->133 (bb5):
   41: r78:DI=r133:TI#8
Add inheritance<-original before:
  163: r133:TI=r100:TI

Inheritance reuse change 100->133 (bb6):
  162: r132:TI=r133:TI
  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

The inheritance undo code then tries to replace r132 in insn 122 with r133.
Unfortunately it only replaces one of the source operands.

The reason is that the target of the first part of the parallel (the cmp) is
REG_UNUSED and hence single_set ignores it and returns just the second part of
the insn. The code then operates on the source operand return by single_set
(lra-constraint.c:6698):

  if (GET_CODE (SET_SRC (set)) == SUBREG)
SUBREG_REG (SET_SRC (set)) = SET_SRC (prev_set);
  else
SET_SRC (set) = SET_SRC (prev_set);

The replacement perhaps needs to be done recursively to get all the sources?

** Undoing inheritance #2: **

Inherit 3 out of 4 (75.00%)
   Insn after restoring regs:
  158: r129:DI=r100:TI#8
  REG_DEAD r100:TI
Change reload insn:
  122: {%cc:CCZ=cmp(r132:TI#8,0);r118:DI=r133:TI#8;}< 2 different
sources
  REG_UNUSED %cc:CCZ
   Insn after restoring regs:
  162: r100:TI=r133:TI
  REG_DEAD r133:TI

  1   2   3   4   5   >