Re: [PATCH], Fix PR/target 80099 (internal error with -mno-upper-regs-sf)
On Fri, Apr 14, 2017 at 05:07:17PM -0500, Segher Boessenkool wrote: > On Fri, Apr 14, 2017 at 05:43:28PM -0400, Michael Meissner wrote: > > On Fri, Apr 14, 2017 at 03:48:47AM -0500, Segher Boessenkool wrote: > > > On Wed, Apr 12, 2017 at 06:45:31PM -0400, Michael Meissner wrote: > > > > The problem is rs6000_expand_vector_extract did not check for SFmode > > > > being > > > > allowed in the Altivec (upper) registers, but the insn implementing the > > > > variable extract had it as a condition. > > > > > > > > In looking at the variable extract code, it currently does not require > > > > SFmode > > > > to go in the Altivec registers, but it does require DImode to go into > > > > the > > > > Altivec registers (vec_extract of V2DFmode will require DFmode to go in > > > > Altivec > > > > registers instead of DImode). > > > > > > > > > > @@ -7586,15 +7586,23 @@ rs6000_expand_vector_extract (rtx target > > > >switch (mode) > > > > { > > > > case V2DFmode: > > > > - emit_insn (gen_vsx_extract_v2df_var (target, vec, elt)); > > > > - return; > > > > + if (TARGET_UPPER_REGS_DF) > > > > + { > > > > + emit_insn (gen_vsx_extract_v2df_var (target, vec, elt)); > > > > + return; > > > > + } > > > > + break; > > > > > > If the option is not set, we then ICE later on as far as I see (since > > > elt is not a CONST_INT). Is that what we want, can that not happen? > > > In that case, please just do an assert (TARGET_UPPER_REGS_DF) here? > > > > No. I guess I was unclear. > > I'm asking about this exact code that I quoted. After the change here, > if not TARGET_UPPER_REGS_DF, it breaks out of the switch, and then > immediately ICEs (failing assert on CONST_INT_P). Right? Or do I read > this wrong. You are right, and my patch was too complicated. All I needed to do was remove the upper register checks. In looking at it, since the insn before being split has both register and memory versions, if the register allocator can't allocate a register, it will push the value on to the stack, and adjust the address with the variable index and do a load. Performance with the store and load, likely will not be ideal, but it should work. Because of the interactions with the debug switches -mno-upper-regs-, I decided to add tests for all of the variable extract built-ins with each of the no-upper regs switches. I've tested this on a little endian power8 system and it bootstrapped and ran make check with no regressions. Is it ok for the trunk? [gcc] 2017-04-15 Michael MeissnerPR target/80099 * config/rs6000/rs6000.c (rs6000_expand_vector_extract): Eliminate unneeded test for TARGET_UPPER_REGS_SF. * config/rs6000/vsx.md (vsx_extract_v4sf_var): Likewise. [gcc/testsuite] 2017-04-15 Michael Meissner PR target/80099 * gcc.target/powerpc/pr80099-1.c: New test. * gcc.target/powerpc/pr80099-2.c: Likewise. * gcc.target/powerpc/pr80099-3.c: Likewise. * gcc.target/powerpc/pr80099-4.c: Likewise. * gcc.target/powerpc/pr80099-5.c: Likewise. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 246815) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -7562,12 +7562,8 @@ rs6000_expand_vector_extract (rtx target return; case V4SFmode: - if (TARGET_UPPER_REGS_SF) - { - emit_insn (gen_vsx_extract_v4sf_var (target, vec, elt)); - return; - } - break; + emit_insn (gen_vsx_extract_v4sf_var (target, vec, elt)); + return; case V4SImode: emit_insn (gen_vsx_extract_v4si_var (target, vec, elt)); Index: gcc/config/rs6000/vsx.md === --- gcc/config/rs6000/vsx.md(revision 246815) +++ gcc/config/rs6000/vsx.md(working copy) @@ -2419,8 +2419,7 @@ (define_insn_and_split "vsx_extract_v4sf UNSPEC_VSX_EXTRACT)) (clobber (match_scratch:DI 3 "=r,,")) (clobber (match_scratch:V2DI 4 "=,X,X"))] - "VECTOR_MEM_VSX_P (V4SFmode) && TARGET_DIRECT_MOVE_64BIT - && TARGET_UPPER_REGS_SF" + "VECTOR_MEM_VSX_P (V4SFmode) && TARGET_DIRECT_MOVE_64BIT" "#" "&& reload_completed" [(const_int 0)] Index: gcc/testsuite/gcc.target/powerpc/pr80099-1.c === --- gcc/testsuite/gcc.target/powerpc/pr80099-1.c(revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr80099-1.c(revision 0) @@ -0,0 +1,12 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-require-effective-target
[PATCH] rs6000: Testcase 20050830-1.c no longer fails (PR66612)
Bin's commit r246810, for PR80153, fixes 20050830-1.c for -m64 (it already passed for -m32). So, this patch removes the remaining xfail. Tested on powerpc64-linux {-m32,-m64}; committing to trunk. Segher 2017-04-15 Segher Boessenkoolgcc/testsuite/ PR tree-optimization/66612 * gcc.target/powerpc/20050830-1.c: Remove xfail. --- gcc/testsuite/gcc.target/powerpc/20050830-1.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.target/powerpc/20050830-1.c b/gcc/testsuite/gcc.target/powerpc/20050830-1.c index 0b1397a..4a8f71a 100644 --- a/gcc/testsuite/gcc.target/powerpc/20050830-1.c +++ b/gcc/testsuite/gcc.target/powerpc/20050830-1.c @@ -1,8 +1,7 @@ /* Make sure the doloop optimization is done for this loop. */ /* { dg-do compile { target powerpc*-*-* } } */ /* { dg-options "-O2" } */ -/* XFAIL for now, see PR66612. */ -/* { dg-final { scan-assembler "bdn" { xfail lp64 } } } */ +/* { dg-final { scan-assembler "bdn" } } */ extern int a[]; int foo(int w) { int n = w; -- 1.9.3
Re: [PATCH] Destroy arguments for _Cilk_spawn calling in the child (PR 80038)
On 2017-04-14 15:00 +0800, Xi Ruoyao wrote: > On 2017-04-13 09:05 +0200, Richard Biener wrote: > > > Did you verify LTO bootstrap still works with the patch? > > I've just done a LTO bootstrapp (boarding a train :) ). > It works with my patch. I've done dejagnu tests in lto.exp and built a Linux kernel with lto bootstrapped GCC. They seem good. -- Xi RuoyaoSchool of Aerospace Science and Technology, Xidian University
PATCH for Re: mirrors
On Sat, 8 Apr 2017, Ionut Vatavu wrote: > I would like to announce a new mirror in Germany Gunzenhausen: > > http://www.bothelp.net/mirrors/gcc - updated daily by rsync Thanks, Ionut. This is now part of our mirrors list per the patch below. Gerald Index: mirrors.html === RCS file: /cvs/gcc/wwwdocs/htdocs/mirrors.html,v retrieving revision 1.241 diff -u -r1.241 mirrors.html --- mirrors.html7 Feb 2017 21:50:17 - 1.241 +++ mirrors.html14 Apr 2017 22:46:41 - @@ -31,6 +31,7 @@ thanks to Tim Semeijn (noc@babylon.network) at Babylon Network. France, Versailles: ftp://ftp.uvsq.fr/pub/gcc/;>ftp.uvsq.fr, thanks to ftpmaint at uvsq.fr Germany, Berlin: ftp://ftp.fu-berlin.de/unix/languages/gcc/;>ftp.fu-berlin.de, thanks to ftp at fu-berlin.de +Germany, Gunzenhausen: http://www.bothelp.net/mirrors/gcc/;>www.bothelp.net, thanks to Ionut Vatavu (iva...@googlemail.com). Germany: ftp://ftp.gwdg.de/pub/misc/gcc/;>ftp.gwdg.de, thanks to emoenke at gwdg.de Germany: ftp://ftp.mpi-sb.mpg.de/pub/gnu/mirror/gcc.gnu.org/pub/gcc/;>ftp.mpi-sb.mpg.de, thanks to ftpadmin at mpi-sb.mpg.de Germany: http://gcc.cybermirror.org;>http://gcc.cybermirror.org, thanks to Sascha Schwarz (cm at cybermirror.org)
[PATCH v2] PR80101: Fix ICE in store_data_bypass_p
This problem reports an assertion error when certain rtl expressions which are not eligible as producers or consumers of a store bypass optimization are passed as arguments to the store_data_bypass_p function. The proposed patch returns false from store_data_bypass_p rather than terminating with an assertion error. False indicates that the passed arguments are not eligible for the store bypass scheduling optimization. Thank you for feedback and guidance received in response to my first patch submission and the follow-on RFC post from Eric Botcazou, Segher Boessenkool, Richard Sandiford, and Pat Haugen. With all of your help, I now have a much better understanding of the intended role of store_data_bypass_p. This new revision of the patch differs from the original submission in the following ways: 1. I have modified the comment that describes this function to clarify that this function is only called if it is already determined that there exists at least one variable that is set by OUT_INSN and read by IN_INSN. My modified comment also clarifies the function's new behavior, as implemented with this patch. 2. I have added comments to the body of the function to clarify some of the rationale for the existing code and the newly inserted code, especially where I was originally confused because I did not understand the rationale. 3. I have added code to allow USE expressions beneath a PARALLEL node without invalidating store data bypass (for consistency, for example, with the implementation of single_set, and as mentioned in feedback from Richard Sandiford). I gather that it is extremely unlikely that in_insn would represent a PARALLEL with multiple store operations beneath it, but this function, as originally implemented, supports that possibility, and my changes to the function do as well. The patch has been boostrapped without regressions on powerpc64le-unknown-linux-gnu. Is this ok for the trunk? gcc/testsuite/ChangeLog: 2017-04-14 Kelvin Nilsen* gcc.target/powerpc/pr80101-1.c: New test. gcc/ChangeLog: 2017-04-14 Kelvin Nilsen * recog.c (store_data_bypass_p): Rather than terminate with assertion error, return false if either of the function's arguments is not a singe_set or a PARALLEL with only SETS inside. Allow USE subexpressions in addition to CLOBBER subexpressions within a PARALLEL that represents either of the function's arguments. Add and modify comments to clarify behavior. Index: gcc/recog.c === --- gcc/recog.c (revision 246469) +++ gcc/recog.c (working copy) @@ -3663,9 +3663,14 @@ peephole2_optimize (void) /* Common predicates for use with define_bypass. */ -/* True if the dependency between OUT_INSN and IN_INSN is on the store - data not the address operand(s) of the store. IN_INSN and OUT_INSN - must be either a single_set or a PARALLEL with SETs inside. */ +/* Given that there exists at least one variable that is set (produced) + by OUT_INSN and read (consumed) by IN_INSN, return true iff + IN_INSN represents one or more memory store operations and none of + the variables set by OUT_INSN is used by IN_INSN as the address of a + store operation. If either IN_INSN or OUT_INSN does not represent + a "single" RTL SET expression (as loosely defined by the + implementation of the single_set function) or a PARALLEL with only + SETs, CLOBBERs, and USEs inside, this function returns false. */ int store_data_bypass_p (rtx_insn *out_insn, rtx_insn *in_insn) @@ -3678,6 +3683,8 @@ store_data_bypass_p (rtx_insn *out_insn, rtx_insn in_set = single_set (in_insn); if (in_set) { + /* If in_set does not represent a store operation, this insn +pair is not eligible for store data bypass. */ if (!MEM_P (SET_DEST (in_set))) return false; @@ -3684,6 +3691,9 @@ store_data_bypass_p (rtx_insn *out_insn, rtx_insn out_set = single_set (out_insn); if (out_set) { + /* If the address stored by in_set is set by out_set, the +dependency is on the address of the store operation, so +this insn pair is not eligible for store data bypass. */ if (reg_mentioned_p (SET_DEST (out_set), SET_DEST (in_set))) return false; } @@ -3698,11 +3708,15 @@ store_data_bypass_p (rtx_insn *out_insn, rtx_insn { out_exp = XVECEXP (out_pat, 0, i); -if (GET_CODE (out_exp) == CLOBBER) - continue; + if ((GET_CODE (out_exp) == CLOBBER) || (GET_CODE (out_exp) == USE)) + continue; +else if (GET_CODE (out_exp) != SET) + return false; -gcc_assert (GET_CODE (out_exp) == SET); - + /* If the address to which the in_set store operation + writes is set by any of the SET subexpressions in
Re: [PATCH], Fix PR/target 80099 (internal error with -mno-upper-regs-sf)
On Fri, Apr 14, 2017 at 05:43:28PM -0400, Michael Meissner wrote: > On Fri, Apr 14, 2017 at 03:48:47AM -0500, Segher Boessenkool wrote: > > On Wed, Apr 12, 2017 at 06:45:31PM -0400, Michael Meissner wrote: > > > The problem is rs6000_expand_vector_extract did not check for SFmode being > > > allowed in the Altivec (upper) registers, but the insn implementing the > > > variable extract had it as a condition. > > > > > > In looking at the variable extract code, it currently does not require > > > SFmode > > > to go in the Altivec registers, but it does require DImode to go into the > > > Altivec registers (vec_extract of V2DFmode will require DFmode to go in > > > Altivec > > > registers instead of DImode). > > > > > > > @@ -7586,15 +7586,23 @@ rs6000_expand_vector_extract (rtx target > > >switch (mode) > > > { > > > case V2DFmode: > > > - emit_insn (gen_vsx_extract_v2df_var (target, vec, elt)); > > > - return; > > > + if (TARGET_UPPER_REGS_DF) > > > + { > > > + emit_insn (gen_vsx_extract_v2df_var (target, vec, elt)); > > > + return; > > > + } > > > + break; > > > > If the option is not set, we then ICE later on as far as I see (since > > elt is not a CONST_INT). Is that what we want, can that not happen? > > In that case, please just do an assert (TARGET_UPPER_REGS_DF) here? > > No. I guess I was unclear. I'm asking about this exact code that I quoted. After the change here, if not TARGET_UPPER_REGS_DF, it breaks out of the switch, and then immediately ICEs (failing assert on CONST_INT_P). Right? Or do I read this wrong. Segher
Re: [PATCH], Fix PR/target 80099 (internal error with -mno-upper-regs-sf)
On Fri, Apr 14, 2017 at 03:48:47AM -0500, Segher Boessenkool wrote: > On Wed, Apr 12, 2017 at 06:45:31PM -0400, Michael Meissner wrote: > > The problem is rs6000_expand_vector_extract did not check for SFmode being > > allowed in the Altivec (upper) registers, but the insn implementing the > > variable extract had it as a condition. > > > > In looking at the variable extract code, it currently does not require > > SFmode > > to go in the Altivec registers, but it does require DImode to go into the > > Altivec registers (vec_extract of V2DFmode will require DFmode to go in > > Altivec > > registers instead of DImode). > > > > @@ -7586,15 +7586,23 @@ rs6000_expand_vector_extract (rtx target > >switch (mode) > > { > > case V2DFmode: > > - emit_insn (gen_vsx_extract_v2df_var (target, vec, elt)); > > - return; > > + if (TARGET_UPPER_REGS_DF) > > + { > > + emit_insn (gen_vsx_extract_v2df_var (target, vec, elt)); > > + return; > > + } > > + break; > > If the option is not set, we then ICE later on as far as I see (since > elt is not a CONST_INT). Is that what we want, can that not happen? > In that case, please just do an assert (TARGET_UPPER_REGS_DF) here? No. I guess I was unclear. We DO NOT NEED the test for SFmode in Altivec registers here, but we do need DImode in Altivec registers. The problem was the generator did not not have the test, but this place did. Before the split, the variable insn looks like: (parallel [(set (reg:SF t) ;; constraint ww (unspec:SF [(reg:V4SF u) ;; constraint v (reg:DI v)] ;; constraint r UNSPEC_VSX_EXTRACT)) (clobber (reg:DI w)) ;; constraint r (clobber (reg:V2DI x))]) ;; constraint v The split generates: ;; VSLO (set (reg:DI x) ;; this is the 2nd clobber (unspec:DI [(reg:V2DI u) ;; this is the vector input (reg:V2DI x)] ;; this is the 2nd clobber UNSPEC_VSX_VSLO)) ;; XSCVSPDPN (set (reg:SF t) ;; this is the destination (unspec:SF [(reg:V2DI x)] ;; this is the 2nd clobber UNSPEC_VSX_CVSPDP)) Because the target constraint is ww, that will be VSX_REGS normally on power8, and FLOAT_REGS if -mupper-regs-sf is used. Note, the variable extraction will not be done this way on power7, since it does not have direct move. So, since the only place we care about whether or not SFmode can go in Altivec registers is the desination, the test should be eliminated. However, in looking at the code generated, I did discover that we need DImode to go in the Altivec registers and we weren't checking for that. Note, variable extraction of DFmode values does not need DImode in Altivec registers, but it does need DFmode in Altivec registers. Now that it is on by default, we really should eliminate the -mno-upper-regs-* options. They were useful when the code was being debugged, but they are less useful now. However, that is not something we should do in stage4. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
[Patch, Fortran, committed] PR 80361: [5/6/7 Regression] bogus recursive call to nonrecursive procedure with -fcheck=recursion
Hi all, I have just committed a close-to-obvious patch for PR 80361, as approved by Thomas on bugzilla: https://gcc.gnu.org/viewcvs?rev=246934=gcc=rev I will backport it to the 5 and 6 branches after a week or so ... Cheers, Janus
Re: [PATCH 0/2] ARC Zero Overhead Loop Fixes
* Claudiu Zissulescu[2017-04-14 11:20:43 +]: > Hi, > > > Andrew Burgess (2): > > arc: Use @pcl assembler syntax instead of invalid expressions > > arc: Fix for loop end detection > > > > Both patches looks good to me. Thanks committed as r246932 and r246933. Andrew
Re: [PATCH] Fix newlib build failure for mips16
On 04/14/2017 10:24 AM, Richard Sandiford wrote: Jeff Lawwrites: On 04/14/2017 05:22 AM, Richard Sandiford wrote: Jeff Law writes: The mips64vr-elf target will fail building newlib, particularly the mips16 newlib as we emit bogus assembly code. In particular the compiler will emit something like lwu $2,0($sp) That's invalid in mips16 mode AFAICT. That's emitted by the extendsidi pattern. It's a case where the operand predicates are looser that the constraints. The code we get out of reload is fine, but hard register propagation substitutes sp for a (valid mips16) hard register that as the same value. Since hard register propagation tests predicates, not constraints, the substitution is successful and the bogus code is generated. Isn't that the bug though? Post-reload passes must test the constraints as well as the predicates, to make sure that the change aligns with one of the available alternatives. I thought that as well and was quite surprised to see regcprop not honor that. regcprop uses the validate_change interface, which normally checks contraints -- except for MEMs which is just checks for validity via memory_address_addr_space_p. Thus inside a MEM it'll allow any change that is recognized as valid according to the legitimate_address hooks. I would claim that is fundamentally broken, but trying to change that at this stage is, IMHO, unwise. I think we should seriously consider doing something different for stage1. For example, after validating the MEM using the legitimate address hooks, call insn_invalid_p as well to verify constraints. I think it only does that if the "containing object" that you're validating is a MEM. If the object you're validating is an insn (which it always is for regcprop) then normal constrain_operands does happen. Adding code to do that to individual MIPS patterns feels like a hack to me. The pass should be doing it itself. Agreed. It's a hack. But it was the best I could see to do at this stage. Been looking at it a bit more, and I think the problem is that we're somehow ending up with a second stack pointer rtx, distinct from stack_pointer_rtx. And then I remembered that this had been discussed before, see the tail end of: https://gcc.gnu.org/ml/gcc-patches/2016-01/msg02362.html I'd be happier with the mips_stack_address_p change described there, although it still seems like a hack. Here's what I'm going to spin in my testers over the weekend. It reverts my mips.md change and instead rejects making a copy of the stack pointer. Jeff diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md index dd5e1e7..7acf00d 100644 --- a/gcc/config/mips/mips.md +++ b/gcc/config/mips/mips.md @@ -3493,10 +3493,7 @@ (define_insn_and_split "*zero_extendsidi2" [(set (match_operand:DI 0 "register_operand" "=d,d") (zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "d,W")))] - "TARGET_64BIT && !ISA_HAS_EXT_INS - && !(TARGET_MIPS16 -&& MEM_P (operands[1]) -&& reg_mentioned_p (stack_pointer_rtx, operands[1]))" + "TARGET_64BIT && !ISA_HAS_EXT_INS" "@ # lwu\t%0,%1" @@ -3512,10 +3509,7 @@ (define_insn "*zero_extendsidi2_dext" [(set (match_operand:DI 0 "register_operand" "=d,d") (zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "d,W")))] - "TARGET_64BIT && ISA_HAS_EXT_INS - && !(TARGET_MIPS16 -&& MEM_P (operands[1]) -&& reg_mentioned_p (stack_pointer_rtx, operands[1]))" + "TARGET_64BIT && ISA_HAS_EXT_INS" "@ dext\t%0,%1,0,32 lwu\t%0,%1" diff --git a/gcc/regcprop.c b/gcc/regcprop.c index ddc6252..367d85a 100644 --- a/gcc/regcprop.c +++ b/gcc/regcprop.c @@ -396,6 +396,13 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode, && GET_MODE_SIZE (copy_mode) < GET_MODE_SIZE (new_mode)) return NULL_RTX; + /* Avoid creating multiple copies of the stack pointer. Some ports + assume there is one and only one stack pointer. + + It's unclear if we need to do the same for other special registers. */ + if (regno == STACK_POINTER_REGNUM) +return NULL_RTX; + if (orig_mode == new_mode) return gen_raw_REG (new_mode, regno); else if (mode_change_ok (orig_mode, new_mode, regno))
Re: [PATCH] Fix newlib build failure for mips16
On 04/14/2017 11:01 AM, Jeff Law wrote: On 04/14/2017 10:24 AM, Richard Sandiford wrote: I think it only does that if the "containing object" that you're validating is a MEM. If the object you're validating is an insn (which it always is for regcprop) then normal constrain_operands does happen. Hmm, you've of course right! Adding code to do that to individual MIPS patterns feels like a hack to me. The pass should be doing it itself. Agreed. It's a hack. But it was the best I could see to do at this stage. Been looking at it a bit more, and I think the problem is that we're somehow ending up with a second stack pointer rtx, distinct from stack_pointer_rtx. And then I remembered that this had been discussed before, see the tail end of: https://gcc.gnu.org/ml/gcc-patches/2016-01/msg02362.html I'd be happier with the mips_stack_address_p change described there, although it still seems like a hack. Let me dig a little further. Two stack pointers just sounds fundamentally wrong. See maybe_mode_change and cry. jeff
Re: [PATCH] Fix newlib build failure for mips16
On 04/14/2017 10:24 AM, Richard Sandiford wrote: I think it only does that if the "containing object" that you're validating is a MEM. If the object you're validating is an insn (which it always is for regcprop) then normal constrain_operands does happen. Hmm, you've of course right! Adding code to do that to individual MIPS patterns feels like a hack to me. The pass should be doing it itself. Agreed. It's a hack. But it was the best I could see to do at this stage. Been looking at it a bit more, and I think the problem is that we're somehow ending up with a second stack pointer rtx, distinct from stack_pointer_rtx. And then I remembered that this had been discussed before, see the tail end of: https://gcc.gnu.org/ml/gcc-patches/2016-01/msg02362.html I'd be happier with the mips_stack_address_p change described there, although it still seems like a hack. Let me dig a little further. Two stack pointers just sounds fundamentally wrong. jeff
Re: [PATCH] Fix newlib build failure for mips16
Jeff Lawwrites: > On 04/14/2017 05:22 AM, Richard Sandiford wrote: >> Jeff Law writes: >>> The mips64vr-elf target will fail building newlib, particularly the >>> mips16 newlib as we emit bogus assembly code. >>> >>> In particular the compiler will emit something like >>> >>> lwu $2,0($sp) >>> >>> That's invalid in mips16 mode AFAICT. >>> >>> That's emitted by the extendsidi pattern. It's a case where the operand >>> predicates are looser that the constraints. The code we get out of >>> reload is fine, but hard register propagation substitutes sp for a >>> (valid mips16) hard register that as the same value. Since hard >>> register propagation tests predicates, not constraints, the substitution >>> is successful and the bogus code is generated. >> >> Isn't that the bug though? Post-reload passes must test the constraints >> as well as the predicates, to make sure that the change aligns with >> one of the available alternatives. > I thought that as well and was quite surprised to see regcprop not honor > that. > > regcprop uses the validate_change interface, which normally checks > contraints -- except for MEMs which is just checks for validity via > memory_address_addr_space_p. Thus inside a MEM it'll allow any change > that is recognized as valid according to the legitimate_address hooks. > > I would claim that is fundamentally broken, but trying to change that at > this stage is, IMHO, unwise. I think we should seriously consider doing > something different for stage1. For example, after validating the MEM > using the legitimate address hooks, call insn_invalid_p as well to > verify constraints. I think it only does that if the "containing object" that you're validating is a MEM. If the object you're validating is an insn (which it always is for regcprop) then normal constrain_operands does happen. >> Adding code to do that to individual MIPS patterns feels like a hack to me. >> The pass should be doing it itself. > Agreed. It's a hack. But it was the best I could see to do at this stage. Been looking at it a bit more, and I think the problem is that we're somehow ending up with a second stack pointer rtx, distinct from stack_pointer_rtx. And then I remembered that this had been discussed before, see the tail end of: https://gcc.gnu.org/ml/gcc-patches/2016-01/msg02362.html I'd be happier with the mips_stack_address_p change described there, although it still seems like a hack. Thanks, Richard
Re: [PATCH] Fix newlib build failure for mips16
On 04/14/2017 05:22 AM, Richard Sandiford wrote: Jeff Lawwrites: The mips64vr-elf target will fail building newlib, particularly the mips16 newlib as we emit bogus assembly code. In particular the compiler will emit something like lwu $2,0($sp) That's invalid in mips16 mode AFAICT. That's emitted by the extendsidi pattern. It's a case where the operand predicates are looser that the constraints. The code we get out of reload is fine, but hard register propagation substitutes sp for a (valid mips16) hard register that as the same value. Since hard register propagation tests predicates, not constraints, the substitution is successful and the bogus code is generated. Isn't that the bug though? Post-reload passes must test the constraints as well as the predicates, to make sure that the change aligns with one of the available alternatives. I thought that as well and was quite surprised to see regcprop not honor that. regcprop uses the validate_change interface, which normally checks contraints -- except for MEMs which is just checks for validity via memory_address_addr_space_p. Thus inside a MEM it'll allow any change that is recognized as valid according to the legitimate_address hooks. I would claim that is fundamentally broken, but trying to change that at this stage is, IMHO, unwise. I think we should seriously consider doing something different for stage1. For example, after validating the MEM using the legitimate address hooks, call insn_invalid_p as well to verify constraints. Adding code to do that to individual MIPS patterns feels like a hack to me. The pass should be doing it itself. Agreed. It's a hack. But it was the best I could see to do at this stage. jeff
Re: RFC: seeking insight on store_data_bypass_p (recog.c)
Pat Haugenwrites: > On 04/12/2017 06:33 PM, Kelvin Nilsen wrote: >> >> 1. As input arguments, out_insn represents an rtl expression that >> potentially "produces" a store to memory and in_insn represents an rtl >> expression that potentially "consumes" a value recently stored to memory. >> > You have this reversed, the code is trying to find situations where > out_insn is producing a value that in_insn will be storing to memory. > >> 2. If the memory store produced matches the memory fetch consumed, this >> function returns true to indicate that this sequence of two instructions >> qualifies for a special "bypass" latency that represents the fact that >> the fetch will obtain the value out of the write buffer. So, whereas >> the instruction scheduler might normally expect that this sequence of >> two instructions would experience Load-Hit-Store penalties associated >> with cache coherency hardware costs, since these two instruction qualify >> for the store_data_bypass optimization, the instruction scheduler counts >> the latency as only 1 or 2 cycles (potentially). [This is what I >> understand, but I may be wrong, so please correct me if so.] >> > In general, yes, if the function returns true then the sequence has been > identified and the target will take appropriate action (adjusting > latency or whatever). > > > As for the remainder below dealing with PARALLELs, I don't have any > history on that so hopefully others can chime in. For the rs6000 port, I > don't see the handling of multiple SET operations making much sense, but > then again I don't know if it will actually occur either based on the > places where store_data_bypass_p is used. > > -Pat > Reordering the message, sorry: >> 2. A "bigger" concern is that any time any SETs are buried within a >> PARALLEL tree, I'm not sure the answer produced by this function, as >> currently implemented, is at all reliable: >> >> a) PARALLEL does not necessarily mean all of its subtrees happen in >> parallel on hardware. It just means that there is no sequencing imposed >> by the source code, so the final order in which the multiple subtrees >> beneath the PARALLEL node is not known at this stage of compilation. >> >> b) It seems to me that it doesn't really make sense to speak of whether >> a whole bunch of producers combined with a whole bunch of consumers >> qualify for an optimized store data bypass latency. If we say that they >> do qualify (as a group), which pair(s) of producer and consumer machine >> instructions qualify? It seems we need to know which producer matches >> with which consumer in order to know where the bypass latencies "fit" >> into the schedule. >> >> c) Furthermore, if it turns out that the "arbitrary" order in which the >> producer instructions and consumer instructions are emitted places too >> much "distance" between a producer and the matching consumer, then it is >> possible that by the time the hardware executes the consumer, the stored >> value is no longer in the write buffer, so even though we might have >> "thought" two PARALLEL rtl expressions qualified for the store bypass >> optimization, we really should have returned false. The bypass function only operates on individual rtl instructions: i.e. one producer and one consumer. Usually rtl instructions map to single machine instructions, with PARALLELs being used if the machine instruction does more than one thing (more below). .md files do have the option of using a single rtl instruction to represent a sequence of several machine instructions but: (a) they're then effectively asking the target-independent code to treat the sequence "as-if" it was a single indivisble instruction. (b) that hampers scheduling in lots of ways, so should be avoided unless there's really no alternative. One problem is that it stops other machine instructions from being scheduled in the sequence. Another is that it makes it harder to describe the microarchitecture effects of the sequence, since more than one instruction is going through the pipeline. So yeah, if a target does put several machine instructions into a single rtl instruction, and one of those instructions is a store, using store_data_bypass_p on it is going to give poor results. But it would give poor results in general, even without the bypass. I think it's case of "don't do that". Sometimes it is (or was) useful to treat multiple machine instructions as single rtl instructions during early rtl optimisation. It's still better to split them into individual machine instructions for scheduling though, via define_split or define_insn_and_split. >> 3. Actually, what I described above is only the "simple" case. It may >> be that the rtl for either out_insn or in_insn is really a parallel >> clause with multiple rtl trees beneath it. In this case, we compare the >> subtrees in a "similar" way to see if the compound expressions qualify >> for the
Re: [PATCH] Fix simplify-rtx.c ICE with vector float (not (neg)) (PR rtl-optimization/80385)
Jakub Jelinekwrites: > Hi! > > The x86 intrinsics allow andnot on MODE_VECTOR_FLOAT modes, but > such modes have NULL CONSTM1_RTX and are not appropriate for the > transformation anyway. > > The following patch fixes that, ok if bootstrap/regtest passes? > Or would you prefer to replace the > && CONSTM1_RTX (mode) > check with e.g. > && (MODE_CLASS (mode) == MODE_INT > || MODE_CLASS (mode) == MODE_VECTOR_INT) > (dunno if we want to handle that way also partial int modes or not, > no experience with those)? > The transformation relies on 2's complement, so certainly doesn't apply > to floating modes (scalar or vector), but even MODE_COMPLEX_INT doesn't > have CONSTM1_RTX. Is it valid to have (not: ...) of a floating-point mode? I thought it had to have an integer mode. FWIW, in the SVE patches, we deliberately used unspecs for floating-point logic to avoid this. Thanks, Richard
Re: [PATCH] Validate that destination gcov file does not exist for, gcov-tool (PR gcov-profile/78783).
On 04/14/2017 02:48 PM, Nathan Sidwell wrote: On 04/14/2017 08:45 AM, Martin Liška wrote: Hello. Patch handles ICE when gcov-tool either merges or scales gcov files and destination folder (where all files are merged) is non-empty. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. And gcov.exp works on x86_64-linux-gnu. Ok. Perhaps change the message from: fatal_error (input_location, "Output file %s should be removed " + "in destination folder: %s", filename, out); Thanks, I'll change that. to 'output file %s already exists in folder %s' or something? I.e report what the problem is, not what the solution might be -- after all, if the only solution is to delete the output file, why isn't gcov-tool doing that? It's doing that on targets that have nftw: static int unlink_profile_dir (const char *path ATTRIBUTE_UNUSED) { #if HAVE_FTW_H return nftw(path, unlink_gcda_file, 64, FTW_DEPTH | FTW_PHYS); #else return -1; #endif } Martin nathan
[PATCH, gfortran, committed] PR59910 ICE in gfc_conv_array_initializer, at fortran/trans-array.c:5327
Patch committed to the gcc5 branch 2017-04-14 Dominique d'HumieresBackport from trunk 2015-11-18 Steven G. Kargl PR fortran/59910 PR fortran/80388 * primary.c (gfc_match_structure_constructor): Reduce a structure constructor in a DATA statement. * gfortran.dg/pr59910.f90: New test. Dominique
Re: [PATCH] Validate that destination gcov file does not exist for, gcov-tool (PR gcov-profile/78783).
On 04/14/2017 08:45 AM, Martin Liška wrote: Hello. Patch handles ICE when gcov-tool either merges or scales gcov files and destination folder (where all files are merged) is non-empty. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. And gcov.exp works on x86_64-linux-gnu. Ok. Perhaps change the message from: fatal_error (input_location, "Output file %s should be removed " +"in destination folder: %s", filename, out); to 'output file %s already exists in folder %s' or something? I.e report what the problem is, not what the solution might be -- after all, if the only solution is to delete the output file, why isn't gcov-tool doing that? nathan -- Nathan Sidwell
[PATCH] Validate that destination gcov file does not exist for, gcov-tool (PR gcov-profile/78783).
Hello. Patch handles ICE when gcov-tool either merges or scales gcov files and destination folder (where all files are merged) is non-empty. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. And gcov.exp works on x86_64-linux-gnu. Ready to be installed? Martin >From 14b0d700bccb87559203d40216975e67ca52415b Mon Sep 17 00:00:00 2001 From: marxinDate: Thu, 13 Apr 2017 16:10:26 +0200 Subject: [PATCH] Validate that destination gcov file does not exist for gcov-tool (PR gcov-profile/78783). libgcc/ChangeLog: 2017-04-13 Martin Liska * libgcov-driver.c (gcov_get_filename): New function. gcc/ChangeLog: 2017-04-13 Martin Liska * gcov-tool.c (gcov_output_files): Validate that destination file is either removed by the tool or by a user. --- gcc/gcov-tool.c | 9 + libgcc/libgcov-driver.c | 9 + 2 files changed, 18 insertions(+) diff --git a/gcc/gcov-tool.c b/gcc/gcov-tool.c index cadf09377dd..e241de2bfe2 100644 --- a/gcc/gcov-tool.c +++ b/gcc/gcov-tool.c @@ -46,6 +46,7 @@ extern int gcov_profile_normalize (struct gcov_info*, gcov_type); extern int gcov_profile_scale (struct gcov_info*, float, int, int); extern struct gcov_info* gcov_read_profile_dir (const char*, int); extern void gcov_do_dump (struct gcov_info *, int); +extern const char *gcov_get_filename (struct gcov_info *list); extern void gcov_set_verbose (void); /* Set to verbose output mode. */ @@ -114,6 +115,14 @@ gcov_output_files (const char *out, struct gcov_info *profile) if (ret) fatal_error (input_location, "Cannot change directory to %s", out); + /* Verify that output file does not exist (either was removed by + unlink_profile_data or removed by user). */ + const char *filename = gcov_get_filename (profile); + + if (access (filename, F_OK) != -1) +fatal_error (input_location, "Output file %s should be removed " + "in destination folder: %s", filename, out); + gcov_do_dump (profile, 0); ret = chdir (pwd); diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c index 70fe69f26b5..c3b2fd4d5ce 100644 --- a/libgcc/libgcov-driver.c +++ b/libgcc/libgcov-driver.c @@ -852,6 +852,15 @@ gcov_do_dump (struct gcov_info *list, int run_counted) free (gf.filename); } +#if IN_GCOV_TOOL +const char * +__attribute__ ((unused)) +gcov_get_filename (struct gcov_info *list) +{ + return list->filename; +} +#endif + #if !IN_GCOV_TOOL void __gcov_dump_one (struct gcov_root *root) -- 2.12.2
[PATCH 6/7] [ARC] [Cxx] Fix calling multiple inheritances.
The TARGET_ASM_OUTPUT_MI_THUNK hook doesn't take into account the variant when we compile for PIC. gcc/ 2016-12-13 Claudiu Zissulescu* config/arc/arc.c (arc_output_mi_thunk): Emit PIC calls. --- gcc/config/arc/arc.c | 24 +--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index 73d72c68..36582d2 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -6434,10 +6434,28 @@ arc_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, fnaddr = XEXP (DECL_RTL (function), 0); if (arc_is_longcall_p (fnaddr)) -fputs ("\tj\t", file); +{ + if (flag_pic) + { + asm_fprintf (file, "\tld\t%s, [pcl, @", + ARC_TEMP_SCRATCH_REG); + assemble_name (file, XSTR (fnaddr, 0)); + fputs ("@gotpc]\n", file); + asm_fprintf (file, "\tj\t[%s]", ARC_TEMP_SCRATCH_REG); + } + else + { + fputs ("\tj\t@", file); + assemble_name (file, XSTR (fnaddr, 0)); + } +} else -fputs ("\tb\t", file); - assemble_name (file, XSTR (fnaddr, 0)); +{ + fputs ("\tb\t@", file); + assemble_name (file, XSTR (fnaddr, 0)); + if (flag_pic) + fputs ("@plt\n", file); +} fputc ('\n', file); } -- 1.9.1
[PATCH 7/7] [ARC] Addresses can use long immediate for offsets.
gcc/ 2016-12-13 Claudiu Zissulescu* config/arc/arc.c (LEGITIMATE_OFFSET_ADDRESS_P): Delete macro. (legitimate_offset_address_p): New function. (arc_legitimate_address_p): Use above function. --- gcc/config/arc/arc.c | 44 +++- 1 file changed, 35 insertions(+), 9 deletions(-) diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index 36582d2..a113c41 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -77,13 +77,6 @@ static const char *arc_cpu_string = arc_cpu_name; ? 0 \ : -(-GET_MODE_SIZE (MODE) | -4) >> 1))) -#define LEGITIMATE_OFFSET_ADDRESS_P(MODE, X, INDEX, STRICT) \ -(GET_CODE (X) == PLUS \ - && RTX_OK_FOR_BASE_P (XEXP (X, 0), (STRICT)) \ - && ((INDEX && RTX_OK_FOR_INDEX_P (XEXP (X, 1), (STRICT)) \ - && GET_MODE_SIZE ((MODE)) <= 4) \ - || RTX_OK_FOR_OFFSET_P (MODE, XEXP (X, 1 - #define LEGITIMATE_SCALED_ADDRESS_P(MODE, X, STRICT) \ (GET_CODE (X) == PLUS \ && GET_CODE (XEXP (X, 0)) == MULT \ @@ -246,6 +239,39 @@ static bool arc_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT, /* Globally visible information about currently selected cpu. */ const arc_cpu_t *arc_selected_cpu; +/* Check for constructions like REG + OFFS, where OFFS can be a + register, an immediate or an long immediate. */ + +static bool +legitimate_offset_address_p (enum machine_mode mode, rtx x, bool index, +bool strict) +{ + if (GET_CODE (x) != PLUS) +return false; + + if (!RTX_OK_FOR_BASE_P (XEXP (x, 0), (strict))) +return false; + + /* Check for: [Rx + small offset] or [Rx + Ry]. */ + if (((index && RTX_OK_FOR_INDEX_P (XEXP (x, 1), (strict)) + && GET_MODE_SIZE ((mode)) <= 4) + || RTX_OK_FOR_OFFSET_P (mode, XEXP (x, 1 +return true; + + /* Check for [Rx + symbol]. */ + if (!flag_pic + && (GET_CODE (XEXP (x, 1)) == SYMBOL_REF) + /* Avoid this type of address for double or larger modes. */ + && (GET_MODE_SIZE (mode) <= 4) + /* Avoid small data which ends in something like GP + +symb@sda. */ + && (!SYMBOL_REF_SMALL_P (XEXP (x, 1)) + || TARGET_NO_SDATA_SET)) +return true; + + return false; +} + /* Implements target hook vector_mode_supported_p. */ static bool @@ -5600,7 +5626,7 @@ arc_legitimate_address_p (machine_mode mode, rtx x, bool strict) { if (RTX_OK_FOR_BASE_P (x, strict)) return true; - if (LEGITIMATE_OFFSET_ADDRESS_P (mode, x, TARGET_INDEXED_LOADS, strict)) + if (legitimate_offset_address_p (mode, x, TARGET_INDEXED_LOADS, strict)) return true; if (LEGITIMATE_SCALED_ADDRESS_P (mode, x, strict)) return true; @@ -5641,7 +5667,7 @@ arc_legitimate_address_p (machine_mode mode, rtx x, bool strict) if ((GET_CODE (x) == PRE_MODIFY || GET_CODE (x) == POST_MODIFY) && GET_CODE (XEXP ((x), 1)) == PLUS && rtx_equal_p (XEXP ((x), 0), XEXP (XEXP (x, 1), 0)) - && LEGITIMATE_OFFSET_ADDRESS_P (QImode, XEXP (x, 1), + && legitimate_offset_address_p (QImode, XEXP (x, 1), TARGET_AUTO_MODIFY_REG, strict)) return true; return false; -- 1.9.1
[PATCH 5/7] [ARC] Use ACCL, ACCH registers whenever they are available.
gcc/ 2016-12-09 Claudiu Zissulescu* config/arc/arc.c (arc_conditional_register_usage): Use ACCL, ACCH registers whenever they are available. --- gcc/config/arc/arc.c | 9 + 1 file changed, 9 insertions(+) diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index f820622..73d72c68 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -1588,6 +1588,15 @@ arc_conditional_register_usage (void) SET_HARD_REG_BIT (reg_class_contents[WRITABLE_CORE_REGS], ACCH_REGNO); SET_HARD_REG_BIT (reg_class_contents[CHEAP_CORE_REGS], ACCL_REGNO); SET_HARD_REG_BIT (reg_class_contents[CHEAP_CORE_REGS], ACCH_REGNO); +SET_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], ACCL_REGNO); +SET_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], ACCH_REGNO); +SET_HARD_REG_BIT (reg_class_contents[MPY_WRITABLE_CORE_REGS], ACCL_REGNO); +SET_HARD_REG_BIT (reg_class_contents[MPY_WRITABLE_CORE_REGS], ACCH_REGNO); + + /* Allow the compiler to freely use them. */ +fixed_regs[ACCL_REGNO] = 0; +fixed_regs[ACCH_REGNO] = 0; + arc_hard_regno_mode_ok[ACC_REG_FIRST] = D_MODES; } } -- 1.9.1
[PATCH 3/7] [ARC] Allow extension core registers to be used for addresses.
gcc/ 2016-12-09 Claudiu Zissulescu* config/arc/arc.h (REGNO_OK_FOR_BASE_P): Consider also extension core registers. (REG_OK_FOR_INDEX_P_NONSTRICT): Likewise. (REG_OK_FOR_BASE_P_NONSTRICT): Likewise. --- gcc/config/arc/arc.h | 20 +--- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h index 0c7e561..9f6a272 100644 --- a/gcc/config/arc/arc.h +++ b/gcc/config/arc/arc.h @@ -640,7 +640,8 @@ extern enum reg_class arc_regno_reg_class[]; #define REGNO_OK_FOR_BASE_P(REGNO) \ ((REGNO) < 29 || ((REGNO) == ARG_POINTER_REGNUM) || ((REGNO) == 63) \ || ((unsigned) reg_renumber[REGNO] < 29)\ - || ((unsigned) (REGNO) == (unsigned) arc_tp_regno)) + || ((unsigned) (REGNO) == (unsigned) arc_tp_regno) \ + || (fixed_regs[REGNO] == 0 && IN_RANGE (REGNO, 32, 59))) #define REGNO_OK_FOR_INDEX_P(REGNO) REGNO_OK_FOR_BASE_P(REGNO) @@ -922,18 +923,15 @@ extern int arc_initial_elimination_offset(int from, int to); /* Nonzero if X is a hard reg that can be used as an index or if it is a pseudo reg. */ -#define REG_OK_FOR_INDEX_P_NONSTRICT(X) \ -((unsigned) REGNO (X) >= FIRST_PSEUDO_REGISTER || \ - (unsigned) REGNO (X) < 29 || \ - (unsigned) REGNO (X) == 63 || \ - (unsigned) REGNO (X) == ARG_POINTER_REGNUM) +#define REG_OK_FOR_INDEX_P_NONSTRICT(X)\ + ((unsigned) REGNO (X) >= FIRST_PSEUDO_REGISTER \ + || REGNO_OK_FOR_BASE_P (REGNO (X))) + /* Nonzero if X is a hard reg that can be used as a base reg or if it is a pseudo reg. */ -#define REG_OK_FOR_BASE_P_NONSTRICT(X) \ -((unsigned) REGNO (X) >= FIRST_PSEUDO_REGISTER || \ - (unsigned) REGNO (X) < 29 || \ - (unsigned) REGNO (X) == 63 || \ - (unsigned) REGNO (X) == ARG_POINTER_REGNUM) +#define REG_OK_FOR_BASE_P_NONSTRICT(X) \ + ((unsigned) REGNO (X) >= FIRST_PSEUDO_REGISTER \ + || REGNO_OK_FOR_BASE_P (REGNO (X))) /* Nonzero if X is a hard reg that can be used as an index. */ #define REG_OK_FOR_INDEX_P_STRICT(X) REGNO_OK_FOR_INDEX_P (REGNO (X)) -- 1.9.1
[PATCH 4/7] [ARC] Make D0, D1 double regs fix when not used.
gcc/ 2016-12-09 Claudiu Zissulescu* config/arc/arc.c (arc_conditional_register_usage): Make D0, D1 double regs fix when not used. --- gcc/config/arc/arc.c | 5 + 1 file changed, 5 insertions(+) diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index 8a8ac86..f820622 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -1546,6 +1546,11 @@ arc_conditional_register_usage (void) arc_regno_reg_class[42] = ALL_REGS; arc_regno_reg_class[43] = ALL_REGS; + fixed_regs[40] = 1; + fixed_regs[41] = 1; + fixed_regs[42] = 1; + fixed_regs[43] = 1; + arc_hard_regno_mode_ok[40] = 0; arc_hard_regno_mode_ok[42] = 0; -- 1.9.1
[PATCH 2/7] [ARC] Differentiate between ARCv1 and ARCv2 'h'-reg class for ADD insns.
gcc/ 2016-12-08 Claudiu Zissulescu* config/arc/arc.c (arc_output_addsi): Check for h-register class when emitting short ADD instructions. --- gcc/config/arc/arc.c | 42 +++--- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index d8ac6a6..8a8ac86 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -7455,6 +7455,10 @@ arc_output_addsi (rtx *operands, bool cond_p, bool output_p) int short_p = (!cond_p && short_0 && satisfies_constraint_Rcq (operands[1])); int ret = 0; +#define REG_H_P(OP) (REG_P (OP) && ((TARGET_V2 && REGNO (OP) <= 31 \ +&& REGNO (OP) != 30) \ + || !TARGET_V2)) + #define ADDSI_OUTPUT1(FORMAT) do {\ if (output_p) \ output_asm_insn (FORMAT, operands);\ @@ -7477,32 +7481,40 @@ arc_output_addsi (rtx *operands, bool cond_p, bool output_p) but add1 r0,sp,35 doesn't. */ && (!output_p || (get_attr_length (current_output_insn) & 2))) { + /* Generate add_s a,b,c; add_s b,b,u7; add_s c,b,u3; add_s b,b,h +patterns. */ if (short_p - && (REG_P (operands[2]) - ? (match || satisfies_constraint_Rcq (operands[2])) - : (unsigned) intval <= (match ? 127 : 7))) - ADDSI_OUTPUT1 ("add%? %0,%1,%2"); - if (short_0 && REG_P (operands[1]) && match2) - ADDSI_OUTPUT1 ("add%? %0,%2,%1"); + && ((REG_H_P (operands[2]) + && (match || satisfies_constraint_Rcq (operands[2]))) + || (CONST_INT_P (operands[2]) + && ((unsigned) intval <= (match ? 127 : 7) + ADDSI_OUTPUT1 ("add%? %0,%1,%2 ;1"); + + /* Generate add_s b,b,h patterns. */ + if (short_0 && match2 && REG_H_P (operands[1])) + ADDSI_OUTPUT1 ("add%? %0,%2,%1 ;2"); + + /* Generate add_s b,sp,u7; add_s sp,sp,u7 patterns. */ if ((short_0 || REGNO (operands[0]) == STACK_POINTER_REGNUM) && REGNO (operands[1]) == STACK_POINTER_REGNUM && !(intval & ~124)) - ADDSI_OUTPUT1 ("add%? %0,%1,%2"); + ADDSI_OUTPUT1 ("add%? %0,%1,%2 ;3"); if ((short_p && (unsigned) neg_intval <= (match ? 31 : 7)) || (REGNO (operands[0]) == STACK_POINTER_REGNUM && match && !(neg_intval & ~124))) - ADDSI_OUTPUT1 ("sub%? %0,%1,%n2"); + ADDSI_OUTPUT1 ("sub%? %0,%1,%n2 ;4"); - if (REG_P(operands[0]) && REG_P(operands[1]) - && (REGNO(operands[0]) <= 31) && (REGNO(operands[0]) == REGNO(operands[1])) - && CONST_INT_P (operands[2]) && ( (intval>= -1) && (intval <= 6))) - ADDSI_OUTPUT1 ("add%? %0,%1,%2"); + /* Generate add_s h,h,s3 patterns. */ + if (REG_H_P (operands[0]) && match && TARGET_V2 + && CONST_INT_P (operands[2]) && ((intval>= -1) && (intval <= 6))) + ADDSI_OUTPUT1 ("add%? %0,%1,%2 ;5"); - if (TARGET_CODE_DENSITY && REG_P(operands[0]) && REG_P(operands[1]) - && ((REGNO(operands[0]) == 0) || (REGNO(operands[0]) == 1)) + /* Generate add_s r0,b,u6; add_s r1,b,u6 patterns. */ + if (TARGET_CODE_DENSITY && REG_P (operands[0]) && REG_P (operands[1]) + && ((REGNO (operands[0]) == 0) || (REGNO (operands[0]) == 1)) && satisfies_constraint_Rcq (operands[1]) && satisfies_constraint_L (operands[2])) - ADDSI_OUTPUT1 ("add%? %0,%1,%2 ;3"); + ADDSI_OUTPUT1 ("add%? %0,%1,%2 ;6"); } /* Now try to emit a 32 bit insn without long immediate. */ -- 1.9.1
[PATCH 0/7] [ARC] Fix constraint letters and allow extra registers
From: clazissHi, There is an issue with 'h'- register class for ARCv2, which accepts only the first 32 general purposes registers as oposite to the ARCv1 which accepts all 64 GPRs. Fix this issue in two patches for CMP and ADD instructions. Also, allow the compiler to use extra GPRs if they are available and mark D0, D1 registers fixed when not available. Fix also C++ calling multiple inheritances when compiling for PIC, and allow addresses to use Rx + @symbol. -- Claudiu Zissulescu (7): [ARC] Differentiate between ARCv1 and ARCv2 'h'-reg class for CMP insns. [ARC] Differentiate between ARCv1 and ARCv2 'h'-reg class for ADD insns. [ARC] Allow extension core registers to be used for addresses. [ARC] Make D0, D1 double regs fix when not used. [ARC] Use ACCL, ACCH registers whenever they are available. [ARC] [Cxx] Fix calling multiple inheritances. [ARC] Addresses can use long immediate for offsets. gcc/config/arc/arc.c | 124 +-- gcc/config/arc/arc.h | 20 --- gcc/config/arc/arc.md| 28 +- gcc/config/arc/predicates.md | 13 + 4 files changed, 135 insertions(+), 50 deletions(-) -- 1.9.1
[PATCH 1/7] [ARC] Differentiate between ARCv1 and ARCv2 'h'-reg class for CMP insns.
gcc/ 2016-12-08 Claudiu Zissulescu* config/arc/arc.md (cmpsi_cc_insn_mixed): Use 'h' register constraint. (cmpsi_cc_c_insn): Likewise. (cbranchsi4_scratch): Compute proper instruction length using compact_hreg_operand. * config/arc/predicates.md (compact_hreg_operand): New predicate. --- gcc/config/arc/arc.md| 28 gcc/config/arc/predicates.md | 13 + 2 files changed, 29 insertions(+), 12 deletions(-) diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index da760ed..053f8a6 100644 --- a/gcc/config/arc/arc.md +++ b/gcc/config/arc/arc.md @@ -3458,15 +3458,16 @@ ;; modifed cc user if second, but not first operand is a compact register. (define_insn "cmpsi_cc_insn_mixed" [(set (reg:CC CC_REG) - (compare:CC (match_operand:SI 0 "register_operand" "Rcq#q, h, c, c,qRcq,c") - (match_operand:SI 1 "nonmemory_operand" "cO,Cm1,cI,cL, Cal,Cal")))] + (compare:CC (match_operand:SI 0 "register_operand" "Rcq#q,Rcqq, h, c, c,qRcq,c") + (match_operand:SI 1 "nonmemory_operand" "cO, hO,Cm1,cI,cL, Cal,Cal")))] "" "cmp%? %0,%B1%&" [(set_attr "type" "compare") - (set_attr "iscompact" "true,true,false,false,true_limm,false") - (set_attr "predicable" "no,no,no,yes,no,yes") + (set_attr "iscompact" "true,true,true,false,false,true_limm,false") + (set_attr "predicable" "no,no,no,no,yes,no,yes") (set_attr "cond" "set") - (set_attr "length" "*,*,4,4,*,8")]) + (set_attr "length" "*,*,*,4,4,*,8") + (set_attr "cpu_facility" "av1,av2,*,*,*,*,*")]) (define_insn "*cmpsi_cc_zn_insn" [(set (reg:CC_ZN CC_REG) @@ -3542,14 +3543,15 @@ (define_insn "*cmpsi_cc_c_insn" [(set (reg:CC_C CC_REG) - (compare:CC_C (match_operand:SI 0 "register_operand" "Rcqq, h, c,Rcqq, c") - (match_operand:SI 1 "nonmemory_operand" "cO,Cm1,cI, Cal,Cal")))] + (compare:CC_C (match_operand:SI 0 "register_operand" "Rcqq,Rcqq, h, c,Rcqq, c") + (match_operand:SI 1 "nonmemory_operand" "cO, hO,Cm1,cI, Cal,Cal")))] "" "cmp%? %0,%S1%&" [(set_attr "type" "compare") - (set_attr "iscompact" "true,true,false,true_limm,false") + (set_attr "iscompact" "true,true,true,false,true_limm,false") (set_attr "cond" "set") - (set_attr "length" "*,*,4,*,8")]) + (set_attr "length" "*,*,*,4,*,8") + (set_attr "cpu_facility" "av1,av2,*,*,*,*")]) ;; Next come the scc insns. @@ -4844,7 +4846,7 @@ return \"br%d0%* %1, %B2, %^%l3\"; /* FALLTHRU */ case 6: case 10: - case 12:return \"cmp%? %1, %B2\\n\\tb%d0%* %^%l3%&;br%d0 out of range\"; + case 12:return \"cmp%? %1, %B2\\n\\tb%d0%* %^%l3%& ;br%d0 out of range\"; default: fprintf (stderr, \"unexpected length %d\\n\", get_attr_length (insn)); fflush (stderr); gcc_unreachable (); } " @@ -4874,13 +4876,15 @@ (minus (const_int 244) (symbol_ref "get_attr_delay_slot_length (insn)" (const_int 4) - (match_operand:SI 1 "compact_register_operand" "") + (and (match_operand:SI 1 "compact_register_operand" "") + (match_operand:SI 2 "compact_hreg_operand" "")) (const_int 6)] (const_int 8))] (cond [(and (ge (minus (match_dup 3) (pc)) (const_int -256)) (le (minus (match_dup 3) (pc)) (const_int 244))) (const_int 8) - (match_operand:SI 1 "compact_register_operand" "") + (and (match_operand:SI 1 "compact_register_operand" "") +(match_operand:SI 2 "compact_hreg_operand" "")) (const_int 10)] (const_int 12 (set (attr "iscompact") diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md index 9e60cb7..f4c2a80 100644 --- a/gcc/config/arc/predicates.md +++ b/gcc/config/arc/predicates.md @@ -189,6 +189,19 @@ } ) +(define_predicate "compact_hreg_operand" + (match_code "reg, subreg") + { + if ((GET_MODE (op) != mode) && (mode != VOIDmode)) +return 0; + + return (GET_CODE (op) == REG) + && (REGNO (op) >= FIRST_PSEUDO_REGISTER + || (TARGET_V2 && REGNO (op) <= 31 && REGNO (op) != 30) + || !TARGET_V2); + } +) + ;; Return true if OP is an acceptable memory operand for ARCompact ;; 16-bit store instructions (define_predicate "compact_store_memory_operand" -- 1.9.1
RE: [PATCH 0/3] ARC patches series
> > Claudiu Zissulescu (3): > > [ARC] Update mode_dependent_address_p hook. > > [ARC] DWARF emitting cleanup. > > [ARC] Use long jumps for CRT calls > > These all look fine. > Committed. Thank you for your review, Claudiu
Re: [PATCH] Fix newlib build failure for mips16
Jeff Lawwrites: > The mips64vr-elf target will fail building newlib, particularly the > mips16 newlib as we emit bogus assembly code. > > In particular the compiler will emit something like > > lwu $2,0($sp) > > That's invalid in mips16 mode AFAICT. > > That's emitted by the extendsidi pattern. It's a case where the operand > predicates are looser that the constraints. The code we get out of > reload is fine, but hard register propagation substitutes sp for a > (valid mips16) hard register that as the same value. Since hard > register propagation tests predicates, not constraints, the substitution > is successful and the bogus code is generated. Isn't that the bug though? Post-reload passes must test the constraints as well as the predicates, to make sure that the change aligns with one of the available alternatives. Adding code to do that to individual MIPS patterns feels like a hack to me. The pass should be doing it itself. Thanks, Richard
RE: [PATCH 0/2] ARC Zero Overhead Loop Fixes
Hi, > Andrew Burgess (2): > arc: Use @pcl assembler syntax instead of invalid expressions > arc: Fix for loop end detection > Both patches looks good to me. Claudiu
Re: [PATCH] Fix fixincludes for canadian cross builds
On 14 April 2017 at 12:29, Bernd Edlingerwrote: > Hi Yvan, > > On 04/14/17 10:24, Yvan Roux wrote: >> Hi Bernd, >> >> On 14 April 2017 at 06:18, Bernd Edlinger wrote: >>> On 04/12/17 17:58, Yvan Roux wrote: Hi, On 20 February 2017 at 18:53, Bruce Korb wrote: > On 02/18/17 01:01, Bernd Edlinger wrote: >> On 02/18/17 00:37, Bruce Korb wrote: >>> On 02/06/17 10:44, Bernd Edlinger wrote: I tested this change with different arm-linux-gnueabihf cross compilers, and verified that mkheaders still works on the host system. Bootstrapped and reg-tested on x86_64-pc-linux-gnu. Is it OK for trunk? >>> >>> As long as you certify that this is correct for all systems we care >>> about: >>> >>> +BUILD_SYSTEM_HEADER_DIR = ` >>> +echo $(CROSS_SYSTEM_HEADER_DIR) | \ >>> +sed -e :a -e 's,[^/]*/\.\.\/,,' -e ta` >>> >>> that is pretty obtuse sed-speak to me. I suggest a comment >>> explaining what sed is supposed to be doing. What should >>> "$(CROSS_SYSTEM_HEADER_DIR)" look like? >>> >> >> I took it just from a few lines above, so I thought that comment would >> sufficiently explain the syntax: > > I confess, I didn't pull a new copy of gcc, sorry. > So it looks good to me. We just noticed that this patch brakes canadian cross builds when configured with --with-build-sysroot, since headers are searched into the target sysroot instead of the one specified for builds. Maybe there's a cleaner way to fix this and avoid the duplication but I didn't find another to test if --with-build-sysroot is used. The attached patch fixes the issue. Tested with a Full canadian cross build for i686-w64-mingw32 host and arm-linux-gnueabihf. Thanks Yvan 2017-04-12 Yvan Roux * Makefile.in (BUILD_SYSTEM_HEADER_DIR): Set to SYSTEM_HEADER_DIR when configured with --with-build-sysroot. >>> >>> Oops, sorry for the breakage... >>> >>> However I think the patch simply restores the previous behavior, because >>> ifdef SYSROOT_CFLAGS_FOR_TARGET is always true, even if it's empty, >>> but it does still not work correctly. >> >> hmm, according to Makefile manual ifdef VAR is true if VAR is >> non-empty, but maybe SYSROOT_CFLAGS_FOR_TARGET is set to an empty >> string when --with-build-sysroot is not used, sorry for not having >> tested this case :/ >> > > Yes, interesting point, the reason must have been somewhere else then. > Actually the "else ifdef" syntax appears to be working to my surprise: > I use gnu make 3.81 here. But the manual says "else" has to be on a > line of it's own, and I have never seen that syntax being used before. Oh, maybe the syntax was changed, I use gnu make 4.2 and the syntax for complex conditional in the manual (https://www.gnu.org/software/make/manual/make.html#Conditional-Syntax) contains: conditional-directive-one text-if-one-is-true else conditional-directive-two ... I made more tests, and when configured without --with-build-sysroot, SYSROOT_CFLAGS_FOR_TARGET really is empty in gcc/Makefile, so I don't understand why my patch didn't work... But anyway, your version is better than mine ;) >>> I tried to build a cross with your patch and a --with-build-sysroot >>> but the target compiler does fix the wrong includes for me: >>> >>> ../gcc-trunk/configure --prefix=/home/ed/gnu/arm-linux-gnueabihf-cross >>> --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf >>> --enable-languages=c,c++,ada,fortran --with-arch=armv7-a >>> --with-tune=cortex-a9 --with-fpu=vfpv3-d16 --with-float=hard >>> --with-build-sysroot=/home/ed/gnu/arm-linux-gnueabihf-3 >>> => >>> Fixing headers into >>> /home/ed/gnu/gcc-build-arm-linux-gnueabihf-2/gcc/include-fixed for >>> arm-unknown-linux-gnueabihf target >>> Forbidden identifiers: linux unix >>> Finding directories and links to directories >>> Searching /usr/include/. >>> Searching /usr/include/./c++/4.8.4 >>> Searching /usr/include/./numpy >>> Searching /usr/include/./python2.7/numpy >>> Making symbolic directory links >>> Fixing directory /usr/include into >>> /home/ed/gnu/gcc-build-arm-linux-gnueabihf-2/gcc/include-fixed >>> >>> but it should fix headers in .../arm-linux-gnueabihf-3/usr/include >>> >>> I think it would work if I use --with-sysroot together with >>> --with-build-sysroot in the config above, but why should the >>> target compiler use --with-sysroot ? >>> There is no need for that on the target, just the cross-compiler >>> might need to support that. >> >> Ah yes, sorry ! we use both together, my understanding was that >> --with-build-sysroot only makes sense we --with-sysroot is used. >> > > No problem, there are certainly many situations when that is true. > >>> I tried to fix some possible combinations of
PR80357: Negative register pressure
In the PR testcase, there were two instructions that had a large number of insn_reg_use records for the same register. model_recompute was instead expecting the records to be unique and so decremented the register pressure for each one. We then ended up with a negative pressure. I think the records *should* be unique here, and that this is really a bug in the generic -fsched-pressure code. Making them unique could be too invasive for GCC 7 though. There are at least two problems: (1) sched-deps.c uses rtx_insn_lists instead of bitmaps to record the set of instructions that use a live register. sched-rgn.c then propagates this information between blocks in a region using list concatenation: succ_rl->uses = concat_INSN_LIST (pred_rl->uses, succ_rl->uses); So dependencies for common predecessors will appear multiple times in the list. In this case (and for the PR), it might be enough to make setup_insn_reg_uses detect duplicate entries. However... (2) setup_insn_reg_uses adds entries for all queued uses of a register at the point that the register is expected to die. It looks like this doesn't work reliably for REG_INC registers: if a register R is auto-incremented in I1 and then used for the last time in I2, setup_insn_reg_uses will first treat the original R as dying in I1 (rightly IMO). But that use of R in I1 is still in the reg_last->uses list when processing I2, so setup_insn_reg_uses adds it a second time. There might be more reasons for multiple records: I stopped looking at this point. I think for GCC 7 it'd be more pragmatic to live with the duplicate entries and keep the fix specific to SCHED_PRESSURE_MODEL. Tested on aarch64-linux-gnu (which uses SCHED_PRESSURE_MODEL by default) and on x86_64-linux-gnu. OK to install? Thanks, Richard gcc/ PR rtl-optimization/80357 * haifa-sched.c (tmp_bitmap): New variable. (model_recompute): Handle duplicate use records. (alloc_global_sched_pressure_data): Initialize tmp_bitmap. (free_global_sched_pressure_data): Free it. gcc/testsuite/ PR rtl-optimization/80357 * gcc.c-torture/compile/pr80357.c: New test. diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c index 07de1bb..0ebf110 100644 --- a/gcc/haifa-sched.c +++ b/gcc/haifa-sched.c @@ -930,6 +930,9 @@ static bitmap saved_reg_live; /* Registers mentioned in the current region. */ static bitmap region_ref_regs; +/* Temporary bitmap used for SCHED_PRESSURE_MODEL. */ +static bitmap tmp_bitmap; + /* Effective number of available registers of a given class (see comment in sched_pressure_start_bb). */ static int sched_class_regs_num[N_REG_CLASSES]; @@ -2135,10 +2138,11 @@ model_recompute (rtx_insn *insn) registers that will be born in the range [model_curr_point, POINT). */ num_uses = 0; num_pending_births = 0; + bitmap_clear (tmp_bitmap); for (use = INSN_REG_USE_LIST (insn); use != NULL; use = use->next_insn_use) { new_last = model_last_use_except (use); - if (new_last < point) + if (new_last < point && bitmap_set_bit (tmp_bitmap, use->regno)) { gcc_assert (num_uses < ARRAY_SIZE (uses)); uses[num_uses].last_use = new_last; @@ -7241,6 +7245,8 @@ alloc_global_sched_pressure_data (void) saved_reg_live = BITMAP_ALLOC (NULL); region_ref_regs = BITMAP_ALLOC (NULL); } + if (sched_pressure == SCHED_PRESSURE_MODEL) + tmp_bitmap = BITMAP_ALLOC (NULL); /* Calculate number of CALL_SAVED_REGS and FIXED_REGS in register classes that we calculate register pressure for. */ @@ -7274,6 +7280,8 @@ free_global_sched_pressure_data (void) BITMAP_FREE (region_ref_regs); BITMAP_FREE (saved_reg_live); } + if (sched_pressure == SCHED_PRESSURE_MODEL) + BITMAP_FREE (tmp_bitmap); BITMAP_FREE (curr_reg_live); free (sched_regno_pressure_class); } diff --git a/gcc/testsuite/gcc.c-torture/compile/pr80357.c b/gcc/testsuite/gcc.c-torture/compile/pr80357.c new file mode 100644 index 000..a3c6ab0 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/compile/pr80357.c @@ -0,0 +1,18 @@ +typedef char a; +a b, c; +int d, e; +void f(void *g) { *(volatile int *)g; } +void j() { + a h, i; + for (; b; b += 2) { +d = b; +i = i >> b; +if (i) + continue; +f( + (b >> 2)); +h = 0; +for (; h < 8 / 2; h++) + if (i << h) +e = 0; + } +}
Re: [PATCH] Fix fixincludes for canadian cross builds
Hi Yvan, On 04/14/17 10:24, Yvan Roux wrote: > Hi Bernd, > > On 14 April 2017 at 06:18, Bernd Edlingerwrote: >> On 04/12/17 17:58, Yvan Roux wrote: >>> Hi, >>> >>> On 20 February 2017 at 18:53, Bruce Korb wrote: On 02/18/17 01:01, Bernd Edlinger wrote: > On 02/18/17 00:37, Bruce Korb wrote: >> On 02/06/17 10:44, Bernd Edlinger wrote: >>> I tested this change with different arm-linux-gnueabihf cross >>> compilers, and verified that mkheaders still works on the host system. >>> >>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu. >>> Is it OK for trunk? >> >> As long as you certify that this is correct for all systems we care >> about: >> >> +BUILD_SYSTEM_HEADER_DIR = ` >> +echo $(CROSS_SYSTEM_HEADER_DIR) | \ >> +sed -e :a -e 's,[^/]*/\.\.\/,,' -e ta` >> >> that is pretty obtuse sed-speak to me. I suggest a comment >> explaining what sed is supposed to be doing. What should >> "$(CROSS_SYSTEM_HEADER_DIR)" look like? >> > > I took it just from a few lines above, so I thought that comment would > sufficiently explain the syntax: I confess, I didn't pull a new copy of gcc, sorry. So it looks good to me. >>> >>> >>> We just noticed that this patch brakes canadian cross builds when >>> configured with --with-build-sysroot, since headers are searched into >>> the target sysroot instead of the one specified for builds. >>> >>> Maybe there's a cleaner way to fix this and avoid the duplication but >>> I didn't find another to test if --with-build-sysroot is used. The >>> attached patch fixes the issue. Tested with a Full canadian cross >>> build for i686-w64-mingw32 host and arm-linux-gnueabihf. >>> >>> Thanks >>> Yvan >>> >>> 2017-04-12 Yvan Roux >>> >>>* Makefile.in (BUILD_SYSTEM_HEADER_DIR): Set to SYSTEM_HEADER_DIR >>>when configured with --with-build-sysroot. >>> >> >> Oops, sorry for the breakage... >> >> However I think the patch simply restores the previous behavior, because >> ifdef SYSROOT_CFLAGS_FOR_TARGET is always true, even if it's empty, >> but it does still not work correctly. > > hmm, according to Makefile manual ifdef VAR is true if VAR is > non-empty, but maybe SYSROOT_CFLAGS_FOR_TARGET is set to an empty > string when --with-build-sysroot is not used, sorry for not having > tested this case :/ > Yes, interesting point, the reason must have been somewhere else then. Actually the "else ifdef" syntax appears to be working to my surprise: I use gnu make 3.81 here. But the manual says "else" has to be on a line of it's own, and I have never seen that syntax being used before. >> I tried to build a cross with your patch and a --with-build-sysroot >> but the target compiler does fix the wrong includes for me: >> >> ../gcc-trunk/configure --prefix=/home/ed/gnu/arm-linux-gnueabihf-cross >> --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf >> --enable-languages=c,c++,ada,fortran --with-arch=armv7-a >> --with-tune=cortex-a9 --with-fpu=vfpv3-d16 --with-float=hard >> --with-build-sysroot=/home/ed/gnu/arm-linux-gnueabihf-3 >> => >> Fixing headers into >> /home/ed/gnu/gcc-build-arm-linux-gnueabihf-2/gcc/include-fixed for >> arm-unknown-linux-gnueabihf target >> Forbidden identifiers: linux unix >> Finding directories and links to directories >> Searching /usr/include/. >> Searching /usr/include/./c++/4.8.4 >> Searching /usr/include/./numpy >> Searching /usr/include/./python2.7/numpy >> Making symbolic directory links >> Fixing directory /usr/include into >> /home/ed/gnu/gcc-build-arm-linux-gnueabihf-2/gcc/include-fixed >> >> but it should fix headers in .../arm-linux-gnueabihf-3/usr/include >> >> I think it would work if I use --with-sysroot together with >> --with-build-sysroot in the config above, but why should the >> target compiler use --with-sysroot ? >> There is no need for that on the target, just the cross-compiler >> might need to support that. > > Ah yes, sorry ! we use both together, my understanding was that > --with-build-sysroot only makes sense we --with-sysroot is used. > No problem, there are certainly many situations when that is true. >> I tried to fix some possible combinations of --with-sysroot/ >> --with-build-sysroot, and moved the logic to the configure >> script. When I did that I noticed also some more glitches >> when grabbing the TARGET_GLIBC_MAJOR/MINOR defines from sysroot >> files. >> >> This updated patch seems to work for me, could you give it a try? > > The patch looks good to me, and works fine. > > Thanks a lot Bernd. > > Cheers, > Yvan > Hi RMs: I am sorry that this happened so close to the imminent gcc-7 release date. To my best knowledge it would be fine to apply this update patch on the trunk: https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00649.html But if you decide otherwise, I am also ready to revert my original commit:
Re: [PATCH,rs6000] PR80315: Add test cases to confirm ICE has been fixed
On Wed, Apr 12, 2017 at 06:33:17PM -0600, Kelvin Nilsen wrote: > PR80315 Reported an Internal Compiler Error when the third argument to > __builtin_crypto_vshasigmaw was an integer constant with a value > greater than 15. The patch to correct this problem was committed > yesterday. This patch adds 4 new test cases to the regression suite. Sure, please commit. Thanks, Segher
Re: [PATCH], Fix PR/target 80099 (internal error with -mno-upper-regs-sf)
On Wed, Apr 12, 2017 at 06:45:31PM -0400, Michael Meissner wrote: > The problem is rs6000_expand_vector_extract did not check for SFmode being > allowed in the Altivec (upper) registers, but the insn implementing the > variable extract had it as a condition. > > In looking at the variable extract code, it currently does not require SFmode > to go in the Altivec registers, but it does require DImode to go into the > Altivec registers (vec_extract of V2DFmode will require DFmode to go in > Altivec > registers instead of DImode). > @@ -7586,15 +7586,23 @@ rs6000_expand_vector_extract (rtx target >switch (mode) > { > case V2DFmode: > - emit_insn (gen_vsx_extract_v2df_var (target, vec, elt)); > - return; > + if (TARGET_UPPER_REGS_DF) > + { > + emit_insn (gen_vsx_extract_v2df_var (target, vec, elt)); > + return; > + } > + break; If the option is not set, we then ICE later on as far as I see (since elt is not a CONST_INT). Is that what we want, can that not happen? In that case, please just do an assert (TARGET_UPPER_REGS_DF) here? Segher
Re: [PATCH], Fix PR 80098, Add better checking for disabling features
On Wed, Apr 12, 2017 at 04:30:51PM -0400, Michael Meissner wrote: > > Or we could just remove -mmodulo etc. What good do they do? They make > > testing infeasible: an exponential number of combinations to test. > > We can't remove -mmodulo, -mpopcntd, -mcmpb, -mpopcntb as these are the basic > markers for -mcpu=power9/power7/power6/power5, and lots of other things depend > on these options. Yes, and that's exactly backward. The main point though is that we allow fine-grained selection of trivial ISA additions, which results in an exponential number of possibilities. But whether some machine insn is generated also matters for *other* patterns (and elsewhere even), so we really do need to test all possible combinations; but that cannot be done. And then bug reports come in, and we spend more time making patches to the option maze than we spend on anything else :-/ > I'm not sure we have a marker for power8 that isn't vector related. I can't think of any. Pretty much anything in Power8 is vector. > 2017-04-12 Michael Meissner> > PR target/80098 > * config/rs6000/rs6000-cpus.def (OTHER_P9_VECTOR_MASKS): Define > masks of options that should be turned off if the VSX vector > options are turned off. > (OTHER_P8_VECTOR_MASKS): Likewise. > (OTHER_VSX_VECTOR_MASKS): Likewise. > * config/rs6000/rs6000.c (rs6000_option_override_internal): Call > rs6000_disable_incompatible_switches to validate no type switches > like -mvsx. > (rs6000_incompatible_switch): New function to disallow turning on > other vector options if -mno-vsx, -mno-power8-vector, or > -mno-power9-vector are specified. > > [gcc/testsuite] > 2017-04-12 Michael Meissner > > PR target/80098 > * gcc.target/powerpc/pr80098-1.c: New test. > * gcc.target/powerpc/pr80098-2.c: Likewise. > * gcc.target/powerpc/pr80098-3.c: Likewise. > * gcc.target/powerpc/pr80098-4.c: Likewise. > +/* Handle explicit -mno-vsx, -mno-power8-vector, and -mno-power9-vector > options > + and turn off all setting other options by default if those options that > + depend on the flags that are turned off. */ Could you this a bit clearer please? Okay for trunk. Thanks, Segher
Re: [PATCH] Fix fixincludes for canadian cross builds
Hi Bernd, On 14 April 2017 at 06:18, Bernd Edlingerwrote: > On 04/12/17 17:58, Yvan Roux wrote: >> Hi, >> >> On 20 February 2017 at 18:53, Bruce Korb wrote: >>> On 02/18/17 01:01, Bernd Edlinger wrote: On 02/18/17 00:37, Bruce Korb wrote: > On 02/06/17 10:44, Bernd Edlinger wrote: >> I tested this change with different arm-linux-gnueabihf cross >> compilers, and verified that mkheaders still works on the host system. >> >> Bootstrapped and reg-tested on x86_64-pc-linux-gnu. >> Is it OK for trunk? > > As long as you certify that this is correct for all systems we care about: > > +BUILD_SYSTEM_HEADER_DIR = ` > +echo $(CROSS_SYSTEM_HEADER_DIR) | \ > +sed -e :a -e 's,[^/]*/\.\.\/,,' -e ta` > > that is pretty obtuse sed-speak to me. I suggest a comment > explaining what sed is supposed to be doing. What should > "$(CROSS_SYSTEM_HEADER_DIR)" look like? > I took it just from a few lines above, so I thought that comment would sufficiently explain the syntax: >>> >>> I confess, I didn't pull a new copy of gcc, sorry. >>> So it looks good to me. >> >> >> We just noticed that this patch brakes canadian cross builds when >> configured with --with-build-sysroot, since headers are searched into >> the target sysroot instead of the one specified for builds. >> >> Maybe there's a cleaner way to fix this and avoid the duplication but >> I didn't find another to test if --with-build-sysroot is used. The >> attached patch fixes the issue. Tested with a Full canadian cross >> build for i686-w64-mingw32 host and arm-linux-gnueabihf. >> >> Thanks >> Yvan >> >> 2017-04-12 Yvan Roux >> >>* Makefile.in (BUILD_SYSTEM_HEADER_DIR): Set to SYSTEM_HEADER_DIR >>when configured with --with-build-sysroot. >> > > Oops, sorry for the breakage... > > However I think the patch simply restores the previous behavior, because > ifdef SYSROOT_CFLAGS_FOR_TARGET is always true, even if it's empty, > but it does still not work correctly. hmm, according to Makefile manual ifdef VAR is true if VAR is non-empty, but maybe SYSROOT_CFLAGS_FOR_TARGET is set to an empty string when --with-build-sysroot is not used, sorry for not having tested this case :/ > I tried to build a cross with your patch and a --with-build-sysroot > but the target compiler does fix the wrong includes for me: > > ../gcc-trunk/configure --prefix=/home/ed/gnu/arm-linux-gnueabihf-cross > --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf > --enable-languages=c,c++,ada,fortran --with-arch=armv7-a > --with-tune=cortex-a9 --with-fpu=vfpv3-d16 --with-float=hard > --with-build-sysroot=/home/ed/gnu/arm-linux-gnueabihf-3 > => > Fixing headers into > /home/ed/gnu/gcc-build-arm-linux-gnueabihf-2/gcc/include-fixed for > arm-unknown-linux-gnueabihf target > Forbidden identifiers: linux unix > Finding directories and links to directories > Searching /usr/include/. > Searching /usr/include/./c++/4.8.4 > Searching /usr/include/./numpy > Searching /usr/include/./python2.7/numpy > Making symbolic directory links > Fixing directory /usr/include into > /home/ed/gnu/gcc-build-arm-linux-gnueabihf-2/gcc/include-fixed > > but it should fix headers in .../arm-linux-gnueabihf-3/usr/include > > I think it would work if I use --with-sysroot together with > --with-build-sysroot in the config above, but why should the > target compiler use --with-sysroot ? > There is no need for that on the target, just the cross-compiler > might need to support that. Ah yes, sorry ! we use both together, my understanding was that --with-build-sysroot only makes sense we --with-sysroot is used. > I tried to fix some possible combinations of --with-sysroot/ > --with-build-sysroot, and moved the logic to the configure > script. When I did that I noticed also some more glitches > when grabbing the TARGET_GLIBC_MAJOR/MINOR defines from sysroot > files. > > This updated patch seems to work for me, could you give it a try? The patch looks good to me, and works fine. Thanks a lot Bernd. Cheers, Yvan
Re: [wwwdocs,fortran] Update link to CHKSYS
Hi Jerry, thanks for the suggestions - I agree that a download link which is more visible would be a good idea. I will adapt the pages with this in mind. Regards, Arjen 2017-04-13 21:36 GMT+02:00 Jerry DeLisle: > On 04/13/2017 02:13 AM, Arjen Markus wrote: >> Hi Gerald, Jerry, >> >> as the author of this program (and the maintainer of that site) I am >> honoured that you refer to this. But I would like to tell you a bit >> about the history: >> I started CHKSYS in the time that FORTRAN 77 was still the main >> standard, so much of the tests that are contained in this program are >> specific to all the variations I had encountered with various FORTRAN >> compilers. I did start a version for Fortran 90, but I was rather a >> novice wrt that standard back then and I never really achieved a >> satisfactory set of tests. >> >> Some time last year I started anew with a set of programs - see the >> chkfeatures subdirectory in the source repository. These programs >> check whether a compiler supports features found in Fortran 2003 and >> 2008 and various common extensions that I am aware of. Since the >> checks frequently cause the compilers to complain - and are even >> designed for this - I have set this up as a collection of programs, >> rather than a single one. >> >> Checking the web pages, I see I should really update the original page >> and integrate the new stuff with it. If you have any suggestions to >> make the reference easier (access to the source code from that page?), >> then let me know. I am no hero when it comes to web design, but a >> better set-up of links and such is well within my capacity. >> > > First, the original page link we had is broken. so clarify for us the URL you > intend to update. > > A download link on each page is helpful. Our link was to CHKSYS so I had to > manually navigate up one level to hunt around for the actual source code. We > probably should link to your main page so one navigates down to what one is > looking for. I think the download page link has one word like 'here' or > something. Maybe pull that out to a separate line that says "Download > Sources" > in bold or something. Just a suggestion, by all means do as you please. > > Jerry
Re: [PATCH] Destroy arguments for _Cilk_spawn calling in the child (PR 80038)
On 2017-04-13 09:05 +0200, Richard Biener wrote: > Did you verify LTO bootstrap still works with the patch? I've just done a LTO bootstrapp (boarding a train :) ). It works with my patch. -- Xi RuoyaoSchool of Aerospace Science and Technology, Xidian University