Re: [patch,avr][Ping #3] PR81075: Move jump-tables out of .text

2017-07-07 Thread Denis Chertykov
I'm sorry for so long delay.

Please apply the patch.

2017-07-05 14:19 GMT+04:00 Georg-Johann Lay :
> Ping #3
>
> http://gcc.gnu.org/ml/gcc-patches/2017-06/msg01029.html
>
> As avr maintainers are off-line, would a global maintainer have
> a look at this?
>
> Thanks,
>
> Johann
>
>
>
>
> On 27.06.2017 12:01, Georg-Johann Lay wrote:
>>
>> Ping #2
>>
>> http://gcc.gnu.org/ml/gcc-patches/2017-06/msg01029.html
>>
>> On 14.06.2017 14:03, Georg-Johann Lay wrote:
>>>
>>> Hi,
>>>
>>> Since PR71151 we have jump-tables in .text so that branches
>>> crossing the tables have longer offsets that needed.
>>>
>>> This moves jump-tables out of test again, but not into
>>> .progmem.gcc_sw_tables like before PR71151, but into
>>> the currently unused but existing .jumptables.
>>>
>>> Since PR63223 there is no restriction on the location
>>> of jump-tables, they can even reside above 128KiB without
>>> problems.
>>>
>>> Also adds -mlog=insn_addresses to dump insn addresses
>>> as asm comments before respective instruction.
>>>
>>> The patch implements ASM_OUTPUT_ADDR_VEC so that avr.c
>>> gains full control over the table generation.
>>>
>>> Tested on ATmega2560.
>>>
>>> Ok to apply?
>>>
>>> Johann
>>>
>>>
>>> gcc/
>>>  Move jump-tables out of .text again.
>>>
>>>  PR target/81075
>>>  * config/avr/avr.c (ASM_OUTPUT_ADDR_VEC_ELT): Remove function.
>>>  (ASM_OUTPUT_ADDR_VEC): New function.
>>>  (avr_adjust_insn_length) [JUMP_TABLE_DATA_P]: Return 0.
>>>  (avr_final_prescan_insn) [avr_log.insn_addresses]: Dump
>>>  INSN_ADDRESSes as asm comment.
>>>  * config/avr/avr.h (JUMP_TABLES_IN_TEXT_SECTION): Adjust comment.
>>>  (ASM_OUTPUT_ADDR_VEC_ELT): Remove define.
>>>  (ASM_OUTPUT_ADDR_VEC): Define to avr_output_addr_vec.
>>>  * config/avr/avr.md (*tablejump): Adjust comment.
>>>  * config/avr/elf.h (ASM_OUTPUT_BEFORE_CASE_LABEL): Remove.
>>>  * config/avr/avr-log.c (avr_log_set_avr_log) :
>>>  New detail.
>>>  * config/avr/avr-protos.h (avr_output_addr_vec_elt): Remove proto.
>>>  (avr_output_addr_vec): New proto.
>>>  (avr_log_t) : New field.
>>
>>
>>
>


Re: [patch,avr] Fix PR20296 / PR81268: Better ISR prologues / epilogues

2017-07-07 Thread Denis Chertykov
2017-07-07 18:31 GMT+04:00 Georg-Johann Lay :
> Hi,
>
> this patch addresses a very old issue, the non-optimal
> generation of ISR prologues and epilogues.
>
> As GAS now provides the __gcc_isr pseudo instruction to
> overcome some problems, see
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=21683
>
> this can now be used to address PR20296.
>
>
> This patch does:
>
> * Add a configure test if GAS supports __gcc_isr and -mgcc-isr.
>
> * Add new option -mgas-isr-prologues to switch on / off
>   generating of __gcc_isr in ISR prologues / epilogues.
>
> * Switch on the feature per default except for -O0 and -Og.
>
> * Add a new no_gccisr function attribute to disable __gcc_isr
>   generation for individual ISRs.
>
> * Add a new pass .avr-gasisr that filters out situations where
>   __gcc_isr is not appropriate.
>
> * Extend prologue and epilogue generation to emit __gcc_isr chunks
>   during prologue and epilogue(s).
>
> * Implement final_postscan_insn to emit final __gcc_isr Done chunk.
>
> * Add -mgcc-isr to ASM_SPEC if appropriate.
>
>
> We currently have only 3 torture tests for ISRs, namely
>
> gcc.target/avr/torture/isr-*.c
>
> All these tests PASS when
>
> * Run with -mgas-isr-prologues
> * Run with -mno-gas-isr-prologues
> * Run for: atmega8 atmega64 atmega103 atmega2560 atmega128 atxmega128a1
> attiny40
>
> Ok for trunk?


Please apply.

>
> Johann
>
> PR target/20296
> PR target/81268
> * configure.ac [target=avr]: Add GAS check for -mgcc-isr.
> (HAVE_AS_AVR_MGCCISR_OPTION):  If so, AC_DEFINE it.
> * config.in: Regenerate.
> * configure: Regenerate.
>
> * doc/extend.texi (AVR Function Attributes) : Document
> it.
> * doc/invoke.texi (AVR Options) <-mgas-isr-prologues>: Document it.
>
> * config/avr/avr.opt (-mgas-isr-prologues): New option and...
> (TARGET_GASISR_PROLOGUES): ...target mask.
> * common/config/avr/avr-common.c
> (avr_option_optimization_table) [OPT_LEVELS_1_PLUS_NOT_DEBUG]:
> Set -mgas-isr-prologues.
> * config/avr/avr-passes.def (avr_pass_maybe_gasisr): Add
> INSERT_PASS_BEFORE for it.
> * config/avr/avr-protos.h (make_avr_pass_maybe_gasisr): New proto.
> * config/avr/avr.c (avr_option_override)
> [!HAVE_AS_AVR_MGCCISR_OPTION]: Unset TARGET_GASISR_PROLOGUES.
> (avr_no_gccisr_function_p, avr_hregs_split_lsb): New static
> functions.
> (avr_attribute_table) : Add new function attribute.
> (avr_set_current_function) : Init machine field.
> (avr_pass_data_gasisr, avr_pass_maybe_gasisr): New pass data
> and rtl_opt_pass.
> (make_avr_pass_maybe_gasisr): New function.
> (emit_push_sfr) : Add argument to function and use it
> instead of TMP_REG.
> (avr_expand_prologue) [machine->gasisr.maybe]: Emit gasisr insn
> and set machine->gasisr.yes.
> (avr_expand_epilogue) [machine->gasisr.yes]: Similar.
> (avr_asm_function_end_prologue) [machine->gasisr.yes]: Add
> __gcc_isr.n_pushed to .L__stack_usage.
> (TARGET_ASM_FINAL_POSTSCAN_INSN): Define to...
> (avr_asm_final_postscan_insn): ...this new static function.
> * config/avr/avr.h (machine_function)
> : New fields.
> : New fields.
> * config/avr/avr.md (UNSPECV_GASISR): Add unspecv enum.
> (GASISR_Prologue, GASISR_Epilogue, GASISR_Done): New
> define_constants.
> (gasisr, *gasisr): New expander and insn.
> * config/avr/gen-avr-mmcu-specs.c (print_mcu)
> [HAVE_AS_AVR_MGCCISR_OPTION]: Print asm_gccisr spec.
> * config/avr/specs.h (ASM_SPEC) : Add sub spec.


Re: [PATCHv2][PR 57371] Remove useless floating point casts in comparisons

2017-07-07 Thread Yuri Gribov
On Sat, Jul 8, 2017 at 5:30 AM, Yuri Gribov  wrote:
> On Fri, Jul 7, 2017 at 11:51 PM, Joseph Myers  wrote:
>> On Fri, 7 Jul 2017, Yuri Gribov wrote:
>>
>>> > I suspect infinities would already work with the patch as-is (the logic
>>> > dealing with constants outside the range of the integer type).  I'm less
>>> > clear that NaNs would work properly.  (If the comparison is == or != you
>>> > can optimize it for quiet NaNs, to false and true respectively.  If it's a
>>> > signaling NaN, or < <= > >=, optimizing to false is only valid with
>>> > -fno-trapping-math, as it would lose an "invalid" exception.)
>>>
>>> It's actually under -fsignaling-nans (which if off by default). I've
>>
>> No, ordered comparisons with qNaNs should result in exceptions,
>
> I assume you mean sNaNs.
>
>> so it's not valid by default to optimize them to false (whereas it is valid 
>> for
>> equality comparisons, as those only raise exceptions for signaling NaNs).
>
> I'm afraid this is default GCC behavior atm - e.g. check existing
>/* a CMP (-0) -> a CMP 0  */
>...
>(if (REAL_VALUE_ISNAN (TREE_REAL_CST (@1))
> && ! HONOR_SNANS (@1))
> { constant_boolean_node (cmp == NE_EXPR, type); })
> (this pattern causes testcase in my patch pr53731-5.c to be optimized).
>
> Or documentation for -fsignaling-nans which says that "the default is
> -fno-signaling-nans" and it "may change the number of exceptions
> visible with signaling NaNs".
>
> In any case, decision to optimize sNaNs is done in HONOR_NANS macro
> which my code duly checks

Actually I should probly change this to be HONOR_SNANS to enable sNaN
optimization by default (like other matchers do).

> so I'm not really sure what else needs to be
> done about discussed patch in this regards.
>
> -Y


Re: [PATCHv2][PR 57371] Remove useless floating point casts in comparisons

2017-07-07 Thread Yuri Gribov
On Fri, Jul 7, 2017 at 11:51 PM, Joseph Myers  wrote:
> On Fri, 7 Jul 2017, Yuri Gribov wrote:
>
>> > I suspect infinities would already work with the patch as-is (the logic
>> > dealing with constants outside the range of the integer type).  I'm less
>> > clear that NaNs would work properly.  (If the comparison is == or != you
>> > can optimize it for quiet NaNs, to false and true respectively.  If it's a
>> > signaling NaN, or < <= > >=, optimizing to false is only valid with
>> > -fno-trapping-math, as it would lose an "invalid" exception.)
>>
>> It's actually under -fsignaling-nans (which if off by default). I've
>
> No, ordered comparisons with qNaNs should result in exceptions,

I assume you mean sNaNs.

> so it's not valid by default to optimize them to false (whereas it is valid 
> for
> equality comparisons, as those only raise exceptions for signaling NaNs).

I'm afraid this is default GCC behavior atm - e.g. check existing
   /* a CMP (-0) -> a CMP 0  */
   ...
   (if (REAL_VALUE_ISNAN (TREE_REAL_CST (@1))
&& ! HONOR_SNANS (@1))
{ constant_boolean_node (cmp == NE_EXPR, type); })
(this pattern causes testcase in my patch pr53731-5.c to be optimized).

Or documentation for -fsignaling-nans which says that "the default is
-fno-signaling-nans" and it "may change the number of exceptions
visible with signaling NaNs".

In any case, decision to optimize sNaNs is done in HONOR_NANS macro
which my code duly checks so I'm not really sure what else needs to be
done about discussed patch in this regards.

-Y


[PATCH] i386: Avoid stack realignment if possible

2017-07-07 Thread H.J. Lu
On Fri, Jul 07, 2017 at 09:58:42AM -0700, H.J. Lu wrote:
> On Fri, Dec 20, 2013 at 8:06 AM, Jakub Jelinek  wrote:
> > Hi!
> >
> > Honza recently changed the i?86 backend, so that it often doesn't
> > do -maccumulate-outgoing-args by default on x86_64.
> > Unfortunately, on some of the here included testcases this regressed
> > quite a bit the generated code.  As AVX vectors are used, the dynamic
> > realignment code needs to assume e.g. that some of them will need to be
> > spilled, and for -mno-accumulate-outgoing-args the code needs to set
> > need_drap early as well.  But in when emitting the prologue/epilogue,
> > if need_drap is set, we don't perform the optimization for leaf functions
> > which have zero size stack frame, thus we end up with uselessly doing
> > dynamic stack realignment, setting up DRAP that nothing uses and later on
> > restore everything back.
> >
> > This patch improves it, if the DRAP register isn't live at the start of
> > entry bb successor and we aren't going to realign the stack, we don't
> > need DRAP at all, and even if we need DRAP register, that can't be the sole
> > reason for doing stack realignment, the prologue code is able to set up DRAP
> > even without dynamic stack realignment.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> > 2013-12-20  Jakub Jelinek  
> >
> > PR target/59501
> > * config/i386/i386.c (ix86_save_reg): Don't return true for drap_reg
> > if !crtl->stack_realign_needed.
> > (ix86_finalize_stack_realign_flags): If drap_reg isn't live on entry
> > and stack_realign_needed will be false, clear drap_reg and 
> > need_drap.
> > Optimize leaf functions that don't need stack frame even if
> > crtl->need_drap.
> >
> > * gcc.target/i386/pr59501-1.c: New test.
> > * gcc.target/i386/pr59501-1a.c: New test.
> > * gcc.target/i386/pr59501-2.c: New test.
> > * gcc.target/i386/pr59501-2a.c: New test.
> > * gcc.target/i386/pr59501-3.c: New test.
> > * gcc.target/i386/pr59501-3a.c: New test.
> > * gcc.target/i386/pr59501-4.c: New test.
> > * gcc.target/i386/pr59501-4a.c: New test.
> > * gcc.target/i386/pr59501-5.c: New test.
> > * gcc.target/i386/pr59501-6.c: New test.
> >
> >
> > --- gcc/testsuite/gcc.target/i386/pr59501-4a.c.jj   2013-12-20 
> > 12:19:20.603212859 +0100
> > +++ gcc/testsuite/gcc.target/i386/pr59501-4a.c  2013-12-20 
> > 12:23:33.647881672 +0100
> > @@ -0,0 +1,8 @@
> > +/* PR target/59501 */
> > +/* { dg-do compile { target { ! ia32 } } } */
> > +/* { dg-options "-O2 -mavx -maccumulate-outgoing-args" } */
> > +
> > +#include "pr59501-3a.c"
> > +
> > +/* Verify no dynamic realignment is performed.  */
> > +/* { dg-final { scan-assembler-not "and\[^\n\r]*sp" { xfail *-*-* } } } */
> >
> 
> Since DRAP isn't used with -maccumulate-outgoing-args, pr59501-4a.c was
> xfailed due to stack frame access via frame pointer instead of DARP.
> This patch finds the maximum stack alignment from the stack frame access
> instructions and avoids stack realignment if stack alignment needed is
> less than incoming stack boundary.
> 
> I am testing this patch.  OK for trunk if there is no regression?
> 
> 

We need to keep the preferred stack alignment as the minimum stack
alignment. Here is the updated patch.  Tested on x86-64.  OK for
trunk?

Thanks.

H.J.
---
Since DRAP isn't used with -maccumulate-outgoing-args, pr59501-4a.c was
xfailed due to stack frame access via frame pointer instead of DARP.
This patch finds the maximum stack alignment from the stack frame access
instructions and avoids stack realignment if stack alignment needed is
less than incoming stack boundary.

gcc/

PR target/59501
* config/i386/i386.c (ix86_finalize_stack_realign_flags): Don't
realign stack if stack alignment needed is less than incoming
stack boundary.

gcc/testsuite/

PR target/59501
* gcc.target/i386/pr59501-4a.c: Remove xfail.
---
 gcc/config/i386/i386.c | 84 +++---
 gcc/testsuite/gcc.target/i386/pr59501-4a.c |  2 +-
 2 files changed, 56 insertions(+), 30 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b041524..28febd0 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -14161,6 +14161,11 @@ ix86_finalize_stack_realign_flags (void)
   add_to_hard_reg_set (_up_by_prologue, Pmode, ARG_POINTER_REGNUM);
   add_to_hard_reg_set (_up_by_prologue, Pmode,
   HARD_FRAME_POINTER_REGNUM);
+
+  /* The preferred stack alignment is the minimum stack alignment.  */
+  unsigned int stack_alignment = crtl->preferred_stack_boundary;
+  bool require_stack_frame = false;
+
   FOR_EACH_BB_FN (bb, cfun)
 {
   rtx_insn *insn;
@@ -14169,43 +14174,64 @@ ix86_finalize_stack_realign_flags (void)
  

Re: [PATCH, rs6000] Modify libgcc's float128 IFUNC resolver functions to use __builtin_cpu_supports()

2017-07-07 Thread Peter Bergner
On 7/7/17 4:13 PM, Peter Bergner wrote:
> On 7/7/17 10:18 AM, Segher Boessenkool wrote:
>> On Thu, Jul 06, 2017 at 04:21:48PM -0500, Peter Bergner wrote:
>>> * config/rs6000/float128-ifunc.c: Don't include auxv.h.
>>> (have_ieee_hw_p): Delete function.
>>> (SW_OR_HW) Use __builtin_cpu_supports().
>>
>> Okay for trunk.  Thanks!
> 
> Given Florian wants this in now to fix a Fedora blocker, I have
> committed this to trunk.  We need the fix on GCC 6 and GCC 5 as
> well, so ok there assuming bootstrapping / regtesting are clean?

FYI, the bootstrap and regtesting were clean on both the GCC 7 and
GCC 6 release branches.  Ok for those branches?

Peter





[patch] Fix Unwind support on DragonFly BSD after sigtramp move

2017-07-07 Thread John Marino
Right after DragonFly 4.8 was released (27 Mar 2017), the signal 
trampoline was moved (twice) in response to a Ryzen bug.  This broke 
GCC's unwind support for DragonFly.


To avoid hardcoding the sigtramp location to avoid issues like this in 
the future, a new sysctl was added to DragonFly to return the signal 
trampoline address range (FreeBSD has a similar sysctl for similar 
reasons).  The attached patch fixes DragonFly unwind support for current 
DragonFly, and maintains support for Release 4.8 and earlier.


This patch has been in use for a few months and works fine.  It is 
similar in function to the FreeBSD Aarch64 unwind support I submitted 
through Andreas T. a few months ago.


I believe the patch can be applied to trunk and release 7 branch.
I am the closest thing to a maintainer for DragonFly, so I don't know if 
additional approval is needed.  This patch is purely DragonFly-specific 
and cannot affect other platforms in any way.


If agreed, it would be great if somebody could commit this for me 
against the trunk and GCC-7-branch.


Thanks!
John

P.S.  Yes, my copyright assignment is on file (I've contributed a few 
patches already).


suggested log entry of libgcc/ChangeLog:

2017-07-XX  John Marino  
   * config/i386/dragonfly-unwind.h: Handle sigtramp relocation.
--- libgcc/config/i386/dragonfly-unwind.h.orig	2017-02-06 16:26:52 UTC
+++ libgcc/config/i386/dragonfly-unwind.h
@@ -28,9 +28,13 @@ see the files COPYING3 and COPYING.RUNTI
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
+#if __DragonFly_version > 400800
+#include 
+#endif
 
 
 #define REG_NAME(reg)	sf_uc.uc_mcontext.mc_## reg
@@ -39,20 +43,44 @@ see the files COPYING3 and COPYING.RUNTI
 #define MD_FALLBACK_FRAME_STATE_FOR x86_64_dragonfly_fallback_frame_state
 
 
-static void
-x86_64_sigtramp_range (unsigned char **start, unsigned char **end)
+static int
+x86_64_outside_sigtramp_range (unsigned char *pc)
 {
-  unsigned long ps_strings;
-  int mib[2];
-  size_t len;
-
-  mib[0] = CTL_KERN;
-  mib[1] = KERN_PS_STRINGS;
-  len = sizeof (ps_strings);
-  sysctl (mib, 2, _strings, , NULL, 0);
+  static int sigtramp_range_determined = 0;
+  static unsigned char *sigtramp_start, *sigtramp_end;
 
-  *start = (unsigned char *)ps_strings - 32;
-  *end   = (unsigned char *)ps_strings;
+  if (sigtramp_range_determined == 0)
+{
+#if __DragonFly_version > 400800
+  struct kinfo_sigtramp kst = {0};
+  size_t len = sizeof (kst);
+  int mib[3] = { CTL_KERN, KERN_PROC, KERN_PROC_SIGTRAMP };
+
+  sigtramp_range_determined = 1;
+  if (sysctl (mib, 3, , , NULL, 0) == 0)
+  {
+sigtramp_range_determined = 2;
+sigtramp_start = kst.ksigtramp_start;
+sigtramp_end   = kst.ksigtramp_end;
+  }
+#else
+  unsigned long ps_strings;
+  size_t len = sizeof (ps_strings);
+  int mib[2] = { CTL_KERN, KERN_PS_STRINGS };
+  
+  sigtramp_range_determined = 1;
+  if (sysctl (mib, 2, _strings, , NULL, 0) == 0)
+  {
+sigtramp_range_determined = 2;
+sigtramp_start = (unsigned char *)ps_strings - 32;
+sigtramp_end   = (unsigned char *)ps_strings;
+  }
+#endif
+}
+  if (sigtramp_range_determined < 2)  /* sysctl failed if < 2 */
+return 1;
+
+  return (pc < sigtramp_start || pc >= sigtramp_end );
 }
 
 
@@ -60,13 +88,10 @@ static _Unwind_Reason_Code
 x86_64_dragonfly_fallback_frame_state
 (struct _Unwind_Context *context, _Unwind_FrameState *fs)
 {
-  unsigned char *pc = context->ra;
-  unsigned char *sigtramp_start, *sigtramp_end;
   struct sigframe *sf;
   long new_cfa;
 
-  x86_64_sigtramp_range(_start, _end);
-  if (pc >= sigtramp_end || pc < sigtramp_start)
+  if (x86_64_outside_sigtramp_range(context->ra))
 return _URC_END_OF_STACK;
 
   sf = (struct sigframe *) context->cfa;


[Committed/AARCH64] Fix ICE with -mcpu=thunderx2t99

2017-07-07 Thread Andrew Pinski
Hi,
  After https://gcc.gnu.org/ml/gcc-cvs/2017-06/msg01066.html, there
was many crashes with -mcpu=thunderx2t99.  This patch fixes the
crashes.

Committed after bootstrap and test.

Thanks,
Andrew Pinski

ChangeLog:
* config/aarch64/aarch64.c (aarch_macro_fusion_pair_p): Check prev_set
and curr_set for AARCH64_FUSE_ALU_BRANCH.
Index: config/aarch64/aarch64.c
===
--- config/aarch64/aarch64.c(revision 250059)
+++ config/aarch64/aarch64.c(working copy)
@@ -14324,7 +14324,9 @@
}
 }
 
-  if (aarch64_fusion_enabled_p (AARCH64_FUSE_ALU_BRANCH)
+  if (prev_set
+  && curr_set
+  && aarch64_fusion_enabled_p (AARCH64_FUSE_ALU_BRANCH)
   && any_condjump_p (curr))
 {
   /* We're trying to match:


Re: [PATCHv2][PR 57371] Remove useless floating point casts in comparisons

2017-07-07 Thread Joseph Myers
On Fri, 7 Jul 2017, Yuri Gribov wrote:

> > I suspect infinities would already work with the patch as-is (the logic
> > dealing with constants outside the range of the integer type).  I'm less
> > clear that NaNs would work properly.  (If the comparison is == or != you
> > can optimize it for quiet NaNs, to false and true respectively.  If it's a
> > signaling NaN, or < <= > >=, optimizing to false is only valid with
> > -fno-trapping-math, as it would lose an "invalid" exception.)
> 
> It's actually under -fsignaling-nans (which if off by default). I've

No, ordered comparisons with qNaNs should result in exceptions, so it's 
not valid by default to optimize them to false (whereas it is valid for 
equality comparisons, as those only raise exceptions for signaling NaNs).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCHv2][PR 57371] Remove useless floating point casts in comparisons

2017-07-07 Thread Yuri Gribov
On Fri, Jul 7, 2017 at 6:07 PM, Joseph Myers  wrote:
> On Fri, 7 Jul 2017, Yuri Gribov wrote:
>
>> Hi all,
>>
>> This is an updated version of patch in
>> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00034.html . It should
>> be much more complete, both in functionality and in tests.
>
> I think there should be tests when the constant is an infinity (of either
> sign) or NaN (quiet or signaling).

Indeed.

> I suspect infinities would already work with the patch as-is (the logic
> dealing with constants outside the range of the integer type).  I'm less
> clear that NaNs would work properly.  (If the comparison is == or != you
> can optimize it for quiet NaNs, to false and true respectively.  If it's a
> signaling NaN, or < <= > >=, optimizing to false is only valid with
> -fno-trapping-math, as it would lose an "invalid" exception.)

It's actually under -fsignaling-nans (which if off by default). I've
attached updated patch, tests are still running but perhaps you could
take a look?

-Y


pr57371-3.patch
Description: Binary data


Re: [PATCH, rs6000] Modify libgcc's float128 IFUNC resolver functions to use __builtin_cpu_supports()

2017-07-07 Thread Peter Bergner
On 7/7/17 10:18 AM, Segher Boessenkool wrote:
> On Thu, Jul 06, 2017 at 04:21:48PM -0500, Peter Bergner wrote:
>>  * config/rs6000/float128-ifunc.c: Don't include auxv.h.
>>  (have_ieee_hw_p): Delete function.
>>  (SW_OR_HW) Use __builtin_cpu_supports().
> 
> Okay for trunk.  Thanks!

Given Florian wants this in now to fix a Fedora blocker, I have
committed this to trunk.  We need the fix on GCC 6 and GCC 5 as
well, so ok there assuming bootstrapping / regtesting are clean?

Peter



[PATCH, rs6000] Fix builtins-1-p9-runnable.c

2017-07-07 Thread Carl Love
GCC Maintainers:

The following patch causes the builtins-1-p9-runnable.c to be reported
as "unsupported test" rather then "unexpected fail" on non-power9
systems.  The patched test does compile and run successfully on Power 9
with a report of 2 "expected passes".  I was expecting the test to fail
on non power 9 systems, but this fix makes the test results much
cleaner.

  Carl Love




2017-07-07  Carl Love  

* gcc.target/powerpc/builtins-1-p9-runnable.c (dg-ddo run): Add
lp64 && p9vector_hw.
---
 gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c
b/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c
index 790f64c..acaebb6 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c
@@ -1,4 +1,4 @@
-/* { dg-do run { target { powerpc*-*-linux* } } } */
+/* { dg-do run { target { powerpc*-*-linux* && { lp64 &&
p9vector_hw } } } } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-options "-O2 -mcpu=power9" } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" }
{ "-mcpu=power9" } } */
-- 
1.9.1





Re: [PATCH] gcc/doc: list what version each attribute was introduced in

2017-07-07 Thread Mike Stump
On Jul 7, 2017, at 10:01 AM, Jeff Law  wrote:
> 
> On 07/06/2017 07:25 AM, Daniel P. Berrange wrote:
>> There are several hundred named attribute keys that have been
>> introduced over many GCC releases. Applications typically need
>> to be compilable with multiple GCC versions, so it is important
>> for developers to know when GCC introduced support for each
>> attribute.

> Keying on version #s is generally a terrible way to make your code
> portable.

> It's far better to actually *test* what your particular compiler
> compiler supports

So, if someone wanted to explore ways to make code that uses these better; a 
possibility might be to use __has_builtin a la clang:

  https://clang.llvm.org/docs/LanguageExtensions.html

It also has __has_feature and __has_extension.  At least it seems reasonably 
complete, and then people can feature test specific bits on a fine grained 
basis.

It doesn't solve history (without a re-release of old versions), but, it can 
provide a framework for solving the problem for the future.

[PATCH v12] add -fpatchable-function-entry=N,M option

2017-07-07 Thread Torsten Duwe
Change since v11:

< +#if TARGET_HAVE_NAMED_SECTIONS

> +  if (record_p && targetm_common.have_named_sections)

(plus > +#include "common/common-target.h" )

Torsten


gcc/c-family/ChangeLog
2017-07-07  Torsten Duwe  

* c-attribs.c (c_common_attribute_table): Add entry for
"patchable_function_entry".

gcc/lto/ChangeLog
2017-07-07  Torsten Duwe  

* lto-lang.c (lto_attribute_table): Add entry for
"patchable_function_entry".

gcc/ChangeLog
2017-07-07  Torsten Duwe  

* common.opt: Introduce -fpatchable-function-entry
command line option, and its variables function_entry_patch_area_size
and function_entry_patch_area_start.
* opts.c (common_handle_option): Add -fpatchable_function_entry_ case,
including a two-value parser.
* target.def (print_patchable_function_entry): New target hook.
* targhooks.h (default_print_patchable_function_entry): New function.
* targhooks.c (default_print_patchable_function_entry): Likewise.
* toplev.c (process_options): Switch off IPA-RA if
patchable function entries are being generated.
* varasm.c (assemble_start_function): Look at the
patchable-function-entry command line switch and current
function attributes and maybe generate NOP instructions by
calling the print_patchable_function_entry hook.
* doc/extend.texi: Document patchable_function_entry attribute.
* doc/invoke.texi: Document -fpatchable_function_entry
command line option.
* doc/tm.texi.in (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY):
New target hook.
* doc/tm.texi: Likewise.

gcc/testsuite/ChangeLog
2017-07-07  Torsten Duwe  

* c-c++-common/patchable_function_entry-default.c: New test.
* c-c++-common/patchable_function_entry-decl.c: Likewise.
* c-c++-common/patchable_function_entry-definition.c: Likewise.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 626ffa1cde7..ecb00c1d5b9 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -142,6 +142,8 @@ static tree handle_bnd_variable_size_attribute (tree *, 
tree, tree, int, bool *)
 static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
 static tree handle_bnd_instrument (tree *, tree, tree, int, bool *);
 static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
+static tree handle_patchable_function_entry_attribute (tree *, tree, tree,
+  int, bool *);
 
 /* Table of machine-independent attributes common to all C-like languages.
 
@@ -351,6 +353,9 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_bnd_instrument, false },
   { "fallthrough",   0, 0, false, false, false,
  handle_fallthrough_attribute, false },
+  { "patchable_function_entry",1, 2, true, false, false,
+ handle_patchable_function_entry_attribute,
+ false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -3260,3 +3265,10 @@ handle_fallthrough_attribute (tree *, tree name, tree, 
int,
   *no_add_attrs = true;
   return NULL_TREE;
 }
+
+static tree
+handle_patchable_function_entry_attribute (tree *, tree, tree, int, bool *)
+{
+  /* Nothing to be done here.  */
+  return NULL_TREE;
+}
diff --git a/gcc/common.opt b/gcc/common.opt
index e81165c488b..78cfa568a95 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -163,6 +163,13 @@ bool flag_stack_usage_info = false
 Variable
 int flag_debug_asm
 
+; How many NOP insns to place at each function entry by default
+Variable
+HOST_WIDE_INT function_entry_patch_area_size
+
+; And how far the real asm entry point is into this area
+Variable
+HOST_WIDE_INT function_entry_patch_area_start
 
 ; Balance between GNAT encodings and standard DWARF to emit.
 Variable
@@ -2030,6 +2037,10 @@ fprofile-reorder-functions
 Common Report Var(flag_profile_reorder_functions)
 Enable function reordering that improves code placement.
 
+fpatchable-function-entry=
+Common Joined Optimization
+Insert NOP instructions at each function entry.
+
 frandom-seed
 Common Var(common_deferred_options) Defer
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 5cb512fe575..9c171abc121 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3105,6 +3105,27 @@ that affect more than one function.
 This attribute should be used for debugging purposes only.  It is not
 suitable in production code.
 
+@item patchable_function_entry
+@cindex @code{patchable_function_entry} function attribute
+@cindex extra NOP instructions at the function entry point
+In case the target's text segment can be made writable at run time by
+any means, padding the function entry with a number of NOPs can be
+used to provide a universal 

[PATCH] document IntegerRange in internals manual

2017-07-07 Thread Martin Sebor

A conflict in my patch for bug 81345 made me notice that r249734
recently added a new option property, IntegerRange.  The change
below adds brief documentation of the property to the manual.

Martin, can you please check to make sure I didn't miss anything?

Btw., while experimenting with the property I noticed that there
is no error when option that specifies IntegerRange is set in
the .opt file to a value outside that range.  Would it be hard
to add some checks the the awk scripts to validate that the
argument values are in the range?  It might help avoid bugs
similar to 81345).

By the way of an example, the following invalid specification
is accepted but then causes errors when GCC runs.

Wfoobar
C ObjC C++ ObjC++ Warning Alias(Wfoobar=, 1, 0)

Wfoobar=
C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_foobar) 
Warning LangEnabledBy(C ObjC C++ ObjC++, Wall, 2, 0) Init (7) 
IntegerRange(3, 5)


Martin

diff --git a/gcc/doc/options.texi b/gcc/doc/options.texi
index 3b68aab..af56e9f 100644
--- a/gcc/doc/options.texi
+++ b/gcc/doc/options.texi
@@ -264,6 +264,12 @@ option handler.  @code{UInteger} should also be 
used on options like

 @code{-falign-loops}=@var{n} are supported to make sure the saved
 options are given a full integer.

+@item IntegerRange(@var{min}, @var{max})
+The option's integer argument is expected to be in the range specified
+by @var{min} and @var{max}, inclusive.  The option parser will check
+and reject option arguments that are outside the range before passing
+it to the relevant option handler.
+
 @item ToLower
 The option's argument should be converted to lowercase as part of
 putting it in canonical form, and before comparing with the strings


[committed] libcpp: preserve ranges within macro expansions (PR c++/79300)

2017-07-07 Thread David Malcolm
Comment #4 of PR c++/79300 noted a place in which we fail to underline
the full token of a macro name in maybe_unwind_expanded_macro_loc.

Root cause is that linemap_macro_loc_to_def_point drops range
information too early when following the macro expansions; fixed thusly.

Successfully bootstrapped on x86_64-pc-linux-gnu.

Committed to trunk as r250058.

gcc/testsuite/ChangeLog:
PR c++/79300
* g++.dg/diagnostic/pr79300.C: New test case.

libcpp/ChangeLog:
PR c++/79300
* line-map.c (linemap_macro_loc_to_def_point): Preserve range
information for macro expansions by delaying resolving ad-hoc
locations until within the loop.
---
 gcc/testsuite/g++.dg/diagnostic/pr79300.C | 44 +++
 libcpp/line-map.c | 14 +-
 2 files changed, 52 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr79300.C

diff --git a/gcc/testsuite/g++.dg/diagnostic/pr79300.C 
b/gcc/testsuite/g++.dg/diagnostic/pr79300.C
new file mode 100644
index 000..6805e85
--- /dev/null
+++ b/gcc/testsuite/g++.dg/diagnostic/pr79300.C
@@ -0,0 +1,44 @@
+// { dg-options "-fdiagnostics-show-caret" }
+
+#define TEST_1_DEPTH_0 throw bad_alloc; // { dg-line define_TEST_1_DEPTH_0 }
+
+void test_1 ()
+{
+  TEST_1_DEPTH_0 // { dg-line use_TEST_1_DEPTH_0 }
+}
+
+// { dg-error "'bad_alloc' was not declared in this scope" "" { target *-*-* } 
define_TEST_1_DEPTH_0 }
+/* { dg-begin-multiline-output "" }
+ #define TEST_1_DEPTH_0 throw bad_alloc;
+  ^
+   { dg-end-multiline-output "" } */
+// { dg-message "in expansion of macro 'TEST_1_DEPTH_0'" "" { target *-*-* } 
use_TEST_1_DEPTH_0 }
+/* { dg-begin-multiline-output "" }
+   TEST_1_DEPTH_0
+   ^~
+   { dg-end-multiline-output "" } */
+
+
+#define TEST_2_DEPTH_0 throw bad_alloc; // { dg-line define_TEST_2_DEPTH_0 }
+#define TEST_2_DEPTH_1 TEST_2_DEPTH_0 // { dg-line define_TEST_2_DEPTH_1 }
+
+void test_2 ()
+{
+  TEST_2_DEPTH_1 // { dg-line use_TEST_2_DEPTH_1 }
+}
+
+// { dg-error "'bad_alloc' was not declared in this scope" "" { target *-*-* } 
define_TEST_2_DEPTH_0 }
+/* { dg-begin-multiline-output "" }
+ #define TEST_2_DEPTH_0 throw bad_alloc;
+  ^
+   { dg-end-multiline-output "" } */
+// { dg-message "in expansion of macro 'TEST_2_DEPTH_0'" "" { target *-*-* } 
define_TEST_2_DEPTH_1 }
+/* { dg-begin-multiline-output "" }
+ #define TEST_2_DEPTH_1 TEST_2_DEPTH_0
+^~
+   { dg-end-multiline-output "" } */
+// { dg-message "in expansion of macro 'TEST_2_DEPTH_1'" "" { target *-*-* } 
use_TEST_2_DEPTH_1 }
+/* { dg-begin-multiline-output "" }
+   TEST_2_DEPTH_1
+   ^~
+   { dg-end-multiline-output "" } */
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 1c1acc8..300cfd0 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1439,21 +1439,23 @@ linemap_macro_loc_to_def_point (struct line_maps *set,
 {
   struct line_map *map;
 
-  if (IS_ADHOC_LOC (location))
-location = set->location_adhoc_data_map.data[location
-& MAX_SOURCE_LOCATION].locus;
-
   linemap_assert (set && location >= RESERVED_LOCATION_COUNT);
 
   while (true)
 {
-  map = const_cast  (linemap_lookup (set, location));
+  source_location caret_loc;
+  if (IS_ADHOC_LOC (location))
+   caret_loc = get_location_from_adhoc_loc (set, location);
+  else
+   caret_loc = location;
+
+  map = const_cast  (linemap_lookup (set, caret_loc));
   if (!linemap_macro_expansion_map_p (map))
break;
 
   location =
linemap_macro_map_loc_to_def_point (linemap_check_macro (map),
-   location);
+   caret_loc);
 }
 
   if (original_map)
-- 
1.8.5.3



Re: [PATCH], PowerPC target_clones minor support

2017-07-07 Thread Michael Meissner
On Fri, Jul 07, 2017 at 07:22:04AM -0500, Segher Boessenkool wrote:
> On Wed, Jun 28, 2017 at 02:28:23PM -0400, Michael Meissner wrote:
> > Some minor changes to the PowerPC target_clones support:
> > 
> > 1) I added a warning if target_clones was used and the compiler whas 
> > configured
> > with an older glibc where __builtin_cpu_supports always returns 0;
> > 
> > 2) I reworked how the ifunc resolver function is generated, and always made 
> > it
> > a static function;
> > 
> > 3) I added an executable target_clones test, and I made both clone tests
> > dependent on GCC being configured with a new glibc.
> 
> > * config/rs6000/rs6000.c
> > (rs6000_get_function_versions_dispatcher): Add warning if the
> > compiler is not configured to use at least GLIBC version 2.23.
> 
> Please say what is really tested for here (namely,
> TARGET_LIBC_PROVIDES_HWCAP_IN_TCB).

I've reworded both the warning message and the ChangeLog entry.
 
> >/* Append the filename to the resolver function if the versions are
> >   not externally visible.  This is because the resolver function has
> >   to be externally visible for the loader to find it.  So, appending
> >   the filename will prevent conflicts with a resolver function from
> >   another module which is based on the same version name.  */
> > -  char *resolver_name = make_unique_name (default_decl, "resolver", 
> > is_uniq);
> > +  tree decl_name = clone_function_name (default_decl, "resolver");
> > +  const char *resolver_name = IDENTIFIER_POINTER (decl_name);
> 
> I think the comment needs some updating now?

Yes.

> > --- gcc/testsuite/gcc.target/powerpc/clone2.c   
> > (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)
> >  (revision 0)
> > +++ gcc/testsuite/gcc.target/powerpc/clone2.c   
> > (.../gcc/testsuite/gcc.target/powerpc)  (revision 249738)
> > @@ -0,0 +1,31 @@
> > +/* { dg-do run { target { powerpc*-*-linux* } } } */
> > +/* { dg-options "-mvsx -O2" } */
> > +/* { dg-require-effective-target powerpc_p9vector_ok } */
> > +/* { dg-require-effective-target ppc_cpu_supports_hw } */
> 
> What a funny name (it reads as "the CPU supports the hardware").  Yes
> I'm easily amused ;-)
> 
> The patch is okay for trunk modulo with those things looked at.  Sorry
> for the slow review.

Here is the patch I committed:

[gcc]
2017-07-07  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_get_function_versions_dispatcher):
Add warning if GCC was not configured to link against a GLIBC that
exports the hardware capability bits.
(make_resolver_func): Make resolver function private and not a
COMDAT function.  Create the name with clone_function_name instead
of make_unique_name.

[gcc/testsuite]
2017-07-07  Michael Meissner  

* gcc.target/powerpc/clone1.c: Add check to make sure the
__builtin_cpu_supports function is fully supported.
* gcc.target/powerpc/clone2.c: New runtime test for
target_clones.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 250054)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -37283,6 +37283,12 @@ rs6000_get_function_versions_dispatcher 
 
   default_node = default_version_info->this_node;
 
+#ifndef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
+  warning_at (DECL_SOURCE_LOCATION (default_node->decl), 0,
+ "target_clone needs GLIBC (2.23 and newer) to export hardware "
+ "capability bits");
+#endif
+
   if (targetm.has_ifunc_p ())
 {
   struct cgraph_function_version_info *it_v = NULL;
@@ -37328,29 +37334,19 @@ make_resolver_func (const tree default_d
const tree dispatch_decl,
basic_block *empty_bb)
 {
-  /* IFUNC's have to be globally visible.  So, if the default_decl is
- not, then the name of the IFUNC should be made unique.  */
-  bool is_uniq = (TREE_PUBLIC (default_decl) == 0);
-
-  /* Append the filename to the resolver function if the versions are
- not externally visible.  This is because the resolver function has
- to be externally visible for the loader to find it.  So, appending
- the filename will prevent conflicts with a resolver function from
- another module which is based on the same version name.  */
-  char *resolver_name = make_unique_name (default_decl, "resolver", is_uniq);
-
-  /* The resolver function should return a (void *).  */
+  /* Make the resolver function static.  The resolver function returns
+ void *.  */
+  tree decl_name = clone_function_name (default_decl, "resolver");
+  const char *resolver_name = IDENTIFIER_POINTER (decl_name);
   tree type = build_function_type_list 

Re: Avoid global optimize flag checks in LTO

2017-07-07 Thread Jan Hubicka
> On 7 July 2017 15:31:55 CEST, Jan Hubicka  wrote:
> >Hi,
> >this patch fixes some places where we check global optimize flag rather
> >than
> >doing it per-function. This makes optimization attribute work closer to
> >what one gets when passing the same flag at command line.
> >This requires to run IPA passes even with !optimize, but having fast
> >way through
> >which does mostly nothing except when it sees functions with optimize
> >attributes
> >set.
> 
> Sounds gross.

Yep, supporting units compiled with different optimization flags is not
prettiest. But with LTO they are sad reality.
> >
> >Bootstrapped/regtested x86_64-linux, comitted.
> > 
> 
> >Index: ipa-visibility.c
> >===
> >--- ipa-visibility.c (revision 250021)
> >+++ ipa-visibility.c (working copy)
> >@@ -622,9 +622,12 @@ function_and_variable_visibility (bool w
> >   int flags = flags_from_decl_or_type (node->decl);
> > 
> >  /* Optimize away PURE and CONST constructors and destructors.  */
> >-  if (optimize
> >+  if (node->analyzed
> >+  && (DECL_STATIC_CONSTRUCTOR (node->decl)
> >+  || DECL_STATIC_CONSTRUCTOR (node->decl))
> 
> Typo DECL_STATIC_DESTRUCTOR

Oops, thanks! I will fix it shortly.

Honza
> 
> thanks,


Re: Avoid global optimize flag checks in LTO

2017-07-07 Thread Bernhard Reutner-Fischer
On 7 July 2017 15:31:55 CEST, Jan Hubicka  wrote:
>Hi,
>this patch fixes some places where we check global optimize flag rather
>than
>doing it per-function. This makes optimization attribute work closer to
>what one gets when passing the same flag at command line.
>This requires to run IPA passes even with !optimize, but having fast
>way through
>which does mostly nothing except when it sees functions with optimize
>attributes
>set.

Sounds gross.
>
>Bootstrapped/regtested x86_64-linux, comitted.
>   

>Index: ipa-visibility.c
>===
>--- ipa-visibility.c   (revision 250021)
>+++ ipa-visibility.c   (working copy)
>@@ -622,9 +622,12 @@ function_and_variable_visibility (bool w
>   int flags = flags_from_decl_or_type (node->decl);
> 
>  /* Optimize away PURE and CONST constructors and destructors.  */
>-  if (optimize
>+  if (node->analyzed
>+&& (DECL_STATIC_CONSTRUCTOR (node->decl)
>+|| DECL_STATIC_CONSTRUCTOR (node->decl))

Typo DECL_STATIC_DESTRUCTOR

thanks,


Re: [PATCHv2][PR 57371] Remove useless floating point casts in comparisons

2017-07-07 Thread Joseph Myers
On Fri, 7 Jul 2017, Yuri Gribov wrote:

> Hi all,
> 
> This is an updated version of patch in
> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00034.html . It should
> be much more complete, both in functionality and in tests.

I think there should be tests when the constant is an infinity (of either 
sign) or NaN (quiet or signaling).

I suspect infinities would already work with the patch as-is (the logic 
dealing with constants outside the range of the integer type).  I'm less 
clear that NaNs would work properly.  (If the comparison is == or != you 
can optimize it for quiet NaNs, to false and true respectively.  If it's a 
signaling NaN, or < <= > >=, optimizing to false is only valid with 
-fno-trapping-math, as it would lose an "invalid" exception.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] gcc/doc: list what version each attribute was introduced in

2017-07-07 Thread Jeff Law
On 07/06/2017 07:25 AM, Daniel P. Berrange wrote:
> There are several hundred named attribute keys that have been
> introduced over many GCC releases. Applications typically need
> to be compilable with multiple GCC versions, so it is important
> for developers to know when GCC introduced support for each
> attribute.
> 
> This augments the texi docs that list attribute keys with
> a note of what version introduced the feature. The version
> information was obtained through archaeology of the GCC source
> repository release tags, back to gcc-4_0_0-release. For
> attributes added in 4.0.0 or later, an explicit version will
> be noted. Any attribute that predates 4.0.0 will simply note
> that it has existed prior to 4.0.0. It is thought there is
> little need to go further back in time than 4.0.0 since few,
> if any, apps will still be using such old compiler versions.
> 
> Where a named attribute can be used in many contexts (ie the
> 'visibility' attribute can be used for both functions or
> variables), it was assumed that the attribute was supported
> in all use contexts at the same time.
> 
> Future patches that add new attributes to GCC should be
> required to follow this new practice, by documenting the
> version.
Keying on version #s is generally a terrible way to make your code
portable.  It's easy to get wrong and due to backporting there's not
always a strong tie between a version number and the existence of a
particular feature.

It's far better to actually *test* what your particular compiler
compiler supports.  I suspect autoconf, for example, probably has some
infrastructure for testing if specific attributes are supported by the
compiler.

Jeff


Re: [PATCH] Improve i?86/x86_64 prologue_and_epilogue for leaf functions (PR target/59501)

2017-07-07 Thread H.J. Lu
On Fri, Dec 20, 2013 at 8:06 AM, Jakub Jelinek  wrote:
> Hi!
>
> Honza recently changed the i?86 backend, so that it often doesn't
> do -maccumulate-outgoing-args by default on x86_64.
> Unfortunately, on some of the here included testcases this regressed
> quite a bit the generated code.  As AVX vectors are used, the dynamic
> realignment code needs to assume e.g. that some of them will need to be
> spilled, and for -mno-accumulate-outgoing-args the code needs to set
> need_drap early as well.  But in when emitting the prologue/epilogue,
> if need_drap is set, we don't perform the optimization for leaf functions
> which have zero size stack frame, thus we end up with uselessly doing
> dynamic stack realignment, setting up DRAP that nothing uses and later on
> restore everything back.
>
> This patch improves it, if the DRAP register isn't live at the start of
> entry bb successor and we aren't going to realign the stack, we don't
> need DRAP at all, and even if we need DRAP register, that can't be the sole
> reason for doing stack realignment, the prologue code is able to set up DRAP
> even without dynamic stack realignment.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2013-12-20  Jakub Jelinek  
>
> PR target/59501
> * config/i386/i386.c (ix86_save_reg): Don't return true for drap_reg
> if !crtl->stack_realign_needed.
> (ix86_finalize_stack_realign_flags): If drap_reg isn't live on entry
> and stack_realign_needed will be false, clear drap_reg and need_drap.
> Optimize leaf functions that don't need stack frame even if
> crtl->need_drap.
>
> * gcc.target/i386/pr59501-1.c: New test.
> * gcc.target/i386/pr59501-1a.c: New test.
> * gcc.target/i386/pr59501-2.c: New test.
> * gcc.target/i386/pr59501-2a.c: New test.
> * gcc.target/i386/pr59501-3.c: New test.
> * gcc.target/i386/pr59501-3a.c: New test.
> * gcc.target/i386/pr59501-4.c: New test.
> * gcc.target/i386/pr59501-4a.c: New test.
> * gcc.target/i386/pr59501-5.c: New test.
> * gcc.target/i386/pr59501-6.c: New test.
>
>
> --- gcc/testsuite/gcc.target/i386/pr59501-4a.c.jj   2013-12-20 
> 12:19:20.603212859 +0100
> +++ gcc/testsuite/gcc.target/i386/pr59501-4a.c  2013-12-20 12:23:33.647881672 
> +0100
> @@ -0,0 +1,8 @@
> +/* PR target/59501 */
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2 -mavx -maccumulate-outgoing-args" } */
> +
> +#include "pr59501-3a.c"
> +
> +/* Verify no dynamic realignment is performed.  */
> +/* { dg-final { scan-assembler-not "and\[^\n\r]*sp" { xfail *-*-* } } } */
>

Since DRAP isn't used with -maccumulate-outgoing-args, pr59501-4a.c was
xfailed due to stack frame access via frame pointer instead of DARP.
This patch finds the maximum stack alignment from the stack frame access
instructions and avoids stack realignment if stack alignment needed is
less than incoming stack boundary.

I am testing this patch.  OK for trunk if there is no regression?


-- 
H.J.
From d9844cba6ce20498aab42a32927230cb7b56475d Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Fri, 7 Jul 2017 06:01:16 -0700
Subject: [PATCH] i386: Avoid stack realignment if possible

Since DRAP isn't used with -maccumulate-outgoing-args, pr59501-4a.c was
xfailed due to stack frame access via frame pointer instead of DARP.
This patch finds the maximum stack alignment from the stack frame access
instructions and avoids stack realignment if stack alignment needed is
less than incoming stack boundary.

gcc/

	PR target/59501
	* config/i386/i386.c (ix86_finalize_stack_realign_flags): Don't
	realign stack if stack alignment needed is less than incoming
	stack boundary.

gcc/testsuite/

	PR target/59501
	* gcc.target/i386/pr59501-4a.c: Remove xfail.
---
 gcc/config/i386/i386.c | 83 +++---
 gcc/testsuite/gcc.target/i386/pr59501-4a.c |  2 +-
 2 files changed, 55 insertions(+), 30 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b041524..0c61998 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -14161,6 +14161,10 @@ ix86_finalize_stack_realign_flags (void)
   add_to_hard_reg_set (_up_by_prologue, Pmode, ARG_POINTER_REGNUM);
   add_to_hard_reg_set (_up_by_prologue, Pmode,
 			   HARD_FRAME_POINTER_REGNUM);
+
+  unsigned int stack_alignment = 0;
+  bool require_stack_frame = false;
+
   FOR_EACH_BB_FN (bb, cfun)
 {
   rtx_insn *insn;
@@ -14169,43 +14173,64 @@ ix86_finalize_stack_realign_flags (void)
 		&& requires_stack_frame_p (insn, prologue_used,
 	   set_up_by_prologue))
 	  {
-		if (crtl->stack_realign_needed != stack_realign)
-		  recompute_frame_layout_p = true;
-		crtl->stack_realign_needed = stack_realign;
-		crtl->stack_realign_finalized = true;
-		if (recompute_frame_layout_p)
-		  

Re: [PATCH] prevent -Wall from resetting -Wstringop-overflow=2 to 1 (pr 81345)

2017-07-07 Thread Joseph Myers
This patch is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH][PR 57371] Remove useless floating point casts in comparisons

2017-07-07 Thread Jeff Law
On 07/03/2017 12:59 PM, Marc Glisse wrote:

>> What happens if @0 is a floating point type?  Based on the variable name
>> "itype" and passing TYPE_PRECISION (itype) to real_to_integer, it seems
>> like you're expecting @0 to be an integer.  If so, you should verify
>> that it really is an integer type.  Seems like a good thing to verify
>> with tests as well.
> 
> @0 is the argument of a FLOAT_EXPR. verify_gimple_assign_unary
> guarantees that it is INTEGRAL_TYPE_P (or VECTOR_INTEGER_TYPE_P but then
> the result would have to be VECTOR_FLOAT_TYPE_P, and since it gets
> compared to REAL_CST... the test SCALAR_FLOAT_TYPE_P is actually
> redundant).
Duh.  I should have realized that.  My bad.

jeff



Re: [PATCH][2/2] PR60510, reduction chain vectorization w/o SLP

2017-07-07 Thread Szabolcs Nagy
On 03/07/17 14:42, Richard Biener wrote:
> 
> The following is the patch enabling non-SLP vectorization of failed SLP
> reduction chains.  It simply dissolves the group composing the SLP
> reduction chain when vect_analyze_slp fails to detect the SLP and then
> fixes up the remaining pieces in reduction vectorization.
> 
> I've made sure that SPEC CPU 2006 is clean on x86_64 (-Ofast 
> -march=haswell, test run only) and gathered some statistics and
> -fopt-info-vec shows 2220 more vectorized loops (from a now total
> of 13483) which is a nice improvement of 15%.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> 
> One day left to fix fallout before I leave for vacation.
> 
> Richard.
> 
> 2017-07-03  Richard Biener  
> 
>   PR tree-optimization/60510
>   * tree-vect-loop.c (vect_create_epilog_for_reduction): Pass in
>   the scalar reduction PHI and use it.
>   (vectorizable_reduction): Properly guard the single_defuse_cycle
>   path for non-SLP reduction chains where we cannot use it.
>   Rework reduc_def/index and vector type deduction.  Rework
>   vector operand gathering during reduction op code-gen.
>   * tree-vect-slp.c (vect_analyze_slp): For failed SLP reduction
>   chains dissolve the chain and leave it to non-SLP reduction
>   handling.
> 
>   * gfortran.dg/vect/pr60510.f: New testcase.
> 

i saw

FAIL: gfortran.dg/vect/pr60510.f   -O0   scan-tree-dump vect "reduction chain"
FAIL: gfortran.dg/vect/pr60510.f   -O0   scan-tree-dump-times vect "vectorized 
1 loops" 2
...

on arm-none-linux-gnueabihf

committed the patch below as obvious:

Index: gcc/testsuite/gfortran.dg/vect/pr60510.f
===
--- gcc/testsuite/gfortran.dg/vect/pr60510.f(revision 250052)
+++ gcc/testsuite/gfortran.dg/vect/pr60510.f(working copy)
@@ -1,4 +1,5 @@
 ! { dg-do run }
+! { dg-require-effective-target vect_double }
 ! { dg-additional-options "-fno-inline -ffast-math" }
   subroutine foo(a,x,y,n)
   implicit none



Re: [PATCH-v3] [SPARC] Add a workaround for the LEON3FT store-store errata

2017-07-07 Thread Eric Botcazou
> Great! Would you mind to apply the patch for us? The only person here
> with write access just went on vacation. I have submitted a new version
> (v4) with the change that applies to both main and 7.

OK, will do.

-- 
Eric Botcazou


Re: Handle data dependence relations with different bases

2017-07-07 Thread Eric Botcazou
> Ah, yeah.  And doing that shows that I'd not handled safelen for
> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
> 
> How does this look?  Tested on x86_64-linux-gnu both without the
> vectoriser changes and with the fixed vectoriser patch.
> 
> Thanks,
> Richard
> 
> 
> 2017-07-07  Richard Sandiford  
> 
> gcc/testsuite/
>   * gnat.dg/vect15.ads (Sarray): Increase range to 1 .. 5.
>   * gnat.dg/vect16.ads (Sarray): Likewise.
>   * gnat.dg/vect17.ads (Sarray): Likewise.
>   * gnat.dg/vect15.adb (Add): Create a dependence distance of 1.
>   * gnat.dg/vect16.adb (Add): Likewise.
>   * gnat.dg/vect17.adb (Add): Likewise.

OK, thanks.

-- 
Eric Botcazou


Re: [PATCH] add vec_pack_to_short builtin.

2017-07-07 Thread Segher Boessenkool
On Fri, Jul 07, 2017 at 08:17:22AM -0700, Carl Love wrote:
> The following patch adds support for the vec_pack_to_short builtin.  The
> patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE) and
> powerpc64-unknown-linux-gnu(Power 9 LE).
> 
> Please let me know if the following patch is acceptable.  Thanks.

It looks fine, thanks!


Segher


Re: [PATCH V2 0/7] Support for the SPARC M8 cpu

2017-07-07 Thread Jose E. Marchesi

> See the individual patch descriptions for more information and
> associated ChangeLog entries.
> 
> After this serie gets integrated upstream we will be contributing more
> support for M8 capabilities, such as support for using the new
> misaligned load/store instructions for memory accesses known to be
> misaligned at compile-time.
> 
> Note that full binutils support for M8 was upstreamed in May 19.
> 
> Bootstrapped and tested in sparc64-linux-gnu.  No regressions.
> Bootstrapped and tested in sparc-sun-solaris2.12.  No regressions.

OK for mainline and 7 branch (I can do the backport to the branch).

I will be committing to svn in both trunk and the gcc 7 branch.

Committed in both trunk and branches/gcc-7-branch.


Re: [PATCH], PR target/81348: Fix compiler segfault on -mcpu=power9 code

2017-07-07 Thread Segher Boessenkool
On Fri, Jul 07, 2017 at 09:19:56AM -0400, Michael Meissner wrote:
> This patch fixes a typo where the wrong operand was used (a memory was used
> where a register was intended), and the compiler segfaulted.
> 
> I did the bootstrap and make check with no regression on a little endian 
> power8
> system.  Can I install this in the trunk and the gcc 7.x branch (it isn't an
> issue on the gcc 6.x branch)?

Yes please.  Thanks,


Segher


Re: [PATCH, rs6000] Modify libgcc's float128 IFUNC resolver functions to use __builtin_cpu_supports()

2017-07-07 Thread Segher Boessenkool
On Thu, Jul 06, 2017 at 04:21:48PM -0500, Peter Bergner wrote:
>   * config/rs6000/float128-ifunc.c: Don't include auxv.h.
>   (have_ieee_hw_p): Delete function.
>   (SW_OR_HW) Use __builtin_cpu_supports().

Okay for trunk.  Thanks!


Segher


[PATCH] add vec_pack_to_short builtin.

2017-07-07 Thread Carl Love
GCC Maintainers:

The following patch adds support for the vec_pack_to_short builtin.  The
patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE) and
powerpc64-unknown-linux-gnu(Power 9 LE).

Please let me know if the following patch is acceptable.  Thanks.

Carl Love

-

gcc/ChangeLog:

2017-07-06 Carl Love  

* config/rs6000/rs6000-c: Add support for built-in function
vector unsigned short vec_pack_to_short_fp32 (vector float,
  vector float).
* config/rs6000/rs6000-builtin.def (CONVERT_4F32_8I16): Add
BU_P9V_AV_2 and BU_P9V_OVERLOAD_2 definitions.
* config/rs6000/altivec.h (vec_pack_to_short_fp32): Add define.
* config/rs6000/altivec.md(UNSPEC_CONVERT_4F32_8I16): Add UNSPEC.
(convert_4f32_8i16): Add define_expand.
* doc/extend.texi: Update the built-in documentation file for the
new built-in function.

gcc/testsuite/ChangeLog:

2017-07-06  Carl Love  

* gcc.target/powerpc/builtins-1-p9-runnable.c:  Add new test
file for built-ins.
---
 gcc/config/rs6000/altivec.h|  1 +
 gcc/config/rs6000/altivec.md   | 18 +++
 gcc/config/rs6000/rs6000-builtin.def   |  2 ++
 gcc/config/rs6000/rs6000-c.c   |  4 
 gcc/doc/extend.texi|  2 ++
 .../gcc.target/powerpc/builtins-1-p9-runnable.c| 26 ++
 6 files changed, 53 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 806675a..5af7eec 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -418,6 +418,7 @@
 
 #ifdef __POWER9_VECTOR__
 /* Vector additions added in ISA 3.0.  */
+#define vec_pack_to_short_fp32 __builtin_vec_convert_4f32_8i16
 #define vec_vctz __builtin_vec_vctz
 #define vec_cnttz __builtin_vec_vctz
 #define vec_vctzb __builtin_vec_vctzb
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 5629d77..d5f7a8f 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -79,6 +79,7 @@
UNSPEC_VUNPACK_LO_SIGN_DIRECT
UNSPEC_VUPKHPX
UNSPEC_VUPKLPX
+   UNSPEC_CONVERT_4F32_8I16
UNSPEC_DARN
UNSPEC_DARN_32
UNSPEC_DARN_RAW
@@ -3170,6 +3171,23 @@
 }
   [(set_attr "type" "veccomplex")])
 
+;; Generate two vector F32 converted to packed vector I16 vector
+(define_expand "convert_4f32_8i16"
+  [(set (match_operand:V8HI 0 "register_operand" "=v")
+   (unspec:V8HI [(match_operand:V4SF 1 "register_operand" "v")
+ (match_operand:V4SF 2 "register_operand" "v")]
+UNSPEC_CONVERT_4F32_8I16))]
+  "TARGET_P9_VECTOR"
+{
+  rtx rtx_tmp_hi = gen_reg_rtx (V4SImode);
+  rtx rtx_tmp_lo = gen_reg_rtx (V4SImode);
+
+  emit_insn (gen_altivec_vctuxs (rtx_tmp_hi, operands[1], const0_rtx));
+  emit_insn (gen_altivec_vctuxs (rtx_tmp_lo, operands[2], const0_rtx));
+  emit_insn (gen_altivec_vpkswss (operands[0], rtx_tmp_hi, rtx_tmp_lo));
+  DONE;
+})
+
 ;; Generate
 ;;xxlxor/vxor SCRATCH0,SCRATCH0,SCRATCH0
 ;;vsubu?m SCRATCH2,SCRATCH1,%1
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index c5017aa..258c5f8 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1990,10 +1990,12 @@ BU_P8V_OVERLOAD_3 (VSUBEUQM,"vsubeuqm")
 /* ISA 3.0 vector overloaded 2-argument functions. */
 BU_P9V_AV_2 (VSLV, "vslv", CONST, vslv)
 BU_P9V_AV_2 (VSRV, "vsrv", CONST, vsrv)
+BU_P9V_AV_2 (CONVERT_4F32_8I16, "convert_4f32_8i16", CONST, convert_4f32_8i16)
 
 /* ISA 3.0 vector overloaded 2-argument functions. */
 BU_P9V_OVERLOAD_2 (VSLV,   "vslv")
 BU_P9V_OVERLOAD_2 (VSRV,   "vsrv")
+BU_P9V_OVERLOAD_2 (CONVERT_4F32_8I16, "convert_4f32_8i16")
 
 /* 2 argument vector functions added in ISA 3.0 (power9). */
 BU_P9V_AV_2 (VADUB,"vadub",CONST,  vaduv16qi3)
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 1a40797..2b5193b 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -2417,6 +2417,10 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 0 },
   { ALTIVEC_BUILTIN_VEC_PACK, P8V_BUILTIN_VPKUDUM,
 RS6000_BTI_V4SF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 },
+
+  { P9V_BUILTIN_VEC_CONVERT_4F32_8I16, P9V_BUILTIN_CONVERT_4F32_8I16,
+RS6000_BTI_unsigned_V8HI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
+
   { ALTIVEC_BUILTIN_VEC_VPKUWUM, ALTIVEC_BUILTIN_VPKUWUM,
 RS6000_BTI_V8HI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
   { 

Re: [PATCH, rs6000] 1/2 Add x86 MMX <mmintrin,h> intrinsics to GCC PPC64LE taget

2017-07-07 Thread Segher Boessenkool
On Thu, Jul 06, 2017 at 09:41:00AM -0500, Steven Munroe wrote:
> +   Net. Most MMX intrinsic operations can be performed efficiently as
> +   C language 64-bit scalar operation or optimized to use the newer
> +   128-bit SSE/Altivec operations.  */

I don't understand what "Net." is, maybe you can write it out?

> +typedef __attribute__ ((__aligned__ (8)))
> +union
> +  {
> +__m64 as_m64;
> +char as_char [8];
> +signed char as_signed_char [8];
> +short as_short [4];
> +int as_int [2];
> +long long as_long_long;
> +float as_float [2];
> +double as_double;
> +  }__m64_union;

Please add a space after "}" and remove those before "[".

> +/* Convert I to a __m64 object.  The integer is zero-extended to
> 64-bits.  */
> +extern __inline __m64  __attribute__((__gnu_inline__,
> __always_inline__, __artificial__))
> +_mm_cvtsi32_si64 (int __i)
> +{
> +  return ((__m64)((unsigned int) __i));
> +}

No superfluous parentheses please, here and elsewhere.  This is just

  return (__m64) (unsigned int) __i;

(and the lines wrapped; misconfigured mail client?)

> +extern __inline __m64  __attribute__((__gnu_inline__,
> __always_inline__, __artificial__))
> +_mm_set_pi64x (long long __i)
> +{
> +  return (__m64) __i;
> +}
> +/* Convert the __m64 object to a 64bit integer.  */

Newline between the function and the following comment.

> +extern __inline long long __attribute__((__gnu_inline__,
> __always_inline__, __artificial__))
> +_mm_cvtm64_si64 (__m64 __i)
> +{
> +  return (long long)__i;
> +}

Space after cast.

> +extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__,
> __artificial__))
> +_mm_setzero_si64 (void)
> +{
> +  return (__m64)0LL;
> +}

return (__m64) 0;

> +  else
> +  return (0);

  else
return 0;

Okay for trunk, but please do something about the coding style stuff
(at least make things a bit more consistent).  Thanks,


Segher


Re: [PATCH, VAX] Correct ffs instruction constraint

2017-07-07 Thread Felix Deichmann
Am 06.07.2017 um 20:53 schrieb Jeff Law:
> Hmm, unfortunately I consistently get a call to into libgcc for the
> __builtin_ffs code rather than an ffs instruction.  That's with a
> gcc-4.8.3 as well as with trunk compiler.
> 
> Can you include "-v" output from compiling scsipi_base?

Hope this is right/enough:

Using built-in specs.
COLLECT_GCC=/nb8/obj/tooldir.NetBSD-7.0-amd64/bin/vax--netbsdelf-gcc
Target: vax--netbsdelf
Configured with:
/nb8/src/tools/gcc/../../external/gpl3/gcc/dist/configure
--target=vax--netbsdelf --enable-long-long --enable-threads
--with-bugurl=http://www.NetBSD.org/Misc/send-pr.html
--with-pkgversion='NetBSD nb1 20160606' --with-system-zlib
--enable-__cxa_atexit --enable-libstdcxx-time=rt
--enable-libstdcxx-threads --with-diagnostics-color=auto-if-env
--with-sysroot=/nb8/obj/destdir.vax
--with-mpc=/nb8/obj/tooldir.NetBSD-7.0-amd64
--with-mpfr=/nb8/obj/tooldir.NetBSD-7.0-amd64
--with-gmp=/nb8/obj/tooldir.NetBSD-7.0-amd64 --disable-nls
--disable-multilib --program-transform-name=s,^,vax--netbsdelf-,
--enable-languages='c c++ objc' --prefix=/nb8/obj/tooldir.NetBSD-7.0-amd64
Thread model: posix
gcc version 5.4.0 (NetBSD nb1 20160606)
COLLECT_GCC_OPTIONS='-v' '-fno-pic' '-ffreestanding'
'-fno-zero-initialized-in-bss' '-Os' '-fno-strict-aliasing'
'-fno-common' '-std=gnu99' '-Werror' '-Wall' '-Wno-main'
'-Wno-format-zero-length' '-Wpointer-arith' '-Wmissing-prototypes'
'-Wstrict-prototypes' '-Wold-style-definition' '-Wswitch' '-Wshadow'
'-Wcast-qual' '-Wwrite-strings' '-Wno-pointer-sign' '-Wno-attributes'
'-Wno-sign-compare' '-D' '_VAX_INLINE_' '-I' '.' '-I'
'/nb8/src/sys/../common/lib/libx86emu' '-I'
'/nb8/src/sys/../common/include' '-I' '/nb8/src/sys/arch' '-I'
'/nb8/src/sys' '-nostdinc' '-D' '_KERNEL' '-D' '_KERNEL_OPT'
'-std=gnu99' '-I'
'/nb8/src/sys/lib/libkern/../../../common/lib/libc/quad' '-I'
'/nb8/src/sys/lib/libkern/../../../common/lib/libc/string' '-I'
'/nb8/src/sys/lib/libkern/../../../common/lib/libc/arch/vax/string' '-c'
'-o' 'scsipi_base.o'
 /nb8/obj/tooldir.NetBSD-7.0-amd64/libexec/gcc/vax--netbsdelf/5.4.0/cc1
-quiet -nostdinc -v -I . -I /nb8/src/sys/../common/lib/libx86emu -I
/nb8/src/sys/../common/include -I /nb8/src/sys/arch -I /nb8/src/sys -I
/nb8/src/sys/lib/libkern/../../../common/lib/libc/quad -I
/nb8/src/sys/lib/libkern/../../../common/lib/libc/string -I
/nb8/src/sys/lib/libkern/../../../common/lib/libc/arch/vax/string
-isysroot /nb8/obj/destdir.vax -D _VAX_INLINE_ -D _KERNEL -D _KERNEL_OPT
/nb8/src/sys/dev/scsipi/scsipi_base.c -quiet -dumpbase scsipi_base.c
-auxbase-strip scsipi_base.o -Os -Werror -Wall -Wno-main
-Wno-format-zero-length -Wpointer-arith -Wmissing-prototypes
-Wstrict-prototypes -Wold-style-definition -Wswitch -Wshadow -Wcast-qual
-Wwrite-strings -Wno-pointer-sign -Wno-attributes -Wno-sign-compare
-std=gnu99 -std=gnu99 -version -fno-pic -ffreestanding
-fno-zero-initialized-in-bss -fno-strict-aliasing -fno-common -o
/var/tmp//ccy92Bnl.s
GNU C99 (NetBSD nb1 20160606) version 5.4.0 (vax--netbsdelf)
compiled by GNU C version 4.8.4, GMP version 5.1.3, MPFR version 3.1.2,
MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
#include "..." search starts here:
#include <...> search starts here:
 .
 /nb8/src/sys/../common/lib/libx86emu
 /nb8/src/sys/../common/include
 /nb8/src/sys/arch
 /nb8/src/sys
 /nb8/src/sys/lib/libkern/../../../common/lib/libc/quad
 /nb8/src/sys/lib/libkern/../../../common/lib/libc/string
 /nb8/src/sys/lib/libkern/../../../common/lib/libc/arch/vax/string
End of search list.
GNU C99 (NetBSD nb1 20160606) version 5.4.0 (vax--netbsdelf)
compiled by GNU C version 4.8.4, GMP version 5.1.3, MPFR version 3.1.2,
MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 530755d1b0ada9bef015017ee74733db
COLLECT_GCC_OPTIONS='-v' '-fno-pic' '-ffreestanding'
'-fno-zero-initialized-in-bss' '-Os' '-fno-strict-aliasing'
'-fno-common' '-std=gnu99' '-Werror' '-Wall' '-Wno-main'
'-Wno-format-zero-length' '-Wpointer-arith' '-Wmissing-prototypes'
'-Wstrict-prototypes' '-Wold-style-definition' '-Wswitch' '-Wshadow'
'-Wcast-qual' '-Wwrite-strings' '-Wno-pointer-sign' '-Wno-attributes'
'-Wno-sign-compare' '-D' '_VAX_INLINE_' '-I' '.' '-I'
'/nb8/src/sys/../common/lib/libx86emu' '-I'
'/nb8/src/sys/../common/include' '-I' '/nb8/src/sys/arch' '-I'
'/nb8/src/sys' '-nostdinc' '-D' '_KERNEL' '-D' '_KERNEL_OPT'
'-std=gnu99' '-I'
'/nb8/src/sys/lib/libkern/../../../common/lib/libc/quad' '-I'
'/nb8/src/sys/lib/libkern/../../../common/lib/libc/string' '-I'
'/nb8/src/sys/lib/libkern/../../../common/lib/libc/arch/vax/string' '-c'
'-o' 'scsipi_base.o'
 
/nb8/obj/tooldir.NetBSD-7.0-amd64/lib/gcc/vax--netbsdelf/5.4.0/../../../../vax--netbsdelf/bin/as
 -v -I . -I /nb8/src/sys/../common/lib/libx86emu -I 
/nb8/src/sys/../common/include -I /nb8/src/sys/arch -I /nb8/src/sys -I 
/nb8/src/sys/lib/libkern/../../../common/lib/libc/quad -I 

[patch,avr] Fix PR20296 / PR81268: Better ISR prologues / epilogues

2017-07-07 Thread Georg-Johann Lay

Hi,

this patch addresses a very old issue, the non-optimal
generation of ISR prologues and epilogues.

As GAS now provides the __gcc_isr pseudo instruction to
overcome some problems, see

https://sourceware.org/bugzilla/show_bug.cgi?id=21683

this can now be used to address PR20296.


This patch does:

* Add a configure test if GAS supports __gcc_isr and -mgcc-isr.

* Add new option -mgas-isr-prologues to switch on / off
  generating of __gcc_isr in ISR prologues / epilogues.

* Switch on the feature per default except for -O0 and -Og.

* Add a new no_gccisr function attribute to disable __gcc_isr
  generation for individual ISRs.

* Add a new pass .avr-gasisr that filters out situations where
  __gcc_isr is not appropriate.

* Extend prologue and epilogue generation to emit __gcc_isr chunks
  during prologue and epilogue(s).

* Implement final_postscan_insn to emit final __gcc_isr Done chunk.

* Add -mgcc-isr to ASM_SPEC if appropriate.


We currently have only 3 torture tests for ISRs, namely

gcc.target/avr/torture/isr-*.c

All these tests PASS when

* Run with -mgas-isr-prologues
* Run with -mno-gas-isr-prologues
* Run for: atmega8 atmega64 atmega103 atmega2560 atmega128 atxmega128a1 
attiny40


Ok for trunk?

Johann

PR target/20296
PR target/81268
* configure.ac [target=avr]: Add GAS check for -mgcc-isr.
(HAVE_AS_AVR_MGCCISR_OPTION):  If so, AC_DEFINE it.
* config.in: Regenerate.
* configure: Regenerate.

* doc/extend.texi (AVR Function Attributes) : Document it.
* doc/invoke.texi (AVR Options) <-mgas-isr-prologues>: Document it.

* config/avr/avr.opt (-mgas-isr-prologues): New option and...
(TARGET_GASISR_PROLOGUES): ...target mask.
* common/config/avr/avr-common.c
(avr_option_optimization_table) [OPT_LEVELS_1_PLUS_NOT_DEBUG]:
Set -mgas-isr-prologues.
* config/avr/avr-passes.def (avr_pass_maybe_gasisr): Add
INSERT_PASS_BEFORE for it.
* config/avr/avr-protos.h (make_avr_pass_maybe_gasisr): New proto.
* config/avr/avr.c (avr_option_override)
[!HAVE_AS_AVR_MGCCISR_OPTION]: Unset TARGET_GASISR_PROLOGUES.
(avr_no_gccisr_function_p, avr_hregs_split_lsb): New static functions.
(avr_attribute_table) : Add new function attribute.
(avr_set_current_function) : Init machine field.
(avr_pass_data_gasisr, avr_pass_maybe_gasisr): New pass data
and rtl_opt_pass.
(make_avr_pass_maybe_gasisr): New function.
(emit_push_sfr) : Add argument to function and use it
instead of TMP_REG.
(avr_expand_prologue) [machine->gasisr.maybe]: Emit gasisr insn
and set machine->gasisr.yes.
(avr_expand_epilogue) [machine->gasisr.yes]: Similar.
(avr_asm_function_end_prologue) [machine->gasisr.yes]: Add
__gcc_isr.n_pushed to .L__stack_usage.
(TARGET_ASM_FINAL_POSTSCAN_INSN): Define to...
(avr_asm_final_postscan_insn): ...this new static function.
* config/avr/avr.h (machine_function)
: New fields.
: New fields.
* config/avr/avr.md (UNSPECV_GASISR): Add unspecv enum.
(GASISR_Prologue, GASISR_Epilogue, GASISR_Done): New define_constants.
(gasisr, *gasisr): New expander and insn.
* config/avr/gen-avr-mmcu-specs.c (print_mcu)
[HAVE_AS_AVR_MGCCISR_OPTION]: Print asm_gccisr spec.
* config/avr/specs.h (ASM_SPEC) : Add sub spec.
Index: common/config/avr/avr-common.c
===
--- common/config/avr/avr-common.c	(revision 249982)
+++ common/config/avr/avr-common.c	(working copy)
@@ -31,6 +31,7 @@ static const struct default_options avr_
 // The only effect of -fcaller-saves might be that it triggers
 // a frame without need when it tries to be smart around calls.
 { OPT_LEVELS_ALL, OPT_fcaller_saves, NULL, 0 },
+{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_mgas_isr_prologues, NULL, 1 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };
 
Index: config/avr/avr-passes.def
===
--- config/avr/avr-passes.def	(revision 249982)
+++ config/avr/avr-passes.def	(working copy)
@@ -17,9 +17,11 @@
along with GCC; see the file COPYING3.  If not see
.  */
 
-/* FIXME: We have to add the last pass first, otherwise
-  gen-pass-instances.awk won't work as expected. */
-  
+/* Compute cfun->machine->gasisr.maybe which is used in prologue and
+   epilogue generation provided -mgas-isr-prologues is on.  */
+
+INSERT_PASS_BEFORE (pass_thread_prologue_and_epilogue, 1, avr_pass_maybe_gasisr);
+
 /* This avr-specific pass (re)computes insn notes, in particular REG_DEAD
notes which are used by `avr.c::reg_unused_after' and branch offset
computations.  These notes must be correct, 

Re: [PATCH, GCC/ARM] Remove ARMv8-M code for D17-D31

2017-07-07 Thread Richard Earnshaw (lists)
On 06/07/17 13:36, Thomas Preudhomme wrote:
> Hi Richard,
> 
> On 28/06/17 16:56, Richard Earnshaw (lists) wrote:
> 
>>>
>>
>> This is silently baking in dangerous assumptions about GCC's internal
>> numbering of the registers.  That's not a good idea from a long-term
>> portability perspective.
>>
>> At the very least you need to assert that all the interesting registers
>> are numbered in the range 0..63; but ideally the code should just handle
>> pretty much any assignment of internal register numbers.
> 
> There is already such an assert in my patch. :-)
> 
>>
>> Did you consider using sbitmaps rather than doing all the multi-word
>> stuff by steam?
> 
> I did now, most of it is trivial but interaction with
> compute_not_to_clear_mask is now more verbose because it returns a
> bitfield and one assert got quite ugly and expensive.
> 
> Please find an updated patch in attachment and judge by yourself.
> 

Hmm, I think that's because really this is a partial conversion.  It
looks like doing this properly would involve moving that existing code
to use sbitmaps as well.  I think doing that would be better for
long-term maintenance perspectives, but I'm not going to insist that you
do it now.

As a result I'll let you take the call as to whether you keep this
version or go back to your earlier patch.  If you do decide to keep this
version, then see the comment below.

> Best regards,
> 
> Thomas
> 
> remove_d16-d31_armv8m_clearing_code.patch
> 
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 
> 259597d8890ee84c5bd92b12b6f9f6521c8dcd2e..93e152b1f38d3675e4ada1de7a34c2c209d8db1f
>  100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -3620,6 +3620,11 @@ arm_option_override (void)
>if (use_cmse && !arm_arch_cmse)
>  error ("target CPU does not support ARMv8-M Security Extensions");
>  
> +  /* We don't clear D16-D31 VFP registers for cmse_nonsecure_call functions
> + and ARMv8-M Baseline and Mainline do not allow such configuration.  */
> +  if (use_cmse && LAST_VFP_REGNUM > LAST_LO_VFP_REGNUM)
> +error ("ARMv8-M Security Extensions incompatible with selected FPU");
> +
>/* Disable scheduling fusion by default if it's not armv7 processor
>   or doesn't prefer ldrd/strd.  */
>if (flag_schedule_fusion == 2
> @@ -24996,42 +25001,41 @@ thumb1_expand_prologue (void)
>  void
>  cmse_nonsecure_entry_clear_before_return (void)
>  {
> -  uint64_t to_clear_mask[2];
> +  sbitmap to_clear_bitmap;

I see the bitmap_alloc, but not the bitmap_free once its dead.  I think
you should use an auto_sbitmap here so that it will clean up
automatically when it goes out of scope.

>uint32_t padding_bits_to_clear = 0;
>uint32_t * padding_bits_to_clear_ptr = _bits_to_clear;
>int regno, maxregno = IP_REGNUM;
>tree result_type;
>rtx result_rtl;
>  
> -  to_clear_mask[0] = (1ULL << (NUM_ARG_REGS)) - 1;
> -  to_clear_mask[0] |= (1ULL << IP_REGNUM);
> +  to_clear_bitmap = sbitmap_alloc (maxregno + 1);
> +  bitmap_clear (to_clear_bitmap);
> +  bitmap_set_range (to_clear_bitmap, R0_REGNUM, NUM_ARG_REGS);
> +  bitmap_set_bit (to_clear_bitmap, IP_REGNUM);
>  
>/* If we are not dealing with -mfloat-abi=soft we will need to clear VFP
>   registers.  We also check that TARGET_HARD_FLOAT and !TARGET_THUMB1 hold
>   to make sure the instructions used to clear them are present.  */
>if (TARGET_HARD_FLOAT && !TARGET_THUMB1)
>  {
> -  uint64_t float_mask = (1ULL << (D7_VFP_REGNUM + 1)) - 1;
> +  int float_bits = D7_VFP_REGNUM - FIRST_VFP_REGNUM + 1;
>maxregno = LAST_VFP_REGNUM;
> +  to_clear_bitmap = sbitmap_resize (to_clear_bitmap, maxregno, 0);
>  
> -  float_mask &= ~((1ULL << FIRST_VFP_REGNUM) - 1);
> -  to_clear_mask[0] |= float_mask;
> -
> -  float_mask = (1ULL << (maxregno - 63)) - 1;
> -  to_clear_mask[1] = float_mask;
> +  bitmap_set_range (to_clear_bitmap, FIRST_VFP_REGNUM, float_bits);
>  
>/* Make sure we don't clear the two scratch registers used to clear the
>relevant FPSCR bits in output_return_instruction.  */
>emit_use (gen_rtx_REG (SImode, IP_REGNUM));
> -  to_clear_mask[0] &= ~(1ULL << IP_REGNUM);
> +  bitmap_clear_bit (to_clear_bitmap, IP_REGNUM);
>emit_use (gen_rtx_REG (SImode, 4));
> -  to_clear_mask[0] &= ~(1ULL << 4);
> +  bitmap_clear_bit (to_clear_bitmap, 4);
>  }
>  
>/* If the user has defined registers to be caller saved, these are no 
> longer
>   restored by the function before returning and must thus be cleared for
>   security purposes.  */
> -  for (regno = NUM_ARG_REGS; regno < LAST_VFP_REGNUM; regno++)
> +  for (regno = NUM_ARG_REGS; regno <= maxregno; regno++)
>  {
>/* We do not touch registers that can be used to pass arguments as per
>the AAPCS, since these should never be made callee-saved by user
> @@ -25041,29 +25045,50 @@ cmse_nonsecure_entry_clear_before_return (void)
>if 

Re: Handle data dependence relations with different bases

2017-07-07 Thread Richard Sandiford
Eric Botcazou  writes:
> [Sorry for missing the previous messages]
>
>> Thanks.  Just been retesting, and I think I must have forgotten
>> to include Ada last time.  It turns out that the patch causes a dg-scan
>> regression in gnat.dg/vect17.adb, because we now think that if the
>> array RECORD_TYPEs *do* alias in:
>> 
>>procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>begin
>>   for I in Sarray'Range loop
>>  R(I) := X(I) + Y(I);
>>   end loop;
>>end;
>> 
>> then the dependence distance must be zero.  Eric, does that hold true
>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>
> Yes, I'd think so (even without the artificial RECORD_TYPE around the arrays).

Good!

>> 2017-06-07  Richard Sandiford  
>> 
>> gcc/testsuite/
>>  * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>  * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>  when X = R or Y = R.
>
> I think that you need to modify vect15 and vect16 the same way.

Ah, yeah.  And doing that shows that I'd not handled safelen for
DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.

How does this look?  Tested on x86_64-linux-gnu both without the
vectoriser changes and with the fixed vectoriser patch.

Thanks,
Richard


2017-07-07  Richard Sandiford  

gcc/testsuite/
* gnat.dg/vect15.ads (Sarray): Increase range to 1 .. 5.
* gnat.dg/vect16.ads (Sarray): Likewise.
* gnat.dg/vect17.ads (Sarray): Likewise.
* gnat.dg/vect15.adb (Add): Create a dependence distance of 1.
* gnat.dg/vect16.adb (Add): Likewise.
* gnat.dg/vect17.adb (Add): Likewise.

Index: gcc/testsuite/gnat.dg/vect15.ads
===
--- gcc/testsuite/gnat.dg/vect15.ads2015-10-14 14:58:56.0 +0100
+++ gcc/testsuite/gnat.dg/vect15.ads2017-07-07 13:12:51.509540701 +0100
@@ -1,6 +1,6 @@
 package Vect15 is
 
-   type Sarray is array (1 .. 4) of Long_Float;
+   type Sarray is array (1 .. 5) of Long_Float;
for Sarray'Alignment use 16;
 
procedure Add (X, Y : Sarray; R : out Sarray);
Index: gcc/testsuite/gnat.dg/vect16.ads
===
--- gcc/testsuite/gnat.dg/vect16.ads2015-10-14 14:58:56.0 +0100
+++ gcc/testsuite/gnat.dg/vect16.ads2017-07-07 13:12:51.511540636 +0100
@@ -1,6 +1,6 @@
 package Vect16 is
 
-   type Sarray is array (1 .. 4) of Long_Float;
+   type Sarray is array (1 .. 5) of Long_Float;
for Sarray'Alignment use 16;
 
procedure Add_Sub (X, Y : Sarray; R,S : out Sarray);
Index: gcc/testsuite/gnat.dg/vect17.ads
===
--- gcc/testsuite/gnat.dg/vect17.ads2017-06-07 22:13:29.692531472 +0100
+++ gcc/testsuite/gnat.dg/vect17.ads2017-07-07 13:12:51.514540538 +0100
@@ -1,6 +1,6 @@
 package Vect17 is
 
-   type Sarray is array (1 .. 4) of Long_Float;
+   type Sarray is array (1 .. 5) of Long_Float;
for Sarray'Alignment use 16;
 
procedure Add (X, Y : aliased Sarray; R : aliased out Sarray);
Index: gcc/testsuite/gnat.dg/vect15.adb
===
--- gcc/testsuite/gnat.dg/vect15.adb2015-10-14 14:58:56.0 +0100
+++ gcc/testsuite/gnat.dg/vect15.adb2017-07-07 13:12:51.509540701 +0100
@@ -5,8 +5,9 @@ package body Vect15 is
 
procedure Add (X, Y : Sarray; R : out Sarray) is
begin
-  for I in Sarray'Range loop
- R(I) := X(I) + Y(I);
+  R(1) := X(5) + Y(5);
+  for I in 1 .. 4 loop
+ R(I + 1) := X(I) + Y(I);
   end loop;
end;
 
Index: gcc/testsuite/gnat.dg/vect16.adb
===
--- gcc/testsuite/gnat.dg/vect16.adb2015-10-14 14:58:56.0 +0100
+++ gcc/testsuite/gnat.dg/vect16.adb2017-07-07 13:12:51.510540669 +0100
@@ -5,9 +5,11 @@ package body Vect16 is
 
procedure Add_Sub (X, Y : Sarray; R,S : out Sarray) is
begin
-  for I in Sarray'Range loop
- R(I) := X(I) + Y(I);
- S(I) := X(I) - Y(I);
+  R(1) := X(5) + Y(5);
+  S(1) := X(5) - Y(5);
+  for I in 1 .. 4 loop
+ R(I + 1) := X(I) + Y(I);
+ S(I + 1) := X(I) - Y(I);
   end loop;
end;
 
Index: gcc/testsuite/gnat.dg/vect17.adb
===
--- gcc/testsuite/gnat.dg/vect17.adb2017-06-07 22:13:29.692531472 +0100
+++ gcc/testsuite/gnat.dg/vect17.adb2017-07-07 13:12:51.512540603 +0100
@@ -5,8 +5,9 @@ package body Vect17 is
 
procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
begin
-  for I in Sarray'Range loop
- R(I) := X(I) + Y(I);
+  R(1) := X(5) + Y(5);
+  for I in 1 .. 4 loop
+ R(I + 

Re: [PATCH v11] add -fpatchable-function-entry=N,M option

2017-07-07 Thread Richard Earnshaw (lists)
On 06/07/17 15:03, Torsten Duwe wrote:
> Permit A 38
> 
> gcc/c-family/ChangeLog
> 2017-07-06  Torsten Duwe  
> 
>   * c-attribs.c (c_common_attribute_table): Add entry for
>   "patchable_function_entry".
> 
> gcc/lto/ChangeLog
> 2017-07-06  Torsten Duwe  
> 
>   * lto-lang.c (lto_attribute_table): Add entry for
>   "patchable_function_entry".
> 
> gcc/ChangeLog
> 2017-07-06  Torsten Duwe  
> 
>   * common.opt: Introduce -fpatchable-function-entry
>   command line option, and its variables function_entry_patch_area_size
>   and function_entry_patch_area_start.
>   * opts.c (common_handle_option): Add -fpatchable_function_entry_ case,
>   including a two-value parser.
>   * target.def (print_patchable_function_entry): New target hook.
>   * targhooks.h (default_print_patchable_function_entry): New function.
>   * targhooks.c (default_print_patchable_function_entry): Likewise.
>   * toplev.c (process_options): Switch off IPA-RA if
>   patchable function entries are being generated.
>   * varasm.c (assemble_start_function): Look at the
>   patchable-function-entry command line switch and current
>   function attributes and maybe generate NOP instructions by
>   calling the print_patchable_function_entry hook.
>   * doc/extend.texi: Document patchable_function_entry attribute.
>   * doc/invoke.texi: Document -fpatchable_function_entry
>   command line option.
>   * doc/tm.texi.in (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY):
>   New target hook.
>   * doc/tm.texi: Likewise.
> 
> gcc/testsuite/ChangeLog
> 2017-07-06  Torsten Duwe  
> 
>   * c-c++-common/patchable_function_entry-default.c: New test.
>   * c-c++-common/patchable_function_entry-decl.c: Likewise.
>   * c-c++-common/patchable_function_entry-definition.c: Likewise.
> 
> diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
> index 626ffa1cde7..ecb00c1d5b9 100644
> --- a/gcc/c-family/c-attribs.c
> +++ b/gcc/c-family/c-attribs.c
> @@ -142,6 +142,8 @@ static tree handle_bnd_variable_size_attribute (tree *, 
> tree, tree, int, bool *)
>  static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
>  static tree handle_bnd_instrument (tree *, tree, tree, int, bool *);
>  static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
> +static tree handle_patchable_function_entry_attribute (tree *, tree, tree,
> +int, bool *);
>  
>  /* Table of machine-independent attributes common to all C-like languages.
>  
> @@ -351,6 +353,9 @@ const struct attribute_spec c_common_attribute_table[] =
> handle_bnd_instrument, false },
>{ "fallthrough", 0, 0, false, false, false,
> handle_fallthrough_attribute, false },
> +  { "patchable_function_entry",  1, 2, true, false, false,
> +   handle_patchable_function_entry_attribute,
> +   false },
>{ NULL, 0, 0, false, false, false, NULL, false }
>  };
>  
> @@ -3260,3 +3265,10 @@ handle_fallthrough_attribute (tree *, tree name, tree, 
> int,
>*no_add_attrs = true;
>return NULL_TREE;
>  }
> +
> +static tree
> +handle_patchable_function_entry_attribute (tree *, tree, tree, int, bool *)
> +{
> +  /* Nothing to be done here.  */
> +  return NULL_TREE;
> +}
> diff --git a/gcc/common.opt b/gcc/common.opt
> index e81165c488b..78cfa568a95 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -163,6 +163,13 @@ bool flag_stack_usage_info = false
>  Variable
>  int flag_debug_asm
>  
> +; How many NOP insns to place at each function entry by default
> +Variable
> +HOST_WIDE_INT function_entry_patch_area_size
> +
> +; And how far the real asm entry point is into this area
> +Variable
> +HOST_WIDE_INT function_entry_patch_area_start
>  
>  ; Balance between GNAT encodings and standard DWARF to emit.
>  Variable
> @@ -2030,6 +2037,10 @@ fprofile-reorder-functions
>  Common Report Var(flag_profile_reorder_functions)
>  Enable function reordering that improves code placement.
>  
> +fpatchable-function-entry=
> +Common Joined Optimization
> +Insert NOP instructions at each function entry.
> +
>  frandom-seed
>  Common Var(common_deferred_options) Defer
>  
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 03ba8fc436c..86d567783f7 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -3105,6 +3105,27 @@ that affect more than one function.
>  This attribute should be used for debugging purposes only.  It is not
>  suitable in production code.
>  
> +@item patchable_function_entry
> +@cindex @code{patchable_function_entry} function attribute
> +@cindex extra NOP instructions at the function entry point
> +In case the target's text segment can be made writable at run time by
> +any means, padding the function entry with 

Avoid global optimize flag checks in LTO

2017-07-07 Thread Jan Hubicka
Hi,
this patch fixes some places where we check global optimize flag rather than
doing it per-function. This makes optimization attribute work closer to
what one gets when passing the same flag at command line.
This requires to run IPA passes even with !optimize, but having fast way through
which does mostly nothing except when it sees functions with optimize attributes
set.

Bootstrapped/regtested x86_64-linux, comitted.

* ipa-comdats.c: Remove optimize check from gate.
* ipa-fnsummary.c (ipa_fn_summary_generate): do not generate summary
for functions not optimized.
(ipa_fn_summary_read): Skip optimize check.
(ipa_fn_summary_write): Likewise.
* ipa-inline-analysis.c (do_estimate_growth_1): Check that caller
is optimized.
* ipa-inline.c (can_inline_edge_p): Not optimized functions are
uninlinable.
(can_inline_edge_p): Check flag_pcc_struct_return for match.
(check_callers): Give up on caller which is not optimized.
(inline_small_functions): Likewise.
(ipa_inline): Do not give up when not optimizing.
* ipa-visbility.c (function_and_variable_visibility): Do not optimize
away unoptimizes cdtors.
(whole_program_function_and_variable_visibility): Do
ipa_discover_readonly_nonaddressable_vars in LTO mode.
* ipa.c (process_references): Do not check optimize.
(symbol_table::remove_unreachable_nodes): Update optimize check.
(set_writeonly_bit): Update optimize check.
(pass_ipa_cdtor_merge::gate): Do not check optimize.
(pass_ipa_single_use::gate): Remove.
Index: ipa-comdats.c
===
--- ipa-comdats.c   (revision 250021)
+++ ipa-comdats.c   (working copy)
@@ -416,7 +416,7 @@ public:
 bool
 pass_ipa_comdats::gate (function *)
 {
-  return HAVE_COMDAT_GROUP && optimize;
+  return HAVE_COMDAT_GROUP;
 }
 
 } // anon namespace
Index: ipa-fnsummary.c
===
--- ipa-fnsummary.c (revision 250021)
+++ ipa-fnsummary.c (working copy)
@@ -3174,22 +3174,20 @@ ipa_fn_summary_generate (void)
 
   FOR_EACH_DEFINED_FUNCTION (node)
 if (DECL_STRUCT_FUNCTION (node->decl))
-  node->local.versionable = tree_versionable_function_p (node->decl);
-
-  /* When not optimizing, do not bother to analyze.  Inlining is still done
- because edge redirection needs to happen there.  */
-  if (!optimize && !flag_generate_lto && !flag_generate_offload && !flag_wpa)
-return;
+  node->local.versionable = 
+   (opt_for_fn (node->decl, optimize)
+   && tree_versionable_function_p (node->decl));
 
   ipa_fn_summary_alloc ();
 
   ipa_fn_summaries->enable_insertion_hook ();
 
   ipa_register_cgraph_hooks ();
-  ipa_free_fn_summary ();
 
   FOR_EACH_DEFINED_FUNCTION (node)
-if (!node->alias)
+if (!node->alias
+   && (flag_generate_lto || flag_generate_offload|| flag_wpa
+   || opt_for_fn (node->decl, optimize)))
   inline_analyze_function (node);
 }
 
@@ -3342,12 +3340,9 @@ ipa_fn_summary_read (void)
fatal_error (input_location,
 "ipa inline summary is missing in input file");
 }
-  if (optimize)
-{
-  ipa_register_cgraph_hooks ();
-  if (!flag_ipa_cp)
-   ipa_prop_read_jump_functions ();
-}
+  ipa_register_cgraph_hooks ();
+  if (!flag_ipa_cp)
+ipa_prop_read_jump_functions ();
 
   gcc_assert (ipa_fn_summaries);
   ipa_fn_summaries->enable_insertion_hook ();
@@ -3462,7 +3457,7 @@ ipa_fn_summary_write (void)
   produce_asm (ob, NULL);
   destroy_output_block (ob);
 
-  if (optimize && !flag_ipa_cp)
+  if (!flag_ipa_cp)
 ipa_prop_write_jump_functions ();
 }
 
Index: ipa-inline-analysis.c
===
--- ipa-inline-analysis.c   (revision 250021)
+++ ipa-inline-analysis.c   (working copy)
@@ -326,7 +326,8 @@ do_estimate_growth_1 (struct cgraph_node
 {
   gcc_checking_assert (e->inline_failed);
 
-  if (cgraph_inline_failed_type (e->inline_failed) == CIF_FINAL_ERROR)
+  if (cgraph_inline_failed_type (e->inline_failed) == CIF_FINAL_ERROR
+ || !opt_for_fn (e->caller->decl, optimize))
{
  d->uninlinable = true;
   continue;
Index: ipa-inline.c
===
--- ipa-inline.c(revision 250021)
+++ ipa-inline.c(working copy)
@@ -322,6 +322,11 @@ can_inline_edge_p (struct cgraph_edge *e
   e->inline_failed = CIF_BODY_NOT_AVAILABLE;
   inlinable = false;
 }
+  if (!early && !opt_for_fn (callee->decl, optimize))
+{
+  e->inline_failed = CIF_FUNCTION_NOT_OPTIMIZED;
+  inlinable = false;
+}
   else if (callee->calls_comdat_local)
 {
   e->inline_failed = CIF_USES_COMDAT_LOCAL;
@@ -402,6 +407,7 @@ can_inline_edge_p (struct 

[PATCH], PR target/81348: Fix compiler segfault on -mcpu=power9 code

2017-07-07 Thread Michael Meissner
This patch fixes a typo where the wrong operand was used (a memory was used
where a register was intended), and the compiler segfaulted.

I did the bootstrap and make check with no regression on a little endian power8
system.  Can I install this in the trunk and the gcc 7.x branch (it isn't an
issue on the gcc 6.x branch)?

[gcc]
2017-07-07  Michael Meissner  

PR target/81348
* config/rs6000/rs6000.md (HI sign_extend splitter): Use the
correct operand in doing the split.

[gcc/testsuite]
2017-07-07  Michael Meissner  

PR target/81348
* gcc.target/powerpc/pr81348.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 250043)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -940,7 +940,7 @@ (define_split
(set (match_dup 0)
(sign_extend:EXTHI (match_dup 2)))]
 {
-  operands[2] = gen_rtx_REG (HImode, REGNO (operands[1]));
+  operands[2] = gen_rtx_REG (HImode, REGNO (operands[0]));
 })
 
 (define_insn_and_split "*extendhi2_dot"
Index: gcc/testsuite/gcc.target/powerpc/pr81348.c
===
--- gcc/testsuite/gcc.target/powerpc/pr81348.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr81348.c  (working copy)
@@ -0,0 +1,24 @@
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -Og" } */
+
+/* PR target/81348: Compiler died in doing short->float conversion due to using
+   the wrong register in a define_split.  */
+
+int a;
+short b;
+float ***c;
+
+void d(void)
+{
+int e = 3;
+
+if (a)
+e = b;
+
+***c = e;
+}
+
+/* { dg-final { scan-assembler {\mlxsihzx\M}  } } */
+/* { dg-final { scan-assembler {\mvextsh2d\M} } } */


Re: [PATCH], PowerPC target_clones minor support

2017-07-07 Thread Segher Boessenkool
On Wed, Jun 28, 2017 at 02:28:23PM -0400, Michael Meissner wrote:
> Some minor changes to the PowerPC target_clones support:
> 
> 1) I added a warning if target_clones was used and the compiler whas 
> configured
> with an older glibc where __builtin_cpu_supports always returns 0;
> 
> 2) I reworked how the ifunc resolver function is generated, and always made it
> a static function;
> 
> 3) I added an executable target_clones test, and I made both clone tests
> dependent on GCC being configured with a new glibc.

>   * config/rs6000/rs6000.c
>   (rs6000_get_function_versions_dispatcher): Add warning if the
>   compiler is not configured to use at least GLIBC version 2.23.

Please say what is really tested for here (namely,
TARGET_LIBC_PROVIDES_HWCAP_IN_TCB).

>/* Append the filename to the resolver function if the versions are
>   not externally visible.  This is because the resolver function has
>   to be externally visible for the loader to find it.  So, appending
>   the filename will prevent conflicts with a resolver function from
>   another module which is based on the same version name.  */
> -  char *resolver_name = make_unique_name (default_decl, "resolver", is_uniq);
> +  tree decl_name = clone_function_name (default_decl, "resolver");
> +  const char *resolver_name = IDENTIFIER_POINTER (decl_name);

I think the comment needs some updating now?

> --- gcc/testsuite/gcc.target/powerpc/clone2.c 
> (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)
>  (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/clone2.c 
> (.../gcc/testsuite/gcc.target/powerpc)  (revision 249738)
> @@ -0,0 +1,31 @@
> +/* { dg-do run { target { powerpc*-*-linux* } } } */
> +/* { dg-options "-mvsx -O2" } */
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-require-effective-target ppc_cpu_supports_hw } */

What a funny name (it reads as "the CPU supports the hardware").  Yes
I'm easily amused ;-)

The patch is okay for trunk modulo with those things looked at.  Sorry
for the slow review.


Segher


Re: [PR80693] drop value of parallel SETs dropped by combine

2017-07-07 Thread Segher Boessenkool
Hi again, sorry for the delay,

On Fri, Jun 23, 2017 at 11:01:12PM -0300, Alexandre Oliva wrote:
> > Things should probably be restructured a bit so we keep the sets count
> > correct, if that is possible?
> 
> I'll have to think a bit to figure out the exact conditions in which to
> decrement the sets count, and reset the recorded value.  I was thinking
> the conditions were the same; am I missing something?
> 
> Or are you getting at cases in which we should do both and don't, or
> vice-versa?  E.g., if reg_referenced_p holds but the subsequent test
> doesn't?  I guess we do, but don't we have to distinguish the cases of
> an original unused set remaining from that of reusing the pseudo for a
> new set?
> 
> Do we have to test whether from_insn still reg_sets_p the REG_UNUSED
> operand, when from_insn is not i3?  (e.g., it could be something that
> remains set in i1 as a side effect, but that's not used in either i2 or
> i3)
> 
> Am I overdoing this?  The situations I had to analyze in the patch I
> posted before were much simpler, and even then I now think I missed a
> number of them :-)

Yeah you're overdoing it ;-)  I meant, just double check if your new
code does the correct thing for the set count.  It wasn't obvious to
me (this code is horribly complicated).  Whether all existing code is
correct...  it's probably best not to look too closely :-/

If you have a patch you feel confident in, could you post it again
please?


Segher


p.s.  What I still want to do is never reuse a set pseudo, always
create a new pseudo instead.  This will get rid of many existing bugs,
and a lot of the complications in existing code.  Unfortunately not
everything is set up yet for creating new pseudos during combine.


Re: [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902)

2017-07-07 Thread Segher Boessenkool
On Thu, Jun 22, 2017 at 10:59:05PM -0600, Jeff Law wrote:
> On 05/28/2017 06:31 AM, Segher Boessenkool wrote:
> > __atomic_add_fetch adds a value to some memory, and returns the result.
> > If there is no direct support for this, expand_builtin_atomic_fetch_op
> > is asked to implement this as __atomic_fetch_add (which returns the
> > original value of the mem), followed by the addition.  Now, the
> > __atomic_add_fetch could have been a tail call, but we shouldn't
> > perform the __atomic_fetch_add as a tail call: following code would
> > not be executed, and in fact thrown away because there is a barrier
> > after tail calls.

> Hmmm.  I wonder if we have similar problems elsewhere.  For example
> expand_builtin_int_roundingfn_2, stack_protect_epilogue,
> expand_builtin_trap (though this one probably isn't broken in practice),
> expand_ifn_atomic_compare_exchange_into_call.
> 
> OK, but please check the other instances where we call expand_call, then
> continue generating code afterwards.  Fixing those can be a follow-up patch.

We certainly have similar problems elsewhere.

I'm doing tests detecting whenever we create dead code (right after a
barrier); it finds a few things, mostly harmless, but there are quite
a few places where we create dead code during expand.  This will take
a while, but it will need to happen during stage 1...  I'm trying to
fit it in :-/


Segher


Re: [PATCH-v3] [SPARC] Add a workaround for the LEON3FT store-store errata

2017-07-07 Thread Daniel Cederman


On 2017-07-07 12:01, Eric Botcazou wrote:

We can drop the define if necessary, but we would like to keep the two
flags. Would that be OK to apply?


Yes, OK to apply on mainline and 7 branch with this change, thanks.



Great! Would you mind to apply the patch for us? The only person here 
with write access just went on vacation. I have submitted a new version 
(v4) with the change that applies to both main and 7.


--
Daniel Cederman
Cobham Gaisler


[PATCH-v4] [SPARC] Add a workaround for the LEON3FT store-store errata

2017-07-07 Thread Daniel Cederman
This patch adds a workaround to the Sparc backend for the LEON3FT
store-store errata. It is enabled when using the -mfix-ut699,
-mfix-ut700, or -mfix-gr712rc flag.

The workaround inserts NOP instructions to prevent the following two
instruction sequences from being generated:

std -> stb/sth/st/std
stb/sth/st -> any single non-store/load instruction -> stb/sth/st/std

The __FIX_B2BST define can be used to only enable workarounds in assembly
code when the flag is used.

See GRLIB-TN-0009, "LEON3FT Stale Cache Entry After Store with Data Tag
Parity Error", for more information.

gcc/ChangeLog:

2017-07-07  Daniel Cederman  

* config/sparc/sparc.c (sparc_do_work_around_errata): Insert NOP
instructions to prevent sequences that can trigger the store-store
errata for certain LEON3FT processors.
(sparc_option_override): -mfix-ut699, -mfix-ut700, and
-mfix-gr712rc enables the errata workaround.
* config/sparc/sparc.md: Prevent stores in delay slot.
* config/sparc/sparc.opt: Add -mfix-ut700 and -mfix-gr712rc flag.
* doc/invoke.texi: Document -mfix-ut700 and -mfix-gr712rc flag.
---
 gcc/config/sparc/sparc.c   | 98 +-
 gcc/config/sparc/sparc.md  | 10 -
 gcc/config/sparc/sparc.opt | 12 ++
 gcc/doc/invoke.texi| 14 ++-
 4 files changed, 128 insertions(+), 6 deletions(-)

diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 790a036..ebf2eda 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -896,6 +896,12 @@ mem_ref (rtx x)
to properly detect the various hazards.  Therefore, this machine specific
pass runs as late as possible.  */
 
+/* True if INSN is a md pattern or asm statement.  */
+#define USEFUL_INSN_P(INSN)\
+  (NONDEBUG_INSN_P (INSN)  \
+   && GET_CODE (PATTERN (INSN)) != USE \
+   && GET_CODE (PATTERN (INSN)) != CLOBBER)
+
 static unsigned int
 sparc_do_work_around_errata (void)
 {
@@ -915,6 +921,81 @@ sparc_do_work_around_errata (void)
if (rtx_sequence *seq = dyn_cast  (PATTERN (insn)))
  insn = seq->insn (1);
 
+  /* Look for either of these two sequences:
+
+Sequence A:
+1. store of word size or less (e.g. st / stb / sth / stf)
+2. any single instruction that is not a load or store
+3. any store instruction (e.g. st / stb / sth / stf / std / stdf)
+
+Sequence B:
+1. store of double word size (e.g. std / stdf)
+2. any store instruction (e.g. st / stb / sth / stf / std / stdf)  */
+  if (sparc_fix_b2bst
+ && NONJUMP_INSN_P (insn)
+ && (set = single_set (insn)) != NULL_RTX
+ && MEM_P (SET_DEST (set)))
+   {
+ /* Sequence B begins with a double-word store.  */
+ bool seq_b = GET_MODE_SIZE (GET_MODE (SET_DEST (set))) == 8;
+ rtx_insn *after;
+ int i;
+
+ next = next_active_insn (insn);
+ if (!next)
+   break;
+
+ for (after = next, i = 0; i < 2; i++)
+   {
+ /* Skip empty assembly statements.  */
+ if ((GET_CODE (PATTERN (after)) == UNSPEC_VOLATILE)
+ || (USEFUL_INSN_P (after)
+ && (asm_noperands (PATTERN (after))>=0)
+ && !strcmp (decode_asm_operands (PATTERN (after),
+  NULL, NULL, NULL,
+  NULL, NULL), "")))
+   after = next_active_insn (after);
+ if (!after)
+   break;
+
+ /* If the insn is a branch, then it cannot be problematic.  */
+ if (!NONJUMP_INSN_P (after)
+ || GET_CODE (PATTERN (after)) == SEQUENCE)
+   break;
+
+ /* Sequence B is only two instructions long.  */
+ if (seq_b)
+   {
+ /* Add NOP if followed by a store.  */
+ if ((set = single_set (after)) != NULL_RTX
+ && MEM_P (SET_DEST (set)))
+   insert_nop = true;
+
+ /* Otherwise it is ok.  */
+ break;
+   }
+
+ /* If the second instruction is a load or a store,
+then the sequence cannot be problematic.  */
+ if (i == 0)
+   {
+ if (((set = single_set (after)) != NULL_RTX)
+ && (MEM_P (SET_DEST (set)) || MEM_P (SET_SRC (set
+   break;
+
+ after = next_active_insn (after);
+ if (!after)
+   break;
+   }
+
+ /* Add NOP if third instruction is a store.  */
+ if (i == 1
+ && ((set = single_set (after)) != NULL_RTX)
+ && MEM_P 

[PATCH][AArch64] Improve aarch64_legitimate_constant_p

2017-07-07 Thread Wilco Dijkstra
This patch further improves aarch64_legitimate_constant_p.  Allow all
integer, floating point and vector constants.  Allow label references
and non-anchor symbols with an immediate offset.  This allows such
constants to be rematerialized, resulting in smaller code and fewer stack
spills.

SPEC2006 codesize reduces by 0.08%, SPEC2017 by 0.13%.

Bootstrap OK, OK for commit?

ChangeLog:
2017-07-07  Wilco Dijkstra  

* config/aarch64/aarch64.c (aarch64_legitimate_constant_p):
Return true for more constants, symbols and label references.
(aarch64_valid_floating_const): Remove unused function.

--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
a2eca64a9c13e44d223b5552c079ef4e09659e84..810c17416db01681e99a9eb8cc9f5af137ed2054
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -10173,49 +10173,46 @@ aarch64_legitimate_pic_operand_p (rtx x)
   return true;
 }
 
-/* Return true if X holds either a quarter-precision or
- floating-point +0.0 constant.  */
-static bool
-aarch64_valid_floating_const (machine_mode mode, rtx x)
-{
-  if (!CONST_DOUBLE_P (x))
-return false;
-
-  if (aarch64_float_const_zero_rtx_p (x))
-return true;
-
-  /* We only handle moving 0.0 to a TFmode register.  */
-  if (!(mode == SFmode || mode == DFmode))
-return false;
-
-  return aarch64_float_const_representable_p (x);
-}
+/* Implement TARGET_LEGITIMATE_CONSTANT_P hook.  Return true for constants
+   that should be rematerialized rather than spilled.  */
 
 static bool
 aarch64_legitimate_constant_p (machine_mode mode, rtx x)
 {
+  /* Support CSE and rematerialization of common constants.  */
+  if (CONST_INT_P (x) || CONST_DOUBLE_P (x) || GET_CODE (x) == CONST_VECTOR)
+return true;
+
   /* Do not allow vector struct mode constants.  We could support
  0 and -1 easily, but they need support in aarch64-simd.md.  */
-  if (TARGET_SIMD && aarch64_vect_struct_mode_p (mode))
+  if (aarch64_vect_struct_mode_p (mode))
 return false;
 
-  /* This could probably go away because
- we now decompose CONST_INTs according to expand_mov_immediate.  */
-  if ((GET_CODE (x) == CONST_VECTOR
-   && aarch64_simd_valid_immediate (x, mode, false, NULL))
-  || CONST_INT_P (x) || aarch64_valid_floating_const (mode, x))
-   return !targetm.cannot_force_const_mem (mode, x);
+  /* Do not allow wide int constants - this requires support in movti.  */
+  if (CONST_WIDE_INT_P (x))
+return false;
 
-  if (GET_CODE (x) == HIGH
-  && aarch64_valid_symref (XEXP (x, 0), GET_MODE (XEXP (x, 0
-return true;
+  /* Do not allow const (plus (anchor_symbol, const_int)).  */
+  if (GET_CODE (x) == CONST && GET_CODE (XEXP (x, 0)) == PLUS)
+  {
+x = XEXP (XEXP (x, 0), 0);
+if (SYMBOL_REF_P (x) && SYMBOL_REF_ANCHOR_P (x))
+  return false;
+  }
+
+  if (GET_CODE (x) == HIGH)
+x = XEXP (x, 0);
 
   /* Treat symbols as constants.  Avoid TLS symbols as they are complex,
  so spilling them is better than rematerialization.  */
   if (SYMBOL_REF_P (x) && !SYMBOL_REF_TLS_MODEL (x))
 return true;
 
-  return aarch64_constant_address_p (x);
+  /* Label references are always constant.  */
+  if (GET_CODE (x) == LABEL_REF)
+return true;
+
+  return false;
 }
 
 rtx

[patch] Fix ICE on CONSTRUCTOR containing absolute addresses

2017-07-07 Thread Eric Botcazou
Hi,

this fixes the following ICE in decode_addr_const:

+===GNAT BUG DETECTED==+
| 8.0.0 20170704 (experimental) [trunk revision 249942] (x86_64-suse-linux) 
GCC error:|
| in decode_addr_const, at varasm.c:2880 

stemming from a CONSTRUCTOR containing absolute addresses hidden behind a 
COMPONENT_REF or similar references.

Fixed by adding support for INDIRECT_REF  to decode_addr_const.

Tested on x86_64-suse-linux, OK for the mainline?


2017-07-07  Eric Botcazou  

* varasm.c (decode_addr_const): Deal with INDIRECT_REF .


2017-07-07  Eric Botcazou  

* gnat.dg/aggr22.ad[sb]: New test.

-- 
Eric BotcazouIndex: varasm.c
===
--- varasm.c	(revision 249942)
+++ varasm.c	(working copy)
@@ -2876,6 +2876,13 @@ decode_addr_const (tree exp, struct addr
   x = output_constant_def (target, 1);
   break;
 
+case INDIRECT_REF:
+  /* This deals with absolute addresses.  */
+  offset += tree_to_shwi (TREE_OPERAND (target, 0));
+  x = gen_rtx_MEM (QImode,
+		   gen_rtx_SYMBOL_REF (Pmode, "origin of addresses"));
+  break;
+
 default:
   gcc_unreachable ();
 }
-- { dg-do compile }

package body Aggr22 is

  type Ptr is access all Integer;
  type Arr is array (Positive range <>) of Ptr;

  procedure Proc is
A : Arr (1 .. 33);
  begin
A := (1 => null, 2 .. 32 => My_Rec.I'Access, 33 => null);
  end;

end Aggr22;
with System;

package Aggr22 is

   type Rec is record
 C : Character;
 I : aliased Integer;
   end record;

   My_Rec : aliased Rec;
   pragma Import (Ada, My_Rec);
   for My_Rec'Address use System'To_Address (16#40086000#);

   procedure Proc;

end Aggr22;


Re: [PATCH V2 0/7] Support for the SPARC M8 cpu

2017-07-07 Thread David Miller
From: jose.march...@oracle.com (Jose E. Marchesi)
Date: Fri, 07 Jul 2017 12:53:37 +0200

> I will be committing to svn in both trunk and the gcc 7 branch.

Thank you for doing this work.


Re: [PATCH V2 0/7] Support for the SPARC M8 cpu

2017-07-07 Thread Jose E. Marchesi

Hi Eric.

> This patch serie adds support for the SPARC M8 processor to GCC.
> The SPARC M8 processor implements the Oracle SPARC Architecture 2017.

Thanks for the contribution!

Thank you for the review :)

> The first four patches are preparatory work:
> 
> - bmask* instructions are put in their own instruction type.  It makes
>   little sense to have them in the same category than array
>   instructions.
> 
> - Similarly, VIS compare instructions are put in their own instruction
>   type.  This is to better accommodate subtypes, which are not quite
>   the same than the subtypes of `visl' instructions.
> 
> - The introduction of a new `subtype' insn attribute in sparc.md
>   avoids the need for adjusting the instruction scheduler DFAs for
>   previous cpu models every time a new cpu is introduced.
> 
> - The full set of SPARC instructions used in sparc.md, and their
>   position in the type/subtype hierarchy, is documented in a comment.
>   This eases the modification of the DFA schedulers, and the addition
>   of new cpus.
> 
> - The M7 DFA scheduler is reworked:
> 
>   + To use the new type/subtype hierarchy.
>   + The v3pipe insn attribute is no longer needed.
>   + More accurate latencies for instructions.
>   + The C4 core pipeline is documented in a comment in niagara7.md.

S4 core I presume?

The cores are known internally as both S4/S5 and C4/C5 (dunno why) so I
tend to use both denominations randomly.  Sorry for the confusion: S* is
better established externally I think.

> The next three patches introduce M8 support proper:
> 
> - Support for -mcpu=m8 (we are thus suggesting to abandon the niagaraN
>   denomination for M8 and later processors.)

If this mirrors the established practice, no objections by me.

> - Support for a new VIS level, VIS4B, covering the new VIS
>   instructions introduced in OSA2017 and implemented in the M8.  Also
>   built-ins.
> 
>   Note that no new VIS level was formally introduced in OSA2017, even
>   if many new VIS instructions were added to the spec.  We introduced
>   VIS4B for coherence (like availability of builtins and visintrin.h
>   depending on the value of __VIS__) and avoided using VIS5 in case it
>   is introduced in future versions of the Oracle SPARC Architecture.

This sounds sensible indeed.

> - A M8 DFA scheduler:
> 
>   + Also based on the new type/subtype hierarchy.
>   + The functional units in the C5 core are explicitly documented in a
> comment in m8.md.

S5 core I presume?

> See the individual patch descriptions for more information and
> associated ChangeLog entries.
> 
> After this serie gets integrated upstream we will be contributing more
> support for M8 capabilities, such as support for using the new
> misaligned load/store instructions for memory accesses known to be
> misaligned at compile-time.
> 
> Note that full binutils support for M8 was upstreamed in May 19.
> 
> Bootstrapped and tested in sparc64-linux-gnu.  No regressions.
> Bootstrapped and tested in sparc-sun-solaris2.12.  No regressions.

OK for mainline and 7 branch (I can do the backport to the branch).

I will be committing to svn in both trunk and the gcc 7 branch.
Thanks again.


Re: [PATCH V2 0/7] Support for the SPARC M8 cpu

2017-07-07 Thread Eric Botcazou
> This patch serie adds support for the SPARC M8 processor to GCC.
> The SPARC M8 processor implements the Oracle SPARC Architecture 2017.

Thanks for the contribution!

> The first four patches are preparatory work:
> 
> - bmask* instructions are put in their own instruction type.  It makes
>   little sense to have them in the same category than array
>   instructions.
> 
> - Similarly, VIS compare instructions are put in their own instruction
>   type.  This is to better accommodate subtypes, which are not quite
>   the same than the subtypes of `visl' instructions.
> 
> - The introduction of a new `subtype' insn attribute in sparc.md
>   avoids the need for adjusting the instruction scheduler DFAs for
>   previous cpu models every time a new cpu is introduced.
> 
> - The full set of SPARC instructions used in sparc.md, and their
>   position in the type/subtype hierarchy, is documented in a comment.
>   This eases the modification of the DFA schedulers, and the addition
>   of new cpus.
> 
> - The M7 DFA scheduler is reworked:
> 
>   + To use the new type/subtype hierarchy.
>   + The v3pipe insn attribute is no longer needed.
>   + More accurate latencies for instructions.
>   + The C4 core pipeline is documented in a comment in niagara7.md.

S4 core I presume?

> The next three patches introduce M8 support proper:
> 
> - Support for -mcpu=m8 (we are thus suggesting to abandon the niagaraN
>   denomination for M8 and later processors.)

If this mirrors the established practice, no objections by me.

> - Support for a new VIS level, VIS4B, covering the new VIS
>   instructions introduced in OSA2017 and implemented in the M8.  Also
>   built-ins.
> 
>   Note that no new VIS level was formally introduced in OSA2017, even
>   if many new VIS instructions were added to the spec.  We introduced
>   VIS4B for coherence (like availability of builtins and visintrin.h
>   depending on the value of __VIS__) and avoided using VIS5 in case it
>   is introduced in future versions of the Oracle SPARC Architecture.

This sounds sensible indeed.

> - A M8 DFA scheduler:
> 
>   + Also based on the new type/subtype hierarchy.
>   + The functional units in the C5 core are explicitly documented in a
> comment in m8.md.

S5 core I presume?

> See the individual patch descriptions for more information and
> associated ChangeLog entries.
> 
> After this serie gets integrated upstream we will be contributing more
> support for M8 capabilities, such as support for using the new
> misaligned load/store instructions for memory accesses known to be
> misaligned at compile-time.
> 
> Note that full binutils support for M8 was upstreamed in May 19.
> 
> Bootstrapped and tested in sparc64-linux-gnu.  No regressions.
> Bootstrapped and tested in sparc-sun-solaris2.12.  No regressions.

OK for mainline and 7 branch (I can do the backport to the branch).

-- 
Eric Botcazou


Re: [PATCH, rs6000] Modify libgcc's float128 IFUNC resolver functions to use __builtin_cpu_supports()

2017-07-07 Thread Florian Weimer
On 07/06/2017 11:21 PM, Peter Bergner wrote:
> I will note that this patch causes issues in some tests in the GLIBC 
> testsiute,
> which Tulio is working on fixing (it's a GLIBC issue, not a GCC issue), so if
> this patch is "ok", I plan on holding off on committing this, until the GLIBC
> fix is committed.

This issue currently blocks Fedora rawhide development.  I would prefer
if we could move this forward despite temporary glibc test suite failures.

Thanks,
Florian


Re: [PATCH-v3] [SPARC] Add a workaround for the LEON3FT store-store errata

2017-07-07 Thread Eric Botcazou
> We can drop the define if necessary, but we would like to keep the two
> flags. Would that be OK to apply?

Yes, OK to apply on mainline and 7 branch with this change, thanks.

-- 
Eric Botcazou


[PING^3][PATCH][Aarch64] Relational compare zero not merged into subtract

2017-07-07 Thread Michael Collison
Ping^3. Original patch posted here:

https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00091.html