Re: [Patch, fortran] PR 37131, inline matmul

2015-05-05 Thread Mikael Morin
Le 05/05/2015 08:25, Thomas Koenig a écrit :
 Hello world,
 
 this is an update of the matmul inline patch.  The only difference to
 the last version is that it has the ubound simplification taken out.

Sorry, I delayed this, but it wasn't (yet) forgotten.

Below a few answers to
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01247.html
and documentation fixes.


   * simplify.c (simplify_bound):  Simplify the case of the
   lower bound of an assumed-shape argument.
Entry to be removed. ;-)


 Index: fortran/array.c
 ===
 --- fortran/array.c (Revision 18)
 +++ fortran/array.c (Arbeitskopie)
 @@ -338,6 +338,9 @@ gfc_resolve_array_spec (gfc_array_spec *as, int ch
if (as == NULL)
  return true;
  
 +  if (as-resolved)
 +return true;
 +
 Why this?
 
 Because you get regressions otherwise.  Not resolving an array spec
 twice should do no harm, and resolving it twice does so - I hit the
 error message in check_restricted.  I'm not sure what is wrong, maybe
 PR 23466 was not fully fixed, but this works.
 
Hum, it seems to work without it here.  Can you double check?

 @@ -524,29 +542,11 @@ constant_string_length (gfc_expr *e)
  
  }
  
 -/* Returns a new expression (a variable) to be used in place of the old 
 one,
 -   with an assignment statement before the current statement to set
 -   the value of the variable. Creates a new BLOCK for the statement if
 -   that hasn't already been done and puts the statement, plus the
 -   newly created variables, in that block.  Special cases:  If the
 -   expression is constant or a temporary which has already
 -   been created, just copy it.  */
 -
 -static gfc_expr*
 -create_var (gfc_expr * e)
 Keep a comment here.
 
 Still exists, further down.
 
I don't mind the comment being moved around together with create_var. :-)
I was asking for a comment for the new function insert_block.


 
 + gfc_simplify_expr (ar-start[i], 0);
 +   }
 + else if (was_fullref)
 +   {
 + ar-dimen_type[i] = DIMEN_RANGE;
 + ar-start[i] = NULL;
 + ar-end[i] = NULL;
 + ar-stride[i] = NULL;
 +   }
 Is this reachable ?
 
 Not in the current incarnation, I wanted to keep it around for
 a full segment later.  I can also remove this.
 
At least add an assert, a comment, something telling that it's not used.
I would rather remove it until it's actually used, but I can live with
it, if it becomes used shortly.


 [...]

 Index: fortran/options.c
 ===
 --- fortran/options.c   (Revision 18)
 +++ fortran/options.c   (Arbeitskopie)
[...]
 +
 +  if (flag_external_blas  flag_inline_matmul_limit  0)
 +flag_inline_matmul_limit = flag_blas_matmul_limit;
 Hum, shouldn't we do something for flag_inline_matmul_limit  0 as well?
 
 
 This is done automatically, by the options machinery.  That is cool 
 
Huh? is it?
I was talking about this:
Using -fblas-matmul-limit=10, one gets inlining until 10.
Using -fblas-matmul-limit=10 -finline-matmul-limit=20 the inlining limit
is _increased_ to 20.

It is of course questionable for a user to specify a low limit for blas
being lower than the high limit for inlining, so that we have a
contradictory specification for the matrix sizes in between.
But I think it would make more sense to have blas take precedence, that
is have flag_inline_matmul_limit clamped to flag_blas_matmul_limit, even
for flag_inline_matmul_limit  0.
Is this done by the options machinery?

Either way, I don't mind too much, this is a corner case.


 Index: invoke.texi
 ===
 --- invoke.texi   (Revision 18)
 +++ invoke.texi   (Arbeitskopie)
 @@ -1537,6 +1538,20 @@ geometric mean of the dimensions of the argument a
  
  The default value for @var{n} is 30.
  
 +@item -finline-matmul-limit=@var{n}
 +@opindex @code{finline-matmul-limit}
 +When front-end optimiztion is active,
optimization
 some calls to the @code{MATMUL}
 +intrinsic function will be inlined.
s/some calls ... inlined/calls ... inlined for small matrix sizes/
or something like that.

  Setting
 +@code{-finline-matmul-limit=0} will disable inlining in all cases.

 +Setting this option it to a specified value will call the library
 +routines for matrices with size larger than @var{n}. 
s/it //
Setting ... value @var{n} will produce inline code for matrix sizes up
to @var{n}; the library routines will be called for bigger matrices.

Maybe add something about code bloat here.  Suggestion:

This may result in code size increase if the matrix size can't be
determined at compile time, as code for both cases is generated.

 If the matrices
 +involved are not square, the size comparison is performed using the
 +geometric mean of the dimensions of the argument and result matrices.
 +
 +The default value for @var{n} is the value specified for
 

Re: [PING^3] [PATCH] [AArch64, NEON] Improve vmulX intrinsics

2015-05-05 Thread Jiangjiji
Hi James,

Thanks for your comment.

Seems we need a 'dup' before 'fmul' if we use the GCC vector extension syntax 
way.

Example:
dup v1.2s, v1.s[0]
fmulv0.2s, v1.2s, v0.2s

And we need another pattern to combine this two insns into 'fmul 
%0.2s,%1.2s,%2.s[0]', which is kind of complex.

BTW: maybe it's better to reconsider this issue after this patch, right?


Thanks.
Jiang jiji



On Sat, Apr 11, 2015 at 11:37:47AM +0100, Jiangjiji wrote:
 Hi,
   This is a ping for: https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00772.html
   Regtested with aarch64-linux-gnu on QEMU.
   This patch has no regressions for aarch64_be-linux-gnu big-endian target 
 too.
   OK for the trunk?
 
 Thanks.
 Jiang jiji
 
 
 --
 Re: [PING^2] [PATCH] [AArch64, NEON] Improve vmulX intrinsics
 
 Hi, Kyrill
   Thank you for your suggestion.
   I fixed it and regtested with aarch64-linux-gnu on QEMU.
   This patch has no regressions for aarch64_be-linux-gnu big-endian target 
 too.
   OK for the trunk?

Hi Jiang,

I'm sorry that I've taken so long to get to this, I've been out of office for 
several weeks. I have one comment.

 +__extension__ static __inline float32x2_t __attribute__ 
 +((__always_inline__))
 +vmul_n_f32 (float32x2_t __a, float32_t __b) {
 +  return __builtin_aarch64_mul_nv2sf (__a, __b); }
 +

For vmul_n_* intrinsics, is there a reason we don't want to use the GCC vector 
extension syntax to allow us to write these as:

  __extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
  vmul_n_f32 (float32x2_t __a, float32_t __b)
  {
return __a * __b;
  }

It would be great if we could make that work.

Thanks,
James



Re: [Bug target/66015] New: align directives not propagated after __attribute__ ((__optimize__ (O2)))

2015-05-05 Thread Christian Bruel
Hi Jim, Steve, Andreas

Please find here a fix for the issue reported by Andreas
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64835 for ia64.

same than x86 and aarch64.

I don't the environment to run the testsuite for ia64. would you mind
giving it a try and verify that it fixes the issue ?

many thanks

Christian





2015-05-05  Christian Bruel  christian.br...@st.com

	PR target/66015
	* config/ia64/ia64.c (ia64_option_override): Move align_loops,
	and align_functions into ia64_override_options_after_change.

2015-05-05  Christian Bruel  christian.br...@st.com

	PR target/66015
	* gcc.target/ia64/iinline-attr-1.c: New test.

Index: gcc/config/ia64/ia64.c
===
--- gcc/config/ia64/ia64.c	(revision 222803)
+++ gcc/config/ia64/ia64.c	(working copy)
@@ -6051,10 +6051,6 @@
 
   init_machine_status = ia64_init_machine_status;
 
-  if (align_functions = 0)
-align_functions = 64;
-  if (align_loops = 0)
-align_loops = 32;
   if (TARGET_ABI_OPEN_VMS)
 flag_no_common = 1;
 
@@ -6066,6 +6062,11 @@
 static void
 ia64_override_options_after_change (void)
 {
+  if (align_functions = 0)
+align_functions = 64;
+  if (align_loops = 0)
+align_loops = 32;
+
   if (optimize = 3
!global_options_set.x_flag_selective_scheduling
!global_options_set.x_flag_selective_scheduling2)
Index: gcc/testsuite/gcc.target/ia64/iinline-attr-1.c
===
--- gcc/testsuite/gcc.target/ia64/iinline-attr-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/ia64/iinline-attr-1.c	(working copy)
@@ -0,0 +1,28 @@
+/* Verify that alignment flags are set when  attribute __optimize is used.  */
+/* { dg-do compile } */
+
+extern void non_existent(int);
+
+__attribute__ ((__optimize__ (O2)))
+static void hooray ()
+{
+  non_existent (1);
+}
+
+__attribute__ ((__optimize__ (O2)))
+static void hiphip (void (*f)())
+{
+  non_existent (2);
+  f ();
+}
+
+__attribute__ ((__optimize__ (O2)))
+int test (void)
+{
+  hiphip (hooray);
+  return 0;
+}
+
+/* { dg-final { scan-assembler .align 64 } } */
+
+


Re: [PATCH, i386]: Switch x86 to TARGET_SUPPORTS_WIDE_INT

2015-05-05 Thread Evgeny Stupachenko
Hi HJ,
I've checked spec2000 performance.
Only few spec binaries differ. Anyway performance is unchanged.
Thanks,
Evgeny

On Thu, Apr 30, 2015 at 10:08 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Apr 30, 2015 at 12:01 PM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 Attached patch switches x86 to TARGET_SUPPORTS_WIDE_INT.

 The patch builds on the fact that build requires
 HOST_BITS_PER_WIDE_INT = 64 capable host. Taking this in account,
 noticeable blocks of code can be removed, and all but one
 immed_double_const can be removed.

 The only wide-int mode that remains is TImode.

 2015-04-30  Uros Bizjak  ubiz...@gmail.com

 * config/i386/i386.h (TARGET_SUPPORTS_WIDE_INT): New define.
 * config/i386/i386.c (ix86_legitimate_constant_p): Handle TImode
 as CONST_WIDE_INT, not CONST_DOUBLE.
 (ix86_cannot_force_const_mem): Handle CONST_WIDE_INT.
 (output_pic_addr_const): Do not handle VOIDmode CONST_DOUBLEs.
 (ix86_find_base_term): Do not check for CONST_DOUBLE.
 (ix86_print_operand): Do not handle non-FPmode CONST_DOUBLEs.
 (ix86_build_signbit_mask): Rewrite using wide ints.
 (ix86_split_to_parts) [HOST_BITS_PER_WIDE_INT  64]: Remove.
 (ix86_rtx_costs): Handle CONST_WIDE_INT.
 (find_constant): Ditto.
 * config/i386/i386.md (bts, btr, btc peepholes): Rewrite
 using gen_int_mode.
 * config/i386/predicates.md (x86_64_immediate_operand)
 case CONST_INT: Remove HOST_BITS_PER_WIDE_INT == 32 code.
 (x86_64_zext_immediate_operand): Remove CONST_DOUBLE handling.
 case CONST_INT: Remove HOST_BITS_PER_WIDE_INT == 32 code.
 (const0_operand): Also match const_wide_int.
 (constm1_operand): Ditto.
 (const1_operand): Ditto.

 Patch was bootstrapped and regression tested on x86_64-linux-gnu
 {,-m32} and i686-linux-gnu.

 I won't be able to commit the patch until Monday. H.J., can you please
 test it on your SPEC testers, so there won't be any surprises w.r.t.
 performance issues.

 Hi Igor,

 Can your team run SPEC CPU on this patch?

 Thanks.

 --
 H.J.


Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Earnshaw
On 05/05/15 13:37, Richard Biener wrote:
 On May 5, 2015 1:01:59 PM GMT+02:00, Richard Earnshaw 
 richard.earns...@foss.arm.com wrote:
 On 05/05/15 11:54, Richard Earnshaw wrote:
 On 05/05/15 08:32, Jakub Jelinek wrote:
 On Mon, May 04, 2015 at 05:00:11PM +0200, Jakub Jelinek wrote:
 So at least changing arm_needs_doubleword_align for non-aggregates
 would
 likely not break anything that hasn't been broken already and would
 unbreak
 the majority of cases.

 Attached (untested so far).  It indeed changes code generated for
 over-aligned va_arg, but as I believe you can't properly pass those
 in the
 ... caller, this should just fix it so that va_arg handling matches
 the
 caller (and likewise for callees for named argument passing).

 The following testcase shows that eipa_sra changes alignment even
 for the
 aggregates.  Change aligned (8) to aligned (4) to see another
 possibility.

 Actually I misread it, for the aggregates esra actually doesn't
 change
 anything, which is the reason why the testcase doesn't fail.
 The problem with the scalars is that esra first changed it to the
 over-aligned MEM_REFs and then later on eipa_sra used the types of
 the
 MEM_REFs created by esra.

 2015-05-05  Jakub Jelinek  ja...@redhat.com

PR target/65956
* config/arm/arm.c (arm_needs_doubleword_align): For non-aggregate
types check TYPE_ALIGN of TYPE_MAIN_VARIANT rather than type
 itself.

* gcc.c-torture/execute/pr65956.c: New test.

 --- gcc/config/arm/arm.c.jj2015-05-04 21:51:42.0 +0200
 +++ gcc/config/arm/arm.c   2015-05-05 09:20:52.481693337 +0200
 @@ -6063,8 +6063,13 @@ arm_init_cumulative_args (CUMULATIVE_ARG
  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 -|| (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY)
 +return true;

 I don't think this is right (though I suspect the existing code has
 the
 same problem).  We should only look at mode if there is no type
 information.  The problem is that GCC has a nasty habit of assigning
 real machine modes to things that are really BLKmode and we've run
 into
 several cases where this has royally screwed things up.  So for
 consistency in the ARM back-end we are careful to only use mode when
 type is NULL (= it's a libcall).

 +  if (type == NULL_TREE)
 +return false;
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
  }
  



 It ought to be possible to re-order this, though, to

  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 - || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (type != NULL_TREE)
 +{
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
 +}
 +  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY);
  }


 Either way, this would need careful cross-testing against an existing
 compiler.


 It looks as though either patch would cause ABI incompatibility for

 typedef int alignedint __attribute__((aligned((8;

 int  __attribute__((weak)) foo (int a, alignedint b)
 {return b;}

 void bar (alignedint x)
 {
  foo (1, x);
 }

 Where currently gcc uses r2 as the argument register for b in foo.
 
 And for foo (1,2) or an int typed 2nd arg?
 
 Richard.
 
 R.
 
 


As is

int foo2 (int a, int b);

void bar2 (alignedint x)
{
  foo2 (1, x);
}

...

R


Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Earnshaw
On 05/05/15 13:46, Richard Earnshaw wrote:
 On 05/05/15 13:37, Richard Biener wrote:
 On May 5, 2015 1:01:59 PM GMT+02:00, Richard Earnshaw 
 richard.earns...@foss.arm.com wrote:
 On 05/05/15 11:54, Richard Earnshaw wrote:
 On 05/05/15 08:32, Jakub Jelinek wrote:
 On Mon, May 04, 2015 at 05:00:11PM +0200, Jakub Jelinek wrote:
 So at least changing arm_needs_doubleword_align for non-aggregates
 would
 likely not break anything that hasn't been broken already and would
 unbreak
 the majority of cases.

 Attached (untested so far).  It indeed changes code generated for
 over-aligned va_arg, but as I believe you can't properly pass those
 in the
 ... caller, this should just fix it so that va_arg handling matches
 the
 caller (and likewise for callees for named argument passing).

 The following testcase shows that eipa_sra changes alignment even
 for the
 aggregates.  Change aligned (8) to aligned (4) to see another
 possibility.

 Actually I misread it, for the aggregates esra actually doesn't
 change
 anything, which is the reason why the testcase doesn't fail.
 The problem with the scalars is that esra first changed it to the
 over-aligned MEM_REFs and then later on eipa_sra used the types of
 the
 MEM_REFs created by esra.

 2015-05-05  Jakub Jelinek  ja...@redhat.com

   PR target/65956
   * config/arm/arm.c (arm_needs_doubleword_align): For non-aggregate
   types check TYPE_ALIGN of TYPE_MAIN_VARIANT rather than type
 itself.

   * gcc.c-torture/execute/pr65956.c: New test.

 --- gcc/config/arm/arm.c.jj   2015-05-04 21:51:42.0 +0200
 +++ gcc/config/arm/arm.c  2015-05-05 09:20:52.481693337 +0200
 @@ -6063,8 +6063,13 @@ arm_init_cumulative_args (CUMULATIVE_ARG
  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 -   || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY)
 +return true;

 I don't think this is right (though I suspect the existing code has
 the
 same problem).  We should only look at mode if there is no type
 information.  The problem is that GCC has a nasty habit of assigning
 real machine modes to things that are really BLKmode and we've run
 into
 several cases where this has royally screwed things up.  So for
 consistency in the ARM back-end we are careful to only use mode when
 type is NULL (= it's a libcall).

 +  if (type == NULL_TREE)
 +return false;
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
  }
  



 It ought to be possible to re-order this, though, to

  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 -|| (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (type != NULL_TREE)
 +{
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
 +}
 +  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY);
  }


 Either way, this would need careful cross-testing against an existing
 compiler.


 It looks as though either patch would cause ABI incompatibility for

 typedef int alignedint __attribute__((aligned((8;

 int  __attribute__((weak)) foo (int a, alignedint b)
 {return b;}

 void bar (alignedint x)
 {
  foo (1, x);
 }

 Where currently gcc uses r2 as the argument register for b in foo.

 And for foo (1,2) or an int typed 2nd arg?

 Richard.

 R.


 
 
 As is
 
 int foo2 (int a, int b);
 
 void bar2 (alignedint x)
 {
   foo2 (1, x);
 }
 

The real question here is why is TYPE the type of the value, rather than
the type of the formal as expressed by the prototype (or implicit
prototype in the case of variadics or KR)?  Surely this is the mid-end
passing the wrong information to the back-end.

R.

 ...
 
 R
 



Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Jakub Jelinek
On Tue, May 05, 2015 at 01:49:55PM +0100, Richard Earnshaw wrote:
 The real question here is why is TYPE the type of the value, rather than
 the type of the formal as expressed by the prototype (or implicit
 prototype in the case of variadics or KR)?  Surely this is the mid-end
 passing the wrong information to the back-end.

There is nothing else for unnamed arguments (KR, stdarg).
For named arguments, the backend has the option to save the fntype in
CUMULATIVE_ARGS and look it up when it needs that.  But, that will
still mean KR and stdarg will be just broken on arm.

Jakub


Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Jakub Jelinek
On Tue, May 05, 2015 at 02:02:19PM +0100, Richard Earnshaw wrote:
 In a literal sense, yes.  However, even KR  stdarg have standard
 promotion and conversion rules (size  int = int, floats promoted to
 double, etc).  What are those rules for GCC's overaligned types (ie
 where in the docs does it say what happens and how a back-end should
 interpret them)?  Shouldn't the mid-end be doing that work so as to

For the middle-end, the TYPE_ALIGN info on expression types is considered
useless, you can get there anything.  There is no conversion rule to what
you get for myalignedint + int, or (myalignedint) int, or (int)
myalignedint, etc.

 create a consistent view of the values passed into the back-end?  It
 seems to me that at present the back-end has to be psychic as to what is
 really happening.

No, the backend just shouldn't consider TYPE_ALIGN on the scalars, and it
seems only arm ever looks at that.

Jakub


Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Jakub Jelinek
On Tue, May 05, 2015 at 12:01:59PM +0100, Richard Earnshaw wrote:
  Either way, this would need careful cross-testing against an existing
  compiler.
  
 
 It looks as though either patch would cause ABI incompatibility for
 
 typedef int alignedint __attribute__((aligned((8;
 
 int  __attribute__((weak)) foo (int a, alignedint b)
 {return b;}
 
 void bar (alignedint x)
 {
   foo (1, x);
 }
 
 Where currently gcc uses r2 as the argument register for b in foo.

That is true, but as you can't call it with almost anything else:

void bar2 (alignedint x)
{
  foo (1, (alignedint) 1);
}

void bar3 (alignedint x)
{
  foo (1, x + (alignedint) 1);
}

alignedint q;

void bar4 (alignedint x)
{
  foo (1, q);
}

eextern int baz (int, int);

void bar5 (alignedint x)
{
  baz (1, x);
}

is all broken, I'd seriously doubt anybody is actually using it
successfully.
Having an ABI feature that most of the time doesn't work at
all, and occassionally happens to work, is just unsupportable.

You could add a -Wpsabi warning in the callee, warning if any of the
non-aggregate arguments has
TYPE_ALIGN (type) != TYPE_ALIGN (TYPE_MAIN_VARIANT (type))
to let users know that this really didn't work in = 5.1.
And supposedly we need the tree-sra.c patch then so that eipa_sra
doesn't create the over-aligned parameters, because -Wpsabi then would warn
about those.

In any case, this is a P1 issue that needs resolving RSN, not just on the
trunk, but also on the branch, and the tree-sra.c patch I've posted isn't
anywhere close to resolving it without fixing the backend.

I'm in the middle of bootstrapping/regtesting this on armv7hl (both
profiledbootstrap to see if it fixes it and normal one).

Jakub


Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-05 Thread Nuno Diegues
Today I have received the news that the Copyright Assignment was
completed with the FSF.


On Thu, Apr 30, 2015 at 8:10 AM, Nuno Diegues n...@ist.utl.pt wrote:

  Patch looks good to me now. It would be perhaps nice to have an
  environment variable to turn the adaptive algorithm off for tests,
  but that's not critical.

 Yes, that makes perfect sense.


  It would be also nice to test it on something else, but I understand
  it's difficult to find other software using the STM syntax.

 Indeed. I'll try to find some time to work on that, but it may take a while.


  I can't approve the patch however. I believe it's big enough that you
  may need a copy right assignment.

 I have signed a Form Assignment from the Free Software Foundation to
 deal exactly with those matters for this patch to the libitm. Torvald
 Riegel had advised me to do so.

 I have not, however, received any further information; so I'm left
 wondering if it went through or if it is still hanging. I will ping
 back to FSF to check that out perhaps?


 Best regards,
 -- Nuno Diegues



 
  -Andi
 
  --
  a...@linux.intel.com -- Speaking for myself only


Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Earnshaw
On 05/05/15 13:54, Jakub Jelinek wrote:
 On Tue, May 05, 2015 at 01:49:55PM +0100, Richard Earnshaw wrote:
 The real question here is why is TYPE the type of the value, rather than
 the type of the formal as expressed by the prototype (or implicit
 prototype in the case of variadics or KR)?  Surely this is the mid-end
 passing the wrong information to the back-end.
 
 There is nothing else for unnamed arguments (KR, stdarg).
 For named arguments, the backend has the option to save the fntype in
 CUMULATIVE_ARGS and look it up when it needs that.  But, that will
 still mean KR and stdarg will be just broken on arm.
 
   Jakub
 

In a literal sense, yes.  However, even KR  stdarg have standard
promotion and conversion rules (size  int = int, floats promoted to
double, etc).  What are those rules for GCC's overaligned types (ie
where in the docs does it say what happens and how a back-end should
interpret them)?  Shouldn't the mid-end be doing that work so as to
create a consistent view of the values passed into the back-end?  It
seems to me that at present the back-end has to be psychic as to what is
really happening.

R.


Re: [PATCH, PR65915] Fix float conversion split.

2015-05-05 Thread Uros Bizjak
On Tue, May 5, 2015 at 3:03 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
  +++ b/gcc/testsuite/gcc.target/i386/pr65915.c
  @@ -0,0 +1,6 @@
  +/* { dg-do run } */
  +/* { dg-options -O2 -mavx512f -fpic -mcmodel=medium } */
  +/* { dg-require-effective-target avx512f } */
  +/* { dg-require-effective-target lp64 } */
  +
  +#include avx512f-vrndscalepd-2.c
 
  Missing testcases for
 
  FAIL: gcc.target/i386/avx512f-vrndscaleps-2.c (test for excess errors)
  FAIL: gcc.target/i386/avx512vl-vrndscaleps-2.c (internal compiler error)

 The attached test is OK, since these two would test for the same problem.

  as well as ChangeLog entries.

 ChangeLog is missing. Please add PR number and describe *each* change
 accurately. You can say (vector convert to float spltiter) for this
 particular nameless splitter.

 Please repost the patch with updated ChangeLog.


 ChangeLog

 PR c/65915
 * config/i386/i386.md (vector convert to float spltiter): Check for
 xmm16+, when splitting scalar float conversion.
 * config/i386/sse.md (sse2_cvtsi2sd): Support EVEX version.

 And for tests

 PR c/65915
 * gcc.target/i386/pr65915.c: New.

OK for mainline.

Thanks,
Uros.


Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Earnshaw
On 05/05/15 14:06, Jakub Jelinek wrote:
 On Tue, May 05, 2015 at 02:02:19PM +0100, Richard Earnshaw wrote:
 In a literal sense, yes.  However, even KR  stdarg have standard
 promotion and conversion rules (size  int = int, floats promoted to
 double, etc).  What are those rules for GCC's overaligned types (ie
 where in the docs does it say what happens and how a back-end should
 interpret them)?  Shouldn't the mid-end be doing that work so as to
 
 For the middle-end, the TYPE_ALIGN info on expression types is considered
 useless, you can get there anything.  There is no conversion rule to what
 you get for myalignedint + int, or (myalignedint) int, or (int)
 myalignedint, etc.
 
 create a consistent view of the values passed into the back-end?  It
 seems to me that at present the back-end has to be psychic as to what is
 really happening.
 
 No, the backend just shouldn't consider TYPE_ALIGN on the scalars, and it
 seems only arm ever looks at that.
 

Nothing in the specification for TARGET_FUNCTION_ARG (or any of the
related functions) makes any mention of this...

R.

   Jakub
 



Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Biener
On May 5, 2015 1:01:59 PM GMT+02:00, Richard Earnshaw 
richard.earns...@foss.arm.com wrote:
On 05/05/15 11:54, Richard Earnshaw wrote:
 On 05/05/15 08:32, Jakub Jelinek wrote:
 On Mon, May 04, 2015 at 05:00:11PM +0200, Jakub Jelinek wrote:
 So at least changing arm_needs_doubleword_align for non-aggregates
would
 likely not break anything that hasn't been broken already and would
unbreak
 the majority of cases.

 Attached (untested so far).  It indeed changes code generated for
 over-aligned va_arg, but as I believe you can't properly pass those
in the
 ... caller, this should just fix it so that va_arg handling matches
the
 caller (and likewise for callees for named argument passing).

 The following testcase shows that eipa_sra changes alignment even
for the
 aggregates.  Change aligned (8) to aligned (4) to see another
possibility.

 Actually I misread it, for the aggregates esra actually doesn't
change
 anything, which is the reason why the testcase doesn't fail.
 The problem with the scalars is that esra first changed it to the
 over-aligned MEM_REFs and then later on eipa_sra used the types of
the
 MEM_REFs created by esra.

 2015-05-05  Jakub Jelinek  ja...@redhat.com

 PR target/65956
 * config/arm/arm.c (arm_needs_doubleword_align): For non-aggregate
 types check TYPE_ALIGN of TYPE_MAIN_VARIANT rather than type
itself.

 * gcc.c-torture/execute/pr65956.c: New test.

 --- gcc/config/arm/arm.c.jj 2015-05-04 21:51:42.0 +0200
 +++ gcc/config/arm/arm.c2015-05-05 09:20:52.481693337 +0200
 @@ -6063,8 +6063,13 @@ arm_init_cumulative_args (CUMULATIVE_ARG
  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 - || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY)
 +return true;
 
 I don't think this is right (though I suspect the existing code has
the
 same problem).  We should only look at mode if there is no type
 information.  The problem is that GCC has a nasty habit of assigning
 real machine modes to things that are really BLKmode and we've run
into
 several cases where this has royally screwed things up.  So for
 consistency in the ARM back-end we are careful to only use mode when
 type is NULL (= it's a libcall).
 
 +  if (type == NULL_TREE)
 +return false;
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
  }
  
 
 
 
 It ought to be possible to re-order this, though, to
 
  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 -  || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (type != NULL_TREE)
 +{
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
 +}
 +  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY);
  }
 
 
 Either way, this would need careful cross-testing against an existing
 compiler.
 

It looks as though either patch would cause ABI incompatibility for

typedef int alignedint __attribute__((aligned((8;

int  __attribute__((weak)) foo (int a, alignedint b)
{return b;}

void bar (alignedint x)
{
  foo (1, x);
}

Where currently gcc uses r2 as the argument register for b in foo.

And for foo (1,2) or an int typed 2nd arg?

Richard.

R.




Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Earnshaw
On 05/05/15 13:37, Richard Biener wrote:
 On May 5, 2015 1:01:59 PM GMT+02:00, Richard Earnshaw 
 richard.earns...@foss.arm.com wrote:
 On 05/05/15 11:54, Richard Earnshaw wrote:
 On 05/05/15 08:32, Jakub Jelinek wrote:
 On Mon, May 04, 2015 at 05:00:11PM +0200, Jakub Jelinek wrote:
 So at least changing arm_needs_doubleword_align for non-aggregates
 would
 likely not break anything that hasn't been broken already and would
 unbreak
 the majority of cases.

 Attached (untested so far).  It indeed changes code generated for
 over-aligned va_arg, but as I believe you can't properly pass those
 in the
 ... caller, this should just fix it so that va_arg handling matches
 the
 caller (and likewise for callees for named argument passing).

 The following testcase shows that eipa_sra changes alignment even
 for the
 aggregates.  Change aligned (8) to aligned (4) to see another
 possibility.

 Actually I misread it, for the aggregates esra actually doesn't
 change
 anything, which is the reason why the testcase doesn't fail.
 The problem with the scalars is that esra first changed it to the
 over-aligned MEM_REFs and then later on eipa_sra used the types of
 the
 MEM_REFs created by esra.

 2015-05-05  Jakub Jelinek  ja...@redhat.com

PR target/65956
* config/arm/arm.c (arm_needs_doubleword_align): For non-aggregate
types check TYPE_ALIGN of TYPE_MAIN_VARIANT rather than type
 itself.

* gcc.c-torture/execute/pr65956.c: New test.

 --- gcc/config/arm/arm.c.jj2015-05-04 21:51:42.0 +0200
 +++ gcc/config/arm/arm.c   2015-05-05 09:20:52.481693337 +0200
 @@ -6063,8 +6063,13 @@ arm_init_cumulative_args (CUMULATIVE_ARG
  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 -|| (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY)
 +return true;

 I don't think this is right (though I suspect the existing code has
 the
 same problem).  We should only look at mode if there is no type
 information.  The problem is that GCC has a nasty habit of assigning
 real machine modes to things that are really BLKmode and we've run
 into
 several cases where this has royally screwed things up.  So for
 consistency in the ARM back-end we are careful to only use mode when
 type is NULL (= it's a libcall).

 +  if (type == NULL_TREE)
 +return false;
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
  }
  



 It ought to be possible to re-order this, though, to

  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 - || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (type != NULL_TREE)
 +{
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
 +}
 +  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY);
  }


 Either way, this would need careful cross-testing against an existing
 compiler.


 It looks as though either patch would cause ABI incompatibility for

 typedef int alignedint __attribute__((aligned((8;

 int  __attribute__((weak)) foo (int a, alignedint b)
 {return b;}

 void bar (alignedint x)
 {
  foo (1, x);
 }

 Where currently gcc uses r2 as the argument register for b in foo.
 
 And for foo (1,2) or an int typed 2nd arg?
 

Yep, looks like that's horribly broken too ;-(

R.

 Richard.
 
 R.
 
 



Re: [PR testsuite/65205, libgomp/65993] Fix dg-shouldfail usage in OpenACC libgomp tests

2015-05-05 Thread John David Anglin

On 2015-05-05 5:43 AM, Thomas Schwinge wrote:

FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-62.c
-DACC_DEVICE_TYPE_hos
t=1 -DACC_MEM_SHARED=1 output pattern test, is , should match invalid size

With this one I'll need your help: please cite from libgomp.log (or, from
a manual run) the actual output message that you're getting.

There's no output message:
# ./lib-62.exe
Segmentation fault (core dumped)

(gdb) r
Starting program: 
/test/gnu/gcc/objdir/hppa2.0w-hp-hpux11.11/libgomp/testsuite/lib-62.exe

warning: Private mapping of shared library text was not specified
by the executable; setting a breakpoint in a shared library which
is not privately mapped will not work.  See the HP-UX 11i v3 chatr
manpage for methods to privately map shared library text.

Program received signal SIGSEGV, Segmentation fault.
acc_init (d=acc_device_nvidia) at ../../../gcc/libgomp/oacc-init.c:178
178   ndevs = base_dev-get_num_devices_func ();
(gdb) disass $pc-16,$pc+16
Dump of assembler code from 0xc12731c8 to 0xc12731e8:
   0xc12731c8 acc_init+64:copy r4,r19
   0xc12731cc acc_init+68:copy r6,r26
   0xc12731d0 acc_init+72:b,l 0xc1272af0 resolve_device,rp
   0xc12731d4 acc_init+76:copy r19,r4
= 0xc12731d8 acc_init+80:ldw 1c(ret0),r22
   0xc12731dc acc_init+84:copy r4,r19
   0xc12731e0 acc_init+88:copy ret0,r3
   0xc12731e4 acc_init+92:copy r19,r4
End of assembler dump.
(gdb) p/x $ret0
$1 = 0x0

It would seem resolve_device returned 0.

Dave

--
John David Anglin  dave.ang...@bell.net



Re: [PATCH, PR target/66015]: Fix alignments with attribute_optimize for aarch64

2015-05-05 Thread Marcus Shawcroft
On 5 May 2015 at 12:07, Christian Bruel christian.br...@st.com wrote:
 This fixes PR target/66015 and a latent issue revealed by
 gcc.dg/ipa/iinline-attr.c since
 https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01609.html

 Regtested on aarch64-linux-gnu by Linaro.

 OK for trunk ?

OK.  Is this issue present in gcc-5? If so can you backport?

Thanks
/Marcus


Re: [PATCH, PR65915] Fix float conversion split.

2015-05-05 Thread Ilya Tocar
  +++ b/gcc/testsuite/gcc.target/i386/pr65915.c
  @@ -0,0 +1,6 @@
  +/* { dg-do run } */
  +/* { dg-options -O2 -mavx512f -fpic -mcmodel=medium } */
  +/* { dg-require-effective-target avx512f } */
  +/* { dg-require-effective-target lp64 } */
  +
  +#include avx512f-vrndscalepd-2.c
 
  Missing testcases for
 
  FAIL: gcc.target/i386/avx512f-vrndscaleps-2.c (test for excess errors)
  FAIL: gcc.target/i386/avx512vl-vrndscaleps-2.c (internal compiler error)
 
 The attached test is OK, since these two would test for the same problem.
 
  as well as ChangeLog entries.
 
 ChangeLog is missing. Please add PR number and describe *each* change
 accurately. You can say (vector convert to float spltiter) for this
 particular nameless splitter.
 
 Please repost the patch with updated ChangeLog.


ChangeLog

PR c/65915
* config/i386/i386.md (vector convert to float spltiter): Check for
xmm16+, when splitting scalar float conversion.
* config/i386/sse.md (sse2_cvtsi2sd): Support EVEX version.

And for tests

PR c/65915
* gcc.target/i386/pr65915.c: New.


Reposted patch below.

---
 gcc/config/i386/i386.md | 8 ++--
 gcc/config/i386/sse.md  | 6 +++---
 gcc/testsuite/gcc.target/i386/pr65915.c | 6 ++
 3 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr65915.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 937871a..af1cd9b 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4897,7 +4897,9 @@
   TARGET_SSE2  TARGET_SSE_MATH
 TARGET_USE_VECTOR_CONVERTS  optimize_function_for_speed_p (cfun)
 reload_completed  SSE_REG_P (operands[0])
-(MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)
+(MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)
+(!EXT_REX_SSE_REG_P (operands[0])
+   || TARGET_AVX512VL)
   [(const_int 0)]
 {
   operands[3] = simplify_gen_subreg (ssevecmodemode, operands[0],
@@ -4921,7 +4923,9 @@
   TARGET_SSE2  TARGET_SSE_MATH
 TARGET_SSE_PARTIAL_REG_DEPENDENCY
 optimize_function_for_speed_p (cfun)
-reload_completed  SSE_REG_P (operands[0])
+reload_completed  SSE_REG_P (operands[0])
+(!EXT_REX_SSE_REG_P (operands[0])
+   || TARGET_AVX512VL)
   [(const_int 0)]
 {
   const machine_mode vmode = MODEF:ssevecmodemode;
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 9b7009a..c61098d 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -4258,11 +4258,11 @@
(set_attr mode TI)])
 
 (define_insn sse2_cvtsi2sd
-  [(set (match_operand:V2DF 0 register_operand =x,x,x)
+  [(set (match_operand:V2DF 0 register_operand =x,x,v)
(vec_merge:V2DF
  (vec_duplicate:V2DF
(float:DF (match_operand:SI 2 nonimmediate_operand r,m,rm)))
- (match_operand:V2DF 1 register_operand 0,0,x)
+ (match_operand:V2DF 1 register_operand 0,0,v)
  (const_int 1)))]
   TARGET_SSE2
   @
@@ -4275,7 +4275,7 @@
(set_attr amdfam10_decode vector,double,*)
(set_attr bdver1_decode double,direct,*)
(set_attr btver2_decode double,double,double)
-   (set_attr prefix orig,orig,vex)
+   (set_attr prefix orig,orig,maybe_evex)
(set_attr mode DF)])
 
 (define_insn sse2_cvtsi2sdqround_name
diff --git a/gcc/testsuite/gcc.target/i386/pr65915.c 
b/gcc/testsuite/gcc.target/i386/pr65915.c
new file mode 100644
index 000..990c5aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr65915.c
@@ -0,0 +1,6 @@
+/* { dg-do run } */
+/* { dg-options -O2 -mavx512f -fpic -mcmodel=medium } */
+/* { dg-require-effective-target avx512f } */
+/* { dg-require-effective-target lp64 } */
+
+#include avx512f-vrndscalepd-2.c
-- 
1.8.3.1



C++ PATCH for self-referential variable template

2015-05-05 Thread Jason Merrill
A variable template or partial specialization trying to refer to the 
template in its initializer got confused because name lookup found the 
VAR_DECL rather than the template.  Fixed by avoiding pushing the VAR_DECL.


Tested x86_64-pc-linux-gnu, applying to trunk and 5.
commit fb2e3bd25d2b8ce5cde5facf156941c7eae5d643
Author: Jason Merrill ja...@redhat.com
Date:   Mon May 4 21:09:11 2015 -0500

	* decl.c (start_decl): Don't push the plain VAR_DECL for a
	variable template.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 6ec1579..261a12d 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -4825,8 +4825,11 @@ start_decl (const cp_declarator *declarator,
 
   was_public = TREE_PUBLIC (decl);
 
-  /* Enter this declaration into the symbol table.  */
-  decl = maybe_push_decl (decl);
+  /* Enter this declaration into the symbol table.  Don't push the plain
+ VAR_DECL for a variable template.  */
+  if (!template_parm_scope_p ()
+  || TREE_CODE (decl) != VAR_DECL)
+decl = maybe_push_decl (decl);
 
   if (processing_template_decl)
 decl = push_template_decl (decl);
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ24.C b/gcc/testsuite/g++.dg/cpp1y/var-templ24.C
new file mode 100644
index 000..d8f7cbf
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/var-templ24.C
@@ -0,0 +1,5 @@
+// { dg-do compile { target c++14 } }
+
+template class T bool Foo = Fooint;
+template  bool Fooint = true;
+int i = Foochar;


Re: [PATCH 5/8] always define HAVE_simple_return and HAVE_return

2015-05-05 Thread Trevor Saunders
On Tue, May 05, 2015 at 06:12:50PM -0700, Mike Stump wrote:
 On Apr 26, 2015, at 10:55 PM, tbsaunde+...@tbsaunde.org wrote:
  From: Trevor Saunders tbsaunde+...@tbsaunde.org
  
  gcc/ChangeLog:
  
  2015-04-27  Trevor Saunders  tbsaunde+...@tbsaunde.org
  
  * bb-reorder.c (HAVE_return): Don't check if its undefined.
  * defaults.h (gen_simple_return): New function.
  (gen_simple_return): Likewise.
  (HAVE_return): Add default definition to false.
  (HAVE_simple_return): Likewise.
  * cfgrtl.c (force_nonfallthru_and_redirect): Remove checks if
  HAVE_return and HAVE_simple_return are defined.
  * function.c (gen_return_pattern): Likewise.
  (convert_jumps_to_returns): Likewise.
  (thread_prologue_and_epilogue_insns): Likewise.
  * reorg.c (find_end_label): Likewise.
  (dbr_schedule): Likewise.
  * shrink-wrap.c: Likewise.
  * shrink-wrap.h: Likewise.
 
 I’m seeing:
 
 In file included from ./tm.h:30:0,
  from ../../gcc/gcc/c-family/c-semantics.c:24:
 ../../gcc/gcc/defaults.h: In function ‘rtx_def* gen_simple_return()’:
 ../../gcc/gcc/defaults.h:1422:1: error: redefinition of ‘rtx_def* 
 gen_simple_return()’
  gen_simple_return ()
  ^
 In file included from ./tm.h:22:0,
  from ../../gcc/gcc/c-family/c-semantics.c:24:
 ./insn-flags.h:1744:1: error: ‘rtx_def* gen_simple_return()’ previously 
 defined here
  gen_simple_return(void)
  ^
 
 in my port.
 
 I have a simple_return and a return that is “0” enabled.
 
 defaults.h has:
 
 #ifndef HAVE_simple_return
 #define HAVE_simple_return 0
 static inline rtx
 gen_simple_return ()
 {
   gcc_unreachable ();
   return NULL;
 }
 #endif
 
 and insn-flags.h has:
 
 static inline rtx gen_simple_return  
 (void);
 static inline rtx
 gen_simple_return(void)
 {
   return 0;
 }
 
 If I change the enable to “1” or “” then it compiles better.  Also, I can 
 delete the pattern or change the name of the pattern and it works ok as well.
 
 If they both did #ifndef HAVE_simple_return, and then insn-flags did #define 
 HAVE_simple_return 0, I think it might work better.  I’ve not thought about 
 it much.

hrm, I didn't realize you could sometimes get a dummy function in
insn-flags.h (especially without it defining the macro that seems kind
of broken on genflags' part).  What I think we could actually do is
change genflags to #define HAVE_x 0 if it is going to emit the static
function and some how have a list of insns that always get emited so you
don't need the define_insn with enabled = 0 in every target's md file.
The one big trick with this is that all the #ifdef HAVE_foo need to be
updated so it'll take a little time.

Trev



Re: [PATCH] PR debug/61352 back port from mainline

2015-05-05 Thread Mike Stump
On Apr 12, 2015, at 11:11 AM, Jack Howarth howarth.at@gmail.com wrote:
   The attached patch is a back port of the change from
 https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=211067 for
 gcc-4_9-branch. Bootstrap and regression tested on
 x86_64-apple-darwin14 with Xcode 6.3. Okay for gcc-4_9-branch?
   Jack
 PR61352_backport.diff

Ok.

Committed revision 222835.


C++ PATCH for DR 1518 (explicit default ctor and copy-list-initialization)

2015-05-05 Thread Jason Merrill
I went to draft standardese for DR 1518 this week and realized that it 
had already been resolved by DR 1630, which clarified that 
default-initialization can call explicit default constructors even in 
the context of copy-list-initialization.  So this patch adjusts my 
earlier approximation to reflect that.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 733826123adfc74ffb07e423fb8cd9311c8d6a86
Author: Jason Merrill ja...@redhat.com
Date:   Mon May 4 06:45:03 2015 -0500

	DR 1518
	DR 1630
	PR c++/54835
	PR c++/60417
	* call.c (convert_like_real): Check value-initialization before
	explicit.
	* typeck2.c (process_init_constructor_record): Don't set
	CONSTRUCTOR_IS_DIRECT_INIT.
	(process_init_constructor_array): Likewise.
	* init.c (build_vec_init): Likewise.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 7bdf236..b77f69a 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -6243,19 +6243,9 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum,
 	tree convfn = cand-fn;
 	unsigned i;
 
-	/* When converting from an init list we consider explicit
-	   constructors, but actually trying to call one is an error.  */
-	if (DECL_NONCONVERTING_P (convfn)  DECL_CONSTRUCTOR_P (convfn)
-	/* Unless this is for direct-list-initialization.  */
-	 !DIRECT_LIST_INIT_P (expr))
-	  {
-	if (!(complain  tf_error))
-	  return error_mark_node;
-	error (converting to %qT from initializer list would use 
-		   explicit constructor %qD, totype, convfn);
-	  }
-
-	/* If we're initializing from {}, it's value-initialization.  */
+	/* If we're initializing from {}, it's value-initialization.  Note
+	   that under the resolution of core 1630, value-initialization can
+	   use explicit constructors.  */
 	if (BRACE_ENCLOSED_INITIALIZER_P (expr)
 	 CONSTRUCTOR_NELTS (expr) == 0
 	 TYPE_HAS_DEFAULT_CONSTRUCTOR (totype))
@@ -6271,6 +6261,18 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum,
 	return expr;
 	  }
 
+	/* When converting from an init list we consider explicit
+	   constructors, but actually trying to call one is an error.  */
+	if (DECL_NONCONVERTING_P (convfn)  DECL_CONSTRUCTOR_P (convfn)
+	/* Unless this is for direct-list-initialization.  */
+	 !DIRECT_LIST_INIT_P (expr))
+	  {
+	if (!(complain  tf_error))
+	  return error_mark_node;
+	error (converting to %qT from initializer list would use 
+		   explicit constructor %qD, totype, convfn);
+	  }
+
 	expr = mark_rvalue_use (expr);
 
 	/* Set user_conv_p on the argument conversions, so rvalue/base
diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index a4fc9ff..c41e30c 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -3720,7 +3720,6 @@ build_vec_init (tree base, tree maxindex, tree init,
 	  if (cxx_dialect = cxx11  AGGREGATE_TYPE_P (type))
 	{
 	  init = build_constructor (init_list_type_node, NULL);
-	  CONSTRUCTOR_IS_DIRECT_INIT (init) = true;
 	}
 	  else
 	{
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index c0df823..6e0c777 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1285,7 +1285,6 @@ process_init_constructor_array (tree type, tree init,
 	   we can't rely on the back end to do it for us, so make the
 	   initialization explicit by list-initializing from T{}.  */
 	next = build_constructor (init_list_type_node, NULL);
-	CONSTRUCTOR_IS_DIRECT_INIT (next) = true;
 	next = massage_init_elt (TREE_TYPE (type), next, complain);
 	if (initializer_zerop (next))
 	  /* The default zero-initialization is fine for us; don't
@@ -1396,9 +1395,6 @@ process_init_constructor_record (tree type, tree init,
 	 for us, so build up TARGET_EXPRs.  If the type in question is
 	 a class, just build one up; if it's an array, recurse.  */
 	  next = build_constructor (init_list_type_node, NULL);
-	  /* Call this direct-initialization pending DR 1518 resolution so
-	 that explicit default ctors don't break valid C++03 code.  */
-	  CONSTRUCTOR_IS_DIRECT_INIT (next) = true;
 	  next = massage_init_elt (TREE_TYPE (field), next, complain);
 
 	  /* Warn when some struct elements are implicitly initialized.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-dr1518.C b/gcc/testsuite/g++.dg/cpp0x/initlist-dr1518.C
new file mode 100644
index 000..9f8ee0b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-dr1518.C
@@ -0,0 +1,27 @@
+// DR 1518
+// { dg-do compile { target c++11 } }
+
+struct A {
+  explicit A() = default;
+};
+
+struct B : A {
+  explicit B() = default;
+};
+
+struct C {
+  explicit C();
+};
+
+struct D : A {
+  C c;
+  explicit D() = default;
+};
+
+templatetypename T void f() {
+  T t = {};
+}
+templatetypename T void g() {
+  void x(T t);
+  x({});
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist40.C b/gcc/testsuite/g++.dg/cpp0x/initlist40.C
index 6e6a11a..de2e19d 100644
--- a/gcc/testsuite/g++.dg/cpp0x/initlist40.C
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist40.C
@@ 

Three C++ PATCHes to fix testsuite regressions with cxx_dialect defaulting to cxx11

2015-05-05 Thread Jason Merrill
These patches fix several issues I found by switching the C++ dialect 
default to C++11.


The first patch fixes obj-c++.dg/encode-10.mm: When 
build_non_dependent_expr tries to get a constant value for a 
non-dependent expression, we need to know that AT_ENCODE_EXPR can't be a 
constant expression.


The second patch fixes the GDB test gdb.cp/anon-struct.cc.  In C++11 
mode we declare the default constructor before we see the typedef which 
gives the anonymous struct a name for linkage purposes, and as a result 
the DWARF output for the constructor had no name.  The patch causes us 
to ignore the constructor if we see it before it has a name, and update 
the name of the constructor when the type gets one.


The third patch fixes g++.dg/torture/Wsizeof-memaccess2.C.  We shouldn't 
emit those warnings when we're in SFINAE mode.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 3e0151efa351acdbb0a294a5d9c36b187c798d19
Author: Jason Merrill ja...@redhat.com
Date:   Tue May 5 20:55:56 2015 -0500

	Fix obj-c++.dg/encode-10.mm with cxx_dialect == cxx11.
	* constexpr.c (potential_constant_expression_1) [AT_ENCODE_EXPR]:
	Return false.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 9ebb640..f35ec1e 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -4159,6 +4159,7 @@ potential_constant_expression_1 (tree t, bool want_rval, bool strict,
 case OBJ_TYPE_REF:
 case TRANSACTION_EXPR:
 case ASM_EXPR:
+case AT_ENCODE_EXPR:
 fail:
   if (flags  tf_error)
 error (expression %qE is not a constant-expression, t);
commit 2d097581fbd5c0e3de866d879f2296d68753caff
Author: Jason Merrill ja...@redhat.com
Date:   Tue May 5 20:52:13 2015 -0500

	Fix gdb.cp/anon-struct.cc with -std=c++11.
gcc/
	* dwarf2out.c (gen_member_die): Don't emit anything for an
	anonymous class constructor.
gcc/cp/
	* decl2.c (reset_type_linkage_2): Update the DECL_NAME of a
	maybe-in-charge constructor.

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index b2251d8..0d47847 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -2648,7 +2648,12 @@ reset_type_linkage_2 (tree type)
 	if (TREE_CODE (m) == VAR_DECL)
 	  reset_decl_linkage (m);
   for (tree m = TYPE_METHODS (type); m; m = DECL_CHAIN (m))
-	reset_decl_linkage (m);
+	{
+	  reset_decl_linkage (m);
+	  if (DECL_MAYBE_IN_CHARGE_CONSTRUCTOR_P (m))
+	/* Also update its name, for cxx_dwarf_name.  */
+	DECL_NAME (m) = TYPE_IDENTIFIER (type);
+	}
   binding_table_foreach (CLASSTYPE_NESTED_UTDS (type),
 			 bt_reset_linkage_2, NULL);
 }
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index ddca2a8..cb2656c 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -19951,6 +19951,10 @@ gen_member_die (tree type, dw_die_ref context_die)
   /* Don't include clones in the member list.  */
   if (DECL_ABSTRACT_ORIGIN (member))
 	continue;
+  /* Nor constructors for anonymous classes.  */
+  if (DECL_ARTIFICIAL (member)
+	   dwarf2_name (member, 0) == NULL)
+	continue;
 
   child = lookup_decl_die (member);
   if (child)
commit 2fe1e2823adecf5b62248f996391b787e4b62926
Author: Jason Merrill ja...@redhat.com
Date:   Tue May 5 20:55:00 2015 -0500

	Fix g++.dg/torture/Wsizeof-pointer-memaccess2.C with -std=c++11
	* semantics.c (finish_call_expr): Check complain.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 0fc08b5f..701a8eb 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2377,6 +2377,7 @@ finish_call_expr (tree fn, vectree, va_gc **args, bool disallow_virtual,
   if (!result)
 	{
 	  if (warn_sizeof_pointer_memaccess
+	   (complain  tf_warning)
 	   !vec_safe_is_empty (*args)
 	   !processing_template_decl)
 	{


Re: [PATCH] Expand PIC calls without PLT with -fno-plt

2015-05-05 Thread Rich Felker
On Mon, May 04, 2015 at 11:42:20AM -0600, Jeff Law wrote:
 On 05/04/2015 11:39 AM, Jakub Jelinek wrote:
 On Mon, May 04, 2015 at 11:34:05AM -0600, Jeff Law wrote:
 On 05/04/2015 10:37 AM, Alexander Monakov wrote:
 This patch introduces option -fno-plt that allows to expand calls that 
 would
 go via PLT to load the address of the function immediately at call site 
 (which
 introduces a GOT load).  Cover letter explains the motivation for this 
 patch.
 
 New option documentation for invoke.texi is missing from the patch; if 
 this is
 accepted I'll be happy to send a v2 with documentation added.
 
* calls.c (prepare_call_address): Transform PLT call to GOT lookup and
indirect call by forcing address into a pseudo with -fno-plt.
* common.opt (flag_plt): New option.
 OK once you cobble together the invoke.texi changes.
 
 Isn't what Michael/Alan suggested better?  I mean as/ld/compiler changes to
 inline the plt slot's first part, then lazy binding will work fine.
 I must have missed Alan/Michael's message.
 
 ISTM the win here is that by going through the GOT, you can CSE the
 GOT reference and possibly get some more register allocation
 freedom.  Is that still the case with Alan/Michael's approach?

There are many advantages to 'going through the GOT'. CSE'ing the
reference is just one. The biggest (IMO) is that you can avoid the bad
PLT ABI that most targets have, where making a call to a PLT slot
requires the GOT address to be pre-loaded into a fixed, call-saved
register. This precludes sibcalls and forces many functions which
otherwise would not need their own stack frames to create one for
saving the old value of the GOT register. See my blog entry on the
topic here: http://ewontfix.com/18/

Anyone who really wants lazy binding can use -fplt (which is
presumably still the default; I didn't check) but lazy binding should
largely be considered deprecated anyway since effective use of relro
protection requires -z now too, in which case you're paying all the
costs (which are considerable!) for lazy binding support even though
you won't get it.

Rich


Re: [AArch64][PR65375] Fix RTX cost for vector SET

2015-05-05 Thread Kugan


On 05/05/15 16:17, James Greenhalgh wrote:
 On Sat, Apr 25, 2015 at 12:26:16AM +0100, Kugan wrote:

 Thanks for the review. I have updated the patch based on the comments
 with some other minor changes. Bootstrapped and regression tested on
 aarch64-none-linux-gnu with no-new regressions. Is this OK for trunk?


 Thanks,
 Kugan


 gcc/ChangeLog:

 2015-04-24  Kugan Vivekanandarajah  kug...@linaro.org
  Jim Wilson  jim.wil...@linaro.org

  * config/arm/aarch-common-protos.h (struct mem_cost_table): Added
  new  fields loadv and storev.
  * config/aarch64/aarch64-cost-tables.h (thunderx_extra_costs):
  Initialize loadv and storev.
  * config/arm/aarch-cost-tables.h (generic_extra_costs): Likewise.
  (cortexa53_extra_costs): Likewise.
  (cortexa57_extra_costs): Likewise.
  (xgene1_extra_costs): Likewise.
  * config/aarch64/aarch64.c (aarch64_rtx_costs): Update vector
  rtx_costs.
 
 Hi Kugan,
 
 Just a few syle comments, regarding the placements of comments in single-line
 if statements. I know the current code does not neccesarily always follow the
 comments below, I'll write a patch cleaning that up at some point when I'm 
 back
 at my desk.
 
 Thanks,
 James
 
 @@ -5667,6 +5668,14 @@ aarch64_rtx_costs (rtx x, int code, int outer 
 ATTRIBUTE_UNUSED,
  case NEG:
op0 = XEXP (x, 0);
  
 +  if (VECTOR_MODE_P (mode))
 +{
 +  if (speed)
 +/* FNEG.  */
 +*cost += extra_cost-vect.alu;
 +  return false;
 +}
 +
if (GET_MODE_CLASS (GET_MODE (x)) == MODE_INT)
 {
if (GET_RTX_CLASS (GET_CODE (op0)) == RTX_COMPARE
 
 Personally, I find commented if statements without braces hard to
 quickly parse. Something like this is much faster for me:
 
 if (speed)
   {
 /* FNEG.  */
 *cost += extra_cost-vect.alu;
   }
 
 @@ -5844,7 +5872,10 @@ cost_minus:
  
  if (speed)
{
 -if (GET_MODE_CLASS (mode) == MODE_INT)
 +if (VECTOR_MODE_P (mode))
 +  /* Vector SUB.  */
 +  *cost += extra_cost-vect.alu;
 +else if (GET_MODE_CLASS (mode) == MODE_INT)
/* SUB(S).  */
*cost += extra_cost-alu.arith;
  else if (GET_MODE_CLASS (mode) == MODE_FLOAT)
 
 As above.
 
 @@ -5888,7 +5919,6 @@ cost_plus:
{
  if (speed)
*cost += extra_cost-alu.arith_shift;
 -
  *cost += rtx_cost (XEXP (XEXP (op0, 0), 0),
 (enum rtx_code) GET_CODE (op0),
 0, speed);
 
 Drop this whitespace change.
 
 @@ -5913,7 +5943,10 @@ cost_plus:
  
  if (speed)
{
 -if (GET_MODE_CLASS (mode) == MODE_INT)
 +if (VECTOR_MODE_P (mode))
 +  /* Vector ADD.  */
 +  *cost += extra_cost-vect.alu;
 +else if (GET_MODE_CLASS (mode) == MODE_INT)
/* ADD.  */
*cost += extra_cost-alu.arith;
  else if (GET_MODE_CLASS (mode) == MODE_FLOAT)
 
 As above.
 
 @@ -6013,10 +6061,15 @@ cost_plus:
return false;
  
  case NOT:
 -  /* MVN.  */
if (speed)
 -*cost += extra_cost-alu.logical;
 -
 +{
 +  /* Vector NOT.  */
 +  if (VECTOR_MODE_P (mode))
 +*cost += extra_cost-vect.alu;
 +  /* MVN.  */
 +  else
 +*cost += extra_cost-alu.logical;
 +}
/* The logical instruction could have the shifted register form,
   but the cost is the same if the shift is processed as a separate
   instruction, so we don't bother with it here.  */
 
 As above.
 
 @@ -6055,10 +6108,15 @@ cost_plus:
return true;
  }
  
 -  /* UXTB/UXTH.  */
if (speed)
 -*cost += extra_cost-alu.extend;
 -
 +{
 +  if (VECTOR_MODE_P (mode))
 +/* UMOV.  */
 +*cost += extra_cost-vect.alu;
 +  else
 +/* UXTB/UXTH.  */
 +*cost += extra_cost-alu.extend;
 +}
return false;
  
  ca§se SIGN_EXTEND:
 
 And again :)
 
 @@ -6087,10 +6150,16 @@ cost_plus:
  
if (CONST_INT_P (op1))
  {
 -  /* LSL (immediate), UBMF, UBFIZ and friends.  These are all
 - aliases.  */
if (speed)
 -*cost += extra_cost-alu.shift;
 +{
 +  /* Vector shift (immediate).  */
 +  if (VECTOR_MODE_P (mode))
 +*cost += extra_cost-vect.alu;
 +  /* LSL (immediate), UBMF, UBFIZ and friends.  These are all
 + aliases.  */
 +  else
 +*cost += extra_cost-alu.shift;
 +}
  
/* We can incorporate zero/sign extend for free.  */
if (GET_CODE (op0) == ZERO_EXTEND
 
 Again, the comment here makes it very difficult to spot the form of
 the if/else statement.
 
 @@ -6102,10 +6171,15 @@ cost_plus:
  }
else
  {
 -  /* LSLV.  */
if (speed)
 -*cost += extra_cost-alu.shift_reg;
 -
 +{
 +  /* Vector shift 

Re: [PATCH 5/8] always define HAVE_simple_return and HAVE_return

2015-05-05 Thread Mike Stump
On Apr 26, 2015, at 10:55 PM, tbsaunde+...@tbsaunde.org wrote:
 From: Trevor Saunders tbsaunde+...@tbsaunde.org
 
 gcc/ChangeLog:
 
 2015-04-27  Trevor Saunders  tbsaunde+...@tbsaunde.org
 
   * bb-reorder.c (HAVE_return): Don't check if its undefined.
   * defaults.h (gen_simple_return): New function.
   (gen_simple_return): Likewise.
   (HAVE_return): Add default definition to false.
   (HAVE_simple_return): Likewise.
   * cfgrtl.c (force_nonfallthru_and_redirect): Remove checks if
   HAVE_return and HAVE_simple_return are defined.
   * function.c (gen_return_pattern): Likewise.
   (convert_jumps_to_returns): Likewise.
   (thread_prologue_and_epilogue_insns): Likewise.
   * reorg.c (find_end_label): Likewise.
   (dbr_schedule): Likewise.
   * shrink-wrap.c: Likewise.
   * shrink-wrap.h: Likewise.

I’m seeing:

In file included from ./tm.h:30:0,
 from ../../gcc/gcc/c-family/c-semantics.c:24:
../../gcc/gcc/defaults.h: In function ‘rtx_def* gen_simple_return()’:
../../gcc/gcc/defaults.h:1422:1: error: redefinition of ‘rtx_def* 
gen_simple_return()’
 gen_simple_return ()
 ^
In file included from ./tm.h:22:0,
 from ../../gcc/gcc/c-family/c-semantics.c:24:
./insn-flags.h:1744:1: error: ‘rtx_def* gen_simple_return()’ previously defined 
here
 gen_simple_return(void)
 ^

in my port.

I have a simple_return and a return that is “0” enabled.

defaults.h has:

#ifndef HAVE_simple_return
#define HAVE_simple_return 0
static inline rtx
gen_simple_return ()
{
  gcc_unreachable ();
  return NULL;
}
#endif

and insn-flags.h has:

static inline rtx gen_simple_return  (void);
static inline rtx
gen_simple_return(void)
{
  return 0;
}

If I change the enable to “1” or “” then it compiles better.  Also, I can 
delete the pattern or change the name of the pattern and it works ok as well.

If they both did #ifndef HAVE_simple_return, and then insn-flags did #define 
HAVE_simple_return 0, I think it might work better.  I’ve not thought about it 
much.

[PATCH] Add SPECIAL_FLOAT_MODE to enable adding IEEE 128-bit floating point to PowerPC

2015-05-05 Thread Michael Meissner
This patch contains the machine independent changes that I will need to add
IEEE 128-bit floating point to the GCC compiler for PowerPC.

It adds a new mode defintion macro: SPECIAL_FLOAT_MODE that is similar to
FLOAT_MODE.  The difference is the conversion system will skip
SPECIAL_FLOAT_MODE types when it is looking for a larger type.

The problem is the PowerPC will have 2 128-bit floating point types, one using
the IBM double-double format (a pair of doubles to give the user more mantissa
bits, but no greater exponent range), and IEEE 128-bit floating point.  I don't
want DFmode to automatically convert to KFmode (IEEE 128-bit floating point),
but to TFmode (IBM double-double).

With these patches applied, in the places where we are setting up types, we
include the special floating point types, but for the normal
GET_MODE_WIDER_MODE usage we do not want to use the special floating point
mode.

I found some of the GET_MODE_WIDER_MODE loops needed to add a check for running
out of modes if the mode type was special.  For those loops, I added a test for
mode not being VOIDmode.


2015-05-05  Michael Meissner  meiss...@linux.vnet.ibm.com

* machmode.h (GET_MODE_WIDER_MODE_SPECIAL): New macro to get the
wider modes that are normally skipped by default.

* rtlanal.c (init_num_sign_bit_copies_in_rep): In going from
narrow to wider modes, check whether we receive VOIDmode, since
special floating point types are not listed in the normal widening
tables.  When doing initializations, go through all modes in a
class, even special modes, that are normally skipped by default
widening.
* cse.c (cse_insn): Likewise.
* expr.c (init_expr_target): Likewise.
(compress_float_constant): Likewise.
* dse.c (find_shift_sequence): Likewise.
* emit-rtl.c (init_derived_machine_modes): Likewise.
(init_emit_once): Likewise.
* combine.c (simplify_comparison): Likewise.

* machmode.def (SPECIAL_FLOAT_MODE): New type of floating point
that is special, and is not automatically part of the normal
widening rules.

* genmodes.c (struct mode_data): Add special field.
(blank_mode): Initialize special.
(complete_mode): Complex and vector types inherit the special mode
class.
(FLOAT_MODE): Add special field for floating point to sort special
nodes higher than normal nodes for the same size.  The intention
is to allow __float128 on PowerPC (KFmode) to be higher than long
double (TFmode), so that automatic widening uses the long double
type.
(FRACTIONAL_FLOAT_MODE): Likewise.
(SPECIAL_FLOAT_MODE): Likewise.
(FLOAT_MODE_INTERNAL): Likewise.
(make_float_mode): Likewise.
(emit_mode_wider): Likewise.

These patches have been part of my IEEE 128-bit floating point branch for some
time, and I have bootstrapped that branch many times.  At present, I am just
submitting these patches to add the necessary infrastructure for the patches I
will be contributing later.  I have done an x86 bootstrap on the IEEE 128-bit
floating point branch to make sure it does not cause problems for ports that do
not define SPECIAL_FLOAT_MODE types.

Are these patches ok to commit to the trunk?

I would also be open to alternative ways of creating a mode that is not part of
the normal wider conversion methods or in adding options to control the order
the widening types are searched.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/cse.c
===
--- gcc/cse.c   (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/cse.c)
(revision 222823)
+++ gcc/cse.c   (.../gcc/cse.c) (working copy)
@@ -4856,7 +4856,7 @@ cse_insn (rtx_insn *insn)
  rtx new_and = gen_rtx_AND (VOIDmode, NULL_RTX, XEXP (src, 1));
 
  for (tmode = GET_MODE_WIDER_MODE (mode);
-  GET_MODE_SIZE (tmode) = UNITS_PER_WORD;
+  GET_MODE_SIZE (tmode) = UNITS_PER_WORD  tmode != VOIDmode;
   tmode = GET_MODE_WIDER_MODE (tmode))
{
  rtx inner = gen_lowpart (tmode, XEXP (src, 0));
@@ -4908,7 +4908,7 @@ cse_insn (rtx_insn *insn)
  XEXP (memory_extend_rtx, 0) = src;
 
  for (tmode = GET_MODE_WIDER_MODE (mode);
-  GET_MODE_SIZE (tmode) = UNITS_PER_WORD;
+  GET_MODE_SIZE (tmode) = UNITS_PER_WORD  tmode != VOIDmode;
   tmode = GET_MODE_WIDER_MODE (tmode))
{
  struct table_elt *larger_elt;
Index: gcc/expr.c
===
--- gcc/expr.c  (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/expr.c)   
(revision 222823)
+++ gcc/expr.c  (.../gcc/expr.c)(working copy)
@@ -261,11 +261,11 @@ 

C++ PATCH to add -Wterminate

2015-05-05 Thread Jason Merrill
C++11 changed destructors to be noexcept by default; as a result, code 
with throwing destructors that works in C++98 immediately calls 
terminate in C++11.  When testing changing the default for cxx_dialect 
to C++11 I noticed that this was causing one of the tests to fail, and 
decided to add a warning for a throw-expression in a noexcept function 
without an intervening try block.  This patch also adds the related 
warning to -Wc++11-compat.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 1ec3d94aa682212752ff3e1cef13ac118a71aa78
Author: Jason Merrill ja...@redhat.com
Date:   Tue May 5 14:01:08 2015 -0500

gcc/c-family/
	* c.opt (Wterminate): New.
gcc/cp/
	* cp-gimplify.c (cp_genericize_r): Track TRY_BLOCK and
	MUST_NOT_THROW_EXPR, warn about a THROW_EXPR directly within a
	MUST_NOT_THROW_EXPR.
	(cp_genericize_data): Add try_block field.
	(cp_genericize_tree): Initialize it.
	* except.c (expand_end_catch_block): Set TREE_NO_WARNING on
	implicit rethrow.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 983f4a8..8ef0cea 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -829,6 +829,10 @@ Wsystem-headers
 C ObjC C++ ObjC++ Warning
 ; Documented in common.opt
 
+Wterminate
+C++ ObjC++ Warning Var(warn_terminate) Init(1)
+Warn if a throw expression will always result in a call to terminate()
+
 Wtraditional
 C ObjC CPP(cpp_warn_traditional) CppReason(CPP_W_TRADITIONAL) Var(warn_traditional) Init(0) Warning
 Warn about features not present in traditional C
diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 70645b5..35749ef 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -905,6 +905,7 @@ struct cp_genericize_data
   hash_settree *p_set;
   vectree bind_expr_stack;
   struct cp_genericize_omp_taskreg *omp_ctx;
+  tree try_block;
 };
 
 /* Perform any pre-gimplification lowering of C++ front end trees to
@@ -1193,6 +1194,54 @@ cp_genericize_r (tree *stmt_p, int *walk_subtrees, void *data)
   wtd-omp_ctx = omp_ctx.outer;
   splay_tree_delete (omp_ctx.variables);
 }
+  else if (TREE_CODE (stmt) == TRY_BLOCK)
+{
+  *walk_subtrees = 0;
+  tree try_block = wtd-try_block;
+  wtd-try_block = stmt;
+  cp_walk_tree (TRY_STMTS (stmt), cp_genericize_r, data, NULL);
+  wtd-try_block = try_block;
+  cp_walk_tree (TRY_HANDLERS (stmt), cp_genericize_r, data, NULL);
+}
+  else if (TREE_CODE (stmt) == MUST_NOT_THROW_EXPR)
+{
+  /* MUST_NOT_THROW_COND might be something else with TM.  */
+  if (MUST_NOT_THROW_COND (stmt) == NULL_TREE)
+	{
+	  *walk_subtrees = 0;
+	  tree try_block = wtd-try_block;
+	  wtd-try_block = stmt;
+	  cp_walk_tree (TREE_OPERAND (stmt, 0), cp_genericize_r, data, NULL);
+	  wtd-try_block = try_block;
+	}
+}
+  else if (TREE_CODE (stmt) == THROW_EXPR)
+{
+  location_t loc = location_of (stmt);
+  if (TREE_NO_WARNING (stmt))
+	/* Never mind.  */;
+  else if (wtd-try_block)
+	{
+	  if (TREE_CODE (wtd-try_block) == MUST_NOT_THROW_EXPR
+	   warning_at (loc, OPT_Wterminate,
+			 throw will always call terminate())
+	   cxx_dialect = cxx11
+	   DECL_DESTRUCTOR_P (current_function_decl))
+	inform (loc, in C++11 destructors default to noexcept);
+	}
+  else
+	{
+	  if (warn_cxx0x_compat  cxx_dialect  cxx11
+	   DECL_DESTRUCTOR_P (current_function_decl)
+	   (TYPE_RAISES_EXCEPTIONS (TREE_TYPE (current_function_decl))
+		  == NULL_TREE)
+	   (get_defaulted_eh_spec (current_function_decl)
+		  == empty_except_spec))
+	warning_at (loc, OPT_Wc__0x_compat,
+			in C++11 this throw will terminate because 
+			destructors default to noexcept);
+	}
+}
   else if (TREE_CODE (stmt) == CONVERT_EXPR)
 gcc_assert (!CONVERT_EXPR_VBASE_PATH (stmt));
   else if (TREE_CODE (stmt) == FOR_STMT)
@@ -1269,6 +1318,7 @@ cp_genericize_tree (tree* t_p)
   wtd.p_set = new hash_settree;
   wtd.bind_expr_stack.create (0);
   wtd.omp_ctx = NULL;
+  wtd.try_block = NULL_TREE;
   cp_walk_tree (t_p, cp_genericize_r, wtd, NULL);
   delete wtd.p_set;
   wtd.bind_expr_stack.release ();
diff --git a/gcc/cp/except.c b/gcc/cp/except.c
index 3ff1ce6..614f2e9 100644
--- a/gcc/cp/except.c
+++ b/gcc/cp/except.c
@@ -579,7 +579,11 @@ expand_end_catch_block (void)
   if (in_function_try_handler
(DECL_CONSTRUCTOR_P (current_function_decl)
 	  || DECL_DESTRUCTOR_P (current_function_decl)))
-finish_expr_stmt (build_throw (NULL_TREE));
+{
+  tree rethrow = build_throw (NULL_TREE);
+  TREE_NO_WARNING (rethrow) = true;
+  finish_expr_stmt (rethrow);
+}
 }
 
 tree
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index debd8ed..9c8aa99 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -2882,6 +2882,12 @@ Warn when overload resolution chooses a promotion from unsigned or
 enumerated type to a signed type, over a conversion to an unsigned type of
 the same size.  Previous versions of G++ tried to preserve
 

Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Jakub Jelinek
On Tue, May 05, 2015 at 09:32:28AM +0200, Jakub Jelinek wrote:
 2015-05-05  Jakub Jelinek  ja...@redhat.com
 
   PR target/65956
   * config/arm/arm.c (arm_needs_doubleword_align): For non-aggregate
   types check TYPE_ALIGN of TYPE_MAIN_VARIANT rather than type itself.
 
   * gcc.c-torture/execute/pr65956.c: New test.

Note, the patch fixed the profiledbootstrap that has been broken in 5+.

Jakub


Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Earnshaw
On 05/05/15 15:29, Jakub Jelinek wrote:
 On Tue, May 05, 2015 at 02:20:43PM +0100, Richard Earnshaw wrote:
 On 05/05/15 14:06, Jakub Jelinek wrote:
 On Tue, May 05, 2015 at 02:02:19PM +0100, Richard Earnshaw wrote:
 In a literal sense, yes.  However, even KR  stdarg have standard
 promotion and conversion rules (size  int = int, floats promoted to
 double, etc).  What are those rules for GCC's overaligned types (ie
 where in the docs does it say what happens and how a back-end should
 interpret them)?  Shouldn't the mid-end be doing that work so as to

 For the middle-end, the TYPE_ALIGN info on expression types is considered
 useless, you can get there anything.  There is no conversion rule to what
 you get for myalignedint + int, or (myalignedint) int, or (int)
 myalignedint, etc.

 create a consistent view of the values passed into the back-end?  It
 seems to me that at present the back-end has to be psychic as to what is
 really happening.

 No, the backend just shouldn't consider TYPE_ALIGN on the scalars, and it
 seems only arm ever looks at that.


 Nothing in the specification for TARGET_FUNCTION_ARG (or any of the
 related functions) makes any mention of this...
 
 While this requirement isn't documented, it is clearly common sense or at
 least something any kind of testing would reveal immediately.

Then clearly no such tests exist in the testsuite :-(

R.

 And it is nothing broken recently (except that with the SRA changes it hits
 much more often), looking e.g. at GCC 3.2, I'm seeing that expand_call is on
 that testcase also called with pretty random TYPE_ALIGN on the argument
 types; we didn't have GIMPLE then, so it is nothing that GIMPLE brought in.
 
   Jakub
 



Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Earnshaw
On 05/05/15 15:06, Richard Biener wrote:
 On May 5, 2015 2:49:55 PM GMT+02:00, Richard Earnshaw 
 richard.earns...@foss.arm.com wrote:
 On 05/05/15 13:46, Richard Earnshaw wrote:
 On 05/05/15 13:37, Richard Biener wrote:
 On May 5, 2015 1:01:59 PM GMT+02:00, Richard Earnshaw
 richard.earns...@foss.arm.com wrote:
 On 05/05/15 11:54, Richard Earnshaw wrote:
 On 05/05/15 08:32, Jakub Jelinek wrote:
 On Mon, May 04, 2015 at 05:00:11PM +0200, Jakub Jelinek wrote:
 So at least changing arm_needs_doubleword_align for
 non-aggregates
 would
 likely not break anything that hasn't been broken already and
 would
 unbreak
 the majority of cases.

 Attached (untested so far).  It indeed changes code generated for
 over-aligned va_arg, but as I believe you can't properly pass
 those
 in the
 ... caller, this should just fix it so that va_arg handling
 matches
 the
 caller (and likewise for callees for named argument passing).

 The following testcase shows that eipa_sra changes alignment
 even
 for the
 aggregates.  Change aligned (8) to aligned (4) to see another
 possibility.

 Actually I misread it, for the aggregates esra actually doesn't
 change
 anything, which is the reason why the testcase doesn't fail.
 The problem with the scalars is that esra first changed it to the
 over-aligned MEM_REFs and then later on eipa_sra used the types
 of
 the
 MEM_REFs created by esra.

 2015-05-05  Jakub Jelinek  ja...@redhat.com

 PR target/65956
 * config/arm/arm.c (arm_needs_doubleword_align): For
 non-aggregate
 types check TYPE_ALIGN of TYPE_MAIN_VARIANT rather than type
 itself.

 * gcc.c-torture/execute/pr65956.c: New test.

 --- gcc/config/arm/arm.c.jj 2015-05-04 21:51:42.0 +0200
 +++ gcc/config/arm/arm.c2015-05-05 09:20:52.481693337 +0200
 @@ -6063,8 +6063,13 @@ arm_init_cumulative_args (CUMULATIVE_ARG
  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 - || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY)
 +return true;

 I don't think this is right (though I suspect the existing code
 has
 the
 same problem).  We should only look at mode if there is no type
 information.  The problem is that GCC has a nasty habit of
 assigning
 real machine modes to things that are really BLKmode and we've run
 into
 several cases where this has royally screwed things up.  So for
 consistency in the ARM back-end we are careful to only use mode
 when
 type is NULL (= it's a libcall).

 +  if (type == NULL_TREE)
 +return false;
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
  }
  



 It ought to be possible to re-order this, though, to

  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 -  || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (type != NULL_TREE)
 +{
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type)) 
 PARM_BOUNDARY;
 +}
 +  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY);
  }


 Either way, this would need careful cross-testing against an
 existing
 compiler.


 It looks as though either patch would cause ABI incompatibility for

 typedef int alignedint __attribute__((aligned((8;

 int  __attribute__((weak)) foo (int a, alignedint b)
 {return b;}

 void bar (alignedint x)
 {
  foo (1, x);
 }

 Where currently gcc uses r2 as the argument register for b in foo.

 And for foo (1,2) or an int typed 2nd arg?

 Richard.

 R.




 As is

 int foo2 (int a, int b);

 void bar2 (alignedint x)
 {
   foo2 (1, x);
 }


 The real question here is why is TYPE the type of the value, rather
 than
 the type of the formal as expressed by the prototype (or implicit
 prototype in the case of variadics or KR)?  Surely this is the mid-end
 passing the wrong information to the back-end.
 
 No - the issue is you are looking at the type of the value at all, instead of 
 at the type of the formal.
 

And I would argue that passing the type of the acutal is stupid in all
cases except when the type of the formal does not exist.  It's just
asking for problems like these.

R.

 Richard.
 
 R.

 ...

 R

 
 



RFA: RL78: Save the frame pointer if it is used.

2015-05-05 Thread Nick Clifton
Hi DJ,

  Below is a small patch to fix a small problem with the need_to_save()
  function in the RL78 backend.  It was only marking the frame pointer
  register as needing a save if the frame pointer was in use.  But since
  the frame pointer is call-saved it needs to be saved any time it is
  used, even if it is not being used as a frame pointer.  This problem
  has not arisen before because the frame pointer is also marked as
  being fixed.  But it turns out that it can still be used inside some
  inline assembler, and so the prologue and epilogue code still need to
  save and restore it.

  Tested with no regressions on an rl78-elf toolchain.

  OK to apply ?

Cheers
  Nick

gcc/ChangeLog
2015-05-05  Nick Clifton  ni...@redhat.com

* config/rl78/rl78.c (need_to_save): Save the frame pointer any
time that it is used.

Index: gcc/config/rl78/rl78.c
===
--- gcc/config/rl78/rl78.c  (revision 222796)
+++ gcc/config/rl78/rl78.c  (working copy)
@@ -687,7 +687,8 @@
   return df_regs_ever_live_p (regno);
 }
 
-  if (regno == FRAME_POINTER_REGNUM  frame_pointer_needed)
+  if (regno == FRAME_POINTER_REGNUM
+   (frame_pointer_needed || df_regs_ever_live_p (regno)))
 return true;
   if (fixed_regs[regno])
 return false;


PING: [PATCH] PR target/65846: Optimize data access in PIE with copy reloc

2015-05-05 Thread H.J. Lu
On Wed, Apr 22, 2015 at 9:34 AM, H.J. Lu hongjiu...@intel.com wrote:
 Normally, with PIE, GCC accesses globals that are extern to the module
 using GOT.  This is two instructions, one to get the address of the global
 from GOT and the other to get the value.  Examples:

 ---
 extern int a_glob;
 int
 main ()
 {
   return a_glob;
 }
 ---

 With PIE, the generated code accesses global via GOT using two memory
 loads:

 movqa_glob@GOTPCREL(%rip), %rax
 movl(%rax), %eax

 for 64-bit or

 movla_glob@GOT(%ecx), %eax
 movl(%eax), %eax

 for 32-bit.

 Some experiments on google and SPEC CPU benchmarks show that the extra
 instruction affects performance by 1% to 5%.

 Solution - Copy Relocations:

 When the linker supports copy relocations, GCC can always assume that
 the global will be defined in the executable.  For globals that are
 truly extern (come from shared objects), the linker will create copy
 relocations and have them defined in the executable.  Result is that
 no global access needs to go through GOT and hence improves performance.
 We can generate

 movla_glob(%rip), %eax

 for 64-bit and

 movla_glob@GOTOFF(%eax), %eax

 for 32-bit.  This optimization only applies to undefined non-weak
 non-TLS global data.  Undefined weak global or TLS data access still
 must go through GOT.

 This patch reverts legitimate_pic_address_disp_p change made in revision
 218397, which only applies to x86-64.  Instead, this patch updates
 targetm.binds_local_p to indicate if undefined non-weak non-TLS global
 data is defined locally in PIE.  It also introduces a new target hook,
 binds_tls_local_p to distinguish TLS variable from non-TLS variable.  By
 default, binds_tls_local_p is the same as binds_local_p.

 This patch checks if 32-bit and 64-bit linkers support PIE with copy
 reloc at configure time.  64-bit linker is enabled in binutils 2.25
 and 32-bit linker is enabled in binutils 2.26.  This optimization
 is enabled only if the linker support is available.

 Tested on Linux/x86-64 with -m32 and -m64, using linkers with and without
 support for copy relocation in PIE.  OK for trunk?

 Thanks.

 H.J.
 ---
 gcc/

 PR target/65846
 * configure.ac (HAVE_LD_PIE_COPYRELOC): Renamed to ...
 (HAVE_LD_64BIT_PIE_COPYRELOC): This.
 (HAVE_LD_32BIT_PIE_COPYRELOC): New.   Defined to 1 if Linux/ia32
 linker supports PIE with copy reloc.
 * output.h (default_binds_tls_local_p): New.
 (default_binds_local_p_3): Add 2 bool arguments.
 * target.def (binds_tls_local_p): New target hook.
 * varasm.c (decl_default_tls_model): Replace targetm.binds_local_p
 with targetm.binds_tls_local_p.
 (default_binds_local_p_3): Add a bool argument to indicate TLS
 variable and a bool argument to indicate if an undefined non-TLS
 non-weak data is local.  Double check TLS variable.  If an
 undefined non-TLS non-weak data is local, treat it as defined
 locally.
 (default_binds_local_p): Pass false and false to
 default_binds_local_p_3.
 (default_binds_local_p_2): Likewise.
 (default_binds_local_p_1): Likewise.
 (default_binds_tls_local_p): New.
 * config.in: Regenerated.
 * configure: Likewise.
 * doc/tm.texi: Likewise.
 * config/i386/i386.c (legitimate_pic_address_disp_p): Don't
 check HAVE_LD_PIE_COPYRELOC here.
 (ix86_binds_local): New.
 (ix86_binds_tls_local_p): Likewise.
 (ix86_binds_local_p): Use it.
 (TARGET_BINDS_TLS_LOCAL_P): New.
 * doc/tm.texi.in (TARGET_BINDS_TLS_LOCAL_P): New hook.

 gcc/testsuite/

 PR target/65846
 * gcc.target/i386/pie-copyrelocs-1.c: Updated for ia32.
 * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
 * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
 * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
 * gcc.target/i386/pr32219-9.c: Likewise.
 * gcc.target/i386/pr32219-10.c: New file.

 * lib/target-supports.exp (check_effective_target_pie_copyreloc):
 Check HAVE_LD_64BIT_PIE_COPYRELOC and HAVE_LD_32BIT_PIE_COPYRELOC
 instead of HAVE_LD_64BIT_PIE_COPYRELOC.

Richard, Jeff,

Can you review this patch:

https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01331.html

Thanks.



-- 
H.J.


Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Earnshaw
On 05/05/15 15:33, Richard Earnshaw wrote:
 On 05/05/15 15:29, Jakub Jelinek wrote:
 On Tue, May 05, 2015 at 02:20:43PM +0100, Richard Earnshaw wrote:
 On 05/05/15 14:06, Jakub Jelinek wrote:
 On Tue, May 05, 2015 at 02:02:19PM +0100, Richard Earnshaw wrote:
 In a literal sense, yes.  However, even KR  stdarg have standard
 promotion and conversion rules (size  int = int, floats promoted to
 double, etc).  What are those rules for GCC's overaligned types (ie
 where in the docs does it say what happens and how a back-end should
 interpret them)?  Shouldn't the mid-end be doing that work so as to

 For the middle-end, the TYPE_ALIGN info on expression types is considered
 useless, you can get there anything.  There is no conversion rule to what
 you get for myalignedint + int, or (myalignedint) int, or (int)
 myalignedint, etc.

 create a consistent view of the values passed into the back-end?  It
 seems to me that at present the back-end has to be psychic as to what is
 really happening.

 No, the backend just shouldn't consider TYPE_ALIGN on the scalars, and it
 seems only arm ever looks at that.


 Nothing in the specification for TARGET_FUNCTION_ARG (or any of the
 related functions) makes any mention of this...

 While this requirement isn't documented, it is clearly common sense or at
 least something any kind of testing would reveal immediately.
 
 Then clearly no such tests exist in the testsuite :-(
 

Or, more precisely, no such tests existed in 2008, when the code went in.

R.

 R.
 
 And it is nothing broken recently (except that with the SRA changes it hits
 much more often), looking e.g. at GCC 3.2, I'm seeing that expand_call is on
 that testcase also called with pretty random TYPE_ALIGN on the argument
 types; we didn't have GIMPLE then, so it is nothing that GIMPLE brought in.

  Jakub

 



Re: Next set of OpenACC changes: C family

2015-05-05 Thread Cesar Philippidis
On 05/05/2015 07:18 AM, Jakub Jelinek wrote:
 On Tue, May 05, 2015 at 10:57:28AM +0200, Thomas Schwinge wrote:
 --- gcc/c-family/c-common.c
 +++ gcc/c-family/c-common.c
 @@ -809,7 +809,7 @@ const struct attribute_spec c_common_attribute_table[] =
handle_omp_declare_simd_attribute, false },
{ cilk simd function, 0, -1, true,  false, false,
handle_omp_declare_simd_attribute, false },
 -  { omp declare target, 0, 0, true, false, false,
 +  { omp declare target, 0, -1, true, false, false,
handle_omp_declare_target_attribute, false },
{ alloc_align,1, 1, false, true, true,
handle_alloc_align_attribute, false },
 
 Can you explain this change?  omp declare target doesn't take any
 arguments, so 0, 0, looks right to me.

Because we are using that attribute for oacc routines, and routines may
have contain clauses.

Thinking about this some more, we could probably revert this change. I
have another patch to disable exception handling inside openacc
accelerated regions because the nvptx target doesn't support them. In
that patch I introduced a new oacc function attribute. Maybe we should
attach the acc routine clauses on that oacc function attribute.

 @@ -823,6 +823,7 @@ const struct attribute_spec c_common_attribute_table[] =
handle_bnd_legacy, false },
{ bnd_instrument, 0, 0, true, false, false,
handle_bnd_instrument, false },
 +  { oacc declare,   0, -1, true,  false, false, NULL, false },
{ NULL, 0, 0, false, false, false, NULL, false }
 
 If oacc declare is different, then supposedly you shouldn't reuse
 omp declare target attribute for the OpenACC thingie.

I'm not sure about this one. Oacc has enough quirks where it may be
justifiable though. I'll find out who wrote this patch.

 --- gcc/c-family/c-omp.c
 +++ gcc/c-family/c-omp.c
 @@ -1087,3 +1087,108 @@ c_omp_predetermined_sharing (tree decl)
  
return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
  }
 +
 +/* Return a numerical code representing the device_type.  Currently,
 +   only device_type(nvidia) is supported.  All device_type parameters
 +   are treated as case-insensitive keywords.  */
 +
 +int
 +oacc_extract_device_id (const char *device)
 +{
 +  if (!strcasecmp (device, nvidia))
 +return GOMP_DEVICE_NVIDIA_PTX;
 +  return GOMP_DEVICE_NONE;
 +}
 
 Why do you support just one particular device_type?  That sounds broken.
 You should just have some table with names - GOMP_DEVICE_* mappings.

I kind of wanted to keep this patch local in gomp-4_0-branch until it
was a little more functional. Adding proper support for device_type is
going to be more involved. For instance, the the tile clause changes the
shape of a loop, so if you have

  #pragma acc loop tile (2, 4) device_type (nvidia) tile (5, 5) \
 device_type (something_else) tile (1, 4)

we're going to have to generate three different versions of that
parallel region. Then we'd have to teach the compiler to the offload
regions with the proper number of gangs, workers and vectors, etc.

For our initial implementation, we just decided to support device_type
(nvidia), since openacc is really only working on nvptx and host
devices. And the runtime is rigged up to ignore num_gangs, num_workers
and vector_length for the host anyway. So that's why I filtered out the
device_type clauses in the front end.

Also, for full disclosure, we're parsing the tile clause, but we're not
actually tiling the loops yet. We're still in the process of getting the
oacc execution model working on the nvptx target. Things which are
easy to do in cpu threads (barriers and synchronization, global
memory, etc.) are not as straightforward on gpus, unfortunately.

 +  if (code  (1  GOMP_DEVICE_NVIDIA_PTX))
 +{
 +  if (seen_nvidia)
 +{
 +  seen_nvidia = NULL_TREE;
 +  error_at (OMP_CLAUSE_LOCATION (c),
 +duplicate device_type (nvidia));
 +  goto filter_error;
 +}
 +  else
 +seen_nvidia = OMP_CLAUSE_DEVICE_TYPE_CLAUSES (c);
 
 Again, I must say I don't like the hardcoding of one particular
 device type here.
 Doesn't Intel want to support OpenACC for XeonPhi?  What about HSA
 eventually, etc.?
 
 @@ -4624,7 +4657,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
last_label = false;
mark_valid_location_for_stdc_pragma (false);
c_parser_declaration_or_fndef (parser, true, true, true, true,
 - true, NULL, vNULL);
 + true, NULL, vNULL, NULL_TREE, false);
 
 Wouldn't default arguments be in order here?  Though, even those will mean
 compile time cost of passing all the zeros almost all the time.

I'll check who did this.

 -/* OpenMP 2.5:
 +/* OpenACC:
 +   num_gangs ( 

Re: [Bug target/66015] New: align directives not propagated after __attribute__ ((__optimize__ (O2)))

2015-05-05 Thread Jim Wilson
On Tue, 2015-05-05 at 14:19 +0200, Christian Bruel wrote:
 I don't the environment to run the testsuite for ia64. would you mind
 giving it a try and verify that it fixes the issue ?

I don't have ia64 hardware anymore.

The GCC compile farm advertises ia64 hardware
https://gcc.gnu.org/wiki/CompileFarm
though the web site suggests that all ia64 machines are offline, so I
don't know if that is going to help.

Jim




Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Jakub Jelinek
On Tue, May 05, 2015 at 02:20:43PM +0100, Richard Earnshaw wrote:
 On 05/05/15 14:06, Jakub Jelinek wrote:
  On Tue, May 05, 2015 at 02:02:19PM +0100, Richard Earnshaw wrote:
  In a literal sense, yes.  However, even KR  stdarg have standard
  promotion and conversion rules (size  int = int, floats promoted to
  double, etc).  What are those rules for GCC's overaligned types (ie
  where in the docs does it say what happens and how a back-end should
  interpret them)?  Shouldn't the mid-end be doing that work so as to
  
  For the middle-end, the TYPE_ALIGN info on expression types is considered
  useless, you can get there anything.  There is no conversion rule to what
  you get for myalignedint + int, or (myalignedint) int, or (int)
  myalignedint, etc.
  
  create a consistent view of the values passed into the back-end?  It
  seems to me that at present the back-end has to be psychic as to what is
  really happening.
  
  No, the backend just shouldn't consider TYPE_ALIGN on the scalars, and it
  seems only arm ever looks at that.
  
 
 Nothing in the specification for TARGET_FUNCTION_ARG (or any of the
 related functions) makes any mention of this...

While this requirement isn't documented, it is clearly common sense or at
least something any kind of testing would reveal immediately.
And it is nothing broken recently (except that with the SRA changes it hits
much more often), looking e.g. at GCC 3.2, I'm seeing that expand_call is on
that testcase also called with pretty random TYPE_ALIGN on the argument
types; we didn't have GIMPLE then, so it is nothing that GIMPLE brought in.

Jakub


Commit: MSP430: Add -mcode-region= and -mdata-region= options

2015-05-05 Thread Nick Clifton
Hi Guys,

  I am applying the attached patch to add two new command line options
  to the MSP430 backend.  The -mcode-region= and -mdata-region= options
  allow the user to specify whether functions and data should be placed
  into low memory (below 64K) or high memory.  This only applies to the
  MSP430X, and only when the specific MSP430 part actually has both low
  and high memory regions.

  An interesting ability of this new feature is to select a memory
  region of either, which tells the linker to place the object into
  low memory if there is room available and into high memory otherwise.

  Three new function and data attributes have also been added - lower,
  either and upper - which allow the programmer to explicitly state
  which memory region an object should use.
  
  Tested with no regressions and 77 less unexpected failures on an
  msp430-elf toolchain.

Cheers
  Nick

gcc/ChangeLog
2015-05-05  Nick Clifton  ni...@redhat.com

* config/msp430/msp430-opts.h (enum msp430_regions): New.
* config/msp430/msp430.c (msp430_override_options): Complain if
-mcode-region or -mdata-region is used on a non MSP430X.
(msp430_section_attr): New function.  Checks lower, upper and
either attributes.
(msp430_attribute_table): Add lower, upper and either.
(gen_prefix): New function.  Generates a prefix for a section
name.
(msp430_select_section): New function - handles the choice of
section for an object.  Takes into account memory region
attributes and options.
(msp430_function_section): Use gen_prefix.
(TARGET_SECTION_TYPE_FLAGS): Define.
(msp430_section_type_flags): New function.
(TARGET_ASM_UNIQUE_SECTION): Define.
(msp430_unique_section): New function.
(msp430_output_aligned_decl_common): New function.
(msp430_do_not_relax_short_jumps): New function.
* config/msp430/msp430.h (USE_SELECT_SECTION_FOR_FUNCTIONS):
Define.
(ASM_OUTPUT_ALIGNED_DECL_COMMON): Define.
* config/msp430/msp430-protos.h
(msp430_do_not_relax_short_jumps): New prototype.
(msp430_output_aligned_decl_common): New prototype.
* config/msp430/msp430.md (length): New attribute.
(cbranchhi4_real): If msp430_do_not_relax_short_jumps is true
then use a long code sequence for short jumps.
* config/msp430/msp430.opt (mcode-region): New.
(mdata-region): New.
* doc/invoke.texi: Document new options.
* doc/extend.texi: Document new attributes.

Index: gcc/config/msp430/msp430-opts.h
===
--- gcc/config/msp430/msp430-opts.h	(revision 222796)
+++ gcc/config/msp430/msp430-opts.h	(working copy)
@@ -29,4 +29,12 @@
   F5SERIES
 };
 
+enum msp430_regions
+{
+  ANY,
+  EITHER,
+  LOWER,
+  UPPER
+};
+
 #endif
Index: gcc/config/msp430/msp430-protos.h
===
--- gcc/config/msp430/msp430-protos.h	(revision 222796)
+++ gcc/config/msp430/msp430-protos.h	(working copy)
@@ -21,6 +21,7 @@
 #ifndef GCC_MSP430_PROTOS_H
 #define GCC_MSP430_PROTOS_H
 
+bool	msp430_do_not_relax_short_jumps (void);
 rtx	msp430_eh_return_stackadj_rtx (void);
 void	msp430_expand_eh_return (rtx);
 void	msp430_expand_epilogue (int);
@@ -40,6 +41,7 @@
 const char * msp430x_logical_shift_right (rtx);
 const char * msp430_mcu_name (void);
 bool	msp430_modes_tieable_p (machine_mode, machine_mode);
+voidmsp430_output_aligned_decl_common (FILE *, const tree, const char *, unsigned HOST_WIDE_INT, unsigned);
 void	msp430_output_labelref (FILE *, const char *);
 void	msp430_register_pragmas (void);
 rtx	msp430_return_addr_rtx (int);
Index: gcc/config/msp430/msp430.c
===
--- gcc/config/msp430/msp430.c	(revision 222796)
+++ gcc/config/msp430/msp430.c	(working copy)
@@ -244,6 +244,11 @@
   if (TARGET_LARGE  !msp430x)
 error (-mlarge requires a 430X-compatible -mmcu=);
 
+  if (msp430_code_region == UPPER  ! msp430x)
+error (-mcode-region=upper requires 430X-compatible cpu);
+  if (msp430_data_region == UPPER  ! msp430x)
+error (-mdata-region=upper requires 430X-compatible cpu);
+
   if (flag_exceptions || flag_non_call_exceptions
   || flag_unwind_tables || flag_asynchronous_unwind_tables)
 flag_omit_frame_pointer = false;
@@ -1146,46 +1151,234 @@
 			  + cfun-machine-framesize_outgoing);
 }
 
+/* Attribute Handling.  */
+
+const char * const  ATTR_INTR   = interrupt;
+const char * const  ATTR_WAKEUP = wakeup;
+const char * const  ATTR_NAKED  = naked;
+const char * const  ATTR_REENT  = reentrant;
+const char * const  ATTR_CRIT   = critical;
+const char * const  ATTR_LOWER  = lower;
+const char * const  ATTR_UPPER  = upper;
+const char * const  ATTR_EITHER = either;
+
 static inline bool
-is_attr_func (const char * attr)
+has_attr (const char 

RFA: Doc update: Describe new MSP430 feature

2015-05-05 Thread Nick Clifton
Hi Gerald, Hi Joesph, Hi Sandra,

  I recently contributes a patch to enhance the MSP430 backend with the
  ability to automatically distribute code and data between high and low
  memory regions.  Below is a patch to update the html documentation
  with a description of this new feature.  Is it OK to apply ?

Cheers
  Nick

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.120
diff -u -3 -p -r1.120 changes.html
--- htdocs/gcc-5/changes.html   1 May 2015 22:44:45 -   1.120
+++ htdocs/gcc-5/changes.html   5 May 2015 15:31:03 -
@@ -859,6 +859,27 @@ here/a./p
   when compiling for soft-float targets./li
   /ul
 
+h3 id=msp430MSP430/h3
+  ul
+lipThe MSP430 compiler now has the ability to automatically distribute 
code
+  and data between low (less than 64K) memory and high memory.  This only
+  applies parts that actually have both memory regions and only if the
+  linker script for the part has been specifically set up to support this
+  feature./p
+
+  pA new attribute of codeeither/code can be applied to both 
functions
+  and data, and this tells the compiler to place the object into low memory
+  if there is room and into high memory otherwise.  Two other new 
attributes
+  - codelower/code and codeupper/code - can be used to explicitly
+  state that an object should be placed in the specified memory region.  If
+  there is not enough left in that region the compilation will fail./p
+
+  pTwo new command line options - 
code-mcode-region=[lower|upper|either]/code
+  and code-mdata-region=[lower|upper|either]/code - can be used to tell
+  the compiler what to do with objects that do not have one of these new
+  attributes./p/li
+  /ul
+
 h3 id=nds32NDS32/h3
   ul
 liThe variadic function ABI implementation is now compatible with


ping**n re [patch, ARM] Add support for crtfastmath.o

2015-05-05 Thread Sandra Loosemore

This patch I posted last fall:

https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00711.html

still has not been reviewed, in spite of me pinging it several times 
before GCC 5 went into stage 4.  Now that we're back in stage 1 again, 
can somebody please take a look at it?  I've just re-tested it on 
mainline head.


FAOD, NVIDIA assigned the copyright on their original version of the 
patch to Mentor Graphics so there should not be any legal issue with it.


-Sandra



Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Biener
On May 5, 2015 4:33:58 PM GMT+02:00, Richard Earnshaw 
richard.earns...@foss.arm.com wrote:
On 05/05/15 15:33, Richard Earnshaw wrote:
 On 05/05/15 15:29, Jakub Jelinek wrote:
 On Tue, May 05, 2015 at 02:20:43PM +0100, Richard Earnshaw wrote:
 On 05/05/15 14:06, Jakub Jelinek wrote:
 On Tue, May 05, 2015 at 02:02:19PM +0100, Richard Earnshaw wrote:
 In a literal sense, yes.  However, even KR  stdarg have
standard
 promotion and conversion rules (size  int = int, floats
promoted to
 double, etc).  What are those rules for GCC's overaligned types
(ie
 where in the docs does it say what happens and how a back-end
should
 interpret them)?  Shouldn't the mid-end be doing that work so as
to

 For the middle-end, the TYPE_ALIGN info on expression types is
considered
 useless, you can get there anything.  There is no conversion rule
to what
 you get for myalignedint + int, or (myalignedint) int, or (int)
 myalignedint, etc.

 create a consistent view of the values passed into the back-end? 
It
 seems to me that at present the back-end has to be psychic as to
what is
 really happening.

 No, the backend just shouldn't consider TYPE_ALIGN on the scalars,
and it
 seems only arm ever looks at that.


 Nothing in the specification for TARGET_FUNCTION_ARG (or any of the
 related functions) makes any mention of this...

 While this requirement isn't documented, it is clearly common sense
or at
 least something any kind of testing would reveal immediately.
 
 Then clearly no such tests exist in the testsuite :-(
 

Or, more precisely, no such tests existed in 2008, when the code went
in.

That just means that the code was the first that make this matter.

BTW, what do other compilers do (I suppose llvm supports the gcc extensions for 
alignment)?  Can llvm and gcc interoperate properly here? Do the cited 
testcases work with llvm?

Richard.


R.

 R.
 
 And it is nothing broken recently (except that with the SRA changes
it hits
 much more often), looking e.g. at GCC 3.2, I'm seeing that
expand_call is on
 that testcase also called with pretty random TYPE_ALIGN on the
argument
 types; we didn't have GIMPLE then, so it is nothing that GIMPLE
brought in.

 Jakub

 




[wwwdocs] Skeleton for GCC 6 release notes

2015-05-05 Thread Gerald Pfeifer
I started working on this over the weekend, and then Jason 
wondered about it yesterday, so I completed and committed the 
following skeleton for the GCC 6 release notes yesterday.

Gerald

Index: gcc-6/changes.html
===
RCS file: gcc-6/changes.html
diff -N gcc-6/changes.html
--- /dev/null   1 Jan 1970 00:00:00 -
+++ gcc-6/changes.html  5 May 2015 19:17:11 -
@@ -0,0 +1,81 @@
+html
+
+head
+titleGCC 6 Release Series mdash; Changes, New Features, and Fixes/title
+/head
+
+!-- GCC maintainers, please do not hesitate to update/contribute entries
+ concerning those part of GCC you maintain!  2002-03-23, Gerald.
+--
+
+body
+h1GCC 6 Release Seriesbr /Changes, New Features, and Fixes/h1
+
+!-- .. --
+h2Caveats/h2
+
+
+!-- .. --
+!-- h2 id=generalGeneral Optimizer Improvements/h2 --
+
+
+!-- .. --
+h2 id=languagesNew Languages and Language specific improvements/h2
+
+!-- h3 id=adaAda/h3 --
+
+!-- h3 id=c-familyC family/h3 --
+
+h3 id=cxxC++/h3
+
+!-- h3 id=goGo/h3 --
+
+!-- h3 id=javaJava (GCJ)/h3 --
+
+
+!-- .. --
+!-- h2 id=jitlibgccjit/h2 --
+
+
+!-- .. --
+!-- h2 id=targetsNew Targets and Target Specific Improvements/h2 --
+
+!-- h3 id=aarch64AArch64/h3 --
+
+!-- h3 id=armARM/h3 --
+
+!-- h3 id=avrAVR/h3 --
+
+!-- h3 id=x86IA-32/x86-64/h3 --
+
+!-- h3 id=mipsMIPS/h3 --
+
+!-- h3 id=nds32NDS32/h3 --
+
+!-- h3 id=powerpcPowerPC / PowerPC64 / RS6000/h3 --
+
+!-- h3 id=rxRX/h3 --
+
+!-- h3 id=shSH/h3 --
+
+
+!-- .. --
+!-- h2 id=osOperating Systems/h2 --
+
+!-- h3 id=dragonflyDragonFly BSD/h3 --
+
+!-- h3 id=freebsdFreeBSD/h3 --
+
+!-- h3 id=vxmilsVxWorks MILS/h3 --
+
+
+!-- .. --
+!-- h2Documentation improvements/h2 --
+
+
+!-- .. --
+!-- h2Other significant improvements/h2 --
+
+
+/body
+/html


Re: RFA: Doc update: Describe new MSP430 feature

2015-05-05 Thread Gerald Pfeifer
Hi Nick,

On Tue, 5 May 2015, Nick Clifton wrote:
   I recently contributes a patch to enhance the MSP430 backend with the
   ability to automatically distribute code and data between high and low
   memory regions.  Below is a patch to update the html documentation
   with a description of this new feature.  Is it OK to apply ?

as maintainer of the port you do not need anyone else's approval.

That said, I am always happy to provide a second pair of eyes, so
here are some comments:

 Index: htdocs/gcc-5/changes.html
 ===
 +lipThe MSP430 compiler now has the ability to automatically 
 distribute code
 +  and data between low (less than 64K) memory and high memory. 

Should that be addresses below ... or something like that, since
the system would still have more than 64K of memory, not less?

 +  This only
 +  applies parts 

Is a to missing here?

 +  pTwo new command line options - 
 code-mcode-region=[lower|upper|either]/code

command-line options


Does this really apply to GCC 5, or GCC 6 and the (new) gcc-6/changes.html?

Gerald


RE: [RFC]: Remove Mem/address type assumption in combiner

2015-05-05 Thread Kumar, Venkataramanan
Hi Segher, 

Thank you for testing the patch and approving it.

Before I commit it, I wanted to check with you on the changelog entry.

Please find the updated patch with the changelog entry. 
I have just removed the comments that says checks for PLUS/MINUS RTX is a hack.

Let me know if it ok.  

Regards,
Venkat.

Change Log 
---

2015-05-05  Venkataramanan Kumar  venkataramanan.ku...@amd.com

* combine.c (make_compound_operation): Remove checks for PLUS/MINUS
rtx type.

---patch---

diff --git a/gcc/combine.c b/gcc/combine.c
index c04146a..9e3eb03 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -7723,9 +7723,8 @@ extract_left_shift (rtx x, int count)
We try, as much as possible, to re-use rtl expressions to save memory.

IN_CODE says what kind of expression we are processing.  Normally, it is
-   SET.  In a memory address (inside a MEM, PLUS or minus, the latter two
-   being kludges), it is MEM.  When processing the arguments of a comparison
-   or a COMPARE against zero, it is COMPARE.  */
+   SET.  In a memory address it is MEM.  When processing the arguments of
+   a comparison or a COMPARE against zero, it is COMPARE.  */

rtx
 make_compound_operation (rtx x, enum rtx_code in_code)
@@ -7745,8 +7744,6 @@ make_compound_operation (rtx x, enum rtx_code in_code)
  but once inside, go back to our default of SET.  */

   next_code = (code == MEM ? MEM
-  : ((code == PLUS || code == MINUS)
-  SCALAR_INT_MODE_P (mode)) ? MEM
   : ((code == COMPARE || COMPARISON_P (x))
   XEXP (x, 1) == const0_rtx) ? COMPARE
   : in_code == COMPARE ? SET : in_code);

---patch---

 
-Original Message-
From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] 
Sent: Friday, May 01, 2015 8:48 PM
To: Kumar, Venkataramanan
Cc: Jeff Law (l...@redhat.com); gcc-patches@gcc.gnu.org; 
maxim.kuvyr...@linaro.org
Subject: Re: [RFC]: Remove Mem/address type assumption in combiner

On Wed, Apr 29, 2015 at 12:03:35PM -0500, Segher Boessenkool wrote:
 On Wed, Apr 29, 2015 at 09:25:21AM +, Kumar, Venkataramanan wrote:
  diff --git a/gcc/combine.c b/gcc/combine.c index 5c763b4..945abdb 
  100644
  --- a/gcc/combine.c
  +++ b/gcc/combine.c
  @@ -7703,8 +7703,6 @@ make_compound_operation (rtx x, enum rtx_code in_code)
but once inside, go back to our default of SET.  */
  
 next_code = (code == MEM ? MEM
  -  : ((code == PLUS || code == MINUS)
  -  SCALAR_INT_MODE_P (mode)) ? MEM
 : ((code == COMPARE || COMPARISON_P (x))
 XEXP (x, 1) == const0_rtx) ? COMPARE
 : in_code == COMPARE ? SET : in_code);
  
  
  On X86_64, it passes bootstrap and regression tests.
  But on Aarch64 the test in PR passed, but I got a few test case failures.
 
  There are few patterns based on multiplication operations in Aarch64 
  backend which are used to match with the pattern combiner generated.
  Now those patterns have to be fixed to use SHIFTS.  Also need to see any 
  impact on other targets.
 
 Right.  It would be good if you could find out for what targets it matters.
 The thing is, if a target expects only the patterns as combine used to 
 make them, it will regress (as you've seen on aarch64); but it will 
 regress _silently_.  Which isn't so nice.
 
  But  before that  I wanted to check if the assumption in combiner,  can 
  simply be removed ?
 
 Seeing for what targets / patterns it makes a difference would tell us 
 the answer to that, too :-)
 
 I'll run some tests with and without your patch.

So I ran the tests.  These are text sizes for vmlinux built for most 
architectures (before resp. after the patch).  Results are good, but they show 
many targets need some work...


BIGGER
 5410496   5432744   alpha
 3848987   3849143   arm
 2190276   2190292   blackfin
 2188431   2191503   c6x (but data shrank by more than text grew)
 2212322   2212706   cris
10896454  10896965   i386
 7471891   7488771   parisc
 6168799   6195391   parisc64
 1545840   1545872   shnommu
 1946649   1954149   xtensa

I looked at some of those.  Alpha regresses with s4add things, arm with add 
,lsl things, parisc with shladd things.  I tried to fix the parisc one (it 
seemed simplest), and the 32-bit kernel got most (but not all) of the size 
difference back; and the 64-bit wouldn't build (an assert in the kernel failed, 
and it uses a floating point reg where an integer one is needed -- I gave up).


SMALLER
 8688171   8688003   powerpc
20252605  20251797   powerpc64
11425334  11422558   s390
12300135  12299767   x86_64

The PowerPC differences are mostly what first was rlwinm;addi;rlwinm;lwz and 
now is rlwinm;lwz; i.e. the add is moved after the two rotates, the rotates are 
merged, and the add is made part of the offset in the load.


NO DIFF
 3321492   3321492   frv
 3252747   3252747   m32r
 4708232   4708232   microblaze
 3949145   3949145   mips
 5693019   

[PATCH, i386]: Fix PR65990, unrecognizable insn with -mmemcpy-strategy=rep_8byte

2015-05-05 Thread Uros Bizjak
Hello!

Attached patch fixes PR65990. rep_8byte is not a valid stringop
strategy for 32bit targets.

2015-05-05  Uros Bizjak  ubiz...@gmail.com

PR target/65990
* config/i386/i386.c (ix86_parse_stringop_strategy_string): Error out
if rep_8byte stringop strategy was specified for 32-bit target.

testsuite/ChangeLog:

2015-05-05  Uros Bizjak  ubiz...@gmail.com

PR target/65990
* gcc.target/i386/pr65990.c: New test.

Tested on x86_64-linux-gnu{,-m32} and committed to mainline SVN.

RMs, should this patch be backported to release branches?

Uros.
Index: testsuite/gcc.target/i386/pr65990.c
===
--- testsuite/gcc.target/i386/pr65990.c (revision 0)
+++ testsuite/gcc.target/i386/pr65990.c (revision 0)
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options -mtune=btver2 -mmemcpy-strategy=rep_8byte:-1:noalign }
+
+/* { dg-error stringop strategy name rep_8byte specified for option 
-mmemcpy_strategy= not supported for 32-bit code  { target ia32 } 0 } */
+
+struct U9
+{
+  unsigned a[9];
+};
+
+struct U9 u9;
+
+void
+foo ()
+{
+  u9 = (struct U9) {
+.a = {
+  0xFF,
+  0xFF,
+  0xFF,
+  0xFF,
+  0xFF,
+  0xFF,
+  0xFF,
+  0xFF,
+  0xFF
+}
+  };
+}
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 222804)
+++ config/i386/i386.c  (working copy)
@@ -2988,6 +2988,17 @@ ix86_parse_stringop_strategy_string (char *strateg
   return;
 }
 
+  if ((stringop_alg) i == rep_prefix_8_byte
+  !TARGET_64BIT)
+   {
+ /* rep; movq isn't available in 32-bit code.  */
+ error (stringop strategy name %s specified for option %s 
+not supported for 32-bit code,
+ alg_name,
+ is_memset ? -mmemset_strategy= : -mmemcpy_strategy=);
+ return;
+   }
+
   input_ranges[n].max = maxs;
   input_ranges[n].alg = (stringop_alg) i;
   if (!strcmp (align, align))


Re: [PATCH, i386]: Fix PR65990, unrecognizable insn with -mmemcpy-strategy=rep_8byte

2015-05-05 Thread Jakub Jelinek
On Tue, May 05, 2015 at 06:58:08PM +0200, Uros Bizjak wrote:
 RMs, should this patch be backported to release branches?

I think so (4.9+, I think 4.8 doesn't have -mmemcpy-strategy=).

Jakub


[patch 0/27] RFC: Use automake-1.11.6 across the tree

2015-05-05 Thread Michael Haubenwallner
Hello build machinery maintainers,

following up
http://thread.gmane.org/gmane.comp.gcc.patches/331902/focus=334462
http://thread.gmane.org/gmane.comp.gcc.patches/332160

On 01/25/2015 08:42 PM, Jan-Benedict Glaw wrote:
 On Sun, 2015-01-04 20:20:40 +0100, Michael Haubenwallner 
 michael.haubenwall...@ssi-schaefer.com wrote:

 Updating to 1.14 might require more work like updating some .in
 files as well. I've seen automake-1.11 being explicitly used, so for
 now we really want 1.11.6 eventually?
 
 Probably yes, we may want to stick to a well-known version. (Maybe
 another way could be to really upgrade to current versions, with is, I
 guess, more work than just sync files and rerun. That might be
 fruitful (ie. to not stick to older versions), but this is a change I
 don't see in the current stage.)
 
   To cut a long story short:
 
   * Do we actually see *problems* from the version inconsistencies
 already introduced by me and/or others?

There's a problem for gcc-developers: When I need to import a libtool
upstream patch, by its nature it affects each library. As I prefer to
avoid mixing these diffs with an automake version change in one commit,
I need to bootstrap different libraries with different automake versions.

Even if I probably need to split this change into one commit per library
anyway, the need for multiple automake versions still feels pointless.

   * ...and: Do we want to stick to known versions, or update if?
 (Probably not in such a late stage, though...)

Now that gcc-5 is out, what about an automake-1.11.6 update for gcc-6?

BTW, the actual commands I use to re-run automake for everything (I found) is:
  $ export AUTOMAKE='automake-1.11 --add-missing --copy --force-missing'
  $ /src/gcc-trunk/configure --prefix=/install \
  --enable-languages=c,c++,fortran,go,java,lto,objc,obj-c++ \
  --enable-liboffloadmic=target \
  --enable-libmpx \
  --enable-maintainer-mode 
  $ make bootstrap

Thanks!
/haubi/


libgo patch committed: Permit nil Func in Func.Name

2015-05-05 Thread Ian Lance Taylor
The gc toolchain permits Func to be nil when calling the
runtime.Func.Name method, returning an empty string.  This patch
changes libgo to do the same, rather than crashing.  This is GCC PR
66016.  Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline and 4.9 branch.

Ian
diff -r f3c36747f019 libgo/runtime/go-caller.c
--- a/libgo/runtime/go-caller.c Thu Apr 30 13:40:30 2015 -0700
+++ b/libgo/runtime/go-caller.c Tue May 05 07:12:06 2015 -0700
@@ -231,6 +231,8 @@
 String
 runtime_funcname_go (Func *f)
 {
+  if (f == NULL)
+return runtime_gostringnocopy ((const byte *) );
   return f-name;
 }
 


Re: [RFC]: Remove Mem/address type assumption in combiner

2015-05-05 Thread Segher Boessenkool
On Tue, May 05, 2015 at 04:14:03PM +, Kumar, Venkataramanan wrote:
 Hi Segher, 
 
 Thank you for testing the patch and approving it.
 
 Before I commit it, I wanted to check with you on the changelog entry.
 
 Please find the updated patch with the changelog entry. 
 I have just removed the comments that says checks for PLUS/MINUS RTX is a 
 hack.
 
 Let me know if it ok.  
 
 Regards,
 Venkat.
 
 Change Log 
 ---
 
 2015-05-05  Venkataramanan Kumar  venkataramanan.ku...@amd.com
 
 * combine.c (make_compound_operation): Remove checks for PLUS/MINUS
 rtx type.

Yes, this is okay for trunk.


Segher


[PING^2, AArch64] Add long-call attribute and pragma interfaces

2015-05-05 Thread Yangfei (Felix)
Patch ping ...


 
  On 04/02/2015 11:59 PM, Yangfei (Felix) wrote:
   Patch ping: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01148.html
 
  This patch needs documentation for the new attributes and pragmas
  before it can be committed.  (Since this is a new feature I think it
  has to wait until stage 1, too, but that's not my call.)
 
  -Sandra
 
 
 Sure, I will update the docs when this patch is approved by the port 
 maintainers.
 I didn't get any feedback from Richard for the v2 patch.  Thanks.


[PATCH 2/4 v2: part 1] Move linemap_assert higher up within the file

2015-05-05 Thread David Malcolm
---
 libcpp/include/line-map.h | 71 ---
 1 file changed, 36 insertions(+), 35 deletions(-)

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index e1681ea..7001552 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -279,6 +279,42 @@ struct GTY(()) line_map {
   } GTY((desc (%1.reason == LC_ENTER_MACRO))) d;
 };
 
+#if defined ENABLE_CHECKING  (GCC_VERSION = 2007)
+
+/* Assertion macro to be used in line-map code.  */
+#define linemap_assert(EXPR)  \
+  do {\
+if (! (EXPR)) \
+  abort ();   \
+  } while (0)
+
+/* Assert that becomes a conditional expression when checking is disabled at
+   compilation time.  Use this for conditions that should not happen but if
+   they happen, it is better to handle them gracefully rather than crash
+   randomly later.
+   Usage:
+
+   if (linemap_assert_fails(EXPR)) handle_error(); */
+#define linemap_assert_fails(EXPR) __extension__ \
+  ({linemap_assert (EXPR); false;})
+
+/* Assert that MAP encodes locations of tokens that are not part of
+   the replacement-list of a macro expansion.  */
+#define linemap_check_ordinary(LINE_MAP) __extension__ \
+  ({linemap_assert (!linemap_macro_expansion_map_p (LINE_MAP)); \
+(LINE_MAP);})
+#else
+/* Include EXPR, so that unused variable warnings do not occur.  */
+#define linemap_assert(EXPR) ((void)(0  (EXPR)))
+#define linemap_assert_fails(EXPR) (! (EXPR))
+#define linemap_check_ordinary(LINE_MAP) (LINE_MAP)
+#endif
+
+/* Return TRUE if MAP encodes locations coming from a macro
+   replacement-list at macro expansion point.  */
+bool
+linemap_macro_expansion_map_p (const struct line_map *);
+
 #define MAP_START_LOCATION(MAP) (MAP)-start_location
 
 #define ORDINARY_MAP_FILE_NAME(MAP) \
@@ -568,10 +604,6 @@ extern const struct line_map *linemap_lookup
macro expansion, FALSE otherwise.  */
 bool linemap_tracks_macro_expansion_locs_p (struct line_maps *);
 
-/* Return TRUE if MAP encodes locations coming from a macro
-   replacement-list at macro expansion point.  */
-bool linemap_macro_expansion_map_p (const struct line_map *);
-
 /* Return the name of the macro associated to MACRO_MAP.  */
 const char* linemap_map_get_macro_name (const struct line_map*);
 
@@ -638,37 +670,6 @@ bool linemap_location_from_macro_expansion_p (const struct 
line_maps *,
 #define MAIN_FILE_P(MAP)   \
   ((linemap_check_ordinary (MAP)-d.ordinary.included_from  0))
 
-#if defined ENABLE_CHECKING  (GCC_VERSION = 2007)
-
-/* Assertion macro to be used in line-map code.  */
-#define linemap_assert(EXPR)   \
-  do { \
-if (! (EXPR))  \
-  abort ();\
-  } while (0)
- 
-/* Assert that becomes a conditional expression when checking is disabled at
-   compilation time.  Use this for conditions that should not happen but if
-   they happen, it is better to handle them gracefully rather than crash
-   randomly later. 
-   Usage:
-
-   if (linemap_assert_fails(EXPR)) handle_error(); */
-#define linemap_assert_fails(EXPR) __extension__ \
-  ({linemap_assert (EXPR); false;}) 
-
-/* Assert that MAP encodes locations of tokens that are not part of
-   the replacement-list of a macro expansion.  */
-#define linemap_check_ordinary(LINE_MAP) __extension__ \
-  ({linemap_assert (!linemap_macro_expansion_map_p (LINE_MAP)); \
-(LINE_MAP);})
-#else
-/* Include EXPR, so that unused variable warnings do not occur.  */
-#define linemap_assert(EXPR) ((void)(0  (EXPR)))
-#define linemap_assert_fails(EXPR) (! (EXPR))
-#define linemap_check_ordinary(LINE_MAP) (LINE_MAP)
-#endif
-
 /* Encode and return a source_location from a column number. The
source line considered is the last source line used to call
linemap_line_start, i.e, the last source line which a location was
-- 
1.8.5.3



[PATCH 2/4 v2: part 2] libcpp: Replace macro usage with C++ constructs

2015-05-05 Thread David Malcolm
libcpp/ChangeLog:
* include/line-map.h (MAX_SOURCE_LOCATION): Convert from a macro
to a const source_location.
(RESERVED_LOCATION_COUNT): Likewise.
(linemap_check_ordinary): Convert from a macro to a pair of inline
functions, for const/non-const arguments.
(MAP_START_LOCATION): Likewise.
(ORDINARY_MAP_STARTING_LINE_NUMBER): Likewise.
(ORDINARY_MAP_INCLUDER_FILE_INDEX): Likewise.
(ORDINARY_MAP_IN_SYSTEM_HEADER_P): Likewise.
(ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): Convert from a macro to a
pair of inline functions, for const/non-const arguments, where the
latter is named...
(SET_ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): New function.
(ORDINARY_MAP_FILE_NAME): Convert from a macro to a pair of inline
functions, for const/non-const arguments.
(MACRO_MAP_MACRO): Likewise.
(MACRO_MAP_NUM_MACRO_TOKENS): Likewise.
(MACRO_MAP_LOCATIONS): Likewise.
(MACRO_MAP_EXPANSION_POINT_LOCATION): Likewise.
(LINEMAPS_MAP_INFO): Likewise.
(LINEMAPS_MAPS): Likewise.
(LINEMAPS_ALLOCATED): Likewise.
(LINEMAPS_USED): Likewise.
(LINEMAPS_CACHE): Likewise.
(LINEMAPS_ORDINARY_CACHE): Likewise.
(LINEMAPS_MACRO_CACHE): Likewise.
(LINEMAPS_MAP_AT): Convert from a macro to an inline function.
(LINEMAPS_LAST_MAP): Likewise.
(LINEMAPS_LAST_ALLOCATED_MAP): Likewise.
(LINEMAPS_ORDINARY_MAPS): Likewise.
(LINEMAPS_ORDINARY_MAP_AT): Likewise.
(LINEMAPS_ORDINARY_ALLOCATED): Likewise.
(LINEMAPS_ORDINARY_USED): Likewise.
(LINEMAPS_LAST_ORDINARY_MAP): Likewise.
(LINEMAPS_LAST_ALLOCATED_ORDINARY_MAP): Likewise.
(LINEMAPS_MACRO_MAPS): Likewise.
(LINEMAPS_MACRO_MAP_AT): Likewise.
(LINEMAPS_MACRO_ALLOCATED): Likewise.
(LINEMAPS_MACRO_USED): Likewise.
(LINEMAPS_MACRO_LOWEST_LOCATION): Likewise.
(LINEMAPS_LAST_MACRO_MAP): Likewise.
(LINEMAPS_LAST_ALLOCATED_MACRO_MAP): Likewise.
(IS_ADHOC_LOC): Likewise.
(COMBINE_LOCATION_DATA): Likewise.
(SOURCE_LINE): Likewise.
(SOURCE_COLUMN): Likewise.
(LAST_SOURCE_LINE_LOCATION): Likewise.
(LAST_SOURCE_LINE): Likewise.
(LAST_SOURCE_COLUMN): Likewise.
(LAST_SOURCE_LINE_LOCATION)
(INCLUDED_FROM): Likewise.
(MAIN_FILE_P): Likewise.
(LINEMAP_FILE): Likewise.
(LINEMAP_LINE): Likewise.
(LINEMAP_SYSP): Likewise.
(linemap_location_before_p): Likewise.
* line-map.c (linemap_check_files_exited): Make local map const.
(linemap_add): Use SET_ORDINARY_MAP_NUMBER_OF_COLUMN_BITS.
(linemap_line_start): Likewise.
---
 libcpp/include/line-map.h | 562 +-
 libcpp/line-map.c |   6 +-
 2 files changed, 455 insertions(+), 113 deletions(-)

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 7001552..67a64e9 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -155,7 +155,7 @@ struct GTY(()) line_map_ordinary {
 
 /* This is the highest possible source location encoded within an
ordinary or macro map.  */
-#define MAX_SOURCE_LOCATION 0x7FFF
+const source_location MAX_SOURCE_LOCATION = 0x7FFF;
 
 struct cpp_hashnode;
 
@@ -298,16 +298,10 @@ struct GTY(()) line_map {
 #define linemap_assert_fails(EXPR) __extension__ \
   ({linemap_assert (EXPR); false;})
 
-/* Assert that MAP encodes locations of tokens that are not part of
-   the replacement-list of a macro expansion.  */
-#define linemap_check_ordinary(LINE_MAP) __extension__ \
-  ({linemap_assert (!linemap_macro_expansion_map_p (LINE_MAP)); \
-(LINE_MAP);})
 #else
 /* Include EXPR, so that unused variable warnings do not occur.  */
 #define linemap_assert(EXPR) ((void)(0  (EXPR)))
 #define linemap_assert_fails(EXPR) (! (EXPR))
-#define linemap_check_ordinary(LINE_MAP) (LINE_MAP)
 #endif
 
 /* Return TRUE if MAP encodes locations coming from a macro
@@ -315,30 +309,196 @@ struct GTY(()) line_map {
 bool
 linemap_macro_expansion_map_p (const struct line_map *);
 
-#define MAP_START_LOCATION(MAP) (MAP)-start_location
+/* Assert that MAP encodes locations of tokens that are not part of
+   the replacement-list of a macro expansion.  */
+inline struct line_map *
+linemap_check_ordinary (struct line_map *map)
+{
+  linemap_assert (!linemap_macro_expansion_map_p (map));
+  return map;
+}
+
+/* Assert that MAP encodes locations of tokens that are not part of
+   the replacement-list of a macro expansion.  */
+
+inline const struct line_map *
+linemap_check_ordinary (const struct line_map *map)
+{
+  linemap_assert (!linemap_macro_expansion_map_p (map));
+  return map;
+}
+
+/* Read the start location of MAP, as an rvalue.  */
+
+inline source_location
+MAP_START_LOCATION (const line_map *map)
+{
+  return map-start_location;

[PATCH 2/4 v2] libcpp: Replace macro usage with C++ constructs)

2015-05-05 Thread David Malcolm
On Mon, 2015-05-04 at 13:15 -0600, Jeff Law wrote:
On 05/01/2015 06:56 PM, David Malcolm wrote:
  libcpp makes extensive use of the C preprocessor.  Whilst this has a
  pleasingly self-referential quality, I find the code hard-to-read;
  implementing source location support in my JIT branch was much harder than
  I felt it should have been.
 
  In an attempt at making the code easier to follow, and to build towards
  a followup patch, this patch converts most of these macros to C++
  equivalents: using const for compile-time constants, and inline
  functions where macros aren't used as lvalues.
 
  This effectively documents the expected types of the params, and makes
  them available from the debugger e.g.:
 
 (gdb) p LINEMAP_FILE ($3)
 $1 = 0x13b8b37 command-line
 
  and indeed the constants also:
 
 (gdb) p IS_ADHOC_LOC(MAX_SOURCE_LOCATION)
 $2 = false
 (gdb) p IS_ADHOC_LOC(MAX_SOURCE_LOCATION + 1)
 $3 = true
 
  [I didn't mark the inline functions as static; should they be?]
 
  [FWIW, I posted a reduced version of this patch about a year ago as:
 https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01092.html
  which covered a smaller subset of the macros].
 
  libcpp/ChangeLog:
  * include/line-map.h (MAX_SOURCE_LOCATION): Convert from a macro
  to a const source_location.
  (RESERVED_LOCATION_COUNT): Likewise.
  (linemap_check_ordinary): Convert from a macro to a pair of inline
  functions, for const/non-const arguments.
  (MAP_START_LOCATION): Likewise.
  (ORDINARY_MAP_STARTING_LINE_NUMBER): Likewise.
  (ORDINARY_MAP_INCLUDER_FILE_INDEX): Likewise.
  (ORDINARY_MAP_IN_SYSTEM_HEADER_P): Likewise.
  (ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): Convert from a macro to a
  pair of inline functions, for const/non-const arguments, where the
  latter is named...
  (SET_ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): New function.
  (ORDINARY_MAP_FILE_NAME): Convert from a macro to a pair of inline
  functions, for const/non-const arguments.
  (MACRO_MAP_MACRO): Likewise.
  (MACRO_MAP_NUM_MACRO_TOKENS): Likewise.
  (MACRO_MAP_LOCATIONS): Likewise.
  (MACRO_MAP_EXPANSION_POINT_LOCATION): Likewise.
  (LINEMAPS_MAP_INFO): Likewise.
  (LINEMAPS_MAPS): Likewise.
  (LINEMAPS_ALLOCATED): Likewise.
  (LINEMAPS_USED): Likewise.
  (LINEMAPS_CACHE): Likewise.
  (LINEMAPS_ORDINARY_CACHE): Likewise.
  (LINEMAPS_MACRO_CACHE): Likewise.
  (LINEMAPS_MAP_AT): Convert from a macro to an inline function.
  (LINEMAPS_LAST_MAP): Likewise.
  (LINEMAPS_LAST_ALLOCATED_MAP): Likewise.
  (LINEMAPS_ORDINARY_MAPS): Likewise.
  (LINEMAPS_ORDINARY_MAP_AT): Likewise.
  (LINEMAPS_ORDINARY_ALLOCATED): Likewise.
  (LINEMAPS_ORDINARY_USED): Likewise.
  (LINEMAPS_LAST_ORDINARY_MAP): Likewise.
  (LINEMAPS_LAST_ALLOCATED_ORDINARY_MAP): Likewise.
  (LINEMAPS_MACRO_MAPS): Likewise.
  (LINEMAPS_MACRO_MAP_AT): Likewise.
  (LINEMAPS_MACRO_ALLOCATED): Likewise.
  (LINEMAPS_MACRO_USED): Likewise.
  (LINEMAPS_MACRO_LOWEST_LOCATION): Likewise.
  (LINEMAPS_LAST_MACRO_MAP): Likewise.
  (LINEMAPS_LAST_ALLOCATED_MACRO_MAP): Likewise.
  (IS_ADHOC_LOC): Likewise.
  (COMBINE_LOCATION_DATA): Likewise.
  (SOURCE_LINE): Likewise.
  (SOURCE_COLUMN): Likewise.
  (LAST_SOURCE_LINE_LOCATION): Likewise.
  (LAST_SOURCE_LINE): Likewise.
  (LAST_SOURCE_COLUMN): Likewise.
  (LAST_SOURCE_LINE_LOCATION)
  (INCLUDED_FROM): Likewise.
  (MAIN_FILE_P): Likewise.
  (LINEMAP_FILE): Likewise.
  (LINEMAP_LINE): Likewise.
  (LINEMAP_SYSP): Likewise.
  (linemap_location_before_p): Likewise.
  * line-map.c (linemap_check_files_exited): Make local map const.
  (linemap_add): Use SET_ORDINARY_MAP_NUMBER_OF_COLUMN_BITS.
  (linemap_line_start): Likewise.
  ---
  -#define MAP_START_LOCATION(MAP) (MAP)-start_location
  +#if defined ENABLE_CHECKING  (GCC_VERSION = 2007)
  +
  +/* Assertion macro to be used in line-map code.  */
  +#define linemap_assert(EXPR)  \
  +  do {\
  +if (! (EXPR)) \
  +  abort ();   \
  +  } while (0)
  +
  +/* Assert that becomes a conditional expression when checking is disabled 
  at
  +   compilation time.  Use this for conditions that should not happen but if
  +   they happen, it is better to handle them gracefully rather than crash
  +   randomly later.
  +   Usage:
  +
  +   if (linemap_assert_fails(EXPR)) handle_error(); */
  +#define linemap_assert_fails(EXPR) __extension__ \
  +  ({linemap_assert (EXPR); false;})
  +
  +#else
  +/* Include EXPR, so that unused variable warnings do not occur.  */
  +#define linemap_assert(EXPR) ((void)(0  (EXPR)))
  +#define linemap_assert_fails(EXPR) (! (EXPR))
  +#endif
 
I moved linemap_assert and linemap_assert_fails higher up within the
file so that 

Re: Patch ping

2015-05-05 Thread Andreas Krebbel
On 05/05/2015 08:52 PM, Jakub Jelinek wrote:
 Hi!
 
 http://gcc.gnu.org/ml/gcc-patches/2015-04/msg01432.html
   - this got approved for arm and aarch64, but not for s390{,x}
   Ok for trunk?

Yes. Thanks!

-Andreas-



Re: [PATCH 3/3] Fix indentation issues seen by -Wmisleading-indentation

2015-05-05 Thread David Malcolm
On Wed, 2015-04-29 at 14:10 +0200, Mikael Morin wrote:
 Hello,
 
 Le 29/04/2015 02:02, David Malcolm a écrit :
  diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
  index 2c7c554..30e4eab 100644
  --- a/gcc/fortran/parse.c
  +++ b/gcc/fortran/parse.c
  @@ -4283,7 +4283,7 @@ parse_oacc_structured_block (gfc_statement acc_st)
  unexpected_eof ();
 else if (st != acc_end_st)
  gfc_error (Expecting %s at %C, gfc_ascii_statement (acc_end_st));
  -   reject_statement ();
  +  reject_statement ();
   }
 while (st != acc_end_st);
   
 I think this one is a bug; there should be braces around 'gfc_error' and
 'reject_statement'.
 At least that's the pattern in 'parse_oacc_loop', and how the
 'unexpected_statement' function is used.
 Author CC'ed in any case.

Thanks.

FWIW, Jeff had approved that patch, so I've committed the patch to trunk
(as r222823), making the indentation reflect the block structure.

Thomas:  should the
  reject_statement ();
call in the above be guarded by the
 else if (st != acc_end_st)
clause?

Thanks
Dave




Re: RFA: RL78: Save the frame pointer if it is used.

2015-05-05 Thread DJ Delorie

   OK to apply ?

Ok, but...

 -  if (regno == FRAME_POINTER_REGNUM  frame_pointer_needed)
 +  if (regno == FRAME_POINTER_REGNUM
 +   (frame_pointer_needed || df_regs_ever_live_p (regno)))

Do we want regs_ever_live or regs_ever_written_to ?  I seem to recall
changing a port... mep perhaps... to only save registers that are
changed, not registers that are used but read-only.


Re: [AArch64][PR65375] Fix RTX cost for vector SET

2015-05-05 Thread James Greenhalgh
On Sat, Apr 25, 2015 at 12:26:16AM +0100, Kugan wrote:
 
 Thanks for the review. I have updated the patch based on the comments
 with some other minor changes. Bootstrapped and regression tested on
 aarch64-none-linux-gnu with no-new regressions. Is this OK for trunk?
 
 
 Thanks,
 Kugan
 
 
 gcc/ChangeLog:
 
 2015-04-24  Kugan Vivekanandarajah  kug...@linaro.org
   Jim Wilson  jim.wil...@linaro.org
 
   * config/arm/aarch-common-protos.h (struct mem_cost_table): Added
   new  fields loadv and storev.
   * config/aarch64/aarch64-cost-tables.h (thunderx_extra_costs):
   Initialize loadv and storev.
   * config/arm/aarch-cost-tables.h (generic_extra_costs): Likewise.
   (cortexa53_extra_costs): Likewise.
   (cortexa57_extra_costs): Likewise.
   (xgene1_extra_costs): Likewise.
   * config/aarch64/aarch64.c (aarch64_rtx_costs): Update vector
   rtx_costs.

Hi Kugan,

Just a few syle comments, regarding the placements of comments in single-line
if statements. I know the current code does not neccesarily always follow the
comments below, I'll write a patch cleaning that up at some point when I'm back
at my desk.

Thanks,
James

 @@ -5667,6 +5668,14 @@ aarch64_rtx_costs (rtx x, int code, int outer 
 ATTRIBUTE_UNUSED,
  case NEG:
op0 = XEXP (x, 0);
  
 +  if (VECTOR_MODE_P (mode))
 + {
 +   if (speed)
 + /* FNEG.  */
 + *cost += extra_cost-vect.alu;
 +   return false;
 + }
 +
if (GET_MODE_CLASS (GET_MODE (x)) == MODE_INT)
 {
if (GET_RTX_CLASS (GET_CODE (op0)) == RTX_COMPARE

Personally, I find commented if statements without braces hard to
quickly parse. Something like this is much faster for me:

  if (speed)
{
  /* FNEG.  */
  *cost += extra_cost-vect.alu;
}

 @@ -5844,7 +5872,10 @@ cost_minus:
  
   if (speed)
 {
 - if (GET_MODE_CLASS (mode) == MODE_INT)
 + if (VECTOR_MODE_P (mode))
 +   /* Vector SUB.  */
 +   *cost += extra_cost-vect.alu;
 + else if (GET_MODE_CLASS (mode) == MODE_INT)
 /* SUB(S).  */
 *cost += extra_cost-alu.arith;
   else if (GET_MODE_CLASS (mode) == MODE_FLOAT)

As above.

 @@ -5888,7 +5919,6 @@ cost_plus:
 {
   if (speed)
 *cost += extra_cost-alu.arith_shift;
 -
   *cost += rtx_cost (XEXP (XEXP (op0, 0), 0),
  (enum rtx_code) GET_CODE (op0),
  0, speed);

Drop this whitespace change.

 @@ -5913,7 +5943,10 @@ cost_plus:
  
   if (speed)
 {
 - if (GET_MODE_CLASS (mode) == MODE_INT)
 + if (VECTOR_MODE_P (mode))
 +   /* Vector ADD.  */
 +   *cost += extra_cost-vect.alu;
 + else if (GET_MODE_CLASS (mode) == MODE_INT)
 /* ADD.  */
 *cost += extra_cost-alu.arith;
   else if (GET_MODE_CLASS (mode) == MODE_FLOAT)

As above.

 @@ -6013,10 +6061,15 @@ cost_plus:
return false;
  
  case NOT:
 -  /* MVN.  */
if (speed)
 - *cost += extra_cost-alu.logical;
 -
 + {
 +   /* Vector NOT.  */
 +   if (VECTOR_MODE_P (mode))
 + *cost += extra_cost-vect.alu;
 +   /* MVN.  */
 +   else
 + *cost += extra_cost-alu.logical;
 + }
/* The logical instruction could have the shifted register form,
   but the cost is the same if the shift is processed as a separate
   instruction, so we don't bother with it here.  */

As above.

 @@ -6055,10 +6108,15 @@ cost_plus:
 return true;
   }
  
 -  /* UXTB/UXTH.  */
if (speed)
 - *cost += extra_cost-alu.extend;
 -
 + {
 +   if (VECTOR_MODE_P (mode))
 + /* UMOV.  */
 + *cost += extra_cost-vect.alu;
 +   else
 + /* UXTB/UXTH.  */
 + *cost += extra_cost-alu.extend;
 + }
return false;
  
  ca§se SIGN_EXTEND:

And again :)

 @@ -6087,10 +6150,16 @@ cost_plus:
  
if (CONST_INT_P (op1))
  {
 -   /* LSL (immediate), UBMF, UBFIZ and friends.  These are all
 -  aliases.  */
 if (speed)
 - *cost += extra_cost-alu.shift;
 + {
 +   /* Vector shift (immediate).  */
 +   if (VECTOR_MODE_P (mode))
 + *cost += extra_cost-vect.alu;
 +   /* LSL (immediate), UBMF, UBFIZ and friends.  These are all
 +  aliases.  */
 +   else
 + *cost += extra_cost-alu.shift;
 + }
  
/* We can incorporate zero/sign extend for free.  */
if (GET_CODE (op0) == ZERO_EXTEND

Again, the comment here makes it very difficult to spot the form of
the if/else statement.

 @@ -6102,10 +6171,15 @@ cost_plus:
  }
else
  {
 -   /* LSLV.  */
 if (speed)
 - *cost += extra_cost-alu.shift_reg;
 -
 + {
 +   /* Vector 

Re: [PING^3] [PATCH] [AArch64, NEON] Improve vmulX intrinsics

2015-05-05 Thread James Greenhalgh
On Sat, Apr 11, 2015 at 11:37:47AM +0100, Jiangjiji wrote:
 Hi,
   This is a ping for: https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00772.html
   Regtested with aarch64-linux-gnu on QEMU.
   This patch has no regressions for aarch64_be-linux-gnu big-endian target 
 too.
   OK for the trunk?
 
 Thanks.
 Jiang jiji
 
 
 --
 Re: [PING^2] [PATCH] [AArch64, NEON] Improve vmulX intrinsics
 
 Hi, Kyrill
   Thank you for your suggestion.
   I fixed it and regtested with aarch64-linux-gnu on QEMU.
   This patch has no regressions for aarch64_be-linux-gnu big-endian target 
 too.
   OK for the trunk?

Hi Jiang,

I'm sorry that I've taken so long to get to this, I've been out of office
for several weeks. I have one comment.

 +__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
 +vmul_n_f32 (float32x2_t __a, float32_t __b)
 +{
 +  return __builtin_aarch64_mul_nv2sf (__a, __b);
 +}
 +

For vmul_n_* intrinsics, is there a reason we don't want to use the
GCC vector extension syntax to allow us to write these as:

  __extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
  vmul_n_f32 (float32x2_t __a, float32_t __b)
  {
return __a * __b;
  }

It would be great if we could make that work.

Thanks,
James



[Patch, fortran] PR 37131, inline matmul

2015-05-05 Thread Thomas Koenig
Hello world,

this is an update of the matmul inline patch.  The only difference to
the last version is that it has the ubound simplification taken out.

Any further comments?  OK for trunk?

Thomas

2015-05-05  Thomas Koenig  tkoe...@gcc.gnu.org

PR fortran/37131
* gfortran.h (gfc_isym_id):  Add GFC_ISYM_FE_RUNTIME_ERROR.
(gfc_array_spec):  Add resolved flag.
(gfc_intrinsic_sym):  Add vararg.
* intrinsic.h (gfc_check_fe_runtime_error):  Add prototype.
(gfc_resolve_re_runtime_error):  Likewise.
Add prototype for gfc_is_reallocatable_lhs.
* array.c (gfc_resolve_array_spec):  Do not resolve if it has
already been resolved.
* trans-array.h (gfc_is_reallocatable_lhs):  Remove prototype.
* check.c (gfc_check_fe_runtime_error):  New function.
* intrinsic.c (add_sym_1p):  New function.
(make_vararg):  New function.
(add_subroutines):  Add fe_runtime_error.
(gfc_intrinsic_sub_interface): Skip sorting for variable number
of arguments.
* iresolve.c (gfc_resolve_fe_runtime_error):  New function.
* lang.opt (inline-matmul-limit):  New option.
(gfc_post_options): If no inline matmul limit has been set and
BLAS is called externally, use the BLAS limit.
* frontend-passes.c:  Include intrinsic.h.
(var_num):  New global counter for naming temporary variablbles.
(matrix_case):  Enum for differentiating the different matmul
cases.
(realloc_string_callback):  Add trim to the variable name.
(create_var): Add optional argument vname as part of the name.
Use var_num. Set dimension of result correctly. Split off block
creation into
(insert_block): New function.
(cfe_expr_0): Use fcn as part of temporary variable name.
(optimize_namesapce): Also set gfc_current_ns. Call
inline_matmul_assign.
(combine_array_constructor):  Use constr as part of
temporary name.
(get_array_inq_function):  New function.
(build_logical_expr):  New function.
(get_operand):  new function.
(inline_limit_check):  New function.
(runtime_error_ne):  New function.
(matmul_lhs_realloc):  New function.
(is_functino_or_op):  New function.
(has_function_or_op):  New function.
(freeze_expr):  New function.
(freeze_references):  New function.
(convert_to_index_kind):  New function.
(create_do_loop):  New function.
(get_size_m1):  New function.
(scalarized_expr):  New function.
(inline_matmul_assign):  New function.
* simplify.c (simplify_bound):  Simplify the case of the
lower bound of an assumed-shape argument.

2015-05-05  Thomas Koenig  tkoe...@gcc.gnu.org

PR fortran/37131
* gfortran.dg/dependency_26.f90: Add option to suppress inlining
matmul.
* gfortran.dg/function_optimize_1.f90:  Likewise.
* gfortran.dg/function_optimize_2.f90:  Likewise.
* gfortran.dg/function_optimize_5.f90:  Likewise.
* gfortran.dg/function_optimize_7.f90:  Likewise.
* gfortran.dg/inline_matmul_1.f90:  New test.
* gfortran.dg/inline_matmul_2.f90:  New test.
* gfortran.dg/inline_matmul_3.f90:  New test.
* gfortran.dg/inline_matmul_4.f90:  New test.
* gfortran.dg/inline_matmul_5.f90:  New test.
Index: fortran/array.c
===
--- fortran/array.c	(Revision 222771)
+++ fortran/array.c	(Arbeitskopie)
@@ -338,6 +338,9 @@ gfc_resolve_array_spec (gfc_array_spec *as, int ch
   if (as == NULL)
 return true;
 
+  if (as-resolved)
+return true;
+
   for (i = 0; i  as-rank + as-corank; i++)
 {
   e = as-lower[i];
@@ -364,6 +367,8 @@ gfc_resolve_array_spec (gfc_array_spec *as, int ch
 	}
 }
 
+  as-resolved = true;
+
   return true;
 }
 
Index: fortran/check.c
===
--- fortran/check.c	(Revision 222771)
+++ fortran/check.c	(Arbeitskopie)
@@ -5527,7 +5527,37 @@ gfc_check_random_seed (gfc_expr *size, gfc_expr *p
   return true;
 }
 
+bool
+gfc_check_fe_runtime_error (gfc_actual_arglist *a)
+{
+  gfc_expr *e;
+  int len, i;
+  int num_percent, nargs;
 
+  e = a-expr;
+  if (e-expr_type != EXPR_CONSTANT)
+return true;
+
+  len = e-value.character.length;
+  if (e-value.character.string[len-1] != '\0')
+gfc_internal_error (fe_runtime_error string must be null terminated);
+
+  num_percent = 0;
+  for (i=0; ilen-1; i++)
+if (e-value.character.string[i] == '%')
+  num_percent ++;
+
+  nargs = 0;
+  for (; a; a = a-next)
+nargs ++;
+
+  if (nargs -1 != num_percent)
+gfc_internal_error (fe_runtime_error: Wrong number of arguments (%d instead of %d),
+			nargs, num_percent++);
+
+  return true;
+}
+
 bool
 gfc_check_second_sub (gfc_expr *time)
 {

Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Jakub Jelinek
On Mon, May 04, 2015 at 05:00:11PM +0200, Jakub Jelinek wrote:
 So at least changing arm_needs_doubleword_align for non-aggregates would
 likely not break anything that hasn't been broken already and would unbreak
 the majority of cases.

Attached (untested so far).  It indeed changes code generated for
over-aligned va_arg, but as I believe you can't properly pass those in the
... caller, this should just fix it so that va_arg handling matches the
caller (and likewise for callees for named argument passing).

 The following testcase shows that eipa_sra changes alignment even for the
 aggregates.  Change aligned (8) to aligned (4) to see another possibility.

Actually I misread it, for the aggregates esra actually doesn't change
anything, which is the reason why the testcase doesn't fail.
The problem with the scalars is that esra first changed it to the
over-aligned MEM_REFs and then later on eipa_sra used the types of the
MEM_REFs created by esra.

2015-05-05  Jakub Jelinek  ja...@redhat.com

PR target/65956
* config/arm/arm.c (arm_needs_doubleword_align): For non-aggregate
types check TYPE_ALIGN of TYPE_MAIN_VARIANT rather than type itself.

* gcc.c-torture/execute/pr65956.c: New test.

--- gcc/config/arm/arm.c.jj 2015-05-04 21:51:42.0 +0200
+++ gcc/config/arm/arm.c2015-05-05 09:20:52.481693337 +0200
@@ -6063,8 +6063,13 @@ arm_init_cumulative_args (CUMULATIVE_ARG
 static bool
 arm_needs_doubleword_align (machine_mode mode, const_tree type)
 {
-  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
- || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
+  if (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY)
+return true;
+  if (type == NULL_TREE)
+return false;
+  if (AGGREGATE_TYPE_P (type))
+return TYPE_ALIGN (type)  PARM_BOUNDARY;
+  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
 }
 
 
--- gcc/testsuite/gcc.c-torture/execute/pr65956.c.jj2015-05-01 
10:32:34.730150257 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr65956.c   2015-05-01 
10:32:13.0 +0200
@@ -0,0 +1,67 @@
+/* PR target/65956 */
+
+struct A { char *a; int b; long long c; };
+char v[3];
+
+__attribute__((noinline, noclone)) void
+fn1 (char *x, char *y)
+{
+  if (x != v[1] || y != v[2])
+__builtin_abort ();
+  v[1]++;
+}
+
+__attribute__((noinline, noclone)) int
+fn2 (char *x)
+{
+  asm volatile ( : +g (x) : : memory);
+  return x == v[0];
+}
+
+__attribute__((noinline, noclone)) void
+fn3 (const char *x)
+{
+  if (x[0] != 0)
+__builtin_abort ();
+}
+
+static struct A
+foo (const char *x, struct A y, struct A z)
+{
+  struct A r = { 0, 0, 0 };
+  if (y.b  z.b)
+{
+  if (fn2 (y.a)  fn2 (z.a))
+   switch (x[0])
+ {
+ case '|':
+   break;
+ default:
+   fn3 (x);
+ }
+  fn1 (y.a, z.a);
+}
+  return r;
+}
+
+__attribute__((noinline, noclone)) int
+bar (int x, struct A *y)
+{
+  switch (x)
+{
+case 219:
+  foo (+, y[-2], y[0]);
+case 220:
+  foo (-, y[-2], y[0]);
+}
+}
+
+int
+main ()
+{
+  struct A a[3] = { { v[1], 1, 1LL }, { v[0], 0, 0LL }, { v[2], 2, 2LL } };
+  bar (220, a + 2);
+  if (v[1] != 1)
+__builtin_abort ();
+  return 0;
+}


Jakub


Re: [Patch, Fortran, PR58586, v3] ICE with derived type with allocatable component passed by value

2015-05-05 Thread Andre Vehreschild
Hi Mikael, hi all,

Mikael, thanks for the review so far. I have inserted the changes requested and
updated this patch to trunk. For the e-expr_type == EXPR_FUNCTION I have
choose !DECL_P(parmse.expr) to not only prevent VAR_DECLs aliasing, but also
prevent aliasing for PARM_DECLs and similar.

Bootstraps and regtests ok on x86_64-linux-gnu/F21. 

Note, this patch also fixes the regression in alloc_comp_constructor_1.f90
introduced by my patch for pr44672, v4. The memory loss is also fixed again.

Ok for trunk?

Regards,
Andre

On Sun, 26 Apr 2015 12:22:56 +0200
Mikael Morin mikael.mo...@sfr.fr wrote:

 Hello,
 
 I'm reviewing the original patch only for now.
 The added bits in v2 will have to wait.
 
 Le 23/04/2015 20:00, Andre Vehreschild a écrit :
  diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
  index 9e6432f..80dfed1 100644
  --- a/gcc/fortran/trans-expr.c
  +++ b/gcc/fortran/trans-expr.c
  @@ -5344,8 +5344,19 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol *
  sym,  (e-expr_type != EXPR_VARIABLE  !e-rank))
   {
  int parm_rank;
  -   tmp = build_fold_indirect_ref_loc (input_location,
  -  parmse.expr);
  +   /* It is known the e returns a structure type with at least one
  +  allocatable component.  When e is a function, ensure that
  the
  +  function is called once only by using a temporary variable.
  */
  +   if (e-expr_type == EXPR_FUNCTION)
  + parmse.expr = gfc_evaluate_now_loc (input_location,
  + parmse.expr, se-pre);
  You need not limit this to functions only.
  I think you can even do this without condition.
  
  Yes, one could do that, but then an unnecessary temporary variable in the -
  for my taste - already too clobbered pseudo code is introduced.
  Furthermore, is the penalty on doing the check for a function negligible. I
  therefore have not changed that.
 
 All right; would you mind writing it either
   if (e-expr_type != EXPR_VARIABLE)
 or
   if (!DECL_P (parmse.expr))
 or
   if (!VAR_P (parmse.expr))
 instead?
 
  
  +   if (POINTER_TYPE_P (TREE_TYPE (parmse.expr)))
  This distinguishes arguments with/without value attribute, right?
  I think it's better to use the frontend information here
  (fsym-attr.value).
  
  Changed to check for value.
 
 Please check fsym  fsym-attr.value
 and add the following testcase (it fails with the patch).
 
 
 module m
   type :: t
 integer, allocatable :: comp
   end type
   type :: u
 type(t), allocatable :: comp
   end type
 end module m
 
   use m
   call sub(u())
 end
 
 
 OK with these changes.
 
  
  Ah, and don't forget to provide a ChangeLog entry with it.
  
  The Changelog entry comes in an additional attachment. 
  
 It doesn't appear inline in my mailer as its content type is
 application/octet-stream, so I missed it.  Sorry.
 
 Thanks for the patch.  I will review the rest later.
 
 Mikael


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
gcc/testsuite/ChangeLog:

2015-05-05  Andre Vehreschild  ve...@gmx.de

* gfortran.dg/alloc_comp_class_3.f03: New test.
* gfortran.dg/alloc_comp_class_4.f03: New test.


gcc/fortran/ChangeLog:

2015-04-23  Andre Vehreschild  ve...@gmx.de

PR fortran/58586
* resolve.c (resolve_symbol): Non-private functions in modules
with allocatable or pointer components are marked referenced
now. Furthermore is the default init especially for those
components now done in gfc_conf_procedure_call preventing
duplicate code.
* trans-decl.c (build_function_decl): Set the result decl when
the function symbol and the result symbol are equal.
(gfc_generate_function_code): Generate a fake result decl for
functions returning an object with allocatable components and
initialize them.
* trans-expr.c (gfc_conv_procedure_call): For value typed trees
use the tree without indirect ref. And for non-decl trees
add a temporary variable to prevent evaluating the tree
multiple times (prevent multiple function evaluations).
* trans.h: Made gfc_trans_structure_assign () protoype
available, which is now needed by trans-decl.c:gfc_generate_
function_code(), too.

gcc/fortran/ChangeLog:

2015-05-05  Andre Vehreschild  ve...@gmx.de

* resolve.c (resolve_symbol):
* trans-decl.c (build_function_decl):
(gfc_generate_function_code):
* trans-expr.c (gfc_conv_procedure_call):
* trans.h:

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 2ac4689..72df35e 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -14093,10 +14093,15 @@ resolve_symbol (gfc_symbol *sym)
 
   if ((!a-save  !a-dummy  !a-pointer
 	!a-in_common  !a-use_assoc
-	(a-referenced || a-result)
-	!(a-function  sym != sym-result))
+	!a-result  !a-function)
 	  || (a-dummy  a-intent == 

Re: [PATCH, x86] Add TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook

2015-05-05 Thread Yvan Roux
Hi Christian,

On 4 May 2015 at 11:29, Christian Bruel christian.br...@st.com wrote:

 Hi Christian,
 I noticed case gcc.dg/ipa/iinline-attr.c failed on aarch64.  The
 original patch is x86 specific, while the case is added as general
 one.  Could you please have a look at this?

 FAIL: gcc.dg/ipa/iinline-attr.c scan-ipa-dump inline
 hooray[^\\n]*inline copy in test


 that is the same latent bug for aarch64:  alignment flags are not
 propagated with attribute optimize (O2).

 testing attached patch

The patch looks good to me, maybe you can just fix the original typo
on optimizing in the comments while moving the code.  I've
bootstrapped and regtested it on aarch64-linux-gnu with the same
target testcase you added on i386 (as we discussed offline) and
everything is ok (gcc.dg/ipa/iinline-attr.c now PASS).  Notice that I
can't approve your patch.

Cheers,
Yvan


[PATCH,testsuite] Xfail gcc.dg/tree-ssa/stdarg-2.c f15 scans

2015-05-05 Thread Tom de Vries

Hi,

after checking in the 'postpone expanding va_arg till pass_stdarg' patch series, 
some scans related to function f15 in gcc.dg/tree-ssa/stdarg-2.c have started 
failing.


[ The committed patch series contained a modification of stdarg-2.c, but that 
seems to be not complete and not correct. ]


F.i., for s390 we find at 
https://gcc.gnu.org/ml/gcc-testresults/2015-05/msg00507.html:

...
FAIL: gcc.dg/tree-ssa/stdarg-2.c scan-tree-dump stdarg f15: va_list escapes 0, 
needs to save 1 GPR units and 2 FPR units

...

Furthermore, at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64950#c11 it was 
noted that:

...
Unfortunately, the gcc.dg/tree-ssa/stdarg-2.c part of the patch is wrong: the 
test now FAILs on i686-unknown-linux-gnu, i686-apple-darwin, and i386-pc-solaris 
with -m64: both dumps (i.e. -m32 and -m64) contain


m32/stdarg-2.c.084t.stdarg:f15: va_list escapes 1, needs to save all GPR units 
and all FPR units.
m64/stdarg-2.c.084t.stdarg:f15: va_list escapes 1, needs to save all GPR units 
and all FPR units.

...

I've filed two PRs:
- PR66010 '[6 Regression] Missed optimization after inlining va_list parameter'
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66010
- PR66013 'Missed optimization after inlining va_list parameter, -m32 case'
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66013
to track these errors.

AFAIU now from the investigations in those PRs, we can expect all f15 scans that 
check for the presence of 'va_list escapes 0' to fail.


I'd like to commit two patches.
- The first patch undoes the modification of stdarg-2.c as committed in the
  original patch series (omitted).
- The second patch adds appropriate xfails (attached).

OK for trunk?

Thanks,
- Tom
[PATCH 2/2] Xfail gcc.dg/tree-ssa/stdarg-2.c f15 scans

2015-05-05  Tom de Vries  t...@codesourcery.com

	* gcc.dg/tree-ssa/stdarg-2.c: Xfail f15 scans which test for presence of
	'va_list escapes 0'.
---
 gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c b/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
index fe39da3..f09b5de 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
@@ -288,10 +288,18 @@ f15 (int i, ...)
   f15_1 (ap);
   va_end (ap);
 }
-/* { dg-final { scan-tree-dump f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units stdarg { target { { i?86-*-* x86_64-*-* }  { ! { ia32 || llp64 } } } } } } */
-/* { dg-final { scan-tree-dump f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units stdarg { target { powerpc*-*-linux*  { powerpc_fprs  ilp32 } } } } } */
+
+/* Following three dg-finals are marked as xfail due to PR66010/PR66013.  */
+/* Was: { target { { i?86-*-* x86_64-*-* }  { ! { ia32 || llp64 } } } }.  */
+/* { dg-final { scan-tree-dump f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units stdarg { xfail *-*-* } } } */
+/* Was: { target { powerpc*-*-linux*  { powerpc_fprs  ilp32 } } }.  */
+/* { dg-final { scan-tree-dump f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units stdarg { xfail *-*-* } } } */
+/* Was: { target s390*-*-linux* }.  */
+/* { dg-final { scan-tree-dump f15: va_list escapes 0, needs to save 1 GPR units and 2 FPR units stdarg { xfail *-*-* } } } */
+
+/* We may be able to improve upon this after fixing PR66010/PR66013.  */
 /* { dg-final { scan-tree-dump f15: va_list escapes 1, needs to save all GPR units and all FPR units stdarg { target alpha*-*-linux* } } } */
-/* { dg-final { scan-tree-dump f15: va_list escapes 0, needs to save 1 GPR units and 2 FPR units stdarg { target s390*-*-linux* } } } */
+
 /* { dg-final { scan-tree-dump-not f15: va_list escapes 0, needs to save 0 GPR units stdarg { target { { i?86-*-* x86_64-*-* }  ia32 } } } } */
 /* { dg-final { scan-tree-dump-not f15: va_list escapes 0, needs to save 0 GPR units stdarg { target ia64-*-* } } } */
 /* { dg-final { scan-tree-dump-not f15: va_list escapes 0, needs to save 0 GPR units stdarg { target { powerpc*-*-*  lp64 } } } } */
-- 
1.9.1



Re: [CHKP, PATCH] Fix LTO cgraph merge for instrumented functions

2015-05-05 Thread Ilya Enkovich
Ping

2015-04-14 12:14 GMT+03:00 Ilya Enkovich enkovich@gmail.com:
 On 10 Apr 03:15, Jan Hubicka wrote:
 
  References are not streamed out for nodes which are referenced in a
  partition but don't belong to it ('continue' condition in output_refs
  loop).

 Yeah, but it already special cases aliases (because we now always preserve 
 IPA_REF_ALIAS link
 in the boundary, so I think making IPA_REF_CHKP to be rpeserved should go 
 the same path)
 so we can special case the instrumentation thunks too?

 Any cgraph_node having instrumented clone should have it, not only 
 instrumentation thunks.  Surely we can make an exception for IPA_REF_CHKP.

 
  
   I suppose we still need to
1) write_symbol and lto-symtab should know that transparent aliases are 
   not real
   symbols and ignore them. We have predicate 
   symtab_node::real_symbol_p,
   I suppose it should return true.
  
   I think cgraph_merge and varpool_merge in lto-symtab needs to know 
   what to do
   when merging two symbols with transparent aliases.
2) At some point we need to populate transparent alias links as 
   discussed in the
   other thread.
3) For safety, I guess we want symbol_table::change_decl_assembler_name 
   to either
   sanity check there are no trasparent alias links pointing to old 
   assembler
   names or update it.
 
  Wouldn't search for possible referring via transparent aliases nodes
  be too expensive?

 Changing of decl_assembler_name is not very common and if we set up the links
 only after renaming of statics, i think we are safe. But it would be better 
 to
 keep verify it at some point.

 I suppose verify_node check that there is no transparent alias link on 
 symbols
 were it is not supposed to be and that there is link when it is supposed to 
 be (i.e.
 symtab_state is =EXPANSION and symbol is either weakref on tragets w/o 
 native weakrefs
 or instrumentation cone.

 Would you please mind adding the test and making sure it triggers when 
 things are
 out of sync?


 OK, but I suppose it should be a part of transparent alias links rework 
 discussed in another thread.  BTW are you going to intoroduce transparent 
 aliases in cgraph soon?

 
4) I think we want verify_node to check that transparent alias links 
   and chkp points
   as intended and only on those symbols where they need to be
  
   There is also logic in lto-partition to always insert alias target
OK, so you have the chkp references that links to 
instrumented_version.
You do not stream them for boundary symbols but for some reason they 
are needed,
but when you can end up with wrong CHKP link?
  
   Wrong links may appear when cgraph nodes are merged.
  
   I see, I think you really want to fix these at merging time as sugested 
   in 1).
   In this case even the node-instrumented_version pointer may be out of 
   date and
   point to a node that lost in merging?
 
  node-instrumented_version is redirected and never points to lost
  symbol. But during merge we may have situations when
  (node-instrumented_version-instrumented_version != node) because of
  multiple not merged (yet) symbols referencing the same merged target.

 OK, which should not be a problem if we stream the links instead of 
 rebuilding them, right?

 IPA_REF_CHKP refs don't affect instrumented_version merging. And I actually 
 don't see here any problems, instrumented_version links always come into 
 consistent state when symbol merging finishes.


 Here is a new patch version.  It has no chkp_maybe_fix_chkp_ref function,  
 IPA_REF_CHKP references are streamed out instead.  Bootstrapped and tested on 
 x86_64-unknown-linux-gnu.  OK for trunk?  I also want to port it to GCC 5 
 branch after release.

 Thanks,
 Ilya
 --
 gcc/

 2015-04-14  Ilya Enkovich  ilya.enkov...@intel.com

 * ipa.c (symbol_table::remove_unreachable_nodes): Don't
 remove instumentation thunks calling reachable functions.
 * lto-cgraph.c (output_refs): Always output IPA_REF_CHKP.
 * lto/lto-partition.c (privatize_symbol_name_1): New.
 (privatize_symbol_name): Privatize both decl and orig_decl
 names for instrumented functions.

 gcc/testsuite/

 2015-04-14  Ilya Enkovich  ilya.enkov...@intel.com

 * gcc.dg/lto/chkp-privatize-1_0.c: New.
 * gcc.dg/lto/chkp-privatize-1_1.c: New.
 * gcc.dg/lto/chkp-privatize-2_0.c: New.
 * gcc.dg/lto/chkp-privatize-2_1.c: New.


 diff --git a/gcc/ipa.c b/gcc/ipa.c
 index b3752de..3054afe 100644
 --- a/gcc/ipa.c
 +++ b/gcc/ipa.c
 @@ -492,7 +492,22 @@ symbol_table::remove_unreachable_nodes (FILE *file)
 }
   else if (cnode-thunk.thunk_p)
 enqueue_node (cnode-callees-callee, first, reachable);
 -
 +
 + /* For instrumentation clones we always need original
 +function node for proper LTO privatization.  */
 + if (cnode-instrumentation_clone
 +  reachable.contains 

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-05-05 Thread Ilya Enkovich
Ping

2015-04-14 17:35 GMT+03:00 Ilya Enkovich enkovich@gmail.com:
 On 10 Apr 03:27, Jan Hubicka wrote:
 
  +  /* We might propagate instrumented function pointer into
  + not instrumented function and vice versa.  In such a
  + case we need to either fix function declaration or
  + remove bounds from call statement.  */
  +  if (flag_check_pointer_bounds  callee)
  +skip_bounds = chkp_redirect_edge (e);

 I think this gets wrong the case where the edge is speculative and the new
 direct call needs adjustement.  You probably need to do the right think in
 the if (e-speculative) branch so direct call is updated by indirect is not
 or at least give an explanation why this is not needed :)

 The speculative edge handling works in a way that the runtime conditoinal is
 built and then the edge is updated to the direct path, so perhaps
 you can just move all this after the ocnditoinal?

 I think you are right, it should be OK to move it after apeculative call 
 processing.


  diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
  index 404cb68..ffb6ad7 100644
  --- a/gcc/lto-wrapper.c
  +++ b/gcc/lto-wrapper.c
  @@ -273,6 +273,7 @@ merge_and_complain (struct cl_decoded_option 
  **decoded_options,
  case OPT_fwrapv:
  case OPT_fopenmp:
  case OPT_fopenacc:
  +   case OPT_fcheck_pointer_bounds:
/* For selected options we can merge conservatively.  */
for (j = 0; j  *decoded_options_count; ++j)
  if ((*decoded_options)[j].opt_index == foption-opt_index)
  @@ -503,6 +504,7 @@ append_compiler_options (obstack *argv_obstack, struct 
  cl_decoded_option *opts,
  case OPT_Ofast:
  case OPT_Og:
  case OPT_Os:
  +   case OPT_fcheck_pointer_bounds:

 Hmm, will this make mixed units contaiing some check bounds and some 
 non-check bounds to work?
 Perhaps these should be function specific? Does things like inlining bounds 
 checked function into
 non-checked function work?

 This actually should make it work better because solves a possible problem 
 with uninitialized static bounds data (chkp static constructors are generated 
 only when OPT_fcheck_pointer_bounds is passed).

 Inlining of instrumentation thunks is not supported (similar to all other 
 thunks).  But we may have a not instrumented call in an instrumented function 
 and do inlining for it.


 Otherwise the patch seems resonable.
 Honza


 Here is a fixed version with chkp redirection moved.  Bootstrap and testing 
 passed.  Is it OK for trunk and later for GCC 5?

 Thanks,
 Ilya
 --
 gcc/

 2015-04-14  Ilya Enkovich  ilya.enkov...@intel.com

 PR target/65527
 * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Add
 redirection for instrumented calls.
 * lto-wrapper.c (merge_and_complain): Merge -fcheck-pointer-bounds.
 (append_compiler_options): Append -fcheck-pointer-bounds.
 * tree-chkp.h (chkp_copy_call_skip_bounds): New.
 (chkp_redirect_edge): New.
 * tree-chkp.c (chkp_copy_call_skip_bounds): New.
 (chkp_redirect_edge): New.

 gcc/testsuite/

 2015-04-14  Ilya Enkovich  ilya.enkov...@intel.com

 PR target/65527
 * gcc.target/i386/mpx/chkp-fix-calls-1.c: New.
 * gcc.target/i386/mpx/chkp-fix-calls-2.c: New.
 * gcc.target/i386/mpx/chkp-fix-calls-3.c: New.
 * gcc.target/i386/mpx/chkp-fix-calls-4.c: New.


 diff --git a/gcc/cgraph.c b/gcc/cgraph.c
 index 85531c8..38e71fc 100644
 --- a/gcc/cgraph.c
 +++ b/gcc/cgraph.c
 @@ -1281,6 +1281,7 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
tree lhs = gimple_call_lhs (e-call_stmt);
gcall *new_stmt;
gimple_stmt_iterator gsi;
 +  bool skip_bounds = false;
  #ifdef ENABLE_CHECKING
cgraph_node *node;
  #endif
 @@ -1389,8 +1390,16 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 }
  }

 +  /* We might propagate instrumented function pointer into
 + not instrumented function and vice versa.  In such a
 + case we need to either fix function declaration or
 + remove bounds from call statement.  */
 +  if (flag_check_pointer_bounds  e-callee)
 +skip_bounds = chkp_redirect_edge (e);
 +
if (e-indirect_unknown_callee
 -  || decl == e-callee-decl)
 +  || (decl == e-callee-decl
 +  !skip_bounds))
  return e-call_stmt;

  #ifdef ENABLE_CHECKING
 @@ -1415,13 +1424,19 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 }
  }

 -  if (e-callee-clone.combined_args_to_skip)
 +  if (e-callee-clone.combined_args_to_skip
 +  || skip_bounds)
  {
int lp_nr;

 -  new_stmt
 -   = gimple_call_copy_skip_args (e-call_stmt,
 - e-callee-clone.combined_args_to_skip);
 +  new_stmt = e-call_stmt;
 +  if (e-callee-clone.combined_args_to_skip)
 +   new_stmt
 + = gimple_call_copy_skip_args (new_stmt,
 +   
 e-callee-clone.combined_args_to_skip);
 +  if (skip_bounds)
 

[PATCH][ARM] Fix PR 65955: Do not take REGNO on non-REG operand in movcond_addsi

2015-05-05 Thread Kyrill Tkachov

Hi all,

As the PR says, the movcond_addsi pattern takes the REGNO of an operand that 
may be a CONST_INT.
The fix for that is rather simple (perhaps even obvious?).

Unfortunately the testcase is in ada, and I'm not sure how to integrate this 
into the testsuite.
Does anyone know how to add an ada testcase?

The patch that introduced this faulty check was added for GCC 5 and backported 
to 4.9, so this patch
should be backported everywhere as well.

I'll be testing it on those branches shortly.

Tested on arm-none-eabi. Bootstrapped on arm-none-linux-gnueabihf with arm and 
thumb default mode.
I think this is the right thing to do in any case since the current code is 
clearly doing the wrong
thing if operands[2] is a CONST_INT.

Ok for trunk?

Thanks,
Kyrill

P.S. I'm attaching the patch with a larger than usual context so that the 
definition of operands[2]
can be seen more clearly.

2015-05-05  Kyrylo Tkachov  kyrylo.tkac...@arm.com

PR target/65955
* config/arm/arm.md (movcond_addsi): Check that operands[2] is a
REG before taking its REGNO.
commit e88f515909e185e7add7429b11672ccc3ad17be9
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Fri May 1 10:01:13 2015 +0100

[ARM] PR 65955: check for REG_P before taking REGNO in movcond_addsi

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index b2f6b4f..e439e7a 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -9413,51 +9413,51 @@ (define_insn_and_split movcond_addsi
 	 (match_operator 5 comparison_operator
 	  [(plus:SI (match_operand:SI 3 s_register_operand r,r,r)
 	(match_operand:SI 4 arm_add_operand rIL,rIL,rIL))
 (const_int 0)])
 	 (match_operand:SI 1 arm_rhs_operand rI,rPy,r)
 	 (match_operand:SI 2 arm_rhs_operand rI,rPy,r)))
(clobber (reg:CC CC_REGNUM))]
TARGET_32BIT
#
 reload_completed
   [(set (reg:CC_NOOV CC_REGNUM)
 	(compare:CC_NOOV
 	 (plus:SI (match_dup 3)
 		  (match_dup 4))
 	 (const_int 0)))
(set (match_dup 0) (match_dup 1))
(cond_exec (match_dup 6)
 	  (set (match_dup 0) (match_dup 2)))]
   
   {
 machine_mode mode = SELECT_CC_MODE (GET_CODE (operands[5]),
 	 operands[3], operands[4]);
 enum rtx_code rc = GET_CODE (operands[5]);
 operands[6] = gen_rtx_REG (mode, CC_REGNUM);
 gcc_assert (!(mode == CCFPmode || mode == CCFPEmode));
-if (REGNO (operands[2]) != REGNO (operands[0]))
+if (!REG_P (operands[2]) || REGNO (operands[2]) != REGNO (operands[0]))
   rc = reverse_condition (rc);
 else
   std::swap (operands[1], operands[2]);
 
 operands[6] = gen_rtx_fmt_ee (rc, VOIDmode, operands[6], const0_rtx);
   }
   
   [(set_attr conds clob)
(set_attr enabled_for_depr_it no,yes,yes)
(set_attr type multiple)]
 )
 
 (define_insn movcond
   [(set (match_operand:SI 0 s_register_operand =r,r,r)
 	(if_then_else:SI
 	 (match_operator 5 arm_comparison_operator
 	  [(match_operand:SI 3 s_register_operand r,r,r)
 	   (match_operand:SI 4 arm_add_operand rIL,rIL,rIL)])
 	 (match_operand:SI 1 arm_rhs_operand 0,rI,?rI)
 	 (match_operand:SI 2 arm_rhs_operand rI,0,rI)))
(clobber (reg:CC CC_REGNUM))]
   TARGET_ARM
   *
   if (GET_CODE (operands[5]) == LT
(operands[4] == const0_rtx))


Re: [PATCH, AArch64] [4.9] Backport PR64304 fix (miscompilation with -mgeneral-regs-only )

2015-05-05 Thread Marcus Shawcroft
On 4 May 2015 at 10:13, Chen Shanyao chenshan...@huawei.com wrote:
 According to your opinion, I split the backports of pr64304 into 2 emails,
 and this one is for 4.9 branch.
 This patch backport the fix of PR target/64304 , miscompilation with
 -mgeneral-regs-only, to the 4.9 branch from trunk r219844. Tested on x86_64
 by using qemu of aarch64.
 OK for 4.9?

 diff -rupN gcc-4.9-20150225/gcc/ChangeLog
 gcc-4.9-20150225.pr64304//gcc/ChangeLog
 --- gcc-4.9-20150225/gcc/ChangeLog2015-03-04 20:48:30.0 -0500
 +++ gcc-4.9-20150225.pr64304//gcc/ChangeLog2015-03-04 20:55:59.0
 -0500
 @@ -1,3 +1,13 @@
 +2015-03-05  Shanyao Chen  chenshan...@huawei.com
 +
 +Backported from mainline
 +2015-01-19  Jiong Wang  jiong.w...@arm.com
 +Andrew Pinski  apin...@cavium.com
 +
 +PR target/64304
 +* config/aarch64/aarch64.md (define_insn *ashlmode3_insn): Deleted.
 +(ashlmode3): Don't expand if operands[2] is not constant.
 +

OK /Marcus


Next set of OpenACC changes

2015-05-05 Thread Thomas Schwinge
Hi!

In follow-up messages, I'll be posting the separated parts (for easier
review) of a next set of OpenACC changes that we'd like to commit.
ChangeLog updates not yet written; will do that before commit, obviously.

Overall diffstat:

 gcc/c-family/c-common.c|3 +-
 gcc/c-family/c-common.h|2 +
 gcc/c-family/c-omp.c   |  105 ++
 gcc/c-family/c-pragma.c|4 +
 gcc/c-family/c-pragma.h|   14 +-
 gcc/c/c-parser.c   | 1353 
 gcc/c/c-tree.h |3 +-
 gcc/c/c-typeck.c   |  112 +-
 gcc/cp/cp-gimplify.c   |3 +-
 gcc/cp/cp-tree.h   |3 +-
 gcc/cp/parser.c| 1382 +
 gcc/cp/parser.h|4 +
 gcc/cp/pt.c|   43 +-
 gcc/cp/semantics.c |  151 +-
 gcc/fortran/dump-parse-tree.c  |   12 +-
 gcc/fortran/gfortran.h |   50 +-
 gcc/fortran/match.h|1 +
 gcc/fortran/openmp.c   |  581 +--
 gcc/fortran/parse.c|   65 +-
 gcc/fortran/parse.h|2 +-
 gcc/fortran/resolve.c  |5 +
 gcc/fortran/st.c   |7 +
 gcc/fortran/trans-decl.c   |   62 +-
 gcc/fortran/trans-openmp.c |   66 +-
 gcc/fortran/trans-stmt.c   |7 +-
 gcc/fortran/trans-stmt.h   |2 +-
 gcc/fortran/trans.c|2 +
 gcc/gimplify.c |   16 +-
 gcc/omp-low.c  |   11 +-
 gcc/testsuite/c-c++-common/goacc-gomp/nesting-1.c  |   46 +
 .../c-c++-common/goacc-gomp/nesting-fail-1.c   |   25 -
 gcc/testsuite/c-c++-common/goacc/asyncwait-1.c |4 +-
 gcc/testsuite/c-c++-common/goacc/data-2.c  |   12 +-
 gcc/testsuite/c-c++-common/goacc/declare-1.c   |   84 +
 gcc/testsuite/c-c++-common/goacc/declare-2.c   |   67 +
 gcc/testsuite/c-c++-common/goacc/dtype-1.c |  113 ++
 gcc/testsuite/c-c++-common/goacc/dtype-2.c |   31 +
 gcc/testsuite/c-c++-common/goacc/host_data-1.c |   14 +
 gcc/testsuite/c-c++-common/goacc/host_data-2.c |   14 +
 gcc/testsuite/c-c++-common/goacc/host_data-3.c |   16 +
 gcc/testsuite/c-c++-common/goacc/host_data-4.c |   15 +
 gcc/testsuite/c-c++-common/goacc/kernels-1.c   |6 -
 gcc/testsuite/c-c++-common/goacc/kernels-empty.c   |6 +
 gcc/testsuite/c-c++-common/goacc/kernels-eternal.c |   11 +
 .../c-c++-common/goacc/kernels-noreturn.c  |   12 +
 gcc/testsuite/c-c++-common/goacc/loop-1.c  |2 -
 gcc/testsuite/c-c++-common/goacc/parallel-1.c  |6 -
 gcc/testsuite/c-c++-common/goacc/parallel-empty.c  |6 +
 .../c-c++-common/goacc/parallel-eternal.c  |   11 +
 .../c-c++-common/goacc/parallel-noreturn.c |   12 +
 gcc/testsuite/c-c++-common/goacc/reduction-1.c |   25 +-
 gcc/testsuite/c-c++-common/goacc/reduction-2.c |   22 +-
 gcc/testsuite/c-c++-common/goacc/reduction-3.c |   22 +-
 gcc/testsuite/c-c++-common/goacc/reduction-4.c |   40 +-
 gcc/testsuite/c-c++-common/goacc/routine-1.c   |   35 +
 gcc/testsuite/c-c++-common/goacc/routine-2.c   |   36 +
 gcc/testsuite/c-c++-common/goacc/routine-3.c   |   52 +
 gcc/testsuite/c-c++-common/goacc/routine-4.c   |   87 ++
 gcc/testsuite/c-c++-common/goacc/tile.c|   26 +
 gcc/testsuite/g++.dg/goacc/template-reduction.C|  100 ++
 gcc/testsuite/g++.dg/goacc/template.C  |  131 ++
 gcc/testsuite/gfortran.dg/goacc/cache-1.f95|1 -
 gcc/testsuite/gfortran.dg/goacc/coarray.f95|2 +-
 gcc/testsuite/gfortran.dg/goacc/coarray_2.f90  |1 +
 gcc/testsuite/gfortran.dg/goacc/combined_loop.f90  |2 +-
 gcc/testsuite/gfortran.dg/goacc/cray.f95   |1 -
 gcc/testsuite/gfortran.dg/goacc/declare-1.f95  |3 +-
 gcc/testsuite/gfortran.dg/goacc/declare-2.f95  |   44 +
 gcc/testsuite/gfortran.dg/goacc/default.f95|   17 +
 gcc/testsuite/gfortran.dg/goacc/dtype-1.f95|  161 ++
 gcc/testsuite/gfortran.dg/goacc/dtype-2.f95|   39 +
 gcc/testsuite/gfortran.dg/goacc/host_data-tree.f95 |2 +-
 gcc/testsuite/gfortran.dg/goacc/loop-1.f95 |1 -
 gcc/testsuite/gfortran.dg/goacc/loop-2.f95 |   26 +-
 gcc/testsuite/gfortran.dg/goacc/modules.f95|   55 +
 gcc/testsuite/gfortran.dg/goacc/parameter.f95  |1 -
 gcc/testsuite/gfortran.dg/goacc/update.f95 | 

RE: [RFC: Patch, PR 60158] gcc/varasm.c : Pass actual alignment value to output_constant_pool_2

2015-05-05 Thread rohitarul...@freescale.com


 -Original Message-
 From: Jeff Law [mailto:l...@redhat.com]
 Sent: Thursday, April 30, 2015 9:14 PM
 On 04/30/2015 09:34 AM, rohitarul...@freescale.com wrote:
 
 
  -Original Message-
  From: Jeff Law [mailto:l...@redhat.com]
  Sent: Thursday, April 30, 2015 8:32 PM
  To: Dharmakan Rohit-B30502; gcc-patches@gcc.gnu.org;
  rguent...@suse.de; Jakub Jelinek
  Cc: Alan Modra; David Edelsohn; Wienskoski Edmar-RA8797
  Subject: Re: [RFC: Patch, PR 60158] gcc/varasm.c : Pass actual
  alignment value to output_constant_pool_2
 
  On 04/29/2015 04:30 AM, rohitarul...@freescale.com wrote:
 
 
  Jeff, I have made the changes as per your comments and attached the
  patch.
  If the patch is OK, I will proceed with the regression tests.
  This patch refers back to 60158 and based on what I see in 60158, it
  appears I should be looking for a .data.rel.ro.local section which
  contains the address of a string constant.  But the constants are
  being put into .rodata.str1.4.  And if the issue is we're putting
  bits into the wrong section and don't have an appropriate .fixup
  section, then ISTM that the test should be compiled, then objdump used
 to verify the sections and/or relocations.
 
  An additional concern is that I get the same code for the included
  testcase with or without your changes.  This is with a
  powerpc-softfloat-linux-gnuspe configured compiler -- which matches
 what I saw in pr 60158.
 
  So while the patch seems reasonable, I'm concerned that I've been
  unable to show it changing anything.
 
  Thoughts?
 
 
  Jeff, the issue is still reproducible with GCC v4.8 branch but not with GCC
 v4.9 or trunk.
 Was it fixed by some other approach or has the bug gone latent?
 Obviously if the former, then the patch is only relevant to gcc-4.8 if the 
 latter,
 then we'll still want to get it fixed on the trunk and possibly in the release
 branches.
 
 Can you please investigate if the bug has been fixed by other means or if it's
 just gone latent on the trunk?

Jeff, 

Just to summarize:
By default in GCC v4.7.x, all the constants are put into '.rodata.str1.4' 
section. In GCC v4.8.x from r192719 onwards, one of the move instruction of the 
string constant .LC0 is getting spilled. The reload pass, for any constants 
that aren't allowed and can't be reloaded in to registers tries to change them 
into memory references. Then while emitting that string constant to asm code 
(A:varasm.c: output_constant_pool_1), it explicitly passes the alignment as 1 
which prevents the generation of fix-up table entries in  'B: 
rs6000.c:rs6000_assemble_integer' because the data is considered unaligned now.

The bug seems to have gone latent with an unrelated trunk commit r204695 [* 
tree-ssa-loop-ivopts.c (force_expr_to_var_cost): Refactor the code. Handle type 
conversion.]. This commit chooses different spill candidates hence all the 
string constants are being put in to '.rodata.str1.4´section.

The check I had in the test case is that if there is a '.data.rel.ro.local', 
then there should be '.fixup' section generated.

Please let me know if you need any other details.

Regards,
Rohit


Re: [PATCH, AArch64] Add Cortex-A53 erratum 843419 configure-time option

2015-05-05 Thread Marcus Shawcroft
On 4 May 2015 at 09:58, Yvan Roux yvan.r...@linaro.org wrote:

 Yes this is a better formulation.

 +corresponding flag to the linker.  It can be explicitly disabled
 during
 +compilation by passing the @option{-mno-fix-cortex-a53-835769} option.

 Copy paste error here with the previous errata number.

 Here is the patch with the modifications.  Is it needed to backport it
 into 4.9 and 5.1 branches ?


This is OK for trunk.  The back ports make sense, but leave it 48
hours, if there are no objections then go ahead.

Cheers
/Marcus


Re: [PATCH, ARM] Fix testcases that require Thumb2 effective target.

2015-05-05 Thread Kyrill Tkachov


On 04/05/15 10:14, Yvan Roux wrote:

Hi,

This patch fixes two ARM testcases that require target to be Thumb2
effective.  One is built for Cortex-m3, the purpose of the second one
is to generate thumb2_addsi3_compare0_scratch insn and both are
failing when compiled for armv5t for instance.

Built and regtested, is it OK for trunk ?


Ok.
Thanks,
Kyrill



Thanks,
Yvan

2015-05-04  Yvan Roux  yvan.r...@linaro.org

 * gcc.target/arm/pr65067.c: Require Thumb2 effective target.
 * gcc.target/arm/pr65924.c: Likewise.




Next set of OpenACC changes: middle end, libgomp

2015-05-05 Thread Thomas Schwinge
Hi!

On Tue, 05 May 2015 10:54:02 +0200, I wrote:
 In follow-up messages, I'll be posting the separated parts (for easier
 review) of a next set of OpenACC changes that we'd like to commit.
 ChangeLog updates not yet written; will do that before commit, obviously.

 gcc/gimplify.c |   16 +-
 gcc/omp-low.c  |   11 +-
 gcc/tree-core.h|   14 +-
 gcc/tree-pretty-print.c|6 +
 gcc/tree.c |   13 +-
 gcc/tree.h |   21 +-
 include/gomp-constants.h   |4 +
 libgomp/oacc-mem.c |3 +
 libgomp/oacc-ptx.h |   28 +
 libgomp/plugin/plugin-nvptx.c  |   10 +

diff --git gcc/gimplify.c gcc/gimplify.c
index bda62ce..12efdc8 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -6385,6 +6385,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
case OMP_CLAUSE_MERGEABLE:
case OMP_CLAUSE_PROC_BIND:
case OMP_CLAUSE_SAFELEN:
+   case OMP_CLAUSE_TILE:
  break;
 
case OMP_CLAUSE_ALIGNED:
@@ -6770,6 +6771,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, tree 
*list_p)
case OMP_CLAUSE_VECTOR:
case OMP_CLAUSE_AUTO:
case OMP_CLAUSE_SEQ:
+   case OMP_CLAUSE_TILE:
  break;
 
default:
@@ -8410,21 +8412,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p,
  break;
 
case OACC_KERNELS:
- if (OACC_KERNELS_COMBINED (*expr_p))
-   sorry (directive not yet implemented);
- else
-   gimplify_omp_workshare (expr_p, pre_p);
- ret = GS_ALL_DONE;
- break;
-
case OACC_PARALLEL:
- if (OACC_PARALLEL_COMBINED (*expr_p))
-   sorry (directive not yet implemented);
- else
-   gimplify_omp_workshare (expr_p, pre_p);
- ret = GS_ALL_DONE;
- break;
-
case OACC_DATA:
case OMP_SECTIONS:
case OMP_SINGLE:
diff --git gcc/omp-low.c gcc/omp-low.c
index 34e2e5c..6ec5145 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -1928,6 +1928,9 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
case OMP_CLAUSE_INDEPENDENT:
case OMP_CLAUSE_AUTO:
case OMP_CLAUSE_SEQ:
+   case OMP_CLAUSE_BIND:
+   case OMP_CLAUSE_NOHOST:
+   case OMP_CLAUSE_TILE:
  sorry (Clause not supported yet);
  break;
 
@@ -2055,6 +2058,9 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
case OMP_CLAUSE_INDEPENDENT:
case OMP_CLAUSE_AUTO:
case OMP_CLAUSE_SEQ:
+   case OMP_CLAUSE_BIND:
+   case OMP_CLAUSE_NOHOST:
+   case OMP_CLAUSE_TILE:
  sorry (Clause not supported yet);
  break;
 
@@ -2742,7 +2748,10 @@ check_omp_nesting_restrictions (gimple stmt, omp_context 
*ctx)
 {
   for (omp_context *ctx_ = ctx; ctx_ != NULL; ctx_ = ctx_-outer)
if (is_gimple_omp (ctx_-stmt)
-is_gimple_omp_oacc (ctx_-stmt))
+is_gimple_omp_oacc (ctx_-stmt)
+   /* Except for atomic codes that we share with OpenMP.  */
+! (gimple_code (stmt) == GIMPLE_OMP_ATOMIC_LOAD
+ || gimple_code (stmt) == GIMPLE_OMP_ATOMIC_STORE))
  {
error_at (gimple_location (stmt),
  non-OpenACC construct inside of OpenACC region);
diff --git gcc/tree-core.h gcc/tree-core.h
index ad1bb23..ffbccda 100644
--- gcc/tree-core.h
+++ gcc/tree-core.h
@@ -390,7 +390,19 @@ enum omp_clause_code {
   OMP_CLAUSE_NUM_WORKERS,
 
   /* OpenACC clause: vector_length (integer-expression).  */
-  OMP_CLAUSE_VECTOR_LENGTH
+  OMP_CLAUSE_VECTOR_LENGTH,
+
+  /* OpenACC clause: bind ( identifer | string ).  */
+  OMP_CLAUSE_BIND,
+
+  /* OpenACC clause: nohost.  */
+  OMP_CLAUSE_NOHOST,
+
+  /* OpenACC clause: tile ( size-expr-list ).  */
+  OMP_CLAUSE_TILE,
+
+  /* OpenACC clause: device_type ( device-type-list).  */
+  OMP_CLAUSE_DEVICE_TYPE
 };
 
 #undef DEFTREESTRUCT
diff --git gcc/tree-pretty-print.c gcc/tree-pretty-print.c
index d7c049f..5eb4daf 100644
--- gcc/tree-pretty-print.c
+++ gcc/tree-pretty-print.c
@@ -799,6 +799,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, 
int flags)
 case OMP_CLAUSE_INDEPENDENT:
   pp_string (pp, independent);
   break;
+case OMP_CLAUSE_TILE:
+  pp_string (pp, tile();
+  dump_generic_node (pp, OMP_CLAUSE_TILE_LIST (clause),
+spc, flags, false);
+  pp_right_paren (pp);
+  break;
 
 default:
   /* Should never happen.  */
diff --git gcc/tree.c gcc/tree.c
index daf0292..43f80b7 100644
--- gcc/tree.c
+++ gcc/tree.c
@@ -369,6 +369,10 @@ unsigned const char omp_clause_num_ops[] =
   1, /* OMP_CLAUSE_NUM_GANGS  */
   1, /* OMP_CLAUSE_NUM_WORKERS  */
   1, /* 

Re: [PATCH, i386]: Switch x86 to TARGET_SUPPORTS_WIDE_INT

2015-05-05 Thread Uros Bizjak
On Tue, May 5, 2015 at 8:10 PM, Richard Biener
richard.guent...@gmail.com wrote:

 I've checked spec2000 performance.
 Only few spec binaries differ. Anyway performance is unchanged.

Thanks, that confirms expectations.

 Are changes in generated code expected at all?

The change by itself should have no effect on the code, but it is a
big change, so some unrelated perturbations in the code are expected.

Uros.


Go patch committed: Don't make temporaries for string concatenation of constants

2015-05-05 Thread Ian Lance Taylor
This patch from Chris Manghane changes the Go frontend to not use
temporaries for constants when doing string concatenation.  This fixes
http://golang.org/issue/10642.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline and GCC 5 branch.

Ian
diff -r b2bef4b0764a go/expressions.cc
--- a/go/expressions.cc Tue May 05 09:38:04 2015 -0700
+++ b/go/expressions.cc Tue May 05 13:50:49 2015 -0700
@@ -5120,13 +5120,15 @@
   if (this-left_-type()-is_string_type()
this-op_ == OPERATOR_PLUS)
 {
-  if (!this-left_-is_variable())
+  if (!this-left_-is_variable()
+  !this-left_-is_constant())
 {
   temp = Statement::make_temporary(NULL, this-left_, loc);
   inserter-insert(temp);
   this-left_ = Expression::make_temporary_reference(temp, loc);
 }
-  if (!this-right_-is_variable())
+  if (!this-right_-is_variable()
+  !this-right_-is_constant())
 {
   temp =
   Statement::make_temporary(this-left_-type(), this-right_, 
loc);


Re: libgo patch committed: Permit nil Func in Func.Name

2015-05-05 Thread Ian Lance Taylor
On Tue, May 5, 2015 at 10:35 AM, Andreas Schwab sch...@linux-m68k.org wrote:
 Ian Lance Taylor i...@golang.org writes:

 Committed to mainline and 4.9 branch.

 What about the 5 branch?

Oh yeah.  Thanks.  Committing to GCC 5 branch shortly, after tests complete.

Ian


Re: [PATCH, i386]: Switch x86 to TARGET_SUPPORTS_WIDE_INT

2015-05-05 Thread Richard Biener
On May 5, 2015 6:16:14 PM GMT+02:00, Uros Bizjak ubiz...@gmail.com wrote:
On Tue, May 5, 2015 at 2:23 PM, Evgeny Stupachenko evstu...@gmail.com
wrote:

 I've checked spec2000 performance.
 Only few spec binaries differ. Anyway performance is unchanged.

Thanks, that confirms expectations.

Are changes in generated code expected at all?

Richard.

Patch committed.

Uros.




Patch ping

2015-05-05 Thread Jakub Jelinek
Hi!

http://gcc.gnu.org/ml/gcc-patches/2015-04/msg01432.html
  - this got approved for arm and aarch64, but not for s390{,x}
  Ok for trunk?

Jakub


[PATCH 5/4] libcpp: Eliminate most of the non-const/reference-returning inline fns

2015-05-05 Thread David Malcolm
gcc/java/ChangeLog:
* jcf-parse.c (set_source_filename): Replace write through
ORDINARY_MAP_FILE_NAME with direct access to to_file.

libcpp/ChangeLog:
* include/line-map.h (MAP_START_LOCATION): Eliminate the non-const
variant, and tweak comment for the const variant.
(ORDINARY_MAP_STARTING_LINE_NUMBER): Drop the non-const variant.
(ORDINARY_MAP_INCLUDER_FILE_INDEX): Likewise.
(ORDINARY_MAP_IN_SYSTEM_HEADER_P): Likewise.
(SET_ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): Delete.
(ORDINARY_MAP_FILE_NAME): Drop the non-const variant.
(MACRO_MAP_MACRO): Likewise.
(MACRO_MAP_NUM_MACRO_TOKENS): Likewise.
(MACRO_MAP_LOCATIONS): Likewise.
(MACRO_MAP_EXPANSION_POINT_LOCATION): Likewise.
* line-map.c (linemap_add): Replace writes through macros with
direct field accesses.
(linemap_enter_macro): Likewise.
(linemap_line_start): Likewise.
---
 gcc/java/jcf-parse.c  |  2 +-
 libcpp/include/line-map.h | 86 +--
 libcpp/line-map.c | 29 
 3 files changed, 16 insertions(+), 101 deletions(-)

diff --git a/gcc/java/jcf-parse.c b/gcc/java/jcf-parse.c
index e609331..d1c9fc4 100644
--- a/gcc/java/jcf-parse.c
+++ b/gcc/java/jcf-parse.c
@@ -374,7 +374,7 @@ set_source_filename (JCF *jcf, int index)
 }
   
   sfname = find_sourcefile (sfname);
-  ORDINARY_MAP_FILE_NAME (LINEMAPS_LAST_ORDINARY_MAP (line_table)) = sfname;
+  LINEMAPS_LAST_ORDINARY_MAP (line_table)-to_file = sfname;
   if (current_class == main_class) main_input_filename = sfname;
 }
 
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 95a1d85..0530072 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -343,7 +343,7 @@ linemap_check_macro (const line_map *map)
   return (const line_map_macro *)map;
 }
 
-/* Read the start location of MAP, as an rvalue.  */
+/* Read the start location of MAP.  */
 
 inline source_location
 MAP_START_LOCATION (const line_map *map)
@@ -351,15 +351,6 @@ MAP_START_LOCATION (const line_map *map)
   return map-start_location;
 }
 
-/* Access the start location of MAP as a reference
-   (e.g. as an lvalue).  */
-
-inline source_location
-MAP_START_LOCATION (line_map *map)
-{
-  return map-start_location;
-}
-
 /* Get the starting line number of ordinary map MAP.  */
 
 inline linenum_type
@@ -368,15 +359,6 @@ ORDINARY_MAP_STARTING_LINE_NUMBER (const line_map_ordinary 
*ord_map)
   return ord_map-to_line;
 }
 
-/* Access the starting line number of ordinary map MAP by
-   reference (e.g. as an lvalue).  */
-
-inline linenum_type
-ORDINARY_MAP_STARTING_LINE_NUMBER (line_map_ordinary *ord_map)
-{
-  return ord_map-to_line;
-}
-
 /* Get the index of the ordinary map at whose end
ordinary map MAP was included.
 
@@ -388,14 +370,6 @@ ORDINARY_MAP_INCLUDER_FILE_INDEX (const line_map_ordinary 
*ord_map)
   return ord_map-included_from;
 }
 
-/* As above, but by reference (e.g. as an lvalue).  */
-
-inline int
-ORDINARY_MAP_INCLUDER_FILE_INDEX (line_map_ordinary *ord_map)
-{
-  return ord_map-included_from;
-}
-
 /* Return a positive value if map encodes locations from a system
header, 0 otherwise. Returns 1 if ordinary map MAP encodes locations
in a system header and 2 if it encodes locations in a C system header
@@ -407,14 +381,6 @@ ORDINARY_MAP_IN_SYSTEM_HEADER_P (const line_map_ordinary 
*ord_map)
   return ord_map-sysp;
 }
 
-/* As above, but by reference (e.g. as an lvalue).  */
-
-inline unsigned char 
-ORDINARY_MAP_IN_SYSTEM_HEADER_P (line_map_ordinary *ord_map)
-{
-  return ord_map-sysp;
-}
-
 /* Get the number of the low-order source_location bits used for a
column number within ordinary map MAP.  */
 
@@ -424,16 +390,6 @@ ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (const 
line_map_ordinary *ord_map)
   return ord_map-column_bits;
 }
 
-/* Set the number of the low-order source_location bits used for a
-   column number within ordinary map MAP.  */
-
-inline void
-SET_ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (line_map_ordinary *ord_map,
-   int col_bits)
-{
-  ord_map-column_bits = col_bits;
-}
-
 /* Get the filename of ordinary map MAP.  */
 
 inline const char *
@@ -442,14 +398,6 @@ ORDINARY_MAP_FILE_NAME (const line_map_ordinary *ord_map)
   return ord_map-to_file;
 }
 
-/* As above, but by reference (e.g. as an lvalue).  */
-
-inline const char * 
-ORDINARY_MAP_FILE_NAME (line_map_ordinary *ord_map)
-{
-  return ord_map-to_file;
-}
-
 /* Get the cpp macro whose expansion gave birth to macro map MAP.  */
 
 inline cpp_hashnode *
@@ -458,14 +406,6 @@ MACRO_MAP_MACRO (const line_map_macro *macro_map)
   return macro_map-macro;
 }
 
-/* As above, but by reference (e.g. as an lvalue).  */
-
-inline cpp_hashnode * 
-MACRO_MAP_MACRO (line_map_macro *macro_map)
-{
-  return macro_map-macro;
-}
-
 /* Get the number of tokens inside the replacement-list of the 

Re: [PATCH, x86] Add TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook

2015-05-05 Thread Christian Bruel
thanks for the validation and the confirmation that iinline-attr.c is
now fixed on aarch64. I can now send the patch for submission request
(this one was just illustrative).

thanks

Christian

On 05/05/2015 11:10 AM, Yvan Roux wrote:
 Hi Christian,
 
 On 4 May 2015 at 11:29, Christian Bruel christian.br...@st.com wrote:

 Hi Christian,
 I noticed case gcc.dg/ipa/iinline-attr.c failed on aarch64.  The
 original patch is x86 specific, while the case is added as general
 one.  Could you please have a look at this?

 FAIL: gcc.dg/ipa/iinline-attr.c scan-ipa-dump inline
 hooray[^\\n]*inline copy in test


 that is the same latent bug for aarch64:  alignment flags are not
 propagated with attribute optimize (O2).

 testing attached patch
 
 The patch looks good to me, maybe you can just fix the original typo
 on optimizing in the comments while moving the code.  I've
 bootstrapped and regtested it on aarch64-linux-gnu with the same
 target testcase you added on i386 (as we discussed offline) and
 everything is ok (gcc.dg/ipa/iinline-attr.c now PASS).  Notice that I
 can't approve your patch.
 
 Cheers,
 Yvan
 


Re: [PATCH,testsuite] Xfail gcc.dg/tree-ssa/stdarg-2.c f15 scans

2015-05-05 Thread Richard Biener
On May 5, 2015 11:15:11 AM GMT+02:00, Tom de Vries tom_devr...@mentor.com 
wrote:
Hi,

after checking in the 'postpone expanding va_arg till pass_stdarg'
patch series, 
some scans related to function f15 in gcc.dg/tree-ssa/stdarg-2.c have
started 
failing.

[ The committed patch series contained a modification of stdarg-2.c,
but that 
seems to be not complete and not correct. ]

F.i., for s390 we find at 
https://gcc.gnu.org/ml/gcc-testresults/2015-05/msg00507.html:
...
FAIL: gcc.dg/tree-ssa/stdarg-2.c scan-tree-dump stdarg f15: va_list
escapes 0, 
needs to save 1 GPR units and 2 FPR units
...

Furthermore, at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64950#c11
it was 
noted that:
...
Unfortunately, the gcc.dg/tree-ssa/stdarg-2.c part of the patch is
wrong: the 
test now FAILs on i686-unknown-linux-gnu, i686-apple-darwin, and
i386-pc-solaris 
with -m64: both dumps (i.e. -m32 and -m64) contain

m32/stdarg-2.c.084t.stdarg:f15: va_list escapes 1, needs to save all
GPR units 
and all FPR units.
m64/stdarg-2.c.084t.stdarg:f15: va_list escapes 1, needs to save all
GPR units 
and all FPR units.
...

I've filed two PRs:
- PR66010 '[6 Regression] Missed optimization after inlining va_list
parameter'
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66010
- PR66013 'Missed optimization after inlining va_list parameter, -m32
case'
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66013
to track these errors.

AFAIU now from the investigations in those PRs, we can expect all f15
scans that 
check for the presence of 'va_list escapes 0' to fail.

I'd like to commit two patches.
- The first patch undoes the modification of stdarg-2.c as committed in
the
   original patch series (omitted).
- The second patch adds appropriate xfails (attached).

OK for trunk?

OK.

Thanks,
Richard.

Thanks,
- Tom




Re: [PATCH][AArch64] Add branch-cost to cpu tuning information.

2015-05-05 Thread Matthew Wahab

On 01/05/15 10:18, Marcus Shawcroft wrote:

On 21 April 2015 at 15:00, Matthew Wahab matthew.wa...@arm.com wrote:

+int aarch64_branch_cost (bool, bool);
+

You would never guess looking at this .h today, but long ago there was
something close to alphabetical order by function name in place.
Please lift this definition between aarch64_bitmask_imm and
aarch64_classify_symbolic_expression.

+int
+aarch64_branch_cost (bool speed_p, bool predictable_p)
+{

Add an appropriate comment before the function please.


Attached reworked patch:

- Moved declaration of aarch64_branch_cost to after aarch64_bitmask_imm.
- Added comment before definition of aarch64_branch_cost.

Tested aarch64-none-linux-gnu with gcc-check.

Ok for trunk?
Matthew

2015-05-05  Matthew Wahab  matthew.wa...@arm.com

* gcc/config/aarch64-protos.h (struct cpu_branch_cost): New.
(tune_params): Add field branch_costs.
(aarch64_branch_cost): Declare.
* gcc/config/aarch64.c (generic_branch_cost): New.
(generic_tunings): Set field cpu_branch_cost to generic_branch_cost.
(cortexa53_tunings): Likewise.
(cortexa57_tunings): Likewise.
(thunderx_tunings): Likewise.
(xgene1_tunings): Likewise.
(aarch64_branch_cost): Define.
* gcc/config/aarch64/aarch64.h (BRANCH_COST): Redefine.

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 08ce5f1..931c8b8 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -162,12 +162,20 @@ struct cpu_vector_cost
   const int cond_not_taken_branch_cost;  /* Cost of not taken branch.  */
 };
 
+/* Branch costs.  */
+struct cpu_branch_cost
+{
+  const int predictable;/* Predictable branch or optimizing for size.  */
+  const int unpredictable;  /* Unpredictable branch or optimizing for speed.  */
+};
+
 struct tune_params
 {
   const struct cpu_cost_table *const insn_extra_cost;
   const struct cpu_addrcost_table *const addr_cost;
   const struct cpu_regmove_cost *const regmove_cost;
   const struct cpu_vector_cost *const vec_costs;
+  const struct cpu_branch_cost *const branch_costs;
   const int memmov_cost;
   const int issue_rate;
   const unsigned int fuseable_ops;
@@ -184,6 +192,7 @@ struct tune_params
 HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned);
 int aarch64_get_condition_code (rtx);
 bool aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode);
+int aarch64_branch_cost (bool, bool);
 enum aarch64_symbol_type
 aarch64_classify_symbolic_expression (rtx, enum aarch64_symbol_context);
 bool aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 374b0a9..7bc28ae 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -340,12 +340,20 @@ static const struct cpu_vector_cost xgene1_vector_cost =
 #define AARCH64_FUSE_ADRP_LDR	(1  3)
 #define AARCH64_FUSE_CMP_BRANCH	(1  4)
 
+/* Generic costs for branch instructions.  */
+static const struct cpu_branch_cost generic_branch_cost =
+{
+  2,  /* Predictable.  */
+  2   /* Unpredictable.  */
+};
+
 static const struct tune_params generic_tunings =
 {
   cortexa57_extra_costs,
   generic_addrcost_table,
   generic_regmove_cost,
   generic_vector_cost,
+  generic_branch_cost,
   4, /* memmov_cost  */
   2, /* issue_rate  */
   AARCH64_FUSE_NOTHING, /* fuseable_ops  */
@@ -365,6 +373,7 @@ static const struct tune_params cortexa53_tunings =
   generic_addrcost_table,
   cortexa53_regmove_cost,
   generic_vector_cost,
+  generic_branch_cost,
   4, /* memmov_cost  */
   2, /* issue_rate  */
   (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
@@ -385,6 +394,7 @@ static const struct tune_params cortexa57_tunings =
   cortexa57_addrcost_table,
   cortexa57_regmove_cost,
   cortexa57_vector_cost,
+  generic_branch_cost,
   4, /* memmov_cost  */
   3, /* issue_rate  */
   (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
@@ -405,6 +415,7 @@ static const struct tune_params thunderx_tunings =
   generic_addrcost_table,
   thunderx_regmove_cost,
   generic_vector_cost,
+  generic_branch_cost,
   6, /* memmov_cost  */
   2, /* issue_rate  */
   AARCH64_FUSE_CMP_BRANCH, /* fuseable_ops  */
@@ -424,6 +435,7 @@ static const struct tune_params xgene1_tunings =
   xgene1_addrcost_table,
   xgene1_regmove_cost,
   xgene1_vector_cost,
+  generic_branch_cost,
   6, /* memmov_cost  */
   4, /* issue_rate  */
   AARCH64_FUSE_NOTHING, /* fuseable_ops  */
@@ -5409,6 +5421,23 @@ aarch64_address_cost (rtx x,
   return cost;
 }
 
+/* Return the cost of a branch.  If SPEED_P is true then the compiler is
+   optimizing for speed.  If PREDICTABLE_P is true then the branch is predicted
+   to be taken.  */
+
+int
+aarch64_branch_cost (bool speed_p, bool predictable_p)
+{
+  /* When optimizing for speed, use the cost of unpredictable branches.  */
+  const struct cpu_branch_cost *branch_costs =
+

Re: [PR testsuite/65205, libgomp/65993] Fix dg-shouldfail usage in OpenACC libgomp tests

2015-05-05 Thread Thomas Schwinge
Hi!

On Mon, 4 May 2015 10:20:14 -0400, John David Anglin dave.ang...@bell.net 
wrote:
 On 2015-05-04 4:32 AM, Thomas Schwinge wrote:
  Dave, would you please test the following patch, and report the
  regression status compared to before r222620?  (Compared to your existing
  r222021 results, as posted in the PR, for example.)
 With patch, we have the following fails on hppa2.0w-hp-hpux11.11:

Thanks for testing!

 FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-3.c 
 -DACC_DEVICE_TYPE_host
 =1 -DACC_MEM_SHARED=1 output pattern test, is
 libgomp: no device found
 , should match device [0-9]+\([0-9]+\) is initialized
 FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-42.c 
 -DACC_DEVICE_TYPE_hos
 t=1 -DACC_MEM_SHARED=1 output pattern test, is , should match 
 \[[0-9a-fA-FxX]+,2
 56\] is not mapped

These are known issues.

 FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-62.c 
 -DACC_DEVICE_TYPE_hos
 t=1 -DACC_MEM_SHARED=1 output pattern test, is , should match invalid size

With this one I'll need your help: please cite from libgomp.log (or, from
a manual run) the actual output message that you're getting.


Committed in r222799:

commit 92d46f1ab3a56ffa0deb412141fbbcbe7036f61d
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Tue May 5 09:39:29 2015 +

[PR testsuite/65205, libgomp/65993] Fix dg-shouldfail usage in OpenACC 
libgomp tests

In dg-output, don't expect 0x prefix for %p format specifier, don't 
expect
(nil) for NULL pointer.

PR testsuite/65205
PR libgomp/65993
libgomp/
* testsuite/libgomp.oacc-c-c++-common/clauses-2.c: In dg-output,
don't expect 0x prefix for %p format specifier, don't expect
(nil) for NULL pointer.
* testsuite/libgomp.oacc-c-c++-common/lib-16.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-17.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-18.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-20.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-21.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-22.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-23.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-25.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-26.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-27.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-28.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-29.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-30.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-34.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-35.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-36.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-39.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-40.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-42.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-43.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-44.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-47.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-48.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-52.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-53.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-54.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-57.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-58.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-already-1.c: More
accurately specify what we're looking for.
* testsuite/libgomp.oacc-c-c++-common/data-already-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-already-8.c: Likewise.
* testsuite/libgomp.oacc-fortran/data-already-1.f: Likewise.
* testsuite/libgomp.oacc-fortran/data-already-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/data-already-8.f: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@222799 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog  |   43 
 .../libgomp.oacc-c-c++-common/clauses-2.c  |2 +-
 .../libgomp.oacc-c-c++-common/data-already-1.c |4 +-
 .../libgomp.oacc-c-c++-common/data-already-2.c |4 +-
 .../libgomp.oacc-c-c++-common/data-already-8.c |4 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-16.c   |2 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-17.c   |2 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-18.c   |2 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-20.c   |2 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-21.c   |2 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-22.c   |2 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-23.c   |2 +-
 

Re: [PATCH 0/3][AArch64] DImode vector compares

2015-05-05 Thread Alan Lawrence

Alan Lawrence wrote:

Hi,

Comparing 64x1 vector types (defined by hand or from arm_neon.h) using GCC 
vector extensions currently generates very poor assembly code, for example 
uint64x1_t foo (uint64x1_t a, uint64x1_t b) { return a = b; } generates (at -O3):


fmov x0, d0 // 22 movdi_aarch64/12 [length = 4]
fmov x1, d1 // 23 movdi_aarch64/12 [length = 4]
cmp x0, x1 // 10 cmpdi/1 [length = 4]
csinv x0, xzr, xzr, cc // 17 cmovdi_insn/3 [length = 4]
fmov d0, x0 // 24 *movdi_aarch64/11 [length = 4]
ret // 27 simple_return [length = 4]

Meaning that arm_neon.h instead has to use rather awkward forms like return 
(uint64x1_t) {__a[0] = __b[0] ? -1ll : 0ll}; to produce the desired assembly


cmhs d0, d0, d1
ret

This series adds vcond(u?)didi patterns for AArch64, to generate appropriate RTL 
from direct comparisons of 64x1 vectors (which are of DImode). However, as 
things stand, adding a vconddidi pattern causes an ICE in vector_compare_rtx 
(maybe_legitimize_operands), because a DImode constant-zero (vector or 
otherwise) is expanded as const0_rtx, which has mode VOIDmode. I tried quite a 
bit to generate an RTL const_vector, or even just something with mode DImode, 
but without success, hence the first patch fixes vector_compare_rtx to use the 
mode from the tree if necessary. (DImode vectors are specifically allowed by 
stor-layout.c, but no other platform defines vconddidi.)


Can I ping the AArch64 parts of this (patches 2+3)? These then provide the 
testcases requested by Jeff Law in his approval of the first patch 
(https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00076.html).


Thanks, Alan



Re: Next set of OpenACC changes: Fortran

2015-05-05 Thread Bernhard Reutner-Fischer
On 5 May 2015 at 10:58, Thomas Schwinge tho...@codesourcery.com wrote:
 Hi!

 +/* Node in the linked list used for storing !$oacc declare constructs.  */

The clause is called $ACC declare, isn't it?


 +  for (oc = new_oc; oc; oc = oc-next)
 +{
 +  c = oc-clauses;
 +  for (n = c-lists[OMP_LIST_MAP]; n != NULL; n = n-next)
 +   n-sym-mark = 0;
 +}
 +
 +  for (oc = new_oc; oc; oc = oc-next)
 +{
 +  c = oc-clauses;
 +  for (n = c-lists[OMP_LIST_MAP]; n != NULL; n = n-next)
 +   {
 + if (n-sym-mark)
 +   {
 + gfc_error (Symbol %qs present on multiple clauses at %C,
 +n-sym-name);
 + return MATCH_ERROR;
 +   }
 + else
 +   n-sym-mark = 1;
 +   }
 +}
 +
 +  for (oc = new_oc; oc; oc = oc-next)
 +{
 +  c = oc-clauses;
 +  for (n = c-lists[OMP_LIST_MAP]; n != NULL; n = n-next)
 +   n-sym-mark = 1;
 +}

Much code for setting n-sym-mark = 1. What am i missing?

 +
 +  ns-oacc_declare = new_oc;
 +
return MATCH_YES;
  }

 @@ -1304,10 +1580,21 @@ match
  gfc_match_oacc_update (void)
  {
gfc_omp_clauses *c;
 -  if (gfc_match_omp_clauses (c, OACC_UPDATE_CLAUSES, false, false, true)
 +  locus here = gfc_current_locus;
 +
 +  if (gfc_match_omp_clauses (c, OACC_UPDATE_CLAUSES,
 +OACC_UPDATE_CLAUSE_DEVICE_TYPE_MASK, false,
 +false, true)
!= MATCH_YES)
  return MATCH_ERROR;

 +  if (!c-lists[OMP_LIST_MAP])
 +{
 +  gfc_error (%acc update% must contain at least one 
 +%device% or %host/self% clause at %L, here);
 +  return MATCH_ERROR;

$ACC UPDATE instead of %acc update % ?

 -  else if (code-ext.omp_clauses-gang
 -   code-ext.omp_clauses-worker
 -   code-ext.omp_clauses-vector)
 +  if (code-ext.omp_clauses-tile_list  code-ext.omp_clauses-gang
 +   code-ext.omp_clauses-worker  code-ext.omp_clauses-vector)


conditions on separate lines, please.

 -  for (list = OMP_LIST_DEVICE_RESIDENT;
 -   list = OMP_LIST_DEVICE_RESIDENT; list++)
 -for (n = ns-oacc_declare_clauses-lists[list]; n; n = n-next)
 -  {
 -   n-sym-mark = 0;
 -   if (n-sym-attr.flavor == FL_PARAMETER)
 - gfc_error (PARAMETER object %qs is not allowed at %L, 
 n-sym-name, loc);
 -  }
 +  for (list = OMP_LIST_DEVICE_RESIDENT;
 +  list = OMP_LIST_DEVICE_RESIDENT; list++)
 +   for (n = oc-clauses-lists[list]; n; n = n-next)
 + {
 +   n-sym-mark = 0;
 +   if (n-sym-attr.flavor == FL_PARAMETER)
 + gfc_error (PARAMETER object %qs is not allowed at %L,
 +n-sym-name, loc);
 + }

 -  for (list = OMP_LIST_DEVICE_RESIDENT;
 -   list = OMP_LIST_DEVICE_RESIDENT; list++)
 -for (n = ns-oacc_declare_clauses-lists[list]; n; n = n-next)
 -  {
 -   if (n-sym-mark)
 - gfc_error (Symbol %qs present on multiple clauses at %L,
 -n-sym-name, loc);
 -   else
 - n-sym-mark = 1;
 -  }
 +  for (list = OMP_LIST_DEVICE_RESIDENT;
 +   list = OMP_LIST_DEVICE_RESIDENT; list++)
 +   for (n = oc-clauses-lists[list]; n; n = n-next)
 + {
 +   if (n-sym-mark)
 + gfc_error (Symbol %qs present on multiple clauses at %L,
 +n-sym-name, loc);
 +   else
 + n-sym-mark = 1;
 + }

 -  for (n = ns-oacc_declare_clauses-lists[OMP_LIST_DEVICE_RESIDENT]; n;
 -   n = n-next)
 -check_array_not_assumed (n-sym, loc, DEVICE_RESIDENT);
 +  for (n = oc-clauses-lists[OMP_LIST_DEVICE_RESIDENT]; n; n = n-next)
 +   check_array_not_assumed (n-sym, loc, DEVICE_RESIDENT);
 +
 +  for (n = oc-clauses-lists[OMP_LIST_MAP]; n; n = n-next)
 +   {
 + if (n-expr  n-expr-ref-type == REF_ARRAY)
 + gfc_error (Subarray: %qs not allowed in $!ACC DECLARE at %L,
 +n-sym-name, loc);
 +   }
 +}
  }

The -mark setting looks complicated (as noted above)?

thanks,


[C++ Patch/RFC] PR 53184

2015-05-05 Thread Paolo Carlini

Hi,

per the audit trail, this issue appears to boil down to two separate issues:
- The warning doesn't appear universally useful, thus it would be nice 
to give it a name in order to enable disabling it.
- As shown by the testcase, sometimes the wording is misleading: it 
talks about 'anonymous namespace', where, as clarified by Jason in the 
trail, the issue is really about a type with no linkage, no namespace 
involved.


- The former is easy done, I picked: -Wsubobject-linkage. Makes sense?
- The latter is a little more tricky, because it doesn't seem always 
easy to tell one case from the other, in particular when templates are 
involved (eg, g++.dg/warn/anonymous-namespace-3.C) and the linkage issue 
involves template arguments. Given that the warning doesn't seem 
terribly important (as another data point, clang doesn't have it), so 
far I have conditionals which reliably figure out cases of anonymous 
namespace and cases of no linkage (per the testcase at issue, for 
example) and otherwise fall back to an 'or' wording. I hope the 
improvement is good enough. Alternately, I suppose the warning could use 
a completely different, more generic, wording, but in that case 
testcases like anonymous-namespace-3.C will need adjustment.


Tested x86_64-linux.

Thanks,
Paolo.


Index: c-family/c.opt
===
--- c-family/c.opt  (revision 222822)
+++ c-family/c.opt  (working copy)
@@ -913,6 +913,11 @@ Wuseless-cast
 C++ ObjC++ Var(warn_useless_cast) Warning
 Warn about useless casts
 
+Wsubobject-linkage
+C++ ObjC++ Var(warn_subobject_linkage) Warning Init(1)
+Warn if a class type has a base or a field whose type uses the anonymous
+namespace or depends on a type with no linkage
+
 ansi
 C ObjC C++ ObjC++
 A synonym for -std=c89 (for C) or -std=c++98 (for C++)
Index: cp/decl2.c
===
--- cp/decl2.c  (revision 222822)
+++ cp/decl2.c  (working copy)
@@ -2554,9 +2554,22 @@ constrain_class_visibility (tree type)
if (subvis == VISIBILITY_ANON)
  {
if (!in_main_input_context ())
- warning (0, \
+ {
+   if (MAYBE_CLASS_TYPE_P (ftype)
+decl_anon_ns_mem_p (TYPE_MAIN_DECL (ftype)))
+ warning (OPT_Wsubobject_linkage, \
 %qT has a field %qD whose type uses the anonymous namespace,
   type, t);
+   else if (no_linkage_check (ftype, /*relaxed_p=*/false))
+ warning (OPT_Wsubobject_linkage, \
+%qT has a field %qD whose type depends on a type with no linkage,
+  type, t);
+   else
+ warning (OPT_Wsubobject_linkage, \
+%qT has a field %qD whose type uses the anonymous namespace or depends on a 
+  type with no linkage,
+  type, t);
+ }
  }
else if (MAYBE_CLASS_TYPE_P (ftype)
  vis  VISIBILITY_HIDDEN
@@ -2574,9 +2587,21 @@ constrain_class_visibility (tree type)
   if (subvis == VISIBILITY_ANON)
 {
  if (!in_main_input_context())
-   warning (0, \
+   {
+ if (decl_anon_ns_mem_p (TYPE_MAIN_DECL (TREE_TYPE (t
+   warning (OPT_Wsubobject_linkage, \
 %qT has a base %qT whose type uses the anonymous namespace,
-type, TREE_TYPE (t));
+type, TREE_TYPE (t));
+ else if (no_linkage_check (TREE_TYPE (t), /*relaxed_p=*/false))
+   warning (OPT_Wsubobject_linkage, \
+%qT has a base %qT whose type depends on a type with no linkage,
+type, TREE_TYPE (t));
+ else
+   warning (OPT_Wsubobject_linkage, \
+%qT has a base %qT whose type uses the anonymous namespace or depends on a 
+type with no linkage,
+type, TREE_TYPE (t));
+   }
}
   else if (vis  VISIBILITY_HIDDEN
subvis = VISIBILITY_HIDDEN)
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 222822)
+++ doc/invoke.texi (working copy)
@@ -278,7 +278,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wstrict-aliasing=n @gol -Wstrict-overflow -Wstrict-overflow=@var{n} @gol
 -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{]} @gol
 -Wsuggest-final-types @gol -Wsuggest-final-methods -Wsuggest-override @gol
--Wmissing-format-attribute @gol
+-Wmissing-format-attribute -Wsubobject-linkage @gol
 -Wswitch  -Wswitch-default  -Wswitch-enum -Wswitch-bool -Wsync-nand @gol
 -Wsystem-headers  -Wtrampolines  -Wtrigraphs  -Wtype-limits  -Wundef @gol
 -Wuninitialized  -Wunknown-pragmas  -Wno-pragmas @gol
@@ -4801,6 +4801,13 @@ types. @option{-Wconversion-null} is enabled by de
 Warn when a literal '0' is used as null pointer 

Re: [debug-early] fix problem with template parameter packs

2015-05-05 Thread Jason Merrill

On 05/04/2015 09:29 PM, Aldy Hernandez wrote:

The code handling parameter DIEs needed a little tweaking for variable
length template arguments.  I've relaxed the original assert, but this
may require tweaking at branch review time-- hopefully later this week.


What testcase motivated this?  We're within a formal_parameter_pack, but 
DECL_ABSTRACT is set, so I guess the earlier parm_die was from a 
declaration?  If we're going to re-use the individual parms, I'd think 
we want to reuse the pack as well.


Jason




Re: [debug-early] fix problem with template parameter packs

2015-05-05 Thread Aldy Hernandez

On 05/05/2015 02:08 PM, Jason Merrill wrote:

On 05/04/2015 09:29 PM, Aldy Hernandez wrote:

The code handling parameter DIEs needed a little tweaking for variable
length template arguments.  I've relaxed the original assert, but this
may require tweaking at branch review time-- hopefully later this week.


What testcase motivated this?  We're within a formal_parameter_pack, but


Pretty much every other test in the libstdc++-v3 testsuite was failing 
with the ICE I elided in my patch.


I wasn't able to narrow it down to a tiny test, but I can do so if you 
want. ??


Aldy


Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Earnshaw
On 05/05/15 08:32, Jakub Jelinek wrote:
 On Mon, May 04, 2015 at 05:00:11PM +0200, Jakub Jelinek wrote:
 So at least changing arm_needs_doubleword_align for non-aggregates would
 likely not break anything that hasn't been broken already and would unbreak
 the majority of cases.
 
 Attached (untested so far).  It indeed changes code generated for
 over-aligned va_arg, but as I believe you can't properly pass those in the
 ... caller, this should just fix it so that va_arg handling matches the
 caller (and likewise for callees for named argument passing).
 
 The following testcase shows that eipa_sra changes alignment even for the
 aggregates.  Change aligned (8) to aligned (4) to see another possibility.
 
 Actually I misread it, for the aggregates esra actually doesn't change
 anything, which is the reason why the testcase doesn't fail.
 The problem with the scalars is that esra first changed it to the
 over-aligned MEM_REFs and then later on eipa_sra used the types of the
 MEM_REFs created by esra.
 
 2015-05-05  Jakub Jelinek  ja...@redhat.com
 
   PR target/65956
   * config/arm/arm.c (arm_needs_doubleword_align): For non-aggregate
   types check TYPE_ALIGN of TYPE_MAIN_VARIANT rather than type itself.
 
   * gcc.c-torture/execute/pr65956.c: New test.
 
 --- gcc/config/arm/arm.c.jj   2015-05-04 21:51:42.0 +0200
 +++ gcc/config/arm/arm.c  2015-05-05 09:20:52.481693337 +0200
 @@ -6063,8 +6063,13 @@ arm_init_cumulative_args (CUMULATIVE_ARG
  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 -   || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY)
 +return true;

I don't think this is right (though I suspect the existing code has the
same problem).  We should only look at mode if there is no type
information.  The problem is that GCC has a nasty habit of assigning
real machine modes to things that are really BLKmode and we've run into
several cases where this has royally screwed things up.  So for
consistency in the ARM back-end we are careful to only use mode when
type is NULL (= it's a libcall).

 +  if (type == NULL_TREE)
 +return false;
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
  }
  



It ought to be possible to re-order this, though, to

 static bool
 arm_needs_doubleword_align (machine_mode mode, const_tree type)
 {
-  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
- || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
+  if (type != NULL_TREE)
+{
+  if (AGGREGATE_TYPE_P (type))
+return TYPE_ALIGN (type)  PARM_BOUNDARY;
+  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
+}
+  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY);
 }


Either way, this would need careful cross-testing against an existing
compiler.

R.

  
 --- gcc/testsuite/gcc.c-torture/execute/pr65956.c.jj  2015-05-01 
 10:32:34.730150257 +0200
 +++ gcc/testsuite/gcc.c-torture/execute/pr65956.c 2015-05-01 
 10:32:13.0 +0200
 @@ -0,0 +1,67 @@
 +/* PR target/65956 */
 +
 +struct A { char *a; int b; long long c; };
 +char v[3];
 +
 +__attribute__((noinline, noclone)) void
 +fn1 (char *x, char *y)
 +{
 +  if (x != v[1] || y != v[2])
 +__builtin_abort ();
 +  v[1]++;
 +}
 +
 +__attribute__((noinline, noclone)) int
 +fn2 (char *x)
 +{
 +  asm volatile ( : +g (x) : : memory);
 +  return x == v[0];
 +}
 +
 +__attribute__((noinline, noclone)) void
 +fn3 (const char *x)
 +{
 +  if (x[0] != 0)
 +__builtin_abort ();
 +}
 +
 +static struct A
 +foo (const char *x, struct A y, struct A z)
 +{
 +  struct A r = { 0, 0, 0 };
 +  if (y.b  z.b)
 +{
 +  if (fn2 (y.a)  fn2 (z.a))
 + switch (x[0])
 +   {
 +   case '|':
 + break;
 +   default:
 + fn3 (x);
 +   }
 +  fn1 (y.a, z.a);
 +}
 +  return r;
 +}
 +
 +__attribute__((noinline, noclone)) int
 +bar (int x, struct A *y)
 +{
 +  switch (x)
 +{
 +case 219:
 +  foo (+, y[-2], y[0]);
 +case 220:
 +  foo (-, y[-2], y[0]);
 +}
 +}
 +
 +int
 +main ()
 +{
 +  struct A a[3] = { { v[1], 1, 1LL }, { v[0], 0, 0LL }, { v[2], 2, 2LL } 
 };
 +  bar (220, a + 2);
 +  if (v[1] != 1)
 +__builtin_abort ();
 +  return 0;
 +}
 
 
   Jakub
 



Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-05 Thread Richard Earnshaw
On 05/05/15 11:54, Richard Earnshaw wrote:
 On 05/05/15 08:32, Jakub Jelinek wrote:
 On Mon, May 04, 2015 at 05:00:11PM +0200, Jakub Jelinek wrote:
 So at least changing arm_needs_doubleword_align for non-aggregates would
 likely not break anything that hasn't been broken already and would unbreak
 the majority of cases.

 Attached (untested so far).  It indeed changes code generated for
 over-aligned va_arg, but as I believe you can't properly pass those in the
 ... caller, this should just fix it so that va_arg handling matches the
 caller (and likewise for callees for named argument passing).

 The following testcase shows that eipa_sra changes alignment even for the
 aggregates.  Change aligned (8) to aligned (4) to see another possibility.

 Actually I misread it, for the aggregates esra actually doesn't change
 anything, which is the reason why the testcase doesn't fail.
 The problem with the scalars is that esra first changed it to the
 over-aligned MEM_REFs and then later on eipa_sra used the types of the
 MEM_REFs created by esra.

 2015-05-05  Jakub Jelinek  ja...@redhat.com

  PR target/65956
  * config/arm/arm.c (arm_needs_doubleword_align): For non-aggregate
  types check TYPE_ALIGN of TYPE_MAIN_VARIANT rather than type itself.

  * gcc.c-torture/execute/pr65956.c: New test.

 --- gcc/config/arm/arm.c.jj  2015-05-04 21:51:42.0 +0200
 +++ gcc/config/arm/arm.c 2015-05-05 09:20:52.481693337 +0200
 @@ -6063,8 +6063,13 @@ arm_init_cumulative_args (CUMULATIVE_ARG
  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 -  || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY)
 +return true;
 
 I don't think this is right (though I suspect the existing code has the
 same problem).  We should only look at mode if there is no type
 information.  The problem is that GCC has a nasty habit of assigning
 real machine modes to things that are really BLKmode and we've run into
 several cases where this has royally screwed things up.  So for
 consistency in the ARM back-end we are careful to only use mode when
 type is NULL (= it's a libcall).
 
 +  if (type == NULL_TREE)
 +return false;
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
  }
  
 
 
 
 It ought to be possible to re-order this, though, to
 
  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
 -  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
 -   || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
 +  if (type != NULL_TREE)
 +{
 +  if (AGGREGATE_TYPE_P (type))
 +return TYPE_ALIGN (type)  PARM_BOUNDARY;
 +  return TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY;
 +}
 +  return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY);
  }
 
 
 Either way, this would need careful cross-testing against an existing
 compiler.
 

It looks as though either patch would cause ABI incompatibility for

typedef int alignedint __attribute__((aligned((8;

int  __attribute__((weak)) foo (int a, alignedint b)
{return b;}

void bar (alignedint x)
{
  foo (1, x);
}

Where currently gcc uses r2 as the argument register for b in foo.

R.



[PATCH, i386]: Use HOST_WIDE_INT_1{,U} some more

2015-05-05 Thread Uros Bizjak
Hello!

No functional changes.

2015-05-05  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c: Use HOST_WIDE_INT_1 instead of (HOST_WIDE_INT) 1
and HOST_WIDE_INT_1U instead of (unsigned HOST_WIDE_INT) 1.
* config/i386/i386.md: Ditto.
* config/i386/winnt.c: Ditto.

Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN:

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 222796)
+++ config/i386/i386.md (working copy)
@@ -6087,7 +6087,7 @@
 GET_MODE (operands[0]) == GET_MODE (operands[1])
 ((unsigned HOST_WIDE_INT) INTVAL (operands[2])) - 1  3
 ((unsigned HOST_WIDE_INT) INTVAL (operands[3])
-((unsigned HOST_WIDE_INT) 1  INTVAL (operands[2])))
+(HOST_WIDE_INT_1U  INTVAL (operands[2])))
   #
reload_completed
   [(const_int 0)]
Index: config/i386/winnt.c
===
--- config/i386/winnt.c (revision 222796)
+++ config/i386/winnt.c (working copy)
@@ -599,7 +599,7 @@ i386_pe_asm_output_aligned_decl_common (FILE *stre
   assemble_name (stream, name);
   if (use_pe_aligned_common)
 fprintf (stream, ,  HOST_WIDE_INT_PRINT_DEC , %d\n,
-  size ? size : (HOST_WIDE_INT) 1,
+  size ? size : HOST_WIDE_INT_1,
   exact_log2 (align) - exact_log2 (CHAR_BIT));
   else
 fprintf (stream, ,  HOST_WIDE_INT_PRINT_DEC \t ASM_COMMENT_START
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 222796)
+++ config/i386/i386.c  (working copy)
@@ -19366,7 +19366,7 @@ ix86_expand_adjust_ufix_to_sfix_si (rtx val, rtx *
 OPTAB_DIRECT);
   else
 {
-  rtx two31 = GEN_INT ((unsigned HOST_WIDE_INT) 1  31);
+  rtx two31 = GEN_INT (HOST_WIDE_INT_1U  31);
   two31 = ix86_build_const_vector (intmode, 1, two31);
   *xorp = expand_simple_binop (intmode, AND,
   gen_lowpart (intmode, tmp[0]),
@@ -24772,7 +24772,7 @@ ix86_expand_set_or_movmem (rtx dst, rtx src, rtx c
  }
 
   /* Make sure we don't need to care about overflow later on.  */
-  if (count  ((unsigned HOST_WIDE_INT) 1  30))
+  if (count  (HOST_WIDE_INT_1U  30))
 return false;
 
   /* Step 0: Decide on preferred algorithm, desired alignment and
@@ -47734,7 +47734,7 @@ expand_vec_perm_interleave2 (struct expand_vec_per
   /* Examine from whence the elements come.  */
   contents = 0;
   for (i = 0; i  nelt; ++i)
-contents |= ((unsigned HOST_WIDE_INT) 1)  d-perm[i];
+contents |= HOST_WIDE_INT_1U  d-perm[i];
 
   memset (remap, 0xff, sizeof (remap));
   dremap = *d;
@@ -47744,7 +47744,7 @@ expand_vec_perm_interleave2 (struct expand_vec_per
   unsigned HOST_WIDE_INT h1, h2, h3, h4;
 
   /* Split the two input vectors into 4 halves.  */
-  h1 = (((unsigned HOST_WIDE_INT) 1)  nelt2) - 1;
+  h1 = (HOST_WIDE_INT_1U  nelt2) - 1;
   h2 = h1  nelt2;
   h3 = h2  nelt2;
   h4 = h3  nelt2;
@@ -47826,7 +47826,7 @@ expand_vec_perm_interleave2 (struct expand_vec_per
   unsigned int nonzero_halves[4];
 
   /* Split the two input vectors into 8 quarters.  */
-  q[0] = (((unsigned HOST_WIDE_INT) 1)  nelt4) - 1;
+  q[0] = (HOST_WIDE_INT_1U  nelt4) - 1;
   for (i = 1; i  8; ++i)
q[i] = q[0]  (nelt4 * i);
   for (i = 0; i  4; ++i)


Re: [PATCH, PR target/65103, 2/3] Propagate address constants into loops for i386

2015-05-05 Thread Ilya Enkovich
2015-04-21 8:52 GMT+03:00 Jeff Law l...@redhat.com:
 On 04/17/2015 02:34 AM, Ilya Enkovich wrote:

 On 15 Apr 14:07, Ilya Enkovich wrote:

 2015-04-14 8:22 GMT+03:00 Jeff Law l...@redhat.com:

 On 03/15/2015 02:30 PM, Richard Sandiford wrote:


 Ilya Enkovich enkovich@gmail.com writes:


 This patch allows propagation of loop invariants for i386 if
 propagated
 value is a constant to be used in address operand.  Bootstrapped and
 tested on x86_64-unknown-linux-gnu.  OK for trunk or stage 1?



 Is it necessary for this to be a target hook?  The concept doesn't seem
 particularly target-specific.  We should only propagate into the
 address
 if the new cost is no greater than the old cost, but if the address
 meets that condition and if propagating at this point in the pipeline
 is
 a win on x86, then wouldn't it be a win for other targets too?


 I agree with Richard here.  I can't see a strong reason why this should
 be a
 target hook.

 Perhaps part of the issue here is the address costing metrics may not
 have
 enough context to make good decisions.  In which case what context do
 they
 need?


 At this point I don't insist on a target hook.  The main reasoning was
 to not affect other targets. If we extend propagation for non constant
 values different aspects may appear. E.g. possible register pressure
 changes may significantly affect ia32. I just wanted to have an
 instrument to play with a propagation on x86 not affecting other
 targets. I don't have an opportunity to test possible performance
 implications on non-x86 targets. Don't expect (significant)
 regressions there but who knows...

 I'll remove the hook from this patch. Will probably introduce it later
 if some target specific cases are found.

 Thanks,
 Ilya


 Jeff


 Here is a version with no hook.  Bootstrapped and tested on
 x86_64-unknown-linux-gnu.  Is it OK for trunk?

 Thanks,
 Ilya
 --
 gcc/

 2015-04-17  Ilya Enkovich  ilya.enkov...@intel.com

 PR target/65103
 * fwprop.c (forward_propagate_into): Propagate loop
 invariants if a target says so.

 gcc/testsuite/

 2015-04-17  Ilya Enkovich  ilya.enkov...@intel.com

 PR target/65103
 * gcc.target/i386/pr65103-2.c: New.

 It seems to me there's a key piece missing here -- metrics.

 When is this profitable, when is it not profitable.   Just blindly undoing
 LICM seems wrong here.

 The first thought is to look at register pressure through the loop.  I
 thought we had some infrastructure for this kind of query available. It'd
 probably be wise to re-use it.  In fact, one might reasonably ask if LICM
 should have hoisted the expression to start with.


 I'd also think the cost of the constant may come into play here.  A really
 cheap constant probably should not have been hoisted by LICM to start with
 -- but the code may have been written in such a way that some low cost
 constants are pulled out as loop invariants at the source level.  So this
 isn't strictly an issue of un-doing bad LICM

 So I think to go forward we need to be working on solving the when is this
 a profitable transformation to make.

This patch doesn't force propagation.  The patch just allows
propagation and regular fwprop cost estimation is used to compute if
this is profitable.  For i386 I don't see cases when we shouldn't
propagate. We remove instruction, reduce register pressure and having
constant in memory operand is free which is reflected in address_cost
hook.

Ilya


 jeff


Re: [PATCH 2/3][AArch64] Add vcond(u?)didi pattern

2015-05-05 Thread Marcus Shawcroft
On 17 April 2015 at 16:40, Alan Lawrence alan.lawre...@arm.com wrote:
 This just adds the necessary patterns used for comparisons of DImode
 vectors. Used as part of arm_neon.h, in next/final patch.

 Tested on aarch64-none-elf.

 gcc/ChangeLog:

 * config/aarch64/aarch64-simd.md
 (aarch64_vcond_internalmodemode,
 vcondmodemode, vcondumode,mode): Add DImode variant.

OK /Marcus


Re: [PATCH 3/3][AArch64] Idiomatic 64x1 comparisons in arm_neon.h

2015-05-05 Thread Marcus Shawcroft
On 17 April 2015 at 16:41, Alan Lawrence alan.lawre...@arm.com wrote:
 This also makes the existing intrinsics tests apply to the new patterns.

 Tested on aarch64-none-elf.

 gcc/ChangeLog:

 * config/aarch64/arm_neon.h (vceq_s64, vceq_u64, vceqz_s64,
 vceqz_u64,
 vcge_s64, vcge_u64, vcgez_s64, vcgt_s64, vcgt_u64, vcgtz_s64,
 vcle_s64,
 vcle_u64, vclez_s64, vclt_s64, vclt_u64, vcltz_s64, vtst_s64,
 vtst_u64): Rewrite using gcc vector extensions.

 gcc/testsuite/ChangeLog:

 * gcc.target/aarch64/singleton_intrinsics_1.c: Generalize regex to
 allow cmlt or sshr.

OK /Marcus