Re: [PATCH] fixup libobjc usage of PCC_BITFIELD_TYPE_MATTERS

2015-05-04 Thread Trevor Saunders
On Sun, May 03, 2015 at 10:59:46AM +0200, Andreas Schwab wrote:
 tbsaunde+...@tbsaunde.org writes:
 
  +AC_DEFUN([gt_BITFIELD_TYPE_MATTERS],
  +[
  +  AC_CACHE_CHECK([if the type of bitfields matters], 
  gt_cv_bitfield_type_matters,
  +  [
  +AC_TRY_COMPILE(
  +  [struct foo1 { char x; char :0; char y; };
  +struct foo2 { char x; int :0; char y; };
  +int foo1test[ sizeof (struct foo1) == 2 ? 1 : -1 ];
  +int foo2test[ sizeof (struct foo2) == 5 ? 1 : -1]; ],
  +  [], gt_cv_bitfield_type_matters=yes, gt_cv_bitfield_type_matters=no)
  +  ])
  +  if test $gt_cv_bitfield_type_matters = yes; then
  +AC_DEFINE(HAVE_BITFIELD_TYPE_MATTERS, 1,
  +  [Define if the type of bitfields effects alignment.])
  +  fi
  +])
 
 gcc/config/aarch64/aarch64.h:#define PCC_BITFIELD_TYPE_MATTERS  1
 
 configure:11554: /opt/gcc/gcc-20150503/Build/./gcc/xgcc 
 -B/opt/gcc/gcc-20150503/Build/./gcc/ -B/usr/aarch64-suse-linux/bin/ 
 -B/usr/aarch64-suse-linux/lib/ -isystem /usr/aarch64-suse-linux/include 
 -isystem /usr/aarch64-suse-linux/sys-include-c -O2 -g  conftest.c 5
 conftest.c:27:5: error: size of array 'foo2test' is negative
  int foo2test[ sizeof (struct foo2) == 5 ? 1 : -1];
  ^
 configure:11554: $? = 1

ok, a quick test seems to show Jakub's version of the test works in this
case so lets try that.

Trev

 
 Andreas.
 
 -- 
 Andreas Schwab, sch...@linux-m68k.org
 GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
 And now for something completely different.


Re: [PATCH, PR65915] Fix float conversion split.

2015-05-04 Thread Uros Bizjak
On Thu, Apr 30, 2015 at 5:18 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Apr 30, 2015 at 8:15 AM, Ilya Tocar tocarip.in...@gmail.com wrote:
 Hi,

 Looks like I missed some splits, which caused PR65915.
 Patch below fixes it.
 Ok for trunk?

 2015-04-28  Ilya Tocar  ilya.to...@intel.com

   * config/i386/i386.md (define_split): Check for xmm16+,
   when splitting scalar float conversion.


 ---
  gcc/config/i386/i386.md | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

 diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
 index 937871a..af1cd9b 100644
 --- a/gcc/config/i386/i386.md
 +++ b/gcc/config/i386/i386.md
 @@ -4897,7 +4897,9 @@
TARGET_SSE2  TARGET_SSE_MATH
  TARGET_USE_VECTOR_CONVERTS  optimize_function_for_speed_p (cfun)
  reload_completed  SSE_REG_P (operands[0])
 -(MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)
 +(MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)
 +(!EXT_REX_SSE_REG_P (operands[0])
 +   || TARGET_AVX512VL)
[(const_int 0)]
  {
operands[3] = simplify_gen_subreg (ssevecmodemode, operands[0],
 @@ -4921,7 +4923,9 @@
TARGET_SSE2  TARGET_SSE_MATH
  TARGET_SSE_PARTIAL_REG_DEPENDENCY
  optimize_function_for_speed_p (cfun)
 -reload_completed  SSE_REG_P (operands[0])
 +reload_completed  SSE_REG_P (operands[0])
 +(!EXT_REX_SSE_REG_P (operands[0])
 +   || TARGET_AVX512VL)
[(const_int 0)]
  {
const machine_mode vmode = MODEF:ssevecmodemode;
 --
 1.8.3.1


 Updated version below (now with test).

 ---
  gcc/config/i386/i386.md | 8 ++--
  gcc/config/i386/sse.md  | 6 +++---
  gcc/testsuite/gcc.target/i386/pr65915.c | 6 ++
  3 files changed, 15 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/i386/pr65915.c

 diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
 index 937871a..af1cd9b 100644
 --- a/gcc/config/i386/i386.md
 +++ b/gcc/config/i386/i386.md
 @@ -4897,7 +4897,9 @@
TARGET_SSE2  TARGET_SSE_MATH
  TARGET_USE_VECTOR_CONVERTS  optimize_function_for_speed_p (cfun)
  reload_completed  SSE_REG_P (operands[0])
 -(MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)
 +(MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)
 +(!EXT_REX_SSE_REG_P (operands[0])
 +   || TARGET_AVX512VL)
[(const_int 0)]
  {
operands[3] = simplify_gen_subreg (ssevecmodemode, operands[0],
 @@ -4921,7 +4923,9 @@
TARGET_SSE2  TARGET_SSE_MATH
  TARGET_SSE_PARTIAL_REG_DEPENDENCY
  optimize_function_for_speed_p (cfun)
 -reload_completed  SSE_REG_P (operands[0])
 +reload_completed  SSE_REG_P (operands[0])
 +(!EXT_REX_SSE_REG_P (operands[0])
 +   || TARGET_AVX512VL)
[(const_int 0)]
  {
const machine_mode vmode = MODEF:ssevecmodemode;
 diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
 index 9b7009a..c61098d 100644
 --- a/gcc/config/i386/sse.md
 +++ b/gcc/config/i386/sse.md
 @@ -4258,11 +4258,11 @@
 (set_attr mode TI)])

  (define_insn sse2_cvtsi2sd
 -  [(set (match_operand:V2DF 0 register_operand =x,x,x)
 +  [(set (match_operand:V2DF 0 register_operand =x,x,v)
 (vec_merge:V2DF
   (vec_duplicate:V2DF
 (float:DF (match_operand:SI 2 nonimmediate_operand r,m,rm)))
 - (match_operand:V2DF 1 register_operand 0,0,x)
 + (match_operand:V2DF 1 register_operand 0,0,v)
   (const_int 1)))]
TARGET_SSE2
@
 @@ -4275,7 +4275,7 @@
 (set_attr amdfam10_decode vector,double,*)
 (set_attr bdver1_decode double,direct,*)
 (set_attr btver2_decode double,double,double)
 -   (set_attr prefix orig,orig,vex)
 +   (set_attr prefix orig,orig,maybe_evex)
 (set_attr mode DF)])

  (define_insn sse2_cvtsi2sdqround_name
 diff --git a/gcc/testsuite/gcc.target/i386/pr65915.c 
 b/gcc/testsuite/gcc.target/i386/pr65915.c
 new file mode 100644
 index 000..990c5aa
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/i386/pr65915.c
 @@ -0,0 +1,6 @@
 +/* { dg-do run } */
 +/* { dg-options -O2 -mavx512f -fpic -mcmodel=medium } */
 +/* { dg-require-effective-target avx512f } */
 +/* { dg-require-effective-target lp64 } */
 +
 +#include avx512f-vrndscalepd-2.c

 Missing testcases for

 FAIL: gcc.target/i386/avx512f-vrndscaleps-2.c (test for excess errors)
 FAIL: gcc.target/i386/avx512vl-vrndscaleps-2.c (internal compiler error)

The attached test is OK, since these two would test for the same problem.

 as well as ChangeLog entries.

ChangeLog is missing. Please add PR number and describe *each* change
accurately. You can say (vector convert to float spltiter) for this
particular nameless splitter.

Please repost the patch with updated ChangeLog.

Uros.


Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-04 Thread Richard Biener
On Sat, 2 May 2015, Jakub Jelinek wrote:

 Hi!
 
 This is an attempt to fix the following testcase (reduced from gsoap)
 similarly how you've fixed another issue with r221795 other AAPCS
 regressions introduced with r221348 change.
 This patch passed bootstrap/regtest on
 {x86_64,i686,armv7hl,aarch64,powerpc64{,le},s390{,x}}-linux.
 
 Though, it still doesn't fix profiledbootstrap on armv7hl that is broken
 since r221348, so other issues are lurking in there, and I must say
 I'm not entirely sure about this, because it changes alignment even when
 the original access had higher alignment.
 
 I was trying something like:
 struct B { char *a, *b; };
 typedef struct B C __attribute__((aligned (8)));
 struct A { C a; int b; long long c; };
 char v[3];
 
 __attribute__((noinline, noclone)) void
 fn1 (C x, C y)
 {
   if (x.a != v[1] || y.a != v[2])
 __builtin_abort ();
   v[1]++;
 }
 
 __attribute__((noinline, noclone)) int
 fn2 (C x)
 {
   asm volatile ( : +g (x.a) : : memory);
   asm volatile ( : +g (x.b) : : memory);
   return x.a == v[0];
 }
 
 __attribute__((noinline, noclone)) void
 fn3 (const char *x)
 {
   if (x[0] != 0)
 __builtin_abort ();
 }
 
 static struct A
 foo (const char *x, struct A y, struct A z)
 {
   struct A r = { { 0, 0 }, 0, 0 };
   if (y.b  z.b)
 {
   if (fn2 (y.a)  fn2 (z.a))
   switch (x[0])
 {
 case '|':
   break;
 default:
   fn3 (x);
 }
   fn1 (y.a, z.a);
 }
   return r;
 }
 
 __attribute__((noinline, noclone)) int
 bar (int x, struct A *y)
 {
   switch (x)
 {
 case 219:
   foo (+, y[-2], y[0]);
 case 220:
   foo (-, y[-2], y[0]);
 }
 }
 
 int
 main ()
 {
   struct A a[3] = { { { v[1], v[0] }, 1, 1LL },
   { { v[0], v[0] }, 0, 0LL },
   { { v[2], v[0] }, 2, 2LL } };
   bar (220, a + 2);
   if (v[1] != 1)
 __builtin_abort ();
   return 0;
 }
 
 and this patch indeed changes the register passing, eventhough it probably
 shouldn't (though, the testcase doesn't fail).  Wouldn't it be possible to
 preserve the original type (before we call build_aligned_type on it)
 somewhere in SRA data structures, perhaps keep expr (the new MEM_REF) use
 the aligned type, but type field be the non-aligned one?

Not sure how this helps when SRA tears apart the parameter.  That is,
isn't the important thing that both the IPA modified function argument
types/decls have the same type as the types of the parameters SRA ends
up passing?  (as far as alignment goes?)

Yes, of course using natural alignment makes sure that the backend
can handle alignment properly and we don't run into oddball bugs here.

 2015-05-02  Jakub Jelinek  ja...@redhat.com
 
   PR target/65956
   * tree-sra.c (turn_representatives_into_adjustments): For
   adj.type, use TYPE_MAIN_VARIANT of repr-type with TYPE_QUALS.
 
   * gcc.c-torture/execute/pr65956.c: New test.
 
 --- gcc/tree-sra.c.jj 2015-04-20 14:35:47.0 +0200
 +++ gcc/tree-sra.c2015-05-01 01:08:34.092636496 +0200
 @@ -4427,7 +4427,11 @@ turn_representatives_into_adjustments (v
 gcc_assert (repr-base == parm);
 adj.base_index = index;
 adj.base = repr-base;
 -   adj.type = repr-type;
 +   /* Drop any special alignment on the type if it's not on the
 +  main variant.  This avoids issues with weirdo ABIs like
 +  AAPCS.  */
 +   adj.type = build_qualified_type (TYPE_MAIN_VARIANT (repr-type),
 +TYPE_QUALS (repr-type));

So - this changes the function argument type of the clone?  Does it
also change the type of the value we pass to the function?  That is,
why drop the alignment here but not avoid attaching it to repr-type
in the first place as my fix for the other issue did?

Doesn't the above just make it inconsistent by default?

There is also the correctness issue of under-aligned types (which
was what the original code using build_aligned_type cared for - before
I fixed it to also preserve over-alignment).

That said - somewhere we create the register we use for passing the
argument, and only the type of that register needs fixing IMHO.

We also have

  ptype = adj-type;
  if (is_gimple_reg_type (ptype))
{
  unsigned malign = GET_MODE_ALIGNMENT (TYPE_MODE 
(ptype));
  if (TYPE_ALIGN (ptype)  malign)
ptype = build_aligned_type (ptype, malign);

in ipa_modify_formal_parameters.  That looks odd for by-value passing
as well.  When modifying the function bodies we simply take what was
set in -new_decl which we'd populate above in 
ipa_modify_formal_parameters.  It seems to me that ipa_modify_expr
should look to preserve alignment at the callers site (for loading
into the regs we pass) for non-reference passing.  Esp.

  if (cand-by_ref)
src = build_simple_mem_ref (cand-new_decl);

looks bogus in this 

Re: [PR testsuite/65205, libgomp/65993] Fix dg-shouldfail usage in OpenACC libgomp tests

2015-05-04 Thread Thomas Schwinge
Hi!

On Thu, 30 Apr 2015 14:47:03 +0200, I wrote:
 Here is a patch, prepared by Jim Norris, to fix dg-shouldfail usage in
 OpenACC libgomp tests.  It introduces two regressions (that is, makes the
 existing errors visible), which shall then be fixed later on:
 libgomp.oacc-c-c++-common/lib-3.c, and
 libgomp.oacc-c-c++-common/lib-42.c.
 
 As obvious, committed to trunk in r222620: [...]

So much for obvious ;-) -- https://gcc.gnu.org/PR65993.

Dave, would you please test the following patch, and report the
regression status compared to before r222620?  (Compared to your existing
r222021 results, as posted in the PR, for example.)

Additionally to the %p format specifier printing a 0x prefix vs. not
doing that, I've also changed the expected (nil) output for NULL
pointers to instead match basically everything.

 libgomp/testsuite/libgomp.oacc-c-c++-common/clauses-2.c  | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-1.c | 4 ++--
 libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-2.c | 4 ++--
 libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-8.c | 4 ++--
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-16.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-17.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-18.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-20.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-21.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-22.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-23.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-25.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-26.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-27.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-28.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-29.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-30.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-34.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-35.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-36.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-39.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-40.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-42.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-44.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-48.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-52.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-53.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-54.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-57.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-58.c | 2 +-
 libgomp/testsuite/libgomp.oacc-fortran/data-already-1.f  | 2 +-
 libgomp/testsuite/libgomp.oacc-fortran/data-already-2.f  | 2 +-
 libgomp/testsuite/libgomp.oacc-fortran/data-already-8.f  | 2 +-
 35 files changed, 38 insertions(+), 38 deletions(-)

diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/clauses-2.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/clauses-2.c
index fec2214..c0a5d00 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/clauses-2.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/clauses-2.c
@@ -64,5 +64,5 @@ main (int argc, char **argv)
 
 return 0;
 }
-/* { dg-output Trying to map into device \\\[0x\[0-9a-f\]+..0x\[0-9a-f\]+\\\) 
object when \\\[0x\[0-9a-f\]+..0x\[0-9a-f\]+\\\) is already mapped }
+/* { dg-output Trying to map into device 
\\\[\[0-9a-fA-FxX\]+..\[0-9a-fA-FxX\]+\\\) object when 
\\\[\[0-9a-fA-FxX\]+..\[0-9a-fA-FxX\]+\\\) is already mapped } */
 /* { dg-shouldfail  } */
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-1.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-1.c
index 83c0a42..0c61a66 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-1.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-1.c
@@ -15,5 +15,5 @@ main (int argc, char *argv[])
   return 0;
 }
 
-/* { dg-shouldfail  }
-   { dg-output Trying to map into device .* object when .* is already mapped 
} */
+/* { dg-output Trying to map into device 
\\\[\[0-9a-fA-FxX\]+..\[0-9a-fA-FxX\]+\\\) object when 
\\\[\[0-9a-fA-FxX\]+..\[0-9a-fA-FxX\]+\\\) is already mapped } */
+/* { dg-shouldfail  } */
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-2.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-2.c
index 137d8ce..cd9fea3 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-2.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-2.c
@@ -12,5 +12,5 @@ main (int argc, char *argv[])
   return 0;
 }
 
-/* 

Re: [rs6000] Fix compare debug failure on AIX

2015-05-04 Thread Richard Biener
On Mon, May 4, 2015 at 2:32 AM, David Edelsohn dje@gmail.com wrote:
 On Sat, May 2, 2015 at 6:04 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Why should GCC unnecessarily create stack frames to avoid
 compare-debug testcase failures?

 I'm not sure I understand the question... compare-debug failures are failures
 (-g is not supposed to change the generated code and this XCOFF-specific bug
 was reported to us) so they need to be fixed.

 From there on, as Alan said, there are 2 cases: either AIX needs a frame for
 debugging or it doesn't.  If the latter, then the lines can simply be 
 deleted.
 If the former, we have to draw a line somewhere; Alan suggests always 
 creating
 a frame while I suggest creating it only at -O0 and -Og.

 I believe that AIX does need a frame for debugging.  I don't remember
 the exact reason off hand.

 I'm sorry that XCOFF debugging changes the generated code (only in the
 sense of allocating a frame), but that is a system dependency.  It's
 been this way for over 20 years.  I see no reason to produce worse
 code at -O0 when not debugging simply to make testcases happier.

The simple reason is because it is policy for GCC to generate the same
code with -g0 and -g.  You can't simply say you don't care.

You never want to run into the situation that you miscompile a program
with -g0 but not with -g because that's very much no fun to debug.

Yes, I don't think we have this policy written down anywhere - something
we should improve on.

Richard.

 By the way, I'm still waiting for the DWARF debugging patches from
 Adacore compatible with AIX as and ld.  DWARF debugging would not
 require pushing a frame, and would resolve the failure when testing
 with DWARF.  The patch would be adjusted to only push a frame when
 writing XCOFF debugging.

 - David


Re: [patch] Perform anonymous constant propagation during inlining

2015-05-04 Thread Richard Biener
On Fri, May 1, 2015 at 8:09 PM, Eric Botcazou ebotca...@adacore.com wrote:
 OK, how aggressive then?  We could as well do the substitution for all
 copies:

   /* For EXPAND_INITIALIZER try harder to get something simpler.
Otherwise, substitute copies on the RHS, this can propagate
constants at -O0 and thus simplify arithmetic operations.  */
   if (g == NULL
  !SSA_NAME_IS_DEFAULT_DEF (exp)
  (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp)))
  (modifier == EXPAND_INITIALIZER

 || (modifier != EXPAND_WRITE

  gimple_assign_copy_p (SSA_NAME_DEF_STMT (exp
  stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
   g = SSA_NAME_DEF_STMT (exp);

 This doesn't work (this generates wrong code because this creates overlapping
 live ranges for SSA_NAMEs with the same base variable).  Here's the latest
 working version, all the predicates and accessors used are inlined.

Hum, the fact that your earlier version created wrong code
(get_gimple_for_ssa_name
already returned false here) points at some issues with
EXPAND_INITIALIZER as well, no...?

That said, the path you add is certainly safe (though maybe we want to change
get_gimple_for_ssa_name to return tcc_constant single-use defs even if
TER is disabled
(thus at -O0 - and only at -O0, otherwise it shouldn't happen).  That
would cover
more cases of get_gimple_for_ssa_name uses (I can see
optimize_bitfield_expansion
for example...)

So, your patch is ok for trunk unless you want to explore the
get_gimple_for_ssa_name
improvement suggestion.

I also wonder about EXPAND_INITIALIZER creating overlapping
life-ranges (or moving
loads across stores).

Thanks,
Richard.

 Tested on x86_64-suse-linux, OK for the mainline?


 2015-05-01  Eric Botcazou  ebotca...@adacore.com

 * expr.c (expand_expr_real_1) SSA_NAME: Try to substitute constants
 on the RHS of expressions.
 * gimple-expr.h (is_gimple_constant): Reorder.


 --
 Eric Botcazou


[PING^4] [PATCH] [AArch64, NEON] Improve vmulX intrinsics

2015-05-04 Thread Jiangjiji

Hi, 
  This is a ping for: https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00772.html
  Regtested with aarch64-linux-gnu on QEMU.
  This patch has no regressions for aarch64_be-linux-gnu big-endian target too. 
  OK for the trunk? 

Thanks.
Jiang jiji







Re: PR 64454: Improve VRP for %

2015-05-04 Thread Richard Biener
On Sat, May 2, 2015 at 12:46 AM, Marc Glisse marc.gli...@inria.fr wrote:
 Hello,

 this patch tries to tighten a bit the range estimate for x%y. slp-perm-7.c
 started failing by vectorizing more than expected, I assumed it was a good
 thing and updated the test. I am less conservative than Jakub with division
 by 0, but I still don't really understand how empty ranges are supposed to
 be represented in VRP.

 Bootstrap+testsuite on x86_64-linux-gnu.

Hmm, so I don't like how you (continute to) use trees for the constant
computations.
wide-ints would be a better fit today.  I also notice that
fold_unary_to_constant can
return NULL_TREE and neither the old nor your code handles that.

empty ranges are basically UNDEFINED.

Aren't you pessimizing the case where the old code used
value_range_nonnegative_p()
by just using TYPE_UNSIGNED?

Thanks,
Richard.

 2015-05-02  Marc Glisse  marc.gli...@inria.fr

 PR tree-optimization/64454
 gcc/
 * tree-vrp.c (extract_range_from_binary_expr_1) TRUNC_MOD_EXPR:
 Rewrite.
 gcc/testsuite/
 * gcc.dg/tree-ssa/vrp97.c: New file.
 * gcc.dg/vect/slp-perm-7.c: Update.

 --
 Marc Glisse
 Index: gcc/testsuite/gcc.dg/tree-ssa/vrp97.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/vrp97.c   (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/vrp97.c   (working copy)
 @@ -0,0 +1,13 @@
 +/* PR tree-optimization/64454 */
 +/* { dg-options -O2 -fdump-tree-vrp1 } */
 +
 +int f(int a, int b)
 +{
 +if (a  -3 || a  13) __builtin_unreachable();
 +if (b  -6 || b  9) __builtin_unreachable();
 +int c = a % b;
 +return c = -3  c = 8;
 +}
 +
 +/* { dg-final { scan-tree-dump return 1; vrp1 } } */
 +/* { dg-final { cleanup-tree-dump vrp1 } } */
 Index: gcc/testsuite/gcc.dg/vect/slp-perm-7.c
 ===
 --- gcc/testsuite/gcc.dg/vect/slp-perm-7.c  (revision 222708)
 +++ gcc/testsuite/gcc.dg/vect/slp-perm-7.c  (working copy)
 @@ -63,15 +63,15 @@ int main (int argc, const char* argv[])

foo (input, output, input2, output2);

for (i = 0; i  N; i++)
   if (output[i] != check_results[i] || output2[i] != check_results2[i])
 abort ();

return 0;
  }

 -/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect  {
 target vect_perm } } } */
 +/* { dg-final { scan-tree-dump-times vectorized 1 loops 2 vect  {
 target vect_perm } } } */
  /* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 1 vect
 { target vect_perm } } } */
  /* { dg-final { cleanup-tree-dump vect } } */


 Index: gcc/tree-vrp.c
 ===
 --- gcc/tree-vrp.c  (revision 222708)
 +++ gcc/tree-vrp.c  (working copy)
 @@ -3189,40 +3189,83 @@ extract_range_from_binary_expr_1 (value_
 }
 }
else
 {
   extract_range_from_multiplicative_op_1 (vr, code, vr0, vr1);
   return;
 }
  }
else if (code == TRUNC_MOD_EXPR)
  {
 -  if (vr1.type != VR_RANGE
 - || range_includes_zero_p (vr1.min, vr1.max) != 0
 - || vrp_val_is_min (vr1.min))
 +  if (range_is_null (vr1))
 +   {
 + set_value_range_to_undefined (vr);
 + return;
 +   }
 +  // Some propagation of symbolic ranges should be possible
 +  // at least in the unsigned case.
 +  bool has_vr0 = vr0.type == VR_RANGE  !symbolic_range_p (vr0);
 +  bool has_vr1 = vr1.type == VR_RANGE  !symbolic_range_p (vr1);
 +  if (!has_vr0  !has_vr1)
 {
   set_value_range_to_varying (vr);
   return;
 }
type = VR_RANGE;
 -  /* Compute MAX |vr1.min|, |vr1.max| - 1.  */
 -  max = fold_unary_to_constant (ABS_EXPR, expr_type, vr1.min);
 -  if (tree_int_cst_lt (max, vr1.max))
 -   max = vr1.max;
 -  max = int_const_binop (MINUS_EXPR, max, build_int_cst (TREE_TYPE
 (max), 1));
 -  /* If the dividend is non-negative the modulus will be
 -non-negative as well.  */
 -  if (TYPE_UNSIGNED (expr_type)
 - || value_range_nonnegative_p (vr0))
 -   min = build_int_cst (TREE_TYPE (max), 0);
 +  if (TYPE_UNSIGNED (expr_type))
 +   {
 + // A % B is at most A and smaller than B.
 + min = build_int_cst (expr_type, 0);
 + if (has_vr0  (!has_vr1 || tree_int_cst_lt (vr0.max, vr1.max)))
 +   max = vr0.max;
 + else
 +   max = int_const_binop (MINUS_EXPR, vr1.max,
 +  build_int_cst (expr_type, 1));
 +   }
else
 -   min = fold_unary_to_constant (NEGATE_EXPR, expr_type, max);
 +   {
 + tree min1 = NULL_TREE;
 + tree max1 = NULL_TREE;
 + if (has_vr1)
 +   {
 + // ABS (A % B)  ABS (B)
 + max1 = fold_unary_to_constant (ABS_EXPR, expr_type, vr1.min);
 + if (tree_int_cst_lt (max1, 

Re: [PATCH, AArch64] Add Cortex-A53 erratum 843419 configure-time option

2015-05-04 Thread Yvan Roux
Hi Marcus,

On 1 May 2015 at 17:18, Marcus Shawcroft marcus.shawcr...@gmail.com wrote:
 On 1 May 2015 at 14:56, Yvan Roux yvan.r...@linaro.org wrote:

 2015-05-01  Yvan Roux  yvan.r...@linaro.org

  * configure.ac: Add --enable-fix-cortex-a53-843419 option.
  * configure: Regenerate.
  * config/aarch64/aarch64-elf-raw.h (CA53_ERR_843419_SPEC): Define.
  (LINK_SPEC): Include CA53_ERR_843419_SPEC.
  * config/aarch64/aarch64-linux.h (CA53_ERR_843419_SPEC): Define.
  (LINK_SPEC): Include CA53_ERR_843419_SPEC.
  * doc/install.texi (aarch64*-*-*): Document
  new --enable-fix-cortex-a53-843419 option
  * config/aarch64/aarch64.opt (mfix-cortex-a53-843419): New option.
  * doc/invoke.texi (AArch64 Options): Document -mfix-cortex-a53-843419
  and -mno-fix-cortex-a53-8434199 options.


 +@option{--enable-fix-cortex-a53-843419} option.  This erratum
 workaround is
 +made at link time and enabling it by default in GCC will only pass
 the

 How about something like The workaround is applied at link time.
 Enabling the workaround will cause GCC to pass the relevant option to
 the linker. ?

Yes this is a better formulation.

 +corresponding flag to the linker.  It can be explicitly disabled
 during
 +compilation by passing the @option{-mno-fix-cortex-a53-835769} option.

 Copy paste error here with the previous errata number.

Here is the patch with the modifications.  Is it needed to backport it
into 4.9 and 5.1 branches ?

Cheers,
Yvan
diff --git a/gcc/config/aarch64/aarch64-elf-raw.h 
b/gcc/config/aarch64/aarch64-elf-raw.h
index ebeeb50..bd5e51c 100644
--- a/gcc/config/aarch64/aarch64-elf-raw.h
+++ b/gcc/config/aarch64/aarch64-elf-raw.h
@@ -35,10 +35,19 @@
%{mfix-cortex-a53-835769:--fix-cortex-a53-835769}
 #endif
 
+#ifdef TARGET_FIX_ERR_A53_843419_DEFAULT
+#define CA53_ERR_843419_SPEC \
+   %{!mno-fix-cortex-a53-843419:--fix-cortex-a53-843419}
+#else
+#define CA53_ERR_843419_SPEC \
+   %{mfix-cortex-a53-843419:--fix-cortex-a53-843419}
+#endif
+
 #ifndef LINK_SPEC
 #define LINK_SPEC %{mbig-endian:-EB} %{mlittle-endian:-EL} -X \
   -maarch64elf%{mabi=ilp32*:32}%{mbig-endian:b} \
-  CA53_ERR_835769_SPEC
+  CA53_ERR_835769_SPEC \
+  CA53_ERR_843419_SPEC
 #endif
 
 #endif /* GCC_AARCH64_ELF_RAW_H */
diff --git a/gcc/config/aarch64/aarch64-linux.h 
b/gcc/config/aarch64/aarch64-linux.h
index 9abb252..7973268 100644
--- a/gcc/config/aarch64/aarch64-linux.h
+++ b/gcc/config/aarch64/aarch64-linux.h
@@ -49,8 +49,17 @@
%{mfix-cortex-a53-835769:--fix-cortex-a53-835769}
 #endif
 
+#ifdef TARGET_FIX_ERR_A53_843419_DEFAULT
+#define CA53_ERR_843419_SPEC \
+   %{!mno-fix-cortex-a53-843419:--fix-cortex-a53-843419}
+#else
+#define CA53_ERR_843419_SPEC \
+   %{mfix-cortex-a53-843419:--fix-cortex-a53-843419}
+#endif
+
 #define LINK_SPEC LINUX_TARGET_LINK_SPEC \
-  CA53_ERR_835769_SPEC
+  CA53_ERR_835769_SPEC \
+  CA53_ERR_843419_SPEC
 
 #define GNU_USER_TARGET_MATHFILE_SPEC \
   %{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s}
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index f2ef124..6d72ac2 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -71,6 +71,10 @@ mfix-cortex-a53-835769
 Target Report Var(aarch64_fix_a53_err835769) Init(2)
 Workaround for ARM Cortex-A53 Erratum number 835769
 
+mfix-cortex-a53-843419
+Target Report
+Workaround for ARM Cortex-A53 Erratum number 843419
+
 mlittle-endian
 Target Report RejectNegative InverseMask(BIG_END)
 Assume target CPU is configured as little endian
diff --git a/gcc/configure b/gcc/configure
index 84f58ce..e563e94 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -923,6 +923,7 @@ enable_gnu_indirect_function
 enable_initfini_array
 enable_comdat
 enable_fix_cortex_a53_835769
+enable_fix_cortex_a53_843419
 with_glibc_version
 enable_gnu_unique_object
 enable_linker_build_id
@@ -1648,6 +1649,14 @@ Optional Features:
   disable workaround for AArch64 Cortex-A53 erratum
   835769 by default
 
+
+  --enable-fix-cortex-a53-843419
+  enable workaround for AArch64 Cortex-A53 erratum
+  843419 by default
+  --disable-fix-cortex-a53-843419
+  disable workaround for AArch64 Cortex-A53 erratum
+  843419 by default
+
   --enable-gnu-unique-object
   enable the use of the @gnu_unique_object ELF
   extension on glibc systems
@@ -18153,7 +18162,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 18156 configure
+#line 18165 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
@@ -18259,7 +18268,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 18262 configure
+#line 18271 configure
 

Re: [RFA] More type narrowing in match.pd V2

2015-05-04 Thread Richard Biener
On Sat, May 2, 2015 at 2:36 AM, Jeff Law l...@redhat.com wrote:
 Here's an updated patch to add more type narrowing to match.pd.

 Changes since the last version:

 Slight refactoring of the condition by using types_match as suggested by
 Richi.  I also applied the new types_match to 2 other patterns in match.pd
 where it seemed clearly appropriate.

 Additionally the transformation is restricted by using the new single_use
 predicate.  I didn't change other patterns in match.pd to use the new
 single_use predicate.  But some probably could be changed.

 This (of course) continues to pass the bootstrap and regression check for
 x86-linux-gnu.

 There's still a ton of work to do in this space.  This is meant to be an
 incremental stand-alone improvement.

 OK now?

Ok with the {gimple,generic}-match-head.c changes mentioned in the ChangeLog.

Thanks,
Richard.



 Jeff

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index e006b26..5ee89de 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,8 @@
 +2015-05-01  Jeff Law  l...@redhat.com
 +
 +   * match.pd (bit_and (plus/minus (convert @0) (convert @1) mask): New
 +   simplifier to narrow arithmetic.
 +
  2015-05-01  Rasmus Villemoes  r...@rasmusvillemoes.dk

 * match.pd: New simplification patterns.
 diff --git a/gcc/generic-match-head.c b/gcc/generic-match-head.c
 index daa56aa..303b237 100644
 --- a/gcc/generic-match-head.c
 +++ b/gcc/generic-match-head.c
 @@ -70,4 +70,20 @@ along with GCC; see the file COPYING3.  If not see
  #include dumpfile.h
  #include generic-match.h

 +/* Routine to determine if the types T1 and T2 are effectively
 +   the same for GENERIC.  */

 +inline bool
 +types_match (tree t1, tree t2)
 +{
 +  return TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2);
 +}
 +
 +/* Return if T has a single use.  For GENERIC, we assume this is
 +   always true.  */
 +
 +inline bool
 +single_use (tree t)
 +{
 +  return true;
 +}
 diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
 index c7b2f95..dc13218 100644
 --- a/gcc/gimple-match-head.c
 +++ b/gcc/gimple-match-head.c
 @@ -861,3 +861,21 @@ do_valueize (tree (*valueize)(tree), tree op)
return op;
  }

 +/* Routine to determine if the types T1 and T2 are effectively
 +   the same for GIMPLE.  */
 +
 +inline bool
 +types_match (tree t1, tree t2)
 +{
 +  return types_compatible_p (t1, t2);
 +}
 +
 +/* Return if T has a single use.  For GIMPLE, we also allow any
 +   non-SSA_NAME (ie constants) and zero uses to cope with uses
 +   that aren't linked up yet.  */
 +
 +inline bool
 +single_use (tree t)
 +{
 +  return TREE_CODE (t) != SSA_NAME || has_zero_uses (t) || has_single_use
 (t);
 +}
 diff --git a/gcc/match.pd b/gcc/match.pd
 index 87ecaf1..51a950a 100644
 --- a/gcc/match.pd
 +++ b/gcc/match.pd
 @@ -289,8 +289,7 @@ along with GCC; see the file COPYING3.  If not see
(if (((TREE_CODE (@1) == INTEGER_CST
   INTEGRAL_TYPE_P (TREE_TYPE (@0))
   int_fits_type_p (@1, TREE_TYPE (@0)))
 -   || (GIMPLE  types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
 -   || (GENERIC  TREE_TYPE (@0) == TREE_TYPE (@1)))
 +   || types_match (TREE_TYPE (@0), TREE_TYPE (@1)))
 /* ???  This transform conflicts with fold-const.c doing
   Convert (T)(x  c) into (T)x  (T)c, if c is an integer
   constants (if x has signed type, the sign bit cannot be set
 @@ -949,8 +948,7 @@ along with GCC; see the file COPYING3.  If not see
  /* Unordered tests if either argument is a NaN.  */
  (simplify
   (bit_ior (unordered @0 @0) (unordered @1 @1))
 - (if ((GIMPLE  types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
 -  || (GENERIC  TREE_TYPE (@0) == TREE_TYPE (@1)))
 + (if (types_match (TREE_TYPE (@0), TREE_TYPE (@1)))
(unordered @0 @1)))
  (simplify
   (bit_ior:c (unordered @0 @0) (unordered:c@2 @0 @1))
 @@ -1054,7 +1052,7 @@ along with GCC; see the file COPYING3.  If not see
 operation and convert the result to the desired type.  */
  (for op (plus minus)
(simplify
 -(convert (op (convert@2 @0) (convert@3 @1)))
 +(convert (op@4 (convert@2 @0) (convert@3 @1)))
  (if (INTEGRAL_TYPE_P (type)
  /* We check for type compatibility between @0 and @1 below,
 so there's no need to check that @1/@3 are integral types.  */
 @@ -1070,15 +1068,45 @@ along with GCC; see the file COPYING3.  If not see
   TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
  /* The inner conversion must be a widening conversion.  */
   TYPE_PRECISION (TREE_TYPE (@2))  TYPE_PRECISION (TREE_TYPE
 (@0))
 - ((GENERIC
 -  (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
 - == TYPE_MAIN_VARIANT (TREE_TYPE (@1)))
 -  (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
 - == TYPE_MAIN_VARIANT (type)))
 -|| (GIMPLE
 - types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1))
 - types_compatible_p (TREE_TYPE (@0), type
 +   

[PATCH, ARM] Fix testcases that require Thumb2 effective target.

2015-05-04 Thread Yvan Roux
Hi,

This patch fixes two ARM testcases that require target to be Thumb2
effective.  One is built for Cortex-m3, the purpose of the second one
is to generate thumb2_addsi3_compare0_scratch insn and both are
failing when compiled for armv5t for instance.

Built and regtested, is it OK for trunk ?

Thanks,
Yvan

2015-05-04  Yvan Roux  yvan.r...@linaro.org

* gcc.target/arm/pr65067.c: Require Thumb2 effective target.
* gcc.target/arm/pr65924.c: Likewise.
diff --git a/gcc/testsuite/gcc.target/arm/pr65067.c 
b/gcc/testsuite/gcc.target/arm/pr65067.c
index 9ddd7bb..05da294 100644
--- a/gcc/testsuite/gcc.target/arm/pr65067.c
+++ b/gcc/testsuite/gcc.target/arm/pr65067.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok } */
 /* { dg-options -mthumb -mcpu=cortex-m3 -O2 } */
 
 struct tmp {
diff --git a/gcc/testsuite/gcc.target/arm/pr65924.c 
b/gcc/testsuite/gcc.target/arm/pr65924.c
index 746749f..e1ad394 100644
--- a/gcc/testsuite/gcc.target/arm/pr65924.c
+++ b/gcc/testsuite/gcc.target/arm/pr65924.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok } */
 /* { dg-options -O2 -mthumb } */
 
 int a, b, c;


Re: [committed, gcc-5-branch] Set DEV-PHASE to prerelease

2015-05-04 Thread Rainer Orth
Jakub Jelinek ja...@redhat.com writes:

 On Thu, Apr 23, 2015 at 04:31:52PM -0700, H.J. Lu wrote:
 Hi,
 
 I checked this patch into gcc-5-branch.

 That's wrong according to https://gcc.gnu.org/develop.html#num_scheme

HJ has a point, though: with DEV-PHASE remaining empty, all post-5.1.0
versions of gcc identify as 5.1.1, with no way of telling them apart,
like datestamp and revison.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [committed, gcc-5-branch] Set DEV-PHASE to prerelease

2015-05-04 Thread Jakub Jelinek
On Mon, May 04, 2015 at 11:13:51AM +0200, Rainer Orth wrote:
 Jakub Jelinek ja...@redhat.com writes:
 
  On Thu, Apr 23, 2015 at 04:31:52PM -0700, H.J. Lu wrote:
  Hi,
  
  I checked this patch into gcc-5-branch.
 
  That's wrong according to https://gcc.gnu.org/develop.html#num_scheme
 
 HJ has a point, though: with DEV-PHASE remaining empty, all post-5.1.0
 versions of gcc identify as 5.1.1, with no way of telling them apart,
 like datestamp and revison.

That suggests we should change
DATESTAMP_s := \$(if $(DEVPHASE_c), $(DATESTAMP_c))\
so that it would expand to DATESTAMP_c also if DEVPHASE_c is empty,
but BASEVER_c does not end with .0

Jakub


Re: [PATCH, x86] Add TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook

2015-05-04 Thread Christian Bruel

 Hi Christian,
 I noticed case gcc.dg/ipa/iinline-attr.c failed on aarch64.  The
 original patch is x86 specific, while the case is added as general
 one.  Could you please have a look at this?
 
 FAIL: gcc.dg/ipa/iinline-attr.c scan-ipa-dump inline
 hooray[^\\n]*inline copy in test
 

that is the same latent bug for aarch64:  alignment flags are not
propagated with attribute optimize (O2).

testing attached patch

Christian


Index: config/aarch64/aarch64.c
===
--- config/aarch64/aarch64.c	(revision 222627)
+++ config/aarch64/aarch64.c	(working copy)
@@ -6908,18 +6908,6 @@
 #endif
 }
 
-  /* If not opzimizing for size, set the default
- alignment to what the target wants */
-  if (!optimize_size)
-{
-  if (align_loops = 0)
-	align_loops = aarch64_tune_params-loop_align;
-  if (align_jumps = 0)
-	align_jumps = aarch64_tune_params-jump_align;
-  if (align_functions = 0)
-	align_functions = aarch64_tune_params-function_align;
-}
-
   if (AARCH64_TUNE_FMA_STEERING)
 aarch64_register_fma_steering ();
 
@@ -6935,6 +6923,18 @@
 flag_omit_leaf_frame_pointer = false;
   else if (flag_omit_leaf_frame_pointer)
 flag_omit_frame_pointer = true;
+
+  /* If not opzimizing for size, set the default
+ alignment to what the target wants */
+  if (!optimize_size)
+{
+  if (align_loops = 0)
+	align_loops = aarch64_tune_params-loop_align;
+  if (align_jumps = 0)
+	align_jumps = aarch64_tune_params-jump_align;
+  if (align_functions = 0)
+	align_functions = aarch64_tune_params-function_align;
+}
 }
 
 static struct machine_function *


Re: [committed, gcc-5-branch] Set DEV-PHASE to prerelease

2015-05-04 Thread Richard Biener
On Mon, 4 May 2015, Jakub Jelinek wrote:

 On Mon, May 04, 2015 at 11:13:51AM +0200, Rainer Orth wrote:
  Jakub Jelinek ja...@redhat.com writes:
  
   On Thu, Apr 23, 2015 at 04:31:52PM -0700, H.J. Lu wrote:
   Hi,
   
   I checked this patch into gcc-5-branch.
  
   That's wrong according to https://gcc.gnu.org/develop.html#num_scheme
  
  HJ has a point, though: with DEV-PHASE remaining empty, all post-5.1.0
  versions of gcc identify as 5.1.1, with no way of telling them apart,
  like datestamp and revison.
 
 That suggests we should change
 DATESTAMP_s := \$(if $(DEVPHASE_c), $(DATESTAMP_c))\
 so that it would expand to DATESTAMP_c also if DEVPHASE_c is empty,
 but BASEVER_c does not end with .0

Yes.

Richard.

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [patch] Perform anonymous constant propagation during inlining

2015-05-04 Thread Eric Botcazou
 Hum, the fact that your earlier version created wrong code
 (get_gimple_for_ssa_name
 already returned false here) points at some issues with
 EXPAND_INITIALIZER as well, no...?

Theoritically yes but, in practice, EXPAND_INITIALIZER is used in varasm.c and 
for debugging stuff only, so I don't think that's a real concern.

 That said, the path you add is certainly safe (though maybe we want to
 change get_gimple_for_ssa_name to return tcc_constant single-use defs even
 if TER is disabled
 (thus at -O0 - and only at -O0, otherwise it shouldn't happen).  That
 would cover
 more cases of get_gimple_for_ssa_name uses (I can see
 optimize_bitfield_expansion
 for example...)

optimize_bitfield_assignment_op is only interested in loads from bitfields 
though.  The get_gimple_for_ssa_name route would be interesting to bypass the 
stmt_is_replaceable_p test, i.e. to bypass the single-use test, but this could 
be counter-productive at -O0 so I'm not sure it's worth the trouble.

-- 
Eric Botcazou


[PATCH, AArch64] [4.8] Backport PR64304 fix (miscompilation with -mgeneral-regs-only )

2015-05-04 Thread Chen Shanyao
According to your opinion, I split the backports of pr64304 into 2 
emails, and this one is for 4.8 branch.
This patch backport the fix of PR target/64304 , miscompilation with 
-mgeneral-regs-only, to the 4.8 branch from trunk r219844. Tested on 
x86_64 by using qemu of aarch64.

OK for 4.8?


diff -rupN gcc-4.8-20150226/gcc/ChangeLog 
gcc-4.8-20150226.pr64304//gcc/ChangeLog

--- gcc-4.8-20150226/gcc/ChangeLog2015-03-04 21:13:46.0 -0500
+++ gcc-4.8-20150226.pr64304//gcc/ChangeLog2015-03-04 
21:19:49.0 -0500

@@ -1,3 +1,13 @@
+2015-03-05  Shanyao Chen  chenshan...@huawei.com
+
+Backported from mainline
+2015-01-19  Jiong Wang  jiong.w...@arm.com
+Andrew Pinski  apin...@cavium.com
+
+PR target/64304
+* config/aarch64/aarch64.md (define_insn *ashlmode3_insn): 
Deleted.

+(ashlmode3): Don't expand if operands[2] is not constant.
+
 2015-02-26  Peter Bergner  berg...@vnet.ibm.com

 Backport from mainline
diff -rupN gcc-4.8-20150226/gcc/config/aarch64/aarch64.md 
gcc-4.8-20150226.pr64304//gcc/config/aarch64/aarch64.md
--- gcc-4.8-20150226/gcc/config/aarch64/aarch64.md2015-03-04 
21:14:29.0 -0500
+++ gcc-4.8-20150226.pr64304//gcc/config/aarch64/aarch64.md 2015-03-04 
21:21:54.0 -0500

@@ -2612,6 +2612,8 @@
 DONE;
   }
   }
+else
+  FAIL;
   }
 )

@@ -2681,16 +2683,6 @@
(set_attr mode SI)]
 )

-(define_insn *ashlmode3_insn
-  [(set (match_operand:SHORT 0 register_operand =r)
-(ashift:SHORT (match_operand:SHORT 1 register_operand r)
-  (match_operand:QI 2 aarch64_reg_or_shift_imm_si rUss)))]
-  
-  lsl\\t%w0, %w1, %w2
-  [(set_attr v8type shift)
-   (set_attr mode MODE)]
-)
-
 (define_insn *optabmode3_insn
   [(set (match_operand:SHORT 0 register_operand =r)
 (ASHIFT:SHORT (match_operand:SHORT 1 register_operand r)
diff -rupN gcc-4.8-20150226/gcc/testsuite/ChangeLog 
gcc-4.8-20150226.pr64304//gcc/testsuite/ChangeLog
--- gcc-4.8-20150226/gcc/testsuite/ChangeLog2015-03-04 
21:16:54.0 -0500
+++ gcc-4.8-20150226.pr64304//gcc/testsuite/ChangeLog2015-03-04 
21:22:58.0 -0500

@@ -1,3 +1,10 @@
+2015-03-05  Shanyao chen  chenshan...@huawei.com
+
+Backported from mainline
+2015-01-19  Jiong Wang  jiong.w...@arm.com
+
+* gcc.target/aarch64/pr64304.c: New testcase.
+
 2015-02-26  Peter Bergner  berg...@vnet.ibm.com

 Backport from mainline
diff -rupN gcc-4.8-20150226/gcc/testsuite/gcc.target/aarch64/pr64304.c 
gcc-4.8-20150226.pr64304//gcc/testsuite/gcc.target/aarch64/pr64304.c
--- gcc-4.8-20150226/gcc/testsuite/gcc.target/aarch64/pr64304.c 
1969-12-31 19:00:00.0 -0500
+++ gcc-4.8-20150226.pr64304//gcc/testsuite/gcc.target/aarch64/pr64304.c 
2015-03-04 21:12:15.0 -0500

@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options -O2 --save-temps } */
+
+unsigned char byte = 0;
+
+void
+set_bit (unsigned int bit, unsigned char value)
+{
+  unsigned char mask = (unsigned char) (1  (bit  7));
+
+  if (! value)
+byte = (unsigned char)~mask;
+  else
+byte |= mask;
+/* { dg-final { scan-assembler and\tw\[0-9\]+, w\[0-9\]+, 7 } } */
+}
+
+/* { dg-final { cleanup-saved-temps } } */

diff -rupN gcc-4.8-20150226/gcc/ChangeLog 
gcc-4.8-20150226.pr64304//gcc/ChangeLog
--- gcc-4.8-20150226/gcc/ChangeLog  2015-03-04 21:13:46.0 -0500
+++ gcc-4.8-20150226.pr64304//gcc/ChangeLog 2015-03-04 21:19:49.0 
-0500
@@ -1,3 +1,13 @@
+2015-03-05  Shanyao Chen  chenshan...@huawei.com
+
+Backported from mainline
+2015-01-19  Jiong Wang  jiong.w...@arm.com
+Andrew Pinski  apin...@cavium.com
+
+PR target/64304
+* config/aarch64/aarch64.md (define_insn *ashlmode3_insn): Deleted.
+(ashlmode3): Don't expand if operands[2] is not constant.
+
 2015-02-26  Peter Bergner  berg...@vnet.ibm.com
 
Backport from mainline
diff -rupN gcc-4.8-20150226/gcc/config/aarch64/aarch64.md 
gcc-4.8-20150226.pr64304//gcc/config/aarch64/aarch64.md
--- gcc-4.8-20150226/gcc/config/aarch64/aarch64.md  2015-03-04 
21:14:29.0 -0500
+++ gcc-4.8-20150226.pr64304//gcc/config/aarch64/aarch64.md 2015-03-04 
21:21:54.0 -0500
@@ -2612,6 +2612,8 @@
DONE;
   }
   }
+else
+  FAIL;
   }
 )
 
@@ -2681,16 +2683,6 @@
(set_attr mode SI)]
 )
 
-(define_insn *ashlmode3_insn
-  [(set (match_operand:SHORT 0 register_operand =r)
-   (ashift:SHORT (match_operand:SHORT 1 register_operand r)
- (match_operand:QI 2 aarch64_reg_or_shift_imm_si 
rUss)))]
-  
-  lsl\\t%w0, %w1, %w2
-  [(set_attr v8type shift)
-   (set_attr mode MODE)]
-)
-
 (define_insn *optabmode3_insn
   [(set (match_operand:SHORT 0 register_operand =r)
(ASHIFT:SHORT (match_operand:SHORT 1 register_operand r)
diff -rupN gcc-4.8-20150226/gcc/testsuite/ChangeLog 
gcc-4.8-20150226.pr64304//gcc/testsuite/ChangeLog
--- 

Re: [committed, gcc-5-branch] Set DEV-PHASE to prerelease

2015-05-04 Thread Jakub Jelinek
On Mon, May 04, 2015 at 11:31:11AM +0200, Richard Biener wrote:
 On Mon, 4 May 2015, Jakub Jelinek wrote:
 
  On Mon, May 04, 2015 at 11:13:51AM +0200, Rainer Orth wrote:
   Jakub Jelinek ja...@redhat.com writes:
   
On Thu, Apr 23, 2015 at 04:31:52PM -0700, H.J. Lu wrote:
Hi,

I checked this patch into gcc-5-branch.
   
That's wrong according to https://gcc.gnu.org/develop.html#num_scheme
   
   HJ has a point, though: with DEV-PHASE remaining empty, all post-5.1.0
   versions of gcc identify as 5.1.1, with no way of telling them apart,
   like datestamp and revison.
  
  That suggests we should change
  DATESTAMP_s := \$(if $(DEVPHASE_c), $(DATESTAMP_c))\
  so that it would expand to DATESTAMP_c also if DEVPHASE_c is empty,
  but BASEVER_c does not end with .0
 
 Yes.

Here is a patch to do that, ok for trunk/5?

2015-05-04  Jakub Jelinek  ja...@redhat.com

* Makefile.in (PATCHLEVEL_c): New variable.
(DATESTAMP_s, REVISION_s): If PATCHLEVEL_c is not 0,
expand the same way as if DEVPHASE_c was non-empty.

--- gcc/Makefile.in.jj  2015-04-12 21:50:12.0 +0200
+++ gcc/Makefile.in 2015-05-04 12:03:03.394797230 +0200
@@ -828,14 +828,20 @@ endif
 
 version := $(BASEVER_c)
 
+PATCHLEVEL_c := \
+  $(shell echo $(BASEVER_c) | sed -e 's/^[0-9]*\.[0-9]*\.\([0-9]*\)$$/\1/')
+
+
 # For use in version.c - double quoted strings, with appropriate
 # surrounding punctuation and spaces, and with the datestamp and
 # development phase collapsed to the empty string in release mode
-# (i.e. if DEVPHASE_c is empty).  The space immediately after the
-# comma in the $(if ...) constructs is significant - do not remove it.
+# (i.e. if DEVPHASE_c is empty and PATCHLEVEL_c is 0).  The space
+# immediately after the comma in the $(if ...) constructs is
+# significant - do not remove it.
 BASEVER_s   := \$(BASEVER_c)\
 DEVPHASE_s  := \$(if $(DEVPHASE_c), ($(DEVPHASE_c)))\
-DATESTAMP_s := \$(if $(DEVPHASE_c), $(DATESTAMP_c))\
+DATESTAMP_s := \
+  \$(if $(DEVPHASE_c)$(filter-out 0,$(PATCHLEVEL_c)), $(DATESTAMP_c))\
 PKGVERSION_s:= \@PKGVERSION@\
 BUGURL_s:= \@REPORT_BUGS_TO@\
 
@@ -843,7 +849,8 @@ PKGVERSION  := @PKGVERSION@
 BUGURL_TEXI := @REPORT_BUGS_TEXI@
 
 ifdef REVISION_c
-REVISION_s  := \$(if $(DEVPHASE_c), $(REVISION_c))\
+REVISION_s  := \
+  \$(if $(DEVPHASE_c)$(filter-out 0,$(PATCHLEVEL_c)), $(REVISION_c))\
 else
 REVISION_s  := \\
 endif


Jakub


Re: [committed, gcc-5-branch] Set DEV-PHASE to prerelease

2015-05-04 Thread Richard Biener
On Mon, 4 May 2015, Jakub Jelinek wrote:

 On Mon, May 04, 2015 at 11:31:11AM +0200, Richard Biener wrote:
  On Mon, 4 May 2015, Jakub Jelinek wrote:
  
   On Mon, May 04, 2015 at 11:13:51AM +0200, Rainer Orth wrote:
Jakub Jelinek ja...@redhat.com writes:

 On Thu, Apr 23, 2015 at 04:31:52PM -0700, H.J. Lu wrote:
 Hi,
 
 I checked this patch into gcc-5-branch.

 That's wrong according to https://gcc.gnu.org/develop.html#num_scheme

HJ has a point, though: with DEV-PHASE remaining empty, all post-5.1.0
versions of gcc identify as 5.1.1, with no way of telling them apart,
like datestamp and revison.
   
   That suggests we should change
   DATESTAMP_s := \$(if $(DEVPHASE_c), $(DATESTAMP_c))\
   so that it would expand to DATESTAMP_c also if DEVPHASE_c is empty,
   but BASEVER_c does not end with .0
  
  Yes.
 
 Here is a patch to do that, ok for trunk/5?

Looks good to me.

Thanks,
Richard.

 2015-05-04  Jakub Jelinek  ja...@redhat.com
 
   * Makefile.in (PATCHLEVEL_c): New variable.
   (DATESTAMP_s, REVISION_s): If PATCHLEVEL_c is not 0,
   expand the same way as if DEVPHASE_c was non-empty.
 
 --- gcc/Makefile.in.jj2015-04-12 21:50:12.0 +0200
 +++ gcc/Makefile.in   2015-05-04 12:03:03.394797230 +0200
 @@ -828,14 +828,20 @@ endif
  
  version := $(BASEVER_c)
  
 +PATCHLEVEL_c := \
 +  $(shell echo $(BASEVER_c) | sed -e 's/^[0-9]*\.[0-9]*\.\([0-9]*\)$$/\1/')
 +
 +
  # For use in version.c - double quoted strings, with appropriate
  # surrounding punctuation and spaces, and with the datestamp and
  # development phase collapsed to the empty string in release mode
 -# (i.e. if DEVPHASE_c is empty).  The space immediately after the
 -# comma in the $(if ...) constructs is significant - do not remove it.
 +# (i.e. if DEVPHASE_c is empty and PATCHLEVEL_c is 0).  The space
 +# immediately after the comma in the $(if ...) constructs is
 +# significant - do not remove it.
  BASEVER_s   := \$(BASEVER_c)\
  DEVPHASE_s  := \$(if $(DEVPHASE_c), ($(DEVPHASE_c)))\
 -DATESTAMP_s := \$(if $(DEVPHASE_c), $(DATESTAMP_c))\
 +DATESTAMP_s := \
 +  \$(if $(DEVPHASE_c)$(filter-out 0,$(PATCHLEVEL_c)), $(DATESTAMP_c))\
  PKGVERSION_s:= \@PKGVERSION@\
  BUGURL_s:= \@REPORT_BUGS_TO@\
  
 @@ -843,7 +849,8 @@ PKGVERSION  := @PKGVERSION@
  BUGURL_TEXI := @REPORT_BUGS_TEXI@
  
  ifdef REVISION_c
 -REVISION_s  := \$(if $(DEVPHASE_c), $(REVISION_c))\
 +REVISION_s  := \
 +  \$(if $(DEVPHASE_c)$(filter-out 0,$(PATCHLEVEL_c)), $(REVISION_c))\
  else
  REVISION_s  := \\
  endif
 
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


[PATCH, AArch64] [4.9] Backport PR64304 fix (miscompilation with -mgeneral-regs-only )

2015-05-04 Thread Chen Shanyao
According to your opinion, I split the backports of pr64304 into 2 
emails, and this one is for 4.9 branch.
This patch backport the fix of PR target/64304 , miscompilation with 
-mgeneral-regs-only, to the 4.9 branch from trunk r219844. Tested on 
x86_64 by using qemu of aarch64.

OK for 4.9?

diff -rupN gcc-4.9-20150225/gcc/ChangeLog 
gcc-4.9-20150225.pr64304//gcc/ChangeLog

--- gcc-4.9-20150225/gcc/ChangeLog2015-03-04 20:48:30.0 -0500
+++ gcc-4.9-20150225.pr64304//gcc/ChangeLog2015-03-04 
20:55:59.0 -0500

@@ -1,3 +1,13 @@
+2015-03-05  Shanyao Chen  chenshan...@huawei.com
+
+Backported from mainline
+2015-01-19  Jiong Wang  jiong.w...@arm.com
+Andrew Pinski  apin...@cavium.com
+
+PR target/64304
+* config/aarch64/aarch64.md (define_insn *ashlmode3_insn): Deleted.
+(ashlmode3): Don't expand if operands[2] is not constant.
+
 2015-02-25  Kai Tietz  kti...@redhat.com

 PR tree-optimization/61917
diff -rupN gcc-4.9-20150225/gcc/config/aarch64/aarch64.md 
gcc-4.9-20150225.pr64304//gcc/config/aarch64/aarch64.md
--- gcc-4.9-20150225/gcc/config/aarch64/aarch64.md2015-03-04 
20:41:03.0 -0500
+++ gcc-4.9-20150225.pr64304//gcc/config/aarch64/aarch64.md 2015-03-04 
20:46:44.0 -0500

@@ -2719,6 +2719,8 @@
 DONE;
   }
   }
+else
+  FAIL;
   }
 )

@@ -2947,15 +2949,6 @@
   [(set_attr type shift_reg)]
 )

-(define_insn *ashlmode3_insn
-  [(set (match_operand:SHORT 0 register_operand =r)
-(ashift:SHORT (match_operand:SHORT 1 register_operand r)
-  (match_operand:QI 2 aarch64_reg_or_shift_imm_si rUss)))]
-  
-  lsl\\t%w0, %w1, %w2
-  [(set_attr type shift_reg)]
-)
-
 (define_insn *optabmode3_insn
   [(set (match_operand:SHORT 0 register_operand =r)
 (ASHIFT:SHORT (match_operand:SHORT 1 register_operand r)
diff -rupN gcc-4.9-20150225/gcc/testsuite/ChangeLog 
gcc-4.9-20150225.pr64304//gcc/testsuite/ChangeLog
--- gcc-4.9-20150225/gcc/testsuite/ChangeLog2015-03-04 
21:00:24.0 -0500
+++ gcc-4.9-20150225.pr64304//gcc/testsuite/ChangeLog2015-03-04 
21:03:21.0 -0500

@@ -1,3 +1,10 @@
+2015-03-05  Shanyao chen  chenshan...@huawei.com
+
+Backported from mainline
+2015-01-19  Jiong Wang  jiong.w...@arm.com
+
+* gcc.target/aarch64/pr64304.c: New testcase.
+
 2015-02-25  Kai Tietz  kti...@redhat.com

 Backported from mainline
diff -rupN gcc-4.9-20150225/gcc/testsuite/gcc.target/aarch64/pr64304.c 
gcc-4.9-20150225.pr64304//gcc/testsuite/gcc.target/aarch64/pr64304.c
--- gcc-4.9-20150225/gcc/testsuite/gcc.target/aarch64/pr64304.c 
1969-12-31 19:00:00.0 -0500
+++ gcc-4.9-20150225.pr64304//gcc/testsuite/gcc.target/aarch64/pr64304.c 
2015-03-04 20:59:24.0 -0500

@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options -O2 --save-temps } */
+
+unsigned char byte = 0;
+
+void
+set_bit (unsigned int bit, unsigned char value)
+{
+  unsigned char mask = (unsigned char) (1  (bit  7));
+
+  if (! value)
+byte = (unsigned char)~mask;
+  else
+byte |= mask;
+/* { dg-final { scan-assembler and\tw\[0-9\]+, w\[0-9\]+, 7 } } */
+}
+
+/* { dg-final { cleanup-saved-temps } } */

diff -rupN gcc-4.9-20150225/gcc/ChangeLog 
gcc-4.9-20150225.pr64304//gcc/ChangeLog
--- gcc-4.9-20150225/gcc/ChangeLog  2015-03-04 20:48:30.0 -0500
+++ gcc-4.9-20150225.pr64304//gcc/ChangeLog 2015-03-04 20:55:59.0 
-0500
@@ -1,3 +1,13 @@
+2015-03-05  Shanyao Chen  chenshan...@huawei.com
+
+   Backported from mainline
+   2015-01-19  Jiong Wang  jiong.w...@arm.com
+   Andrew Pinski  apin...@cavium.com
+
+   PR target/64304
+   * config/aarch64/aarch64.md (define_insn *ashlmode3_insn): Deleted.
+   (ashlmode3): Don't expand if operands[2] is not constant.
+
 2015-02-25  Kai Tietz  kti...@redhat.com
 
PR tree-optimization/61917
diff -rupN gcc-4.9-20150225/gcc/config/aarch64/aarch64.md 
gcc-4.9-20150225.pr64304//gcc/config/aarch64/aarch64.md
--- gcc-4.9-20150225/gcc/config/aarch64/aarch64.md  2015-03-04 
20:41:03.0 -0500
+++ gcc-4.9-20150225.pr64304//gcc/config/aarch64/aarch64.md 2015-03-04 
20:46:44.0 -0500
@@ -2719,6 +2719,8 @@
DONE;
   }
   }
+else
+  FAIL;
   }
 )
 
@@ -2947,15 +2949,6 @@
   [(set_attr type shift_reg)]
 )
 
-(define_insn *ashlmode3_insn
-  [(set (match_operand:SHORT 0 register_operand =r)
-   (ashift:SHORT (match_operand:SHORT 1 register_operand r)
- (match_operand:QI 2 aarch64_reg_or_shift_imm_si 
rUss)))]
-  
-  lsl\\t%w0, %w1, %w2
-  [(set_attr type shift_reg)]
-)
-
 (define_insn *optabmode3_insn
   [(set (match_operand:SHORT 0 register_operand =r)
(ASHIFT:SHORT (match_operand:SHORT 1 register_operand r)
diff -rupN gcc-4.9-20150225/gcc/testsuite/ChangeLog 
gcc-4.9-20150225.pr64304//gcc/testsuite/ChangeLog
--- gcc-4.9-20150225/gcc/testsuite/ChangeLog2015-03-04 21:00:24.0 
-0500
+++ 

[PATCH] Fix PR65965

2015-05-04 Thread Richard Biener

We don't support vectorizing group stores with gaps - so the natural
thing is to just split groups at such boundaries which enables
more BB vectorization (and likely loop vectorization as well, though
that would be some weird cases I suspect).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2015-05-04  Richard Biener  rguent...@suse.de

PR tree-optimization/65965
* tree-vect-data-refs.c (vect_analyze_data_ref_accesses): Split
store groups at gaps.

* gcc.dg/vect/bb-slp-33.c: New testcase.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 222758)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -2602,6 +2602,15 @@ vect_analyze_data_ref_accesses (loop_vec
  if ((init_b - init_a) % type_size_a != 0)
break;
 
+ /* If we have a store, the accesses are adjacent.  This splits
+groups into chunks we support (we don't support vectorization
+of stores with gaps).  */
+ if (!DR_IS_READ (dra)
+  (((unsigned HOST_WIDE_INT)init_b
+ - TREE_INT_CST_LOW (DR_INIT (datarefs_copy[i-1])))
+ != type_size_a))
+   break;
+
  /* The step (if not zero) is greater than the difference between
 data-refs' inits.  This splits groups into suitable sizes.  */
  HOST_WIDE_INT step = tree_to_shwi (DR_STEP (dra));
Index: gcc/testsuite/gcc.dg/vect/bb-slp-33.c
===
--- gcc/testsuite/gcc.dg/vect/bb-slp-33.c   (revision 0)
+++ gcc/testsuite/gcc.dg/vect/bb-slp-33.c   (working copy)
@@ -0,0 +1,49 @@
+/* { dg-require-effective-target vect_int } */
+
+#include tree-vect.h
+
+extern void abort (void);
+
+void __attribute__((noinline,noclone))
+test(int *__restrict__ a, int *__restrict__ b)
+{
+  a[0] = b[0];
+  a[1] = b[1];
+  a[2] = b[2];
+  a[3] = b[3];
+  a[5] = 0;
+  a[6] = 0;
+  a[7] = 0;
+  a[8] = 0;
+}
+
+int main()
+{
+  int a[9];
+  int b[4];
+  b[0] = 1;
+  __asm__ volatile ();
+  b[1] = 2;
+  __asm__ volatile ();
+  b[2] = 3;
+  __asm__ volatile ();
+  b[3] = 4;
+  __asm__ volatile ();
+  a[4] = 7;
+  check_vect ();
+  test(a, b);
+  if (a[0] != 1
+  || a[1] != 2
+  || a[2] != 3
+  || a[3] != 4
+  || a[4] != 7
+  || a[5] != 0
+  || a[6] != 0
+  || a[7] != 0
+  || a[8] != 0)
+abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 2 slp2 { 
target { vect_element_align || vect_hw_misalign } } } } */
+/* { dg-final { cleanup-tree-dump slp2 } } */


[PATCH] Fix PR65935

2015-05-04 Thread Richard Biener

The following fixes PR65935 where the vectorizer is confused after
SLP operands swapping to see the stmts in the IL with unswapped
operands.  As we already swap for different def-kinds just swap
for other swaps as well.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2015-05-04  Richard Biener  rguent...@suse.de

PR tree-optimization/65935
* tree-vect-slp.c (vect_build_slp_tree): If we swapped operands
then make sure to apply that swapping to the IL.

* gcc.dg/vect/pr65935.c: New testcase.

Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 222758)
--- gcc/tree-vect-slp.c (working copy)
*** vect_build_slp_tree (loop_vec_info loop_
*** 1081,1093 
dump_printf (MSG_NOTE, %d , j);
  }
  dump_printf (MSG_NOTE, \n);
! /* And try again ... */
  if (vect_build_slp_tree (loop_vinfo, bb_vinfo, child,
   group_size, max_nunits, loads,
   vectorization_factor,
!  matches, npermutes, this_tree_size,
   max_tree_size))
{
  oprnd_info-def_stmts = vNULL;
  SLP_TREE_CHILDREN (*node).quick_push (child);
  continue;
--- 1081,1105 
dump_printf (MSG_NOTE, %d , j);
  }
  dump_printf (MSG_NOTE, \n);
! /* And try again with scratch 'matches' ... */
! bool *tem = XALLOCAVEC (bool, group_size);
  if (vect_build_slp_tree (loop_vinfo, bb_vinfo, child,
   group_size, max_nunits, loads,
   vectorization_factor,
!  tem, npermutes, this_tree_size,
   max_tree_size))
{
+ /* ... so if successful we can apply the operand swapping
+to the GIMPLE IL.  This is necessary because for example
+vect_get_slp_defs uses operand indexes and thus expects
+canonical operand order.  */
+ for (j = 0; j  group_size; ++j)
+   if (!matches[j])
+ {
+   gimple stmt = SLP_TREE_SCALAR_STMTS (*node)[j];
+   swap_ssa_operands (stmt, gimple_assign_rhs1_ptr (stmt),
+  gimple_assign_rhs2_ptr (stmt));
+ }
  oprnd_info-def_stmts = vNULL;
  SLP_TREE_CHILDREN (*node).quick_push (child);
  continue;
Index: gcc/testsuite/gcc.dg/vect/pr65935.c
===
*** gcc/testsuite/gcc.dg/vect/pr65935.c (revision 0)
--- gcc/testsuite/gcc.dg/vect/pr65935.c (working copy)
***
*** 0 
--- 1,63 
+ /* { dg-do run } */
+ /* { dg-additional-options -O3 } */
+ /* { dg-require-effective-target vect_double } */
+ 
+ #include tree-vect.h
+ 
+ extern void abort (void);
+ extern void *malloc (__SIZE_TYPE__);
+ 
+ struct site {
+ struct {
+   struct {
+   double real;
+   double imag;
+   } e[3][3];
+ } link[32];
+ double phase[32];
+ } *lattice;
+ int sites_on_node;
+ 
+ void rephase (void)
+ {
+   int i,j,k,dir;
+   struct site *s;
+   for(i=0,s=lattice;isites_on_node;i++,s++)
+ for(dir=0;dir32;dir++)
+   for(j=0;j3;j++)for(k=0;k3;k++)
+   {
+ s-link[dir].e[j][k].real *= s-phase[dir];
+ s-link[dir].e[j][k].imag *= s-phase[dir];
+   }
+ }
+ 
+ int main()
+ {
+   int i,j,k;
+   check_vect ();
+   sites_on_node = 1;
+   lattice = malloc (sizeof (struct site) * sites_on_node);
+   for (i = 0; i  32; ++i)
+ {
+   lattice-phase[i] = i;
+   for (j = 0; j  3; ++j)
+   for (k = 0; k  3; ++k)
+ {
+   lattice-link[i].e[j][k].real = 1.0;
+   lattice-link[i].e[j][k].imag = 1.0;
+   __asm__ volatile ( : : : memory);
+ }
+ }
+   rephase ();
+   for (i = 0; i  32; ++i)
+ for (j = 0; j  3; ++j)
+   for (k = 0; k  3; ++k)
+   if (lattice-link[i].e[j][k].real != i
+   || lattice-link[i].e[j][k].imag != i)
+ abort ();
+   return 0;
+ }
+ 
+ /* { dg-final { scan-tree-dump-times vectorized 1 loops 1 slp1 } } */
+ /* { dg-final { cleanup-tree-dump slp1 } } */
+ /* { dg-final { cleanup-tree-dump vect } } */


Re: [PATCH] Fix PR65935

2015-05-04 Thread H.J. Lu
On Mon, May 4, 2015 at 4:15 AM, Richard Biener rguent...@suse.de wrote:

 The following fixes PR65935 where the vectorizer is confused after
 SLP operands swapping to see the stmts in the IL with unswapped
 operands.  As we already swap for different def-kinds just swap
 for other swaps as well.

 Bootstrap and regtest running on x86_64-unknown-linux-gnu.

 Richard.

 2015-05-04  Richard Biener  rguent...@suse.de

 PR tree-optimization/65935
 * tree-vect-slp.c (vect_build_slp_tree): If we swapped operands
 then make sure to apply that swapping to the IL.

 * gcc.dg/vect/pr65935.c: New testcase.

 Index: gcc/tree-vect-slp.c
 ===
 *** gcc/tree-vect-slp.c (revision 222758)
 --- gcc/tree-vect-slp.c (working copy)
 *** vect_build_slp_tree (loop_vec_info loop_
 *** 1081,1093 
 dump_printf (MSG_NOTE, %d , j);
   }
   dump_printf (MSG_NOTE, \n);
 ! /* And try again ... */
   if (vect_build_slp_tree (loop_vinfo, bb_vinfo, child,
group_size, max_nunits, loads,
vectorization_factor,
 !  matches, npermutes, this_tree_size,
max_tree_size))
 {
   oprnd_info-def_stmts = vNULL;
   SLP_TREE_CHILDREN (*node).quick_push (child);
   continue;
 --- 1081,1105 
 dump_printf (MSG_NOTE, %d , j);
   }
   dump_printf (MSG_NOTE, \n);
 ! /* And try again with scratch 'matches' ... */
 ! bool *tem = XALLOCAVEC (bool, group_size);
   if (vect_build_slp_tree (loop_vinfo, bb_vinfo, child,
group_size, max_nunits, loads,
vectorization_factor,
 !  tem, npermutes, this_tree_size,
max_tree_size))
 {
 + /* ... so if successful we can apply the operand swapping
 +to the GIMPLE IL.  This is necessary because for example
 +vect_get_slp_defs uses operand indexes and thus expects
 +canonical operand order.  */
 + for (j = 0; j  group_size; ++j)
 +   if (!matches[j])
 + {
 +   gimple stmt = SLP_TREE_SCALAR_STMTS (*node)[j];
 +   swap_ssa_operands (stmt, gimple_assign_rhs1_ptr (stmt),
 +  gimple_assign_rhs2_ptr (stmt));
 + }
   oprnd_info-def_stmts = vNULL;
   SLP_TREE_CHILDREN (*node).quick_push (child);
   continue;
 Index: gcc/testsuite/gcc.dg/vect/pr65935.c
 ===
 *** gcc/testsuite/gcc.dg/vect/pr65935.c (revision 0)
 --- gcc/testsuite/gcc.dg/vect/pr65935.c (working copy)
 ***
 *** 0 
 --- 1,63 
 + /* { dg-do run } */
 + /* { dg-additional-options -O3 } */
 + /* { dg-require-effective-target vect_double } */
 +
 + #include tree-vect.h
 +
 + extern void abort (void);
 + extern void *malloc (__SIZE_TYPE__);
 +
 + struct site {
 + struct {
 +   struct {
 +   double real;
 +   double imag;
 +   } e[3][3];
 + } link[32];
 + double phase[32];
 + } *lattice;
 + int sites_on_node;
 +
 + void rephase (void)
 + {
 +   int i,j,k,dir;
 +   struct site *s;
 +   for(i=0,s=lattice;isites_on_node;i++,s++)
 + for(dir=0;dir32;dir++)
 +   for(j=0;j3;j++)for(k=0;k3;k++)
 +   {
 + s-link[dir].e[j][k].real *= s-phase[dir];
 + s-link[dir].e[j][k].imag *= s-phase[dir];
 +   }
 + }
 +
 + int main()
 + {
 +   int i,j,k;
 +   check_vect ();
 +   sites_on_node = 1;
 +   lattice = malloc (sizeof (struct site) * sites_on_node);
 +   for (i = 0; i  32; ++i)
 + {
 +   lattice-phase[i] = i;
 +   for (j = 0; j  3; ++j)
 +   for (k = 0; k  3; ++k)
 + {
 +   lattice-link[i].e[j][k].real = 1.0;
 +   lattice-link[i].e[j][k].imag = 1.0;
 +   __asm__ volatile ( : : : memory);
 + }
 + }
 +   rephase ();
 +   for (i = 0; i  32; ++i)
 + for (j = 0; j  3; ++j)
 +   for (k = 0; k  3; ++k)
 +   if (lattice-link[i].e[j][k].real != i
 +   || lattice-link[i].e[j][k].imag != i)
 + abort ();
 +   return 0;
 + }
 +
 + /* { dg-final { scan-tree-dump-times vectorized 1 loops 1 slp1 } } */
 + /* { dg-final { cleanup-tree-dump slp1 } } */
 + /* { dg-final { cleanup-tree-dump vect } } */

Need for these when it is a run-time test.

-- 
H.J.


[PATCH] Remove dead code.

2015-05-04 Thread Dominik Vogt
This patch removes a write only variable from the C++ code.

ChangeLog:

--

2015-05-04  Dominik Vogt  v...@linux.vnet.ibm.com

* call.c (print_z_candidates): Remove dead code.

--

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
From 6943ad84a5a5b69c7cf5df1ea5bb6ab5fd254825 Mon Sep 17 00:00:00 2001
From: Dominik Vogt v...@linux.vnet.ibm.com
Date: Mon, 4 May 2015 12:46:21 +0100
Subject: [PATCH] Remove dead code.

---
 gcc/cp/call.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 31d2b9c..55350f8 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -3436,7 +3436,6 @@ print_z_candidates (location_t loc, struct z_candidate *candidates)
 {
   struct z_candidate *cand1;
   struct z_candidate **cand2;
-  int n_candidates;
 
   if (!candidates)
 return;
@@ -3478,9 +3477,6 @@ print_z_candidates (location_t loc, struct z_candidate *candidates)
 	}
 }
 
-  for (n_candidates = 0, cand1 = candidates; cand1; cand1 = cand1-next)
-n_candidates++;
-
   for (; candidates; candidates = candidates-next)
 print_z_candidate (loc, candidate:, candidates);
 }
-- 
2.3.0



Re: [rs6000] Fix compare debug failure on AIX

2015-05-04 Thread Tristan Gingold

 On 04 May 2015, at 02:32, David Edelsohn dje@gmail.com wrote:
 
 On Sat, May 2, 2015 at 6:04 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Why should GCC unnecessarily create stack frames to avoid
 compare-debug testcase failures?
 
 I'm not sure I understand the question... compare-debug failures are failures
 (-g is not supposed to change the generated code and this XCOFF-specific bug
 was reported to us) so they need to be fixed.
 
 From there on, as Alan said, there are 2 cases: either AIX needs a frame for
 debugging or it doesn't.  If the latter, then the lines can simply be 
 deleted.
 If the former, we have to draw a line somewhere; Alan suggests always 
 creating
 a frame while I suggest creating it only at -O0 and -Og.
 
 I believe that AIX does need a frame for debugging.  I don't remember
 the exact reason off hand.
 
 I'm sorry that XCOFF debugging changes the generated code (only in the
 sense of allocating a frame), but that is a system dependency.  It's
 been this way for over 20 years.  I see no reason to produce worse
 code at -O0 when not debugging simply to make testcases happier.
 
 By the way, I'm still waiting for the DWARF debugging patches from
 Adacore compatible with AIX as and ld.  DWARF debugging would not
 require pushing a frame, and would resolve the failure when testing
 with DWARF.  The patch would be adjusted to only push a frame when
 writing XCOFF debugging.

Sorry but we don’t have these patches.  We have a tiny patch to generate
Dwarf debug infos on XCOFF platforms but that requires GNU as and ld.

Tristan.



Re: [PR testsuite/65205, libgomp/65993] Fix dg-shouldfail usage in OpenACC libgomp tests

2015-05-04 Thread Rainer Orth
Thomas Schwinge tho...@codesourcery.com writes:

 Additionally to the %p format specifier printing a 0x prefix vs. not
 doing that, I've also changed the expected (nil) output for NULL
 pointers to instead match basically everything.

You cannot expect printf to print (nil) or variant for NULL pointers.
E.g. on Solaris 10 you get a SEGV instead.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: PR 64454: Improve VRP for %

2015-05-04 Thread Marc Glisse

On Mon, 4 May 2015, Richard Biener wrote:


On Sat, May 2, 2015 at 12:46 AM, Marc Glisse marc.gli...@inria.fr wrote:

Hello,

this patch tries to tighten a bit the range estimate for x%y. slp-perm-7.c
started failing by vectorizing more than expected, I assumed it was a good
thing and updated the test. I am less conservative than Jakub with division
by 0, but I still don't really understand how empty ranges are supposed to
be represented in VRP.

Bootstrap+testsuite on x86_64-linux-gnu.


Hmm, so I don't like how you (continute to) use trees for the constant 
computations. wide-ints would be a better fit today.  I also notice that 
fold_unary_to_constant can return NULL_TREE and neither the old nor your 
code handles that.


You are right. I was lazy and tried to keep this part of the old code, I 
shouldn't have...



empty ranges are basically UNDEFINED.


Cool, that's what I did. But I don't see code adding calls to 
__builtin_unreachable() when an empty range is detected. Maybe that almost 
never happens?


Aren't you pessimizing the case where the old code used 
value_range_nonnegative_p() by just using TYPE_UNSIGNED?


I don't think so. The old code only handled signed types in the positive 
case, while I have a more complete handling of signed types, which should 
do at least as good as the old one even in the positive case.


--
Marc Glisse


[PATCH] Fix(?) PR66002

2015-05-04 Thread Richard Biener

This fixes a missed vectorization of a function in paq8p.  Without
merged PHI nodes phiopt doesn't recognize adjacent MIN/MAX_EXPRs.
Certainly no other pass I schedule mergephi over cares for merged
PHIs (DCE might even be confused here).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2015-05-04  Richard Biener  rguent...@suse.de

PR tree-optimization/66002
* passes.def: Schedule pass_merge_phi after VRP, right before
ifcombine and phiopt.

* gcc.dg/vect/vect-125.c: New testcase.

Index: gcc/passes.def
===
*** gcc/passes.def  (revision 222760)
--- gcc/passes.def  (working copy)
*** along with GCC; see the file COPYING3.
*** 168,174 
NEXT_PASS (pass_build_alias);
NEXT_PASS (pass_return_slot);
NEXT_PASS (pass_fre);
-   NEXT_PASS (pass_merge_phi);
NEXT_PASS (pass_vrp);
NEXT_PASS (pass_chkp_opt);
NEXT_PASS (pass_dce);
--- 168,173 
*** along with GCC; see the file COPYING3.
*** 176,181 
--- 175,181 
NEXT_PASS (pass_call_cdce);
NEXT_PASS (pass_cselim);
NEXT_PASS (pass_copy_prop);
+   NEXT_PASS (pass_merge_phi);
NEXT_PASS (pass_tree_ifcombine);
NEXT_PASS (pass_phiopt);
NEXT_PASS (pass_tail_recursion);
Index: gcc/testsuite/gcc.dg/vect/vect-125.c
===
*** gcc/testsuite/gcc.dg/vect/vect-125.c(revision 0)
--- gcc/testsuite/gcc.dg/vect/vect-125.c(working copy)
***
*** 0 
--- 1,19 
+ /* { dg-do compile } */
+ /* { dg-require-effective-target vect_int } */
+ /* { dg-require-effective-target vect_pack_trunc } */
+ /* { dg-require-effective-target vect_unpack } */
+ 
+ void train(short *t, short *w, int n, int err)
+ {
+   n=(n+7)-8;
+   for (int i=0; in; ++i)
+ {
+   int wt=w[i]+((t[i]*err*216)+11);
+   if (wt-32768) wt=-32768;
+   if (wt32767) wt=32767;
+   w[i]=wt;
+ }
+ }
+ 
+ /* { dg-final { scan-tree-dump vectorized 1 loops vect { xfail 
vect_no_int_max } } } */
+ /* { dg-final { cleanup-tree-dump vect } } */


Re: [PR testsuite/65205, libgomp/65993] Fix dg-shouldfail usage in OpenACC libgomp tests

2015-05-04 Thread John David Anglin

On 2015-05-04 4:32 AM, Thomas Schwinge wrote:

Dave, would you please test the following patch, and report the
regression status compared to before r222620?  (Compared to your existing
r222021 results, as posted in the PR, for example.)

With patch, we have the following fails on hppa2.0w-hp-hpux11.11:

FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-3.c 
-DACC_DEVICE_TYPE_host

=1 -DACC_MEM_SHARED=1 output pattern test, is
libgomp: no device found
, should match device [0-9]+\([0-9]+\) is initialized
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-42.c 
-DACC_DEVICE_TYPE_hos
t=1 -DACC_MEM_SHARED=1 output pattern test, is , should match 
\[[0-9a-fA-FxX]+,2

56\] is not mapped
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-62.c 
-DACC_DEVICE_TYPE_hos

t=1 -DACC_MEM_SHARED=1 output pattern test, is , should match invalid size
Running /test/gnu/gcc/gcc/libgomp/testsuite/libgomp.oacc-c++/c++.exp ...
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/lib-3.c 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 output pattern test, is

libgomp: no device found
, should match device [0-9]+\([0-9]+\) is initialized
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/lib-42.c 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 output pattern test, is , 
should match \[[0-9a-fA-FxX]+,256\] is not mapped
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/lib-62.c 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 output pattern test, is , 
should match invalid size


Note this is a 32-bit build and not the 64-bit build reported in PR.  
However, I would expect similar

printf support.  Don't have a 64-bit build handy.

Dave

--
John David Anglin  dave.ang...@bell.net



Re: Extend verify_type to check various uses of TYPE_MINVAL

2015-05-04 Thread Rainer Orth
Jan Hubicka hubi...@ucw.cz writes:

 Hi,
 this patch extends verify_type to check various uses of TYPE_MINVAL. 
 I also added check that MIN_VALUE have compatible type with T:
  useless_type_conversion_p (const_cast tree (t), TREE_TYPE (TYPE_MIN_VALUE 
 (t)))
 but that one fails interesting ways for C sizetype. I will try to look
 into this and thus this patch omits it.

 The main motivation is to check that various frontend overrides of TYPE_MINVAL
 are under control.

 Bootstrapped/regtested x86_64-linux, will commit it as obvious.

Not obvious enough, it seems: this patch broke gnat.dg/lto* tests at
least on i386-pc-solaris2.10.  E.g.

FAIL: gnat.dg/lto1.adb (test for excess errors)
WARNING: gnat.dg/lto1.adb compilation failed to produce executable

FAIL: gnat.dg/lto1.adb (test for excess errors)
Excess errors:
/vol/gcc/src/hg/trunk/solaris/gcc/testsuite/gnat.dg/lto1_pkg.adb:23:1: error: 
TYPE_MIN_VALUE is not constant
 placeholder_expr feb2b9b0
type integer_type fea16000 sizetype public unsigned SI
size integer_cst fea041cc constant 32
unit size integer_cst fea041e0 constant 4
align 32 symtab 0 alias set -1 canonical type fea16000 precision 32 min 
integer_cst fea041f4 0 max integer_cst fea04000 4294967295
   
 integer_type feb67ba0 lto1_pkg__Tfiltering_levels_tB___UB0
type integer_type fea16000 sizetype public unsigned SI
size integer_cst fea041cc constant 32
unit size integer_cst fea041e0 constant 4
align 32 symtab 0 alias set -1 canonical type fea16000 precision 32 min 
integer_cst fea041f4 0 max integer_cst fea04000 4294967295
sizes-gimplified visited SI size integer_cst fea041cc 32 unit size 
integer_cst fea041e0 4
align 32 symtab 0 alias set -1 canonical type feb67ba0 precision 32 min 
placeholder_expr feb2b9b0 max placeholder_expr feb2b9c0
index type integer_type feb67b40
type enumeral_type feb67960 lto1_pkg__filtering_level_t 
sizes-gimplified visited unsigned QI
size integer_cst fea042d0 constant 8
unit size integer_cst fea042e4 constant 1
align 8 symtab 0 alias set -1 canonical type feb67960 precision 8 
min integer_cst feb60e4c 0 max integer_cst feb60f3c 255
values tree_list feb648e8
purpose identifier_node feb639d8 lto1_pkg__none
value integer_cst feb60e4c constant visited 0
chain tree_list feb64918
purpose identifier_node feb639f4 lto1_pkg__pr_in_clutter
value integer_cst feb60f50 constant 1
chain tree_list feb64930
purpose identifier_node feb63a10 lto1_pkg__ssr_plots
value integer_cst feb60f78 constant 2
chain tree_list feb64948 purpose identifier_node 
feb63a2c lto1_pkg__pr_plots value integer_cst feb60fa0 3 context 
translation_unit_decl fed805f0 D.18
chain type_decl feb687e8 lto1_pkg__filtering_level_t
QI size integer_cst fea042d0 8 unit size integer_cst fea042e4 1
align 8 symtab 0 alias set -1 canonical type feb67b40 precision 8 min 
integer_cst feb60e4c 0 max integer_cst feb60fa0 3 RM min component_ref 
feb63a9c RM max component_ref feb63ab8
chain type_decl feb68a10 D.4194

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PR testsuite/65205, libgomp/65993] Fix dg-shouldfail usage in OpenACC libgomp tests

2015-05-04 Thread Andreas Schwab
Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 You cannot expect printf to print (nil) or variant for NULL pointers.
 E.g. on Solaris 10 you get a SEGV instead.

You are probably mixing it up with %s.  %p is required to handle NULL
like any other valid pointer value.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: [PR testsuite/65205, libgomp/65993] Fix dg-shouldfail usage in OpenACC libgomp tests

2015-05-04 Thread Rainer Orth
Andreas Schwab sch...@linux-m68k.org writes:

 Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 You cannot expect printf to print (nil) or variant for NULL pointers.
 E.g. on Solaris 10 you get a SEGV instead.

 You are probably mixing it up with %s.  %p is required to handle NULL
 like any other valid pointer value.

Seems so.  Sorry for the noise.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=

2015-05-04 Thread Michael Matz
Hi,

On Thu, 30 Apr 2015, Sriraman Tallam wrote:

 We noticed that one of our benchmarks sped-up by ~1% when we eliminated 
 PLT stubs for some of the hot external library functions like memcmp, 
 pow.  The win was from better icache and itlb performance. The main 
 reason was that the PLT stubs had no spatial locality with the 
 call-sites. I have started looking at ways to tell the compiler to 
 eliminate PLT stubs (in-effect inline them) for specified external 
 functions, for x86_64. I have a proposal and a patch and I would like to 
 hear what you think.
 
 This comes with caveats.  This cannot be generally done for all 
 functions marked extern as it is impossible for the compiler to say if a 
 function is truly extern (defined in a shared library). If a function 
 is not truly extern(ends up defined in the final executable), then 
 calling it indirectly is a performance penalty as it could have been a 
 direct call.

This can be fixed by Alans idea.

 Further, the newly created GOT entries are fixed up at 
 start-up and do not get lazily bound.

And this can be fixed by some enhancements in the linker and dynamic 
linker.  The idea is to still generate a PLT stub and make its GOT entry 
point to it initially (like a normal got.plt slot).  Then the first 
indirect call will use the address of PLT entry (starting lazy resolution) 
and update the GOT slot with the real address, so further indirect calls 
will directly go to the function.

This requires a new asm marker (and hence new reloc) as normally if 
there's a GOT slot it's filled by the real symbols address, unlike if 
there's only a got.plt slot.  E.g. a

  call *foo@GOTPLT(%rip)

would generate a GOT slot (and fill its address into above call insn), but 
generate a JUMP_SLOT reloc in the final executable, not a GLOB_DAT one.


Ciao,
Michael.


Re: Extend verify_type to check various uses of TYPE_MINVAL

2015-05-04 Thread Eric Botcazou
 Not obvious enough, it seems: this patch broke gnat.dg/lto* tests at
 least on i386-pc-solaris2.10.  E.g.
 
 FAIL: gnat.dg/lto1.adb (test for excess errors)
 WARNING: gnat.dg/lto1.adb compilation failed to produce executable
 
 FAIL: gnat.dg/lto1.adb (test for excess errors)
 Excess errors:
 /vol/gcc/src/hg/trunk/solaris/gcc/testsuite/gnat.dg/lto1_pkg.adb:23:1:
 error: TYPE_MIN_VALUE is not constant 

TYPE_MIN_VALUE can be arbitrary in Ada, with or without LTO.  For

package Q is

   function LB return Natural;
   function UB return Natural;

end Q;
with Q;

package P is

   type Arr1 is array (Natural range ) of Boolean;

   subtype Arr2 is Arr1 (Q.LB .. Q.UB);

end P;

the TYPE_DOMAIN of Arr2 is

domain integer_type 0x769be000 type integer_type 0x76d0e0a8 
sizetype
sizes-gimplified visited DI size integer_cst 0x76d0abb8 64 unit 
size integer_cst 0x76d0abd0 8
align 64 symtab 0 alias set -1 canonical type 0x769be000 precision 
64 min nop_expr 0x769bd000 max cond_expr 0x769b9420

-- 
Eric Botcazou


Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-04 Thread Jakub Jelinek
On Mon, May 04, 2015 at 10:11:13AM +0200, Richard Biener wrote:
 Not sure how this helps when SRA tears apart the parameter.  That is,
 isn't the important thing that both the IPA modified function argument
 types/decls have the same type as the types of the parameters SRA ends
 up passing?  (as far as alignment goes?)
 
 Yes, of course using natural alignment makes sure that the backend
 can handle alignment properly and we don't run into oddball bugs here.

On IRC we were discussing making

 /* Return true if mode/type need doubleword alignment.  */
 static bool
 arm_needs_doubleword_align (machine_mode mode, const_tree type)
 {
   return (GET_MODE_ALIGNMENT (mode)  PARM_BOUNDARY
- || (type  TYPE_ALIGN (type)  PARM_BOUNDARY));
+ || (type  TYPE_ALIGN (TYPE_MAIN_VARIANT (type))  PARM_BOUNDARY));
 }


Looking at

struct S { char a[16]; }; 
typedef struct S T;
typedef struct S U __attribute__((aligned (16))); 
struct V { U u; T v; };
typedef int N __attribute__((aligned (16)));

T t1;
U u1;
int a[3];

void
f5 (__builtin_va_list *ap)
{
  t1 = __builtin_va_arg (*ap, T);
  a[0] = __builtin_va_arg (*ap, int);
  u1 = __builtin_va_arg (*ap, U);
  a[1] = __builtin_va_arg (*ap, int);
  a[2] = __builtin_va_arg (*ap, N);
}

void f6 (int, N, int, U);

void
f7 (void)
{
  U u = {};
  f6 (0, (N) 1, 0, u);
}

and s/16/8/g output, it seems that neither i?86 nor x86_64 care about
the alignment for any passing, ppc64le cares about aggregates, but not
scalars apparently (with a warning that the passing changed), arm cares
about both.  And the f7 function shows that for non-aggregates, what arm
does is simply never going to work, because there is no way to pass down
the scalars aligned, f6 is still called with 1 in int type rather than N.

So at least changing arm_needs_doubleword_align for non-aggregates would
likely not break anything that hasn't been broken already and would unbreak
the majority of cases.

The following testcase shows that eipa_sra changes alignment even for the
aggregates.  Change aligned (8) to aligned (4) to see another possibility.

/* PR target/65956 */

struct B { char *a, *b; };
typedef struct B C __attribute__((aligned (8)));
struct A { C a; int b; long long c; };
char v[3];

__attribute__((noinline, noclone)) void
fn1 (int v, ...)
{
  __builtin_va_list ap;
  __builtin_va_start (ap, v);
  C c, d;
  c = __builtin_va_arg (ap, C);
  __builtin_va_arg (ap, int);
  d = __builtin_va_arg (ap, C);
  __builtin_va_end (ap);
  if (c.a != v[1] || d.a != v[2])
__builtin_abort ();
  v[1]++;
}

__attribute__((noinline, noclone)) int
fn2 (C x)
{
  asm volatile ( : +g (x.a) : : memory);
  asm volatile ( : +g (x.b) : : memory);
  return x.a == v[0];
}

__attribute__((noinline, noclone)) void
fn3 (const char *x)
{
  if (x[0] != 0)
__builtin_abort ();
}

static struct A
foo (const char *x, struct A y, struct A z)
{
  struct A r = { { 0, 0 }, 0, 0 };
  if (y.b  z.b)
{
  if (fn2 (y.a)  fn2 (z.a))
switch (x[0])
  {
  case '|':
break;
  default:
fn3 (x);
  }
  fn1 (0, y.a, 0, z.a);
}
  return r;
}

__attribute__((noinline, noclone)) int
bar (int x, struct A *y)
{
  switch (x)
{
case 219:
  foo (+, y[-2], y[0]);
case 220:
  foo (-, y[-2], y[0]);
}
}

int
main ()
{
  struct A a[3] = { { { v[1], v[0] }, 1, 1LL },
{ { v[0], v[0] }, 0, 0LL },
{ { v[2], v[0] }, 2, 2LL } };
  bar (220, a + 2);
  if (v[1] != 1)
__builtin_abort ();
  return 0;
}

Jakub


Re: [C++17] Implement N3928 - Extending static_assert

2015-05-04 Thread Marek Polacek
On Sat, May 02, 2015 at 04:16:18PM -0400, Ed Smith-Rowland wrote:
 This extends' static assert to not require a message string.
 I elected to make this work also for C++11 and C++14 and warn only with
 -pedantic.
 I think many people just write
   static_assert(thing, );
 .
 
 I took the path of building an empty string in the parser in this case.
 I wasn't sure if setting message to NULL_TREE would cause sadness later on
 or not.
 
 I also, perhaps in a fit of overzealousness made finish_static_assert not
 print the extra :  and an empty message in this case.
 
 I didn't modify _Static_assert for C.

I'm not aware of any C DR that is asking for _Static_assert (cst-expr), so
I suppose there's no need to change C at this point.

Marek


[Committed] Restore bootstrap for ARM

2015-05-04 Thread Andreas Tobler

All,

I committed the below as obvious.

Andreas

2015-05-04  Andreas Tobler  andre...@gcc.gnu.org

* config/arm/arm.c: Restore bootstrap.


Index: config/arm/arm.c
===
--- config/arm/arm.c(revision 222767)
+++ config/arm/arm.c(working copy)
@@ -150,7 +150,7 @@
 static void assign_minipool_offsets (Mfix *);
 static void arm_print_value (FILE *, rtx);
 static void dump_minipool (rtx_insn *);
-static int arm_barrier_cost (rtx);
+static int arm_barrier_cost (rtx_insn *);
 static Mfix *create_fix_barrier (Mfix *, HOST_WIDE_INT);
 static void push_minipool_barrier (rtx_insn *, HOST_WIDE_INT);
 static void push_minipool_fix (rtx_insn *, HOST_WIDE_INT, rtx *,


PIC calls without PLT, generic implementation

2015-05-04 Thread Alexander Monakov
Recent post by Sriraman prompts me to post my -fno-plt approach sooner rather
than later; I was working on no-PLT PIC codegen in last few days too.
Although I'm posting a patch series, half of it is i386 backend tuning and can
go in independently.  Except one patch where it's noted specifically, the
patches were bootstrapped and regtested together, not separately, on x86-64.
Likewise the improvement claimed below is obtained with GCC with all patches
applied, the difference being only in -fno-plt flag.

The approach taken here is different.  Instead of adjusting call expansion in
the back end, I force callee address to be loaded into a pseudo at RTL
expansion time, similar to function CSE which is not enabled to most
targets.  The address load (which loads from GOT) can be moved out of loops,
scheduled, or, on x86, re-fused with indirect jump by peepholes.  On 32-bit
x86, it also allows the compiler to use registers other than %ebx for GOT
pointer (which can be a win since %ebx is callee-saved).

The benefit of PLT is the possibility of lazy relocation.  It is not possible
with BIND_NOW, in particular when -z relro -z now flags were used at link time
as security hardening measure.  Performance-critical executables do not
particularly need PLT and lazy relocation too, except if they are used very
frequently, with each individual run time extremely small -- but in that case
they can benefit massively from static linking or less massively from
prelinking, and with prelinking they can get the benefit of no-plt.

I've used LLVM/Clang to evaluate performance impact of PLT-less PIC codegen.
I configured with
  cmake -DLLVM_ENABLE_PIC=ON -DBUILD_SHARED_LIBS=ON \
  -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=OFF
from 3.6 release branch; this configuration mimics non-static build that e.g.
OpenSUSE is using, and produces Clang dependent on 112 clang/llvm shared
libraries, with roughly 24000 externally visible functions.

Without input files time is mostly spent on dynamic linking, so without
prelink there's a predictable regression, from 55 to 140 ms.  On C++ hello
world, I get:
PLT   no-PLT  PLT+BIND_NOW
[32bit]  430 ms   535 ms  590 ms
[64bit]  410 ms   495 ms  555 ms

So no-PLT is 20% slower than default, but already 10% faster when non-lazy
binding is forced.

On tramp3d compilation with -O2 -g I get:
PLT   no-PLT
[32bit]  49.0 s   43.3 s
[64bit]  41.6 s   36.8 s

So on long-running compiles -fno-plt is a very significant win.  Note that I'm
using Clang as (perhaps extreme) example of PIC-call-intensive code, but the
argument about -fno-plt being useful for performance should apply generally.

When looking at code size changes, there's a 1% improvement on 32-bit
libstdc++ and a small regression on 64-bit.  On LLVM/Clang, there's overall size
regression on both 32-bit and 64-bit; I've tried to analyze it and so far came
up with one possible cause, which is detailed in IRA REG_EQUIV patch.

Thanks.
Alexander


[PATCH i386] Move CLOBBERED_REGS earlier in register class list

2015-05-04 Thread Alexander Monakov
On 32-bit x86, register class CLOBBERED_REGS is a proper subset of
LEGACY_REGS, which causes IRA not to consider it separately for register
allocation, even when it has lower cost than other classes.  This patch is
useful to fix code generation problem that appears with no-PLT PIC tailcalls.

Was there a specific reason for CLOBBERED_REGS class to be listed as late as
it is?  On 32-bit this class contains only EAX, ECX, EDX.

OK?
* config/i386/i386.h (enum reg_class): Move CLOBBERED_REGS before 
Q_REGS.
(REG_CLASS_NAMES): Ditto.
(REG_CLASS_CONTENTS): Ditto.

diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 1e755d3..75071ac 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -1300,17 +1300,17 @@ extern const char *host_detect_local_cpu (int argc, 
const char **argv);
 
 enum reg_class
 {
   NO_REGS,
   AREG, DREG, CREG, BREG, SIREG, DIREG,
   AD_REGS, /* %eax/%edx for DImode */
+  CLOBBERED_REGS,  /* call-clobbered integer registers */
   Q_REGS,  /* %eax %ebx %ecx %edx */
   NON_Q_REGS,  /* %esi %edi %ebp %esp */
   INDEX_REGS,  /* %eax %ebx %ecx %edx %esi %edi %ebp */
   LEGACY_REGS, /* %eax %ebx %ecx %edx %esi %edi %ebp %esp */
-  CLOBBERED_REGS,  /* call-clobbered integer registers */
   GENERAL_REGS,/* %eax %ebx %ecx %edx %esi %edi %ebp 
%esp
   %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 */
   FP_TOP_REG, FP_SECOND_REG,   /* %st(0) %st(1) */
   FLOAT_REGS,
   SSE_FIRST_REG,
   NO_REX_SSE_REGS,
@@ -1361,16 +1361,16 @@ enum reg_class
 
 #define REG_CLASS_NAMES \
 {  NO_REGS,  \
AREG, DREG, CREG, BREG, \
SIREG, DIREG,   \
AD_REGS,  \
+   CLOBBERED_REGS,   \
Q_REGS, NON_Q_REGS, \
INDEX_REGS,   \
LEGACY_REGS,  \
-   CLOBBERED_REGS,   \
GENERAL_REGS, \
FP_TOP_REG, FP_SECOND_REG,  \
FLOAT_REGS,   \
SSE_FIRST_REG,\
NO_REX_SSE_REGS,  \
SSE_REGS, \
@@ -1400,17 +1400,17 @@ enum reg_class
   { 0x02,   0x0,0x0 },   /* DREG */  \
   { 0x04,   0x0,0x0 },   /* CREG */  \
   { 0x08,   0x0,0x0 },   /* BREG */  \
   { 0x10,   0x0,0x0 },   /* SIREG */ \
   { 0x20,   0x0,0x0 },   /* DIREG */ \
   { 0x03,   0x0,0x0 },   /* AD_REGS */   \
+  { 0x07,   0x0,0x0 },   /* CLOBBERED_REGS */\
   { 0x0f,   0x0,0x0 },   /* Q_REGS */\
   { 0x1100f0,0x1fe0,0x0 },   /* NON_Q_REGS */\
   { 0x7f,0x1fe0,0x0 },   /* INDEX_REGS */\
   { 0x1100ff,   0x0,0x0 },   /* LEGACY_REGS */   \
-  { 0x07,   0x0,0x0 },   /* CLOBBERED_REGS */\
   { 0x1100ff,0x1fe0,0x0 },   /* GENERAL_REGS */  \
  { 0x100,   0x0,0x0 },   /* FP_TOP_REG */\
 { 0x0200,   0x0,0x0 },   /* FP_SECOND_REG */ \
 { 0xff00,   0x0,0x0 },   /* FLOAT_REGS */\
   { 0x20,   0x0,0x0 },   /* SSE_FIRST_REG */ \
 { 0x1fe0,  0x00,0x0 },   /* NO_REX_SSE_REGS */   \


[PATCH i386] PR65753: allow PIC tail calls via function pointers

2015-05-04 Thread Alexander Monakov
In the i386 backend, tailcalls are incorrectly disallowed in PIC mode for
calls via function pointers on the basis that indirect calls, like direct
calls, would go via PLT and thus require %ebx to point to GOT -- but that is
not true.  Quoting Rich Felker who reported the bug,

  For PLT slots in the non-PIE main executable, %ebx is not required at all.
  PLT slots in PIE or shared libraries need %ebx, but a function pointer can
  never evaluate to such a PLT slot; it always evaluates to the nominal address
  of the function which is the same in all DSOs and therefore fundamentally
  cannot depend on the address of the GOT in the calling DSO

As far as I can see it's simply a mistake that was there from day 1 (comment 4
in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65753 points to original patch).

Bootstrapped and regtested on 32-bit x86, OK for trunk?
(the comment before the condition will need to be adjusted too, i.e.
s/optimize any indirect call, or a direct call/optimize any direct call/ )

PR target/65753
* config/i386/i386.c (ix86_function_ok_for_sibcall): Allow PIC sibcalls
via function pointers.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 3263656..f29e053 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5448,13 +5448,13 @@ ix86_function_ok_for_sibcall (tree decl, tree exp)
   /* If we are generating position-independent code, we cannot sibcall
  optimize any indirect call, or a direct call to a global function,
  as the PLT requires %ebx be live. (Darwin does not have a PLT.)  */
   if (!TARGET_MACHO
!TARGET_64BIT
flag_pic
-   (!decl || !targetm.binds_local_p (decl)))
+   (decl  !targetm.binds_local_p (decl)))
 return false;
 
   /* If we need to align the outgoing stack, then sibcalling would
  unalign the stack, which may break the called function.  */
   if (ix86_minimum_incoming_stack_boundary (true)
PREFERRED_STACK_BOUNDARY)


[PATCH i386] Allow sibcalls in no-PLT PIC

2015-05-04 Thread Alexander Monakov
With -fno-plt, we don't have to reject even direct calls as sibcall
candidates.

This patch depends on '-fplt' flag that is introduced in another patch.

This patch requires that with -fno-plt all sibcall candidates go through
prepare_call_address that transforms the call to a GOT lookup.

OK?
* config/i386/i386.c (ix86_function_ok_for_sibcall): Check flag_plt.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index f29e053..b734350 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5448,12 +5448,13 @@ ix86_function_ok_for_sibcall (tree decl, tree exp)
   /* If we are generating position-independent code, we cannot sibcall
  optimize any indirect call, or a direct call to a global function,
  as the PLT requires %ebx be live. (Darwin does not have a PLT.)  */
   if (!TARGET_MACHO
!TARGET_64BIT
flag_pic
+   flag_plt
(decl  !targetm.binds_local_p (decl)))
 return false;
 
   /* If we need to align the outgoing stack, then sibcalling would
  unalign the stack, which may break the called function.  */
   if (ix86_minimum_incoming_stack_boundary (true)


[PATCH i386] Extend sibcall peepholes to allow source in %eax

2015-05-04 Thread Alexander Monakov
On i386, peepholes that transform memory load and register-indirect jump into
memory-indirect jump are overly restrictive in that they don't allow combining
when the jump target is loaded into %eax, and the called function returns a
value (also in %eax, so it's not dead after the call).  Fix this by checking
for same source and output register operands separately.

OK?
* config/i386/i386.md (sibcall_value_memory): Extend peepholes to
allow memory address in %eax.
(sibcall_value_pop_memory): Likewise.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 729db75..7f81bcc 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -11872,13 +11872,14 @@
   [(set (match_operand:W 0 register_operand)
(match_operand:W 1 memory_operand))
(set (match_operand 2)
(call (mem:QI (match_dup 0))
 (match_operand 3)))]
   !TARGET_X32  SIBLING_CALL_P (peep2_next_insn (1))
-peep2_reg_dead_p (2, operands[0])
+(REGNO (operands[2]) == REGNO (operands[0])
+   || peep2_reg_dead_p (2, operands[0]))
   [(parallel [(set (match_dup 2)
   (call (mem:QI (match_dup 1))
 (match_dup 3)))
  (unspec [(const_int 0)] UNSPEC_PEEPSIB)])])
 
 (define_peephole2
@@ -11886,13 +11887,14 @@
(match_operand:W 1 memory_operand))
(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)
(set (match_operand 2)
(call (mem:QI (match_dup 0))
  (match_operand 3)))]
   !TARGET_X32  SIBLING_CALL_P (peep2_next_insn (2))
-peep2_reg_dead_p (3, operands[0])
+(REGNO (operands[2]) == REGNO (operands[0])
+   || peep2_reg_dead_p (3, operands[0]))
   [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)
(parallel [(set (match_dup 2)
   (call (mem:QI (match_dup 1))
 (match_dup 3)))
  (unspec [(const_int 0)] UNSPEC_PEEPSIB)])])
 
@@ -11951,13 +11953,14 @@
   (call (mem:QI (match_dup 0))
 (match_operand 3)))
  (set (reg:SI SP_REG)
   (plus:SI (reg:SI SP_REG)
(match_operand:SI 4 immediate_operand)))])]
   !TARGET_64BIT  SIBLING_CALL_P (peep2_next_insn (1))
-peep2_reg_dead_p (2, operands[0])
+(REGNO (operands[2]) == REGNO (operands[0])
+   || peep2_reg_dead_p (2, operands[0]))
   [(parallel [(set (match_dup 2)
   (call (mem:QI (match_dup 1))
 (match_dup 3)))
  (set (reg:SI SP_REG)
   (plus:SI (reg:SI SP_REG)
(match_dup 4)))
@@ -11971,13 +11974,14 @@
   (call (mem:QI (match_dup 0))
 (match_operand 3)))
  (set (reg:SI SP_REG)
   (plus:SI (reg:SI SP_REG)
(match_operand:SI 4 immediate_operand)))])]
   !TARGET_64BIT  SIBLING_CALL_P (peep2_next_insn (2))
-peep2_reg_dead_p (3, operands[0])
+(REGNO (operands[2]) == REGNO (operands[0])
+   || peep2_reg_dead_p (3, operands[0]))
   [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)
(parallel [(set (match_dup 2)
   (call (mem:QI (match_dup 1))
 (match_dup 3)))
  (set (reg:SI SP_REG)
   (plus:SI (reg:SI SP_REG)


[PATCH] Expand PIC calls without PLT with -fno-plt

2015-05-04 Thread Alexander Monakov
This patch introduces option -fno-plt that allows to expand calls that would
go via PLT to load the address of the function immediately at call site (which
introduces a GOT load).  Cover letter explains the motivation for this patch.

New option documentation for invoke.texi is missing from the patch; if this is
accepted I'll be happy to send a v2 with documentation added.

* calls.c (prepare_call_address): Transform PLT call to GOT lookup and
indirect call by forcing address into a pseudo with -fno-plt.
* common.opt (flag_plt): New option.

diff --git a/gcc/calls.c b/gcc/calls.c
index 970415d..0c3b9aa 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -222,12 +222,18 @@ prepare_call_address (tree fndecl_or_type, rtx funexp, 
rtx static_chain_value,
 /* If we are using registers for parameters, force the
function address into a register now.  */
 funexp = ((reg_parm_seen
targetm.small_register_classes_for_mode_p (FUNCTION_MODE))
  ? force_not_mem (memory_address (FUNCTION_MODE, funexp))
  : memory_address (FUNCTION_MODE, funexp));
+  else if (flag_pic  !flag_plt  fndecl_or_type
+   TREE_CODE (fndecl_or_type) == FUNCTION_DECL
+   !targetm.binds_local_p (fndecl_or_type))
+{
+  funexp = force_reg (Pmode, funexp);
+}
   else if (! sibcallp)
 {
 #ifndef NO_FUNCTION_CSE
   if (optimize  ! flag_no_function_cse)
funexp = force_reg (Pmode, funexp);
 #endif
diff --git a/gcc/common.opt b/gcc/common.opt
index b49ac46..cd8b256 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1773,12 +1773,16 @@ Common Report Var(flag_pic,1) Negative(fpie)
 Generate position-independent code if possible (small mode)
 
 fpie
 Common Report Var(flag_pie,1) Negative(fPIC)
 Generate position-independent code for executables if possible (small mode)
 
+fplt
+Common Report Var(flag_plt) Init(1)
+Use PLT for PIC calls (-fno-plt: load the address from GOT at call site)
+
 fplugin=
 Common Joined RejectNegative Var(common_deferred_options) Defer
 Specify a plugin to load
 
 fplugin-arg-
 Common Joined RejectNegative Var(common_deferred_options) Defer


[RFC PATCH] ira: accept loads via argp rtx in validate_equiv_mem

2015-05-04 Thread Alexander Monakov
With this patch at hand, I'd like to discuss a code generation problem, which
my patch solves only partially.  FWIW, it passes bootstrap/regtest on x86-64.

With other patches in series applied, GCC with -fno-plt can generate tail
calls in PIC mode more frequently, but sometimes poorer code is generated.
I've tried to look for possible causes, and found one issue so far.

Consider the following testcase:

void foo1(int a, int b, int c, int d, int e, int f, int g, int h);
int bar(int x);
void foo2(int a, int b, int c, int d, int e, int f, int g, int h)
{
  bar(a);
  foo1(a, b, c, d, e, f, g, h);
}

Comparing x86 code generation with -O2 -m32 and with/without -fPIC, you can
see that -fPIC happens to produce smaller code.  Without -fPIC, GCC
saves/restores all arguments before/after call to 'bar'.

The reason for that is without -fPIC, GCC performs tail call optimization on
'foo1', and that causes it to drop REG_EQUIV notes for incoming arguments in
fixup_tail_calls.  After that, code generation diverges at IRA stage, where
lack of equivalences prevents loads of pseudos to be moved to the point of
first use.

The patch tries to repair the problem by allowing REG_EQUIV notes to be
resynthesized at ira init for loads that happen via `argp' rtx.  It helps for
the simple testcase above, but not for problematic Clang/LLVM functions where
I noticed the issue.

I hope there's a way around the 'big hammer' approach of fixup_tail_calls.
Might it be possible instead of dropping REG_EQUIV notes, to copy incoming
arguments into other pseudos just prior to stack pointer adjustment in
preparation for tailcall?

diff --git a/gcc/ira.c b/gcc/ira.c
index ea2b69f..e6b82e2 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3001,13 +3001,16 @@ validate_equiv_mem (rtx_insn *start, rtx reg, rtx 
memref)
 
   /* This used to ignore readonly memory and const/pure calls.  The problem
 is the equivalent form may reference a pseudo which gets assigned a
 call clobbered hard reg.  When we later replace REG with its
 equivalent form, the value in the call-clobbered reg has been
 changed and all hell breaks loose.  */
-  if (CALL_P (insn))
+  rtx addr = XEXP (memref, 0);
+  if (GET_CODE (addr) == PLUS  GET_CODE (XEXP (addr, 1)) == CONST_INT)
+   addr = XEXP (addr, 0);
+  if (CALL_P (insn)  addr != arg_pointer_rtx)
return 0;
 
   note_stores (PATTERN (insn), validate_equiv_mem_from_store, NULL);
 
   /* If a register mentioned in MEMREF is modified via an
 auto-increment, we lose the equivalence.  Do the same if one


Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=

2015-05-04 Thread Xinliang David Li
The use case proposed by Sri allows user to selectively eliminate PLT
overhead for hot external calls only. In such scenarios, lazy binding
won't be something matters to the user.

David

On Mon, May 4, 2015 at 7:45 AM, Michael Matz m...@suse.de wrote:
 Hi,

 On Thu, 30 Apr 2015, Sriraman Tallam wrote:

 We noticed that one of our benchmarks sped-up by ~1% when we eliminated
 PLT stubs for some of the hot external library functions like memcmp,
 pow.  The win was from better icache and itlb performance. The main
 reason was that the PLT stubs had no spatial locality with the
 call-sites. I have started looking at ways to tell the compiler to
 eliminate PLT stubs (in-effect inline them) for specified external
 functions, for x86_64. I have a proposal and a patch and I would like to
 hear what you think.

 This comes with caveats.  This cannot be generally done for all
 functions marked extern as it is impossible for the compiler to say if a
 function is truly extern (defined in a shared library). If a function
 is not truly extern(ends up defined in the final executable), then
 calling it indirectly is a performance penalty as it could have been a
 direct call.

 This can be fixed by Alans idea.

 Further, the newly created GOT entries are fixed up at
 start-up and do not get lazily bound.

 And this can be fixed by some enhancements in the linker and dynamic
 linker.  The idea is to still generate a PLT stub and make its GOT entry
 point to it initially (like a normal got.plt slot).  Then the first
 indirect call will use the address of PLT entry (starting lazy resolution)
 and update the GOT slot with the real address, so further indirect calls
 will directly go to the function.

 This requires a new asm marker (and hence new reloc) as normally if
 there's a GOT slot it's filled by the real symbols address, unlike if
 there's only a got.plt slot.  E.g. a

   call *foo@GOTPLT(%rip)

 would generate a GOT slot (and fill its address into above call insn), but
 generate a JUMP_SLOT reloc in the final executable, not a GLOB_DAT one.


 Ciao,
 Michael.


Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=

2015-05-04 Thread Michael Matz
Hi,

On Mon, 4 May 2015, Xinliang David Li wrote:

 The use case proposed by Sri allows user to selectively eliminate PLT
 overhead for hot external calls only.

Yes, but only _because_ his approach doesn't use lazy binding.  With the 
full solution such restriction to a subset of functions isn't necessary.
And we should strive for going the full way, instead of adding hacks, 
shouldn't we?


Ciao,
Michael.


Re: [RFA] More type narrowing in match.pd V2

2015-05-04 Thread Jeff Law

On 05/02/2015 03:17 PM, Bernhard Reutner-Fischer wrote:


I should find time to commit the already approved auto-wipe dump file patch.
So let's assume I'll get to it maybe next weekend and nobody will notice the 2 
leftover .original dumps in this patch :)
Doh!  Not sure how there's be a .original dump left lying around, but as 
posted it'll definitely leave a .optimized lying around.  I'll fix that 
before committing.


Thanks for pointing it out.

jeff


Re: [PATCH] Remove dead code.

2015-05-04 Thread Jeff Law

On 05/04/2015 05:50 AM, Dominik Vogt wrote:

This patch removes a write only variable from the C++ code.

ChangeLog:

--

2015-05-04  Dominik Vogt  v...@linux.vnet.ibm.com

* call.c (print_z_candidates): Remove dead code.

OK.  Please install.

FWIW, removing a write-only variable seems like it ought ot fall under 
the obvious rule.


jeff


Re: [PATCH 00/13] further rtx_insn *ification

2015-05-04 Thread Jeff Law

On 05/02/2015 03:01 PM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders tbsaunde+...@tbsaunde.org

Hi,

This set of patches changes rtx to rtx_insn * in many plaes where its fairly
trivial to do so.

each was bootstrapped + regtested on x86_64-linux-gnu, and the series was run
through config-list.mk.  I believe this all falls under Jeff's preapproval from
last year for this sort of thing which I assume is still valid, so committing
to trunk.
And just to be explicit, it does fall under that preapproval for such 
changes.


Jeff



Re: [PATCH 1/4] libcpp: Improvements to comments in line-map.h/c

2015-05-04 Thread Jeff Law

On 05/01/2015 06:56 PM, David Malcolm wrote:

This patch updates and expands some comments in libcpp, adding
a big table to try to clarify what an individual source_location
value can mean.

libcpp/ChangeLog:
* include/line-map.h: Fix comment at the top of the file.
(source_location): Rewrite and expand the comment for this
typedef, adding an ascii-art table to clarify how source_location
values are allocated.
* line-map.c: Fix comment at the top of the file.

OK.
jeff



Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=

2015-05-04 Thread Xinliang David Li
yes -- a full solution that supports lazy binding will be nice.

David

On Mon, May 4, 2015 at 9:58 AM, Michael Matz m...@suse.de wrote:
 Hi,

 On Mon, 4 May 2015, Xinliang David Li wrote:

 The use case proposed by Sri allows user to selectively eliminate PLT
 overhead for hot external calls only.

 Yes, but only _because_ his approach doesn't use lazy binding.  With the
 full solution such restriction to a subset of functions isn't necessary.
 And we should strive for going the full way, instead of adding hacks,
 shouldn't we?


 Ciao,
 Michael.


Re: [rfc, stage 1] default to -fno-delete-null-pointer-checks on nios2-elf

2015-05-04 Thread Jeff Law

On 05/01/2015 02:33 PM, Sandra Loosemore wrote:

Re https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01510.html :

On 04/15/2015 10:42 PM, Jeff Law wrote:

It looks very sane to me.  This is probably how the AVR and CR16 should
have been handled to begin with IMHO.

FWIW, I generally discourage ports overriding default options, but this
is a case where I believe it makes some sense.

Please move forward with an official submission.


I've now bootstrapped and regression-tested the previously posted patch
on x86_64-linux-gnu, as well as retesting it on nios2-elf after updating
my source tree to current mainline head.

Are the target-independent parts OK to commit?

Yes.  Please install.

Thanks,
Jeff


Re: [PATCH] Expand PIC calls without PLT with -fno-plt

2015-05-04 Thread Jeff Law

On 05/04/2015 10:37 AM, Alexander Monakov wrote:

This patch introduces option -fno-plt that allows to expand calls that would
go via PLT to load the address of the function immediately at call site (which
introduces a GOT load).  Cover letter explains the motivation for this patch.

New option documentation for invoke.texi is missing from the patch; if this is
accepted I'll be happy to send a v2 with documentation added.

* calls.c (prepare_call_address): Transform PLT call to GOT lookup and
indirect call by forcing address into a pseudo with -fno-plt.
* common.opt (flag_plt): New option.

OK once you cobble together the invoke.texi changes.

Jeff




Re: [RFC PATCH] ira: accept loads via argp rtx in validate_equiv_mem

2015-05-04 Thread Jeff Law

On 05/04/2015 10:37 AM, Alexander Monakov wrote:

With this patch at hand, I'd like to discuss a code generation problem, which
my patch solves only partially.  FWIW, it passes bootstrap/regtest on x86-64.

With other patches in series applied, GCC with -fno-plt can generate tail
calls in PIC mode more frequently, but sometimes poorer code is generated.
I've tried to look for possible causes, and found one issue so far.

Consider the following testcase:

void foo1(int a, int b, int c, int d, int e, int f, int g, int h);
int bar(int x);
void foo2(int a, int b, int c, int d, int e, int f, int g, int h)
{
   bar(a);
   foo1(a, b, c, d, e, f, g, h);
}

Comparing x86 code generation with -O2 -m32 and with/without -fPIC, you can
see that -fPIC happens to produce smaller code.  Without -fPIC, GCC
saves/restores all arguments before/after call to 'bar'.

The reason for that is without -fPIC, GCC performs tail call optimization on
'foo1', and that causes it to drop REG_EQUIV notes for incoming arguments in
fixup_tail_calls.  After that, code generation diverges at IRA stage, where
lack of equivalences prevents loads of pseudos to be moved to the point of
first use.

The patch tries to repair the problem by allowing REG_EQUIV notes to be
resynthesized at ira init for loads that happen via `argp' rtx.  It helps for
the simple testcase above, but not for problematic Clang/LLVM functions where
I noticed the issue.

I hope there's a way around the 'big hammer' approach of fixup_tail_calls.
Might it be possible instead of dropping REG_EQUIV notes, to copy incoming
arguments into other pseudos just prior to stack pointer adjustment in
preparation for tailcall?
Isn't the whole point of dropping the notes to indicate that those 
argument slots are not longer guaranteed to hold the value at all points 
throughout the function?


That can certainly be relaxed, but you'll have to have some kind of code 
to analyze the data in the argument slots to ensure they haven't 
changed.  You can't just blindly put the notes back if I remember this 
stuff correctly.


Jeff



Re: [PATCH] Expand PIC calls without PLT with -fno-plt

2015-05-04 Thread Jakub Jelinek
On Mon, May 04, 2015 at 11:34:05AM -0600, Jeff Law wrote:
 On 05/04/2015 10:37 AM, Alexander Monakov wrote:
 This patch introduces option -fno-plt that allows to expand calls that would
 go via PLT to load the address of the function immediately at call site 
 (which
 introduces a GOT load).  Cover letter explains the motivation for this patch.
 
 New option documentation for invoke.texi is missing from the patch; if this 
 is
 accepted I'll be happy to send a v2 with documentation added.
 
  * calls.c (prepare_call_address): Transform PLT call to GOT lookup and
  indirect call by forcing address into a pseudo with -fno-plt.
  * common.opt (flag_plt): New option.
 OK once you cobble together the invoke.texi changes.

Isn't what Michael/Alan suggested better?  I mean as/ld/compiler changes to
inline the plt slot's first part, then lazy binding will work fine.

Jakub


Re: [PATCH] Expand PIC calls without PLT with -fno-plt

2015-05-04 Thread Jeff Law

On 05/04/2015 11:39 AM, Jakub Jelinek wrote:

On Mon, May 04, 2015 at 11:34:05AM -0600, Jeff Law wrote:

On 05/04/2015 10:37 AM, Alexander Monakov wrote:

This patch introduces option -fno-plt that allows to expand calls that would
go via PLT to load the address of the function immediately at call site (which
introduces a GOT load).  Cover letter explains the motivation for this patch.

New option documentation for invoke.texi is missing from the patch; if this is
accepted I'll be happy to send a v2 with documentation added.

* calls.c (prepare_call_address): Transform PLT call to GOT lookup and
indirect call by forcing address into a pseudo with -fno-plt.
* common.opt (flag_plt): New option.

OK once you cobble together the invoke.texi changes.


Isn't what Michael/Alan suggested better?  I mean as/ld/compiler changes to
inline the plt slot's first part, then lazy binding will work fine.

I must have missed Alan/Michael's message.

ISTM the win here is that by going through the GOT, you can CSE the GOT 
reference and possibly get some more register allocation freedom.  Is 
that still the case with Alan/Michael's approach?


jeff


Re: [PATCH] fixup libobjc usage of PCC_BITFIELD_TYPE_MATTERS

2015-05-04 Thread Jeff Law

On 05/01/2015 09:30 PM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders tbsaunde+...@tbsaunde.org

Hi,

This adds a configure check to libobjc to find out if types of bitfields effect
their layout, and uses it to replace the rather broken usage of
PCC_BITFIELD_TYPE_MATTERS.

bootstrapped + regtested x86_64-linux-gnu, bootstrapped on ppc64le-linux-gnu
and ran check-objc there without failures, and checked the correct part of the
ifdef is used on a cross to m68k-linux-elf.  ok?  I'm sure I've gotten
something wrong since this is a bunch of auto tools ;-)

Trev

libobjc/ChangeLog:

2015-05-01  Trevor Saunders  tbsaunde+...@tbsaunde.org

* acinclude.m4: Include bitfields.m4.
* config.h.in: Regenerate.
* configure: Likewise.
* configure.ac: Invoke gt_BITFIELD_TYPE_MATTERS.
* encoding.c: Check HAVE_BITFIELD_TYPE_MATTERS.
OK with the general direction here.  If Jakub's test is better, then go 
with it as a follow-up.


jeff


[C++ Patch] PR 66007

2015-05-04 Thread Paolo Carlini

Hi,

unfortunately we have to return to these few lines of code :(

This regression is a more subtle variant of c++/65858: if the user 
passes -Wno-error=narrowing the pedwarn didn't result in an actual error 
(even if we are forcing -pedantic-errors around it) but produces anyway 
a warning, thus returns true, and ok isn't set to true, thus we have a 
miscompilation in this case too. Jakub suggested simply checking by hand 
errorcount, which passes all my tests.


Thanks,
Paolo.


/cp
2015-05-04  Paolo Carlini  paolo.carl...@oracle.com
Jakub Jelinek  ja...@redhat.com

PR c++/66007
* typeck2.c (check_narrowing): Check by-hand that the pedwarn didn't
result in an actual error.

/testsuite
2015-05-04  Paolo Carlini  paolo.carl...@oracle.com
Jakub Jelinek  ja...@redhat.com

PR c++/66007
* g++.dg/cpp0x/Wnarrowing4.C: New.
Index: cp/typeck2.c
===
--- cp/typeck2.c(revision 222767)
+++ cp/typeck2.c(working copy)
@@ -958,10 +958,12 @@ check_narrowing (tree type, tree init, tsubst_flag
}
   else if (complain  tf_error)
{
+ int savederrorcount = errorcount;
  global_dc-pedantic_errors = 1;
- if (!pedwarn (EXPR_LOC_OR_LOC (init, input_location), OPT_Wnarrowing,
-   narrowing conversion of %qE from %qT to %qT 
-   inside { }, init, ftype, type))
+ pedwarn (EXPR_LOC_OR_LOC (init, input_location), OPT_Wnarrowing,
+  narrowing conversion of %qE from %qT to %qT 
+  inside { }, init, ftype, type);
+ if (errorcount == savederrorcount)
ok = true;
  global_dc-pedantic_errors = flag_pedantic_errors;
}
Index: testsuite/g++.dg/cpp0x/Wnarrowing4.C
===
--- testsuite/g++.dg/cpp0x/Wnarrowing4.C(revision 0)
+++ testsuite/g++.dg/cpp0x/Wnarrowing4.C(working copy)
@@ -0,0 +1,14 @@
+// PR c++/66007
+// { dg-do run { target c++11 } }
+// { dg-options -Wno-error=narrowing }
+
+extern C void abort();
+
+int main()
+{
+  unsigned foo[] = { 1, -1, 3 };
+  if (foo[0] != 1 || foo[1] != __INT_MAX__ * 2U + 1 || foo[2] != 3)
+abort();
+}
+
+// { dg-prune-output narrowing conversion }


[PATCH] Fix ubsan non-call-exceptions ICE (PR tree-optimization/65984)

2015-05-04 Thread Jakub Jelinek
Hi!

The code I've added in r217755 was assuming that stmt_could_throw_p
memory read will always end a bb, but that is clearly not the case.
Thus, the following patch uses stmt_ends_bb_p instead.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/5?

2015-05-04  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/65984
* ubsan.c: Include tree-cfg.h.
(instrument_bool_enum_load): Use stmt_ends_bb_p instead of
stmt_could_throw_p test, rename can_throw variable to ends_bb.

* c-c++-common/ubsan/pr65984.c: New test.

--- gcc/ubsan.c.jj  2015-04-09 21:49:59.0 +0200
+++ gcc/ubsan.c 2015-05-04 17:17:34.273661884 +0200
@@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.
 #include builtins.h
 #include tree-object-size.h
 #include tree-eh.h
+#include tree-cfg.h
 
 /* Map from a tree to a VAR_DECL tree.  */
 
@@ -1420,7 +1421,7 @@ instrument_bool_enum_load (gimple_stmt_i
   || TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
 return;
 
-  bool can_throw = stmt_could_throw_p (stmt);
+  bool ends_bb = stmt_ends_bb_p (stmt);
   location_t loc = gimple_location (stmt);
   tree lhs = gimple_assign_lhs (stmt);
   tree ptype = build_pointer_type (TREE_TYPE (rhs));
@@ -1432,7 +1433,7 @@ instrument_bool_enum_load (gimple_stmt_i
   tree mem = build2 (MEM_REF, utype, gimple_assign_lhs (g),
 build_int_cst (atype, 0));
   tree urhs = make_ssa_name (utype);
-  if (can_throw)
+  if (ends_bb)
 {
   gimple_assign_set_lhs (stmt, urhs);
   g = gimple_build_assign (lhs, NOP_EXPR, urhs);
@@ -1469,7 +1470,7 @@ instrument_bool_enum_load (gimple_stmt_i
   gimple_set_location (g, loc);
   gsi_insert_after (gsi, g, GSI_NEW_STMT);
 
-  if (!can_throw)
+  if (!ends_bb)
 {
   gimple_assign_set_rhs_with_ops (gsi2, NOP_EXPR, urhs);
   update_stmt (stmt);
--- gcc/testsuite/c-c++-common/ubsan/pr65984.c.jj   2015-05-04 
14:16:59.655378975 +0200
+++ gcc/testsuite/c-c++-common/ubsan/pr65984.c  2015-05-04 17:19:55.875447821 
+0200
@@ -0,0 +1,23 @@
+/* PR tree-optimization/65984 */
+/* { dg-do compile } */
+/* { dg-options -fnon-call-exceptions -fsanitize=bool,enum } */
+
+#ifndef __cplusplus
+#define bool _Bool
+#endif
+
+enum E { E0, E1, E2 };
+enum E e[2];
+bool *b;
+
+int
+foo (int i)
+{
+  return e[i];
+}
+
+int
+bar (int i)
+{
+  return b[i];
+}

Jakub


Re: [PATCH 2/4] libcpp: Replace macro usage with C++ constructs

2015-05-04 Thread Jeff Law

On 05/01/2015 06:56 PM, David Malcolm wrote:

libcpp makes extensive use of the C preprocessor.  Whilst this has a
pleasingly self-referential quality, I find the code hard-to-read;
implementing source location support in my JIT branch was much harder than
I felt it should have been.

In an attempt at making the code easier to follow, and to build towards
a followup patch, this patch converts most of these macros to C++
equivalents: using const for compile-time constants, and inline
functions where macros aren't used as lvalues.

This effectively documents the expected types of the params, and makes
them available from the debugger e.g.:

   (gdb) p LINEMAP_FILE ($3)
   $1 = 0x13b8b37 command-line

and indeed the constants also:

   (gdb) p IS_ADHOC_LOC(MAX_SOURCE_LOCATION)
   $2 = false
   (gdb) p IS_ADHOC_LOC(MAX_SOURCE_LOCATION + 1)
   $3 = true

[I didn't mark the inline functions as static; should they be?]

[FWIW, I posted a reduced version of this patch about a year ago as:
   https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01092.html
which covered a smaller subset of the macros].

libcpp/ChangeLog:
* include/line-map.h (MAX_SOURCE_LOCATION): Convert from a macro
to a const source_location.
(RESERVED_LOCATION_COUNT): Likewise.
(linemap_check_ordinary): Convert from a macro to a pair of inline
functions, for const/non-const arguments.
(MAP_START_LOCATION): Likewise.
(ORDINARY_MAP_STARTING_LINE_NUMBER): Likewise.
(ORDINARY_MAP_INCLUDER_FILE_INDEX): Likewise.
(ORDINARY_MAP_IN_SYSTEM_HEADER_P): Likewise.
(ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): Convert from a macro to a
pair of inline functions, for const/non-const arguments, where the
latter is named...
(SET_ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): New function.
(ORDINARY_MAP_FILE_NAME): Convert from a macro to a pair of inline
functions, for const/non-const arguments.
(MACRO_MAP_MACRO): Likewise.
(MACRO_MAP_NUM_MACRO_TOKENS): Likewise.
(MACRO_MAP_LOCATIONS): Likewise.
(MACRO_MAP_EXPANSION_POINT_LOCATION): Likewise.
(LINEMAPS_MAP_INFO): Likewise.
(LINEMAPS_MAPS): Likewise.
(LINEMAPS_ALLOCATED): Likewise.
(LINEMAPS_USED): Likewise.
(LINEMAPS_CACHE): Likewise.
(LINEMAPS_ORDINARY_CACHE): Likewise.
(LINEMAPS_MACRO_CACHE): Likewise.
(LINEMAPS_MAP_AT): Convert from a macro to an inline function.
(LINEMAPS_LAST_MAP): Likewise.
(LINEMAPS_LAST_ALLOCATED_MAP): Likewise.
(LINEMAPS_ORDINARY_MAPS): Likewise.
(LINEMAPS_ORDINARY_MAP_AT): Likewise.
(LINEMAPS_ORDINARY_ALLOCATED): Likewise.
(LINEMAPS_ORDINARY_USED): Likewise.
(LINEMAPS_LAST_ORDINARY_MAP): Likewise.
(LINEMAPS_LAST_ALLOCATED_ORDINARY_MAP): Likewise.
(LINEMAPS_MACRO_MAPS): Likewise.
(LINEMAPS_MACRO_MAP_AT): Likewise.
(LINEMAPS_MACRO_ALLOCATED): Likewise.
(LINEMAPS_MACRO_USED): Likewise.
(LINEMAPS_MACRO_LOWEST_LOCATION): Likewise.
(LINEMAPS_LAST_MACRO_MAP): Likewise.
(LINEMAPS_LAST_ALLOCATED_MACRO_MAP): Likewise.
(IS_ADHOC_LOC): Likewise.
(COMBINE_LOCATION_DATA): Likewise.
(SOURCE_LINE): Likewise.
(SOURCE_COLUMN): Likewise.
(LAST_SOURCE_LINE_LOCATION): Likewise.
(LAST_SOURCE_LINE): Likewise.
(LAST_SOURCE_COLUMN): Likewise.
(LAST_SOURCE_LINE_LOCATION)
(INCLUDED_FROM): Likewise.
(MAIN_FILE_P): Likewise.
(LINEMAP_FILE): Likewise.
(LINEMAP_LINE): Likewise.
(LINEMAP_SYSP): Likewise.
(linemap_location_before_p): Likewise.
* line-map.c (linemap_check_files_exited): Make local map const.
(linemap_add): Use SET_ORDINARY_MAP_NUMBER_OF_COLUMN_BITS.
(linemap_line_start): Likewise.
---
-#define MAP_START_LOCATION(MAP) (MAP)-start_location
+#if defined ENABLE_CHECKING  (GCC_VERSION = 2007)
+
+/* Assertion macro to be used in line-map code.  */
+#define linemap_assert(EXPR)  \
+  do {\
+if (! (EXPR)) \
+  abort ();   \
+  } while (0)
+
+/* Assert that becomes a conditional expression when checking is disabled at
+   compilation time.  Use this for conditions that should not happen but if
+   they happen, it is better to handle them gracefully rather than crash
+   randomly later.
+   Usage:
+
+   if (linemap_assert_fails(EXPR)) handle_error(); */
+#define linemap_assert_fails(EXPR) __extension__ \
+  ({linemap_assert (EXPR); false;})
+
+#else
+/* Include EXPR, so that unused variable warnings do not occur.  */
+#define linemap_assert(EXPR) ((void)(0  (EXPR)))
+#define linemap_assert_fails(EXPR) (! (EXPR))
+#endif
So if we're generally trying to get away from #define programming, then 
this part seems like a bit of a step backwards.


Re: [PATCH 3/4] libcpp/input.c: Add a way to visualize the linemaps

2015-05-04 Thread Jeff Law

On 05/01/2015 06:56 PM, David Malcolm wrote:

As a relative newcomer to GCC, one of the issues I had was
becoming comfortable with the linemap API and its internal
representation.

To familiarize myself with it, I wrote a dumping routine
to try to visualize how the source_location space is carved
up between line maps, and what each number can mean.

It struck me that this would benefit others, so this patch
adds this visualization, via an undocumented option
-fdump-locations, and adds a text file to libcpp's sources
documenting a simple example of compiling a small C file,
with a header and macro expansions (built using the
-fdump-locations option and a little hand-editing).

gcc/ChangeLog:
* common.opt (fdump-locations): New option.
* input.c: Include diagnostic-core.h.
(get_end_location): New function.
(write_digit): New function.
(write_digit_row): New function.
(dump_location_range): New function.
(dump_labelled_location_range): New function.
(dump_location_info): New function.
* input.h (dump_location_info): New prototype.
* toplev.c (compile_file): Handle flag_dump_locations.

libcpp/ChangeLog:
* include/line-map.h (source_location): Add a reference to
location-example.txt to the descriptive comment.
* location-example.txt: New file.
Maybe dump-internal-locations?  Not sure I want to bikeshed on the 
name any more than that.   If you feel strongly about the option name, 
then I won't stress about it.





+void
+dump_location_info (FILE *stream)
+{
+  if (0)
+line_table_dump (stream,
+line_table,
+LINEMAPS_ORDINARY_USED (line_table),
+LINEMAPS_MACRO_USED (line_table));

Should the if (0) code go away?


+
+  /* A brute-force visualization: emit a warning at every location.  */
+  if (0)
+for (source_location loc = 0; loc  line_table-highest_location; loc++)
+  warning_at (loc, 0, this is location %i, loc);
+  /* Alternatively, we could use inform (), though this
+also shows lots of locations in stdc-predef.h */

And again.


So I think with removing the if (0) code and the possible option name 
change this is good to go.


Jeff


Re: Extend verify_type to check various uses of TYPE_MINVAL

2015-05-04 Thread Jan Hubicka
Hi,
if my wifi connectoin allows, I will commit the following patch I tested in
meantime.  It also adds sanity checking for TYPE_MAXVAL that does not seem to
trigger any issues anymore.

From type_non_common it remains to check values and binfo. I hope to kill all
those fields and move them to derived structures where they belong but it is
harder than it seems because way obj-c++ shares datastructures with C++ and C
FEs and abuse these fields in interesting ways. (I got stuck on these last
stage1)

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 222791)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2015-05-02  Jan Hubicka  hubi...@ucw.cz
+
+   * tree.c (verify_type): Check various uses of TYPE_MAXVAL;
+   fix overactive TYPE_MIN_VALUE check and add FIXME for type
+   compatibility problems.
+
 2015-05-04  Ajit Agarwal  ajit...@xilinx.com
 
* config/microblaze/microblaze.md (cbranchsi4): Added immediate
Index: tree.c
===
--- tree.c  (revision 222753)
+++ tree.c  (working copy)
@@ -12621,14 +12621,9 @@ verify_type (const_tree t)
 }
   else if (INTEGRAL_TYPE_P (t) || TREE_CODE (t) == REAL_TYPE || TREE_CODE (t) 
== FIXED_POINT_TYPE)
 {
-  if (!TYPE_MIN_VALUE (t))
-   ;
-  else if (!TREE_CONSTANT (TYPE_MIN_VALUE (t)))
-{
- error (TYPE_MIN_VALUE is not constant);
- debug_tree (TYPE_MIN_VALUE (t));
- error_found = true;
-}
+  /* FIXME: The following check should pass:
+ useless_type_conversion_p (const_cast tree (t), TREE_TYPE 
(TYPE_MIN_VALUE (t))
+bud does not for C sizetypes in LTO.  */
 }
   else if (TYPE_MINVAL (t))
 {
@@ -12637,6 +12632,62 @@ verify_type (const_tree t)
   error_found = true;
 }
 
+  /* Check various uses of TYPE_MAXVAL.  */
+  if (RECORD_OR_UNION_TYPE_P (t))
+{
+  if (TYPE_METHODS (t)  TREE_CODE (TYPE_METHODS (t)) != FUNCTION_DECL
+  TREE_CODE (TYPE_METHODS (t)) != TEMPLATE_DECL)
+   {
+ error (TYPE_METHODS is not FUNCTION_DECL nor TEMPLATE_DECL);
+ debug_tree (TYPE_METHODS (t));
+ error_found = true;
+   }
+}
+  else if (TREE_CODE (t) == FUNCTION_TYPE || TREE_CODE (t) == METHOD_TYPE)
+{
+  if (TYPE_METHOD_BASETYPE (t)
+  TREE_CODE (TYPE_METHOD_BASETYPE (t)) != RECORD_TYPE
+  TREE_CODE (TYPE_METHOD_BASETYPE (t)) != UNION_TYPE)
+   {
+ error (TYPE_METHOD_BASETYPE is not record nor union);
+ debug_tree (TYPE_METHOD_BASETYPE (t));
+ error_found = true;
+   }
+}
+  else if (TREE_CODE (t) == OFFSET_TYPE)
+{
+  if (TYPE_OFFSET_BASETYPE (t)
+  TREE_CODE (TYPE_OFFSET_BASETYPE (t)) != RECORD_TYPE
+  TREE_CODE (TYPE_OFFSET_BASETYPE (t)) != UNION_TYPE)
+   {
+ error (TYPE_OFFSET_BASETYPE is not record nor union);
+ debug_tree (TYPE_OFFSET_BASETYPE (t));
+ error_found = true;
+   }
+}
+  else if (INTEGRAL_TYPE_P (t) || TREE_CODE (t) == REAL_TYPE || TREE_CODE (t) 
== FIXED_POINT_TYPE)
+{
+  /* FIXME: The following check should pass:
+ useless_type_conversion_p (const_cast tree (t), TREE_TYPE 
(TYPE_MAX_VALUE (t))
+bud does not for C sizetypes in LTO.  */
+}
+  else if (TREE_CODE (t) == ARRAY_TYPE)
+{
+  if (TYPE_ARRAY_MAX_SIZE (t)
+  TREE_CODE (TYPE_ARRAY_MAX_SIZE (t)) != INTEGER_CST)
+{
+ error (TYPE_ARRAY_MAX_SIZE not INTEGER_CST);
+ debug_tree (TYPE_ARRAY_MAX_SIZE (t));
+ error_found = true;
+} 
+}
+  else if (TYPE_MAXVAL (t))
+{
+  error (TYPE_MAXVAL non-NULL);
+  debug_tree (TYPE_MAXVAL (t));
+  error_found = true;
+}
+
 
   if (error_found)
 {


RE: [PATCH, combine] Try REG_EQUAL for nonzero_bits

2015-05-04 Thread Thomas Preud'homme
 From: Jeff Law [mailto:l...@redhat.com]
 Sent: Tuesday, April 28, 2015 12:27 AM
 OK.  No need for heroics -- give it a shot, but don't burn an insane
 amount of time on it.  If we can't get to a reasonable testcase, then so
 be it.

Ok, I tried but really didn't managed to create a testcase. I did, however,
understand the condition when this patch is helpful. In the function
reg_nonzero_bits_for_combine () in combine.c there is a test to check if
last_set_nonzero_bits for a given register is still valid.

In the case I'm considering, the test evaluates to false because:

(i) the register rX whose nonzero bits are being evaluated was set in a
previous basic block than the one with the instruction using rX (hence
rsp-last_set_label  label_tick)
(ii) the predecessor of the the basic block for that same insn is not the
previous basic block analyzed by combine_instructions (hence
label_tick_ebb_start == label_tick)
(iii) the register rX is set multiple time (hence
REG_N_SETS (REGNO (x)) != 1)

Yet, the block being processed is dominated by the SET for rX so there
is a REG_EQUAL available to narrow down the set of nonzero bits.

Based on my understanding of your answer quoted above, I'll commit
it as is, despite not having been able to come up with a testcase. I'll
wait tomorrow to do so though in case you changed your mind about it.

Best regards,

Thomas




[PATCH] Improve the test in bitfields.m4

2015-05-04 Thread tbsaunde+gcc
From: Trevor Saunders tbsaunde+...@tbsaunde.org

Hi,

here's what I committed.  bootstrapped + regtested x86_64-linux-gnu.

Trev

Using a named bitfield with a width more than 0 means we won't hit
weirdness caused by the bitfield not really needing to exist.  Changing
int to long long means we won't have trouble with some arch where size
of int is 1 or 2.

libobjc/ChangeLog:

2015-05-04  Trevor Saunders  tbsaunde+...@tbsaunde.org

* configure: Regenerate.

config/ChangeLog:

2015-05-04  Trevor Saunders  tbsaunde+...@tbsaunde.org

* bitfields.m4: Change int to long long, and use bitfields of
width 1 instead of 0.
---
 config/bitfields.m4 | 7 +++
 libobjc/configure   | 7 +++
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/config/bitfields.m4 b/config/bitfields.m4
index ee8f3b5..8185cd3 100644
--- a/config/bitfields.m4
+++ b/config/bitfields.m4
@@ -13,10 +13,9 @@ AC_DEFUN([gt_BITFIELD_TYPE_MATTERS],
   AC_CACHE_CHECK([if the type of bitfields matters], 
gt_cv_bitfield_type_matters,
   [
 AC_TRY_COMPILE(
-  [struct foo1 { char x; char :0; char y; };
-struct foo2 { char x; int :0; char y; };
-int foo1test[ sizeof (struct foo1) == 2 ? 1 : -1 ];
-int foo2test[ sizeof (struct foo2) == 5 ? 1 : -1]; ],
+  [struct foo1 { char x; char y:1; char z; };
+struct foo2 { char x; long long int y:1; char z; };
+int foo1test[ sizeof (struct foo1)  sizeof (struct foo2) ? 1 : -1 ]; ],
   [], gt_cv_bitfield_type_matters=yes, gt_cv_bitfield_type_matters=no)
   ])
   if test $gt_cv_bitfield_type_matters = yes; then
diff --git a/libobjc/configure b/libobjc/configure
index 0547f91..2f71735 100755
--- a/libobjc/configure
+++ b/libobjc/configure
@@ -11539,10 +11539,9 @@ else
 
 cat confdefs.h - _ACEOF conftest.$ac_ext
 /* end confdefs.h.  */
-struct foo1 { char x; char :0; char y; };
-struct foo2 { char x; int :0; char y; };
-int foo1test[ sizeof (struct foo1) == 2 ? 1 : -1 ];
-int foo2test[ sizeof (struct foo2) == 5 ? 1 : -1];
+struct foo1 { char x; char y:1; char z; };
+struct foo2 { char x; long long int y:1; char z; };
+int foo1test[ sizeof (struct foo1)  sizeof (struct foo2) ? 1 : -1 ];
 int
 main ()
 {
-- 
2.4.0



Re: [Patch,microblaze]: Optimized usage of fint instruction.

2015-05-04 Thread Michael Eager

On 03/04/2015 08:20 AM, Michael Eager wrote:

On 03/04/15 03:53, Ajit Kumar Agarwal wrote:



-Original Message-
From: Michael Eager [mailto:ea...@eagerm.com]
Sent: Thursday, February 26, 2015 4:33 AM
To: Ajit Kumar Agarwal; GCC Patches
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: [Patch,microblaze]: Optimized usage of fint instruction.

On 02/25/15 02:20, Ajit Kumar Agarwal wrote:

Hello All:

Please find the patch for the optimized usage of fint instruction
changes. No regression is seen in the deja GNU tests.

commit ed4dc0b96bf43c200cacad97f73a98ab7048e51b
Author: Ajit Kumar Agarwal ajitkum@xhdspdgnu.(none)
Date:   Wed Feb 25 15:36:29 2015 +0530

  [Patch,microblaze]: Optimized usage of fint instruction.

  The changes are made in the patch for optimized usage of fint instruction.
  The sequence of fint/cond_branch is replaced with fcmp/cond_branch. The
  fint instruction takes 6/7 cycles as compared to fcmp instruction which
  takes 1 cycles. The conversion from float to int with fint instruction
  is not required and can directly compared with fcmp instruction which
  takes 1 cycle as compared to 6/7 cycles with fint instruction.

  ChangeLog:
  2015-02-25  Ajit Agarwal  ajit...@xilinx.com

  * config/microblaze/microblaze.md (peephole2): New.





+emit_insn (gen_cstoresf4 (comp_reg, operands[2],
+  gen_rtx_REG(SFmode,REGNO(cmp_op0)),
+  gen_rtx_REG(SFmode,REGNO(cmp_op1;


Spaces before left parens and after comma in last two lines.


Changes are incorporated. Please find the log for updated patch.

commit 492b0d0b67a5b12d2dc239de3215630c8838edea
Author: Ajit Kumar Agarwal ajitkum@xhdspdgnu.(none)
Date:   Wed Mar 4 17:15:16 2015 +0530

 [Patch,microblaze]: Optimized usage of fint instruction.

 The changes are made in the patch for optimized usage of fint instruction.
 The sequence of fint/cond_branch is replaced with fcmp/cond_branch. The
 fint instruction takes 6/7 cycles as compared to fcmp instruction which
 takes 1 cycles. The conversion from float to int with fint instruction
 is not required and can directly compared with fcmp instruction which
 takes 1 cycle as compared to 6/7 cycles with fint instruction.

 ChangeLog:
 2015-03-04  Ajit Agarwal  ajit...@xilinx.com

 * config/microblaze/microblaze.md (peephole2): New.

 Signed-off-by:Ajit Agarwal ajit...@xilinx.com

Thanks  Regards
Ajit


OK.


Committed revision 222790.


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


Re: [Patch,microblaze]: Optimized usage of pcmp conditional instruction.

2015-05-04 Thread Michael Eager

On 03/06/2015 07:33 AM, Michael Eager wrote:

On 03/05/15 21:12, Ajit Kumar Agarwal wrote:





Changes are  incorporated. Please find the log of the updated patch.

commit 91f275c144165320850ddf18e3a1e059a66c
Author: Ajit Kumar Agarwal ajitkum@xhdspdgnu.(none)
Date:   Fri Mar 6 09:55:11 2015 +0530

 [Patch,microblaze]: Optimized usage of pcmp conditional instruction.

 The changes are made in the patch for optimized usage of pcmpne/pcmpeq
 instructions. The xor with register to register is replaced with pcmpeq
 /pcmpne instructions and for immediate check still the xori will be used.
 The purpose of the change is to acheive the aggressive usage of pcmpne
 /pcmpeq instructions instead of xor being used for comparison.

 ChangeLog:
 2015-03-06  Ajit Agarwal  ajit...@xilinx.com

 * config/microblaze/microblaze.md (cbranchsi4): Added immediate
 constraints.
 (cbranchsi4_reg): New.
 * config/microblaze/microblaze.c
 (microblaze_expand_conditional_branch_reg): New.
 * config/microblaze/microblaze-protos.h
 (microblaze_expand_conditional_branch_reg): New prototype.

 Signed-off-by:Ajit Agarwal ajit...@xilinx.com

Thanks  Regards
Ajit


OK.


Committed revision 222791.



--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


[PATCH, i386]: Fix PR65871, add *bmi_andn_mode_ccno pattern

2015-05-04 Thread Uros Bizjak
Hello!

Another pattern that seems useful.

2015-05-05  Uros Bizjak  ubiz...@gmail.com

PR target/65871
* config/i386/i386.md (*bmi_andn_mode_ccno): New pattern.

testsuite/ChangeLog:

2015-05-05  Uros Bizjak  ubiz...@gmail.com

PR target/65871
* gcc.target/i386/pr65871-3.c: New test.

Teste on x86_64-linux-gnu {,-m32}  and committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 222774)
+++ config/i386/i386.md (working copy)
@@ -12565,11 +12564,25 @@
(set_attr btver2_decode direct, double)
(set_attr mode MODE)])
 
+(define_insn *bmi_andn_mode_ccno
+  [(set (reg FLAGS_REG)
+   (compare
+ (and:SWI48
+   (not:SWI48 (match_operand:SWI48 1 register_operand r,r))
+   (match_operand:SWI48 2 nonimmediate_operand r,m))
+ (const_int 0)))
+   (clobber (match_scratch:SWI48 0 =r,r))]
+  TARGET_BMI  ix86_match_ccmode (insn, CCNOmode)
+  andn\t{%2, %1, %0|%0, %1, %2}
+  [(set_attr type bitmanip)
+   (set_attr btver2_decode direct, double)
+   (set_attr mode MODE)])
+
 (define_insn bmi_bextr_mode
   [(set (match_operand:SWI48 0 register_operand =r,r)
 (unspec:SWI48 [(match_operand:SWI48 1 nonimmediate_operand r,m)
-   (match_operand:SWI48 2 register_operand r,r)]
-   UNSPEC_BEXTR))
+   (unspec:SWI48 [(match_operand:SWI48 1 nonimmediate_operand r,m)
+  (match_operand:SWI48 2 register_operand r,r)]
+ UNSPEC_BEXTR))
(clobber (reg:CC FLAGS_REG))]
   TARGET_BMI
   bextr\t{%2, %1, %0|%0, %1, %2}
Index: testsuite/gcc.target/i386/pr65871-3.c
===
--- testsuite/gcc.target/i386/pr65871-3.c   (revision 0)
+++ testsuite/gcc.target/i386/pr65871-3.c   (working copy)
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -mbmi } */
+
+int foo (int x, int y)
+{
+  if (~x  y)
+return 1;
+
+  return 0;
+}
+
+int bar (int x, int y)
+{
+  if ((~x  y)  0)
+return 1;
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-not test } } */


Re: [PATCH/libiberty] fix build of gdb/binutils with clang.

2015-05-04 Thread Yunlian Jiang
There was a similar disscussion here
https://gcc.gnu.org/ml/gcc/2005-11/msg01190.html

The problem is in the configure stage, the __GNU_SOURCE is not
defined, and it could not find
the declaration of asprintf. so it make a declaration of asprintf in
libiberty.h. And  for the file floatformat.c,
the  __GNU_SOURCE is defined, so it could find another asprintf in
/usr/include/bits/stdio2.h, it also includes
libiberty.h. So these two asprintf conflicts when __USE_FORTIFY_LEVEL is set.

On Sat, May 2, 2015 at 11:58 AM, Ian Lance Taylor i...@google.com wrote:
 On Fri, May 1, 2015 at 4:45 PM, Yunlian Jiang yunl...@google.com wrote:
 The test case does not have #define _GNU_SOURCE, so it says
 error: ‘asprintf’ undeclared (first use in this function)

 OK, then my next question is: why does the test case (I assume you
 mean the test case for whether to set HAVE_DECL_ASPRINTF) not have
 #define _GNU_SOURCE?

 What is the background here?

 Ian

 On Fri, May 1, 2015 at 3:45 PM, Ian Lance Taylor i...@google.com wrote:
 On Tue, Apr 28, 2015 at 2:59 PM, Yunlian Jiang yunl...@google.com wrote:
 I believe this is the same problem as
 https://gcc.gnu.org/ml/gcc-patches/2008-07/msg00292.html

 The asprinf declaration is  messed up when using clang to build gdb.

 diff --git a/include/libiberty.h b/include/libiberty.h
 index b33dd65..a294903 100644
 --- a/include/libiberty.h
 +++ b/include/libiberty.h
 @@ -625,8 +625,10 @@ extern int pwait (int, int *, int);
  /* Like sprintf but provides a pointer to malloc'd storage, which must
 be freed by the caller.  */

 +#ifndef asprintf
  extern int asprintf (char **, const char *, ...) ATTRIBUTE_PRINTF_2;
  #endif
 +#endif

  /* Like asprintf but allocates memory without fail. This works like
 xmalloc.  */

 Why is HAVE_DECL_ASPRINTF not defined?

 Ian


[patch committed SH] Fix PR target/65987

2015-05-04 Thread Kaz Kojima
I've committed the attached patch to fix PR target/65987
which is a 6 regression.  The recent stdarg change reveals
the target problem for section crossing jumps.
Some SH specific jump optimizations don't take into account
such jumps.  The attached patch is a minimal fix to solve
the above PR.  Tested on sh4-unknown-linux-gnu.

Regards,
kaz
--
2015-05-04  Kaz Kojima  kkoj...@gcc.gnu.org

PR target/65987
* config/sh/sh.c (output_far_jump): Take into account crossing jumps.
(split_branches): Likewise.

diff --git a/config/sh/sh.c b/config/sh/sh.c
index 1cf6ed0..a4c9c4c 100644
--- a/config/sh/sh.c
+++ b/config/sh/sh.c
@@ -2747,7 +2747,8 @@ output_far_jump (rtx_insn *insn, rtx op)
 
   if (TARGET_SH2
offset = -32764
-   offset - get_attr_length (insn) = 32766)
+   offset - get_attr_length (insn) = 32766
+   ! CROSSING_JUMP_P (insn))
 {
   far = 0;
   jump =   mov.w  %O0,%1 \n
@@ -6753,6 +6754,13 @@ split_branches (rtx_insn *first)
 
if (type == TYPE_JUMP)
  {
+   if (CROSSING_JUMP_P (insn))
+ {
+   emit_insn_before (gen_block_branch_redirect (const0_rtx),
+ insn);
+   continue;
+ }
+
far_label = as_a rtx_insn * (
  XEXP (SET_SRC (PATTERN (insn)), 0));
dest_uid = get_dest_uid (far_label, max_uid);


match.pd patch reverted

2015-05-04 Thread Jeff Law


I've reverted my latest match.pd change.  It's causing a bootstrap 
failure on i686.


Jeff


Re: Extend verify_type to check various uses of TYPE_MINVAL

2015-05-04 Thread Jan Hubicka
  Not obvious enough, it seems: this patch broke gnat.dg/lto* tests at
  least on i386-pc-solaris2.10.  E.g.
  
  FAIL: gnat.dg/lto1.adb (test for excess errors)
  WARNING: gnat.dg/lto1.adb compilation failed to produce executable
  
  FAIL: gnat.dg/lto1.adb (test for excess errors)
  Excess errors:
  /vol/gcc/src/hg/trunk/solaris/gcc/testsuite/gnat.dg/lto1_pkg.adb:23:1:
  error: TYPE_MIN_VALUE is not constant 
 
 TYPE_MIN_VALUE can be arbitrary in Ada, with or without LTO.  For
 
 package Q is
 
function LB return Natural;
function UB return Natural;
 
 end Q;
 with Q;
 
 package P is
 
type Arr1 is array (Natural range ) of Boolean;
 
subtype Arr2 is Arr1 (Q.LB .. Q.UB);
 
 end P;
 
 the TYPE_DOMAIN of Arr2 is
 
 domain integer_type 0x769be000 type integer_type 0x76d0e0a8 
 sizetype
 sizes-gimplified visited DI size integer_cst 0x76d0abb8 64 unit 
 size integer_cst 0x76d0abd0 8
 align 64 symtab 0 alias set -1 canonical type 0x769be000 
 precision 
 64 min nop_expr 0x769bd000 max cond_expr 0x769b9420

Thanks, I just noticed the failures.  I will revert that check, it is indeed 
valid
for min values to not be constants (and even in C max values may be variable)

Honza


Re: [PATCH] Fix ubsan non-call-exceptions ICE (PR tree-optimization/65984)

2015-05-04 Thread Jeff Law

On 05/04/2015 12:16 PM, Jakub Jelinek wrote:

Hi!

The code I've added in r217755 was assuming that stmt_could_throw_p
memory read will always end a bb, but that is clearly not the case.
Thus, the following patch uses stmt_ends_bb_p instead.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/5?

2015-05-04  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/65984
* ubsan.c: Include tree-cfg.h.
(instrument_bool_enum_load): Use stmt_ends_bb_p instead of
stmt_could_throw_p test, rename can_throw variable to ends_bb.

* c-c++-common/ubsan/pr65984.c: New test.

OK.
Jeff



Fix PR48052: loop not vectorized if index is unsigned int

2015-05-04 Thread Abderrazek Zaafrani
This is an old thread and we are still running into similar issues:
Code is not being vectorized on 64-bit target due to scev not being
able to optimally analyze overflow condition.

While the original test case shown here seems to work now, it does not
work if the start value is not a constant and the loop index variable
is of unsigned type: Ex

void loop2( double const * __restrict__ x_in, double * __restrict__
x_out, double const * __restrict__ c, unsigned int N, unsigned int
start) {
 for(unsigned int i=start; i!=N; ++i)
   x_out[i] = c[i]*x_in[i];
}

Here is our unit test:

int foo(int* A, int* B, unsigned start, unsigned B)
{
  int s;
  for (unsigned k = start; k start+B; k++)
s += A[k] * B[k];
  return s;
}

Our unit test case is extracted from a matrix multiply of a
two-dimensional array and all loops are blocked by hand by a factor of
B. Even though a bit modified, above loop corresponds to the innermost
loop of the blocked matrix multiply.

We worked on patch to solve the problem (see attachment.)
The attached patch passed bootstrap and make check on x86_64-linux.
Ok for trunk?

Thanks,
Abderrazek Zaafrani
From eedbcd1ef6a81bb9c000e0dba9ff2a6c524576ac Mon Sep 17 00:00:00 2001
From: Abderrazek Zaafrani a.zaafr...@samsung.com
Date: Mon, 4 May 2015 11:00:12 -0500
Subject: [PATCH] scev for vectorization

PR optimization/48052
* tree-ssa-loop-niter.c (variable_appears_in_loop_exit_condition): New.
(scev_probably_wraps_p): Handle unsigned convert expressions to a 
larger type
than the basic induction variable.

* gcc.dg/vect/pr48052.c: New.
---
 gcc/testsuite/gcc.dg/vect/pr48052.c | 27 
 gcc/tree-ssa-loop-niter.c   | 84 +
 2 files changed, 111 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr48052.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr48052.c 
b/gcc/testsuite/gcc.dg/vect/pr48052.c
new file mode 100644
index 000..8e406d7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr48052.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-additional-options -O3 } */
+/* { dg-final { scan-tree-dump-times vectorized 1 loops 2 vect } } */
+/* { dg-final { cleanup-tree-dump vect } } */
+
+int foo(int* A, int* B,  unsigned start, unsigned BS)
+{
+  int s;
+  for (unsigned k = start;  k  start + BS; k++)
+{
+  s += A[k] * B[k];
+}
+
+  return s;
+}
+
+int bar(int* A, int* B, unsigned BS)
+{
+  int s;
+  for (unsigned k = 0;  k  BS; k++)
+{
+  s += A[k] * B[k];
+}
+
+  return s;
+}
+
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 042f8df..345fb93 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -3773,6 +3773,30 @@ nowrap_type_p (tree type)
   return false;
 }
 
+/* Returns true when T appears in the exit condition of LOOP.  */
+
+static bool
+variable_appears_in_loop_exit_condition (tree t, struct loop *loop)
+{
+  struct nb_iter_bound *bound;
+
+  /* For now, we are only interested in loops with one exit condition.  */
+  if (loop-bounds == NULL || loop-bounds-next != NULL)
+  return false;
+
+  for (bound = loop-bounds; bound; bound = bound-next)
+{
+  if (gimple_code (bound-stmt) != GIMPLE_COND)
+return false;
+
+  if (t == gimple_cond_lhs(bound-stmt)
+ || t == gimple_cond_rhs(bound-stmt))
+return true;
+}
+
+  return false;
+}
+
 /* Return false only when the induction variable BASE + STEP * I is
known to not overflow: i.e. when the number of iterations is small
enough with respect to the step and initial condition in order to
@@ -3879,6 +3903,66 @@ scev_probably_wraps_p (tree base, tree step,
 
   fold_undefer_and_ignore_overflow_warnings ();
 
+  /* At this point, we could not determine that the current scalar
+ evolution composed of base and step does not overflow.  In order
+ to improve this analysis, go back to the context of this scev,
+ i.e., statement and loop, and determine from there if we can
+ deduce that there is no overflow.
+
+ We are so far interested in convert statement of this form
+
+ _1 = (some cast) I;
+
+ where I is a basic induction variable.  This case is common when
+ computing addresses for 64-bit targets.  */
+  if (loop != NULL  loop-nb_iterations != NULL  loop-bounds != NULL
+   at_stmt != NULL  integer_onep (step))
+{
+  enum tree_code nbi_code = TREE_CODE (loop-nb_iterations);
+  enum gimple_code stmt_code = gimple_code (at_stmt);
+
+  if (nbi_code != SCEV_NOT_KNOWN  stmt_code == GIMPLE_ASSIGN)
+{
+  tree rhs1 = gimple_assign_rhs1 (at_stmt);
+  enum tree_code tree_code = gimple_assign_rhs_code (at_stmt);
+  tree rhs2 = gimple_assign_rhs2 (at_stmt);
+
+  /* If at_stmt is a convert statement: _1 = (some cast) I;  */
+  if (rhs1 != NULL  rhs2 == NULL
+   (tree_code == CONVERT_EXPR || tree_code == NOP_EXPR))
+{
+

Demangle symbols in debug assertion messages

2015-05-04 Thread François Dumont

Hi

Here is  the patch to demangle symbols in debug messages. I have 
also simplify code in formatter.h.


Here is an example of assertion message:

/home/fdt/dev/gcc/build/x86_64-unknown-linux-gnu/libstdc++-v3/include/debug/functions.h:213:
error: function requires a valid iterator range [__first, __last).

Objects involved in the operation:
iterator __first @ 0x0x7fff165d68b0 {
  type = __gnu_debug::_Safe_iterator__gnu_cxx::__normal_iteratorint*, 
std::__cxx1998::vectorint, std::allocatorint  , 
std::__debug::vectorint, std::allocatorint   (mutable iterator);

  state = dereferenceable;
  references sequence with type `std::__debug::vectorint, 
std::allocatorint ' @ 0x0x7fff165d69d0

}
iterator __last @ 0x0x7fff165d68e0 {
  type = __gnu_debug::_Safe_iterator__gnu_cxx::__normal_iteratorint*, 
std::__cxx1998::vectorint, std::allocatorint  , 
std::__debug::vectorint, std::allocatorint   (mutable iterator);

  state = dereferenceable;
  references sequence with type `std::__debug::vectorint, 
std::allocatorint ' @ 0x0x7fff165d69d0

}


* include/debug/formatter.h (_GLIBCXX_TYPEID): New macro to simplify
usage of typeid.
(_Error_formatter::_M_print_type): New.
* src/c++11/debug.cc
(_Error_formatter::_Parameter::_M_print_field): Use latter.
(_Error_formatter::_M_print_type): Implement latter using
__cxaabiv1::__cxa_demangle to print demangled type name.

I just hope that __cxa_demangle is portable.

Ok to commit ?

François

diff --git a/libstdc++-v3/include/debug/formatter.h b/libstdc++-v3/include/debug/formatter.h
index 6767cd9..32dcf92 100644
--- a/libstdc++-v3/include/debug/formatter.h
+++ b/libstdc++-v3/include/debug/formatter.h
@@ -31,7 +31,17 @@
 
 #include bits/c++config.h
 #include bits/cpp_type_traits.h
-#include typeinfo
+
+#if __cpp_rtti
+# include typeinfo
+# define _GLIBCXX_TYPEID(_Type) typeid(_Type)
+#else
+namespace std
+{
+  class type_info;
+}
+# define _GLIBCXX_TYPEID(_Type) 0
+#endif
 
 namespace __gnu_debug
 {
@@ -218,21 +228,13 @@ namespace __gnu_debug
 	{
 	  _M_variant._M_iterator._M_name = __name;
 	  _M_variant._M_iterator._M_address = __it;
-#if __cpp_rtti
-	  _M_variant._M_iterator._M_type = typeid(__it);
-#else
-	  _M_variant._M_iterator._M_type = 0;
-#endif
+	  _M_variant._M_iterator._M_type = _GLIBCXX_TYPEID(__it);
 	  _M_variant._M_iterator._M_constness =
 	std::__are_same_Safe_iterator_Iterator, _Sequence,
 			typename _Sequence::iterator::
 	  __value ? __mutable_iterator : __const_iterator;
 	  _M_variant._M_iterator._M_sequence = __it._M_get_sequence();
-#if __cpp_rtti
-	  _M_variant._M_iterator._M_seq_type = typeid(_Sequence);
-#else
-	  _M_variant._M_iterator._M_seq_type = 0;
-#endif
+	  _M_variant._M_iterator._M_seq_type = _GLIBCXX_TYPEID(_Sequence);
 
 	  if (__it._M_singular())
 	_M_variant._M_iterator._M_state = __singular;
@@ -256,21 +258,13 @@ namespace __gnu_debug
 	{
 	  _M_variant._M_iterator._M_name = __name;
 	  _M_variant._M_iterator._M_address = __it;
-#if __cpp_rtti
-	  _M_variant._M_iterator._M_type = typeid(__it);
-#else
-	  _M_variant._M_iterator._M_type = 0;
-#endif
+	  _M_variant._M_iterator._M_type = _GLIBCXX_TYPEID(__it);
 	  _M_variant._M_iterator._M_constness =
 	std::__are_same_Safe_local_iterator_Iterator, _Sequence,
 			typename _Sequence::local_iterator::
 	  __value ? __mutable_iterator : __const_iterator;
 	  _M_variant._M_iterator._M_sequence = __it._M_get_sequence();
-#if __cpp_rtti
-	  _M_variant._M_iterator._M_seq_type = typeid(_Sequence);
-#else
-	  _M_variant._M_iterator._M_seq_type = 0;
-#endif
+	  _M_variant._M_iterator._M_seq_type = _GLIBCXX_TYPEID(_Sequence);
 
 	  if (__it._M_singular())
 	_M_variant._M_iterator._M_state = __singular;
@@ -291,11 +285,7 @@ namespace __gnu_debug
 	{
 	  _M_variant._M_iterator._M_name = __name;
 	  _M_variant._M_iterator._M_address = __it;
-#if __cpp_rtti
-	  _M_variant._M_iterator._M_type = typeid(__it);
-#else
-	  _M_variant._M_iterator._M_type = 0;
-#endif
+	  _M_variant._M_iterator._M_type = _GLIBCXX_TYPEID(__it);
 	  _M_variant._M_iterator._M_constness = __mutable_iterator;
 	  _M_variant._M_iterator._M_state = __it? __unknown_state : __singular;
 	  _M_variant._M_iterator._M_sequence = 0;
@@ -308,11 +298,7 @@ namespace __gnu_debug
 	{
 	  _M_variant._M_iterator._M_name = __name;
 	  _M_variant._M_iterator._M_address = __it;
-#if __cpp_rtti
-	  _M_variant._M_iterator._M_type = typeid(__it);
-#else
-	  _M_variant._M_iterator._M_type = 0;
-#endif
+	  _M_variant._M_iterator._M_type = _GLIBCXX_TYPEID(__it);
 	  _M_variant._M_iterator._M_constness = __const_iterator;
 	  _M_variant._M_iterator._M_state = __it? __unknown_state : __singular;
 	  _M_variant._M_iterator._M_sequence = 0;
@@ -325,11 +311,7 @@ namespace __gnu_debug
 	{
 	  _M_variant._M_iterator._M_name = __name;
 	  _M_variant._M_iterator._M_address = __it;
-#if __cpp_rtti
-	  _M_variant._M_iterator._M_type = typeid(__it);
-#else
-	  _M_variant._M_iterator._M_type = 0;

Re: [PATCH, RFC]: Next stage1, refactoring: propagating rtx subclasses

2015-05-04 Thread Trevor Saunders
 OK. Fixed the patch. Rebased and tested on x86_64-linux (fortunately, it
 did not conflict with Trevor's series of rtx_insn-related patches).

good :) fwiw I have another series that'll probably be ready about the
end of the week (the punishment for writing small patches is making the
testing box spin for days ;-)

 I'm trying to continue and the next patch (peep_split.patch,
 peep_split.cl) is addressing the same task in some of the generated code
 (namely, gen_peephole2_* and gen_split_* series of functions).

ok, I've stayed away from the generators andjust done more trivial
changes of rtx - rtx_insn * in arguments.

Trev

  If you're going to continue this work, you should probably get
  write-after-approval access so that you can commit your own approved
  changes.
 Is it OK to mention you as a maintainer who can approve my request for
 write access?
 
 -- 
 Regards,
 Mikhail Maltsev
 




Re: [PING 2][PATCH] libgcc: Add CFI directives to the soft floating point support code for ARM

2015-05-04 Thread Martin Galvan
Hi Ramana! Sorry to bother, but I looked at the repository and didn't
see this committed. As I don't have write access could you please
commit this for me?

Thanks a lot!

On Tue, Apr 28, 2015 at 2:07 PM, Martin Galvan
martin.gal...@tallertechnologies.com wrote:
 Thanks a lot. I don't have write access to the repository, could you
 commit this for me?

 On Tue, Apr 28, 2015 at 1:21 PM, Ramana Radhakrishnan
 ramana@googlemail.com wrote:
 On Tue, Apr 28, 2015 at 4:19 PM, Martin Galvan
 martin.gal...@tallertechnologies.com wrote:
 This patch adds CFI directives to the soft floating point support code for 
 ARM.

 Previously, if we tried to do a backtrace from that code in a debug session 
 we'd
 get something like this:

 (gdb) bt
 #0  __nedf2 () at 
 ../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1082
 #1  0x0db6 in __aeabi_cdcmple () at 
 ../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1158
 #2  0xf5c28f5c in ?? ()
 Backtrace stopped: previous frame identical to this frame (corrupt stack?)

 Now we'll get something like this:

 (gdb) bt
 #0  __nedf2 () at 
 ../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1156
 #1  0x0db6 in __aeabi_cdcmple () at 
 ../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1263
 #2  0x0dc8 in __aeabi_dcmpeq () at 
 ../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1285
 #3  0x0504 in main ()

 I have a company-wide copyright assignment. I don't have commit access, 
 though, so it would be great if anyone could commit this for me.

 Thanks a lot!


 this is OK , thanks. Sorry about the delay in reviewing this.

 Ramana


Re: [patch] Perform anonymous constant propagation during inlining

2015-05-04 Thread Eric Botcazou
 2015-05-01  Eric Botcazou  ebotca...@adacore.com
 
   * expr.c (expand_expr_real_1) SSA_NAME: Try to substitute constants
   on the RHS of expressions.
   * gimple-expr.h (is_gimple_constant): Reorder.

Bummer.  This breaks C++ debugging:

+FAIL: gdb.cp/class2.exp: print alpha at marker return 0
+FAIL: gdb.cp/class2.exp: print beta at marker return 0
+FAIL: gdb.cp/class2.exp: print * aap at marker return 0
+FAIL: gdb.cp/class2.exp: print * bbp at marker return 0
+FAIL: gdb.cp/class2.exp: print * abp at marker return 0, s-p-o off
+FAIL: gdb.cp/class2.exp: print * (B *) abp at marker return 0
+FAIL: gdb.cp/class2.exp: p acp
+FAIL: gdb.cp/class2.exp: p acp-c1
+FAIL: gdb.cp/class2.exp: p acp-c2

because C++ is apparently relying on the assignment to the anonymous return 
object to preserve the debug info attached to a return statement.

Would you be OK with a slight variation of your earlier idea, i.e. calling 
fold_stmt with a specific valueizer from fold_marked_statements instead of the 
implicit no_follow_ssa_edges in the inliner?  Something like:

tree
follow_anonymous_single_use_edges (tree val)
{
  if (TREE_CODE (val) == SSA_NAME
   (!SSA_NAME_VAR (val) || DECL_IGNORED_P (SSA_NAME_VAR (var)))
   has_single_use (val))
return val
  return NULL_TREE;
}

-- 
Eric Botcazou


Re: [C++ Patch] PR 66007

2015-05-04 Thread Jason Merrill

On 05/04/2015 01:17 PM, Paolo Carlini wrote:

This regression is a more subtle variant of c++/65858: if the user
passes -Wno-error=narrowing the pedwarn didn't result in an actual error
(even if we are forcing -pedantic-errors around it) but produces anyway
a warning, thus returns true, and ok isn't set to true, thus we have a
miscompilation in this case too. Jakub suggested simply checking by hand
errorcount, which passes all my tests.


OK.

Jason




Re: [PATCH, RFC]: Next stage1, refactoring: propagating rtx subclasses

2015-05-04 Thread Mikhail Maltsev
(the original message was bounced by the mailing list, resending with
compressed attachment)

On 30.04.2015 8:00, Jeff Law wrote:
 
 Can you please check the changes to do_jump_1, the indention looked 
 weird in the patch.  If it's correct, just say so.
It is ok. Probably that's because the surrounding code is indented with
spaces.

 The definition of PEEP2_EOB looks wrong.  I don't see how you can
 safely cast pc_rtx to an rtx_insn * since it's an RTX rather than rtx
 chain object.  Maybe you're getting away with it because it's used as
 marker. But it still feels wrong.
Yes, FWIW, it is only needed for assertions in peep2_regno_dead_p and
peep2_reg_dead_p which check it against NULL (they are intended to
verify that live_before field in peep2_insn_data struct is valid). At
least, when I removed the assertions and changed PEEP2_EOB to NULL (as
an experiment), the testsuite passed without regressions.

 You'd probably be better off creating a unique rtx_insn * object and
 using that as the marker.
OK. Fixed the patch. Rebased and tested on x86_64-linux (fortunately, it
did not conflict with Trevor's series of rtx_insn-related patches).

I'm trying to continue and the next patch (peep_split.patch,
peep_split.cl) is addressing the same task in some of the generated code
(namely, gen_peephole2_* and gen_split_* series of functions).

 If you're going to continue this work, you should probably get
 write-after-approval access so that you can commit your own approved
 changes.
Is it OK to mention you as a maintainer who can approve my request for
write access?

-- 
Regards,
Mikhail Maltsev



as_insn.tar.gz
Description: GNU Zip compressed data


Re: [PATCH 4/4] Replace line_map union with C++ class hierarchy

2015-05-04 Thread Jeff Law

On 05/01/2015 06:56 PM, David Malcolm wrote:

This patch eliminates the union in struct line_map in favor of
a simple class hierarchy, making struct line_map a base class,
with line_map_ordinary and line_map_macro subclasses.

The patch eliminates all usage of linemap_check_ordinary and
linemap_check_macro from line-map.h, updating return types and
signatures throughout libcpp and gcc's usage of it to use the
appropriate subclasses.

This moves the checking of linemap kind from run-time to
compile-time, and also implicitly documents everywhere where
the code is expecting an ordinary map vs a macro map vs
either kind of map.  I believe it makes the code significantly
simpler: most of the accessor functions in line-map.h become
trivial field-lookups.

I attemped to use templates for maps_info, but was stymied by
gengtype, so in the end I simply split it manually into
maps_info_ordinary and maps_info_macro.  In theory it's just
a vec, but vec.h is in gcc, and thus not available
for use from libcpp.

In a similar vein, gcc/is-a.h is presumably not usable
from within libcpp.  If it were, there would be the following
rough equivalences:

-  
line-map.h is-a.h
-  
linemap_check_ordinary (m) as_a line_map_ordinary * (m)
linemap_check_macro (m)as_a line_map_macro * (m)
linemap_macro_expansion_map_p (m)  (M ? is_a line_map_macro * (m)
   : false)
-  

There are numerous places in libcpp that offset a
line_map * using array notation to get the next/prev line_map of the
same kind, e.g.:
MAP_START_LOCATION (cached[1])
which breaks due to the different sizes of line_map vs its subclasses.

On x86_64 host, before:
(gdb) p sizeof(line_map)
$1 = 40

after:
(gdb) p sizeof(line_map)
$1 = 8
(gdb) p sizeof(line_map_ordinary)
$2 = 32
(gdb) p sizeof(line_map_macro)
$3 = 40

Tracking down all of these array-based offsets to use a pointer to the
appropriate subclass (and thus use the correct offset) was rather
involved, but I believe the patch fixes them all now.

(the patch thus also gives a very modest saving of 8 bytes per ordinary
line map).

I've tried to use the naming convention ord_map and macro_map
whenever the typesystem ensures we're dealing with such a map,
wherever this is doable without needing to touch lines of code that
would otherwise not need touching by the patch.

gcc/ChangeLog:
* diagnostic.c (diagnostic_report_current_module): Strengthen
local new_map from const line_map * to
const line_map_ordinary *.
* genmatch.c (error_cb): Likewise for local map.
(output_line_directive): Likewise for local map.
* input.c (expand_location_1): Likewise for local map.
Pass NULL rather than map to
linemap_unwind_to_first_non_reserved_loc, since the value is never
read from there, and the value written back not read from here.
(is_location_from_builtin_token): Strengthen local map from
const line_map * to const line_map_ordinary *.
(dump_location_info): Strengthen locals map from
line_map *, one to const line_map_ordinary *, the other
to const line_map_macro *.
* tree-diagnostic.c (loc_map_pair): Strengthen field map from
const line_map * to const line_map_macro *.
(maybe_unwind_expanded_macro_loc): Add a call to
linemap_check_macro when writing to the map field of the
loc_map_pair.
Introduce local const line_map_ordinary * ord_map, using it in
place of map in the part of the function where we know we have
an ordinary map.  Strengthen local m from const line_map * to
const line_map_ordinary *.

gcc/ada/ChangeLog:
* gcc-interface/trans.c (Sloc_to_locus1): Strenghthen local map
from line_map * to line_map_ordinary *.

gcc/c-family/ChangeLog:
* c-common.h (fe_file_change): Strengthen param from
const line_map * to const line_map_ordinary *.
(pp_file_change): Likewise.
* c-lex.c (fe_file_change): Likewise.
(cb_define): Use linemap_check_ordinary when invoking
SOURCE_LINE.
(cb_undef): Likewise.
* c-opts.c (c_finish_options): Use linemap_check_ordinary when
invoking cb_file_change.
(c_finish_options): Likewise.
(push_command_line_include): Likewise.
(cb_file_change): Strengthen param new_map from
const line_map * to const line_map_ordinary *.
* c-ppoutput.c (cb_define): Likewise for local map.
(pp_file_change): Likewise for param map and local from.

gcc/fortran/ChangeLog:
* cpp.c (maybe_print_line): Strengthen local map from
const line_map * to const line_map_ordinary *.
(cb_file_change): Likewise for param map and local 

[PATCH, i386]: Some trivial const_wide_int/const_double related cleanups

2015-05-04 Thread Uros Bizjak
Hello!

2015-05-04  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c: Change GET_CODE (...) == CONST_DOUBLE check
to CONST_DOUBLE_P predicate.
(standard_sse_constant_p): Return 0 for !TARGET_SSE.
(ix86_legitimate_constant_p) case CONST_WIDE_INT: For 32bit targets,
allow only operands that satisfy standard_sse_constant_p predicate.
* config/i386/i386.md: Change GET_CODE (...) == CONST_DOUBLE check
to CONST_DOUBLE_P predicate.

Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 222767)
+++ config/i386/i386.c  (working copy)
@@ -9368,7 +9368,7 @@ standard_80387_constant_p (rtx x)
 
   REAL_VALUE_TYPE r;
 
-  if (!(X87_FLOAT_MODE_P (mode)  (GET_CODE (x) == CONST_DOUBLE)))
+  if (!(CONST_DOUBLE_P (x)  X87_FLOAT_MODE_P (mode)))
 return -1;
 
   if (x == CONST0_RTX (mode))
@@ -9469,9 +9469,14 @@ standard_80387_constant_rtx (int idx)
 int
 standard_sse_constant_p (rtx x)
 {
-  machine_mode mode = GET_MODE (x);
+  machine_mode mode;
 
-  if (x == const0_rtx || x == CONST0_RTX (GET_MODE (x)))
+  if (!TARGET_SSE)
+return 0;
+
+  mode = GET_MODE (x);
+  
+  if (x == const0_rtx || x == CONST0_RTX (mode))
 return 1;
   if (vector_all_ones_operand (x, mode))
 switch (mode)
@@ -13078,9 +13083,7 @@ ix86_legitimate_constant_p (machine_mode, rtx x)
   break;
 
 case CONST_WIDE_INT:
-  if (GET_MODE (x) == TImode
-  x != CONST0_RTX (TImode)
-   !TARGET_64BIT)
+  if (!TARGET_64BIT  !standard_sse_constant_p (x))
return false;
   break;
 
@@ -15903,7 +15906,7 @@ ix86_print_operand (FILE *file, rtx x, int code)
output_address (x);
 }
 
-  else if (GET_CODE (x) == CONST_DOUBLE  GET_MODE (x) == SFmode)
+  else if (CONST_DOUBLE_P (x)  GET_MODE (x) == SFmode)
 {
   REAL_VALUE_TYPE r;
   long l;
@@ -15921,7 +15924,7 @@ ix86_print_operand (FILE *file, rtx x, int code)
fprintf (file, 0x%08x, (unsigned int) l);
 }
 
-  else if (GET_CODE (x) == CONST_DOUBLE  GET_MODE (x) == DFmode)
+  else if (CONST_DOUBLE_P (x)  GET_MODE (x) == DFmode)
 {
   REAL_VALUE_TYPE r;
   long l[2];
@@ -15935,7 +15938,7 @@ ix86_print_operand (FILE *file, rtx x, int code)
 }
 
   /* These float cases don't actually occur as immediate operands.  */
-  else if (GET_CODE (x) == CONST_DOUBLE  GET_MODE (x) == XFmode)
+  else if (CONST_DOUBLE_P (x)  GET_MODE (x) == XFmode)
 {
   char dstr[30];
 
@@ -17364,8 +17367,7 @@ ix86_expand_move (machine_mode mode, rtx operands[
op1 = copy_to_mode_reg (mode, op1);
 
   if (can_create_pseudo_p ()
-  FLOAT_MODE_P (mode)
-  GET_CODE (op1) == CONST_DOUBLE)
+  CONST_DOUBLE_P (op1))
{
  /* If we are loading a floating point constant to a register,
 force the value to memory now, since we'll get better code
@@ -19563,7 +19565,7 @@ ix86_expand_copysign (rtx operands[])
   else
 vmode = mode;
 
-  if (GET_CODE (op0) == CONST_DOUBLE)
+  if (CONST_DOUBLE_P (op0))
 {
   rtx (*copysign_insn)(rtx, rtx, rtx, rtx);
 
@@ -22632,7 +22634,7 @@ ix86_split_to_parts (rtx operand, rtx *parts, mach
  for (i = 1; i  size; i++)
parts[i] = adjust_address (operand, SImode, 4 * i);
}
- else if (GET_CODE (operand) == CONST_DOUBLE)
+ else if (CONST_DOUBLE_P (operand))
{
  REAL_VALUE_TYPE r;
  long l[4];
@@ -22683,7 +22685,7 @@ ix86_split_to_parts (rtx operand, rtx *parts, mach
  parts[0] = operand;
  parts[1] = adjust_address (operand, upper_mode, 8);
}
- else if (GET_CODE (operand) == CONST_DOUBLE)
+ else if (CONST_DOUBLE_P (operand))
{
  REAL_VALUE_TYPE r;
  long l[4];
@@ -41208,7 +41210,7 @@ ix86_preferred_reload_class (rtx x, reg_class_t re
 return SSE_CLASS_P (regclass) ? regclass : NO_REGS;
 
   /* Floating-point constants need more complex checks.  */
-  if (GET_CODE (x) == CONST_DOUBLE  GET_MODE (x) != VOIDmode)
+  if (CONST_DOUBLE_P (x))
 {
   /* General regs can load everything.  */
   if (reg_class_subset_p (regclass, GENERAL_REGS))
@@ -44551,9 +44553,9 @@ ix86_expand_vector_init (bool mmx_ok, rtx target,
   for (i = 0; i  n_elts; ++i)
 {
   x = XVECEXP (vals, 0, i);
-  if (!(CONST_INT_P (x)
-   || GET_CODE (x) == CONST_DOUBLE
-   || GET_CODE (x) == CONST_FIXED))
+  if (!(CONST_SCALAR_INT_P (x)
+   || CONST_DOUBLE_P (x)
+   || CONST_FIXED_P (x)))
n_var++, one_var = i;
   else if (x != CONST0_RTX (inner_mode))
all_const_zero = false;
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 222767)
+++ config/i386/i386.md (working copy)
@@ -2955,7 

Re: [RFA] More type narrowing in match.pd V2

2015-05-04 Thread H.J. Lu
I think this caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66009

H.J.


On Mon, May 4, 2015 at 2:02 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Sat, May 2, 2015 at 2:36 AM, Jeff Law l...@redhat.com wrote:
 Here's an updated patch to add more type narrowing to match.pd.

 Changes since the last version:

 Slight refactoring of the condition by using types_match as suggested by
 Richi.  I also applied the new types_match to 2 other patterns in match.pd
 where it seemed clearly appropriate.

 Additionally the transformation is restricted by using the new single_use
 predicate.  I didn't change other patterns in match.pd to use the new
 single_use predicate.  But some probably could be changed.

 This (of course) continues to pass the bootstrap and regression check for
 x86-linux-gnu.

 There's still a ton of work to do in this space.  This is meant to be an
 incremental stand-alone improvement.

 OK now?

 Ok with the {gimple,generic}-match-head.c changes mentioned in the ChangeLog.

 Thanks,
 Richard.



 Jeff

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index e006b26..5ee89de 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,8 @@
 +2015-05-01  Jeff Law  l...@redhat.com
 +
 +   * match.pd (bit_and (plus/minus (convert @0) (convert @1) mask): New
 +   simplifier to narrow arithmetic.
 +
  2015-05-01  Rasmus Villemoes  r...@rasmusvillemoes.dk

 * match.pd: New simplification patterns.
 diff --git a/gcc/generic-match-head.c b/gcc/generic-match-head.c
 index daa56aa..303b237 100644
 --- a/gcc/generic-match-head.c
 +++ b/gcc/generic-match-head.c
 @@ -70,4 +70,20 @@ along with GCC; see the file COPYING3.  If not see
  #include dumpfile.h
  #include generic-match.h

 +/* Routine to determine if the types T1 and T2 are effectively
 +   the same for GENERIC.  */

 +inline bool
 +types_match (tree t1, tree t2)
 +{
 +  return TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2);
 +}
 +
 +/* Return if T has a single use.  For GENERIC, we assume this is
 +   always true.  */
 +
 +inline bool
 +single_use (tree t)
 +{
 +  return true;
 +}
 diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
 index c7b2f95..dc13218 100644
 --- a/gcc/gimple-match-head.c
 +++ b/gcc/gimple-match-head.c
 @@ -861,3 +861,21 @@ do_valueize (tree (*valueize)(tree), tree op)
return op;
  }

 +/* Routine to determine if the types T1 and T2 are effectively
 +   the same for GIMPLE.  */
 +
 +inline bool
 +types_match (tree t1, tree t2)
 +{
 +  return types_compatible_p (t1, t2);
 +}
 +
 +/* Return if T has a single use.  For GIMPLE, we also allow any
 +   non-SSA_NAME (ie constants) and zero uses to cope with uses
 +   that aren't linked up yet.  */
 +
 +inline bool
 +single_use (tree t)
 +{
 +  return TREE_CODE (t) != SSA_NAME || has_zero_uses (t) || has_single_use
 (t);
 +}
 diff --git a/gcc/match.pd b/gcc/match.pd
 index 87ecaf1..51a950a 100644
 --- a/gcc/match.pd
 +++ b/gcc/match.pd
 @@ -289,8 +289,7 @@ along with GCC; see the file COPYING3.  If not see
(if (((TREE_CODE (@1) == INTEGER_CST
   INTEGRAL_TYPE_P (TREE_TYPE (@0))
   int_fits_type_p (@1, TREE_TYPE (@0)))
 -   || (GIMPLE  types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
 -   || (GENERIC  TREE_TYPE (@0) == TREE_TYPE (@1)))
 +   || types_match (TREE_TYPE (@0), TREE_TYPE (@1)))
 /* ???  This transform conflicts with fold-const.c doing
   Convert (T)(x  c) into (T)x  (T)c, if c is an integer
   constants (if x has signed type, the sign bit cannot be set
 @@ -949,8 +948,7 @@ along with GCC; see the file COPYING3.  If not see
  /* Unordered tests if either argument is a NaN.  */
  (simplify
   (bit_ior (unordered @0 @0) (unordered @1 @1))
 - (if ((GIMPLE  types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
 -  || (GENERIC  TREE_TYPE (@0) == TREE_TYPE (@1)))
 + (if (types_match (TREE_TYPE (@0), TREE_TYPE (@1)))
(unordered @0 @1)))
  (simplify
   (bit_ior:c (unordered @0 @0) (unordered:c@2 @0 @1))
 @@ -1054,7 +1052,7 @@ along with GCC; see the file COPYING3.  If not see
 operation and convert the result to the desired type.  */
  (for op (plus minus)
(simplify
 -(convert (op (convert@2 @0) (convert@3 @1)))
 +(convert (op@4 (convert@2 @0) (convert@3 @1)))
  (if (INTEGRAL_TYPE_P (type)
  /* We check for type compatibility between @0 and @1 below,
 so there's no need to check that @1/@3 are integral types.  */
 @@ -1070,15 +1068,45 @@ along with GCC; see the file COPYING3.  If not see
   TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
  /* The inner conversion must be a widening conversion.  */
   TYPE_PRECISION (TREE_TYPE (@2))  TYPE_PRECISION (TREE_TYPE
 (@0))
 - ((GENERIC
 -  (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
 - == TYPE_MAIN_VARIANT (TREE_TYPE (@1)))
 -  (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
 - == TYPE_MAIN_VARIANT (type)))
 - 

Re: [patch] Perform anonymous constant propagation during inlining

2015-05-04 Thread Richard Biener
On May 4, 2015 11:38:42 PM GMT+02:00, Eric Botcazou ebotca...@adacore.com 
wrote:
 2015-05-01  Eric Botcazou  ebotca...@adacore.com
 
  * expr.c (expand_expr_real_1) SSA_NAME: Try to substitute
constants
  on the RHS of expressions.
  * gimple-expr.h (is_gimple_constant): Reorder.

Bummer.  This breaks C++ debugging:

+FAIL: gdb.cp/class2.exp: print alpha at marker return 0
+FAIL: gdb.cp/class2.exp: print beta at marker return 0
+FAIL: gdb.cp/class2.exp: print * aap at marker return 0
+FAIL: gdb.cp/class2.exp: print * bbp at marker return 0
+FAIL: gdb.cp/class2.exp: print * abp at marker return 0, s-p-o off
+FAIL: gdb.cp/class2.exp: print * (B *) abp at marker return 0
+FAIL: gdb.cp/class2.exp: p acp
+FAIL: gdb.cp/class2.exp: p acp-c1
+FAIL: gdb.cp/class2.exp: p acp-c2

because C++ is apparently relying on the assignment to the anonymous
return 
object to preserve the debug info attached to a return statement.

Would you be OK with a slight variation of your earlier idea, i.e.
calling 
fold_stmt with a specific valueizer from fold_marked_statements instead
of the 
implicit no_follow_ssa_edges in the inliner?  Something like:

tree
follow_anonymous_single_use_edges (tree val)
{
  if (TREE_CODE (val) == SSA_NAME
   (!SSA_NAME_VAR (val) || DECL_IGNORED_P (SSA_NAME_VAR (var)))
   has_single_use (val))
return val
  return NULL_TREE;
}

Yes, that works for me as well.

Richard.



Re: [PATCH/libiberty] fix build of gdb/binutils with clang.

2015-05-04 Thread Ian Lance Taylor
On Mon, May 4, 2015 at 3:49 PM, Yunlian Jiang yunl...@google.com wrote:
 There was a similar disscussion here
 https://gcc.gnu.org/ml/gcc/2005-11/msg01190.html

That was a discussion about libiberty.  Your subject says you have
trouble building gdb.

Can you describe the exact problem that you are having?  What
precisely are you doing?  What precisely happens?


 The problem is in the configure stage, the __GNU_SOURCE is not
 defined, and it could not find
 the declaration of asprintf. so it make a declaration of asprintf in
 libiberty.h. And  for the file floatformat.c,
 the  __GNU_SOURCE is defined, so it could find another asprintf in
 /usr/include/bits/stdio2.h, it also includes
 libiberty.h. So these two asprintf conflicts when __USE_FORTIFY_LEVEL is set.

I think the basic guideline should be that HAVE_DECL_ASPRINTF should
be correct.  If libiberty compiled with _GNU_SOURCE defined, then it
should test HAVE_DECL_ASPRINTF with _GNU_SOURCE defined.  If not, then
not.  So perhaps the problem is that libiberty is compiling some files
with _GNU_SOURCE defined and some not.

Ian


Re: [patch] libstdc++/56117 make std::async launch new threads by default

2015-05-04 Thread Jonathan Wakely

On 02/05/15 19:56 +0100, Jonathan Wakely wrote:

One last patch before I head to Lenexa, this fixes the long standing
not-a-bug that our default launch policy is launch::deferred.

This way std::async with no explicit policy or with any policy that
contains launch::async will run in a new thread.

Apparently libc++ does the same and they aren't getting lots of
complaints about fork-bombs, so let's try the same thing. If people
don't like it we have plenty of time in stage 1 to reconsider.

Tested x86_64-linux and powerpc64le-linux, I'm going to commit this to
trunk unless someone strongly objects.


Committed to trunk.


[debug-early] fix problem with template parameter packs

2015-05-04 Thread Aldy Hernandez
The code handling parameter DIEs needed a little tweaking for variable 
length template arguments.  I've relaxed the original assert, but this 
may require tweaking at branch review time-- hopefully later this week.


Committing to branch.

Aldy

p.s. Richi/Jason: Winter is coming.  Down to 1 GCC regression which is 
actually a missed DIE optimization which I hope I can fix post merge.
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index c51cea1..a5b155f 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -18018,8 +18018,20 @@ gen_formal_parameter_die (tree node, tree origin, bool 
emit_name_p,
 DW_AT_abstract_origin.  */
   if (parm_die  parm_die-die_parent != context_die)
{
- gcc_assert (!DECL_ABSTRACT_P (node));
- parm_die = NULL;
+ if (!DECL_ABSTRACT_P (node))
+   {
+ gcc_assert (!DECL_ABSTRACT_P (node));
+ parm_die = NULL;
+   }
+ else
+   {
+ /* Reuse DIE even with a differing context.  This
+happens when called through
+dwarf2out_abstract_function for
+formal parameter packs.  */
+ gcc_assert (parm_die-die_parent-die_tag
+ == DW_TAG_GNU_formal_parameter_pack);
+   }
}
 
   if (parm_die  parm_die-die_parent == NULL)