[PATCH] Windows libibery: Don't quote args unnecessarily

2014-05-07 Thread Ray Donnelly
We only quote arguments that contain spaces, \t or 
characters to prevent wasting 2 characters per
argument of the CreateProcess() 32,768 limit.
---
 libiberty/pex-win32.c | 46 +-
 1 file changed, 37 insertions(+), 9 deletions(-)

diff --git a/libiberty/pex-win32.c b/libiberty/pex-win32.c
index eae72c5..8b9d4f0 100644
--- a/libiberty/pex-win32.c
+++ b/libiberty/pex-win32.c
@@ -340,17 +340,25 @@ argv_to_cmdline (char *const *argv)
   char *p;
   size_t cmdline_len;
   int i, j, k;
+  int needs_quotes;
 
   cmdline_len = 0;
   for (i = 0; argv[i]; i++)
 {
-  /* We quote every last argument.  This simplifies the problem;
-we need only escape embedded double-quotes and immediately
+  /* We only quote arguments that contain spaces, \t or  characters to
+prevent wasting 2 chars per argument of the CreateProcess 32k char
+limit.  We need only escape embedded double-quotes and immediately
 preceeding backslash characters.  A sequence of backslach characters
 that is not follwed by a double quote character will not be
 escaped.  */
+  needs_quotes = 0;
   for (j = 0; argv[i][j]; j++)
{
+ if (argv[i][j] == ' ' || argv[i][j] == '\t' || argv[i][j] == '')
+   {
+ needs_quotes = 1;
+   }
+
  if (argv[i][j] == '')
{
  /* Escape preceeding backslashes.  */
@@ -362,16 +370,33 @@ argv_to_cmdline (char *const *argv)
}
   /* Trailing backslashes also need to be escaped because they will be
  followed by the terminating quote.  */
-  for (k = j - 1; k = 0  argv[i][k] == '\\'; k--)
-   cmdline_len++;
+  if (needs_quotes)
+{
+  for (k = j - 1; k = 0  argv[i][k] == '\\'; k--)
+cmdline_len++;
+}
   cmdline_len += j;
-  cmdline_len += 3;  /* for leading and trailing quotes and space */
+  /* for leading and trailing quotes and space */
+  cmdline_len += needs_quotes * 2 + 1;
 }
   cmdline = XNEWVEC (char, cmdline_len);
   p = cmdline;
   for (i = 0; argv[i]; i++)
 {
-  *p++ = '';
+  needs_quotes = 0;
+  for (j = 0; argv[i][j]; j++)
+{
+  if (argv[i][j] == ' ' || argv[i][j] == '\t' || argv[i][j] == '')
+{
+  needs_quotes = 1;
+  break;
+}
+}
+
+  if (needs_quotes)
+{
+  *p++ = '';
+}
   for (j = 0; argv[i][j]; j++)
{
  if (argv[i][j] == '')
@@ -382,9 +407,12 @@ argv_to_cmdline (char *const *argv)
}
  *p++ = argv[i][j];
}
-  for (k = j - 1; k = 0  argv[i][k] == '\\'; k--)
-   *p++ = '\\';
-  *p++ = '';
+  if (needs_quotes)
+{
+  for (k = j - 1; k = 0  argv[i][k] == '\\'; k--)
+*p++ = '\\';
+  *p++ = '';
+}
   *p++ = ' ';
 }
   p[-1] = '\0';
-- 
1.9.2



[PATCH] Windows libiberty: Don't quote args unnecessarily (v2)

2014-05-07 Thread Ray Donnelly
We only quote arguments that contain spaces, \t or 
characters to prevent wasting 2 characters per
argument of the CreateProcess() 32,768 limit.

libiberty/
* pex-win32.c (argv_to_cmdline): Don't quote
args unnecessarily

Ray Donnelly (1):
  Windows libibery: Don't quote args unnecessarily

 libiberty/pex-win32.c | 46 +-
 1 file changed, 37 insertions(+), 9 deletions(-)

-- 
1.9.2



Re: we are starting the wide int merge

2014-05-07 Thread Andreas Schwab
Christophe Lyon christophe.l...@linaro.org writes:

 It also looks like the git-svn-id property is now wrong/incomplete.
 For instance, commit 9a5942c1d4d9116ab74b0741cfe3894a89fd17fb has:
 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/wide-int@201706
 138bc75d-0d04-0410-961f-82ee72b054a4

 How does it map to the SVN commit in trunk?

This is a commit on the wide-int branch (the one that created it).

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


[PATCH, nds32] Committed: Enable HONOR_REG_ALLOC_ORDER when optimizing for size.

2014-05-07 Thread Chung-Ju Wu
Hi, all,

There was a patch to have HONOR_REG_ALLOC_ORDER using C expression:
  http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01546.html
  http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00048.html

This is very helpful to nds32 port since we can decide when to apply
HONOR_REG_ALLOC_ORDER against code size and performance trade-off.
Currently, HONOR_REG_ALLOC_ORDER only benefits code size in nds32 port.

ChangeLog and patch are as below, committed as Rev.210137:


Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 210135)
+++ gcc/ChangeLog   (revision 210137)
@@ -1,3 +1,8 @@
+2014-05-07  Chung-Ju Wu  jasonw...@gmail.com
+
+   * config/nds32/nds32.h (HONOR_REG_ALLOC_ORDER): Have it in favor
+   of using optimize_size.
+
 2014-05-06  Mike Stump  mikest...@comcast.net

* wide-int.h (wi::int_traits HOST_WIDE_INT): Always define.

Index: gcc/config/nds32/nds32.h
===
--- gcc/config/nds32/nds32.h(revision 210135)
+++ gcc/config/nds32/nds32.h(revision 210137)
@@ -553,7 +553,7 @@

 /* Tell IRA to use the order we define rather than messing it up with its
own cost calculations.  */
-#define HONOR_REG_ALLOC_ORDER 1
+#define HONOR_REG_ALLOC_ORDER optimize_size

 /* The number of consecutive hard regs needed starting at
reg regno for holding a value of mode mode.  */


Best regards,
jasonwucj


[PATCH][4.7] Fix PR57864

2014-05-07 Thread Richard Biener

This backports a piece of

2012-09-24  Richard Guenther  rguent...@suse.de

   * tree-ssa-pre.c (bitmap_find_leader, create_expression_by_pieces,
   find_or_generate_expression): Remove dominating stmt argument.
   (find_leader_in_sets, phi_translate_1, bitmap_find_leader,
   create_component_ref_by_pieces_1, create_component_ref_by_pieces,
   do_regular_insertion, do_partial_partial_insertion): Adjust.
   (compute_avail): Do not set uids.

to the 4.7 branch.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to
the branch (and the testcase added to 4.8, 4.9 and trunk).

Richard.

2014-05-06  Richard Biener  rguent...@suse.de

PR tree-optimization/57864
* tree-ssa-pre.c (phi_translate_1): Backport NAME case
simplification from mainline.  Do not lookup the VN
value-number here.

* gcc.dg/torture/pr57864.c: New testcase.

Index: gcc/tree-ssa-pre.c
===
*** gcc/tree-ssa-pre.c  (revision 210104)
--- gcc/tree-ssa-pre.c  (working copy)
*** phi_translate_1 (pre_expr expr, bitmap_s
*** 1756,1794 
  
  case NAME:
{
-   gimple phi = NULL;
-   edge e;
-   gimple def_stmt;
tree name = PRE_EXPR_NAME (expr);
! 
!   def_stmt = SSA_NAME_DEF_STMT (name);
if (gimple_code (def_stmt) == GIMPLE_PHI
 gimple_bb (def_stmt) == phiblock)
- phi = def_stmt;
-   else
- return expr;
- 
-   e = find_edge (pred, gimple_bb (phi));
-   if (e)
  {
!   tree def = PHI_ARG_DEF (phi, e-dest_idx);
!   pre_expr newexpr;
! 
!   if (TREE_CODE (def) == SSA_NAME)
! def = VN_INFO (def)-valnum;
  
/* Handle constant. */
if (is_gimple_min_invariant (def))
  return get_or_alloc_expr_for_constant (def);
  
!   if (TREE_CODE (def) == SSA_NAME  ssa_undefined_value_p (def))
! return NULL;
! 
!   newexpr = get_or_alloc_expr_for_name (def);
!   return newexpr;
  }
}
-   return expr;
  
  default:
gcc_unreachable ();
--- 1756,1781 
  
  case NAME:
{
tree name = PRE_EXPR_NAME (expr);
!   gimple def_stmt = SSA_NAME_DEF_STMT (name);
!   /* If the SSA name is defined by a PHI node in this block,
!  translate it.  */
if (gimple_code (def_stmt) == GIMPLE_PHI
 gimple_bb (def_stmt) == phiblock)
  {
!   edge e = find_edge (pred, gimple_bb (def_stmt));
!   tree def = PHI_ARG_DEF (def_stmt, e-dest_idx);
  
/* Handle constant. */
if (is_gimple_min_invariant (def))
  return get_or_alloc_expr_for_constant (def);
  
!   return get_or_alloc_expr_for_name (def);
  }
+   /* Otherwise return it unchanged - it will get cleaned if its
+  value is not available in PREDs AVAIL_OUT set of expressions.  */
+   return expr;
}
  
  default:
gcc_unreachable ();
Index: gcc/testsuite/gcc.dg/torture/pr57864.c
===
*** gcc/testsuite/gcc.dg/torture/pr57864.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr57864.c  (working copy)
***
*** 0 
--- 1,37 
+ /* { dg-do compile } */
+ 
+ union U {
+ double val;
+ union U *ptr;
+ };
+ 
+ union U *d;
+ double a;
+ int b;
+ int c;
+ 
+ static void fn1(union U *p1, int p2, _Bool p3)
+ {
+ union U *e;
+ 
+ if (p2 == 0)
+   a = ((union U*)((unsigned long)p1  ~1))-val;
+ 
+ if (b) {
+   e = p1;
+ } else if (c) {
+   e = ((union U*)((unsigned long)p1  ~1))-ptr;
+   d = e;
+ } else {
+   e = 0;
+   d = ((union U*)0)-ptr;
+ }
+ 
+ fn1 (e, 0, 0);
+ fn1 (0, 0, p3);
+ }
+ 
+ void fn2 (void)
+ {
+   fn1 (0, 0, 0);
+ }


Re: [PATCH] Change HONOR_REG_ALLOC_ORDER to a marco for C expression

2014-05-07 Thread Chung-Ju Wu
2014-05-02 14:41 GMT+08:00 Kito Cheng kito.ch...@gmail.com:
 Hi Jeff:

 I fixed up some minor whitespace issues and committed your patch.

 Thanks for your help :)

Hi,

I noticed the commit date in ChangeLog was incorrect for the patch.
Fixed it as obvious.  Committed into Rev.210138.

Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 210137)
+++ gcc/ChangeLog   (revision 210138)
@@ -1092,7 +1092,7 @@

* doc/invoke.texi: Describe -fsanitize=float-divide-by-zero.

-2014-02-26  Kito Cheng  k...@0xlab.org
+2014-05-02  Kito Cheng  k...@0xlab.org

* defaults.h (HONOR_REG_ALLOC_ORDER): Change HONOR_REG_ALLOC_ORDER
to a C expression marco.


Best regards,
jasonwucj


patch1.diff updated + test results Was: Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-05-07 Thread Svante Signell
On Tue, 2014-05-06 at 15:26 +0200, Samuel Thibault wrote:
 Svante Signell, le Tue 06 May 2014 15:25:38 +0200, a écrit :
  On Tue, 2014-05-06 at 15:07 +0200, Samuel Thibault wrote:
   Svante Signell, le Tue 06 May 2014 15:05:20 +0200, a écrit :
On Tue, 2014-05-06 at 14:51 +0200, Samuel Thibault wrote:
 Just to explicitly ask for it:
 
 Svante Signell, le Tue 06 May 2014 10:06:49 +0200, a écrit :
  For some (yet) unknown reason all libgo tests fails with a segfault 
  when
  run in the build tree: make, sh or something else, the test 
  commands are
  rather hard to track.
 
 Doesn't that dump a core?  Do you have /servers/crash properly 
 pointing
 to /servers/crash-dump-core and ulimit -u set to unlimited?

More good news:
- Installing the modified libpthread.so.0.3 made the segfault go away. I
could now run the check from the build tree :-)

- Adding
#define TARGET_THREAD_SSP_OFFSET 0x14
to patch1.diff and building gcc-4.9.0-2 the test results are summarised
as follows :-)
=== libgo Summary ===

# of expected passes101
# of unexpected failures21

I think some of the remaining failures are rather easy to fix.

Attached is an updated patch1.diff.
Remains to solve the problem with patch8.diff: Adding arch specific code
to: src/libgo/mksysinfo.sh
--- a/src/gcc/config/i386/gnu.h
+++ b/src/gcc/config/i386/gnu.h
@@ -37,11 +37,14 @@
 
 #ifdef TARGET_LIBC_PROVIDES_SSP
 
-/* Not supported yet.  */
-# undef TARGET_THREAD_SSP_OFFSET
-
-/* Not supported yet.  */
-# undef TARGET_CAN_SPLIT_STACK
-# undef TARGET_THREAD_SPLIT_STACK_OFFSET
+/* i386 glibc provides __stack_chk_guard in %gs:0x14.  */
+#define TARGET_THREAD_SSP_OFFSET0x14
 
+/* We only build the -fsplit-stack support in libgcc if the
+   assembler has full support for the CFI directives.  */
+#if HAVE_GAS_CFI_PERSONALITY_DIRECTIVE
+#define TARGET_CAN_SPLIT_STACK
+#endif
+/* We steal the last transactional memory word.  */
+#define TARGET_THREAD_SPLIT_STACK_OFFSET 0x30
 #endif


[PATCH] [PING^2] Fix for PR libstdc++/60758

2014-05-07 Thread Yury Gribov

 Original Message 
Subject: [PING] [PATCH] Fix for PR libstdc++/60758
Date: Thu, 17 Apr 2014 17:48:12 +0400
From: Alexey Merzlyakov alexey.merzlya...@samsung.com
To: Ramana Radhakrishnan ramra...@arm.com
CC: gcc-patches@gcc.gnu.org gcc-patches@gcc.gnu.org, Viacheslav 
Garbuzov v.garbu...@samsung.com, Yury Gribov y.gri...@samsung.com


Hi,

This fixes infinite backtrace in __cxa_end_cleanup().
Regtest was finished with no regressions on arm-linux-gnueabi(sf).

The patch posted at:
  http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00496.html

Thanks in advance.

Best regards,
Merzlyakov Alexey



2014-05-07  Alexey Merzlyakov alexey.merzlya...@samsung.com

	PR libstdc++/60758
	* libsupc++/eh_arm.cc (__cxa_end_cleanup): Change r4 to lr in save/restore
	and add unwind directives.

diff --git a/libstdc++-v3/libsupc++/eh_arm.cc b/libstdc++-v3/libsupc++/eh_arm.cc
index aa453dd..6a45af5 100644
--- a/libstdc++-v3/libsupc++/eh_arm.cc
+++ b/libstdc++-v3/libsupc++/eh_arm.cc
@@ -199,27 +199,33 @@ asm (.global __cxa_end_cleanup\n
 	nop		5\n);
 #else
 // Assembly wrapper to call __gnu_end_cleanup without clobbering r1-r3.
-// Also push r4 to preserve stack alignment.
+// Also push lr to preserve stack alignment and to allow backtracing.
 #ifdef __thumb__
 asm (  .pushsection .text.__cxa_end_cleanup\n
 	.global __cxa_end_cleanup\n
 	.type __cxa_end_cleanup, \function\\n
 	.thumb_func\n
 __cxa_end_cleanup:\n
-	push\t{r1, r2, r3, r4}\n
+	.fnstart\n
+	push\t{r1, r2, r3, lr}\n
+	.save\t{r1, r2, r3, lr}\n
 	bl\t__gnu_end_cleanup\n
-	pop\t{r1, r2, r3, r4}\n
+	pop\t{r1, r2, r3, lr}\n
 	bl\t_Unwind_Resume @ Never returns\n
+	.fnend\n
 	.popsection\n);
 #else
 asm (  .pushsection .text.__cxa_end_cleanup\n
 	.global __cxa_end_cleanup\n
 	.type __cxa_end_cleanup, \function\\n
 __cxa_end_cleanup:\n
-	stmfd\tsp!, {r1, r2, r3, r4}\n
+	.fnstart\n
+	stmfd\tsp!, {r1, r2, r3, lr}\n
+	.save\t{r1, r2, r3, lr}\n
 	bl\t__gnu_end_cleanup\n
-	ldmfd\tsp!, {r1, r2, r3, r4}\n
+	ldmfd\tsp!, {r1, r2, r3, lr}\n
 	bl\t_Unwind_Resume @ Never returns\n
+	.fnend\n
 	.popsection\n);
 #endif
 #endif


Re: [PATCH] [PING^2] Fix for PR libstdc++/60758

2014-05-07 Thread Paolo Carlini

Hi,

On 05/07/2014 10:19 AM, Yury Gribov wrote:

 Original Message 
Subject: [PING] [PATCH] Fix for PR libstdc++/60758
Date: Thu, 17 Apr 2014 17:48:12 +0400
From: Alexey Merzlyakov alexey.merzlya...@samsung.com
To: Ramana Radhakrishnan ramra...@arm.com
CC: gcc-patches@gcc.gnu.org gcc-patches@gcc.gnu.org, Viacheslav 
Garbuzov v.garbu...@samsung.com, Yury Gribov y.gri...@samsung.com


Hi,

This fixes infinite backtrace in __cxa_end_cleanup().
Regtest was finished with no regressions on arm-linux-gnueabi(sf).

The patch posted at:
  http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00496.html
I think you want an ARM maintainer for this. I'm adding some in CC. 
Also, remember to send patches touching the C++ library to the mailing 
list too.


Paolo.


[C++ Patch] PR 61080

2014-05-07 Thread Paolo Carlini

Hi,

thus I prepared this simple patch. Tested x86_64-linux.

Thanks,
Paolo.

/
/cp
2014-05-07  Paolo Carlini  paolo.carl...@oracle.com

PR c++/61080
* pt.c (instantiate_decl): Avoid generating the body of a
deleted function.

/testsuite
2014-05-07  Paolo Carlini  paolo.carl...@oracle.com

PR c++/61080
* g++.dg/cpp0x/deleted7.C: New.
Index: cp/pt.c
===
--- cp/pt.c (revision 210140)
+++ cp/pt.c (working copy)
@@ -19542,6 +19542,7 @@ instantiate_decl (tree d, int defer_ok,
   int saved_unevaluated_operand = cp_unevaluated_operand;
   int saved_inhibit_evaluation_warnings = c_inhibit_evaluation_warnings;
   bool external_p;
+  bool deleted_p;
   tree fn_context;
   bool nested;
 
@@ -19623,11 +19624,17 @@ instantiate_decl (tree d, int defer_ok,
 args = gen_args;
 
   if (TREE_CODE (d) == FUNCTION_DECL)
-pattern_defined = (DECL_SAVED_TREE (code_pattern) != NULL_TREE
-  || DECL_DEFAULTED_OUTSIDE_CLASS_P (code_pattern)
-  || DECL_DELETED_FN (code_pattern));
+{
+  deleted_p = DECL_DELETED_FN (code_pattern);
+  pattern_defined = (DECL_SAVED_TREE (code_pattern) != NULL_TREE
+|| DECL_DEFAULTED_OUTSIDE_CLASS_P (code_pattern)
+|| deleted_p);
+}
   else
-pattern_defined = ! DECL_IN_AGGR_P (code_pattern);
+{
+  deleted_p = false;
+  pattern_defined = ! DECL_IN_AGGR_P (code_pattern);
+}
 
   /* We may be in the middle of deferred access check.  Disable it now.  */
   push_deferring_access_checks (dk_no_deferred);
@@ -19671,7 +19678,10 @@ instantiate_decl (tree d, int defer_ok,
 elsewhere, we don't want to instantiate the entire data
 member, but we do want to instantiate the initializer so that
 we can substitute that elsewhere.  */
-  || (external_p  VAR_P (d)))
+  || (external_p  VAR_P (d))
+  /* Handle here a deleted function too, avoid generating
+its body (c++/61080).  */
+  || deleted_p)
 {
   /* The definition of the static data member is now required so
 we must substitute the initializer.  */
@@ -19867,17 +19877,14 @@ instantiate_decl (tree d, int defer_ok,
   tf_warning_or_error, tmpl,
   /*integral_constant_expression_p=*/false);
 
- if (DECL_STRUCT_FUNCTION (code_pattern))
-   {
- /* Set the current input_location to the end of the function
-so that finish_function knows where we are.  */
- input_location
-   = DECL_STRUCT_FUNCTION (code_pattern)-function_end_locus;
+ /* Set the current input_location to the end of the function
+so that finish_function knows where we are.  */
+ input_location
+   = DECL_STRUCT_FUNCTION (code_pattern)-function_end_locus;
 
- /* Remember if we saw an infinite loop in the template.  */
- current_function_infinite_loop
-   = DECL_STRUCT_FUNCTION (code_pattern)-language-infinite_loop;
-   }
+ /* Remember if we saw an infinite loop in the template.  */
+ current_function_infinite_loop
+   = DECL_STRUCT_FUNCTION (code_pattern)-language-infinite_loop;
}
 
   /* We don't need the local specializations any more.  */
Index: testsuite/g++.dg/cpp0x/deleted7.C
===
--- testsuite/g++.dg/cpp0x/deleted7.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/deleted7.C   (working copy)
@@ -0,0 +1,36 @@
+// PR c++/61080
+// { dg-do compile { target c++11 } }
+// { dg-options -Wreturn-type }
+
+struct AAA
+{
+  int a1, a2, a3;
+  void *p;
+};
+
+template typename K, typename V
+class WeakMapPtr
+{
+  public:
+WeakMapPtr() : ptr(nullptr) {};
+bool init(AAA *cx);
+  private:
+void *ptr;
+WeakMapPtr(const WeakMapPtr wmp) = delete;
+WeakMapPtr operator=(const WeakMapPtr wmp) = delete;
+};
+
+template typename K, typename V
+bool WeakMapPtrK, V::init(AAA *cx)
+{
+ptr = cx-p;
+return true;
+}
+
+struct JSObject
+{
+  int blah;
+  float meh;
+};
+
+template class WeakMapPtrJSObject*, JSObject*;


PR 61084: SPARC fallout from wide-int merge

2014-05-07 Thread Richard Sandiford
The DImode constant spliiter assigned the result of trunc_int_for_mode
to an unsigned int rather than a HOST_WIDE_INT.  This then produced const_ints
that were zero-extended rather than sign-extended and tripped the assert:

gcc_checking_assert (INTVAL (x.first)
 == sext_hwi (INTVAL (x.first), precision)
 || (x.second == BImode  INTVAL (x.first) == 1));

The other hunks are just by inspection, but I think gen_int_mode is
preferred over GEN_INT when the mode is obvious.

Tested by Rainer, who says that the boostrap now completes.
OK to install?

Thanks,
Richard


gcc/
PR target/61084
* config/sparc/sparc.md: Fix types of low and high in DI constant
splitter.  Use gen_int_mode in some other splitters.

Index: gcc/config/sparc/sparc.md
===
--- gcc/config/sparc/sparc.md   2014-05-07 10:15:23.051156294 +0100
+++ gcc/config/sparc/sparc.md   2014-05-07 10:15:27.922201361 +0100
@@ -1886,7 +1886,7 @@ (define_split
   emit_insn (gen_movsi (gen_lowpart (SImode, operands[0]),
operands[1]));
 #else
-  unsigned int low, high;
+  HOST_WIDE_INT low, high;
 
   low = trunc_int_for_mode (INTVAL (operands[1]), SImode);
   high = trunc_int_for_mode (INTVAL (operands[1])  32, SImode);
@@ -4822,7 +4822,7 @@ (define_split
   [(set (match_dup 3) (match_dup 4))
(set (match_dup 0) (ior:SI (not:SI (match_dup 3)) (match_dup 1)))]
 {
-  operands[4] = GEN_INT (~INTVAL (operands[2]));
+  operands[4] = gen_int_mode (~INTVAL (operands[2]), SImode);
 })
 
 (define_insn_and_split *or_not_di_sp32
@@ -4899,7 +4899,7 @@ (define_split
   [(set (match_dup 3) (match_dup 4))
(set (match_dup 0) (not:SI (xor:SI (match_dup 3) (match_dup 1]
 {
-  operands[4] = GEN_INT (~INTVAL (operands[2]));
+  operands[4] = gen_int_mode (~INTVAL (operands[2]), SImode);
 })
 
 (define_split
@@ -4911,7 +4911,7 @@ (define_split
   [(set (match_dup 3) (match_dup 4))
(set (match_dup 0) (xor:SI (match_dup 3) (match_dup 1)))]
 {
-  operands[4] = GEN_INT (~INTVAL (operands[2]));
+  operands[4] = gen_int_mode (~INTVAL (operands[2]), SImode);
 })
 
 ;; Split DImode logical operations requiring two instructions.


Re: [PATCH] [PING^2] Fix for PR libstdc++/60758

2014-05-07 Thread Ramana Radhakrishnan

On 05/07/14 09:19, Yury Gribov wrote:

 Original Message 
Subject: [PING] [PATCH] Fix for PR libstdc++/60758
Date: Thu, 17 Apr 2014 17:48:12 +0400
From: Alexey Merzlyakov alexey.merzlya...@samsung.com
To: Ramana Radhakrishnan ramra...@arm.com
CC: gcc-patches@gcc.gnu.org gcc-patches@gcc.gnu.org, Viacheslav
Garbuzov v.garbu...@samsung.com, Yury Gribov y.gri...@samsung.com

Hi,

This fixes infinite backtrace in __cxa_end_cleanup().
Regtest was finished with no regressions on arm-linux-gnueabi(sf).

The patch posted at:
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00496.html


This is OK to apply if no regressions.

Thanks,
Ramana



Thanks in advance.

Best regards,
Merzlyakov Alexey






[PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option

2014-05-07 Thread Herman, Andrei

Hi,

Currently GCC only emits DWARF debug information (DW_TAG_lexical_block DIEs)
for compound statements containing significant local declarations.
However, code coverage tools that process the DWARF debug information to
implement block/path coverage need more complete lexical block information. 

This patch adds the necessary functionality under the control of a new 
command line argument: -fforce-dwarf-lexical-blocks.

When this flag is set, a DW_TAG_lexical_block DIE will be emitted for every
function body, loop body, switch body, case statement, if-then and if-else
statement, even if the body is a single statement. 
Likewise, a lexical block will be emitted for the first label of a labeled
statement. This block ends at the end of the current lexical scope, or when
a break, continue, goto or return statement is encountered at the same lexical
scope level. 
Consequently, any case in a switch statement that does not flow through to 
the next case, will have its own dwarf lexical block.

The complete change proposal contains 4 patches (attached first 3):
1. Add command line option -fforce-dwarf-lexical-blocks
2. Use of flag_force_dwarf_blocks
3. Create label scopes

A forth patch, extending the proposed functionality to C++ will be submitted in 
a separate message.

Attached are the proposed ChangeLog additions, named according to the directory 
each one belongs to.

Best regards,
Andrei Herman
Mentor Graphics Corporation
Israel branch 



gcc_c_ChangeLog
Description: gcc_c_ChangeLog


gcc_c-family_ChangeLog
Description: gcc_c-family_ChangeLog


gcc_ChangeLog
Description: gcc_ChangeLog


0001-Add-command-line-option-fforce_dwarf_lexical_blocks.patch
Description: 0001-Add-command-line-option-fforce_dwarf_lexical_blocks.patch


0002-Use-flag_force_dwarf_blocks.patch
Description: 0002-Use-flag_force_dwarf_blocks.patch


0003-Create-label-scopes.patch
Description: 0003-Create-label-scopes.patch


Re: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option

2014-05-07 Thread pinskia


 On May 7, 2014, at 2:32 AM, Herman, Andrei andrei_her...@codesourcery.com 
 wrote:
 
 
 Hi,
 
 Currently GCC only emits DWARF debug information (DW_TAG_lexical_block DIEs)
 for compound statements containing significant local declarations.
 However, code coverage tools that process the DWARF debug information to
 implement block/path coverage need more complete lexical block information. 
 
 This patch adds the necessary functionality under the control of a new 
 command line argument: -fforce-dwarf-lexical-blocks.
 
 When this flag is set, a DW_TAG_lexical_block DIE will be emitted for every
 function body, loop body, switch body, case statement, if-then and if-else
 statement, even if the body is a single statement. 
 Likewise, a lexical block will be emitted for the first label of a labeled
 statement. This block ends at the end of the current lexical scope, or when
 a break, continue, goto or return statement is encountered at the same lexical
 scope level. 
 Consequently, any case in a switch statement that does not flow through to 
 the next case, will have its own dwarf lexical block.
 
 The complete change proposal contains 4 patches (attached first 3):
 1. Add command line option -fforce-dwarf-lexical-blocks

This option since it is specific to the c frontend should go into c.opt instead 
of common.opt. Unless you are going to extend this to Ada, Java and fortran. 

Thanks,
Andrew


 2. Use of flag_force_dwarf_blocks
 3. Create label scopes
 
 A forth patch, extending the proposed functionality to C++ will be submitted 
 in a separate message.
 
 Attached are the proposed ChangeLog additions, named according to the 
 directory each one belongs to.
 
 Best regards,
 Andrei Herman
 Mentor Graphics Corporation
 Israel branch 
 
 gcc_c_ChangeLog
 gcc_c-family_ChangeLog
 gcc_ChangeLog
 0001-Add-command-line-option-fforce_dwarf_lexical_blocks.patch
 0002-Use-flag_force_dwarf_blocks.patch
 0003-Create-label-scopes.patch


Re: we are starting the wide int merge

2014-05-07 Thread Christophe Lyon
On 7 May 2014 09:48, Andreas Schwab sch...@suse.de wrote:
 Christophe Lyon christophe.l...@linaro.org writes:

 It also looks like the git-svn-id property is now wrong/incomplete.
 For instance, commit 9a5942c1d4d9116ab74b0741cfe3894a89fd17fb has:
 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/wide-int@201706
 138bc75d-0d04-0410-961f-82ee72b054a4

 How does it map to the SVN commit in trunk?

 This is a commit on the wide-int branch (the one that created it).


I had a bug in my script while parsing the output of git log,
hopefully fixed now.


RE: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option

2014-05-07 Thread Herman, Andrei
Thanks for the note.
I will make the needed changes and resubmit.

Regards,
Andrei Herman
Mentor Graphics Corporation
Israel branch 

 -Original Message-
 From: pins...@gmail.com [mailto:pins...@gmail.com]
 Sent: Wednesday, May 07, 2014 12:37 PM
 To: Herman, Andrei
 Cc: gcc-patches@gcc.gnu.org; herman_and...@mentor.com
 Subject: Re: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line
 option
 
 
 
  On May 7, 2014, at 2:32 AM, Herman, Andrei
 andrei_her...@codesourcery.com wrote:
 
 
  Hi,
 
  Currently GCC only emits DWARF debug information
 (DW_TAG_lexical_block
  DIEs) for compound statements containing significant local declarations.
  However, code coverage tools that process the DWARF debug information
  to implement block/path coverage need more complete lexical block
 information.
 
  This patch adds the necessary functionality under the control of a new
  command line argument: -fforce-dwarf-lexical-blocks.
 
  When this flag is set, a DW_TAG_lexical_block DIE will be emitted for
  every function body, loop body, switch body, case statement, if-then
  and if-else statement, even if the body is a single statement.
  Likewise, a lexical block will be emitted for the first label of a
  labeled statement. This block ends at the end of the current lexical
  scope, or when a break, continue, goto or return statement is
  encountered at the same lexical scope level.
  Consequently, any case in a switch statement that does not flow
  through to the next case, will have its own dwarf lexical block.
 
  The complete change proposal contains 4 patches (attached first 3):
  1. Add command line option -fforce-dwarf-lexical-blocks
 
 This option since it is specific to the c frontend should go into c.opt 
 instead
 of common.opt. Unless you are going to extend this to Ada, Java and
 fortran.
 
 Thanks,
 Andrew
 
 
  2. Use of flag_force_dwarf_blocks
  3. Create label scopes
 
  A forth patch, extending the proposed functionality to C++ will be
 submitted in a separate message.
 
  Attached are the proposed ChangeLog additions, named according to the
 directory each one belongs to.
 
  Best regards,
  Andrei Herman
  Mentor Graphics Corporation
  Israel branch
 
  gcc_c_ChangeLog
  gcc_c-family_ChangeLog
  gcc_ChangeLog
  0001-Add-command-line-option-fforce_dwarf_lexical_blocks.patch
  0002-Use-flag_force_dwarf_blocks.patch
  0003-Create-label-scopes.patch


[PATCH][1/n] Always-64bit HWI cleanups

2014-05-07 Thread Richard Biener

This removes the need_64bit_hwi logic, nothing else (well, brings
libcpp in line with gcc).

Bootstrap / regtest pending on x86_64-unknown-linux-gnu.

Just as I promised to send this before committing the let's try this
patch (which is now said to fix wide-int fallout).

Richard.

2014-05-07  Richard Biener  rguent...@suse.de

gcc/
* config.gcc: Remove need_64bit_hwint.
* configure.ac: Do not define NEED_64BIT_HOST_WIDE_INT.
* hwint.h: Do not check NEED_64BIT_HOST_WIDE_INT but assume
it to be true.
* config.in: Regenerate.
* configure: Likewise.

libcpp/
* configure.ac: Copy gcc logic of detecting a 64bit type.
Remove HOST_WIDE_INT define.
* include/cpplib.h: typedef cpp_num_part to a 64bit type,
similar to how hwint.h does it.
* config.in: Regenerate.
* configure: Likewise.

Index: trunk/gcc/config.gcc
===
*** trunk.orig/gcc/config.gcc   2014-04-30 10:16:58.491135331 +0200
--- trunk/gcc/config.gcc2014-04-30 10:24:43.902103288 +0200
***
*** 164,176 
  #  gasSet to yes or no depending on whether the target
  # system normally uses GNU as.
  #
- #  need_64bit_hwint   Set to yes if HOST_WIDE_INT must be 64 bits wide
- # for this target.  This is true if this target
- # supports long or wchar_t wider than 32 bits,
- # or BITS_PER_WORD is wider than 32 bits.
- # The setting made here must match the one made in
- # other locations such as libcpp/configure.ac
- #
  #  configure_default_options
  # Set to an initializer for configure_default_options
  # in configargs.h, based on --with-cpu et cetera.
--- 164,169 
*** gnu_ld=$gnu_ld_flag
*** 233,239 
  default_use_cxa_atexit=no
  default_gnu_indirect_function=no
  target_gtfiles=
- need_64bit_hwint=yes
  need_64bit_isa=
  native_system_header_dir=/usr/include
  target_type_format_char='@'
--- 226,231 
*** m32c*-*-*)
*** 310,323 
  ;;
  aarch64*-*-*)
cpu_type=aarch64
-   need_64bit_hwint=yes
extra_headers=arm_neon.h
extra_objs=aarch64-builtins.o aarch-common.o
target_has_targetm_common=yes
;;
  alpha*-*-*)
cpu_type=alpha
-   need_64bit_hwint=yes
extra_options=${extra_options} g.opt
;;
  am33_2.0-*-linux*)
--- 302,313 
*** arm*-*-*)
*** 333,339 
target_type_format_char='%'
c_target_objs=arm-c.o
cxx_target_objs=arm-c.o
-   need_64bit_hwint=yes
extra_options=${extra_options} arm/arm-tables.opt
;;
  avr-*-*)
--- 323,328 
*** i[34567]86-*-*)
*** 363,369 
cpu_type=i386
c_target_objs=i386-c.o
cxx_target_objs=i386-c.o
-   need_64bit_hwint=yes
extra_options=${extra_options} fused-madd.opt
extra_headers=cpuid.h mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h
   pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h
--- 352,357 
*** x86_64-*-*)
*** 393,403 
   adxintrin.h fxsrintrin.h xsaveintrin.h xsaveoptintrin.h
   avx512cdintrin.h avx512erintrin.h avx512pfintrin.h
   shaintrin.h
-   need_64bit_hwint=yes
;;
  ia64-*-*)
extra_headers=ia64intrin.h
-   need_64bit_hwint=yes
extra_options=${extra_options} g.opt fused-madd.opt
;;
  hppa*-*-*)
--- 381,389 
*** microblaze*-*-*)
*** 420,426 
  ;;
  mips*-*-*)
cpu_type=mips
-   need_64bit_hwint=yes
extra_headers=loongson.h
extra_options=${extra_options} g.opt mips/mips-tables.opt
;;
--- 406,411 
*** picochip-*-*)
*** 438,444 
  powerpc*-*-*)
cpu_type=rs6000
extra_headers=ppc-asm.h altivec.h spe.h ppu_intrinsics.h paired.h 
spu2vmx.h vec_types.h si2vmx.h htmintrin.h htmxlintrin.h
-   need_64bit_hwint=yes
case x$with_cpu in

xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[345678]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|Xe6500)
cpu_is_64bit=yes
--- 423,428 
*** powerpc*-*-*)
*** 447,453 
extra_options=${extra_options} g.opt fused-madd.opt 
rs6000/rs6000-tables.opt
;;
  rs6000*-*-*)
-   need_64bit_hwint=yes
extra_options=${extra_options} g.opt fused-madd.opt 
rs6000/rs6000-tables.opt
;;
  score*-*-*)
--- 431,436 
*** sparc*-*-*)
*** 459,480 
c_target_objs=sparc-c.o
cxx_target_objs=sparc-c.o
extra_headers=visintrin.h
-   need_64bit_hwint=yes
;;
  spu*-*-*)
cpu_type=spu
-   need_64bit_hwint=yes
;;
  s390*-*-*)

Re: [PATCH][RFC] Always require a 64bit HWI

2014-05-07 Thread Richard Biener
On Wed, 30 Apr 2014, Richard Biener wrote:

 On Tue, 29 Apr 2014, Jeff Law wrote:
 
  On 04/29/14 05:21, Richard Biener wrote:
   
   The following patch forces the availability of a 64bit HWI
   (without applying the cleanups that result from this).  I propose
   this exact patch for a short time to get those that are affected
   and do not want to be affected scream.
   
   But honestly I don't see any important host architecture that
   not already requires a 64bit HWI.
   
   Another concern is that the host compiler may not provide a
   64bit type.  I'm not sure that this is an issue nowadays
   (even though C++98 doesn't have 'long long', so it's maybe
   more an issue now with C++ than it was previously with
   requiring C89).  But given that it wasn't an issue for
   the existing 64bit HWI requiring host archs it shouldn't
   be an issue now.
   
   The benefit of this change is obviously the cleanup that
   can result from it - especially getting rid of code
   generation dependences on the host (!need_64bit_hwi
   doesn't mean we force a 32bit hwi).  As followup
   we can replace HOST_WIDE_INT and its friends with
   int64_t variants and appear less confusing to
   newcomers (and it's also less characters to type! yay!).
   
   We'd still retain HOST_WIDEST_FAST_INT, and as Kenny
   said elsewhere wide-int should internally operate on that,
   not on the eventually slow int64_t.  But that's a separate
   issue.
   
   So - any objections?
   
   Thanks,
   Richard.
   
   2014-04-29  Richard Biener  rguent...@suse.de
   
 libcpp/
 * configure.ac: Always set need_64bit_hwint to yes.
 * configure: Regenerated.
   
 * config.gcc: Always set need_64bit_hwint to yes.
  No objections.  The requirement for 64 bit HWINT traces its origins back to
  the MIPS R5900 target IIRC.  It's probably well past the time when we should
  just bite the bullet and make HWINT 64 bits across the board.
  
  If the host compiler doesn't support 64-bit HWINT, then it seems to me the
  host compiler can be used to bootstrap 4.9, which can then be used to
  bootstrap more modern GCCs.
  
  And like you I suspect it's really not going to be an issue in practice.
 
 I realized I forgot to copy gcc-patches, so done now (patch copied
 below again for reference).
 
 I propose to apply the patch after the wide-int merge for a short
 period of time and then followup with a patch to remove the
 need_64bit_hwint code (I'll make sure to send that out for review
 before applying this one).
 
 Testing coverage for non-64bit hwi configs is really low these
 days (I know of only 32bit hppa-*-* that is still built and
 tested semi-regularly - Dave, I suppose the host compiler
 has a 64bit long long type there, right?).

I have now applied the patch (as it is said to fix wide-int merge
fallout).  The plan is to go forward with cleanups that are
possible after this throughout stage1 (I sent the first cleanup
patch already, but further ones should wait until we released
4.9.1 to not make backports harder than necessary).

Richard.

 Thanks,
 Richard.
 
 2014-04-29  Richard Biener  rguent...@suse.de
 
   libcpp/
   * configure.ac: Always set need_64bit_hwint to yes.
   * configure: Regenerated.
 
   * config.gcc: Always set need_64bit_hwint to yes.
 
 Index: libcpp/configure.ac
 ===
 --- libcpp/configure.ac   (revision 209890)
 +++ libcpp/configure.ac   (working copy)
 @@ -200,7 +200,7 @@ case $target in
   tilegx*-*-* | tilepro*-*-* )
   need_64bit_hwint=yes ;;
   *)
 - need_64bit_hwint=no ;;
 + need_64bit_hwint=yes ;;
  esac
  
  case $need_64bit_hwint:$ac_cv_sizeof_long in
 Index: gcc/config.gcc
 ===
 --- gcc/config.gcc(revision 209890)
 +++ gcc/config.gcc(working copy)
 @@ -233,7 +233,7 @@ gnu_ld=$gnu_ld_flag
  default_use_cxa_atexit=no
  default_gnu_indirect_function=no
  target_gtfiles=
 -need_64bit_hwint=
 +need_64bit_hwint=yes
  need_64bit_isa=
  native_system_header_dir=/usr/include
  target_type_format_char='@'
 


Re: we are starting the wide int merge

2014-05-07 Thread Richard Sandiford
Jan-Benedict Glaw jbg...@lug-owl.de writes:
 On Tue, 2014-05-06 12:20:54 -0700, Mike Stump mikest...@comcast.net wrote:
 On May 6, 2014, at 8:19 AM, Kenneth Zadeck zad...@naturalbridge.com wrote:
  please hold off on committing patches for the next couple of hours
  as we have a very large merge to do.
  thanks.
 
 All done…  It is in.

 Just found one more:

 g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
 -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing 
 -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual 
 -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings 
 -fno-common  -DHAVE_CONFIG_H -I. -I. -I/home/vaxbuild/repos/gcc/gcc 
 -I/home/vaxbuild/repos/gcc/gcc/. -I/home/vaxbuild/repos/gcc/gcc/../include 
 -I/home/vaxbuild/repos/gcc/gcc/../libcpp/include  
 -I/home/vaxbuild/repos/gcc/gcc/../libdecnumber 
 -I/home/vaxbuild/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
 -I/home/vaxbuild/repos/gcc/gcc/../libbacktrace-o loop-iv.o -MT loop-iv.o 
 -MMD -MP -MF ./.deps/loop-iv.TPo /home/vaxbuild/repos/gcc/gcc/loop-iv.c
 In file included from /home/vaxbuild/repos/gcc/gcc/real.h:25:0,
  from /home/vaxbuild/repos/gcc/gcc/rtl.h:27,
  from /home/vaxbuild/repos/gcc/gcc/loop-iv.c:54:
 /home/vaxbuild/repos/gcc/gcc/wide-int.h: In instantiation of 
 ‘fixed_wide_int_storageN::fixed_wide_int_storage(const T) [with T = long 
 long unsigned int; int N = 160]’:
 /home/vaxbuild/repos/gcc/gcc/wide-int.h:724:15:   required from 
 ‘generic_wide_intT::generic_wide_int(const T) [with T = long long unsigned 
 int; storage = fixed_wide_int_storage160]’
 /home/vaxbuild/repos/gcc/gcc/loop-iv.c:2628:48:   required from here
 /home/vaxbuild/repos/gcc/gcc/wide-int.h:1172:45: error: incomplete type 
 ‘wi::int_traitslong long unsigned int’ used in nested name specifier
WI_BINARY_RESULT (T, FIXED_WIDE_INT (N)) *assertion ATTRIBUTE_UNUSED;
  ^
 /home/vaxbuild/repos/gcc/gcc/wide-int.h:1173:47: error: incomplete type 
 ‘wi::int_traitslong long unsigned int’ used in nested name specifier
wi::copy (*this, WIDE_INT_REF_FOR (T) (x, N));
^
 make[1]: *** [loop-iv.o] Error 1

Looks like this is specific to 32-bit HOST_WIDE_INTs.  The problem was
that loop-iv.c was using HOST_WIDEST_INT and no template specialisations
were defined for that.

Richard B's patch to force HOST_WIDE_INT to 64 bits will fix this.

Thanks,
Richard


Re: [AArch64] Fix integer vabs intrinsics

2014-05-07 Thread Richard Earnshaw
On 05/05/14 09:04, Richard Biener wrote:
 On Fri, May 2, 2014 at 12:39 PM, Richard Earnshaw rearn...@arm.com wrote:
 On 02/05/14 11:28, James Greenhalgh wrote:
 On Fri, May 02, 2014 at 10:29:06AM +0100, pins...@gmail.com wrote:


 On May 2, 2014, at 2:21 AM, James Greenhalgh james.greenha...@arm.com 
 wrote:

 On Fri, May 02, 2014 at 10:00:15AM +0100, Andrew Pinski wrote:
 On Fri, May 2, 2014 at 1:48 AM, James Greenhalgh
 james.greenha...@arm.com wrote:

 Hi,

 Unlike the mid-end's concept of an ABS_EXPR, which treats overflow as
 undefined/impossible, the neon intrinsics vabs intrinsics should behave 
 as
 the hardware. That is to say, the pseudo-code sequence:


 Only for signed integer types.  You should be able to use an unsigned
 integer type here instead.

 If anything, I think that puts us in a worse position.

 Not if you cast it back.


 The issue that
 inspires this patch is that GCC will happily fold:

  t1 = ABS_EXPR (x)
  t2 = GE_EXPR (t1, 0)

 to

  t2 = TRUE

 Surely an unsigned integer type is going to suffer the same fate? 
 Certainly I
 can imagine somewhere in the compiler there being a fold path for:

 Yes but if add a cast from the unsigned type to the signed type gcc does 
 not
 optimize that. If it does it is a bug since the overflow is defined there.

 I'm not sure I understand, are you saying I want to fold to:

   t1 = VIEW_CONVERT_EXPR (x, unsigned)
   t2 = ABS_EXPR (t1)
   t3 = VIEW_CONVERT_EXPR (t2, signed)

 Surely ABS_EXPR (unsigned) is a nop, and the two VIEW_CONVERTs cancel each
 other out leading to an overall NOP? It might just be Friday morning and a
 lack of coffee talking, but I think I need you to spell this one out to
 me in big letters!


 I agree.  I think what you need is a type widening so that you get

 t1 = VEC_WIDEN (x)
 t2 = ABS_EXPR (t1)
 t3 = VEC_NARROW (t2)

 This then guarantees that the ABS expression cannot be undefined.  I'm
 less sure, however about the narrow causing a change in 'sign'.  Has it
 just punted the problem?  Maybe you need
 
 Another option is to allow ABS_EXPR to have a TYPE_UNSIGNED
 result type, thus do abs(int) - unsigned (what we have as absu_hwi).
 That is, have an ABS_EXPR that doesn't have the undefined issue
 (at expense of optimization in case the result is immediately casted
 back to signed)
 

Yes, that would make more sense, and is, in effect, what the ARM VABS
instruction is doing (producing an unsigned result with no undefined
behaviour).

I'm not sure I understand your 'at expense of optimization' comment,
though.  Surely a cast back to signed is essentially a no-op, since
there's no representational change in the value (at least, not on 2's
complement machines)?


 Richard.
 

 t1 = VEC_WIDEN (x)
 t2 = ABS_EXPR (t1)
 t3 = VIEW_CONVERT_EXPR (x, unsigned)
 t4 = VEC_NARROW (t3)
 t5 = VIEW_CONVERT_EXPR (t4, signed)

 !!!

 How you capture this into RTL during expand, though, is another thing.

 R.


  (unsigned = 0) == TRUE


  a = vabs_s8 (vdup_n_s8 (-128));
  assert (a = 0);

 does not hold. As in hardware

  abs (-128) == -128

 Folding vabs intrinsics to an ABS_EXPR is thus a mistake, and we should 
 avoid
 it. In fact, we have to be even more careful than that, and keep the 
 integer
 vabs intrinsics as an unspec in the back end.

 No it is not.  The mistake is to use signed integer types here.  Just
 add a conversion to an unsigned integer vector and it will work
 correctly.
 In fact the ABS rtl code is not undefined for the overflow.

 Here we are covering ourselves against a seperate issue. For 
 auto-vectorized
 code we want the SABD combine patterns to kick in whenever sensible. For
 intrinsics code, in the case where vsub_s8 (x, y) would cause an 
 underflow:

  vabs_s8 (vsub_s8 (x, y)) != vabd_s8 (x, y)

 So in this case, the combine would be erroneous. Likewise SABA.

 This sounds like it would problematic for unsigned types  and not just for
 vabs_s8 with vsub_s8. So I think you should be using unspec for vabd_s8
 instead. Since in rtl overflow and underflow is defined to be wrapping.

 There are no vabs_u8/vabd_u8 so I don't see how we can reach this point
 with unsigned types. Further, I have never thought of RTL having signed
 and unsigned types, just a bag of bits. We'll want to use unspec for the
 intrinsic version of vabd_s8 - but we'll want to specify the

   (abs (minus (reg) (reg)))

 behaviour so that auto-vectorized code can pick it up.

 So in the end we'll have these patterns:

   (abs
 (abs (reg)))

   (intrinsic_abs
 (unspec [(reg)] UNSPEC_ABS))

   (abd
 (abs (minus (reg) (reg

   (intrinsic_abd
 (unspec [(reg) (reg)] UNSPEC_ABD))

   (aba
 (plus (abs (minus (reg) (reg))) (reg)))

   (intrinsic_aba
 (plus (unspec [(reg) (reg)] UNSPEC_ABD) (reg)))

 which should give us reasonable auto-vectorized code without triggering any
 of the issues mapping the semantics of the instructions to intrinsics.

 Thanks,
 James


 Thanks,
 Andrew Pinski


 Thanks,
 James




 




Re: [AArch64] Fix integer vabs intrinsics

2014-05-07 Thread Richard Biener
On Wed, May 7, 2014 at 12:30 PM, Richard Earnshaw rearn...@arm.com wrote:
 On 05/05/14 09:04, Richard Biener wrote:
 On Fri, May 2, 2014 at 12:39 PM, Richard Earnshaw rearn...@arm.com wrote:
 On 02/05/14 11:28, James Greenhalgh wrote:
 On Fri, May 02, 2014 at 10:29:06AM +0100, pins...@gmail.com wrote:


 On May 2, 2014, at 2:21 AM, James Greenhalgh james.greenha...@arm.com 
 wrote:

 On Fri, May 02, 2014 at 10:00:15AM +0100, Andrew Pinski wrote:
 On Fri, May 2, 2014 at 1:48 AM, James Greenhalgh
 james.greenha...@arm.com wrote:

 Hi,

 Unlike the mid-end's concept of an ABS_EXPR, which treats overflow as
 undefined/impossible, the neon intrinsics vabs intrinsics should 
 behave as
 the hardware. That is to say, the pseudo-code sequence:


 Only for signed integer types.  You should be able to use an unsigned
 integer type here instead.

 If anything, I think that puts us in a worse position.

 Not if you cast it back.


 The issue that
 inspires this patch is that GCC will happily fold:

  t1 = ABS_EXPR (x)
  t2 = GE_EXPR (t1, 0)

 to

  t2 = TRUE

 Surely an unsigned integer type is going to suffer the same fate? 
 Certainly I
 can imagine somewhere in the compiler there being a fold path for:

 Yes but if add a cast from the unsigned type to the signed type gcc does 
 not
 optimize that. If it does it is a bug since the overflow is defined there.

 I'm not sure I understand, are you saying I want to fold to:

   t1 = VIEW_CONVERT_EXPR (x, unsigned)
   t2 = ABS_EXPR (t1)
   t3 = VIEW_CONVERT_EXPR (t2, signed)

 Surely ABS_EXPR (unsigned) is a nop, and the two VIEW_CONVERTs cancel each
 other out leading to an overall NOP? It might just be Friday morning and a
 lack of coffee talking, but I think I need you to spell this one out to
 me in big letters!


 I agree.  I think what you need is a type widening so that you get

 t1 = VEC_WIDEN (x)
 t2 = ABS_EXPR (t1)
 t3 = VEC_NARROW (t2)

 This then guarantees that the ABS expression cannot be undefined.  I'm
 less sure, however about the narrow causing a change in 'sign'.  Has it
 just punted the problem?  Maybe you need

 Another option is to allow ABS_EXPR to have a TYPE_UNSIGNED
 result type, thus do abs(int) - unsigned (what we have as absu_hwi).
 That is, have an ABS_EXPR that doesn't have the undefined issue
 (at expense of optimization in case the result is immediately casted
 back to signed)


 Yes, that would make more sense, and is, in effect, what the ARM VABS
 instruction is doing (producing an unsigned result with no undefined
 behaviour).

 I'm not sure I understand your 'at expense of optimization' comment,
 though.  Surely a cast back to signed is essentially a no-op, since
 there's no representational change in the value (at least, not on 2's
 complement machines)?

We can't derive a value range of [0, INT_MAX] for the (int)ABSU_EXPR.

Richard.


 Richard.


 t1 = VEC_WIDEN (x)
 t2 = ABS_EXPR (t1)
 t3 = VIEW_CONVERT_EXPR (x, unsigned)
 t4 = VEC_NARROW (t3)
 t5 = VIEW_CONVERT_EXPR (t4, signed)

 !!!

 How you capture this into RTL during expand, though, is another thing.

 R.


  (unsigned = 0) == TRUE


  a = vabs_s8 (vdup_n_s8 (-128));
  assert (a = 0);

 does not hold. As in hardware

  abs (-128) == -128

 Folding vabs intrinsics to an ABS_EXPR is thus a mistake, and we 
 should avoid
 it. In fact, we have to be even more careful than that, and keep the 
 integer
 vabs intrinsics as an unspec in the back end.

 No it is not.  The mistake is to use signed integer types here.  Just
 add a conversion to an unsigned integer vector and it will work
 correctly.
 In fact the ABS rtl code is not undefined for the overflow.

 Here we are covering ourselves against a seperate issue. For 
 auto-vectorized
 code we want the SABD combine patterns to kick in whenever sensible. For
 intrinsics code, in the case where vsub_s8 (x, y) would cause an 
 underflow:

  vabs_s8 (vsub_s8 (x, y)) != vabd_s8 (x, y)

 So in this case, the combine would be erroneous. Likewise SABA.

 This sounds like it would problematic for unsigned types  and not just for
 vabs_s8 with vsub_s8. So I think you should be using unspec for vabd_s8
 instead. Since in rtl overflow and underflow is defined to be wrapping.

 There are no vabs_u8/vabd_u8 so I don't see how we can reach this point
 with unsigned types. Further, I have never thought of RTL having signed
 and unsigned types, just a bag of bits. We'll want to use unspec for the
 intrinsic version of vabd_s8 - but we'll want to specify the

   (abs (minus (reg) (reg)))

 behaviour so that auto-vectorized code can pick it up.

 So in the end we'll have these patterns:

   (abs
 (abs (reg)))

   (intrinsic_abs
 (unspec [(reg)] UNSPEC_ABS))

   (abd
 (abs (minus (reg) (reg

   (intrinsic_abd
 (unspec [(reg) (reg)] UNSPEC_ABD))

   (aba
 (plus (abs (minus (reg) (reg))) (reg)))

   (intrinsic_aba
 (plus (unspec [(reg) (reg)] UNSPEC_ABD) (reg)))

 which should give us reasonable auto-vectorized code without 

Re: [PATCH][RFC] Remove RTL loop unswitching

2014-05-07 Thread Thomas Schwinge
Hi!

On Tue, 15 Apr 2014 11:26:29 +0200 (CEST), Richard Biener rguent...@suse.de 
wrote:
 This removes RTL loop unswitching

 2014-04-15  Richard Biener  rguent...@suse.de
 
   * Makefile.in (OBJS): Remove loop-unswitch.o.
   * loop-unswitch.c: Delete.
   * tree-pass.h (make_pass_rtl_unswitch): Remove.
   * passes.def (pass_rtl_unswitch): Likewise.
   * loop-init.c (gate_rtl_unswitch): Likewise.
   (rtl_unswitch): Likewise.
   (pass_data_rtl_unswitch): Likewise.
   (pass_rtl_unswitch): Likewise.
   (make_pass_rtl_unswitch): Likewise.
   * rtl.h (reversed_condition): Likewise.
   (compare_and_jump_seq): Likewise.
   * loop-iv.c (reversed_condition): Move here from loop-unswitch.c
   and make static.
   * loop-unroll.c (compare_and_jump_seq): Likewise.

After checking with Richard on IRC, I applied the following in r210150:

commit 81283dac62a91d2fbdf154fe51e9f84e0b1db816
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Wed May 7 10:31:26 2014 +

Really delete gcc/loop-unswitch.c.

gcc/
* loop-unswitch.c: Delete.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@210150 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git gcc/ChangeLog gcc/ChangeLog
index d5e6a0a..e5033a0 100644
--- gcc/ChangeLog
+++ gcc/ChangeLog
@@ -1,3 +1,7 @@
+2014-05-07  Thomas Schwinge  tho...@codesourcery.com
+
+   * loop-unswitch.c: Delete.
+
 2014-05-07  Richard Biener  rguent...@suse.de
 
* config.gcc: Always set need_64bit_hwint to yes.
@@ -2294,7 +2298,6 @@
 2014-04-23  Richard Biener  rguent...@suse.de
 
* Makefile.in (OBJS): Remove loop-unswitch.o.
-   * loop-unswitch.c: Delete.
* tree-pass.h (make_pass_rtl_unswitch): Remove.
* passes.def (pass_rtl_unswitch): Likewise.
* loop-init.c (gate_rtl_unswitch): Likewise.
diff --git gcc/loop-unswitch.c gcc/loop-unswitch.c
deleted file mode 100644
index fff0fd1..000


Grüße,
 Thomas


pgp6PZW4kmLlT.pgp
Description: PGP signature


Re: [AArch64] Fix integer vabs intrinsics

2014-05-07 Thread Richard Earnshaw
On 07/05/14 11:32, Richard Biener wrote:
 On Wed, May 7, 2014 at 12:30 PM, Richard Earnshaw rearn...@arm.com wrote:
 On 05/05/14 09:04, Richard Biener wrote:
 On Fri, May 2, 2014 at 12:39 PM, Richard Earnshaw rearn...@arm.com wrote:
 On 02/05/14 11:28, James Greenhalgh wrote:
 On Fri, May 02, 2014 at 10:29:06AM +0100, pins...@gmail.com wrote:


 On May 2, 2014, at 2:21 AM, James Greenhalgh james.greenha...@arm.com 
 wrote:

 On Fri, May 02, 2014 at 10:00:15AM +0100, Andrew Pinski wrote:
 On Fri, May 2, 2014 at 1:48 AM, James Greenhalgh
 james.greenha...@arm.com wrote:

 Hi,

 Unlike the mid-end's concept of an ABS_EXPR, which treats overflow as
 undefined/impossible, the neon intrinsics vabs intrinsics should 
 behave as
 the hardware. That is to say, the pseudo-code sequence:


 Only for signed integer types.  You should be able to use an unsigned
 integer type here instead.

 If anything, I think that puts us in a worse position.

 Not if you cast it back.


 The issue that
 inspires this patch is that GCC will happily fold:

  t1 = ABS_EXPR (x)
  t2 = GE_EXPR (t1, 0)

 to

  t2 = TRUE

 Surely an unsigned integer type is going to suffer the same fate? 
 Certainly I
 can imagine somewhere in the compiler there being a fold path for:

 Yes but if add a cast from the unsigned type to the signed type gcc does 
 not
 optimize that. If it does it is a bug since the overflow is defined 
 there.

 I'm not sure I understand, are you saying I want to fold to:

   t1 = VIEW_CONVERT_EXPR (x, unsigned)
   t2 = ABS_EXPR (t1)
   t3 = VIEW_CONVERT_EXPR (t2, signed)

 Surely ABS_EXPR (unsigned) is a nop, and the two VIEW_CONVERTs cancel each
 other out leading to an overall NOP? It might just be Friday morning and a
 lack of coffee talking, but I think I need you to spell this one out to
 me in big letters!


 I agree.  I think what you need is a type widening so that you get

 t1 = VEC_WIDEN (x)
 t2 = ABS_EXPR (t1)
 t3 = VEC_NARROW (t2)

 This then guarantees that the ABS expression cannot be undefined.  I'm
 less sure, however about the narrow causing a change in 'sign'.  Has it
 just punted the problem?  Maybe you need

 Another option is to allow ABS_EXPR to have a TYPE_UNSIGNED
 result type, thus do abs(int) - unsigned (what we have as absu_hwi).
 That is, have an ABS_EXPR that doesn't have the undefined issue
 (at expense of optimization in case the result is immediately casted
 back to signed)


 Yes, that would make more sense, and is, in effect, what the ARM VABS
 instruction is doing (producing an unsigned result with no undefined
 behaviour).

 I'm not sure I understand your 'at expense of optimization' comment,
 though.  Surely a cast back to signed is essentially a no-op, since
 there's no representational change in the value (at least, not on 2's
 complement machines)?
 
 We can't derive a value range of [0, INT_MAX] for the (int)ABSU_EXPR.
 

Unless you're assuming that ABS_EXPR(INT_MIN) will always trap, then if
you can derive it for ABS_EXPR (which really returns [0,
INT_MAX]+UNSPECIFIED, I don't really see why you can't derive it for
(int)ABSU_EXPR, which returns [0, INT_MAX]+INT_MIN, since the latter is
a subset of the former).

R.

 Richard.
 

 Richard.


 t1 = VEC_WIDEN (x)
 t2 = ABS_EXPR (t1)
 t3 = VIEW_CONVERT_EXPR (x, unsigned)
 t4 = VEC_NARROW (t3)
 t5 = VIEW_CONVERT_EXPR (t4, signed)

 !!!

 How you capture this into RTL during expand, though, is another thing.

 R.


  (unsigned = 0) == TRUE


  a = vabs_s8 (vdup_n_s8 (-128));
  assert (a = 0);

 does not hold. As in hardware

  abs (-128) == -128

 Folding vabs intrinsics to an ABS_EXPR is thus a mistake, and we 
 should avoid
 it. In fact, we have to be even more careful than that, and keep the 
 integer
 vabs intrinsics as an unspec in the back end.

 No it is not.  The mistake is to use signed integer types here.  Just
 add a conversion to an unsigned integer vector and it will work
 correctly.
 In fact the ABS rtl code is not undefined for the overflow.

 Here we are covering ourselves against a seperate issue. For 
 auto-vectorized
 code we want the SABD combine patterns to kick in whenever sensible. For
 intrinsics code, in the case where vsub_s8 (x, y) would cause an 
 underflow:

  vabs_s8 (vsub_s8 (x, y)) != vabd_s8 (x, y)

 So in this case, the combine would be erroneous. Likewise SABA.

 This sounds like it would problematic for unsigned types  and not just 
 for
 vabs_s8 with vsub_s8. So I think you should be using unspec for vabd_s8
 instead. Since in rtl overflow and underflow is defined to be wrapping.

 There are no vabs_u8/vabd_u8 so I don't see how we can reach this point
 with unsigned types. Further, I have never thought of RTL having signed
 and unsigned types, just a bag of bits. We'll want to use unspec for the
 intrinsic version of vabd_s8 - but we'll want to specify the

   (abs (minus (reg) (reg)))

 behaviour so that auto-vectorized code can pick it up.

 So in the end we'll have these patterns:

   (abs
 (abs 

[patch] libstdc++/61086 - fix ubsan errors in std::vector

2014-05-07 Thread Jonathan Wakely

The testcase in the PR calls __position._M_const_cast() to get a
mutable iterator and that dereferences the pointer as suggested in
http://gcc.gnu.org/ml/libstdc++/2013-05/msg00031.html
That's invalid because the pointer is not dereferenceable (in this
case it's null but is past-the-end at all times).

I played around with changing the __normal_iterator so we would do
__postition._M_const_cast(begin()) then decided we don't need it at
all and can just as easily obtain a mutable iterator using:

  auto __pos = begin() + (__position - cbegin());

I plan to commit the attached patch to trunk and 4.9 soon. I've tested
it on x86_64-linux but not added a testcase because we don't test with
-fsanitize (though we should do) and it only shows up with Clang
anyway.
commit 566623def309c70387e41da2346ff89aa7619b13
Author: Jonathan Wakely jwak...@redhat.com
Date:   Wed May 7 12:17:41 2014 +0100

	PR libstdc++/61086
	* include/bits/stl_iterator.h (__normal_iterator::_M_const_cast):
	Remove.
	* include/bits/stl_vector.h (vector::insert, vector::erase): Use
	arithmetic to obtain a mutable iterator from const_iterator.
	* include/bits/vector.tcc (vector::insert): Likewise.
	* include/debug/vector (vector::erase): Likewise.
	* testsuite/23_containers/vector/requirements/dr438/assign_neg.cc:
	Adjust dg-error line number.
	* testsuite/23_containers/vector/requirements/dr438/
	constructor_1_neg.cc: Likewise.
	* testsuite/23_containers/vector/requirements/dr438/
	constructor_2_neg.cc: Likewise.
	* testsuite/23_containers/vector/requirements/dr438/insert_neg.cc:
	Likewise.

diff --git a/libstdc++-v3/include/bits/stl_iterator.h b/libstdc++-v3/include/bits/stl_iterator.h
index 16f992c..f4522a4 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -736,21 +736,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 		  _Container::__type __i) _GLIBCXX_NOEXCEPT
 : _M_current(__i.base()) { }
 
-#if __cplusplus = 201103L
-  __normal_iteratortypename _Container::pointer, _Container
-  _M_const_cast() const noexcept
-  {
-	using _PTraits = std::pointer_traitstypename _Container::pointer;
-	return __normal_iteratortypename _Container::pointer, _Container
-	  (_PTraits::pointer_to(const_casttypename _PTraits::element_type
-(*_M_current)));
-  }
-#else
-  __normal_iterator
-  _M_const_cast() const
-  { return *this; }
-#endif
-
   // Forward iterator requirements
   reference
   operator*() const _GLIBCXX_NOEXCEPT
diff --git a/libstdc++-v3/include/bits/stl_vector.h b/libstdc++-v3/include/bits/stl_vector.h
index 3d3a2cf..0a56c65 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -1051,7 +1051,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   insert(const_iterator __position, size_type __n, const value_type __x)
   {
 	difference_type __offset = __position - cbegin();
-	_M_fill_insert(__position._M_const_cast(), __n, __x);
+	_M_fill_insert(begin() + __offset, __n, __x);
 	return begin() + __offset;
   }
 #else
@@ -1096,7 +1096,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	   _InputIterator __last)
 {
 	  difference_type __offset = __position - cbegin();
-	  _M_insert_dispatch(__position._M_const_cast(),
+	  _M_insert_dispatch(begin() + __offset,
 			 __first, __last, __false_type());
 	  return begin() + __offset;
 	}
@@ -1144,10 +1144,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   iterator
 #if __cplusplus = 201103L
   erase(const_iterator __position)
+  { return _M_erase(begin() + (__position - cbegin())); }
 #else
   erase(iterator __position)
+  { return _M_erase(__position); }
 #endif
-  { return _M_erase(__position._M_const_cast()); }
 
   /**
*  @brief  Remove a range of elements.
@@ -1170,10 +1171,15 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   iterator
 #if __cplusplus = 201103L
   erase(const_iterator __first, const_iterator __last)
+  {
+	const auto __beg = begin();
+	const auto __cbeg = cbegin();
+	return _M_erase(__beg + (__first - __cbeg), __beg + (__last - __cbeg));
+  }
 #else
   erase(iterator __first, iterator __last)
+  { return _M_erase(__first, __last); }
 #endif
-  { return _M_erase(__first._M_const_cast(), __last._M_const_cast()); }
 
   /**
*  @brief  Swaps data with another %vector.
diff --git a/libstdc++-v3/include/bits/vector.tcc b/libstdc++-v3/include/bits/vector.tcc
index 299e614..5c3dfae 100644
--- a/libstdc++-v3/include/bits/vector.tcc
+++ b/libstdc++-v3/include/bits/vector.tcc
@@ -121,14 +121,17 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   else
 	{
 #if __cplusplus = 201103L
+	  const auto __pos = begin() + (__position - cbegin());
 	  if (this-_M_impl._M_finish != this-_M_impl._M_end_of_storage)
 	{
 	  _Tp __x_copy = __x;
-	  _M_insert_aux(__position._M_const_cast(), std::move(__x_copy));
+	  

Re: [patch] libstdc++/61086 - fix ubsan errors in std::vector

2014-05-07 Thread Paolo Carlini


On 05/07/2014 02:07 PM, Jonathan Wakely wrote:

The testcase in the PR calls __position._M_const_cast() to get a
mutable iterator and that dereferences the pointer as suggested in
http://gcc.gnu.org/ml/libstdc++/2013-05/msg00031.html
That's invalid because the pointer is not dereferenceable (in this
case it's null but is past-the-end at all times).
Uhmm, I see, at the time I scratched my head a bit. Nice that we can 
avoid the whole thing. Are we sure we don't have something similar 
elsewhere?


Paolo.


Re: [patch] libstdc++/61086 - fix ubsan errors in std::vector

2014-05-07 Thread Jonathan Wakely

On 07/05/14 14:21 +0200, Paolo Carlini wrote:


On 05/07/2014 02:07 PM, Jonathan Wakely wrote:

The testcase in the PR calls __position._M_const_cast() to get a
mutable iterator and that dereferences the pointer as suggested in
http://gcc.gnu.org/ml/libstdc++/2013-05/msg00031.html
That's invalid because the pointer is not dereferenceable (in this
case it's null but is past-the-end at all times).
Uhmm, I see, at the time I scratched my head a bit. Nice that we can 
avoid the whole thing. Are we sure we don't have something similar 
elsewhere?


Yes, I checked. deque::const_iterator, list::const_iterator,
vectorbool::const_iterator and the _Rb_tree_const_iterator types all
have _M_const_cast but they do not dereference anything.

It only really affected std::vector because that's the only one of our
containers that correctly supports custom pointer types (when my fixes
for PR57272 are ready I'll need to deal with the issue again and will
be careful about dereferencing).


Re: [PATCH][RFC] Remove RTL loop unswitching

2014-05-07 Thread Thomas Schwinge
Hi!

On Tue, 15 Apr 2014 11:26:29 +0200 (CEST), Richard Biener rguent...@suse.de 
wrote:
 This removes RTL loop unswitching

 2014-04-15  Richard Biener  rguent...@suse.de
 
   * Makefile.in (OBJS): Remove loop-unswitch.o.
   * loop-unswitch.c: Delete.
   * tree-pass.h (make_pass_rtl_unswitch): Remove.
   * passes.def (pass_rtl_unswitch): Likewise.
   * loop-init.c (gate_rtl_unswitch): Likewise.
   (rtl_unswitch): Likewise.
   (pass_data_rtl_unswitch): Likewise.
   (pass_rtl_unswitch): Likewise.
   (make_pass_rtl_unswitch): Likewise.
   * rtl.h (reversed_condition): Likewise.
   (compare_and_jump_seq): Likewise.
   * loop-iv.c (reversed_condition): Move here from loop-unswitch.c
   and make static.
   * loop-unroll.c (compare_and_jump_seq): Likewise.

I found some more; OK to commit?  Is a non-bootstrap build enough for
this, or is a full bootstrap build and test needed?

commit 8a703b1e7adc6001f665a12f93601382e3eea806
Author: Thomas Schwinge tho...@codesourcery.com
Date:   Wed May 7 13:01:47 2014 +0200

More gcc/loop-unswitch.c cleanup.

gcc/
* cfgloop.h (unswitch_loops): Remove.
* doc/passes.texi: Remove references to loop-unswitch.c
* timevar.def (TV_LOOP_UNSWITCH): Remove.

diff --git gcc/cfgloop.h gcc/cfgloop.h
index ab8b809..62a656a 100644
--- gcc/cfgloop.h
+++ gcc/cfgloop.h
@@ -711,8 +711,6 @@ extern void loop_optimizer_init (unsigned);
 extern void loop_optimizer_finalize (void);
 
 /* Optimization passes.  */
-extern void unswitch_loops (void);
-
 enum
 {
   UAP_PEEL = 1,/* Enables loop peeling.  */
diff --git gcc/doc/passes.texi gcc/doc/passes.texi
index 2727b2c..fb064db 100644
--- gcc/doc/passes.texi
+++ gcc/doc/passes.texi
@@ -474,10 +474,7 @@ merging and induction variable elimination.  The pass is 
implemented in
 Loop unswitching.  This pass moves the conditional jumps that are invariant
 out of the loops.  To achieve this, a duplicate of the loop is created for
 each possible outcome of conditional jump(s).  The pass is implemented in
-@file{tree-ssa-loop-unswitch.c}.  This pass should eventually replace the
-RTL level loop unswitching in @file{loop-unswitch.c}, but currently
-the RTL level pass is not completely redundant yet due to deficiencies
-in tree level alias analysis.
+@file{tree-ssa-loop-unswitch.c}.
 
 The optimizations also use various utility functions contained in
 @file{tree-ssa-loop-manip.c}, @file{cfgloop.c}, @file{cfgloopanal.c} and
@@ -793,8 +790,8 @@ The source files @file{cfgloopanal.c} and 
@file{cfgloopmanip.c} contain
 generic loop analysis and manipulation code.  Initialization and finalization
 of loop structures is handled by @file{loop-init.c}.
 A loop invariant motion pass is implemented in @file{loop-invariant.c}.
-Basic block level optimizations---unrolling, peeling and unswitching loops---
-are implemented in @file{loop-unswitch.c} and @file{loop-unroll.c}.
+Basic block level optimizations---unrolling, and peeling loops---
+are implemented in @file{loop-unroll.c}.
 Replacing of the exit condition of loops by special machine-dependent
 instructions is handled by @file{loop-doloop.c}.
 
diff --git gcc/timevar.def gcc/timevar.def
index 9faf98b..2db1943 100644
--- gcc/timevar.def
+++ gcc/timevar.def
@@ -207,7 +207,6 @@ DEFTIMEVAR (TV_DSE2  , dead store elim2)
 DEFTIMEVAR (TV_LOOP  , loop analysis)
 DEFTIMEVAR (TV_LOOP_INIT, loop init)
 DEFTIMEVAR (TV_LOOP_MOVE_INVARIANTS  , loop invariant motion)
-DEFTIMEVAR (TV_LOOP_UNSWITCH , loop unswitching)
 DEFTIMEVAR (TV_LOOP_UNROLL   , loop unrolling)
 DEFTIMEVAR (TV_LOOP_DOLOOP   , loop doloop)
 DEFTIMEVAR (TV_LOOP_FINI, loop fini)


Grüße,
 Thomas


pgpP6eLZr8j19.pgp
Description: PGP signature


Re: [PATCH][RFC] Remove RTL loop unswitching

2014-05-07 Thread Richard Biener
On Wed, 7 May 2014, Thomas Schwinge wrote:

 Hi!
 
 On Tue, 15 Apr 2014 11:26:29 +0200 (CEST), Richard Biener rguent...@suse.de 
 wrote:
  This removes RTL loop unswitching
 
  2014-04-15  Richard Biener  rguent...@suse.de
  
  * Makefile.in (OBJS): Remove loop-unswitch.o.
  * loop-unswitch.c: Delete.
  * tree-pass.h (make_pass_rtl_unswitch): Remove.
  * passes.def (pass_rtl_unswitch): Likewise.
  * loop-init.c (gate_rtl_unswitch): Likewise.
  (rtl_unswitch): Likewise.
  (pass_data_rtl_unswitch): Likewise.
  (pass_rtl_unswitch): Likewise.
  (make_pass_rtl_unswitch): Likewise.
  * rtl.h (reversed_condition): Likewise.
  (compare_and_jump_seq): Likewise.
  * loop-iv.c (reversed_condition): Move here from loop-unswitch.c
  and make static.
  * loop-unroll.c (compare_and_jump_seq): Likewise.
 
 I found some more; OK to commit?  Is a non-bootstrap build enough for
 this, or is a full bootstrap build and test needed?

That's enough.

Ok.

Thanks,
Richard.

 commit 8a703b1e7adc6001f665a12f93601382e3eea806
 Author: Thomas Schwinge tho...@codesourcery.com
 Date:   Wed May 7 13:01:47 2014 +0200
 
 More gcc/loop-unswitch.c cleanup.
 
   gcc/
   * cfgloop.h (unswitch_loops): Remove.
   * doc/passes.texi: Remove references to loop-unswitch.c
   * timevar.def (TV_LOOP_UNSWITCH): Remove.
 
 diff --git gcc/cfgloop.h gcc/cfgloop.h
 index ab8b809..62a656a 100644
 --- gcc/cfgloop.h
 +++ gcc/cfgloop.h
 @@ -711,8 +711,6 @@ extern void loop_optimizer_init (unsigned);
  extern void loop_optimizer_finalize (void);
  
  /* Optimization passes.  */
 -extern void unswitch_loops (void);
 -
  enum
  {
UAP_PEEL = 1,  /* Enables loop peeling.  */
 diff --git gcc/doc/passes.texi gcc/doc/passes.texi
 index 2727b2c..fb064db 100644
 --- gcc/doc/passes.texi
 +++ gcc/doc/passes.texi
 @@ -474,10 +474,7 @@ merging and induction variable elimination.  The pass is 
 implemented in
  Loop unswitching.  This pass moves the conditional jumps that are invariant
  out of the loops.  To achieve this, a duplicate of the loop is created for
  each possible outcome of conditional jump(s).  The pass is implemented in
 -@file{tree-ssa-loop-unswitch.c}.  This pass should eventually replace the
 -RTL level loop unswitching in @file{loop-unswitch.c}, but currently
 -the RTL level pass is not completely redundant yet due to deficiencies
 -in tree level alias analysis.
 +@file{tree-ssa-loop-unswitch.c}.
  
  The optimizations also use various utility functions contained in
  @file{tree-ssa-loop-manip.c}, @file{cfgloop.c}, @file{cfgloopanal.c} and
 @@ -793,8 +790,8 @@ The source files @file{cfgloopanal.c} and 
 @file{cfgloopmanip.c} contain
  generic loop analysis and manipulation code.  Initialization and finalization
  of loop structures is handled by @file{loop-init.c}.
  A loop invariant motion pass is implemented in @file{loop-invariant.c}.
 -Basic block level optimizations---unrolling, peeling and unswitching loops---
 -are implemented in @file{loop-unswitch.c} and @file{loop-unroll.c}.
 +Basic block level optimizations---unrolling, and peeling loops---
 +are implemented in @file{loop-unroll.c}.
  Replacing of the exit condition of loops by special machine-dependent
  instructions is handled by @file{loop-doloop.c}.
  
 diff --git gcc/timevar.def gcc/timevar.def
 index 9faf98b..2db1943 100644
 --- gcc/timevar.def
 +++ gcc/timevar.def
 @@ -207,7 +207,6 @@ DEFTIMEVAR (TV_DSE2  , dead store elim2)
  DEFTIMEVAR (TV_LOOP  , loop analysis)
  DEFTIMEVAR (TV_LOOP_INIT  , loop init)
  DEFTIMEVAR (TV_LOOP_MOVE_INVARIANTS  , loop invariant motion)
 -DEFTIMEVAR (TV_LOOP_UNSWITCH , loop unswitching)
  DEFTIMEVAR (TV_LOOP_UNROLL   , loop unrolling)
  DEFTIMEVAR (TV_LOOP_DOLOOP   , loop doloop)
  DEFTIMEVAR (TV_LOOP_FINI  , loop fini)
 
 
 Grüße,
  Thomas
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer

Re: [patch] libstdc++/61086 - fix ubsan errors in std::vector

2014-05-07 Thread Paolo Carlini

Hi,

On 05/07/2014 02:33 PM, Jonathan Wakely wrote:

Yes, I checked. deque::const_iterator, list::const_iterator,
vectorbool::const_iterator and the _Rb_tree_const_iterator types all 
have _M_const_cast but they do not dereference anything.


It only really affected std::vector because that's the only one of our
containers that correctly supports custom pointer types (when my fixes 
for PR57272 are ready I'll need to deal with the issue again and will 
be careful about dereferencing).

Excellent. Thanks again!

Paolo.


Re: debug container patch

2014-05-07 Thread Ramana Radhakrishnan
On Wed, May 7, 2014 at 2:13 AM, Paolo Carlini paolo.carl...@oracle.com wrote:
 -- Francois,

 remember to regenerate and commit the Makefile.in changes.

Can someone regenerate and commit the Makefile.in changes soon ? I'm
seeing testsuite failures thanks to missing debug/safe_container.h on
arm-none-linux-gnueabihf

I don't have access to a machine right now with the right versions of
autoconf and automake that can do this easily.

Ramana


 Thanks,
 Paolo.


[PATCH][1/n] Fix PR61034

2014-05-07 Thread Richard Biener

The following fixes part of PR61034 - we are hindered by false
clobbering during FRE/PRE on paths we try to look through by
means of the alias walker.  The following makes us also
consider lattice-based disambiguation there and in particular
also try harder to disambiguate against builtins.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2014-05-07  Richard Biener  rguent...@suse.de

PR tree-optimization/61034
* tree-ssa-alias.c (call_may_clobber_ref_p_1): Export.
(maybe_skip_until): Use translate to take into account
lattices when trying to do disambiguations.
(get_continuation_for_phi_1): Likewise.
(get_continuation_for_phi): Adjust for added translate
arguments.
(walk_non_aliased_vuses): Likewise.
* tree-ssa-alias.h (get_continuation_for_phi): Adjust
prototype.
(walk_non_aliased_vuses): Likewise.
(call_may_clobber_ref_p_1): Declare.
* tree-ssa-sccvn.c (vn_reference_lookup_3): Also
disambiguate against calls.  Stop early if we are
only supposed to disambiguate.
* tree-ssa-pre.c (translate_vuse_through_block): Adjust.

* g++.dg/tree-ssa/pr61034.C: New testcase.

Index: gcc/tree-ssa-alias.c
===
*** gcc/tree-ssa-alias.c.orig   2014-05-07 13:53:47.015599960 +0200
--- gcc/tree-ssa-alias.c2014-05-07 14:07:09.087544738 +0200
*** ref_maybe_used_by_stmt_p (gimple stmt, t
*** 1835,1841 
  /* If the call in statement CALL may clobber the memory reference REF
 return true, otherwise return false.  */
  
! static bool
  call_may_clobber_ref_p_1 (gimple call, ao_ref *ref)
  {
tree base;
--- 1835,1841 
  /* If the call in statement CALL may clobber the memory reference REF
 return true, otherwise return false.  */
  
! bool
  call_may_clobber_ref_p_1 (gimple call, ao_ref *ref)
  {
tree base;
*** stmt_kills_ref_p (gimple stmt, tree ref)
*** 2318,2324 
  static bool
  maybe_skip_until (gimple phi, tree target, ao_ref *ref,
  tree vuse, unsigned int *cnt, bitmap *visited,
! bool abort_on_visited)
  {
basic_block bb = gimple_bb (phi);
  
--- 2318,2326 
  static bool
  maybe_skip_until (gimple phi, tree target, ao_ref *ref,
  tree vuse, unsigned int *cnt, bitmap *visited,
! bool abort_on_visited,
! void *(*translate)(ao_ref *, tree, void *, bool),
! void *data)
  {
basic_block bb = gimple_bb (phi);
  
*** maybe_skip_until (gimple phi, tree targe
*** 2338,2344 
  if (bitmap_bit_p (*visited, SSA_NAME_VERSION (PHI_RESULT (def_stmt
return !abort_on_visited;
  vuse = get_continuation_for_phi (def_stmt, ref, cnt,
!  visited, abort_on_visited);
  if (!vuse)
return false;
  continue;
--- 2340,2347 
  if (bitmap_bit_p (*visited, SSA_NAME_VERSION (PHI_RESULT (def_stmt
return !abort_on_visited;
  vuse = get_continuation_for_phi (def_stmt, ref, cnt,
!  visited, abort_on_visited,
!  translate, data);
  if (!vuse)
return false;
  continue;
*** maybe_skip_until (gimple phi, tree targe
*** 2350,2356 
  /* A clobbering statement or the end of the IL ends it failing.  */
  ++*cnt;
  if (stmt_may_clobber_ref_p_1 (def_stmt, ref))
!   return false;
}
/* If we reach a new basic-block see if we already skipped it
   in a previous walk that ended successfully.  */
--- 2353,2365 
  /* A clobbering statement or the end of the IL ends it failing.  */
  ++*cnt;
  if (stmt_may_clobber_ref_p_1 (def_stmt, ref))
!   {
! if (translate
!  (*translate) (ref, vuse, data, true) == NULL)
!   ;
! else
!   return false;
!   }
}
/* If we reach a new basic-block see if we already skipped it
   in a previous walk that ended successfully.  */
*** maybe_skip_until (gimple phi, tree targe
*** 2372,2378 
  static tree
  get_continuation_for_phi_1 (gimple phi, tree arg0, tree arg1,
ao_ref *ref, unsigned int *cnt,
!   bitmap *visited, bool abort_on_visited)
  {
gimple def0 = SSA_NAME_DEF_STMT (arg0);
gimple def1 = SSA_NAME_DEF_STMT (arg1);
--- 2381,2389 
  static tree
  get_continuation_for_phi_1 (gimple phi, tree arg0, tree arg1,
ao_ref *ref, unsigned int *cnt,
!   bitmap *visited, bool abort_on_visited,
!   void *(*translate)(ao_ref *, tree, void *, bool),
!

Re: debug container patch

2014-05-07 Thread Jonathan Wakely

On 07/05/14 14:17 +0100, Ramana Radhakrishnan wrote:

Can someone regenerate and commit the Makefile.in changes soon ? I'm
seeing testsuite failures thanks to missing debug/safe_container.h on
arm-none-linux-gnueabihf


It was done hours ago by
http://gcc.gnu.org/ml/gcc-cvs/2014-05/msg00170.html


Re: debug container patch

2014-05-07 Thread Ramana Radhakrishnan
On Wed, May 7, 2014 at 2:22 PM, Jonathan Wakely jwak...@redhat.com wrote:
 On 07/05/14 14:17 +0100, Ramana Radhakrishnan wrote:

 Can someone regenerate and commit the Makefile.in changes soon ? I'm
 seeing testsuite failures thanks to missing debug/safe_container.h on
 arm-none-linux-gnueabihf


 It was done hours ago by
 http://gcc.gnu.org/ml/gcc-cvs/2014-05/msg00170.html

Sorry about the noise. I realized that just after I had hit send. not
enough coffee today.

Ramana


Re: [C++ Patch] PR 61080

2014-05-07 Thread Jason Merrill

OK.

Jason


[patch] libstdc++/61023 - copy comparison functor in RB tree move assignment

2014-05-07 Thread Jonathan Wakely

As noted in the PR, the standard doesn't actually say what containers
should do with their functors on move construction/assignment.

Our unordered containers currently move the hash and predicate
functions.

Our RB trees copy the comparison function in the move constructor but
do nothing with it in the move assignment. I think moving in both
cases is probably correct, but rather than change the existing move
constructor this patch just makes the move assignment copy the
function, for consistency.

When the standard is clarified we can review whether we should be
moving instead of copying.

Tested x86_64-linux, committed to trunk and the 4.9 branch.
commit 42ea108aeb7528ff3b41f7c1b9d11f3a8ba1bae8
Author: Jonathan Wakely jwak...@redhat.com
Date:   Wed May 7 14:25:48 2014 +0100

	PR libstdc++/61023
	* include/bits/stl_tree.h (_Rb_tree::_M_move_assign): Copy the
	comparison function.
	* testsuite/23_containers/set/cons/61023.cc: New.

diff --git a/libstdc++-v3/include/bits/stl_tree.h b/libstdc++-v3/include/bits/stl_tree.h
index 288c9fa..ce43ab8 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -1073,6 +1073,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _Rb_tree_Key, _Val, _KeyOfValue, _Compare, _Alloc::
 _M_move_assign(_Rb_tree __x)
 {
+  _M_impl._M_key_compare = __x._M_impl._M_key_compare;
   if (_Alloc_traits::_S_propagate_on_move_assign()
 	  || _Alloc_traits::_S_always_equal()
 	  || _M_get_Node_allocator() == __x._M_get_Node_allocator())
diff --git a/libstdc++-v3/testsuite/23_containers/set/cons/61023.cc b/libstdc++-v3/testsuite/23_containers/set/cons/61023.cc
new file mode 100644
index 000..087b9cc
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/set/cons/61023.cc
@@ -0,0 +1,56 @@
+// { dg-options -std=gnu++11 }
+
+// Copyright (C) 2014 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// http://www.gnu.org/licenses/.
+
+#include set
+#include stdexcept
+
+struct Comparator
+{
+  Comparator() : valid(false) { }
+  explicit Comparator(bool) : valid(true) { }
+
+  bool operator()(int i, int j) const
+  {
+if (!valid)
+  throw std::logic_error(Comparator is invalid);
+return i  j;
+  }
+
+private:
+  bool valid;
+};
+
+int main()
+{
+  using test_type = std::setint, Comparator;
+
+  Comparator cmp{true};
+
+  test_type good{cmp};
+
+  test_type s1;
+  s1 = good; // copy-assign
+  s1.insert(1);
+  s1.insert(2);
+
+  test_type s2;
+  s2 = std::move(good);  // move-assign
+  s2.insert(1);
+  s2.insert(2);
+}


Re: [Patch ARM 1/3] Neon intrinsics TLC : Replace intrinsics with GNU C implementations where possible.

2014-05-07 Thread Richard Earnshaw
On 28/04/14 14:01, Ramana Radhakrishnan wrote:
 
 On Mon, Apr 28, 2014 at 12:44 PM, Julian Brown jul...@codesourcery.com 
 wrote:
   On Mon, 28 Apr 2014 11:44:01 +0100
   Ramana Radhakrishnan ramra...@arm.com wrote:
  
   I've special cased the ffast-math case for the _f32 intrinsics to
   prevent the auto-vectorizer from coming along and vectorizing addv2sf
   and addv4sf type operations which we don't want to happen by default.
   Patch 1/3 causes apparent regressions in the rather ineffective
   neon intrinsics tests that we currently carry soon hopefully to be
   replaced by Christophe Lyon's rewrite that is being reviewed. On the
   whole I deem this patch stack to be safe to go in if necessary. These
   regressions are for -O0 with the vbic and vorn intrinsics which
   don't now get combined and well, so be it.
  
   I think reimplementing these intrinsics in C is a mistake if we ever
   hope to make big-endian mode work properly, and fixing the generated
   header file by bypassing the generator makes it harder to accurately
   perform the sweeping changes that will probably be necessary to do that.#
 
 
   Recall e.g. the discussion around:
 
  
   http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00161.html
 
 Well, it would help if the generator were written in a better language 
 than ML :) . While I don't mind the different language in the backend 
 once in a while the problem is that everytime anyone needs to make a 
 change to this file, we spend far more time relearning ML than actually 
 doing the change :(.
 

I agree: it's time the ML files went.  They're an impediment to
maintenance these days.

When the ML description was added it did three things: generated
arm_neon.h, generated the testsuite and generated a pipeline description
for Cortex-A8.  As we've progressed the second and third of these have
gone away (or at least, are about to in the case of the testsuite),
leaving only the arm_neon.h generation.  I don't see any real merit in
having that file generated from the ML file; we might as well just
maintain the existing code directly and that brings about the chance to
have more people actively work on fixing issues there without having to
learn ML first.

R.




[PATCH, PR 60897] Clear DECL_LANG_SPECIFIC when creating ISRA clones

2014-05-07 Thread Martin Jambor
Hi,

I nearly forgot about this patch to fix PR 60897 where we get a
mangled name in a warning for IPA-SRA functions because IPA-SRA
currently does not clear DECL_LANG_SPECIFIC when it messes with formal
parameters and the front-end then does not look at abstract origin
when it is not NULL.

Bootstrapped and tested on x86_64-linux.  OK for trunk?  Also,
although I have not tested it there yet, I suppose this should also be
committed to the 4.9 branch.

Thanks,

Martin


2014-04-22  Martin Jambor  mjam...@suse.cz

PR ipa/60897
* ipa-prop.c (ipa_modify_formal_parameters): Reset DECL_LANG_SPECIFIC.

diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 9f144fa..0bc44d3 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -3650,6 +3650,7 @@ ipa_modify_formal_parameters (tree fndecl, 
ipa_parm_adjustment_vec adjustments)
 
   TREE_TYPE (fndecl) = new_type;
   DECL_VIRTUAL_P (fndecl) = 0;
+  DECL_LANG_SPECIFIC (fndecl) = NULL;
   otypes.release ();
   oparms.release ();
 }


Re: [PATCH] Fix GDB PR15559 (inferior calls using thiscall calling convention)

2014-05-07 Thread Tom Tromey
Tom The usual approach is some appropriate text somewhere on the GCC wiki
Tom (though I suppose a note in the mail archives would do in a pinch)
Tom along with a URL in a comment in the appropriate file (dwarf2.h or
Tom dwarf2.def).

Tom Could you please do that?

Julian How's this, as a first attempt?
Julian http://gcc.gnu.org/wiki/GNUDwarfExtensions

Sorry I didn't reply to this sooner.
That page looks great.  Thanks for doing this.

Tom


[C++ PATCH] demangler fix

2014-05-07 Thread Gary Benson
Hi all,

A patch I committed to libiberty last year [1, 2] caused a regression
that caused the demangler to segfault on certain symbols [3, 4, 5, 6].
The attached patch fixes, and adds regression tests for all symbols
referenced in those bugs.

Ok to commit?

Thanks,
Gary

--
http://gbenson.net/

[1] http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01299.html
[2] http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01755.html
[3] https://sourceware.org/bugzilla/show_bug.cgi?id=14963
[4] https://sourceware.org/bugzilla/show_bug.cgi?id=16593
[5] https://sourceware.org/bugzilla/show_bug.cgi?id=16752
[6] https://sourceware.org/bugzilla/show_bug.cgi?id=16845

2014-05-07  Gary Benson  gben...@redhat.com

* cp-demangle.c (struct d_component_stack): New structure.
(struct d_print_info): New field component_stack.
(d_print_init): Initialize the above.
(d_print_comp_inner): Renamed from d_print_comp.
Do not restore template stack if it would cause a loop.
(d_print_comp): New function.
* testsuite/demangle-expected: New test cases.

diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index bf2ffa9..41c86c7 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -275,6 +275,16 @@ struct d_growable_string
   int allocation_failure;
 };
 
+/* Stack of components, innermost first, used to avoid loops.  */
+
+struct d_component_stack
+{
+  /* This component.  */
+  const struct demangle_component *dc;
+  /* This component's parent.  */
+  const struct d_component_stack *parent;
+};
+
 /* A demangle component and some scope captured when it was first
traversed.  */
 
@@ -327,6 +337,8 @@ struct d_print_info
   int pack_index;
   /* Number of d_print_flush calls so far.  */
   unsigned long int flush_count;
+  /* Stack of components, innermost first, used to avoid loops.  */
+  const struct d_component_stack *component_stack;
   /* Array of saved scopes for evaluating substitutions.  */
   struct d_saved_scope *saved_scopes;
   /* Index of the next unused saved scope in the above array.  */
@@ -3934,6 +3946,8 @@ d_print_init (struct d_print_info *dpi, 
demangle_callbackref callback,
 
   dpi-demangle_failure = 0;
 
+  dpi-component_stack = NULL;
+
   dpi-saved_scopes = NULL;
   dpi-next_saved_scope = 0;
   dpi-num_saved_scopes = 0;
@@ -4269,8 +4283,8 @@ d_get_saved_scope (struct d_print_info *dpi,
 /* Subroutine to handle components.  */
 
 static void
-d_print_comp (struct d_print_info *dpi, int options,
-  const struct demangle_component *dc)
+d_print_comp_inner (struct d_print_info *dpi, int options,
+ const struct demangle_component *dc)
 {
   /* Magic variable to let reference smashing skip over the next modifier
  without needing to modify *dc.  */
@@ -4673,11 +4687,30 @@ d_print_comp (struct d_print_info *dpi, int options,
  }
else
  {
+   const struct d_component_stack *dcse;
+   int found_self_or_parent = 0;
+
/* This traversal is reentering SUB as a substition.
-  Restore the original templates temporarily.  */
-   saved_templates = dpi-templates;
-   dpi-templates = scope-templates;
-   need_template_restore = 1;
+  If we are not beneath SUB or DC in the tree then we
+  need to restore SUB's template stack temporarily.  */
+   for (dcse = dpi-component_stack; dcse != NULL;
+dcse = dcse-parent)
+ {
+   if (dcse-dc == sub
+   || (dcse-dc == dc
+dcse != dpi-component_stack))
+ {
+   found_self_or_parent = 1;
+   break;
+ }
+ }
+
+   if (!found_self_or_parent)
+ {
+   saved_templates = dpi-templates;
+   dpi-templates = scope-templates;
+   need_template_restore = 1;
+ }
  }
 
a = d_lookup_template_argument (dpi, sub);
@@ -5316,6 +5349,21 @@ d_print_comp (struct d_print_info *dpi, int options,
 }
 }
 
+static void
+d_print_comp (struct d_print_info *dpi, int options,
+ const struct demangle_component *dc)
+{
+  struct d_component_stack self;
+
+  self.dc = dc;
+  self.parent = dpi-component_stack;
+  dpi-component_stack = self;
+
+  d_print_comp_inner (dpi, options, dc);
+
+  dpi-component_stack = self.parent;
+}
+
 /* Print a Java dentifier.  For Java we try to handle encoded extended
Unicode characters.  The C++ ABI doesn't mention Unicode encoding,
so we don't it for C++.  Characters are encoded as
diff --git a/libiberty/testsuite/demangle-expected 
b/libiberty/testsuite/demangle-expected
index 3ff08e6..453f9a3 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ 

Re: PR 61084: SPARC fallout from wide-int merge

2014-05-07 Thread Mike Stump
On May 7, 2014, at 2:26 AM, Richard Sandiford rdsandif...@googlemail.com 
wrote:
 The DImode constant spliiter assigned the result of trunc_int_for_mode
 to an unsigned int rather than a HOST_WIDE_INT.  This then produced const_ints
 that were zero-extended rather than sign-extended and tripped the assert:
 
   gcc_checking_assert (INTVAL (x.first)
== sext_hwi (INTVAL (x.first), precision)
|| (x.second == BImode  INTVAL (x.first) == 1));
 
 The other hunks are just by inspection, but I think gen_int_mode is
 preferred over GEN_INT when the mode is obvious.
 
 Tested by Rainer, who says that the boostrap now completes.
 OK to install?

Ok.

[Committed] Add myself to MAINTAINERS

2014-05-07 Thread Charles Baylis
Committed as r210164.
Index: MAINTAINERS
===
--- MAINTAINERS (revision 210161)
+++ MAINTAINERS (working copy)
@@ -315,6 +315,7 @@
 Simon Baldwin  sim...@google.com
 Scott Bambroughsco...@netwinder.org
 Wolfgang Bangerth  bange...@dealii.org
+Charles Baylis charles.bay...@linaro.org
 Tejas Belagod  tejas.bela...@arm.com
 Andrey Belevantsev a...@ispras.ru
 Jon Beniston   j...@beniston.com


Re: [C++ PATCH] demangler fix

2014-05-07 Thread Jason Merrill

OK, thanks.

Jason


[PATCH] copyprop_hardreg_forward needs to check HARD_REGNO_CALL_PART_CLOBBERED

2014-05-07 Thread Matthew Fortune
The MIPS O32 FPXX ABI exposes a bug in regcprop where call part
clobbered information is not checked when calculating clobbered
registers. This is only one of many places that 
regs_invalidated_by_call is used without also checking 
HARD_REGNO_CALL_PART_CLOBBERED. This patch ensures that a part 
clobbered register is treated as if fully clobbered.

Other places where this same issue occurs are not so easily
fixed as they do not always have mode information available
when calculating clobbered registers. A solution to the larger
problem will be significantly more involved.

Exposed in a testcase as part of:
http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00401.html

Regards,
Matthew

2014-05-07  Matthew Fortune  matthew.fort...@imgtec.com

gcc/
* regcprop.c (copyprop_hardreg_forward_1): Account for
HARD_REGNO_CALL_PART_CLOBBERED.


0001-copyprop-part-clobbered.patch
Description: 0001-copyprop-part-clobbered.patch


Re: PR 61084: SPARC fallout from wide-int merge

2014-05-07 Thread Richard Sandiford
Mike Stump mikest...@comcast.net writes:
 On May 7, 2014, at 2:26 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 The DImode constant spliiter assigned the result of trunc_int_for_mode
 to an unsigned int rather than a HOST_WIDE_INT.  This then produced 
 const_ints
 that were zero-extended rather than sign-extended and tripped the assert:
 
  gcc_checking_assert (INTVAL (x.first)
   == sext_hwi (INTVAL (x.first), precision)
   || (x.second == BImode  INTVAL (x.first) == 1));
 
 The other hunks are just by inspection, but I think gen_int_mode is
 preferred over GEN_INT when the mode is obvious.
 
 Tested by Rainer, who says that the boostrap now completes.
 OK to install?

 Ok.

I think this needs a backend maintainer.  Although it was exposed by
the wide-int assert, it isn't really wide-int-related as such.

Thanks,
Richard



Re: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option

2014-05-07 Thread Mike Stump
On May 7, 2014, at 2:32 AM, Herman, Andrei andrei_her...@codesourcery.com 
wrote:
 However, code coverage tools that process the DWARF debug information to
 implement block/path coverage need more complete lexical block information. 

So, it would be nice to give a hint in the actual documentation, why a user 
might use the flag, or for a maintainer to be able to predict exactly what was 
desired in some obscure corner of dwarf semantics given the documentation.  I 
think it can be as simple as “This option is useful for code coverage tools 
that utilize the dwarf debug information.”  A user, upon seeing that, would 
then ask, do I have such a tool, say no, and then know they don’t have to 
contemplate the goodness of the option further.  If one is writing a coverage 
tool, upon seeing the documentation, they might then ask themselves, how might 
I use that flag profitably for my users.

Re: [PATCH] rs6000: New attributes for load/store: sign_extend, update and indexed

2014-05-07 Thread David Edelsohn
On Sun, May 4, 2014 at 10:13 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 The new attributes replace the instruction types *_ext*, *_u, *_ux.

 This simplifies all code that does not care about the addressing modes,
 putting the burden on the code that does care (mostly the scheduling
 descriptions for certain CPUs).

 It fixes a few minor bugs in the process.

 The update and indexed attributes are automatic for any insn that
 has a MEM as operand 0 or 1.  Other insns have to set it manually, if
 they do not like the default (which is no).  Insns that are type
 load/store/fpload/fpstore but have fewer than two operands need to set
 it too, or the compiler will crash.  There are very few of those.

 This tries not to change semantics anywhere; in particular, the string
 and multiple instructions set both update and indexed (although
 they are neither).

 Bootstrapped on powerpc64-linux c,c++,fortran,ada,go; tested
 {-m64,-m64/-mcpu=power8,-m32,-m32/-mpowerpc64}, no regressions.

 OK for mainline?


 Segher


 gcc/

 2014-05-04  Segher Boessenkool  seg...@kernel.crashing.org

 * config/rs6000/predicates.md (indexed_address_mem): New.
 * config/rs6000/rs6000.md (type): Remove load_ext, load_ext_u,
 load_ext_ux, load_ux, load_u, store_ux, store_u, fpload_ux, fpload_u,
 fpstore_ux, fpstore_u.
 (sign_extend, indexed, update): New.
 (cell_micro): Adjust.
 (*zero_extendmodedi2_internal1, *zero_extendsidi2_lfiwzx,
 *extendsidi2_lfiwax, *extendsidi2_nocell, *extendsfdf2_fpr,
 *movsi_internal1, *movsi_internal1_single, *movhi_internal,
 *movqi_internal, *movcc_internal1, movmode_hardfloat,
 *movmode_softfloat, *movmode_hardfloat32, *movmode_hardfloat64,
 *movmode_softfloat64, *movdi_internal32, *movdi_internal64,
 *movmode_string, *ldmsi8, *ldmsi7, *ldmsi6, *ldmsi5, *ldmsi4,
 *ldmsi3, *stmsi8, *stmsi7, *stmsi6, *stmsi5, *stmsi4, *stmsi3,
 *movdi_update1, movdi_mode_update, movdi_mode_update_stack,
 *movsi_update1, *movsi_update2, movsi_update, movsi_update_stack,
 *movhi_update1, *movhi_update2, *movhi_update3, *movhi_update4,
 *movqi_update1, *movqi_update2, *movqi_update3, *movsf_update1,
 *movsf_update2, *movsf_update3, *movsf_update4, *movdf_update1,
 *movdf_update2, load_toc_aix_si, load_toc_aix_di, probe_stack_mode,
 *stmw, *lmw, as well as 10 anonymous patterns): Adjust.

 * config/rs6000/dfp.md (movsd_store, movsd_load): Adjust.
 * config/rs6000/vsx.md (*vsx_movti_32bit, *vsx_extract_mode_load,
 *vsx_extract_mode_store): Adjust.
 * config/rs6000/rs6000.c (rs6000_adjust_cost, is_microcoded_insn,
 is_cracked_insn, insn_must_be_first_in_group,
 insn_must_be_last_in_group): Adjust.

 * config/rs6000/40x.md (ppc403-load, ppc403-store, ppc405-float):
 Adjust.
 * config/rs6000/440.md (ppc440-load, ppc440-store, ppc440-fpload,
 ppc440-fpstore): Adjust.
 * config/rs6000/476.md (ppc476-load, ppc476-store, ppc476-fpload,
 ppc476-fpstore): Adjust.
 * config/rs6000/601.md (ppc601-load, ppc601-store, ppc601-fpload,
 ppc601-fpstore): Adjust.
 * config/rs6000/603.md (ppc603-load, ppc603-store, ppc603-fpload):
 Adjust.
 * config/rs6000/6xx.md (ppc604-load, ppc604-store, ppc604-fpload):
 Adjust.
 * config/rs6000/7450.md (ppc7450-load, ppc7450-store, ppc7450-fpload,
 ppc7450-fpstore): Adjust.
 * config/rs6000/7xx.md (ppc750-load, ppc750-store): Adjust.
 * config/rs6000/8540.md (ppc8540_load, ppc8540_store): Adjust.
 * config/rs6000/a2.md (ppca2-load, ppca2-fp-load, ppca2-fp-store):
 Adjust.
 * config/rs6000/cell.md (cell-load, cell-load-ux, cell-load-ext,
 cell-fpload, cell-fpload-update, cell-store, cell-store-update,
 cell-fpstore, cell-fpstore-update): Adjust.
 * config/rs6000/e300c2c3.md (ppce300c3_load, ppce300c3_fpload,
 ppce300c3_store, ppce300c3_fpstore): Adjust.
 * config/rs6000/e500mc.md (e500mc_load, e500mc_fpload, e500mc_store,
 e500mc_fpstore): Adjust.
 * config/rs6000/e500mc64.md (e500mc64_load, e500mc64_fpload,
 e500mc64_store, e500mc64_fpstore): Adjust.
 * config/rs6000/e5500.md (e5500_load, e5500_fpload, e5500_store,
 e5500_fpstore): Adjust.
 * config/rs6000/e6500.md (e6500_load, e6500_fpload, e6500_store,
 e6500_fpstore): Adjust.
 * config/rs6000/mpc.md (mpccore-load, mpccore-store, mpccore-fpload):
 Adjust.
 * config/rs6000/power4.md (power4-load, power4-load-ext,
 power4-load-ext-update, power4-load-ext-update-indexed,
 power4-load-update-indexed, power4-load-update, power4-fpload,
 power4-fpload-update, power4-store, power4-store-update,
 power4-store-update-indexed, power4-fpstore, 

[4.7] Various backports

2014-05-07 Thread Jakub Jelinek
Hi!

I've backported some fixes I've committed (plus one support change from
Jason and one fix from Marek) to 4.8 branch in the last year or so to
4.7 branch, after bootstrapping/regtesting them on x86_64-linux and
i686-linux.
Sorry for the delay.

Jakub
2014-05-07  Jakub Jelinek  ja...@redhat.com

Backported from mainline
2013-06-27  Jakub Jelinek  ja...@redhat.com

PR target/57623
* config/i386/i386.md (bmi2_bzhi_mode3): Swap AND arguments
to match RTL canonicalization.  Swap predicates and
constraints of operand 1 and 2.

* gcc.target/i386/bmi2-bzhi-1.c: New test.

--- gcc/config/i386/i386.md (revision 200477)
+++ gcc/config/i386/i386.md (revision 200478)
@@ -12174,9 +12174,9 @@ (define_insn *bmi_blsr_mode
 ;; BMI2 instructions.
 (define_insn bmi2_bzhi_mode3
   [(set (match_operand:SWI48 0 register_operand =r)
-   (and:SWI48 (match_operand:SWI48 1 register_operand r)
-  (lshiftrt:SWI48 (const_int -1)
-  (match_operand:SWI48 2 
nonimmediate_operand rm
+   (and:SWI48 (lshiftrt:SWI48 (const_int -1)
+  (match_operand:SWI48 2 register_operand 
r))
+  (match_operand:SWI48 1 nonimmediate_operand rm)))
(clobber (reg:CC FLAGS_REG))]
   TARGET_BMI2
   bzhi\t{%2, %1, %0|%0, %1, %2}
--- gcc/testsuite/gcc.target/i386/bmi2-bzhi-1.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/bmi2-bzhi-1.c (revision 200478)
@@ -0,0 +1,31 @@
+/* PR target/57623 */
+/* { dg-do assemble { target bmi2 } } */
+/* { dg-options -O2 -mbmi2 } */
+
+#include x86intrin.h
+
+unsigned int
+f1 (unsigned int x, unsigned int *y)
+{
+  return _bzhi_u32 (x, *y);
+}
+
+unsigned int
+f2 (unsigned int *x, unsigned int y)
+{
+  return _bzhi_u32 (*x, y);
+}
+
+#ifdef  __x86_64__
+unsigned long long
+f3 (unsigned long long x, unsigned long long *y)
+{
+  return _bzhi_u64 (x, *y);
+}
+
+unsigned long long
+f4 (unsigned long long *x, unsigned long long y)
+{
+  return _bzhi_u64 (*x, y);
+}
+#endif
2014-05-07  Jakub Jelinek  ja...@redhat.com

Backported from mainline
2013-06-27  Jakub Jelinek  ja...@redhat.com

PR target/57623
* config/i386/i386.md (bmi_bextr_mode): Swap predicates and
constraints of operand 1 and 2.

* gcc.target/i386/bmi-bextr-3.c: New test.

--- gcc/config/i386/i386.md (revision 200479)
+++ gcc/config/i386/i386.md (revision 200480)
@@ -12077,8 +12077,8 @@
 
 (define_insn bmi_bextr_mode
   [(set (match_operand:SWI48 0 register_operand =r)
-(unspec:SWI48 [(match_operand:SWI48 1 register_operand r)
-   (match_operand:SWI48 2 nonimmediate_operand rm)]
+(unspec:SWI48 [(match_operand:SWI48 1 nonimmediate_operand rm)
+   (match_operand:SWI48 2 register_operand r)]
UNSPEC_BEXTR))
(clobber (reg:CC FLAGS_REG))]
   TARGET_BMI
--- gcc/testsuite/gcc.target/i386/bmi-bextr-3.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/bmi-bextr-3.c (revision 200480)
@@ -0,0 +1,31 @@
+/* PR target/57623 */
+/* { dg-do assemble { target bmi } } */
+/* { dg-options -O2 -mbmi } */
+
+#include x86intrin.h
+
+unsigned int
+f1 (unsigned int x, unsigned int *y)
+{
+  return __bextr_u32 (x, *y);
+}
+
+unsigned int
+f2 (unsigned int *x, unsigned int y)
+{
+  return __bextr_u32 (*x, y);
+}
+
+#ifdef  __x86_64__
+unsigned long long
+f3 (unsigned long long x, unsigned long long *y)
+{
+  return __bextr_u64 (x, *y);
+}
+
+unsigned long long
+f4 (unsigned long long *x, unsigned long long y)
+{
+  return __bextr_u64 (*x, y);
+}
+#endif
2014-05-07  Jakub Jelinek  ja...@redhat.com

Backported from mainline
2013-07-03  Jakub Jelinek  ja...@redhat.com

PR target/5
* config/i386/predicates.md (vsib_address_operand): Disallow
SYMBOL_REF or LABEL_REF in parts.disp if TARGET_64BIT  flag_pic.

* gcc.target/i386/pr5.c: New test.

--- gcc/config/i386/predicates.md   (revision 200649)
+++ gcc/config/i386/predicates.md   (revision 200650)
@@ -835,19 +835,28 @@ (define_predicate vsib_address_operand
 return false;
 
   /* VSIB addressing doesn't support (%rip).  */
-  if (parts.disp  GET_CODE (parts.disp) == CONST)
+  if (parts.disp)
 {
-  disp = XEXP (parts.disp, 0);
-  if (GET_CODE (disp) == PLUS)
-   disp = XEXP (disp, 0);
-  if (GET_CODE (disp) == UNSPEC)
-   switch (XINT (disp, 1))
- {
- case UNSPEC_GOTPCREL:
- case UNSPEC_PCREL:
- case UNSPEC_GOTNTPOFF:
-   return false;
- }
+  disp = parts.disp;
+  if (GET_CODE (disp) == CONST)
+   {
+ disp = XEXP (disp, 0);
+ if (GET_CODE (disp) == PLUS)
+   disp = XEXP (disp, 0);
+ if (GET_CODE (disp) == UNSPEC)
+   switch (XINT (disp, 1))
+ {
+ case UNSPEC_GOTPCREL:
+ case UNSPEC_PCREL:
+ case 

Re: [PATCH, PR 60897] Clear DECL_LANG_SPECIFIC when creating ISRA clones

2014-05-07 Thread Richard Biener
On May 7, 2014 5:30:53 PM CEST, Martin Jambor mjam...@suse.cz wrote:
Hi,

I nearly forgot about this patch to fix PR 60897 where we get a
mangled name in a warning for IPA-SRA functions because IPA-SRA
currently does not clear DECL_LANG_SPECIFIC when it messes with formal
parameters and the front-end then does not look at abstract origin
when it is not NULL.

Bootstrapped and tested on x86_64-linux.  OK for trunk?  Also,
although I have not tested it there yet, I suppose this should also be
committed to the 4.9 branch.

OK for both.
Thanks,
Richard.

Thanks,

Martin


2014-04-22  Martin Jambor  mjam...@suse.cz

   PR ipa/60897
   * ipa-prop.c (ipa_modify_formal_parameters): Reset DECL_LANG_SPECIFIC.

diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 9f144fa..0bc44d3 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -3650,6 +3650,7 @@ ipa_modify_formal_parameters (tree fndecl,
ipa_parm_adjustment_vec adjustments)
 
   TREE_TYPE (fndecl) = new_type;
   DECL_VIRTUAL_P (fndecl) = 0;
+  DECL_LANG_SPECIFIC (fndecl) = NULL;
   otypes.release ();
   oparms.release ();
 }




Re: [SH, committed] Fix PR 61026 sh-*-* Fails to Compile on FreeBSD

2014-05-07 Thread Joseph S. Myers
On Sat, 3 May 2014, Oleg Endo wrote:

 +#include sstream
 +#include vector
 +#include algorithm
 +
  #include config.h

It's never OK to include any system headers (C or C++) before config.h.  
config.h may define feature test macros such as _FILE_OFFSET_BITS that 
affect system headers in various ways and are only effective if defined 
before any system headers are included, and if different files in GCC are 
built with different settings of such feature test macros then they may 
expect incompatible choices of ABI for C library types.

(This is a general principle for any software using autoconf, at least if 
it uses any of the autoconf macros that can define feature test macros - 
which GCC does - not just for GCC.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch] change specific int128 - generic intN

2014-05-07 Thread Joseph S. Myers
On Sun, 4 May 2014, DJ Delorie wrote:

  I'm not aware of any reason those macros need to have decimal values.  I'd 
  suggest removing the precomputed table and printing them in hex, which is 
  easy for values of any precision.
 
 Here's an independent change that removes the decimal table and
 replaces it with generated hex values.  I included the relevent output
 of gcc -E -dM also.

OK (presuming the usual bootstrap and regression test, which should 
provide a reasonably thorough test of this code through the stdint.h 
tests).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [DOC PATCH] Rewrite docs for inline asm

2014-05-07 Thread Joseph S. Myers
On Mon, 5 May 2014, Gerald Pfeifer wrote:

  I've changed this to @code{=}.  Is that what you meant?
 
 This is a question for Joseph.  I see how a single character
 under @code{} won't work, yet @code{=} doesn't feel right,
 either.  Perhaps ``@code{=}''?

If you are referring to an actual string constant

  =

in the user's source code, then @code{=} is correct.  If you are 
referring just to the single character

  =

in the user's source code, whether as a token on its own or as part of a 
larger token, then @samp{=} is the way to get it quoted (with the 
character being in a fixed-width font, but the quotes around it not being 
in such a font).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [C PATCH] Don't reject valid code with _Alignas (PR c/61053)

2014-05-07 Thread Joseph S. Myers
On Mon, 5 May 2014, Marek Polacek wrote:

 In this PR the issue is that we reject (valid) code such as
 _Alignas (long long) long long foo;
 with -m32, because we trip this condition:
 
alignas_align = 1U  declspecs-align_log;
if (alignas_align  TYPE_ALIGN_UNIT (type))
  {
if (name)
  error_at (loc, %_Alignas% specifiers cannot reduce 
alignment of %qE, name);
 
 and error later on, since alignas_align is 4 (correct, see PR52023 for
 why), but TYPE_ALIGN_UNIT of long long is 8.  I think TYPE_ALIGN_UNIT
 is wrong here as that won't give us minimal alignment required.
 In c_sizeof_or_alignof_type we already have the code to compute such
 minimal alignment so I just moved the code to a separate function
 and used that instead of TYPE_ALIGN_UNIT.
 
 Note that the test is run only on i?86 and x86_64, because we can't (?)
 easily determine which target requires what alignment.
 
 Regtested/bootstrapped on x86_64-unknown-linux-gnu and
 powerpc64-unknown-linux-gnu, ok for trunk?

OK, though I'm not sure if the lp64 conditions are right in the testcase 
(i.e. if x32 has the same peculiarity as -m32 here, which is what's 
implied by the use of lp64).

-- 
Joseph S. Myers
jos...@codesourcery.com


RE: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option

2014-05-07 Thread Herman, Andrei
Thanks for the suggestion.
The current patch includes the following text added in gcc/doc/invoke.texi:

@item -fforce-dwarf-lexical-blocks
Produce debug information (a DW_TAG_lexical_block) for every function
body, loop body, switch body, case statement, if-then and if-else statement,
even if the body is a single statement.  Likewise, a lexical block will be
emitted for the first label of a statement.  This block ends at the end of the
current lexical scope, or when a break, continue, goto or return statement is
encountered at the same lexical scope level.
This option is available when using DWARF Version 4 or higher.

I can add the suggested sentence at the beginning of the description, to save 
time for users not interested in the more detailed explanation.

Regards,
Andrei Herman
Mentor Graphics Corporation
Israel branch 


 -Original Message-
 From: Mike Stump [mailto:mikest...@comcast.net]
 Sent: Wednesday, May 07, 2014 7:00 PM
 To: Herman, Andrei
 Cc: gcc-patches@gcc.gnu.org; herman_and...@mentor.com
 Subject: Re: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line
 option
 
 On May 7, 2014, at 2:32 AM, Herman, Andrei
 andrei_her...@codesourcery.com wrote:
  However, code coverage tools that process the DWARF debug information
  to implement block/path coverage need more complete lexical block
 information.
 
 So, it would be nice to give a hint in the actual documentation, why a user
 might use the flag, or for a maintainer to be able to predict exactly what
 was desired in some obscure corner of dwarf semantics given the
 documentation.  I think it can be as simple as This option is useful for code
 coverage tools that utilize the dwarf debug information.  A user, upon
 seeing that, would then ask, do I have such a tool, say no, and then know
 they don't have to contemplate the goodness of the option further.  If one
 is writing a coverage tool, upon seeing the documentation, they might then
 ask themselves, how might I use that flag profitably for my users.


Re: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option

2014-05-07 Thread Andrew Pinski
On Wed, May 7, 2014 at 10:19 AM, Herman, Andrei
andrei_her...@codesourcery.com wrote:
 Thanks for the suggestion.
 The current patch includes the following text added in gcc/doc/invoke.texi:

 @item -fforce-dwarf-lexical-blocks
 Produce debug information (a DW_TAG_lexical_block) for every function
 body, loop body, switch body, case statement, if-then and if-else statement,
 even if the body is a single statement.  Likewise, a lexical block will be
 emitted for the first label of a statement.  This block ends at the end of the
 current lexical scope, or when a break, continue, goto or return statement is
 encountered at the same lexical scope level.
 This option is available when using DWARF Version 4 or higher.

 I can add the suggested sentence at the beginning of the description, to save 
 time for users not interested in the more detailed explanation.

Also be explicit that the option only applies to C/C++ code in the
documentation.

Thanks,
Andrew Pinski


 Regards,
 Andrei Herman
 Mentor Graphics Corporation
 Israel branch


 -Original Message-
 From: Mike Stump [mailto:mikest...@comcast.net]
 Sent: Wednesday, May 07, 2014 7:00 PM
 To: Herman, Andrei
 Cc: gcc-patches@gcc.gnu.org; herman_and...@mentor.com
 Subject: Re: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line
 option

 On May 7, 2014, at 2:32 AM, Herman, Andrei
 andrei_her...@codesourcery.com wrote:
  However, code coverage tools that process the DWARF debug information
  to implement block/path coverage need more complete lexical block
 information.

 So, it would be nice to give a hint in the actual documentation, why a user
 might use the flag, or for a maintainer to be able to predict exactly what
 was desired in some obscure corner of dwarf semantics given the
 documentation.  I think it can be as simple as This option is useful for 
 code
 coverage tools that utilize the dwarf debug information.  A user, upon
 seeing that, would then ask, do I have such a tool, say no, and then know
 they don't have to contemplate the goodness of the option further.  If one
 is writing a coverage tool, upon seeing the documentation, they might then
 ask themselves, how might I use that flag profitably for my users.


Re: [C PATCH] Warn about variadic main (PR c/60156)

2014-05-07 Thread Joseph S. Myers
On Tue, 6 May 2014, Marek Polacek wrote:

 On Thu, May 01, 2014 at 11:37:58PM +, Joseph S. Myers wrote:
  As a matter of QoI we should also diagnose use of _Atomic in the return 
  type or argument types of main (something I deferred doing in the initial 
  _Atomic support).
 
 Ok, I opened PR61077 and I'm taking it.  But I wonder if I should
 diagnose if the second parameter is e.g.:
 _Atomic char **argv;
 char *_Atomic *argv;

Yes, those should be diagnosed (remember that _Atomic char is allowed to 
be bigger than char, so those certainly aren't reasonable types for 
arguments to main).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option

2014-05-07 Thread Joseph S. Myers
On Wed, 7 May 2014, Herman, Andrei wrote:

 When this flag is set, a DW_TAG_lexical_block DIE will be emitted for every
 function body, loop body, switch body, case statement, if-then and if-else
 statement, even if the body is a single statement. 
 Likewise, a lexical block will be emitted for the first label of a labeled
 statement. This block ends at the end of the current lexical scope, or when
 a break, continue, goto or return statement is encountered at the same lexical
 scope level. 
 Consequently, any case in a switch statement that does not flow through to 
 the next case, will have its own dwarf lexical block.

The documentation appears to suggest it's purely about debug info and has 
no effect on language semantics.  However, the implementation appears to 
force C99 scoping rules.  I don't think it's appropriate for a debug info 
option to have that effect; that is, gcc.dg/c90-scope-1.c should still 
pass even with the option enabled (more generally, the whole C testsuite 
should be verified to work with the option enabled).  I suspect the 
changes adding scopes for labels would also affect language semantics; 
it's valid in C to have a declaration (not having variably modified type) 
after one case in a switch statement that gets used in another case even 
when control does not flow through.

If you can't avoid affecting language semantics then you need to be very 
clear in the documentation that the option makes some invalid programs 
valid and vice versa and changes the semantics of some valid programs 
(even if you then assert the affected cases are uncommon in real C code).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option

2014-05-07 Thread Mike Stump
On May 7, 2014, at 10:19 AM, Herman, Andrei andrei_her...@codesourcery.com 
wrote:
 Thanks for the suggestion.

 I can add the suggested sentence at the beginning of the description, to save 
 time for users not interested in the more detailed explanation.

I’d put it at the end…  I think the description you have it more important.

[committed] PR 61095: tsan fallout from wide-int merge

2014-05-07 Thread Richard Sandiford
This PR was due to code in which -(int) foo was suposed to be sign-extended,
but was being ORed with an unsigned int and so ended up being zero-extended.
Fixed by using the proper-width type.

Tested on x86_64-linux-gnu and applied as obvious.  Sorry for the breakage.

Thanks,
Richard


gcc/
PR tree-optimization/61095
* tree-ssanames.c (get_nonzero_bits): Fix type extension in wi::shwi.

Index: gcc/tree-ssanames.c
===
--- gcc/tree-ssanames.c 2014-05-07 16:50:15.136064484 +0100
+++ gcc/tree-ssanames.c 2014-05-07 16:50:15.422063737 +0100
@@ -271,7 +271,8 @@ get_nonzero_bits (const_tree name)
 {
   struct ptr_info_def *pi = SSA_NAME_PTR_INFO (name);
   if (pi  pi-align)
-   return wi::shwi (-(int) pi-align | pi-misalign, precision);
+   return wi::shwi (-(HOST_WIDE_INT) pi-align
+| (HOST_WIDE_INT) pi-misalign, precision);
   return wi::shwi (-1, precision);
 }
 


Re: [C PATCH] Don't reject valid code with _Alignas (PR c/61053)

2014-05-07 Thread H.J. Lu
On Wed, May 7, 2014 at 10:15 AM, Joseph S. Myers
jos...@codesourcery.com wrote:
 On Mon, 5 May 2014, Marek Polacek wrote:

 In this PR the issue is that we reject (valid) code such as
 _Alignas (long long) long long foo;
 with -m32, because we trip this condition:

alignas_align = 1U  declspecs-align_log;
if (alignas_align  TYPE_ALIGN_UNIT (type))
  {
if (name)
  error_at (loc, %_Alignas% specifiers cannot reduce 
alignment of %qE, name);

 and error later on, since alignas_align is 4 (correct, see PR52023 for
 why), but TYPE_ALIGN_UNIT of long long is 8.  I think TYPE_ALIGN_UNIT
 is wrong here as that won't give us minimal alignment required.
 In c_sizeof_or_alignof_type we already have the code to compute such
 minimal alignment so I just moved the code to a separate function
 and used that instead of TYPE_ALIGN_UNIT.

 Note that the test is run only on i?86 and x86_64, because we can't (?)
 easily determine which target requires what alignment.

 Regtested/bootstrapped on x86_64-unknown-linux-gnu and
 powerpc64-unknown-linux-gnu, ok for trunk?

 OK, though I'm not sure if the lp64 conditions are right in the testcase

It should be !ia32 instead of lp64.

 (i.e. if x32 has the same peculiarity as -m32 here, which is what's
 implied by the use of lp64).


Alignments of long long and long double on x32 are the same as x86-64.

-- 
H.J.


Re: [C++ Patch] PR 61083

2014-05-07 Thread Jason Merrill

On 05/07/2014 01:15 PM, Paolo Carlini wrote:

curiously, convert_nontype_argument still has most of its error calls
not protected by complain  tf_error. The obvious fix works for this
SFINAE issue. Not a regression, but could be safe for the branch too?


Sure, OK for trunk and 4.9.

Jason



[patch libgcc]: Fix PR c++/57440

2014-05-07 Thread Kai Tietz
Hi,

this patch adds for Windows targets the define
_GTHREAD_USE_MUTEX_INIT_FUNC, which is necessary as pthread-emulation
for those targets are just handling pthread_mutext_init,
othread_mutex_destroy proper.

ChangeLog libgcc

2014-05-07  Kai Tietz  kti...@redhat.com

PR c++/57440
* gthr-posix.h (_GTHREAD_USE_MUTEX_INIT_FUNC): Define for native windows
targets.

Patch passed already regression-test for x86_64-unknown-linux-gnu.
Test for i686-w64-mingw32 is still running (with posix-threading
model).  Ok to apply this patch after last test passes?

Regards,
Kai



Index: gthr-posix.h
===
--- gthr-posix.h(Revision 210070)
+++ gthr-posix.h(Arbeitskopie)
@@ -34,6 +34,10 @@ see the files COPYING3 and COPYING.RUNTIME respect

 #include pthread.h

+#if defined (_WIN32)  !defined (__CYGWIN__)
+#define _GTHREAD_USE_MUTEX_INIT_FUNC 1
+#endif
+
 #if ((defined(_LIBOBJC) || defined(_LIBOBJC_WEAK)) \
  || !defined(_GTHREAD_USE_MUTEX_TIMEDLOCK))
 # include unistd.h


Re: [PATCH, MIPS] Alter default number of single-precision registers

2014-05-07 Thread Richard Sandiford
Matthew Fortune matthew.fort...@imgtec.com writes:
 diff --git a/gcc/testsuite/gcc.target/mips/oddspreg-6.c 
 b/gcc/testsuite/gcc.target/mips/oddspreg-6.c
 new file mode 100644
 index 000..2d1b129
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/mips/oddspreg-6.c
 @@ -0,0 +1,15 @@
 +/* Check that we disable odd-numbered single precision registers and can
 +   still generate code.  */
 +/* { dg-options -mabi=64 -mno-odd-spreg -mhard-float } */

Check that we enable odd-numbered single precision registers. for this one?

OK otherwise once the copyright is sorted out, thanks.

Richard


[jit] Add a soname

2014-05-07 Thread David Malcolm
gcc/jit/
* Make-lang.in (LIBGCCJIT_LINKER_NAME): New.
(LIBGCCJIT_VERSION_NUM): New.
(LIBGCCJIT_MINOR_NUM): New.
(LIBGCCJIT_RELEASE_NUM): New.
(LIBGCCJIT_SONAME): New.
(LIBGCCJIT_FILENAME): New.
(LIBGCCJIT_LINKER_NAME_SYMLINK): New.
(LIBGCCJIT_SONAME_SYMLINK): New.
(jit): Add symlink targets.
(libgccjit.so): Convert to...
(LIBGCCJIT_FILENAME): ...and add a soname.
(jit.install-common): Install the library with a soname, and
symlinks.  Install libgccjit++.h.
---
 gcc/jit/ChangeLog.jit | 16 
 gcc/jit/Make-lang.in  | 38 +-
 2 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index ccf8a10..f5c4742 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,19 @@
+2014-05-07  David Malcolm  dmalc...@redhat.com
+
+   * Make-lang.in (LIBGCCJIT_LINKER_NAME): New.
+   (LIBGCCJIT_VERSION_NUM): New.
+   (LIBGCCJIT_MINOR_NUM): New.
+   (LIBGCCJIT_RELEASE_NUM): New.
+   (LIBGCCJIT_SONAME): New.
+   (LIBGCCJIT_FILENAME): New.
+   (LIBGCCJIT_LINKER_NAME_SYMLINK): New.
+   (LIBGCCJIT_SONAME_SYMLINK): New.
+   (jit): Add symlink targets.
+   (libgccjit.so): Convert to...
+   (LIBGCCJIT_FILENAME): ...and add a soname.
+   (jit.install-common): Install the library with a soname, and
+   symlinks.  Install libgccjit++.h.
+
 2014-04-25  David Malcolm  dmalc...@redhat.com
 
* internal-api.c (gcc::jit::playback::context::compile): Put
diff --git a/gcc/jit/Make-lang.in b/gcc/jit/Make-lang.in
index 776ee81..ce0cdc5 100644
--- a/gcc/jit/Make-lang.in
+++ b/gcc/jit/Make-lang.in
@@ -40,7 +40,18 @@
 # into the jit rule, but that needs a little bit of work
 # to do the right thing within all.cross.
 
-jit: libgccjit.so
+LIBGCCJIT_LINKER_NAME = libgccjit.so
+LIBGCCJIT_VERSION_NUM = 0
+LIBGCCJIT_MINOR_NUM = 0
+LIBGCCJIT_RELEASE_NUM = 1
+LIBGCCJIT_SONAME = $(LIBGCCJIT_LINKER_NAME).$(LIBGCCJIT_VERSION_NUM)
+LIBGCCJIT_FILENAME = \
+  $(LIBGCCJIT_SONAME).$(LIBGCCJIT_MINOR_NUM).$(LIBGCCJIT_RELEASE_NUM)
+
+LIBGCCJIT_LINKER_NAME_SYMLINK = $(LIBGCCJIT_LINKER_NAME)
+LIBGCCJIT_SONAME_SYMLINK = $(LIBGCCJIT_SONAME)
+
+jit: $(LIBGCCJIT_FILENAME) $(LIBGCCJIT_SYMLINK) 
$(LIBGCCJIT_LINKER_NAME_SYMLINK)
 
 # Tell GNU make to ignore these if they exist.
 .PHONY: jit
@@ -53,14 +64,21 @@ jit-warn = $(STRICT_WARN)
 
 # We avoid using $(BACKEND) from Makefile.in in order to avoid pulling
 # in main.o
-libgccjit.so: $(jit_OBJS) \
+$(LIBGCCJIT_FILENAME): $(jit_OBJS) \
libbackend.a libcommon-target.a libcommon.a \
$(CPPLIB) $(LIBDECNUMBER) \
$(LIBDEPS) $(srcdir)/jit/libgccjit.map
+$(LLINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ -shared \
 $(jit_OBJS) libbackend.a libcommon-target.a libcommon.a \
 $(CPPLIB) $(LIBDECNUMBER) $(LIBS) $(BACKENDLIBS) \
--Wl,--version-script=$(srcdir)/jit/libgccjit.map
+-Wl,--version-script=$(srcdir)/jit/libgccjit.map \
+-Wl,-soname,$(LIBGCCJIT_SONAME)
+
+$(LIBGCCJIT_SONAME_SYMLINK): $(LIBGCCJIT_FILENAME)
+   ln -sf $(LIBGCCJIT_FILENAME) $(LIBGCCJIT_SONAME_SYMLINK)
+
+$(LIBGCCJIT_LINKER_NAME_SYMLINK): $(LIBGCCJIT_SONAME_SYMLINK)
+   ln -sf $(LIBGCCJIT_SONAME_SYMLINK) $(LIBGCCJIT_LINKER_NAME_SYMLINK)
 
 #
 # Build hooks:
@@ -87,8 +105,18 @@ jit.srcman:
 #
 # Install hooks:
 jit.install-common: installdirs
-   $(INSTALL_PROGRAM) libgccjit.so $(DESTDIR)/$(libdir)/libgccjit.so
-   $(INSTALL_PROGRAM) $(srcdir)/jit/libgccjit.h 
$(DESTDIR)/$(includedir)/libgccjit.h
+   $(INSTALL_PROGRAM) $(LIBGCCJIT_FILENAME) \
+ $(DESTDIR)/$(libdir)/$(LIBGCCJIT_FILENAME)
+   ln -sf \
+ $(LIBGCCJIT_FILENAME) \
+ $(DESTDIR)/$(libdir)/$(LIBGCCJIT_SONAME_SYMLINK)
+   ln -sf \
+ $(LIBGCCJIT_SONAME_SYMLINK)\
+ $(DESTDIR)/$(libdir)/$(LIBGCCJIT_LINKER_NAME_SYMLINK)
+   $(INSTALL_PROGRAM) $(srcdir)/jit/libgccjit.h \
+ $(DESTDIR)/$(includedir)/libgccjit.h
+   $(INSTALL_PROGRAM) $(srcdir)/jit/libgccjit++.h \
+ $(DESTDIR)/$(includedir)/libgccjit++.h
 
 jit.install-man:
 
-- 
1.8.5.3



RE: [PATCH, MIPS] Alter default number of single-precision registers

2014-05-07 Thread Matthew Fortune
Richard Sandiford rdsandif...@googlemail.com writes:
 Matthew Fortune matthew.fort...@imgtec.com writes:
  diff --git a/gcc/testsuite/gcc.target/mips/oddspreg-6.c
 b/gcc/testsuite/gcc.target/mips/oddspreg-6.c
  new file mode 100644
  index 000..2d1b129
  --- /dev/null
  +++ b/gcc/testsuite/gcc.target/mips/oddspreg-6.c
  @@ -0,0 +1,15 @@
  +/* Check that we disable odd-numbered single precision registers and can
  +   still generate code.  */
  +/* { dg-options -mabi=64 -mno-odd-spreg -mhard-float } */
 
 Check that we enable odd-numbered single precision registers. for this one?

Yes.

 OK otherwise once the copyright is sorted out, thanks.
 
 Richard


Committed: [PATCH 19/89] Const-correctness of gimple_call_builtin_p

2014-05-07 Thread David Malcolm
On Mon, 2014-04-21 at 12:56 -0400, David Malcolm wrote:
 gcc/
   * gimple.h (gimple_builtin_call_types_compatible_p): Accept a
   const_gimple, rather than a gimple.
   (gimple_call_builtin_p): Likewise, for the three variants.
 
   * gimple.c (gimple_builtin_call_types_compatible_p): Likewise.
   (gimple_call_builtin_p): Likewise, for the three variants.
 ---
  gcc/gimple.c | 8 
  gcc/gimple.h | 8 
  2 files changed, 8 insertions(+), 8 deletions(-)
 
 diff --git a/gcc/gimple.c b/gcc/gimple.c
 index 13c5a08..943fa7c 100644
 --- a/gcc/gimple.c
 +++ b/gcc/gimple.c
 @@ -2383,7 +2383,7 @@ validate_type (tree type1, tree type2)
 a decl of a builtin function.  */
  
  bool
 -gimple_builtin_call_types_compatible_p (gimple stmt, tree fndecl)
 +gimple_builtin_call_types_compatible_p (const_gimple stmt, tree fndecl)
  {
gcc_checking_assert (DECL_BUILT_IN_CLASS (fndecl) != NOT_BUILT_IN);
  
 @@ -2412,7 +2412,7 @@ gimple_builtin_call_types_compatible_p (gimple stmt, 
 tree fndecl)
  /* Return true when STMT is builtins call.  */
  
  bool
 -gimple_call_builtin_p (gimple stmt)
 +gimple_call_builtin_p (const_gimple stmt)
  {
tree fndecl;
if (is_gimple_call (stmt)
 @@ -2425,7 +2425,7 @@ gimple_call_builtin_p (gimple stmt)
  /* Return true when STMT is builtins call to CLASS.  */
  
  bool
 -gimple_call_builtin_p (gimple stmt, enum built_in_class klass)
 +gimple_call_builtin_p (const_gimple stmt, enum built_in_class klass)
  {
tree fndecl;
if (is_gimple_call (stmt)
 @@ -2438,7 +2438,7 @@ gimple_call_builtin_p (gimple stmt, enum built_in_class 
 klass)
  /* Return true when STMT is builtins call to CODE of CLASS.  */
  
  bool
 -gimple_call_builtin_p (gimple stmt, enum built_in_function code)
 +gimple_call_builtin_p (const_gimple stmt, enum built_in_function code)
  {
tree fndecl;
if (is_gimple_call (stmt)
 diff --git a/gcc/gimple.h b/gcc/gimple.h
 index a8a8d72..62f9756 100644
 --- a/gcc/gimple.h
 +++ b/gcc/gimple.h
 @@ -1458,10 +1458,10 @@ extern tree gimple_unsigned_type (tree);
  extern tree gimple_signed_type (tree);
  extern alias_set_type gimple_get_alias_set (tree);
  extern bool gimple_ior_addresses_taken (bitmap, gimple);
 -extern bool gimple_builtin_call_types_compatible_p (gimple, tree);
 -extern bool gimple_call_builtin_p (gimple);
 -extern bool gimple_call_builtin_p (gimple, enum built_in_class);
 -extern bool gimple_call_builtin_p (gimple, enum built_in_function);
 +extern bool gimple_builtin_call_types_compatible_p (const_gimple, tree);
 +extern bool gimple_call_builtin_p (const_gimple);
 +extern bool gimple_call_builtin_p (const_gimple, enum built_in_class);
 +extern bool gimple_call_builtin_p (const_gimple, enum built_in_function);
  extern bool gimple_asm_clobbers_memory_p (const_gimple);
  extern void dump_decl_set (FILE *, bitmap);
  extern bool nonfreeing_call_p (gimple);

Succesfully bootstrappedregtested on its own on
x86_64-unknown-linux-gnu (Fedora 20).

Committed to trunk as r210185 (this is just fixing const-correctness,
and so it falls under Jeff's preapproval for such fixes here:
  http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01240.html )




Re: [patch libgcc]: Fix PR c++/57440

2014-05-07 Thread Jonathan Wakely
On 7 May 2014 20:06, Kai Tietz wrote:

 PR c++/57440

N.B. that should be libstdc++/57440 in the ChangeLog


[SH, committeð] PR 60884 - reduce code size of inlined strlen

2014-05-07 Thread Oleg Endo
Hi,

The attached patch reduces the code size of inlined builtin strlen
functions on SH a little bit.
Tested on r210083 with
make -k check RUNTESTFLAGS=--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}

and no new failures, except for gcc.target/sh/pr53976-1.c on SH2 and
SH2A.  Using builtin strlen for checking the sett/clrt optimization pass
was a bit inappropriate in this case.

Committed as r210187.

Cheers,
Oleg

gcc/ChangeLog:
PR target/60884
* config/sh/sh-mem.cc (sh_expand_strlen): Use loop when emitting
unrolled byte insns.  Emit address increments after move insns.

gcc/testsuite/ChangeLog:
PR target/60884
* gcc.target/sh/pr53976-1.c (test_02): Remove inappropriate test case.
(test_03): Rename to test_02.
Index: gcc/testsuite/gcc.target/sh/pr53976-1.c
===
--- gcc/testsuite/gcc.target/sh/pr53976-1.c	(revision 210185)
+++ gcc/testsuite/gcc.target/sh/pr53976-1.c	(working copy)
@@ -24,15 +24,8 @@
 }
 
 int
-test_02 (const char* a)
+test_02 (int a, int b, int c, int d)
 {
-  /* Must not see a sett after the inlined strlen.  */
-  return __builtin_strlen (a);
-}
-
-int
-test_03 (int a, int b, int c, int d)
-{
   /* One of the blocks should have a sett and the other one should not.  */
   if (d  4)
 return a + b + 1;
Index: gcc/config/sh/sh-mem.cc
===
--- gcc/config/sh/sh-mem.cc	(revision 210185)
+++ gcc/config/sh/sh-mem.cc	(working copy)
@@ -568,7 +568,7 @@
 
   addr1 = adjust_automodify_address (addr1, SImode, current_addr, 0);
 
-  /*start long loop.  */
+  /* start long loop.  */
   emit_label (L_loop_long);
 
   /* tmp1 is aligned, OK to load.  */
@@ -589,29 +589,15 @@
   addr1 = adjust_address (addr1, QImode, 0);
 
   /* unroll remaining bytes.  */
-  emit_insn (gen_extendqisi2 (tmp1, addr1));
-  emit_insn (gen_cmpeqsi_t (tmp1, const0_rtx));
-  jump = emit_jump_insn (gen_branch_true (L_return));
-  add_int_reg_note (jump, REG_BR_PROB, prob_likely);
+  for (int i = 0; i  4; ++i)
+{
+  emit_insn (gen_extendqisi2 (tmp1, addr1));
+  emit_move_insn (current_addr, plus_constant (Pmode, current_addr, 1));
+  emit_insn (gen_cmpeqsi_t (tmp1, const0_rtx));
+  jump = emit_jump_insn (gen_branch_true (L_return));
+  add_int_reg_note (jump, REG_BR_PROB, prob_likely);
+}
 
-  emit_move_insn (current_addr, plus_constant (Pmode, current_addr, 1));
-
-  emit_insn (gen_extendqisi2 (tmp1, addr1));
-  emit_insn (gen_cmpeqsi_t (tmp1, const0_rtx));
-  jump = emit_jump_insn (gen_branch_true (L_return));
-  add_int_reg_note (jump, REG_BR_PROB, prob_likely);
-
-  emit_move_insn (current_addr, plus_constant (Pmode, current_addr, 1));
-
-  emit_insn (gen_extendqisi2 (tmp1, addr1));
-  emit_insn (gen_cmpeqsi_t (tmp1, const0_rtx));
-  jump = emit_jump_insn (gen_branch_true (L_return));
-  add_int_reg_note (jump, REG_BR_PROB, prob_likely);
-
-  emit_move_insn (current_addr, plus_constant (Pmode, current_addr, 1));
-
-  emit_insn (gen_extendqisi2 (tmp1, addr1));
-  jump = emit_jump_insn (gen_jump_compact (L_return));
   emit_barrier_after (jump);
 
   /* start byte loop.  */
@@ -626,10 +612,9 @@
 
   /* end loop.  */
 
-  emit_insn (gen_addsi3 (start_addr, start_addr, GEN_INT (1)));
-
   emit_label (L_return);
 
+  emit_insn (gen_addsi3 (start_addr, start_addr, GEN_INT (1)));
   emit_insn (gen_subsi3 (operands[0], current_addr, start_addr));
 
   return true;


Re: [patch libgcc]: Fix PR c++/57440

2014-05-07 Thread Kai Tietz
2014-05-07 21:41 GMT+02:00 Jonathan Wakely jwakely@gmail.com:
 On 7 May 2014 20:06, Kai Tietz wrote:

 PR c++/57440

 N.B. that should be libstdc++/57440 in the ChangeLog

Oh, yes of course.

Thanks.
Kai


RFC: Faster for_each_rtx-like iterators

2014-05-07 Thread Richard Sandiford
I noticed for_each_rtx showing up in profiles and thought I'd have a go
at using worklist-based iterators instead.  So far I have three:

  FOR_EACH_SUBRTX: iterates over const_rtx subrtxes of a const_rtx
  FOR_EACH_SUBRTX_VAR: iterates over rtx subrtxes of an rtx
  FOR_EACH_SUBRTX_PTR: iterates over subrtx pointers of an rtx *

with FOR_EACH_SUBRTX_PTR being the direct for_each_rtx replacement.

I made FOR_EACH_SUBRTX the default (unsuffixed) version because
most walks really don't modify the structure.  I think we should
encourage const_rtxes to be used whereever possible.  E.g. it might
make it easier to have non-GC storage for temporary rtxes in future.

I've locally replaced all for_each_rtx calls in the generic code with
these iterators and they make things reproducably faster.  The speed-up
on full --enable-checking=release ./cc1 and ./cc1plus times is only about 1%,
but maybe that's enough to justify the churn.

Implementation-wise, the main observation is that most subrtxes are part
of a single contiguous sequence of e fields.  E.g. when compiling an
oldish combine.ii on x86_64-linux-gnu with -O2, we iterate over the
subrtxes of 7,636,542 rtxes.  Of those:

(A) 4,459,135 (58.4%) are leaf rtxes with no e or E fields,
(B) 3,133,875 (41.0%) are rtxes with a single block of e fields and
  no E fields, and
(C)43,532 (00.6%) are more complicated.

(A) is really a special case of (B) in which the block has zero length.
Those are the only two cases that really need to be handled inline.
The implementation does this by having a mapping from an rtx code to the
bounds of its e sequence, in the form of a start index and count.

Out of (C), the vast majority (43,509) are PARALLELs.  However, as you'd
probably expect, bloating the inline code with that case made things
slower rather than faster.

The vast majority (in fact all in the combine.ii run above) of iterations
can be done with a 16-element stack worklist.  We obviously still need a
heap fallback for the pathological cases though.

I spent a bit of time trying different iterator implementations and
seeing which produced the best code.  Specific results from that were:

- The storage used for the worklist is separate from the iterator,
  in order to avoid capturing iterator fields.

- Although the natural type of the storage would be auto_vec ..., 16,
  that produced some overhead compared with a separate stack array and heap
  vector pointer.  With the heap vector pointer, the only overhead is an
  assignment in the constructor and an if (x) release (x)-style sequence
  in the destructor.  I think the extra complication over auto_vec is worth
  it because in this case the heap version is so very rarely needed.

- Several existing for_each_rtx callbacks have something like:

if (GET_CODE (x) == CONST)
  return -1;

  or:

if (CONSTANT_P (x))
  return -1;

  to avoid walking subrtxes of constants.  That can be done without
  extra code checks and branches by having a separate code-bound
  mapping in which all constants are treated as leaf rtxes.  This usage
  should be common enough to outweigh the cache penalty of two arrays.

  The choice between iterating over constants or not is given in the
  final parameter of the FOR_EACH_* iterator.

- The maximum number of fields in (B)-type rtxes is 3.  We get better
  code by making that explicit rather than having a general loop.

- (C) codes map to an e count of UCHAR_MAX, so we can use a single
  check to test for that and for cases where the stack worklist is
  too small.

To give an example:

/* Callback for for_each_rtx, that returns 1 upon encountering a VALUE
   whose UID is greater than the int uid that D points to.  */

static int
refs_newer_value_cb (rtx *x, void *d)
{
  if (GET_CODE (*x) == VALUE  CSELIB_VAL_PTR (*x)-uid  *(int *)d)
return 1;

  return 0;
}

/* Return TRUE if EXPR refers to a VALUE whose uid is greater than
   that of V.  */

static bool
refs_newer_value_p (rtx expr, rtx v)
{
  int minuid = CSELIB_VAL_PTR (v)-uid;

  return for_each_rtx (expr, refs_newer_value_cb, minuid);
}

becomes:

/* Return TRUE if EXPR refers to a VALUE whose uid is greater than
   that of V.  */

static bool
refs_newer_value_p (const_rtx expr, rtx v)
{
  int minuid = CSELIB_VAL_PTR (v)-uid;
  subrtx_iterator::array_type array;
  FOR_EACH_SUBRTX (iter, array, expr, NONCONST)
if (GET_CODE (*iter) == VALUE  CSELIB_VAL_PTR (*iter)-uid  minuid)
  return true;
  return false;
}

The iterator also allows subrtxes of a specific rtx to be skipped;
this is the equivalent of returning -1 from a for_each_rtx callback.
It also allows the current rtx to be replaced in the worklist by
another.  E.g.:

static void
mark_constants_in_pattern (rtx insn)
{
  subrtx_iterator::array_type array;
  FOR_EACH_SUBRTX (iter, array, PATTERN (insn), ALL)
{
  const_rtx x = *iter;
  if (GET_CODE (x) == SYMBOL_REF)
{
  if (CONSTANT_POOL_ADDRESS_P (x))
   

genattrtab error reporting

2014-05-07 Thread Mike Stump
getattrtab looses track of which file the given rtl came from during error 
reporting.  A port that uses multiple .md files for the port will tend to list 
the last .md file processed instead of the correct md file.  We preserve the 
filename upon read, and during post processing, we reset the filename to the 
right context, as we process that context.

Ok?

2014-05-07  Mike Stump  mikest...@comcast.net

* genattrtab.c (struct insn_def): Add filename.
(convert_set_attr_alternative): Improve error message.
(check_defs): Ensure read_md_filename is set appropriately.
(gen_insn): Save read_md_filename.

diff --git a/gcc/genattrtab.c b/gcc/genattrtab.c
index 99b1b83..0f14b4d 100644
--- a/gcc/genattrtab.c
+++ b/gcc/genattrtab.c
@@ -139,6 +139,7 @@ struct insn_def
   rtx def; /* The DEFINE_...  */
   int insn_code;   /* Instruction number.  */
   int insn_index;  /* Expression number in file, for errors.  */
+  const char *filename;/* Filename.  */
   int lineno;  /* Line number.  */
   int num_alternatives;/* Number of alternatives.  */
   int vec_idx; /* Index of attribute vector in `def'.  */
@@ -1066,7 +1067,8 @@ convert_set_attr_alternative (rtx exp, struct insn_def 
*id)
   if (XVECLEN (exp, 1) != num_alt)
 {
   error_with_line (id-lineno,
-  bad number of entries in SET_ATTR_ALTERNATIVE);
+  bad number of entries in SET_ATTR_ALTERNATIVE, was %d 
expected %d,
+  XVECLEN (exp, 1), num_alt);
   return NULL_RTX;
 }
 
@@ -1137,6 +1139,7 @@ check_defs (void)
   if (XVEC (id-def, id-vec_idx) == NULL)
continue;
 
+  read_md_filename = id-filename;
   for (i = 0; i  XVECLEN (id-def, id-vec_idx); i++)
{
  value = XVECEXP (id-def, id-vec_idx, i);
@@ -3280,6 +3283,7 @@ gen_insn (rtx exp, int lineno)
   id-next = defs;
   defs = id;
   id-def = exp;
+  id-filename = read_md_filename;
   id-lineno = lineno;
 
   switch (GET_CODE (exp))


Re: RFC: Faster for_each_rtx-like iterators

2014-05-07 Thread Mike Stump
On May 7, 2014, at 1:52 PM, Richard Sandiford rdsandif...@googlemail.com 
wrote:
 
 I've locally replaced all for_each_rtx calls in the generic code with
 these iterators and they make things reproducably faster.  The speed-up
 on full --enable-checking=release ./cc1 and ./cc1plus times is only about 1%,
 but maybe that's enough to justify the churn.

100 1% fixes would make the compiler 100% faster.  :-)  I think 1% is actually 
a really good improvement.  If you have times for -O0, that would be 
interesting to see what they are.


Re: [PATCH] AutoFDO patch for trunk

2014-05-07 Thread Xinliang David Li
Have you announced the autofdo profile tool to gcc list?

David

On Wed, May 7, 2014 at 2:24 PM, Dehao Chen de...@google.com wrote:
 Hi,

 I'm planning to port the AutoFDO patch upstream. Attached is the
 prepared patch. You can also find the patch in
 http://codereview.appspot.com/99010043

 I've tested the patch with SPECCPU2006. For the CINT2006 benchmarks,
 the speedup comparison between O2, FDO and AutoFDO is as follows:

 Reference: o2
 (1): auto_fdo
 (2): fdo

Benchmark Base:Reference(1)  (2)
 -
 spec/2006/int/C++/471.omnetpp 23.18   +3.11%   +5.09%
 spec/2006/int/C++/473.astar   21.15   +6.79%   +9.80%
 spec/2006/int/C++/483.xalancbmk   36.68  +11.56%  +14.47%
 spec/2006/int/C/400.perlbench 34.57   +6.59%  +18.56%
 spec/2006/int/C/401.bzip2 23.17   +0.95%   +2.49%
 spec/2006/int/C/403.gcc   32.33   +8.27%   +9.76%
 spec/2006/int/C/429.mcf   42.13   +4.72%   +5.23%
 spec/2006/int/C/445.gobmk 26.53   -1.39%   +0.05%
 spec/2006/int/C/456.hmmer 23.72   +7.12%   +7.87%
 spec/2006/int/C/458.sjeng 26.17   +4.65%   +6.04%
 spec/2006/int/C/462.libquantum57.23   +4.04%   +1.42%
 spec/2006/int/C/464.h264ref46.3   +1.07%   +8.97%

 geometric mean+4.73%   +7.36%

 The majority of the performance difference between AutoFDO and FDO
 comes from the lack of instruction level discriminator support. Cary
 Coutant is planning to port that patch upstream too.

 Please let me know if you have any question about this patch, and
 thanks in advance for reviewing such a huge patch.

 Dehao


libgo patch committed: Define CLONE flags in syscall package

2014-05-07 Thread Ian Lance Taylor
Domink Vogt pointed out that the gccgo syscall package does not define
the CLONE flags.  This patch defines them.  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline and 4.9
branch.

Ian

diff -r c8ae29f0c4c6 libgo/configure.ac
--- a/libgo/configure.ac	Tue May 06 12:23:00 2014 -0700
+++ b/libgo/configure.ac	Wed May 07 14:40:49 2014 -0700
@@ -475,7 +475,7 @@
   ;;
 esac
 
-AC_CHECK_HEADERS(sys/file.h sys/mman.h syscall.h sys/epoll.h sys/inotify.h sys/ptrace.h sys/syscall.h sys/user.h sys/utsname.h sys/select.h sys/socket.h net/if.h net/if_arp.h net/route.h netpacket/packet.h sys/prctl.h sys/mount.h sys/vfs.h sys/statfs.h sys/timex.h sys/sysinfo.h utime.h linux/ether.h linux/fs.h linux/reboot.h netinet/icmp6.h netinet/in_syst.h netinet/ip.h netinet/ip_mroute.h netinet/if_ether.h)
+AC_CHECK_HEADERS(sched.h sys/file.h sys/mman.h syscall.h sys/epoll.h sys/inotify.h sys/ptrace.h sys/syscall.h sys/user.h sys/utsname.h sys/select.h sys/socket.h net/if.h net/if_arp.h net/route.h netpacket/packet.h sys/prctl.h sys/mount.h sys/vfs.h sys/statfs.h sys/timex.h sys/sysinfo.h utime.h linux/ether.h linux/fs.h linux/reboot.h netinet/icmp6.h netinet/in_syst.h netinet/ip.h netinet/ip_mroute.h netinet/if_ether.h)
 
 AC_CHECK_HEADERS([linux/filter.h linux/if_addr.h linux/if_ether.h linux/if_tun.h linux/netlink.h linux/rtnetlink.h], [], [],
 [#ifdef HAVE_SYS_SOCKET_H
diff -r c8ae29f0c4c6 libgo/mksysinfo.sh
--- a/libgo/mksysinfo.sh	Tue May 06 12:23:00 2014 -0700
+++ b/libgo/mksysinfo.sh	Wed May 07 14:40:49 2014 -0700
@@ -163,6 +163,9 @@
 #if defined(HAVE_NETINET_ICMP6_H)
 #include netinet/icmp6.h
 #endif
+#if defined(HAVE_SCHED_H)
+#include sched.h
+#endif
 
 /* Constants that may only be defined as expressions on some systems,
expressions too complex for -fdump-go-spec to handle.  These are
@@ -1130,6 +1133,10 @@
   -e 's/\[0\]byte/[0]int8/' \
  ${OUT}
 
+# The GNU/Linux CLONE flags.
+grep '^const _CLONE_' gen-sysinfo.go | \
+  sed -e 's/^\(const \)_\(CLONE_[^= ]*\)\(.*\)$/\1\2 = _\2/'  ${OUT}
+
 # The Solaris 11 Update 1 _zone_net_addr_t struct.
 grep '^type _zone_net_addr_t ' gen-sysinfo.go | \
 sed -e 's/_in6_addr/[16]byte/' \


libgo patch committed: Define more TIOC constants

2014-05-07 Thread Ian Lance Taylor
This patch to libgo defines more TIOC constants, constants that are
non-trivial constants on GNU/Linux systems.  Boostrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline and 4.9
branch.

Ian

diff -r bbf6c7c22954 libgo/mksysinfo.sh
--- a/libgo/mksysinfo.sh	Wed May 07 14:42:39 2014 -0700
+++ b/libgo/mksysinfo.sh	Wed May 07 14:58:48 2014 -0700
@@ -180,6 +180,18 @@
 #ifdef TIOCSCTTY
   TIOCSCTTY_val = TIOCSCTTY,
 #endif
+#ifdef TIOCGPTN
+  TIOCGPTN_val = TIOCGPTN,
+#endif
+#ifdef TIOCSPTLCK
+  TIOCSPTLCK_val = TIOCSPTLCK,
+#endif
+#ifdef TIOCGDEV
+  TIOCGDEV_val = TIOCGDEV,
+#endif
+#ifdef TIOCSIG
+  TIOCSIG_val = TIOCSIG,
+#endif
 };
 EOF
 
@@ -778,6 +790,26 @@
 echo 'const TIOCSCTTY = _TIOCSCTTY_val'  ${OUT}
   fi
 fi
+if ! grep '^const TIOCGPTN' ${OUT} /dev/null 21; then
+  if grep '^const _TIOCGPTN_val' ${OUT} /dev/null 21; then
+echo 'const TIOCGPTN = _TIOCGPTN_val'  ${OUT}
+  fi
+fi
+if ! grep '^const TIOCSPTLCK' ${OUT} /dev/null 21; then
+  if grep '^const _TIOCSPTLCK_val' ${OUT} /dev/null 21; then
+echo 'const TIOCSPTLCK = _TIOCSPTLCK_val'  ${OUT}
+  fi
+fi
+if ! grep '^const TIOCGDEV' ${OUT} /dev/null 21; then
+  if grep '^const _TIOCGDEV_val' ${OUT} /dev/null 21; then
+echo 'const TIOCGDEV = _TIOCGDEV_val'  ${OUT}
+  fi
+fi
+if ! grep '^const TIOCSIG' ${OUT} /dev/null 21; then
+  if grep '^const _TIOCSIG_val' ${OUT} /dev/null 21; then
+echo 'const TIOCSIG = _TIOCSIG_val'  ${OUT}
+  fi
+fi
 
 # The ioctl flags for terminal control
 grep '^const _TC[GS]ET' gen-sysinfo.go | \


AutoFDO profile toolchain is open-sourced

2014-05-07 Thread Dehao Chen
We have open-sourced AutoFDO profile toolchain in:

https://github.com/google/autofdo

For GCC developers, the most important tool is create_gcov, which
converts sampling based profile to GCC-readable profile. Please refer
to the readme file
(https://raw.githubusercontent.com/google/autofdo/master/README) for
more details.

To use the profile, one need to checkout
https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_8. We are working on
porting AutoFDO to trunk
(http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00438.html).

We have limited doc inside the open-sourced package, and we are
planning to add more content to the wiki page
(https://github.com/google/autofdo/wiki). Feel free to send me emails
or discuss on github if you have any questions.

Cheers,
Dehao


Re: [PATCH, PR58066] preferred_stack_boundary update for tls expanded call

2014-05-07 Thread Wei Mi
This is the updated patch of pr58066-3.patch.

The calls added in the templates of tls_local_dynamic_base_32 and
tls_global_dynamic_32 in pr58066-3.patch are used to prevent sched2
from moving sp setting across implicit tls calls, but those calls make
the combine of UNSPEC_TLS_LD_BASE and UNSPEC_DTPOFF difficult, so that
the optimization in tls_local_dynamic_32_once to convert local_dynamic
to global_dynamic mode for single tls reference cannot take effect. In
the updated patch, I remove those calls from insn templates and add
reg:SI SP_REG explicitly in the templates of UNSPEC_TLS_GD and
UNSPEC_TLS_LD_BASE. It solves the sched2 and combine problems above,
and now the optimization in tls_local_dynamic_32_once works.

bootstrapped ok on x86_64-linux-gnu. regression is going on. Is it OK
if regression passes?

Thanks.
Wei.

ChangeLog:

gcc/
2014-05-07  Wei Mi  w...@google.com

* config/i386/i386.c (ix86_compute_frame_layout):
preferred_stack_boundary updated for tls expanded call.
* config/i386/i386.md: Set ix86_tls_descriptor_calls_expanded_in_cfun.

gcc/testsuite/
2014-05-07  Wei Mi  w...@google.com

* gcc.target/i386/pr58066.c: New test.

Index: testsuite/gcc.target/i386/pr58066.c
===
--- testsuite/gcc.target/i386/pr58066.c (revision 0)
+++ testsuite/gcc.target/i386/pr58066.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options -fPIC -O2 } */
+
+/* Check whether the stack frame starting addresses of tls expanded calls
+   in foo and goo are 16bytes aligned.  */
+static __thread char ccc1;
+void* foo()
+{
+ return ccc1;
+}
+
+__thread char ccc2;
+void* goo()
+{
+ return ccc2;
+}
+
+/* { dg-final { scan-assembler-times .cfi_def_cfa_offset 16 2 } } */
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 209979)
+++ config/i386/i386.c  (working copy)
@@ -9485,20 +9485,30 @@ ix86_compute_frame_layout (struct ix86_f
   frame-nregs = ix86_nsaved_regs ();
   frame-nsseregs = ix86_nsaved_sseregs ();

-  stack_alignment_needed = crtl-stack_alignment_needed / BITS_PER_UNIT;
-  preferred_alignment = crtl-preferred_stack_boundary / BITS_PER_UNIT;
-
   /* 64-bit MS ABI seem to require stack alignment to be always 16 except for
  function prologues and leaf.  */
-  if ((TARGET_64BIT_MS_ABI  preferred_alignment  16)
+  if ((TARGET_64BIT_MS_ABI  crtl-preferred_stack_boundary  128)
(!crtl-is_leaf || cfun-calls_alloca != 0
   || ix86_current_function_calls_tls_descriptor))
 {
-  preferred_alignment = 16;
-  stack_alignment_needed = 16;
   crtl-preferred_stack_boundary = 128;
   crtl-stack_alignment_needed = 128;
 }
+  /* preferred_stack_boundary is never updated for call
+ expanded from tls descriptor. Update it here. We don't update it in
+ expand stage because according to the comments before
+ ix86_current_function_calls_tls_descriptor, tls calls may be optimized
+ away.  */
+  else if (ix86_current_function_calls_tls_descriptor
+   crtl-preferred_stack_boundary  PREFERRED_STACK_BOUNDARY)
+{
+  crtl-preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
+  if (crtl-stack_alignment_needed  PREFERRED_STACK_BOUNDARY)
+   crtl-stack_alignment_needed = PREFERRED_STACK_BOUNDARY;
+}
+
+  stack_alignment_needed = crtl-stack_alignment_needed / BITS_PER_UNIT;
+  preferred_alignment = crtl-preferred_stack_boundary / BITS_PER_UNIT;

   gcc_assert (!size || stack_alignment_needed);
   gcc_assert (preferred_alignment = STACK_BOUNDARY / BITS_PER_UNIT);
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 209979)
+++ config/i386/i386.md (working copy)
@@ -12530,7 +12530,8 @@
(unspec:SI
 [(match_operand:SI 1 register_operand b)
  (match_operand 2 tls_symbolic_operand)
- (match_operand 3 constant_call_address_operand z)]
+ (match_operand 3 constant_call_address_operand z)
+ (reg:SI SP_REG)]
 UNSPEC_TLS_GD))
(clobber (match_scratch:SI 4 =d))
(clobber (match_scratch:SI 5 =c))
@@ -12555,11 +12556,14 @@
 [(set (match_operand:SI 0 register_operand)
  (unspec:SI [(match_operand:SI 2 register_operand)
  (match_operand 1 tls_symbolic_operand)
- (match_operand 3 constant_call_address_operand)]
+ (match_operand 3 constant_call_address_operand)
+ (reg:SI SP_REG)]
 UNSPEC_TLS_GD))
  (clobber (match_scratch:SI 4))
  (clobber (match_scratch:SI 5))
- (clobber (reg:CC FLAGS_REG))])])
+ (clobber (reg:CC FLAGS_REG))])]
+  
+  ix86_tls_descriptor_calls_expanded_in_cfun = true;)

 (define_insn *tls_global_dynamic_64_mode
   [(set (match_operand:P 0 register_operand =a)
@@ -12614,13 +12618,15 @@
   (const_int 0)))
   

Re: genattrtab error reporting

2014-05-07 Thread H.J. Lu
On Wed, May 7, 2014 at 2:21 PM, Mike Stump mikest...@comcast.net wrote:
 getattrtab looses track of which file the given rtl came from during error 
 reporting.  A port that uses multiple .md files for the port will tend to 
 list the last .md file processed instead of the correct md file.  We preserve 
 the filename upon read, and during post processing, we reset the filename to 
 the right context, as we process that context.


Does this fix

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31778

-- 
H.J.


Re: genattrtab error reporting

2014-05-07 Thread Mike Stump
On May 7, 2014, at 5:22 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, May 7, 2014 at 2:21 PM, Mike Stump mikest...@comcast.net wrote:
 getattrtab looses track of which file the given rtl came from during error 
 reporting.  A port that uses multiple .md files for the port will tend to 
 list the last .md file processed instead of the correct md file.  We 
 preserve the filename upon read, and during post processing, we reset the 
 filename to the right context, as we process that context.
 
 
 Does this fix
 
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31778

Only if it is applied to the tree!  :-)  Yes.

[v3] Mini-tweak to acinclude.m4

2014-05-07 Thread Paolo Carlini

Hi,

I don't think we have any reason to trigger a -Wwrite-strings warning, 
thus, barring objections, I'm going to commit the below.


Thanks,
Paolo.

///
2014-05-08  Paolo Carlini  paolo.carl...@oracle.com

* acinclude.m4 ([GLIBCXX_ENABLE_C99]): Avoid -Wwrite-strings warning.
* configure: Regenerate.
Index: acinclude.m4
===
--- acinclude.m4(revision 210183)
+++ acinclude.m4(working copy)
@@ -1052,8 +1052,8 @@ AC_DEFUN([GLIBCXX_ENABLE_C99], [
vscanf(%i, args);
vsnprintf(fmt, 0, %i, args);
vsscanf(fmt, %i, args);
-  }],
- [snprintf(12, 0, %i);],
+   snprintf(fmt, 0, %i);
+  }], [],
  [glibcxx_cv_c99_stdio=yes], [glibcxx_cv_c99_stdio=no])
   ])
   AC_MSG_RESULT($glibcxx_cv_c99_stdio)


Fix some tests for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL

2014-05-07 Thread Joseph S. Myers
Having fixed TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL to apply only to
128-bit vectors, some --with-arch=bdver3 --with-cpu=bdver3
scan-assembler failures relating to that tuning remain, because of
different choices of instructions for 128-bit vectors from the choices
expected by the tests.

This patch fixes affected tests to allow the different instruction
choices seen in this case.  Tested for x86_64-linux-gnu
(--with-arch=bdver3 --with-cpu=bdver3).  OK to commit?

2014-05-07  Joseph Myers  jos...@codesourcery.com

* gcc.target/i386/avx256-unaligned-load-2.c,
gcc.target/i386/pr49002-1.c, gcc.target/i386/pr53712.c,
gcc.target/i386/pr53907.c, gcc.target/i386/pr59539-1.c: Allow
packed-single instructions.

Index: gcc/testsuite/gcc.target/i386/pr59539-1.c
===
--- gcc/testsuite/gcc.target/i386/pr59539-1.c   (revision 210124)
+++ gcc/testsuite/gcc.target/i386/pr59539-1.c   (working copy)
@@ -13,4 +13,4 @@
   return _mm_movemask_epi8 (result);
 }
 
-/* { dg-final { scan-assembler-times vmovdqu 1 } } */
+/* { dg-final { scan-assembler-times vmovdqu|vmovups 1 } } */
Index: gcc/testsuite/gcc.target/i386/pr53712.c
===
--- gcc/testsuite/gcc.target/i386/pr53712.c (revision 210124)
+++ gcc/testsuite/gcc.target/i386/pr53712.c (working copy)
@@ -10,4 +10,4 @@
   return __builtin_ia32_pcmpistri128 (s1chars, s2chars, 0);
 }
 
-/* { dg-final { scan-assembler-times movdqu 1 } } */
+/* { dg-final { scan-assembler-times movdqu|movups 1 } } */
Index: gcc/testsuite/gcc.target/i386/avx256-unaligned-load-2.c
===
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-load-2.c (revision 
210124)
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-load-2.c (working copy)
@@ -11,5 +11,5 @@
 }
 
 /* { dg-final { scan-assembler-not 
(avx_loaddqu256|vmovdqu\[^\n\r]*movv32qi_internal) } } */
-/* { dg-final { scan-assembler 
(sse2_loaddqu|vmovdqu\[^\n\r]*movv16qi_internal) } } */
+/* { dg-final { scan-assembler 
(sse2_loaddqu|(vmovdqu|vmovups)\[^\n\r]*movv16qi_internal) } } */
 /* { dg-final { scan-assembler vinsert.128 } } */
Index: gcc/testsuite/gcc.target/i386/pr49002-1.c
===
--- gcc/testsuite/gcc.target/i386/pr49002-1.c   (revision 210124)
+++ gcc/testsuite/gcc.target/i386/pr49002-1.c   (working copy)
@@ -13,4 +13,4 @@
 
 /* Ensure we load into xmm, not ymm.  */
 /* { dg-final { scan-assembler-not vmovapd\[\t \]*\[^,\]*,\[\t \]*%ymm } } */
-/* { dg-final { scan-assembler vmovapd\[\t \]*\[^,\]*,\[\t \]*%xmm } } */
+/* { dg-final { scan-assembler vmovap\[ds\]\[\t \]*\[^,\]*,\[\t \]*%xmm } } 
*/
Index: gcc/testsuite/gcc.target/i386/pr53907.c
===
--- gcc/testsuite/gcc.target/i386/pr53907.c (revision 210124)
+++ gcc/testsuite/gcc.target/i386/pr53907.c (working copy)
@@ -13,4 +13,4 @@
   return sz;
 }
 
-/* { dg-final { scan-assembler movdqa } } */
+/* { dg-final { scan-assembler movdqa|movaps } } */

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: genattrtab error reporting

2014-05-07 Thread Segher Boessenkool
  Does this fix
  
  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31778
 
 Only if it is applied to the tree!  :-)  Yes.

It also is PR57062.  Thanks for fixing it!


Segher


Re: RFC: Faster for_each_rtx-like iterators

2014-05-07 Thread Trevor Saunders
On Wed, May 07, 2014 at 09:52:49PM +0100, Richard Sandiford wrote:
 I noticed for_each_rtx showing up in profiles and thought I'd have a go
 at using worklist-based iterators instead.  So far I have three:
 
   FOR_EACH_SUBRTX: iterates over const_rtx subrtxes of a const_rtx
   FOR_EACH_SUBRTX_VAR: iterates over rtx subrtxes of an rtx
   FOR_EACH_SUBRTX_PTR: iterates over subrtx pointers of an rtx *
 
 with FOR_EACH_SUBRTX_PTR being the direct for_each_rtx replacement.
 
 I made FOR_EACH_SUBRTX the default (unsuffixed) version because
 most walks really don't modify the structure.  I think we should
 encourage const_rtxes to be used whereever possible.  E.g. it might
 make it easier to have non-GC storage for temporary rtxes in future.
 
 I've locally replaced all for_each_rtx calls in the generic code with
 these iterators and they make things reproducably faster.  The speed-up
 on full --enable-checking=release ./cc1 and ./cc1plus times is only about 1%,
 but maybe that's enough to justify the churn.

seems pretty nice, and it seems like it'll make code a little more
readable too :)

 Implementation-wise, the main observation is that most subrtxes are part
 of a single contiguous sequence of e fields.  E.g. when compiling an
 oldish combine.ii on x86_64-linux-gnu with -O2, we iterate over the
 subrtxes of 7,636,542 rtxes.  Of those:
 
 (A) 4,459,135 (58.4%) are leaf rtxes with no e or E fields,
 (B) 3,133,875 (41.0%) are rtxes with a single block of e fields and
   no E fields, and
 (C)43,532 (00.6%) are more complicated.
 
 (A) is really a special case of (B) in which the block has zero length.
 Those are the only two cases that really need to be handled inline.
 The implementation does this by having a mapping from an rtx code to the
 bounds of its e sequence, in the form of a start index and count.
 
 Out of (C), the vast majority (43,509) are PARALLELs.  However, as you'd
 probably expect, bloating the inline code with that case made things
 slower rather than faster.
 
 The vast majority (in fact all in the combine.ii run above) of iterations
 can be done with a 16-element stack worklist.  We obviously still need a
 heap fallback for the pathological cases though.
 
 I spent a bit of time trying different iterator implementations and
 seeing which produced the best code.  Specific results from that were:
 
 - The storage used for the worklist is separate from the iterator,
   in order to avoid capturing iterator fields.
 
 - Although the natural type of the storage would be auto_vec ..., 16,
   that produced some overhead compared with a separate stack array and heap
   vector pointer.  With the heap vector pointer, the only overhead is an
   assignment in the constructor and an if (x) release (x)-style sequence
   in the destructor.  I think the extra complication over auto_vec is worth
   it because in this case the heap version is so very rarely needed.

hm, where does the overhead come from exactly? it seems like if  its
 faster to use vecT, va_heap, vl_embedd *foo; we should fix something
 about vectors since this isn't the only place it could matter.  does it
 matter if you use vecT, va_heap, vl_embedd * or vecT ? the second
 is basically just a wrapper around the former I'd expect has no effect.
 I'm not saying you're doing the wrong thing here, but if we can make
 generic vectors faster we probably should ;) or is the issue the
 __builtin_expect()s you can add?

 - Several existing for_each_rtx callbacks have something like:
 
 if (GET_CODE (x) == CONST)
   return -1;
 
   or:
 
 if (CONSTANT_P (x))
   return -1;
 
   to avoid walking subrtxes of constants.  That can be done without
   extra code checks and branches by having a separate code-bound
   mapping in which all constants are treated as leaf rtxes.  This usage
   should be common enough to outweigh the cache penalty of two arrays.
 
   The choice between iterating over constants or not is given in the
   final parameter of the FOR_EACH_* iterator.

less repitition \O/

 - The maximum number of fields in (B)-type rtxes is 3.  We get better
   code by making that explicit rather than having a general loop.
 
 - (C) codes map to an e count of UCHAR_MAX, so we can use a single
   check to test for that and for cases where the stack worklist is
   too small.

 can we use uint8_t?

 To give an example:
 
 /* Callback for for_each_rtx, that returns 1 upon encountering a VALUE
whose UID is greater than the int uid that D points to.  */
 
 static int
 refs_newer_value_cb (rtx *x, void *d)
 {
   if (GET_CODE (*x) == VALUE  CSELIB_VAL_PTR (*x)-uid  *(int *)d)
 return 1;
 
   return 0;
 }
 
 /* Return TRUE if EXPR refers to a VALUE whose uid is greater than
that of V.  */
 
 static bool
 refs_newer_value_p (rtx expr, rtx v)
 {
   int minuid = CSELIB_VAL_PTR (v)-uid;
 
   return for_each_rtx (expr, refs_newer_value_cb, minuid);
 }
 
 becomes:
 
 /* Return TRUE if EXPR refers to a VALUE whose 

[RS6000] Fix PR61098, Poor code setting count register

2014-05-07 Thread Alan Modra
On powerpc64, to set a large loop count we have code like the
following after split1:

(insn 67 14 68 4 (set (reg:DI 160)
(const_int 99942400 [0x5f5])) /home/amodra/unaligned_load.c:14 -1
 (nil))
(insn 68 67 42 4 (set (reg:DI 160)
(ior:DI (reg:DI 160)
(const_int 57600 [0xe100]))) /home/amodra/unaligned_load.c:14 -1
 (expr_list:REG_EQUAL (const_int 1 [0x5f5e100])
(nil)))

and then test for loop exit with:

(jump_insn 65 31 45 5 (parallel [
(set (pc)
(if_then_else (ne (reg:DI 160)
(const_int 1 [0x1]))
(label_ref:DI 42)
(pc)))
(set (reg:DI 160)
(plus:DI (reg:DI 160)
(const_int -1 [0x])))
(clobber (scratch:CC))
(clobber (scratch:DI))
]) /home/amodra/unaligned_load.c:15 800 {*ctrdi_internal1}
 (int_list:REG_BR_PROB 9899 (nil))
 - 42)

The jump_insn of course is meant for use with bdnz, which implies a
strong preference for reg 160 to live in the count register.  Trouble
is, the count register doesn't do arithmetic.

So, use a new psuedo for intermediate results.  On looking at this,
I noticed the !TARGET_POWERPC64 code in rs6000_emit_set_long_const was
broken, apparently expecting c1 and c2 to be the high and low 32 bits
of the constant.  That's no longer true, so I've fixed that as well.
Bootstrapped and regression tested powerpc64-linux.  OK for mainline
and branches?

PR target/61098
* config/rs6000/rs6000.c (rs6000_emit_set_const): Remove unneeded
params and return value.  Simplify.  Update comment.
(rs6000_emit_set_long_const): Remove unneeded param and return
value.  Correct !TARGET_POWERPC64 handling of constants  2G.
If we can, use a new pseudo for intermediate calculations.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 209926)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1068,7 +1069,7 @@ static tree rs6000_handle_longcall_attribute (tree
 static tree rs6000_handle_altivec_attribute (tree *, tree, tree, int, bool *);
 static tree rs6000_handle_struct_attribute (tree *, tree, tree, int, bool *);
 static tree rs6000_builtin_vectorized_libmass (tree, tree, tree);
-static rtx rs6000_emit_set_long_const (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
+static void rs6000_emit_set_long_const (rtx, HOST_WIDE_INT);
 static int rs6000_memory_move_cost (enum machine_mode, reg_class_t, bool);
 static bool rs6000_debug_rtx_costs (rtx, int, int, int, int *, bool);
 static int rs6000_debug_address_cost (rtx, enum machine_mode, addr_space_t,
@@ -7826,53 +7811,36 @@ rs6000_conditional_register_usage (void)
 }
 
 
-/* Try to output insns to set TARGET equal to the constant C if it can
-   be done in less than N insns.  Do all computations in MODE.
-   Returns the place where the output has been placed if it can be
-   done and the insns have been emitted.  If it would take more than N
-   insns, zero is returned and no insns and emitted.  */
+/* Output insns to set DEST equal to the constant SOURCE.  */
 
-rtx
-rs6000_emit_set_const (rtx dest, enum machine_mode mode,
-  rtx source, int n ATTRIBUTE_UNUSED)
+void
+rs6000_emit_set_const (rtx dest, rtx source)
 {
-  rtx result, insn, set;
-  HOST_WIDE_INT c0, c1;
+  enum machine_mode mode = GET_MODE (dest);
+  rtx temp, insn, set;
+  HOST_WIDE_INT c;
 
+  gcc_checking_assert (CONST_INT_P (source));
+  c = INTVAL (source);
   switch (mode)
 {
-case  QImode:
+case QImode:
 case HImode:
-  if (dest == NULL)
-   dest = gen_reg_rtx (mode);
   emit_insn (gen_rtx_SET (VOIDmode, dest, source));
-  return dest;
+  return;
 
 case SImode:
-  result = !can_create_pseudo_p () ? dest : gen_reg_rtx (SImode);
+  temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (SImode);
 
-  emit_insn (gen_rtx_SET (VOIDmode, copy_rtx (result),
- GEN_INT (INTVAL (source)
-   (~ (HOST_WIDE_INT) 0x;
+  emit_insn (gen_rtx_SET (VOIDmode, copy_rtx (temp),
+ GEN_INT (c  (~ (HOST_WIDE_INT) 0x;
   emit_insn (gen_rtx_SET (VOIDmode, dest,
- gen_rtx_IOR (SImode, copy_rtx (result),
-  GEN_INT (INTVAL (source)  
0x;
-  result = dest;
+ gen_rtx_IOR (SImode, copy_rtx (temp),
+  GEN_INT (c  0x;
   break;
 
 case DImode:
-  switch (GET_CODE (source))
-   {
-   case CONST_INT:
- c0 = INTVAL (source);
- c1 = -(c0  0);
- break;
-
-   default:
- gcc_unreachable ();
-   }
-
-  result = rs6000_emit_set_long_const (dest, c0, c1);
+  

Re: [RS6000] PR60737, expand_block_clear uses word stores

2014-05-07 Thread Alan Modra
On Wed, May 07, 2014 at 01:39:50PM -0400, David Edelsohn wrote:
 On Tue, May 6, 2014 at 4:32 AM, Alan Modra amo...@gmail.com wrote:
  BTW, the latest patch in my tree has a slight refinement, the
  reload-by-hand addition.
 
  PR target/60737
  * config/rs6000/rs6000.c (expand_block_move): Allow 64-bit
  loads and stores when -mno-strict-align at any alignment.
  (expand_block_clear): Similarly.  Also correct calculation of
  instruction count.
 
 Based on results of your experiment, the revised patch is okay.
 
 You did not include gcc-patches in the distribution list for the revised 
 patch.

Thanks, David.  Patch copied here for gcc-patches and committed
revision 210201.

PR target/60737
* config/rs6000/rs6000.c (expand_block_move): Allow 64-bit
loads and stores when -mno-strict-align at any alignment.
(expand_block_clear): Similarly.  Also correct calculation of
instruction count.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 210200)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -15443,7 +15443,7 @@ expand_block_clear (rtx operands[])
  load zero and three to do clearing.  */
   if (TARGET_ALTIVEC  align = 128)
 clear_step = 16;
-  else if (TARGET_POWERPC64  align = 32)
+  else if (TARGET_POWERPC64  (align = 64 || !STRICT_ALIGNMENT))
 clear_step = 8;
   else if (TARGET_SPE  align = 64)
 clear_step = 8;
@@ -15471,12 +15471,27 @@ expand_block_clear (rtx operands[])
   mode = V2SImode;
 }
   else if (bytes = 8  TARGET_POWERPC64
-  /* 64-bit loads and stores require word-aligned
- displacements.  */
-   (align = 64 || (!STRICT_ALIGNMENT  align = 32)))
+   (align = 64 || !STRICT_ALIGNMENT))
{
  clear_bytes = 8;
  mode = DImode;
+ if (offset == 0  align  64)
+   {
+ rtx addr;
+
+ /* If the address form is reg+offset with offset not a
+multiple of four, reload into reg indirect form here
+rather than waiting for reload.  This way we get one
+reload, not one per store.  */
+ addr = XEXP (orig_dest, 0);
+ if ((GET_CODE (addr) == PLUS || GET_CODE (addr) == LO_SUM)
+  GET_CODE (XEXP (addr, 1)) == CONST_INT
+  (INTVAL (XEXP (addr, 1))  3) != 0)
+   {
+ addr = copy_addr_to_reg (addr);
+ orig_dest = replace_equiv_address (orig_dest, addr);
+   }
+   }
}
   else if (bytes = 4  (align = 32 || !STRICT_ALIGNMENT))
{   /* move 4 bytes */
@@ -15604,13 +15619,36 @@ expand_block_move (rtx operands[])
  gen_func.movmemsi = gen_movmemsi_4reg;
}
   else if (bytes = 8  TARGET_POWERPC64
-  /* 64-bit loads and stores require word-aligned
- displacements.  */
-   (align = 64 || (!STRICT_ALIGNMENT  align = 32)))
+   (align = 64 || !STRICT_ALIGNMENT))
{
  move_bytes = 8;
  mode = DImode;
  gen_func.mov = gen_movdi;
+ if (offset == 0  align  64)
+   {
+ rtx addr;
+
+ /* If the address form is reg+offset with offset not a
+multiple of four, reload into reg indirect form here
+rather than waiting for reload.  This way we get one
+reload, not one per load and/or store.  */
+ addr = XEXP (orig_dest, 0);
+ if ((GET_CODE (addr) == PLUS || GET_CODE (addr) == LO_SUM)
+  GET_CODE (XEXP (addr, 1)) == CONST_INT
+  (INTVAL (XEXP (addr, 1))  3) != 0)
+   {
+ addr = copy_addr_to_reg (addr);
+ orig_dest = replace_equiv_address (orig_dest, addr);
+   }
+ addr = XEXP (orig_src, 0);
+ if ((GET_CODE (addr) == PLUS || GET_CODE (addr) == LO_SUM)
+  GET_CODE (XEXP (addr, 1)) == CONST_INT
+  (INTVAL (XEXP (addr, 1))  3) != 0)
+   {
+ addr = copy_addr_to_reg (addr);
+ orig_src = replace_equiv_address (orig_src, addr);
+   }
+   }
}
   else if (TARGET_STRING  bytes  4  !TARGET_POWERPC64)
{   /* move up to 8 bytes at a time */

-- 
Alan Modra
Australia Development Lab, IBM


Re: genattrtab error reporting

2014-05-07 Thread Mike Stump
On May 7, 2014, at 6:12 PM, Segher Boessenkool seg...@kernel.crashing.org 
wrote:
 Does this fix
 
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31778
 
 Only if it is applied to the tree!  :-)  Yes.
 
 It also is PR57062.  Thanks for fixing it!

Thanks, marked as dup.

Re: [patch] change specific int128 - generic intN

2014-05-07 Thread DJ Delorie

 OK (presuming the usual bootstrap and regression test, which should 
 provide a reasonably thorough test of this code through the stdint.h 
 tests).

Bootstrapped with and without the patch on x86-64, no regressions.
Committed.  Thanks!