[RS6000] PR61300 KR incoming args

2014-05-29 Thread Alan Modra
One of the nice features of the ELFv2 ABI is that stack frames are
smaller compared to ELFv1.  We don't allocate a parameter save area
unless we actually use it.  However, for variable argument lists, we
kept the simple va_list type which is a pointer to the memory location
of the next parameter.  This means calls to variable argument list
functions must allocate the parameter save area, and hence calls to
unprototyped functions must also do so.

The wrinkle with KR style C functions is that function *definitions*
may be unprototyped.  So when compiling a function body we can't use
!prototype_p() to say we have a parameter save area.  A call in some
other compilation unit might be prototyped and so not allocate a
parameter save area.  Another consequence of unprototyped function
definitions is that the return type and argument types may not be
available on the function type node.  Instead you need to look at the
return and arguments on the function decl.

Now, function.c always passes a decl to REG_PARM_STACK_SPACE, but
calls.c sometimes passes a decl and sometimes a type.  This latter
fact makes it necessary, I think, to define an
INCOMING_REG_PARM_STACK_SPACE used by function.c.  You can't blindly
use a decl from calls.c as that falls foul of C++..

The following implements this.  Bootstrapped and regression tested
powerpc64le-linux and powerpc64-linux all langs (except Ada since I
didn't have gnat installed.)  OK to apply?

PR target/61300
* doc/tm.texi.in (INCOMING_REG_PARM_STACK_SPACE): Document.
* doc/tm.texi: Regenerate.
* function.c (INCOMING_REG_PARM_STACK_SPACE): Provide default.
Use throughout in place of REG_PARM_STACK_SPACE.
* config/rs6000/rs6000.c (rs6000_reg_parm_stack_space): Add
incoming param.  Pass to rs6000_function_parms_need_stack.
(rs6000_function_parms_need_stack): Add incoming param, ignore
prototype_p when incoming.  Use function decl when incoming
to handle KR style functions.
* config/rs6000/rs6000.h (REG_PARM_STACK_SPACE): Adjust.
(INCOMING_REG_PARM_STACK_SPACE): Define.

Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in  (revision 210919)
+++ gcc/doc/tm.texi.in  (working copy)
@@ -3499,6 +3499,13 @@ which.
 @c above is overfull.  not sure what to do.  --mew 5feb93  did
 @c something, not sure if it looks good.  --mew 10feb93
 
+@defmac INCOMING_REG_PARM_STACK_SPACE (@var{fndecl})
+Like @code{REG_PARM_STACK_SPACE}, but for incoming register arguments.
+Define this macro if space guaranteed when compiling a function body
+is different to space required when making a call, a situation that
+can arise with unprototyped functions.
+@end defmac
+
 @defmac OUTGOING_REG_PARM_STACK_SPACE (@var{fntype})
 Define this to a nonzero value if it is the responsibility of the
 caller to allocate the area reserved for arguments passed in registers
Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi (revision 210919)
+++ gcc/doc/tm.texi (working copy)
@@ -3948,6 +3948,13 @@ which.
 @c above is overfull.  not sure what to do.  --mew 5feb93  did
 @c something, not sure if it looks good.  --mew 10feb93
 
+@defmac INCOMING_REG_PARM_STACK_SPACE (@var{fndecl})
+Like @code{REG_PARM_STACK_SPACE}, but for incoming register arguments.
+Define this macro if space guaranteed when compiling a function body
+is different to space required when making a call, a situation that
+can arise with unprototyped functions.
+@end defmac
+
 @defmac OUTGOING_REG_PARM_STACK_SPACE (@var{fntype})
 Define this to a nonzero value if it is the responsibility of the
 caller to allocate the area reserved for arguments passed in registers
Index: gcc/function.c
===
--- gcc/function.c  (revision 210919)
+++ gcc/function.c  (working copy)
@@ -1348,9 +1348,13 @@ static int cfa_offset;
 #define STACK_POINTER_OFFSET   0
 #endif
 
+#if defined (REG_PARM_STACK_SPACE)  !defined (INCOMING_REG_PARM_STACK_SPACE)
+#define INCOMING_REG_PARM_STACK_SPACE REG_PARM_STACK_SPACE
+#endif
+
 /* If not defined, pick an appropriate default for the offset of dynamically
allocated memory depending on the value of ACCUMULATE_OUTGOING_ARGS,
-   REG_PARM_STACK_SPACE, and OUTGOING_REG_PARM_STACK_SPACE.  */
+   INCOMING_REG_PARM_STACK_SPACE, and OUTGOING_REG_PARM_STACK_SPACE.  */
 
 #ifndef STACK_DYNAMIC_OFFSET
 
@@ -1362,12 +1366,12 @@ static int cfa_offset;
`crtl-outgoing_args_size'.  Nevertheless, we must allow
for it when allocating stack dynamic objects.  */
 
-#if defined(REG_PARM_STACK_SPACE)
+#ifdef INCOMING_REG_PARM_STACK_SPACE
 #define STACK_DYNAMIC_OFFSET(FNDECL)   \
 ((ACCUMULATE_OUTGOING_ARGS   \
   ? (crtl-outgoing_args_size\
  + 

Re: ipa-visibility TLC 2/n

2014-05-29 Thread Richard Sandiford
Jan Hubicka hubi...@ucw.cz writes:
 Richard Sandiford wrote the original section anchors implementation,
 so he would be a good person to comment about the interaction between
 aliases and section anchors.

 Thanks! Richard, does this patch seem sane?

Looks good to me in principle, but with:

 +  struct symtab_node *snode;
decl = SYMBOL_REF_DECL (symbol);
 +
 +  snode = symtab_node (decl);
 +  if (snode-alias)
 +   {
 + rtx target = DECL_RTL (symtab_alias_ultimate_target (snode)-decl);
 + SYMBOL_REF_BLOCK_OFFSET (symbol) = SYMBOL_REF_BLOCK_OFFSET (target);
 + return;
 +   }

is SYMBOL_REF_BLOCK_OFFSET (target) guaranteed to be valid at this point?
It looked at face value like you'd need a recursive call to place_block_symbol
on the target before the copy.

Thanks,
Richard


RE: [PATCH] Fix PR54733 Optimize endian independent load/store

2014-05-29 Thread Thomas Preud'homme
 From: Andreas Schwab [mailto:sch...@linux-m68k.org]
 
 This adds a full byte of padding between each bitfield.  If you want a
 single padding bit you should use :1, but you also need to update the
 test to check for 0x44434241 (0x88868482 is impossible, since that
 requires at least 8 bits per bitfield).

Actually if I understood C99 correctly it depends on the storage unit
allocated for the bitfield preceding the 0 length bitfield. Instead of
trying to cover all possible value read from this bitfield I rewrote the
test to check if bswap misinterpret the expression and replace it with
a load or load+bswap. This reduce the number of possible values to 2
and thus makes the test less fragile and easier to understand.

By the way, I couldn't understand how you reached the value
0x44434241. Can you explain me?

Here is the ChangeLog:

2014-05-29  Thomas Preud'homme  thomas.preudho...@arm.com

* gcc.c-torture/execute/bswap-2.c: Add alignment constraints to
bitfield and test wrong results instead of correct results to make the
test more portable.

And the patch:

diff --git a/gcc/testsuite/gcc.c-torture/execute/bswap-2.c 
b/gcc/testsuite/gcc.c-torture/execute/bswap-2.c
index 38f18fd..a47e01a 100644
--- a/gcc/testsuite/gcc.c-torture/execute/bswap-2.c
+++ b/gcc/testsuite/gcc.c-torture/execute/bswap-2.c
@@ -6,8 +6,11 @@ typedef __UINT32_TYPE__ unsigned;
 
 struct bitfield {
   unsigned char f0:7;
+  unsigned char   :1;
   unsigned char f1:7;
+  unsigned char   :1;
   unsigned char f2:7;
+  unsigned char   :1;
   unsigned char f3:7;
 };
 
@@ -74,11 +77,17 @@ main ()
 return 0;
   bfin.inval = (struct ok) { 0x83, 0x85, 0x87, 0x89 };
   out = partial_read_le32 (bfin);
-  if (out != 0x09070503  out != 0x88868482  out != 0x78306141)
+  /* Test what bswap would do if its check are not strict enough instead of
+ what is the expected result as there is too many possible results with
+ bitfields.  */
+  if (out == 0x89878583)
 __builtin_abort ();
   bfin.inval = (struct ok) { 0x83, 0x85, 0x87, 0x89 };
   out = partial_read_be32 (bfin);
-  if (out != 0x03050709  out != 0x82848688  out != 0x41613078)
+  /* Test what bswap would do if its check are not strict enough instead of
+ what is the expected result as there is too many possible results with
+ bitfields.  */
+  if (out == 0x83858789)
 __builtin_abort ();
   out = fake_read_le32 (cin, cin[2]);
   if (out != 0x89018583)

Best regards,

Thomas

fix_bswap-2.diff
Description: Binary data


Re: [Patch] Minor fixes for regtesting gfortran with -flto

2014-05-29 Thread Uros Bizjak
Hello!

 With the following patch, gfortran can be regtested with -flto
 with no failure, but pr54852 and pr60061.

-! { dg-final { scan-assembler-times myBindC 1 { target { ! {
hppa*-*-hpux* } } } } }
-! { dg-final { scan-assembler-times myBindC,%r2 1 { target {
hppa*-*-hpux* } } } }
+! { dg-final { scan-assembler-times call\[^\n\r\]*myBindC 1 {
target { ! { hppa*-*-hpux* } } } } }
+! { dg-final { scan-assembler-times call\[^\n\r\]*myBindC,%r2 1 {
target { hppa*-*-hpux* } } } }

The change above fails on alpha, which doesn't emit call in the assembly, but:

$ grep myBindC bind_c_array_params_2.s
jsr $26,myBindC

Probably, alpha is not the only one that fails this assumption.

Uros.


Re: -fuse-caller-save - Collect register usage information

2014-05-29 Thread Tom de Vries

On 29-05-14 00:42, Bill Schmidt wrote:

Tom, the final version of this patch that you committed breaks bootstrap
on powerpc64le-linux-gnu.  The problem is that all uses of the variable
i are guarded by #ifdef STACK_REGS, but the declaration of i is
unconditional.  We get an unused variable warning that becomes an error
during stage 3.



Bill,

thanks for letting me know.

I've bootstrapped attached patch on x86_64, and committed it.

Thanks,
- Tom

2014-05-29  Tom de Vries  t...@codesourcery.com

	* final.c (collect_fn_hard_reg_usage): Guard variable declaration
	with #ifdef STACK_REGS.

diff --git a/gcc/final.c b/gcc/final.c
index a345fe7..c32e177 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -4750,7 +4750,9 @@ static void
 collect_fn_hard_reg_usage (void)
 {
   rtx insn;
+#ifdef STACK_REGS
   int i;
+#endif
   struct cgraph_rtl_info *node;
 
   /* ??? To be removed when all the ports have been fixed.  */


Re: [PATCH] Fix PR54733 Optimize endian independent load/store

2014-05-29 Thread Andreas Schwab
Thomas Preud'homme thomas.preudho...@arm.com writes:

 By the way, I couldn't understand how you reached the value
 0x44434241. Can you explain me?

Each byte is composed of the first 7 bits of the original byte.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


RE: [MIPS] Add sbasic supoert ffor MSA (SIMD)

2014-05-29 Thread Matthew Fortune
Mike Stump mikest...@comcast.net writes:
 On May 28, 2014, at 7:27 AM, Richard Earnshaw rearn...@arm.com wrote:
 
  Speed of implementation.  We're gradually replacing these with proper
  builtins, but that takes a lot more work.
 
 As an owner of a port with more builtins that yours, I can offer a
 technological solution to reduce the cost of builtins to:
 
 (define_builtin my_stop
   [
 (define_outputs [(void_operand 0)])
 (define_rtl_pattern my_stop [])
   ]
 )
 
 (define_insn my_stop
   [(unspec_volatile [(const_int 0)]
 UNSPECV_STOP)]
   
   stop)
 
 for example.  This creates the builtins, allows overloading, allows
 input/output parameters, can reorder operands, allows for complex types,
 allows memory reference parameters, allows pure markings, does vectors,
 conditional availability, generates documentation, creates test suites and
 more.  If you wire up a speaker it even sings.
 
 Someone would have have to step forward with a need and some time to port
 their port over to the new scheme and help with the reason for why the
 technology should go in.  It is mostly contained in 5600 lines of self
 contained python code, and is built to solve the problem generally.  It adds
 about 800 lines to builtins.c.  It has a macro system that is more powerful
 than the macro system .md files use, so one gets to share and collapse
 builtins rather nicely.  It is known to work for C and C++.  Other languages
 may need extending; C for example cost is around 250 lines to support.

Myself and others at IMG would be interested in reviewing/evaluating the
implementation and assuming it looks useful then we would of course help to
get it in shape for submission.
 
 One promise, you will never have to create an argument list, or a type, for
 example here is a two output, type input functional instruction with some
 doc content:
 
 (define_mode_iterator MYTYPE
 [V8QI V4HI V2SI DI ...])
 
 (define_builtin my_foo my_foo2_type
   [
 (define_descDoc string for operation)
 (define_outputs [(var_operand:T_MYTYPE 0)
  (var_operand:T_MYTYPE 1)])
 (define_inputs  [(var_operand:T_MYTYPE 2)
  (var_operand:T_MYTYPE 3)])
 (define_rtl_pattern my_foo2_mode [0 2 1 3])
 (attributes [pure])
   ]
 )
 
 I stripped it so you can't know what the instruction was, but you get a
 flavor of multiple outputs, doc bits, pure, overloading, arguments and
 argument rearranging.

Can you post the implementation as an RFC? I suspect the python aspect
will cause the most trouble as GCC builds do not currently require python
I guess that could change depending on the value added. Otherwise it would
be a rewrite I guess.

Before digging in too deep though it would be useful to know if RichardS
would be willing to consider this kind of thing for the MIPS port?

Regards,
Matthew


Re: [Patch] Minor fixes for regtesting gfortran with -flto

2014-05-29 Thread Dominique Dhumieres
 Probably, alpha is not the only one that fails this assumption.

Indeed! see the thread starting at
https://gcc.gnu.org/ml/fortran/2014-05/msg00127.html

Could you test the following patch

--- ../_clean/gcc/testsuite/gfortran.dg/bind_c_array_params_2.f90   
2014-05-24 16:17:53.0 +0200
+++ gcc/testsuite/gfortran.dg/bind_c_array_params_2.f90 2014-05-29 
11:34:40.0 +0200
@@ -16,7 +16,7 @@ integer :: aa(4,4)
 call test(aa)
 end
 
-! { dg-final { scan-assembler-times call\[^\n\r\]*myBindC 1 { target { ! { 
hppa*-*-hpux* } } } } }
-! { dg-final { scan-assembler-times call\[^\n\r\]*myBindC,%r2 1 { target { 
hppa*-*-hpux* } } } }
+! { dg-final { scan-assembler-times \[ \t\]\[$,_0-9\]*myBindC 1 { target { ! 
{ hppa*-*-hpux* } } } } }
+! { dg-final { scan-assembler-times \[ \t\]\[$,_0-9\]*myBindC,%r2 1 { target 
{ hppa*-*-hpux* } } } }
 ! { dg-final { scan-tree-dump-times test \\\(parm\\. 1 original } }
 ! { dg-final { cleanup-tree-dump original } }

with

make -k check-gfortran RUNTESTFLAGS=dg.exp=bind_c_array_params_2.f90 
--target_board=unix'{-m32,-m64,-m32/-flto,-m64/-flto}'

Can you pre-approved it?

TIA

Dominique


Re: [MIPS] Add sbasic supoert ffor MSA (SIMD)

2014-05-29 Thread Ramana Radhakrishnan
On Wed, May 28, 2014 at 6:49 PM, Mike Stump mikest...@comcast.net wrote:
 On May 28, 2014, at 7:27 AM, Richard Earnshaw rearn...@arm.com wrote:

 Speed of implementation.  We're gradually replacing these with proper
 builtins, but that takes a lot more work.

 As an owner of a port with more builtins that yours, I can offer a 
 technological solution to reduce the cost of builtins to:

 (define_builtin “my_stop
   [
 (define_outputs [(void_operand 0)])
 (define_rtl_pattern “my_stop [])
   ]
 )

 (define_insn “my_stop
   [(unspec_volatile [(const_int 0)]
 UNSPECV_STOP)]
   
   “stop”)

 for example.  This creates the builtins, allows overloading, allows 
 input/output parameters, can reorder operands, allows for complex types, 
 allows memory reference parameters, allows pure markings, does vectors, 
 conditional availability, generates documentation, creates test suites and 
 more.  If you wire up a speaker it even sings.

 Someone would have have to step forward with a need and some time to port 
 their port over to the new scheme and help with the reason for why the 
 technology should go in.  It is mostly contained in 5600 lines of self 
 contained python code, and is built to solve the problem generally.  It adds 
 about 800 lines to builtins.c.  It has a macro system that is more powerful 
 than the macro system .md files use, so one gets to share and collapse 
 builtins rather nicely.  It is known to work for C and C++.  Other languages 
 may need extending; C for example cost is around 250 lines to support.

 One promise, you will never have to create an argument list, or a type, for 
 example here is a two output, type input functional instruction with some doc 
 content:

 (define_mode_iterator MYTYPE
 [V8QI V4HI V2SI DI ...])

 (define_builtin “my_foo” my_foo2_type
   [
 (define_desc“Doc string for operation)
 (define_outputs [(var_operand:T_MYTYPE 0)
  (var_operand:T_MYTYPE 1)])
 (define_inputs  [(var_operand:T_MYTYPE 2)
  (var_operand:T_MYTYPE 3)])
 (define_rtl_pattern “my_foo2_mode [0 2 1 3])
 (attributes [pure])
   ]
 )

 I stripped it so you can’t know what the instruction was, but you get a 
 flavor of multiple outputs, doc bits, pure, overloading, arguments and 
 argument rearranging.


 Let me know if you’re interested.

This sounds interesting - could you post something for an RFC or in a
branch so that one can play with it ?

Ramana


Re: RFA: cache enabled attribute by insn code

2014-05-29 Thread Richard Earnshaw
On 27/05/14 16:07, Richard Sandiford wrote:
 Richard Sandiford rdsandif...@googlemail.com writes:
 Richard Sandiford rsand...@linux.vnet.ibm.com writes:
 Does the following patch help?

 Bah, it won't of course: %i1 needs to be the operator.
 
 Here's v2.  I tested that it worked for simple tests like:
 
 int f1 (int x, int y) { return x + (y  4); }
 int f2 (int x, int y) { return x - (y  4); }
 int f3 (int x, int y) { return x  (y  4); }
 int f4 (int x, int y) { return x | (y  4); }
 int f5 (int x, int y) { return x ^ (y  4); }
 int f6 (int x, int y) { return (y  4) - x; }
 int g1 (int x, int y, int z) { return x + (y  z); }
 int g2 (int x, int y, int z) { return x - (y  z); }
 int g3 (int x, int y, int z) { return x  (y  z); }
 int g4 (int x, int y, int z) { return x | (y  z); }
 int g5 (int x, int y, int z) { return x ^ (y  z); }
 int g6 (int x, int y, int z) { return (y  z) - x; }
 
 as well as the testcase.
 
 Thanks,
 Richard
 
 
 gcc/
   * config/arm/iterators.md (shiftable_ops): New code iterator.
   (t2_binop0, arith_shift_insn): New code attributes.
   * config/arm/arm.md (insn_enabled): Delete.
   (enabled): Remove insn_enabled test.
   (*arith_shiftsi): Split out...
   (*arith_multsi): ...this pattern and remove insn_enabled attribute.
 


Thanks, Richard.
I've tweaked this as followed and committed it.

I now consider shift_operator in the arm backend deprecated.  We
should be moving towards using shift_nomul_operator.

There's one final wart still to be handled, though.  'rotate' can only
take an immediate operand, not a register.  We can currently deal with
this, but it's not clean in terms of constraint handling.  I'll see if I
can fix this up sometime, but not today.

R.

2014-05-29  Richard Earnshaw rearn...@arm.com
Richard Sandiford  rdsandif...@googlemail.com

* arm/iterators.md (shiftable_ops): New code iterator.
(t2_binop0, arith_shift_insn): New code attributes.
* arm/predicates.md (shift_nomul_operator): New predicate.
* arm/arm.md (insn_enabled): Delete.
(enabled): Remove insn_enabled test.
(*arith_shiftsi): Delete.  Replace with ...
(*arith_shift_insn_multsi): ... new pattern.
(*arith_shift_insn_shiftsi): ... new pattern.
* config/arm/arm.c (arm_print_operand): Handle operand format 'b'.





diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index ccad548..b514757 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -21271,7 +21271,15 @@ arm_print_condition (FILE *stream)
 }
 
 
-/* If CODE is 'd', then the X is a condition operand and the instruction
+/* Globally reserved letters: acln
+   Puncutation letters currently used: @_|?().!#
+   Lower case letters currently used: bcdefhimpqtvwxyz
+   Upper case letters currently used: ABCDFGHJKLMNOPQRSTU
+   Letters previously used, but now deprecated/obsolete: sVWXYZ.
+
+   Note that the global reservation for 'c' is only for CONSTANT_ADDRESS_P.
+
+   If CODE is 'd', then the X is a condition operand and the instruction
should only be executed if the condition is true.
if CODE is 'D', then the X is a condition operand and the instruction
should only be executed if the condition is false: however, if the mode
@@ -21411,6 +21419,19 @@ arm_print_operand (FILE *stream, rtx x, int code)
}
   return;
 
+case 'b':
+  /* Print the log2 of a CONST_INT.  */
+  {
+   HOST_WIDE_INT val;
+
+   if (!CONST_INT_P (x)
+   || (val = exact_log2 (INTVAL (x)  0x))  0)
+ output_operand_lossage (Unsupported operand for code '%c', code);
+   else
+ fprintf (stream, # HOST_WIDE_INT_PRINT_DEC, val);
+  }
+  return;
+
 case 'L':
   /* The low 16 bits of an immediate constant.  */
   fprintf (stream, HOST_WIDE_INT_PRINT_DEC, INTVAL(x)  0x);
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 348a89c..cd7495f 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -200,17 +200,9 @@
  (const_string yes)]
 (const_string no)))
 
-; Allows an insn to disable certain alternatives for reasons other than
-; arch support.
-(define_attr insn_enabled no,yes
-  (const_string yes))
-
 ; Enable all alternatives that are both arch_enabled and insn_enabled.
  (define_attr enabled no,yes
-   (cond [(eq_attr insn_enabled no)
- (const_string no)
-
- (and (eq_attr predicable_short_it no)
+   (cond [(and (eq_attr predicable_short_it no)
   (and (eq_attr predicated yes)
(match_test arm_restrict_it)))
  (const_string no)
@@ -9876,38 +9868,34 @@
 
 ;; Patterns to allow combination of arithmetic, cond code and shifts
 
-(define_insn *arith_shiftsi
-  [(set (match_operand:SI 0 s_register_operand =r,r,r,r)
-(match_operator:SI 1 shiftable_operator
-  [(match_operator:SI 3 shift_operator
- [(match_operand:SI 4 s_register_operand r,r,r,r)
- 

RE: [PATCH] Fix PR54733 Optimize endian independent load/store

2014-05-29 Thread Thomas Preud'homme
 From: Andreas Schwab [mailto:sch...@linux-m68k.org]
 Thomas Preud'homme thomas.preudho...@arm.com writes:
 
  By the way, I couldn't understand how you reached the value
  0x44434241. Can you explain me?
 
 Each byte is composed of the first 7 bits of the original byte.

Sorry, it seems I wasn't very awake when I checked that. Makes sense now.
Thanks. 

Does the patch solve the problem you had? What about you Christophe?

Best regards,

Thomas





Re: [PATCH] Inline asm asan instrumentation

2014-05-29 Thread Evgeniy Stepanov
Cool, we don't have this in LLVM-ASan, but we have plans to instrument
inline asm soon (not just constraints).

asm-struct-1.c test looks like a false positive though - the code does
not access any invalid memory, it only does a harmless pointer cast.


On Wed, May 28, 2014 at 10:36 PM, Konstantin Serebryany
konstantin.s.serebry...@gmail.com wrote:
 On Wed, May 28, 2014 at 5:33 PM, Marat Zakirov m.zaki...@samsung.com wrote:
 Hi all,

 Here's a patch for optional Asan instrumentation of inline assembly.

 This version scans gimple for GIMPLE_ASMs and performs usual instrumentation
 of arguments with memory constraints (m, o, etc.) with fixed size.

 Instrumentation is turned off by default.

 This was successfully bootstrapped and regtested on x64. I have also
 instrumented and ran ffmpeg regression testsuite (it seems to have quite
 some inline asm).

 --Marat


Re: [Patch] Minor fixes for regtesting gfortran with -flto

2014-05-29 Thread Uros Bizjak
On Thu, May 29, 2014 at 11:38 AM, Dominique Dhumieres
domi...@lps.ens.fr wrote:
 Probably, alpha is not the only one that fails this assumption.

 Indeed! see the thread starting at
 https://gcc.gnu.org/ml/fortran/2014-05/msg00127.html

 Could you test the following patch

 --- ../_clean/gcc/testsuite/gfortran.dg/bind_c_array_params_2.f90   
 2014-05-24 16:17:53.0 +0200
 +++ gcc/testsuite/gfortran.dg/bind_c_array_params_2.f90 2014-05-29 
 11:34:40.0 +0200
 @@ -16,7 +16,7 @@ integer :: aa(4,4)
  call test(aa)
  end

 -! { dg-final { scan-assembler-times call\[^\n\r\]*myBindC 1 { target { ! { 
 hppa*-*-hpux* } } } } }
 -! { dg-final { scan-assembler-times call\[^\n\r\]*myBindC,%r2 1 { target { 
 hppa*-*-hpux* } } } }
 +! { dg-final { scan-assembler-times \[ \t\]\[$,_0-9\]*myBindC 1 { target { 
 ! { hppa*-*-hpux* } } } } }
 +! { dg-final { scan-assembler-times \[ \t\]\[$,_0-9\]*myBindC,%r2 1 { 
 target { hppa*-*-hpux* } } } }
  ! { dg-final { scan-tree-dump-times test \\\(parm\\. 1 original } }
  ! { dg-final { cleanup-tree-dump original } }

 with

 make -k check-gfortran RUNTESTFLAGS=dg.exp=bind_c_array_params_2.f90 
 --target_board=unix'{-m32,-m64,-m32/-flto,-m64/-flto}'

This works on alpha with --target_board=unix'{,-flto}' and x86_64, so
I guess it is OK.

 Can you pre-approved it?

I'm not a testsuite maintainer (one is CC'd for a final approval), but
the situation is definitely better with the patched regexp.

Uros.


Re: [PATCH] Inline asm asan instrumentation

2014-05-29 Thread Jakub Jelinek
On Wed, May 28, 2014 at 05:33:44PM +0400, Marat Zakirov wrote:
 Here's a patch for optional Asan instrumentation of inline assembly.
 
 This version scans gimple for GIMPLE_ASMs and performs usual instrumentation
 of arguments with memory constraints (m, o, etc.) with fixed size.

That doesn't look right to me.  The fact that some region appears in m
doesn't mean the inline asm actually accesses it, it could not touch it at
all, or only some part of it.
If you look e.g. at Linux kernel headers, you'll see lots of
struct __large_struct { unsigned long buf[100]; };
#define __m(x) (*(struct __large_struct __user *)(x))
...
  m (__m(addr))
and similar cases, if Asan wants to check that the whole 100*sizeof(long)
region is accessible, it could often just have false positives, because the
inline asm really accesses just some small part of it.

Jakub


Re: RTABI half-precision conversion functions (ping)

2014-05-29 Thread Julian Brown
On Thu, 19 Jul 2012 14:47:54 +0100
Julian Brown jul...@codesourcery.com wrote:

 On Thu, 19 Jul 2012 13:54:57 +0100
 Paul Brook p...@codesourcery.com wrote:
 
   But, that means EABI-conformant callers are also perfectly
   entitled to sign-extend half-float values before calling our
   helper functions (although GCC itself won't do that). Using
   unsigned int and taking care to only examine the low-order bits
   of the value in the helper function itself serves to fix the
   latent bug, is compatible with existing code, allows us to be
   conformant with the eabi, and allows use of aliases to make the
   __gnu and __aeabi functions the same.
  
  As long as LTO never sees this mismatch we should be fine :-)  AFAIK
  we don't curently have any way of expressing the actual ABI.
 
 Let's not worry about that for now :-).
 
   The patch no longer applied as-is, so I've updated it (attached,
   re-tested). Note that there are no longer any target-independent
   changes (though I'm not certain that the symbol versions are still
   correct).
   
   OK to apply?
  
  I think this deserves a comment in the source.  Otherwise it's
  liable to get fixed in the future :-) Something allong the lines
  of While the EABI describes the arguments to the half-float helper
  routines as 'short', it does not require that they be extended to
  full register width. The normal ABI requres that the caller
  sign/zero extend short values to 32 bit.  We use unsigned int
  arguments to prevent the gcc making assumptions about the high half
  of the register.
 
 Here's a version with an explanatory comment. I also fixed a couple of
 minor formatting nits I noticed (they don't upset the diff too much, I
 don't think).

It looks like this one got forgotten about. Ping?

Context:

https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00902.html
https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00912.html

This is an EABI-conformance fix.

Thanks,

Julian


Re: [DOC PATCH] Rewrite docs for inline asm

2014-05-29 Thread Eric Botcazou
 Yes.  We already know that this is better than the current docs.
 Let's check it in.

As far as I can see you did it, but didn't add a ChangeLog entry (so David 
isn't properly credited with the rewrite)?

-- 
Eric Botcazou


RE: [PATCH] Inline asm asan instrumentation

2014-05-29 Thread Marat Zakirov
 asm-struct-1.c test looks like a false positive though - the code does not 
 access any invalid memory, it only does a harmless pointer cast.

It is not. Because st1 have smaller size than st2:

struct st1 {
   int a[110];
}
struct st2 {
   int a[111];
};

And asm constrain was declared as: 

: =m (*((struct st2 *)s1)));

Test violate memory access constrain by cast (struct st2 *)s1. 
We check only constraints and by such a cast as we think user declare that he 
want to access full st2 structure which have bigger size than st1. 

-Original Message-
From: Evgeniy Stepanov [mailto:eugeni.stepa...@gmail.com] 
Sent: Thursday, May 29, 2014 1:58 PM
To: Konstantin Serebryany
Cc: Marat Zakirov; GCC Patches; Konstantin Serebryany; Jakub Jelinek; 
Viacheslav Garbuzov; Yuri Gribov; Marat Zakirov
Subject: Re: [PATCH] Inline asm asan instrumentation

Cool, we don't have this in LLVM-ASan, but we have plans to instrument inline 
asm soon (not just constraints).

asm-struct-1.c test looks like a false positive though - the code does not 
access any invalid memory, it only does a harmless pointer cast.


On Wed, May 28, 2014 at 10:36 PM, Konstantin Serebryany 
konstantin.s.serebry...@gmail.com wrote:
 On Wed, May 28, 2014 at 5:33 PM, Marat Zakirov m.zaki...@samsung.com wrote:
 Hi all,

 Here's a patch for optional Asan instrumentation of inline assembly.

 This version scans gimple for GIMPLE_ASMs and performs usual 
 instrumentation of arguments with memory constraints (m, o, etc.) with 
 fixed size.

 Instrumentation is turned off by default.

 This was successfully bootstrapped and regtested on x64. I have also 
 instrumented and ran ffmpeg regression testsuite (it seems to have 
 quite some inline asm).

 --Marat



Re: RFA: cache enabled attribute by insn code

2014-05-29 Thread Richard Sandiford
Richard Earnshaw rearn...@arm.com writes:
 On 27/05/14 16:07, Richard Sandiford wrote:
 Richard Sandiford rdsandif...@googlemail.com writes:
 Richard Sandiford rsand...@linux.vnet.ibm.com writes:
 Does the following patch help?

 Bah, it won't of course: %i1 needs to be the operator.
 
 Here's v2.  I tested that it worked for simple tests like:
 
 int f1 (int x, int y) { return x + (y  4); }
 int f2 (int x, int y) { return x - (y  4); }
 int f3 (int x, int y) { return x  (y  4); }
 int f4 (int x, int y) { return x | (y  4); }
 int f5 (int x, int y) { return x ^ (y  4); }
 int f6 (int x, int y) { return (y  4) - x; }
 int g1 (int x, int y, int z) { return x + (y  z); }
 int g2 (int x, int y, int z) { return x - (y  z); }
 int g3 (int x, int y, int z) { return x  (y  z); }
 int g4 (int x, int y, int z) { return x | (y  z); }
 int g5 (int x, int y, int z) { return x ^ (y  z); }
 int g6 (int x, int y, int z) { return (y  z) - x; }
 
 as well as the testcase.
 
 Thanks,
 Richard
 
 
 gcc/
  * config/arm/iterators.md (shiftable_ops): New code iterator.
  (t2_binop0, arith_shift_insn): New code attributes.
  * config/arm/arm.md (insn_enabled): Delete.
  (enabled): Remove insn_enabled test.
  (*arith_shiftsi): Split out...
  (*arith_multsi): ...this pattern and remove insn_enabled attribute.
 


 Thanks, Richard.
 I've tweaked this as followed and committed it.

 I now consider shift_operator in the arm backend deprecated.  We
 should be moving towards using shift_nomul_operator.

 There's one final wart still to be handled, though.  'rotate' can only
 take an immediate operand, not a register.  We can currently deal with
 this, but it's not clean in terms of constraint handling.  I'll see if I
 can fix this up sometime, but not today.

Thanks for picking it up.  I realised later that I'd fluffed the
MULT check in:

 @@ -9876,38 +9868,34 @@
  
  ;; Patterns to allow combination of arithmetic, cond code and shifts
  
 -(define_insn *arith_shiftsi
 -  [(set (match_operand:SI 0 s_register_operand =r,r,r,r)
 -(match_operator:SI 1 shiftable_operator
 -  [(match_operator:SI 3 shift_operator
 - [(match_operand:SI 4 s_register_operand r,r,r,r)
 -  (match_operand:SI 5 shift_amount_operand M,M,M,r)])
 -   (match_operand:SI 2 s_register_operand rk,rk,r,rk)]))]
 +(define_insn *arith_shift_insn_multsi
 +  [(set (match_operand:SI 0 s_register_operand =r,r)
 + (shiftable_ops:SI
 +  (mult:SI (match_operand:SI 2 s_register_operand r,r)
 +   (match_operand:SI 3 power_of_two_operand ))
 +  (match_operand:SI 1 s_register_operand rk,t2_binop0)))]
TARGET_32BIT
 -  %i1%?\\t%0, %2, %4%S3
 +  arith_shift_insn%?\\t%0, %1, %2, lsl %b3
 +  [(set_attr predicable yes)
 +   (set_attr predicable_short_it no)
 +   (set_attr shift 4)
 +   (set_attr arch a,t2)
 +   (set_attr type alu_shift_imm)])
 +
 +(define_insn *arith_shift_insn_shiftsi
 +  [(set (match_operand:SI 0 s_register_operand =r,r,r)
 + (shiftable_ops:SI
 +  (match_operator:SI 2 shift_nomul_operator
 +   [(match_operand:SI 3 s_register_operand r,r,r)
 +(match_operand:SI 4 shift_amount_operand M,M,r)])
 +  (match_operand:SI 1 s_register_operand rk,t2_binop0,rk)))]
 +  TARGET_32BIT  GET_CODE (operands[3]) != MULT

...this condition: operands[3] was the old numbering of the operator
rather than the new numbering.  It looks like shift_nomul_operator
should make it redundant anyway.

Richard


[PATCH, i386, Pointer Bounds Checker 11/x] Keep bounds initial values

2014-05-29 Thread Ilya Enkovich
Hi,

This patch tries to keep bounds initial values when it may be needed.  Even if 
initial value is not fully known (e.g. we know only low bound) it still may 
help to remove some redundant checks.

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--
gcc/

2013-05-29  Ilya Enkovich  ilya.enkov...@intel.com

* ipa.c (symtab_remove_unreachable_nodes): Kepp initial values for
pointer bounds to be used for checks eliminations.
* lto-cgraph.c (compute_ltrans_boundary): Likewise.
* add_references_to_partition (add_references_to_partition): Add
references to pointer bounds vars.


diff --git a/gcc/ipa.c b/gcc/ipa.c
index 1d7fa35..958cabe 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -568,7 +568,8 @@ symtab_remove_unreachable_nodes (bool before_inlining_p, 
FILE *file)
  vnode-aux = NULL;
 
  /* Keep body if it may be useful for constant folding.  */
- if ((init = ctor_for_folding (vnode-decl)) == error_mark_node)
+ if ((init = ctor_for_folding (vnode-decl)) == error_mark_node
+  !POINTER_BOUNDS_P (vnode-decl))
varpool_remove_initializer (vnode);
  else
DECL_INITIAL (vnode-decl) = init;
diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index 58105f0..7089516 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -829,7 +829,8 @@ compute_ltrans_boundary (lto_symtab_encoder_t in_encoder)
{
  if (!lto_symtab_encoder_encode_initializer_p (encoder,
vnode)
-  ctor_for_folding (vnode-decl) != error_mark_node)
+  (ctor_for_folding (vnode-decl) != error_mark_node
+ || POINTER_BOUNDS_P (vnode-decl)))
{
  lto_set_symtab_encoder_encode_initializer (encoder, vnode);
  add_references (encoder, vnode-ref_list);
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 2967d73..330253b 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -96,7 +96,8 @@ add_references_to_partition (ltrans_partition part, 
symtab_node *node)
Recursively look into the initializers of the constant variable and add
references, too.  */
 else if (is_a varpool_node (ref-referred)
- ctor_for_folding (ref-referred-decl) != error_mark_node
+ (ctor_for_folding (ref-referred-decl) != error_mark_node
+|| POINTER_BOUNDS_P (ref-referred-decl))
  !lto_symtab_encoder_in_partition_p (part-encoder, 
ref-referred))
   {
if (!part-initializers_visited)


[PATCH, i386, Pointer Bounds Checker 12/x] Recognize instrumented special functions

2014-05-29 Thread Ilya Enkovich
Hi,

This patch allows to recognize instrumented call to special function by using 
the original function name for recognition.

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--
gcc/

2014-05-29  Ilya Enkovich  ilya.enkov...@intel.com

* calls.c (special_function_p): Use original decl name
when analyzing instrumentation clone.

diff --git a/gcc/calls.c b/gcc/calls.c
index f0c92dd..e1dc8eb 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -502,8 +502,16 @@ emit_call_1 (rtx funexp, tree fntree ATTRIBUTE_UNUSED, 
tree fndecl ATTRIBUTE_UNU
 static int
 special_function_p (const_tree fndecl, int flags)
 {
-  if (fndecl  DECL_NAME (fndecl)
-   IDENTIFIER_LENGTH (DECL_NAME (fndecl)) = 17
+  tree name_decl = DECL_NAME (fndecl);
+
+  /* For instrumentation clones we want to derive flags
+ from the original name.  */
+  if (cgraph_get_node (fndecl)
+   cgraph_get_node (fndecl)-instrumentation_clone)
+name_decl = DECL_NAME (cgraph_get_node (fndecl)-orig_decl);
+
+  if (fndecl  name_decl
+   IDENTIFIER_LENGTH (name_decl) = 17
   /* Exclude functions not at the file scope, or not `extern',
 since they are not the magic functions we would otherwise
 think they are.
@@ -515,16 +523,16 @@ special_function_p (const_tree fndecl, int flags)
  || TREE_CODE (DECL_CONTEXT (fndecl)) == TRANSLATION_UNIT_DECL)
TREE_PUBLIC (fndecl))
 {
-  const char *name = IDENTIFIER_POINTER (DECL_NAME (fndecl));
+  const char *name = IDENTIFIER_POINTER (name_decl);
   const char *tname = name;
 
   /* We assume that alloca will always be called by name.  It
 makes no sense to pass it as a pointer-to-function to
 anything that does not understand its behavior.  */
-  if (((IDENTIFIER_LENGTH (DECL_NAME (fndecl)) == 6
+  if (((IDENTIFIER_LENGTH (name_decl) == 6
 name[0] == 'a'
 ! strcmp (name, alloca))
-  || (IDENTIFIER_LENGTH (DECL_NAME (fndecl)) == 16
+  || (IDENTIFIER_LENGTH (name_decl) == 16
name[0] == '_'
! strcmp (name, __builtin_alloca
flags |= ECF_MAY_BE_ALLOCA;


[PATCH, Pointer Bounds Checker 13/x] Early versioning

2014-05-29 Thread Ilya Enkovich
Hi,

This patch allows to perform function versioning when some structures are not 
available yet.  It is required to make clones for Pointer Bounds Checker right 
after SSA build.

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--
gcc/

2014-05-29  Ilya Enkovich  ilya.enkov...@intel.com

* tree-inline.c (copy_cfg_body): Check loop tree
existence before accessing it.
(tree_function_versioning): Check DF info existence
before accessing it.

diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 4293241..23fef90 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -2544,7 +2544,8 @@ copy_cfg_body (copy_body_data * id, gcov_type count, int 
frequency_scale,
 
   /* If the loop tree in the source function needed fixup, mark the
  destination loop tree for fixup, too.  */
-  if (loops_for_fn (src_cfun)-state  LOOPS_NEED_FIXUP)
+  if (loops_for_fn (src_cfun)
+   loops_for_fn (src_cfun)-state  LOOPS_NEED_FIXUP)
 loops_state_set (LOOPS_NEED_FIXUP);
 
   if (gimple_in_ssa_p (cfun))
@@ -5350,8 +5351,9 @@ tree_function_versioning (tree old_decl, tree new_decl,
   DECL_ARGUMENTS (new_decl) = DECL_ARGUMENTS (old_decl);
   initialize_cfun (new_decl, old_decl,
   old_entry_block-count);
-  DECL_STRUCT_FUNCTION (new_decl)-gimple_df-ipa_pta
-= id.src_cfun-gimple_df-ipa_pta;
+  if (DECL_STRUCT_FUNCTION (new_decl)-gimple_df)
+DECL_STRUCT_FUNCTION (new_decl)-gimple_df-ipa_pta
+  = id.src_cfun-gimple_df-ipa_pta;
 
   /* Copy the function's static chain.  */
   p = DECL_STRUCT_FUNCTION (old_decl)-static_chain_decl;


RE: [PATCH] Inline asm asan instrumentation

2014-05-29 Thread Marat Zakirov
Actually I do not think that this is good idea to use constraints in a such
arbitrary way. By setting constraints user takes responsibility on himself.
So even if full inline asm support will be done one day, I do think that
checking memory constraints will be still exist. 
It is the same situation as with compiler warnings - sometimes they are
bothering but if you think you do not need them - just do not use them.

-Original Message-
From: Jakub Jelinek [mailto:ja...@redhat.com] 
Sent: Thursday, May 29, 2014 2:09 PM
To: Marat Zakirov
Cc: gcc-patches@gcc.gnu.org; 'Konstantin Serebryany'; 'Viacheslav Garbuzov';
'Yuri Gribov'; 'Marat Zakirov'
Subject: Re: [PATCH] Inline asm asan instrumentation

On Wed, May 28, 2014 at 05:33:44PM +0400, Marat Zakirov wrote:
 Here's a patch for optional Asan instrumentation of inline assembly.
 
 This version scans gimple for GIMPLE_ASMs and performs usual 
 instrumentation of arguments with memory constraints (m, o, etc.) with
fixed size.

That doesn't look right to me.  The fact that some region appears in m
doesn't mean the inline asm actually accesses it, it could not touch it at
all, or only some part of it.
If you look e.g. at Linux kernel headers, you'll see lots of struct
__large_struct { unsigned long buf[100]; }; #define __m(x) (*(struct
__large_struct __user *)(x)) ...
  m (__m(addr))
and similar cases, if Asan wants to check that the whole 100*sizeof(long)
region is accessible, it could often just have false positives, because the
inline asm really accesses just some small part of it.

Jakub



Re: [PATCH 7/7] Move SYMBOL_REF_FLAGS to rtx header

2014-05-29 Thread Tom de Vries

On 10-05-14 22:24, Richard Sandiford wrote:

  /* A set of flags on a symbol_ref that are, in some respects, redundant with
 information derivable from the tree decl associated with this symbol.
@@ -1791,7 +1794,9 @@ #define SYMBOL_REF_CONSTANT(RTX) \
 this information to avoid recomputing it.  Finally, this allows space for
 the target to store more than one bit of information, as with
 SYMBOL_REF_FLAG.  */
-#define SYMBOL_REF_FLAGS(RTX)  X0INT ((RTX), 1)
+#define SYMBOL_REF_FLAGS(RTX) \
+  (RTL_FLAG_CHECK1 (SYMBOL_REF_FLAGS, (RTX), SYMBOL_REF) \
+   -u2.symbol_ref_flags)



Richard,

with an arm-linux-gnueabi non-bootstrap build with --enable-checking=yes,rtl, I 
ran into the following error:

...
/home/vries/gcc_versions/devel/src/libgcc/libgcc2.c:819:1: internal compiler 
error: RTL check: attempt to treat non-block symbol as a block symbol in 
create_block_symbol, at varasm.c:394

 };
 ^
0xc3c16b rtl_check_failed_block_symbol(char const*, int, char const*)
/home/vries/gcc_versions/devel/src/gcc/rtl.c:844
0x103c09d create_block_symbol
/home/vries/gcc_versions/devel/src/gcc/varasm.c:394
0x103f42d make_decl_rtl(tree_node*)
/home/vries/gcc_versions/devel/src/gcc/varasm.c:1379
0x103fc87 notice_global_symbol(tree_node*)
/home/vries/gcc_versions/devel/src/gcc/varasm.c:1552
0x7588bf varpool_finalize_decl(tree_node*)
/home/vries/gcc_versions/devel/src/gcc/cgraphunit.c:823
0xb4eaa0 rest_of_decl_compilation(tree_node*, int, int)
/home/vries/gcc_versions/devel/src/gcc/passes.c:241
0x5902c4 finish_decl(tree_node*, unsigned int, tree_node*, tree_node*, 
tree_node*)
/home/vries/gcc_versions/devel/src/gcc/c/c-decl.c:4521
0x5e8586 c_parser_declaration_or_fndef
/home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:1782
0x5e7644 c_parser_external_declaration
/home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:1399
0x5e72c7 c_parser_translation_unit
/home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:1286
0x606c6d c_parse_file()
/home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:14077
0x66b7fa c_common_parse_file()
/home/vries/gcc_versions/devel/src/gcc/c-family/c-opts.c:1067
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See http://gcc.gnu.org/bugs.html for instructions.
...

It looks like BLOCK_SYMBOL_CHECK hasn't been updated.

Patch below fixes it for me. OK for trunk if bootstrap on x86_64 succeeds?

Thanks,
- Tom


2014-05-29  Tom de Vries  t...@codesourcery.com

	* rtl.h (BLOCK_SYMBOL_CHECK): Use SYMBOL_REF_FLAGS.
---
 gcc/rtl.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/rtl.h b/gcc/rtl.h
index 02ce424..51cfae5 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -708,7 +708,7 @@ struct GTY(()) rtvec_def {
 
 #define BLOCK_SYMBOL_CHECK(RTX) __extension__\
 ({ __typeof (RTX) const _symbol = (RTX);\
-   const unsigned int flags = RTL_CHECKC1 (_symbol, 1, SYMBOL_REF).rt_int; \
+   const unsigned int flags = SYMBOL_REF_FLAGS (_symbol);		\
if ((flags  SYMBOL_FLAG_HAS_BLOCK_INFO) == 0)			\
  rtl_check_failed_block_symbol (__FILE__, __LINE__,			\
 __FUNCTION__);			\
-- 
1.9.1



Re: [DOC PATCH] Rewrite docs for inline asm

2014-05-29 Thread Andrew Haley
On 05/29/2014 11:22 AM, Eric Botcazou wrote:
 Yes.  We already know that this is better than the current docs.
 Let's check it in.
 
 As far as I can see you did it, but didn't add a ChangeLog entry (so David 
 isn't properly credited with the rewrite)?

Fixed.

Thanks,
Andrew.




Re: [PATCH 7/7] Move SYMBOL_REF_FLAGS to rtx header

2014-05-29 Thread Richard Sandiford
Tom de Vries tom_devr...@mentor.com writes:
 On 10-05-14 22:24, Richard Sandiford wrote:
   /* A set of flags on a symbol_ref that are, in some respects, redundant 
 with
  information derivable from the tree decl associated with this symbol.
 @@ -1791,7 +1794,9 @@ #define SYMBOL_REF_CONSTANT(RTX) \
  this information to avoid recomputing it.  Finally, this allows space 
 for
  the target to store more than one bit of information, as with
  SYMBOL_REF_FLAG.  */
 -#define SYMBOL_REF_FLAGS(RTX)   X0INT ((RTX), 1)
 +#define SYMBOL_REF_FLAGS(RTX) \
 +  (RTL_FLAG_CHECK1 (SYMBOL_REF_FLAGS, (RTX), SYMBOL_REF) \
 +   -u2.symbol_ref_flags)


 Richard,

 with an arm-linux-gnueabi non-bootstrap build with --enable-checking=yes,rtl, 
 I 
 ran into the following error:
 ...
 /home/vries/gcc_versions/devel/src/libgcc/libgcc2.c:819:1: internal compiler 
 error: RTL check: attempt to treat non-block symbol as a block symbol in 
 create_block_symbol, at varasm.c:394
   };
   ^
 0xc3c16b rtl_check_failed_block_symbol(char const*, int, char const*)
   /home/vries/gcc_versions/devel/src/gcc/rtl.c:844
 0x103c09d create_block_symbol
   /home/vries/gcc_versions/devel/src/gcc/varasm.c:394
 0x103f42d make_decl_rtl(tree_node*)
   /home/vries/gcc_versions/devel/src/gcc/varasm.c:1379
 0x103fc87 notice_global_symbol(tree_node*)
   /home/vries/gcc_versions/devel/src/gcc/varasm.c:1552
 0x7588bf varpool_finalize_decl(tree_node*)
   /home/vries/gcc_versions/devel/src/gcc/cgraphunit.c:823
 0xb4eaa0 rest_of_decl_compilation(tree_node*, int, int)
   /home/vries/gcc_versions/devel/src/gcc/passes.c:241
 0x5902c4 finish_decl(tree_node*, unsigned int, tree_node*, tree_node*, 
 tree_node*)
   /home/vries/gcc_versions/devel/src/gcc/c/c-decl.c:4521
 0x5e8586 c_parser_declaration_or_fndef
   /home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:1782
 0x5e7644 c_parser_external_declaration
   /home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:1399
 0x5e72c7 c_parser_translation_unit
   /home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:1286
 0x606c6d c_parse_file()
   /home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:14077
 0x66b7fa c_common_parse_file()
   /home/vries/gcc_versions/devel/src/gcc/c-family/c-opts.c:1067
 Please submit a full bug report,
 with preprocessed source if appropriate.
 Please include the complete backtrace with any bug report.
 See http://gcc.gnu.org/bugs.html for instructions.
 ...

 It looks like BLOCK_SYMBOL_CHECK hasn't been updated.

 Patch below fixes it for me. OK for trunk if bootstrap on x86_64 succeeds?

Can't really approve it, but it looks obviously correct to me.
Thanks for the fix.

Richard


Re: [PATCH 7/7] Move SYMBOL_REF_FLAGS to rtx header

2014-05-29 Thread Jeff Law

On 05/29/14 05:27, Tom de Vries wrote:

On 10-05-14 22:24, Richard Sandiford wrote:

  /* A set of flags on a symbol_ref that are, in some respects,
redundant with
 information derivable from the tree decl associated with this
symbol.
@@ -1791,7 +1794,9 @@ #define SYMBOL_REF_CONSTANT(RTX) \
 this information to avoid recomputing it.  Finally, this allows
space for
 the target to store more than one bit of information, as with
 SYMBOL_REF_FLAG.  */
-#define SYMBOL_REF_FLAGS(RTX)X0INT ((RTX), 1)
+#define SYMBOL_REF_FLAGS(RTX) \
+  (RTL_FLAG_CHECK1 (SYMBOL_REF_FLAGS, (RTX), SYMBOL_REF) \
+   -u2.symbol_ref_flags)



Richard,

with an arm-linux-gnueabi non-bootstrap build with
--enable-checking=yes,rtl, I ran into the following error:
...
/home/vries/gcc_versions/devel/src/libgcc/libgcc2.c:819:1: internal
compiler error: RTL check: attempt to treat non-block symbol as a block
symbol in create_block_symbol, at varasm.c:394
  };
  ^
0xc3c16b rtl_check_failed_block_symbol(char const*, int, char const*)
 /home/vries/gcc_versions/devel/src/gcc/rtl.c:844
0x103c09d create_block_symbol
 /home/vries/gcc_versions/devel/src/gcc/varasm.c:394
0x103f42d make_decl_rtl(tree_node*)
 /home/vries/gcc_versions/devel/src/gcc/varasm.c:1379
0x103fc87 notice_global_symbol(tree_node*)
 /home/vries/gcc_versions/devel/src/gcc/varasm.c:1552
0x7588bf varpool_finalize_decl(tree_node*)
 /home/vries/gcc_versions/devel/src/gcc/cgraphunit.c:823
0xb4eaa0 rest_of_decl_compilation(tree_node*, int, int)
 /home/vries/gcc_versions/devel/src/gcc/passes.c:241
0x5902c4 finish_decl(tree_node*, unsigned int, tree_node*, tree_node*,
tree_node*)
 /home/vries/gcc_versions/devel/src/gcc/c/c-decl.c:4521
0x5e8586 c_parser_declaration_or_fndef
 /home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:1782
0x5e7644 c_parser_external_declaration
 /home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:1399
0x5e72c7 c_parser_translation_unit
 /home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:1286
0x606c6d c_parse_file()
 /home/vries/gcc_versions/devel/src/gcc/c/c-parser.c:14077
0x66b7fa c_common_parse_file()
 /home/vries/gcc_versions/devel/src/gcc/c-family/c-opts.c:1067
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See http://gcc.gnu.org/bugs.html for instructions.
...

It looks like BLOCK_SYMBOL_CHECK hasn't been updated.

Patch below fixes it for me. OK for trunk if bootstrap on x86_64 succeeds?

Yes.  Ok.

jeff


Re: [PATCH 7/7] Move SYMBOL_REF_FLAGS to rtx header

2014-05-29 Thread Jeff Law

On 05/29/14 06:07, Richard Sandiford wrote:


Can't really approve it, but it looks obviously correct to me.
Thanks for the fix.

Is that something you'd like to see changed?

Jeff


[PATCH AArch64 0/2] Correct signedness of builtins, remove casts from arm_neon.h

2014-05-29 Thread Alan Lawrence
The __builtin_ functions registered by aarch64_init_simd_builtins use signed 
and/or unsigned types according to the qualifiers defined in aarch64-builtins.c 
and used in aarch64-simd-builtins.def. These __builtin functions are then used 
in arm_neon.h, with explicit casts converting between the signed/unsigned types 
declared for the intrinsics and for the builtins.


These two patches add a few more sets of qualifiers, and use existing ones more 
widely, allowing removing lots of the explicit casts from arm_neon.h.


There is no particular logic in the division between the two patches beyond that 
the second patch uses sets of qualifiers defined in the first.


No regressions on aarch64-none-elf or aarch64_be-none-elf.



[PATCH AArch64 1/2] Correct signedness of builtins, remove casts from arm_neon.h

2014-05-29 Thread Alan Lawrence
This adds three new sets of qualifiers to aarch64-builtins.c, and uses the 
already-present-but-unused USHIFTIMM.


gcc/ChangeLog:

* gcc/config/aarch64/aarch64-builtins.c
(aarch64_types_binop_uus_qualifiers,
aarch64_types_shift_to_unsigned_qualifiers,
aarch64_types_unsigned_shiftacc_qualifiers): Define.
* gcc/config/aarch64/aarch64-simd-builtins.def (uqshl, uqrshl, uqadd,
uqsub, usqadd, usra_n, ursra_n, uqshrn_n, uqrshrn_n, usri_n, usli_n,
sqshlu_n, uqshl_n): Update qualifiers.
* gcc/config/aarch64/arm_neon.h (vqadd_u8, vqadd_u16, vqadd_u32,
vqadd_u64, vqaddq_u8, vqaddq_u16, vqaddq_u32, vqaddq_u64, vqsub_u8,
vqsub_u16, vqsub_u32, vqsub_u64, vqsubq_u8, vqsubq_u16, vqsubq_u32,
vqsubq_u64, vqaddb_u8, vqaddh_u16, vqadds_u32, vqaddd_u64, vqrshl_u8,
vqrshl_u16, vqrshl_u32, vqrshl_u64, vqrshlq_u8, vqrshlq_u16,
vqrshlq_u32, vqrshlq_u64, vqrshlb_u8, vqrshlh_u16, vqrshls_u32,
vqrshld_u64, vqrshrn_n_u16, vqrshrn_n_u32, vqrshrn_n_u64,
vqrshrnh_n_u16, vqrshrns_n_u32, vqrshrnd_n_u64, vqshl_u8, vqshl_u16,
vqshl_u32, vqshl_u64, vqshlq_u8, vqshlq_u16, vqshlq_u32, vqshlq_u64,
vqshlb_u8, vqshlh_u16, vqshls_u32, vqshld_u64, vqshl_n_u8, vqshl_n_u16,
vqshl_n_u32, vqshl_n_u64, vqshlq_n_u8, vqshlq_n_u16, vqshlq_n_u32,
vqshlq_n_u64, vqshlb_n_u8, vqshlh_n_u16, vqshls_n_u32, vqshld_n_u64,
vqshlu_n_s8, vqshlu_n_s16, vqshlu_n_s32, vqshlu_n_s64, vqshluq_n_s8,
vqshluq_n_s16, vqshluq_n_s32, vqshluq_n_s64, vqshlub_n_s8,
vqshluh_n_s16, vqshlus_n_s32, vqshlud_n_s64, vqshrn_n_u16,
vqshrn_n_u32, vqshrn_n_u64, vqshrnh_n_u16, vqshrns_n_u32,
vqshrnd_n_u64, vqsubb_u8, vqsubh_u16, vqsubs_u32, vqsubd_u64,
vrsra_n_u8, vrsra_n_u16, vrsra_n_u32, vrsra_n_u64, vrsraq_n_u8,
vrsraq_n_u16, vrsraq_n_u32, vrsraq_n_u64, vrsrad_n_u64, vsli_n_u8,
vsli_n_u16, vsli_n_u32,vsli_n_u64, vsliq_n_u8, vsliq_n_u16,
vsliq_n_u32, vsliq_n_u64, vslid_n_u64, vsqadd_u8, vsqadd_u16,
vsqadd_u32, vsqadd_u64, vsqaddq_u8, vsqaddq_u16, vsqaddq_u32,
vsqaddq_u64, vsqaddb_u8, vsqaddh_u16, vsqadds_u32, vsqaddd_u64,
vsra_n_u8, vsra_n_u16, vsra_n_u32, vsra_n_u64, vsraq_n_u8,
vsraq_n_u16, vsraq_n_u32, vsraq_n_u64, vsrad_n_u64, vsri_n_u8,
vsri_n_u16, vsri_n_u32, vsri_n_u64, vsriq_n_u8, vsriq_n_u16,
vsriq_n_u32, vsriq_n_u64, vsrid_n_u64): Remove casts.



Alan Lawrence wrote:
The __builtin_ functions registered by aarch64_init_simd_builtins use signed 
and/or unsigned types according to the qualifiers defined in aarch64-builtins.c 
and used in aarch64-simd-builtins.def. These __builtin functions are then used 
in arm_neon.h, with explicit casts converting between the signed/unsigned types 	

declared for the intrinsics and for the builtins.

These two patches add a few more sets of qualifiers, and use existing ones more 
widely, allowing removing lots of the explicit casts from arm_neon.h.


There is no particular logic in the division between the two patches beyond that 
the second patch uses sets of qualifiers defined in the first.


No regressions on aarch64-none-elf or aarch64_be-none-elf.



diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 591260f18bcc084bcc6cc16b6597a3d2ec098d05..036da3e8646124cda7e1cb36db2ec48f84b58214 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -173,6 +173,10 @@ aarch64_types_binopu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned };
 #define TYPES_BINOPU (aarch64_types_binopu_qualifiers)
 static enum aarch64_type_qualifiers
+aarch64_types_binop_uus_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_none };
+#define TYPES_BINOP_UUS (aarch64_types_binop_uus_qualifiers)
+static enum aarch64_type_qualifiers
 aarch64_types_binopp_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_poly, qualifier_poly, qualifier_poly };
 #define TYPES_BINOPP (aarch64_types_binopp_qualifiers)
@@ -199,9 +203,14 @@ aarch64_types_getlane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define TYPES_GETLANE (aarch64_types_getlane_qualifiers)
 #define TYPES_SHIFTIMM (aarch64_types_getlane_qualifiers)
 static enum aarch64_type_qualifiers
+aarch64_types_shift_to_unsigned_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_none, qualifier_immediate };
+#define TYPES_SHIFTIMM_USS (aarch64_types_shift_to_unsigned_qualifiers)
+static enum aarch64_type_qualifiers
 aarch64_types_unsigned_shift_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate };
 #define TYPES_USHIFTIMM (aarch64_types_unsigned_shift_qualifiers)
+
 static enum aarch64_type_qualifiers
 aarch64_types_setlane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_none, qualifier_none, qualifier_none, qualifier_immediate };
@@ 

[PATCH AArch64 2/2] Correct signedness of builtins, remove casts from arm_neon.h

2014-05-29 Thread Alan Lawrence
This adds another set of qualifiers to aarch64-builtins.c, and removes more 
casts from arm_neon.h, for the suqadd, ushl, urshl, urshr_n, ushll_n, and sshl 
intrinsics.


gcc/ChangeLog:

* gcc/config/aarch64/aarch64-builtins.c
(aarch64_types_binop_ssu_qualifiers): New static data.
(TYPES_BINOP_SSU): Define.
* gcc/config/aarch64/aarch64-simd-builtins.def (suqadd, ushl, urshl,
urshr_n, ushll_n): Use appropriate unsigned qualifiers.
* gcc/config/aarch64/arm_neon.h (vrshl_u8, vrshl_u16, vrshl_u32,
vrshl_u64, vrshlq_u8, vrshlq_u16, vrshlq_u32, vrshlq_u64, vrshld_u64,
vrshr_n_u8, vrshr_n_u16, vrshr_n_u32, vrshr_n_u64, vrshrq_n_u8,
vrshrq_n_u16, vrshrq_n_u32, vrshrq_n_u64, vrshrd_n_u64, vshll_n_u8,
vshll_n_u16, vshll_n_u32, vuqadd_s8, vuqadd_s16, vuqadd_s32,
vuqadd_s64, vuqaddq_s8, vuqaddq_s16, vuqaddq_s32, vuqaddq_s64,
vuqaddb_s8, vuqaddh_s16, vuqadds_s32, vuqaddd_s64): Add signedness
suffix to builtin function name, remove cast.
(vshl_s8, vshl_s16, vshl_s32, vshl_s64, vshl_u8, vshl_u16, vshl_u32,
vshl_u64, vshlq_s8, vshlq_s16, vshlq_s32, vshlq_s64, vshlq_u8,
vshlq_u16, vshlq_u32, vshlq_u64, vshld_s64, vshld_u64): Remove cast.diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 036da3e8646124cda7e1cb36db2ec48f84b58214..481f68fe7f070545b63d39d2589720566288d638 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -177,6 +177,10 @@ aarch64_types_binop_uus_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_unsigned, qualifier_unsigned, qualifier_none };
 #define TYPES_BINOP_UUS (aarch64_types_binop_uus_qualifiers)
 static enum aarch64_type_qualifiers
+aarch64_types_binop_ssu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none, qualifier_unsigned };
+#define TYPES_BINOP_SSU (aarch64_types_binop_ssu_qualifiers)
+static enum aarch64_type_qualifiers
 aarch64_types_binopp_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_poly, qualifier_poly, qualifier_poly };
 #define TYPES_BINOPP (aarch64_types_binopp_qualifiers)
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 03e820f426fb0d99634a3e3d675ab648fca1691b..8e3dcf1d04de2e8ccef75bd2f37d5a5ac17321c0 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -86,7 +86,7 @@
   BUILTIN_VSDQ_I (BINOP, sqsub, 0)
   BUILTIN_VSDQ_I (BINOPU, uqsub, 0)
   /* Implemented by aarch64_surqaddmode.  */
-  BUILTIN_VSDQ_I (BINOP, suqadd, 0)
+  BUILTIN_VSDQ_I (BINOP_SSU, suqadd, 0)
   BUILTIN_VSDQ_I (BINOP_UUS, usqadd, 0)
 
   /* Implemented by aarch64_get_dregVSTRUCT:modeVDC:mode.  */
@@ -197,9 +197,9 @@
   BUILTIN_VSDQ_I_DI (BINOP, ashl, 3)
   /* Implemented by aarch64_surshlmode.  */
   BUILTIN_VSDQ_I_DI (BINOP, sshl, 0)
-  BUILTIN_VSDQ_I_DI (BINOP, ushl, 0)
+  BUILTIN_VSDQ_I_DI (BINOP_UUS, ushl, 0)
   BUILTIN_VSDQ_I_DI (BINOP, srshl, 0)
-  BUILTIN_VSDQ_I_DI (BINOP, urshl, 0)
+  BUILTIN_VSDQ_I_DI (BINOP_UUS, urshl, 0)
 
   BUILTIN_VDQ_I (SHIFTIMM, ashr, 3)
   VAR1 (SHIFTIMM, ashr_simd, 0, di)
@@ -207,7 +207,7 @@
   VAR1 (USHIFTIMM, lshr_simd, 0, di)
   /* Implemented by aarch64_surshr_nmode.  */
   BUILTIN_VSDQ_I_DI (SHIFTIMM, srshr_n, 0)
-  BUILTIN_VSDQ_I_DI (SHIFTIMM, urshr_n, 0)
+  BUILTIN_VSDQ_I_DI (USHIFTIMM, urshr_n, 0)
   /* Implemented by aarch64_sursra_nmode.  */
   BUILTIN_VSDQ_I_DI (SHIFTACC, ssra_n, 0)
   BUILTIN_VSDQ_I_DI (USHIFTACC, usra_n, 0)
@@ -215,7 +215,7 @@
   BUILTIN_VSDQ_I_DI (USHIFTACC, ursra_n, 0)
   /* Implemented by aarch64_surshll_nmode.  */
   BUILTIN_VDW (SHIFTIMM, sshll_n, 0)
-  BUILTIN_VDW (SHIFTIMM, ushll_n, 0)
+  BUILTIN_VDW (USHIFTIMM, ushll_n, 0)
   /* Implemented by aarch64_surshll2_nmode.  */
   BUILTIN_VQW (SHIFTIMM, sshll2_n, 0)
   BUILTIN_VQW (SHIFTIMM, ushll2_n, 0)
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index af86d8e127404c7bdf2763cbefbdabb2db047dfc..f96871aeaa4613988352a10a2a2a718dd24702ab 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -22605,25 +22605,25 @@ vrshl_s64 (int64x1_t __a, int64x1_t __b)
 __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
 vrshl_u8 (uint8x8_t __a, int8x8_t __b)
 {
-  return (uint8x8_t) __builtin_aarch64_urshlv8qi ((int8x8_t) __a, __b);
+  return __builtin_aarch64_urshlv8qi_uus (__a, __b);
 }
 
 __extension__ static __inline uint16x4_t __attribute__ ((__always_inline__))
 vrshl_u16 (uint16x4_t __a, int16x4_t __b)
 {
-  return (uint16x4_t) __builtin_aarch64_urshlv4hi ((int16x4_t) __a, __b);
+  return __builtin_aarch64_urshlv4hi_uus (__a, __b);
 }
 
 __extension__ static __inline uint32x2_t __attribute__ ((__always_inline__))
 vrshl_u32 (uint32x2_t __a, int32x2_t __b)
 {
-  return (uint32x2_t) __builtin_aarch64_urshlv2si ((int32x2_t) __a, __b);
+  return 

Re: [PATCH] Inline asm asan instrumentation

2014-05-29 Thread Yury Gribov

The fact that some region appears in m
doesn't mean the inline asm actually accesses it, it could not touch it at
all, or only some part of it.


Do we have precise semantics of m written somewhere?
My understanding was that even though asm may not touch buffer at all 
(like e.g. in our tests),

user still declares whole region accessible to compiler.


if Asan wants to check that the whole 100*sizeof(long)
region is accessible, it could often just have false positives, because the
inline asm really accesses just some small part of it.


We've seen this abused (e.g. casting to struct { char x[0x]; } *)
and that's main reason why we turned this off by default.
On the other we've seen no problems with ffmpeg's testsuite
and ability to detect overflows in inline asm would be rather useful.

Do you see how we could make this more robustly?
We could e.g. only check the first byte
although this wouldn't be as useful.

-Y


Re: [C++ Patch] PR 57543

2014-05-29 Thread Jason Merrill

On 05/28/2014 01:06 PM, Paolo Carlini wrote:

Now, I got this insane idea: would it make sense to simply invert the
substitutions (args and return) unconditionally?


If we're going to change the order, I want to do it in a more correct, 
rather than differently wrong, way.  DR 1227 clarified that substitution 
should proceed in lexical order.


http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1227

Jason



Re: [C++ Patch] PR 57543

2014-05-29 Thread Paolo Carlini

Hi,

On 05/29/2014 03:34 PM, Jason Merrill wrote:

On 05/28/2014 01:06 PM, Paolo Carlini wrote:

Now, I got this insane idea: would it make sense to simply invert the
substitutions (args and return) unconditionally?


If we're going to change the order, I want to do it in a more correct, 
rather than differently wrong, way.  DR 1227 clarified that 
substitution should proceed in lexical order.


http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1227
Ok, I had no idea we had this kind of much more general issue. Then this 
is really something for you to handle ;)


Thanks!
Paolo.


Re: ipa-visibility TLC 2/n

2014-05-29 Thread Yufeng Zhang

Hi Honza,

I can confirm that with your commit r211045 the 
arm-none-linux-gnueabi{hf} builds are OK now.  Thanks for the fix.


Yufeng

On 05/28/14 22:56, Jan Hubicka wrote:

Any update?

I've managed to generate a simple test case from
libstdc++-v3/src/c++98/strstream.cc which reproduces the issue on
ARM that Ramana has reported previously:


templateclass _CharT   struct char_traits;

templatetypename _CharT, typename _Traits
class basic_ios
{
};

templatetypename _CharT, typename _Traits = char_traits_CharT  
class basic_istream : virtual public basic_ios_CharT, _Traits
{
protected:
   int _M_gcount;
   virtual ~basic_istream()
 { }
};

class istrstream : public basic_istreamchar
{
   virtual ~istrstream();
};

istrstream::~istrstream()
{
}

-- CUT --

With an arm-none-linux-gnueabi gcc configured as:

./gcc/configure --target=arm-none-linux-gnueabi
--enable-gnu-indirect-function --enable-shared --with-arch=armv7-a
--with-fpu=vfpv3-d16 --with-float=softfp --with-arch=armv7-a
(irrelevant parts omitted)

With the following command line options:

  -fdata-sections-O2  -fPIC  -S ./test.cpp

We'll see

./test.cpp:17:7: error: istrstream::_ZTV10istrstream.localalias.0
causes a section type conflict with istrstream::_ZTV10istrstream
  class istrstream : public basic_istreamchar
^
./test.cpp:17:7: note: 'istrstream::_ZTV10istrstream' was declared here


This seems to be same cause as on AIX - we do section for decl rather than
original.  The following patch seems to fix it.  Does it allows bootstrap
for you? (it doesn't for AIX. but that seems bug in output machinery)

Index: varasm.c
===
--- varasm.c(revision 210914)
+++ varasm.c(working copy)
@@ -1083,6 +1083,9 @@
  {
addr_space_t as = ADDR_SPACE_GENERIC;
int reloc;
+  symtab_node *snode = symtab_get_node (decl);
+  if (snode)
+decl = symtab_alias_ultimate_target (snode)-decl;

if (TREE_TYPE (decl) != error_mark_node)
  as = TYPE_ADDR_SPACE (TREE_TYPE (decl));



Yufeng







Re: detecting container overflow bugs in std::vector

2014-05-29 Thread Jonathan Wakely

On 26/05/14 19:19 +0400, Konstantin Serebryany wrote:

It does look useful but I'm concerned about a proliferation of
container checks, we already have the libstdc++ Debug Mode
and I'd
like to see some of the lightweight checks from the Google branch
added to trunk too.

Me too, but these checks are mostly orthogonal to the proposed annotations.


Thanks for clarifying that (and to Paul).


Aren't they still much cheaper than asan
instrumentation?


Of course, they are much cheaper than asan. But they do not cover the case
that motivated the container overflow annotations (when the contents
of vector are accessed via
vectorT::data())


Yes, I don't think I've ever seen that error in code I work with, but
if Asan can be made to detect it then I'm in favour of the changes.

Thanks.



[PATCH][ARM] Use mov_imm type for movw operations consistently

2014-05-29 Thread Kyrill Tkachov

Hi all,

I noticed that in some of our move patterns the movw instruction is given the 
mov_reg type rather than the
mov_imm type that all other uses of movw have. This patch fixes that.

Scanning through our pipeline descriptions I see that mov_imm is treated the 
same way as mov_reg everywhere anyway. In the Cortex-A7 description we do have 
a bit more complicated logic:

;; ALU instruction with an immediate operand can dual-issue.
(define_insn_reservation cortex_a7_alu_imm 2
  (and (eq_attr tune cortexa7)
   (ior (eq_attr type adr,alu_imm,alus_imm,logic_imm,logics_imm,\
 mov_imm,mvn_imm,extend)
(and (eq_attr type mov_reg,mov_shift,mov_shift_reg)
 (not (eq_attr length 8)
  cortex_a7_ex2|cortex_a7_ex1)

In the two patterns that I change the mov_imm has a length of 4 an hence will 
still use this reservation.
Thus I don't expect codegen to change at all from this patch but for future 
scheduling jobs this could make a
difference.

Tested arm-none-eabi on qemu.

Ok for trunk?

Thanks,
Kyrill

2014-05-29  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/arm/thumb2.md (*thumb2_movhi_insn): Set type of movw
to mov_imm.
* config/arm/vfp.md (*thumb2_movsi_vfp): Likewise.
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 10bc8b1..6ea0810 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -329,7 +329,7 @@
movw%?\\t%0, %L1\\t%@ movhi
str%(h%)\\t%1, %0\\t%@ movhi
ldr%(h%)\\t%0, %1\\t%@ movhi
-  [(set_attr type mov_reg,mov_imm,mov_imm,mov_reg,store1,load1)
+  [(set_attr type mov_reg,mov_imm,mov_imm,mov_imm,store1,load1)
(set_attr predicable yes)
(set_attr predicable_short_it yes,no,yes,no,no,no)
(set_attr length 2,4,2,4,4,4)
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index e1a48ee..8147624 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -100,7 +100,7 @@
   
   [(set_attr predicable yes)
(set_attr predicable_short_it yes,no,yes,no,no,no,no,no,no,no,no,no,no,no)
-   (set_attr type mov_reg,mov_reg,mov_reg,mvn_reg,mov_reg,load1,load1,store1,store1,f_mcr,f_mrc,fmov,f_loads,f_stores)
+   (set_attr type mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load1,load1,store1,store1,f_mcr,f_mrc,fmov,f_loads,f_stores)
(set_attr length 2,4,2,4,4,4,4,4,4,4,4,4,4,4)
(set_attr pool_range *,*,*,*,*,1018,4094,*,*,*,*,*,1018,*)
(set_attr neg_pool_range *,*,*,*,*,   0,   0,*,*,*,*,*,1008,*)]

Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2014-05-29 Thread H.J. Lu
On Wed, May 28, 2014 at 9:52 PM, Andrew Pinski pins...@gmail.com wrote:
 On Wed, Jul 13, 2011 at 9:39 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, Jul 13, 2011 at 9:13 AM, Paolo Bonzini bonz...@gnu.org wrote:
 On 07/11/2011 05:54 PM, H.J. Lu wrote:

 The key is the
 
  XEXP (x, 1) == convert_memory_address_addr_space
 (to_mode, XEXP (x, 1), as)
 
   test.  It ensures basically that the constant has 31-bit precision,
  because
   otherwise the constant would change from e.g. (const_int -0x7ffc)
  to
   (const_int 0x8004) when zero-extending it from SImode to DImode.
 
   But I'm not sure it's safe.  You have,
 
 (zero_extend:DI (plus:SI FOO:SI) (const_int Y))
 
   and you want to convert it to
 
 (plus:DI FOO:DI (zero_extend:DI (const_int Y)))
 
   (where the zero_extend is folded).  Ignore that FOO is a SYMBOL_REF
  (this
   piece of code does not assume anything about its shape); if FOO ==
   0xfffc and Y = 8, the result will be respectively 0x4 (valid) and
   0x10004 (invalid).

 This example contradicts what you said above It ensures basically that
 the
 constant has 31-bit precision.

 Why?  Certainly Y = 8 has 31-bit (or less) precision.  So it has the same
 representation in SImode and DImode, and the test above on XEXP (x, 1)
 succeeds.

 And then we permute conversion and addition, which leads to the issue you
 raised above.  In another word, the current code permutes conversion
 and addition.
 It leads to different values in case of symbol (0xfffc) + 8.
 Basically the current
 test for 31-bit (or less) precision is bogus.  The real question is
 for a address
 computation, A + B, if address wrap-around is supported in
 convert_memory_address_addr_space.

 Unless the code has already reassociated the additions already.
 Like in the AARCH64 ILP32 case:

 (plus:SI (plus:SI (mult:SI (reg/v:SI 80 [ b ])
 (const_int -4 [0xfffc]))
 (subreg/s/u:SI (reg/v/f:DI 79 [ a ]) 0))
 (const_int -1073742592 [0xbd00]))

 The Tree level is correct in that it did not reassociate the addition
 but the RTL level ignores that.

 So this patch is invalid and incorrect unless you know the non
 constant part of the addition is a pointer (which is not the case
 here).


There is an address overflow.  Is the address overflow behavior
defined here?


-- 
H.J.


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2014-05-29 Thread pinskia


 On May 29, 2014, at 9:13 AM, H.J. Lu hjl.to...@gmail.com wrote:
 
 On Wed, May 28, 2014 at 9:52 PM, Andrew Pinski pins...@gmail.com wrote:
 On Wed, Jul 13, 2011 at 9:39 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, Jul 13, 2011 at 9:13 AM, Paolo Bonzini bonz...@gnu.org wrote:
 On 07/11/2011 05:54 PM, H.J. Lu wrote:
 
 The key is the
 
XEXP (x, 1) == convert_memory_address_addr_space
   (to_mode, XEXP (x, 1), as)
 
 test.  It ensures basically that the constant has 31-bit precision,
 because
 otherwise the constant would change from e.g. (const_int -0x7ffc)
 to
 (const_int 0x8004) when zero-extending it from SImode to DImode.
 
 But I'm not sure it's safe.  You have,
 
   (zero_extend:DI (plus:SI FOO:SI) (const_int Y))
 
 and you want to convert it to
 
   (plus:DI FOO:DI (zero_extend:DI (const_int Y)))
 
 (where the zero_extend is folded).  Ignore that FOO is a SYMBOL_REF
 (this
 piece of code does not assume anything about its shape); if FOO ==
 0xfffc and Y = 8, the result will be respectively 0x4 (valid) and
 0x10004 (invalid).
 
 This example contradicts what you said above It ensures basically that
 the
 constant has 31-bit precision.
 
 Why?  Certainly Y = 8 has 31-bit (or less) precision.  So it has the same
 representation in SImode and DImode, and the test above on XEXP (x, 1)
 succeeds.
 
 And then we permute conversion and addition, which leads to the issue you
 raised above.  In another word, the current code permutes conversion
 and addition.
 It leads to different values in case of symbol (0xfffc) + 8.
 Basically the current
 test for 31-bit (or less) precision is bogus.  The real question is
 for a address
 computation, A + B, if address wrap-around is supported in
 convert_memory_address_addr_space.
 
 Unless the code has already reassociated the additions already.
 Like in the AARCH64 ILP32 case:
 
 (plus:SI (plus:SI (mult:SI (reg/v:SI 80 [ b ])
(const_int -4 [0xfffc]))
(subreg/s/u:SI (reg/v/f:DI 79 [ a ]) 0))
(const_int -1073742592 [0xbd00]))
 
 The Tree level is correct in that it did not reassociate the addition
 but the RTL level ignores that.
 
 So this patch is invalid and incorrect unless you know the non
 constant part of the addition is a pointer (which is not the case
 here).
 
 There is an address overflow.  Is the address overflow behavior
 defined here?

There was no address overflow in the original code and there was no address 
overflow in the tree level. The rtl level does introduce an address overflow 
but the semantics of plus is defined to be wrapping so there is no overflow.   
This is blocking me from testing ilp32 under gnu/Linux as ld.so gets 
miscompiled and stack addresses have the sign bit set. 

Thanks,
Andrew


 
 
 -- 
 H.J.


Re: [PATCH AArch64] Remove from arm_neon.h functions not in the spec

2014-05-29 Thread Alan Lawrence

Patch retaining vfmaq_n_f64 attached, updated gcc/ChangeLog:

* config/aarch64/arm_neon.h (vmlaq_n_f64, vmlsq_n_f64, vrsrtsq_f64,
vcge_p8, vcgeq_p8, vcgez_p8, vcgez_u8, vcgez_u16, vcgez_u32, vcgez_u64,
vcgezq_p8, vcgezq_u8, vcgezq_u16, vcgezq_u32, vcgezq_u64, vcgezd_u64,
vcgt_p8, vcgtq_p8, vcgtz_p8, vcgtz_u8, vcgtz_u16, vcgtz_u32, vcgtz_u64,
vcgtzq_p8, vcgtzq_u8, vcgtzq_u16, vcgtzq_u32, vcgtzq_u64, vcgtzd_u64,
vcle_p8, vcleq_p8, vclez_p8, vclez_u64, vclezq_p8, vclezd_u64, vclt_p8,
vcltq_p8, vcltz_p8, vcltzq_p8, vcltzd_u64): Remove functions as they are
not in the spec.


Alan Lawrence wrote:
No, hold that, vfmaq_n_f64 has been added back in the latest version (to which I 
linked). Hang on...


--Alan

Alan Lawrence wrote:
arm_neon.h contains a bunch of functions (for example, the wonderful vcgez_u* 
intrinsics - that's an unsigned comparison of greater-than-or-equal-to zero) 
that are not present in the current ARM Neon Intrinsics spec:


http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/index.html

This patch just deletes those intrinsics.

OK for trunk?

Cheers, Alan

gcc/ChangeLog:
2014-05-27  Alan Lawrence  alan.lawre...@arm.com

* config/aarch64/arm_neon.h (vfmaq_n_f64, vmlaq_n_f64, vmlsq_n_f64,
vrsrtsq_f64, vtst_p16, vtstq_p16, vcge_p8, vcgeq_p8, vcgez_p8, vcgez_u8,
vcgez_u16, vcgez_u32, vcgez_u64, vcgezq_p8, vcgezq_u8, vcgezq_u16,
vcgezq_u32, vcgezq_u64, vcgezd_u64, vcgt_p8, vcgtq_p8, vcgtz_p8,
vcgtz_u8, vcgtz_u16, vcgtz_u32, vcgtz_u64, vcgtzq_p8, vcgtzq_u8,
vcgtzq_u16, vcgtzq_u32, vcgtzq_u64, vcgtzd_u64, vcle_p8, vcleq_p8,
vclez_p8, vclez_u64, vclezq_p8, vclezd_u64, vclt_p8, vcltq_p8, vcltz_p8,
vcltzq_p8, vcltzd_u64): Remove functions as they are not in the spec.






diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 747a292ba9b2260e74566c946fe57afaea267969..bbf47349ae0e21761637a670c6e59c5c1e3f5195 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -7243,18 +7243,6 @@ vmlaq_n_f32 (float32x4_t a, float32x4_t b, float32_t c)
   return result;
 }
 
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vmlaq_n_f64 (float64x2_t a, float64x2_t b, float64_t c)
-{
-  float64x2_t result;
-  float64x2_t t1;
-  __asm__ (fmul %1.2d, %3.2d, %4.d[0]; fadd %0.2d, %0.2d, %1.2d
-   : =w(result), =w(t1)
-   : 0(a), w(b), w(c)
-   : /* No clobbers */);
-  return result;
-}
-
 __extension__ static __inline int16x8_t __attribute__ ((__always_inline__))
 vmlaq_n_s16 (int16x8_t a, int16x8_t b, int16_t c)
 {
@@ -7943,18 +7931,6 @@ vmlsq_n_f32 (float32x4_t a, float32x4_t b, float32_t c)
   return result;
 }
 
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vmlsq_n_f64 (float64x2_t a, float64x2_t b, float64_t c)
-{
-  float64x2_t result;
-  float64x2_t t1;
-  __asm__ (fmul %1.2d, %3.2d, %4.d[0]; fsub %0.2d, %0.2d, %1.2d
-   : =w(result), =w(t1)
-   : 0(a), w(b), x(c)
-   : /* No clobbers */);
-  return result;
-}
-
 __extension__ static __inline int16x8_t __attribute__ ((__always_inline__))
 vmlsq_n_s16 (int16x8_t a, int16x8_t b, int16_t c)
 {
@@ -11329,17 +11305,6 @@ vrsqrtss_f32 (float32_t a, float32_t b)
   return result;
 }
 
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vrsrtsq_f64 (float64x2_t a, float64x2_t b)
-{
-  float64x2_t result;
-  __asm__ (frsqrts %0.2d,%1.2d,%2.2d
-   : =w(result)
-   : w(a), w(b)
-   : /* No clobbers */);
-  return result;
-}
-
 __extension__ static __inline int8x16_t __attribute__ ((__always_inline__))
 vrsubhn_high_s16 (int8x8_t a, int16x8_t b, int16x8_t c)
 {
@@ -16082,13 +16047,6 @@ vcge_f64 (float64x1_t __a, float64x1_t __b)
 }
 
 __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
-vcge_p8 (poly8x8_t __a, poly8x8_t __b)
-{
-  return (uint8x8_t) __builtin_aarch64_cmgev8qi ((int8x8_t) __a,
-		 (int8x8_t) __b);
-}
-
-__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
 vcge_s8 (int8x8_t __a, int8x8_t __b)
 {
   return (uint8x8_t) __builtin_aarch64_cmgev8qi (__a, __b);
@@ -16152,13 +16110,6 @@ vcgeq_f64 (float64x2_t __a, float64x2_t __b)
 }
 
 __extension__ static __inline uint8x16_t __attribute__ ((__always_inline__))
-vcgeq_p8 (poly8x16_t __a, poly8x16_t __b)
-{
-  return (uint8x16_t) __builtin_aarch64_cmgev16qi ((int8x16_t) __a,
-		   (int8x16_t) __b);
-}
-
-__extension__ static __inline uint8x16_t __attribute__ ((__always_inline__))
 vcgeq_s8 (int8x16_t __a, int8x16_t __b)
 {
   return (uint8x16_t) __builtin_aarch64_cmgev16qi (__a, __b);
@@ -16252,14 +16203,6 @@ vcgez_f64 (float64x1_t __a)
 }
 
 __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
-vcgez_p8 (poly8x8_t __a)
-{
-  poly8x8_t __b = {0, 0, 0, 0, 0, 0, 0, 0};
-  return (uint8x8_t) 

Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2014-05-29 Thread H.J. Lu
On Thu, May 29, 2014 at 9:23 AM,  pins...@gmail.com wrote:


 On May 29, 2014, at 9:13 AM, H.J. Lu hjl.to...@gmail.com wrote:

 On Wed, May 28, 2014 at 9:52 PM, Andrew Pinski pins...@gmail.com wrote:
 On Wed, Jul 13, 2011 at 9:39 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, Jul 13, 2011 at 9:13 AM, Paolo Bonzini bonz...@gnu.org wrote:
 On 07/11/2011 05:54 PM, H.J. Lu wrote:

 The key is the

XEXP (x, 1) == convert_memory_address_addr_space
   (to_mode, XEXP (x, 1), as)

 test.  It ensures basically that the constant has 31-bit precision,
 because
 otherwise the constant would change from e.g. (const_int -0x7ffc)
 to
 (const_int 0x8004) when zero-extending it from SImode to DImode.

 But I'm not sure it's safe.  You have,

   (zero_extend:DI (plus:SI FOO:SI) (const_int Y))

 and you want to convert it to

   (plus:DI FOO:DI (zero_extend:DI (const_int Y)))

 (where the zero_extend is folded).  Ignore that FOO is a SYMBOL_REF
 (this
 piece of code does not assume anything about its shape); if FOO ==
 0xfffc and Y = 8, the result will be respectively 0x4 (valid) and
 0x10004 (invalid).

 This example contradicts what you said above It ensures basically that
 the
 constant has 31-bit precision.

 Why?  Certainly Y = 8 has 31-bit (or less) precision.  So it has the same
 representation in SImode and DImode, and the test above on XEXP (x, 1)
 succeeds.

 And then we permute conversion and addition, which leads to the issue you
 raised above.  In another word, the current code permutes conversion
 and addition.
 It leads to different values in case of symbol (0xfffc) + 8.
 Basically the current
 test for 31-bit (or less) precision is bogus.  The real question is
 for a address
 computation, A + B, if address wrap-around is supported in
 convert_memory_address_addr_space.

 Unless the code has already reassociated the additions already.
 Like in the AARCH64 ILP32 case:

 (plus:SI (plus:SI (mult:SI (reg/v:SI 80 [ b ])
(const_int -4 [0xfffc]))
(subreg/s/u:SI (reg/v/f:DI 79 [ a ]) 0))
(const_int -1073742592 [0xbd00]))

 The Tree level is correct in that it did not reassociate the addition
 but the RTL level ignores that.

 So this patch is invalid and incorrect unless you know the non
 constant part of the addition is a pointer (which is not the case
 here).

 There is an address overflow.  Is the address overflow behavior
 defined here?

 There was no address overflow in the original code and there was no address 
 overflow in the tree level. The rtl level does introduce an address overflow 
 but the semantics of plus is defined to be wrapping so there is no overflow.  
  This is blocking me from testing ilp32 under gnu/Linux as ld.so gets 
 miscompiled and stack addresses have the sign bit set.


What is your Pmode?


-- 
H.J.


Re: ipa-visibility TLC 2/n

2014-05-29 Thread Jan Hubicka
 Jan Hubicka hubi...@ucw.cz writes:
  Richard Sandiford wrote the original section anchors implementation,
  so he would be a good person to comment about the interaction between
  aliases and section anchors.
 
  Thanks! Richard, does this patch seem sane?
 
 Looks good to me in principle, but with:
 
  +  struct symtab_node *snode;
 decl = SYMBOL_REF_DECL (symbol);
  +
  +  snode = symtab_node (decl);
  +  if (snode-alias)
  +   {
  + rtx target = DECL_RTL (symtab_alias_ultimate_target 
  (snode)-decl);
  + SYMBOL_REF_BLOCK_OFFSET (symbol) = SYMBOL_REF_BLOCK_OFFSET 
  (target);
  + return;
  +   }
 
 is SYMBOL_REF_BLOCK_OFFSET (target) guaranteed to be valid at this point?
 It looked at face value like you'd need a recursive call to place_block_symbol
 on the target before the copy.

My reading was that SYMBOL_REF_BLOCK_OFFSET is computed at DECL_RTL calculation
time. But you are right - it is done by validize_mem that is not done by 
DECL_RTL.
Shall I just call it on target first?

Honza
 
 Thanks,
 Richard


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2014-05-29 Thread pinskia


 On May 29, 2014, at 10:09 AM, H.J. Lu hjl.to...@gmail.com wrote:
 
 On Thu, May 29, 2014 at 9:23 AM,  pins...@gmail.com wrote:
 
 
 On May 29, 2014, at 9:13 AM, H.J. Lu hjl.to...@gmail.com wrote:
 
 On Wed, May 28, 2014 at 9:52 PM, Andrew Pinski pins...@gmail.com wrote:
 On Wed, Jul 13, 2011 at 9:39 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, Jul 13, 2011 at 9:13 AM, Paolo Bonzini bonz...@gnu.org wrote:
 On 07/11/2011 05:54 PM, H.J. Lu wrote:
 
 The key is the
 
   XEXP (x, 1) == convert_memory_address_addr_space
  (to_mode, XEXP (x, 1), as)
 
 test.  It ensures basically that the constant has 31-bit precision,
 because
 otherwise the constant would change from e.g. (const_int -0x7ffc)
 to
 (const_int 0x8004) when zero-extending it from SImode to DImode.
 
 But I'm not sure it's safe.  You have,
 
  (zero_extend:DI (plus:SI FOO:SI) (const_int Y))
 
 and you want to convert it to
 
  (plus:DI FOO:DI (zero_extend:DI (const_int Y)))
 
 (where the zero_extend is folded).  Ignore that FOO is a SYMBOL_REF
 (this
 piece of code does not assume anything about its shape); if FOO ==
 0xfffc and Y = 8, the result will be respectively 0x4 (valid) and
 0x10004 (invalid).
 
 This example contradicts what you said above It ensures basically that
 the
 constant has 31-bit precision.
 
 Why?  Certainly Y = 8 has 31-bit (or less) precision.  So it has the same
 representation in SImode and DImode, and the test above on XEXP (x, 1)
 succeeds.
 
 And then we permute conversion and addition, which leads to the issue you
 raised above.  In another word, the current code permutes conversion
 and addition.
 It leads to different values in case of symbol (0xfffc) + 8.
 Basically the current
 test for 31-bit (or less) precision is bogus.  The real question is
 for a address
 computation, A + B, if address wrap-around is supported in
 convert_memory_address_addr_space.
 
 Unless the code has already reassociated the additions already.
 Like in the AARCH64 ILP32 case:
 
 (plus:SI (plus:SI (mult:SI (reg/v:SI 80 [ b ])
   (const_int -4 [0xfffc]))
   (subreg/s/u:SI (reg/v/f:DI 79 [ a ]) 0))
   (const_int -1073742592 [0xbd00]))
 
 The Tree level is correct in that it did not reassociate the addition
 but the RTL level ignores that.
 
 So this patch is invalid and incorrect unless you know the non
 constant part of the addition is a pointer (which is not the case
 here).
 
 There is an address overflow.  Is the address overflow behavior
 defined here?
 
 There was no address overflow in the original code and there was no address 
 overflow in the tree level. The rtl level does introduce an address overflow 
 but the semantics of plus is defined to be wrapping so there is no overflow. 
   This is blocking me from testing ilp32 under gnu/Linux as ld.so gets 
 miscompiled and stack addresses have the sign bit set.
 
 
 What is your Pmode?

Pmode is dimode while ptr_mode is simode.  Pointers are zero extended when 
converting between si and di modes. 

Thanks,
Andrew


 
 
 -- 
 H.J.


Patch RFA: Move x86 _mm_pause out of pragma target(sse) scope

2014-05-29 Thread Ian Lance Taylor
The _mm_pause intrinsic is defined in xmmintrin.h.  Right now using it
with -m32 with the default -march option gives an error:

/home/iant/foo.c: In function ‘f’:
/home/iant/gcc/go-install/lib/gcc/x86_64-unknown-linux-gnu/4.10.0/include/xmmintrin.h:1238:1:
 error: inlining failed in call to always_inline ‘_mm_pause’: target specific 
option mismatch
 _mm_pause (void)
 ^
/home/iant/foo5.c:3:13: error: called from here
 void f () { _mm_pause (); }
 ^

This error is because _mm_pause is defined in the scope of #pragma GCC
target(sse).  But _mm_pause, which simply generates the pause
instruction, does not require SSE support.  The pause instruction has
nothing really to do with SSE, and it works on all x86 processors (on
processors that do not explicitly recognize it, it is a nop).

I propose the following patch, which moves _mm_pause out of the pragma
target scope.

I know that x86intrin.h provides a similar intrinsic, __pause, but I
think it's worth making _mm_pause work reasonably as well.

I'm running a full testsuite run.  OK for mainline if it passes?

Ian


gcc/ChangeLog:

2014-05-29  Ian Lance Taylor  i...@google.com

* config/i386/xmmintrin.h (_mm_pause): Move out of scope of pragma
target(sse).

gcc/testsuite/ChangeLog:

2014-05-29  Ian Lance Taylor  i...@google.com

* gcc.target/i386/pause-2.c: New test.


Index: config/i386/xmmintrin.h
===
--- config/i386/xmmintrin.h	(revision 211057)
+++ config/i386/xmmintrin.h	(working copy)
@@ -1231,15 +1231,6 @@ _mm_sfence (void)
   __builtin_ia32_sfence ();
 }
 
-/* The execution of the next instruction is delayed by an implementation
-   specific amount of time.  The instruction does not modify the
-   architectural state.  */
-extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_pause (void)
-{
-  __builtin_ia32_pause ();
-}
-
 /* Transpose the 4x4 matrix composed of row[0-3].  */
 #define _MM_TRANSPOSE4_PS(row0, row1, row2, row3)			\
 do {	\
@@ -1262,4 +1253,15 @@ do {	\
 #pragma GCC pop_options
 #endif /* __DISABLE_SSE__ */
 
+/* The execution of the next instruction is delayed by an implementation
+   specific amount of time.  The instruction does not modify the
+   architectural state.  This is after the pop_options pragma because
+   it does not require SSE support in the processor--the encoding is a
+   nop on processors that do not support it.  */
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_pause (void)
+{
+  __builtin_ia32_pause ();
+}
+
 #endif /* _XMMINTRIN_H_INCLUDED */
Index: testsuite/gcc.target/i386/pause-2.c
===
--- testsuite/gcc.target/i386/pause-2.c	(revision 0)
+++ testsuite/gcc.target/i386/pause-2.c	(revision 0)
@@ -0,0 +1,12 @@
+/* Test that pause instruction works even when SSE is not enabled.  */
+/* { dg-do compile } */
+/* { dg-options -O2 -dp } */
+/* { dg-final { scan-assembler-times \\*pause 1 } } */
+
+#include xmmintrin.h
+
+void
+foo (void)
+{
+  _mm_pause ();
+}


Re: [AArch64/ARM 3/3] Add execution tests of ARM EXT intrinsics

2014-05-29 Thread Alan Lawrence
I've just committed this as revision 211059, with the change of adding a _1 
suffix to the names of all the new tests (as per standard testsuite convention).


All passed on arm-none-eabi and armeb-none-eabi.

Cheers, Alan

Ramana Radhakrishnan wrote:

On Wed, Apr 23, 2014 at 9:32 PM, Alan Lawrence alan.lawre...@arm.com wrote:

Final patch in series, adds new tests of the ARM EXT Intrinsics, that also
check
the execution results, reusing the test bodies introduced into AArch64 in
the
first patch. (These tests subsume the autogenerated ones in
testsuite/gcc.target/arm/neon/ that only check assembler output.)

Tests use gcc.target/arm/simd/simd.exp from corresponding patch for ZIP
Intrinsics http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01500.html, will
commit that first.

All tests passing on arm-none-eabi.



Ok if no regressions.

Thanks,
Ramana

gcc/testsuite/ChangeLog:
2014-04-23  Alan Lawrence  alan.lawre...@arm.com

gcc.target/arm/simd/vextQf32.c: New file.
gcc.target/arm/simd/vextQp16.c: New file.
gcc.target/arm/simd/vextQp8.c: New file.
gcc.target/arm/simd/vextQs16.c: New file.
gcc.target/arm/simd/vextQs32.c: New file.
gcc.target/arm/simd/vextQs64.c: New file.
gcc.target/arm/simd/vextQs8.c: New file.
gcc.target/arm/simd/vextQu16.c: New file.
gcc.target/arm/simd/vextQu32.c: New file.
gcc.target/arm/simd/vextQu64.c: New file.
gcc.target/arm/simd/vextQu8.c: New file.
gcc.target/arm/simd/vextQp64.c: New file.
gcc.target/arm/simd/vextf32.c: New file.
gcc.target/arm/simd/vextp16.c: New file.
gcc.target/arm/simd/vextp8.c: New file.
gcc.target/arm/simd/vexts16.c: New file.
gcc.target/arm/simd/vexts32.c: New file.
gcc.target/arm/simd/vexts64.c: New file.
gcc.target/arm/simd/vexts8.c: New file.
gcc.target/arm/simd/vextu16.c: New file.
gcc.target/arm/simd/vextu32.c: New file.
gcc.target/arm/simd/vextu64.c: New file.
gcc.target/arm/simd/vextu8.c: New file.
gcc.target/arm/simd/vextp64.c: New file.

diff --git a/gcc/testsuite/gcc.target/arm/simd/vextQf32.c
b/gcc/testsuite/gcc.target/arm/simd/vextQf32.c
new file mode 100644
index 000..c1da6d3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vextQf32.c
@@ -0,0 +1,12 @@
+/* Test the `vextQf32' ARM Neon intrinsic.  */
+
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options -save-temps -O3 -fno-inline } */
+/* { dg-add-options arm_neon } */
+
+#include arm_neon.h
+#include ../../aarch64/simd/extq_f32.x
+
+/* { dg-final { scan-assembler-times vext\.32\[ \t\]+\[qQ\]\[0-9\]+,
\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9
\]+\)?\n 3 } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/vextQp16.c
b/gcc/testsuite/gcc.target/arm/simd/vextQp16.c
new file mode 100644
index 000..adc0861
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vextQp16.c
@@ -0,0 +1,12 @@
+/* Test the `vextQp16' ARM Neon intrinsic.  */
+
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options -save-temps -O3 -fno-inline } */
+/* { dg-add-options arm_neon } */
+
+#include arm_neon.h
+#include ../../aarch64/simd/extq_p16.x
+
+/* { dg-final { scan-assembler-times vext\.16\[ \t\]+\[qQ\]\[0-9\]+,
\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9
\]+\)?\n 7 } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/vextQp64.c
b/gcc/testsuite/gcc.target/arm/simd/vextQp64.c
new file mode 100644
index 000..e8b688d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vextQp64.c
@@ -0,0 +1,33 @@
+/* Test the `vextQp64' ARM Neon intrinsic.  */
+
+/* { dg-do run } */
+/* { dg-require-effective-target arm_crypto_ok } */
+/* { dg-options -save-temps -O3 -fno-inline } */
+/* { dg-add-options arm_crypto } */
+
+#include arm_neon.h
+
+extern void abort (void);
+
+poly64x2_t
+test_vextq_p64_1 (poly64x2_t a, poly64x2_t b)
+{
+  return vextq_p64(a, b, 1);
+}
+
+int
+main (int argc, char **argv)
+{
+  int i, off;
+  poly64x2_t in1 = {0, 1};
+  poly64x2_t in2 = {2, 3};
+  poly64x2_t actual = test_vextq_p64_1 (in1, in2);
+  for (i = 0; i  2; i++)
+if (actual[i] != i + 1)
+  abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times vext\.64\[ \t\]+\[qQ\]\[0-9\]+,
\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9
\]+\)?\n 1 } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/vextQp8.c
b/gcc/testsuite/gcc.target/arm/simd/vextQp8.c
new file mode 100644
index 000..5f2cc53
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vextQp8.c
@@ -0,0 +1,12 @@
+/* Test the `vextQp8' ARM Neon intrinsic.  */
+
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options -save-temps -O3 -fno-inline } */
+/* { dg-add-options arm_neon 

patch to fix PR61325

2014-05-29 Thread Vladimir Makarov
  The following patch PR61325.  The details can be found on

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61325

  The patch was bootstrapped and tested on x86/x86-64.

  Committed as rev. 211060 to gcc-4.9 branch and as rev.211061 to trunk.

2014-05-29  Vladimir Makarov  vmaka...@redhat.com

PR rtl-optimization/61325
* lra-constraints.c (process_address): Rename to
process_address_1.
(process_address): New function.

2014-05-29  Vladimir Makarov  vmaka...@redhat.com

PR rtl-optimization/61325
* gcc.target/aarch64/pr61325.c: New.

Index: lra-constraints.c
===
--- lra-constraints.c	(revision 210973)
+++ lra-constraints.c	(working copy)
@@ -2784,9 +2784,14 @@
 
Add reloads to the lists *BEFORE and *AFTER.  We might need to add
reloads to *AFTER because of inc/dec, {pre, post} modify in the
-   address.  Return true for any RTL change.  */
+   address.  Return true for any RTL change.
+
+   The function is a helper function which does not produce all
+   transformations which can be necessary.  It does just basic steps.
+   To do all necessary transformations use function
+   process_address.  */
 static bool
-process_address (int nop, rtx *before, rtx *after)
+process_address_1 (int nop, rtx *before, rtx *after)
 {
   struct address_info ad;
   rtx new_reg;
@@ -2986,6 +2991,18 @@
   return true;
 }
 
+/* Do address reloads until it is necessary.  Use process_address_1 as
+   a helper function.  Return true for any RTL changes.  */
+static bool
+process_address (int nop, rtx *before, rtx *after)
+{
+  bool res = false;
+
+  while (process_address_1 (nop, before, after))
+res = true;
+  return res;
+}
+
 /* Emit insns to reload VALUE into a new register.  VALUE is an
auto-increment or auto-decrement RTX whose operand is a register or
memory location; so reloading involves incrementing that location.
@@ -3270,7 +3287,7 @@
 	change_p = true;
 	lra_update_dup (curr_id, i);
   }
-
+  
   if (change_p)
 /* If we've changed the instruction then any alternative that
we chose previously may no longer be valid.  */
Index: testsuite/gcc.target/aarch64/pr61325.c
===
--- testsuite/gcc.target/aarch64/pr61325.c	(revision 0)
+++ testsuite/gcc.target/aarch64/pr61325.c	(working copy)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options -O2 } */
+typedef unsigned int wchar_t;
+typedef long unsigned int size_t;
+
+size_t
+wcstombs(char *s , const wchar_t *pwcs , size_t n)
+{
+  int count = 0;
+  
+  if (n != 0) {
+do {
+  if ((*s++ = (char) *pwcs++) == 0)
+break;
+  count++;
+} while (--n != 0);
+  }
+  return count;
+}


[patch, avr] ata6289 device ISA is updated

2014-05-29 Thread S, Pitchumani
Hi,

Device ATA6289 has MUL instruction and it belongs to avr4 ISA.
Now it is incorrectly listed under avr25. Attached patch corrects it.

Please commit if the patch is OK. I do not have commit access.

Regards,
Pitchumani

2014-05-29  Pitchumani Sivanupandi pitchuman...@atmel.com
 
* config/avr/avr-mcus.def: Change ATA6289 ISA to AVR4
* config/avr/avr-tables.opt: Regenerate.
* config/avr/t-multilib: Regenerate.
* doc/avr-mmcu.texi: Regenerate.



ata6289-isa-change.patch
Description: ata6289-isa-change.patch


Re: [PATCH] Do not build libsanitizer also for powerpc*-*-linux*

2014-05-29 Thread Peter Bergner
On Wed, 2014-05-28 at 09:36 +0200, Thomas Schwinge wrote:
 Hi!
 
 On Mon, 26 May 2014 18:53:22 +0800, Arseny Solokha asolo...@gmx.com wrote:
  Recent changes in GetPcSpBp() (libsanitizer/asan/asan_linux.cc) made it
  impossible to build 4.10.0-alpha20140525 snapshot for powerpc targets.
 
 I hit this, too.
 
  The
  proposed patch disables building libsanitizer for powerpc*-*-linux* in 
  addition
  to already disabled powerpc*le-*-linux* until the smarter solution will 
  emerge.
  
  The actual issue preventing ASAN from porting to PPC seems to be inability 
  to
  retrieve values of PC and BP on this architecture.
 
 This is being discussed in the thread at
 http://gcc.gnu.org/ml/gcc-patches/2014-05/msg02031.html.  Until that
 has been resolved, I do agree to check in the following patch (and have
 successfully tested it, but cannot formally approve it for commit; thus
 copying the libsanitizer maintainers):

The re-enablement patch was submitted to the llvm mailing list here:

  
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140526/219249.html

Once that is committed and merged into gcc, we can re-enable building
libsanitizer for powerpc*-linux.

Peter





Re: RFA: A couple of ira_get_dup_out_num fixes

2014-05-29 Thread Vladimir Makarov
On 05/28/2014 04:32 PM, Richard Sandiford wrote:
 While working on patches to speed up the handling of constraints,
 I hit some behaviour in ira_get_dup_out_num that looked unintentional:

 - the check for output operands was part of the !ignored_p condition
   so would be skipped if the first alternative is disabled/excluded.

 - the first disabled/excluded alternative stops all following alternatives
   from being processed, since we get stuck in the first part of the
   if statement and never increment curr_alt.

 This seems to have some effect on the testsuite.  E.g. at -O2
 gcc.c-torture/compile/20071117-1.c has changes like:

  .LCFI2:
 movq%rsp, %rbx
 subq%rax, %rsp
 -   leaq15(%rsp), %rax
 -   andq$-16, %rax
 -   movq%rax, %rdi
 +   leaq15(%rsp), %rdi
 +   andq$-16, %rdi
 callbar
 xorl%esi, %esi
 movq%rbx, %rsp

 There are also some cases where the change introduces a move though.
 E.g. gcc.c-torture/compat/struct-ic.c has:

 movabsq $4294967296, %rdx
 addq$8, %rsp
  .LCFI4:
 -   andq%rdi, %rax
 +   andq%rax, %rdi
 +   movq%rdi, %rax
 orq %rdx, %rax
 ret
  .L9:

 But AFAICT the patch is what was originally intended.

 Tested on x86_64-linux-gnu.  OK to install?


Ok, Richard.  Thanks for fixing this.


Re: [C++ Patch] PR 57543

2014-05-29 Thread Paolo Carlini

Hi again,

On 05/29/2014 03:34 PM, Jason Merrill wrote:

On 05/28/2014 01:06 PM, Paolo Carlini wrote:

Now, I got this insane idea: would it make sense to simply invert the
substitutions (args and return) unconditionally?


If we're going to change the order, I want to do it in a more correct, 
rather than differently wrong, way.  DR 1227 clarified that 
substitution should proceed in lexical order.


http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1227
So, here is another iteration, sorry about the ping-pong. I put together 
the below which already passes testing. How does it look?


Thanks again for your patience,
Paolo.

/
Index: cp/cp-tree.h
===
--- cp/cp-tree.h(revision 211052)
+++ cp/cp-tree.h(working copy)
@@ -125,7 +125,7 @@ c-common.h, not after.
Usage of TYPE_LANG_FLAG_?:
0: TYPE_DEPENDENT_P
1: TYPE_HAS_USER_CONSTRUCTOR.
-   2: unused
+   2: TYPE_HAS_LATE_RETURN_TYPE (in FUNCTION_TYPE, METHOD_TYPE)
3: TYPE_FOR_JAVA.
4: TYPE_HAS_NONTRIVIAL_DESTRUCTOR
5: CLASS_TYPE_P (in RECORD_TYPE and UNION_TYPE)
@@ -3404,6 +3404,11 @@ more_aggr_init_expr_args_p (const aggr_init_expr_a
user-declared constructor.  */
 #define TYPE_HAS_USER_CONSTRUCTOR(NODE) (TYPE_LANG_FLAG_1 (NODE))
 
+/* Nonzero means that the FUNCTION_TYPE or METHOD_TYPE has a
+   late-specified return type.  */
+#define TYPE_HAS_LATE_RETURN_TYPE(NODE) \
+  (TYPE_LANG_FLAG_2 (FUNC_OR_METHOD_CHECK (NODE)))
+
 /* When appearing in an INDIRECT_REF, it means that the tree structure
underneath is actually a call to a constructor.  This is needed
when the constructor must initialize local storage (which can
Index: cp/decl.c
===
--- cp/decl.c   (revision 211052)
+++ cp/decl.c   (working copy)
@@ -8817,6 +8817,7 @@ grokdeclarator (const cp_declarator *declarator,
   bool template_parm_flag = false;
   bool typedef_p = decl_spec_seq_has_spec_p (declspecs, ds_typedef);
   bool constexpr_p = decl_spec_seq_has_spec_p (declspecs, ds_constexpr);
+  bool late_return_type_p = false;
   source_location saved_loc = input_location;
   const char *errmsg;
 
@@ -9660,6 +9661,9 @@ grokdeclarator (const cp_declarator *declarator,
if (type == error_mark_node)
  return error_mark_node;
 
+   if (declarator-u.function.late_return_type)
+ late_return_type_p = true;
+
if (ctype == NULL_TREE
 decl_context == FIELD
 funcdecl_p
@@ -10590,6 +10594,10 @@ grokdeclarator (const cp_declarator *declarator,
  decl_function_context (TYPE_MAIN_DECL (ctype)) : NULL_TREE;
publicp = (! friendp || ! staticp)
   function_context == NULL_TREE;
+
+   if (late_return_type_p)
+ TYPE_HAS_LATE_RETURN_TYPE (type) = 1;
+
decl = grokfndecl (ctype, type,
   TREE_CODE (unqualified_id) != TEMPLATE_ID_EXPR
   ? unqualified_id : dname,
@@ -10814,6 +10822,9 @@ grokdeclarator (const cp_declarator *declarator,
publicp = (ctype != NULL_TREE
   || storage_class != sc_static);
 
+   if (late_return_type_p)
+ TYPE_HAS_LATE_RETURN_TYPE (type) = 1;
+
decl = grokfndecl (ctype, type, original_name, parms, unqualified_id,
   virtualp, flags, memfn_quals, rqual, raises,
   1, friendp,
Index: cp/pt.c
===
--- cp/pt.c (revision 211052)
+++ cp/pt.c (working copy)
@@ -11322,8 +11322,42 @@ tsubst_function_type (tree t,
   /* The TYPE_CONTEXT is not used for function/method types.  */
   gcc_assert (TYPE_CONTEXT (t) == NULL_TREE);
 
-  /* Substitute the return type.  */
-  return_type = tsubst (TREE_TYPE (t), args, complain, in_decl);
+  /* DR 1227:  Mixing immediate and non-immediate contexts in deduction
+ failure.  */
+  bool late_return_type_p = TYPE_HAS_LATE_RETURN_TYPE (t);
+
+  if (late_return_type_p)
+{
+  /* Substitute the argument types.  */
+  arg_types = tsubst_arg_types (TYPE_ARG_TYPES (t), args, NULL_TREE,
+   complain, in_decl);
+  if (arg_types == error_mark_node)
+   return error_mark_node;
+
+  tree save_ccp = current_class_ptr;
+  tree save_ccr = current_class_ref;
+  tree this_type = (TREE_CODE (t) == METHOD_TYPE
+   ? TREE_TYPE (TREE_VALUE (arg_types)) : NULL_TREE);
+  bool do_inject = this_type  !dependent_type_p (this_type);
+  if (do_inject)
+   {
+ /* DR 1207: 'this' is in scope in the trailing return type.  */
+ inject_this_parameter (this_type, cp_type_quals (this_type));
+   }
+
+  /* Substitute the return type.  */
+  return_type = tsubst (TREE_TYPE (t), args, complain, in_decl);
+
+  

libgo patch committed: Add --without-libatomic configure option

2014-05-29 Thread Ian Lance Taylor
This patch from Peter Collingbourne adds a --without-libatomic configure
option to libgo, to make it easier to build libgo outside of the GCC
build system.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 9e7a28ffe425 libgo/Makefile.am
--- a/libgo/Makefile.am	Wed May 28 17:02:52 2014 -0700
+++ b/libgo/Makefile.am	Thu May 29 13:05:27 2014 -0700
@@ -30,6 +30,8 @@
 LIBFFI = @LIBFFI@
 LIBFFIINCS = @LIBFFIINCS@
 
+LIBATOMIC = @LIBATOMIC@
+
 WARN_CFLAGS = $(WARN_FLAGS) $(WERROR)
 
 # -I/-D flags to pass when compiling.
@@ -1949,8 +1951,7 @@
 
 libgo_la_LIBADD = \
 	$(libgo_go_objs) ../libbacktrace/libbacktrace.la \
-	../libatomic/libatomic_convenience.la \
-	$(LIBFFI) $(PTHREAD_LIBS) $(MATH_LIBS) $(NET_LIBS)
+	$(LIBATOMIC) $(LIBFFI) $(PTHREAD_LIBS) $(MATH_LIBS) $(NET_LIBS)
 
 libgobegin_a_SOURCES = \
 	runtime/go-main.c
diff -r 9e7a28ffe425 libgo/configure.ac
--- a/libgo/configure.ac	Wed May 28 17:02:52 2014 -0700
+++ b/libgo/configure.ac	Thu May 29 13:05:27 2014 -0700
@@ -122,6 +122,21 @@
 AC_SUBST(LIBFFI)
 AC_SUBST(LIBFFIINCS)
 
+# See if the user wants to configure without libatomic. This is useful if we are
+# on an architecture for which libgo does not need an atomic support library and
+# libatomic does not support our C compiler.
+AC_ARG_WITH(libatomic,
+  AS_HELP_STRING([--without-libatomic],
+ [don't use libatomic]),
+  [:],
+  [with_libatomic=${with_libatomic_default-yes}])
+
+LIBATOMIC=
+if test $with_libatomic != no; then
+   LIBATOMIC=../libatomic/libatomic_convenience.la
+fi
+AC_SUBST(LIBATOMIC)
+
 # Used to tell GNU make to include a file without telling automake to
 # include it.
 go_include=-include


Re: [DOC PATCH] Rewrite docs for inline asm

2014-05-29 Thread Eric Botcazou
 Fixed.

Thanks!

-- 
Eric Botcazou


Re: [PATCH][ARM] FAIL: gcc.target/arm/pr58041.c scan-assembler ldrb

2014-05-29 Thread Maciej W. Rozycki
On Wed, 28 May 2014, Richard Earnshaw wrote:

 Ah, light dawns (maybe).
 
 I guess the problems stem from the attempts to combine Neon with ARMv5.
  Neon shouldn't be used with anything prior to ARMv7, since that's the
 earliest version of the architecture that can support it.

 Good to know, thanks for the hint.  Anyway it's the test case doing 
something silly or maybe just odd.  After all IIUC ARMv5 code will run 
just fine on ARMv7/NEON hardware so mixing up ARMv5 scalar code with NEON 
vector code is nothing wrong per se.

 I guess that what is happening is that we see we have Neon, so start to
 generate a Neon-based copy sequence, but then notice that we don't have
 misaligned access (something that must exist if we have Neon) and
 generate VLDR instructions in a mistaken attempt to work around the
 first inconsistency.
 
 Maybe we should tie -mfpu=neon to having at least ARMv7 (though ARMv6
 also has misaligned access support).

 So to move away from the odd mixture of instruction selection options 
just as a quick test I rebuilt the same file with `-march=armv7-a 
-mno-unaligned-access' and the result is the same, a pair of VLDR 
instructions accessing unaligned memory, i.e. the same problem.

 So based on observations made so far I think there are two sensible 
ways to move forward:

1. Fix GCC so that a manual byte-wise copy is made whenever 
  `-mno-unaligned-access' is in effect.

2. Revert the change being discussed here as its lone purpose was to 
   disable the use of VLD1.8, etc. where `-mno-unaligned-access' is in 
   effect, and it does no good.

  Maciej


run dsymutil post lto

2014-05-29 Thread Mike Stump
Jack finally found the answer to a question I had back in 2010…  Why, yes, one 
does have to arrange to run the post ld pass when lto runs but doesn’t have to 
relink.

Committed revision 211067.

Thanks Jack.

PR debug/61352
* collect2.c (maybe_run_lto_and_relink): Be sure to always run
post ld passes when lto is used.

Index: collect2.c
===
--- collect2.c  (revision 211062)
+++ collect2.c  (working copy)
@@ -848,6 +848,8 @@ maybe_run_lto_and_relink (char **lto_ld_
   fork_execute (ld, lto_ld_argv);
   post_ld_pass (false);
 }
+  else
+post_ld_pass (true);
 }
 

 /* Main program.  */