[Patch, Fortran, committed] CO_MIN/MAX/SUM fixes

2014-06-19 Thread Tobias Burnus

This patches fixes a few bugs related to CO_MIN/MAX/SUM:

* The recent patch missed to update the argument in trans-intrinsic, it 
had the changes only in trans-decl and libgfortran/caf.

* in libcaf_single, setting stat to 0 had a bug
* There were several multi-image bugs in the collective_2.

Additionally, passing an array with vector subscript doesn't make sense 
as the first argument is (at least without image_index) intent(inout) – 
but that's not permitted with vector subscripts.


I will add a check to check.c in a follow up patch – and update also the 
caf_send/caf_sendget API as there the same applies and one can remove 
that extra argument.


Committed as Rev. 211816.

Tobias
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 211815)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,8 @@
+2014-06-19  Tobias Burnus  bur...@net-b.de
+
+	* trans-intrinsic.c (conv_co_minmaxsum): Fix argument
+	passing.
+
 2014-06-18  Tobias Burnus  bur...@net-b.de
 
 	* gfortran.texi (OpenMP): Update refs to OpenMP 4.0.
Index: gcc/fortran/trans-intrinsic.c
===
--- gcc/fortran/trans-intrinsic.c	(Revision 211815)
+++ gcc/fortran/trans-intrinsic.c	(Arbeitskopie)
@@ -8300,13 +8300,11 @@ conv_co_minmaxsum (gfc_code *code)
 gcc_unreachable ();
 
   if (code-resolved_isym-id == GFC_ISYM_CO_SUM)
-fndecl = build_call_expr_loc (input_location, fndecl, 6, array,
-  null_pointer_node, image_index, stat, errmsg,
-  errmsg_len);
+fndecl = build_call_expr_loc (input_location, fndecl, 5, array,
+  image_index, stat, errmsg, errmsg_len);
   else
-fndecl = build_call_expr_loc (input_location, fndecl, 7, array,
-  null_pointer_node, image_index, stat, errmsg,
-  strlen, errmsg_len);
+fndecl = build_call_expr_loc (input_location, fndecl, 6, array, image_index,
+  stat, errmsg, strlen, errmsg_len);
   gfc_add_expr_to_block (block, fndecl);
   gfc_add_block_to_block (block, post_block);
 
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 211815)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,9 +1,14 @@
+2014-06-19  Tobias Burnus  bur...@net-b.de
+
+	* gfortran.dg/coarray/collectives_2.f90: Extend
+	and make valid.
+
 2014-06-18  Tom de Vries  t...@codesourcery.com
 
 	* gcc.target/aarch64/fuse-caller-save.c: New test.
 
 2014-06-18  Radovan Obradovic  robrado...@mips.com
-Tom de Vries  t...@codesourcery.com
+	Tom de Vries  t...@codesourcery.com
 
 	* gcc.target/arm/fuse-caller-save.c: New test.
 
Index: gcc/testsuite/gfortran.dg/coarray/collectives_2.f90
===
--- gcc/testsuite/gfortran.dg/coarray/collectives_2.f90	(Revision 211815)
+++ gcc/testsuite/gfortran.dg/coarray/collectives_2.f90	(Arbeitskopie)
@@ -7,7 +7,7 @@ program test
   intrinsic co_max
   intrinsic co_min
   intrinsic co_sum
-  integer :: val(3)
+  integer :: val(3), tmp_val(3)
   integer :: vec(3)
   vec = [2,3,1]
   if (this_image() == 1) then
@@ -21,14 +21,25 @@ program test
   else
 val(3) = 101
   endif
+  tmp_val = val
   call test_min
+  val = tmp_val
   call test_max
+  val = tmp_val
   call test_sum
 contains
   subroutine test_max
-call co_max (val(vec))
-!write(*,*) Maximal value, val
+integer :: tmp
+call co_max (val(::2))
 if (num_images()  1) then
+  if (any (val /= [42, this_image(), 101])) call abort()
+else
+  if (any (val /= [42, this_image(), -55])) call abort()
+endif
+
+val = tmp_val
+call co_max (val(:))
+if (num_images()  1) then
   if (any (val /= [42, num_images(), 101])) call abort()
 else
   if (any (val /= [42, num_images(), -55])) call abort()
@@ -40,20 +51,26 @@ contains
 if (this_image() == num_images()) then
   !write(*,*) Minimal value, val
   if (num_images()  1) then
-if (any (val /= [-99, num_images(), -55])) call abort()
+if (any (val /= [-99, 1, -55])) call abort()
   else
-if (any (val /= [42, num_images(), -55])) call abort()
+if (any (val /= [42, 1, -55])) call abort()
   endif
+else
+  if (any (val /= tmp_val)) call abort()
 endif
   end subroutine test_min
 
   subroutine test_sum
 integer :: n
-call co_sum (val, result_image=1)
+n = 88
+call co_sum (val, result_image=1, stat=n)
+if (n /= 0) call abort()
 if (this_image() == 1) then
   n = num_images()
   !write(*,*) The sum is , val
   if (any (val /= [42 + (n-1)*(-99), (n**2 + n)/2, -55+(n-1)*101])) call abort()
+else
+  if (any (val /= tmp_val)) call abort()
 end if
   end subroutine test_sum
 end program test
Index: libgfortran/ChangeLog
===
--- libgfortran/ChangeLog	(Revision 211815)
+++ 

Re: [PATCH, AARCH64] Enable fuse-caller-save for AARCH64

2014-06-19 Thread Tom de Vries

On 19-06-14 05:53, Richard Henderson wrote:

Do we in fact make sure this isn't an ifunc resolver?  I don't immediately see
how those get wired up in the cgraph...


Richard,

using the patch below I changed the 
gcc/testsuite/gcc.target/i386/fuse-caller-save.c testcase to use an ifunc 
resolver, and observed that the fuse-caller-save optimization didn't work.


The reason the optimization doesn't work in this case is that 
default_binds_local_p_1 checks the ifunc attribute:

...
  /* Weakrefs may not bind locally, even though the weakref itself is always
 static and therefore local.  Similarly, the resolver for ifunc functions
 might resolve to a non-local function.
 FIXME: We can resolve the weakref case more curefuly by looking at the
 weakref alias.  */
  else if (lookup_attribute (weakref, DECL_ATTRIBUTES (exp))
   || (TREE_CODE (exp) == FUNCTION_DECL
lookup_attribute (ifunc, DECL_ATTRIBUTES (exp
local_p = false;
...

The default_binds_local_p_1 function is used via this path in the optimization:
get_call_reg_set_usage - get_call_cgraph_rtl_info - 
decl_binds_to_current_def_p - default_binds_local_p - default_binds_local_p_1 .


Thanks,
- Tom


diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c
index 4ec4995..012dc12 100644
--- a/gcc/testsuite/gcc.target/i386/fuse-caller-save.c
+++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c
@@ -5,11 +5,18 @@
 /* Testing -fuse-caller-save optimization option.  */
 
 static int __attribute__((noinline))
-bar (int x)
+my_bar (int x)
 {
   return x + 3;
 }
 
+static void (*resolve_bar (void)) (void)
+{
+  return (void*) my_bar;
+}
+
+static int __attribute__((noinline)) __attribute__((ifunc (resolve_bar))) bar (int x);
+
 int __attribute__((noinline))
 foo (int y)
 {
-- 
1.9.1



Re: C++ PATCH for c++/59296 (rvalue object and lvalue ref-qualifier)

2014-06-19 Thread Jason Merrill

On 06/19/2014 12:12 AM, Jason Merrill wrote:

We were treating a const  member function like a normal const
reference, and binding an rvalue object argument to it.  But it doesn't
work that way.


In 4.9 we also need to set LOOKUP_NO_TEMP_BIND.



commit 48ca9803695872d984b0f4efa56f7f58987d0928
Author: jason jason@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Wed Jun 18 22:13:51 2014 +

	PR c++/59296
	* call.c (add_function_candidate): Set LOOKUP_NO_RVAL_BIND
	|LOOKUP_NO_TEMP_BIND for ref-qualifier handling.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 2c3d4ac..fc65b97 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -1999,6 +1999,9 @@ add_function_candidate (struct z_candidate **candidates,
 		 object parameter has reference type.  */
 		  bool rv = FUNCTION_RVALUE_QUALIFIED (TREE_TYPE (fn));
 		  parmtype = cp_build_reference_type (parmtype, rv);
+		  /* Don't bind an rvalue to a const lvalue ref-qualifier.  */
+		  if (!rv)
+		lflags |= LOOKUP_NO_RVAL_BIND|LOOKUP_NO_TEMP_BIND;
 		}
 	  else
 		{
diff --git a/gcc/testsuite/g++.dg/cpp0x/ref-qual15.C b/gcc/testsuite/g++.dg/cpp0x/ref-qual15.C
new file mode 100644
index 000..ca333c2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/ref-qual15.C
@@ -0,0 +1,13 @@
+// PR c++/59296
+// { dg-do compile { target c++11 } }
+
+struct Type
+{
+  void get() const { }
+  void get() const { }
+};
+
+int main()
+{
+  Type{}.get();
+}


[PATCH][MIPS] Enable load-load/store-store bonding

2014-06-19 Thread Sameera Deshpande
Hi Richard,

Please find attached the patch implementing load-load/store-store bonding 
supported by P5600.

In P5600, 2 consecutive loads/stores of same type which access contiguous 
memory locations are bonded together by instruction issue unit to dispatch 
single load/store instruction which accesses both locations. This allows 2X 
improvement in memory intensive code. This optimization can be performed for 
LH, SH, LW, SW, LWC, SWC, LDC, SDC instructions.

This patch adds peephole2 patterns to identify such loads/stores, and put them 
in parallel, so that the scheduler will not split it - thereby guarantying h/w 
level load/store bonding.

The patch is tested with dejagnu for correctness.
Local testing on hardware for perf is  currently going on.
Ok for trunk? 

Changelog:
gcc/
* config/mips/mips.md (JOINLDST1): New mode iterator.
(insn_type): New mode attribute.
(reg): Update mode attribute.
(join2_load_StoreJOINLDST1:mode): New pattern.
(join2_loadhi): Likewise.
(join2_storehi): Likewise.
(define_peehole2): Add peephole2 patterns to join 2 HI/SI/SF/DF-mode
load-load and store-stores.
* config/mips/mips.opt (mld-st-pairing): New option.
* config/mips/mips.c (mips_option_override): New exception.
*config/mips/mips.h (ENABLE_LD_ST_PAIRING): New macro.

- Thanks and regards,
   Sameera D.



load-store-pairing.patch
Description: load-store-pairing.patch


Re: [PATCH, cprop] Check rtx_cost when propagating constant

2014-06-19 Thread Zhenqiang Chen
On 17 June 2014 17:42, Zhenqiang Chen zhenqiang.c...@linaro.org wrote:
 On 17 June 2014 16:15, Richard Biener richard.guent...@gmail.com wrote:
 On Tue, Jun 17, 2014 at 4:11 AM, Zhenqiang Chen
 zhenqiang.c...@linaro.org wrote:
 Hi,

 For some large constant, ports like ARM, need one more instructions to
 operate it. e.g

 #define MASK 0xfe00ff
 void maskdata (int * data, int len)
 {
int i = len;
for (; i  0; i -= 2)
 {
   data[i] = MASK;
   data[i + 1] = MASK;
 }
 }

 Need two instructions for each AND operation:

 andr3, r3, #16711935
 bicr3, r3, #65536

 If we keep the MASK in a register, loop2_invariant pass can hoist it
 out the loop. And it can be shared by different references.

 So the patch skips constant propagation if it makes INSN's cost higher.

 So cprop undos invariant motions work here?

 Yes. GLOBAL CONST-PROP will undo invariant motions.

 Should we make sure we add a REG_EQUAL note when not propagating?

 Logs show there already has REG_EQUAL note.

 Bootstrap and no make check regression on X86-64 and ARM Chrome book.

 OK for trunk?

 Thanks!
 -Zhenqiang

 ChangeLog:
 2014-06-17  Zhenqiang Chen  zhenqiang.c...@linaro.org

 * cprop.c (try_replace_reg): Check cost for constants.

 diff --git a/gcc/cprop.c b/gcc/cprop.c
 index aef3ee8..c9cf02a 100644
 --- a/gcc/cprop.c
 +++ b/gcc/cprop.c
 @@ -733,6 +733,14 @@ try_replace_reg (rtx from, rtx to, rtx insn)
rtx src = 0;
int success = 0;
rtx set = single_set (insn);
 +  int old_cost = 0;
 +  bool copy_p = false;
 +  bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn));
 +
 +  if (set  SET_SRC (set)  REG_P (SET_SRC (set)))
 +copy_p = true;
 +  else
 +old_cost = set_rtx_cost (set, speed);

 Looks bogus for set == NULL?

 set_rtx_cost has checked it. If it is NULL, the function will return 0;

 Also what about register pressure?

 Do you think it has big register pressure impact? I think it does not
 increase register pressure.

 I think this kind of change needs wider testing as RTX costs are
 usually not fully implemented and you introduce a new use kind
 (or is it already used elsewhere in this way to compute cost
 difference of a set with s/reg/const?).

 Passes like fwprop, cse, auto_inc_dec, uses RTX costs to make the
 decision. e.g. in function attempt_change of auto-inc-dec.c, it has
 code segments like:

   old_cost = (set_src_cost (mem, speed)
   + set_rtx_cost (PATTERN (inc_insn.insn), speed));
   new_cost = set_src_cost (mem_tmp, speed);
   ...
   if (old_cost  new_cost)
 {
   ...
   return false;
 }

 The usage of RTX costs in this patch is similar.

 I had run X86-64 bootstrap and regression tests with
 --enable-languages=c,c++,lto,fortran,go,ada,objc,obj-c++,java

 And ARM bootstrap and regression tests with
 --enable-languages=c,c++,fortran,lto,objc,obj-c++

 I will run tests on i686. What other tests do you think I have to run?

 What kind of performance difference do you see?

 I had run coremark, dhrystone, eembc on ARM Cortex-M4 (with some arm
 backend changes). Coremark with some options show 10% performance
 improvement. dhrystone is a little better. Some wave in eembc, but
 overall result is better.

 I will run spec2000 on X86-64 and ARM, and back to you about the
 performance changes.

Please ignore my previous comments about Cortex-M4 performance since
it does not base on clean codes.

Here is a summary for performance result on X86-64 and ARM.

For X86-64, I run SPEC2000 INT and FP (-O3). There is no improvement
or regression. As tests, I moved the code segment to end of function
try_replace_reg and check insns which meet success  new_cost 
old_cost. Logs show only 52 occurrences for all SPEC2000 build and
the only one instruction pattern: *adddi_1 is impacted. For *adddi_1,
rtx_cost increases from 8 to 10 when changing a register operand to a
constant.

For ARM Cortex-M4, minimal changes for Coremark, Dhrystone and EEMBC.
For ARM Chrome book (Cortex-A15), some wave in SPEC2000 INT test. But
the final result does not show improvement or regression.

The patch is updated to remove the bogus code and keep more constants.

Bootstrap and no make check regression on X86-64, i686 and ARM.

diff --git a/gcc/cprop.c b/gcc/cprop.c
index aef3ee8..6ea6be0 100644
--- a/gcc/cprop.c
+++ b/gcc/cprop.c
@@ -733,6 +733,28 @@ try_replace_reg (rtx from, rtx to, rtx insn)
   rtx src = 0;
   int success = 0;
   rtx set = single_set (insn);
+  int old_cost = 0;
+  bool const_p = false;
+  bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn));
+
+  if (set  SET_SRC (set))
+{
+  rtx src = SET_SRC (set);
+  if (REG_P (src) || GET_CODE (src) == SUBREG)
+const_p = true;
+  else
+   {
+ if (note != 0
+  REG_NOTE_KIND (note) == REG_EQUAL
+  (GET_CODE (XEXP (note, 0)) == CONST
+ || CONSTANT_P (XEXP (note, 0
+   {
+ const_p = true;
+   

Re: [PATCH, cpp] Fix line directive bug‏

2014-06-19 Thread Dodji Seketeli
Hello Nicholas,

First of all, thank you for taking the time to dive into this code and
provide such a detailed analysis along with a patch.  This is
appreciated.

Please find below my comments to some parts of your message.

Nicholas Ormrod nicholas.orm...@hotmail.com a écrit:

 PR preprocessor/60723
 
 Description:
 
 When line directives are inserted into the expansion of a macro, the line
 directive may erroneously specify the file as being a system file. This
 causes certain warnings to be suppressed in the rest of the file.

Agreed.

 The fact that line directives are ever inserted into a macro is itself a
 half-bug. Please see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60723 for
 full details.

We could discuss that, but I might slightly disagree.  I would rather
say that it's the way the directives are inserted that is a bug.  But
let's put this topic aside for now.

To ease the discussion at hand, let me paste here the test case that you
submitted to the bugzilla:

cat inc.h
#define FOO() static const char * F = __FILE__ ;
$ cat src.cpp 
#include inc.h
FOO(
)

int main() {
  int z = 1 / 0;
  return z;
}
$
$ g++ -E -isystem . src.cpp 
# 1 src.cpp
# 1 interne
# 1 command-line
# 1 /usr/include/stdc-predef.h 1 3 4
# 1 command-line 2
# 1 src.cpp
# 1 ./inc.h 1 3 4
# 2 src.cpp 2
static const char * F =
 src.cpp
# 2 src.cpp 3 4
 ;


int main() {
  int z = 1 / 0;
  return z;
}
$ 

So when compiling the resulting pre-processed file, the warning
concerning the division by zero is not emitted, and that is a bug, I
agree with you.

 Patch:
 
 Information for locations is, for similar pieces of data, read from a
 LOCATION_* macro. The sysp read which was causing the error was using an
 inconsistent method to read the data. Resolving this is a two-line
 fix.

Maybe I am missing something, but my understanding seems to differ
here, sorry.

I think that we really want to be able to say if the macro FOO that got
expanded in the file src.cpp was actually *defined* in a system header
or not.  That is, if the macro is a system macro[1] or not.  Because the
expansion of a system macro should (generally) not emit a warning, even
when that expansion occurs in a file (src.cpp in your example from
bugzilla) that is not a system file.

So in that case we want to suppress the potential warnings that arise
from the macro expansion.

But the issue here is, I think, that (in src.cpp) we consider the tokens
resulting from the expansion of the macro FOO as being system tokens[2]
(and rightly so) and *also* that all the subsequent tokens of the
src.cpp file are being system tokens; and this is wrong.

So, I would tend to think that a potential proper fix would emit a
subsequent line directive after the lines:

# 2 src.cpp 3 4
 ;

So that the pre-processed file looks like:

$ g++ -E -isystem . src.cpp 
# 1 src.cpp
# 1 interne
# 1 command-line
# 1 /usr/include/stdc-predef.h 1 3 4
# 1 command-line 2
# 1 src.cpp
# 1 ./inc.h 1 3 4
# 2 src.cpp 2
static const char * F =
 src.cpp
# 2 src.cpp 3 4
 ;
# 3 src.cpp 4


int main() {
  int z = 1 / 0;
  return z;
}
$ 

Note the additional line directive # 3 src.cpp 4 that doesn't
mention the '3' flags and thus says that the rest of the tokens are
*not* system tokens.

What do you think?

[1]: A system macro is a macro defined in a system header.
[2]: A system token is a token coming from the expansion of a system macro

Cheers,

-- 
Dodji



Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-06-19 Thread Ilya Verbin
On 18 Jun 16:22, Bernd Schmidt wrote:
 What I think you need to do is
 For the first compiler:
 --enable-as-accelerator-for=x86_64-pc-linux-gnu
 --target=x86_64-intelmic-linux-gnu --prefix=/somewhere
 
 No --enable-accelerator options at all. This should work, if it
 doesn't let me know what you find in /somewhere after installation
 for both compilers.

It doesn't work without --enable-accelerator:

--enable-as-accelerator-for requires --enable-accelerator
make[1]: *** [configure-gcc] Error 1

  -- Ilya


[fortran,patch] One-line fix to PR61454 (init expression simplification)

2014-06-19 Thread FX
In expr.c:scalarize_intrinsic_call(), we don't deal correctly with intrinsics 
that have an optional kind argument, while simplifying initialization 
expressions. The attached one-line patch fixes it, and adds a testcase so we 
don’t regress.

Bootstrapped and regtested on x86_64-apple-darwin13.
OK to commit?

FX




pr61454.diff
Description: Binary data


pr61454.ChangeLog
Description: Binary data


Re: [fortran,patch] One-line fix to PR61454 (init expression simplification)

2014-06-19 Thread Paul Richard Thomas
Dear FX,

Not only is it 'obvious' but it can do no harm in any circumstances
:-)  OK to commit

Thanks

Paul

On 19 June 2014 13:14, FX fxcoud...@gmail.com wrote:
 In expr.c:scalarize_intrinsic_call(), we don't deal correctly with intrinsics 
 that have an optional kind argument, while simplifying initialization 
 expressions. The attached one-line patch fixes it, and adds a testcase so we 
 don’t regress.

 Bootstrapped and regtested on x86_64-apple-darwin13.
 OK to commit?

 FX





-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy


Re: C++ PATCH for c++/59296 (rvalue object and lvalue ref-qualifier)

2014-06-19 Thread Marc Glisse

On Thu, 19 Jun 2014, Jason Merrill wrote:

We were treating a const  member function like a normal const reference, and 
binding an rvalue object argument to it.  But it doesn't work that way.


That looks weird to me. The const version is a better match than the 
const, so we should pick that one in overload resolution, but if we 
remove the const version, the other one seems valid to me, we shouldn't 
reject this program:


struct Type
{
  void get() const { }
};

int main()
{
  Type{}.get();
}

At least both clang and intel accept it, and I hope the standard didn't 
make it behave differently from regular function calls.


--
Marc Glisse


Re: [PATCH, PR 61540] Do not ICE on impossible devirtualization

2014-06-19 Thread Martin Jambor
Hi,

On Wed, Jun 18, 2014 at 06:12:34PM +0200, Bernhard Reutner-Fischer wrote:
 On 18 June 2014 10:24:16 Martin Jambor mjam...@suse.cz wrote:
 
 @@ -3002,10 +3014,8 @@ try_make_edge_direct_virtual_call (struct
 cgraph_edge *ie,
 
if (target)
  {
 -#ifdef ENABLE_CHECKING
 -  gcc_assert (possible_polymorphic_call_target_p
 - (ie, cgraph_get_node (target)));
 -#endif
 +  if (!possible_polymorphic_call_target_p (ie, cgraph_get_node 
 (target)))
 +return ipa_make_edge_direct_to_target (ie, target);
return ipa_make_edge_direct_to_target (ie, target);
  }
 
 The above looks odd. You return the same thing both conditionally
 and unconditionally?
 

You are obviously right, apparently I was too tired to attempt to work
that night.  Thanks, for spotting it.  The following patch has this
corrected and it also passes bootstrap and testing on x86_64-linux on
both the trunk and the 4.9 branch. OK for both?

Thanks,

Martin


2014-06-19  Martin Jambor  mjam...@suse.cz

PR ipa/61540
* ipa-prop.c (impossible_devirt_target): New function.
(try_make_edge_direct_virtual_call): Use it, also instead of
asserting.

testsuite/
* g++.dg/ipa/pr61540.C: New test.

diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index b67deed..d9dca52 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -2912,6 +2912,29 @@ try_make_edge_direct_simple_call (struct cgraph_edge *ie,
   return cs;
 }
 
+/* Return the target to be used in cases of impossible devirtualization.  IE
+   and target (the latter can be NULL) are dumped when dumping is enabled.  */
+
+static tree
+impossible_devirt_target (struct cgraph_edge *ie, tree target)
+{
+  if (dump_file)
+{
+  if (target)
+   fprintf (dump_file,
+Type inconsident devirtualization: %s/%i-%s\n,
+ie-caller-name (), ie-caller-order,
+IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (target)));
+  else
+   fprintf (dump_file,
+No devirtualization target in %s/%i\n,
+ie-caller-name (), ie-caller-order);
+}
+  tree new_target = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
+  cgraph_get_create_node (new_target);
+  return new_target;
+}
+
 /* Try to find a destination for indirect edge IE that corresponds to a virtual
call based on a formal parameter which is described by jump function JFUNC
and if it can be determined, make it direct and return the direct edge.
@@ -2946,15 +2969,7 @@ try_make_edge_direct_virtual_call (struct cgraph_edge 
*ie,
DECL_FUNCTION_CODE (target) == BUILT_IN_UNREACHABLE)
  || !possible_polymorphic_call_target_p
   (ie, cgraph_get_node (target)))
-   {
- if (dump_file)
-   fprintf (dump_file,
-Type inconsident devirtualization: %s/%i-%s\n,
-ie-caller-name (), ie-caller-order,
-IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (target)));
- target = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
- cgraph_get_create_node (target);
-   }
+   target = impossible_devirt_target (ie, target);
  return ipa_make_edge_direct_to_target (ie, target);
}
}
@@ -2984,10 +2999,7 @@ try_make_edge_direct_virtual_call (struct cgraph_edge 
*ie,
   if (targets.length () == 1)
target = targets[0]-decl;
   else
-   {
-  target = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
- cgraph_get_create_node (target);
-   }
+   target = impossible_devirt_target (ie, NULL_TREE);
 }
   else
 {
@@ -3002,10 +3014,8 @@ try_make_edge_direct_virtual_call (struct cgraph_edge 
*ie,
 
   if (target)
 {
-#ifdef ENABLE_CHECKING
-  gcc_assert (possible_polymorphic_call_target_p
-(ie, cgraph_get_node (target)));
-#endif
+  if (!possible_polymorphic_call_target_p (ie, cgraph_get_node (target)))
+   target = impossible_devirt_target (ie, target);
   return ipa_make_edge_direct_to_target (ie, target);
 }
   else
diff --git a/gcc/testsuite/g++.dg/ipa/pr61540.C 
b/gcc/testsuite/g++.dg/ipa/pr61540.C
new file mode 100644
index 000..d298964
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr61540.C
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options -O3 -fno-early-inlining -fdump-ipa-cp } */
+
+struct data {
+  data(int) {}
+};
+
+struct top {
+  virtual int topf() {}
+};
+
+struct intermediate: top {
+int topf() /* override */ { return 0; }
+};
+
+struct child1: top {
+void childf()
+{
+data d(topf());
+}
+};
+
+struct child2: intermediate {};
+
+void test(top t)
+{
+child1 c = static_castchild1(t);
+c.childf();
+child2 d;
+test(d);
+}
+
+int main (int argc, char **argv)
+{
+  child1 c;
+  test (c);
+  return 0;
+}
+
+/* { dg-final { scan-ipa-dump Type inconsident devirtualization cp 

Re: [fortran,patch] One-line fix to PR61454 (init expression simplification)

2014-06-19 Thread FX
 Not only is it 'obvious' but it can do no harm in any circumstances
 :-)  OK to commit

True! Committed as rev. 211822

FX


[PATCH AArch64 0/2] PR/60825 Make {int,uint,float}64x1_t in arm_neon.h a proper vector type

2014-06-19 Thread Alan Lawrence
According to the ARM C Language Extensions the 64x1 types should all be passed 
in the SIMD registers rather than GPRs, and should not be assignment-compatible 
with [u]int64_t / float64_t (as they are at present). These two patches (first 
for float64x1_t, second for [u]int64x1_t) make these types into vector types as 
per GNU vector extensions.


In the int64x1 patch I also fix the type signatures of the many scalar 
(d_s64/d_u64) intrinsics, which had previously used int64x1_t in place of 
int64_t (the two previously having been indistinguishable).


I expect these to backport to 4.9 straightforwardly...

Ok for trunk?

--Alan



Re: [PATCH, AARCH64] Enable fuse-caller-save for AARCH64

2014-06-19 Thread Tom de Vries

On 19-06-14 05:21, Richard Henderson wrote:

On 06/01/2014 03:00 AM, Tom de Vries wrote:

+/* Emit call insn with PAT and do aarch64-specific handling.  */
+
+bool
+aarch64_emit_call_insn (rtx pat)
+{
+  rtx insn = emit_call_insn (pat);
+
+  rtx *fusage = CALL_INSN_FUNCTION_USAGE (insn);
+  clobber_reg (fusage, gen_rtx_REG (word_mode, IP0_REGNUM));
+  clobber_reg (fusage, gen_rtx_REG (word_mode, IP1_REGNUM));
+}
+


Which can't have been bootstrapped, since this has no return stmt.
Why the bool return type anyway?  Nothing appears to use it.



Richard,

Indeed, the return type should be void, this patch fixes that.

I have no setup to bootstrap this on aarch64. I've build an aarch64 compiler and 
ran the gcc.target/aarch64/fuse-caller-save.c testcase.


Committed as obvious.

Thanks,
- Tom
2014-06-19  Tom de Vries  t...@codesourcery.com

	* config/aarch64/aarch64-protos.h (aarch64_emit_call_insn): Change
	return type to void.
	* config/aarch64/aarch64.c (aarch64_emit_call_insn): Same.

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 213c8dc..53023ba 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -245,7 +245,7 @@ void aarch64_init_cumulative_args (CUMULATIVE_ARGS *, const_tree, rtx,
 void aarch64_init_expanders (void);
 void aarch64_print_operand (FILE *, rtx, char);
 void aarch64_print_operand_address (FILE *, rtx);
-bool aarch64_emit_call_insn (rtx);
+void aarch64_emit_call_insn (rtx);
 
 /* Initialize builtins for SIMD intrinsics.  */
 void init_aarch64_simd_builtins (void);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b2d005b..f0aafbd 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3395,7 +3395,7 @@ aarch64_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2)
 
 /* Emit call insn with PAT and do aarch64-specific handling.  */
 
-bool
+void
 aarch64_emit_call_insn (rtx pat)
 {
   rtx insn = emit_call_insn (pat);
-- 
1.9.1



[PATCH AArch64 1/2] PR/60825 Make float64x1_t in arm_neon.h a proper vector type

2014-06-19 Thread Alan Lawrence
This updates the .md files to generate V1DFmode patterns instead of DFmode for 
create and reinterpret, and the corresponding __builtins.


The various other float64x1_t intrinsics can then be rewritten, generally I've 
tried to use gcc vector extensions rather than unnecessary/custom builtins where 
possible, and have started adding some range checking using 
__builtin_aarch64_im_lane_boundsi.


Finally, rewrite the cases in arm_neon.h and various tests, that relied on 
float64[x1]_t being assignment-compatible, including arm_neon.h vfma functions 
which had the wrong (but previously equivalent) type signature; and add some new 
ABI tests.


gcc/ChangeLog:
2014-06-19  Alan Lawrence  alan.lawre...@arm.com

* config/aarch64/aarch64.c (aarch64_simd_mangle_map): Add entry for
V1DFmode.
* config/aarch64/aarch64-builtins.c (aarch64_simd_builtin_type_mode):
add V1DFmode
(BUILTIN_VD1): New.
(BUILTIN_VD_RE): Remove.
(aarch64_init_simd_builtins): Add V1DF to modes/modenames.
(aarch64_fold_builtin): Update reinterpret patterns, df becomes v1df.
* config/aarch64/aarch64-simd-builtins.def (create): Make a v1df
variant but not df.
(vreinterpretv1df*, vreinterpret*v1df): New.
(vreinterpretdf*, vreinterpret*df): Remove.
* config/aarch64/aarch64-simd.md (aarch64_create, aarch64_reinterpret*):
Generate V1DFmode pattern not DFmode.
* config/aarch64/iterators.md (VD_RE): Include V1DF, remove DF.
(VD1): New.
* config/aarch64/arm_neon.h (float64x1_t): typedef with gcc extensions.
(vcreate_f64): Remove cast, use v1df builtin.
(vcombine_f64): Remove cast, get elements with gcc vector extensions.
(vget_low_f64, vabs_f64, vceq_f64, vceqz_f64, vcge_f64, vgfez_f64,
vcgt_f64, vcgtz_f64, vcle_f64, vclez_f64, vclt_f64, vcltz_f64,
vdup_n_f64, vdupq_lane_f64, vld1_f64, vld2_f64, vld3_f64, vld4_f64,
vmov_n_f64, vst1_f64): Use gcc vector extensions.
(vget_lane_f64, vdupd_lane_f64, vmulq_lane_f64, ): Use gcc extensions,
add range check using __builtin_aarch64_im_lane_boundsi.
(vfma_lane_f64, vfmad_lane_f64, vfma_laneq_f64, vfmaq_lane_f64,
vfms_lane_f64, vfmsd_lane_f64, vfms_laneq_f64, vfmsq_lane_f64): Fix
type signature, use gcc vector extensions.
(vreinterpret_p8_f64, vreinterpret_p16_f64, vreinterpret_f32_f64,
vreinterpret_f64_f32, vreinterpret_f64_p8, vreinterpret_f64_p16,
vreinterpret_f64_s8, vreinterpret_f64_s16, vreinterpret_f64_s32,
vreinterpret_f64_s64, vreinterpret_f64_u8, vreinterpret_f64_u16,
vreinterpret_f64_u32, vreinterpret_f64_u64, vreinterpret_s8_f64,
vreinterpret_s16_f64, vreinterpret_s32_f64, vreinterpret_s64_f64,
vreinterpret_u8_f64, vreinterpret_u16_f64, vreinterpret_u32_f64,
vreinterpret_u64_f64): Use v1df builtin not df.

gcc/testsuite/ChangeLog:
2014-06-19  Alan Lawrence  alan.lawre...@arm.com

* g++.dg/abi/mangle-neon-aarch64.C: Also test mangling of float64x1_t.
* gcc.target/aarch64/aapcs/test_64x1_1.c: New test.
* gcc.target/aarch64/aapcs/func-ret-64x1_1.c: New test.
* gcc.target/aarch64/simd/ext_f64_1.c (main): Compare vector elements.
* gcc.target/aarch64/vadd_f64.c: Rewrite with macro to use vector types.
* gcc.target/aarch64/vsub_f64.c: Likewise.
* gcc.target/aarch64/vdiv_f.c (INDEX*, RUN_TEST): Remove indexing scheme
as now the same for all variants.
* gcc.target/aarch64/vrnd_f64_1.c (compare_f64): Return float64_t not
float64x1_t.diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index fe4d39283b05f244b400f62d4e44097f51b237d7..51407cbef59e0135a897ccdf4224b847dccdad88 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -53,6 +53,7 @@ enum aarch64_simd_builtin_type_mode
   T_V4HI,
   T_V2SI,
   T_V2SF,
+  T_V1DF,
   T_DI,
   T_DF,
   T_V16QI,
@@ -76,6 +77,7 @@ enum aarch64_simd_builtin_type_mode
 #define v4hi_UP  T_V4HI
 #define v2si_UP  T_V2SI
 #define v2sf_UP  T_V2SF
+#define v1df_UP  T_V1DF
 #define di_UPT_DI
 #define df_UPT_DF
 #define v16qi_UP T_V16QI
@@ -346,6 +348,8 @@ aarch64_types_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   VAR2 (T, N, MAP, v8qi, v16qi)
 #define BUILTIN_VD(T, N, MAP) \
   VAR4 (T, N, MAP, v8qi, v4hi, v2si, v2sf)
+#define BUILTIN_VD1(T, N, MAP) \
+  VAR5 (T, N, MAP, v8qi, v4hi, v2si, v2sf, v1df)
 #define BUILTIN_VDC(T, N, MAP) \
   VAR6 (T, N, MAP, v8qi, v4hi, v2si, v2sf, di, df)
 #define BUILTIN_VDIC(T, N, MAP) \
@@ -380,8 +384,6 @@ aarch64_types_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   VAR3 (T, N, MAP, v8qi, v4hi, v2si)
 #define BUILTIN_VD_HSI(T, N, MAP) \
   VAR2 (T, N, MAP, v4hi, v2si)
-#define BUILTIN_VD_RE(T, N, MAP) \
-  VAR6 (T, N, MAP, v8qi, v4hi, v2si, v2sf, di, df)
 #define BUILTIN_VQ(T, N, MAP) \
  

RE: [PATCH, Cilk+, PR57541] Additional fix for issues witn array notations

2014-06-19 Thread Zamyatin, Igor
 On 06/16/14 14:13, Zamyatin, Igor wrote:
  Hi All!
 
  The patch fixes ICE in array notation for the cases of incorrect arguments 
  of
 Cilk+ builtins and undeclared initial index.
 
  Is it ok for trunk and 4.9?
 
  Thanks,
  Igor
 
  diff --git a/gcc/c/ChangeLog b/gcc/c/ChangeLog index 54d0de7..56e1b0b
  100644
  --- a/gcc/c/ChangeLog
  +++ b/gcc/c/ChangeLog
  @@ -1,3 +1,12 @@
  +2014-06-16  Igor Zamyatin  igor.zamya...@intel.com
  +
  +   PR middle-end/57541
  +   * c-array-notation.c (fix_builtin_array_notation_fn):
  +   Check for 0 arguments in builtin call. Check that bultin argument is
  +   correct.
  +   * c-parser.c (c_parser_array_notation): Check for incorrect initial
  +   index.
 Shouldn't this have been caught earlier?  ISTM we should be catching any
 argument mix-ups during parsing?!?Is there some reason we don't do
 that?

But call stack for fix_builtin_array_notation_fn is from c-parser...

Thanks,
Igor

 
 jeff



Re: -fuse-caller-save - Collect register usage information

2014-06-19 Thread Tom de Vries

On 19-06-14 07:13, Richard Henderson wrote:

On 05/19/2014 07:30 AM, Tom de Vries wrote:

+  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
+{
+  HARD_REG_SET insn_used_regs;
+
+  if (!NONDEBUG_INSN_P (insn))
+   continue;
+
+  find_all_hard_reg_sets (insn, insn_used_regs, false);
+
+  if (CALL_P (insn)
+  !get_call_reg_set_usage (insn, insn_used_regs, call_used_reg_set))
+   {
+ CLEAR_HARD_REG_SET (node-function_used_regs);
+ return;
+   }
+
+  IOR_HARD_REG_SET (node-function_used_regs, insn_used_regs);
+}

As an aside, wouldn't it work out better if we collect into a local variable
instead of writing to memory here in node-function_used_regs each time?


Richard,

Agreed. This patch implements that. I'll bootstrap and reg-test on x86_64 and 
commit as obvious.


Thanks,
- Tom


2014-06-19  Tom de Vries  t...@codesourcery.com

	* final.c (collect_fn_hard_reg_usage): Add and use variable
	function_used_regs.

diff --git a/gcc/final.c b/gcc/final.c
index 4f08073..e39930d 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -4760,13 +4760,13 @@ collect_fn_hard_reg_usage (void)
   int i;
 #endif
   struct cgraph_rtl_info *node;
+  HARD_REG_SET function_used_regs;
 
   /* ??? To be removed when all the ports have been fixed.  */
   if (!targetm.call_fusage_contains_non_callee_clobbers)
 return;
 
-  node = cgraph_rtl_info (current_function_decl);
-  gcc_assert (node != NULL);
+  CLEAR_HARD_REG_SET (function_used_regs);
 
   for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
 {
@@ -4779,25 +4779,26 @@ collect_fn_hard_reg_usage (void)
 
   if (CALL_P (insn)
 	   !get_call_reg_set_usage (insn, insn_used_regs, call_used_reg_set))
-	{
-	  CLEAR_HARD_REG_SET (node-function_used_regs);
-	  return;
-	}
+	return;
 
-  IOR_HARD_REG_SET (node-function_used_regs, insn_used_regs);
+  IOR_HARD_REG_SET (function_used_regs, insn_used_regs);
 }
 
   /* Be conservative - mark fixed and global registers as used.  */
-  IOR_HARD_REG_SET (node-function_used_regs, fixed_reg_set);
+  IOR_HARD_REG_SET (function_used_regs, fixed_reg_set);
 
 #ifdef STACK_REGS
   /* Handle STACK_REGS conservatively, since the df-framework does not
  provide accurate information for them.  */
 
   for (i = FIRST_STACK_REG; i = LAST_STACK_REG; i++)
-SET_HARD_REG_BIT (node-function_used_regs, i);
+SET_HARD_REG_BIT (function_used_regs, i);
 #endif
 
+  node = cgraph_rtl_info (current_function_decl);
+  gcc_assert (node != NULL);
+
+  COPY_HARD_REG_SET (node-function_used_regs, function_used_regs);
   node-function_used_regs_valid = 1;
 }
 
-- 
1.9.1



Re: [AArch64] Implement ADD in vector registers for 32-bit scalar values.

2014-06-19 Thread James Greenhalgh
On Fri, May 16, 2014 at 11:30:38AM +0100, James Greenhalgh wrote:
 On Fri, Mar 28, 2014 at 03:39:53PM +, James Greenhalgh wrote:
  On Fri, Mar 28, 2014 at 03:09:22PM +, pins...@gmail.com wrote:
On Mar 28, 2014, at 7:48 AM, James Greenhalgh 
james.greenha...@arm.com wrote:
On Fri, Mar 28, 2014 at 11:11:58AM +, pins...@gmail.com wrote:
On Mar 28, 2014, at 2:12 AM, James Greenhalgh 
james.greenha...@arm.com wrote:
There is no way to perform scalar addition in the vector register 
file,
but with the RTX costs in place we start rewriting (x  1) to (x + x)
on almost all cores. The code which makes this decision has no idea 
that we
will end up doing this (it happens well before reload) and so we end 
up with
very ugly code generation in the case where addition was selected, but
we are operating in vector registers.

This patch relies on the same gimmick we are already using to allow
shifts on 32-bit scalars in the vector register file - Use a vector 
32x2
operation instead, knowing that we can safely ignore the top bits.

This restores some normality to scalar_shift_1.c, however the test
that we generate a left shift by one is clearly bogus, so remove that.

This patch is pretty ugly, but it does generate superficially better
looking code for this testcase.

Tested on aarch64-none-elf with no issues.

OK for stage 1?

It seems we should also discourage the neon alternatives as there 
might be
extra movement between the two register sets which we don't want.

I see your point, but we've tried to avoid doing that elsewhere in the
AArch64 backend. Our argument has been that strictly speaking, it isn't 
that
the alternative is expensive, it is the movement between the register 
sets. We
do model that elsewhere, and the register allocator should already be 
trying to
avoid unneccesary moves between register classes.

   
   What about on a specific core where that alternative is expensive; that is
   the vector instructions are worse than the scalar ones. How are we going 
   to
   handle this case?
  
  Certainly not by discouraging the alternative for all cores. We would need
  a more nuanced approach which could be tuned on a per-core basis. Otherwise
  we are bluntly and inaccurately pessimizing those cases where we can cheaply
  perform the operation in the vector register file (e.g. we are cleaning up
  loose ends after a vector loop, we have spilled to the vector register
  file, etc.). The register preference mechanism feels the wrong place to
  catch this as it does not allow for that degree of per-core felxibility,
  an alternative is simply disparaged slightly (?, * in LRA) or
  disparaged severely (!).
  
  I would think that we don't want to start polluting the machine description
  trying to hack around this as was done with the ARM backend's
  neon_for_64_bits/avoid_neon_for_64_bits.
  
  How have other targets solved this issue?
 
 Did you have any further thoughts on this? I've pushed the costs patches, so
 we will start to see gcc.target/aarch64/scalar_shift_1.c failing without
 this or an equivalent patch.

This has been sitting waiting for comment for a while now. If we do need a
mechanism to describe individual costs for alternatives, it will need
applied to all the existing uses in aarch64.md/aarch64-simd.md. I think
solving that problem (if we need to) is a seperate patch, and shouldn't
prevent this one from going in.

*pingx2*.

Thanks,
James

 ---
 gcc/
 
 2014-05-16  James Greenhalgh  james.greenha...@arm.com
 
   * config/aarch64/aarch64.md (*addsi3_aarch64): Add alternative in
   vector registers.
 
 gcc/testsuite/
 
 2014-05-16  James Greenhalgh  james.greenha...@arm.com
 
   * gcc.target/aarch64/scalar_shift_1.c: Fix expected assembler.
 
If those mechanisms are broken, we should fix them - in that case fixing
this by discouraging valid alternatives would seem to be gaffer-taping 
over the
real problem.

Thanks,
James


Thanks,
Andrew


Thanks,
James

---
gcc/

2014-03-27  James Greenhalgh  james.greenha...@arm.com

  * config/aarch64/aarch64.md (*addsi3_aarch64): Add alternative in
  vector registers.

gcc/testsuite/
2014-03-27  James Greenhalgh  james.greenha...@arm.com

  * gcc.target/aarch64/scalar_shift_1.c: Fix expected assembler.
0001-AArch64-Implement-ADD-in-vector-registers-for-32-bit.patch

   
  
 
 --1.8.3-rc0
 Content-Type: text/x-patch; 
 name=0001-AArch64-Implement-ADD-in-vector-registers-for-32-bit.patch
 Content-Transfer-Encoding: 8bit
 Content-Disposition: attachment; 
 filename=0001-AArch64-Implement-ADD-in-vector-registers-for-32-bit.patch
 
 diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
 index 
 

[PATCH 4.8 ARM] Backport of r211369: PR/61062 Fix arm_neon.h ZIP/UZP/TRN for bigendian

2014-06-19 Thread Alan Lawrence
This backports straightforwardly; no regressions on arm-none-eabi or 
armeb-none-eabi, and FAIL-PASS of the new ZIP, UZP, and TRN execution tests 
from r209908, r209947 and r210422 (running locally).


--Alandiff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 4d945ce..a930e05 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -7504,12 +7504,32 @@ vbslq_p16 (uint16x8_t __a, poly16x8_t __b, poly16x8_t __c)
   return (poly16x8_t)__builtin_neon_vbslv8hi ((int16x8_t) __a, (int16x8_t) __b, (int16x8_t) __c);
 }
 
+/* For big-endian, the shuffle masks for ZIP, UZP and TRN must be changed as
+   follows. (nelt = the number of elements within a vector.)
+
+   Firstly, a value of N within a mask, becomes (N ^ (nelt - 1)), as gcc vector
+   extension's indexing scheme is reversed *within each vector* (relative to the
+   neon intrinsics view), but without changing which of the two vectors.
+
+   Secondly, the elements within each mask are reversed, as the mask is itself a
+   vector, and will itself be loaded in reverse order (again, relative to the
+   neon intrinsics view, i.e. that would result from a vld1 instruction).  */
+
 __extension__ static __inline int8x8x2_t __attribute__ ((__always_inline__))
 vtrn_s8 (int8x8_t __a, int8x8_t __b)
 {
   int8x8x2_t __rv;
-  __rv.val[0] = (int8x8_t) __builtin_shuffle (__a, __b, (uint8x8_t) { 0, 8, 2, 10, 4, 12, 6, 14 });
-  __rv.val[1] = (int8x8_t) __builtin_shuffle (__a, __b, (uint8x8_t) { 1, 9, 3, 11, 5, 13, 7, 15 });
+#ifdef __ARM_BIG_ENDIAN
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 9, 1, 11, 3, 13, 5, 15, 7 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 8, 0, 10, 2, 12, 4, 14, 6 });
+#else
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 0, 8, 2, 10, 4, 12, 6, 14 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 1, 9, 3, 11, 5, 13, 7, 15 });
+#endif
   return __rv;
 }
 
@@ -7517,8 +7537,13 @@ __extension__ static __inline int16x4x2_t __attribute__ ((__always_inline__))
 vtrn_s16 (int16x4_t __a, int16x4_t __b)
 {
   int16x4x2_t __rv;
-  __rv.val[0] = (int16x4_t) __builtin_shuffle (__a, __b, (uint16x4_t) { 0, 4, 2, 6 });
-  __rv.val[1] = (int16x4_t) __builtin_shuffle (__a, __b, (uint16x4_t) { 1, 5, 3, 7 });
+#ifdef __ARM_BIG_ENDIAN
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint16x4_t) { 5, 1, 7, 3 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint16x4_t) { 4, 0, 6, 2 });
+#else
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint16x4_t) { 0, 4, 2, 6 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint16x4_t) { 1, 5, 3, 7 });
+#endif
   return __rv;
 }
 
@@ -7526,8 +7551,17 @@ __extension__ static __inline uint8x8x2_t __attribute__ ((__always_inline__))
 vtrn_u8 (uint8x8_t __a, uint8x8_t __b)
 {
   uint8x8x2_t __rv;
-  __rv.val[0] = (uint8x8_t) __builtin_shuffle (__a, __b, (uint8x8_t) { 0, 8, 2, 10, 4, 12, 6, 14 });
-  __rv.val[1] = (uint8x8_t) __builtin_shuffle (__a, __b, (uint8x8_t) { 1, 9, 3, 11, 5, 13, 7, 15 });
+#ifdef __ARM_BIG_ENDIAN
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 9, 1, 11, 3, 13, 5, 15, 7 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 8, 0, 10, 2, 12, 4, 14, 6 });
+#else
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 0, 8, 2, 10, 4, 12, 6, 14 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 1, 9, 3, 11, 5, 13, 7, 15 });
+#endif
   return __rv;
 }
 
@@ -7535,8 +7569,13 @@ __extension__ static __inline uint16x4x2_t __attribute__ ((__always_inline__))
 vtrn_u16 (uint16x4_t __a, uint16x4_t __b)
 {
   uint16x4x2_t __rv;
-  __rv.val[0] = (uint16x4_t) __builtin_shuffle (__a, __b, (uint16x4_t) { 0, 4, 2, 6 });
-  __rv.val[1] = (uint16x4_t) __builtin_shuffle (__a, __b, (uint16x4_t) { 1, 5, 3, 7 });
+#ifdef __ARM_BIG_ENDIAN
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint16x4_t) { 5, 1, 7, 3 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint16x4_t) { 4, 0, 6, 2 });
+#else
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint16x4_t) { 0, 4, 2, 6 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint16x4_t) { 1, 5, 3, 7 });
+#endif
   return __rv;
 }
 
@@ -7544,8 +7583,17 @@ __extension__ static __inline poly8x8x2_t __attribute__ ((__always_inline__))
 vtrn_p8 (poly8x8_t __a, poly8x8_t __b)
 {
   poly8x8x2_t __rv;
-  __rv.val[0] = (poly8x8_t) __builtin_shuffle (__a, __b, (uint8x8_t) { 0, 8, 2, 10, 4, 12, 6, 14 });
-  __rv.val[1] = (poly8x8_t) __builtin_shuffle (__a, __b, (uint8x8_t) { 1, 9, 3, 11, 5, 13, 7, 15 });
+#ifdef __ARM_BIG_ENDIAN
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 9, 1, 11, 3, 13, 5, 15, 7 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 8, 0, 10, 2, 12, 4, 14, 6 });
+#else
+  __rv.val[0] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 0, 8, 2, 10, 4, 12, 6, 14 });
+  __rv.val[1] = __builtin_shuffle (__a, __b, (uint8x8_t)
+  { 1, 9, 3, 11, 5, 13, 7, 15 });

[PATCH, Testsuite, AArch64] Make aapcs64.exp Tests Big-Endian Friendly

2014-06-19 Thread Yufeng Zhang

Hi,

This patch updates a number of aapcs64 tests to make them big-endian 
friendly.  Changes are mainly:


* checking the W regs instead of X regs for integral arguments less than 
8 bytes

* correcting the corresponding stack location checks in big-endian mode

With this patch, make check-gcc RUNTESTFLAGS=aapcs64.exp gives a clean 
result on aarch64_be-none-elf.


OK for trunk?

Thanks,
Yufeng

gcc/testsuite/

Make the tests big-endian friendly.
* gcc.target/aarch64/aapcs64/test_25.c: Update.
* gcc.target/aarch64/aapcs64/va_arg-1.c: Ditto.
* gcc.target/aarch64/aapcs64/va_arg-12.c: Ditto.
* gcc.target/aarch64/aapcs64/va_arg-2.c: Ditto.
* gcc.target/aarch64/aapcs64/va_arg-3.c: Ditto.
* gcc.target/aarch64/aapcs64/va_arg-4.c: Ditto.
* gcc.target/aarch64/aapcs64/va_arg-5.c: Ditto.
* gcc.target/aarch64/aapcs64/va_arg-6.c: Ditto.
* gcc.target/aarch64/aapcs64/va_arg-7.c: Ditto.diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/test_25.c 
b/gcc/testsuite/gcc.target/aarch64/aapcs64/test_25.c
index 2f942ff..2febb79 100644
--- a/gcc/testsuite/gcc.target/aarch64/aapcs64/test_25.c
+++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/test_25.c
@@ -42,20 +42,20 @@ void init_data ()
   s2.df[0] = 123.456;
   s2.df[1] = 234.567;
   s2.df[2] = 345.678;
-  s3.v[0] = (vf2_t){ 19.f, 20.f, 21.f, 22.f };
-  s3.v[1] = (vf2_t){ 23.f, 24.f, 25.f, 26.f };
-  s4.v[0] = (vf2_t){ 27.f, 28.f, 29.f, 30.f };
-  s4.v[1] = (vf2_t){ 31.f, 32.f, 33.f, 34.f };
-  s4.v[2] = (vf2_t){ 35.f, 36.f, 37.f, 38.f };
+  s3.v[0] = (vf2_t){ 19.f, 20.f };
+  s3.v[1] = (vf2_t){ 23.f, 24.f };
+  s4.v[0] = (vf2_t){ 27.f, 28.f };
+  s4.v[1] = (vf2_t){ 31.f, 32.f };
+  s4.v[2] = (vf2_t){ 35.f, 36.f };
 }
 
 #include abitest.h
 #else
-ARG_NONFLAT (struct x0, s0, Q0, f32in64)
+ARG (struct x0, s0, D0)
 ARG (struct x2, s2, D1)
 ARG (struct x1, s1, Q4)
 ARG (struct x3, s3, D5)
 ARG (struct x4, s4, STACK)
-ARG_NONFLAT (int, 0xdeadbeef, X0, i32in64)
+ARG (int, 0xdeadbeef, W0)
 LAST_ARG (double, 456.789, STACK+24)
 #endif
diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-1.c 
b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-1.c
index 4eb569e..4fb9a03 100644
--- a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-1.c
+++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-1.c
@@ -30,14 +30,14 @@ void init_data ()
 
 #include abitest.h
 #else
-  ARG  ( int  , 0xff  ,X0, 
LAST_NAMED_ARG_ID)
+  ARG  ( int  , 0xff  ,W0, 
LAST_NAMED_ARG_ID)
   DOTS
-  ANON_PROMOTED(unsigned char , 0xfe  , unsigned int, 0xfe   , X1, 
  1)
-  ANON_PROMOTED(  signed char , sc,   signed int, sc_promoted, X2, 
  2)
-  ANON_PROMOTED(unsigned short, 0xdcba, unsigned int, 0xdcba , X3, 
  3)
-  ANON_PROMOTED(  signed short, ss,   signed int, ss_promoted, X4, 
  4)
-  ANON (unsigned int  , 0xdeadbeef,X5, 
  5)
-  ANON (  signed int  , 0xcafebabe,X6, 
  6)
+  ANON_PROMOTED(unsigned char , 0xfe  , unsigned int, 0xfe   , W1, 
  1)
+  ANON_PROMOTED(  signed char , sc,   signed int, sc_promoted, W2, 
  2)
+  ANON_PROMOTED(unsigned short, 0xdcba, unsigned int, 0xdcba , W3, 
  3)
+  ANON_PROMOTED(  signed short, ss,   signed int, ss_promoted, W4, 
  4)
+  ANON (unsigned int  , 0xdeadbeef,W5, 
  5)
+  ANON (  signed int  , 0xcafebabe,W6, 
  6)
   ANON (unsigned long long, 0xba98765432101234ULL, X7, 
  7)
   ANON (  signed long long, 0xa987654321012345LL , STACK,  
  8)
   ANON (  __int128, qword.i  , 
STACK+16, 9)
@@ -46,5 +46,9 @@ void init_data ()
   ANON (long double   , 98765432123456789.987654321L,  Q2, 
 12)
   ANON ( vf2_t, vf2   ,D3, 
 13)
   ANON ( vi4_t, vi4   ,Q4, 
 14)
+#ifndef __AAPCS64_BIG_ENDIAN__
   LAST_ANON( int  , 0x,
STACK+32,15)
+#else
+  LAST_ANON( int  , 0x,
STACK+36,15)
+#endif
 #endif
diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-12.c 
b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-12.c
index a12ccfd..3eddaa2 100644
--- a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-12.c
+++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-12.c
@@ -45,16 +45,20 @@ void init_data ()
 #include abitest.h
 #else
   PTR(struct z, a, X0, 0)
-  ARG(int, 0xdeadbeef, X1, 1)
-  ARG(int, 0xcafebabe, X2, 2)
-  ARG(int, 0xdeadbabe, X3, 3)
-  ARG(int, 0xcafebeef, X4, 4)
-  ARG(int, 0xbeefdead, X5, 5)
-  ARG(int, 0xbabecafe, X6, LAST_NAMED_ARG_ID)
+  ARG(int, 

Re: [PATCH] Implement -fsanitize=bounds and internal calls in FEs

2014-06-19 Thread Marek Polacek
On Mon, Jun 16, 2014 at 01:23:04PM +0200, Jakub Jelinek wrote:
 On Mon, Jun 16, 2014 at 12:39:07PM +0200, Marek Polacek wrote:
  Jason/Joseph, could you please look at the C++/C FE parts?
 
 As mentioned on IRC, you need to differentiate between taking address
 and not taking address.
 
 struct S { int a; int b; } s[4], *t;
 int *a, *b, *c;
 void *d;
 int e[4][4];
 
 void
 foo ()
 {
   t = s[4];  // Should be fine
   a = s[4].a; // Error
   b = s[4].b; // Error
   d = e[4];  // Should be fine
   c = e[4][0]; // Error
 }
 
 So, supposedly when e.g. in cp_genericize_r, for ADDR_EXPR ARRAY_REF
 allow off-by-one, for all other ARRAY_REFs (e.g. those not appearing
 inside of ADDR_EXPR, or not directly inside of ADDR_EXPR, e.g. with
 COMPONENT_REF or another ARRAY_REF in between) disallow off-by-one.

Should be fixed in this new patch.  Another change is in handling
flexible array member-like arrays, what I had in last patch was wrong.
I use the non-strict mode by default, which means flexible array
member-like arrays are not instrumented.   We could have e.g.
-fsanitize=bounds-strict option that would instrument even those.
(Regular FMAs are never instrumented.)

I moved the code that instruments array to c_genericize (works for
both C and C++).  Also I fixed an ICE (we shouldn't instrument
initializers of TREE_STATIC, otherwise internal calls from FE
survive up to expansion).

Oh, I see I forgot to update docs, I'll write something up in
another iteration.

Regtested/bootstrapped on x86_64-linux.  How does this look?

2014-06-19  Marek Polacek  pola...@redhat.com

* asan.c (pass_sanopt::execute): Handle IFN_UBSAN_BOUNDS.
* flag-types.h (enum sanitize_code): Add SANITIZE_BOUNDS and or it
into SANITIZE_UNDEFINED.
* gimplify.c (gimplify_call_expr): Add gimplification of internal
functions created in the FEs.
* internal-fn.c: Move internal-fn.h after tree.h.
(expand_UBSAN_BOUNDS): New function.
* internal-fn.def (UBSAN_BOUNDS): New internal function.
* internal-fn.h: Don't define internal functions here.
* opts.c (common_handle_option): Add -fsanitize=bounds.
* sanitizer.def (BUILT_IN_UBSAN_HANDLE_OUT_OF_BOUNDS,
BUILT_IN_UBSAN_HANDLE_OUT_OF_BOUNDS_ABORT): Add.
* tree-core.h: Define internal functions here.
(struct tree_base): Add ifn field.
* tree-pretty-print.c (print_call_name): Handle functions without
CALL_EXPR_FN.
* tree.c (get_callee_fndecl): Likewise.
(build_call_expr_internal_loc): New function.
* tree.def (CALL_EXPR): Update description.
* tree.h (CALL_EXPR_IFN): Define.
(build_call_expr_internal_loc): Declare.
* ubsan.c (get_ubsan_type_info_for_type): Return 0 for non-arithmetic
types.
(ubsan_type_descriptor): Change bool parameter to enum
ubsan_print_style.  Adjust the code.  Add handling of
UBSAN_PRINT_ARRAY.
(ubsan_expand_bounds_btn): New function.
(ubsan_expand_null_ifn): Adjust ubsan_type_descriptor call.
(ubsan_build_overflow_builtin): Likewise.
(instrument_bool_enum_load): Likewise.
(ubsan_instrument_float_cast): Likewise.
* ubsan.h (enum ubsan_print_style): New enum.
(ubsan_expand_bounds_btn): Declare.
(ubsan_type_descriptor): Adjust declaration.  Use a default parameter.
c-family/
* c-gimplify.c: Include c-ubsan.h and pointer-set.h.
(ubsan_walk_array_refs_r): New function.
(c_genericize): Instrument array bounds.
* c-ubsan.c: Include internal-fn.h.
(ubsan_instrument_division): Mark instrumented arrays as having
side effects.  Adjust ubsan_type_descriptor call.
(ubsan_instrument_shift): Likewise.
(ubsan_instrument_vla): Adjust ubsan_type_descriptor call.
(ubsan_instrument_bounds): New function.
(ubsan_array_ref_instrumented_p): New function.
(ubsan_maybe_instrument_array_ref): New function.
* c-ubsan.h (ubsan_instrument_bounds): Declare.
(ubsan_array_ref_instrumented_p): Declare.
(ubsan_maybe_instrument_array_ref): Declare.
testsuite/
* c-c++-common/ubsan/bounds-1.c: New test.
* c-c++-common/ubsan/bounds-2.c: New test.
* c-c++-common/ubsan/bounds-3.c: New test.
* c-c++-common/ubsan/bounds-4.c: New test.
* c-c++-common/ubsan/bounds-5.c: New test.
* c-c++-common/ubsan/bounds-6.c: New test.

diff --git gcc/asan.c gcc/asan.c
index 281a795..5f5dcaa 100644
--- gcc/asan.c
+++ gcc/asan.c
@@ -2761,6 +2761,9 @@ pass_sanopt::execute (function *fun)
  case IFN_UBSAN_NULL:
ubsan_expand_null_ifn (gsi);
break;
+ case IFN_UBSAN_BOUNDS:
+   ubsan_expand_bounds_btn (gsi);
+   break;
  default:
break;
  }
@@ -2771,6 +2774,10 @@ pass_sanopt::execute (function *fun)
 

[PATCH] Fix for PR 61561

2014-06-19 Thread Marat Zakirov
Hi all,

Here's a patch for PR 61561
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61561). 

It fixes ICE.

Reg. tested on arm15.

--Marat


arm.md.diff.diff
Description: Binary data


Re: [PATCH] Fix for PR 61561

2014-06-19 Thread Kyrill Tkachov


On 19/06/14 16:05, Marat Zakirov wrote:

Hi all,

Here's a patch for PR 61561
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61561).

It fixes ICE.

Reg. tested on arm15.


CC'ing the arm maintainers...

Kyrill



--Marat





Re: -fuse-caller-save - Collect register usage information

2014-06-19 Thread Richard Henderson
On 06/19/2014 05:39 AM, Tom de Vries wrote:
 
 2014-06-19  Tom de Vries  t...@codesourcery.com
 
   * final.c (collect_fn_hard_reg_usage): Add and use variable
   function_used_regs.

Looks good, thanks.


r~


Re: [PATCH, AARCH64] Enable fuse-caller-save for AARCH64

2014-06-19 Thread Richard Henderson
On 06/19/2014 01:39 AM, Tom de Vries wrote:
 On 19-06-14 05:53, Richard Henderson wrote:
 Do we in fact make sure this isn't an ifunc resolver?  I don't immediately 
 see
 how those get wired up in the cgraph...
 
 Richard,
 
 using the patch below I changed the
 gcc/testsuite/gcc.target/i386/fuse-caller-save.c testcase to use an ifunc
 resolver, and observed that the fuse-caller-save optimization didn't work.
 
 The reason the optimization doesn't work in this case is that
 default_binds_local_p_1 checks the ifunc attribute:
 ...
   /* Weakrefs may not bind locally, even though the weakref itself is always
  static and therefore local.  Similarly, the resolver for ifunc functions
  might resolve to a non-local function.
  FIXME: We can resolve the weakref case more curefuly by looking at the
  weakref alias.  */
   else if (lookup_attribute (weakref, DECL_ATTRIBUTES (exp))
|| (TREE_CODE (exp) == FUNCTION_DECL
 lookup_attribute (ifunc, DECL_ATTRIBUTES (exp
 local_p = false;
 ...
 
 The default_binds_local_p_1 function is used via this path in the 
 optimization:
 get_call_reg_set_usage - get_call_cgraph_rtl_info -
 decl_binds_to_current_def_p - default_binds_local_p - 
 default_binds_local_p_1 .

Excellent.  Thanks for doing the digging I was too lazy to finish last night.


r~



Re: [PATCH] Fix for PR 61561

2014-06-19 Thread Yuri Gribov
 +  (if_then_else (match_operand 1 const_int_operand 
 )
 +(const_string mov_imm )
 +(const_string mov_reg))])]

Why not just mov_reg?

 * config/arm/arm.md: New templates see pr61561.

I think you shouldn't reference PR in file change description, just
add a general comment PR arm/61561 to each ChangeLog (see existing
ChangeLog entries).

 * c-c++-common/pr61561.c: New test for pr61561.

Likewise.

 -  @
 +  @

Accidental change?

-Y


Re: [PATCH] Fix for PR 61561

2014-06-19 Thread Richard Earnshaw
On 19/06/14 16:05, Marat Zakirov wrote:
 Hi all,
 
 Here's a patch for PR 61561
 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61561). 
 
 It fixes ICE.
 
 Reg. tested on arm15.
 
 --Marat
 
 
 arm.md.diff.diff
 
 
 gcc/ChangeLog:
 
 2014-06-19  Marat Zakirov  m.zaki...@samsung.com
 
   * config/arm/arm.md: New templates see pr61561.
 

ChangeLog entries should list the names of the patterns changed.

 gcc/testsuite/ChangeLog:
 
 2014-06-19  Marat Zakirov  m.zaki...@samsung.com
 
   * c-c++-common/pr61561.c: New test for pr61561.
 

Not OK.  I don't see why you need to zero out the top bits.

Firstly, it shouldn't be necessary to have a new alternative; I see no
reason why these can't use the MOV instruction in alternative 0 (just
change rI to rkI).

Secondly, uxt[bh] are ARMv6 and later, but this pattern needs to work
for armv4 and later.

Thirdly, we also need to fix movhi_bytes (for pre-v4) thumb2_movhi_insn
(for thumb2) and, quite possibly, thumb1_movhi_insn (for thumb1).  There
may well be additional changes for movqi variants as well.

R.

 
 diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
 index 42c12c8..7ed8abc 100644
 --- a/gcc/config/arm/arm.md
 +++ b/gcc/config/arm/arm.md
 @@ -6290,27 +6290,31 @@
  
  ;; Pattern to recognize insn generated default case above
  (define_insn *movhi_insn_arch4
 -  [(set (match_operand:HI 0 nonimmediate_operand =r,r,m,r)
 - (match_operand:HI 1 general_operand  rI,K,r,mi))]
 +  [(set (match_operand:HI 0 nonimmediate_operand =r,r,m,r,r)
 + (match_operand:HI 1 general_operand  rI,K,r,mi,k))]
TARGET_ARM
  arm_arch4
  (register_operand (operands[0], HImode)
 || register_operand (operands[1], HImode))
 -  @
 +  @   
 mov%?\\t%0, %1\\t%@ movhi
 mvn%?\\t%0, #%B1\\t%@ movhi
 str%(h%)\\t%1, %0\\t%@ movhi
 -   ldr%(h%)\\t%0, %1\\t%@ movhi
 +   ldr%(h%)\\t%0, %1\\t%@ movhi
 +   uxth%?\\t%0, %1\\t%@ movhi
[(set_attr predicable yes)
 -   (set_attr pool_range *,*,*,256)
 -   (set_attr neg_pool_range *,*,*,244)
 +   (set_attr pool_range *,*,*,256,*)
 +   (set_attr neg_pool_range *,*,*,244,*)
 (set_attr_alternative type
   [(if_then_else (match_operand 1 const_int_operand 
 )
  (const_string mov_imm )
  (const_string mov_reg))
(const_string mvn_imm)
(const_string store1)
 -  (const_string load1)])]
 +  (const_string load1)
 +  (if_then_else (match_operand 1 const_int_operand 
 )
 +(const_string mov_imm )
 +(const_string mov_reg))])]
  )
  
  (define_insn *movhi_bytes
 @@ -6429,8 +6433,8 @@
  )
  
  (define_insn *arm_movqi_insn
 -  [(set (match_operand:QI 0 nonimmediate_operand =r,r,r,l,r,l,Uu,r,m)
 - (match_operand:QI 1 general_operand r,r,I,Py,K,Uu,l,m,r))]
 +  [(set (match_operand:QI 0 nonimmediate_operand =r,r,r,l,r,l,Uu,r,m,r,r)
 + (match_operand:QI 1 general_operand r,r,I,Py,K,Uu,l,m,r,k,k))]
TARGET_32BIT
  (   register_operand (operands[0], QImode)
 || register_operand (operands[1], QImode))
 @@ -6443,12 +6447,14 @@
 ldr%(b%)\\t%0, %1
 str%(b%)\\t%1, %0
 ldr%(b%)\\t%0, %1
 -   str%(b%)\\t%1, %0
 -  [(set_attr type 
 mov_reg,mov_reg,mov_imm,mov_imm,mvn_imm,load1,store1,load1,store1)
 +   str%(b%)\\t%1, %0
 +   uxtb%?\\t%0, %1
 +   uxtb%?\\t%0, %1
 +  [(set_attr type 
 mov_reg,mov_reg,mov_imm,mov_imm,mvn_imm,load1,store1,load1,store1,mov_reg,mov_reg)
 (set_attr predicable yes)
 -   (set_attr predicable_short_it yes,yes,yes,no,no,no,no,no,no)
 -   (set_attr arch t2,any,any,t2,any,t2,t2,any,any)
 -   (set_attr length 2,4,4,2,4,2,2,4,4)]
 +   (set_attr predicable_short_it yes,yes,yes,no,no,no,no,no,no,no,no)
 +   (set_attr arch t2,any,any,t2,any,t2,t2,any,any,any,t2)
 +   (set_attr length 2,4,4,2,4,2,2,4,4,4,2)]
  )
  
  ;; HFmode moves
 diff --git a/gcc/testsuite/c-c++-common/pr61561.c 
 b/gcc/testsuite/c-c++-common/pr61561.c
 new file mode 100644
 index 000..0f4b716
 --- /dev/null
 +++ b/gcc/testsuite/c-c++-common/pr61561.c
 @@ -0,0 +1,15 @@
 +/* PR c/61561 */
 +/* { dg-do assemble } */
 +/* { dg-options  -w } */
 +
 +int dummy(int a);
 +
 +char a;
 +short b;
 +
 +void mmm (void)
 +{
 +  char dyn[ dummy(3) ];
 +  a = (char)dyn[0];
 +  b = (short)dyn[0];
 +}
 




Re: -fuse-caller-save - Collect register usage information

2014-06-19 Thread Tom de Vries

On 19-06-14 07:13, Richard Henderson wrote:

On 05/19/2014 07:30 AM, Tom de Vries wrote:

+  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
+{
+  HARD_REG_SET insn_used_regs;
+
+  if (!NONDEBUG_INSN_P (insn))
+   continue;
+
+  find_all_hard_reg_sets (insn, insn_used_regs, false);
+
+  if (CALL_P (insn)
+  !get_call_reg_set_usage (insn, insn_used_regs, call_used_reg_set))
+   {
+ CLEAR_HARD_REG_SET (node-function_used_regs);
+ return;
+   }
+
+  IOR_HARD_REG_SET (node-function_used_regs, insn_used_regs);
+}


SNIP


Let's suppose that we've got a rather large function, with only local calls for
which we can acquire usage.  Let's suppose that even one of those callees
further calls something else, such that insn_used_regs == call_used_reg_set.

We fill node-function_used_regs immediately, but keep scanning the rest of the
large function.



+
+  /* Be conservative - mark fixed and global registers as used.  */
+  IOR_HARD_REG_SET (node-function_used_regs, fixed_reg_set);
+  for (i = 0; i  FIRST_PSEUDO_REGISTER; i++)
+if (global_regs[i])
+  SET_HARD_REG_BIT (node-function_used_regs, i);
+
+#ifdef STACK_REGS
+  /* Handle STACK_REGS conservatively, since the df-framework does not
+ provide accurate information for them.  */
+
+  for (i = FIRST_STACK_REG; i = LAST_STACK_REG; i++)
+SET_HARD_REG_BIT (node-function_used_regs, i);
+#endif
+
+  node-function_used_regs_valid = 1;


Wouldn't it be better to compare the collected function_used_regs; if it
contains all of call_used_reg_set, decline to set function_used_regs_valid.
That way, we'll early exit from the above loop whenever we see that we can't
improve over the default call-clobber set.



Richard,

Agreed.  Attached patch implements this (on top of the minor rewrite of 
https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01535.html ).



Although perhaps function_used_regs_valid is no longer the best name in that
case...



I think the name is still ok.  The field function_used_regs_valid just states 
that the function_used_regs field is valid and can be used.



OK for trunk if bootstrap and reg-test on x86_64 is ok ?

Thanks,
- Tom
2014-06-19  Tom de Vries  t...@codesourcery.com

	* final.c (collect_fn_hard_reg_usage): Don't save function_used_regs if
	it contains all call_used_regs.

diff --git a/gcc/final.c b/gcc/final.c
index e39930d..e67e84b 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -4795,6 +4795,11 @@ collect_fn_hard_reg_usage (void)
 SET_HARD_REG_BIT (function_used_regs, i);
 #endif
 
+  /* The information we have gathered is only interesting if it exposes a
+ register from the call_used_regs that is not used in this function.  */
+  if (hard_reg_set_subset_p (call_used_reg_set, function_used_regs))
+return;
+
   node = cgraph_rtl_info (current_function_decl);
   gcc_assert (node != NULL);
 
-- 
1.9.1



Fix finding reg-sets of call insn in collect_fn_hard_reg_usage

2014-06-19 Thread Tom de Vries

Richard,

atm the moment, when processing a call in collect_fn_hard_reg_usage, we get the 
used regs from the callee, but forget to register the regs in the call insn 
itself (ouch).  This patch fixes this by introducing an extra IOR_HARD_REG_SET.


We also switch the order of find_all_hard_reg_sets and get_call_reg_set_usage. 
There's no point in doing find_all_hard_reg_sets on a call if 
get_call_reg_set_usage returns false.


OK for trunk if bootstrap and reg-test on x86_64 is ok ?

Thanks,
- Tom
2014-06-19  Tom de Vries  t...@codesourcery.com

	* final.c (collect_fn_hard_reg_usage): Add separate IOR_HARD_REG_SET for
	get_call_reg_set_usage.

diff --git a/gcc/final.c b/gcc/final.c
index e67e84b..bbeb50d 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -4775,12 +4775,16 @@ collect_fn_hard_reg_usage (void)
   if (!NONDEBUG_INSN_P (insn))
 	continue;
 
-  find_all_hard_reg_sets (insn, insn_used_regs, false);
+  if (CALL_P (insn))
+	{
+	  if (!get_call_reg_set_usage (insn, insn_used_regs,
+   call_used_reg_set))
+	return;
 
-  if (CALL_P (insn)
-	   !get_call_reg_set_usage (insn, insn_used_regs, call_used_reg_set))
-	return;
+	  IOR_HARD_REG_SET (function_used_regs, insn_used_regs);
+	}
 
+  find_all_hard_reg_sets (insn, insn_used_regs, false);
   IOR_HARD_REG_SET (function_used_regs, insn_used_regs);
 }
 
-- 
1.9.1



[PATCH] Change default for --param allow-...-data-races to off

2014-06-19 Thread Bernd Edlinger
Hi,

from a recent discussion on g...@gcc.gnu.org I have learned that the default of
--param allow-store-data-races is still 1, and it is causing problems.
Therefore I would like to suggest to change the default of this option to 0.

Boot-strapped and regression tested on x86_64-linux-gnu.
Ok for trunk?


Thanks
Bernd.
  gcc/ChangeLog:
2014-06-19  Bernd Edlinger  bernd.edlin...@hotmail.de

Set default for --param allow-...-data-races to off.
* params.def (PARAM_ALLOW_LOAD_DATA_RACES,
PARAM_ALLOW_STORE_DATA_RACES, PARAM_ALLOW_PACKED_LOAD_DATA_RACES,
PARAM_ALLOW_PACKED_STORE_DATA_RACES): Set default to off.

testsuite/ChangeLog:
2014-06-19  Bernd Edlinger  bernd.edlin...@hotmail.de

Adjust to new default for --param allow-...-data-races.
* c-c++-common/cxxbitfields-3.c: Adjust.
* c-c++-common/cxxbitfields-6.c: Adjust.
* c-c++-common/simulate-thread/bitfields-1.c: Adjust.
* c-c++-common/simulate-thread/bitfields-2.c: Adjust.
* c-c++-common/simulate-thread/bitfields-3.c: Adjust.
* c-c++-common/simulate-thread/bitfields-4.c: Adjust.
* g++.dg/simulate-thread/bitfields.C: Adjust.
* g++.dg/simulate-thread/bitfields-2.C: Adjust.
* gcc.dg/lto/pr52097_0.c: Adjust.
* gcc.dg/simulate-thread/speculative-store.c: Adjust.
* gcc.dg/simulate-thread/speculative-store-2.c: Adjust.
* gcc.dg/simulate-thread/speculative-store-3.c: Adjust.
* gcc.dg/simulate-thread/speculative-store-4.c: Adjust.
* gcc.dg/simulate-thread/strict-align-global.c: Adjust.
* gcc.dg/simulate-thread/subfields.c: Adjust.
* gcc.dg/tree-ssa/20050314-1.c: Adjust.



patch-allow-races.diff
Description: Binary data


Re: [PATCH] Implement -fsanitize=bounds and internal calls in FEs

2014-06-19 Thread Marek Polacek
On Thu, Jun 19, 2014 at 04:56:53PM +0200, Marek Polacek wrote:
 Regtested/bootstrapped on x86_64-linux.  How does this look?

Now even bootstrap-ubsan passed, with 92258 runtime errors:
index 1 out of bounds for type 'rtunion [1]' - heh.

Marek


Re: -fuse-caller-save - Collect register usage information

2014-06-19 Thread Richard Henderson
On 06/19/2014 09:06 AM, Tom de Vries wrote:
 
 2014-06-19  Tom de Vries  t...@codesourcery.com
 
   * final.c (collect_fn_hard_reg_usage): Don't save function_used_regs if
   it contains all call_used_regs.

Ok.


r~


Re: [PATCH] Fix for PR 61561

2014-06-19 Thread Ramana Radhakrishnan



On 19/06/14 16:12, Kyrill Tkachov wrote:


On 19/06/14 16:05, Marat Zakirov wrote:

Hi all,

Here's a patch for PR 61561
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61561).

It fixes ICE.


Thanks for your contribution.

However, this is *really* not the way to submit a patch and is the sort 
of patch that will make me avoid reviewing it instantly.


But given this is your first time .. :)

Can you please explain why your fix is required ? One can understand 
what you are attempting to achieve which is adding constraints and 
alternatives for moves between sp and general purpose registers for 
HImode and QImode operands, but how did we get here in the first place ? 
My next reaction was which pass is doing this and why ?


Ah, then I go and read your bug report and your submission there in the 
description. Now I understand.


Please read - https://gcc.gnu.org/contribute.html#patches




Reg. tested on arm15.


What's an ARM15 ? I have never seen it or even heard about it. 
Presumably you mean a Cortex-A15 and that is inferred from your comments 
in the bugzilla report.


What configuration did you test, languages ?


Now on to the patch itself.



gcc/ChangeLog:

2014-06-19  Marat Zakirov  m.zaki...@samsung.com

* config/arm/arm.md: New templates see pr61561.


See Changelog formats and comments about using PR numbers in Changelogs.

DATE  NAME  email

PR target/61561
* config/arm/arm.md (*movhi_insn_arch4): Handle stack pointer
  (*arm_movqi_insn): Likewise.


gcc/testsuite/ChangeLog:

2014-06-19  Marat Zakirov  m.zaki...@samsung.com

* c-c++-common/pr61561.c: New test for pr61561.


PR target/61561
* file_name: New test.




diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 42c12c8..7ed8abc 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -6290,27 +6290,31 @@

 ;; Pattern to recognize insn generated default case above
 (define_insn *movhi_insn_arch4
-  [(set (match_operand:HI 0 nonimmediate_operand =r,r,m,r)
-   (match_operand:HI 1 general_operand  rI,K,r,mi))]
+  [(set (match_operand:HI 0 nonimmediate_operand =r,r,m,r,r)
+   (match_operand:HI 1 general_operand  rI,K,r,mi,k))]
   TARGET_ARM
 arm_arch4
 (register_operand (operands[0], HImode)
|| register_operand (operands[1], HImode))
-  @
+  @
mov%?\\t%0, %1\\t%@ movhi
mvn%?\\t%0, #%B1\\t%@ movhi
str%(h%)\\t%1, %0\\t%@ movhi
-   ldr%(h%)\\t%0, %1\\t%@ movhi
+   ldr%(h%)\\t%0, %1\\t%@ movhi
+   uxth%?\\t%0, %1\\t%@ movhi


Instead replace the first source constraint with rIk rather than adding 
another alternative. You don't need a widening operation here. There is 
no need to do a zero extend here. Any widening or narrowing should be 
dealt with where required, and the store would remain a store byte and 
therefore this can be a move like the other moves between registers.


This pattern matches for arm_arch4 where uxth is not a valid 
instruction. Further more on earlier architectures this will cause an 
undefined instruction trap.





   [(set_attr predicable yes)
-   (set_attr pool_range *,*,*,256)
-   (set_attr neg_pool_range *,*,*,244)
+   (set_attr pool_range *,*,*,256,*)
+   (set_attr neg_pool_range *,*,*,244,*)
(set_attr_alternative type
  [(if_then_else (match_operand 1 const_int_operand 
)
 (const_string mov_imm )
 (const_string mov_reg))
   (const_string mvn_imm)
   (const_string store1)
-  (const_string load1)])]
+  (const_string load1)
+  (if_then_else (match_operand 1 const_int_operand 
)
+(const_string mov_imm )
+(const_string mov_reg))])]



This should not be needed now. This is just a move_reg.


 )

 (define_insn *movhi_bytes
@@ -6429,8 +6433,8 @@
 )

 (define_insn *arm_movqi_insn
-  [(set (match_operand:QI 0 nonimmediate_operand =r,r,r,l,r,l,Uu,r,m)
-   (match_operand:QI 1 general_operand r,r,I,Py,K,Uu,l,m,r))]
+  [(set (match_operand:QI 0 nonimmediate_operand =r,r,r,l,r,l,Uu,r,m,r,r)
+   (match_operand:QI 1 general_operand r,r,I,Py,K,Uu,l,m,r,k,k))]


Why do you need 2 alternatives which appear to do the same thing ? 
Instead just do a move.





   TARGET_32BIT
 (   register_operand (operands[0], QImode)
|| register_operand (operands[1], QImode))
@@ -6443,12 +6447,14 @@
ldr%(b%)\\t%0, %1
str%(b%)\\t%1, %0
ldr%(b%)\\t%0, %1
-   str%(b%)\\t%1, %0
-  [(set_attr type 
mov_reg,mov_reg,mov_imm,mov_imm,mvn_imm,load1,store1,load1,store1)
+   str%(b%)\\t%1, %0
+   uxtb%?\\t%0, %1
+   uxtb%?\\t%0, %1



+  [(set_attr type 
mov_reg,mov_reg,mov_imm,mov_imm,mvn_imm,load1,store1,load1,store1,mov_reg,mov_reg)
(set_attr predicable yes)
-   (set_attr predicable_short_it 

Re: [PATCH, ARM] Enable fuse-caller-save for ARM

2014-06-19 Thread Tom de Vries

On 19-06-14 05:59, Richard Henderson wrote:

On 06/01/2014 04:27 AM, Tom de Vries wrote:

+  if (TARGET_AAPCS_BASED)
+{
+  /* For AAPCS, IP and CC can be clobbered by veneers inserted by the
+linker.  We need to add these to allow
+arm_call_fusage_contains_non_callee_clobbers to return true.  */
+  rtx *fusage = CALL_INSN_FUNCTION_USAGE (insn);
+  clobber_reg (fusage, gen_rtx_REG (word_mode, IP_REGNUM));
+  clobber_reg (fusage, gen_rtx_REG (word_mode, CC_REGNUM));


Why are you adding CC_REGNUM if fixed registers are automatically included?



Richard,

You're right, setting a fixed register here is not required for fuse-caller-save 
to work safely.


But it fits the definition of the hook 
TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS:

...
Set to true if all the calls in the current function contain clobbers in 
CALL_INSN_FUNCTION_USAGE for the registers that are clobbered by the call rather 
than by the callee, and are not already set or clobbered in the call pattern.

...

We can adapt the definition to not include fixed registers. I can make a patch 
for that, if you like.


Thanks,
- Tom


Re: Fix finding reg-sets of call insn in collect_fn_hard_reg_usage

2014-06-19 Thread Richard Henderson
On 06/19/2014 09:07 AM, Tom de Vries wrote:
 
 2014-06-19  Tom de Vries  t...@codesourcery.com
 
   * final.c (collect_fn_hard_reg_usage): Add separate IOR_HARD_REG_SET for
   get_call_reg_set_usage.

Ok, as far as it goes, but...

It seems like there should be quite a bit of overlap with regs_ever_live here.
 How much of that previous computation can we leverage?

It appears that regs_ever_live includes any register mentioned explicitly, and
thus the only registers it doesn't contain are those killed by the callees.
That should be an easier scan than the rtl, since we have those already
collected in the cgraph.

Sorry I wasn't paying much attention earlier when this was first posted, when
questions like this may have been answered.


r~


Re: [PATCH, ARM] Enable fuse-caller-save for ARM

2014-06-19 Thread Richard Henderson
On 06/19/2014 09:37 AM, Tom de Vries wrote:
 On 19-06-14 05:59, Richard Henderson wrote:
 On 06/01/2014 04:27 AM, Tom de Vries wrote:
 +  if (TARGET_AAPCS_BASED)
 +{
 +  /* For AAPCS, IP and CC can be clobbered by veneers inserted by the
 + linker.  We need to add these to allow
 + arm_call_fusage_contains_non_callee_clobbers to return true.  */
 +  rtx *fusage = CALL_INSN_FUNCTION_USAGE (insn);
 +  clobber_reg (fusage, gen_rtx_REG (word_mode, IP_REGNUM));
 +  clobber_reg (fusage, gen_rtx_REG (word_mode, CC_REGNUM));

 Why are you adding CC_REGNUM if fixed registers are automatically included?

 
 Richard,
 
 You're right, setting a fixed register here is not required for
 fuse-caller-save to work safely.
 
 But it fits the definition of the hook
 TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS:
 ...
 Set to true if all the calls in the current function contain clobbers in
 CALL_INSN_FUNCTION_USAGE for the registers that are clobbered by the call
 rather than by the callee, and are not already set or clobbered in the call
 pattern.
 ...
 
 We can adapt the definition to not include fixed registers. I can make a patch
 for that, if you like.

I think that would be best.  It'll save just a bit of memory and scanning time.


r~



Re: Fix finding reg-sets of call insn in collect_fn_hard_reg_usage

2014-06-19 Thread Richard Henderson
On 06/19/2014 09:40 AM, Richard Henderson wrote:
 It appears that regs_ever_live includes any register mentioned explicitly, and
 thus the only registers it doesn't contain are those killed by the callees.
 That should be an easier scan than the rtl, since we have those already
 collected in the cgraph.

And I forgot to mention it might be worth while to notice simple recursion.
Avoid the early exit path if caller == callee, despite the caller-save info not
being valid.


r~


Re: [PATCH, PR 61540] Do not ICE on impossible devirtualization

2014-06-19 Thread Jan Hubicka
 Hi,
 
 On Wed, Jun 18, 2014 at 06:12:34PM +0200, Bernhard Reutner-Fischer wrote:
  On 18 June 2014 10:24:16 Martin Jambor mjam...@suse.cz wrote:
  
  @@ -3002,10 +3014,8 @@ try_make_edge_direct_virtual_call (struct
  cgraph_edge *ie,
  
 if (target)
   {
  -#ifdef ENABLE_CHECKING
  -  gcc_assert (possible_polymorphic_call_target_p
  -   (ie, cgraph_get_node (target)));
  -#endif
  +  if (!possible_polymorphic_call_target_p (ie, cgraph_get_node 
  (target)))
  +  return ipa_make_edge_direct_to_target (ie, target);
 return ipa_make_edge_direct_to_target (ie, target);
   }
  
  The above looks odd. You return the same thing both conditionally
  and unconditionally?
  
 
 You are obviously right, apparently I was too tired to attempt to work
 that night.  Thanks, for spotting it.  The following patch has this
 corrected and it also passes bootstrap and testing on x86_64-linux on
 both the trunk and the 4.9 branch. OK for both?

OK,
Honza
 
 Thanks,
 
 Martin
 
 
 2014-06-19  Martin Jambor  mjam...@suse.cz
 
   PR ipa/61540
   * ipa-prop.c (impossible_devirt_target): New function.
   (try_make_edge_direct_virtual_call): Use it, also instead of
   asserting.
 
 testsuite/
 * g++.dg/ipa/pr61540.C: New test.
 
 diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
 index b67deed..d9dca52 100644
 --- a/gcc/ipa-prop.c
 +++ b/gcc/ipa-prop.c
 @@ -2912,6 +2912,29 @@ try_make_edge_direct_simple_call (struct cgraph_edge 
 *ie,
return cs;
  }
  
 +/* Return the target to be used in cases of impossible devirtualization.  IE
 +   and target (the latter can be NULL) are dumped when dumping is enabled.  
 */
 +
 +static tree
 +impossible_devirt_target (struct cgraph_edge *ie, tree target)
 +{
 +  if (dump_file)
 +{
 +  if (target)
 + fprintf (dump_file,
 +  Type inconsident devirtualization: %s/%i-%s\n,
 +  ie-caller-name (), ie-caller-order,
 +  IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (target)));
 +  else
 + fprintf (dump_file,
 +  No devirtualization target in %s/%i\n,
 +  ie-caller-name (), ie-caller-order);
 +}
 +  tree new_target = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
 +  cgraph_get_create_node (new_target);
 +  return new_target;
 +}
 +
  /* Try to find a destination for indirect edge IE that corresponds to a 
 virtual
 call based on a formal parameter which is described by jump function JFUNC
 and if it can be determined, make it direct and return the direct edge.
 @@ -2946,15 +2969,7 @@ try_make_edge_direct_virtual_call (struct cgraph_edge 
 *ie,
   DECL_FUNCTION_CODE (target) == BUILT_IN_UNREACHABLE)
 || !possible_polymorphic_call_target_p
  (ie, cgraph_get_node (target)))
 - {
 -   if (dump_file)
 - fprintf (dump_file,
 -  Type inconsident devirtualization: %s/%i-%s\n,
 -  ie-caller-name (), ie-caller-order,
 -  IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (target)));
 -   target = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
 -   cgraph_get_create_node (target);
 - }
 + target = impossible_devirt_target (ie, target);
 return ipa_make_edge_direct_to_target (ie, target);
   }
   }
 @@ -2984,10 +2999,7 @@ try_make_edge_direct_virtual_call (struct cgraph_edge 
 *ie,
if (targets.length () == 1)
   target = targets[0]-decl;
else
 - {
 -  target = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
 -   cgraph_get_create_node (target);
 - }
 + target = impossible_devirt_target (ie, NULL_TREE);
  }
else
  {
 @@ -3002,10 +3014,8 @@ try_make_edge_direct_virtual_call (struct cgraph_edge 
 *ie,
  
if (target)
  {
 -#ifdef ENABLE_CHECKING
 -  gcc_assert (possible_polymorphic_call_target_p
 -  (ie, cgraph_get_node (target)));
 -#endif
 +  if (!possible_polymorphic_call_target_p (ie, cgraph_get_node (target)))
 + target = impossible_devirt_target (ie, target);
return ipa_make_edge_direct_to_target (ie, target);
  }
else
 diff --git a/gcc/testsuite/g++.dg/ipa/pr61540.C 
 b/gcc/testsuite/g++.dg/ipa/pr61540.C
 new file mode 100644
 index 000..d298964
 --- /dev/null
 +++ b/gcc/testsuite/g++.dg/ipa/pr61540.C
 @@ -0,0 +1,41 @@
 +/* { dg-do compile } */
 +/* { dg-options -O3 -fno-early-inlining -fdump-ipa-cp } */
 +
 +struct data {
 +  data(int) {}
 +};
 +
 +struct top {
 +  virtual int topf() {}
 +};
 +
 +struct intermediate: top {
 +int topf() /* override */ { return 0; }
 +};
 +
 +struct child1: top {
 +void childf()
 +{
 +data d(topf());
 +}
 +};
 +
 +struct child2: intermediate {};
 +
 +void test(top t)
 +{
 +child1 c = static_castchild1(t);
 +c.childf();
 +child2 d;
 +test(d);
 +}
 +
 +int main (int argc, char **argv)
 +{
 + 

Re: [PATCH] Implement -fsanitize=bounds and internal calls in FEs

2014-06-19 Thread Jakub Jelinek
Hi!

On Thu, Jun 19, 2014 at 04:56:53PM +0200, Marek Polacek wrote:

Thanks for working on this.

 --- gcc/asan.c
 +++ gcc/asan.c
 @@ -2761,6 +2761,9 @@ pass_sanopt::execute (function *fun)
 case IFN_UBSAN_NULL:
   ubsan_expand_null_ifn (gsi);
   break;
 +   case IFN_UBSAN_BOUNDS:
 + ubsan_expand_bounds_btn (gsi);
 + break;
 default:

Why *_btn instead of *_ifn ?

 +static tree
 +ubsan_walk_array_refs_r (tree *tp, int *walk_subtrees, void *data)
 +{
 +  struct pointer_set_t *pset = (struct pointer_set_t *) data;
 +
 +  if (TREE_CODE (*tp) == BIND_EXPR)
 +{

I think it would be worth adding here a comment why do you handle BIND_EXPR
here, that it doesn't walk the vars, but only their initializers etc. and
thus in order to prevent walking DECL_INITIAL of TREE_STATIC decls
we have to duplicate this part of walk_tree.

 +  for (tree decl = BIND_EXPR_VARS (*tp); decl; decl = DECL_CHAIN (decl))
 + {
 +   if (TREE_STATIC (decl))
 + {
 +   *walk_subtrees = 0;
 +   continue;
 + }
 +   walk_tree (DECL_INITIAL (decl), ubsan_walk_array_refs_r, NULL, pset);
 +   walk_tree (DECL_SIZE (decl), ubsan_walk_array_refs_r, NULL, pset);
 +   walk_tree (DECL_SIZE_UNIT (decl), ubsan_walk_array_refs_r, NULL, 
 pset);

Shouldn't that use pset, pset); or data, pset); ?
Also, too long lines (at least the last one, first one likely too).

 +  tree bound = TYPE_MAX_VALUE (domain);
 +  if (ignore_off_by_one)
 +bound = fold_build2 (PLUS_EXPR, TREE_TYPE (bound), bound,
 +  build_int_cst (TREE_TYPE (bound), 1));
 +
 +  /* Detect flexible array members and suchlike.  */
 +  tree base = get_base_address (array);
 +  if (base  TREE_CODE (base) == INDIRECT_REF)

I'd check also == MEM_REF here, while the FEs often use INDIRECT_REFs,
there are already spots where it creates MEM_REFs.

 +void
 +ubsan_maybe_instrument_array_ref (tree *expr_p, bool ignore_off_by_one)
 +{
 +  if (!ubsan_array_ref_instrumented_p (*expr_p)
 +   current_function_decl != 0

Please use != NULL_TREE.

 +   !lookup_attribute (no_sanitize_undefined,
 + DECL_ATTRIBUTES (current_function_decl)))
 +{
 +  tree t = copy_node (*expr_p);
 +  tree op0 = TREE_OPERAND (t, 0);
 +  tree op1 = TREE_OPERAND (t, 1);

Please don't call copy_node until you know you want to instrument it.
I.e.
  tree op0 = TREE_OPERAND (*expr_p, 0);
  tree op1 = TREE_OPERAND (*expr_p, 1);
 +  tree e = ubsan_instrument_bounds (EXPR_LOCATION (t), op0, op1,

s/t/*expr_p/ above.

 + ignore_off_by_one);
 +  if (e != NULL_TREE)
 + {

and only here add:
  tree t = copy_node (*expr_p);
 +   TREE_OPERAND (t, 1) = build2 (COMPOUND_EXPR, TREE_TYPE (op1),
 + e, op1);
 +   *expr_p = t;
 + }
 +}
 +}

 --- gcc/tree-pretty-print.c
 +++ gcc/tree-pretty-print.c
 @@ -3218,6 +3218,13 @@ print_call_name (pretty_printer *buffer, tree node, 
 int flags)
  {
tree op0 = node;
  
 +  if (node == NULL_TREE)
 +{
 +  /* TODO Print builtin name.  */
 +  pp_string (buffer, internal function call);

Use internal_fn_name function?

Jakub


Re: [PATCH] Implement -fsanitize=bounds and internal calls in FEs

2014-06-19 Thread Jakub Jelinek
On Thu, Jun 19, 2014 at 04:56:53PM +0200, Marek Polacek wrote:
 + /* Don't instrument this FMA-like array in non-strict

Also, please don't use FMA to mean flexible member array, it is
flexible array member, but more importantly, FMA is used for fused
multiply-add, so IMHO it is better to spell it without acronym.

Jakub


Re: [PATCH, AARCH64] Enable fuse-caller-save for AARCH64

2014-06-19 Thread Tom de Vries

On 19-06-14 05:53, Richard Henderson wrote:

On 06/01/2014 03:00 AM, Tom de Vries wrote:

+aarch64_emit_call_insn (rtx pat)
+{
+  rtx insn = emit_call_insn (pat);
+
+  rtx *fusage = CALL_INSN_FUNCTION_USAGE (insn);
+  clobber_reg (fusage, gen_rtx_REG (word_mode, IP0_REGNUM));
+  clobber_reg (fusage, gen_rtx_REG (word_mode, IP1_REGNUM));

Actually, I'd like to know more about how this is supposed to work.

Why are you only marking the two registers that would be used by a PLT entry,
but not those clobbered by the ld.so trampoline, or indeed the unknown function
that would be called from the PLT.

Oh, I see, looking at the code we do actually follow the cgraph and make sure
it is a direct call with a known destination.  So, in fact, it's only the
registers that could be clobbered by ld branch islands (so these two are still
correct for aarch64).

This means the documentation is actually wrong when it mentions PLTs at all.


Yes, if we go from the point of view that the 
TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS hooks sole purpose is to enable 
the fuse-caller-save optimization.


How about this updated definition ? OK for trunk if re-testing on arm succeeds ?

Thanks,
- Tom


2014-06-19  Tom de Vries  t...@codesourcery.com

	* config/arm/arm.c (arm_emit_call_insn): Remove clobber of CC_REGNUM.
	* target.def: Update defition.
	* doc/tm.texi: Regenerate.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index d293b5b..178f08b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17642,11 +17642,11 @@ arm_emit_call_insn (rtx pat, rtx addr, bool sibcall)
   if (TARGET_AAPCS_BASED)
 {
   /* For AAPCS, IP and CC can be clobbered by veneers inserted by the
-	 linker.  We need to add these to allow setting
-	 TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS to true.  */
+	 linker.  We need to add IP to allow setting
+	 TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS to true.  CC is not
+	 needed since it's a fixed register.  */
   rtx *fusage = CALL_INSN_FUNCTION_USAGE (insn);
   clobber_reg (fusage, gen_rtx_REG (word_mode, IP_REGNUM));
-  clobber_reg (fusage, gen_rtx_REG (word_mode, CC_REGNUM));
 }
 }
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index c272630..b0a8dbd 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4884,14 +4884,11 @@ Whether this target supports splitting the stack when the options described in @
 @cindex miscellaneous register hooks
 
 @deftypevr {Target Hook} bool TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS
-set to true if all the calls in the current function contain clobbers in
-CALL_INSN_FUNCTION_USAGE for the registers that are clobbered by the call
-rather than by the callee, and are not already set or clobbered in the call
-pattern.  Examples of such registers are registers used in PLTs and stubs,
-and temporary registers used in the call instruction but not present in the
-rtl pattern.  Another way to formulate it is the registers not present in the
-rtl pattern that are clobbered by the call assuming the callee does not
-clobber any register.  The default version of this hook is set to false.
+Set to true if each call that binds to a local definition contain clobbers
+in CALL_INSN_FUNCTION_USAGE for the non-fixed registers that are clobbered by
+the call rather than by the callee, and are not already set or clobbered in
+the call pattern.  The default version of this hook is set to false.  The
+purpose of this hook it to enable the fuse-caller-save optimization.
 @end deftypevr
 
 @node Varargs
diff --git a/gcc/target.def b/gcc/target.def
index e455211..b738281 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5128,18 +5128,15 @@ FRAME_POINTER_REGNUM, ARG_POINTER_REGNUM, and the PIC_OFFSET_TABLE_REGNUM.,
  hook_void_bitmap)
 
 /* Targets should define this target hook to mark that non-callee clobbers are
-   present in CALL_INSN_FUNCTION_USAGE for all the calls in the current
-   function.  */
+   present in CALL_INSN_FUNCTION_USAGE for all the calls that bind to a local
+   definition.  */
 DEFHOOKPOD
 (call_fusage_contains_non_callee_clobbers,
- set to true if all the calls in the current function contain clobbers in\n\
-CALL_INSN_FUNCTION_USAGE for the registers that are clobbered by the call\n\
-rather than by the callee, and are not already set or clobbered in the call\n\
-pattern.  Examples of such registers are registers used in PLTs and stubs,\n\
-and temporary registers used in the call instruction but not present in the\n\
-rtl pattern.  Another way to formulate it is the registers not present in the\n\
-rtl pattern that are clobbered by the call assuming the callee does not\n\
-clobber any register.  The default version of this hook is set to false.,
+ Set to true if each call that binds to a local definition contain clobbers\n\
+in CALL_INSN_FUNCTION_USAGE for the non-fixed registers that are clobbered by\n\
+the call rather than by the callee, and are not already set or clobbered in\n\
+the call pattern.  

Re: [PATCH, AARCH64] Enable fuse-caller-save for AARCH64

2014-06-19 Thread Richard Henderson
On 06/19/2014 11:25 AM, Tom de Vries wrote:
 On 19-06-14 05:53, Richard Henderson wrote:
 On 06/01/2014 03:00 AM, Tom de Vries wrote:
 +aarch64_emit_call_insn (rtx pat)
 +{
 +  rtx insn = emit_call_insn (pat);
 +
 +  rtx *fusage = CALL_INSN_FUNCTION_USAGE (insn);
 +  clobber_reg (fusage, gen_rtx_REG (word_mode, IP0_REGNUM));
 +  clobber_reg (fusage, gen_rtx_REG (word_mode, IP1_REGNUM));
 Actually, I'd like to know more about how this is supposed to work.

 Why are you only marking the two registers that would be used by a PLT entry,
 but not those clobbered by the ld.so trampoline, or indeed the unknown 
 function
 that would be called from the PLT.

 Oh, I see, looking at the code we do actually follow the cgraph and make sure
 it is a direct call with a known destination.  So, in fact, it's only the
 registers that could be clobbered by ld branch islands (so these two are 
 still
 correct for aarch64).

 This means the documentation is actually wrong when it mentions PLTs at all.
 
 Yes, if we go from the point of view that the
 TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS hooks sole purpose is to 
 enable
 the fuse-caller-save optimization.
 
 How about this updated definition ? OK for trunk if re-testing on arm 
 succeeds ?

I did like the doc including mention of stubs, because they're easy to
forget.  How about

Set to true if each call that binds to a local definition explicitly clobbers
or sets all non-fixed registers modified by performing the call.  That is, by
the call pattern itself, or by code that might be inserted by the linker
(e.g. stubs, veneers, branch islands), but not including those modifiable by
the callee.  The affected registers may be mentioned explicitly in the
call pattern, or included as clobbers in CALL_INSN_FUNCTION_USAGE.
The default version of this hook is set to false.  The purpose of this hook
is to enable the fuse-caller-save optimization.


r~


Re: -fuse-caller-save - Collect register usage information

2014-06-19 Thread Jan Hubicka
 On 06/19/2014 09:06 AM, Tom de Vries wrote:
  
  2014-06-19  Tom de Vries  t...@codesourcery.com
  
  * final.c (collect_fn_hard_reg_usage): Don't save function_used_regs if
  it contains all call_used_regs.
 
 Ok.

When we now have way to represent different reg usages for functions, what 
would be best
way to make local functions to default into saving some SSE registers on 
x86/x86-64?

Honza


Re: -fuse-caller-save - Collect register usage information

2014-06-19 Thread Richard Henderson
On 06/19/2014 12:36 PM, Jan Hubicka wrote:
 On 06/19/2014 09:06 AM, Tom de Vries wrote:

 2014-06-19  Tom de Vries  t...@codesourcery.com

 * final.c (collect_fn_hard_reg_usage): Don't save function_used_regs if
 it contains all call_used_regs.

 Ok.
 
 When we now have way to represent different reg usages for functions, what 
 would be best
 way to make local functions to default into saving some SSE registers on 
 x86/x86-64?

I wouldn't do that at all.  Leave all sse registers call-clobbered.  This way
you don't need to have different entry points (or one possibly less efficient
entry point) when a function is used both locally and globally.

What I would investigate is how to use this hard reg usage data in the register
allocator.  If we know that the callee only uses xmm0-xmm4, then we can keep
xmm5-xmm15 live across the call.


r~


Re: [PATCH] Fix for PR 61561

2014-06-19 Thread Yuri Gribov
 Thirdly, we also need to fix movhi_bytes (for pre-v4) thumb2_movhi_insn
 (for thumb2) and, quite possibly, thumb1_movhi_insn (for thumb1).  There
 may well be additional changes for movqi variants as well.

A general question: how should one test ARM backend patches? Is it
enough to regtest ARM and Thumb2 on some modern Cortex? If not - what
other variants should be tested?

-Y


Re: [PATCH 0/5] let gdb reuse gcc'c C compiler

2014-06-19 Thread Tom Tromey
 Tom == Tom Tromey tro...@redhat.com writes:

Tom This patch series is half of a project to let gdb reuse gcc (which
Tom half depends on which list you are seeing this on), so that users can
Tom compile small snippets of code and evaluate them in the current
Tom context of the inferior.

We've updated the patches according to the reviews.

Since I'm not sure if GCC prefers new patch series submissions or
follow-ups; and not all the patches changed; and patch #5 didn't make it
through the email filter unmangled the first time -- I'm just sending
the new patches as individual follow-ups.  I'm happy to git send-email
again though if you'd prefer.

I believe we've addressed all the review comments.

On the gdb side a new series will be ready very soon.  We're just
finishing fixing up the documentation according to the review.

thanks,
Tom


Re: [PATCH 3/5] introduce the binding oracle

2014-06-19 Thread Tom Tromey
Jeff Just a nit.  C-style comment would be appreciated.  It might also help
Jeff to clarify what much more sane really means here.

Jeff Otherwise, it looks OK to me.

Here's the updated patch.

Tom

2014-06-19  Phil Muldoon  pmuld...@redhat.com
Tom Tromey  tro...@redhat.com

* c-tree.h (enum c_oracle_request): New.
(c_binding_oracle_function): New typedef.
(c_binding_oracle, c_pushtag, c_bind): Declare.
* c-decl.c (c_binding_oracle): New global.
(I_SYMBOL_CHECKED): New macro.
(i_symbol_binding): New function.
(I_SYMBOL_BINDING, I_SYMBOL_DECL): Redefine.
(I_TAG_CHECKED): New macro.
(i_tag_binding): New function.
(I_TAG_BINDING, I_TAG_DECL): Redefine.
(I_LABEL_CHECKED): New macro.
(i_label_binding): New function.
(I_LABEL_BINDING, I_LABEL_DECL): Redefine.
(c_print_identifier): Save and restore c_binding_oracle.
(c_pushtag, c_bind): New functions.

---
 gcc/c/ChangeLog |  19 +++
 gcc/c/c-decl.c  | 164 ++--
 gcc/c/c-tree.h  |  24 +
 3 files changed, 192 insertions(+), 15 deletions(-)

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 3456030..b0f47af 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -216,21 +216,6 @@ struct GTY((chain_next (%h.prev))) c_binding {
 #define B_IN_FILE_SCOPE(b) ((b)-depth == 1 /*file_scope-depth*/)
 #define B_IN_EXTERNAL_SCOPE(b) ((b)-depth == 0 /*external_scope-depth*/)
 
-#define I_SYMBOL_BINDING(node) \
-  (((struct lang_identifier *) IDENTIFIER_NODE_CHECK(node))-symbol_binding)
-#define I_SYMBOL_DECL(node) \
- (I_SYMBOL_BINDING(node) ? I_SYMBOL_BINDING(node)-decl : 0)
-
-#define I_TAG_BINDING(node) \
-  (((struct lang_identifier *) IDENTIFIER_NODE_CHECK(node))-tag_binding)
-#define I_TAG_DECL(node) \
- (I_TAG_BINDING(node) ? I_TAG_BINDING(node)-decl : 0)
-
-#define I_LABEL_BINDING(node) \
-  (((struct lang_identifier *) IDENTIFIER_NODE_CHECK(node))-label_binding)
-#define I_LABEL_DECL(node) \
- (I_LABEL_BINDING(node) ? I_LABEL_BINDING(node)-decl : 0)
-
 /* Each C symbol points to three linked lists of c_binding structures.
These describe the values of the identifier in the three different
namespaces defined by the language.  */
@@ -246,6 +231,96 @@ struct GTY(()) lang_identifier {
 extern char C_SIZEOF_STRUCT_LANG_IDENTIFIER_isnt_accurate
 [(sizeof(struct lang_identifier) == C_SIZEOF_STRUCT_LANG_IDENTIFIER) ? 1 : -1];
 
+/* The binding oracle; see c-tree.h.  */
+void (*c_binding_oracle) (enum c_oracle_request, tree identifier);
+
+/* This flag is set on an identifier if we have previously asked the
+   binding oracle for this identifier's symbol binding.  */
+#define I_SYMBOL_CHECKED(node) \
+  (TREE_LANG_FLAG_4 (IDENTIFIER_NODE_CHECK (node)))
+
+static inline struct c_binding* *
+i_symbol_binding (tree node)
+{
+  struct lang_identifier *lid
+= (struct lang_identifier *) IDENTIFIER_NODE_CHECK (node);
+
+  if (lid-symbol_binding == NULL
+   c_binding_oracle != NULL
+   !I_SYMBOL_CHECKED (node))
+{
+  /* Set the checked flag first, to avoid infinite recursion
+when the binding oracle calls back into gcc.  */
+  I_SYMBOL_CHECKED (node) = 1;
+  c_binding_oracle (C_ORACLE_SYMBOL, node);
+}
+
+  return lid-symbol_binding;
+}
+
+#define I_SYMBOL_BINDING(node) (*i_symbol_binding (node))
+
+#define I_SYMBOL_DECL(node) \
+ (I_SYMBOL_BINDING(node) ? I_SYMBOL_BINDING(node)-decl : 0)
+
+/* This flag is set on an identifier if we have previously asked the
+   binding oracle for this identifier's tag binding.  */
+#define I_TAG_CHECKED(node) \
+  (TREE_LANG_FLAG_5 (IDENTIFIER_NODE_CHECK (node)))
+
+static inline struct c_binding **
+i_tag_binding (tree node)
+{
+  struct lang_identifier *lid
+= (struct lang_identifier *) IDENTIFIER_NODE_CHECK (node);
+
+  if (lid-tag_binding == NULL
+   c_binding_oracle != NULL
+   !I_TAG_CHECKED (node))
+{
+  /* Set the checked flag first, to avoid infinite recursion
+when the binding oracle calls back into gcc.  */
+  I_TAG_CHECKED (node) = 1;
+  c_binding_oracle (C_ORACLE_TAG, node);
+}
+
+  return lid-tag_binding;
+}
+
+#define I_TAG_BINDING(node) (*i_tag_binding (node))
+
+#define I_TAG_DECL(node) \
+ (I_TAG_BINDING(node) ? I_TAG_BINDING(node)-decl : 0)
+
+/* This flag is set on an identifier if we have previously asked the
+   binding oracle for this identifier's label binding.  */
+#define I_LABEL_CHECKED(node) \
+  (TREE_LANG_FLAG_6 (IDENTIFIER_NODE_CHECK (node)))
+
+static inline struct c_binding **
+i_label_binding (tree node)
+{
+  struct lang_identifier *lid
+= (struct lang_identifier *) IDENTIFIER_NODE_CHECK (node);
+
+  if (lid-label_binding == NULL
+   c_binding_oracle != NULL
+   !I_LABEL_CHECKED (node))
+{
+  /* Set the checked flag first, to avoid infinite recursion
+when the binding oracle calls back into gcc.  */
+  

Re: [PATCH 4/5] add gcc/gdb interface files

2014-06-19 Thread Tom Tromey
 Jeff == Jeff Law l...@redhat.com writes:

 One other random idea was something like:

 GCC_METHOD7 (gcc_decl, build_decl,
 const char *, /* Argument NAME.  */
 enum gcc_c_symbol_kind,   /* Argument SYM_KIND.  */

Jeff Works for me.

I took this approach.  Other changes in this version are - a minor
change to one of the generic gcc methods to make it possible to choose
the correct compiler, and changing GDB in the part of comments to
GCC.

Tom

2014-06-19  Phil Muldoon  pmuld...@redhat.com
Jan Kratochvil  jan.kratoch...@redhat.com
Tom Tromey  tro...@redhat.com

* gcc-c-fe.def: New file.
* gcc-c-interface.h: New file.
* gcc-interface.h: New file.

---
 include/ChangeLog |   8 ++
 include/gcc-c-fe.def  | 197 +
 include/gcc-c-interface.h | 220 ++
 include/gcc-interface.h   | 127 ++
 4 files changed, 552 insertions(+)
 create mode 100644 include/gcc-c-fe.def
 create mode 100644 include/gcc-c-interface.h
 create mode 100644 include/gcc-interface.h

diff --git a/include/gcc-c-fe.def b/include/gcc-c-fe.def
new file mode 100644
index 000..19cb867
--- /dev/null
+++ b/include/gcc-c-fe.def
@@ -0,0 +1,197 @@
+/* Interface between GCC C FE and GDB  -*- c -*-
+
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see http://www.gnu.org/licenses/.  */
+
+
+
+/* Create a new decl in GCC.  A decl is a declaration, basically a
+   kind of symbol.
+
+   NAME is the name of the new symbol.  SYM_KIND is the kind of
+   symbol being requested.  SYM_TYPE is the new symbol's C type;
+   except for labels, where this is not meaningful and should be
+   zero.  If SUBSTITUTION_NAME is not NULL, then a reference to this
+   decl in the source will later be substituted with a dereference
+   of a variable of the given name.  Otherwise, for symbols having
+   an address (e.g., functions), ADDRESS is the address.  FILENAME
+   and LINE_NUMBER refer to the symbol's source location.  If this
+   is not known, FILENAME can be NULL and LINE_NUMBER can be 0.
+   This function returns the new decl.  */
+
+GCC_METHOD7 (gcc_decl, build_decl,
+const char *,/* Argument NAME.  */
+enum gcc_c_symbol_kind,  /* Argument SYM_KIND.  */
+gcc_type,/* Argument SYM_TYPE.  */
+const char *,/* Argument SUBSTITUTION_NAME.  */
+gcc_address, /* Argument ADDRESS.  */
+const char *,/* Argument FILENAME.  */
+unsigned int)/* Argument LINE_NUMBER.  */
+
+/* Insert a GCC decl into the symbol table.  DECL is the decl to
+   insert.  IS_GLOBAL is true if this is an outermost binding, and
+   false if it is a possibly-shadowing binding.  */
+
+GCC_METHOD2 (int /* bool */, bind,
+gcc_decl, /* Argument DECL.  */
+int /* bool */)   /* Argument IS_GLOBAL.  */
+
+/* Insert a tagged type into the symbol table.  NAME is the tag name
+   of the type and TAGGED_TYPE is the type itself.  TAGGED_TYPE must
+   be either a struct, union, or enum type, as these are the only
+   types that have tags.  FILENAME and LINE_NUMBER refer to the type's
+   source location.  If this is not known, FILENAME can be NULL and
+   LINE_NUMBER can be 0.  */
+
+GCC_METHOD4 (int /* bool */, tagbind,
+const char *,/* Argument NAME.  */
+gcc_type,/* Argument TAGGED_TYPE.  */
+const char *,/* Argument FILENAME.  */
+unsigned int)/* Argument LINE_NUMBER.  */
+
+/* Return the type of a pointer to a given base type.  */
+
+GCC_METHOD1 (gcc_type, build_pointer_type,
+gcc_type)  /* Argument BASE_TYPE.  */
+
+/* Create a new 'struct' type.  Initially it has no fields.  */
+
+GCC_METHOD0 (gcc_type, build_record_type)
+
+/* Create a new 'union' type.  Initially it has no fields.  */
+
+GCC_METHOD0 (gcc_type, build_union_type)
+
+/* Add a field to a struct or union type.  FIELD_NAME is the field's
+   name.  FIELD_TYPE is the type of the field.  BITSIZE and BITPOS
+   indicate where in the struct the field occurs.  */
+
+GCC_METHOD5 (int /* bool */, build_add_field,
+  

Re: [PATCH 2/5] c_diagnostic_ignored_function hack

2014-06-19 Thread Tom Tromey
Joseph I'd say this global actually belongs somewhere in the
Joseph diagnostic_context (i.e., instead of the
Joseph diagnostic_context_auxiliary_data (DC) actually being a tree as
Joseph it is at present, it should point to a structure with whatever
Joseph extra information clients wish to use to control aspects of
Joseph diagnostic reporting).

We dropped this patch from the series and instead the diagnostic stuff
is all handled in the plugin itself.  You can see it in the new patch #5.

Tom


Re: [PATCH 5/5] add libcc1

2014-06-19 Thread Tom Tromey
Joseph I don't see anything obvious that would disable the plugin if
Joseph plugins are unsupported (e.g. on Windows host) or disabled
Joseph (--disable-plugin).  Probably the relevant support from
Joseph gcc/configure.ac needs to go somewhere it can be used at
Joseph toplevel.

Here's the patch to pull this out to a separate file.
I've omitted generated code from this mail.

Tom

b/config/ChangeLog:
2014-06-19  Tom Tromey  tro...@redhat.com

* gcc-plugin.m4: New file.

b/gcc/ChangeLog:
2014-06-19  Tom Tromey  tro...@redhat.com

* aclocal.m4, configure: Rebuild.
* Makefile.in (aclocal_deps): Add gcc-plugin.m4.
* configure.ac: Use GCC_ENABLE_PLUGINS.

---
 config/ChangeLog |   4 ++
 config/gcc-plugin.m4 | 113 +
 gcc/ChangeLog|   6 ++
 gcc/Makefile.in  |   1 +
 gcc/aclocal.m4   |  89 +-
 gcc/configure| 172 +--
 gcc/configure.ac | 100 +-
 7 files changed, 212 insertions(+), 273 deletions(-)
 create mode 100644 config/gcc-plugin.m4

diff --git a/config/gcc-plugin.m4 b/config/gcc-plugin.m4
new file mode 100644
index 000..dd06a58
--- /dev/null
+++ b/config/gcc-plugin.m4
@@ -0,0 +1,113 @@
+# gcc-plugin.m4 -*- Autoconf -*-
+# Check whether GCC is able to be built with plugin support.
+
+dnl Copyright (C) 2014 Free Software Foundation, Inc.
+dnl This file is free software, distributed under the terms of the GNU
+dnl General Public License.  As a special exception to the GNU General
+dnl Public License, this file may be distributed as part of a program
+dnl that contains a configuration script generated by Autoconf, under
+dnl the same distribution terms as the rest of that program.
+
+# Check for plugin support.
+# Respects --enable-plugin.
+# Sets the shell variables enable_plugin and pluginlibs.
+AC_DEFUN([GCC_ENABLE_PLUGINS],
+  [# Check for plugin support
+   AC_ARG_ENABLE(plugin,
+   [AS_HELP_STRING([--enable-plugin], [enable plugin support])],
+   enable_plugin=$enableval,
+   enable_plugin=yes; default_plugin=yes)
+
+   pluginlibs=
+
+   case ${host} in
+ *-*-darwin*)
+   if test x$build = x$host; then
+export_sym_check=nm${exeext} -g
+   elif test x$host = x$target; then
+export_sym_check=$gcc_cv_nm -g
+   else
+export_sym_check=
+   fi
+ ;;
+ *)
+   if test x$build = x$host; then
+export_sym_check=objdump${exeext} -T
+   elif test x$host = x$target; then
+export_sym_check=$gcc_cv_objdump -T
+   else
+export_sym_check=
+   fi
+ ;;
+   esac
+
+   if test x$enable_plugin = xyes; then
+
+ AC_MSG_CHECKING([for exported symbols])
+ if test x$export_sym_check != x; then
+   echo int main() {return 0;} int foobar() {return 0;}  conftest.c
+   ${CC} ${CFLAGS} ${LDFLAGS} conftest.c -o conftest$ac_exeext  /dev/null 
21
+   if $export_sym_check conftest$ac_exeext | grep -q foobar  /dev/null; 
then
+: # No need to use a flag
+AC_MSG_RESULT([yes])
+   else
+AC_MSG_RESULT([yes])
+AC_MSG_CHECKING([for -rdynamic])
+${CC} ${CFLAGS} ${LDFLAGS} -rdynamic conftest.c -o conftest$ac_exeext 
 /dev/null 21
+if $export_sym_check conftest$ac_exeext | grep -q foobar  /dev/null; 
then
+  plugin_rdynamic=yes
+  pluginlibs=-rdynamic
+else
+  plugin_rdynamic=no
+  enable_plugin=no
+fi
+AC_MSG_RESULT([$plugin_rdynamic])
+   fi
+ else
+   AC_MSG_RESULT([unable to check])
+ fi
+
+ # Check -ldl
+ saved_LIBS=$LIBS
+ AC_SEARCH_LIBS([dlopen], [dl])
+ if test x$ac_cv_search_dlopen = x-ldl; then
+   pluginlibs=$pluginlibs -ldl
+ fi
+ LIBS=$saved_LIBS
+
+ # Check that we can build shared objects with -fPIC -shared
+ saved_LDFLAGS=$LDFLAGS
+ saved_CFLAGS=$CFLAGS
+ case ${host} in
+   *-*-darwin*)
+CFLAGS=`echo $CFLAGS | sed s/-mdynamic-no-pic//g`
+CFLAGS=$CFLAGS -fPIC
+LDFLAGS=$LDFLAGS -shared -undefined dynamic_lookup
+   ;;
+   *)
+CFLAGS=$CFLAGS -fPIC
+LDFLAGS=$LDFLAGS -fPIC -shared
+   ;;
+ esac
+ AC_MSG_CHECKING([for -fPIC -shared])
+ AC_TRY_LINK(
+   [extern int X;],[return X == 0;],
+   [AC_MSG_RESULT([yes]); have_pic_shared=yes],
+   [AC_MSG_RESULT([no]); have_pic_shared=no])
+ if test x$have_pic_shared != xyes -o x$ac_cv_search_dlopen = xno; 
then
+   pluginlibs=
+   enable_plugin=no
+ fi
+ LDFLAGS=$saved_LDFLAGS
+ CFLAGS=$saved_CFLAGS
+
+ # If plugin support had been requested but not available, fail.
+ if test x$enable_plugin = xno ; then
+   if test x$default_plugin != xyes; then
+AC_MSG_ERROR([
+   Building GCC with plugin support requires a host that supports
+   -fPIC, -shared, -ldl and -rdynamic.])
+   fi
+ fi
+   fi
+])
diff 

Re: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option

2014-06-19 Thread Joseph S. Myers
On Sun, 1 Jun 2014, Herman, Andrei wrote:

+  /* The -fforce-dwarf-lexical-blocks option is only relevant when debug
+ info is in DWARF4 format */
+  if (flag_force_dwarf_blocks) {

Watch coding style: the opening '{' always goes on the next line.

+fforce-dwarf-lexical-blocks
+C C++ Var(flag_force_dwarf_blocks)
+Force generation of lexical blocks in dwarf output

I don't see a good reason for this not to be supported for ObjC and ObjC++ 
as well.  Say DWARF, not dwarf.

+@item -fforce-dwarf-lexical-blocks
+Produce debug information (a DW_TAG_lexical_block) for every function
+body, loop body, switch body, case statement, if-then and if-else statement,
+even if the body is a single statement.  Likewise, a lexical block will be
+emitted for the first label of a statement.  This block ends at the end of the
+current lexical scope, or when a break, continue, goto or return statement is
+encountered at the same lexical scope level.  This option is usefull for
+coverage tools that utilize the dwarf debug information.
+This option only applies to C/C++ code and is available when using DWARF
+Version 4 or higher.

Use @code{} markup for keywords (if, else, break, continue, goto, return).  
useful not usefull.  DWARF not dwarf.

+/* Create a block_loc struct for a statement list created on behalf of
+   flag_force_dwarf_blocks.  We use this for label or forced c99 scopes.  */
+
+void
+push_block_info (tree block, location_t loc, bool is_label)
+{
+  if (TREE_CODE(block) != STATEMENT_LIST)

Watch coding style: space before '(' in function and macro calls (and 
similar calls such as sizeof) (many places in this patch, not just this 
one).

+tree
+pop_block_info (location_t loc)

It's not documented in codingconventions.html, but I think it's preferred 
to avoid returning values through reference arguments (see e.g. 
https://gcc.gnu.org/ml/gcc-patches/2013-11/msg00198.html).

+{
+  block_loc  tl = NULL;

Excess space between block_loc and tl.

@@ -4679,7 +4712,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
expressions being rejected later.  */
 
 static void
-c_parser_label (c_parser *parser)
+c_parser_label (c_parser *parser, bool prev_label)

You're adding a new argument - you need to update the comment above this 
function to explain the semantics of this argument.

In general, make sure that new functions have comments above them that 
explain the semantics of the arguments (by name) and any return value.

+/* If current scope is a label scope, pop it from block info stack
+   and close it's compound statement.  */

its not it's.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 5/5] add libcc1

2014-06-19 Thread Jakub Jelinek
On Thu, Jun 19, 2014 at 02:52:12PM -0600, Tom Tromey wrote:
 Tom I've edited this one down by removing the auto-generated stuff , and
 Tom then compressed it.
 
 Here's a new version of patch #5.
 I've removed the generated code; let's see if it gets through without
 compression.
 
 I think this addresses all the reviews:
 
 * It uses gcc-plugin.m4 to disable the plugin
 * It does some configure checks for needed functionality, and disables
   the plugin if they are not found
 * libcc1 and the plugin now do a protocol version handshake at
   startup
 * The diagnostic overriding code is now in the plugin, not in gcc proper
 * gdb now tells libcc1 about the target triplet, and libcc1 uses
   this to invoke the proper GCC.  This is done by (ewww) searching $PATH.

If you plan to implement this for other frontends (cc1plus, f951?) in the
future, would that be still libcc1 and perhaps new plugins in there, or are
we going to have new toplevel directories for each such a plugin?

Jakub


Re: [PATCH 5/5] add libcc1

2014-06-19 Thread Tom Tromey
 Jakub == Jakub Jelinek ja...@redhat.com writes:

Jakub If you plan to implement this for other frontends (cc1plus,
Jakub f951?) in the future, would that be still libcc1 and perhaps new
Jakub plugins in there, or are we going to have new toplevel
Jakub directories for each such a plugin?

We're planning to do this for g++ but, as far as I know, not anything
else.  I was planning to put the C++ plugin into this same directory, as
I expect it to share a reasonable amount of code with the C plugin -- at
least all the RPC stuff.

Tom


[RFC] Add a .gitattributes file for use with git-merge-changelog

2014-06-19 Thread Samuel Bronson
[Am I really supposed to CC this to gcc@ like binutils/MAINTAINERS
says I should?]

Individual users will still have to:

 1. Install git-merge-changelog

 2. Set up the merge driver in their git config

See gnulib's lib/git-merge-changelog.c [1] for details.

For example, I:

 1. Patched Debian's gnulib package to build git-merge-changelog, and
sent the patch to the Debian maintainer, who then proceeded to not
only accept my patch but even write a *manpage* for
git-merge-changelog! (Let's hear it for Ian Beckwith.)

So now, I can install it simply by running apt-get install
git-merge-changelog.  (Except, of course, that I already have it
installed from when I was testing my patch.)

 2. Added this to my ~/.gitconfig:

--8---cut here---start-8---
[merge merge-changelog]
name = GNU-style ChangeLog merge driver
driver = git-merge-changelog %O %A %B
--8---cut here---end---8---

(You could just put it in the .git/config file for a given
repository, but I can't really see much point in that.)

With this patch applied and the above two tasks done by whatever means
you deem best, you can say goodbye to merge conflicts in ChangeLog
files.

*IF* people will stop renaming the danged things, anyway.

[1]: 
http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/git-merge-changelog.c

[Note: The docs for git-merge-changelog (the comments at the top) say
you need a .gitattributes in every directory.  The docs are wrong.
Ignore the docs.

You really only need one at the top level, since .gitattributes uses
the same pattern matching rules as .gitignore, which match files in
any subdirectory unless you prefix the pattern with a /.]

[Note 2: I already have copyright assignment papers on file for GDB.]
---
 .gitattributes | 3 +++
 1 file changed, 3 insertions(+)
 create mode 100644 .gitattributes

diff --git a/.gitattributes b/.gitattributes
new file mode 100644
index 000..75abe79
--- /dev/null
+++ b/.gitattributes
@@ -0,0 +1,3 @@
+# See gnulib's lib/git-merge-changelog.c (or git-merge-changelog(1))
+# to activate this
+ChangeLog   merge=merge-changelog
-- 
2.0.0



[PATCH] Power/GCC: Remove trailing NOP from byte-swap code

2014-06-19 Thread Maciej W. Rozycki
Hi,

 This change removes an extraneous NOP instruction placed at the end of 
code produced by each of byte-swap patterns due to the expansion of the 
(const_int 0) RTL unnecessarily produced by `define_split' definitions.  
Updated patterns follow what other targets do in corresponding situations.

 The change in output produced can be illustrated with the following 
simple example:

$ cat bswap.c
long long
bswap (long long i)
{
  return __builtin_bswap64 (i);
}
$ powerpc-linux-gnu-gcc -S -dp -o bswap.s bswap.c

This currently produces the following code:

.file   bswap.c
.section.text
.align 2
.globl bswap
.type   bswap, @function
bswap:
stwu 1,-32(1)# 20   movsi_update/2  [length = 4]
stw 31,28(1) # 21   *movsi_internal1/4  [length = 4]
mr 31,1  # 22   *movsi_internal1/1  [length = 4]
stw 3,8(31)  # 31   *movsi_internal1/4  [length = 4]
stw 4,12(31) # 32   *movsi_internal1/4  [length = 4]
lwz 9,8(31)  # 33   *movsi_internal1/3  [length = 4]
lwz 10,12(31)# 34   *movsi_internal1/3  [length = 4]
rlwinm 7,10,8,0x # 38   rotlsi3/2   [length = 4]
rlwimi 7,10,24,0,7   # 39   insvsi_internal [length = 4]
rlwimi 7,10,24,16,23 # 40   *insvsi_internal1   [length = 4]
rlwinm 8,9,8,0x  # 41   rotlsi3/2   [length = 4]
rlwimi 8,9,24,0,7# 42   insvsi_internal [length = 4]
rlwimi 8,9,24,16,23  # 43   *insvsi_internal1   [length = 4]
nop  # 37   nop [length = 4]
mr 10,8  # 44   *movsi_internal1/1  [length = 4]
mr 9,7   # 45   *movsi_internal1/1  [length = 4]
mr 3,9   # 46   *movsi_internal1/1  [length = 4]
mr 4,10  # 47   *movsi_internal1/1  [length = 4]
addi 11,31,32# 25   *addsi3_internal1/2 [length = 4]
lwz 31,-4(11)# 26   *movsi_internal1/3  [length = 4]
mr 1,11  # 28   *movsi_internal1/1  [length = 4]
blr  # 29   *return_internal_si [length = 4]
.size   bswap,.-bswap

Notice the NOP in the middle.  With this change applied this code is 
produced instead:

.file   bswap.c
.section.text
.align 2
.globl bswap
.type   bswap, @function
bswap:
stwu 1,-32(1)# 20   movsi_update/2  [length = 4]
stw 31,28(1) # 21   *movsi_internal1/4  [length = 4]
mr 31,1  # 22   *movsi_internal1/1  [length = 4]
stw 3,8(31)  # 31   *movsi_internal1/4  [length = 4]
stw 4,12(31) # 32   *movsi_internal1/4  [length = 4]
lwz 9,8(31)  # 33   *movsi_internal1/3  [length = 4]
lwz 10,12(31)# 34   *movsi_internal1/3  [length = 4]
rlwinm 7,10,8,0x # 37   rotlsi3/2   [length = 4]
rlwimi 7,10,24,0,7   # 38   insvsi_internal [length = 4]
rlwimi 7,10,24,16,23 # 39   *insvsi_internal1   [length = 4]
rlwinm 8,9,8,0x  # 40   rotlsi3/2   [length = 4]
rlwimi 8,9,24,0,7# 41   insvsi_internal [length = 4]
rlwimi 8,9,24,16,23  # 42   *insvsi_internal1   [length = 4]
mr 10,8  # 43   *movsi_internal1/1  [length = 4]
mr 9,7   # 44   *movsi_internal1/1  [length = 4]
mr 3,9   # 45   *movsi_internal1/1  [length = 4]
mr 4,10  # 46   *movsi_internal1/1  [length = 4]
addi 11,31,32# 25   *addsi3_internal1/2 [length = 4]
lwz 31,-4(11)# 26   *movsi_internal1/3  [length = 4]
mr 1,11  # 28   *movsi_internal1/1  [length = 4]
blr  # 29   *return_internal_si [length = 4]
.size   bswap,.-bswap

 This has been regression tested with the powerpc-eabi target and the 
following multilibs:

-mcpu=603e
-mcpu=603e -msoft-float
-mcpu=8540 -mfloat-gprs=single -mspe=yes -mabi=spe
-mcpu=8540 -mfloat-gprs=single -mspe=yes -mabi=spe -msoft-float
-mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe
-mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe -mlittle
-mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe -msoft-float
-mcpu=7400 -maltivec -mabi=altivec

as well as the powerpc-linux-gnu target and the following multilibs:

-mcpu=603e
-mcpu=603e -msoft-float
-mcpu=8540 -mfloat-gprs=single -mspe=yes -mabi=spe
-mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe
-mcpu=7400 -maltivec -mabi=altivec
-mcpu=e5500 -m64

 OK to apply?

2014-06-20  Maciej W. Rozycki  ma...@codesourcery.com

gcc/
* config/rs6000/rs6000.md: Append `DONE' to preparation
statements of `bswap' pattern splitters.

  Maciej

gcc-ppc-bswap-done.diff
Index: gcc-fsf-trunk-quilt/gcc/config/rs6000/rs6000.md
===
--- gcc-fsf-trunk-quilt.orig/gcc/config/rs6000/rs6000.md2014-06-10 

Re: [Patch AArch64] Define TARGET_FLAGS_REGNUM

2014-06-19 Thread Richard Henderson
On 02/28/2014 01:32 AM, Ramana Radhakrishnan wrote:
 Hi,
 
 This defines TARGET_FLAGS_REGNUM for AArch64 to be CC_REGNUM. Noticed this
 turns on the cmpelim pass after reload and in a few examples and a couple of
 benchmarks I noticed a number of comparisons getting deleted. A similar patch
 for AArch32 is being tested.
 
 Tested cross with aarch64-none-elf on a model with no regressions.
 
 Ok for stage1 ?
 
 regards
 Ramana
 
 DATE  Ramana Radhakrishnan  ramana.radhakrish...@arm.com
 
 * config/aarch64/aarch64.c (TARGET_FLAGS_REGNUM): Define.

This appears to cause PR bootstrap/61565.


r~



Re: [PATCH, PR 61211] Fix a bug in clone_of_p verification

2014-06-19 Thread Jan Hubicka
 Ping.
 
 Thanks,
 
 Martin
 
 
 On Sat, May 31, 2014 at 12:46:03AM +0200, Martin Jambor wrote:
  Hi,
  
  after a clone is materialized, its clone_of field is cleared which in
  PR 61211 leads to a failure in the skipped_thunk path in clone_of_p in
  cgraph.c, which then leads to a false positive verification failure.
  
  Fixed by the following patch.  Bootstrapped and tested on x86_64-linux
  on both the trunk and the 4.9 branch.  OK for both?
  
  Thanks,
  
  Martin
  
  
  2014-05-30  Martin Jambor  mjam...@suse.cz
  
  PR ipa/61211
  * cgraph.c (clone_of_p): Allow skipped_branch to deal with
  expanded clones.

OK, thanks!
Honza
  
  diff --git a/gcc/cgraph.c b/gcc/cgraph.c
  index ff65b86..f18f977 100644
  --- a/gcc/cgraph.c
  +++ b/gcc/cgraph.c
  @@ -2566,11 +2566,16 @@ clone_of_p (struct cgraph_node *node, struct 
  cgraph_node *node2)
 skipped_thunk = true;
   }
   
  -  if (skipped_thunk
  -   (!node2-clone_of
  - || !node2-clone.args_to_skip
  - || !bitmap_bit_p (node2-clone.args_to_skip, 0)))
  -return false;
  +  if (skipped_thunk)
  +{
  +  if (!node2-clone.args_to_skip
  + || !bitmap_bit_p (node2-clone.args_to_skip, 0))
  +   return false;
  +  if (node2-former_clone_of == node-decl)
  +   return true;
  +  else if (!node2-clone_of)
  +   return false;
  +}
   
 while (node != node2  node2)
   node2 = node2-clone_of;


[Committed] [PATCH] PR61123 : Fix the ABI mis-matching error caused by LTO

2014-06-19 Thread Hale Wang


 -Original Message-
 From: Mike Stump [mailto:mikest...@comcast.net]
 Sent: 2014年6月19日 1:42
 To: Richard Biener
 Cc: Hale Wang; Mike Stump; GCC Patches
 Subject: Re: [PATCH] PR61123 : Fix the ABI mis-matching error caused by
LTO
 
 On Jun 18, 2014, at 3:22 AM, Richard Biener richard.guent...@gmail.com
 wrote:
  Space after the *.
 
  I think you don't need to copy the LTO harness but you can simply use
  dg.exp and sth similar to gcc.dg/20081223-1.c (there is an effective
  target 'lto' to guard for lto support).
 
  So simply place the testcase in gcc.target/arm/ (make sure to put a
  dg-do compile on the 2nd file and use dg-additional-sources).
 
  If that doesn't work I'd say put the testcase in gcc.dg/lto/ instead
  and do a dg-skip-if for non-arm targets.
 
  Ok with one of those changes.
 
  Oh, I see you need a new object-readelf ... I defer to a testsuite
  maintainer for this part.
 
 The testsuite bits are Ok.  My guidance on the test suite would be this,
all lto
 test cases in .*lto directories.  20 or fewer test cases for a given
target, in the
 main lto directory, more than 50, in the arm/lto directory.  When one is
 tracking down bugs and trying to clean test suite results if they break,
it is
 nice to be able to skip in mass all lto bugs first, and resolve all
non-lto issues
 and then come back to the lto issues last, in hopes that they are all then
 resolved.  Also, if one it redoing lto bits, and a test case with lto in
the name
 pops up as a regression, and you’re not an lto person, you can stop
thinking
 about it and just pass to the lto person, it is a slightly different
mindset.  :-)

Thanks! Patch was committed @r211832 with minimal format changes due to
TAB. The final change log and  patch is:

2014-06-20 Hale Wang hale.w...@arm.com

PR lto/61123
* c.opt (fshort-enums): Add to LTO.
* c.opt (fshort-wchar): Likewise.

testsuite/ChangeLog
2014-06-20 Hale Wang hale.w...@arm.com

* gcc.target/arm/lto/: New folder to verify the LTO option.
* gcc.target/arm/lto/pr61123-enum-size_0.c: New test case.
* gcc.target/arm/lto/pr61123-enum-size_1.c: Likewise.
* gcc.target/arm/lto/lto.exp: New exp file used to test LTO option.
* lib/lto.exp (object-readelf): New procedure.


Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt  (revision 211394)
+++ gcc/c-family/c.opt  (working copy)
@@ -1189,11 +1189,11 @@
 Use the same size for double as for float
 
 fshort-enums
-C ObjC C++ ObjC++ Optimization Var(flag_short_enums)
+C ObjC C++ ObjC++ LTO Optimization Var(flag_short_enums)
 Use the narrowest integer type possible for enumeration types
 
 fshort-wchar
-C ObjC C++ ObjC++ Optimization Var(flag_short_wchar)
+C ObjC C++ ObjC++ LTO Optimization Var(flag_short_wchar)
 Force the underlying type for \wchar_t\ to be \unsigned short\
 
 fsigned-bitfields
Index: gcc/testsuite/lib/lto.exp
===
--- gcc/testsuite/lib/lto.exp   (revision 211394)
+++ gcc/testsuite/lib/lto.exp   (working copy)
@@ -650,3 +650,82 @@
fail scan-symbol $args
 }
 }
+
+# Call pass if object readelf is ok, otherwise fail.
+# example: /* { dg-final { object-readelf Tag_ABI_enum_size int} } */
+proc object-readelf { args } {
+global readelf
+global base_dir
+upvar 2 execname execname
+
+if { [llength $args]  2 } {
+   error object-readelf: too few arguments
+   return
+}
+if { [llength $args]  3 } {
+   error object-readelf: too many arguments
+   return
+}
+if { [llength $args] = 3 } {
+   switch [dg-process-target [lindex $args 2]] {
+   S { }
+   N { return }
+   F { setup_xfail *-*-* }
+   P { }
+   }
+}
+
+# Find size like we find g++ in g++.exp.
+if ![info exists readelf]  {
+   set readelf [findfile $base_dir/../../../binutils/readelf \
+   $base_dir/../../../binutils/readelf \
+   [findfile $base_dir/../../readelf $base_dir/../../readelf \
+   [findfile $base_dir/readelf $base_dir/readelf \
+   [transform readelf
+   verbose -log readelf is $readelf
+}
+
+set what [lindex $args 0]
+set with [lindex $args 1]
+
+if ![file_on_host exists $execname] {
+   verbose -log $execname does not exist
+   unresolved object-readelf $what 
+   return
+}
+
+set output [remote_exec host $readelf -A $execname]
+set status [lindex $output 0]
+if { $status != 0 } {
+   verbose -log object-readelf: $readelf failed
+   unresolved object-readelf $what $execname
+   return
+}
+
+set text [lindex $output 1]
+set lines [split $text \n]
+
+set done 0
+set i 0
+while { !$done } {
+   set line_tex [lindex $lines $i]
+   if { [llength ${line_tex}]  1} {
+   incr i
+   if [regexp -- $what 

Re: [Patch, Fortran, committed] CO_MIN/MAX/SUM fixes

2014-06-19 Thread Tobias Burnus

Tobias Burnus wrote:

This patches fixes a few bugs related to CO_MIN/MAX/SUM:
Committed as Rev. 211816.


That patch required also the attached testsuite changes, committed as 
Rev. 211833.


Tobias
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 211832)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,10 @@
+2014-06-20  Tobias Burnus  bur...@net-b.de
+
+	PR testsuite/61567
+	* gfortran.dg/coarray_collectives_5.f90: Update
+	dg-final scan-tree-dump-times.
+	* gfortran.dg/coarray_collectives_6.f90: Ditto.
+
 2014-06-20 Hale Wang hale.w...@arm.com
 
 	* gcc.target/arm/lto/: New folder to verify the LTO option.
Index: gcc/testsuite/gfortran.dg/coarray_collectives_5.f90
===
--- gcc/testsuite/gfortran.dg/coarray_collectives_5.f90	(Revision 211832)
+++ gcc/testsuite/gfortran.dg/coarray_collectives_5.f90	(Arbeitskopie)
@@ -13,7 +13,7 @@ program test
   call co_sum(val, stat=stat3)
 end program test
 
-! { dg-final { scan-tree-dump-times _gfortran_caf_co_max \\(desc.., 0B, 0, stat1, 0B, 0, 0\\); 1 original } }
-! { dg-final { scan-tree-dump-times _gfortran_caf_co_min \\(desc.., 0B, 0, stat2, 0B, 0, 0\\); 1 original } }
-! { dg-final { scan-tree-dump-times _gfortran_caf_co_sum \\(desc.., 0B, 0, stat3, 0B, 0\\); 1 original } }
+! { dg-final { scan-tree-dump-times _gfortran_caf_co_max \\(desc.., 0, stat1, 0B, 0, 0\\); 1 original } }
+! { dg-final { scan-tree-dump-times _gfortran_caf_co_min \\(desc.., 0, stat2, 0B, 0, 0\\); 1 original } }
+! { dg-final { scan-tree-dump-times _gfortran_caf_co_sum \\(desc.., 0, stat3, 0B, 0\\); 1 original } }
 ! { dg-final { cleanup-tree-dump original } }
Index: gcc/testsuite/gfortran.dg/coarray_collectives_6.f90
===
--- gcc/testsuite/gfortran.dg/coarray_collectives_6.f90	(Revision 211832)
+++ gcc/testsuite/gfortran.dg/coarray_collectives_6.f90	(Arbeitskopie)
@@ -20,7 +20,7 @@ program test
   call co_min(val3, result_image=res,stat=stat3, errmsg=errmesg3)
 end program test
 
-! { dg-final { scan-tree-dump-times _gfortran_caf_co_max \\(desc.., 0B, 0, stat1, errmesg1, 0, 6\\); 1 original } }
-! { dg-final { scan-tree-dump-times _gfortran_caf_co_sum \\(val2, 0B, 4, stat2, errmesg2, 7\\); 1 original } }
-! { dg-final { scan-tree-dump-times _gfortran_caf_co_min \\(desc.., 0B, res, stat3, errmesg3, 99, 8\\); 1 original } }
+! { dg-final { scan-tree-dump-times _gfortran_caf_co_max \\(desc.., 0, stat1, errmesg1, 0, 6\\); 1 original } }
+! { dg-final { scan-tree-dump-times _gfortran_caf_co_sum \\(val2, 4, stat2, errmesg2, 7\\); 1 original } }
+! { dg-final { scan-tree-dump-times _gfortran_caf_co_min \\(desc.., res, stat3, errmesg3, 99, 8\\); 1 original } }
 ! { dg-final { cleanup-tree-dump original } }