date:20140403

Re: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-03 Thread Rainer Orth

Thomas Preud'homme thomas.preudho...@arm.com writes:

 From: Joseph Myers [mailto:jos...@codesourcery.com]
 
  +   if { [is-effective-target bswap]
  + ![istarget x86_64-*-*] } {
 
 That x86_64-*-* test is wrong.  x86_64-*-* and i?86-*-* should always be
 handled the same (if you then want to distinguish 32-bit and 64-bit
 multilibs, you check the appropriate effective-target there, depending on
 whether the condition is one on the ABI or which register size is being
 used, which affects how x32 should be counted).

 Indeed, it's a mistake. I?86 should be in there two. Please find attached an 
 updated patch.

 diff --git a/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c 
 b/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c
 index 7d557f3..a9c3443 100644
 --- a/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c
 +++ b/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c
 @@ -1,6 +1,6 @@
 -/* { dg-do compile { target arm*-*-* alpha*-*-* ia64*-*-* x86_64-*-* 
 s390x-*-* powerpc*-*-* rs6000-*-* } } */
 +/* { dg-do compile { target *-*-* } } */

Just omit the { target *-*-* } completely, also a few more times.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: Fix ipa-devirt ICE

2014-04-03 Thread Marc Glisse


On Thu, 3 Apr 2014, Jan Hubicka wrote:


+ /* Use OTR_TOKEN = INT_MAX as a marker of probably type 
inconsistent
+ /* Use OTR_TOKEN = INT_MAX as a marker of probably type 
inconsistent
+   OTR_TOKEN == INT_MAX is used to mark calls that are provably


Did you mean provably instead of probably in the first two?

--
Marc Glisse

[Ping][Patch]Simplify SUBREG with operand whose target bits are cleared by AND operation

2014-04-03 Thread Terry Guo

Hello Eric,

Would you please review my patch at
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01582.html? Thanks.

BR,
Terry

Re: RFA: PATCH to add -fno-gnu-unique for c++/60731

2014-04-03 Thread Richard Biener

On Wed, Apr 2, 2014 at 9:24 PM, Jason Merrill ja...@redhat.com wrote:
 Use of STB_GNU_UNIQUE to avoid problems with variable symbols shared between
 two RTLD_LOCAL plugins and a common library dependency causes problems with
 libraries that depend on dlclose/dlopen to reinitialize state.  This patch
 adds a -fno-gnu-unique flag that such libraries can use.

 Tested x86_64-pc-linux-gnu.  OK for trunk?

Ok.  Can you add a testcase as well please?

Thanks,
Richard.

Re: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-03 Thread Andreas Schwab

Thomas Preud'homme thomas.preudho...@arm.com writes:

 +# Return 1 if the target supports byte swap instructions.
 +
 +proc check_effective_target_bswap { } {
 +global et_bswap_saved
 +
 +if [info exists et_bswap_saved] {
 +verbose check_effective_target_bswap: using cached result 2
 +} else {
 + set et_bswap_saved 0
 + if { [istarget aarch64-*-*]
 +  || [istarget alpha*-*-*]
 +  || [istarget arm*-*-*]
 +  || [istarget i?86-*-*]
 +  || [istarget powerpc*-*-*]
 +  || [istarget rs6000-*-*]
 +  || [istarget s390*-*-*]
 +  || [istarget x86_64-*-*] } {

Please add m68k-*-*.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.

[PATCH][LTO] Reduce WPA memory usage

2014-04-03 Thread Richard Biener


This reduces WPA memory usage at stream-out time by avoiding to
allocate the streamer cache node array and by freeing the global
out-decl-states hash tables (we do that already for the fn-decl-states).

LTO bootstrapped and bootstrapped on x86_64-unknown-linux-gnu, testing
in progress.

Ok?

Not sure if it will make a notable difference.  The pointer-map
overhead is at least 4 times of that of the vector, if we make
pointer-map behave like hash_table (3/4 full) or htab_t
(half full) then that would improve (we could even make that
configurable at pointer-set/map construction time).  Like with

Index: gcc/pointer-set.c
===
--- gcc/pointer-set.c   (revision 209018)
+++ gcc/pointer-set.c   (working copy)
@@ -125,7 +125,7 @@ pointer_set_insert (struct pointer_set_t
 
   /* For simplicity, expand the set even if P is already there.  This can 
be
  superfluous but can happen at most once.  */
-  if (pset-n_elements  pset-n_slots / 4)
+  if (pset-n_elements * 4  pset-n_slots * 3)
 {
   size_t old_n_slots = pset-n_slots;
   const void **old_slots = pset-slots;
Index: gcc/pointer-set.h
===
--- gcc/pointer-set.h   (revision 209018)
+++ gcc/pointer-set.h   (working copy)
@@ -109,7 +109,7 @@ pointer_mapT::insert (const void *p, b
   /* For simplicity, expand the map even if P is already there.  This can 
be
  superfluous but can happen at most once.  */
   /* ???  Fugly that we have to inline that here.  */
-  if (n_elements  n_slots / 4)
+  if (n_elements * 4  n_slots * 3)
 {
   size_t old_n_slots = n_slots;
   const void **old_keys = slots;

might be worth checking how much memory we save from the above.

Thanks,
Richard.

2014-04-03  Richard Biener  rguent...@suse.de

* tree-streamer.h (struct streamer_tree_cache_d): Add next_idx
member.
(streamer_tree_cache_create): Adjust.
* tree-streamer.c (streamer_tree_cache_add_to_node_array): Adjust
to allow optional nodes array.
(streamer_tree_cache_insert_1): Use next_idx to assign idx.
(streamer_tree_cache_append): Likewise.
(streamer_tree_cache_create): Create nodes array optionally
as specified by parameter.
* lto-streamer-out.c (create_output_block): Avoid maintaining
the node array in the writer cache.
(DFS_write_tree): Remove assertion.
(produce_asm_for_decls): Free the out decl state hash table
early.
* lto-streamer-in.c (lto_data_in_create): Adjust for
streamer_tree_cache_create prototype change.

Index: gcc/tree-streamer.c
===
*** gcc/tree-streamer.c (revision 209018)
--- gcc/tree-streamer.c (working copy)
*** static void
*** 101,120 
  streamer_tree_cache_add_to_node_array (struct streamer_tree_cache_d *cache,
   unsigned ix, tree t, hashval_t hash)
  {
!   /* Make sure we're either replacing an old element or
!  appending consecutively.  */
!   gcc_assert (ix = cache-nodes.length ());
! 
!   if (ix == cache-nodes.length ())
  {
!   cache-nodes.safe_push (t);
!   if (cache-hashes.exists ())
!   cache-hashes.safe_push (hash);
  }
!   else
  {
!   cache-nodes[ix] = t;
!   if (cache-hashes.exists ())
cache-hashes[ix] = hash;
  }
  }
--- 101,119 
  streamer_tree_cache_add_to_node_array (struct streamer_tree_cache_d *cache,
   unsigned ix, tree t, hashval_t hash)
  {
!   /* We're either replacing an old element or appending consecutively.  */
!   if (cache-nodes.exists ())
  {
!   if (cache-nodes.length () == ix)
!   cache-nodes.safe_push (t);
!   else
!   cache-nodes[ix] = t;
  }
!   if (cache-hashes.exists ())
  {
!   if (cache-hashes.length () == ix)
!   cache-hashes.safe_push (hash);
!   else
cache-hashes[ix] = hash;
  }
  }
*** streamer_tree_cache_insert_1 (struct str
*** 146,152 
  {
/* Determine the next slot to use in the cache.  */
if (insert_at_next_slot_p)
!   ix = cache-nodes.length ();
else
ix = *ix_p;
 *slot = ix;
--- 145,151 
  {
/* Determine the next slot to use in the cache.  */
if (insert_at_next_slot_p)
!   ix = cache-next_idx++;
else
ix = *ix_p;
 *slot = ix;
*** void
*** 211,217 
  streamer_tree_cache_append (struct streamer_tree_cache_d *cache,
tree t, hashval_t hash)
  {
!   unsigned ix = cache-nodes.length ();
if (!cache-node_map)
  streamer_tree_cache_add_to_node_array (cache, ix, t, hash);
else
--- 210,216 
  streamer_tree_cache_append (struct streamer_tree_cache_d *cache,
tree t, hashval_t hash)
  {
!

[PATCH] Fix PR60740

2014-04-03 Thread Richard Biener


The following fixes the graphite ICE that results from
stmt_simple_for_scop_p not walking all GIMPLE_COND operands
but only SSA name ones.

Bootstrap and regtest in progress on x86_64-unknown-linux-gnu.

Richard.

2014-04-03  Richard Biener  rguent...@suse.de

PR tree-optimization/60740
* graphite-scop-detection.c (stmt_simple_for_scop_p): Iterate
over all GIMPLE_COND operands.

* gcc.dg/graphite/pr60740.c: New testcase.

Index: gcc/graphite-scop-detection.c
===
*** gcc/graphite-scop-detection.c   (revision 209018)
--- gcc/graphite-scop-detection.c   (working copy)
*** stmt_simple_for_scop_p (basic_block scop
*** 346,358 
  
  case GIMPLE_COND:
{
-   tree op;
-   ssa_op_iter op_iter;
- enum tree_code code = gimple_cond_code (stmt);
- 
/* We can handle all binary comparisons.  Inequalities are
   also supported as they can be represented with union of
   polyhedra.  */
  if (!(code == LT_EXPR
  || code == GT_EXPR
  || code == LE_EXPR
--- 346,355 
  
  case GIMPLE_COND:
{
/* We can handle all binary comparisons.  Inequalities are
   also supported as they can be represented with union of
   polyhedra.  */
+ enum tree_code code = gimple_cond_code (stmt);
  if (!(code == LT_EXPR
  || code == GT_EXPR
  || code == LE_EXPR
*** stmt_simple_for_scop_p (basic_block scop
*** 361,371 
  || code == NE_EXPR))
return false;
  
!   FOR_EACH_SSA_TREE_OPERAND (op, stmt, op_iter, SSA_OP_ALL_USES)
! if (!graphite_can_represent_expr (scop_entry, loop, op)
! /* We can not handle REAL_TYPE. Failed for pr39260.  */
! || TREE_CODE (TREE_TYPE (op)) == REAL_TYPE)
!   return false;
  
return true;
}
--- 358,371 
  || code == NE_EXPR))
return false;
  
!   for (unsigned i = 0; i  2; ++i)
! {
!   tree op = gimple_op (stmt, i);
!   if (!graphite_can_represent_expr (scop_entry, loop, op)
!   /* We can not handle REAL_TYPE. Failed for pr39260.  */
!   || TREE_CODE (TREE_TYPE (op)) == REAL_TYPE)
! return false;
! }
  
return true;
}
Index: gcc/testsuite/gcc.dg/graphite/pr60740.c
===
*** gcc/testsuite/gcc.dg/graphite/pr60740.c (revision 0)
--- gcc/testsuite/gcc.dg/graphite/pr60740.c (working copy)
***
*** 0 
--- 1,16 
+ /* { dg-options -O2 -floop-interchange } */
+ 
+ int **db6 = 0;
+ 
+ void
+ k26(void)
+ {
+   static int geb = 0;
+   int *a22 = geb;
+   int **l30 = a22;
+   int *c4b;
+   int ndf;
+   for (ndf = 0; ndf = 1; ++ndf)
+ *c4b = (db6 == l30)  (*a22)--;
+ }
+

[PATCH, ARM] Fix PR60609 (Error: value of 256 too large for field of 1 bytes)

2014-04-03 Thread Charles Baylis

Hi

This bug causes the compiler to create a Thumb-2 TBB instruction with
a jump table containing an out of range value in a .byte field:

whatever.s:148: Error: value of 256 too large for field of 1 bytes at 100

This occurs because the jump table is followed with a .align 1 due
to ASM_OUTPUT_CASE_END, but the 'shorten' phase does not account for
the space taken by this align directive.

This patch addresses the issue by removing ASM_OUTPUT_CASE_END from
arm.h, and ensuring that the alignment after an ADDR_DIFF_VEC is
instead inserted by aligning the label following the barrier which
follows it. This is achieved by defining LABEL_ALIGN_AFTER_BARRIER
appropriately.

Bootstrapped/checked on arm-unknown-linux-gnueabihf.

OK for trunk, and backporting to 4.8?



2014-04-02  Charles Baylis  charles.bay...@linaro.org

PR target/60609
* config/arm/arm.h (ASM_OUTPUT_CASE_END) Remove.
(LABEL_ALIGN_AFTER_BARRIER) Align barriers which occur after
ADDR_DIFF_VEC.


2014-04-02  Charles Baylis  charles.bay...@linaro.org

PR target/60609
* g++.dg/torture/pr60609.C: New test.
From 9b0c1ada23e2b210b02ebaee2f599bb5205a91d6 Mon Sep 17 00:00:00 2001
From: Charles Baylis charles.bay...@linaro.org
Date: Thu, 3 Apr 2014 10:57:33 +0100
Subject: [PATCH] fix for PR target/60609

2014-04-02  Charles Baylis  charles.bay...@linaro.org

	PR target/60609
	* config/arm/arm.h (ASM_OUTPUT_CASE_END) Remove.
	(LABEL_ALIGN_AFTER_BARRIER) Align barriers which occur after
	ADDR_DIFF_VEC.

2014-04-02  Charles Baylis  charles.bay...@linaro.org

	PR target/60609
	* g++.dg/torture/pr60609.C: New test.
---
 gcc/config/arm/arm.h   |  11 +-
 gcc/testsuite/g++.dg/torture/pr60609.C | 252 +
 2 files changed, 255 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr60609.C

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 7ca47a7..a4bbd12 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2194,14 +2194,9 @@ extern int making_const_table;
 #undef ASM_OUTPUT_BEFORE_CASE_LABEL
 #define ASM_OUTPUT_BEFORE_CASE_LABEL(FILE, PREFIX, NUM, TABLE) /* Empty.  */
 
-/* Make sure subsequent insns are aligned after a TBB.  */
-#define ASM_OUTPUT_CASE_END(FILE, NUM, JUMPTABLE)	\
-  do			\
-{			\
-  if (GET_MODE (PATTERN (JUMPTABLE)) == QImode)	\
-	ASM_OUTPUT_ALIGN (FILE, 1);			\
-}			\
-  while (0)
+#define LABEL_ALIGN_AFTER_BARRIER(LABEL)\
+   (GET_CODE (PATTERN (prev_active_insn (LABEL))) == ADDR_DIFF_VEC \
+   ? 1 : 0)
 
 #define ARM_DECLARE_FUNCTION_NAME(STREAM, NAME, DECL) 	\
   do			\
diff --git a/gcc/testsuite/g++.dg/torture/pr60609.C b/gcc/testsuite/g++.dg/torture/pr60609.C
new file mode 100644
index 000..9ddec0b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr60609.C
@@ -0,0 +1,252 @@
+/* { dg-do assemble } */
+
+class exception
+{
+};
+class bad_alloc:exception
+{
+};
+class logic_error:exception
+{
+};
+class domain_error:logic_error
+{
+};
+class invalid_argument:logic_error
+{
+};
+class length_error:logic_error
+{
+};
+class overflow_error:exception
+{
+};
+typedef int mpz_t[];
+template  class  class __gmp_expr;
+template  class __gmp_expr  mpz_t 
+{
+~__gmp_expr ();
+};
+
+class PIP_Solution_Node;
+class internal_exception
+{
+~internal_exception ();
+};
+class not_an_integer:internal_exception
+{
+};
+class not_a_variable:internal_exception
+{
+};
+class not_an_optimization_mode:internal_exception
+{
+};
+class not_a_bounded_integer_type_width:internal_exception
+{
+};
+class not_a_bounded_integer_type_representation:internal_exception
+{
+};
+class not_a_bounded_integer_type_overflow:internal_exception
+{
+};
+class not_a_complexity_class:internal_exception
+{
+};
+class not_a_control_parameter_name:internal_exception
+{
+};
+class not_a_control_parameter_value:internal_exception
+{
+};
+class not_a_pip_problem_control_parameter_name:internal_exception
+{
+};
+class not_a_pip_problem_control_parameter_value:internal_exception
+{
+};
+class not_a_relation:internal_exception
+{
+};
+class ppl_handle_mismatch:internal_exception
+{
+};
+class timeout_exception
+{
+~timeout_exception ();
+};
+class deterministic_timeout_exception:timeout_exception
+{
+};
+void __assert_fail (const char *, const char *, int, int *)
+__attribute__ ((__noreturn__));
+void PL_get_pointer (void *);
+int Prolog_is_address ();
+inline int
+Prolog_get_address (void **p1)
+{
+Prolog_is_address ()? static_cast 
+void (0) : __assert_fail (Prolog_is_address, ./swi_cfli.hh, 0, 0);
+PL_get_pointer (p1);
+return 0;
+}
+
+class non_linear:internal_exception
+{
+};
+class not_unsigned_integer:internal_exception
+{
+};
+class not_universe_or_empty:internal_exception
+{
+};
+class not_a_nil_terminated_list:internal_exception
+{
+};
+class PPL_integer_out_of_range
+{
+__gmp_expr  mpz_t  n;
+};
+void handle_exception ();
+template  typename T  T * term_to_handle

[PATCH][LTO] Fix(?) parallel WPA memory unsharing

2014-04-03 Thread Richard Biener


The following fixes(?) parallel WPA memory unsharing caused by
streamer_write_chain writing to TREE_CHAIN (for no good reason).
The patch removes this historical code.

LTO bootstrap and testing running on x86_64-unknown-linux-gnu.

Richard.

2014-04-03  Richard Biener  rguent...@suse.de

* tree-streamer-out.c (streamer_write_chain): Do not temporarily
set TREE_CHAIN to NULL_TREE.

Index: gcc/tree-streamer-out.c
===
--- gcc/tree-streamer-out.c (revision 209054)
+++ gcc/tree-streamer-out.c (working copy)
@@ -523,13 +523,6 @@ streamer_write_chain (struct output_bloc
 {
   while (t)
 {
-  tree saved_chain;
-
-  /* Clear TREE_CHAIN to avoid blindly recursing into the rest
-of the list.  */
-  saved_chain = TREE_CHAIN (t);
-  TREE_CHAIN (t) = NULL_TREE;
-
   /* We avoid outputting external vars or functions by reference
 to the global decls section as we do not want to have them
 enter decl merging.  This is, of course, only for the call
@@ -541,7 +534,6 @@ streamer_write_chain (struct output_bloc
   else
stream_write_tree (ob, t, ref_p);
 
-  TREE_CHAIN (t) = saved_chain;
   t = TREE_CHAIN (t);
 }

Re: [Patch]Simplify SUBREG with operand whose target bits are cleared by AND operation

2014-04-03 Thread Eric Botcazou

 I find the GCC function simplify_subreg fails to simplify rtx (subreg:SI
 (and:DI (reg/v:DI 115 [ a ]) (const_int 4294967295 [0x])) 4) to zero
 during the fwprop1 pass, considering the fact that the high 32-bit part of
 (a  0x) is zero. This leads to some unnecessary multiplications
 for high 32-bit part of the result of AND operation. The attached patch is
 trying to improve simplify_rtx to handle such case. Other target like x86
 seems hasn't such issue because it generates different RTX to handle 64bit
 multiplication on a 32bit machine.

See http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00073.html for another try,
which led to the simplification in combine.c:combine_simplify_rtx line 5448.

Your variant is both more general, because it isn't restricted to the lowpart, 
and less general, because it is artificially restricted to AND.

Some remarks:
 - this needs to be restricted to non-paradoxical subregs,
 - you need to test HWI_COMPUTABLE_MODE_P (innermode),
 - you need to test !side_effects_p (op).

I think we need to find a common ground between Jakub's patch and yours and 
put a single transformation in simplify_subreg.

-- 
Eric Botcazou

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:23 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 Support for Power8 features and the new powerpc64le-linux-gnu target,
 including the ELFv2 ABI, has been developed up till now on the
 ibm/gcc-4_8-branch.  It was appropriate to use this separate branch
 while the support was unstable, but this branch will not represent a
 particularly good support mechanism for distributions going forward.
 Most distros are set up to pull from the major release branches, and
 having a separate branch for one target is quite inconvenient.  Also,
 the ibm/gcc-4_8-branch's original purpose is to serve as the code base
 for IBM's Advance Toolchain 7.0.  Over time the two purposes that the
 branch currently serves will diverge and make things even more
 complicated.

 The code is now tested and stable enough that we are ready to backport
 this support to the FSF 4.8 branch.  This patch series constitutes that
 backport.

 Almost all of the changes are specific to PowerPC portions of the code,
 and for those patches I am only CCing David.  However, some of the
 patches require changes to common code, and for these I will CC Richard
 and Jakub.  Three of these are slightly unrelated but necessary patches,
 one to enable decimal float ABS builtins, and two others to fix PR54537
 and PR56843.  In addition there are patches that update configuration
 files throughout for the new target, and some small changes in common
 call support (call.c, expr.h, function.c) to support how the new ABI
 handles calls.

 I realize it is unusual to backport such a large amount of code, but we
 have been asked by distribution partners to do this, and we feel it
 makes good sense for long-term support.

 I have tested the patch series by applying it to a clean FSF 4.8 branch
 and comparing the test results against those from the IBM 4.8 branch on
 three systems:
  * Power8, little endian (--mcpu=power8)
  * Power8, big endian (--mcpu=power8)
  * Power7, big endian (--mcpu=power7)

 I also checked a recursive diff against the two source directories to
 ensure that no patches were missed.

 Thanks,
 Bill

 [ 1/26] diff-p8
 [ 2/26] diff-p8-htm
 [ 3/26] diff-le-config
 [ 4/26] diff-le-libtool
 [ 5/26] diff-le-tests
 [ 6/26] diff-le-dfp
 [ 7/26] diff-le-vector
 [ 8/26] diff-abi-compat
 [ 9/26] diff-abi-calls
 [10/26] diff-abi-elfv2
 [11/26] diff-abi-gotest
 [12/26] diff-le-align
 [13/26] diff-abi-libffi
 [14/26] diff-dfp-abs
 [15/26] diff-pr54537
 [16/26] diff-pr56843
 [17/26] diff-direct-move
 [18/26] diff-le-config-2
 [19/26] diff-quad-memory
 [20/26] diff-lra
 [21/26] diff-le-vector-api
 [22/26] diff-mcall
 [23/26] diff-pr60137-pr60203
 [24/26] diff-reload
 [25/26] diff-v1ti
 [26/26] diff-trunk-missing

With the positive feedback from Darwin and RTEMS, the additional
backports for AIX and the bug fix for SPE, I am going to approve this
patch series.

There is a remaining issue with e600, but IBM LTC cannot reproduce it.
If IBM can get more information, it can be addressed in a later patch
to trunk and 4.8 branch.

Thanks, David

Re: [4.8, PATCH 2/26] Backport Power8 and LE support: HTM support

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:25 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-p8-htm) backports hardware transactional memory
 support.  Copying Jakub and Richard for the libitm support.

 Thanks,
 Bill


 [gcc]

 2014-03-29  Bill Schmidt wschm...@linux.vnet.ibm.com

 Backport from mainline
 2013-12-03  Peter Bergner  berg...@vnet.ibm.com

 * config/rs6000/htmintrin.h (_TEXASR_INSTRUCTION_FETCH_CONFLICT): Fix
 typo in macro name.
 (_TEXASRU_INSTRUCTION_FETCH_CONFLICT): Likewise.

 Backport from mainline r205233.
 2013-11-21  Peter Bergner  berg...@vnet.ibm.com

 * doc/extend.texi: Document htm builtins.

 Backport from mainline
 2013-07-17  Iain Sandoe  i...@codesourcery.com

 * config/rs6000/darwin.h (REGISTER_NAMES): Add HTM registers.

 Backport from mainline
 2013-07-16  Peter Bergner berg...@vnet.ibm.com

 * config/rs6000/rs6000.c (rs6000_option_override_internal): Do not
 enable extra ISA flags with TARGET_HTM.

 2013-07-16  Jakub Jelinek  ja...@redhat.com
 Peter Bergner  berg...@vnet.ibm.com

 * config/rs6000/rs6000.h (FIRST_PSEUDO_REGISTERS): Mention HTM
 registers in the comment.
 (DWARF_FRAME_REGISTERS): Subtract also the 3 HTM registers.
 (DWARF_REG_TO_UNWIND_COLUMN): Use DWARF_FRAME_REGISTERS
 rather than FIRST_PSEUDO_REGISTERS.

 * config.gcc (powerpc*-*-*): Install htmintrin.h and htmxlintrin.h.
 * config/rs6000/t-rs6000 (MD_INCLUDES): Add htm.md.
 * config/rs6000/rs6000.opt: Add -mhtm option.
 * config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add OPTION_MASK_HTM.
 (ISA_2_7_MASKS_SERVER): Add OPTION_MASK_HTM.
 * config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define
 __HTM__ if the HTM instructions are available.
 * config/rs6000/predicates.md (u3bit_cint_operand, 
 u10bit_cint_operand,
 htm_spr_reg_operand): New define_predicates.
 * config/rs6000/rs6000.md (define_attr type): Add htm.
 (TFHAR_REGNO, TFIAR_REGNO, TEXASR_REGNO): New define_constants.
 Include htm.md.
 * config/rs6000/rs6000-builtin.def (BU_HTM_0, BU_HTM_1, BU_HTM_2,
 BU_HTM_3, BU_HTM_SPR0, BU_HTM_SPR1): Add support macros for defining
 HTM builtin functions.
 * config/rs6000/rs6000.c (RS6000_BUILTIN_H): New macro.
 (rs6000_reg_names, alt_reg_names): Add HTM SPR register names.
 (rs6000_init_hard_regno_mode_ok): Add support for HTM instructions.
 (rs6000_builtin_mask_calculate): Likewise.
 (rs6000_option_override_internal): Likewise.
 (bdesc_htm): Add new HTM builtin support.
 (htm_spr_num): New function.
 (htm_spr_regno): Likewise.
 (rs6000_htm_spr_icode): Likewise.
 (htm_expand_builtin): Likewise.
 (htm_init_builtins): Likewise.
 (rs6000_expand_builtin): Add support for HTM builtin functions.
 (rs6000_init_builtins): Likewise.
 (rs6000_invalid_builtin, rs6000_opt_mask): Add support for -mhtm 
 option.
 * config/rs6000/rs6000.h (ASM_CPU_SPEC): Add support for -mhtm.
 (TARGET_HTM, MASK_HTM): Define macros.
 (FIRST_PSEUDO_REGISTER): Adjust for new HTM SPR registers.
 (FIXED_REGISTERS): Likewise.
 (CALL_USED_REGISTERS): Likewise.
 (CALL_REALLY_USED_REGISTERS): Likewise.
 (REG_ALLOC_ORDER): Likewise.
 (enum reg_class): Likewise.
 (REG_CLASS_NAMES): Likewise.
 (REG_CLASS_CONTENTS): Likewise.
 (REGISTER_NAMES): Likewise.
 (ADDITIONAL_REGISTER_NAMES): Likewise.
 (RS6000_BTC_SPR, RS6000_BTC_VOID, RS6000_BTC_32BIT, RS6000_BTC_64BIT,
 RS6000_BTC_MISC_MASK, RS6000_BTM_HTM): New macros.
 (RS6000_BTM_COMMON): Add RS6000_BTM_HTM.
 * config/rs6000/htm.md: New file.
 * config/rs6000/htmintrin.h: New file.
 * config/rs6000/htmxlintrin.h: New file.

 [libitm]

 2014-03-29  Bill Schmidt wschm...@linux.vnet.ibm.com

 Backport from mainline
 * acinclude.m4 (LIBITM_CHECK_AS_HTM): New.
 * configure: Rebuild.
 * configure.tgt (target_cpu): Add -mhtm to XCFLAGS.
 * config/powerpc/target.h: Include sys/auxv.h and htmintrin.h.
 (USE_HTM_FASTPATH): Define.
 (_TBEGIN_STARTED, _TBEGIN_INDETERMINATE, _TBEGIN_PERSISTENT,
 _HTM_RETRIES) New macros.
 (htm_abort, htm_abort_should_retry, htm_available, htm_begin, 
 htm_init,
 htm_begin_success, htm_commit, htm_transaction_active): New functions.

 [gcc/testsuite]

 2014-03-29  Bill Schmidt wschm...@linux.vnet.ibm.com

 Backport from mainline
 * lib/target-supports.exp (check_effective_target_powerpc_htm_ok): New
 function to test if HTM is available.
 * gcc.target/powerpc/htm-xl-intrin-1.c: New test.
 *

Re: [4.8, PATCH 5/26] Backport Power8 and LE support: Test adjustments

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 11:25 AM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-le-tests) backports adjustments to a few tests for
 powerpc64le and the ELFv2 ABI.

 Thanks,
 Bill


 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline
 2013-11-27  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * gfortran.dg/nan_7.f90: Disable for little endian PowerPC.

 Backport from mainline r205106:

 2013-11-20  Ulrich Weigand  ulrich.weig...@de.ibm.com

 * gcc.target/powerpc/darwin-longlong.c (msw): Make endian-safe.

 Backport from mainline r205046:

 2013-11-19  Ulrich Weigand  ulrich.weig...@de.ibm.com

 * gcc.target/powerpc/ppc64-abi-2.c (MAKE_SLOT): New macro to
 construct parameter slot value in endian-independent way.
 (fcevv, fciievv, fcvevv): Use it.

Okay.

Thanks, David

Re: [4.8, PATCH 6/26] Backport Power8 and LE support: TDmode for LE

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:29 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-le-dfp) backports fixes for TDmode on a little endian
 target.

 Thanks,
 Bill


 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline r205123:

 2013-11-20  Ulrich Weigand  ulrich.weig...@de.ibm.com

 * config/rs6000/rs6000.c (rs6000_cannot_change_mode_class): Do not
 allow subregs of TDmode in FPRs of smaller size in little-endian.
 (rs6000_split_multireg_move): When splitting an access to TDmode
 in FPRs, do not use simplify_gen_subreg.

 Backport from mainline r204927:

 2013-11-17  Ulrich Weigand  ulrich.weig...@de.ibm.com

 * config/rs6000/rs6000.c (rs6000_emit_move): Use low word of
 sdmode_stack_slot also in little-endian mode.

Okay.

Thanks, David

Re: [4.8, PATCH 7/26] Backport Power8 and LE support: Vector LE

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:30 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-le-vector) backports the changes to support vector
 infrastructure on powerpc64le.  Copying Richard and Jakub for the libcpp
 bits.

 Thanks,
 Bill


 [gcc]

 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline r205333
 2013-11-24  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Correct
 for little endian.

 Backport from mainline r205241
 2013-11-21  Bill Schmidt  wschm...@vnet.ibm.com

 * config/rs6000/vector.md (vec_pack_trunc_v2df): Revert previous
 little endian change.
 (vec_pack_sfix_trunc_v2df): Likewise.
 (vec_pack_ufix_trunc_v2df): Likewise.
 * config/rs6000/rs6000.c (rs6000_expand_interleave): Correct
 double checking of endianness.

 Backport from mainline r205146
 2013-11-20  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/vsx.md (vsx_set_mode): Adjust for little endian.
 (vsx_extract_mode): Likewise.
 (*vsx_extract_mode_one_le): New LE variant on
 *vsx_extract_mode_zero.
 (vsx_extract_v4sf): Adjust for little endian.

 Backport from mainline r205080
 2013-11-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Adjust
 V16QI vector splat case for little endian.

 Backport from mainline r205045:

 2013-11-19  Ulrich Weigand  ulrich.weig...@de.ibm.com

 * config/rs6000/vector.md (movmode): Do not call
 rs6000_emit_le_vsx_move to move into or out of GPRs.
 * config/rs6000/rs6000.c (rs6000_emit_le_vsx_move): Assert
 source and destination are not GPR hard regs.

 Backport from mainline r204920
 2011-11-17  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (rs6000_frame_related): Add split_reg
 parameter and use it in REG_FRAME_RELATED_EXPR note.
 (emit_frame_save): Call rs6000_frame_related with extra NULL_RTX
 parameter.
 (rs6000_emit_prologue): Likewise, but for little endian VSX
 stores, pass the source register of the store instead.

 Backport from mainline r204862
 2013-11-15  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/altivec.md (UNSPEC_VPERM_X, UNSPEC_VPERM_UNS_X):
 Remove.
 (altivec_vperm_mode): Revert earlier little endian change.
 (*altivec_vperm_mode_internal): Remove.
 (altivec_vperm_mode_uns): Revert earlier little endian change.
 (*altivec_vperm_mode_uns_internal): Remove.
 * config/rs6000/vector.md (vec_realign_load_mode): Revise
 commentary.

 Backport from mainline r204441
 2013-11-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (rs6000_option_override_internal):
 Remove restriction against use of VSX instructions when generating
 code for little endian mode.

 Backport from mainline r204440
 2013-11-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/altivec.md (mulv4si3): Ensure we generate vmulouh
 for both big and little endian.
 (mulv8hi3): Swap input operands for merge high and merge low
 instructions for little endian.

 Backport from mainline r204439
 2013-11-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/altivec.md (vec_widen_umult_even_v16qi): Change
 define_insn to define_expand that uses even patterns for big
 endian and odd patterns for little endian.
 (vec_widen_smult_even_v16qi): Likewise.
 (vec_widen_umult_even_v8hi): Likewise.
 (vec_widen_smult_even_v8hi): Likewise.
 (vec_widen_umult_odd_v16qi): Likewise.
 (vec_widen_smult_odd_v16qi): Likewise.
 (vec_widen_umult_odd_v8hi): Likewise.
 (vec_widen_smult_odd_v8hi): Likewise.
 (altivec_vmuleub): New define_insn.
 (altivec_vmuloub): Likewise.
 (altivec_vmulesb): Likewise.
 (altivec_vmulosb): Likewise.
 (altivec_vmuleuh): Likewise.
 (altivec_vmulouh): Likewise.
 (altivec_vmulesh): Likewise.
 (altivec_vmulosh): Likewise.

 Backport from mainline r204395
 2013-11-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/vector.md (vec_pack_sfix_trunc_v2df): Adjust for
 little endian.
 (vec_pack_ufix_trunc_v2df): Likewise.

 Backport from mainline r204363
 2013-11-04  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/altivec.md (vec_widen_umult_hi_v16qi): Swap
 arguments to merge instruction for little endian.
 (vec_widen_umult_lo_v16qi): Likewise.

Re: [4.8, PATCH 8/26] Backport Power8 and LE support: PR57949

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:30 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-abi-compat) backports the ABI compatibility fix for
 PR57949.

 Thanks,
 Bill


 [gcc]

 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline r201750.
 2013-11-15  Ulrich Weigand  ulrich.weig...@de.ibm.com
 Note: Default setting of -mcompat-align-parm inverted!

 2013-08-14  Bill Schmidt  wschm...@linux.vnet.ibm.com

 PR target/57949
 * doc/invoke.texi: Add documentation of mcompat-align-parm
 option.
 * config/rs6000/rs6000.opt: Add mcompat-align-parm option.
 * config/rs6000/rs6000.c (rs6000_function_arg_boundary): For AIX
 and Linux, correct BLKmode alignment when 128-bit alignment is
 required and compatibility flag is not set.
 (rs6000_gimplify_va_arg): For AIX and Linux, honor specified
 alignment for zero-size arguments when compatibility flag is not
 set.

 [gcc/testsuite]

 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline r201750.
 2013-11-15  Ulrich Weigand  ulrich.weig...@de.ibm.com
 Note: Default setting of -mcompat-align-parm inverted!

 2013-08-14  Bill Schmidt  wschm...@linux.vnet.ibm.com

 PR target/57949
 * gcc.target/powerpc/pr57949-1.c: New.
 * gcc.target/powerpc/pr57949-2.c: New.

Okay.

Thanks, David

Re: [4.8, PATCH 10/26] Backport Power8 and LE support: ELFv2 ABI

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:31 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-abi-elfv2) backports the fundamental changes for the
 ELFv2 ABI for powerpc64le.  Copying Richard and Jakub for the libgcc,
 libitm, and libstdc++ bits.

 Thanks,
 Bill


 [gcc]

 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline r204842:

 2013-11-15  Ulrich Weigand  ulrich.weig...@de.ibm.com

 * doc/invoke.texi (-mabi=elfv1, -mabi=elfv2): Document.

 Backport from mainline r204809:

 2013-11-14  Ulrich Weigand  ulrich.weig...@de.ibm.com

 * config/rs6000/sysv4le.h (LINUX64_DEFAULT_ABI_ELFv2): Define.

 Backport from mainline r204808:

 2013-11-14  Ulrich Weigand  ulrich.weig...@de.ibm.com
 Alan Modra  amo...@gmail.com

 * config/rs6000/rs6000.h (RS6000_SAVE_AREA): Handle ABI_ELFv2.
 (RS6000_SAVE_TOC): Remove.
 (RS6000_TOC_SAVE_SLOT): New macro.
 * config/rs6000/rs6000.c (rs6000_parm_offset): New function.
 (rs6000_parm_start): Use it.
 (rs6000_function_arg_advance_1): Likewise.
 (rs6000_emit_prologue): Use RS6000_TOC_SAVE_SLOT.
 (rs6000_emit_epilogue): Likewise.
 (rs6000_call_aix): Likewise.
 (rs6000_output_function_prologue): Do not save/restore r11
 around calling _mcount for ABI_ELFv2.

 2013-11-14  Ulrich Weigand  ulrich.weig...@de.ibm.com
 Alan Modra  amo...@gmail.com

 * config/rs6000/rs6000-protos.h (rs6000_reg_parm_stack_space):
 Add prototype.
 * config/rs6000/rs6000.h (RS6000_REG_SAVE): Remove.
 (REG_PARM_STACK_SPACE): Call rs6000_reg_parm_stack_space.
 * config/rs6000/rs6000.c (rs6000_parm_needs_stack): New function.
 (rs6000_function_parms_need_stack): Likewise.
 (rs6000_reg_parm_stack_space): Likewise.
 (rs6000_function_arg): Do not replace BLKmode by Pmode when
 returning a register argument.

 2013-11-14  Ulrich Weigand  ulrich.weig...@de.ibm.com
 Michael Gschwind  m...@us.ibm.com

 * config/rs6000/rs6000.h (FP_ARG_MAX_RETURN): New macro.
 (ALTIVEC_ARG_MAX_RETURN): Likewise.
 (FUNCTION_VALUE_REGNO_P): Use them.
 * config/rs6000/rs6000.c (TARGET_RETURN_IN_MSB): Define.
 (rs6000_return_in_msb): New function.
 (rs6000_return_in_memory): Handle ELFv2 homogeneous aggregates.
 Handle aggregates of up to 16 bytes for ELFv2.
 (rs6000_function_value): Handle ELFv2 homogeneous aggregates.

 2013-11-14  Ulrich Weigand  ulrich.weig...@de.ibm.com
 Michael Gschwind  m...@us.ibm.com

 * config/rs6000/rs6000.h (AGGR_ARG_NUM_REG): Define.
 * config/rs6000/rs6000.c (rs6000_aggregate_candidate): New function.
 (rs6000_discover_homogeneous_aggregate): Likewise.
 (rs6000_function_arg_boundary): Handle homogeneous aggregates.
 (rs6000_function_arg_advance_1): Likewise.
 (rs6000_function_arg): Likewise.
 (rs6000_arg_partial_bytes): Likewise.
 (rs6000_psave_function_arg): Handle BLKmode arguments.

 2013-11-14  Ulrich Weigand  ulrich.weig...@de.ibm.com
 Michael Gschwind  m...@us.ibm.com

 * config/rs6000/rs6000.h (AGGR_ARG_NUM_REG): Define.
 * config/rs6000/rs6000.c (rs6000_aggregate_candidate): New function.
 (rs6000_discover_homogeneous_aggregate): Likewise.
 (rs6000_function_arg_boundary): Handle homogeneous aggregates.
 (rs6000_function_arg_advance_1): Likewise.
 (rs6000_function_arg): Likewise.
 (rs6000_arg_partial_bytes): Likewise.
 (rs6000_psave_function_arg): Handle BLKmode arguments.

 2013-11-14  Ulrich Weigand  ulrich.weig...@de.ibm.com

 * config/rs6000/rs6000.c (machine_function): New member
 r2_setup_needed.
 (rs6000_emit_prologue): Set r2_setup_needed if necessary.
 (rs6000_output_mi_thunk): Set r2_setup_needed.
 (rs6000_output_function_prologue): Output global entry point
 prologue and local entry point marker if needed for ABI_ELFv2.
 Output -mprofile-kernel code here.
 (output_function_profiler): Do not output -mprofile-kernel
 code here; moved to rs6000_output_function_prologue.
 (rs6000_file_start): Output .abiversion 2 for ABI_ELFv2.

 (rs6000_emit_move): Do not handle dot symbols for ABI_ELFv2.
 (rs6000_output_function_entry): Likewise.
 (rs6000_assemble_integer): Likewise.
 (rs6000_elf_encode_section_info): Likewise.
 (rs6000_elf_declare_function_name): Do not create dot symbols
 or .opd section for ABI_ELFv2.

 (rs6000_trampoline_size): Update for ABI_ELFv2 trampolines.
 (rs6000_trampoline_init): Likewise.
 (rs6000_elf_file_end): Call file_end_indicate_exec_stack

Re: [4.8, PATCH 11/26] Backport Power8 and LE support: gotest

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:31 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-abi-gotest) backports enablement of the Go testsuite
 for powerpc64le.

 Thanks,
 Bill


 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline r205000.
 2013-11-19  Ulrich Weigand  ulrich.weig...@de.ibm.com

 gotest: Recognize PPC ELF v2 function pointers in text section.

Okay.

Thanks, David

Re: [4.8, PATCH 12/26] Backport Power8 and LE support: Defaults

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:32 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-le-align) sets some miscellaneous defaults for little
 endian support.

 Thanks,
 Bill


 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Apply mainline r205060.
 2013-11-20  Alan Modra  amo...@gmail.com
 * config/rs6000/sysv4.h (CC1_ENDIAN_LITTLE_SPEC): Define as empty.
 * config/rs6000/rs6000.c (rs6000_option_override_internal): Default
 to strict alignment on older processors when little-endian.
 * config/rs6000/linux64.h (PROCESSOR_DEFAULT64): Default to power8
 for ELFv2.

Okay.

Thanks, David

Re: [4.8, PATCH 14/26] Backport Power8 and LE support: DFP absolute value

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:32 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-dfp-abs) backports some unrelated but necessary work to
 enable the DFP absolute value builtins.  Copying Jakub who was involved
 with the original patch.

 Thanks,
 Bill


 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline
 2013-08-19  Peter Bergner  berg...@vnet.ibm.com
 Jakub Jelinek  ja...@redhat.com

 * builtins.def (BUILT_IN_FABSD32): New DFP ABS builtin.
 (BUILT_IN_FABSD64): Likewise.
 (BUILT_IN_FABSD128): Likewise.
 * builtins.c (expand_builtin): Add support for
 new DFP ABS builtins.
 (fold_builtin_1): Likewise.
 * config/rs6000/dfp.md
 (*abstd2_fpr): Handle non-overlapping destination
 and source operands.
 (*nabstd2_fpr): Likewise.

 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline
 2013-08-19  Peter Bergner  berg...@vnet.ibm.com

 * gcc.target/powerpc/dfp-dd-2.c: New test.
 * gcc.target/powerpc/dfp-td-2.c: Likewise.
 * gcc.target/powerpc/dfp-td-3.c: Likewise.

Okay.

Thanks, David

Re: [4.8, PATCH 17/26] Backport Power8 and LE support: Direct moves

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-direct-move) backports support for the Power8 direct
 move instructions for little endian.

 Thanks,
 Bill


 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline
 2013-10-23  Pat Haugen  pthau...@us.ibm.com

 * gcc.target/powerpc/direct-move.h: Fix header for executable tests.

 Back port from mainline
 2014-01-16  Michael Meissner  meiss...@linux.vnet.ibm.com

 PR target/59844
 * config/rs6000/rs6000.md (reload_vsx_from_gprsf): Add little
 endian support, remove tests for WORDS_BIG_ENDIAN.
 (p8_mfvsrd_3_mode): Likewise.
 (reload_gpr_from_vsxmode): Likewise.
 (reload_gpr_from_vsxsf): Likewise.
 (p8_mfvsrd_4_disf): Likewise.

Okay.

Thanks, David

Re: [4.8, PATCH 16/26] Backport Power8 and LE support: PR56843

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-pr56843) backports the fix for PR56843.

 Thanks,
 Bill


 [gcc]

 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline
 2013-04-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

 PR target/56843
 * config/rs6000/rs6000.c (rs6000_emit_swdiv_high_precision): Remove.
 (rs6000_emit_swdiv_low_precision): Remove.
 (rs6000_emit_swdiv): Rewrite to handle between one and four
 iterations of Newton-Raphson generally; modify required number of
 iterations for some cases.
 * config/rs6000/rs6000.h (RS6000_RECIP_HIGH_PRECISION_P): Remove.

 [gcc/testsuite]

 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline
 2013-04-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

 PR target/56843
 * gcc.target/powerpc/recip-1.c: Modify expected output.
 * gcc.target/powerpc/recip-3.c: Likewise.
 * gcc.target/powerpc/recip-4.c: Likewise.
 * gcc.target/powerpc/recip-5.c: Add expected output for iterations.

Okay.

Thanks, David

Re: [4.8, PATCH 18/26] Backport Power8 and LE support: Configure bits 2

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-le-config-2) backports more configure changes,
 particularly for multilib/multiarch targeting powerpc64le.

 Thanks,
 Bill


 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Apply mainline r202190, powerpc64le multilibs and multiarch dir
 2013-09-03  Alan Modra  amo...@gmail.com

 * config.gcc (powerpc*-*-linux*): Add support for little-endian
 multilibs to big-endian target and vice versa.
 * config/rs6000/t-linux64: Use := assignment on all vars.
 (MULTILIB_EXTRA_OPTS): Remove fPIC.
 (MULTILIB_OSDIRNAMES): Specify using mapping from multilib_options.
 * config/rs6000/t-linux64le: New file.
 * config/rs6000/t-linux64bele: New file.
 * config/rs6000/t-linux64lebe: New file.

Okay.

Thanks, David

Re: [4.8, PATCH 19/26] Backport Power8 and LE support: Quad memory atomic

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-quad-memory) backports support for quad-memory atomic
 operations.

 Thanks,
 Bill


 [gcc/testsuite]

 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Back port from mainline
 2014-01-23  Michael Meissner  meiss...@linux.vnet.ibm.com

 PR target/59909
 * gcc.target/powerpc/quad-atomic.c: New file to test power8 quad
 word atomic functions at runtime.

 [gcc]

 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Back port from mainline
 2014-01-23  Michael Meissner  meiss...@linux.vnet.ibm.com

 PR target/59909
 * doc/invoke.texi (RS/6000 and PowerPC Options): Document
 -mquad-memory-atomic.  Update -mquad-memory documentation to say
 it is only used for non-atomic loads/stores.

 * config/rs6000/predicates.md (quad_int_reg_operand): Allow either
 -mquad-memory or -mquad-memory-atomic switches.

 * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Add
 -mquad-memory-atomic to ISA 2.07 support.

 * config/rs6000/rs6000.opt (-mquad-memory-atomic): Add new switch
 to separate support of normal quad word memory operations (ldq,
 stq) from the atomic quad word memory operations.

 * config/rs6000/rs6000.c (rs6000_option_override_internal): Add
 support to separate non-atomic quad word operations from atomic
 quad word operations.  Disable non-atomic quad word operations in
 little endian mode so that we don't have to swap words after the
 load and before the store.
 (quad_load_store_p): Add comment about atomic quad word support.
 (rs6000_opt_masks): Add -mquad-memory-atomic to the list of
 options printed with -mdebug=reg.

 * config/rs6000/rs6000.h (TARGET_SYNC_TI): Use
 -mquad-memory-atomic as the test for whether we have quad word
 atomic instructions.
 (TARGET_SYNC_HI_QI): If either -mquad-memory-atomic,
 -mquad-memory, or -mp8-vector are used, allow byte/half-word
 atomic operations.

 * config/rs6000/sync.md (load_lockedti): Insure that the address
 is a proper indexed or indirect address for the lqarx instruction.
 On little endian systems, swap the hi/lo registers after the lqarx
 instruction.
 (load_lockedpti): Use indexed_or_indirect_operand predicate to
 insure the address is valid for the lqarx instruction.
 (store_conditionalti): Insure that the address is a proper indexed
 or indirect address for the stqcrx. instruction.  On little endian
 systems, swap the hi/lo registers before doing the stqcrx.
 instruction.
 (store_conditionalpti): Use indexed_or_indirect_operand predicate to
 insure the address is valid for the stqcrx. instruction.

 * gcc/config/rs6000/rs6000-c.c (rs6000_target_modify_macros):
 Define __QUAD_MEMORY__ and __QUAD_MEMORY_ATOMIC__ based on what
 type of quad memory support is available.

Okay.

Thanks, David

Re: [4.8, PATCH 22/26] Backport Power8 and LE support: -mcall-* endianness

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-mcall) fixes big-endian assumptions for -mcall-aixdesc
 and various others.

 Thanks,
 Bill


 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline r207658
 2014-02-06  Ulrich Weigand  ulrich.weig...@de.ibm.com

 * config/rs6000/sysv4.h (ENDIAN_SELECT): Do not attempt to enforce
 big-endian mode for -mcall-aixdesc, -mcall-freebsd, -mcall-netbsd,
 -mcall-openbsd, or -mcall-linux.
 (CC1_ENDIAN_BIG_SPEC): Remove.
 (CC1_ENDIAN_LITTLE_SPEC): Remove.
 (CC1_ENDIAN_DEFAULT_SPEC): Remove.
 (CC1_SPEC): Remove (always empty) %cc1_endian_... spec.
 (SUBTARGET_EXTRA_SPECS): Remove %cc1_endian_big, %cc1_endian_little,
 and %cc1_endian_default.
 * config/rs6000/sysv4le.h (CC1_ENDIAN_DEFAULT_SPEC): Remove.

Okay.

Thanks, David

Re: [4.8, PATCH 21/26] Backport Power8 and LE support: Vector APIs

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-le-vector-api) backports enablement of LE support for
 the Altivec APIs, including support for -maltivec=be.

 Thanks,
 Bill


 [gcc]

 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline r206443
 2014-01-08  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove
 two duplicate entries.

 Backport from mainline r206494
 2014-01-09  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * doc/invoke.texi: Add -maltivec={be,le} options, and document
 default element-order behavior for -maltivec.
 * config/rs6000/rs6000.opt: Add -maltivec={be,le} options.
 * config/rs6000/rs6000.c (rs6000_option_override_internal): Ensure
 that -maltivec={le,be} implies -maltivec; disallow -maltivec=le
 when targeting big endian, at least for now.
 * config/rs6000/rs6000.h: Add #define of VECTOR_ELT_ORDER_BIG.

 Backport from mainline r206541
 2014-01-10  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000-builtin.def: Fix pasto for VPKSDUS.

 Backport from mainline r206590
 2014-01-13  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
 Implement -maltivec=be for vec_insert and vec_extract.

 Backport from mainline r206641
 2014-01-15  Bill Schmidt  wschm...@vnet.linux.ibm.com

 * config/rs6000/altivec.md (mulv8hi3): Explicitly generate vmulesh
 and vmulosh rather than call gen_vec_widen_smult_*.
 (vec_widen_umult_even_v16qi): Test VECTOR_ELT_ORDER_BIG rather
 than BYTES_BIG_ENDIAN to determine use of even or odd instruction.
 (vec_widen_smult_even_v16qi): Likewise.
 (vec_widen_umult_even_v8hi): Likewise.
 (vec_widen_smult_even_v8hi): Likewise.
 (vec_widen_umult_odd_v16qi): Likewise.
 (vec_widen_smult_odd_v16qi): Likewise.
 (vec_widen_umult_odd_v8hi): Likewise.
 (vec_widen_smult_odd_v8hi): Likewise.
 (vec_widen_umult_hi_v16qi): Explicitly generate vmuleub and
 vmuloub rather than call gen_vec_widen_umult_*.
 (vec_widen_umult_lo_v16qi): Likewise.
 (vec_widen_smult_hi_v16qi): Explicitly generate vmulesb and
 vmulosb rather than call gen_vec_widen_smult_*.
 (vec_widen_smult_lo_v16qi): Likewise.
 (vec_widen_umult_hi_v8hi): Explicitly generate vmuleuh and vmulouh
 rather than call gen_vec_widen_umult_*.
 (vec_widen_umult_lo_v8hi): Likewise.
 (vec_widen_smult_hi_v8hi): Explicitly gnerate vmulesh and vmulosh
 rather than call gen_vec_widen_smult_*.
 (vec_widen_smult_lo_v8hi): Likewise.

 Backport from mainline r207062
 2014-01-24  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Remove
 correction for little endian...
 * config/rs6000/vsx.md (vsx_xxpermdi2_mode_1): ...and move it to
 here.

 Backport from mainline r207262
 2014-01-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (altivec_expand_vec_perm_const):  Use
 CODE_FOR_altivec_vmrg*_direct rather than CODE_FOR_altivec_vmrg*.
 * config/rs6000/vsx.md (vsx_mergel_mode): Adjust for
 -maltivec=be with LE targets.
 (vsx_mergeh_mode): Likewise.
 * config/rs6000/altivec.md (UNSPEC_VMRG[HL]_DIRECT): New
 unspecs.
 (mulv8hi3): Use gen_altivec_vmrg[hl]w_direct.
 (altivec_vmrghb): Replace with define_expand and new
 *altivec_vmrghb_internal insn; adjust for -maltivec=be with LE
 targets.
 (altivec_vmrghb_direct): New define_insn.
 (altivec_vmrghh): Replace with define_expand and new
 *altivec_vmrghh_internal insn; adjust for -maltivec=be with LE
 targets.
 (altivec_vmrghh_direct): New define_insn.
 (altivec_vmrghw): Replace with define_expand and new
 *altivec_vmrghw_internal insn; adjust for -maltivec=be with LE
 targets.
 (altivec_vmrghw_direct): New define_insn.
 (*altivec_vmrghsf): Adjust for endianness.
 (altivec_vmrglb): Replace with define_expand and new
 *altivec_vmrglb_internal insn; adjust for -maltivec=be with LE
 targets.
 (altivec_vmrglb_direct): New define_insn.
 (altivec_vmrglh): Replace with define_expand and new
 *altivec_vmrglh_internal insn; adjust for -maltivec=be with LE
 targets.
 (altivec_vmrglh_direct): New define_insn.
 (altivec_vmrglw): Replace with define_expand and new
 *altivec_vmrglw_internal insn; adjust for -maltivec=be with LE
 targets.

Re: [4.8, PATCH 23/26] Backport Power8 and LE support: PR60137, PR60203

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-pr60137-pr60203) backports fixes for two little-endian
 vector mode problems.

 Thanks,
 Bill


 [gcc]

 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline r207699.
 2014-02-11  Michael Meissner  meiss...@linux.vnet.ibm.com

 PR target/60137
 * config/rs6000/rs6000.md (128-bit GPR splitter): Add a splitter
 for VSX/Altivec vectors that land in GPR registers.

 Backport from mainline r207808.
 2014-02-15  Michael Meissner  meiss...@linux.vnet.ibm.com

 PR target/60203
 * config/rs6000/rs6000.md (rreg): Add TFmode, TDmode constraints.
 (movmode_internal, TFmode/TDmode): Split TFmode/TDmode moves
 into 64-bit and 32-bit moves.  On 64-bit moves, add support for
 using direct move instructions on ISA 2.07.  Also adjust
 instruction length for 64-bit.
 (movmode_64bit, TFmode/TDmode): Likewise.
 (movmode_32bit, TFmode/TDmode): Likewise.

 Backport from mainline r207868.
 2014-02-18  Michael Meissner  meiss...@linux.vnet.ibm.com

 PR target/60203
 * config/rs6000/rs6000.md (movmode_64bit, TF/TDmode moves):
 Split 64-bit moves into 2 patterns.  Do not allow the use of
 direct move for TDmode in little endian, since the decimal value
 has little endian bytes within a word, but the 64-bit pieces are
 ordered in a big endian fashion, and normal subreg's of TDmode are
 not allowed.
 (movmode_64bit_dm): Likewise.
 (movtd_64bit_nodm): Likewise.

 [gcc/testsuite]

 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Backport from mainline r207699.
 2014-02-11  Michael Meissner  meiss...@linux.vnet.ibm.com

 PR target/60137
 * gcc.target/powerpc/pr60137.c: New file.

 Backport from mainline r207808.
 2014-02-15  Michael Meissner  meiss...@linux.vnet.ibm.com

 PR target/60203
 * gcc.target/powerpc/pr60203.c: New testsuite.

Okay.

Thanks, David

Re: [4.8, PATCH 24/26] Backport Power8 and LE support: Reload issues

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-reload) backports fixes for a couple of problems in
 PowerPC reload handling.

 Thanks,
 Bill


 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Apply mainline r207798
 2014-02-26  Alan Modra  amo...@gmail.com
 PR target/58675
 PR target/57935
 * config/rs6000/rs6000.c (rs6000_secondary_reload_inner): Use
 find_replacement on parts of insn rtl that might be reloaded.

 Backport from mainline r208287
 2014-03-03  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (rs6000_preferred_reload_class): Disallow
 reload of PLUS rtx's outside of GENERAL_REGS or BASE_REGS; relax
 constraint on constants to permit them being loaded into
 GENERAL_REGS or BASE_REGS.

Okay.

Thanks, David

Re: [4.8, PATCH 25/26] Backport Power8 and LE support: V1TI support

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-v1ti) backports the V1TI support.

 Thanks,
 Bill


 [gcc]

 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Back port from trunk
 2014-03-12  Michael Meissner  meiss...@linux.vnet.ibm.com

 * config/rs6000/vector.md (VEC_L): Add V1TI mode to vector types.
 (VEC_M): Likewise.
 (VEC_N): Likewise.
 (VEC_R): Likewise.
 (VEC_base): Likewise.
 (movMODE, VEC_M modes): If we are loading TImode into VSX
 registers, we need to swap double words in little endian mode.

 * config/rs6000/rs6000-modes.def (V1TImode): Add new vector mode
 to be a container mode for 128-bit integer operations added in ISA
 2.07.  Unlike TImode and PTImode, the preferred register set is
 the Altivec/VMX registers for the 128-bit operations.

 * config/rs6000/rs6000-protos.h (rs6000_move_128bit_ok_p): Add
 declarations.
 (rs6000_split_128bit_ok_p): Likewise.

 * config/rs6000/rs6000-builtin.def (BU_P8V_AV_3): Add new support
 macros for creating ISA 2.07 normal and overloaded builtin
 functions with 3 arguments.
 (BU_P8V_OVERLOAD_3): Likewise.
 (VPERM_1T): Add support for V1TImode in 128-bit vector operations
 for use as overloaded functions.
 (VPERM_1TI_UNS): Likewise.
 (VSEL_1TI): Likewise.
 (VSEL_1TI_UNS): Likewise.
 (ST_INTERNAL_1ti): Likewise.
 (LD_INTERNAL_1ti): Likewise.
 (XXSEL_1TI): Likewise.
 (XXSEL_1TI_UNS): Likewise.
 (VPERM_1TI): Likewise.
 (VPERM_1TI_UNS): Likewise.
 (XXPERMDI_1TI): Likewise.
 (SET_1TI): Likewise.
 (LXVD2X_V1TI): Likewise.
 (STXVD2X_V1TI): Likewise.
 (VEC_INIT_V1TI): Likewise.
 (VEC_SET_V1TI): Likewise.
 (VEC_EXT_V1TI): Likewise.
 (EQV_V1TI): Likewise.
 (NAND_V1TI): Likewise.
 (ORC_V1TI): Likewise.
 (VADDCUQ): Add support for 128-bit integer arithmetic instructions
 added in ISA 2.07.  Add both normal 'altivec' builtins, and the
 overloaded builtin.
 (VADDUQM): Likewise.
 (VSUBCUQ): Likewise.
 (VADDEUQM): Likewise.
 (VADDECUQ): Likewise.
 (VSUBEUQM): Likewise.
 (VSUBECUQ): Likewise.

 * config/rs6000/rs6000-c.c (__int128_type): New static to hold
 __int128_t and __uint128_t types.
 (__uint128_type): Likewise.
 (altivec_categorize_keyword): Add support for vector __int128_t,
 vector __uint128_t, vector __int128, and vector unsigned __int128
 as a container type for TImode operations that need to be done in
 VSX/Altivec registers.
 (rs6000_macro_to_expand): Likewise.
 (altivec_overloaded_builtins): Add ISA 2.07 overloaded functions
 to support 128-bit integer instructions vaddcuq, vadduqm,
 vaddecuq, vaddeuqm, vsubcuq, vsubuqm, vsubecuq, vsubeuqm.
 (altivec_resolve_overloaded_builtin): Add support for V1TImode.

 * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support
 for V1TImode, and set up preferences to use VSX/Altivec
 registers.  Setup VSX reload handlers.
 (rs6000_debug_reg_global): Likewise.
 (rs6000_init_hard_regno_mode_ok): Likewise.
 (rs6000_preferred_simd_mode): Likewise.
 (vspltis_constant): Do not allow V1TImode as easy altivec
 constants.
 (easy_altivec_constant): Likewise.
 (output_vec_const_move): Likewise.
 (rs6000_expand_vector_set): Convert V1TImode set and extract to
 simple move.
 (rs6000_expand_vector_extract): Likewise.
 (reg_offset_addressing_ok_p): Setup V1TImode to use VSX reg+reg
 addressing.
 (rs6000_const_vec): Add support for V1TImode.
 (rs6000_emit_le_vsx_load): Swap double words when loading or
 storing TImode/V1TImode.
 (rs6000_emit_le_vsx_store): Likewise.
 (rs6000_emit_le_vsx_move): Likewise.
 (rs6000_emit_move): Add support for V1TImode.
 (altivec_expand_ld_builtin): Likewise.
 (altivec_expand_st_builtin): Likewise.
 (altivec_expand_vec_init_builtin): Likewise.
 (altivec_expand_builtin): Likewise.
 (rs6000_init_builtins): Add support for V1TImode type.  Add
 support for ISA 2.07 128-bit integer builtins.  Define type names
 for the VSX/Altivec vector types.
 (altivec_init_builtins): Add support for overloaded vector
 functions with V1TImode type.
 (rs6000_preferred_reload_class): Prefer Altivec registers for
 V1TImode.
 (rs6000_move_128bit_ok_p): Move 128-bit move/split validation to
 external function.
 (rs6000_split_128bit_ok_p): Likewise.
 (rs6000_handle_altivec_attribute): Create

Re: [4.8, PATCH 26/26] Backport Power8 and LE support: Missing support

2014-04-03 Thread David Edelsohn

On Wed, Mar 19, 2014 at 3:35 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This patch (diff-trunk-missing) backports some LE pieces that were found
 not to have been backported from trunk to the IBM 4.8 branch until
 relatively recently.

 Thanks,
 Bill


 2014-03-19  Bill Schmidt  wschm...@linux.vnet.ibm.com

 Back port from trunk
 2013-04-25  Alan Modra  amo...@gmail.com

 PR target/57052
 * config/rs6000/rs6000.md (rotlsi3_internal7): Rename to
 rotlsi3_internal7le and condition on !BYTES_BIG_ENDIAN.
 (rotlsi3_internal8be): New BYTES_BIG_ENDIAN insn.
 Repeat for many other rotate/shift and mask patterns using subregs.
 Name lshiftrt insns.
 (ashrdisi3_noppc64): Rename to ashrdisi3_noppc64be and condition
 on WORDS_BIG_ENDIAN.

 2013-06-07  Alan Modra  amo...@gmail.com

 * config/rs6000/rs6000.c (rs6000_option_override_internal): Don't
 override user -mfp-in-toc.
 (offsettable_ok_by_alignment): Consider just the current access
 rather than the whole object, unless BLKmode.  Handle
 CONSTANT_POOL_ADDRESS_P constants that lack a decl too.
 (use_toc_relative_ref): Allow CONSTANT_POOL_ADDRESS_P constants
 for -mcmodel=medium.
 * config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Don't
 override user -mfp-in-toc or -msum-in-toc.  Default to
 -mno-fp-in-toc for -mcmodel=medium.

 2013-06-18  Alan Modra  amo...@gmail.com

 * config/rs6000/rs6000.h (enum data_align): New.
 (LOCAL_ALIGNMENT, DATA_ALIGNMENT): Use rs6000_data_alignment.
 (DATA_ABI_ALIGNMENT): Define.
 (CONSTANT_ALIGNMENT): Correct comment.
 * config/rs6000/rs6000-protos.h (rs6000_data_alignment): Declare.
 * config/rs6000/rs6000.c (rs6000_data_alignment): New function.

 2013-07-11  Ulrich Weigand  ulrich.weig...@de.ibm.com

 * config/rs6000/rs6000.md (*tls_gd_lowTLSmode:tls_abi_suffix):
 Require GOT register as additional operand in UNSPEC.
 (*tls_ld_lowTLSmode:tls_abi_suffix): Likewise.
 (*tls_got_dtprel_lowTLSmode:tls_abi_suffix): Likewise.
 (*tls_got_tprel_lowTLSmode:tls_abi_suffix): Likewise.
 (*tls_gdTLSmode:tls_abi_suffix): Update splitter.
 (*tls_ldTLSmode:tls_abi_suffix): Likewise.
 (tls_got_dtprel_TLSmode:tls_abi_suffix): Likewise.
 (tls_got_tprel_TLSmode:tls_abi_suffix): Likewise.

 2014-01-23  Pat Haugen  pthau...@us.ibm.com

 * config/rs6000/rs6000.c (rs6000_option_override_internal): Don't
 force flag_ira_loop_pressure if set via command line.

 2014-02-06  Alan Modra  amo...@gmail.com

 PR target/60032
 * config/rs6000/rs6000.c (rs6000_secondary_memory_needed_mode): Only
 change SDmode to DDmode when lra_in_progress.

Okay.

Thanks, David

[PATCH] Fix PR c++/21113

2014-04-03 Thread Patrick Palka

Hi,

This patch fixes c++/21113 which reports that the C++ frontend does not
forbid jumps into the scope of identifiers with variably-modified types.

The patch simply augments decl_jump_unsafe() to disallow jumping into
blocks that initialize variably-modified decls.

I bootstrapped and regtested this change on x86_64-unknown-linux-gnu.

2014-04-03  Patrick Palka  patr...@parcs.ath.cx

PR c++/21113
* decl.c (decl_jump_unsafe): Consider variably-modified decls.
---
 gcc/cp/decl.c|  5 ++---
 gcc/testsuite/g++.dg/ext/vla14.C | 23 +++
 2 files changed, 25 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/vla14.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 5bd33c5..6571af5 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -2785,9 +2785,8 @@ decl_jump_unsafe (tree decl)
   || type == error_mark_node)
 return 0;
 
-  type = strip_array_types (type);
-
-  if (DECL_NONTRIVIALLY_INITIALIZED_P (decl))
+  if (DECL_NONTRIVIALLY_INITIALIZED_P (decl)
+  || variably_modified_type_p (type, NULL_TREE))
 return 2;
 
   if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (decl)))
diff --git a/gcc/testsuite/g++.dg/ext/vla14.C b/gcc/testsuite/g++.dg/ext/vla14.C
new file mode 100644
index 000..278cb63
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/vla14.C
@@ -0,0 +1,23 @@
+// PR c++/21113
+// { dg-options  }
+
+void
+f (int n)
+{
+  goto label; // { dg-error from here }
+  int a[n]; // { dg-error crosses initialization }
+label: // { dg-error jump to label }
+  ;
+}
+
+void
+g (int n)
+{
+  switch (1)
+  {
+  case 1:
+int (*a)[n]; // { dg-error crosses initialization }
+  default: // { dg-error jump to case label }
+;
+  }
+}
-- 
1.9.1

[PATCH] Fix PR c++/44613

2014-04-03 Thread Patrick Palka

Hi,

This patch fixes a wrong code issue in the code generated for VLAs in
the C++ frontend.  This exact issue was fixed in the C frontend with
r85849, and this patch is essentially a port of r85849 for the C++
frontend.

The issue is that this C++ code:

  {
foo:
int x[n];
f ();
  }

gets gimplified into this:

  {
int x[n];
void *saved_stack;
saved_stack = __builtin_stack_save ();
try
{
foo:  // -- jump to foo will bypass initialization of 
saved_stack
  x = alloca (...);
  f ();
} finally
{
  __builtin_stack_restore (saved_stack);
}
  }

In order to ensure that labels such as foo that occur before the
initialization of a VLA are emitted in the right place by the
gimplifier, the C++ frontend is changed to handle the above C++ code
as if it looked like this:

  {
foo:
{
  int x[n];
  f ();
}
  }

thereby forcing the label foo to be placed before the initialization
of saved_stack during gimplification.  This is the same approach that
the C frontend uses (see r85849).

I bootstrapped and regtested this patch on x86_64-unknown-linux-gnu.

2014-04-03  Patrick Palka  patr...@parcs.ath.cx

PR c++/44613
* semantics.c (add_stmt): Set STATEMENT_LIST_HAS_LABEL.
* decl.c (cp_finish_decl): Create a new BIND_EXPR before
instantiating a variable-sized type.
---
 gcc/cp/decl.c| 19 ++-
 gcc/cp/semantics.c   |  3 +++
 gcc/testsuite/g++.dg/ext/vla15.C | 20 
 3 files changed, 41 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/vla15.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index f3a081b..5bd33c5 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6441,7 +6441,24 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
  after the call to check_initializer so that the DECL_EXPR for a
  reference temp is added before the DECL_EXPR for the reference itself.  */
   if (DECL_FUNCTION_SCOPE_P (decl))
-add_decl_expr (decl);
+{
+  /* If we're building a variable sized type, and we might be
+reachable other than via the top of the current binding
+level, then create a new BIND_EXPR so that we deallocate
+the object at the right time.  */
+  if (VAR_P (decl)
+  DECL_SIZE (decl)
+  !TREE_CONSTANT (DECL_SIZE (decl))
+  STATEMENT_LIST_HAS_LABEL (cur_stmt_list))
+   {
+ tree bind;
+ bind = build3 (BIND_EXPR, void_type_node, NULL, NULL, NULL);
+ TREE_SIDE_EFFECTS (bind) = 1;
+ add_stmt (bind);
+ BIND_EXPR_BODY (bind) = push_stmt_list ();
+   }
+  add_decl_expr (decl);
+}
 
   /* Let the middle end know about variables and functions -- but not
  static data members in uninstantiated class templates.  */
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index fb1e404..b00294e 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -386,6 +386,9 @@ add_stmt (tree t)
   STMT_IS_FULL_EXPR_P (t) = stmts_are_full_exprs_p ();
 }
 
+  if (code == LABEL_EXPR || code == CASE_LABEL_EXPR)
+STATEMENT_LIST_HAS_LABEL (cur_stmt_list) = 1;
+
   /* Add T to the statement-tree.  Non-side-effect statements need to be
  recorded during statement expressions.  */
   gcc_checking_assert (!stmt_list_stack-is_empty ());
diff --git a/gcc/testsuite/g++.dg/ext/vla15.C b/gcc/testsuite/g++.dg/ext/vla15.C
new file mode 100644
index 000..feeb49f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/vla15.C
@@ -0,0 +1,20 @@
+// PR c++/44613
+// { dg-do run }
+// { dg-options  }
+
+void *volatile p;
+
+int
+main (void)
+{
+  int n = 0;
+ lab:;
+  int x[n % 1000 + 1];
+  x[0] = 1;
+  x[n % 1000] = 2;
+  p = x;
+  n++;
+  if (n  100)
+goto lab;
+  return 0;
+}
-- 
1.9.1

[PING][C++ Patch, 4.8] Backport fix for c++/54537 to FSF 4.8

2014-04-03 Thread Peter Bergner

I'd like to ping the following backport patch for the fix for PR54537.
This did bootstrap and regtest with no regressions on powerpc64-linux.

  http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01148.html

Peter

Re: [PING^8][PATCH] Add a couple of dialect and warning options regarding Objective-C instance variable scope

2014-04-03 Thread Dimitris Papavasiliou


Still pinging.

On 03/28/2014 11:58 AM, Dimitris Papavasiliou wrote:

Ping!

On 03/23/2014 03:20 AM, Dimitris Papavasiliou wrote:

Ping!

On 03/13/2014 11:54 AM, Dimitris Papavasiliou wrote:

Ping!

On 03/06/2014 07:44 PM, Dimitris Papavasiliou wrote:

Ping!

On 02/27/2014 11:44 AM, Dimitris Papavasiliou wrote:

Ping!

On 02/20/2014 12:11 PM, Dimitris Papavasiliou wrote:

Hello all,

Pinging this patch review request again. See previous messages quoted
below for details.

Regards,
Dimitris

On 02/13/2014 04:22 PM, Dimitris Papavasiliou wrote:

Hello,

Pinging this patch review request. Can someone involved in the
Objective-C language frontend have a quick look at the
description of
the proposed features and tell me if it'd be ok to have them in the
trunk so I can go ahead and create proper patches?

Thanks,
Dimitris

On 02/06/2014 11:25 AM, Dimitris Papavasiliou wrote:

Hello,

This is a patch regarding a couple of Objective-C related dialect
options and warning switches. I have already submitted it a while
ago
but gave up after pinging a couple of times. I am now informed that
should have kept pinging until I got someone's attention so I'm
resending it.

The patch is now against an old revision and as I stated originally
it's
probably not in a state that can be adopted as is. I'm sending it
as is
so that the implemented features can be assesed in terms of their
usefulness and if they're welcome I'd be happy to make any
necessary
changes to bring it up-to-date, split it into smaller patches, add
test-cases and anything else that is deemed necessary.

Here's the relevant text from my initial message:

Two of these switches are related to a feature request I
submitted a
while ago, Bug 56044
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56044). I won't
reproduce
the entire argument here since it is available in the feature
request.
The relevant functionality in the patch comes in the form of two
switches:

-Wshadow-ivars which controls the local declaration of ‘somevar’
hides
instance variable warning which curiously is enabled by default
instead
of being controlled at least by -Wshadow. The patch changes it so
that
this warning can be enabled and disabled specifically through
-Wshadow-ivars as well as with all other shadowing-related warnings
through -Wshadow.

The reason for the extra switch is that, while searching through
the
Internet for a solution to this problem I have found out that other
people are inconvenienced by this particular warning as well so it
might
be useful to be able to turn it off while keeping all the other
shadowing-related warnings enabled.

-flocal-ivars which when true, as it is by default, treats instance
variables as having local scope. If false (-fno-local-ivars)
instance
variables must always be referred to as self-ivarname and
references of
ivarname resolve to the local or global scope as usual.

I've also taken the opportunity of adding another switch
unrelated to
the above but related to instance variables:

-fivar-visibility which can be set to either private, protected
(the
default), public and package. This sets the default instance
variable
visibility which normally is implicitly protected. My use-case for
it is
basically to be able to set it to public and thus effectively
disable
this visibility mechanism altogether which I find no use for and
therefore have to circumvent. I'm not sure if anyone else feels the
same
way towards this but I figured it was worth a try.

I'm attaching a preliminary patch against the current revision in
case
anyone wants to have a look. The changes are very small and any
blatant
mistakes should be immediately obvious. I have to admit to having
virtually no knowledge of the internals of GCC but I have tried to
keep
in line with formatting guidelines and general style as well as
looking
up the particulars of the way options are handled in the available
documentation to avoid blind copy-pasting. I have also tried to
test
the
functionality both in my own (relatively large, or at least not too
small) project and with small test programs and everything works as
expected. Finallly, I tried running the tests too but these fail to
complete both in the patched and unpatched version, possibly due to
the
way I've configured GCC.

Dimitris

Re: [committed, libjava] XFAIL sourcelocation (PR libgcj/55637) backported to 4.8.3

2014-04-03 Thread Dominique d'Humières


Thanks for the tip. What should I do now? Should I fix the ChangeLog entry and 
add a new one or do nothing?

Dominique

Le 2 avr. 2014 à 12:47, Rainer Orth r...@cebitec.uni-bielefeld.de a écrit :

 domi...@lps.ens.fr (Dominique Dhumieres) writes:
 
 r...@cebitec.uni-bielefeld.de (Rainer Orth) wrote:
 Sure, patch preapproved.
 
 Commited as r208983:
 
 2014-04-01  Dominique d'Humieres domi...@lps.ens.fr
Rainer Orth  r...@cebitec.uni-bielefeld.de
 
PR libgcj/55637
* testsuite/libjava.lang/sourcelocation.xfail: New file.
 
 Btw, the customary format for such a ChangeLog entry is
 
 2014-04-01  Dominique d'Humieres domi...@lps.ens.fr
 
   Backport from mainline
   2014-02-20  Rainer Orth  r...@cebitec.uni-bielefeld.de
 
PR libgcj/55637
* testsuite/libjava.lang/sourcelocation.xfail: New file.
 
 This way, you can easily see when the original went in.
 
   Rainer
 
 -- 
 -
 Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [committed, libjava] XFAIL sourcelocation (PR libgcj/55637) backported to 4.8.3

2014-04-03 Thread Rainer Orth

Hi Dominique,

 Thanks for the tip. What should I do now? Should I fix the ChangeLog entry
 and add a new one or do nothing?

if you want, you could fix the ChangeLog entry in place, but don't add a
new one for that change.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PING][C++ Patch, 4.8] Backport fix for c++/54537 to FSF 4.8

2014-04-03 Thread Jonathan Wakely


On 03/04/14 10:25 -0500, Peter Bergner wrote:

I'd like to ping the following backport patch for the fix for PR54537.
This did bootstrap and regtest with no regressions on powerpc64-linux.

 http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01148.html


I don't know how risky the front-end change is, but if it gets
approved then the library part is obviously fine.

That said, my kneejerk reaction is if it's only really needed to
allow inclusion of tr1/cmath then my solution would be to not use
that TR1 header!

Re: [gomp4] Add tables generation

2014-04-03 Thread Bernd Schmidt


On 04/02/2014 10:36 AM, Thomas Schwinge wrote:

I see regressions in the libgomp testsuite for configurations where
offloading is not enabled:

 spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ 
[...]/source/libgomp/testsuite/libgomp.c/for-3.c 
-B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ 
-B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs 
-I[...]/build/x86_64-unknown-linux-gnu/./libgomp 
-I[...]/source/libgomp/testsuite/.. -fmessage-length=0 
-fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 
-fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o 
./for-3.exe
 /tmp/ccGnT0ei.o: In function `main':
 for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__'
 collect2: error: ld returned 1 exit status

I suppose that's because [...]


Workaround committed in r209015:



libgcc/
* crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to
NULL.


The patch below should be a better fix, making the references to 
__OPENMP_TARGET__ weak. Does this work for you?



Bernd

Index: gcc/omp-low.c
===
--- gcc/omp-low.c	(revision 429741)
+++ gcc/omp-low.c	(working copy)
@@ -221,6 +221,28 @@ static tree scan_omp_1_op (tree *, int *
   *handled_ops_p = false; \
   break;
 
+static GTY(()) tree offload_symbol_decl;
+
+/* Get the __OPENMP_TARGET__ symbol.  */
+static tree
+get_offload_symbol_decl (void)
+{
+  if (!offload_symbol_decl)
+{
+  tree decl = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+			  get_identifier (__OPENMP_TARGET__),
+			  ptr_type_node);
+  TREE_PUBLIC (decl) = 1;
+  DECL_EXTERNAL (decl) = 1;
+  DECL_WEAK (decl) = 1;
+  DECL_ATTRIBUTES (decl)
+	= tree_cons (get_identifier (weak),
+		 NULL_TREE, DECL_ATTRIBUTES (decl));
+  offload_symbol_decl = decl;
+}
+  return offload_symbol_decl;
+}
+
 /* Convenience function for calling scan_omp_1_op on tree operands.  */
 
 static inline tree
@@ -5148,11 +5170,7 @@ expand_oacc_offload (struct omp_region *
 }
 
   gimple g;
-  tree openmp_target
-= build_decl (UNKNOWN_LOCATION, VAR_DECL,
-		  get_identifier (__OPENMP_TARGET__), ptr_type_node);
-  TREE_PUBLIC (openmp_target) = 1;
-  DECL_EXTERNAL (openmp_target) = 1;
+  tree openmp_target = get_offload_symbol_decl ();
   tree fnaddr = build_fold_addr_expr (child_fn);
   g = gimple_build_call (builtin_decl_explicit (start_ix), 10, device,
 			 fnaddr, build_fold_addr_expr (openmp_target),
@@ -8686,11 +8704,7 @@ expand_omp_target (struct omp_region *re
 }
 
   gimple g;
-  tree openmp_target
-= build_decl (UNKNOWN_LOCATION, VAR_DECL,
-		  get_identifier (__OPENMP_TARGET__), ptr_type_node);
-  TREE_PUBLIC (openmp_target) = 1;
-  DECL_EXTERNAL (openmp_target) = 1;
+  tree openmp_target = get_offload_symbol_decl ();
   if (kind == GF_OMP_TARGET_KIND_REGION)
 {
   tree fnaddr = build_fold_addr_expr (child_fn);

Re: [gomp4] Add tables generation

2014-04-03 Thread Ilya Verbin

2014-04-03 20:13 GMT+04:00 Bernd Schmidt ber...@codesourcery.com:
 The patch below should be a better fix, making the references to  
 __OPENMP_TARGET__ weak. Does this work for you?

Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target,
since we decided to pass it to GOMP_offload_register?

  -- Ilya

[PATCH] Initialize sanitizer builtins (PR sanitizer/60745)

2014-04-03 Thread Marek Polacek

Under certain circumstances the sanitizer builtins are not initialized
properly and ubsan_instrument_return must make sure they are
initialized.  Otherwise builtin_decl_explicit returns NULL and
we'll ICE in build_call_expr_loc_array.  I'm not sure which other
ubsan routines need similar fix.

No testcase attached since it's not trivial to reproduce this.

Bootstrapped/ran ubsan testsuite on x86_64-linux, ok for trunk?

2014-04-03  Marek Polacek  pola...@redhat.com

PR sanitizer/60745
* c-ubsan.c: Include asan.h.
(ubsan_instrument_return): Call initialize_sanitizer_builtins.

diff --git gcc/c-family/c-ubsan.c gcc/c-family/c-ubsan.c
index dc4d981..9d2403c 100644
--- gcc/c-family/c-ubsan.c
+++ gcc/c-family/c-ubsan.c
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
 #include ubsan.h
 #include c-family/c-common.h
 #include c-family/c-ubsan.h
+#include asan.h
 
 /* Instrument division by zero and INT_MIN / -1.  If not instrumenting,
return NULL_TREE.  */
@@ -185,6 +186,8 @@ ubsan_instrument_vla (location_t loc, tree size)
 tree
 ubsan_instrument_return (location_t loc)
 {
+  initialize_sanitizer_builtins ();
+
   tree data = ubsan_create_data (__ubsan_missing_return_data, loc,
 NULL, NULL_TREE);
   tree t = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_MISSING_RETURN);

Marek

Re: [PATCH, ARM] Fix PR60609 (Error: value of 256 too large for field of 1 bytes)

2014-04-03 Thread Ramana Radhakrishnan

On Thu, Apr 3, 2014 at 2:27 PM, Charles Baylis
charles.bay...@linaro.org wrote:
 Hi

 This bug causes the compiler to create a Thumb-2 TBB instruction with
 a jump table containing an out of range value in a .byte field:

 whatever.s:148: Error: value of 256 too large for field of 1 bytes at 100

 This occurs because the jump table is followed with a .align 1 due
 to ASM_OUTPUT_CASE_END, but the 'shorten' phase does not account for
 the space taken by this align directive.

My first reaction is to wonder why this is this not a bug in the
shorten phase.


 This patch addresses the issue by removing ASM_OUTPUT_CASE_END from
 arm.h, and ensuring that the alignment after an ADDR_DIFF_VEC is
 instead inserted by aligning the label following the barrier which
 follows it. This is achieved by defining LABEL_ALIGN_AFTER_BARRIER
 appropriately.

On first glance this feels like a blunt hammer, what's the code size
bloat with putting out such an alignment after each barrier that the
compiler emits rather than tracking this in ASM_OUTPUT_CASE_END.

I'll try and have a look at this again tomorrow morning.

regards
Ramana


 Bootstrapped/checked on arm-unknown-linux-gnueabihf.

 OK for trunk, and backporting to 4.8?



 2014-04-02  Charles Baylis  charles.bay...@linaro.org

 PR target/60609
 * config/arm/arm.h (ASM_OUTPUT_CASE_END) Remove.
 (LABEL_ALIGN_AFTER_BARRIER) Align barriers which occur after
 ADDR_DIFF_VEC.


 2014-04-02  Charles Baylis  charles.bay...@linaro.org

 PR target/60609
 * g++.dg/torture/pr60609.C: New test.

Re: [gomp4] Add tables generation

2014-04-03 Thread Bernd Schmidt


On 04/03/2014 06:53 PM, Ilya Verbin wrote:

2014-04-03 20:13 GMT+04:00 Bernd Schmidt ber...@codesourcery.com:

The patch below should be a better fix, making the references to  
__OPENMP_TARGET__ weak. Does this work for you?


Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target,
since we decided to pass it to GOMP_offload_register?


I thought it was used to look up the right function? With shared 
libraries you'd get multiple __OPENMP_TARGET__ tables.



Bernd

Re: [PATCH] PowerPC, PR60735: _Decimal64 moves broken on -mspe

2014-04-03 Thread David Edelsohn

On Tue, Apr 1, 2014 at 7:55 PM, Michael Meissner
meiss...@linux.vnet.ibm.com wrote:
 In backporting the power8 changes to the 4.8 branch, one of the testers of
 these patches noticed that libgcc cannot be built on a linux SPE target.  The
 reason was the _Decimal64 type did not have a proper move insn in the SPE
 environment.  This patch fixes that issue.  In looking at the patch, I
 discovered two other thinkos that are fixed in this patch.

 The first problem is the movdf/movdd insns for 32-bit without hardware 
 floating
 point, checked whether we had hardware single precision support, when it 
 should
 have been checking that we had hardware double precision support.

 The second problem was that some of the types believed they could use the
 floating point registers in a SPE or software emulation enviornment.  So I
 added additional code to turn off the use of the FPRs in this case.

 I have done bootstraps and make check on 64-bit PowerPC linux systems with no
 regression.  In addition, I tested the code generated using cross compilers to
 the Linux SPE system.  Is this patch acceptible to be checked in the trunk 
 (and
 to the 4.8 branch when the other patches are approved)?

Mike,

Can you work with Edmar and Rohit to create a testcase for the GCC
testsuite as well?

Thanks, David

Re: [gomp4] Add tables generation

2014-04-03 Thread Ilya Verbin

2014-04-03 21:06 GMT+04:00 Bernd Schmidt ber...@codesourcery.com:
 On 04/03/2014 06:53 PM, Ilya Verbin wrote:

 2014-04-03 20:13 GMT+04:00 Bernd Schmidt ber...@codesourcery.com:

 The patch below should be a better fix, making the references to 
 __OPENMP_TARGET__ weak. Does this work for you?


 Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target,
 since we decided to pass it to GOMP_offload_register?


 I thought it was used to look up the right function? With shared libraries
 you'd get multiple __OPENMP_TARGET__ tables.


 Bernd


Yes, initially the idea was to use it for look up the right function.
But now each DSO will call GOMP_offload_register, and pass unique
pointer to __OPENMP_TARGET__ (host_table) for this DSO.  Then
gomp_register_images_for_device registers all this host tables in the
plugin.  And when libgomp calls device_get_table_func, the plugin
returns the joint table for all DSO's.

  -- Ilya

Re: [gomp4] Add tables generation

2014-04-03 Thread Bernd Schmidt


On 04/03/2014 07:25 PM, Ilya Verbin wrote:

Yes, initially the idea was to use it for look up the right function.
But now each DSO will call GOMP_offload_register, and pass unique
pointer to __OPENMP_TARGET__ (host_table) for this DSO.  Then
gomp_register_images_for_device registers all this host tables in the
plugin.  And when libgomp calls device_get_table_func, the plugin
returns the joint table for all DSO's.


Why make a joint table? It seems better to use the __OPENMP_TARGET__ 
symbol to restrict lookups to the subset of symbols that could actually 
be found.
BTW, I still expect that the lookup by ordering will turn out to be 
fundamentally unreliable and we'll need to use the unique id patch I 
posted a while ago. In that case using __OPENMP_TARGET__ as a first 
order key for the lookups eliminates any problem with duplicate names 
across multiple libraries.



Bernd

Re: [gomp4] Add tables generation

2014-04-03 Thread Ilya Verbin

2014-04-03 21:28 GMT+04:00 Bernd Schmidt ber...@codesourcery.com:
 On 04/03/2014 07:25 PM, Ilya Verbin wrote:

 Yes, initially the idea was to use it for look up the right function.
 But now each DSO will call GOMP_offload_register, and pass unique
 pointer to __OPENMP_TARGET__ (host_table) for this DSO.  Then
 gomp_register_images_for_device registers all this host tables in the
 plugin.  And when libgomp calls device_get_table_func, the plugin
 returns the joint table for all DSO's.


 Why make a joint table? It seems better to use the __OPENMP_TARGET__ symbol
 to restrict lookups to the subset of symbols that could actually be found.
 BTW, I still expect that the lookup by ordering will turn out to be
 fundamentally unreliable and we'll need to use the unique id patch I posted
 a while ago. In that case using __OPENMP_TARGET__ as a first order key for
 the lookups eliminates any problem with duplicate names across multiple
 libraries.


 Bernd


In current implementation each gomp_device_descr contains one
dev_splay_tree.  And all addresses are inserted into this splay tree.
There is no need to restrict lookup, because the addresses from
multiple DSO's can't overlap.

  -- Ilya

Re: [PATCH] PowerPC, PR60735: _Decimal64 moves broken on -mspe

2014-04-03 Thread Michael Meissner

On Thu, Apr 03, 2014 at 01:24:25PM -0400, David Edelsohn wrote:
 On Tue, Apr 1, 2014 at 7:55 PM, Michael Meissner
 meiss...@linux.vnet.ibm.com wrote:
  In backporting the power8 changes to the 4.8 branch, one of the testers of
  these patches noticed that libgcc cannot be built on a linux SPE target.  
  The
  reason was the _Decimal64 type did not have a proper move insn in the SPE
  environment.  This patch fixes that issue.  In looking at the patch, I
  discovered two other thinkos that are fixed in this patch.
 
  The first problem is the movdf/movdd insns for 32-bit without hardware 
  floating
  point, checked whether we had hardware single precision support, when it 
  should
  have been checking that we had hardware double precision support.
 
  The second problem was that some of the types believed they could use the
  floating point registers in a SPE or software emulation enviornment.  So I
  added additional code to turn off the use of the FPRs in this case.
 
  I have done bootstraps and make check on 64-bit PowerPC linux systems with 
  no
  regression.  In addition, I tested the code generated using cross compilers 
  to
  the Linux SPE system.  Is this patch acceptible to be checked in the trunk 
  (and
  to the 4.8 branch when the other patches are approved)?
 
 Mike,
 
 Can you work with Edmar and Rohit to create a testcase for the GCC
 testsuite as well?

Sure, but I won't be able to run it under the test suite.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[4.8, PATCH 28/26] Backport Power8 and LE support: Fix for SPE (PR60735)

2014-04-03 Thread Bill Schmidt

Hi,

This patch (diff-pr60735) adds to the 4.8 PowerPC backport patch series
with a backported fix for PR60735, an unrecognized insn problem for SPE.

Thanks,
Bill


[gcc]

2014-04-03  Bill Schmidt  wschm...@linux.vnet.ibm.com

Back port mainline subversion id 209025.
2014-04-02  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/60735
* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): If we have
software floating point or no floating point registers, do not
allow any type in the FPRs.  Eliminate a test for SPE SIMD types
in GPRs that occurs after we tested for GPRs that would never be
true.

* config/rs6000/rs6000.md (movmode_softfloat32, FMOVE64):
Rewrite tests to use TARGET_DOUBLE_FLOAT and TARGET_E500_DOUBLE,
since the FMOVE64 type is DFmode/DDmode.  If TARGET_E500_DOUBLE,
specifically allow DDmode, since that does not use the SPE SIMD
instructions.

Index: gcc-4_8-test2/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test2.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test2/gcc/config/rs6000/rs6000.c
@@ -1733,6 +1733,9 @@ rs6000_hard_regno_mode_ok (int regno, en
  modes and DImode.  */
   if (FP_REGNO_P (regno))
 {
+  if (TARGET_SOFT_FLOAT || !TARGET_FPRS)
+   return 0;
+
   if (SCALAR_FLOAT_MODE_P (mode)
   (mode != TDmode || (regno % 2) == 0)
   FP_REGNO_P (last_regno))
@@ -1761,10 +1764,6 @@ rs6000_hard_regno_mode_ok (int regno, en
 return (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
|| mode == V1TImode);
 
-  /* ...but GPRs can hold SIMD data on the SPE in one register.  */
-  if (SPE_SIMD_REGNO_P (regno)  TARGET_SPE  SPE_VECTOR_MODE (mode))
-return 1;
-
   /* We cannot put non-VSX TImode or PTImode anywhere except general register
  and it must be able to fit within the register set.  */
 
Index: gcc-4_8-test2/gcc/config/rs6000/rs6000.md
===
--- gcc-4_8-test2.orig/gcc/config/rs6000/rs6000.md
+++ gcc-4_8-test2/gcc/config/rs6000/rs6000.md
@@ -9428,8 +9428,9 @@
   [(set (match_operand:FMOVE64 0 nonimmediate_operand =Y,r,r,r,r,r)
(match_operand:FMOVE64 1 input_operand r,Y,r,G,H,F))]
   ! TARGET_POWERPC64 
-((TARGET_FPRS  TARGET_SINGLE_FLOAT) 
-   || TARGET_SOFT_FLOAT || TARGET_E500_SINGLE)
+((TARGET_FPRS  TARGET_DOUBLE_FLOAT) 
+   || TARGET_SOFT_FLOAT
+   || (MODEmode == DDmode  TARGET_E500_DOUBLE))
 (gpc_reg_operand (operands[0], MODEmode)
|| gpc_reg_operand (operands[1], MODEmode))
   #

[4.8, PATCH 29/26] Backport Power8 and LE support: Document vec_vgbbd

2014-04-03 Thread Bill Schmidt

Hi,

This patch (diff-vecdoc) is the last addition to the 4.8 PowerPC
backport patch series.  It simply adds some missing documentation that
should have been part of one of the previous patches.

I'm currently doing one more quick round of testing with the three
late-addition patches, and will then be ready to commit the series.

Thanks,
Bill


[gcc]

2014-04-03  Bill Schmidt  wschm...@linux.vnet.ibm.com

Back port from main line:
2014-04-01  Michael Meissner  meiss...@linux.vnet.ibm.com

* doc/extend.texi (PowerPC AltiVec/VSX Built-in Functions):
Document vec_vgbbd.

Index: gcc-4_8-test3/gcc/doc/extend.texi
===
--- gcc-4_8-test3.orig/gcc/doc/extend.texi
+++ gcc-4_8-test3/gcc/doc/extend.texi
@@ -14132,6 +14132,9 @@ vector unsigned short vec_vclzh (vector
 vector int vec_vclzw (vector int);
 vector unsigned int vec_vclzw (vector int);
 
+vector signed char vec_vgbbd (vector signed char);
+vector unsigned char vec_vgbbd (vector unsigned char);
+
 vector long long vec_vmaxsd (vector long long, vector long long);
 
 vector unsigned long long vec_vmaxud (vector unsigned long long,

Re: [GOOGLE] Updates SSA after VPT transofrmations in AFDO pass

2014-04-03 Thread Xinliang David Li

looks fine.

David

On Thu, Apr 3, 2014 at 10:56 AM, Dehao Chen de...@google.com wrote:
 This patch updates SSA after VPT transformation. This is needed
 because compute_inline_parameters will ICE without updated SSA.

 Testing on-going.

 OK for google-4_8?

 Thanks,
 Dehao

 Index: gcc/auto-profile.c
 ===
 --- gcc/auto-profile.c (revision 209059)
 +++ gcc/auto-profile.c (working copy)
 @@ -1448,6 +1448,7 @@ afdo_vpt_for_early_inline (stmt_set *promoted_stmt
free_dominance_info (CDI_POST_DOMINATORS);
calculate_dominance_info (CDI_POST_DOMINATORS);
calculate_dominance_info (CDI_DOMINATORS);
 +  update_ssa (TODO_update_ssa);
rebuild_cgraph_edges ();
return true;
  }

[GOOGLE] Updates SSA after VPT transofrmations in AFDO pass

2014-04-03 Thread Dehao Chen

This patch updates SSA after VPT transformation. This is needed
because compute_inline_parameters will ICE without updated SSA.

Testing on-going.

OK for google-4_8?

Thanks,
Dehao

Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 209059)
+++ gcc/auto-profile.c (working copy)
@@ -1448,6 +1448,7 @@ afdo_vpt_for_early_inline (stmt_set *promoted_stmt
   free_dominance_info (CDI_POST_DOMINATORS);
   calculate_dominance_info (CDI_POST_DOMINATORS);
   calculate_dominance_info (CDI_DOMINATORS);
+  update_ssa (TODO_update_ssa);
   rebuild_cgraph_edges ();
   return true;
 }

[Fortran-CAF, committed] Add array sending support for coarrays

2014-04-03 Thread Tobias Burnus

This patch handles assigning to coarray array (sections) from local 
arrays for array RHS and for scalar RHS. I have lightly tested it with 
libcaf_single.


On the library side, I added a minimal implementation for libcaf_single, 
which handles only rank==1 arrays, but which otherwise seems to work.


With that patch, the most common cases for sending should be handled. 
Missing features for sending to remote issues: character strings are not 
handled, type conversion (i.e. assigning a real to an integer or 
similar), allocatable/pointer components of coarrays, and array vector 
sections are still not handled. - And, of course, reading from remote 
coarrays (get, pull) is not supported.


Build on x86-64-gnu-linux - and committed to the branch as Rev. 209060

Tobias

PS: Minimal test case to be run with gfortran -fdump-tree-original 
-fcoarray=single -lcaf_single:


integer :: foo(5)[*]
integer :: bar(5)

bar = [1,2,3,4,5]
foo(:)[1] = bar
print *, foo
foo(:)[1] = 45
print *, foo
end
 gcc/fortran/ChangeLog.fortran-caf |9 +
 gcc/fortran/trans-decl.c  |   15 +++-
 gcc/fortran/trans-intrinsic.c |   34 +--
 gcc/fortran/trans.h   |2 +
 libgfortran/ChangeLog.fortran-caf |   13 +++
 libgfortran/caf/libcaf.h  |   34 +++
 libgfortran/caf/single.c  |   67 ++
 7 files changed, 163 insertions(+), 11 deletions(-)

Index: libgfortran/ChangeLog.fortran-caf
===
--- libgfortran/ChangeLog.fortran-caf	(Revision 208931)
+++ libgfortran/ChangeLog.fortran-caf	(Arbeitskopie)
@@ -1,3 +1,16 @@
+2014-04-03  Tobias Burnus  bur...@net-b.de
+
+	* caf/libcaf.h (descriptor_dimension, gfc_descriptor_t): New
+	structs.
+	(GFC_MAX_DIMENSIONS, GFC_DTYPE_RANK_MASK, GFC_DTYPE_TYPE_SHIFT,
+	GFC_DTYPE_TYPE_MASK, GFC_DTYPE_SIZE_SHIFT, GFC_DESCRIPTOR_RANK,
+	GFC_DESCRIPTOR_TYPE, GFC_DESCRIPTOR_SIZE): New defines.
+	(_gfortran_caf_send_desc, _gfortran_caf_send_desc_scalar): New
+	prototypes.
+	* caf/single.c (_gfortran_caf_send_desc,
+	_gfortran_caf_send_desc_scalar): New functions, supporting
+	rank == 1 only.
+
 2014-03-14  Tobias Burnus  bur...@net-b.de
 
 	* caf/libcaf.h (caf_token_t): New typedef.
Index: libgfortran/caf/libcaf.h
===
--- libgfortran/caf/libcaf.h	(Revision 208931)
+++ libgfortran/caf/libcaf.h	(Arbeitskopie)
@@ -58,6 +58,38 @@ caf_register_t;
 
 typedef void* caf_token_t;
 
+
+/* GNU Fortran's array descriptor.  Keep in sync with libgfortran.h.  */
+
+typedef struct descriptor_dimension
+{
+  ptrdiff_t _stride;
+  ptrdiff_t lower_bound;
+  ptrdiff_t _ubound;
+}
+descriptor_dimension;
+
+typedef struct gfc_descriptor_t {
+  void *base_addr;
+  size_t offset;
+  ptrdiff_t dtype;
+  descriptor_dimension dim[];
+} gfc_descriptor_t;
+
+
+#define GFC_MAX_DIMENSIONS 7
+
+#define GFC_DTYPE_RANK_MASK 0x07
+#define GFC_DTYPE_TYPE_SHIFT 3
+#define GFC_DTYPE_TYPE_MASK 0x38
+#define GFC_DTYPE_SIZE_SHIFT 6
+#define GFC_DESCRIPTOR_RANK(desc) ((desc)-dtype  GFC_DTYPE_RANK_MASK)
+#define GFC_DESCRIPTOR_TYPE(desc) (((desc)-dtype  GFC_DTYPE_TYPE_MASK) \
+GFC_DTYPE_TYPE_SHIFT)
+#define GFC_DESCRIPTOR_SIZE(desc) ((desc)-dtype  GFC_DTYPE_SIZE_SHIFT)
+
+
+
 /* Linked list of static coarrays registered.  */
 typedef struct caf_static_t {
   caf_token_t token;
@@ -77,6 +109,8 @@ void *_gfortran_caf_register (size_t, caf_register
 void _gfortran_caf_deregister (caf_token_t *, int *, char *, int);
 
 void _gfortran_send (caf_token_t, size_t, int, void *, size_t, bool);
+void _gfortran_send_desc (caf_token_t, size_t, int, gfc_descriptor_t*, gfc_descriptor_t*, bool);
+void _gfortran_send_desc_scalar (caf_token_t, size_t, int, gfc_descriptor_t*, void*, bool);
 
 void _gfortran_caf_sync_all (int *, char *, int);
 void _gfortran_caf_sync_images (int, int[], int *, char *, int);
Index: libgfortran/caf/single.c
===
--- libgfortran/caf/single.c	(Revision 208931)
+++ libgfortran/caf/single.c	(Arbeitskopie)
@@ -149,6 +149,7 @@ _gfortran_caf_deregister (caf_token_t *token, int
 *stat = 0;
 }
 
+/* Send scalar (or contiguous) data from buffer to a remote image.  */
 
 void
 _gfortran_caf_send (caf_token_t token, size_t offset,
@@ -161,7 +162,73 @@ _gfortran_caf_send (caf_token_t token, size_t offs
 }
 
 
+/* Send array data from src to dest on a remote image.  */
+
 void
+_gfortran_caf_send_desc (caf_token_t token, size_t offset,
+			 int image_id __attribute__ ((unused)),
+			 gfc_descriptor_t *dest, gfc_descriptor_t *src,
+			 bool asyn __attribute__ ((unused)))
+{
+  fprintf (stderr, COARRAY ERROR: Array communication 
+	   [_gfortran_caf_send_desc] not yet implemented for rank /= 0);
+  exit (EXIT_FAILURE);
+  size_t i, j;
+  size_t size = GFC_DESCRIPTOR_SIZE (dest);
+  int rank = GFC_DESCRIPTOR_RANK (dest);

[Fortran-caf] Merge from the trunk to the branch

2014-04-03 Thread Tobias Burnus


Committed to the fortran-caf branch as Rev. 209062

Tobias

[PR target/60657] [P1 regression] Fix operand predicates for a few ARM insns

2014-04-03 Thread Jeff Law




As noted in the PR, there are a few insns in the ARM backend which use 
const_int_operand as a predicate, but which have constraints like I or 
M.


With the predicate accepting all constants, it's possible for a pass 
such as combine to create an insn where the constant operand matches the 
loose predicate, but will not match the tighter constraint.  WIth no 
other alternatives to choose from, lra/reload won't be able to fixup the 
insn.


The right way (IMHO) is to tighten the predicate in these cases.  This 
patch introduces const_int_I_operand and const_int_M_operand.


Bootstrapped on arm7l-unknown-linux-gnu (without java which fails for 
unrelated reasons) and regression tested.  One system didn't have GDB 
installed, so the atomic and guality tests were noisy and due to time 
constraints, I haven't re-run them.


OK for the trunk?

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8d0c021..6c170d3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,15 @@
+2014-04-03  Jeff Law  l...@redhat.com
+
+PR target/60657
+   * arm/predicates.md (const_int_I_operand): New predicate.
+   (const_int_M_operand): Similarly.
+   * arm/arm.md (insv_zero): Use const_int_M_operand instead of
+   const_int_operand.
+   (insv_t2, extv_reg, extzv_t2): Likewise.
+   (load_multiple_with_writeback): Similarly for const_int_I_operand.
+   (pop_multiple_with_writeback_and_return): Likewise.
+   (vfp_pop_multiple_with_writeback): Likewise
+
 2014-04-03  Richard Biener  rguent...@suse.de
 
* tree-streamer.h (struct streamer_tree_cache_d): Add next_idx
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 4df24a2..4b81ee2 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -2784,8 +2784,8 @@
 
 (define_insn insv_zero
   [(set (zero_extract:SI (match_operand:SI 0 s_register_operand +r)
- (match_operand:SI 1 const_int_operand M)
- (match_operand:SI 2 const_int_operand M))
+ (match_operand:SI 1 const_int_M_operand M)
+ (match_operand:SI 2 const_int_M_operand M))
 (const_int 0))]
   arm_arch_thumb2
   bfc%?\t%0, %2, %1
@@ -2797,8 +2797,8 @@
 
 (define_insn insv_t2
   [(set (zero_extract:SI (match_operand:SI 0 s_register_operand +r)
- (match_operand:SI 1 const_int_operand M)
- (match_operand:SI 2 const_int_operand M))
+ (match_operand:SI 1 const_int_M_operand M)
+ (match_operand:SI 2 const_int_M_operand M))
 (match_operand:SI 3 s_register_operand r))]
   arm_arch_thumb2
   bfi%?\t%0, %3, %2, %1
@@ -4480,8 +4480,8 @@
 (define_insn *extv_reg
   [(set (match_operand:SI 0 s_register_operand =r)
(sign_extract:SI (match_operand:SI 1 s_register_operand r)
- (match_operand:SI 2 const_int_operand M)
- (match_operand:SI 3 const_int_operand M)))]
+ (match_operand:SI 2 const_int_M_operand M)
+ (match_operand:SI 3 const_int_M_operand M)))]
   arm_arch_thumb2
   sbfx%?\t%0, %1, %3, %2
   [(set_attr length 4)
@@ -4493,8 +4493,8 @@
 (define_insn extzv_t2
   [(set (match_operand:SI 0 s_register_operand =r)
(zero_extract:SI (match_operand:SI 1 s_register_operand r)
- (match_operand:SI 2 const_int_operand M)
- (match_operand:SI 3 const_int_operand M)))]
+ (match_operand:SI 2 const_int_M_operand M)
+ (match_operand:SI 3 const_int_M_operand M)))]
   arm_arch_thumb2
   ubfx%?\t%0, %1, %3, %2
   [(set_attr length 4)
@@ -12073,7 +12073,7 @@
   [(match_parallel 0 load_multiple_operation
 [(set (match_operand:SI 1 s_register_operand +rk)
   (plus:SI (match_dup 1)
-   (match_operand:SI 2 const_int_operand I)))
+   (match_operand:SI 2 const_int_I_operand I)))
  (set (match_operand:SI 3 s_register_operand =rk)
   (mem:SI (match_dup 1)))
 ])]
@@ -12102,7 +12102,7 @@
 [(return)
  (set (match_operand:SI 1 s_register_operand +rk)
   (plus:SI (match_dup 1)
-   (match_operand:SI 2 const_int_operand I)))
+   (match_operand:SI 2 const_int_I_operand I)))
  (set (match_operand:SI 3 s_register_operand =rk)
   (mem:SI (match_dup 1)))
 ])]
@@ -12155,7 +12155,7 @@
   [(match_parallel 0 pop_multiple_fp
 [(set (match_operand:SI 1 s_register_operand +rk)
   (plus:SI (match_dup 1)
-   (match_operand:SI 2 const_int_operand I)))
+   (match_operand:SI 2 const_int_I_operand I)))
  (set (match_operand:DF 3 vfp_hard_register_operand )
   (mem:DF (match_dup 1)))])]
   TARGET_32BIT  TARGET_HARD_FLOAT  TARGET_VFP
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index ce5c9a8..6273e88 100644

[RFA jit] clear timevar_enable in timevar_print

2014-04-03 Thread Tom Tromey

The timevar module doesn't properly re-initialize timevar_print
between invocations of the compiler.  In particular, if the compiler
is put into verbose mode, and subsequently put back into quiet mode,
then timevar_enable is never set to false -- leading to unwanted
timevar display.

This patch fixes the problem by clearing timevar_enable in
timevar_print.
---
 gcc/ChangeLog.jit | 4 
 gcc/timevar.c | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index 5145cf9..6ef9794 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.jit
@@ -1,5 +1,9 @@
 2014-03-24  Tom Tromey  tro...@redhat.com
 
+   * timevar.c (timevar_print): Clear timevar_enable.
+
+2014-03-24  Tom Tromey  tro...@redhat.com
+
* toplev.c (general_init): Initialize input_location.
* input.c (input_location): Initialize to UNKNOWN_LOCATION.
 
diff --git a/gcc/timevar.c b/gcc/timevar.c
index 2ceee51..5e4c4c49 100644
--- a/gcc/timevar.c
+++ b/gcc/timevar.c
@@ -491,6 +491,8 @@ timevar_print (FILE *fp)
 
   if (!timevar_enable)
 return;
+  // Clean up for a possible next run.
+  timevar_enable = false;
 
   /* Update timing information in case we're calling this from GDB.  */
 
-- 
1.9.0

Re: [PATCH, PR 60640] When creating virtual clones, clone thunks too

2014-04-03 Thread Jan Hubicka

   +/* If E does not lead to a thunk, simply redirect it to N.  Otherwise 
   create
   +   one or more equivalent thunks for N and redirect E to the first in the
   +   chain.  */
   +
   +void
   +redirect_edge_duplicating_thunks (struct cgraph_edge *e, struct 
   cgraph_node *n,
   +   bitmap args_to_skip)
   +{
   +  cgraph_node *orig_to = cgraph_function_or_thunk_node (e-callee);
   +  if (orig_to-thunk.thunk_p)
   +n = duplicate_thunk_for_node (orig_to, n, args_to_skip);
  
  Is there anything that would pevent us from creating a new thunk for
  each call?
 
 No, given how late we have discovered it, it probably only happens
 very rarely.  Moreover, since you have plans to always inline only
 directly called thunks for the next release, which should be the
 ultimate solution, I did not think it was necessary or even
 appropriate at this stage.

A lot of code iterate over thunks/aliases and expect this to be cheap operation.
We thus need to be sure we won't create very many thunks or aliases of a given
function internally.

In order to trigger quadratic behaviour here, we only need a single function
call used very often in a big project, like mozilla, to create uncontrolled
numbers of thunks.  I would suggest to just walk existing thunks before
creating new looking if there is one mathcing our needs.  Same code is in
making local aliases. This change is pre-approved.
 
  
  Also I think you need to avoid this logic when THIS parameter is being 
  optimized out
  (i.e. it is part of skip_args)
 
 You are of course right.  However, skipping the creation of a new
 thunk when we are also removing parameter this leads to verification
 errors again, so I had to also teach the verifier that this case is
 actually OK.  Moreover, although it seems that currently all
That is fine with me.
 non-this_adjusting thunks are expanded before IPA-CP runs, I made sure
 the skipping logic checked that flag.

Yes, we only keep the simple thunks in non-lowered form, but I do not
see how it makes difference for you.
 
 Accidently, the two original testcases are removing parameter this so
 I added a new one, which also shows how current trunk miscompiles
 stuff.  Unfortunately, at the moment it relies on speculative edges
 and so when IPA-CP correctly redirects calls to a thunk, inlining
 gives up and removes the edge, so the IPA-CP transformation is not
 run-time checked.  However, the cgraph verifier does see the edge
 before that happens and is OK with it.

You can probably play with anonymous namespaces and final flags to get
it devirtualized unconditnally.
 
 I have also took the liberty of removing an extra call to
 cgraph_function_or_thunk_node (clone_of_p calls it too) and a clearly
 obsolete comment from verify_edge_corresponds_to_fndecl.
 
 Bootstrapped and tested on x86_64-linux.  OK for trunk?
 
 Thanks,
 
 Martin
 
 
 2014-03-31  Martin Jambor  mjam...@suse.cz
 
 * cgraph.h (cgraph_clone_node): New parameter added to declaration.
 Adjust all callers.
   * cgraph.c (clone_of_p): Also return true if thunks match.
   (verify_edge_corresponds_to_fndecl): Removed extraneous call to
   cgraph_function_or_thunk_node and an obsolete comment.
 * cgraphclones.c (build_function_type_skip_args): Moved upwards in the
 file.
 (build_function_decl_skip_args): Likewise.
   (set_new_clone_decl_and_node_flags): New function.
 (duplicate_thunk_for_node): Likewise.
 (redirect_edge_duplicating_thunks): Likewise.
 (cgraph_clone_node): New parameter args_to_skip, pass it to
 redirect_edge_duplicating_thunks which is called instead of
 cgraph_redirect_edge_callee.
 (cgraph_create_virtual_clone): Pass args_to_skip to cgraph_clone_node,
   moved setting of a lot of flags to set_new_clone_decl_and_node_flags.
 
 testsuite/
 * g++.dg/ipa/pr60640-1.C: New test.
 * g++.dg/ipa/pr60640-2.C: Likewise.
 * g++.dg/ipa/pr60640-3.C: Likewise.

OK, with the change above.

Honza

Re: [4.8, PATCH 29/26] Backport Power8 and LE support: Document vec_vgbbd

2014-04-03 Thread Bill Schmidt

On Thu, 2014-04-03 at 13:01 -0500, Bill Schmidt wrote:
 I'm currently doing one more quick round of testing with the three
 late-addition patches, and will then be ready to commit the series.
 

Final tests have all passed (BE Linux, LE Linux, BE AIX).

Thanks,
Bill

Re: [PATCH] PR debug/57519 - Emit DW_TAG_imported_declaration under the right class for 'using' statements in a class

2014-04-03 Thread Cary Coutant

 ChangeLog:
 2014-03-25  Siva Chandra Reddy  sivachan...@google.com

 Fix PR debug/57519

 /cp

 PR debug/57519
 * class.c (handle_using_decl): Pass the correct scope to
 cp_emit_debug_info_for_using.

 testsuite/

 PR debug/57519
 * g++.dg/debug/dwarf2/imported-decl-2.C: New testcase.

This looks right to me, but you'll need approval from a C++ front end
maintainer.

Thanks!

-cary

RE: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-03 Thread Thomas Preud'homme

 From: Andreas Schwab [mailto:sch...@suse.de]

 Please add m68k-*-*.

 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Rainer Orth

 Just omit the { target *-*-* } completely, also a few more times.

Please find attached an updated patch.

gcc32rm-84.3.2.part1.diff
Description: Binary data

RE: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-04-03 Thread Thomas Preud'homme

 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Rainer Orth
 
 Just omit the { target *-*-* } completely, also a few more times.

Please find attached an updated patch.

Best regards,

Thomas

gcc32rm-84.3.2.part2.diff
Description: Binary data

Re: [gomp4] Add tables generation

2014-04-03 Thread Thomas Schwinge

Hi!

On Thu, 3 Apr 2014 18:13:08 +0200, Bernd Schmidt ber...@codesourcery.com 
wrote:
 On 04/02/2014 10:36 AM, Thomas Schwinge wrote:
  I see regressions in the libgomp testsuite for configurations where
  offloading is not enabled:
 
   spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ 
  [...]/source/libgomp/testsuite/libgomp.c/for-3.c 
  -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ 
  -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs 
  -I[...]/build/x86_64-unknown-linux-gnu/./libgomp 
  -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 
  -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 
  -fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o 
  ./for-3.exe
   /tmp/ccGnT0ei.o: In function `main':
   for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__'
   collect2: error: ld returned 1 exit status
 
  I suppose that's because [...]
 
  Workaround committed in r209015:
 
  libgcc/
  * crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to
  NULL.
 
 The patch below should be a better fix, making the references to 
 __OPENMP_TARGET__ weak. Does this work for you?

Yes, it does, thanks!  Please revert my patch when committing yours.


Oh, and please use ChangeLog.gomp files on gomp-4_0-branch; also please
move the entries for your recent commits from the ChangeLog file(s) to
the respective ChangeLog.gomp one(s).


Grüße,
 Thomas


pgp9LEYYQa4tJ.pgp
Description: PGP signature

61 matches

Mail list logo