[PING] [PATCH 1/n] Add conditional compare support

2013-11-20 Thread Zhenqiang Chen
Ping?

Thanks!
-Zhenqiang

 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Zhenqiang Chen
 Sent: Wednesday, November 06, 2013 3:39 PM
 To: 'Richard Henderson'
 Cc: Richard Earnshaw; 'Richard Biener'; GCC Patches
 Subject: RE: [PATCH 1/n] Add conditional compare support
 
 
  -Original Message-
  From: Richard Henderson [mailto:r...@redhat.com]
  Sent: Tuesday, November 05, 2013 4:39 AM
  To: Zhenqiang Chen
  Cc: Richard Earnshaw; 'Richard Biener'; GCC Patches
  Subject: Re: [PATCH 1/n] Add conditional compare support
 
  On 11/04/2013 08:00 PM, Zhenqiang Chen wrote:
   Thanks. I add a new hook. The default function will return -1 if the
   target does not care about the order.
  
   +DEFHOOK
   +(select_ccmp_cmp_order,
   + For some target (like ARM), the order of two compares is
   +sensitive for\n\ conditional compare.  cmp0-cmp1 might be an
   +invalid combination.  But when\n\ swapping the order, cmp1-cmp0 is
 valid.
   +The function will return\n\
   +  -1: if @code{code1} and @code{code2} are valid combination.\n\
   +   1: if @code{code2} and @code{code1} are valid combination.\n\
   +   0: both are invalid.,
   + int, (int code1, int code2),
   + default_select_ccmp_cmp_order)
 
  Fair enough.  I'd originally been thinking that returning a tri-state
  value akin to the comparison callback to qsort would allow easy
  sorting of a whole list of comparisons.  But probably just as easy to
  open-code while checking for invalid combinations.
 
  Checking for invalid while sorting means that we can then disallow
  returning NULL from the other two hooks.  Because the backend has
  already had a chance to indicate failure.
 
 The check is only for the first two compares. And the following compares are
 not checked. In addition, backend might check more staffs (e.g.
 arm_select_dominance_cc_mode) to generate a valid compare instruction.
 
   For gen_ccmp_next, I add another parameter CC_P to indicate the
   result is used as CC or not. If CC_P is false, the gen_ccmp_next
   will return a general register. This is for code like
  
   int test (int a, int b)
   {
 return a  0  b  0;
   }
   During expand, there might have no branch at all. So gen_ccmp_next
   can not return CC for a  0  b  0.
 
  Uh, no, this is a terrible idea.  There's no need for gen_ccmp_next to
  re-do the work of cstore_optab.
 
  I believe you can use emit_store_flag as a high-level interface here,
  since there are technically vagaries due to STORE_FLAG_VALUE.  If that
  turns out to crash or fail in some way, we can talk about using
  cstore_optab directly given some restrictions.
 
 emit_store_flag does too much checks. I use cstore_optab to emit the insn.
 
 +  icode = optab_handler (cstore_optab, CCmode);
 +  if (icode != CODE_FOR_nothing)
 + {
 +   rtx target = gen_reg_rtx (word_mode);
 +   tmp = emit_cstore (target, icode, NE, CCmode, CCmode,
 +  0, tmp, const0_rtx, 1, word_mode);
 +   if (tmp)
 + return tmp;
 + }
 
  It also means that you shouldn't need all of and_scc_scc, ior_scc_scc,
  ccmp_and_scc_scc, ccmp_ior_scc_scc.
 
 Yes. We only need ccmp_and and ccmp_ior now.
 
 I will verify to remove the existing and_scc_scc, ior_scc_scc,
 and_scc_scc_cmp, ior_scc_scc_cmp once conditional compare is enabled.
 
  Although I don't see cstorecc4 defined for ARM, so there is something
  missing.
 
 cstorecc4 is added.
 
   +static int
   +arm_select_ccmp_cmp_order (int cond1, int cond2) {
   +  if (cond1 == cond2)
   +return -1;
   +  if (comparison_dominates_p ((enum rtx_code) cond1, (enum
   +rtx_code)
  cond2))
   +return 1;
   +  if (comparison_dominates_p ((enum rtx_code) cond2, (enum
   + rtx_code)
  cond1))
   +return -1;
   +  return 0;
   +
   +}
 
  This sort does not look stable.  In particular,
 
if (cond1 == cond2)
  return 1;
 
  would seem to better preserve the original order of the comparisons.
 
 -1 is to keep the original order. Anyway I change the function as:
 
 +/* COND1 and COND2 should be enum rtx_code, which represent two
 compares.
 +   There are order sensitive for conditional compare.  It returns
 +  1: Keep current order.
 + -1: Swap the two compares.
 +  0: Invalid combination.  */
 +
 +static int
 +arm_select_ccmp_cmp_order (int cond1, int cond2) {
 +  /* THUMB1 does not support conditional compare.  */
 +  if (TARGET_THUMB1)
 +return 0;
 +
 +  if (cond1 == cond2)
 +return 1;
 +  if (comparison_dominates_p ((enum rtx_code) cond1, (enum rtx_code)
 cond2))
 +return -1;
 +  if (comparison_dominates_p ((enum rtx_code) cond2, (enum rtx_code)
 cond1))
 +return 1;
 +
 +  return 0;
 +}
 
 Thanks!
 -Zhenqiang





Re: [PATCH] Support -fsanitize=leak

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 11:40:14AM +0400, Yury Gribov wrote:
  Jakub already wrote the invoke.texi part:
  http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02061.html
 
 Ah, thanks, I somehow missed it. When will this end up in trunk? I
 have a depending patch.

When it is reviewed.  I have various other pending asan patches:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02070.html
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02061.html
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01874.html
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01791.html
plus the libbacktrace symbolization in libsanitizer (but that depends
on upstream acceptance and because of the non-trivial changes in upstream
will need to wait for acceptance there plus another merge back).

Dodji, do you think you could review these soon?

Jakub


[PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion

2013-11-20 Thread Jakub Jelinek
Hi!

I've noticed we generate terrible code for the testcase below.
E.g. with -mavx2 it is:
leal6(%rdi), %edx
leal12(%rdi), %ecx
leal18(%rdi), %esi
leal3(%rdi), %eax
movl%edx, -20(%rsp)
movl%ecx, -24(%rsp)
leal9(%rdi), %edx
movl%esi, -28(%rsp)
vmovd   -20(%rsp), %xmm6
leal15(%rdi), %ecx
movl%edi, -20(%rsp)
leal21(%rdi), %esi
vmovd   -28(%rsp), %xmm4
vmovd   -24(%rsp), %xmm5
vpinsrd $1, %edx, %xmm6, %xmm3
vmovd   -20(%rsp), %xmm7
vpinsrd $1, %esi, %xmm4, %xmm2
vpinsrd $1, %ecx, %xmm5, %xmm1
vpinsrd $1, %eax, %xmm7, %xmm0
vpunpcklqdq %xmm2, %xmm1, %xmm1
vpunpcklqdq %xmm3, %xmm0, %xmm0
vinserti128 $0x1, %xmm1, %ymm0, %ymm0
to load the { x, x + 3, x + 6, x + 9, x + 12, x + 15, x + 18, x + 21 }
CONSTRUCTOR into a vector register.  This patch expands it as:
movl%edi, -20(%rsp)
vbroadcastss-20(%rsp), %ymm0
vpaddd  .LC0(%rip), %ymm0, %ymm0
instead.  Similarly for SSE2:
leal3(%rdi), %eax
movl%eax, -20(%rsp)
leal6(%rdi), %eax
movd-20(%rsp), %xmm4
movl%eax, -16(%rsp)
leal9(%rdi), %eax
movd-16(%rsp), %xmm1
movl%edi, -16(%rsp)
movl%eax, -12(%rsp)
movd-16(%rsp), %xmm0
movd-12(%rsp), %xmm3
punpckldq   %xmm4, %xmm0
punpckldq   %xmm3, %xmm1
punpcklqdq  %xmm1, %xmm0
instead of:
movl%edi, -12(%rsp)
movd-12(%rsp), %xmm3
pshufd  $0, %xmm3, %xmm0
paddd   .LC0(%rip), %xmm0

The patch leaves that decision to the target (in it's
*_expand_vector_init).

Ok?

2013-11-20  Jakub Jelinek  ja...@redhat.com

* expr.c (store_constructor): If all CONSTRUCTOR values are
the same SSA_NAME plus optional constant, pass plus_constant results
down to vec_init.
* config/i386/i386.c (ix86_expand_vector_init): Optimize it using
broadcast followed by addition of a constant vector.

* gcc.dg/vect/vect-124.c: New test.

--- gcc/expr.c.jj   2013-11-19 21:56:27.0 +0100
+++ gcc/expr.c  2013-11-20 09:24:14.435860130 +0100
@@ -6296,6 +6296,8 @@ store_constructor (tree exp, rtx target,
rtvec vector = NULL;
unsigned n_elts;
alias_set_type alias;
+   tree base = NULL_TREE;
+   rtx base_rtx = NULL_RTX;
 
gcc_assert (eltmode != BLKmode);
 
@@ -6334,6 +6336,31 @@ store_constructor (tree exp, rtx target,
TYPE_SIZE (TREE_TYPE (value)),
TYPE_SIZE (elttype)));
 
+   /* Try to detect fairly common pattern of
+  { a_5, a_5 + 1, a_5 + 2, a_5 + 3 }.  */
+   if (vector
+(count == 0 || base)
+TREE_CODE (value) == SSA_NAME)
+ {
+   gimple g = get_gimple_for_ssa_name (value);
+   tree this_base = value;
+   if (g  is_gimple_assign (g))
+ switch (gimple_assign_rhs_code (g))
+   {
+   case PLUS_EXPR:
+   case POINTER_PLUS_EXPR:
+ if (TREE_CODE (gimple_assign_rhs1 (g)) == SSA_NAME
+  tree_fits_shwi_p (gimple_assign_rhs2 (g)))
+   this_base = gimple_assign_rhs1 (g);
+ break;
+   default:
+ break;
+   }
+   if (count == 0)
+ base = this_base;
+   else if (base != this_base)
+ base = NULL_TREE;
+ }
count += n_elts_here;
if (mostly_zeros_p (value))
  zero_count += n_elts_here;
@@ -6342,6 +6369,9 @@ store_constructor (tree exp, rtx target,
/* Clear the entire vector first if there are any missing elements,
   or if the incidence of zero elements is = 75%.  */
need_to_clear = (count  n_elts || 4 * zero_count = 3 * count);
+
+   if (count  n_elts)
+ base = NULL_TREE;
  }
 
if (need_to_clear  size  0  !vector)
@@ -6362,6 +6392,9 @@ store_constructor (tree exp, rtx target,
else
  alias = get_alias_set (elttype);
 
+   if (base)
+ base_rtx = expand_normal (base);
+
 /* Store each element of the constructor into the corresponding
   element of TARGET, determined by counting the elements.  */
for (idx = 0, i = 0;
@@ -6385,8 +6418,21 @@ store_constructor (tree exp, rtx target,
/* Vector CONSTRUCTORs should only be built from smaller
   vectors in the case of BLKmode vectors.  */

Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Uros Bizjak
On Tue, Nov 19, 2013 at 9:43 PM, Ilya Enkovich enkovich@gmail.com wrote:

  Here is a patch to add size relocation and instruction to obtain object's 
  size in i386 target.

 +(define_insn move_size_reloc_mode
 +  [(set (match_operand:SWI48 0 register_operand =r)
 +(match_operand:MODE 1 size_relocation Z))]
 +  
 +{
 +  return mov{imodesuffix}\t{%1, %0|%0, %1};

 Please don't change x86_64_immediate_operand just to use Z
 constraint The predicate is used in a couple of other places that for
 sure don't accept your change.

 Better write this insn in an explicit way (see for example
 tls_initial_exec_64_sun). Something like:

 (define_insn move_size_reloc_mode
   [(set (match_operand:SWI48 0 register_operand =r)
 (unspec:SWI48
  [(match_operand 1 symbolic_operand ... )]
  UNSPEC_SIZEOF))]
   
   mov{imodesuffix}\t{%a1@SIZE, %0|%0, %a1@SIZE})

 You will probably need to define new operand 1 predicate and constraint.

 Uros.

 Hi, Uros!  Thanks for comments!  Here is what I got trying to follow your 
 suggestion.  Does it look better?

You actually don't need any operand modifiers in the insn template. Simply use:

mov{imodesuffix}\t{%1@SIZE, %0|%0, %1@SIZE}

and you will automatically get

movl $zzz, %eax or mov %eax, OFFSET FLAT: zzz.

Since your pattern allows only symbolic_operand, there is no reload,
so you can avoid the constraint alltogether.

BTW: Did you consider various -mcmodel=... options? For DImode moves,
you should check x86_64_zext_immediate_operand predicate and output
either movl $zzz, %eax or movabs $zzz, %rax. There is no movq with
64bit immediate. Please see movdi pattern.

Uros.


Re: [AArch64] AArch64 SIMD Builtins Better Type Correctness.

2013-11-20 Thread Marcus Shawcroft
On 18 November 2013 09:10, James Greenhalgh james.greenha...@arm.com wrote:

 2013-11-18  James Greenhalgh  james.greenha...@arm.com

 * gcc/config/aarch64/aarch64-builtins.c
 (aarch64_simd_itype): Remove.
 (aarch64_simd_builtin_datum): Remove itype, add
 qualifiers pointer.
 (VAR1): Use qualifiers.
 (aarch64_build_scalar_type): New.
 (aarch64_build_vector_type): Likewise.
 (aarch64_build_type): Likewise.
 (aarch64_init_simd_builtins): Refactor, remove special cases,
 consolidate main loop.
 (aarch64_simd_expand_args): Likewise.

OK
/Marcus


[buildrobot] [PATCH] mips: Really remove ENTRY_BLOCK_PTR

2013-11-20 Thread Jan-Benedict Glaw
Hi!

One usage of ENTRY_BLOCK_PTR was missed:

g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wwrite-strings -Wcast-qual 
-Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros 
-Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -I. -I. 
-I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/. 
-I/home/jbglaw/repos/gcc/gcc/../include 
-I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o mips.o -MT mips.o -MMD -MP 
-MF ./.deps/mips.TPo /home/jbglaw/repos/gcc/gcc/config/mips/mips.c
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c: In function ‘void 
mips_insert_attributes(tree_node*, tree_node**)’:
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1403: warning: unknown conversion 
type character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1403: warning: format ‘%qs’ 
expects type ‘char*’, but argument 2 has type ‘tree_node*’
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1403: warning: too many arguments 
for format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1408: warning: unknown conversion 
type character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1408: warning: format ‘%qs’ 
expects type ‘char*’, but argument 2 has type ‘tree_node*’
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1408: warning: too many arguments 
for format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c: In function ‘tree_node* 
mips_merge_decl_attributes(tree_node*, tree_node*)’:
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1437: warning: unknown conversion 
type character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1437: warning: format ‘%qs’ 
expects type ‘char*’, but argument 2 has type ‘tree_node*’
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1437: warning: too many arguments 
for format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1443: warning: unknown conversion 
type character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1443: warning: format ‘%qs’ 
expects type ‘char*’, but argument 2 has type ‘tree_node*’
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1443: warning: too many arguments 
for format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c: In function ‘rtx_def* 
mips_expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)’:
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:14458: warning: unknown 
conversion type character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:14458: warning: too many 
arguments for format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c: In function ‘rtx_def* 
r10k_simplify_address(rtx_def*, rtx_def*)’:
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:14845: error: ‘ENTRY_BLOCK_PTR’ 
was not declared in this scope
make[1]: *** [mips.o] Error 1

(http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=36303)

This should fix it:

2013-11-20  Jan-Benedict Glaw  jbg...@lug-owl.de

* config/mips/mips.c (r10k_simplify_address): Eliminate macro usage.

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 82ca719..d06d574 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -14842,7 +14842,7 @@ r10k_simplify_address (rtx x, rtx insn)
  /* Replace the incoming value of $sp with
 virtual_incoming_args_rtx.  */
  if (x == stack_pointer_rtx
-  DF_REF_BB (def) == ENTRY_BLOCK_PTR)
+  DF_REF_BB (def) == ENTRY_BLOCK_PTR_FOR_FN (cfun))
newx = virtual_incoming_args_rtx;
}
  else if (dominated_by_p (CDI_DOMINATORS, DF_REF_BB (use),


Ok?

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: 23:53 @jbglaw So, ich kletter' jetzt mal ins Bett.
the second  : 23:57 @jever2 .oO( kletter ..., hat er noch Gitter vorm Bett, 
wie früher meine Kinder?)
  00:00 @jbglaw jever2: *patsch*
  00:01 @jever2 *aua*, wofür, Gedanken sind frei!
  00:02 @jbglaw Nee, freie Gedanken, die sind seit 1984 doch aus!
  00:03 @jever2 1984? ich bin erst seit 1985 verheiratet!


signature.asc
Description: Digital signature


Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.

2013-11-20 Thread Richard Biener
On Tue, 19 Nov 2013, Cong Hou wrote:

 On Tue, Nov 19, 2013 at 1:45 AM, Richard Biener rguent...@suse.de wrote:
 
  On Mon, 18 Nov 2013, Cong Hou wrote:
 
   I tried your method and it works well for doubles. But for float,
   there is an issue. For the following gimple code:
  
  c1 = a - b;
  c2 = a + b;
  c = VEC_PERM c1, c2, [0,5,2,7]
  
   It needs two instructions to implement the VEC_PERM operation in
   SSE2-4, one of which should be using shufps which is represented by
   the following pattern in rtl:
  
  
   (define_insn sse_shufps_mode
 [(set (match_operand:VI4F_128 0 register_operand =x,x)
   (vec_select:VI4F_128
(vec_concat:ssedoublevecmode
  (match_operand:VI4F_128 1 register_operand 0,x)
  (match_operand:VI4F_128 2 nonimmediate_operand xm,xm))
(parallel [(match_operand 3 const_0_to_3_operand)
   (match_operand 4 const_0_to_3_operand)
   (match_operand 5 const_4_to_7_operand)
   (match_operand 6 const_4_to_7_operand)])))]
   ...)
  
   Note that it contains two rtl instructions.
 
  It's a single instruction as far as combine is concerned (RTL
  instructions have arbitrary complexity).
 
 
 Even it is one instruction, we will end up with four rtl statements,
 which still cannot be combined as there are restrictions on combining
 four instructions (loads of constants or binary operations involving a
 constant). Note that vec_select instead of vec_merge is used here
 because currently vec_merge is emitted only if SSE4 is enabled (thus
 blend instructions can be used. If you look at
 ix86_expand_vec_perm_const_1() in i386.c, you can find that vec_merge
 is generated in expand_vec_perm_1() with SSE4.). Without SSE4 support,
 in most cases a vec_merge statement could not be translated by one SSE
 instruction.
 
 
 
   Together with minus, plus,
   and one more shuffling instruction, we have at least five instructions
   for addsub pattern. I think during the combine pass, only four
   instructions are considered to be combined, right? So unless we
   compress those five instructions into four or less, we could not use
   this method for float values.
 
  At the moment addsubv4sf looks like
 
  (define_insn sse3_addsubv4sf3
[(set (match_operand:V4SF 0 register_operand =x,x)
  (vec_merge:V4SF
(plus:V4SF
  (match_operand:V4SF 1 register_operand 0,x)
  (match_operand:V4SF 2 nonimmediate_operand xm,xm))
(minus:V4SF (match_dup 1) (match_dup 2))
(const_int 10)))]
 
  to match this it's best to have the VEC_SHUFFLE retained as
  vec_merge and thus support arbitrary(?) vec_merge for the aid
  of combining until reload(?) after which we can split it.
 
 
 
 You mean VEC_PERM (this is generated in gimple from your patch)? Note
 as I mentioned above, without SSE4, it is difficult to translate
 VEC_PERM into vec_merge. Even if we can do it, we still need do define
 split to convert one vec_merge into two or more other statements
 later. ADDSUB instructions are proved by SSE3 and I think we should
 not rely on SSE4 to perform this transformation, right?

Sure not, we just keep vec_merge in the early RTL IL to give
combine an easier job.  We split it later to expose the target details
(see below).

 To sum up, if we use vec_select instead of vec_merge, we may have four
 rtl statements for float types, in which case they cannot be combined.
 If we use vec_merge, we need to define the split for it without SSE4
 support, and we need also to change the behavior of
 ix86_expand_vec_perm_const_1().

Yes.  But that would be best IMHO.  Otherwise we're not going to
have PR56766 fixed for the plain C generic vector testcase

typedef double v2df __attribute__((vector_size(16)));
typedef long long v2di __attribute__((vector_size(16)));
v2df foo (v2df x, v2df y)
{
  v2df tem1 = x - y;
  v2df tem2 = x + y;
  return __builtin_shuffle (tem1, tem2, (v2di) { 0, 3 });
}

   What do you think?
 
  Besides of addsub are there other instructions that can be expressed
  similarly?  Thus, how far should the combiner pattern go?
 
 
 
 I think your method is quite flexible. Beside blending add/sub, we
 could blend other combinations of two operations, and even one
 operation and a no-op. For example, consider vectorizing complex
 conjugate operation:
 
 for (int i = 0; i  N; i+=2) {
   a[i] = b[i];
   a[i+1] = -b[i+1];
 }
 
 This is loop is better to be vectorized by hybrid SLP. The second
 statement has a unary minus operation but there is no operation in the
 first one. We can improve out SLP grouping algorithm to let GCC SLP
 vectorize it.

Yes indeed - I also liked the flexibility in the end.  If cost
considerations allow it we can handle mismatched ops
quite well (albeit with a cost overhead).  Note that the
VEC_PERM permutations generated are always of the form that allow
implementing them with

  res = VEC_PERM op1, op2, selector

=

  op1 = op1  selector-mask1;
  op2 = op2  selector-mask2;
  res = op1 | op2;


Re: [buildrobot] [PATCH] mips: Really remove ENTRY_BLOCK_PTR

2013-11-20 Thread Richard Sandiford
Jan-Benedict Glaw jbg...@lug-owl.de writes:
 2013-11-20  Jan-Benedict Glaw  jbg...@lug-owl.de

   * config/mips/mips.c (r10k_simplify_address): Eliminate macro usage.

OK.  And thanks for the catch.

Richard


Re: [buildrobot] [PATCH] mips: Really remove ENTRY_BLOCK_PTR

2013-11-20 Thread Steven Bosscher
On Wed, Nov 20, 2013 at 10:04 AM, Jan-Benedict Glaw wrote:
 2013-11-20  Jan-Benedict Glaw  ...

 * config/mips/mips.c (r10k_simplify_address): Eliminate macro usage.

 diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
 index 82ca719..d06d574 100644
 --- a/gcc/config/mips/mips.c
 +++ b/gcc/config/mips/mips.c
 @@ -14842,7 +14842,7 @@ r10k_simplify_address (rtx x, rtx insn)
   /* Replace the incoming value of $sp with
  virtual_incoming_args_rtx.  */
   if (x == stack_pointer_rtx
 -  DF_REF_BB (def) == ENTRY_BLOCK_PTR)
 +  DF_REF_BB (def) == ENTRY_BLOCK_PTR_FOR_FN (cfun))
 newx = virtual_incoming_args_rtx;
 }
   else if (dominated_by_p (CDI_DOMINATORS, DF_REF_BB (use),


 Ok?

 MfG, JBG

This patch is obvious and it fixes breakage. Please go ahead and commit it.

I wonder if there are any more cases like this missed... Could you
please check that? Something like:

egrep -w ENTRY_BLOCK_PTR|EXIT_BLOCK_PTR gcc/*.[ch] gcc/config/*.[ch]
gcc/config/*/*.{c,h,md}

should do the trick. (I'd do it myself if I had access to a Linux box
right now...)

Ciao!
Steven


Re: RFA: patch to fix PR58785 (an ARM LRA crash)

2013-11-20 Thread Yvan Roux
Hi,

as Richard said, only a subset of rclass is allowed to be returned by
preferred_reload_class.  I've tested the attached patched in Thumb
mode, on ARMv5, A9 and A9hf and on cross A15 without regression.

Yvan

2013-11-20  Yvan Roux  yvan.r...@linaro.org

PR target/58785
* config/arm/arm.c (arm_preferred_reload_class): Only return LO_REGS
when rclass is GENERAL_REGS.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5c53440..63f10bd 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6882,10 +6882,7 @@ arm_preferred_reload_class (rtx x ATTRIBUTE_UNUSED, 
reg_class_t rclass)
 return rclass;
   else
 {
-  if (rclass == GENERAL_REGS
- || rclass == HI_REGS
- || rclass == NO_REGS
- || rclass == STACK_REG)
+  if (rclass == GENERAL_REGS)
return LO_REGS;
   else
return rclass;


Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion

2013-11-20 Thread Richard Biener
On Wed, 20 Nov 2013, Jakub Jelinek wrote:

 Hi!
 
 I've noticed we generate terrible code for the testcase below.
 E.g. with -mavx2 it is:
 leal6(%rdi), %edx
 leal12(%rdi), %ecx
 leal18(%rdi), %esi
 leal3(%rdi), %eax
 movl%edx, -20(%rsp)
 movl%ecx, -24(%rsp)
 leal9(%rdi), %edx
 movl%esi, -28(%rsp)
 vmovd   -20(%rsp), %xmm6
 leal15(%rdi), %ecx
 movl%edi, -20(%rsp)
 leal21(%rdi), %esi
 vmovd   -28(%rsp), %xmm4
 vmovd   -24(%rsp), %xmm5
 vpinsrd $1, %edx, %xmm6, %xmm3
 vmovd   -20(%rsp), %xmm7
 vpinsrd $1, %esi, %xmm4, %xmm2
 vpinsrd $1, %ecx, %xmm5, %xmm1
 vpinsrd $1, %eax, %xmm7, %xmm0
 vpunpcklqdq %xmm2, %xmm1, %xmm1
 vpunpcklqdq %xmm3, %xmm0, %xmm0
 vinserti128 $0x1, %xmm1, %ymm0, %ymm0
 to load the { x, x + 3, x + 6, x + 9, x + 12, x + 15, x + 18, x + 21 }
 CONSTRUCTOR into a vector register.  This patch expands it as:
 movl%edi, -20(%rsp)
 vbroadcastss-20(%rsp), %ymm0
 vpaddd  .LC0(%rip), %ymm0, %ymm0
 instead.  Similarly for SSE2:
 leal3(%rdi), %eax
 movl%eax, -20(%rsp)
 leal6(%rdi), %eax
 movd-20(%rsp), %xmm4
 movl%eax, -16(%rsp)
 leal9(%rdi), %eax
 movd-16(%rsp), %xmm1
 movl%edi, -16(%rsp)
 movl%eax, -12(%rsp)
 movd-16(%rsp), %xmm0
 movd-12(%rsp), %xmm3
 punpckldq   %xmm4, %xmm0
 punpckldq   %xmm3, %xmm1
 punpcklqdq  %xmm1, %xmm0
 instead of:
 movl%edi, -12(%rsp)
 movd-12(%rsp), %xmm3
 pshufd  $0, %xmm3, %xmm0
 paddd   .LC0(%rip), %xmm0
 
 The patch leaves that decision to the target (in it's
 *_expand_vector_init).
 
 Ok?

Aww ;)  Nice improvement.  Generally when I see this I always wonder
whether we want to do this kind of stuff pre RTL expansion.
1st to not rely on being able to TER, 2nd to finally eventually
get rid of TER.

These patches are unfortunately a step backward for #2.

As of the patch, do we have a way to query whether the target
can efficiently broadcast?  If so this IMHO belongs in generic
code, not in ix86_expand_vector_init - passing down plus_constant
results may pessimize other targets that don't understand this
trick, no?  (OTOH, which vector ISA does _not_ know how to
broadcast ...?)

Thanks,
Richard.

 2013-11-20  Jakub Jelinek  ja...@redhat.com
 
   * expr.c (store_constructor): If all CONSTRUCTOR values are
   the same SSA_NAME plus optional constant, pass plus_constant results
   down to vec_init.
   * config/i386/i386.c (ix86_expand_vector_init): Optimize it using
   broadcast followed by addition of a constant vector.
 
   * gcc.dg/vect/vect-124.c: New test.
 
 --- gcc/expr.c.jj 2013-11-19 21:56:27.0 +0100
 +++ gcc/expr.c2013-11-20 09:24:14.435860130 +0100
 @@ -6296,6 +6296,8 @@ store_constructor (tree exp, rtx target,
   rtvec vector = NULL;
   unsigned n_elts;
   alias_set_type alias;
 + tree base = NULL_TREE;
 + rtx base_rtx = NULL_RTX;
  
   gcc_assert (eltmode != BLKmode);
  
 @@ -6334,6 +6336,31 @@ store_constructor (tree exp, rtx target,
   TYPE_SIZE (TREE_TYPE (value)),
   TYPE_SIZE (elttype)));
  
 + /* Try to detect fairly common pattern of
 +{ a_5, a_5 + 1, a_5 + 2, a_5 + 3 }.  */
 + if (vector
 +  (count == 0 || base)
 +  TREE_CODE (value) == SSA_NAME)
 +   {
 + gimple g = get_gimple_for_ssa_name (value);
 + tree this_base = value;
 + if (g  is_gimple_assign (g))
 +   switch (gimple_assign_rhs_code (g))
 + {
 + case PLUS_EXPR:
 + case POINTER_PLUS_EXPR:
 +   if (TREE_CODE (gimple_assign_rhs1 (g)) == SSA_NAME
 +tree_fits_shwi_p (gimple_assign_rhs2 (g)))
 + this_base = gimple_assign_rhs1 (g);
 +   break;
 + default:
 +   break;
 + }
 + if (count == 0)
 +   base = this_base;
 + else if (base != this_base)
 +   base = NULL_TREE;
 +   }
   count += n_elts_here;
   if (mostly_zeros_p (value))
 zero_count += n_elts_here;
 @@ -6342,6 +6369,9 @@ store_constructor (tree exp, rtx target,
   /* Clear the entire vector first if there are any missing elements,
  or if the incidence of zero elements is = 75%.  */
   need_to_clear = (count  n_elts || 4 * zero_count = 3 * count);
 +
 +   

Re: Ping Re: [gomp4] Dumping gimple for offload.

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 10:58 AM, Ilya Tocar tocarip.in...@gmail.com wrote:
 On 14 Nov 11:27, Richard Biener wrote:
  +  /* Set when symbol needs to be dumped for lto/offloading.  */
  +  unsigned need_dump : 1;
  +

 That's very non-descriptive.  What's offloading?  But yes, something
 like this is what I was asking for.

 I've changed it into:
 Set when symbol needs to be dumped into LTO bytecode for LTO,
 or in pragma omp target case, for separate compilation targeting
 a different architecture.

 Ok for gomp4 branch now?

Works for me.  I'll let branch maintainers decide if it follows whatever
is done there (I haven't found time to follow stuff here).

Richard.

 2013-11-19 Ilya Tocar  ilya.to...@intel.com

 * cgraph.h (symtab_node): Add need_dump.
 * cgraphunit.c (ipa_passes): Run ipa_write_summaries for omp.
 (compile): Intialize streamer for omp.
 * ipa-inline-analysis.c (inline_generate_summary): Add flag_openmp.
 * lto-cgraph.c (lto_set_symtab_encoder_in_partition): Respect
 need_dump flag.
 (select_what_to_dump): New.
 * lto-streamer.c (section_name_prefix): New.
 (lto_get_section_name): Use section_name_prefix.
 (lto_streamer_init): Add flag_openmp.
 * lto-streamer.h (OMP_SECTION_NAME_PREFIX): New.
 (section_name_prefix): Ditto.
 (select_what_to_dump): Ditto.
 * lto/lto-partition.c (add_symbol_to_partition_1): Set need_dump.
 (lto_promote_cross_file_statics): Dump everyhtinh.
 * passes.c (ipa_write_summaries): Add parameter,
 call select_what_to_dump.
 * tree-pass.h (void ipa_write_summaries): Add parameter.


 ---
  gcc/cgraph.h  |  5 +
  gcc/cgraphunit.c  | 15 +--
  gcc/ipa-inline-analysis.c |  2 +-
  gcc/lto-cgraph.c  | 14 ++
  gcc/lto-streamer.c|  5 +++--
  gcc/lto-streamer.h|  6 ++
  gcc/lto/lto-partition.c   |  3 +++
  gcc/passes.c  |  6 --
  gcc/tree-pass.h   |  2 +-
  9 files changed, 50 insertions(+), 8 deletions(-)

 diff --git a/gcc/cgraph.h b/gcc/cgraph.h
 index fb0fe93..9f799f4 100644
 --- a/gcc/cgraph.h
 +++ b/gcc/cgraph.h
 @@ -105,6 +105,11 @@ public:
/* Set when symbol has address taken. */
unsigned address_taken : 1;

 +  /* Set when symbol needs to be dumped into LTO bytecode for LTO,
 + or in pragma omp target case, for separate compilation targeting
 + a different architecture.  */
 +  unsigned need_dump : 1;
 +

/* Ordering of all symtab entries.  */
int order;
 diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
 index c3a8967..53cd250 100644
 --- a/gcc/cgraphunit.c
 +++ b/gcc/cgraphunit.c
 @@ -2019,7 +2019,18 @@ ipa_passes (void)
   passes-all_lto_gen_passes);

if (!in_lto_p)
 -ipa_write_summaries ();
 +{
 +  if (flag_openmp)
 +   {
 + section_name_prefix = OMP_SECTION_NAME_PREFIX;
 + ipa_write_summaries (true);
 +   }
 +  if (flag_lto)
 +   {
 + section_name_prefix = LTO_SECTION_NAME_PREFIX;
 + ipa_write_summaries (false);
 +   }
 +}

if (flag_generate_lto)
  targetm.asm_out.lto_end ();
 @@ -2110,7 +2121,7 @@ compile (void)
cgraph_state = CGRAPH_STATE_IPA;

/* If LTO is enabled, initialize the streamer hooks needed by GIMPLE.  */
 -  if (flag_lto)
 +  if (flag_lto || flag_openmp)
  lto_streamer_hooks_init ();

/* Don't run the IPA passes if there was any error or sorry messages.  */
 diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
 index 4458723..62faa52 100644
 --- a/gcc/ipa-inline-analysis.c
 +++ b/gcc/ipa-inline-analysis.c
 @@ -3813,7 +3813,7 @@ inline_generate_summary (void)

/* When not optimizing, do not bother to analyze.  Inlining is still done
   because edge redirection needs to happen there.  */
 -  if (!optimize  !flag_lto  !flag_wpa)
 +  if (!optimize  !flag_lto  !flag_wpa  !flag_openmp)
  return;

function_insertion_hook_holder =
 diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
 index 6a52da8..697c069 100644
 --- a/gcc/lto-cgraph.c
 +++ b/gcc/lto-cgraph.c
 @@ -238,6 +238,9 @@ void
  lto_set_symtab_encoder_in_partition (lto_symtab_encoder_t encoder,
  symtab_node *node)
  {
 +  /* Ignore not needed nodes.  */
 +  if (!node-need_dump)
 +return;
int index = lto_symtab_encoder_encode (encoder, node);
encoder-nodes[index].in_partition = true;
  }
 @@ -751,6 +754,17 @@ add_references (lto_symtab_encoder_t encoder,
lto_symtab_encoder_encode (encoder, ref-referred);
  }

 +/* Select what needs to be dumped. In lto case dump everything.
 +   In omp target case only dump stuff makrked with attribute.  */
 +void
 +select_what_to_dump (bool is_omp)
 +{
 +  struct symtab_node *snode;
 +  FOR_EACH_SYMBOL(snode)
 +snode-need_dump = !is_omp || lookup_attribute (omp declare target,
 +   

Re: Ping Re: [gomp4] Dumping gimple for offload.

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 10:34:30AM +0100, Richard Biener wrote:
 On Tue, Nov 19, 2013 at 10:58 AM, Ilya Tocar tocarip.in...@gmail.com wrote:
  On 14 Nov 11:27, Richard Biener wrote:
   +  /* Set when symbol needs to be dumped for lto/offloading.  */
   +  unsigned need_dump : 1;
   +
 
  That's very non-descriptive.  What's offloading?  But yes, something
  like this is what I was asking for.
 
  I've changed it into:
  Set when symbol needs to be dumped into LTO bytecode for LTO,
  or in pragma omp target case, for separate compilation targeting
  a different architecture.
 
  Ok for gomp4 branch now?
 
 Works for me.  I'll let branch maintainers decide if it follows whatever
 is done there (I haven't found time to follow stuff here).

Ok then.

Jakub


Re: [RFA/RFC patch]: Follow-up on type-demotion pass ...

2013-11-20 Thread Richard Biener
On Mon, Nov 18, 2013 at 8:46 PM, Kai Tietz kti...@redhat.com wrote:
 Hello,

 this is the second approach of the patch I've sent some months ago.  I think 
 we saw that there was no other approach shown, so I will continue on that ...
 The idea of this pass is to introduce a statement normalization and 
 denormalization.  The normalization happens in forward-propagation, and 
 denormalization is done by this type-demotion-pass.  Therefore it is 
 recommented that the type-demotion pass runs just before the forwprop-pass, 
 as otherwise we might end in denormalized representation, which is in some 
 cases less performance.

 The current pass supports for addition/substraction type-transscript to 
 unsigned-integer logic for preventing overflow-issues.  By this we transform 
 operations like:
 int a,b; ... (unsigned int) (a + b) ... into ((unsigned int) a) + ((unsigned 
 int) b).
 This transformation needs to be done after first analyze code, as we 
 otherwise would destroy overflow-detection by this transscription.  We could 
 do the same for MULT_EXPR, if there wouldn't be that late analyzis of 
 possible overflow of it.  Therefore I disabled such transformation for it.

 Then it supports shift-operation type-sinking/type-raising.  This seems to be 
 the most interesting part of this patch - beside of the constnat propagtion 
 optimization as seen in ts-add-3.c testcase - as this kind of optimization is 
 new to gcc.  The part in forward-propagation is required to define the 
 normal notation of such shift-expressions.  As nice side-effect by these 
 transformation we can find that type-over-widening on some operations don't 
 occure anymore in vectorize code.

 Additional this pass should catch useless type-conversion and simplifies 
 them.  Therefore I kept the code about type-cast reduction still within that 
 pass.  Of course we can move it out to some other place, like 
 tree-ssa-gimple, or even gimple might be a place.  But right now I would like 
 to keep at beginning all those pieces together.

This is not a review of the pass either.

1) Can you please split out the forwprop pieces and motivate them
separately?
2) How did you decide on pass placement?  It looks like you do
it twice (eh?) and before forwprop passes.  What's the difference
between the passes?
3) You don't add a flag to disable this pass, please add one
4) You compute post-dominators but appearantly only to specify
basic-block walk order - instead use one of the basic-block
odering functions

Richard.


 ChangeLog gcc

 * Makefile.in: Add tree-ssa-te.c to build.
 * passes.def: Add typedemote passes.
 * tree-pass.h (make_pass_demote1): Add prototype.
 (make_pass_demote2): Likewise.
 * tree-ssa-forwprop.c (simplify_shift):  New function.
 (ssa_forward_propagate_and_combine): Use it.
 *  tree-ssa-te.c: New pass.

 Changelog gcc/testsuite:

 * gcc.dg/tree-ssa/scev-cast.c: Adjust test.
 * gcc.dg/tree-ssa/ssa-fre-2.c: Likewise.
 * gcc.dg/tree-ssa/ssa-fre-3.c: Likewise.
 * gcc.dg/tree-ssa/ssa-fre-4.c: Likewise.
 * gcc.dg/tree-ssa/ssa-fre-5.c: Likewise.
 * gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
 * gcc.dg/vect/vect-over-widen-1.c: Likewise.
 * gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
 * gcc.dg/vect/vect-over-widen-3.c: Likewise.
 * gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
 * gcc.dg/vect/vect-over-widen-4.c: Likewise.
 * gcc.dg/tree-ssa/ts-add-1.c: New test.
 * gcc.dg/tree-ssa/ts-add-2.c: New test.
 * gcc.dg/tree-ssa/ts-add-3.c: New test.
 * gcc.dg/tree-ssa/ts-shift-1.c: New test.

 Test for x86_64-unknown-linux-gnu, and i686-pc-cygwin, and x86_64-pc-cygwin.

 Ok for apply?

 Regards,
 Kai


 Index: gcc-trunk/gcc/Makefile.in
 ===
 --- gcc-trunk.orig/gcc/Makefile.in
 +++ gcc-trunk/gcc/Makefile.in
 @@ -1430,6 +1430,7 @@ OBJS = \
 tree-ssa-pre.o \
 tree-ssa-propagate.o \
 tree-ssa-reassoc.o \
 +   tree-ssa-te.o \
 tree-ssa-sccvn.o \
 tree-ssa-sink.o \
 tree-ssa-strlen.o \
 Index: gcc-trunk/gcc/passes.def
 ===
 --- gcc-trunk.orig/gcc/passes.def
 +++ gcc-trunk/gcc/passes.def
 @@ -64,6 +64,7 @@ along with GCC; see the file COPYING3.
   NEXT_PASS (pass_remove_cgraph_callee_edges);
   NEXT_PASS (pass_rename_ssa_copies);
   NEXT_PASS (pass_ccp);
 + NEXT_PASS (pass_demote1);
   /* After CCP we rewrite no longer addressed locals into SSA
  form if possible.  */
   NEXT_PASS (pass_forwprop);
 @@ -141,6 +142,7 @@ along with GCC; see the file COPYING3.
/* After CCP we rewrite no longer addressed locals into SSA
  form if possible.  */
NEXT_PASS (pass_phiprop);
 +  NEXT_PASS 

Re: [1/4] Use tree_to_uhwi with an inlined tree_fits_uhwi_p test

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 12:54 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 check_function_arguments_recurse has an assert that is equivalent
 to tree_fits_uhwi_p.  The extraction can then use tree_to_uhwi.

 Asserting here makes the intent obvious, but tree_to_uhwi also asserts
 for the same thing, so an alternative would be to use tree_to_uhwi on
 its own.

Please do that.  Ok with this change.

Thanks,
Richard.

 Thanks,
 Richard


 gcc/c-family/
 2013-11-19  Kenneth Zadeck  zad...@naturalbridge.com

 * c-common.c (check_function_arguments_recurse): Use tree_fits_uhwi_p
 and tree_to_uhwi.

 Index: gcc/c-family/c-common.c
 ===
 --- gcc/c-family/c-common.c 2013-11-19 10:53:54.965643984 +
 +++ gcc/c-family/c-common.c 2013-11-19 11:08:41.797920627 +
 @@ -9209,10 +9209,9 @@ check_function_arguments_recurse (void (
to be valid.  */
 format_num_expr = TREE_VALUE (TREE_VALUE (attrs));

 -   gcc_assert (TREE_CODE (format_num_expr) == INTEGER_CST
 -!TREE_INT_CST_HIGH (format_num_expr));
 +   gcc_assert (tree_fits_uhwi_p (format_num_expr));

 -   format_num = TREE_INT_CST_LOW (format_num_expr);
 +   format_num = tree_to_uhwi (format_num_expr);

 for (inner_arg = first_call_expr_arg (param, iter), i = 1;
  inner_arg != 0;


Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 10:31:38AM +0100, Richard Biener wrote:
 Aww ;)  Nice improvement.  Generally when I see this I always wonder
 whether we want to do this kind of stuff pre RTL expansion.
 1st to not rely on being able to TER, 2nd to finally eventually
 get rid of TER.
 
 These patches are unfortunately a step backward for #2.
 
 As of the patch, do we have a way to query whether the target
 can efficiently broadcast?  If so this IMHO belongs in generic

We don't.  Perhaps if we'd add optab for vec_dupmode and mentioned
clearly in the documentation that it should be used only if it is reasonably
efficient.  But still, even with optab, it would probably better to do it
in the veclower* passes than in the vectorizer itself.

Jakub


Re: [3/4] Avoid undefined operation in overflow check

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 1:06 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 This is a case where tree_to_shwi can be used instead of TREE_INT_CST_LOW.
 I separated it out because it was using a signed x * y / y == x to check
 whether x * y overflows a HWI, which relies on undefined behaviour.

Please CSE tree_to_shwi (size)  (yeah, you may use an ugly

   (hsz = tree_to_shwi (size)) = HOST_WIDE_INT_MAX / BITS_PER_UNIT

Ok with that change.
Thanks,
Richard.

 Thanks,
 Richard


 gcc/
 * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Avoid signed
 overflow.  Use tree_to_shwi.

 Index: gcc/tree-ssa-alias.c
 ===
 --- gcc/tree-ssa-alias.c2013-11-19 10:53:54.965643984 +
 +++ gcc/tree-ssa-alias.c2013-11-19 11:08:51.882992035 +
 @@ -615,9 +615,8 @@ ao_ref_init_from_ptr_and_size (ao_ref *r
ref-offset += extra_offset;
if (size
 tree_fits_shwi_p (size)
 -   TREE_INT_CST_LOW (size) * BITS_PER_UNIT / BITS_PER_UNIT
 -== TREE_INT_CST_LOW (size))
 -ref-max_size = ref-size = TREE_INT_CST_LOW (size) * BITS_PER_UNIT;
 +   tree_to_shwi (size) = HOST_WIDE_INT_MAX / BITS_PER_UNIT)
 +ref-max_size = ref-size = tree_to_shwi (size) * BITS_PER_UNIT;
else
  ref-max_size = ref-size = -1;
ref-ref_alias_set = 0;


Re: [wwwdocs] Broken links

2013-11-20 Thread Patrick Marlier
+li Secify code[docs]/code in the subject header./li

Also [wwwdocs] instead of [docs].

--
Patrick Marlier


Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 9:18 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/19 Jeff Law l...@redhat.com:
 On 11/19/13 05:20, Ilya Enkovich wrote:

 2013/11/19 Richard Biener richard.guent...@gmail.com:

 On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich enkovich@gmail.com
 wrote:

 2013/11/18 Jeff Law l...@redhat.com:

 On 11/18/13 11:27, Ilya Enkovich wrote:



 How does pointer passed to regular function differ from pointer passed
 to splitted function? How do I know then which pointer is to be passed
 with bounds and wchich one is not? Moreover current ABI does not allow
 to pass bounds with no pointer or pass bounds for some pointers in the
 call only.


 But I don't see any case in function splitting where we're going to
 want to
 pass the pointer without the bounds.  If you want the former, you're
 going
 to want the latter.


 There are at least cases when checks are eliminated or when lots of
 pointer usages are accompanied with few checks performed earlier (e.g.
 we are working with array). In such cases splitted part may easily get
 no bounds.


 I really don't see why you need to do anything special here.  At the
 most an
 assert in the splitting code to ensure that you don't have a situation
 where
 there's mixed pointers with bounds and pointers without bounds should
 be all
 you need or that you passed a bounds with no associated pointer :-)


 It would also require generation of proper bind_bounds calls in the
 original function and arg_bounds calls in a separated part. So,
 special support is required.


 Well, only to keep proper instrumentation.  I hope code still works
 (doesn't trap) when optimizations wreck the bounds?  Thus all
 these patches are improving bounds propagation but are not required
 for correctness?  If so please postpone all of them until after the
 initial support is merged.  If not, please make sure BND instrumentation
 works conservatively when optimizations wreck it.


 All patches I sent for optimization passes are required to avoid ICEs
 when compiling instrumented code.

 Then I think we're going to need to understand them in more detail. That's
 going to mean testcases, probably dumps and some commentary about what went
 wrong.

 I can't speak for Richi, but when optimizations get disabled, I tend to want
 to really understand why and make sure we're not papering over a larger
 problem.

 The tail recursion elimination one we're discussing now is a great example.
 At this point I understand the problem you're running into, but I'm still
 trying to wrap my head around the implications of the funny semantics of
 __builtin_arg_bounds and how they may cause other problems.

 Root of all problems if implicit data flow hidden in arg_bounds and
 bind_bounds.  Calls consume bounds and compiler does not know it. And
 input bounds are always expressed via arg_bounds calls and never
 expressed via formal args. Obviously optimizers have to be taught
 about these data dependencies to work correctly.

 I agree semantics of arg_bounds call creates many issues for
 optimizers but currently I do not see a better replacement for it.

But it looks incredibly fragile if you ICE once something you don't like
happens.  You should be able to easily detect the case and punt,
that is, drop to non-instrumented aka invalidating bounds.

Thus, I really really don't like these patches.  They hint at some
deeper problem with the overall design (or the HW feature or the
accompaning ABI).

Richard.

 Ilya



 jeff



Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 9:29 PM, Ilya Enkovich enkovich@gmail.com wrote:
 On 19 Nov 12:33, Jeff Law wrote:
 On 11/19/13 05:13, Ilya Enkovich wrote:
 On 19 Nov 13:00, Richard Biener wrote:
 I'd say not in the gimplifier either but in varpool (symbol table) code
 where the symbols are ultimatively registered with?
 
 Something like that?
 
 --- a/gcc/varpool.c
 +++ b/gcc/varpool.c
 @@ -151,6 +151,10 @@ varpool_node_for_decl (tree decl)
 node = varpool_create_empty_node ();
 node-decl = decl;
 symtab_register_node (node);
 +
 +  if (DECL_NIITIAL (decl))
 +chkp_register_var_initializer (decl);
 +
 return node;
   }
 Yea, I think that's what Richi is suggesting.
 jeff


 Great!  Here is a full version of the patch.  Bootstrap, make check and MPX 
 tests are OK with the change.

Please don't do this in varpool_node_for_decl but in
varpool_finalize_decl.

Richard.

 Thanks,
 Ilya
 --
 2013-11-19  Ilya Enkovich  ilya.enkov...@intel.com

 * varpool.c: Include tree-chkp.h.
 (varpool_node_for_decl): Register statically
 initialized decls in Pointer Bounds Checker.


 diff --git a/gcc/varpool.c b/gcc/varpool.c
 index 471db82..8487b6e 100644
 --- a/gcc/varpool.c
 +++ b/gcc/varpool.c
 @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
  #include output.h
  #include gimple.h
  #include flags.h
 +#include tree-chkp.h

  /* List of hooks triggered on varpool_node events.  */
  struct varpool_node_hook_list {
 @@ -151,6 +152,10 @@ varpool_node_for_decl (tree decl)
node = varpool_create_empty_node ();
node-decl = decl;
symtab_register_node (node);
 +
 +  if (DECL_INITIAL (decl))
 +chkp_register_var_initializer (decl);
 +
return node;
  }



Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 10:57 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Tue, Nov 19, 2013 at 9:18 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/19 Jeff Law l...@redhat.com:
 On 11/19/13 05:20, Ilya Enkovich wrote:

 2013/11/19 Richard Biener richard.guent...@gmail.com:

 On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich enkovich@gmail.com
 wrote:

 2013/11/18 Jeff Law l...@redhat.com:

 On 11/18/13 11:27, Ilya Enkovich wrote:



 How does pointer passed to regular function differ from pointer passed
 to splitted function? How do I know then which pointer is to be passed
 with bounds and wchich one is not? Moreover current ABI does not allow
 to pass bounds with no pointer or pass bounds for some pointers in the
 call only.


 But I don't see any case in function splitting where we're going to
 want to
 pass the pointer without the bounds.  If you want the former, you're
 going
 to want the latter.


 There are at least cases when checks are eliminated or when lots of
 pointer usages are accompanied with few checks performed earlier (e.g.
 we are working with array). In such cases splitted part may easily get
 no bounds.


 I really don't see why you need to do anything special here.  At the
 most an
 assert in the splitting code to ensure that you don't have a situation
 where
 there's mixed pointers with bounds and pointers without bounds should
 be all
 you need or that you passed a bounds with no associated pointer :-)


 It would also require generation of proper bind_bounds calls in the
 original function and arg_bounds calls in a separated part. So,
 special support is required.


 Well, only to keep proper instrumentation.  I hope code still works
 (doesn't trap) when optimizations wreck the bounds?  Thus all
 these patches are improving bounds propagation but are not required
 for correctness?  If so please postpone all of them until after the
 initial support is merged.  If not, please make sure BND instrumentation
 works conservatively when optimizations wreck it.


 All patches I sent for optimization passes are required to avoid ICEs
 when compiling instrumented code.

 Then I think we're going to need to understand them in more detail. That's
 going to mean testcases, probably dumps and some commentary about what went
 wrong.

 I can't speak for Richi, but when optimizations get disabled, I tend to want
 to really understand why and make sure we're not papering over a larger
 problem.

 The tail recursion elimination one we're discussing now is a great example.
 At this point I understand the problem you're running into, but I'm still
 trying to wrap my head around the implications of the funny semantics of
 __builtin_arg_bounds and how they may cause other problems.

 Root of all problems if implicit data flow hidden in arg_bounds and
 bind_bounds.  Calls consume bounds and compiler does not know it. And
 input bounds are always expressed via arg_bounds calls and never
 expressed via formal args. Obviously optimizers have to be taught
 about these data dependencies to work correctly.

 I agree semantics of arg_bounds call creates many issues for
 optimizers but currently I do not see a better replacement for it.

 But it looks incredibly fragile if you ICE once something you don't like
 happens.  You should be able to easily detect the case and punt,
 that is, drop to non-instrumented aka invalidating bounds.

 Thus, I really really don't like these patches.  They hint at some
 deeper problem with the overall design (or the HW feature or the
 accompaning ABI).

Note that this, the intrusiveness of the feature and the questionable
gain makes me question whether GCC should have support for this
feature (and whether we really should rush this in this late).

Thus, I hereby formally ask to push back this feature to 4.10.

Thanks,
Richard.

 Richard.

 Ilya



 jeff



Re: [buildrobot] [PATCH] mips: Really remove ENTRY_BLOCK_PTR

2013-11-20 Thread Jan-Benedict Glaw
On Wed, 2013-11-20 10:08:45 +0100, Steven Bosscher stevenb@gmail.com 
wrote:
[...]
 I wonder if there are any more cases like this missed... Could you
 please check that? Something like:
 
 egrep -w ENTRY_BLOCK_PTR|EXIT_BLOCK_PTR gcc/*.[ch] gcc/config/*.[ch]
 gcc/config/*/*.{c,h,md}

No more uses, but 21 comment lines contain references to the macros.

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: Friends are relatives you make for yourself.
the second  :


signature.asc
Description: Digital signature


Re: [PATCH, MPX, 2/X] Pointers Checker [16/25] Tail recursion

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 8:02 PM, Jeff Law l...@redhat.com wrote:
 On 11/18/13 14:03, Ilya Enkovich wrote:

 2013/11/19 Jeff Law l...@redhat.com:

 On 11/18/13 12:16, Ilya Enkovich wrote:


 With current recursion elimination we will have:

 test (int *param1)
 {
 bb1:

 bb2:
 _7 = PHIparam1(D) (bb1), _6 (bb2)
 bounds2 = __builtin_arg_bounds (_7) -- WRONG


 I wouldn't say it's wrong.  It's conservatively correct if properly
 implemented.   Why precisely do you consider this wrong?  If your code
 can't
 handle it, it really should be fixed.


 It is wrong because __builtin_arg_bounds is used to get bounds of
 input parameter and PHI node here is not an input parameter.

 OK, now we're getting somewhere.  So we've got this odd little function
 which only works on parameters.   I can't help but feel this is a bit of
 mis-design coming back to haunt us and I wonder if we're going to see other
 instances of this kind of problem.

Right.

 There's something just wrong with the semantics of __builtin_arg_bounds, but
 I'm having trouble putting my finger on it.

Well, the issue is that it accepts any pointer argument as input.
I originally thought it may just get an integer constant argument - the
argument position.  It really seems to me that we have
__builtin_arg_boundsN (), but to avoid having N builtins we specify N
via an argument to the function.  But as it may not change we should
use something that cannot change - like a constant.  A constant
that identifies a parameter is one that gives its position for example.

Note that any restrictions you impose on what is valid for GIMPLE
should be verified in tree-cfg.c:verify_gimple_* - in the
__builtin_arg_bounds call case in verify_gimple_call.

Richard.


  Correctl

 handling of __builtin_arg_bounds in this 'wrong' example requires
 adding it a PHI node semantics based on it's arg.  For me it seems
 more complex then generating a regular PHI node for bounds and makes
 GIMPLE less readable.

 But presumably you have code to merge bound information already, right?  You
 need that for PHI nodes.  Can't you use that to walk up the use-def chains
 and build the bounds information?

 Or if you want to fix the tailcall stuff, you can start down that direction.
 I don't think that'll fix the nagging concerns about the overall semantics
 of builtin_arg_bounds, but it might be enough for you to go forward.

 jeff


Re: [PATCH, 4.8] Backport patches to fix parallel build fail PR 57683

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 2:30 AM, Zhenqiang Chen
zhenqiang.c...@linaro.org wrote:
 Hi,

 The issue files in PR 57683 include lra-constraints.o,
 lra-eliminations.o and tree-switch-conversion.o.

 Jeff had fixed them in trunk (r197467 and r198999):

 http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00957.html
 http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00211.html

 Can we backport the two patches to fix the parallel build issues in 4.8?

Sure, please go ahead.

Richard.

 Thanks!
 -Zhenqiang


Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion

2013-11-20 Thread Richard Biener
On Wed, 20 Nov 2013, Jakub Jelinek wrote:

 On Wed, Nov 20, 2013 at 10:31:38AM +0100, Richard Biener wrote:
  Aww ;)  Nice improvement.  Generally when I see this I always wonder
  whether we want to do this kind of stuff pre RTL expansion.
  1st to not rely on being able to TER, 2nd to finally eventually
  get rid of TER.
  
  These patches are unfortunately a step backward for #2.
  
  As of the patch, do we have a way to query whether the target
  can efficiently broadcast?  If so this IMHO belongs in generic
 
 We don't.  Perhaps if we'd add optab for vec_dupmode and mentioned
 clearly in the documentation that it should be used only if it is reasonably
 efficient.  But still, even with optab, it would probably better to do it
 in the veclower* passes than in the vectorizer itself.

Yes, veclower would be a good fit.

I was of course thinking of a new apply TER pass right before cfgexpand
(for the cases where TER ends up doing some kind of folding).

Richard.


[PATCH] Fix PR59035

2013-11-20 Thread Richard Biener

The following makes sure the languages default of -ffp-contract
is properly transfered to the LTO stage (which defaults to fast).
Merging different options from different TUs is done conservatively.

LTO bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-11-20  Richard Biener  rguent...@suse.de

PR lto/59035
* lto-opts.c (lto_write_options): Write defaults only if
they were not explicitely specified.  Also write
-ffp-contract default.
* lto-wrapper.c (merge_and_complain): Merge -ffp-contract
conservatively.
(run_gcc): Pass through -ffp-contract.

Index: gcc/lto-opts.c
===
*** gcc/lto-opts.c  (revision 205009)
--- gcc/lto-opts.c  (working copy)
*** lto_write_options (void)
*** 85,98 
   function rather than per compilation unit.  */
/* -fexceptions causes the EH machinery to be initialized, enabling
   generation of unwind data so that explicit throw() calls work.  */
!   if (global_options.x_flag_exceptions)
  append_to_collect_gcc_options (temporary_obstack, first_p,
   -fexceptions);
/* -fnon-call-exceptions changes the generation of exception
regions.  It is enabled implicitly by the Go frontend.  */
!   if (global_options.x_flag_non_call_exceptions)
  append_to_collect_gcc_options (temporary_obstack, first_p,
   -fnon-call-exceptions);
  
/* Output explicitly passed options.  */
for (i = 1; i  save_decoded_options_count; ++i)
--- 85,119 
   function rather than per compilation unit.  */
/* -fexceptions causes the EH machinery to be initialized, enabling
   generation of unwind data so that explicit throw() calls work.  */
!   if (!global_options_set.x_flag_exceptions
!global_options.x_flag_exceptions)
  append_to_collect_gcc_options (temporary_obstack, first_p,
   -fexceptions);
/* -fnon-call-exceptions changes the generation of exception
regions.  It is enabled implicitly by the Go frontend.  */
!   if (!global_options_set.x_flag_non_call_exceptions
!global_options.x_flag_non_call_exceptions)
  append_to_collect_gcc_options (temporary_obstack, first_p,
   -fnon-call-exceptions);
+   /* The default -ffp-contract changes depending on the language
+  standard.  Pass thru conservative standard settings.  */
+   if (!global_options_set.x_flag_fp_contract_mode)
+ switch (global_options.x_flag_fp_contract_mode)
+   {
+   case FP_CONTRACT_OFF:
+   append_to_collect_gcc_options (temporary_obstack, first_p,
+  -ffp-contract=off);
+   break;
+   case FP_CONTRACT_ON:
+   append_to_collect_gcc_options (temporary_obstack, first_p,
+  -ffp-contract=on);
+   break;
+   case FP_CONTRACT_FAST:
+   /* Nothing.  That merges conservatively and is the default for LTO.  */
+   break;
+   default:
+   gcc_unreachable ();
+   }
  
/* Output explicitly passed options.  */
for (i = 1; i  save_decoded_options_count; ++i)
Index: gcc/lto-wrapper.c
===
*** gcc/lto-wrapper.c   (revision 205009)
--- gcc/lto-wrapper.c   (working copy)
*** merge_and_complain (struct cl_decoded_op
*** 422,427 
--- 422,439 
append_option (decoded_options, decoded_options_count, foption);
  break;
  
+   case OPT_ffp_contract_:
+ /* For selected options we can merge conservatively.  */
+ for (j = 0; j  *decoded_options_count; ++j)
+   if ((*decoded_options)[j].opt_index == foption-opt_index)
+ break;
+ if (j == *decoded_options_count)
+   append_option (decoded_options, decoded_options_count, foption);
+ /* FP_CONTRACT_OFF  FP_CONTRACT_ON  FP_CONTRACT_FAST.  */
+ else if (foption-value  (*decoded_options)[j].value)
+   (*decoded_options)[j] = *foption;
+ break;
+ 
case OPT_freg_struct_return:
case OPT_fpcc_struct_return:
  for (j = 0; j  *decoded_options_count; ++j)
*** run_gcc (unsigned argc, char *argv[])
*** 578,583 
--- 590,596 
case OPT_fgnu_tm:
case OPT_freg_struct_return:
case OPT_fpcc_struct_return:
+   case OPT_ffp_contract_:
  break;
  
default:


Re: [PATCH, MPX, 2/X] Pointers Checker [16/25] Tail recursion

2013-11-20 Thread Ilya Enkovich
2013/11/20 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 19, 2013 at 8:02 PM, Jeff Law l...@redhat.com wrote:
 On 11/18/13 14:03, Ilya Enkovich wrote:

 2013/11/19 Jeff Law l...@redhat.com:

 On 11/18/13 12:16, Ilya Enkovich wrote:


 With current recursion elimination we will have:

 test (int *param1)
 {
 bb1:

 bb2:
 _7 = PHIparam1(D) (bb1), _6 (bb2)
 bounds2 = __builtin_arg_bounds (_7) -- WRONG


 I wouldn't say it's wrong.  It's conservatively correct if properly
 implemented.   Why precisely do you consider this wrong?  If your code
 can't
 handle it, it really should be fixed.


 It is wrong because __builtin_arg_bounds is used to get bounds of
 input parameter and PHI node here is not an input parameter.

 OK, now we're getting somewhere.  So we've got this odd little function
 which only works on parameters.   I can't help but feel this is a bit of
 mis-design coming back to haunt us and I wonder if we're going to see other
 instances of this kind of problem.

 Right.

 There's something just wrong with the semantics of __builtin_arg_bounds, but
 I'm having trouble putting my finger on it.

 Well, the issue is that it accepts any pointer argument as input.
 I originally thought it may just get an integer constant argument - the
 argument position.  It really seems to me that we have
 __builtin_arg_boundsN (), but to avoid having N builtins we specify N
 via an argument to the function.  But as it may not change we should
 use something that cannot change - like a constant.  A constant
 that identifies a parameter is one that gives its position for example.

I have tried to pass constant for arg_bounds.  It still requires
additional modification of optimization passes (including tail
recursion, param propagation, inline etc.)  But having constant as an
argument you really cannot detect errors.  E.g. if you inline and do
not fix inlined arg_bounds calls correctly, you may not get ICE but
get wrong program which uses wrong bounds and produces false bounds
violations.


 Note that any restrictions you impose on what is valid for GIMPLE
 should be verified in tree-cfg.c:verify_gimple_* - in the
 __builtin_arg_bounds call case in verify_gimple_call.

Will add it.

Thanks,
Ilya


 Richard.


  Correctl

 handling of __builtin_arg_bounds in this 'wrong' example requires
 adding it a PHI node semantics based on it's arg.  For me it seems
 more complex then generating a regular PHI node for bounds and makes
 GIMPLE less readable.

 But presumably you have code to merge bound information already, right?  You
 need that for PHI nodes.  Can't you use that to walk up the use-def chains
 and build the bounds information?

 Or if you want to fix the tailcall stuff, you can start down that direction.
 I don't think that'll fix the nagging concerns about the overall semantics
 of builtin_arg_bounds, but it might be enough for you to go forward.

 jeff


[PATCH] Strict volatile bit-fields clean-up

2013-11-20 Thread Bernd Edlinger
Hello Richard,

as a follow-up patch to the bit-fields patch(es), I wanted to remove the 
dependencies on
the variable flag_strict_volatile_bitfields from expand_assignment and 
expand_expr_real_1.
Additionally I want the access mode of the field to be selected in the memory 
context,
instead of the structure's mode.

Boot-strapped and regression-tested on x86_64-linux-gnu.

OK for trunk?

Thanks
Bernd.


FYI - these are my in-flight patches, which would be nice to go into the GCC 
4.9.0 release:


http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02046.html :[PATCH] reimplement 
-fstrict-volatile-bitfields v4, part 1/2
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02025.html :[PATCH] Fix C++0x 
memory model for -fno-strict-volatile-bitfields on ARM
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02291.html :
[PATCH, PR 57748] Check for out of bounds access, Part 2
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00581.html :[PATCH, PR 57748] 
Check for out of bounds access, Part 3
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00133.html:[PATCH] Fix PR58115 
  2013-11-20  Bernd Edlinger  bernd.edlin...@hotmail.de

* expr.c (expand_assignment): Remove dependency on
flag_strict_volatile_bitfields. Always set the memory 
access mode.
(expand_expr_real_1): Likewise.



patch-bitfields-cleanup.diff
Description: Binary data


Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-20 Thread Ilya Enkovich
2013/11/20 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 19, 2013 at 9:18 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/19 Jeff Law l...@redhat.com:
 On 11/19/13 05:20, Ilya Enkovich wrote:

 2013/11/19 Richard Biener richard.guent...@gmail.com:

 On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich enkovich@gmail.com
 wrote:

 2013/11/18 Jeff Law l...@redhat.com:

 On 11/18/13 11:27, Ilya Enkovich wrote:



 How does pointer passed to regular function differ from pointer passed
 to splitted function? How do I know then which pointer is to be passed
 with bounds and wchich one is not? Moreover current ABI does not allow
 to pass bounds with no pointer or pass bounds for some pointers in the
 call only.


 But I don't see any case in function splitting where we're going to
 want to
 pass the pointer without the bounds.  If you want the former, you're
 going
 to want the latter.


 There are at least cases when checks are eliminated or when lots of
 pointer usages are accompanied with few checks performed earlier (e.g.
 we are working with array). In such cases splitted part may easily get
 no bounds.


 I really don't see why you need to do anything special here.  At the
 most an
 assert in the splitting code to ensure that you don't have a situation
 where
 there's mixed pointers with bounds and pointers without bounds should
 be all
 you need or that you passed a bounds with no associated pointer :-)


 It would also require generation of proper bind_bounds calls in the
 original function and arg_bounds calls in a separated part. So,
 special support is required.


 Well, only to keep proper instrumentation.  I hope code still works
 (doesn't trap) when optimizations wreck the bounds?  Thus all
 these patches are improving bounds propagation but are not required
 for correctness?  If so please postpone all of them until after the
 initial support is merged.  If not, please make sure BND instrumentation
 works conservatively when optimizations wreck it.


 All patches I sent for optimization passes are required to avoid ICEs
 when compiling instrumented code.

 Then I think we're going to need to understand them in more detail. That's
 going to mean testcases, probably dumps and some commentary about what went
 wrong.

 I can't speak for Richi, but when optimizations get disabled, I tend to want
 to really understand why and make sure we're not papering over a larger
 problem.

 The tail recursion elimination one we're discussing now is a great example.
 At this point I understand the problem you're running into, but I'm still
 trying to wrap my head around the implications of the funny semantics of
 __builtin_arg_bounds and how they may cause other problems.

 Root of all problems if implicit data flow hidden in arg_bounds and
 bind_bounds.  Calls consume bounds and compiler does not know it. And
 input bounds are always expressed via arg_bounds calls and never
 expressed via formal args. Obviously optimizers have to be taught
 about these data dependencies to work correctly.

 I agree semantics of arg_bounds call creates many issues for
 optimizers but currently I do not see a better replacement for it.

 But it looks incredibly fragile if you ICE once something you don't like
 happens.  You should be able to easily detect the case and punt,
 that is, drop to non-instrumented aka invalidating bounds.

Actually, it is easy detect such cases and invalidate bounds each time
some optimization bounds data flow.  With such feature 5-6 of sent
patches would be unnecessary to successfully build instrumented code
on -O2, but without them quality of checks would be much lower (mostly
due to inline).

Ilya


 Thus, I really really don't like these patches.  They hint at some
 deeper problem with the overall design (or the HW feature or the
 accompaning ABI).

 Richard.

 Ilya



 jeff



Re: [PATCH, MPX, 2/X] Pointers Checker [16/25] Tail recursion

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 11:41 AM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/20 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 19, 2013 at 8:02 PM, Jeff Law l...@redhat.com wrote:
 On 11/18/13 14:03, Ilya Enkovich wrote:

 2013/11/19 Jeff Law l...@redhat.com:

 On 11/18/13 12:16, Ilya Enkovich wrote:


 With current recursion elimination we will have:

 test (int *param1)
 {
 bb1:

 bb2:
 _7 = PHIparam1(D) (bb1), _6 (bb2)
 bounds2 = __builtin_arg_bounds (_7) -- WRONG


 I wouldn't say it's wrong.  It's conservatively correct if properly
 implemented.   Why precisely do you consider this wrong?  If your code
 can't
 handle it, it really should be fixed.


 It is wrong because __builtin_arg_bounds is used to get bounds of
 input parameter and PHI node here is not an input parameter.

 OK, now we're getting somewhere.  So we've got this odd little function
 which only works on parameters.   I can't help but feel this is a bit of
 mis-design coming back to haunt us and I wonder if we're going to see other
 instances of this kind of problem.

 Right.

 There's something just wrong with the semantics of __builtin_arg_bounds, but
 I'm having trouble putting my finger on it.

 Well, the issue is that it accepts any pointer argument as input.
 I originally thought it may just get an integer constant argument - the
 argument position.  It really seems to me that we have
 __builtin_arg_boundsN (), but to avoid having N builtins we specify N
 via an argument to the function.  But as it may not change we should
 use something that cannot change - like a constant.  A constant
 that identifies a parameter is one that gives its position for example.

 I have tried to pass constant for arg_bounds.  It still requires
 additional modification of optimization passes (including tail
 recursion, param propagation, inline etc.)  But having constant as an
 argument you really cannot detect errors.  E.g. if you inline and do
 not fix inlined arg_bounds calls correctly, you may not get ICE but
 get wrong program which uses wrong bounds and produces false bounds
 violations.

Yeah, that's why I refrained from suggesting this earlier ;)  You lose
the connection between bounds for argument X and function entry
value for argument X.

Richard.


 Note that any restrictions you impose on what is valid for GIMPLE
 should be verified in tree-cfg.c:verify_gimple_* - in the
 __builtin_arg_bounds call case in verify_gimple_call.

 Will add it.

 Thanks,
 Ilya


 Richard.


  Correctl

 handling of __builtin_arg_bounds in this 'wrong' example requires
 adding it a PHI node semantics based on it's arg.  For me it seems
 more complex then generating a regular PHI node for bounds and makes
 GIMPLE less readable.

 But presumably you have code to merge bound information already, right?  You
 need that for PHI nodes.  Can't you use that to walk up the use-def chains
 and build the bounds information?

 Or if you want to fix the tailcall stuff, you can start down that direction.
 I don't think that'll fix the nagging concerns about the overall semantics
 of builtin_arg_bounds, but it might be enough for you to go forward.

 jeff


Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 11:55 AM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/20 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 19, 2013 at 9:18 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/19 Jeff Law l...@redhat.com:
 On 11/19/13 05:20, Ilya Enkovich wrote:

 2013/11/19 Richard Biener richard.guent...@gmail.com:

 On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich enkovich@gmail.com
 wrote:

 2013/11/18 Jeff Law l...@redhat.com:

 On 11/18/13 11:27, Ilya Enkovich wrote:



 How does pointer passed to regular function differ from pointer passed
 to splitted function? How do I know then which pointer is to be passed
 with bounds and wchich one is not? Moreover current ABI does not allow
 to pass bounds with no pointer or pass bounds for some pointers in the
 call only.


 But I don't see any case in function splitting where we're going to
 want to
 pass the pointer without the bounds.  If you want the former, you're
 going
 to want the latter.


 There are at least cases when checks are eliminated or when lots of
 pointer usages are accompanied with few checks performed earlier (e.g.
 we are working with array). In such cases splitted part may easily get
 no bounds.


 I really don't see why you need to do anything special here.  At the
 most an
 assert in the splitting code to ensure that you don't have a situation
 where
 there's mixed pointers with bounds and pointers without bounds should
 be all
 you need or that you passed a bounds with no associated pointer :-)


 It would also require generation of proper bind_bounds calls in the
 original function and arg_bounds calls in a separated part. So,
 special support is required.


 Well, only to keep proper instrumentation.  I hope code still works
 (doesn't trap) when optimizations wreck the bounds?  Thus all
 these patches are improving bounds propagation but are not required
 for correctness?  If so please postpone all of them until after the
 initial support is merged.  If not, please make sure BND instrumentation
 works conservatively when optimizations wreck it.


 All patches I sent for optimization passes are required to avoid ICEs
 when compiling instrumented code.

 Then I think we're going to need to understand them in more detail. That's
 going to mean testcases, probably dumps and some commentary about what went
 wrong.

 I can't speak for Richi, but when optimizations get disabled, I tend to 
 want
 to really understand why and make sure we're not papering over a larger
 problem.

 The tail recursion elimination one we're discussing now is a great example.
 At this point I understand the problem you're running into, but I'm still
 trying to wrap my head around the implications of the funny semantics of
 __builtin_arg_bounds and how they may cause other problems.

 Root of all problems if implicit data flow hidden in arg_bounds and
 bind_bounds.  Calls consume bounds and compiler does not know it. And
 input bounds are always expressed via arg_bounds calls and never
 expressed via formal args. Obviously optimizers have to be taught
 about these data dependencies to work correctly.

 I agree semantics of arg_bounds call creates many issues for
 optimizers but currently I do not see a better replacement for it.

 But it looks incredibly fragile if you ICE once something you don't like
 happens.  You should be able to easily detect the case and punt,
 that is, drop to non-instrumented aka invalidating bounds.

 Actually, it is easy detect such cases and invalidate bounds each time
 some optimization bounds data flow.  With such feature 5-6 of sent
 patches would be unnecessary to successfully build instrumented code
 on -O2, but without them quality of checks would be much lower (mostly
 due to inline).

Sure, but you want to have this feature for a production compiler as
for sure you are going to miss an odd sequence of transforms that
breaks your assumptions.

Richard.

 Ilya


 Thus, I really really don't like these patches.  They hint at some
 deeper problem with the overall design (or the HW feature or the
 accompaning ABI).

 Richard.

 Ilya



 jeff



Re: [AArch64] Remove mode, mode2 attributes

2013-11-20 Thread Richard Earnshaw
On 19/11/13 18:12, James Greenhalgh wrote:
 
 There are no consumers for these attributes, nor should there
 ever be. Remove them.
 
 Regression tested on aarch64-none-elf with no issues.
 
 OK?
 
 Thanks,
 James
 
 ---
 2013-11-19  James Greenhalgh  james.greenha...@arm.com
 
   * config/aarch64/aarch64.md: Remove mode and mode2 attributes
   from all insns.
 
 

OK.

R.




Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-20 Thread Ilya Enkovich
2013/11/20 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 20, 2013 at 11:55 AM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/20 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 19, 2013 at 9:18 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/19 Jeff Law l...@redhat.com:
 On 11/19/13 05:20, Ilya Enkovich wrote:

 2013/11/19 Richard Biener richard.guent...@gmail.com:

 On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich enkovich@gmail.com
 wrote:

 2013/11/18 Jeff Law l...@redhat.com:

 On 11/18/13 11:27, Ilya Enkovich wrote:



 How does pointer passed to regular function differ from pointer 
 passed
 to splitted function? How do I know then which pointer is to be 
 passed
 with bounds and wchich one is not? Moreover current ABI does not 
 allow
 to pass bounds with no pointer or pass bounds for some pointers in 
 the
 call only.


 But I don't see any case in function splitting where we're going to
 want to
 pass the pointer without the bounds.  If you want the former, you're
 going
 to want the latter.


 There are at least cases when checks are eliminated or when lots of
 pointer usages are accompanied with few checks performed earlier (e.g.
 we are working with array). In such cases splitted part may easily get
 no bounds.


 I really don't see why you need to do anything special here.  At the
 most an
 assert in the splitting code to ensure that you don't have a situation
 where
 there's mixed pointers with bounds and pointers without bounds should
 be all
 you need or that you passed a bounds with no associated pointer :-)


 It would also require generation of proper bind_bounds calls in the
 original function and arg_bounds calls in a separated part. So,
 special support is required.


 Well, only to keep proper instrumentation.  I hope code still works
 (doesn't trap) when optimizations wreck the bounds?  Thus all
 these patches are improving bounds propagation but are not required
 for correctness?  If so please postpone all of them until after the
 initial support is merged.  If not, please make sure BND instrumentation
 works conservatively when optimizations wreck it.


 All patches I sent for optimization passes are required to avoid ICEs
 when compiling instrumented code.

 Then I think we're going to need to understand them in more detail. That's
 going to mean testcases, probably dumps and some commentary about what 
 went
 wrong.

 I can't speak for Richi, but when optimizations get disabled, I tend to 
 want
 to really understand why and make sure we're not papering over a larger
 problem.

 The tail recursion elimination one we're discussing now is a great 
 example.
 At this point I understand the problem you're running into, but I'm still
 trying to wrap my head around the implications of the funny semantics of
 __builtin_arg_bounds and how they may cause other problems.

 Root of all problems if implicit data flow hidden in arg_bounds and
 bind_bounds.  Calls consume bounds and compiler does not know it. And
 input bounds are always expressed via arg_bounds calls and never
 expressed via formal args. Obviously optimizers have to be taught
 about these data dependencies to work correctly.

 I agree semantics of arg_bounds call creates many issues for
 optimizers but currently I do not see a better replacement for it.

 But it looks incredibly fragile if you ICE once something you don't like
 happens.  You should be able to easily detect the case and punt,
 that is, drop to non-instrumented aka invalidating bounds.

 Actually, it is easy detect such cases and invalidate bounds each time
 some optimization bounds data flow.  With such feature 5-6 of sent
 patches would be unnecessary to successfully build instrumented code
 on -O2, but without them quality of checks would be much lower (mostly
 due to inline).

 Sure, but you want to have this feature for a production compiler as
 for sure you are going to miss an odd sequence of transforms that
 breaks your assumptions.

There still should be a way to detect such cases.  If all unsupported
cases are silently ignored then user never knows about it and never
reports the problem.

I may fix expand changes to make it less vulnerable to different code
modifications and get no ICEs without fixes in optimizers (well,
except fix in tree-ssa-ccp.c where instrumentation breaks assumption
for data flow between stack save/restore builtins). But even with
stable expand I would still want to have all those changes in
optimizers to get better checker quality.

Ilya

 Richard.

 Ilya


 Thus, I really really don't like these patches.  They hint at some
 deeper problem with the overall design (or the HW feature or the
 accompaning ABI).

 Richard.

 Ilya



 jeff



Re: [PATCH i386 8/8] [AVX-512] Add SHA support.

2013-11-20 Thread Kirill Yukhin
On 19 Nov 15:36, Uros Bizjak wrote:
 Please also add new command options to g++.dg/other/sse-2.C and
 g++.dg/other/sse-3.C
 
 OK with the small nit below and with above testsute addition.
 
 No need to document negative option here.

Thanks! I'll fix your inputs and check in after rest is reviewed.

--
K


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread Eric Botcazou
 Thanks. Committed as rev 205023.

Nice work, but why did you antedate the entries in the various ChangeLog 
files?  That's rather confusing when you use them to track things in specific 
directories (yes, we all know your opinion about ChangeLog files ;-)

-- 
Eric Botcazou


Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Andreas Schwab
Jan Hubicka hubi...@ucw.cz writes:

   * config/bootstrap-lto.mk: Use -ffat-lto-objects.
   * common.opt (ffat-lto-objects): Disable by default.
   * doc/invoke.texi (fat-lto-objects): Update documentation.
   * opts.c: Enable fat-lto-objects on lto plugin disable setups.

This is breaking all tests with -flto -fno-use-linker-plugin.

$ gcc/xgcc -B gcc/ ../gcc/testsuite/gcc.c-torture/execute/900409-1.c -w -flto 
-fno-use-linker-plugin 
/daten/cross/m68k-linux/m68k-linux/sys-root/usr/lib/crt1.o: In function 
`_start':
/daten/m68k/build/glibc-2.18/csu/../ports/sysdeps/m68k/start.S:85: undefined 
reference to `main'
collect2: error: ld returned 1 exit status

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Jan Hubicka
 Jan Hubicka hubi...@ucw.cz writes:
 
  * config/bootstrap-lto.mk: Use -ffat-lto-objects.
  * common.opt (ffat-lto-objects): Disable by default.
  * doc/invoke.texi (fat-lto-objects): Update documentation.
  * opts.c: Enable fat-lto-objects on lto plugin disable setups.
 
 This is breaking all tests with -flto -fno-use-linker-plugin.
 
 $ gcc/xgcc -B gcc/ ../gcc/testsuite/gcc.c-torture/execute/900409-1.c -w -flto 
 -fno-use-linker-plugin 
 /daten/cross/m68k-linux/m68k-linux/sys-root/usr/lib/crt1.o: In function 
 `_start':
 /daten/m68k/build/glibc-2.18/csu/../ports/sysdeps/m68k/start.S:85: undefined 
 reference to `main'
 collect2: error: ld returned 1 exit status

Hmm, I see.  Seems I profilebootstrapped but later regtested only on plugin
disabled gccfarm machine.  I wonder if we shall require users to use
-ffat-lto-objects in combination with -fno-use-linker-plugin (and thus document
it and fix the testsuite) or find a way to make driver to pass the option
transparently?

Honza
 
 Andreas.
 
 -- 
 Andreas Schwab, SUSE Labs, sch...@suse.de
 GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
 And now for something completely different.


Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Jan Hubicka
Hi,
actually the flag is being passed to the frontends instead of being consumed by
the driver that makes life easier.  I am testing the attached patch and will
commit it as obvious if it passes (to unbreak the testing).

Index: opts.c
===
--- opts.c  (revision 205108)
+++ opts.c  (working copy)
@@ -809,10 +809,13 @@ finish_options (struct gcc_options *opts
 #else
   error_at (loc, LTO support has not been enabled in this configuration);
 #endif
-  if (!opts-x_flag_fat_lto_objects  !HAVE_LTO_PLUGIN)
+  if (!opts-x_flag_fat_lto_objects
+  (!HAVE_LTO_PLUGIN
+ || (opts_set-x_flag_use_linker_plugin
+  !opts-x_flag_use_linker_plugin)))
{
  if (opts_set-x_flag_fat_lto_objects)
-error_at (loc, -fno-fat-lto-objects are supported only with 
linker plugin.);
+error_at (loc, -fno-fat-lto-objects are supported only with 
linker plugin);
  opts-x_flag_fat_lto_objects = 1;
}
 }


Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Ilya Enkovich
On 20 Nov 09:49, Uros Bizjak wrote:
 On Tue, Nov 19, 2013 at 9:43 PM, Ilya Enkovich enkovich@gmail.com wrote:
 
   Here is a patch to add size relocation and instruction to obtain 
   object's size in i386 target.
 
  +(define_insn move_size_reloc_mode
  +  [(set (match_operand:SWI48 0 register_operand =r)
  +(match_operand:MODE 1 size_relocation Z))]
  +  
  +{
  +  return mov{imodesuffix}\t{%1, %0|%0, %1};
 
  Please don't change x86_64_immediate_operand just to use Z
  constraint The predicate is used in a couple of other places that for
  sure don't accept your change.
 
  Better write this insn in an explicit way (see for example
  tls_initial_exec_64_sun). Something like:
 
  (define_insn move_size_reloc_mode
[(set (match_operand:SWI48 0 register_operand =r)
  (unspec:SWI48
   [(match_operand 1 symbolic_operand ... )]
   UNSPEC_SIZEOF))]

mov{imodesuffix}\t{%a1@SIZE, %0|%0, %a1@SIZE})
 
  You will probably need to define new operand 1 predicate and constraint.
 
  Uros.
 
  Hi, Uros!  Thanks for comments!  Here is what I got trying to follow your 
  suggestion.  Does it look better?
 
 You actually don't need any operand modifiers in the insn template. Simply 
 use:
 
 mov{imodesuffix}\t{%1@SIZE, %0|%0, %1@SIZE}
 
 and you will automatically get
 
 movl $zzz, %eax or mov %eax, OFFSET FLAT: zzz.
 
 Since your pattern allows only symbolic_operand, there is no reload,
 so you can avoid the constraint alltogether.
 
 BTW: Did you consider various -mcmodel=... options? For DImode moves,
 you should check x86_64_zext_immediate_operand predicate and output
 either movl $zzz, %eax or movabs $zzz, %rax. There is no movq with
 64bit immediate. Please see movdi pattern.
 
 Uros.

Yep, for large objects it may work wrongly. Does anyone use static objects 
4Gb? :)

Large address does not mean large object but seems we have to be conservative 
here. I added  x86_64_zext_immediate_operand check with additional CM_KERNEL 
check because in this model object size should always fit 32 bits.

Thanks,
Ilya
--

gcc/

2013-11-19  Ilya Enkovich  ilya.enkov...@intel.com

* config/i386/i386.md (UNSPEC_SIZEOF): New.
(move_size_reloc_mode): New.
* config/i386/predicates.md (symbol_operand): New.

diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 7289ae4..b29fadb 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -241,6 +241,7 @@
 ;;   s - address with no segment register
 ;;   i - address with no index and no rip
 ;;   b - address with no base and no rip
+;;   S - symbol reference
 
 (define_address_constraint Tv
   VSIB address operand
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index e23b3b6..772c09d 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -79,6 +79,7 @@
   UNSPEC_PLTOFF
   UNSPEC_MACHOPIC_OFFSET
   UNSPEC_PCREL
+  UNSPEC_SIZEOF
 
   ;; Prologue support
   UNSPEC_STACK_ALLOC
@@ -18446,6 +18447,22 @@
   bndstx\t{%2, %3|%3, %2}
   [(set_attr type mpxst)])
 
+(define_insn move_size_reloc_mode
+  [(set (match_operand:SWI48 0 register_operand =r)
+   (unspec:MODE
+[(match_operand:MODE 1 symbol_operand)]
+UNSPEC_SIZEOF))]
+  
+{
+  if (GET_MODE (operands[0]) == SImode)
+return mov{l}\t{%1@SIZE, %k0|%k0, %1@SIZE};
+  else if (x86_64_zext_immediate_operand (operands[1], VOIDmode)
+  || ix86_cmodel == CM_KERNEL)
+return mov{q}\t{%1@SIZE, %0|%0, %1@SIZE};
+  else
+return movabs{q}\t{%1@SIZE, %0|%0, %1@SIZE};
+})
+
 (include mmx.md)
 (include sse.md)
 (include sync.md)
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 66ac52f..5c758ab 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -119,6 +119,10 @@
(match_test TARGET_64BIT)
(match_test REGNO (op)  BX_REG)))
 
+;; Return true if VALUE is symbol reference
+(define_predicate symbol_operand
+  (match_code symbol_ref))
+
 ;; Return true if VALUE can be stored in a sign extended immediate field.
 (define_predicate x86_64_immediate_operand
   (match_code const_int,symbol_ref,label_ref,const)


Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-20 Thread Sergey Ostanevich
Thanks for comments, hope I got all of em.
Note: I used a LOOP_VINFO_LOC (loop_vinfo) to print the loop location
but it appears to be 0, so the output is somewhat lousy. The global
vect_location points somewhere inside the loop, which is not that better.
Shall we address this separately?

Sergos

* common.opt: Added new option -fsimd-cost-model.
* tree-vectorizer.h (unlimited_cost_model): Interface update
to rely on particular loop info.
* tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
unlimited_cost_model call according to new interface.
(vect_peeling_hash_choose_best_peeling): Ditto.
(vect_enhance_data_refs_alignment): Ditto.
* tree-vect-slp.c: Ditto.
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
plus issue a warning in case cost model overrides users' directive.
* c-family/c.opt: add openmp-simd warning.
* fortran/lang.opt: Ditto.
* doc/invoke.texi: Added new openmp-simd warning.



diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 0026683..a85a8ad 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -592,6 +592,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Warning
 Warn if an old-style parameter definition is used

+Wopenmp-simd
+C C++ Var(openmp_simd) Warning EnabledBy(Wall)
+Warn about simd directive is overridden by vectorizer cost model
+
 Woverlength-strings
 C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning
LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
 Warn if a string is longer than the maximum portable length specified
by the standard
diff --git a/gcc/common.opt b/gcc/common.opt
index d5971df..2b0e9e6 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2296,6 +2296,10 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization

+fsimd-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the vectorization cost model for code marked with simd directive
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
vectorizer cost model %qs)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c250385..050bd44 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wlogical-op -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmissing-braces
-Wmissing-field-initializers @gol
 -Wmissing-include-dirs @gol
--Wno-multichar  -Wnonnull  -Wno-overflow @gol
+-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
 -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
 -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
@@ -3318,6 +3318,7 @@ Options} and @ref{Objective-C and Objective-C++
Dialect Options}.
 -Wmaybe-uninitialized @gol
 -Wmissing-braces @r{(only for C/ObjC)} @gol
 -Wnonnull  @gol
+-Wopenmp-simd @gol
 -Wparentheses  @gol
 -Wpointer-sign  @gol
 -Wreorder   @gol
@@ -4804,6 +4805,10 @@ attribute.
 @opindex Woverflow
 Do not warn about compile-time overflow in constant expressions.

+@item -Wopenmp-simd
+@opindex Wopenm-simd
+Warn if vectorizer cost model overrides simd directive from user.
+
 @item -Woverride-init @r{(C and Objective-C only)}
 @opindex Woverride-init
 @opindex Wno-override-init
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5e09cbd..b43c48c 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard

+Wopenmp-simd
+Fortran Warning
+; Documented in C
+
 Wreal-q-constant
 Fortran Warning
 Warn about real-literal-constants with 'q' exponent-letter
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 83d1f45..977db43 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
loop_vinfo, struct data_reference *dr,
   *new_slot = slot;
 }

-  if (!supportable_dr_alignment  unlimited_cost_model ())
+  if (!supportable_dr_alignment
+   unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
 slot-count += VECT_MAX_COST;
 }

@@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
(loop_vec_info loop_vinfo,
res.peel_info.dr = NULL;
res.body_cost_vec = stmt_vector_for_cost ();

-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
  {
res.inside_cost = INT_MAX;
res.outside_cost = INT_MAX;
@@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
  vectorization factor.
  We do this automtically for cost model, since we
calculate cost
  for every peeling option.  */
-  if 

Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Jan Hubicka
Hi,
this is version I commited - we need to intrdocue var for -fuse-linker-plugin
to be able to check it.

I apologize for the breakage.
Honza

* opts.c (finish_options): Imply -ffat-lto-objects with 
-fno-use-linker-plugin.
* common.opt (fuse-linker-plugin): Add var.
Index: opts.c
===
--- opts.c  (revision 205108)
+++ opts.c  (working copy)
@@ -809,10 +809,13 @@ finish_options (struct gcc_options *opts
 #else
   error_at (loc, LTO support has not been enabled in this configuration);
 #endif
-  if (!opts-x_flag_fat_lto_objects  !HAVE_LTO_PLUGIN)
+  if (!opts-x_flag_fat_lto_objects
+  (!HAVE_LTO_PLUGIN
+ || (opts_set-x_flag_use_linker_plugin
+  !opts-x_flag_use_linker_plugin)))
{
  if (opts_set-x_flag_fat_lto_objects)
-error_at (loc, -fno-fat-lto-objects are supported only with 
linker plugin.);
+error_at (loc, -fno-fat-lto-objects are supported only with 
linker plugin);
  opts-x_flag_fat_lto_objects = 1;
}
 }
Index: common.opt
===
--- common.opt  (revision 205108)
+++ common.opt  (working copy)
@@ -2247,7 +2247,7 @@ Common Negative(fuse-ld=bfd)
 Use the gold linker instead of the default linker
 
 fuse-linker-plugin
-Common Undocumented
+Common Undocumented Var(flag_use_linker_plugin)
 
 ; Positive if we should track variables, negative if we should run
 ; the var-tracking pass only to discard debug annotations, zero if


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread H.J. Lu
On Thu, Nov 14, 2013 at 12:28 PM, Diego Novillo dnovi...@google.com wrote:
 This patch applies the rule that functions defined in FOO.c must be
 declared in FOO.h. One of the worst offenders in the code base is
 tree.h, unsurprisingly.

 The first patch contains the actual moves from tree.h into the
 various .h files.  The second patch contains the mechanical
 side-effects of the first one.

 This patch creates several new headers: attribs.h calls.h
 fold-const.h gcc-symtab.h print-rtl.h print-tree.h stmt.h
 stor-layout.h stringpool.h tree-nested.h tree-object-size.h
 varasm.h.

 Functions in each corresponding .c file got moved to those
 headers and others that already existed. I wanted to make this
 patch as mechanical as possible, so I made no attempt to fix
 problems like having build_addr defined in tree-inline.c. I left
 that for later.

 There were some declarations that I could not move out of tree.h
 because of header poisoning. We forbid the inclusion of things
 like expr.h from FE files. While that's a reasonable idea, the FE
 file *still* manage to at expr.c functionality because the
 declarations they want to use were defined in tree.h.

 If that functionality is allowed to be accessed from the FEs,
 then I will later move those functions out of expr.c into tree.c.
 I have moved these declarations to the bottom of tree.h so they
 are easy to identify later.

 There is a namespace collision with libcpp. The file gcc/symtab.c
 cannot use gcc/symtab.h because the #include command picks up
 libcpp/include/symtab.h first. So I named this file gcc-symtab.h
 for now.

 This patch should offer some minimal incremental build advantages
 by reducing the size of tree.h. Changes that would otherwise
 affected tree.h, will now go to other headers which are less
 frequently included.

 The cleanup is not complete, but this is already bulky:

 - There are some declarations that are still in tree.h that
   should be moved. Andrew is also modifying those files, so I
   chose to leave them in tree.h for now.

 - Some header files always need another header file. I chose to
   #include that header in the file. At this stage we want to do
   the opposite, but this would've added even more bulk to the
   change, so I left a FIXME marker for the next pass.

 I've tested these changes with all the languages and all the
 targets in contrib/config-list.mk. However, I would not be
 surprised if something slipped. Please CC me on any fallout, so I
 can fix it.

 I'm doing a final round of testing and will commit once
 everything finishes.


 2013-11-14  Diego Novillo  dnovi...@google.com

 * tree.h: Include fold-const.h.
 (aggregate_value_p): Moved to function.h.
 (alloca_call_p): Moved to calls.h.
 (allocate_struct_function): Moved to function.h.
 (apply_tm_attr): Moved to attribs.h.
 (array_at_struct_end_p): Moved to expr.h.
 (array_ref_element_size): Moved to tree-dfa.h.
 (array_ref_low_bound): Moved to tree-dfa.h.
 (array_ref_up_bound): Moved to tree.h.
 (assemble_alias): Moved to cgraph.h.
 (avoid_folding_inline_builtin): Moved to builtins.h.
 (bit_from_pos): Moved to stor-layout.h.
 (build_addr): Moved to tree-nested.h.
 (build_call_expr): Moved to builtins.h.
 (build_call_expr_loc): Moved to builtins.h.
 (build_call_expr_loc_array): Moved to builtins.h.
 (build_call_expr_loc_vec): Moved to builtins.h.
 (build_duplicate_type): Moved to tree-inline.h.
 (build_fold_addr_expr): Moved to fold-const.h.
 (build_fold_addr_expr_with_type): Moved to fold-const.h.
 (build_fold_addr_expr_with_type_loc): Moved to fold-const.h.
 (build_fold_indirect_ref): Moved to fold-const.h.
 (build_fold_indirect_ref_loc): Moved to fold-const.h.
 (build_personality_function): Moved to tree.h.
 (build_range_check): Moved to fold-const.h.
 (build_simple_mem_ref): Moved to fold-const.h.
 (build_simple_mem_ref_loc): Moved to fold-const.h.
 (build_string_literal): Moved to builtins.h.
 (build_tm_abort_call): Moved to trans-mem.h.
 (builtin_mathfn_code): Moved to builtins.h.
 (builtin_memset_read_str): Moved to builtins.h.
 (byte_from_pos): Moved to stor-layout.h.
 (c_strlen): Moved to builtins.h.
 (call_expr_flags): Moved to calls.h.
 (can_move_by_pieces): Moved to expr.h.
 (categorize_ctor_elements): Moved to expr.h.
 (change_decl_assembler_name): Moved to gcc-symtab.h.
 (combine_comparisons): Moved to fold-const.h.
 (complete_ctor_at_level_p): Moved to tree.h.
 (component_ref_field_offset): Moved to tree-dfa.h.
 (compute_builtin_object_size): Moved to tree-object-size.h.
 (compute_record_mode): Moved to stor-layout.h.
 (constant_boolean_node): Moved to fold-const.h.
 

[patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

This has been bugging me since I moved it out of gimple.h to gimplify.h.

The gimplify context structure was exposed in the header file to allow a 
few other files to push and pop contexts off the gimplification stack. 
Unfortunately, the struct contains a hash-table template, which means it 
also required hash-table.h in order to even parse the header.  No file 
other than gimplify.c actually uses what is in the structure, so 
exposing it seems wrong.


This patch primarily changes push_gimplify_context () and 
pop_gimplify_context () to grab a struct gimplify_ctx off a local stack 
pool rather than have each file's local function know about the 
structure and allocate it on their stack.  I didn't want to use 
malloc/free or there would be a lot of churn.  Its open to debate what 
the best approach is, but I decided to simply allocate a bunch on a 
static array local to gimplify.c.  If it goes past the end of the local 
array, malloc the required structure.  Its pretty simplistic, but 
functional and doesn't required any GTY stuff or other machinery.


I also hacked up the compiler to report what the 'top' of the stack was 
for a compilation unit. I then ran it through a bootstrap and full 
testsuite run of all languages, and looked for the maximum number of 
context structs in use at one time.  Normally only 1 or 2 were ever in 
use, but its topped out at 7 during some of the openmp tests.  So my 
current limit of 30 will likely never been reached.


only 2 fields were ever set by anyone outside of gimplify.c, the 
'into_ssa' and 'allow_rhs_cond_expr' fields, and they were always set
immediately after pushing the context onto the stack... So I added them 
as parameters to the push and gave them default values.


And thats it... this patch moves the hash function and structure 
completely within gimplify.c so no one needs to know anything about it, 
and removes the hash-table.h include prerequisite for parsing.


Bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK for 
mainline?


Andrew




	* gimplify.h (gimplify_hasher : typed_free_remove, struct gimplify_ctx):
	Move to gimplify.c.
	* gimplify.c (gimplify_hasher:typed_free_remove): Relocate here.
	(struct gimplify_ctx): Relocate here and add malloced field.
	(gimplify_ctxp): Make static.
	(ctx_alloc, ctx_free): Manage a pool of struct gimplify_ctx.
	(push_gimplify_context)): Add default parameters and allocate a 
	struct from the pool.
	(pop_gimplify_context): Free a struct back to the pool.
	(gimplify_scan_omp_clauses, gimplify_omp_parallel, gimplify_omp_task,
	gimplify_omp_workshare, gimplify_transaction, gimplify_body): Don't
	use a local struct gimplify_ctx.
	* gimplify-me.c (force_gimple_operand_1, gimple_regimplify_operands):
	Likewise.
	* omp-low.c (lower_omp_sections, lower_omp_single, lower_omp_master,
	lower_omp_ordered, lower_omp_critical, lower_omp_for,
	create_task_copyfn, lower_omp_taskreg, lower_omp_target,
	lower_omp_teams, execute_lower_omp): Likewise.
	* gimple-fold.c (gimplify_and_update_call_from_tree): Likewise.
	* tree-inline.c (optimize_inline_calls): Likewise.

Index: gimplify.h
===
*** gimplify.h	(revision 205035)
--- gimplify.h	(working copy)
*** enum gimplify_status {
*** 48,86 
GS_OK		= 0,	/* We did something, maybe more to do.  */
GS_ALL_DONE	= 1	/* The expression is fully gimplified.  */
  };
- /* Gimplify hashtable helper.  */
  
! struct gimplify_hasher : typed_free_remove elt_t
! {
!   typedef elt_t value_type;
!   typedef elt_t compare_type;
!   static inline hashval_t hash (const value_type *);
!   static inline bool equal (const value_type *, const compare_type *);
! };
! 
! struct gimplify_ctx
! {
!   struct gimplify_ctx *prev_context;
! 
!   vecgimple bind_expr_stack;
!   tree temps;
!   gimple_seq conditional_cleanups;
!   tree exit_label;
!   tree return_temp;
! 
!   vectree case_labels;
!   /* The formal temporary table.  Should this be persistent?  */
!   hash_table gimplify_hasher temp_htab;
! 
!   int conditions;
!   bool save_stack;
!   bool into_ssa;
!   bool allow_rhs_cond_expr;
!   bool in_cleanup_point_expr;
! };
! 
! extern struct gimplify_ctx *gimplify_ctxp;
! extern void push_gimplify_context (struct gimplify_ctx *);
  extern void pop_gimplify_context (gimple);
  extern gimple gimple_current_bind_expr (void);
  extern vecgimple gimple_bind_expr_stack (void);
--- 48,56 
GS_OK		= 0,	/* We did something, maybe more to do.  */
GS_ALL_DONE	= 1	/* The expression is fully gimplified.  */
  };
  
! extern void push_gimplify_context (bool in_ssa = false,
!    bool rhs_cond_ok = false);
  extern void pop_gimplify_context (gimple);
  extern gimple gimple_current_bind_expr (void);
  extern vecgimple gimple_bind_expr_stack (void);
Index: gimplify.c
===
*** gimplify.c	(revision 205035)
--- gimplify.c	(working copy)
*** 

[wide-int, committed] Remove remaining wide_int (x)s

2013-11-20 Thread Richard Sandiford
One simple switch to wi::add and one simplification that became possible
after Kenny's patch to remove the other real_from_integer.

Tested on x86_64-linux-gnu and applied as obvious (I hope).

Thanks,
Richard


Index: gcc/objc/objc-act.c
===
--- gcc/objc/objc-act.c 2013-11-20 12:04:28.937491368 +
+++ gcc/objc/objc-act.c 2013-11-20 13:35:05.663472662 +
@@ -4885,7 +4885,8 @@ objc_decl_method_attributes (tree *node,
  number = TREE_VALUE (second_argument);
  if (number  TREE_CODE (number) == INTEGER_CST)
TREE_VALUE (second_argument)
- = wide_int_to_tree (TREE_TYPE (number), wide_int (number) 
+ 2);
+ = wide_int_to_tree (TREE_TYPE (number),
+ wi::add (number, 2));
 
  /* This is the third argument, the first-to-check,
 which specifies the index of the first argument to
@@ -4895,7 +4896,8 @@ objc_decl_method_attributes (tree *node,
  number = TREE_VALUE (third_argument);
  if (number  TREE_CODE (number) == INTEGER_CST)
TREE_VALUE (third_argument)
- = wide_int_to_tree (TREE_TYPE (number), wide_int (number) 
+ 2);
+ = wide_int_to_tree (TREE_TYPE (number),
+ wi::add (number, 2));
}
  filtered_attributes = chainon (filtered_attributes,
 new_attribute);
@@ -4930,7 +4932,8 @@ objc_decl_method_attributes (tree *node,
  if (number  TREE_CODE (number) == INTEGER_CST
   !wi::eq_p (number, 0))
TREE_VALUE (argument)
- = wide_int_to_tree (TREE_TYPE (number), wide_int (number) 
+ 2);
+ = wide_int_to_tree (TREE_TYPE (number),
+ wi::add (number, 2));
  argument = TREE_CHAIN (argument);
}
 
Index: gcc/tree.c
===
--- gcc/tree.c  2013-11-20 12:55:04.109302505 +
+++ gcc/tree.c  2013-11-20 13:35:05.677472558 +
@@ -1711,8 +1711,8 @@ real_value_from_int_cst (const_tree type
  bitwise comparisons to see if two values are the same.  */
   memset (d, 0, sizeof d);
 
-  real_from_integer (d, type ? TYPE_MODE (type) : VOIDmode,
-wide_int (i), TYPE_SIGN (TREE_TYPE (i)));
+  real_from_integer (d, type ? TYPE_MODE (type) : VOIDmode, i,
+TYPE_SIGN (TREE_TYPE (i)));
   return d;
 }
 



Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

On 11/20/2013 08:35 AM, Andrew MacLeod wrote:



Bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK for 
mainline?


Andrew


And I've already fixed the typo in the changelog :-)  I had it open in a 
window but hadn't saved it when I created the patch.


Andrew

* gimplify.h (gimplify_hasher : typed_free_remove, struct gimplify_ctx):
Move to gimplify.c.
* gimplify.c (gimplify_hasher:typed_free_remove): Relocate here.
(struct gimplify_ctx): Relocate here and add 'malloced' field.
(gimplify_ctxp): Make static.
(ctx_alloc, ctx_free): Manage a pool of struct gimplify_ctx.
(push_gimplify_context): Add default parameters and allocate a 
struct from the pool.
(pop_gimplify_context): Free a struct back to the pool.
(gimplify_scan_omp_clauses, gimplify_omp_parallel, gimplify_omp_task,
gimplify_omp_workshare, gimplify_transaction, gimplify_body): Don't
use a local 'struct gimplify_ctx'.
* gimplify-me.c (force_gimple_operand_1, gimple_regimplify_operands):
Likewise.
* omp-low.c (lower_omp_sections, lower_omp_single, lower_omp_master,
lower_omp_ordered, lower_omp_critical, lower_omp_for,
create_task_copyfn, lower_omp_taskreg, lower_omp_target,
lower_omp_teams, execute_lower_omp): Likewise.
* gimple-fold.c (gimplify_and_update_call_from_tree): Likewise.
* tree-inline.c (optimize_inline_calls): Likewise.



Re: Add slim-lto support to gcc's build machinery

2013-11-20 Thread Paolo Bonzini
Il 20/11/2013 08:23, Markus Trippelsdorf ha scritto:
 Hi,
 
 now that slim-lto objects are enabled by default, it would be nice to
 use them when building gcc with bootstrap-lto. The following patch
 implements the handling of these objects in gcc's build machinery.
 (Once -fuse-linker-plugin is made the default, -ffat-lto-objects in
 config/bootstrap-lto.mk could be dropped.)
 
 LTO-Bootstrapped on x86_64-linux (with default -fuse-linker-plugin).
 The patch was already approved by Paolo Bonzini.
 
 Please apply, because I don't have access.

Note that you need to regenerate all users of libtool.m4.  Please post a
patch _with_ the regeneration so that whoever applies it won't screw up.

Paolo

 Thanks.
 
 2013-11-20  Markus Trippelsdorf  mar...@trippelsdorf.de
 
   * libtool.m4 : Handle slim-lto objects.
   * ltmain.sh: Handle lto options.
 
 diff --git a/libtool.m4 b/libtool.m4
 index 797468f02a5a..c55b6eba7a94 100644
 --- a/libtool.m4
 +++ b/libtool.m4
 @@ -3440,6 +3440,7 @@ for ac_symprfx in  _; do
else
  lt_cv_sys_global_symbol_pipe=sed -n -e 's/^.*[[  
 ]]\($symcode$symcode*\)[[   ]][[
 ]]*$ac_symprfx$sympat$opt_cr$/$symxfrm/p'
fi
 +  lt_cv_sys_global_symbol_pipe=$lt_cv_sys_global_symbol_pipe | sed '/ 
 __gnu_lto/d'
  
# Check to see that the pipe works correctly.
pipe_works=no
 @@ -4459,7 +4460,7 @@ _LT_EOF
if $LD --help 21 | $EGREP ': supported targets:.* elf'  /dev/null \
 test $tmp_diet = no
then
 - tmp_addflag=
 + tmp_addflag=' $pic_flag'
   tmp_sharedflag='-shared'
   case $cc_basename,$host_cpu in
  pgcc*)   # Portland Group C compiler
 @@ -5525,8 +5526,8 @@ if test $_lt_caught_CXX_error != yes; then
# Check if GNU C++ uses GNU ld as the underlying linker, since the
# archiving commands below assume that GNU ld is being used.
if test $with_gnu_ld = yes; then
 -_LT_TAGVAR(archive_cmds, $1)='$CC -shared -nostdlib $predep_objects 
 $libobjs $deplibs $postdep_objects $compiler_flags ${wl}-soname $wl$soname -o 
 $lib'
 -_LT_TAGVAR(archive_expsym_cmds, $1)='$CC -shared -nostdlib 
 $predep_objects $libobjs $deplibs $postdep_objects $compiler_flags 
 ${wl}-soname $wl$soname ${wl}-retain-symbols-file $wl$export_symbols -o $lib'
 +_LT_TAGVAR(archive_cmds, $1)='$CC $pic_flag -shared -nostdlib 
 $predep_objects $libobjs $deplibs $postdep_objects $compiler_flags 
 ${wl}-soname $wl$soname -o $lib'
 +_LT_TAGVAR(archive_expsym_cmds, $1)='$CC $pic_flag -shared -nostdlib 
 $predep_objects $libobjs $deplibs $postdep_objects $compiler_flags 
 ${wl}-soname $wl$soname ${wl}-retain-symbols-file $wl$export_symbols -o $lib'
  
  _LT_TAGVAR(hardcode_libdir_flag_spec, $1)='${wl}-rpath ${wl}$libdir'
  _LT_TAGVAR(export_dynamic_flag_spec, $1)='${wl}--export-dynamic'
 @@ -6503,6 +6504,13 @@ public class foo {
  };
  _LT_EOF
  ])
 +
 +_lt_libdeps_save_CFLAGS=$CFLAGS
 +case $CC $CFLAGS  in #(
 +*\ -flto*\ *) CFLAGS=$CFLAGS -fno-lto ;;
 +*\ -fwhopr*\ *) CFLAGS=$CFLAGS -fno-whopr ;;
 +esac
 +
  dnl Parse the compiler output and extract the necessary
  dnl objects, libraries and library flags.
  if AC_TRY_EVAL(ac_compile); then
 @@ -6551,6 +6559,7 @@ if AC_TRY_EVAL(ac_compile); then
 fi
 ;;
  
 +*.lto.$objext) ;; # Ignore GCC LTO objects
  *.$objext)
 # This assumes that the test object file only shows up
 # once in the compiler output.
 @@ -6586,6 +6595,7 @@ else
  fi
  
  $RM -f confest.$objext
 +CFLAGS=$_lt_libdeps_save_CFLAGS
  
  # PORTME: override above test on systems where it is broken
  m4_if([$1], [CXX],
 diff --git a/ltmain.sh b/ltmain.sh
 index a03433f17894..2e0910194f24 100644
 --- a/ltmain.sh
 +++ b/ltmain.sh
 @@ -4980,7 +4980,8 @@ func_mode_link ()
# @file GCC response files
# -tp=* Portland pgcc target processor selection
-64|-mips[0-9]|-r[0-9][0-9]*|-xarch=*|-xtarget=*|+DA*|+DD*|-q*|-m*| \
 -  -t[45]*|-txscale*|-p|-pg|--coverage|-fprofile-*|-F*|@*|-tp=*)
 +  -t[45]*|-txscale*|-p|-pg|--coverage|-fprofile-*|-F*|@*|-tp=*| \
 +  -O*|-flto*|-fwhopr|-fuse-linker-plugin)
  func_quote_for_eval $arg
   arg=$func_quote_for_eval_result
  func_append compile_command  $arg
 



[v3 patch] add missing 'template' keyword

2013-11-20 Thread Jonathan Wakely
2013-11-20  Jonathan Wakely  jwakely@gmail.com

PR c++/59173
* include/ext/pointer.h (pointer_traits::rebind): Add template
keyword in nested name.

Tested x86_64-linux, committed to trunk.
commit 7359ca380f9e4ec5d1193e1efcc3aec5af95ec93
Author: Jonathan Wakely jwakely@gmail.com
Date:   Wed Nov 20 13:19:28 2013 +

PR c++/59173
* include/ext/pointer.h (pointer_traits::rebind): Add template
keyword in nested name.

diff --git a/libstdc++-v3/include/ext/pointer.h 
b/libstdc++-v3/include/ext/pointer.h
index 12bc749..d1730be 100644
--- a/libstdc++-v3/include/ext/pointer.h
+++ b/libstdc++-v3/include/ext/pointer.h
@@ -580,7 +580,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   templatetypename _Up
 using rebind = typename __gnu_cxx::_Pointer_adapter
-   typename pointer_traits_Storage_policy::rebind_Up;
+   typename pointer_traits_Storage_policy::template rebind_Up;
 
   static pointer pointer_to(typename pointer::reference __r) noexcept
   { return pointer(std::addressof(__r)); }


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 08:35:28AM -0500, Andrew MacLeod wrote:
 I also hacked up the compiler to report what the 'top' of the stack
 was for a compilation unit. I then ran it through a bootstrap and
 full testsuite run of all languages, and looked for the maximum
 number of context structs in use at one time.  Normally only 1 or 2
 were ever in use, but its topped out at 7 during some of the openmp
 tests.  So my current limit of 30 will likely never been reached.

???  You can trivially create a testcase where it will be reached
(and you certainly should, so that you actually test the fallback).

Say:

/* { dg-do compile } */
/* { dg-options -fopenmp } */
#define P1 _Pragma (omp parallel)
#define P10 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1
#define P100 P10 P10 P10 P10 P10 P10 P10 P10 P10 P10
void bar (void);
void
foo (void)
{
  P100
bar ();
}

Jakub


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread Diego Novillo
On Wed, Nov 20, 2013 at 6:44 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Thanks. Committed as rev 205023.

 Nice work, but why did you antedate the entries in the various ChangeLog

Oh, that's because of local commits and holding on to the patch for a
few days. That date is the date of the original local commit. Further
pulls from upstream pushed the entries down in the file. I just forgot
to move the entries forward when I committed the patch.

 directories (yes, we all know your opinion about ChangeLog files ;-)

What gave you that idea? ;)


Diego.


Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-20 Thread Ilya Enkovich
On 20 Nov 10:59, Richard Biener wrote:
 On Tue, Nov 19, 2013 at 9:29 PM, Ilya Enkovich enkovich@gmail.com wrote:
  On 19 Nov 12:33, Jeff Law wrote:
  On 11/19/13 05:13, Ilya Enkovich wrote:
  On 19 Nov 13:00, Richard Biener wrote:
  I'd say not in the gimplifier either but in varpool (symbol table) code
  where the symbols are ultimatively registered with?
  
  Something like that?
  
  --- a/gcc/varpool.c
  +++ b/gcc/varpool.c
  @@ -151,6 +151,10 @@ varpool_node_for_decl (tree decl)
  node = varpool_create_empty_node ();
  node-decl = decl;
  symtab_register_node (node);
  +
  +  if (DECL_NIITIAL (decl))
  +chkp_register_var_initializer (decl);
  +
  return node;
}
  Yea, I think that's what Richi is suggesting.
  jeff
 
 
  Great!  Here is a full version of the patch.  Bootstrap, make check and MPX 
  tests are OK with the change.
 
 Please don't do this in varpool_node_for_decl but in
 varpool_finalize_decl.
 
 Richard.
 

Here it is.

Thanks,
Ilya
--

gcc/

2013-11-20  Ilya Enkovich  ilya.enkov...@intel.com

* cgraphunit.c: Include tree-chkp.h.
(varpool_finalize_decl): Register statically
initialized decls in Pointer Bounds Checker.


diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 8ab274b..6b6ec55 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -201,6 +201,7 @@ along with GCC; see the file COPYING3.  If not see
 #include regset.h /* FIXME: For reg_obstack.  */
 #include context.h
 #include pass_manager.h
+#include tree-chkp.h
 
 /* Queue of cgraph nodes scheduled to be added into cgraph.  This is a
secondary queue used during optimization to accommodate passes that
@@ -830,6 +831,9 @@ varpool_finalize_decl (tree decl)
  finished.  */
   if (cgraph_state == CGRAPH_STATE_FINISHED)
 varpool_assemble_decl (node);
+
+  if (DECL_INITIAL (decl))
+chkp_register_var_initializer (decl);
 }
 
 /* EDGE is an polymorphic call.  Mark all possible targets as reachable


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread Diego Novillo
On Wed, Nov 20, 2013 at 8:32 AM, H.J. Lu hjl.to...@gmail.com wrote:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59212

Thanks.  Fixed.


PR 59212
* g++.dg/plugin/selfassign.c: Include stringpool.h

diff --git a/gcc/testsuite/g++.dg/plugin/selfassign.c
b/gcc/testsuite/g++.dg/plugin/selfassign.c
index 2498153..cdab74a 100644
--- a/gcc/testsuite/g++.dg/plugin/selfassign.c
+++ b/gcc/testsuite/g++.dg/plugin/selfassign.c
@@ -8,6 +8,7 @@
 #include coretypes.h
 #include tm.h
 #include tree.h
+#include stringpool.h
 #include toplev.h
 #include basic-block.h
 #include gimple.h


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

On 11/20/2013 08:44 AM, Jakub Jelinek wrote:

On Wed, Nov 20, 2013 at 08:35:28AM -0500, Andrew MacLeod wrote:

I also hacked up the compiler to report what the 'top' of the stack
was for a compilation unit. I then ran it through a bootstrap and
full testsuite run of all languages, and looked for the maximum
number of context structs in use at one time.  Normally only 1 or 2
were ever in use, but its topped out at 7 during some of the openmp
tests.  So my current limit of 30 will likely never been reached.

???  You can trivially create a testcase where it will be reached
(and you certainly should, so that you actually test the fallback).


Not something we're likely to see every day, but excellent!  a testcase 
is good :-) I'll add this as a testcase.


Do you think the current limit of 30 is reasonable?

Andrew


Re: [patch] regcprop fix for PR rtl-optimization/54300

2013-11-20 Thread Richard Earnshaw
On 19/11/13 17:48, Jeff Law wrote:
 On 11/19/13 10:32, Steven Bosscher wrote:

 Yes. In the GCC3 days it was important for sincos on i386, and on mk68
 it used to be important for some of the funnier patterns. Not sure if
 it's still useful today, though. Might be worth looking into, just to
 avoid the confusion in the future.
 I doubt it's changed all that much :-)
 

 There's been confusion about this before, where people assumed
 single_set really means just one SET in this pattern. (ISTR fixing
 gcse.c's hash_scan_rtx for this at some point...?). But that's not the
 semantics of single_set.
 Yes.  And I'd expect confusion to continue :(  Not sure if creating 
 renaming to capture the actual semantics would help here.
 

 The proper test for just one SET is (!multiple_sets  single_set).
 At least, that's how I've always coded it...
 Seems reasonable for those cases where you have to ensure there really 
 is just one set.
 
 
 jeff
 

Provided we correctly note the other values that are killed, we can
handle multiple sets safely.  The one restriction we have to watch is
where the dead set operations kill input values to the live set operation.

I've committed my patch to trunk.

I'll leave it to gestate a couple of days, but this is also needed on
the active release branches as well.

R.



[wide-int] Undo some differences with trunk

2013-11-20 Thread Richard Sandiford
I've committed a patch upstream to convert TREE_INT_CST_LOW to tree_to_[su]hwi
if there is an obvious tree_fits_[su]hwi_p guard.  There were some other
changes from TREE_INT_CST_LOW to tree_to_[su]hwi that weren't as obvious
and I think we should deal with them separately.  As before, these changes
aren't really related to the wide-int conversion, since the code that uses
them is operating on HWIs rather than double_ints.  The patch is still here,
so it's not like the work is lost.

There were also some changes from TREE_INT_CST_LOW (X) to
TREE_INT_CST_ELT (X, 0), which are no longer needed now that we define
TREE_INT_CST_LOW too.

There was one case where the branch switched a tree_fits_uhwi_p to
a tree_fits_shwi_p, but with a !!! about the original choice.
The two other instances of the !!! sequence kept the original tests,
so I think this might have been unintentional.  If not, then again
it should go on trunk first.

The patch also undoes a couple of other ordering and whitespace differences.

Tested on x86_64-linux-gnu.  OK for wide-int?

Thanks,
Richard


Index: gcc/c-family/cilk.c
===
--- gcc/c-family/cilk.c 2013-11-20 13:43:30.896150594 +
+++ gcc/c-family/cilk.c 2013-11-20 13:48:38.491410510 +
@@ -1208,7 +1208,7 @@ extract_free_variables (tree t, struct w
int ii = 0;
if (TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST)
  {
-   len = tree_to_uhwi (TREE_OPERAND (t, 0));
+   len = TREE_INT_CST_LOW (TREE_OPERAND (t, 0));
 
for (ii = 0; ii  len; ii++)
  extract_free_variables (TREE_OPERAND (t, ii), wd, ADD_READ);
Index: gcc/c/Make-lang.in
===
--- gcc/c/Make-lang.in  2013-11-20 13:43:30.896150594 +
+++ gcc/c/Make-lang.in  2013-11-20 13:48:38.474410604 +
@@ -137,3 +137,4 @@ c.stageprofile: stageprofile-start
-mv c/*$(objext) stageprofile/c
 c.stagefeedback: stagefeedback-start
-mv c/*$(objext) stagefeedback/c
+
Index: gcc/config/avr/avr.c
===
--- gcc/config/avr/avr.c2013-11-20 13:43:30.896150594 +
+++ gcc/config/avr/avr.c2013-11-20 13:48:38.523410332 +
@@ -12233,7 +12233,7 @@ avr_fold_builtin (tree fndecl, int n_arg
   }
 
 tmap = wide_int_to_tree (map_type, arg[0]);
-map = tree_to_uhwi (tmap);
+map = TREE_INT_CST_LOW (tmap);
 
 if (TREE_CODE (tval) != INTEGER_CST
  0 == avr_map_metric (map, MAP_MASK_PREIMAGE_F))
Index: gcc/config/nds32/nds32.c
===
--- gcc/config/nds32/nds32.c2013-11-20 13:43:30.896150594 +
+++ gcc/config/nds32/nds32.c2013-11-20 13:48:38.492410505 +
@@ -1276,7 +1276,7 @@ nds32_construct_isr_vectors_information
  /* Pick up each vector id value.  */
  id = TREE_VALUE (id_list);
  /* Add vector_number_offset to get actual vector number.  */
- vector_id = tree_to_uhwi (id) + vector_number_offset;
+ vector_id = TREE_INT_CST_LOW (id) + vector_number_offset;
 
  /* Enable corresponding vector and set function name.  */
  nds32_isr_vectors[vector_id].category = (intr)
@@ -1318,7 +1318,7 @@ nds32_construct_isr_vectors_information
 
   /* The total vectors = interrupt + exception numbers + reset.
  There are 8 exception and 1 reset in nds32 architecture.  */
-  nds32_isr_vectors[0].total_n_vectors = tree_to_uhwi (id) + 8 + 1;
+  nds32_isr_vectors[0].total_n_vectors = TREE_INT_CST_LOW (id) + 8 + 1;
   strcpy (nds32_isr_vectors[0].func_name, func_name);
 
   /* Retrieve nmi and warm function.  */
Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c 2013-11-20 13:43:30.896150594 +
+++ gcc/cp/parser.c 2013-11-20 13:48:38.503410443 +
@@ -3804,7 +3804,7 @@ make_string_pack (tree value)
   tree charvec;
   tree argpack = make_node (NONTYPE_ARGUMENT_PACK);
   const char *str = TREE_STRING_POINTER (value);
-  int sz = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (value;
+  int sz = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (value;
   int len = TREE_STRING_LENGTH (value) / sz - 1;
   tree argvec = make_tree_vec (2);
 
Index: gcc/cp/vtable-class-hierarchy.c
===
--- gcc/cp/vtable-class-hierarchy.c 2013-11-20 13:43:30.896150594 +
+++ gcc/cp/vtable-class-hierarchy.c 2013-11-20 13:48:38.504410438 +
@@ -453,7 +453,7 @@ check_and_record_registered_pairs (tree
 vptr_address = TREE_OPERAND (vptr_address, 0);
 
   if (TREE_OPERAND_LENGTH (vptr_address)  1)
-offset = tree_to_uhwi (TREE_OPERAND (vptr_address, 1));
+offset = TREE_INT_CST_LOW (TREE_OPERAND (vptr_address, 1));
   else
 offset = 0;
 
@@ -876,7 

Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-20 Thread Richard Biener
On Wed, 20 Nov 2013, Sergey Ostanevich wrote:

 Thanks for comments, hope I got all of em.
 Note: I used a LOOP_VINFO_LOC (loop_vinfo) to print the loop location
 but it appears to be 0, so the output is somewhat lousy. The global
 vect_location points somewhere inside the loop, which is not that better.
 Shall we address this separately?

Use vect_location instead.

Note that c-family/ and fortran/ have their own ChangeLog file
and files there don't have a prefix.

Ok with the change to use vect_location.

Thanks,
Richard.

 Sergos
 
 * common.opt: Added new option -fsimd-cost-model.
 * tree-vectorizer.h (unlimited_cost_model): Interface update
 to rely on particular loop info.
 * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
 unlimited_cost_model call according to new interface.
 (vect_peeling_hash_choose_best_peeling): Ditto.
 (vect_enhance_data_refs_alignment): Ditto.
 * tree-vect-slp.c: Ditto.
 * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
 plus issue a warning in case cost model overrides users' directive.
 * c-family/c.opt: add openmp-simd warning.
 * fortran/lang.opt: Ditto.
 * doc/invoke.texi: Added new openmp-simd warning.
 
 
 
 diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
 index 0026683..a85a8ad 100644
 --- a/gcc/c-family/c.opt
 +++ b/gcc/c-family/c.opt
 @@ -592,6 +592,10 @@ Wold-style-definition
  C ObjC Var(warn_old_style_definition) Warning
  Warn if an old-style parameter definition is used
 
 +Wopenmp-simd
 +C C++ Var(openmp_simd) Warning EnabledBy(Wall)
 +Warn about simd directive is overridden by vectorizer cost model
 +
  Woverlength-strings
  C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning
 LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
  Warn if a string is longer than the maximum portable length specified
 by the standard
 diff --git a/gcc/common.opt b/gcc/common.opt
 index d5971df..2b0e9e6 100644
 --- a/gcc/common.opt
 +++ b/gcc/common.opt
 @@ -2296,6 +2296,10 @@ fvect-cost-model=
  Common Joined RejectNegative Enum(vect_cost_model)
 Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
  Specifies the cost model for vectorization
 
 +fsimd-cost-model=
 +Common Joined RejectNegative Enum(vect_cost_model)
 Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
 +Specifies the vectorization cost model for code marked with simd directive
 +
  Enum
  Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
 vectorizer cost model %qs)
 
 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
 index c250385..050bd44 100644
 --- a/gcc/doc/invoke.texi
 +++ b/gcc/doc/invoke.texi
 @@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
  -Wlogical-op -Wlong-long @gol
  -Wmain -Wmaybe-uninitialized -Wmissing-braces
 -Wmissing-field-initializers @gol
  -Wmissing-include-dirs @gol
 --Wno-multichar  -Wnonnull  -Wno-overflow @gol
 +-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
  -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
  -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
  -Wpointer-arith  -Wno-pointer-to-int-cast @gol
 @@ -3318,6 +3318,7 @@ Options} and @ref{Objective-C and Objective-C++
 Dialect Options}.
  -Wmaybe-uninitialized @gol
  -Wmissing-braces @r{(only for C/ObjC)} @gol
  -Wnonnull  @gol
 +-Wopenmp-simd @gol
  -Wparentheses  @gol
  -Wpointer-sign  @gol
  -Wreorder   @gol
 @@ -4804,6 +4805,10 @@ attribute.
  @opindex Woverflow
  Do not warn about compile-time overflow in constant expressions.
 
 +@item -Wopenmp-simd
 +@opindex Wopenm-simd
 +Warn if vectorizer cost model overrides simd directive from user.
 +
  @item -Woverride-init @r{(C and Objective-C only)}
  @opindex Woverride-init
  @opindex Wno-override-init
 diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
 index 5e09cbd..b43c48c 100644
 --- a/gcc/fortran/lang.opt
 +++ b/gcc/fortran/lang.opt
 @@ -257,6 +257,10 @@ Wintrinsics-std
  Fortran Warning
  Warn on intrinsics not part of the selected standard
 
 +Wopenmp-simd
 +Fortran Warning
 +; Documented in C
 +
  Wreal-q-constant
  Fortran Warning
  Warn about real-literal-constants with 'q' exponent-letter
 diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
 index 83d1f45..977db43 100644
 --- a/gcc/tree-vect-data-refs.c
 +++ b/gcc/tree-vect-data-refs.c
 @@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
 loop_vinfo, struct data_reference *dr,
*new_slot = slot;
  }
 
 -  if (!supportable_dr_alignment  unlimited_cost_model ())
 +  if (!supportable_dr_alignment
 +   unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
  slot-count += VECT_MAX_COST;
  }
 
 @@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
 (loop_vec_info loop_vinfo,
 res.peel_info.dr = NULL;
 res.body_cost_vec = stmt_vector_for_cost ();
 
 -   if (!unlimited_cost_model ())
 +   if (!unlimited_cost_model (LOOP_VINFO_LOOP 

[wide-int] Tweak an rs6000 test

2013-11-20 Thread Richard Sandiford
This test changes from TREE_INT_CST_LOW to TREE_INT_CST_ELT.  I was going
to change it back as part of the previous patch, but using wi:: seemed
more robust.

Only compile-tested so far because of problems with gcc110.  OK for wide-int?

Richard


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  2013-11-20 11:26:05.996351099 +
+++ gcc/config/rs6000/rs6000.c  2013-11-20 11:43:37.896818266 +
@@ -12131,16 +12131,14 @@ rs6000_expand_ternop_builtin (enum insn_
   /* Check whether the 2nd and 3rd arguments are integer constants and in
 range and prepare arguments.  */
   STRIP_NOPS (arg1);
-  if (TREE_CODE (arg1) != INTEGER_CST
- || !IN_RANGE (TREE_INT_CST_ELT (arg1, 0), 0, 1))
+  if (TREE_CODE (arg1) != INTEGER_CST || wi::geu_p (arg1, 2))
{
  error (argument 2 must be 0 or 1);
  return const0_rtx;
}
 
   STRIP_NOPS (arg2);
-  if (TREE_CODE (arg2) != INTEGER_CST
- || !IN_RANGE (TREE_INT_CST_ELT (arg2, 0), 0, 15))
+  if (TREE_CODE (arg2) != INTEGER_CST || wi::geu_p (arg1, 16))
{
  error (argument 3 must be in the range 0..15);
  return const0_rtx;



Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 2:12 PM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 this is version I commited - we need to intrdocue var for -fuse-linker-plugin
 to be able to check it.

 I apologize for the breakage.
 Honza

 * opts.c (finish_options): Imply -ffat-lto-objects with 
 -fno-use-linker-plugin.
 * common.opt (fuse-linker-plugin): Add var.
 Index: opts.c

Hmm, but that won't help for

gcc -c t.c -flto
gcc t.o -flto -fno-use-linker-plugin

Isn't that the style how we invoke LTO tests?  (ok we put in useless
-fno-use-linker-plugin in compile stage, too)

So in the end it'll be too fragile for users IMHO.  I think we should
disallow -fno-use-linker-plugin (if we decided to enable it by default
then it won't fall back to no linker plugin but errors if we replace ld
with something too old or remove plugin, right?).

Richard.

===
 --- opts.c  (revision 205108)
 +++ opts.c  (working copy)
 @@ -809,10 +809,13 @@ finish_options (struct gcc_options *opts
  #else
error_at (loc, LTO support has not been enabled in this 
 configuration);
  #endif
 -  if (!opts-x_flag_fat_lto_objects  !HAVE_LTO_PLUGIN)
 +  if (!opts-x_flag_fat_lto_objects
 +  (!HAVE_LTO_PLUGIN
 + || (opts_set-x_flag_use_linker_plugin
 +  !opts-x_flag_use_linker_plugin)))
 {
   if (opts_set-x_flag_fat_lto_objects)
 -error_at (loc, -fno-fat-lto-objects are supported only with 
 linker plugin.);
 +error_at (loc, -fno-fat-lto-objects are supported only with 
 linker plugin);
   opts-x_flag_fat_lto_objects = 1;
 }
  }
 Index: common.opt
 ===
 --- common.opt  (revision 205108)
 +++ common.opt  (working copy)
 @@ -2247,7 +2247,7 @@ Common Negative(fuse-ld=bfd)
  Use the gold linker instead of the default linker

  fuse-linker-plugin
 -Common Undocumented
 +Common Undocumented Var(flag_use_linker_plugin)

  ; Positive if we should track variables, negative if we should run
  ; the var-tracking pass only to discard debug annotations, zero if


Re: [wide-int] Undo some differences with trunk

2013-11-20 Thread Kenneth Zadeck
I would leave out the last frag of tree.c.  It actually gets rid of what 
i consider, a latent bug. it does not show up with the current 
implementation of wide-int, but if that implementation changed, it 
would.   The problem is that you really should not expect that you can 
set the min or max value for a type without first setting the precision 
of that type.


Aside from that, this looks fine.

kenny

On 11/20/2013 08:58 AM, Richard Sandiford wrote:

I've committed a patch upstream to convert TREE_INT_CST_LOW to tree_to_[su]hwi
if there is an obvious tree_fits_[su]hwi_p guard.  There were some other
changes from TREE_INT_CST_LOW to tree_to_[su]hwi that weren't as obvious
and I think we should deal with them separately.  As before, these changes
aren't really related to the wide-int conversion, since the code that uses
them is operating on HWIs rather than double_ints.  The patch is still here,
so it's not like the work is lost.

There were also some changes from TREE_INT_CST_LOW (X) to
TREE_INT_CST_ELT (X, 0), which are no longer needed now that we define
TREE_INT_CST_LOW too.

There was one case where the branch switched a tree_fits_uhwi_p to
a tree_fits_shwi_p, but with a !!! about the original choice.
The two other instances of the !!! sequence kept the original tests,
so I think this might have been unintentional.  If not, then again
it should go on trunk first.

The patch also undoes a couple of other ordering and whitespace differences.

Tested on x86_64-linux-gnu.  OK for wide-int?

Thanks,
Richard


Index: gcc/c-family/cilk.c
===
--- gcc/c-family/cilk.c 2013-11-20 13:43:30.896150594 +
+++ gcc/c-family/cilk.c 2013-11-20 13:48:38.491410510 +
@@ -1208,7 +1208,7 @@ extract_free_variables (tree t, struct w
int ii = 0;
if (TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST)
  {
-   len = tree_to_uhwi (TREE_OPERAND (t, 0));
+   len = TREE_INT_CST_LOW (TREE_OPERAND (t, 0));
  
  	for (ii = 0; ii  len; ii++)

  extract_free_variables (TREE_OPERAND (t, ii), wd, ADD_READ);
Index: gcc/c/Make-lang.in
===
--- gcc/c/Make-lang.in  2013-11-20 13:43:30.896150594 +
+++ gcc/c/Make-lang.in  2013-11-20 13:48:38.474410604 +
@@ -137,3 +137,4 @@ c.stageprofile: stageprofile-start
-mv c/*$(objext) stageprofile/c
  c.stagefeedback: stagefeedback-start
-mv c/*$(objext) stagefeedback/c
+
Index: gcc/config/avr/avr.c
===
--- gcc/config/avr/avr.c2013-11-20 13:43:30.896150594 +
+++ gcc/config/avr/avr.c2013-11-20 13:48:38.523410332 +
@@ -12233,7 +12233,7 @@ avr_fold_builtin (tree fndecl, int n_arg
}
  
  tmap = wide_int_to_tree (map_type, arg[0]);

-map = tree_to_uhwi (tmap);
+map = TREE_INT_CST_LOW (tmap);
  
  if (TREE_CODE (tval) != INTEGER_CST

   0 == avr_map_metric (map, MAP_MASK_PREIMAGE_F))
Index: gcc/config/nds32/nds32.c
===
--- gcc/config/nds32/nds32.c2013-11-20 13:43:30.896150594 +
+++ gcc/config/nds32/nds32.c2013-11-20 13:48:38.492410505 +
@@ -1276,7 +1276,7 @@ nds32_construct_isr_vectors_information
  /* Pick up each vector id value.  */
  id = TREE_VALUE (id_list);
  /* Add vector_number_offset to get actual vector number.  */
- vector_id = tree_to_uhwi (id) + vector_number_offset;
+ vector_id = TREE_INT_CST_LOW (id) + vector_number_offset;
  
  	  /* Enable corresponding vector and set function name.  */

  nds32_isr_vectors[vector_id].category = (intr)
@@ -1318,7 +1318,7 @@ nds32_construct_isr_vectors_information
  
/* The total vectors = interrupt + exception numbers + reset.

   There are 8 exception and 1 reset in nds32 architecture.  */
-  nds32_isr_vectors[0].total_n_vectors = tree_to_uhwi (id) + 8 + 1;
+  nds32_isr_vectors[0].total_n_vectors = TREE_INT_CST_LOW (id) + 8 + 1;
strcpy (nds32_isr_vectors[0].func_name, func_name);
  
/* Retrieve nmi and warm function.  */

Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c 2013-11-20 13:43:30.896150594 +
+++ gcc/cp/parser.c 2013-11-20 13:48:38.503410443 +
@@ -3804,7 +3804,7 @@ make_string_pack (tree value)
tree charvec;
tree argpack = make_node (NONTYPE_ARGUMENT_PACK);
const char *str = TREE_STRING_POINTER (value);
-  int sz = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (value;
+  int sz = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (value;
int len = TREE_STRING_LENGTH (value) / sz - 1;
tree argvec = make_tree_vec (2);
  
Index: gcc/cp/vtable-class-hierarchy.c

===

Re: [wide-int] Tweak an rs6000 test

2013-11-20 Thread Kenneth Zadeck

looks fine to me.

kenny

On 11/20/2013 09:00 AM, Richard Sandiford wrote:

This test changes from TREE_INT_CST_LOW to TREE_INT_CST_ELT.  I was going
to change it back as part of the previous patch, but using wi:: seemed
more robust.

Only compile-tested so far because of problems with gcc110.  OK for wide-int?

Richard


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  2013-11-20 11:26:05.996351099 +
+++ gcc/config/rs6000/rs6000.c  2013-11-20 11:43:37.896818266 +
@@ -12131,16 +12131,14 @@ rs6000_expand_ternop_builtin (enum insn_
/* Check whether the 2nd and 3rd arguments are integer constants and in
 range and prepare arguments.  */
STRIP_NOPS (arg1);
-  if (TREE_CODE (arg1) != INTEGER_CST
- || !IN_RANGE (TREE_INT_CST_ELT (arg1, 0), 0, 1))
+  if (TREE_CODE (arg1) != INTEGER_CST || wi::geu_p (arg1, 2))
{
  error (argument 2 must be 0 or 1);
  return const0_rtx;
}
  
STRIP_NOPS (arg2);

-  if (TREE_CODE (arg2) != INTEGER_CST
- || !IN_RANGE (TREE_INT_CST_ELT (arg2, 0), 0, 15))
+  if (TREE_CODE (arg2) != INTEGER_CST || wi::geu_p (arg1, 16))
{
  error (argument 3 must be in the range 0..15);
  return const0_rtx;





Re: [PATCH, PR 57363] IBM long double: adding qNaN and number raises inexact exception

2013-11-20 Thread Adhemerval Zanella
Hi,

This is a reformatted patch after Ulrich review:

---

libgcc/ChangeLog:

2013-11-20  Adhemerval Zanella  azane...@linux.vnet.ibm.com

* config/rs6000/ibm-ldouble.c (__gcc_qadd): Fix add
of normal number and qNaN to not raise an inexact exception.

gcc/testsuite/ChangeLog:

2013-11-20  Adhemerval Zanella  azane...@linux.vnet.ibm.com

* gcc.target/powerpc/pr57363.c: New test.

--

diff --git a/gcc/testsuite/gcc.target/powerpc/pr57363.c 
b/gcc/testsuite/gcc.target/powerpc/pr57363.c
new file mode 100644
index 000..45ea3f3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr57363.c
@@ -0,0 +1,19 @@
+/* { dg-do run { target { powerpc*-*-linux* } } } */
+/* { dg-options -mlong-double-128 } */
+
+/* Check if adding a qNAN and a normal long double does not generate a
+   inexact exception.  */
+
+#define _GNU_SOURCE
+#include fenv.h
+
+int main(void)
+{
+  double x = __builtin_nan ();
+  long double y = 1.1L;
+
+  feenableexcept (FE_INEXACT);
+  feclearexcept (FE_ALL_EXCEPT);
+  x = x + y;
+  return fetestexcept(FE_INEXACT);
+}
diff --git a/libgcc/config/rs6000/ibm-ldouble.c 
b/libgcc/config/rs6000/ibm-ldouble.c
index 28e02e9..7ca900c 100644
--- a/libgcc/config/rs6000/ibm-ldouble.c
+++ b/libgcc/config/rs6000/ibm-ldouble.c
@@ -104,6 +104,8 @@ __gcc_qadd (double a, double aa, double c, double cc)
 
   if (nonfinite (z))
 {
+  if (fabs (z) != inf())
+   return z;
   z = cc + aa + c + a;
   if (nonfinite (z))
return z;



Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 2:35 PM, Andrew MacLeod amacl...@redhat.com wrote:
 This has been bugging me since I moved it out of gimple.h to gimplify.h.

 The gimplify context structure was exposed in the header file to allow a few
 other files to push and pop contexts off the gimplification stack.
 Unfortunately, the struct contains a hash-table template, which means it
 also required hash-table.h in order to even parse the header.  No file other
 than gimplify.c actually uses what is in the structure, so exposing it seems
 wrong.

 This patch primarily changes push_gimplify_context () and
 pop_gimplify_context () to grab a struct gimplify_ctx off a local stack pool
 rather than have each file's local function know about the structure and
 allocate it on their stack.  I didn't want to use malloc/free or there would
 be a lot of churn.  Its open to debate what the best approach is, but I
 decided to simply allocate a bunch on a static array local to gimplify.c.
 If it goes past the end of the local array, malloc the required structure.
 Its pretty simplistic, but functional and doesn't required any GTY stuff or
 other machinery.

 I also hacked up the compiler to report what the 'top' of the stack was for
 a compilation unit. I then ran it through a bootstrap and full testsuite run
 of all languages, and looked for the maximum number of context structs in
 use at one time.  Normally only 1 or 2 were ever in use, but its topped out
 at 7 during some of the openmp tests.  So my current limit of 30 will likely
 never been reached.

 only 2 fields were ever set by anyone outside of gimplify.c, the 'into_ssa'
 and 'allow_rhs_cond_expr' fields, and they were always set
 immediately after pushing the context onto the stack... So I added them as
 parameters to the push and gave them default values.

 And thats it... this patch moves the hash function and structure completely
 within gimplify.c so no one needs to know anything about it, and removes the
 hash-table.h include prerequisite for parsing.

 Bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK for
 mainline?

The limit looks reasonable, but you could have used a simple linked
list (and never free).  Also being able to pop a random context
looks fragile ...  that is, pop_gimplify_context shouldn't have an argument.

Richard.



 Andrew




Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 02:59:21PM +0100, Richard Biener wrote:
  --- a/gcc/c-family/c.opt
  +++ b/gcc/c-family/c.opt
  @@ -592,6 +592,10 @@ Wold-style-definition
   C ObjC Var(warn_old_style_definition) Warning
   Warn if an old-style parameter definition is used
  
  +Wopenmp-simd
  +C C++ Var(openmp_simd) Warning EnabledBy(Wall)

Please use Var(warn_openmp_simd) here.

  --- a/gcc/common.opt
  +++ b/gcc/common.opt
  @@ -2296,6 +2296,10 @@ fvect-cost-model=
   Common Joined RejectNegative Enum(vect_cost_model)
  Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
   Specifies the cost model for vectorization
  
  +fsimd-cost-model=
  +Common Joined RejectNegative Enum(vect_cost_model)
  Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
  +Specifies the vectorization cost model for code marked with simd directive
  +
   Enum
   Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
  vectorizer cost model %qs)

I'd say you want to add
EnumValue
Enum(vect_cost_model) String(default) Value(VECT_COST_MODEL_DEFAULT)
here.

  @@ -2929,6 +2929,13 @@ vect_estimate_min_profitable_iters
  (loop_vec_info loop_vinfo,
 /* vector version will never be profitable.  */
 else
   {
  +  if (LOOP_VINFO_LOOP (loop_vinfo)-force_vect)
  +{
  +  warning_at (LOOP_VINFO_LOC (loop_vinfo), OPT_Wopenmp_simd,
  +  Vectorization did not happen for 
  +  the loop labeled as simd.);

No {} around single stmt then body.  Also, diagnostic messages
don't start with a capital letter and don't end with dot.
So
vectorization did not happen for 
a simd loop
or so.

   /* Return true if the vect cost model is unlimited.  */
   static inline bool
  -unlimited_cost_model ()
  +unlimited_cost_model (loop_p loop)
   {
  -  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
  +  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED
  +  || (loop != NULL
  +   loop-force_vect
  +   flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED));
   }

IMNSHO this should instead do:
  if (loop != NULL  loop-force_vect
   flag_simd_cost_model != VECT_COST_MODEL_DEFAULT)
return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED;
  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
so, if user said that -fsimd-cost-model=default, then it should honor
-fvect-cost-model.  And, IMHO that should be the default, but I don't
feel strongly about that.

Jakub


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:
 The limit looks reasonable, but you could have used a simple linked
 list (and never free).  Also being able to pop a random context
 looks fragile ...  that is, pop_gimplify_context shouldn't have an argument.

Can't we use stack_vecgimplify_context, 30 for that?  Though that would
mean a global var constructor and destructor, so alternatively just use
a normal vec and .create(30) it somewhere during initialization?

Jakub


Re: patch PLUGIN_HEADER_FILE event for tracing of header inclusions.

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 10:57 PM, Basile Starynkevitch
bas...@starynkevitch.net wrote:
 On Tue, 2013-11-19 at 20:33 +0100, Dominique Dhumieres wrote:
  Thanks: Committed revision 205038.

 This seems to break several g++ tests: see 
 http://gcc.gnu.org/ml/gcc-testresults/2013-11/msg01482.html
 On x86_64-apple-darwin13, the ICE is

 cc1: error: cannot load plugin ./one_time_plugin.so

 Actually, when we add hard coded plugin events we should never forget,
 like I just did, to edit invoke_plugin_callbacks_full.

 Diego and all, can I add the following patch to the trunk. One is just a
 comment inside plugin.def to recall that invoke_plugin_callbacks_full
 should be edited, and another is the tiny fix on that function in
 plugin.c

 ### gcc/ChangeLog entry
 2013-11-19  Basile Starynkevitch  bas...@starynkevitch.net

 * plugin.def: Add comment about register_callback and
   invoke_plugin_callbacks_full.

 * plugin.c (register_callback, invoke_plugin_callbacks_full):
   Handle PLUGIN_INCLUDE_FILE event.
 ###

 Apologies for my mistake. Hope this will be approved quickly before 4.9
 stage 3! (I will later submit a documentation patch)

Ok.

Thanks,
Richard.


 BTW, I am not very happy about some events in plugin.def not being real
 events.  IMHO I would prefer that PLUGIN_PASS_MANAGER_SETUP,
 PLUGIN_REGISTER_GGC_ROOTS, PLUGIN_REGISTER_GGC_CACHES be removed,
 because they are not real events (and they should have their own public
 functions in plugin.h, e.g. add functions like
 plugin_pass_manager_setup, plugin_register_ggc_roots,
 plugin_register_ggc_caches in gcc/gcc-plugin.h). Unfortunately, we don't
 have time for such an improvement for 4.9

 Regards.
 --
 Basile STARYNKEVITCH http://starynkevitch.net/Basile/
 email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
 8, rue de la Faiencerie, 92340 Bourg La Reine, France
 *** opinions {are only mine, sont seulement les miennes} ***



Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 3:16 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:
 The limit looks reasonable, but you could have used a simple linked
 list (and never free).  Also being able to pop a random context
 looks fragile ...  that is, pop_gimplify_context shouldn't have an argument.

 Can't we use stack_vecgimplify_context, 30 for that?  Though that would
 mean a global var constructor and destructor, so alternatively just use
 a normal vec and .create(30) it somewhere during initialization?

only with gimplify_context *, otherwise things will break during re-allocation.

Richard.

 Jakub


Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Jan Hubicka
 On Wed, Nov 20, 2013 at 2:12 PM, Jan Hubicka hubi...@ucw.cz wrote:
  Hi,
  this is version I commited - we need to intrdocue var for 
  -fuse-linker-plugin
  to be able to check it.
 
  I apologize for the breakage.
  Honza
 
  * opts.c (finish_options): Imply -ffat-lto-objects with 
  -fno-use-linker-plugin.
  * common.opt (fuse-linker-plugin): Add var.
  Index: opts.c
 
 Hmm, but that won't help for
 
 gcc -c t.c -flto
 gcc t.o -flto -fno-use-linker-plugin
 
 Isn't that the style how we invoke LTO tests?  (ok we put in useless
 -fno-use-linker-plugin in compile stage, too)

We put -fno-use-linker-plugin everywhere and I think it is what I would expect
people would do, too - just drop -flto -fno-use-linker-plugin as new 
optimization
option.

 
 So in the end it'll be too fragile for users IMHO.  I think we should
 disallow -fno-use-linker-plugin (if we decided to enable it by default
 then it won't fall back to no linker plugin but errors if we replace ld
 with something too old or remove plugin, right?).

On plugin enabled setups, I think only value of -fno-use-linker-plugin
is to allow us to test the other code path.
Perhaps we can declare it undocumented and for internal use only?

Honza
 
 Richard.
 
 ===
  --- opts.c  (revision 205108)
  +++ opts.c  (working copy)
  @@ -809,10 +809,13 @@ finish_options (struct gcc_options *opts
   #else
 error_at (loc, LTO support has not been enabled in this 
  configuration);
   #endif
  -  if (!opts-x_flag_fat_lto_objects  !HAVE_LTO_PLUGIN)
  +  if (!opts-x_flag_fat_lto_objects
  +  (!HAVE_LTO_PLUGIN
  + || (opts_set-x_flag_use_linker_plugin
  +  !opts-x_flag_use_linker_plugin)))
  {
if (opts_set-x_flag_fat_lto_objects)
  -error_at (loc, -fno-fat-lto-objects are supported only with 
  linker plugin.);
  +error_at (loc, -fno-fat-lto-objects are supported only with 
  linker plugin);
opts-x_flag_fat_lto_objects = 1;
  }
   }
  Index: common.opt
  ===
  --- common.opt  (revision 205108)
  +++ common.opt  (working copy)
  @@ -2247,7 +2247,7 @@ Common Negative(fuse-ld=bfd)
   Use the gold linker instead of the default linker
 
   fuse-linker-plugin
  -Common Undocumented
  +Common Undocumented Var(flag_use_linker_plugin)
 
   ; Positive if we should track variables, negative if we should run
   ; the var-tracking pass only to discard debug annotations, zero if


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

On 11/20/2013 09:12 AM, Richard Biener wrote:

On Wed, Nov 20, 2013 at 2:35 PM, Andrew MacLeod amacl...@redhat.com wrote:

This has been bugging me since I moved it out of gimple.h to gimplify.h.

The gimplify context structure was exposed in the header file to allow a few
other files to push and pop contexts off the gimplification stack.
Unfortunately, the struct contains a hash-table template, which means it
also required hash-table.h in order to even parse the header.  No file other
than gimplify.c actually uses what is in the structure, so exposing it seems
wrong.

This patch primarily changes push_gimplify_context () and
pop_gimplify_context () to grab a struct gimplify_ctx off a local stack pool
rather than have each file's local function know about the structure and
allocate it on their stack.  I didn't want to use malloc/free or there would
be a lot of churn.  Its open to debate what the best approach is, but I
decided to simply allocate a bunch on a static array local to gimplify.c.
If it goes past the end of the local array, malloc the required structure.
Its pretty simplistic, but functional and doesn't required any GTY stuff or
other machinery.

I also hacked up the compiler to report what the 'top' of the stack was for
a compilation unit. I then ran it through a bootstrap and full testsuite run
of all languages, and looked for the maximum number of context structs in
use at one time.  Normally only 1 or 2 were ever in use, but its topped out
at 7 during some of the openmp tests.  So my current limit of 30 will likely
never been reached.

only 2 fields were ever set by anyone outside of gimplify.c, the 'into_ssa'
and 'allow_rhs_cond_expr' fields, and they were always set
immediately after pushing the context onto the stack... So I added them as
parameters to the push and gave them default values.

And thats it... this patch moves the hash function and structure completely
within gimplify.c so no one needs to know anything about it, and removes the
hash-table.h include prerequisite for parsing.

Bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK for
mainline?

The limit looks reasonable, but you could have used a simple linked
list (and never free).  Also being able to pop a random context
looks fragile ...  that is, pop_gimplify_context shouldn't have an argument.


The argument to pop_gimplify_context isn't a context pointer, its a 
gimple statement and if present calls declare_vars on the temps in the 
body...  the routine always pops the current context.  So that doesn't 
seem too fragile?


I had considered just a linked list and never freeing, but thought that 
policy might be frowned upon and trigger false positives on memory 
allocation analysis tools  :-)   I'm happy to do that tho, its quite 
trivial.



Andrew




Re: [Patch, ARM] New feature to minimize the literal load for armv7-m target

2013-11-20 Thread Richard Earnshaw
On 06/11/13 06:10, Terry Guo wrote:
 Hi,
 
 This patch intends to minimize the use of literal pool for some armv7-m
 targets that have slower speed to load data from flash than to fetch
 instruction from flash. The normal literal load instruction is now replaced
 by MOVW/MOVT instructions. A new option -mslow-flash-data is created for
 this purpose. So far this feature doesn't support PIC code and target that
 isn't based on armv7-m.
 
 Tested with GCC regression test on QEMU for cortex-m3. No new regressions.
 Is it OK to trunk?
 
 BR,
 Terry
 
 2013-11-06  Terry Guo  terry@arm.com
 
  * doc/invoke.texi (-mslow-flash-data): Document new option.
  * config/arm/arm.opt (mslow-flash-data): New option.
  * config/arm/arm-protos.h
 (arm_max_const_double_inline_cost): Declare it.
  * config/arm/arm.h (TARGET_USE_MOVT): Always true when
 disable literal pools.
literal pools are disabled.

  (arm_disable_literal_pool): Declare it.
  * config/arm/arm.c (arm_disable_literal_pool): New
 variable.
  (arm_option_override): Handle new option.
  (thumb2_legitimate_address_p): Invalid certain address
 format.

Invalidate.  What address formats?

  (arm_max_const_double_inline_cost): New function.
  * config/arm/arm.md (types.md): Include it a little
 earlier.

Include it before ...

  (use_literal_pool): New attribute.
  (enabled): Use new attribute.
  (split pattern): Replace symbol+offset with MOVW/MOVT.
 
 

Comments inline.

 diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
 index 1781b75..25927a1 100644
 --- a/gcc/config/arm/arm.h
 +++ b/gcc/config/arm/arm.h
 @@ -554,6 +556,9 @@ extern int arm_arch_thumb_hwdiv;
 than core registers.  */
  extern int prefer_neon_for_64bits;
  
 +/* Nonzero if shouldn't use literal pool in generated code.  */
'if we shouldn't use literal pools'

 +extern int arm_disable_literal_pool;

This should be a bool, values stored in it should be true/false not 1/0.

 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
 index 78554e8..de2a9c0 100644
 --- a/gcc/config/arm/arm.c
 +++ b/gcc/config/arm/arm.c
 @@ -864,6 +864,9 @@ int arm_arch_thumb_hwdiv;
 than core registers.  */
  int prefer_neon_for_64bits = 0;
  
 +/* Nonzero if shouldn't use literal pool in generated code.  */
 +int arm_disable_literal_pool = 0;

Similar comments to above.

 @@ -6348,6 +6361,25 @@ thumb2_legitimate_address_p (enum machine_mode mode, 
 rtx x, int strict_p)
  thumb2_legitimate_index_p (mode, xop0, strict_p)));
  }
  
 +  /* Normally we can assign constant values to its target register without
'to target registers'

 + the help of constant pool.  But there are cases we have to use constant
 + pool like:
 + 1) assign a label to register.
 + 2) sign-extend a 8bit value to 32bit and then assign to register.
 +
 + Constant pool access in format:
 + (set (reg r0) (mem (symbol_ref (.LC0
 + will cause the use of literal pool (later in function arm_reorg).
 + So here we mark such format as an invalid format, then compiler
'then the compiler'

 @@ -16114,6 +16146,18 @@ push_minipool_fix (rtx insn, HOST_WIDE_INT address, 
 rtx *loc,
minipool_fix_tail = fix;
  }
  
 +/* Return maximum allowed cost of synthesizing a 64-bit constant VAL inline.
 +   Returns 99 if we always want to synthesize the value.  */

Needs to mention that the cost is in terms of 'insns' (see the function
below it).

 +int
 +arm_max_const_double_inline_cost ()
 +{
 +  /* Let the value get synthesized to avoid the use of literal pools.  */
 +  if (arm_disable_literal_pool)
 +return 99;
 +
 +  return ((optimize_size || arm_ld_sched) ? 3 : 4);
 +}
 +
  /* Return the cost of synthesizing a 64-bit constant VAL inline.
 Returns the number of insns needed, or 99 if we don't know how to
 do it.  */

 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
 index adbc45b..a5991cb 100644
 --- a/gcc/doc/invoke.texi
 +++ b/gcc/doc/invoke.texi
 @@ -534,6 +534,7 @@ Objective-C and Objective-C++ Dialects}.
  -mfix-cortex-m3-ldrd @gol
  -munaligned-access @gol
  -mneon-for-64bits @gol
 +-mslow-flash-data @gol
  -mrestrict-it}
  
  @emph{AVR Options}
 @@ -12295,6 +12296,12 @@ Enables using Neon to handle scalar 64-bits 
 operations. This is
  disabled by default since the cost of moving data from core registers
  to Neon is high.
  
 +@item -mslow-flash-data
 +@opindex mslow-flash-data
 +Assume loading data from flash is slower than fetching instruction.
 +Therefore literal load is minimized for better performance.
 +This option is off by default.
 +

Needs to mention the limitation on which processors can support this, ie
only v7 m-profile.


R.



[Patchv2] Add slim-lto support to gcc's build machinery

2013-11-20 Thread Markus Trippelsdorf
On 2013.11.20 at 14:41 +0100, Paolo Bonzini wrote:
 Il 20/11/2013 08:23, Markus Trippelsdorf ha scritto:
  Hi,
  
  now that slim-lto objects are enabled by default, it would be nice to
  use them when building gcc with bootstrap-lto. The following patch
  implements the handling of these objects in gcc's build machinery.
  (Once -fuse-linker-plugin is made the default, -ffat-lto-objects in
  config/bootstrap-lto.mk could be dropped.)
  
  LTO-Bootstrapped on x86_64-linux (with default -fuse-linker-plugin).
  The patch was already approved by Paolo Bonzini.
  
  Please apply, because I don't have access.
 
 Note that you need to regenerate all users of libtool.m4.  Please post a
 patch _with_ the regeneration so that whoever applies it won't screw up.

Thanks. Lets hope this one is OK.

2013-11-20  Markus Trippelsdorf  mar...@trippelsdorf.de

* boehm-gc/configure: Regenerate.
* gcc/configure: Likewise.
* ibatomic/configure: Likewise.
* libbacktrace/configure: Likewise.
* libcilkrts/configure: Likewise.
* libffi/configure: Likewise.
* libgfortran/configure
* libgomp/configure: Likewise.
* libitm/configure: Likewise.
* libjava/classpath/configure: Likewise.
* libjava/configure: Likewise.
* libobjc/configure: Likewise.
* libquadmath/configure: Likewise.
* libsanitizer/configure: Likewise.
* libssp/configure: Likewise.
* libstdc++-v3/configure: Likewise.
* libtool.m4: Handle slim-lto objects.
* libvtv/configure: Regenerate.
* ltmain.sh: Handle lto options.
* lto-plugin/configure: Regenerate.
* zlib/configure: Likewise.


diff --git a/boehm-gc/configure b/boehm-gc/configure
index 025003cac135..3c57697e0854 100755
--- a/boehm-gc/configure
+++ b/boehm-gc/configure
@@ -6555,6 +6555,7 @@ for ac_symprfx in  _; do
   else
 lt_cv_sys_global_symbol_pipe=sed -n -e 's/^.*[ 
]\($symcode$symcode*\)[ ][  
]*$ac_symprfx$sympat$opt_cr$/$symxfrm/p'
   fi
+  lt_cv_sys_global_symbol_pipe=$lt_cv_sys_global_symbol_pipe | sed '/ 
__gnu_lto/d'
 
   # Check to see that the pipe works correctly.
   pipe_works=no
@@ -9081,7 +9082,7 @@ _LT_EOF
   if $LD --help 21 | $EGREP ': supported targets:.* elf'  /dev/null \
  test $tmp_diet = no
   then
-   tmp_addflag=
+   tmp_addflag=' $pic_flag'
tmp_sharedflag='-shared'
case $cc_basename,$host_cpu in
 pgcc*) # Portland Group C compiler
@@ -11318,7 +11319,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 11321 configure
+#line 11322 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
@@ -11424,7 +11425,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 11427 configure
+#line 11428 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
@@ -12019,8 +12020,8 @@ with_gnu_ld=$lt_cv_prog_gnu_ld
   # Check if GNU C++ uses GNU ld as the underlying linker, since the
   # archiving commands below assume that GNU ld is being used.
   if test $with_gnu_ld = yes; then
-archive_cmds_CXX='$CC -shared -nostdlib $predep_objects $libobjs 
$deplibs $postdep_objects $compiler_flags ${wl}-soname $wl$soname -o $lib'
-archive_expsym_cmds_CXX='$CC -shared -nostdlib $predep_objects 
$libobjs $deplibs $postdep_objects $compiler_flags ${wl}-soname $wl$soname 
${wl}-retain-symbols-file $wl$export_symbols -o $lib'
+archive_cmds_CXX='$CC $pic_flag -shared -nostdlib $predep_objects 
$libobjs $deplibs $postdep_objects $compiler_flags ${wl}-soname $wl$soname -o 
$lib'
+archive_expsym_cmds_CXX='$CC $pic_flag -shared -nostdlib 
$predep_objects $libobjs $deplibs $postdep_objects $compiler_flags ${wl}-soname 
$wl$soname ${wl}-retain-symbols-file $wl$export_symbols -o $lib'
 
 hardcode_libdir_flag_spec_CXX='${wl}-rpath ${wl}$libdir'
 export_dynamic_flag_spec_CXX='${wl}--export-dynamic'
@@ -13027,6 +13028,13 @@ private:
 };
 _LT_EOF
 
+
+_lt_libdeps_save_CFLAGS=$CFLAGS
+case $CC $CFLAGS  in #(
+*\ -flto*\ *) CFLAGS=$CFLAGS -fno-lto ;;
+*\ -fwhopr*\ *) CFLAGS=$CFLAGS -fno-whopr ;;
+esac
+
 if { { eval echo \\$as_me\:${as_lineno-$LINENO}: \$ac_compile\; } 5
   (eval $ac_compile) 25
   ac_status=$?
@@ -13077,6 +13085,7 @@ if { { eval echo \\$as_me\:${as_lineno-$LINENO}: 
\$ac_compile\; } 5
fi
;;
 
+*.lto.$objext) ;; # Ignore GCC LTO objects
 *.$objext)
# This assumes that the test object file only shows up
# once in the compiler output.
@@ -13112,6 +13121,7 @@ else
 fi
 
 $RM -f confest.$objext
+CFLAGS=$_lt_libdeps_save_CFLAGS
 
 # PORTME: override above test on systems where it is broken
 case $host_os in
diff --git a/gcc/configure b/gcc/configure
index c9bbd653e52e..0ed594b15839 100755
--- a/gcc/configure
+++ 

Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-20 Thread Sergey Ostanevich
Updated as per Richard and Jakub feedback - assuming the default
for simd-cost-model is unlmited by default.
Richard - was you Ok with it?

Sergos

* common.opt: Added new option -fsimd-cost-model.
* tree-vectorizer.h (unlimited_cost_model): Interface update
to rely on particular loop info.
* tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
unlimited_cost_model call according to new interface.
(vect_peeling_hash_choose_best_peeling): Ditto.
(vect_enhance_data_refs_alignment): Ditto.
* tree-vect-slp.c: Ditto.
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
plus issue a warning in case cost model overrides users' directive.
* c.opt: add openmp-simd warning.
* lang.opt: Ditto.
* doc/invoke.texi: Added new openmp-simd warning.



diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 0026683..6173013 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -592,6 +592,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Warning
 Warn if an old-style parameter definition is used

+Wopenmp-simd
+C C++ Var(warn_openmp_simd) Warning EnabledBy(Wall)
+Warn about simd directive is overridden by vectorizer cost model
+
 Woverlength-strings
 C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning
LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
 Warn if a string is longer than the maximum portable length specified
by the standard
diff --git a/gcc/common.opt b/gcc/common.opt
index d5971df..6a40a5d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2296,10 +2296,17 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization

+fsimd-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the vectorization cost model for code marked with simd directive
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
vectorizer cost model %qs)

 EnumValue
+Enum(vect_cost_model) String(default) Value(VECT_COST_MODEL_DEFAULT)
+
+EnumValue
 Enum(vect_cost_model) String(unlimited) Value(VECT_COST_MODEL_UNLIMITED)

 EnumValue
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c250385..050bd44 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wlogical-op -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmissing-braces
-Wmissing-field-initializers @gol
 -Wmissing-include-dirs @gol
--Wno-multichar  -Wnonnull  -Wno-overflow @gol
+-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
 -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
 -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
@@ -3318,6 +3318,7 @@ Options} and @ref{Objective-C and Objective-C++
Dialect Options}.
 -Wmaybe-uninitialized @gol
 -Wmissing-braces @r{(only for C/ObjC)} @gol
 -Wnonnull  @gol
+-Wopenmp-simd @gol
 -Wparentheses  @gol
 -Wpointer-sign  @gol
 -Wreorder   @gol
@@ -4804,6 +4805,10 @@ attribute.
 @opindex Woverflow
 Do not warn about compile-time overflow in constant expressions.

+@item -Wopenmp-simd
+@opindex Wopenm-simd
+Warn if vectorizer cost model overrides simd directive from user.
+
 @item -Woverride-init @r{(C and Objective-C only)}
 @opindex Woverride-init
 @opindex Wno-override-init
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5e09cbd..b43c48c 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard

+Wopenmp-simd
+Fortran Warning
+; Documented in C
+
 Wreal-q-constant
 Fortran Warning
 Warn about real-literal-constants with 'q' exponent-letter
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 83d1f45..977db43 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
loop_vinfo, struct data_reference *dr,
   *new_slot = slot;
 }

-  if (!supportable_dr_alignment  unlimited_cost_model ())
+  if (!supportable_dr_alignment
+   unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
 slot-count += VECT_MAX_COST;
 }

@@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
(loop_vec_info loop_vinfo,
res.peel_info.dr = NULL;
res.body_cost_vec = stmt_vector_for_cost ();

-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
  {
res.inside_cost = INT_MAX;
res.outside_cost = INT_MAX;
@@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
  vectorization factor.
  We do this automtically for cost model, since we
calculate cost
  for every peeling option.  */
-   

Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Trevor Saunders
On Wed, Nov 20, 2013 at 03:18:07PM +0100, Richard Biener wrote:
 On Wed, Nov 20, 2013 at 3:16 PM, Jakub Jelinek ja...@redhat.com wrote:
  On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:
  The limit looks reasonable, but you could have used a simple linked
  list (and never free).  Also being able to pop a random context
  looks fragile ...  that is, pop_gimplify_context shouldn't have an 
  argument.
 
  Can't we use stack_vecgimplify_context, 30 for that?  Though that would
  mean a global var constructor and destructor, so alternatively just use
  a normal vec and .create(30) it somewhere during initialization?
 
 only with gimplify_context *, otherwise things will break during 
 re-allocation.

hm? it seems like the only member of gimplify_ctx that can't just be
memcpyd is the prev pointer which presumably could go away if you have a
vec of all the contexts.

Trev

 
 Richard.
 
  Jakub


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread Trevor Saunders
On Wed, Nov 20, 2013 at 08:43:57AM -0500, Diego Novillo wrote:
 On Wed, Nov 20, 2013 at 6:44 AM, Eric Botcazou ebotca...@adacore.com wrote:
  Thanks. Committed as rev 205023.
 
  Nice work, but why did you antedate the entries in the various ChangeLog
 
 Oh, that's because of local commits and holding on to the patch for a
 few days. That date is the date of the original local commit. Further
 pulls from upstream pushed the entries down in the file. I just forgot
 to move the entries forward when I committed the patch.
 
  directories (yes, we all know your opinion about ChangeLog files ;-)
 
 What gave you that idea? ;)

 This even seems like a good argument for that view, git log and I
 assume svn log could give you the same data without the possibility of
 this sort of issue ;)

 Trev

 
 
 Diego.


Re: [RFC] [PATCH, AARCH64] Machine descriptions to support stack smashing protection

2013-11-20 Thread Venkataramanan Kumar
Hi Joseph,

On 19 November 2013 21:53, Joseph S. Myers jos...@codesourcery.com wrote:
 On Tue, 19 Nov 2013, Jakub Jelinek wrote:

 On Tue, Nov 19, 2013 at 04:30:21PM +0530, Venkataramanan Kumar wrote:
  This is RFC patch that adds machine descriptions to support stack
  smashing protection in AArch64.

 Most of the testsuite changes look wrong.  The fact that aarch64
 gets stack protector support doesn't mean all other targets do as well.
 So please leave all the changes that remove native or stack_protector
 requirements out.

 The tests for target native look wrong for me (native conditionals
 only make sense for special cases such as guality tests that expect to
 exec another tool on the host) - so I think they should be removed, but
 that removal needs submitting as a separate patch.


Yes apologies for that, I will send another patch.

 I would like to see a clear description of what happens with all eight
 combinations of (glibc version does or doesn't support this, GCC building
 glibc does or doesn't support this, GCC building user program does or
 doesn't support this).  Which of the (GCC building glibc, glibc)
 combinations will successfully build glibc?  Will all such glibcs be
 binary-compatible?  Will both old and new GCC work for building user
 programs with both old and new glibc?

Can you please clarify why we need to consider the gcc compiler that
builds the glibc in the combinations you want me to describe. I am
not able to understand that.

I tested the below three GCC compiler versions for building user programs.

1) GCC trunk (without my patch) generates code for stack protection
set/test when -fstack-protector-all is enabled.
This is based on global variable  __stack_chk_guard access, which
glibc supports from version 2.4.

-snip-
 adrpx0, __stack_chk_guard
 add x0, x0, :lo12:__stack_chk_guard
 ldr x0, [x0]
 str x0, [x29,24]
-snip-

GCC has generic code emitted for stack protection, enabled for targets
where frame growth is downwards. The patch for frame growing downwards
in Aarch64 is recently committed in trunk.

2) Now with the patch, GCC compiler will generate TLS based access.

snip-
   mrs x0, tpidr_el0
   ldr x1, [x0,-8]
   str x1, [x29,24]
-snip-

I need to check if target glibc for Aarch64 supports TLS based ssp
support. Currently some targets check and generate TLS based access
when glibc version is =2.4.

3) GCC linaro compiler packaged with open embedded image and which I
boot in V8 foundation model as I don't have access to Aarch64 hardware
yet.

It will not emit any stack protection code.
test.c:1:0: warning: -fstack-protector not supported for this target
[enabled by default]

regards,
Venkat.


Re: Committed: gdbhooks.py: reorganization to support regex-matching and typedefs

2013-11-20 Thread Tom Tromey
 David == David Malcolm dmalc...@redhat.com writes:

David I've committed the following to trunk as r205085, as a mixture of
David enabling work towards gdb being able to print vec, and to fix issues
David where the string-matching logic got confused by typedefs (done in a
David rather crude way, by supporting multiple strings in the match, but it
David works).

FWIW for the libstdc++ printers what we do is strip typedefs and
qualifiers to find the type whose name we want to match:

# Get the unqualified type, stripped of typedefs.
type = type.unqualified ().strip_typedefs ()

Tom


Re: [wide-int] Undo some differences with trunk

2013-11-20 Thread Richard Sandiford
Kenneth Zadeck zad...@naturalbridge.com writes:
 I would leave out the last frag of tree.c.  It actually gets rid of what 
 i consider, a latent bug. it does not show up with the current 
 implementation of wide-int, but if that implementation changed, it 
 would.   The problem is that you really should not expect that you can 
 set the min or max value for a type without first setting the precision 
 of that type.

Ah, OK.  I should have realised there was more to it.

 Aside from that, this looks fine.

Thanks, applied without that hunk.

Richard



[wide-int] Remove cst_fits_*

2013-11-20 Thread Richard Sandiford
FWIW this is the last change I had planned.  I thought it would be easier
to do once the host_integerp replacement was in mainline, but that means
it's a bit closer to the deadline than it should have been.

The branch adds two new functions, cst_fits_uhwi_p and cst_fits_shwi_p,
alongside the tree_fits_* variants.  AFAICT, all cst_fits_uhwi_p tests
replace trunk tests of the form:

TREE_CODE (x) == INTEGER_CST  TREE_INT_CST_HIGH (x) == 0

But this is the same as host_integerp (x, 1), now tree_fits_uhwi_p,
so I think we should just use that instead.  FWIW host_integerp was:

int
host_integerp (const_tree t, int pos)
{
  if (t == NULL_TREE)
return 0;

  return (TREE_CODE (t) == INTEGER_CST
   ((TREE_INT_CST_HIGH (t) == 0
(HOST_WIDE_INT) TREE_INT_CST_LOW (t) = 0)
  || (! pos  TREE_INT_CST_HIGH (t) == -1
   (HOST_WIDE_INT) TREE_INT_CST_LOW (t)  0
   !TYPE_UNSIGNED (TREE_TYPE (t)))
  || (pos  TREE_INT_CST_HIGH (t) == 0)));
}

which if you fold in pos = 1 reduces to current trunk's:

bool
tree_fits_uhwi_p (const_tree t)
{
  return (t != NULL_TREE
   TREE_CODE (t) == INTEGER_CST
   TREE_INT_CST_HIGH (t) == 0);
}

cst_fits_shwi_p replaces cst_and_fits_in_hwi, but if cst_fits_uhwi_p
goes away then I think we might as well stick with the original name.
(We're already keeping the associated extraction function, int_cst_value.)

I did a bit of wide-intification of write_integer_cst, which showed
up a missing conversion in the abs routine (my fault).

Tested on x86_64-linux-gnu.  OK for wide-int?

Thanks,
Richard


Index: gcc/ada/gcc-interface/utils.c
===
--- gcc/ada/gcc-interface/utils.c   2013-11-20 14:00:54.151599452 +
+++ gcc/ada/gcc-interface/utils.c   2013-11-20 15:29:17.483262547 +
@@ -6065,7 +6065,7 @@ handle_novops_attribute (tree *node, tre
 get_nonnull_operand (tree arg_num_expr, unsigned HOST_WIDE_INT *valp)
 {
   /* Verify the arg number is a constant.  */
-  if (!cst_fits_uhwi_p (arg_num_expr))
+  if (!tree_fits_uhwi_p (arg_num_expr))
 return false;
 
   *valp = TREE_INT_CST_LOW (arg_num_expr);
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c 2013-11-20 14:00:54.151599452 +
+++ gcc/c-family/c-common.c 2013-11-20 15:29:17.486262535 +
@@ -8768,7 +8768,7 @@ check_nonnull_arg (void * ARG_UNUSED (ct
 get_nonnull_operand (tree arg_num_expr, unsigned HOST_WIDE_INT *valp)
 {
   /* Verify the arg number is a small constant.  */
-  if (cst_fits_uhwi_p (arg_num_expr))
+  if (tree_fits_uhwi_p (arg_num_expr))
 {
   *valp = TREE_INT_CST_LOW (arg_num_expr);
   return true;
Index: gcc/c-family/c-format.c
===
--- gcc/c-family/c-format.c 2013-11-20 14:00:54.151599452 +
+++ gcc/c-family/c-format.c 2013-11-20 15:29:17.487262531 +
@@ -227,7 +227,7 @@ check_format_string (tree fntype, unsign
 static bool
 get_constant (tree expr, unsigned HOST_WIDE_INT *value, int validated_p)
 {
-  if (!cst_fits_uhwi_p (expr))
+  if (!tree_fits_uhwi_p (expr))
 {
   gcc_assert (!validated_p);
   return false;
Index: gcc/config/sol2-c.c
===
--- gcc/config/sol2-c.c 2013-11-20 14:00:54.151599452 +
+++ gcc/config/sol2-c.c 2013-11-20 15:29:17.433262746 +
@@ -96,7 +96,7 @@ solaris_pragma_align (cpp_reader *pfile
 }
 
   low = TREE_INT_CST_LOW (x);
-  if (!cst_fits_uhwi_p (x)
+  if (!tree_fits_uhwi_p (x)
   || (low != 1  low != 2  low != 4  low != 8  low != 16
   low != 32  low != 64  low != 128))
 {
Index: gcc/cp/mangle.c
===
--- gcc/cp/mangle.c 2013-11-20 14:00:54.151599452 +
+++ gcc/cp/mangle.c 2013-11-20 15:29:17.434262742 +
@@ -1505,8 +1505,8 @@ write_number (unsigned HOST_WIDE_INT num
 write_integer_cst (const tree cst)
 {
   int sign = tree_int_cst_sgn (cst);
-
-  if (!cst_fits_shwi_p (cst))
+  widest_int abs_value = wi::abs (wi::to_widest (cst));
+  if (!wi::fits_uhwi_p (abs_value))
 {
   /* A bignum. We do this in chunks, each of which fits in a
 HOST_WIDE_INT.  */
@@ -1559,14 +1559,9 @@ write_integer_cst (const tree cst)
   else
 {
   /* A small num.  */
-  unsigned HOST_WIDE_INT low = TREE_INT_CST_LOW (cst);
-
   if (sign  0)
-   {
- write_char ('n');
- low = -low;
-   }
-  write_unsigned_number (low);
+   write_char ('n');
+  write_unsigned_number (abs_value.to_uhwi ());
 }
 }
 
Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c 2013-11-20 15:27:30.843688771 +
+++ gcc/dwarf2out.c 2013-11-20 15:29:17.438262726 +
@@ -16353,8 +16353,7 @@ 

[PATCH, ia64] [PR target/52731] internal compiler error: in ia64_st_address_bypass_p, at config/ia64/ia64.c:9357

2013-11-20 Thread Kirill Yukhin
Hello,
Patch in the bottom fixes PR52731.

The essense of the problem is that `ia64_single_set'
allows double set as exception for `prologue_allocate_stack'
and `epilogue_deallocate_stack'.

Although it does not apply this exception for predicated 
version of the patterns. I introduce explicit predicated
versions of the insns and use their codes in the routine.

Test passes w/ patch applied and fails when not.
Bootstrap pass (for all languages).

ChangeLog entry:
2013-11-20  Kirill Yukhin  kirill.yuk...@intel.com

* config/ia64/ia64.md (prologue_allocate_stack): Block auto-
generation of predicated version.
(epilogue_deallocate_stack): Ditto.
(prologue_allocate_stack_pr): Add explicit predicated version.
(epilogue_deallocate_stack_pr): Ditto.

testsuite/ChangeLog entry:
2013-11-20  Kirill Yukhin  kirill.yuk...@intel.com

* gcc.target/ia64/pr52731.c: New. 

Is it ok for trunk?

--
Thanks, K

 gcc/config/ia64/ia64.c  |2 +
 gcc/config/ia64/ia64.md |   41 +-
 gcc/testsuite/gcc.target/ia64/pr52731.c |   19 ++
 3 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index 4fde7aa..6ce7154 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -7146,7 +7146,9 @@ ia64_single_set (rtx insn)
   switch (recog_memoized (insn))
 {
 case CODE_FOR_prologue_allocate_stack:
+case CODE_FOR_prologue_allocate_stack_pr:
 case CODE_FOR_epilogue_deallocate_stack:
+case CODE_FOR_epilogue_deallocate_stack_pr:
   ret = XVECEXP (x, 0, 0);
   break;
 
diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
index 4d9d4e0..bc4e8cb 100644
--- a/gcc/config/ia64/ia64.md
+++ b/gcc/config/ia64/ia64.md
@@ -4652,6 +4652,8 @@
 
 ;; This prevents the scheduler from moving the SP decrement past FP-relative
 ;; stack accesses.  This is the same as adddi3 plus the extra set.
+;; Explicit predicated version of insn needed to check by CODE_FOR_
+;; in ia64_single_set, where despite of 2 sets this define_insn should be OK.
 
 (define_insn prologue_allocate_stack
   [(set (match_operand:DI 0 register_operand =r,r,r)
@@ -4664,10 +4666,31 @@
add %0 = %1, %2
adds %0 = %2, %1
addl %0 = %2, %1
-  [(set_attr itanium_class ialu)])
+  [(set_attr itanium_class ialu)
+   (set_attr predicable no)])
+
+(define_insn prologue_allocate_stack_pr
+  [(cond_exec (match_operator 0 (predicate_operator)
+[(match_operand:BI 1 (register_operand) (c,c,c))
+ (const_int 0)])
+ (parallel
+[(set (match_operand:DI 2 register_operand =r,r,r)
+  (plus:DI (match_operand:DI 3 register_operand %r,r,a)
+   (match_operand:DI 4 gr_reg_or_22bit_operand 
r,I,J)))
+ (set (match_operand:DI 5 register_operand +r,r,r)
+  (match_dup 5))]))]
+  
+  @
+   (%J0) add %2 = %3, %4
+   (%J0) adds %2 = %3, %4
+   (%J0) addl %2 = %3, %4
+  [(set_attr itanium_class ialu)
+   (set_attr predicable no)])
 
 ;; This prevents the scheduler from moving the SP restore past FP-relative
 ;; stack accesses.  This is similar to movdi plus the extra set.
+;; Explicit predicated version of insn needed to check by CODE_FOR_
+;; in ia64_single_set, where despite of 2 sets this define_insn should be OK.
 
 (define_insn epilogue_deallocate_stack
   [(set (match_operand:DI 0 register_operand =r)
@@ -4675,7 +4698,21 @@
(set (match_dup 1) (match_dup 1))]
   
   mov %0 = %1
-  [(set_attr itanium_class ialu)])
+  [(set_attr itanium_class ialu)
+   (set_attr predicable no)])
+
+(define_insn epilogue_deallocate_stack_pr
+  [(cond_exec (match_operator 0 (predicate_operator)
+[(match_operand:BI 1 (register_operand) (c))
+ (const_int 0)])
+ (parallel
+[(set (match_operand:DI 2 register_operand =r)
+  (match_operand:DI 3 register_operand +r))
+ (set (match_dup 3) (match_dup 3))]))]
+  
+  (%J0) mov %2 = %3
+  [(set_attr itanium_class ialu)
+   (set_attr predicable no)])
 
 ;; As USE insns aren't meaningful after reload, this is used instead
 ;; to prevent deleting instructions setting registers for EH handling
diff --git a/gcc/testsuite/gcc.target/ia64/pr52731.c 
b/gcc/testsuite/gcc.target/ia64/pr52731.c
new file mode 100644
index 000..50ef1d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/ia64/pr52731.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target ia64-*-* } } */
+/* { dg-options -O2 } */
+
+char* area;
+long int area_size;
+char* base;
+
+void fun(unsigned long int addr)
+{
+  unsigned long int size32 = (addr + 4096 - 1)  ~(4096 - 1);
+  unsigned long int size = size32 * sizeof(unsigned int);
+
+  if (size  0) {
+size = (size + 1)  ~(1);
+  }
+
+  area_size = size;
+  area = base + size;
+}



Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread Diego Novillo
On Wed, Nov 20, 2013 at 10:10 AM, Trevor Saunders tsaund...@mozilla.com wrote:
 On Wed, Nov 20, 2013 at 08:43:57AM -0500, Diego Novillo wrote:
 On Wed, Nov 20, 2013 at 6:44 AM, Eric Botcazou ebotca...@adacore.com wrote:
  Thanks. Committed as rev 205023.
 
  Nice work, but why did you antedate the entries in the various ChangeLog

 Oh, that's because of local commits and holding on to the patch for a
 few days. That date is the date of the original local commit. Further
 pulls from upstream pushed the entries down in the file. I just forgot
 to move the entries forward when I committed the patch.

  directories (yes, we all know your opinion about ChangeLog files ;-)

 What gave you that idea? ;)

  This even seems like a good argument for that view, git log and I
  assume svn log could give you the same data without the possibility of
  this sort of issue ;)

This is the main reason why I think ChangeLogs are absolutely
worthless. But I don't want to hijack this thread to discuss ChangeLog
politics.


Diego.


Re: [PATCH, MPX, 2/X] Pointers Checker. Add flag to varpool_node

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 8:09 PM, Jeff Law l...@redhat.com wrote:
 On 11/19/13 07:06, Ilya Enkovich wrote:

 Hi,

 Here is a patch to add flag for marking symbols as requiring static
 initialization of bounds.  Used by Pointer Bounds Checker to handle
 statically initialized pointers and static bounds vars.

 Thanks,
 Ilya
 --
 2013-11-19  Ilya Enkovich  ilya.enkov...@intel.com

 * cgraph.h (varpool_node): Add need_bounds_init field.
 * lto-cgraph.c (lto_output_varpool_node): Output
 need_bounds_init value.
 (input_varpool_node): Read need_bounds_init value.
 * varpool.c (dump_varpool_node): Dump need_bounds_init field.

 OK.  This is fine to check in as long as any prerequisites are already in.

Please refrain from cluttering the tree with dead code.  Commit all patches
in one go once all patches are approved.

Thanks,
Richard.

 jeff



Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 3:34 PM, Andrew MacLeod amacl...@redhat.com wrote:
 On 11/20/2013 09:12 AM, Richard Biener wrote:

 On Wed, Nov 20, 2013 at 2:35 PM, Andrew MacLeod amacl...@redhat.com
 wrote:

 This has been bugging me since I moved it out of gimple.h to gimplify.h.

 The gimplify context structure was exposed in the header file to allow a
 few
 other files to push and pop contexts off the gimplification stack.
 Unfortunately, the struct contains a hash-table template, which means it
 also required hash-table.h in order to even parse the header.  No file
 other
 than gimplify.c actually uses what is in the structure, so exposing it
 seems
 wrong.

 This patch primarily changes push_gimplify_context () and
 pop_gimplify_context () to grab a struct gimplify_ctx off a local stack
 pool
 rather than have each file's local function know about the structure and
 allocate it on their stack.  I didn't want to use malloc/free or there
 would
 be a lot of churn.  Its open to debate what the best approach is, but I
 decided to simply allocate a bunch on a static array local to gimplify.c.
 If it goes past the end of the local array, malloc the required
 structure.
 Its pretty simplistic, but functional and doesn't required any GTY stuff
 or
 other machinery.

 I also hacked up the compiler to report what the 'top' of the stack was
 for
 a compilation unit. I then ran it through a bootstrap and full testsuite
 run
 of all languages, and looked for the maximum number of context structs in
 use at one time.  Normally only 1 or 2 were ever in use, but its topped
 out
 at 7 during some of the openmp tests.  So my current limit of 30 will
 likely
 never been reached.

 only 2 fields were ever set by anyone outside of gimplify.c, the
 'into_ssa'
 and 'allow_rhs_cond_expr' fields, and they were always set
 immediately after pushing the context onto the stack... So I added them
 as
 parameters to the push and gave them default values.

 And thats it... this patch moves the hash function and structure
 completely
 within gimplify.c so no one needs to know anything about it, and removes
 the
 hash-table.h include prerequisite for parsing.

 Bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK for
 mainline?

 The limit looks reasonable, but you could have used a simple linked
 list (and never free).  Also being able to pop a random context
 looks fragile ...  that is, pop_gimplify_context shouldn't have an
 argument.


 The argument to pop_gimplify_context isn't a context pointer, its a gimple
 statement and if present calls declare_vars on the temps in the body...  the
 routine always pops the current context.  So that doesn't seem too fragile?

Oh... I see.

 I had considered just a linked list and never freeing, but thought that
 policy might be frowned upon and trigger false positives on memory
 allocation analysis tools  :-)   I'm happy to do that tho, its quite
 trivial.

;)

Richard.


 Andrew




Re: [wwwdocs] Broken links

2013-11-20 Thread Philippe Baril Lecavalier


Embarrassing typos, my apologies. I was told to specify that I don't
have commit access, but since this mention is irrelevant, I modified my
suggested notice in about.html to reflect that. See attached.Index: projects/beginner.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/beginner.html,v
retrieving revision 1.57
diff -u -r1.57 beginner.html
--- projects/beginner.html	29 Dec 2012 00:24:56 -	1.57
+++ projects/beginner.html	20 Nov 2013 15:13:58 -
@@ -32,7 +32,7 @@
 a href=http://gcc.gnu.org/onlinedocs/gccint/RTL.html;RTL/a.
 It may help to understand the higher-level codetree/code structure as
 well.  Unfortunately, for this we only have an a
-href=http://gcc.gnu.org/onlinedocs/gccint/Trees.html;incomplete, C/C++ specific manual/a./p
+href=http://gcc.gnu.org/onlinedocs/gccint/GENERIC.html;incomplete, C/C++ specific manual/a./p
 
 pRemember to a href=../contributewhy.htmlkeep other developers
 informed/a of any substantial projects you intend to work on./p
Index: projects/documentation.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/documentation.html,v
retrieving revision 1.5
diff -u -r1.5 documentation.html
--- projects/documentation.html	19 Sep 2007 22:40:29 -	1.5
+++ projects/documentation.html	20 Nov 2013 15:13:58 -
@@ -56,7 +56,7 @@
 
 pWe've got quite a bit of this but it is scattered all over the
 place.  It belongs in the official manual.  There is a a
-href=http://gcc.gnu.org/onlinedocs/gccint/Trees.html;C/C++ specific manual/a,
+href=http://gcc.gnu.org/onlinedocs/gccint/GENERIC.html;C/C++ specific manual/a,
 which is incomplete, though.  The file
 codegcc/LANGUAGES/code contains incomplete and outdated information
 about changes made in not so recent years to the codetree/code
Index: readings.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v
retrieving revision 1.233
diff -u -r1.233 readings.html
--- readings.html	31 Oct 2013 18:03:28 -	1.233
+++ readings.html	20 Nov 2013 15:13:58 -
@@ -401,7 +401,7 @@
 h3Objective-C Information/h3
 
 ul
-  lia href=http://developer.apple.com/documentation/Cocoa/Conceptual/ObjectiveC/;
+lia href=https://developer.apple.com/library/mac/documentation/cocoa/conceptual/ProgrammingWithObjectiveC/Introduction/Introduction.html;
   Objective-C Language Description/a (Apple Computer)/li
 /ul
 
Index: about.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/about.html,v
retrieving revision 1.19
diff -u -r1.19 about.html
--- about.html	18 Aug 2013 20:38:58 -	1.19
+++ about.html	20 Nov 2013 15:13:58 -
@@ -69,6 +69,14 @@
 list./li
 /ol
 
+pWhether or not you have commit access, you must send an email to a
+href=mailto:gcc-patches@gcc.gnu.orggcc-patches@gcc.gnu.org/a
+and CC a href=mailto:ger...@pfeifer.comger...@pfeifer.com/a,
+with your patch as an attachment. Specify [wwwdocs] in the subject
+header and follow the instructions on a
+href=http://gcc.gnu.org/contribute.html#patches;patch
+submission/a, whenever applicable./p
+
 
 h2 id=systemThe host system/h2
 


Re: [PATCH] [AArch64] Fix over length lines around aarch64_save_or_restore_fprs

2013-11-20 Thread Marcus Shawcroft

On 19/11/13 16:50, Kyrill Tkachov wrote:

On 19/11/13 16:39, Marcus Shawcroft wrote:

Committed.

/Marcus

2013-11-19  Marcus Shawcroft  marcus.shawcr...@arm.com

* config/aarch64/aarch64.c (aarch64_save_or_restore_fprs): Fix over
length lines.



Minor nit... but ENOPATCH ;)

Kyrill



Ooops, sorry.

Attached
/Marcusdiff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cfda95e..4faa7cf 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1796,7 +1796,8 @@ aarch64_save_or_restore_fprs (int start_offset, int increment,
   unsigned regno;
   unsigned regno2;
   rtx insn;
-  rtx (*gen_mem_ref)(enum machine_mode, rtx) = (frame_pointer_needed)? gen_frame_mem : gen_rtx_MEM;
+  rtx (*gen_mem_ref)(enum machine_mode, rtx)
+= (frame_pointer_needed)? gen_frame_mem : gen_rtx_MEM;
 
 
   for (regno = V0_REGNUM; regno = V31_REGNUM; regno++)
@@ -1839,16 +1840,17 @@ aarch64_save_or_restore_fprs (int start_offset, int increment,
 		( gen_load_pairdf (gen_rtx_REG (DFmode, regno), mem,
    gen_rtx_REG (DFmode, regno2), mem2));
 
-		  add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (DFmode, regno));
-		  add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (DFmode, regno2));
+		  add_reg_note (insn, REG_CFA_RESTORE,
+gen_rtx_REG (DFmode, regno));
+		  add_reg_note (insn, REG_CFA_RESTORE,
+gen_rtx_REG (DFmode, regno2));
 		}
 
 		  /* The first part of a frame-related parallel insn
 		 is always assumed to be relevant to the frame
 		 calculations; subsequent parts, are only
 		 frame-related if explicitly marked.  */
-	  RTX_FRAME_RELATED_P (XVECEXP (PATTERN (insn), 0,
-	1)) = 1;
+	  RTX_FRAME_RELATED_P (XVECEXP (PATTERN (insn), 0, 1)) = 1;
 	  regno = regno2;
 	  start_offset += increment * 2;
 	}
@@ -1859,7 +1861,8 @@ aarch64_save_or_restore_fprs (int start_offset, int increment,
 	  else
 		{
 		  insn = emit_move_insn (gen_rtx_REG (DFmode, regno), mem);
-		  add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (DImode, regno));
+		  add_reg_note (insn, REG_CFA_RESTORE,
+gen_rtx_REG (DImode, regno));
 		}
 	  start_offset += increment;
 	}
-- 
1.7.9.5


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 4:06 PM, Trevor Saunders tsaund...@mozilla.com wrote:
 On Wed, Nov 20, 2013 at 03:18:07PM +0100, Richard Biener wrote:
 On Wed, Nov 20, 2013 at 3:16 PM, Jakub Jelinek ja...@redhat.com wrote:
  On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:
  The limit looks reasonable, but you could have used a simple linked
  list (and never free).  Also being able to pop a random context
  looks fragile ...  that is, pop_gimplify_context shouldn't have an 
  argument.
 
  Can't we use stack_vecgimplify_context, 30 for that?  Though that would
  mean a global var constructor and destructor, so alternatively just use
  a normal vec and .create(30) it somewhere during initialization?

 only with gimplify_context *, otherwise things will break during 
 re-allocation.

 hm? it seems like the only member of gimplify_ctx that can't just be
 memcpyd is the prev pointer which presumably could go away if you have a
 vec of all the contexts.

Callers have a pointer to gimplify_context AFAIK.

Richard.

 Trev


 Richard.

  Jakub


Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Uros Bizjak
On Wed, Nov 20, 2013 at 1:32 PM, Ilya Enkovich enkovich@gmail.com wrote:

   Here is a patch to add size relocation and instruction to obtain 
   object's size in i386 target.
 
  +(define_insn move_size_reloc_mode
  +  [(set (match_operand:SWI48 0 register_operand =r)
  +(match_operand:MODE 1 size_relocation Z))]
  +  
  +{
  +  return mov{imodesuffix}\t{%1, %0|%0, %1};
 
  Please don't change x86_64_immediate_operand just to use Z
  constraint The predicate is used in a couple of other places that for
  sure don't accept your change.
 
  Better write this insn in an explicit way (see for example
  tls_initial_exec_64_sun). Something like:
 
  (define_insn move_size_reloc_mode
[(set (match_operand:SWI48 0 register_operand =r)
  (unspec:SWI48
   [(match_operand 1 symbolic_operand ... )]
   UNSPEC_SIZEOF))]

mov{imodesuffix}\t{%a1@SIZE, %0|%0, %a1@SIZE})
 
  You will probably need to define new operand 1 predicate and constraint.
 
  Uros.
 
  Hi, Uros!  Thanks for comments!  Here is what I got trying to follow your 
  suggestion.  Does it look better?

 You actually don't need any operand modifiers in the insn template. Simply 
 use:

 mov{imodesuffix}\t{%1@SIZE, %0|%0, %1@SIZE}

 and you will automatically get

 movl $zzz, %eax or mov %eax, OFFSET FLAT: zzz.

 Since your pattern allows only symbolic_operand, there is no reload,
 so you can avoid the constraint alltogether.

 BTW: Did you consider various -mcmodel=... options? For DImode moves,
 you should check x86_64_zext_immediate_operand predicate and output
 either movl $zzz, %eax or movabs $zzz, %rax. There is no movq with
 64bit immediate. Please see movdi pattern.

 Yep, for large objects it may work wrongly. Does anyone use static objects 
 4Gb? :)

 Large address does not mean large object but seems we have to be conservative 
 here. I added  x86_64_zext_immediate_operand check with additional CM_KERNEL 
 check because in this model object size should always fit 32 bits.

IMO, we should list code models that support object sizes  31bits for
64bit target. The object size in small models will never be  31bits
(and never negative), so we can use movl unconditionally.

Please try attached patch.

Uros.
Index: i386.md
===
--- i386.md (revision 205051)
+++ i386.md (working copy)
@@ -79,6 +79,7 @@
   UNSPEC_PLTOFF
   UNSPEC_MACHOPIC_OFFSET
   UNSPEC_PCREL
+  UNSPEC_SIZEOF
 
   ;; Prologue support
   UNSPEC_STACK_ALLOC
@@ -18458,6 +18459,30 @@
   bndstx\t{%2, %3|%3, %2}
   [(set_attr type mpxst)])
 
+(define_insn move_size_reloc_si
+  [(set (match_operand:SI 0 register_operand =r)
+   (unspec:SI
+[(match_operand:SI 1 symbol_operand)]
+UNSPEC_SIZEOF))]
+  TARGET_MPX
+  mov{l}\t{%1@SIZE, %0|%0, %1@SIZE}
+  [(set_attr type imov)])
+
+(define_insn move_size_reloc_di
+  [(set (match_operand:DI 0 register_operand =r)
+   (unspec:DI
+[(match_operand:DI 1 symbol_operand)]
+UNSPEC_SIZEOF))]
+  TARGET_64BIT  TARGET_MPX
+{
+  if (ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_LARGE
+  || ix86_cmodel == CM_MEDIUM_PIC || ix86_cmodel == CM_LARGE_PIC)
+return movabs{q}\t{%1@SIZE, %0|%0, %1@SIZE};
+  else
+return mov{l}\t{%1@SIZE, %k0|%k0, %1@SIZE};
+}
+  [(set_attr type imov)])
+
 (include mmx.md)
 (include sse.md)
 (include sync.md)
Index: predicates.md
===
--- predicates.md   (revision 205051)
+++ predicates.md   (working copy)
@@ -119,6 +119,10 @@
(match_test TARGET_64BIT)
(match_test REGNO (op)  BX_REG)))
 
+;; Return true if VALUE is a symbol reference
+(define_predicate symbol_operand
+  (match_code symbol_ref))
+
 ;; Return true if VALUE can be stored in a sign extended immediate field.
 (define_predicate x86_64_immediate_operand
   (match_code const_int,symbol_ref,label_ref,const)


Re: [PATCH][AArch64] vneg testcase made big-endian safe

2013-11-20 Thread Marcus Shawcroft

On 19/11/13 17:29, Alex Velenko wrote:


gcc/testsuite/

2013-11-19  Alex Velenko  alex.vele...@arm.com

* gcc.target/aarch64/vneg_s.c (test_vneg_s8): fixed to not use
vector indexing.
(test_vneg_s16): Likewise.
(test_vneg_s32): Likewise.
(test_vneg_s64): Likewise.
(test_vnegq_s8): Likewise.
(test_vnegq_s16): Likewise.
(test_vnegq_s32): Likewise.
(test_vnegq_s64): Likewise.



Hi, Couple of nits:

-#define RUN_TEST(test_set, answ_set, reg_len, data_len, n, a, b)   \
-  {\

+#define RUN_TEST(test_set, answ_set, reg_len,  \
+ data_len, n, a, b, _a, _b)\
+{  \

This part of the patch changes the indentation for the body of RUN_TEST 
unnecessarily, please re-indent.


-  int16x4_t a;
-  int16x4_t b;
+  int16x4_t a, b;

Unnecessary layout change, please remove.

The ChangeLog should provide a detailed explanation of what changed. 
For example:


* gcc.target/aarch64/vneg_s.c (LANE_POSTFIX): New.
(INDEX64_8, INDEX64_16, INDEX64_32, INDEX64_64, INDEX128_8)
(INDEX128_16, INDEX128_32, INDEX128_64, INDEX): Removed.
(RUN_TEST): New formal arguments _a and  _b...
(test_vneg_s8): Define _a and _b.  Pass to RUN_TEST.
...
etc

Cheers
/Marcus



Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Ilya Enkovich
2013/11/20 Uros Bizjak ubiz...@gmail.com:
 On Wed, Nov 20, 2013 at 1:32 PM, Ilya Enkovich enkovich@gmail.com wrote:

   Here is a patch to add size relocation and instruction to obtain 
   object's size in i386 target.
 
  +(define_insn move_size_reloc_mode
  +  [(set (match_operand:SWI48 0 register_operand =r)
  +(match_operand:MODE 1 size_relocation Z))]
  +  
  +{
  +  return mov{imodesuffix}\t{%1, %0|%0, %1};
 
  Please don't change x86_64_immediate_operand just to use Z
  constraint The predicate is used in a couple of other places that for
  sure don't accept your change.
 
  Better write this insn in an explicit way (see for example
  tls_initial_exec_64_sun). Something like:
 
  (define_insn move_size_reloc_mode
[(set (match_operand:SWI48 0 register_operand =r)
  (unspec:SWI48
   [(match_operand 1 symbolic_operand ... )]
   UNSPEC_SIZEOF))]

mov{imodesuffix}\t{%a1@SIZE, %0|%0, %a1@SIZE})
 
  You will probably need to define new operand 1 predicate and constraint.
 
  Uros.
 
  Hi, Uros!  Thanks for comments!  Here is what I got trying to follow your 
  suggestion.  Does it look better?

 You actually don't need any operand modifiers in the insn template. Simply 
 use:

 mov{imodesuffix}\t{%1@SIZE, %0|%0, %1@SIZE}

 and you will automatically get

 movl $zzz, %eax or mov %eax, OFFSET FLAT: zzz.

 Since your pattern allows only symbolic_operand, there is no reload,
 so you can avoid the constraint alltogether.

 BTW: Did you consider various -mcmodel=... options? For DImode moves,
 you should check x86_64_zext_immediate_operand predicate and output
 either movl $zzz, %eax or movabs $zzz, %rax. There is no movq with
 64bit immediate. Please see movdi pattern.

 Yep, for large objects it may work wrongly. Does anyone use static objects 
 4Gb? :)

 Large address does not mean large object but seems we have to be 
 conservative here. I added  x86_64_zext_immediate_operand check with 
 additional CM_KERNEL check because in this model object size should always 
 fit 32 bits.

 IMO, we should list code models that support object sizes  31bits for
 64bit target. The object size in small models will never be  31bits
 (and never negative), so we can use movl unconditionally.

For medium models x86_64_zext_immediate_operand returns true for
object is known to go to lower 2Gb space.  It should allow us to use
movl.  Why do you always emit movabs for medium model?

Ilya


 Please try attached patch.

 Uros.


Re: [PATCH, rs6000] Fix order of TDmode in FPRs in little-endian mode

2013-11-20 Thread David Edelsohn
On Mon, Nov 18, 2013 at 3:07 PM, Ulrich Weigand uweig...@de.ibm.com wrote:


 Also note that this patch does not change how TDmode values are loaded
 into GPRs: on little-endian, this means we do get the usual LE subreg
 order there (least significant word in lowest-numbered register).  This
 does still seem the right thing to do, because there are no instructions
 that operate on DFP values in GPRs, and because this allows us to keep
 using subregs on GPRs in that case, and it allows moving TDmode between
 memory and GPRs (e.g. for vararg arguments) in the usual way.

Does this require special support in GDB so that it understands how to
interpret this type in different classes of registers?  However, the
interpretation always will be consistent for a particular class of
register.

 Tested on powerpc64le-linux, fixes all remaining DFP-related test suite
 failures.

 OK for mainline?

 Bye,
 Ulrich


 ChangeLog:

 * config/rs6000/rs6000.c (rs6000_cannot_change_mode_class): Do not
 allow subregs of TDmode in FPRs of smaller size in little-endian.
 (rs6000_split_multireg_move): When splitting an access to TDmode
 in FPRs, do not use simplify_gen_subreg.

Okay.

Thanks, David


Re: [PATCH, PR 57363] IBM long double: adding qNaN and number raises inexact exception

2013-11-20 Thread David Edelsohn
On Wed, Nov 20, 2013 at 9:11 AM, Adhemerval Zanella
azane...@linux.vnet.ibm.com wrote:

 libgcc/ChangeLog:

 2013-11-20  Adhemerval Zanella  azane...@linux.vnet.ibm.com

 * config/rs6000/ibm-ldouble.c (__gcc_qadd): Fix add
 of normal number and qNaN to not raise an inexact exception.

 gcc/testsuite/ChangeLog:

 2013-11-20  Adhemerval Zanella  azane...@linux.vnet.ibm.com

 * gcc.target/powerpc/pr57363.c: New test.

This is okay.  And the patch is small enough that it does not need a
copyright assignment for GCC.

Do you need someone to commit the patch for you?

Thanks, David


[PATCH][ARM]Use of vcvt for float to fixed point conversions.

2013-11-20 Thread Renlin Li

Hi all,

This patch will make the arm back-end use vcvt for float to fixed point 
conversions when applicable.


Test on arm-none-linux-gnueabi has been done on the model.
Okay for trunk?


Kind regards,
Renlin Li


gcc/ChangeLog:

2013-11-20  Renlin Li  renlin...@arm.com

* config/arm/arm-protos.h (vfp_const_double_for_bits): Declare.
* config/arm/constraints.md (Dp): Define new constraint.
* config/arm/predicates.md ( const_double_vcvt_power_of_two): Define
new predicate.
* config/arm/arm.c (arm_print_operand): Add print for new fucntion.
(vfp3_const_double_for_bits): New function.
* config/arm/vfp.md (combine_vcvtf2i): Define new instruction.

gcc/testsuite/ChangeLog:

2013-11-20  Renlin Li  renlin...@arm.com

* gcc.target/arm/fixed_float_conversion.c: New test case.diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 944cf10..f2f8272 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -275,6 +275,8 @@ struct tune_params
 
 extern const struct tune_params *current_tune;
 extern int vfp3_const_double_for_fract_bits (rtx);
+/* return power of two from operand, otherwise 0.  */
+extern int vfp3_const_double_for_bits (rtx);
 
 extern void arm_emit_coreregs_64bit_shift (enum rtx_code, rtx, rtx, rtx, rtx,
 	   rtx);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 78554e8..72c4204 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -21175,7 +21175,11 @@ arm_print_operand (FILE *stream, rtx x, int code)
 
 case 'v':
 	gcc_assert (CONST_DOUBLE_P (x));
-	fprintf (stream, #%d, vfp3_const_double_for_fract_bits (x));
+	int result;
+	result = vfp3_const_double_for_fract_bits (x);
+	if (result == 0)
+	  result = vfp3_const_double_for_bits (x);
+	fprintf (stream, #%d, result);
 	return;
 
 /* Register specifier for vld1.16/vst1.16.  Translate the S register
@@ -28958,6 +28973,26 @@ vfp3_const_double_for_fract_bits (rtx operand)
 }
   return 0;
 }
+
+int
+vfp3_const_double_for_bits (rtx operand)
+{
+  REAL_VALUE_TYPE r0;
+
+  if (!CONST_DOUBLE_P (operand))
+return 0;
+
+  REAL_VALUE_FROM_CONST_DOUBLE (r0, operand);
+  if (exact_real_truncate (DFmode, r0))
+{
+  HOST_WIDE_INT value = real_to_integer (r0);
+  value = value  0x;
+  if ((value != 0)  ( (value  (value - 1)) == 0))
+	return int_log2 (value);
+}
+
+  return 0;
+}
 
 /* Emit a memory barrier around an atomic sequence according to MODEL.  */
 
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index e2a3099..59ca4b6 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -31,7 +31,7 @@
 ;; 'H' was previously used for FPA.
 
 ;; The following multi-letter normal constraints have been used:
-;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dz
+;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
 ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
 ;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
 
@@ -328,12 +328,18 @@
  (and (match_code const_double)
   (match_test TARGET_32BIT  TARGET_VFP_DOUBLE  vfp3_const_double_rtx (op
 
-(define_constraint Dt 
+(define_constraint Dt
  @internal
   In ARM/ Thumb2 a const_double which can be used with a vcvt.f32.s32 with fract bits operation
   (and (match_code const_double)
(match_test TARGET_32BIT  TARGET_VFP  vfp3_const_double_for_fract_bits (op
 
+(define_constraint Dp
+ @internal
+  In ARM/ Thumb2 a const_double which can be used with a vcvt.s32.f32 with bits operation
+  (and (match_code const_double)
+   (match_test TARGET_32BIT  TARGET_VFP  vfp3_const_double_for_bits (op
+
 (define_register_constraint Ts (arm_restrict_it) ? LO_REGS : GENERAL_REGS
  For arm_restrict_it the core registers @code{r0}-@code{r7}.  GENERAL_REGS otherwise.)
 
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index 29e1e5c..2dac581 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -639,8 +639,13 @@
 
 (define_predicate const_double_vcvt_power_of_two_reciprocal
   (and (match_code const_double)
-   (match_test TARGET_32BIT  TARGET_VFP 
-   		vfp3_const_double_for_fract_bits (op
+   (match_test TARGET_32BIT  TARGET_VFP
+vfp3_const_double_for_fract_bits (op
+
+(define_predicate const_double_vcvt_power_of_two
+  (and (match_code const_double)
+   (match_test TARGET_32BIT  TARGET_VFP
+vfp3_const_double_for_bits (op
 
 (define_predicate neon_struct_operand
   (and (match_code mem)
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index 22b6325..aec17f9 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -1253,6 +1253,19 @@
(set_attr length 8)]
 )
 
+ (define_insn *combine_vcvtf2i
+   [(set (match_operand:SI 0 s_register_operand =r)
+	(fix:SI (fix:SF (mult:SF (match_operand:SF 1 s_register_operand t)
+ (match_operand 2

Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Uros Bizjak
On Wed, Nov 20, 2013 at 5:11 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/20 Uros Bizjak ubiz...@gmail.com:
 On Wed, Nov 20, 2013 at 1:32 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:

   Here is a patch to add size relocation and instruction to obtain 
   object's size in i386 target.
 
  +(define_insn move_size_reloc_mode
  +  [(set (match_operand:SWI48 0 register_operand =r)
  +(match_operand:MODE 1 size_relocation Z))]
  +  
  +{
  +  return mov{imodesuffix}\t{%1, %0|%0, %1};
 
  Please don't change x86_64_immediate_operand just to use Z
  constraint The predicate is used in a couple of other places that for
  sure don't accept your change.
 
  Better write this insn in an explicit way (see for example
  tls_initial_exec_64_sun). Something like:
 
  (define_insn move_size_reloc_mode
[(set (match_operand:SWI48 0 register_operand =r)
  (unspec:SWI48
   [(match_operand 1 symbolic_operand ... )]
   UNSPEC_SIZEOF))]

mov{imodesuffix}\t{%a1@SIZE, %0|%0, %a1@SIZE})
 
  You will probably need to define new operand 1 predicate and constraint.
 
  Uros.
 
  Hi, Uros!  Thanks for comments!  Here is what I got trying to follow 
  your suggestion.  Does it look better?

 You actually don't need any operand modifiers in the insn template. Simply 
 use:

 mov{imodesuffix}\t{%1@SIZE, %0|%0, %1@SIZE}

 and you will automatically get

 movl $zzz, %eax or mov %eax, OFFSET FLAT: zzz.

 Since your pattern allows only symbolic_operand, there is no reload,
 so you can avoid the constraint alltogether.

 BTW: Did you consider various -mcmodel=... options? For DImode moves,
 you should check x86_64_zext_immediate_operand predicate and output
 either movl $zzz, %eax or movabs $zzz, %rax. There is no movq with
 64bit immediate. Please see movdi pattern.

 Yep, for large objects it may work wrongly. Does anyone use static objects 
 4Gb? :)

 Large address does not mean large object but seems we have to be 
 conservative here. I added  x86_64_zext_immediate_operand check with 
 additional CM_KERNEL check because in this model object size should always 
 fit 32 bits.

 IMO, we should list code models that support object sizes  31bits for
 64bit target. The object size in small models will never be  31bits
 (and never negative), so we can use movl unconditionally.

 For medium models x86_64_zext_immediate_operand returns true for
 object is known to go to lower 2Gb space.  It should allow us to use
 movl.  Why do you always emit movabs for medium model?

CM_MEDIUM has unlimited data size.

i386-opts.h:  CM_MEDIUM,/* Assumes code fits in the low 31
bits; data unlimited.  */

The x86_64_zext_immediate_operand allows _address_ to be loaded by
movl. The @SIZE reloc returns object size, which is unlimited and has
no connection to its address. For CM_MEDIUM,
x86_64_zext_immediate_operand allows:

  return (ix86_cmodel == CM_SMALL
  || (ix86_cmodel == CM_MEDIUM
   !SYMBOL_REF_FAR_ADDR_P (op)));

Uros.


Re: [PATCH, i386, MPX, 2/X] Pointers Checker [22/25] Target builtins

2013-11-20 Thread Uros Bizjak
Hello!

 Here is a patch introducing i386 target versions of Pointer Bounds Checker 
 builtins.

 2013-11-15  Ilya Enkovich  ilya.enkov...@intel.com

 * config/i386/i386-builtin-types.def (BND): New.
 (ULONG): New.
 (BND_FTYPE_PCVOID_ULONG): New.
 (VOID_FTYPE_BND_PCVOID): New.
(VOID_FTYPE_PCVOID_PCVOID_BND): New.
 (BND_FTYPE_PCVOID_PCVOID): New.
 (BND_FTYPE_PCVOID): New.
 (BND_FTYPE_BND_BND): New.
 (PVOID_FTYPE_PVOID_PVOID_ULONG): New.
 (PVOID_FTYPE_PCVOID_BND_ULONG): New.
 (ULONG_FTYPE_VOID): New.
 (PVOID_FTYPE_BND): New.
 * config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h.
 (ix86_builtins): Add
 IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX,
 IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL,
 IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET,
 IX86_BUILTIN_BNDSET, IX86_BUILTIN_BNDNARROW,
 IX86_BUILTIN_BNDINT, IX86_BUILTIN_ARG_BND,
 IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER,
 IX86_BUILTIN_BNDUPPER.
 (builtin_isa): Add leaf_p and nothrow_p fields.
 (def_builtin): Initialize leaf_p and nothrow_p.
 (ix86_add_new_builtins): Handle leaf_p and nothrow_p
 flags.
 (bdesc_mpx): New.
 (bdesc_mpx_const): New.
 (ix86_init_mpx_builtins): New.
 (ix86_init_builtins): Call ix86_init_mpx_builtins.
 (ix86_expand_builtin): expand IX86_BUILTIN_BNDMK,
 IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX,
 IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU,
 IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDSET,
 IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT,
 IX86_BUILTIN_ARG_BND, IX86_BUILTIN_SIZEOF,
 IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER.

+  /* Avoid registers which connot be used as index.  */
+  if (REGNO (op1) == VIRTUAL_INCOMING_ARGS_REGNUM
+  || REGNO (op1) == VIRTUAL_STACK_VARS_REGNUM
+  || REGNO (op1) == VIRTUAL_OUTGOING_ARGS_REGNUM)

You can use index_register_operand predicate here.

Uros.


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

On 11/20/2013 10:51 AM, Richard Biener wrote:

On Wed, Nov 20, 2013 at 4:06 PM, Trevor Saunders tsaund...@mozilla.com wrote:

On Wed, Nov 20, 2013 at 03:18:07PM +0100, Richard Biener wrote:

On Wed, Nov 20, 2013 at 3:16 PM, Jakub Jelinek ja...@redhat.com wrote:

On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:

The limit looks reasonable, but you could have used a simple linked
list (and never free).  Also being able to pop a random context
looks fragile ...  that is, pop_gimplify_context shouldn't have an argument.

Can't we use stack_vecgimplify_context, 30 for that?  Though that would
mean a global var constructor and destructor, so alternatively just use
a normal vec and .create(30) it somewhere during initialization?

only with gimplify_context *, otherwise things will break during re-allocation.

hm? it seems like the only member of gimplify_ctx that can't just be
memcpyd is the prev pointer which presumably could go away if you have a
vec of all the contexts.

Callers have a pointer to gimplify_context AFAIK.




No one except gimplify.c can have a pointer to the gimplify_context, so 
its contained to within gimplify.c.  Pretty much everything is based off 
the current context pointer (gimplify_ctxp).  There are places where the 
address of a field within that context structure is passed to another 
routine.  If that routine then eventually triggered another 
push/pop_context call, the address underneath could be changed... and 
chaos ensues.


I don't know if that does happen, but it is a possibility and I dont see 
the need to find out.  So a simple allocation scheme has the minimal 
impact on the code, and  my preference is leave it as it is, or 
otherwise do the simple linked list malloc-ing only as necessary


Want me to change it, or leave it as is?

Andrew







Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Ilya Enkovich
2013/11/20 Uros Bizjak ubiz...@gmail.com:
 On Wed, Nov 20, 2013 at 5:11 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/20 Uros Bizjak ubiz...@gmail.com:
 On Wed, Nov 20, 2013 at 1:32 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:

   Here is a patch to add size relocation and instruction to obtain 
   object's size in i386 target.
 
  +(define_insn move_size_reloc_mode
  +  [(set (match_operand:SWI48 0 register_operand =r)
  +(match_operand:MODE 1 size_relocation Z))]
  +  
  +{
  +  return mov{imodesuffix}\t{%1, %0|%0, %1};
 
  Please don't change x86_64_immediate_operand just to use Z
  constraint The predicate is used in a couple of other places that for
  sure don't accept your change.
 
  Better write this insn in an explicit way (see for example
  tls_initial_exec_64_sun). Something like:
 
  (define_insn move_size_reloc_mode
[(set (match_operand:SWI48 0 register_operand =r)
  (unspec:SWI48
   [(match_operand 1 symbolic_operand ... )]
   UNSPEC_SIZEOF))]

mov{imodesuffix}\t{%a1@SIZE, %0|%0, %a1@SIZE})
 
  You will probably need to define new operand 1 predicate and 
  constraint.
 
  Uros.
 
  Hi, Uros!  Thanks for comments!  Here is what I got trying to follow 
  your suggestion.  Does it look better?

 You actually don't need any operand modifiers in the insn template. 
 Simply use:

 mov{imodesuffix}\t{%1@SIZE, %0|%0, %1@SIZE}

 and you will automatically get

 movl $zzz, %eax or mov %eax, OFFSET FLAT: zzz.

 Since your pattern allows only symbolic_operand, there is no reload,
 so you can avoid the constraint alltogether.

 BTW: Did you consider various -mcmodel=... options? For DImode moves,
 you should check x86_64_zext_immediate_operand predicate and output
 either movl $zzz, %eax or movabs $zzz, %rax. There is no movq with
 64bit immediate. Please see movdi pattern.

 Yep, for large objects it may work wrongly. Does anyone use static objects 
 4Gb? :)

 Large address does not mean large object but seems we have to be 
 conservative here. I added  x86_64_zext_immediate_operand check with 
 additional CM_KERNEL check because in this model object size should always 
 fit 32 bits.

 IMO, we should list code models that support object sizes  31bits for
 64bit target. The object size in small models will never be  31bits
 (and never negative), so we can use movl unconditionally.

 For medium models x86_64_zext_immediate_operand returns true for
 object is known to go to lower 2Gb space.  It should allow us to use
 movl.  Why do you always emit movabs for medium model?

 CM_MEDIUM has unlimited data size.

 i386-opts.h:  CM_MEDIUM,/* Assumes code fits in the low 31
 bits; data unlimited.  */

 The x86_64_zext_immediate_operand allows _address_ to be loaded by
 movl. The @SIZE reloc returns object size, which is unlimited and has
 no connection to its address. For CM_MEDIUM,
 x86_64_zext_immediate_operand allows:

   return (ix86_cmodel == CM_SMALL
   || (ix86_cmodel == CM_MEDIUM
!SYMBOL_REF_FAR_ADDR_P (op)));

Yes, but large symbols never have SYMBOL_FLAG_FAR_ADDR set.

Ilya


 Uros.


Re: [PATCH, rs6000] Fix order of TDmode in FPRs in little-endian mode

2013-11-20 Thread Ulrich Weigand
David Edelsohn wrote:
 On Mon, Nov 18, 2013 at 3:07 PM, Ulrich Weigand uweig...@de.ibm.com wrote:
  Also note that this patch does not change how TDmode values are loaded
  into GPRs: on little-endian, this means we do get the usual LE subreg
  order there (least significant word in lowest-numbered register).  This
  does still seem the right thing to do, because there are no instructions
  that operate on DFP values in GPRs, and because this allows us to keep
  using subregs on GPRs in that case, and it allows moving TDmode between
  memory and GPRs (e.g. for vararg arguments) in the usual way.
 
 Does this require special support in GDB so that it understands how to
 interpret this type in different classes of registers?  However, the
 interpretation always will be consistent for a particular class of
 register.

Ah, good point.  Yes, GDB will need to be changed, but this looks
to be straightforward.  GDB already uses pseudo registers to
represent _Decimal128 in FPR pairs, they simply need to be glued
together the right way on little-endian.  I'll fix this.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Ilya Enkovich
2013/11/20 Ilya Enkovich enkovich@gmail.com:
 2013/11/20 Uros Bizjak ubiz...@gmail.com:
 On Wed, Nov 20, 2013 at 5:11 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/20 Uros Bizjak ubiz...@gmail.com:
 On Wed, Nov 20, 2013 at 1:32 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:

   Here is a patch to add size relocation and instruction to obtain 
   object's size in i386 target.
 
  +(define_insn move_size_reloc_mode
  +  [(set (match_operand:SWI48 0 register_operand =r)
  +(match_operand:MODE 1 size_relocation Z))]
  +  
  +{
  +  return mov{imodesuffix}\t{%1, %0|%0, %1};
 
  Please don't change x86_64_immediate_operand just to use Z
  constraint The predicate is used in a couple of other places that for
  sure don't accept your change.
 
  Better write this insn in an explicit way (see for example
  tls_initial_exec_64_sun). Something like:
 
  (define_insn move_size_reloc_mode
[(set (match_operand:SWI48 0 register_operand =r)
  (unspec:SWI48
   [(match_operand 1 symbolic_operand ... )]
   UNSPEC_SIZEOF))]

mov{imodesuffix}\t{%a1@SIZE, %0|%0, %a1@SIZE})
 
  You will probably need to define new operand 1 predicate and 
  constraint.
 
  Uros.
 
  Hi, Uros!  Thanks for comments!  Here is what I got trying to follow 
  your suggestion.  Does it look better?

 You actually don't need any operand modifiers in the insn template. 
 Simply use:

 mov{imodesuffix}\t{%1@SIZE, %0|%0, %1@SIZE}

 and you will automatically get

 movl $zzz, %eax or mov %eax, OFFSET FLAT: zzz.

 Since your pattern allows only symbolic_operand, there is no reload,
 so you can avoid the constraint alltogether.

 BTW: Did you consider various -mcmodel=... options? For DImode moves,
 you should check x86_64_zext_immediate_operand predicate and output
 either movl $zzz, %eax or movabs $zzz, %rax. There is no movq with
 64bit immediate. Please see movdi pattern.

 Yep, for large objects it may work wrongly. Does anyone use static 
 objects 4Gb? :)

 Large address does not mean large object but seems we have to be 
 conservative here. I added  x86_64_zext_immediate_operand check with 
 additional CM_KERNEL check because in this model object size should 
 always fit 32 bits.

 IMO, we should list code models that support object sizes  31bits for
 64bit target. The object size in small models will never be  31bits
 (and never negative), so we can use movl unconditionally.

 For medium models x86_64_zext_immediate_operand returns true for
 object is known to go to lower 2Gb space.  It should allow us to use
 movl.  Why do you always emit movabs for medium model?

 CM_MEDIUM has unlimited data size.

 i386-opts.h:  CM_MEDIUM,/* Assumes code fits in the low 31
 bits; data unlimited.  */

 The x86_64_zext_immediate_operand allows _address_ to be loaded by
 movl. The @SIZE reloc returns object size, which is unlimited and has
 no connection to its address. For CM_MEDIUM,
 x86_64_zext_immediate_operand allows:

   return (ix86_cmodel == CM_SMALL
   || (ix86_cmodel == CM_MEDIUM
!SYMBOL_REF_FAR_ADDR_P (op)));

 Yes, but large symbols never have SYMBOL_FLAG_FAR_ADDR set.

Sorry, I mean all large symbols have this flag set.


 Ilya


 Uros.


Re: [PATCH][1/3] Re-submission of Altera Nios II port, gcc parts

2013-11-20 Thread Chung-Lin Tang
On 13/11/20 1:34 AM, Richard Sandiford wrote:
 Chung-Lin Tang clt...@codesourcery.com writes:
 +;;  Integer logical Operations
 +
 +(define_code_iterator LOGICAL [and ior xor])
 +(define_code_attr logical_asm [(and and) (ior or) (xor xor)])
 +
 +(define_insn codesi3
 +  [(set (match_operand:SI 0 register_operand =r,r,r)
 +(LOGICAL:SI (match_operand:SI 1 register_operand %r,r,r)
 +(match_operand:SI 2 logical_operand  rM,J,K)))]
 +  
 +  @
 +logical_asm\\t%0, %1, %z2
 +logical_asm%i2\\t%0, %1, %2
 +logical_asmh%i2\\t%0, %1, %U2
 +  [(set_attr type alu)])
 +
 +(define_insn *norsi3
 +  [(set (match_operand:SI 0 register_operand  =r)
 +(and:SI (not:SI (match_operand:SI 1 register_operand  %r))
 +(not:SI (match_operand:SI 2 reg_or_0_operand  rM]
 +  
 +  nor\\t%0, %1, %z2
 +  [(set_attr type alu)])

 M constraints (for const0_rtx) and reg_or_0 seem unnecessary, no such
 RTL should make it to this point.

 Such RTL does appear under -O0. Removing the 'M' will also
 require a bit of re-working the operand printing mechanics; not a lot of
 work, but I'd rather keep it as is. The behavior of using the zero
 register for a 0-value is also more expected in nios2, I think.

Will any RTL-level passes, say for example, propagate a zero-constant
into the pattern, without also simplifying it to a NOT pattern? (that's
a question, I don't really know)

Personally, I haven't really seen many cases optimizable to NOR, but I
think keeping the 'M' there is pretty harmless.

 Hmm, but if we get (not (const_int 0)) then that sounds like a bug,
 since (and (not X) (not (const_int 0))) should reduce to (not X).
 IMO target-independent code shouldn't try to create the nor-with-0
 form and backends shouldn't match it.
 
 Why would removing 'M' affect the printing mechanism?  Naively I'd
 have expected:

The *norsi3 here is of course straightforward. I was referring to other
cases, like the AND/IOR/XOR pattern above, where I wanted to combine
them into a single alternative. That needs a bit more work to reorganize
the nios2_print_operand() cases.

 (define_insn *norsi3
   [(set (match_operand:SI 0 register_operand  =r)
 (and:SI (not:SI (match_operand:SI 1 register_operand  r))
 (not:SI (match_operand:SI 2 register_operand  r]
   
   nor\\t%0, %1, %2
   [(set_attr type alu)])
 
 to just work.

That will definitely work, though I don't think the zero case does any harm.

 +;; Integer comparisons
 +
 +(define_code_iterator EQNE [eq ne])
 +(define_insn nios2_cmpcode
 +  [(set (match_operand:SI 0 register_operand   =r)
 +(EQNE:SI (match_operand:SI 1 reg_or_0_operand %rM)
 + (match_operand:SI 2 arith_operand rI)))]
 +  
 +  cmpcode%i2\\t%0, %z1, %z2
 +  [(set_attr type alu)])

 Once again, using reg_or_0 and M seems pointless.

 The compares don't support all operations, with only the second operand
 capable of an immediate. Using 'rM' should theoretically allow more
 commutative swapping.
 
 But rtl-wise, we should never generate an EQ or NE with two constants.
 And if one operand is constant then it's supposed to be the second.
 
 The % should give commutativity on its own, without the M.

For EQ/NE I guess that's the case, for the other comparisons I'm not so
sure; I'm not familiar enough with the details of the expander machinery
to claim anything.

If this doesn't carry to other comparisons, I intend to keep it in line
with the other cmp patterns. I experimented a bit with the generated
code, nothing is affected.

 +  emit_insn
 +  (gen_rtx_SET (Pmode, tmp,
 +gen_int_mode (cfun-machine-save_regs_offset,
 +  Pmode)));

 Shouldn't have a mode on the SET, but really should just call
 emit_move_insn. Similar cases elsewhere, search for gen_rtx_SET (Pmode.

 I've removed the modes on SET, though I prefer the more bare generation
 of the insns in some contexts; emit_move_insn() seems to have a lot
 under the hood.
 
 There shouldn't be anything to be afraid of though.  Target-independent
 code would use emit_move_insn for this though, so it needs to just work.

It will work, and I did use it in some places, though I did not
exhaustively search-and-replace.

For (subtle) performance reasons, emit_move_insn() really does too much
as a backend utility. Usually backend code is already very precise on
what we want to generate.

 +  HOST_WIDE_INT var_size;   /* # of var. bytes allocated.  */

 Not the first time they occur in this file, but I suppose I should
 mention that these in-line comments are probably just outside our coding
 guidelines.

 Deleted the comments outside the struct defs.
 
 FWIW, I think it was more that comments should be above the field rather
 than tagged on the right.  (One of the big problems with right-column comments
 is that people tend to make them too short.)

I will change to use the 

Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

On 11/20/2013 11:30 AM, Andrew MacLeod wrote:

On 11/20/2013 10:51 AM, Richard Biener wrote:
On Wed, Nov 20, 2013 at 4:06 PM, Trevor Saunders 
tsaund...@mozilla.com wrote:

On Wed, Nov 20, 2013 at 03:18:07PM +0100, Richard Biener wrote:
On Wed, Nov 20, 2013 at 3:16 PM, Jakub Jelinek ja...@redhat.com 
wrote:

On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:

The limit looks reasonable, but you could have used a simple linked
list (and never free).  Also being able to pop a random context
looks fragile ...  that is, pop_gimplify_context shouldn't have 
an argument.
Can't we use stack_vecgimplify_context, 30 for that?  Though 
that would
mean a global var constructor and destructor, so alternatively 
just use

a normal vec and .create(30) it somewhere during initialization?
only with gimplify_context *, otherwise things will break during 
re-allocation.

hm? it seems like the only member of gimplify_ctx that can't just be
memcpyd is the prev pointer which presumably could go away if you 
have a

vec of all the contexts.

Callers have a pointer to gimplify_context AFAIK.




No one except gimplify.c can have a pointer to the gimplify_context, 
so its contained to within gimplify.c.  Pretty much everything is 
based off the current context pointer (gimplify_ctxp).  There are 
places where the address of a field within that context structure is 
passed to another routine.  If that routine then eventually triggered 
another push/pop_context call, the address underneath could be 
changed... and chaos ensues.


I don't know if that does happen, but it is a possibility and I dont 
see the need to find out.  So a simple allocation scheme has the 
minimal impact on the code, and  my preference is leave it as it is, 
or otherwise do the simple linked list malloc-ing only as necessary


Want me to change it, or leave it as is?
In fact, I like the code better for the linked list... its even simpler  
I've attached that patch.   Its currently bootstrapping and I'll run the 
tests.   And it wouldn't need a special testcase since there are no edge 
cases.


Which is your preference?

Andrew


	* gimplify.h (gimplify_hasher : typed_free_remove, struct gimplify_ctx):
	Move to gimplify.c.
	* gimplify.c (gimplify_hasher:typed_free_remove): Relocate here.
	(struct gimplify_ctx): Relocate here and add 'malloced' field.
	(gimplify_ctxp): Make static.
	(ctx_pool, ctx_alloc, ctx_free): Manage a list of struct gimplify_ctx.
	(push_gimplify_context): Add default parameters and allocate a 
	struct from the pool.
	(pop_gimplify_context): Free a struct back to the pool.
	(gimplify_scan_omp_clauses, gimplify_omp_parallel, gimplify_omp_task,
	gimplify_omp_workshare, gimplify_transaction, gimplify_body): Don't
	use a local 'struct gimplify_ctx'.
	* gimplify-me.c (force_gimple_operand_1, gimple_regimplify_operands):
	Likewise.
	* omp-low.c (lower_omp_sections, lower_omp_single, lower_omp_master,
	lower_omp_ordered, lower_omp_critical, lower_omp_for,
	create_task_copyfn, lower_omp_taskreg, lower_omp_target,
	lower_omp_teams, execute_lower_omp): Likewise.
	* gimple-fold.c (gimplify_and_update_call_from_tree): Likewise.
	* tree-inline.c (optimize_inline_calls): Likewise.

Index: gimplify.h
===
*** gimplify.h	(revision 205035)
--- gimplify.h	(working copy)
*** enum gimplify_status {
*** 48,86 
GS_OK		= 0,	/* We did something, maybe more to do.  */
GS_ALL_DONE	= 1	/* The expression is fully gimplified.  */
  };
- /* Gimplify hashtable helper.  */
  
! struct gimplify_hasher : typed_free_remove elt_t
! {
!   typedef elt_t value_type;
!   typedef elt_t compare_type;
!   static inline hashval_t hash (const value_type *);
!   static inline bool equal (const value_type *, const compare_type *);
! };
! 
! struct gimplify_ctx
! {
!   struct gimplify_ctx *prev_context;
! 
!   vecgimple bind_expr_stack;
!   tree temps;
!   gimple_seq conditional_cleanups;
!   tree exit_label;
!   tree return_temp;
! 
!   vectree case_labels;
!   /* The formal temporary table.  Should this be persistent?  */
!   hash_table gimplify_hasher temp_htab;
! 
!   int conditions;
!   bool save_stack;
!   bool into_ssa;
!   bool allow_rhs_cond_expr;
!   bool in_cleanup_point_expr;
! };
! 
! extern struct gimplify_ctx *gimplify_ctxp;
! extern void push_gimplify_context (struct gimplify_ctx *);
  extern void pop_gimplify_context (gimple);
  extern gimple gimple_current_bind_expr (void);
  extern vecgimple gimple_bind_expr_stack (void);
--- 48,56 
GS_OK		= 0,	/* We did something, maybe more to do.  */
GS_ALL_DONE	= 1	/* The expression is fully gimplified.  */
  };
  
! extern void push_gimplify_context (bool in_ssa = false,
!    bool rhs_cond_ok = false);
  extern void pop_gimplify_context (gimple);
  extern gimple gimple_current_bind_expr (void);
  extern vecgimple gimple_bind_expr_stack (void);
Index: gimplify.c

Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Jeff Law

On 11/20/13 06:44, Jakub Jelinek wrote:

On Wed, Nov 20, 2013 at 08:35:28AM -0500, Andrew MacLeod wrote:

I also hacked up the compiler to report what the 'top' of the stack
was for a compilation unit. I then ran it through a bootstrap and
full testsuite run of all languages, and looked for the maximum
number of context structs in use at one time.  Normally only 1 or 2
were ever in use, but its topped out at 7 during some of the openmp
tests.  So my current limit of 30 will likely never been reached.


???  You can trivially create a testcase where it will be reached
(and you certainly should, so that you actually test the fallback).
I was going to suggest hacking things so that he only had 2, then do a 
bootstrap/comparison to test the fallback.  Obviously that would be for 
testing purposes only.


However, a test we can put into the suite is even better :-)
jeff



Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Jeff Law

On 11/20/13 06:56, Andrew MacLeod wrote:

On 11/20/2013 08:44 AM, Jakub Jelinek wrote:

On Wed, Nov 20, 2013 at 08:35:28AM -0500, Andrew MacLeod wrote:

I also hacked up the compiler to report what the 'top' of the stack
was for a compilation unit. I then ran it through a bootstrap and
full testsuite run of all languages, and looked for the maximum
number of context structs in use at one time.  Normally only 1 or 2
were ever in use, but its topped out at 7 during some of the openmp
tests.  So my current limit of 30 will likely never been reached.

???  You can trivially create a testcase where it will be reached
(and you certainly should, so that you actually test the fallback).


Not something we're likely to see every day, but excellent!  a testcase
is good :-) I'll add this as a testcase.

Do you think the current limit of 30 is reasonable?
Yes if you've verified the fall back works ;-)  You probably could even 
drop it to 10ish.

jeff


Fix PR target/59207

2013-11-20 Thread Eric Botcazou
This fixes an uninitialized variable introduced by the double_int rewrite.

Tested on SPARC/Solaris, applied on the mainline and 4.8 branch.


2013-11-20  Eric Botcazou ebotca...@adacore.com

PR target/59207
* config/sparc/sparc.c (sparc_fold_builtin) case CODE_FOR_pdist_vis:
Make sure neg2_ovf is set before being used.


-- 
Eric BotcazouIndex: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 205090)
+++ config/sparc/sparc.c	(working copy)
@@ -10658,7 +10658,8 @@ sparc_fold_builtin (tree fndecl, int n_a
 	  tmp = e0.add_with_sign (tmp, false, add1_ovf);
 	  if (tmp.is_negative ())
 		tmp = tmp.neg_with_overflow (neg2_ovf);
-
+	  else
+		neg2_ovf = false;
 	  result = result.add_with_sign (tmp, false, add2_ovf);
 	  overflow |= neg1_ovf | neg2_ovf | add1_ovf | add2_ovf;
 	}


  1   2   >