Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.

2014-11-21 Thread Uros Bizjak
On Thu, Nov 20, 2014 at 5:25 PM, Evgeny Stupachenko evstu...@gmail.com wrote:
 Bootstrap / make check passed with updated patch.

 Is it still ok?

 It looks like we don't need expand_vec_perm_vpshufb2_vpermq_even_odd
 any more with the patch.
 However the clean up will be in the separate patch after appropriate testing.

 Modified ChangeLog:

 2014-11-20  Evgeny Stupachenko  evstu...@gmail.com

 gcc/testsuite
 PR target/60451
 * gcc.target/i386/pr60451.c: New.

 gcc/
 PR target/60451
 * config/i386/i386.c (expand_vec_perm_even_odd_pack): New.
 (expand_vec_perm_even_odd_1): Add new expand for V8HI mode,
 replace for V16QI, V16HI and V32QI modes.
 (ix86_expand_vec_perm_const_1): Add new expand.

OK.

Thanks,
Uros.


[PATCH] Backport PR61750 fix

2014-11-21 Thread Richard Biener

The following backports a fix I applied to match.pd whilst merging
from match-and-simplify to the original tree-ssa-forwprop.c code
on the 4.9 branch.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-11-21  Richard Biener  rguent...@suse.de

PR tree-optimization/61750
* tree-ssa-forwprop.c (simplify_vce): Verify type sizes
match for the resulting VIEW_CONVERT_EXPR.

Index: gcc/tree-ssa-forwprop.c
===
--- gcc/tree-ssa-forwprop.c (revision 217764)
+++ gcc/tree-ssa-forwprop.c (working copy)
@@ -3178,7 +3178,9 @@ simplify_vce (gimple_stmt_iterator *gsi)
   (INTEGRAL_TYPE_P (TREE_TYPE (def_op))
  || POINTER_TYPE_P (TREE_TYPE (def_op)))
   (TYPE_PRECISION (TREE_TYPE (op))
- == TYPE_PRECISION (TREE_TYPE (def_op
+ == TYPE_PRECISION (TREE_TYPE (def_op)))
+  (TYPE_SIZE (TREE_TYPE (op))
+ == TYPE_SIZE (TREE_TYPE (def_op
{
  TREE_OPERAND (gimple_assign_rhs1 (stmt), 0) = def_op;
  update_stmt (stmt);


Re: [PATCH 2/2] PR debug/38757 continued. Handle C11, C++11 and C++14.

2014-11-21 Thread Jakub Jelinek
On Thu, Nov 20, 2014 at 11:30:12PM +0100, Mark Wielaard wrote:
 @@ -19592,13 +19597,28 @@ gen_compile_unit_die (const char *filename)
  
language = DW_LANG_C;
if (strncmp (language_string, GNU C++, 7) == 0)
 -language = DW_LANG_C_plus_plus;
 +{
 +  language = DW_LANG_C_plus_plus;
 +  if (dwarf_version = 5 || !dwarf_strict)
 + {
 +   if (strcmp (language_string, GNU C++11) == 0)
 + language = DW_LANG_C_plus_plus_11;
 +   else if (strcmp (language_string, GNU C++14) == 0)
 + language = DW_LANG_C_plus_plus_14;
 + }
 +}

I think best would be to tweak
  if (value  2 || value  4)
error_at (loc, dwarf version %d is not supported, value);
  else
opts-x_dwarf_version = value;
so that we accept value 5 too, and for now, until the
most common consumers are changed, use
  if (dwarf_version = 5 /* || !dwarf_strict */)
so that
- you can actually use it in the test with -gdwarf-5
- you can commit it right away
- people can start playing with what it will mean to support DWARF5

GCC 4.5 also allowed -gdwarf-4 even when DWARF4 has not been released yet.
When there are consumers that can grok it, we can uncomment the
|| !dwarf_strict.

Jason, do you agree?

else if (strncmp (language_string, GNU C, 5) == 0)
  {
language = DW_LANG_C89;
if (dwarf_version = 3 || !dwarf_strict)
 - if (strcmp (language_string, GNU C99) == 0)
 -   language = DW_LANG_C99;
 + {
 +   if (strcmp (language_string, GNU C89) != 0)
 + language = DW_LANG_C99;
 +
 +   if (dwarf_version = 5 || !dwarf_strict)
 + if (strcmp (language_string, GNU C11) == 0)
 +   language = DW_LANG_C11;
 + }

Shouldn't we emit at least DW_LANG_C99 for GNU C11 if
not dwarf_version = 5 /* || !dwarf_strict */ but
dwarf_version = 3 || !dwarf_strict is true?

BTW, noticed we don't have anything for Fortran 2003 and 2008,
filed a DWARF Issue for that.

Jakub


Re: [PATCH] OpenACC for C front end

2014-11-21 Thread Jakub Jelinek
On Thu, Nov 20, 2014 at 05:50:33PM -0600, James Norris wrote:
 +   case 'h':
 + if (!strcmp (host, p))
 +   result = PRAGMA_OMP_CLAUSE_SELF;
 + break;
 Shouldn't this be PRAGMA_OMP_CLAUSE_HOST (PRAGMA_OACC_CLAUSE_HOST)
 instead?  It is _HOST in the C++ patch, are there no C tests with
 that clause covering it?
 
 The host clause is a synonym for the self clause. The initial
 C++ patch did not treat host as a synonym and has amended
 accordingly.

Can you add a comment mentioning that (for casual readers)?

 There was a mistake in naming the function:
 c_parser_omp_clause_vector_length.
 Once it was renamed to: c_parser_oacc_clause_vector_length, diff was able to
 keep track.

Great.

 OK to commit after middle end is accepted?

Ok, thanks.

Jakub


Re: [PATCH] OpenACC for C++ front end

2014-11-21 Thread Jakub Jelinek
On Thu, Nov 20, 2014 at 05:33:57PM -0600, James Norris wrote:
 + t = OMP_CLAUSE_ASYNC_EXPR (c);
 + if (t == error_mark_node)
 +   remove = true;
 + else if (!type_dependent_expression_p (t)
 +   !INTEGRAL_TYPE_P (TREE_TYPE (t)))
 +   {
 + error (%async% expression must be integral);
 You have OMP_CLAUSE_LOCATION (c) which you could use for error_at.
 
 I followed the convention that was used elsewhere in the function
 at using error ().

Perhaps it would be better to change even those other spots in the function.
But that can be certainly done as a follow-up patch.

 Thank you for taking the time to review!
 
 OK to commit after middle end has been accepted?

Yes, thanks.

Jakub


Re: Add to maintainers list.

2014-11-21 Thread Alex Velenko

Hi,



2014-11-20  Alex Velenko  alex.vele...@arm.com

*MAINTAINERS (write-after-approval): Add myself.

diff --git a/MAINTAINERS b/MAINTAINERS
index 11a28ef..eada4e9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -566,6 +566,7 @@ David Ung   
dav...@mips.com
  Neil Vachharajani nvach...@gmail.com
  Kris Van Hees kris.van.h...@oracle.com
  Joost VandeVondelejoost.vandevond...@mat.ethz.ch
+Alex Velenko   alex.vele...@arm.com
  Ilya Verbin   iver...@gmail.com
  Kugan Vivekanandarajahkug...@linaro.org
  Tom de Vries  t...@codesourcery.com



Can someone, please, approve?
Kind regards,
Alex



Re: Add to maintainers list.

2014-11-21 Thread Marcus Shawcroft
On 20 November 2014 16:27, Alex Velenko alex.vele...@arm.com wrote:

 2014-11-20  Alex Velenko  alex.vele...@arm.com

 *MAINTAINERS (write-after-approval): Add myself.


Your patch looks fine, commit it.  /Marcus


Commit: Rl78: Save ES register in interrupt handlers

2014-11-21 Thread Nick Clifton
Hi Guys,

  I am applying the patch below to fix the RL78 backend so that it will
  preserve the ES register if an interrupt handler uses it.  The ES
  register can be altered if a __far variable is addressed inside the
  handler.

  Tested without any regressions on an rl78-elf toolchain.

Cheers
  Nick

gcc/ChangeLog

2014-11-21  Nick Clifton  ni...@redhat.com

* config/rl78/rl78-real.md (movqi_from_es): New pattern.
* config/rl78/rl78.c (struct machine_function): Add uses_es field.
(rl78_expand_prologue): Save the ES register in interrupt handlers
that use it.
(rl78_expand_epilogue): Restore the ES register if necessary.
(rl78_start_function): Mention if the function uses the ES
register.
(rl78_lo16): Record the use of the ES register.
(transcode_memory_rtx): Likewise.

Index: gcc/config/rl78/rl78-real.md
===
--- gcc/config/rl78/rl78-real.md	(revision 217910)
+++ gcc/config/rl78/rl78-real.md	(working copy)
@@ -36,6 +36,13 @@
   mov\tes, %0
 )
 
+(define_insn movqi_from_es
+  [(set (match_operand:QI 0 register_operand =a)
+	(reg:QI ES_REG))]
+  
+  mov\t%0, es
+)
+
 (define_insn movqi_cs
   [(set (reg:QI CS_REG)
 	(match_operand:QI 0 register_operand a))]
Index: gcc/config/rl78/rl78.c
===
--- gcc/config/rl78/rl78.c	(revision 217910)
+++ gcc/config/rl78/rl78.c	(working copy)
@@ -118,6 +118,9 @@
   int virt_insns_ok;
   /* Set if the current function needs to clean up any trampolines.  */
   int trampolines_used;
+  /* True if the ES register is used and hence
+ needs to be saved inside interrupt handlers.  */
+  bool uses_es;
 };
 
 /* This is our init_machine_status, as set in
@@ -136,38 +139,36 @@
 /* This pass converts virtual instructions using virtual registers, to
real instructions using real registers.  Rather than run it as
reorg, we reschedule it before vartrack to help with debugging.  */
-namespace {
-
-const pass_data pass_data_rl78_devirt =
+namespace
 {
-  RTL_PASS, /* type */
-  devirt, /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_MACH_DEP, /* tv_id */
-  0, /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
+  const pass_data pass_data_rl78_devirt =
+{
+  RTL_PASS, /* type */
+  devirt, /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_MACH_DEP, /* tv_id */
+  0, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
 
-class pass_rl78_devirt : public rtl_opt_pass
-{
-public:
-  pass_rl78_devirt(gcc::context *ctxt)
-: rtl_opt_pass(pass_data_rl78_devirt, ctxt)
+  class pass_rl78_devirt : public rtl_opt_pass
   {
-  }
+  public:
+pass_rl78_devirt (gcc::context *ctxt)
+  : rtl_opt_pass (pass_data_rl78_devirt, ctxt)
+  {
+  }
 
-  /* opt_pass methods: */
-  virtual unsigned int execute (function *)
+/* opt_pass methods: */
+virtual unsigned int execute (function *)
 {
   rl78_reorg ();
   return 0;
 }
-
-};
-
+  };
 } // anon namespace
 
 rtl_opt_pass *
@@ -203,8 +204,7 @@
 	 can eliminate the second SET.  */
   if (prev
 	   rtx_equal_p (SET_DEST (prev), SET_SRC (set))
-	   rtx_equal_p (SET_DEST (set), SET_SRC (prev))
-	  )	  
+	   rtx_equal_p (SET_DEST (set), SET_SRC (prev)))
 	{
 	  if (dump_file)
 	fprintf (dump_file,  Delete insn %d because it is redundant\n,
@@ -216,7 +216,7 @@
   else
 	prev = set;
 }
-  
+
   if (dump_file)
 print_rtl_with_bb (dump_file, get_insns (), 0);
 
@@ -223,33 +223,32 @@
   return 0;
 }
 
-namespace {
-
-const pass_data pass_data_rl78_move_elim =
+namespace
 {
-  RTL_PASS, /* type */
-  move_elim, /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_MACH_DEP, /* tv_id */
-  0, /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
+  const pass_data pass_data_rl78_move_elim =
+{
+  RTL_PASS, /* type */
+  move_elim, /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_MACH_DEP, /* tv_id */
+  0, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
 
-class pass_rl78_move_elim : public rtl_opt_pass
-{
-public:
-  pass_rl78_move_elim(gcc::context *ctxt)
-: rtl_opt_pass(pass_data_rl78_move_elim, ctxt)
+  class pass_rl78_move_elim : public rtl_opt_pass
   {
-  }
+  public:
+pass_rl78_move_elim (gcc::context *ctxt)
+  : rtl_opt_pass (pass_data_rl78_move_elim, ctxt)
+  {
+  }
 
-  /* opt_pass methods: */
-  virtual unsigned int execute (function *) { return move_elim_pass (); 

Re: [PATCH x86] Increase PARAM_MAX_COMPLETELY_PEELED_INSNS when branch is costly

2014-11-21 Thread Evgeny Stupachenko
PING.
200 currently looks optimal for x86.
Let's commit the following:

2014-11-21  Evgeny Stupachenko  evstu...@gmail.com
* config/i386/i386.c (ix86_option_override_internal): Increase
PARAM_MAX_COMPLETELY_PEELED_INSNS.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 6337aa5..5ac10eb 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4081,6 +4081,12 @@ ix86_option_override_internal (bool main_args_p,
 opts-x_param_values,
 opts_set-x_param_values);

+  /* Extend full peel max insns parameter for x86.  */

+  maybe_set_param_value (PARAM_MAX_COMPLETELY_PEELED_INSNS,
+200,
+opts-x_param_values,
+opts_set-x_param_values);
+
   /* Enable sw prefetching at -O3 for CPUS that prefetching is helpful.  */
   if (opts-x_flag_prefetch_loop_arrays  0
HAVE_prefetch

On Wed, Nov 12, 2014 at 5:02 PM, Evgeny Stupachenko evstu...@gmail.com wrote:
 Code size for spec2000 is almost unchanged (many benchmarks have the
 same binaries).
 For those that are changed we have the following numbers (200 vs 100,
 both dynamic build -Ofast -funroll-loops -flto):
 183.equake +10%
 164.gzip, 173.applu +3,5%
 187.facerec, 191.fma3d +2,5%
 200.sixstrack +2%
 177.mesa, 178.galgel +1%


 On Wed, Nov 12, 2014 at 2:51 AM, Jan Hubicka hubi...@ucw.cz wrote:
  150 and 200 make Silvermont performance better on 173.applu (+8%) and
  183.equake (+3%); Haswell spec2006 performance stays almost unchanged.
  Higher value of 300 leave the performance of mentioned tests
  unchanged, but add some regressions on other benchmarks.
 
  So I like 200 as well as 120 and 150, but can confirm performance
  gains only for x86.

 IMO it's either 150 or 200.  We chose 200 for our 4.9-based compiler because
 this gave the performance boost without affecting the code size (on x86-64)
 and because this was previously 400, but it's your call.

 Both 150 or 200 globally work for me if there is not too much of code size
 bloat (did not see code size mentioned here).

 What I did before decreasing the bounds was strenghtening the loop iteraton
 count bounds and adding logic the predicts constant propagation enabled by
 unrolling. For this reason 400 became too large as we did a lot more complete
 unrolling than before. Also 400 in older compilers is not really 400 in 
 newer.

 Because I saw performance to drop only with values bellow 50, I went for 100.
 It would be very interesting to actually analyze what happends for those two
 benchmarks (that should not be too hard with perf).

 Honza


Re: [PATCH] PR lto/63968: 175.vpr from cpu2000 fails to build with LTO

2014-11-21 Thread Martin Liška

On 11/20/2014 10:13 PM, Jan Hubicka wrote:

Hello.

As I reimplemented fibheap to C++ template, Honza told me that replace_key 
method actually
supports just decrement operation. Old implementation suppress any feedback if 
we try to increase key:

fibheap.c:
...
   /* If we wanted to, we could actually do a real increase by redeleting and
  inserting. However, this would require O (log n) time. So just bail out
  for now.  */
   if (fibheap_comp_data (heap, key, data, node)  0)
 return NULL;
...

My reimplementation added assert for such kind operation, as this PR shows we 
try to do increment in reorder-bb.
Thus, I added fibonacci_heap::replace_key method that can increment key (it 
deletes the node and new key
is associated with the node).

The patch can bootstrap on x86_64-linux-pc and no new regression was introduced.
I would like to ask someone if the increase operation for bb-reorder is valid 
or not?


Can you verify that the implementation is correct? I tend to remember that I 
introduced the
lazy incerementation to inliner both for perofrmance and correctness reasons. I 
used to get
odd orders when keys was increased.

Honza


Hello.

What kind of correctness do you mean? Old implementation didn't support 
increment operation and the fact was hushed up.


Martin



Thanks,
Martin



gcc/ChangeLog:

2014-11-20  Martin Liska  mli...@suse.cz

* bb-reorder.c (find_traces_1_round): decreate_key is replaced
with replace_key method.
* fibonacci_heap.h (fibonacci_heap::insert): New argument.
(fibonacci_heap::replace_key_data): Likewise.
(fibonacci_heap::replace_key): New method that can even increment key,
this operation costs O(log N).
(fibonacci_heap::extract_min): New argument.
(fibonacci_heap::delete_node): Likewise.



diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
index 689d7b6..b568114 100644
--- a/gcc/bb-reorder.c
+++ b/gcc/bb-reorder.c
@@ -644,7 +644,7 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type 
count_th,
   (long) bbd[e-dest-index].node-get_key (),
   key);
}
- bbd[e-dest-index].heap-decrease_key
+ bbd[e-dest-index].heap-replace_key
(bbd[e-dest-index].node, key);
}
}
@@ -812,7 +812,7 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type 
count_th,
   e-dest-index,
   (long) bbd[e-dest-index].node-get_key (), 
key);
}
- bbd[e-dest-index].heap-decrease_key
+ bbd[e-dest-index].heap-replace_key
(bbd[e-dest-index].node, key);
}
}
diff --git a/gcc/fibonacci_heap.h b/gcc/fibonacci_heap.h
index ecb92f8..3fce370 100644
--- a/gcc/fibonacci_heap.h
+++ b/gcc/fibonacci_heap.h
@@ -183,20 +183,27 @@ public:
}

/* For given NODE, set new KEY value.  */
-  K decrease_key (fibonacci_node_t *node, K key)
+  K replace_key (fibonacci_node_t *node, K key)
{
  K okey = node-m_key;
-gcc_assert (key = okey);

  replace_key_data (node, key, node-m_data);
  return okey;
}

+  /* For given NODE, decrease value to new KEY.  */
+  K decrease_key (fibonacci_node_t *node, K key)
+  {
+gcc_assert (key = node-m_key);
+return replace_key (node, key);
+  }
+
/* For given NODE, set new KEY and DATA value.  */
V *replace_key_data (fibonacci_node_t *node, K key, V *data);

-  /* Extract minimum node in the heap. */
-  V *extract_min ();
+  /* Extract minimum node in the heap. If RELEASE is specified,
+ memory is released.  */
+  V *extract_min (bool release = true);

/* Return value associated with minimum node in the heap.  */
V *min ()
@@ -214,12 +221,15 @@ public:
}

/* Delete NODE in the heap.  */
-  V *delete_node (fibonacci_node_t *node);
+  V *delete_node (fibonacci_node_t *node, bool release = true);

/* Union the heap with HEAPB.  */
fibonacci_heap *union_with (fibonacci_heap *heapb);

  private:
+  /* Insert new NODE given by KEY and DATA associated with the key.  */
+  fibonacci_node_t *insert (fibonacci_node_t *node, K key, V *data);
+
/* Insert it into the root list.  */
void insert_root (fibonacci_node_t *node);

@@ -322,6 +332,15 @@ fibonacci_heapK,V::insert (K key, V *data)
/* Create the new node.  */
fibonacci_nodeK,V *node = new fibonacci_node_t ();

+  return insert (node, key, data);
+}
+
+/* Insert new NODE given by KEY and DATA associated with the key.  */
+
+templateclass K, class V
+fibonacci_nodeK,V*
+fibonacci_heapK,V::insert (fibonacci_node_t *node, K key, V *data)
+{
/* Set the node's data.  */
node-m_data = data;
node-m_key = key;
@@ -345,17 +364,22 @@ V*
  fibonacci_heapK,V::replace_key_data (fibonacci_nodeK,V *node, K key,

Re: [PATCH x86] Increase PARAM_MAX_COMPLETELY_PEELED_INSNS when branch is costly

2014-11-21 Thread Uros Bizjak
On Fri, Nov 21, 2014 at 11:46 AM, Evgeny Stupachenko evstu...@gmail.com wrote:
 PING.
 200 currently looks optimal for x86.
 Let's commit the following:

 2014-11-21  Evgeny Stupachenko  evstu...@gmail.com
 * config/i386/i386.c (ix86_option_override_internal): Increase
 PARAM_MAX_COMPLETELY_PEELED_INSNS.

OK. Looks like a good performance vs. codesize tradeoff.

Uros.


Re: [ia64 PATCH] Fix up ia64 attribute handling (PR target/61137)

2014-11-21 Thread Andreas Schwab
Jakub Jelinek ja...@redhat.com writes:

 The following untested patch fixes that (tested on small-addr-1.c with
 a cross-compiler), I don't have ia64 hw nor spare cycles to test this
 though, so I'm just offering the patch as is if anyone wants to test it.
 Perhaps better testsuite coverage wouldn't hurt (test the model (small)
 attribute also in C++, perhaps test the common_object attribute on VMS?).

 2014-11-20  Jakub Jelinek  ja...@redhat.com

   PR target/61137
   * config/ia64/ia64.c (ia64_attribute_takes_identifier_p): New function.
   (TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P): Redefine to it.

Looks good.

http://gcc.gnu.org/ml/gcc-testresults/2014-11/msg02276.html

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


[PATCH]Add myself to MAINTAINERS

2014-11-21 Thread Renlin Li

Hi,

This patch is to add myself into Write After Approval section of 
MAINTAINERS file.


Is it Okay to commit?

Regards,
Renlin Li


ChangeLog:

2014-11-21  Renlin Li  renlin...@arm.com

* MAINTAINERS (Write After Approval): Add myself.diff --git a/MAINTAINERS b/MAINTAINERS
index 56e68c5..96a7497 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -461,6 +461,7 @@ Georg-Johann Laya...@gjlay.de
 Marc Lehmann	p...@goof.com
 James Lemke	jwle...@codesourcery.com
 Kriang Lerdsuwanakijlerds...@users.sourceforge.net
+Renlin Li	renlin...@arm.com
 Xinliang David Lidavi...@google.com
 Jiangning Liu	jiangning@arm.com
 Sa Liu		sa...@de.ibm.com

Re: [PATCH 8/9] Negative numbers added for sreal class.

2014-11-21 Thread Martin Liška

On 11/14/2014 11:48 AM, Richard Biener wrote:

On Thu, Nov 13, 2014 at 1:35 PM, mliska mli...@suse.cz wrote:

gcc/ChangeLog:

2014-11-13  Martin Liska  mli...@suse.cz

 * predict.c (propagate_freq): More elegant sreal API is used.
 (estimate_bb_frequencies): New static constants defined by sreal
 replace precomputed ones.
 * sreal.c (sreal::normalize): New function.
 (sreal::to_int): Likewise.
 (sreal::operator+): Likewise.
 (sreal::operator-): Likewise.
 * sreal.h: Definition of new functions added.


Please use gcc_checking_assert()s everywhere.  sreal is supposed
to be fast... (I see it has current uses of gcc_assert - you may want
to mass-convert them as a followup).


---
  gcc/predict.c | 30 +++-
  gcc/sreal.c   | 56 
  gcc/sreal.h   | 75 ---
  3 files changed, 126 insertions(+), 35 deletions(-)

diff --git a/gcc/predict.c b/gcc/predict.c
index 0215e91..0f640f5 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -82,7 +82,7 @@ along with GCC; see the file COPYING3.  If not see

  /* real constants: 0, 1, 1-1/REG_BR_PROB_BASE, REG_BR_PROB_BASE,
1/REG_BR_PROB_BASE, 0.5, BB_FREQ_MAX.  */
-static sreal real_zero, real_one, real_almost_one, real_br_prob_base,
+static sreal real_almost_one, real_br_prob_base,
  real_inv_br_prob_base, real_one_half, real_bb_freq_max;

  static void combine_predictions_for_insn (rtx_insn *, basic_block);
@@ -2528,13 +2528,13 @@ propagate_freq (basic_block head, bitmap tovisit)
 bb-count = bb-frequency = 0;
  }

-  BLOCK_INFO (head)-frequency = real_one;
+  BLOCK_INFO (head)-frequency = sreal::one ();
last = head;
for (bb = head; bb; bb = nextbb)
  {
edge_iterator ei;
-  sreal cyclic_probability = real_zero;
-  sreal frequency = real_zero;
+  sreal cyclic_probability = sreal::zero ();
+  sreal frequency = sreal::zero ();

nextbb = BLOCK_INFO (bb)-next;
BLOCK_INFO (bb)-next = NULL;
@@ -2559,13 +2559,13 @@ propagate_freq (basic_block head, bitmap tovisit)
   * BLOCK_INFO (e-src)-frequency /
   REG_BR_PROB_BASE);  */

-   sreal tmp (e-probability, 0);
+   sreal tmp = e-probability;
 tmp *= BLOCK_INFO (e-src)-frequency;
 tmp *= real_inv_br_prob_base;
 frequency += tmp;
   }

- if (cyclic_probability == real_zero)
+ if (cyclic_probability == sreal::zero ())
 {
   BLOCK_INFO (bb)-frequency = frequency;
 }
@@ -2577,7 +2577,7 @@ propagate_freq (basic_block head, bitmap tovisit)
   /* BLOCK_INFO (bb)-frequency = frequency
   / (1 - cyclic_probability) */

- cyclic_probability = real_one - cyclic_probability;
+ cyclic_probability = sreal::one () - cyclic_probability;
   BLOCK_INFO (bb)-frequency = frequency / cyclic_probability;
 }
 }
@@ -2591,7 +2591,7 @@ propagate_freq (basic_block head, bitmap tovisit)
  = ((e-probability * BLOCK_INFO (bb)-frequency)
  / REG_BR_PROB_BASE); */

- sreal tmp (e-probability, 0);
+ sreal tmp = e-probability;
   tmp *= BLOCK_INFO (bb)-frequency;
   EDGE_INFO (e)-back_edge_prob = tmp * real_inv_br_prob_base;
 }
@@ -2873,13 +2873,11 @@ estimate_bb_frequencies (bool force)
if (!real_values_initialized)
  {
   real_values_initialized = 1;
- real_zero = sreal (0, 0);
- real_one = sreal (1, 0);
- real_br_prob_base = sreal (REG_BR_PROB_BASE, 0);
- real_bb_freq_max = sreal (BB_FREQ_MAX, 0);
+ real_br_prob_base = REG_BR_PROB_BASE;
+ real_bb_freq_max = BB_FREQ_MAX;
   real_one_half = sreal (1, -1);
- real_inv_br_prob_base = real_one / real_br_prob_base;
- real_almost_one = real_one - real_inv_br_prob_base;
+ real_inv_br_prob_base = sreal::one () / real_br_prob_base;
+ real_almost_one = sreal::one () - real_inv_br_prob_base;
 }

mark_dfs_back_edges ();
@@ -2897,7 +2895,7 @@ estimate_bb_frequencies (bool force)

   FOR_EACH_EDGE (e, ei, bb-succs)
 {
- EDGE_INFO (e)-back_edge_prob = sreal (e-probability, 0);
+ EDGE_INFO (e)-back_edge_prob = e-probability;
   EDGE_INFO (e)-back_edge_prob *= real_inv_br_prob_base;
 }
 }
@@ -2906,7 +2904,7 @@ estimate_bb_frequencies (bool force)
   to outermost to examine frequencies for back edges.  */
estimate_loops ();

-  freq_max = real_zero;
+  freq_max = sreal::zero ();
FOR_EACH_BB_FN (bb, cfun)
 if (freq_max  BLOCK_INFO (bb)-frequency)
   

Re: [PATCH 1/4][x86] Add clwb,pcommit,avx512avbmi,avx512ifma.

2014-11-21 Thread Ilya Tocar
On 20 Nov 09:43, Uros Bizjak wrote:
 On Wed, Nov 19, 2014 at 6:32 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
  Hi,
 
  New revision of Intel ISA reference [1] has new instructions:
  Clwb, pcommit and new flavors of AVX512. Patch bellow adds them.
  I understand that stage 1 is closed, however those changes shouldn't
  affect anything outside if i386 backend. And are extremely unlikely to
  break existing functionality, and I personally think it's desirable for
  newest GCC to support newest spec.
  Bootstrapped/regtestsed on x86_64-unknown-linux-gnu.
  Ok for trunk?
 
 Please split the patch into patch series, like it was done previously
 for AVX512F patches.
 
 Uros.


This part adds avx512ifma.
Bootstraps/passes make check.

gcc/

* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512IFMA_SET,
, OPTION_MASK_ISA_AVX512IFMA_UNSET): New.
(ix86_handle_option): Handle OPT_mavx512ifma.
* config.gcc: Add avx512ifmaintrin.h, avx512ifmavlintrin.h.
* config/i386/avx512ifmaintrin.h: New file.
* config/i386/avx512ifmaivlntrin.h: Ditto.
* config/i386/cpuid.h (bit_AVX512IFMA): New.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect
avx512ifma.
* config/i386/i386-c.c (ix86_target_macros_internal): Define
__AVX512IFMA__.
* config/i386/i386.c (ix86_target_string): Add -mavx512ifma.
(PTA_AVX512IFMA): Define.
(ix86_option_override_internal): Handle new options.
(ix86_valid_target_attribute_inner_p): Add avx512ifma.
(ix86_builtins): Add IX86_BUILTIN_VPMADD52LUQ512,
IX86_BUILTIN_VPMADD52HUQ512, IX86_BUILTIN_VPMADD52LUQ256,
IX86_BUILTIN_VPMADD52HUQ256, IX86_BUILTIN_VPMADD52LUQ128,
IX86_BUILTIN_VPMADD52HUQ128, IX86_BUILTIN_VPMADD52LUQ512_MASKZ,
IX86_BUILTIN_VPMADD52HUQ512_MASKZ, IX86_BUILTIN_VPMADD52LUQ256_MASKZ,
IX86_BUILTIN_VPMADD52HUQ256_MASKZ, IX86_BUILTIN_VPMADD52LUQ128_MASKZ,
IX86_BUILTIN_VPMADD52HUQ128_MASKZ.
(bdesc_special_args): Add __builtin_ia32_vpmadd52luq512_mask,
__builtin_ia32_vpmadd52luq512_maskz,
__builtin_ia32_vpmadd52huq512_mask,
__builtin_ia32_vpmadd52huq512_maskx,
__builtin_ia32_vpmadd52luq256_mask,
__builtin_ia32_vpmadd52luq256_maskz,
__builtin_ia32_vpmadd52huq256_mask,
__builtin_ia32_vpmadd52huq256_maskz,
__builtin_ia32_vpmadd52luq128_mask,
__builtin_ia32_vpmadd52luq128_maskz,
__builtin_ia32_vpmadd52huq128_mask,
__builtin_ia32_vpmadd52huq128_maskz,
* config/i386/i386.h (TARGET_AVX512IFMA, TARGET_AVX512IFMA_P): Define.
* config/i386/i386.opt: Add mavx512ifma.
* config/i386/immintrin.h: Include avx512ifmaintrin.h,
avx512ifmavlintrin.h.
* config/i386/sse.md (unspec): Add UNSPEC_VPMADD52LUQ,
UNSPEC_VPMADD52HUQ.
(VPMADD52): New iterator.
(vpmadd52type): New attribute.
(vpamdd52huqmode_maskz): New.
(vpamdd52luqmode_maskz): Ditto.
(vpamdd52vpmadd52typemodesd_maskz_name): Ditto.
(vpamdd52vpmadd52typemode_mask): Ditto.


gcc/testsuite/

* g++.dg/other/i386-2.C: Add -mavx512ifma.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/avx512f-helper.h: Add avx512ifma-check.h.
* gcc.target/i386/avx512ifma-check.h: New.
* gcc.target/i386/avx512ifma-vpmaddhuq-1.c: Ditto.
* gcc.target/i386/avx512ifma-vpmaddhuq-2.c: Ditto.
* gcc.target/i386/avx512ifma-vpmaddluq-1.c: Ditto.
* gcc.target/i386/avx512ifma-vpmaddluq-2.c: Ditto.
* gcc.target/i386/avx512vl-vpmaddhuq-2.c: Ditto.
* gcc.target/i386/avx512vl-vpmaddluq-2.c: Ditto.
* gcc.target/i386/i386.exp (check_effective_target_avx512ifma): New.
* gcc.target/i386/sse-12.c: Add new options.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.

---
 gcc/common/config/i386/i386-common.c   |  16 ++
 gcc/config.gcc |   6 +-
 gcc/config/i386/avx512ifmaintrin.h | 104 +
 gcc/config/i386/avx512ifmavlintrin.h   | 164 +
 gcc/config/i386/cpuid.h|   1 +
 gcc/config/i386/driver-i386.c  |   5 +-
 gcc/config/i386/i386-c.c   |   2 +
 gcc/config/i386/i386.c |  35 +
 gcc/config/i386/i386.h |   2 +
 gcc/config/i386/i386.opt   |   4 +
 gcc/config/i386/immintrin.h|   4 +
 gcc/config/i386/sse.md |  69 +
 gcc/testsuite/g++.dg/other/i386-2.C|   2 +-
 gcc/testsuite/g++.dg/other/i386-3.C|   2 +-
 gcc/testsuite/gcc.target/i386/avx512f-helper.h |   5 +
 

[PATCH] VRP: don't assume strict overflow semantics when checking if a loop wraps

2014-11-21 Thread Patrick Palka
When adjusting the value range of an induction variable using SCEV, VRP
calls scev_probably_wraps_p() with use_overflow_semantics=true.  This
parameter set to true makes scev_probably_wraps_p() assume that signed
induction variables never wrap, so for these variables it always returns
false (when strict overflow rules are in effect).  This is wrong because
if a signed induction variable really does overflow then we want to give
it an INF(OVF) value range and not the (finite) estimation returned by
SCEV.

While this change shouldn't make a difference in code generation, it
should help improve the coverage of -Wstrict-overflow warnings on
induction variables like in the test case.

OK after bootstrap + regtest on x86_64-unknown-linux-gnu?

gcc/
* tree-vrp.c (adjust_range_with_scev): Call
scev_probably_wraps_p with use_overflow_semantics=false.

gcc/testsuite/
* gcc.dg/Wstrict-overflow-27.c: New test.
---
 gcc/testsuite/gcc.dg/Wstrict-overflow-27.c | 22 ++
 gcc/tree-vrp.c |  2 +-
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/Wstrict-overflow-27.c

diff --git a/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c 
b/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c
new file mode 100644
index 000..c1f27ab
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options -fstrict-overflow -O2 -Wstrict-overflow } */
+
+/* Warn about an overflow when folding i  0.  */
+
+void bar (unsigned *p);
+
+int
+foo (unsigned *p)
+{
+  int i;
+  int sum = 0;
+
+  for (i = 0; i  *p; i++)
+{
+  if (i  0) /* { dg-warning signed overflow } */
+   sum += 2;
+  bar (p);
+}
+
+  return sum;
+}
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index a75138f..bf9ff61 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -4270,7 +4270,7 @@ adjust_range_with_scev (value_range_t *vr, struct loop 
*loop,
   dir == EV_DIR_UNKNOWN
   /* ... or if it may wrap.  */
   || scev_probably_wraps_p (init, step, stmt, get_chrec_loop (chrec),
-   true))
+   /*use_overflow_semantics=*/false))
 return;
 
   /* We use TYPE_MIN_VALUE and TYPE_MAX_VALUE here instead of
-- 
2.2.0.rc1.23.gf570943



Re: [PATCH][ARM] Fix names of some rounding intrinsics, impement vrndx_f32 and vrndxq_f32

2014-11-21 Thread Kyrill Tkachov

Ping again.

Thanks,
Kyrill

On 13/11/14 14:45, Kyrill Tkachov wrote:

Ping.

Kyrill

On 04/11/14 10:56, Kyrill Tkachov wrote:

Phew,

This one slipped through the cracks. Ping?
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01981.html

Thanks,
Kyrill

On 23/09/14 16:25, Kyrill Tkachov wrote:

On 23/09/14 16:07, Kyrill Tkachov wrote:

Hi all,

Some intrinsics had the wrong name (inconsistent with the NEON
intrinsics spec). This patch fixes that and adds the vrndx_f32 and
vrndxq_f32 intrinsics that were missing.

For reference, the NEON intrinsics spec can be found at:
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/IHI0073A_arm_neon_intrinsics_ref.pdf

Kyrill


These map down to vrintx.f32 NEON instructions (d and q forms). We
already had builtins defined for them, just the intrinsics were not
wired up to them properly.

Tested arm-none-eabi

Ok for trunk?

2014-09-23  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/arm/arm_neon.h (vrndqn_f32): Rename to...
 (vrndnq_f32): ... this.
 (vrndqa_f32): Rename to...
 (vrndaq_f32): ... this.
 (vrndqp_f32): Rename to...
 (vrndpq_f32): ... this.
 (vrndqm_f32): Rename to...
 (vrndmq_f32): ... this.
 (vrndx_f32): New intrinsic.
 (vrndxq_f32): Likewise.

2014-09-23  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * gcc.target/arm/simd/neon-vrndx_f32_1.c: New test.
 * gcc.target/arm/simd/neon-vrndxq_f32_1.c: Likewise.
 * gcc.target/arm/neon/vrndqaf32.c: Rename to...
 * gcc.target/arm/neon/vrndaqf32.c: ... This. Update intrinsic names.
 * gcc.target/arm/neon/vrndqmf32.c: Rename to...
 * gcc.target/arm/neon/vrndmqf32.c: ... This. Update intrinsic names.
 * gcc.target/arm/neon/vrndqnf32.c: Rename to...
 * gcc.target/arm/neon/vrndnqf32.c: ... This. Update intrinsic names.
 * gcc.target/arm/neon/vrndqpf32.c: Rename to...
 * gcc.target/arm/neon/vrndpqf32.c: ... This. Update intrinsic names.










Re: [patch c++]: Fix PR/53904

2014-11-21 Thread Richard Biener
On Thu, Nov 20, 2014 at 8:48 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 this issue fixes a type-overflow issue caused by trying to cast a UHWI
 via tree_to_shwi.
 As soon as value gets larger then SHWI_MAX, we get an error for it.
 So we need to cast it
 via tree_to_uhwi, and then casting it to the signed variant.

I think it's better to handle the degenerate case (no element) explicitely.
And I would think that sth like nelts should have a positive result,
thus why is 'max' not unsigned?  Also 'max' and using 'nelts' looks
like a mismatch?  max == nelts - 1.  Ah, because array_type_nelts
returns nelts - 1 ... how useful ;)

Still you want to special-case the array_type_nelts == -1 case.

Richard.

 ChangeLog

 2014-11-20  Kai Tietz  kti...@redhat.com

 PR c++/63904
 * constexpr.c (cxx_eval_vec_init_1): Avoid
 type-overflow issue.

 2014-11-20  Kai Tietz  kti...@redhat.com

 PR c++/63904
 * g++.dg/cpp0x/pr63904.C: New.


 Regression tested for x86_64-unknown-linux-gnu.  Ok for apply?

 Regards,
 Kai

 Index: gcc/gcc/cp/constexpr.c
 ===
 --- gcc.orig/gcc/cp/constexpr.c
 +++ gcc/gcc/cp/constexpr.c
 @@ -2006,12 +2050,12 @@ cxx_eval_vec_init_1 (const constexpr_ctx
   bool *non_constant_p, bool *overflow_p)
  {
tree elttype = TREE_TYPE (atype);
 -  int max = tree_to_shwi (array_type_nelts (atype));
 +  HOST_WIDE_INT max = (HOST_WIDE_INT) tree_to_uhwi (array_type_nelts 
 (atype));
verify_ctor_sanity (ctx, atype);
vecconstructor_elt, va_gc **p = CONSTRUCTOR_ELTS (ctx-ctor);
vec_alloc (*p, max + 1);
bool pre_init = false;
 -  int i;
 +  HOST_WIDE_INT i;

/* For the default constructor, build up a call to the default
   constructor of the element type.  We only need to handle class types
 Index: gcc/gcc/testsuite/g++.dg/cpp0x/pr63904.C
 ===
 --- /dev/null
 +++ gcc/gcc/testsuite/g++.dg/cpp0x/pr63904.C
 @@ -0,0 +1,13 @@
 +// { dg-do compile { target c++11 } }
 +
 +templateint N
 +struct foo {
 +constexpr foo() : a() {}
 +int a[N];
 +};
 +
 +int main() {
 +  foo (foo1{}).a[0]  f;
 +  return 0;
 +}
 +


Re: [PATCH 2/4][x86] Add clwb,pcommit,avx512avbmi,avx512ifma.

2014-11-21 Thread Ilya Tocar
On 20 Nov 09:43, Uros Bizjak wrote:
 On Wed, Nov 19, 2014 at 6:32 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
  Hi,
 
  New revision of Intel ISA reference [1] has new instructions:
  Clwb, pcommit and new flavors of AVX512. Patch bellow adds them.
  I understand that stage 1 is closed, however those changes shouldn't
  affect anything outside if i386 backend. And are extremely unlikely to
  break existing functionality, and I personally think it's desirable for
  newest GCC to support newest spec.
  Bootstrapped/regtestsed on x86_64-unknown-linux-gnu.
  Ok for trunk?
 
 Please split the patch into patch series, like it was done previously
 for AVX512F patches.
 
 Uros.

This part adds avx512vbmi.
I'll send vpermi2b autogen patch together with v64qi const perm later.
Boostraps/passes make check.
Ok for trunk?


gcc/
* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VBMI_SET
OPTION_MASK_ISA_AVX512VBMI_UNSET): New.
(ix86_handle_option): Handle OPT_mavx512vbmi.
* config.gcc: Add avx512vbmiintrin.h, avx512vbmivlintrin.h.
* config/i386/avx512vbmiintrin.h: New file.
* config/i386/avx512vbmivlintrin.h: Ditto.
* config/i386/cpuid.h (bit_AVX512VBMI): New.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect avx512vbmi.
* config/i386/i386-c.c (ix86_target_macros_internal): Define
__AVX512VBMI__.
* config/i386/i386.c (ix86_target_string): Add -mavx512vbmi.
(PTA_AVX512VBMI): Define.
(ix86_option_override_internal): Handle new options.
(ix86_valid_target_attribute_inner_p): Add avx512vbmi,
(ix86_builtins): Add IX86_BUILTIN_VPMULTISHIFTQB512,
IX86_BUILTIN_VPMULTISHIFTQB256, IX86_BUILTIN_VPMULTISHIFTQB128,
IX86_BUILTIN_VPERMVARQI512_MASK, IX86_BUILTIN_VPERMT2VARQI512,
IX86_BUILTIN_VPERMT2VARQI512_MASKZ, IX86_BUILTIN_VPERMI2VARQI512,
IX86_BUILTIN_VPERMVARQI256_MASK, IX86_BUILTIN_VPERMVARQI128_MASK,
IX86_BUILTIN_VPERMT2VARQI256, IX86_BUILTIN_VPERMT2VARQI256_MASKZ,
IX86_BUILTIN_VPERMT2VARQI128, IX86_BUILTIN_VPERMI2VARQI256,
IX86_BUILTIN_VPERMI2VARQI128.
(bdesc_special_args): Add __builtin_ia32_vpmultishiftqb512_mask,
__builtin_ia32_vpmultishiftqb256_mask,
__builtin_ia32_vpmultishiftqb128_mask,
__builtin_ia32_permvarqi512_mask, __builtin_ia32_vpermt2varqi512_mask,
__builtin_ia32_vpermt2varqi512_maskz,
__builtin_ia32_vpermi2varqi512_mask, __builtin_ia32_permvarqi256_mask,
__builtin_ia32_permvarqi128_mask, __builtin_ia32_vpermt2varqi256_mask,
__builtin_ia32_vpermt2varqi256_maskz,
__builtin_ia32_vpermt2varqi128_mask,
__builtin_ia32_vpermt2varqi128_maskz,
__builtin_ia32_vpermi2varqi256_mask,
__builtin_ia32_vpermi2varqi128_mask.
(ix86_hard_regno_mode_ok): Allow big masks for AVX512VBMI.
* config/i386/i386.h (TARGET_AVX512VBMI, TARGET_AVX512VBMI_P): Define.
* config/i386/i386.opt: Add mavx512vbmi.
* config/i386/immintrin.h: Include avx512vbmiintrin.h,
avx512vbmivlintrin.h.
* config/i386/sse.md (unspec): Add UNSPEC_VPMULTISHIFT.
(VI1_AVX512VL): New iterator.
(avx512_permvarmodemask_name): Use it.
(avx512_vpermi2varmode3_maskz): Ditto.
(avx512_vpermi2varmode3sd_maskz_name): Ditto.
(avx512_vpermi2varmode3_mask): Ditto.
(avx512_vpermt2varmode3_maskz): Ditto.
(avx512_vpermt2varmode3sd_maskz_name): Ditto.
(avx512_vpermt2varmode3_mask): Ditto.
(vpmultishiftqbmodemask_name): Ditto.

gcc/testsuite/

* g++.dg/other/i386-2.C: Add -mavx512vbmi.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/avx512f-helper.h: Add avx512vbmi-check.h.
* gcc.target/i386/avx512vbmi-check.h: Ditto.
* gcc.target/i386/avx512vbmi-vpermb-1.c: Ditto.
* gcc.target/i386/avx512vbmi-vpermb-2.c: Ditto.
* gcc.target/i386/avx512vbmi-vpermi2b-1.c: Ditto.
* gcc.target/i386/avx512vbmi-vpermi2b-2.c: Ditto.
* gcc.target/i386/avx512vbmi-vpermt2b-1.c: Ditto.
* gcc.target/i386/avx512vbmi-vpermt2b-2.c: Ditto.
* gcc.target/i386/avx512vbmi-vpmultishiftqb-1.c: Ditto.
* gcc.target/i386/avx512vbmi-vpmultishiftqb-2.c: Ditto.
* gcc.target/i386/avx512vl-vpermb-2.c: Ditto.
* gcc.target/i386/avx512vl-vpermi2b-2.c: Ditto.
* gcc.target/i386/avx512vl-vpermt2b-2.c: Ditto.
* gcc.target/i386/avx512vl-vpmaddhuq-2.c: Ditto.
* gcc.target/i386/avx512vl-vpmaddluq-2.c: Ditto.
* gcc.target/i386/avx512vl-vpmultishiftqb-2.c: Ditto.
* gcc.target/i386/i386.exp (check_effective_target_avx512vbmi): New.
* gcc.target/i386/sse-12.c: Add new options.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.

---
 

Re: [PATCH] Fix PR 63952 (Re: [PATCH, ifcvt] Allow CC mode if HAVE_cbranchcc4)

2014-11-21 Thread Richard Biener
On Fri, Nov 21, 2014 at 2:51 AM, Ulrich Weigand uweig...@de.ibm.com wrote:
 Richard Biener wrote:

 This probably caused bootstrap on s390x-linux to fail as in PR63952
 (last checked with rev. 217714).

 It seems we have both a back-end bug and a middle-end bug here.

 First of all, this code in optabs.c:prepare_cmp_insn is quite strange:

if (GET_MODE_CLASS (mode) == MODE_CC)
  {
gcc_assert (can_compare_p (comparison, CCmode, ccp_jump));
*ptest = gen_rtx_fmt_ee (comparison, VOIDmode, x, y);
return;
  }

 Note that can_compare_p checks whether the back-end accepts a test
 RTX created via:
   test = gen_rtx_fmt_ee (code, mode, const0_rtx, const0_rtx);

 All back-end cbranchcc4 patterns however verify that the first operand
 of the comparison is the flags register, so a const0_rtx will never
 match.  It doesn't seem useful to call can_compare_p with CCmode at all.

 The patch below changes prepare_cmp_insn do do an explicit
 insn_operand_matches test using the actual operands, just like
 is also done for non-CCmode comparisons.


 However, even so this is still rejected by the s390 back end.  This is
 because the s390 cbranchcc4 pattern is really quite wrong; it is restricted
 to accepting only EQ/NE comparisons when it could simply accept any valid
 comparison (i.e. where s390_comparison is true).

 In addition, it has a TARGET_HARD_FLOAT check for no reason I can see,
 and it has custom expander code that is in all cases a no-op and results
 in exactly the pattern in the insn to be emitted anyway.

 Fixed by the patch below as well.


 Tested on s390x-ibm-linux (with and without --with-arch=z196).

 OK for mainline?

Ok.

Thanks,
Richard.

 Bye,
 Ulrich


 ChangeLog:

 PR rtl-optimization/63952
 * optabs.c (prepare_cmp_insn): Do not call can_compare_p for CCmode.
 * config/s390/s390.md (cbranchcc4): Accept any s390_comparison.
 Remove incorrect TARGET_HARD_FLOAT check and no-op expander code.

 Index: gcc/optabs.c
 ===
 *** gcc/optabs.c(revision 217784)
 --- gcc/optabs.c(working copy)
 *** prepare_cmp_insn (rtx x, rtx y, enum rtx
 *** 4167,4174 

 if (GET_MODE_CLASS (mode) == MODE_CC)
   {
 !   gcc_assert (can_compare_p (comparison, CCmode, ccp_jump));
 !   *ptest = gen_rtx_fmt_ee (comparison, VOIDmode, x, y);
 return;
   }

 --- 4167,4177 

 if (GET_MODE_CLASS (mode) == MODE_CC)
   {
 !   enum insn_code icode = optab_handler (cbranch_optab, CCmode);
 !   test = gen_rtx_fmt_ee (comparison, VOIDmode, x, y);
 !   gcc_assert (icode != CODE_FOR_nothing
 !insn_operand_matches (icode, 0, test));
 !   *ptest = test;
 return;
   }

 Index: gcc/config/s390/s390.md
 ===
 *** gcc/config/s390/s390.md (revision 217784)
 --- gcc/config/s390/s390.md (working copy)
 ***
 *** 8142,8157 

   (define_expand cbranchcc4
 [(set (pc)
 ! (if_then_else (match_operator 0 s390_eqne_operator
[(match_operand 1 cc_reg_operand )
 ! (match_operand 2 const0_operand )])
   (label_ref (match_operand 3  ))
 (pc)))]
 !   TARGET_HARD_FLOAT
 !   s390_emit_jump (operands[3],
 ! s390_emit_compare (GET_CODE (operands[0]), operands[1], operands[2]));
 !DONE;)
 !


   ;;
 --- 8142,8154 

   (define_expand cbranchcc4
 [(set (pc)
 ! (if_then_else (match_operator 0 s390_comparison
[(match_operand 1 cc_reg_operand )
 ! (match_operand 2 const_int_operand )])
   (label_ref (match_operand 3  ))
 (pc)))]
 !   
 !   )


   ;;

 --
   Dr. Ulrich Weigand
   GNU/Linux compilers and toolchain
   ulrich.weig...@de.ibm.com



Re: [PATCH]Add myself to MAINTAINERS

2014-11-21 Thread Richard Earnshaw
On 21/11/14 11:16, Renlin Li wrote:
 Hi,
 
 This patch is to add myself into Write After Approval section of 
 MAINTAINERS file.
 
 Is it Okay to commit?
 
 Regards,
 Renlin Li
 
 
 ChangeLog:
 
 2014-11-21  Renlin Li  renlin...@arm.com
 
  * MAINTAINERS (Write After Approval): Add myself.
 
 
 tmp.patch
 
 
 diff --git a/MAINTAINERS b/MAINTAINERS
 index 56e68c5..96a7497 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -461,6 +461,7 @@ Georg-Johann Lay  a...@gjlay.de
  Marc Lehmann p...@goof.com
  James Lemke  jwle...@codesourcery.com
  Kriang Lerdsuwanakij lerds...@users.sourceforge.net
 +Renlin Lirenlin...@arm.com
  Xinliang David Lidavi...@google.com
  Jiangning Liujiangning@arm.com
  Sa Liu   sa...@de.ibm.com
 

OK

R.




Re: [PATCH 3/4][x86] Add clwb,pcommit,avx512avbmi,avx512ifma.

2014-11-21 Thread Ilya Tocar
On 20 Nov 09:43, Uros Bizjak wrote:
 On Wed, Nov 19, 2014 at 6:32 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
  Hi,
 
  New revision of Intel ISA reference [1] has new instructions:
  Clwb, pcommit and new flavors of AVX512. Patch bellow adds them.
  I understand that stage 1 is closed, however those changes shouldn't
  affect anything outside if i386 backend. And are extremely unlikely to
  break existing functionality, and I personally think it's desirable for
  newest GCC to support newest spec.
  Bootstrapped/regtestsed on x86_64-unknown-linux-gnu.
  Ok for trunk?
 
 Please split the patch into patch series, like it was done previously
 for AVX512F patches.
 
 Uros.

Done. This part adds clwb.
Bootstrapped/passes make-check.
Ok for trunk?

gcc/

* common/config/i386/i386-common.c (OPTION_MASK_ISA_CLWB_UNSET,
OPTION_MASK_ISA_CLWB_SET): New.
(ix86_handle_option): Handle OPT_mclwb.
* config.gcc: Add clwbintrin.h.
* config/i386/clwbintrin.h: New file.
* config/i386/cpuid.h (bit_CLWB): Define.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect clwb. 
* config/i386/i386-c.c (ix86_target_macros_internal): Define
__CLWB__.
* config/i386/i386.c (ix86_target_string): Add -mclwb.
(PTA_CLWB): Define.
(ix86_option_override_internal): Handle new option.
(ix86_valid_target_attribute_inner_p): Add clwb.
(ix86_builtins): Add IX86_BUILTIN_CLWB.
(ix86_init_mmx_sse_builtins): Add __builtin_ia32_clwb.
(ix86_expand_builtin): Handle IX86_BUILTIN_CLWB.
* config/i386/i386.h (TARGET_CLWB, TARGET_CLWB_P): Define.
* config/i386/i386.md (unspecv): Add UNSPECV_CLWB.
(clwb): New instruction.
* config/i386/i386.opt: Add mclwb.
* config/i386/x86intrin.h: Include clwbintrin.h.

gcc/testsuite/

* g++.dg/other/i386-2.C: Add -mclwb.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/clwb-1.c: New test.
* gcc.target/i386/sse-12.c: Add new options.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.

---
 gcc/common/config/i386/i386-common.c   | 15 +++
 gcc/config.gcc |  4 +--
 gcc/config/i386/clwbintrin.h   | 49 ++
 gcc/config/i386/cpuid.h|  1 +
 gcc/config/i386/driver-i386.c  |  6 +++--
 gcc/config/i386/i386-c.c   |  2 ++
 gcc/config/i386/i386.c | 23 
 gcc/config/i386/i386.h |  2 ++
 gcc/config/i386/i386.md| 12 +
 gcc/config/i386/i386.opt   |  4 +++
 gcc/config/i386/x86intrin.h|  2 ++
 gcc/testsuite/g++.dg/other/i386-2.C|  2 +-
 gcc/testsuite/g++.dg/other/i386-3.C|  2 +-
 gcc/testsuite/gcc.target/i386/clwb-1.c | 11 
 gcc/testsuite/gcc.target/i386/sse-12.c |  2 +-
 gcc/testsuite/gcc.target/i386/sse-13.c |  2 +-
 gcc/testsuite/gcc.target/i386/sse-14.c |  2 +-
 gcc/testsuite/gcc.target/i386/sse-22.c |  2 +-
 gcc/testsuite/gcc.target/i386/sse-23.c |  2 +-
 19 files changed, 134 insertions(+), 11 deletions(-)
 create mode 100644 gcc/config/i386/clwbintrin.h
 create mode 100644 gcc/testsuite/gcc.target/i386/clwb-1.c

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 1c4f15e..bad0988 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -85,6 +85,7 @@ along with GCC; see the file COPYING3.  If not see
   (OPTION_MASK_ISA_XSAVES | OPTION_MASK_ISA_XSAVE)
 #define OPTION_MASK_ISA_XSAVEC_SET \
   (OPTION_MASK_ISA_XSAVEC | OPTION_MASK_ISA_XSAVE)
+#define OPTION_MASK_ISA_CLWB_SET OPTION_MASK_ISA_CLWB
 
 /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
as -msse4.2.  */
@@ -181,6 +182,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_CLFLUSHOPT_UNSET OPTION_MASK_ISA_CLFLUSHOPT
 #define OPTION_MASK_ISA_XSAVEC_UNSET OPTION_MASK_ISA_XSAVEC
 #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
+#define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
 
 /* SSE4 includes both SSE4.1 and SSE4.2.  -mno-sse4 should the same
as -mno-sse4.1. */
@@ -901,6 +903,19 @@ ix86_handle_option (struct gcc_options *opts,
}
   return true;
 
+case OPT_mclwb:
+  if (value)
+   {
+ opts-x_ix86_isa_flags |= OPTION_MASK_ISA_CLWB_SET;
+ opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_CLWB_SET;
+   }
+  else
+   {
+ opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_CLWB_UNSET;
+ opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_CLWB_UNSET;
+   }
+  return true;
+
   /* Comes from final.c -- no real reason to change it.  */
 #define MAX_CODE_ALIGN 16
 
diff --git a/gcc/config.gcc b/gcc/config.gcc
index da2a723..766f13b 100644

Re: [PATCH 1/2] PR debug/38757 gcc does not emit DW_LANG_C99.

2014-11-21 Thread Richard Biener
On Fri, Nov 21, 2014 at 8:56 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Nov 20, 2014 at 11:30:11PM +0100, Mark Wielaard wrote:
 --- a/gcc/config/avr/avr-c.c
 +++ b/gcc/config/avr/avr-c.c
 @@ -386,7 +386,8 @@ avr_cpu_cpp_builtins (struct cpp_reader *pfile)
   (as mentioned in ISO/IEC DTR 18037; Annex F.2) which is not
   implemented in GCC up to now.  */

 -  if (!strcmp (lang_hooks.name, GNU C))
 +  if (strncmp (lang_hooks.name, GNU C, 5) == 0
 +   strncmp (lang_hooks.name, GNU C++, 7) != 0)

 I wonder if the tests for C language shouldn't be better done
 as (strncmp (lang_hooks.name, GNU C, 5) == 0
  strchr (0123456789, lang_hooks.name[5]) != NULL)
 or (strncmp (lang_hooks.name, GNU C, 5) == 0
  (ISDIGIT (lang_hooks.name[5]) || lang_hooks.name[5] == '\0'))
 to make it explicit what we are looking for, not what we aren't.

Or even make that a helper function in langhooks.[ch]

  lang_GNU_C (), lang_GNU_CXX ()

 +  either, so for now use 0.  Match GNU C++ first, since it needs to
 +  be compared with strncmp, like GNU C, which has the same prefix.  */
 +  if (! strncmp (language_string, GNU C++, 7)
 +|| ! strcmp (language_string, GNU Objective-C++))

 Wrong formatting, || should be below ! on the previous line.

 + i = 9;
 +  else if (! strncmp (language_string, GNU C, 5)
 || ! strcmp (language_string, GNU GIMPLE)
 || ! strcmp (language_string, GNU Go))

 And here too.  But if you use a different check for C (see above), you could
 avoid moving the C++ case first.

 --- a/gcc/langhooks.h
 +++ b/gcc/langhooks.h
 @@ -261,7 +261,8 @@ struct lang_hooks_for_lto

  struct lang_hooks
  {
 -  /* String identifying the front end.  e.g. GNU C++.  */
 +  /* String identifying the front end.  e.g. GNU C++.
 + Might include language version being used.  */

 As we no longer have GNU C++ as any name, using it as an example
 is weird.  So,
   /* String identifying the front end and optionally language standard
  version, e.g. GNU C++98 or GNU Java.  */
 ?

 LGTM otherwise.

Yes, otherwise looks good.

Thanks,
Richard.

 Jakub


Re: [PATCH][x86] Add clwb,pcommit,avx512avbmi,avx512ifma.

2014-11-21 Thread Ilya Tocar
On 20 Nov 09:43, Uros Bizjak wrote:
 On Wed, Nov 19, 2014 at 6:32 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
  Hi,
 
  New revision of Intel ISA reference [1] has new instructions:
  Clwb, pcommit and new flavors of AVX512. Patch bellow adds them.
  I understand that stage 1 is closed, however those changes shouldn't
  affect anything outside if i386 backend. And are extremely unlikely to
  break existing functionality, and I personally think it's desirable for
  newest GCC to support newest spec.
  Bootstrapped/regtestsed on x86_64-unknown-linux-gnu.
  Ok for trunk?
 
 Please split the patch into patch series, like it was done previously
 for AVX512F patches.
 
 Uros.
 
  [1]:https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
 

This part adds pcommit.
Bootstrapps/passes make check.
Ok for trunk?

gcc/


* common/config/i386/i386-common.c (OPTION_MASK_ISA_PCOMMIT_UNSET,
OPTION_MASK_ISA_PCOMMIT_SET): New.
(ix86_handle_option): Handle OPT_mpcommit.
* config.gcc: Add pcommitintrin.h
* config/i386/pcommitintrin.h: New file.
* config/i386/cpuid.h (bit_PCOMMIT): Define.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect pcommit.
* config/i386/i386-c.c (ix86_target_macros_internal): Define
__PCOMMIT__.
* config/i386/i386.c (ix86_target_string): Add -mpcommit.
(PTA_PCOMMIT): Define.
(ix86_option_override_internal): Handle new option.
(ix86_valid_target_attribute_inner_p): Add pcommit.
(ix86_builtins): Add IX86_BUILTIN_PCOMMIT.
(bdesc_special_args): Add __builtin_ia32_pcommit.
* config/i386/i386.h (TARGET_PCOMMIT, TARGET_PCOMMIT_P): Define.
* config/i386/i386.md (unspecv): Add UNSPECV_PCOMMIT.
(pcommit): New instruction.
* config/i386/i386.opt: Add mpcommit.
* config/i386/x86intrin.h: Include pcommitintrin.h.

 
---
 gcc/common/config/i386/i386-common.c  | 15 ++
 gcc/config.gcc|  4 +--
 gcc/config/i386/cpuid.h   |  1 +
 gcc/config/i386/driver-i386.c |  5 +++-
 gcc/config/i386/i386-c.c  |  2 ++
 gcc/config/i386/i386.c| 12 
 gcc/config/i386/i386.h|  2 ++
 gcc/config/i386/i386.md   | 10 +++
 gcc/config/i386/i386.opt  |  4 +++
 gcc/config/i386/pcommitintrin.h   | 49 +++
 gcc/config/i386/x86intrin.h   |  2 ++
 gcc/testsuite/g++.dg/other/i386-2.C   |  2 +-
 gcc/testsuite/g++.dg/other/i386-3.C   |  2 +-
 gcc/testsuite/gcc.target/i386/pcommit-1.c | 11 +++
 gcc/testsuite/gcc.target/i386/sse-12.c|  2 +-
 gcc/testsuite/gcc.target/i386/sse-13.c|  2 +-
 gcc/testsuite/gcc.target/i386/sse-14.c|  2 +-
 gcc/testsuite/gcc.target/i386/sse-22.c|  2 +-
 gcc/testsuite/gcc.target/i386/sse-23.c|  2 +-
 19 files changed, 121 insertions(+), 10 deletions(-)
 create mode 100644 gcc/config/i386/pcommitintrin.h
 create mode 100644 gcc/testsuite/gcc.target/i386/pcommit-1.c

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index bad0988..2e09d77 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -86,6 +86,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_XSAVEC_SET \
   (OPTION_MASK_ISA_XSAVEC | OPTION_MASK_ISA_XSAVE)
 #define OPTION_MASK_ISA_CLWB_SET OPTION_MASK_ISA_CLWB
+#define OPTION_MASK_ISA_PCOMMIT_SET OPTION_MASK_ISA_PCOMMIT
 
 /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
as -msse4.2.  */
@@ -182,6 +183,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_CLFLUSHOPT_UNSET OPTION_MASK_ISA_CLFLUSHOPT
 #define OPTION_MASK_ISA_XSAVEC_UNSET OPTION_MASK_ISA_XSAVEC
 #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
+#define OPTION_MASK_ISA_PCOMMIT_UNSET OPTION_MASK_ISA_PCOMMIT
 #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
 
 /* SSE4 includes both SSE4.1 and SSE4.2.  -mno-sse4 should the same
@@ -903,6 +905,19 @@ ix86_handle_option (struct gcc_options *opts,
}
   return true;
 
+case OPT_mpcommit:
+  if (value)
+   {
+ opts-x_ix86_isa_flags |= OPTION_MASK_ISA_PCOMMIT_SET;
+ opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_PCOMMIT_SET;
+   }
+  else
+   {
+ opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_PCOMMIT_UNSET;
+ opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_PCOMMIT_UNSET;
+   }
+  return true;
+
 case OPT_mclwb:
   if (value)
{
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 766f13b..fa3e1fc 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -369,7 +369,7 @@ i[34567]86-*-*)
   xsavesintrin.h avx512dqintrin.h avx512bwintrin.h
   avx512vlintrin.h avx512vlbwintrin.h avx512vldqintrin.h

Re: [PATCH] Detect a pack-unpack pattern in GCC vectorizer and optimize it.

2014-11-21 Thread Evgeny Stupachenko
Hi,

Please note that currently the test:

int a[N];
short b[N*2];

for (int i = 0; i  N; ++i)
  a[i] = b[i*2];

Is compiled to (with -march=corei7 -O2 -ftree-vectorize):

movdqa  b(%rax), %xmm0
movdqa  b-16(%rax), %xmm2
pand%xmm1, %xmm0
pand%xmm1, %xmm2
packusdw%xmm2, %xmm0
pmovsxwd%xmm0, %xmm2
psrldq  $8, %xmm0
pmovsxwd%xmm0, %xmm0
movaps  %xmm2, a-32(%rax)
movaps  %xmm0, a-16(%rax)

Which is more close to the requested sequence.

Thanks,
Evgeny


On Wed, Jun 25, 2014 at 8:34 PM, Cong Hou co...@google.com wrote:
 On Tue, Jun 24, 2014 at 4:05 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Sat, May 3, 2014 at 2:39 AM, Cong Hou co...@google.com wrote:
 On Mon, Apr 28, 2014 at 4:04 AM, Richard Biener rguent...@suse.de wrote:
 On Thu, 24 Apr 2014, Cong Hou wrote:

 Given the following loop:

 int a[N];
 short b[N*2];

 for (int i = 0; i  N; ++i)
   a[i] = b[i*2];


 After being vectorized, the access to b[i*2] will be compiled into
 several packing statements, while the type promotion from short to int
 will be compiled into several unpacking statements. With this patch,
 each pair of pack/unpack statements will be replaced by less expensive
 statements (with shift or bit-and operations).

 On x86_64, the loop above will be compiled into the following assembly
 (with -O2 -ftree-vectorize):

 movdqu 0x10(%rcx),%xmm3
 movdqu -0x20(%rcx),%xmm0
 movdqa %xmm0,%xmm2
 punpcklwd %xmm3,%xmm0
 punpckhwd %xmm3,%xmm2
 movdqa %xmm0,%xmm3
 punpcklwd %xmm2,%xmm0
 punpckhwd %xmm2,%xmm3
 movdqa %xmm1,%xmm2
 punpcklwd %xmm3,%xmm0
 pcmpgtw %xmm0,%xmm2
 movdqa %xmm0,%xmm3
 punpckhwd %xmm2,%xmm0
 punpcklwd %xmm2,%xmm3
 movups %xmm0,-0x10(%rdx)
 movups %xmm3,-0x20(%rdx)


 With this patch, the generated assembly is shown below:

 movdqu 0x10(%rcx),%xmm0
 movdqu -0x20(%rcx),%xmm1
 pslld  $0x10,%xmm0
 psrad  $0x10,%xmm0
 pslld  $0x10,%xmm1
 movups %xmm0,-0x10(%rdx)
 psrad  $0x10,%xmm1
 movups %xmm1,-0x20(%rdx)


 Bootstrapped and tested on x86-64. OK for trunk?

 This is an odd place to implement such transform.  Also if it
 is faster or not depends on the exact ISA you target - for
 example ppc has constraints on the maximum number of shifts
 carried out in parallel and the above has 4 in very short
 succession.  Esp. for the sign-extend path.

 Thank you for the information about ppc. If this is an issue, I think
 we can do it in a target dependent way.



 So this looks more like an opportunity for a post-vectorizer
 transform on RTL or for the vectorizer special-casing
 widening loads with a vectorizer pattern.

 I am not sure if the RTL transform is more difficult to implement. I
 prefer the widening loads method, which can be detected in a pattern
 recognizer. The target related issue will be resolved by only
 expanding the widening load on those targets where this pattern is
 beneficial. But this requires new tree operations to be defined. What
 is your suggestion?

 I apologize for the delayed reply.

 Likewise ;)

 I suggest to implement this optimization in vector lowering in
 tree-vect-generic.c.  This sees for your example

   vect__5.7_32 = MEM[symbol: b, index: ivtmp.15_13, offset: 0B];
   vect__5.8_34 = MEM[symbol: b, index: ivtmp.15_13, offset: 16B];
   vect_perm_even_35 = VEC_PERM_EXPR vect__5.7_32, vect__5.8_34, { 0,
 2, 4, 6, 8, 10, 12, 14 };
   vect__6.9_37 = [vec_unpack_lo_expr] vect_perm_even_35;
   vect__6.9_38 = [vec_unpack_hi_expr] vect_perm_even_35;

 where you can apply the pattern matching and transform (after checking
 with the target, of course).

 This sounds good to me! I'll try to make a patch following your suggestion.

 Thank you!


 Cong


 Richard.


 thanks,
 Cong


 Richard.


Re: [ia64 PATCH] Fix up ia64 attribute handling (PR target/61137)

2014-11-21 Thread Richard Biener
On Fri, Nov 21, 2014 at 12:01 PM, Andreas Schwab sch...@suse.de wrote:
 Jakub Jelinek ja...@redhat.com writes:

 The following untested patch fixes that (tested on small-addr-1.c with
 a cross-compiler), I don't have ia64 hw nor spare cycles to test this
 though, so I'm just offering the patch as is if anyone wants to test it.
 Perhaps better testsuite coverage wouldn't hurt (test the model (small)
 attribute also in C++, perhaps test the common_object attribute on VMS?).

 2014-11-20  Jakub Jelinek  ja...@redhat.com

   PR target/61137
   * config/ia64/ia64.c (ia64_attribute_takes_identifier_p): New function.
   (TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P): Redefine to it.

 Looks good.

 http://gcc.gnu.org/ml/gcc-testresults/2014-11/msg02276.html

Ok.

Thanks,
Richard.

 Andreas.

 --
 Andreas Schwab, sch...@linux-m68k.org
 GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
 And now for something completely different.


Re: [PATCH] PR ipa/63909 ICE: SIGSEGV in ipa_icf_gimple::func_checker::compare_bb()

2014-11-21 Thread Martin Liška

On 11/20/2014 05:41 PM, Richard Biener wrote:

On Thu, Nov 20, 2014 at 5:30 PM, Martin Liška mli...@suse.cz wrote:

Hello.

Following patch fixes ICE in IPA ICF. Problem was that number of non-debug
statements in a BB can
change (for instance by IPA split), so that the number is recomputed.


Huh, so can it get different for both candidates?  I think the stmt compare
loop should be terminated on gsi_end_p of either iterator and return
false for any remaining non-debug-stmts on the other.

Thus, not walk all stmts twice here.


Hello.

Sorry for the previous patch, you are right it can be fixed in purer 
way. Please take a look at attached patch.




As IPA split is run early I don't see how it should affect a real IPA
pass though?




Sorry for non precise information, the problematic BB is changed here:
#0  gsi_split_seq_before (i=0x7fffd550, pnew_seq=0x7fffd528) at 
../../gcc/gimple-iterator.c:429
#1  0x00b95a2a in gimple_split_block (bb=0x76c41548, 
stmt=0x0) at ../../gcc/tree-cfg.c:5707
#2  0x007563cf in split_block (bb=0x76c41548, i=i@entry=0x0) 
at ../../gcc/cfghooks.c:508
#3  0x00756b44 in split_block_after_labels (bb=optimized out) 
at ../../gcc/cfghooks.c:549
#4  make_forwarder_block (bb=optimized out, 
redirect_edge_p=redirect_edge_p@entry=0x75d4e0 
mfb_keep_just(edge_def*), new_bb_cbk=new_bb_cbk@entry=0x0) at 
../../gcc/cfghooks.c:842
#5  0x0076085a in create_preheader (loop=0x76d56948, 
flags=optimized out) at ../../gcc/cfgloopmanip.c:1563
#6  0x00760aea in create_preheaders (flags=1) at 
../../gcc/cfgloopmanip.c:1613
#7  0x009bc6b0 in apply_loop_flags (flags=15) at 
../../gcc/loop-init.c:75
#8  0x009bc7d3 in loop_optimizer_init (flags=15) at 
../../gcc/loop-init.c:136
#9  0x00957914 in estimate_function_body_sizes 
(node=0x76c47620, early=false) at ../../gcc/ipa-inline-analysis.c:2480
#10 0x0095948b in compute_inline_parameters 
(node=0x76c47620, early=false) at ../../gcc/ipa-inline-analysis.c:2907
#11 0x0095bd88 in inline_analyze_function (node=0x76c47620) 
at ../../gcc/ipa-inline-analysis.c:3994
#12 0x0095bed3 in inline_generate_summary () at 
../../gcc/ipa-inline-analysis.c:4045
#13 0x00a70b71 in execute_ipa_summary_passes 
(ipa_pass=0x1dcb9e0) at ../../gcc/passes.c:2137

#14 0x00777a15 in ipa_passes () at ../../gcc/cgraphunit.c:2074
#15 symbol_table::compile (this=this@entry=0x76c3a000) at 
../../gcc/cgraphunit.c:2187
#16 0x00778bcd in symbol_table::finalize_compilation_unit 
(this=0x76c3a000) at ../../gcc/cgraphunit.c:2340
#17 0x006580ee in c_write_global_declarations () at 
../../gcc/c/c-decl.c:10777

#18 0x00b5bb8b in compile_file () at ../../gcc/toplev.c:584
#19 0x00b5def1 in do_compile () at ../../gcc/toplev.c:2041
#20 0x00b5e0fa in toplev::main (this=0x7fffdc9f, argc=20, 
argv=0x7fffdd98) at ../../gcc/toplev.c:2138
#21 0x0063f1d9 in main (argc=20, argv=0x7fffdd98) at 
../../gcc/main.c:38


Patch can bootstrap on x86_64-linux-pc and no regression has been seen.
Ready for trunk?


Thanks,
Martin



Thanks,
Richard.


Patch can bootstrap on x86_64-linux-pc and no regression has been seen.
Ready for trunk?

Thanks,
Martin


From 09b90f6a5ec1e49464f57c333af43574ad8c1375 Mon Sep 17 00:00:00 2001
From: mliska mli...@suse.cz
Date: Thu, 20 Nov 2014 16:28:54 +0100
Subject: [PATCH] Fix and new test.

gcc/ChangeLog:

2014-11-21  Martin Liska  mli...@suse.cz

	* gimple-iterator.h (gsi_start_bb_nondebug): New function.
	* ipa-icf-gimple.c (func_checker::compare_bb): Correct iteration
	replaces loop based on precomputed number of non-debug statements.

gcc/testsuite/ChangeLog:

2014-11-21  Martin Liska  mli...@suse.cz

	* gcc.dg/ipa/pr63909.c: New test.
---
 gcc/gimple-iterator.h  | 13 +
 gcc/ipa-icf-gimple.c   | 25 ++---
 gcc/testsuite/gcc.dg/ipa/pr63909.c | 27 +++
 3 files changed, 50 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr63909.c

diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h
index fb6cc07..e9602b3 100644
--- a/gcc/gimple-iterator.h
+++ b/gcc/gimple-iterator.h
@@ -211,6 +211,19 @@ gsi_stmt (gimple_stmt_iterator i)
   return i.ptr;
 }
 
+/* Return a new iterator pointing to the first non-debug statement
+   in basic block BB.  */
+
+static inline gimple_stmt_iterator
+gsi_start_bb_nondebug (basic_block bb)
+{
+  gimple_stmt_iterator gsi = gsi_start_bb (bb);
+  while (!gsi_end_p (gsi)  is_gimple_debug (gsi_stmt (gsi)))
+gsi_next (gsi);
+
+  return gsi;
+}
+
 /* Return a block statement iterator that points to the first non-label
statement in block BB.  */
 
diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
index 8f2a438..ec0290a 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -559,24 +559,16 @@ func_checker::parse_labels (sem_bb *bb)
 

Re: [PATCH 8/9] Negative numbers added for sreal class.

2014-11-21 Thread Richard Biener
On Fri, Nov 21, 2014 at 12:21 PM, Martin Liška mli...@suse.cz wrote:
 On 11/14/2014 11:48 AM, Richard Biener wrote:

 On Thu, Nov 13, 2014 at 1:35 PM, mliska mli...@suse.cz wrote:

 gcc/ChangeLog:

 2014-11-13  Martin Liska  mli...@suse.cz

  * predict.c (propagate_freq): More elegant sreal API is used.
  (estimate_bb_frequencies): New static constants defined by sreal
  replace precomputed ones.
  * sreal.c (sreal::normalize): New function.
  (sreal::to_int): Likewise.
  (sreal::operator+): Likewise.
  (sreal::operator-): Likewise.
  * sreal.h: Definition of new functions added.


 Please use gcc_checking_assert()s everywhere.  sreal is supposed
 to be fast... (I see it has current uses of gcc_assert - you may want
 to mass-convert them as a followup).

 ---
   gcc/predict.c | 30 +++-
   gcc/sreal.c   | 56 
   gcc/sreal.h   | 75
 ---
   3 files changed, 126 insertions(+), 35 deletions(-)

 diff --git a/gcc/predict.c b/gcc/predict.c
 index 0215e91..0f640f5 100644
 --- a/gcc/predict.c
 +++ b/gcc/predict.c
 @@ -82,7 +82,7 @@ along with GCC; see the file COPYING3.  If not see

   /* real constants: 0, 1, 1-1/REG_BR_PROB_BASE, REG_BR_PROB_BASE,
 1/REG_BR_PROB_BASE, 0.5, BB_FREQ_MAX.  */
 -static sreal real_zero, real_one, real_almost_one, real_br_prob_base,
 +static sreal real_almost_one, real_br_prob_base,
   real_inv_br_prob_base, real_one_half, real_bb_freq_max;

   static void combine_predictions_for_insn (rtx_insn *, basic_block);
 @@ -2528,13 +2528,13 @@ propagate_freq (basic_block head, bitmap tovisit)
  bb-count = bb-frequency = 0;
   }

 -  BLOCK_INFO (head)-frequency = real_one;
 +  BLOCK_INFO (head)-frequency = sreal::one ();
 last = head;
 for (bb = head; bb; bb = nextbb)
   {
 edge_iterator ei;
 -  sreal cyclic_probability = real_zero;
 -  sreal frequency = real_zero;
 +  sreal cyclic_probability = sreal::zero ();
 +  sreal frequency = sreal::zero ();

 nextbb = BLOCK_INFO (bb)-next;
 BLOCK_INFO (bb)-next = NULL;
 @@ -2559,13 +2559,13 @@ propagate_freq (basic_block head, bitmap tovisit)
* BLOCK_INFO (e-src)-frequency /
REG_BR_PROB_BASE);  */

 -   sreal tmp (e-probability, 0);
 +   sreal tmp = e-probability;
  tmp *= BLOCK_INFO (e-src)-frequency;
  tmp *= real_inv_br_prob_base;
  frequency += tmp;
}

 - if (cyclic_probability == real_zero)
 + if (cyclic_probability == sreal::zero ())
  {
BLOCK_INFO (bb)-frequency = frequency;
  }
 @@ -2577,7 +2577,7 @@ propagate_freq (basic_block head, bitmap tovisit)
/* BLOCK_INFO (bb)-frequency = frequency
/ (1 - cyclic_probability)
 */

 - cyclic_probability = real_one - cyclic_probability;
 + cyclic_probability = sreal::one () - cyclic_probability;
BLOCK_INFO (bb)-frequency = frequency /
 cyclic_probability;
  }
  }
 @@ -2591,7 +2591,7 @@ propagate_freq (basic_block head, bitmap tovisit)
   = ((e-probability * BLOCK_INFO (bb)-frequency)
   / REG_BR_PROB_BASE); */

 - sreal tmp (e-probability, 0);
 + sreal tmp = e-probability;
tmp *= BLOCK_INFO (bb)-frequency;
EDGE_INFO (e)-back_edge_prob = tmp * real_inv_br_prob_base;
  }
 @@ -2873,13 +2873,11 @@ estimate_bb_frequencies (bool force)
 if (!real_values_initialized)
   {
real_values_initialized = 1;
 - real_zero = sreal (0, 0);
 - real_one = sreal (1, 0);
 - real_br_prob_base = sreal (REG_BR_PROB_BASE, 0);
 - real_bb_freq_max = sreal (BB_FREQ_MAX, 0);
 + real_br_prob_base = REG_BR_PROB_BASE;
 + real_bb_freq_max = BB_FREQ_MAX;
real_one_half = sreal (1, -1);
 - real_inv_br_prob_base = real_one / real_br_prob_base;
 - real_almost_one = real_one - real_inv_br_prob_base;
 + real_inv_br_prob_base = sreal::one () / real_br_prob_base;
 + real_almost_one = sreal::one () - real_inv_br_prob_base;
  }

 mark_dfs_back_edges ();
 @@ -2897,7 +2895,7 @@ estimate_bb_frequencies (bool force)

FOR_EACH_EDGE (e, ei, bb-succs)
  {
 - EDGE_INFO (e)-back_edge_prob = sreal (e-probability, 0);
 + EDGE_INFO (e)-back_edge_prob = e-probability;
EDGE_INFO (e)-back_edge_prob *= real_inv_br_prob_base;
  }
  }
 @@ -2906,7 +2904,7 @@ estimate_bb_frequencies (bool force)
to outermost to examine frequencies for back edges.  */
 

Re: [PATCH]Add myself to MAINTAINERS

2014-11-21 Thread Markus Trippelsdorf
On 2014.11.21 at 11:42 +, Richard Earnshaw wrote:
 On 21/11/14 11:16, Renlin Li wrote:
  Hi,
  
  This patch is to add myself into Write After Approval section of 
  MAINTAINERS file.
  
  Is it Okay to commit?
 
 OK

There is no need to ask for permission in this case:
http://gcc.gnu.org/svnwrite.html#authenticated

Once you get a gcc.gnu.org account you could just add yourself.

-- 
Markus


FW: [Aarch64][BE][2/2] Fix vector load/stores to not use ld1/st1

2014-11-21 Thread Alan Hayward

On 20/11/2014 18:13, Marcus Shawcroft marcus.shawcr...@gmail.com wrote:

On 14 November 2014 16:48, Alan Hayward alan.hayw...@arm.com wrote:
This is a new version of my BE patch from a few weeks ago.
This is part 2 and covers all the aarch64 changes.

When combined with the first patch, It fixes up movoi/ci/xi for Big
Endian, so that we end up with the lab of a big-endian integer to be in
the low byte of the highest-numbered register.

This patch requires part 1 and David Sherwood’s patch:
  [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.

When tested with David’s patch and [1/2] of this patch, no regressions
were seen when testing aarch64 and x86_64 on make check.


Changelog:
2014-11-14  Alan Hayward  alan.hayw...@arm.com

 * config/aarch64/aarch64.c
 (aarch64_classify_address): Allow extra addressing modes for BE.
 (aarch64_print_operand): new operand for printing a q
register+1.


Just a bunch of ChangeLog nits.


+void aarch64_simd_emit_reg_reg_move (rtx *operands, enum machine_mode
mode,
+ unsigned int count);

Drop the formal argument names.


Can you respin with these changes please.

/Marcus


New version. Identical to previous version of the patch except for:
* removal of parameter names in aarch64-protos.h
* new changelog


2014-11-21  Alan Hayward  alan.hayw...@arm.com

PR 57233
PR 59810
* config/aarch64/aarch64.c
(aarch64_classify_address): Allow extra addressing modes for BE.
(aarch64_print_operand): New operand for printing a q register+1.
(aarch64_simd_emit_reg_reg_move): Define.
(aarch64_simd_disambiguate_copy): Remove.
* config/aarch64/aarch64-protos.h
(aarch64_simd_emit_reg_reg_move): Define.
(aarch64_simd_disambiguate_copy): Remove.
* config/aarch64/aarch64-simd.md
(define_split): Use aarch64_simd_emit_reg_reg_move.
(define_expand movmode): Less restrictive predicates.
(define_insn *aarch64_movmode): Simplify and only allow for LE.
(define_insn *aarch64_be_movoi): Define.
(define_insn *aarch64_be_movci): Define.
(define_insn *aarch64_be_movxi): Define.
(define_split): OI mov.  Use aarch64_simd_emit_reg_reg_move.
(define_split): CI mov.  Use aarch64_simd_emit_reg_reg_move.
(define_split): XI mov.  Use aarch64_simd_emit_reg_reg_move.


Alan.




0001-BE-fix-load-stores.-Aarch64-code.-v2.patch
Description: Binary data


Re: [PATCH 3/4][x86] Add clwb,pcommit,avx512avbmi,avx512ifma.

2014-11-21 Thread Uros Bizjak
On Fri, Nov 21, 2014 at 12:45 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
 On 20 Nov 09:43, Uros Bizjak wrote:
 On Wed, Nov 19, 2014 at 6:32 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
  Hi,
 
  New revision of Intel ISA reference [1] has new instructions:
  Clwb, pcommit and new flavors of AVX512. Patch bellow adds them.
  I understand that stage 1 is closed, however those changes shouldn't
  affect anything outside if i386 backend. And are extremely unlikely to
  break existing functionality, and I personally think it's desirable for
  newest GCC to support newest spec.
  Bootstrapped/regtestsed on x86_64-unknown-linux-gnu.
  Ok for trunk?

 Please split the patch into patch series, like it was done previously
 for AVX512F patches.

 Uros.

 Done. This part adds clwb.
 Bootstrapped/passes make-check.
 Ok for trunk?

 gcc/

 * common/config/i386/i386-common.c (OPTION_MASK_ISA_CLWB_UNSET,
 OPTION_MASK_ISA_CLWB_SET): New.
 (ix86_handle_option): Handle OPT_mclwb.
 * config.gcc: Add clwbintrin.h.
 * config/i386/clwbintrin.h: New file.
 * config/i386/cpuid.h (bit_CLWB): Define.
 * config/i386/driver-i386.c (host_detect_local_cpu): Detect clwb.
 * config/i386/i386-c.c (ix86_target_macros_internal): Define
 __CLWB__.
 * config/i386/i386.c (ix86_target_string): Add -mclwb.
 (PTA_CLWB): Define.
 (ix86_option_override_internal): Handle new option.
 (ix86_valid_target_attribute_inner_p): Add clwb.
 (ix86_builtins): Add IX86_BUILTIN_CLWB.
 (ix86_init_mmx_sse_builtins): Add __builtin_ia32_clwb.
 (ix86_expand_builtin): Handle IX86_BUILTIN_CLWB.
 * config/i386/i386.h (TARGET_CLWB, TARGET_CLWB_P): Define.
 * config/i386/i386.md (unspecv): Add UNSPECV_CLWB.
 (clwb): New instruction.
 * config/i386/i386.opt: Add mclwb.
 * config/i386/x86intrin.h: Include clwbintrin.h.

 gcc/testsuite/

 * g++.dg/other/i386-2.C: Add -mclwb.
 * g++.dg/other/i386-3.C: Ditto.
 * gcc.target/i386/clwb-1.c: New test.
 * gcc.target/i386/sse-12.c: Add new options.
 * gcc.target/i386/sse-13.c: Ditto.
 * gcc.target/i386/sse-14.c: Ditto.
 * gcc.target/i386/sse-22.c: Ditto.
 * gcc.target/i386/sse-23.c: Ditto.

OK.

Thanks,
Uros.

 ---
  gcc/common/config/i386/i386-common.c   | 15 +++
  gcc/config.gcc |  4 +--
  gcc/config/i386/clwbintrin.h   | 49 
 ++
  gcc/config/i386/cpuid.h|  1 +
  gcc/config/i386/driver-i386.c  |  6 +++--
  gcc/config/i386/i386-c.c   |  2 ++
  gcc/config/i386/i386.c | 23 
  gcc/config/i386/i386.h |  2 ++
  gcc/config/i386/i386.md| 12 +
  gcc/config/i386/i386.opt   |  4 +++
  gcc/config/i386/x86intrin.h|  2 ++
  gcc/testsuite/g++.dg/other/i386-2.C|  2 +-
  gcc/testsuite/g++.dg/other/i386-3.C|  2 +-
  gcc/testsuite/gcc.target/i386/clwb-1.c | 11 
  gcc/testsuite/gcc.target/i386/sse-12.c |  2 +-
  gcc/testsuite/gcc.target/i386/sse-13.c |  2 +-
  gcc/testsuite/gcc.target/i386/sse-14.c |  2 +-
  gcc/testsuite/gcc.target/i386/sse-22.c |  2 +-
  gcc/testsuite/gcc.target/i386/sse-23.c |  2 +-
  19 files changed, 134 insertions(+), 11 deletions(-)
  create mode 100644 gcc/config/i386/clwbintrin.h
  create mode 100644 gcc/testsuite/gcc.target/i386/clwb-1.c

 diff --git a/gcc/common/config/i386/i386-common.c 
 b/gcc/common/config/i386/i386-common.c
 index 1c4f15e..bad0988 100644
 --- a/gcc/common/config/i386/i386-common.c
 +++ b/gcc/common/config/i386/i386-common.c
 @@ -85,6 +85,7 @@ along with GCC; see the file COPYING3.  If not see
(OPTION_MASK_ISA_XSAVES | OPTION_MASK_ISA_XSAVE)
  #define OPTION_MASK_ISA_XSAVEC_SET \
(OPTION_MASK_ISA_XSAVEC | OPTION_MASK_ISA_XSAVE)
 +#define OPTION_MASK_ISA_CLWB_SET OPTION_MASK_ISA_CLWB

  /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
 as -msse4.2.  */
 @@ -181,6 +182,7 @@ along with GCC; see the file COPYING3.  If not see
  #define OPTION_MASK_ISA_CLFLUSHOPT_UNSET OPTION_MASK_ISA_CLFLUSHOPT
  #define OPTION_MASK_ISA_XSAVEC_UNSET OPTION_MASK_ISA_XSAVEC
  #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
 +#define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB

  /* SSE4 includes both SSE4.1 and SSE4.2.  -mno-sse4 should the same
 as -mno-sse4.1. */
 @@ -901,6 +903,19 @@ ix86_handle_option (struct gcc_options *opts,
 }
return true;

 +case OPT_mclwb:
 +  if (value)
 +   {
 + opts-x_ix86_isa_flags |= OPTION_MASK_ISA_CLWB_SET;
 + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_CLWB_SET;
 +   }
 +  else
 +   {
 + opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_CLWB_UNSET;
 + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_CLWB_UNSET;
 +   }
 +  

Re: [PATCH] VRP: don't assume strict overflow semantics when checking if a loop wraps

2014-11-21 Thread Richard Biener
On Fri, Nov 21, 2014 at 12:29 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 When adjusting the value range of an induction variable using SCEV, VRP
 calls scev_probably_wraps_p() with use_overflow_semantics=true.  This
 parameter set to true makes scev_probably_wraps_p() assume that signed
 induction variables never wrap, so for these variables it always returns
 false (when strict overflow rules are in effect).  This is wrong because
 if a signed induction variable really does overflow then we want to give
 it an INF(OVF) value range and not the (finite) estimation returned by
 SCEV.

 While this change shouldn't make a difference in code generation, it
 should help improve the coverage of -Wstrict-overflow warnings on
 induction variables like in the test case.

 OK after bootstrap + regtest on x86_64-unknown-linux-gnu?

Hmm, I don't think the change won't affect code-generation.  In fact
we check for overflow ourselves in the most interesting case
(the first block) - only the path adjusting min/max based on the
init value and the max value of the type needs to know whether
overflow may happen and fail or drop to +-INF(OVF).

So I'd rather open-code the relevant cases and not call
scev_probably_wraps_p at all.

Richard.

 gcc/
 * tree-vrp.c (adjust_range_with_scev): Call
 scev_probably_wraps_p with use_overflow_semantics=false.

 gcc/testsuite/
 * gcc.dg/Wstrict-overflow-27.c: New test.
 ---
  gcc/testsuite/gcc.dg/Wstrict-overflow-27.c | 22 ++
  gcc/tree-vrp.c |  2 +-
  2 files changed, 23 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.dg/Wstrict-overflow-27.c

 diff --git a/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c 
 b/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c
 new file mode 100644
 index 000..c1f27ab
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c
 @@ -0,0 +1,22 @@
 +/* { dg-do compile } */
 +/* { dg-options -fstrict-overflow -O2 -Wstrict-overflow } */
 +
 +/* Warn about an overflow when folding i  0.  */
 +
 +void bar (unsigned *p);
 +
 +int
 +foo (unsigned *p)
 +{
 +  int i;
 +  int sum = 0;
 +
 +  for (i = 0; i  *p; i++)
 +{
 +  if (i  0) /* { dg-warning signed overflow } */
 +   sum += 2;
 +  bar (p);
 +}
 +
 +  return sum;
 +}
 diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
 index a75138f..bf9ff61 100644
 --- a/gcc/tree-vrp.c
 +++ b/gcc/tree-vrp.c
 @@ -4270,7 +4270,7 @@ adjust_range_with_scev (value_range_t *vr, struct loop 
 *loop,
dir == EV_DIR_UNKNOWN
/* ... or if it may wrap.  */
|| scev_probably_wraps_p (init, step, stmt, get_chrec_loop (chrec),
 -   true))
 +   /*use_overflow_semantics=*/false))
  return;

/* We use TYPE_MIN_VALUE and TYPE_MAX_VALUE here instead of
 --
 2.2.0.rc1.23.gf570943



Re: libsanitizer merge from upstream r221802

2014-11-21 Thread Dmitry Vyukov
On Thu, Nov 13, 2014 at 12:16 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Nov 12, 2014 at 05:35:48PM -0800, Konstantin Serebryany wrote:
 Here is one more merge of libsanitizer (last one was in Sept).

 Tested on x86_64 Ubuntu 14.04 like this:
 rm -rf */{*/,}libsanitizer  make -j 50
 make -j 40 -C gcc check-g{cc,++}
 RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} asan.exp'  \
 make -j 40 -C gcc check-g{cc,++}
 RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} tsan.exp'  \
 make -j 40 -C gcc check
 RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} ubsan.exp'  \
 echo PASS

 Expected ChangeLog entry:

 2014-11-12  Kostya Serebryany  k...@google.com

 * All source files: Merge from upstream r221802.
 * sanitizer_common/sanitizer_symbolizer_libbacktrace.cc
   (LibbacktraceSymbolizer::SymbolizeData): replace 'address'
   with 'start' to follow the new interface.

 Capital R in Replace.  All lines are indented by single tab, not tab
 and two spaces.

 * asan/Makefile.am (AM_CXXFLAGS): added -std=c++11.

 Capital A in Added.  Also, I wonder if we shouldn't use -std=gnu++11
 instead.  As the sources are compiled by newly built compiler, it should be
 generally fine to use extensions in there.

 * interception/Makefile.am (AM_CXXFLAGS): added -std=c++11.
 * libbacktrace/Makefile.am (AM_CXXFLAGS): added -std=c++11.
 * lsan/Makefile.am (AM_CXXFLAGS): added -std=c++11.
 * sanitizer_common/Makefile.am (sanitizer_common_files): Added new
   files.
   (AM_CXXFLAGS): added -std=c++11.
 * tsan/Makefile.am (AM_CXXFLAGS): added -std=c++11.
 * ubsan/Makefile.am (AM_CXXFLAGS): added -std=c++11.

 Ditto.

 * asan/Makefile.in: Regenerate.
 * interception/Makefile.in: Regenerate.
 * libbacktrace/Makefile.in: Regenerate.
 * lsan/Makefile.in: Regenerate.
 * sanitizer_common/Makefile.in: Regenerate.
 * tsan/Makefile.in: Regenerate.
 * ubsan/Makefile.in: Regenerate.

 Other than that, it looks good to me, I've bootstrapped/regtested
 it on x86_64-linux and i686-linux too.  So, with those changes ok for trunk
 (how do you decide about c++11 vs. gnu++11 I'll leave to you).

 A few questions regarding possible changes on the compiler side:
 1) is __asan_poison_intra_object_redzone/__asan_unpoison_intra_object_redzone
just for the ABI incompatible putting of red zones in between fields
in structures?  How do you handle whole struct copying in that case?
Could it be done without changing ABI for a subset of structs
which have natural padding in them?
 2) regarding the tsan memory layout changes, is it now possible to support
non-pie binaries?  If yes, we should probably remove the:
 %{!pie:%{!shared:%e-fsanitize=thread linking must be done with -pie or 
 -shared}}}\
and add testcases that would test that.

Hi Jakub,

Yes, I think it's the way to go.
I've just committed the following revision to clang that removes -pie
when compiling with tsan:
http://llvm.org/viewvc/llvm-project?view=revisionrevision=222526
The tests in llvm tree pass with this change.


Re: [PATCH][x86] Add clwb,pcommit,avx512avbmi,avx512ifma.

2014-11-21 Thread Uros Bizjak
On Fri, Nov 21, 2014 at 12:50 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
 On 20 Nov 09:43, Uros Bizjak wrote:
 On Wed, Nov 19, 2014 at 6:32 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
  Hi,
 
  New revision of Intel ISA reference [1] has new instructions:
  Clwb, pcommit and new flavors of AVX512. Patch bellow adds them.
  I understand that stage 1 is closed, however those changes shouldn't
  affect anything outside if i386 backend. And are extremely unlikely to
  break existing functionality, and I personally think it's desirable for
  newest GCC to support newest spec.
  Bootstrapped/regtestsed on x86_64-unknown-linux-gnu.
  Ok for trunk?

 Please split the patch into patch series, like it was done previously
 for AVX512F patches.

 Uros.

  [1]:https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
 

 This part adds pcommit.
 Bootstrapps/passes make check.
 Ok for trunk?

 gcc/


 * common/config/i386/i386-common.c (OPTION_MASK_ISA_PCOMMIT_UNSET,
 OPTION_MASK_ISA_PCOMMIT_SET): New.
 (ix86_handle_option): Handle OPT_mpcommit.
 * config.gcc: Add pcommitintrin.h
 * config/i386/pcommitintrin.h: New file.
 * config/i386/cpuid.h (bit_PCOMMIT): Define.
 * config/i386/driver-i386.c (host_detect_local_cpu): Detect pcommit.
 * config/i386/i386-c.c (ix86_target_macros_internal): Define
 __PCOMMIT__.
 * config/i386/i386.c (ix86_target_string): Add -mpcommit.
 (PTA_PCOMMIT): Define.
 (ix86_option_override_internal): Handle new option.
 (ix86_valid_target_attribute_inner_p): Add pcommit.
 (ix86_builtins): Add IX86_BUILTIN_PCOMMIT.
 (bdesc_special_args): Add __builtin_ia32_pcommit.
 * config/i386/i386.h (TARGET_PCOMMIT, TARGET_PCOMMIT_P): Define.
 * config/i386/i386.md (unspecv): Add UNSPECV_PCOMMIT.
 (pcommit): New instruction.
 * config/i386/i386.opt: Add mpcommit.
 * config/i386/x86intrin.h: Include pcommitintrin.h.

OK with a small typo fix below.

Thanks,
Uros.


 ---
  gcc/common/config/i386/i386-common.c  | 15 ++
  gcc/config.gcc|  4 +--
  gcc/config/i386/cpuid.h   |  1 +
  gcc/config/i386/driver-i386.c |  5 +++-
  gcc/config/i386/i386-c.c  |  2 ++
  gcc/config/i386/i386.c| 12 
  gcc/config/i386/i386.h|  2 ++
  gcc/config/i386/i386.md   | 10 +++
  gcc/config/i386/i386.opt  |  4 +++
  gcc/config/i386/pcommitintrin.h   | 49 
 +++
  gcc/config/i386/x86intrin.h   |  2 ++
  gcc/testsuite/g++.dg/other/i386-2.C   |  2 +-
  gcc/testsuite/g++.dg/other/i386-3.C   |  2 +-
  gcc/testsuite/gcc.target/i386/pcommit-1.c | 11 +++
  gcc/testsuite/gcc.target/i386/sse-12.c|  2 +-
  gcc/testsuite/gcc.target/i386/sse-13.c|  2 +-
  gcc/testsuite/gcc.target/i386/sse-14.c|  2 +-
  gcc/testsuite/gcc.target/i386/sse-22.c|  2 +-
  gcc/testsuite/gcc.target/i386/sse-23.c|  2 +-
  19 files changed, 121 insertions(+), 10 deletions(-)
  create mode 100644 gcc/config/i386/pcommitintrin.h
  create mode 100644 gcc/testsuite/gcc.target/i386/pcommit-1.c

 diff --git a/gcc/common/config/i386/i386-common.c 
 b/gcc/common/config/i386/i386-common.c
 index bad0988..2e09d77 100644
 --- a/gcc/common/config/i386/i386-common.c
 +++ b/gcc/common/config/i386/i386-common.c
 @@ -86,6 +86,7 @@ along with GCC; see the file COPYING3.  If not see
  #define OPTION_MASK_ISA_XSAVEC_SET \
(OPTION_MASK_ISA_XSAVEC | OPTION_MASK_ISA_XSAVE)
  #define OPTION_MASK_ISA_CLWB_SET OPTION_MASK_ISA_CLWB
 +#define OPTION_MASK_ISA_PCOMMIT_SET OPTION_MASK_ISA_PCOMMIT

  /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
 as -msse4.2.  */
 @@ -182,6 +183,7 @@ along with GCC; see the file COPYING3.  If not see
  #define OPTION_MASK_ISA_CLFLUSHOPT_UNSET OPTION_MASK_ISA_CLFLUSHOPT
  #define OPTION_MASK_ISA_XSAVEC_UNSET OPTION_MASK_ISA_XSAVEC
  #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
 +#define OPTION_MASK_ISA_PCOMMIT_UNSET OPTION_MASK_ISA_PCOMMIT
  #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB

  /* SSE4 includes both SSE4.1 and SSE4.2.  -mno-sse4 should the same
 @@ -903,6 +905,19 @@ ix86_handle_option (struct gcc_options *opts,
 }
return true;

 +case OPT_mpcommit:
 +  if (value)
 +   {
 + opts-x_ix86_isa_flags |= OPTION_MASK_ISA_PCOMMIT_SET;
 + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_PCOMMIT_SET;
 +   }
 +  else
 +   {
 + opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_PCOMMIT_UNSET;
 + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_PCOMMIT_UNSET;
 +   }
 +  return true;
 +
  case OPT_mclwb:
if (value)
 {
 diff --git a/gcc/config.gcc b/gcc/config.gcc
 index 766f13b..fa3e1fc 100644
 --- 

Re: [PATCH] PR ipa/63909 ICE: SIGSEGV in ipa_icf_gimple::func_checker::compare_bb()

2014-11-21 Thread Richard Biener
On Fri, Nov 21, 2014 at 12:52 PM, Martin Liška mli...@suse.cz wrote:
 On 11/20/2014 05:41 PM, Richard Biener wrote:

 On Thu, Nov 20, 2014 at 5:30 PM, Martin Liška mli...@suse.cz wrote:

 Hello.

 Following patch fixes ICE in IPA ICF. Problem was that number of
 non-debug
 statements in a BB can
 change (for instance by IPA split), so that the number is recomputed.


 Huh, so can it get different for both candidates?  I think the stmt
 compare
 loop should be terminated on gsi_end_p of either iterator and return
 false for any remaining non-debug-stmts on the other.

 Thus, not walk all stmts twice here.


 Hello.

 Sorry for the previous patch, you are right it can be fixed in purer way.
 Please take a look at attached patch.


 As IPA split is run early I don't see how it should affect a real IPA
 pass though?




 Sorry for non precise information, the problematic BB is changed here:
 #0  gsi_split_seq_before (i=0x7fffd550, pnew_seq=0x7fffd528) at
 ../../gcc/gimple-iterator.c:429
 #1  0x00b95a2a in gimple_split_block (bb=0x76c41548, stmt=0x0)
 at ../../gcc/tree-cfg.c:5707
 #2  0x007563cf in split_block (bb=0x76c41548, i=i@entry=0x0) at
 ../../gcc/cfghooks.c:508
 #3  0x00756b44 in split_block_after_labels (bb=optimized out) at
 ../../gcc/cfghooks.c:549
 #4  make_forwarder_block (bb=optimized out,
 redirect_edge_p=redirect_edge_p@entry=0x75d4e0 mfb_keep_just(edge_def*),
 new_bb_cbk=new_bb_cbk@entry=0x0) at ../../gcc/cfghooks.c:842
 #5  0x0076085a in create_preheader (loop=0x76d56948,
 flags=optimized out) at ../../gcc/cfgloopmanip.c:1563
 #6  0x00760aea in create_preheaders (flags=1) at
 ../../gcc/cfgloopmanip.c:1613
 #7  0x009bc6b0 in apply_loop_flags (flags=15) at
 ../../gcc/loop-init.c:75
 #8  0x009bc7d3 in loop_optimizer_init (flags=15) at
 ../../gcc/loop-init.c:136
 #9  0x00957914 in estimate_function_body_sizes (node=0x76c47620,
 early=false) at ../../gcc/ipa-inline-analysis.c:2480
 #10 0x0095948b in compute_inline_parameters (node=0x76c47620,
 early=false) at ../../gcc/ipa-inline-analysis.c:2907
 #11 0x0095bd88 in inline_analyze_function (node=0x76c47620) at
 ../../gcc/ipa-inline-analysis.c:3994
 #12 0x0095bed3 in inline_generate_summary () at
 ../../gcc/ipa-inline-analysis.c:4045
 #13 0x00a70b71 in execute_ipa_summary_passes (ipa_pass=0x1dcb9e0) at

So inline_summary is generated after IPA-ICF does its job?

But the bug is obviously that an IPA analysis phase does a code transform
(here initializes loops without AVOID_CFG_MANIPULATIONS).
Honza - if that is really needed then I think we should make sure
loops are initialized at the start of the IPA analysis phase, not randomly
inbetween.

Thanks,
Richard.

 ../../gcc/passes.c:2137
 #14 0x00777a15 in ipa_passes () at ../../gcc/cgraphunit.c:2074
 #15 symbol_table::compile (this=this@entry=0x76c3a000) at
 ../../gcc/cgraphunit.c:2187
 #16 0x00778bcd in symbol_table::finalize_compilation_unit
 (this=0x76c3a000) at ../../gcc/cgraphunit.c:2340
 #17 0x006580ee in c_write_global_declarations () at
 ../../gcc/c/c-decl.c:10777
 #18 0x00b5bb8b in compile_file () at ../../gcc/toplev.c:584
 #19 0x00b5def1 in do_compile () at ../../gcc/toplev.c:2041
 #20 0x00b5e0fa in toplev::main (this=0x7fffdc9f, argc=20,
 argv=0x7fffdd98) at ../../gcc/toplev.c:2138
 #21 0x0063f1d9 in main (argc=20, argv=0x7fffdd98) at
 ../../gcc/main.c:38


 Patch can bootstrap on x86_64-linux-pc and no regression has been seen.
 Ready for trunk?


 Thanks,
 Martin


 Thanks,
 Richard.

 Patch can bootstrap on x86_64-linux-pc and no regression has been seen.
 Ready for trunk?

 Thanks,
 Martin




Re: [PATCH 2/4][x86] Add clwb,pcommit,avx512avbmi,avx512ifma.

2014-11-21 Thread Uros Bizjak
On Fri, Nov 21, 2014 at 12:38 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
 On 20 Nov 09:43, Uros Bizjak wrote:
 On Wed, Nov 19, 2014 at 6:32 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
  Hi,
 
  New revision of Intel ISA reference [1] has new instructions:
  Clwb, pcommit and new flavors of AVX512. Patch bellow adds them.
  I understand that stage 1 is closed, however those changes shouldn't
  affect anything outside if i386 backend. And are extremely unlikely to
  break existing functionality, and I personally think it's desirable for
  newest GCC to support newest spec.
  Bootstrapped/regtestsed on x86_64-unknown-linux-gnu.
  Ok for trunk?

 Please split the patch into patch series, like it was done previously
 for AVX512F patches.

 Uros.

 This part adds avx512vbmi.
 I'll send vpermi2b autogen patch together with v64qi const perm later.
 Boostraps/passes make check.
 Ok for trunk?


 gcc/
 * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VBMI_SET

Please remove  in the above line.

 OPTION_MASK_ISA_AVX512VBMI_UNSET): New.
 (ix86_handle_option): Handle OPT_mavx512vbmi.
 * config.gcc: Add avx512vbmiintrin.h, avx512vbmivlintrin.h.
 * config/i386/avx512vbmiintrin.h: New file.
 * config/i386/avx512vbmivlintrin.h: Ditto.
 * config/i386/cpuid.h (bit_AVX512VBMI): New.
 * config/i386/driver-i386.c (host_detect_local_cpu): Detect 
 avx512vbmi.
 * config/i386/i386-c.c (ix86_target_macros_internal): Define
 __AVX512VBMI__.
 * config/i386/i386.c (ix86_target_string): Add -mavx512vbmi.
 (PTA_AVX512VBMI): Define.
 (ix86_option_override_internal): Handle new options.
 (ix86_valid_target_attribute_inner_p): Add avx512vbmi,
 (ix86_builtins): Add IX86_BUILTIN_VPMULTISHIFTQB512,
 IX86_BUILTIN_VPMULTISHIFTQB256, IX86_BUILTIN_VPMULTISHIFTQB128,
 IX86_BUILTIN_VPERMVARQI512_MASK, IX86_BUILTIN_VPERMT2VARQI512,
 IX86_BUILTIN_VPERMT2VARQI512_MASKZ, IX86_BUILTIN_VPERMI2VARQI512,
 IX86_BUILTIN_VPERMVARQI256_MASK, IX86_BUILTIN_VPERMVARQI128_MASK,
 IX86_BUILTIN_VPERMT2VARQI256, IX86_BUILTIN_VPERMT2VARQI256_MASKZ,
 IX86_BUILTIN_VPERMT2VARQI128, IX86_BUILTIN_VPERMI2VARQI256,
 IX86_BUILTIN_VPERMI2VARQI128.
 (bdesc_special_args): Add __builtin_ia32_vpmultishiftqb512_mask,
 __builtin_ia32_vpmultishiftqb256_mask,
 __builtin_ia32_vpmultishiftqb128_mask,
 __builtin_ia32_permvarqi512_mask, __builtin_ia32_vpermt2varqi512_mask,
 __builtin_ia32_vpermt2varqi512_maskz,
 __builtin_ia32_vpermi2varqi512_mask, __builtin_ia32_permvarqi256_mask,
 __builtin_ia32_permvarqi128_mask, __builtin_ia32_vpermt2varqi256_mask,
 __builtin_ia32_vpermt2varqi256_maskz,
 __builtin_ia32_vpermt2varqi128_mask,
 __builtin_ia32_vpermt2varqi128_maskz,
 __builtin_ia32_vpermi2varqi256_mask,
 __builtin_ia32_vpermi2varqi128_mask.
 (ix86_hard_regno_mode_ok): Allow big masks for AVX512VBMI.
 * config/i386/i386.h (TARGET_AVX512VBMI, TARGET_AVX512VBMI_P): Define.
 * config/i386/i386.opt: Add mavx512vbmi.
 * config/i386/immintrin.h: Include avx512vbmiintrin.h,
 avx512vbmivlintrin.h.
 * config/i386/sse.md (unspec): Add UNSPEC_VPMULTISHIFT.
 (VI1_AVX512VL): New iterator.
 (avx512_permvarmodemask_name): Use it.
 (avx512_vpermi2varmode3_maskz): Ditto.
 (avx512_vpermi2varmode3sd_maskz_name): Ditto.
 (avx512_vpermi2varmode3_mask): Ditto.
 (avx512_vpermt2varmode3_maskz): Ditto.
 (avx512_vpermt2varmode3sd_maskz_name): Ditto.
 (avx512_vpermt2varmode3_mask): Ditto.
 (vpmultishiftqbmodemask_name): Ditto.

 gcc/testsuite/

 * g++.dg/other/i386-2.C: Add -mavx512vbmi.
 * g++.dg/other/i386-3.C: Ditto.
 * gcc.target/i386/avx512f-helper.h: Add avx512vbmi-check.h.
 * gcc.target/i386/avx512vbmi-check.h: Ditto.
 * gcc.target/i386/avx512vbmi-vpermb-1.c: Ditto.
 * gcc.target/i386/avx512vbmi-vpermb-2.c: Ditto.
 * gcc.target/i386/avx512vbmi-vpermi2b-1.c: Ditto.
 * gcc.target/i386/avx512vbmi-vpermi2b-2.c: Ditto.
 * gcc.target/i386/avx512vbmi-vpermt2b-1.c: Ditto.
 * gcc.target/i386/avx512vbmi-vpermt2b-2.c: Ditto.
 * gcc.target/i386/avx512vbmi-vpmultishiftqb-1.c: Ditto.
 * gcc.target/i386/avx512vbmi-vpmultishiftqb-2.c: Ditto.
 * gcc.target/i386/avx512vl-vpermb-2.c: Ditto.
 * gcc.target/i386/avx512vl-vpermi2b-2.c: Ditto.
 * gcc.target/i386/avx512vl-vpermt2b-2.c: Ditto.
 * gcc.target/i386/avx512vl-vpmaddhuq-2.c: Ditto.
 * gcc.target/i386/avx512vl-vpmaddluq-2.c: Ditto.
 * gcc.target/i386/avx512vl-vpmultishiftqb-2.c: Ditto.
 * gcc.target/i386/i386.exp (check_effective_target_avx512vbmi): New.
 * gcc.target/i386/sse-12.c: Add new options.
 * 

Re: [PATCH 1/4][x86] Add clwb,pcommit,avx512avbmi,avx512ifma.

2014-11-21 Thread Uros Bizjak
On Fri, Nov 21, 2014 at 12:21 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
 On 20 Nov 09:43, Uros Bizjak wrote:
 On Wed, Nov 19, 2014 at 6:32 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
  Hi,
 
  New revision of Intel ISA reference [1] has new instructions:
  Clwb, pcommit and new flavors of AVX512. Patch bellow adds them.
  I understand that stage 1 is closed, however those changes shouldn't
  affect anything outside if i386 backend. And are extremely unlikely to
  break existing functionality, and I personally think it's desirable for
  newest GCC to support newest spec.
  Bootstrapped/regtestsed on x86_64-unknown-linux-gnu.
  Ok for trunk?

 Please split the patch into patch series, like it was done previously
 for AVX512F patches.

 Uros.


 This part adds avx512ifma.
 Bootstraps/passes make check.

 gcc/

 * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512IFMA_SET,
 , OPTION_MASK_ISA_AVX512IFMA_UNSET): New.
 (ix86_handle_option): Handle OPT_mavx512ifma.
 * config.gcc: Add avx512ifmaintrin.h, avx512ifmavlintrin.h.
 * config/i386/avx512ifmaintrin.h: New file.
 * config/i386/avx512ifmaivlntrin.h: Ditto.
 * config/i386/cpuid.h (bit_AVX512IFMA): New.
 * config/i386/driver-i386.c (host_detect_local_cpu): Detect
 avx512ifma.
 * config/i386/i386-c.c (ix86_target_macros_internal): Define
 __AVX512IFMA__.
 * config/i386/i386.c (ix86_target_string): Add -mavx512ifma.
 (PTA_AVX512IFMA): Define.
 (ix86_option_override_internal): Handle new options.
 (ix86_valid_target_attribute_inner_p): Add avx512ifma.
 (ix86_builtins): Add IX86_BUILTIN_VPMADD52LUQ512,
 IX86_BUILTIN_VPMADD52HUQ512, IX86_BUILTIN_VPMADD52LUQ256,
 IX86_BUILTIN_VPMADD52HUQ256, IX86_BUILTIN_VPMADD52LUQ128,
 IX86_BUILTIN_VPMADD52HUQ128, IX86_BUILTIN_VPMADD52LUQ512_MASKZ,
 IX86_BUILTIN_VPMADD52HUQ512_MASKZ, IX86_BUILTIN_VPMADD52LUQ256_MASKZ,
 IX86_BUILTIN_VPMADD52HUQ256_MASKZ, IX86_BUILTIN_VPMADD52LUQ128_MASKZ,
 IX86_BUILTIN_VPMADD52HUQ128_MASKZ.
 (bdesc_special_args): Add __builtin_ia32_vpmadd52luq512_mask,
 __builtin_ia32_vpmadd52luq512_maskz,
 __builtin_ia32_vpmadd52huq512_mask,
 __builtin_ia32_vpmadd52huq512_maskx,
 __builtin_ia32_vpmadd52luq256_mask,
 __builtin_ia32_vpmadd52luq256_maskz,
 __builtin_ia32_vpmadd52huq256_mask,
 __builtin_ia32_vpmadd52huq256_maskz,
 __builtin_ia32_vpmadd52luq128_mask,
 __builtin_ia32_vpmadd52luq128_maskz,
 __builtin_ia32_vpmadd52huq128_mask,
 __builtin_ia32_vpmadd52huq128_maskz,
 * config/i386/i386.h (TARGET_AVX512IFMA, TARGET_AVX512IFMA_P): Define.
 * config/i386/i386.opt: Add mavx512ifma.
 * config/i386/immintrin.h: Include avx512ifmaintrin.h,
 avx512ifmavlintrin.h.
 * config/i386/sse.md (unspec): Add UNSPEC_VPMADD52LUQ,
 UNSPEC_VPMADD52HUQ.
 (VPMADD52): New iterator.
 (vpmadd52type): New attribute.
 (vpamdd52huqmode_maskz): New.
 (vpamdd52luqmode_maskz): Ditto.
 (vpamdd52vpmadd52typemodesd_maskz_name): Ditto.
 (vpamdd52vpmadd52typemode_mask): Ditto.


 gcc/testsuite/

 * g++.dg/other/i386-2.C: Add -mavx512ifma.
 * g++.dg/other/i386-3.C: Ditto.
 * gcc.target/i386/avx512f-helper.h: Add avx512ifma-check.h.
 * gcc.target/i386/avx512ifma-check.h: New.
 * gcc.target/i386/avx512ifma-vpmaddhuq-1.c: Ditto.
 * gcc.target/i386/avx512ifma-vpmaddhuq-2.c: Ditto.
 * gcc.target/i386/avx512ifma-vpmaddluq-1.c: Ditto.
 * gcc.target/i386/avx512ifma-vpmaddluq-2.c: Ditto.
 * gcc.target/i386/avx512vl-vpmaddhuq-2.c: Ditto.
 * gcc.target/i386/avx512vl-vpmaddluq-2.c: Ditto.
 * gcc.target/i386/i386.exp (check_effective_target_avx512ifma): New.
 * gcc.target/i386/sse-12.c: Add new options.
 * gcc.target/i386/sse-13.c: Ditto.
 * gcc.target/i386/sse-14.c: Ditto.
 * gcc.target/i386/sse-22.c: Ditto.
 * gcc.target/i386/sse-23.c: Ditto.

As discussed some time ago with Kirill, scan strings in the testsuite
need %xmm\[0-9\]+ (please note + at the end, at least one number
should be present), but this will be mass-fixed in the near future.

OK for mainline.

Thanks,
Uros.

 ---
  gcc/common/config/i386/i386-common.c   |  16 ++
  gcc/config.gcc |   6 +-
  gcc/config/i386/avx512ifmaintrin.h | 104 +
  gcc/config/i386/avx512ifmavlintrin.h   | 164 
 +
  gcc/config/i386/cpuid.h|   1 +
  gcc/config/i386/driver-i386.c  |   5 +-
  gcc/config/i386/i386-c.c   |   2 +
  gcc/config/i386/i386.c |  35 +
  gcc/config/i386/i386.h |   2 

[AArch64, Obvious] Fix formatting of SHLL and friends

2014-11-21 Thread James Greenhalgh
Hi,

I spotted in an assembly dump, that the the SHLL, SHLL2, SADDL, and SSUBL
instructions appear out of line, as they are missing a tab between their
mnemonic and their operands.

I've committed (revision 217917) the attached as the obvious fix to this.

Tested with a build-test and a run of aarch64.exp/simd.exp for
aarch64-none-elf with no issues.

Cheers,
James

---

2014-11-21  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64-simd.md
(aarch64_ANY_EXTEND:suADDSUB:optablmode): Add a tab between
output mnemonic and operands.
(aarch64_simd_vec_unpacksu_lo_mode): Likewise.
(aarch64_simd_vec_unpacksu_hi_mode): Likewise.
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 23345b1df1ebb28075edd2effd5f327749abd61d..926eb765e1bdc84f3f7873dbcd4030c4e2ea62a7 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1175,7 +1175,7 @@ (define_insn aarch64_simd_vec_unpacksu
 			   (match_operand:VQW 2 vect_par_cnst_lo_half )
 			)))]
   TARGET_SIMD
-  sushll %0.Vwtype, %1.Vhalftype, 0
+  sushll\t%0.Vwtype, %1.Vhalftype, 0
   [(set_attr type neon_shift_imm_long)]
 )
 
@@ -1186,7 +1186,7 @@ (define_insn aarch64_simd_vec_unpacksu
 			   (match_operand:VQW 2 vect_par_cnst_hi_half )
 			)))]
   TARGET_SIMD
-  sushll2 %0.Vwtype, %1.Vtype, 0
+  sushll2\t%0.Vwtype, %1.Vtype, 0
   [(set_attr type neon_shift_imm_long)]
 )
 
@@ -2601,7 +2601,7 @@ (define_insn aarch64_ANY_EXTEND:suAD
 		   (ANY_EXTEND:VWIDE
 			   (match_operand:VDW 2 register_operand w]
   TARGET_SIMD
-  ANY_EXTEND:suADDSUB:optabl %0.Vwtype, %1.Vtype, %2.Vtype
+  ANY_EXTEND:suADDSUB:optabl\t%0.Vwtype, %1.Vtype, %2.Vtype
   [(set_attr type neon_ADDSUB:optab_long)]
 )
 

[PATCH, committed] Add fgcse-sm test with scan-rtl-dump

2014-11-21 Thread Tom de Vries

Hi,

this patch adds a fgcse-sm test with a scan-rtl-dump directive.

The other fgcse-sm tests:
...
./gcc/testsuite/gcc.dg/pr45352-3.c
./gcc/testsuite/gcc.dg/torture/pr24257.c
./gcc/testsuite/gcc.target/i386/movsi-sm-1.c
./gcc/testsuite/g++.dg/opt/pr36185.C
...
do not check whether fgcse-sm actually does something.

Committed as trivial.

Thanks,
- Tom
2014-11-21  Tom de Vries  t...@codesourcery.com

	* gcc.dg/store-motion-fgcse-sm.c: New test.
---
 gcc/testsuite/gcc.dg/store-motion-fgcse-sm.c | 32 
 1 file changed, 32 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/store-motion-fgcse-sm.c

diff --git a/gcc/testsuite/gcc.dg/store-motion-fgcse-sm.c b/gcc/testsuite/gcc.dg/store-motion-fgcse-sm.c
new file mode 100644
index 000..b331a24
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/store-motion-fgcse-sm.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-options -O2 -ftree-pre -fno-tree-loop-im -fgcse-sm -fdump-rtl-store_motion } */
+
+/* tree-pre moves the *sum load out of the loop.  ftree-loop-im moves the *sum
+   store out of the loop, so we disable it, to allow fgcse-sm to do it
+   instead.  */
+
+#include stdlib.h
+
+void __attribute__((noinline))
+f (unsigned int *__restrict__ a, unsigned int *__restrict__ sum, unsigned int n)
+{
+  unsigned int i;
+  for (i = 0; i  n; ++i)
+*sum += a[i];
+}
+
+int
+main ()
+{
+  unsigned int a[] = { 1, 10, 100 };
+  unsigned sum = 1000;
+
+  f (a, sum, 3);
+  if (sum != )
+abort ();
+
+  return 0;
+}
+
+/* Check that -fgcse-sm did something for f.  */
+/* { dg-final { scan-rtl-dump STORE_MOTION of f, .* basic blocks, 1 insns deleted, 1 insns created store_motion } } */
-- 
1.9.1



Re: [PATCH 1/2] PR debug/38757 gcc does not emit DW_LANG_C99.

2014-11-21 Thread Mark Wielaard
On Fri, 2014-11-21 at 12:48 +0100, Richard Biener wrote:
 On Fri, Nov 21, 2014 at 8:56 AM, Jakub Jelinek ja...@redhat.com wrote:
  On Thu, Nov 20, 2014 at 11:30:11PM +0100, Mark Wielaard wrote:
  --- a/gcc/config/avr/avr-c.c
  +++ b/gcc/config/avr/avr-c.c
  @@ -386,7 +386,8 @@ avr_cpu_cpp_builtins (struct cpp_reader *pfile)
(as mentioned in ISO/IEC DTR 18037; Annex F.2) which is not
implemented in GCC up to now.  */
 
  -  if (!strcmp (lang_hooks.name, GNU C))
  +  if (strncmp (lang_hooks.name, GNU C, 5) == 0
  +   strncmp (lang_hooks.name, GNU C++, 7) != 0)
 
  I wonder if the tests for C language shouldn't be better done
  as (strncmp (lang_hooks.name, GNU C, 5) == 0
   strchr (0123456789, lang_hooks.name[5]) != NULL)
  or (strncmp (lang_hooks.name, GNU C, 5) == 0
   (ISDIGIT (lang_hooks.name[5]) || lang_hooks.name[5] == '\0'))
  to make it explicit what we are looking for, not what we aren't.
 
 Or even make that a helper function in langhooks.[ch]
 
   lang_GNU_C (), lang_GNU_CXX ()

Nice idea. I added those. It also fixes the formatting issues and makes
the diff smaller.

  --- a/gcc/langhooks.h
  +++ b/gcc/langhooks.h
  @@ -261,7 +261,8 @@ struct lang_hooks_for_lto
 
   struct lang_hooks
   {
  -  /* String identifying the front end.  e.g. GNU C++.  */
  +  /* String identifying the front end.  e.g. GNU C++.
  + Might include language version being used.  */
 
  As we no longer have GNU C++ as any name, using it as an example
  is weird.  So,
/* String identifying the front end and optionally language standard
   version, e.g. GNU C++98 or GNU Java.  */
  ?

Used Jakub's example text.

OK to push?

Thanks,

Mark

PR debug/38757 gcc does not emit DW_LANG_C99.

For C and C++ add the language standard version in use to lang_hooks.name.
Change users of lang_hook.name to check with new functions lang_GNU_C or
lang_GNU_CXX. In dwarf2out.c output the DW_LANG_C version from the
lang_hooks.name and merge any LTO TRANSLATION_UNIT_LANGUAGE found. Adds
two testcases to dwarf2.exp to check the right DWARF DW_AT_language is set
on the compile_unit depending on the -std=c89 or -std=c99 setting.

gcc/c-family/ChangeLog

PR debug/38757
* c-opts.c (set_std_c89): Set lang_hooks.name.
(set_std_c99): Likewise.
(set_std_c11): Likewise.
(set_std_cxx98): Likewise.
(set_std_cxx11): Likewise.
(set_std_cxx14): Likewise.
(set_std_cxx1z): Likewise.

gcc/ChangeLog

PR debug/38757
* config/avr/avr-c.c (avr_cpu_cpp_builtins): Use lang_GNU_C.
* config/darwin.c (darwin_file_end): Use lang_GNU_CXX.
(darwin_override_options): Likewise.
* config/ia64/ia64.c (ia64_struct_retval_addr_is_first_parm_p):
Likewise.
* config/rs6000/rs6000.c (rs6000_output_function_epilogue):
Likewise.
* dbxout.c (get_lang_number): Likewise.
(dbxout_type): Likewise.
(dbxout_symbol_location): Likewise.
* dwarf2out.c (add_prototyped_attribute): Add DW_AT_prototype
also for DW_LANG_{C,C99,ObjC}.
(highest_c_language): New function.
(gen_compile_unit_die): Call highest_c_language to merge LTO
TRANSLATION_UNIT_LANGUAGE. Use strncmp language_string to
determine if DW_LANG_C99 or DW_LANG_C89 should be returned.
* fold-const.c (fold_cond_expr_with_comparison): Use lang_GNU_CXX.
* langhooks.h (struct lang_hooks): Add version comment to name.
(lang_GNU_C): New function declaration.
(lang_GNU_CXX): Likewise.
* langhooks.c (lang_GNU_C): New function.
(lang_GNU_CXX): Likewise.
* vmsdbgout.c (vmsdbgout_init): Use lang_GNU_C and lang_GNU_CXX.

gcc/testsuite/ChangeLog

PR debug/38757
* gcc.dg/debug/dwarf2/lang-c89.c: New test.
* gcc.dg/debug/dwarf2/lang-c99.c: Likewise.

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 000fdd2..08a36f0 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -1450,6 +1450,7 @@ set_std_c89 (int c94, int iso)
   flag_isoc94 = c94;
   flag_isoc99 = 0;
   flag_isoc11 = 0;
+  lang_hooks.name = GNU C89;
 }
 
 /* Set the C 99 standard (without GNU extensions if ISO).  */
@@ -1463,6 +1464,7 @@ set_std_c99 (int iso)
   flag_isoc11 = 0;
   flag_isoc99 = 1;
   flag_isoc94 = 1;
+  lang_hooks.name = GNU C99;
 }
 
 /* Set the C 11 standard (without GNU extensions if ISO).  */
@@ -1476,6 +1478,7 @@ set_std_c11 (int iso)
   flag_isoc11 = 1;
   flag_isoc99 = 1;
   flag_isoc94 = 1;
+  lang_hooks.name = GNU C11;
 }
 
 /* Set the C++ 98 standard (without GNU extensions if ISO).  */
@@ -1487,6 +1490,7 @@ set_std_cxx98 (int iso)
   flag_no_nonansi_builtin = iso;
   flag_iso = iso;
   cxx_dialect = cxx98;
+  lang_hooks.name = GNU C++98;
 }
 
 /* Set the C++ 2011 standard (without GNU extensions if ISO).  */
@@ -1501,6 +1505,7 @@ set_std_cxx11 (int iso)
   

Re: [PATCH 1/2] PR debug/38757 gcc does not emit DW_LANG_C99.

2014-11-21 Thread Jakub Jelinek
On Fri, Nov 21, 2014 at 02:01:55PM +0100, Mark Wielaard wrote:
 gcc/c-family/ChangeLog
 
   PR debug/38757
   * c-opts.c (set_std_c89): Set lang_hooks.name.
   (set_std_c99): Likewise.
   (set_std_c11): Likewise.
   (set_std_cxx98): Likewise.
   (set_std_cxx11): Likewise.
   (set_std_cxx14): Likewise.
   (set_std_cxx1z): Likewise.
 
 gcc/ChangeLog
 
   PR debug/38757
   * config/avr/avr-c.c (avr_cpu_cpp_builtins): Use lang_GNU_C.
   * config/darwin.c (darwin_file_end): Use lang_GNU_CXX.
   (darwin_override_options): Likewise.
   * config/ia64/ia64.c (ia64_struct_retval_addr_is_first_parm_p):
   Likewise.
   * config/rs6000/rs6000.c (rs6000_output_function_epilogue):
   Likewise.
   * dbxout.c (get_lang_number): Likewise.
   (dbxout_type): Likewise.
   (dbxout_symbol_location): Likewise.
   * dwarf2out.c (add_prototyped_attribute): Add DW_AT_prototype
   also for DW_LANG_{C,C99,ObjC}.
   (highest_c_language): New function.
   (gen_compile_unit_die): Call highest_c_language to merge LTO
   TRANSLATION_UNIT_LANGUAGE. Use strncmp language_string to
   determine if DW_LANG_C99 or DW_LANG_C89 should be returned.
   * fold-const.c (fold_cond_expr_with_comparison): Use lang_GNU_CXX.
   * langhooks.h (struct lang_hooks): Add version comment to name.
   (lang_GNU_C): New function declaration.
   (lang_GNU_CXX): Likewise.
   * langhooks.c (lang_GNU_C): New function.
   (lang_GNU_CXX): Likewise.
   * vmsdbgout.c (vmsdbgout_init): Use lang_GNU_C and lang_GNU_CXX.
 
 gcc/testsuite/ChangeLog
 
   PR debug/38757
   * gcc.dg/debug/dwarf2/lang-c89.c: New test.
   * gcc.dg/debug/dwarf2/lang-c99.c: Likewise.

Ok, thanks.

Jakub


Re: FW: [Aarch64][BE][2/2] Fix vector load/stores to not use ld1/st1

2014-11-21 Thread Marcus Shawcroft
On 21 November 2014 12:11, Alan Hayward alan.hayw...@arm.com wrote:

 2014-11-21  Alan Hayward  alan.hayw...@arm.com

 PR 57233
 PR 59810
 * config/aarch64/aarch64.c
 (aarch64_classify_address): Allow extra addressing modes for BE.
 (aarch64_print_operand): New operand for printing a q register+1.
 (aarch64_simd_emit_reg_reg_move): Define.
 (aarch64_simd_disambiguate_copy): Remove.
 * config/aarch64/aarch64-protos.h
 (aarch64_simd_emit_reg_reg_move): Define.
 (aarch64_simd_disambiguate_copy): Remove.
 * config/aarch64/aarch64-simd.md
 (define_split): Use aarch64_simd_emit_reg_reg_move.
 (define_expand movmode): Less restrictive predicates.
 (define_insn *aarch64_movmode): Simplify and only allow for LE.
 (define_insn *aarch64_be_movoi): Define.
 (define_insn *aarch64_be_movci): Define.
 (define_insn *aarch64_be_movxi): Define.
 (define_split): OI mov.  Use aarch64_simd_emit_reg_reg_move.
 (define_split): CI mov.  Use aarch64_simd_emit_reg_reg_move.
 (define_split): XI mov.  Use aarch64_simd_emit_reg_reg_move.

I don;t think we should claim to resolve 57233 here.  The solution to
57233 from Marc just happened to expose the BE issues in aarch64.
Otherwise OK.

/Marcus


[wwwdocs] Document ARM --with-cpu changes for 5.0

2014-11-21 Thread James Greenhalgh
Hi,

As requested by Ramana when he OKed the initial change, the attched patch
documents the changes I made to --with-cpu and --with-tune in this patch:
  https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02618.html
in the changes for GCC 5.0.

OK?

Thanks,
James

---

? .git
? foo.patch
? htdocs/.#index.html.1.888
Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.40
diff -u -r1.40 changes.html
--- htdocs/gcc-5/changes.html	20 Nov 2014 09:09:26 -	1.40
+++ htdocs/gcc-5/changes.html	20 Nov 2014 10:46:48 -
@@ -393,6 +393,10 @@
 non-unified syntax is used. However this is subject to change in future releases.
 Eventually the non-unified syntax will be deprecated.
   /li
+  li It is now a configure-time error to use the code--with-cpu/code
+  configure option with either of code--with-tune/code or
+  code--with-arch/code.
+  /li
  /ul
 
 h3 id=x86IA-32/x86-64/h3


Re: [PATCH] VRP: don't assume strict overflow semantics when checking if a loop wraps

2014-11-21 Thread Patrick Palka
On Fri, Nov 21, 2014 at 7:18 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Fri, Nov 21, 2014 at 12:29 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 When adjusting the value range of an induction variable using SCEV, VRP
 calls scev_probably_wraps_p() with use_overflow_semantics=true.  This
 parameter set to true makes scev_probably_wraps_p() assume that signed
 induction variables never wrap, so for these variables it always returns
 false (when strict overflow rules are in effect).  This is wrong because
 if a signed induction variable really does overflow then we want to give
 it an INF(OVF) value range and not the (finite) estimation returned by
 SCEV.

 While this change shouldn't make a difference in code generation, it
 should help improve the coverage of -Wstrict-overflow warnings on
 induction variables like in the test case.

 OK after bootstrap + regtest on x86_64-unknown-linux-gnu?

 Hmm, I don't think the change won't affect code-generation.  In fact
 we check for overflow ourselves in the most interesting case
 (the first block) - only the path adjusting min/max based on the
 init value and the max value of the type needs to know whether
 overflow may happen and fail or drop to +-INF(OVF).

 So I'd rather open-code the relevant cases and not call
 scev_probably_wraps_p at all.

What kind of tests for overflow do you have in mind?
max_loop_iterations() in this test case always return INT_MAX so there
will be no overflow when computing the upper bound using the number of
loop iterations. Do you mean to compare what max_loop_iterations()
returns with the range that VRP has inferred for the induction
variable?


 Richard.

 gcc/
 * tree-vrp.c (adjust_range_with_scev): Call
 scev_probably_wraps_p with use_overflow_semantics=false.

 gcc/testsuite/
 * gcc.dg/Wstrict-overflow-27.c: New test.
 ---
  gcc/testsuite/gcc.dg/Wstrict-overflow-27.c | 22 ++
  gcc/tree-vrp.c |  2 +-
  2 files changed, 23 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.dg/Wstrict-overflow-27.c

 diff --git a/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c 
 b/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c
 new file mode 100644
 index 000..c1f27ab
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c
 @@ -0,0 +1,22 @@
 +/* { dg-do compile } */
 +/* { dg-options -fstrict-overflow -O2 -Wstrict-overflow } */
 +
 +/* Warn about an overflow when folding i  0.  */
 +
 +void bar (unsigned *p);
 +
 +int
 +foo (unsigned *p)
 +{
 +  int i;
 +  int sum = 0;
 +
 +  for (i = 0; i  *p; i++)
 +{
 +  if (i  0) /* { dg-warning signed overflow } */
 +   sum += 2;
 +  bar (p);
 +}
 +
 +  return sum;
 +}
 diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
 index a75138f..bf9ff61 100644
 --- a/gcc/tree-vrp.c
 +++ b/gcc/tree-vrp.c
 @@ -4270,7 +4270,7 @@ adjust_range_with_scev (value_range_t *vr, struct loop 
 *loop,
dir == EV_DIR_UNKNOWN
/* ... or if it may wrap.  */
|| scev_probably_wraps_p (init, step, stmt, get_chrec_loop (chrec),
 -   true))
 +   /*use_overflow_semantics=*/false))
  return;

/* We use TYPE_MIN_VALUE and TYPE_MAX_VALUE here instead of
 --
 2.2.0.rc1.23.gf570943



[patch] Fix tilepro includes

2014-11-21 Thread Andrew MacLeod
During the flattening of optabs.h, I updated all the config/* files 
which were affected.   I've been getting spurious failures with 
config-list.mk where my changes would disappear and tracked down why.


I was blissfully unaware that the tilepro ports mul-tables.c file is 
actually generated from gen-mul-tables.cc.


This patch fixes the include issue by adding #include insn-codes.h to 
the generated files.  I also added a comment indicating these are 
generated files, and to make changes in the generator.


This allows all the tile* ports to compile properly again.

OK for trunk?

Andrew

	* config/tilepro/gen-mul-tables.cc: Add insn-codes.h to include list
	for generator file.  Add comment indicating it is a generated file.
	* config/tilepro/mul-tables.c: Update generated file.
	* config/tilegx/mul-tables.c: Likewise.

Index: config/tilepro/gen-mul-tables.cc
===
*** config/tilepro/gen-mul-tables.cc	(revision 217787)
--- config/tilepro/gen-mul-tables.cc	(working copy)
*** main ()
*** 1249,1258 
--- 1249,1262 
printf (   along with GCC; see the file COPYING3.  If not see\n);
printf (   http://www.gnu.org/licenses/.  */\n);
printf (\n);
+   printf (/* Note this file is auto-generated from gen-mul-tables.cc.\n);
+   printf (   Make any required changes there.  */\n);
+   printf (\n);
printf (#include \config.h\\n);
printf (#include \system.h\\n);
printf (#include \coretypes.h\\n);
printf (#include \expr.h\\n);
+   printf (#include \insn-codes.h\\n);
printf (#include \optabs.h\\n);
printf (#include \%s-multiply.h\\n\n, ARCH);
create_insn_code_compression_table ();
Index: config/tilepro/mul-tables.c
===
*** config/tilepro/mul-tables.c	(revision 217787)
--- config/tilepro/mul-tables.c	(working copy)
***
*** 18,23 
--- 18,26 
 along with GCC; see the file COPYING3.  If not see
 http://www.gnu.org/licenses/.  */
  
+ /* Note this file is auto-generated from gen-mul-tables.cc.
+Make any required changes there.  */
+ 
  #include config.h
  #include system.h
  #include coretypes.h
Index: config/tilegx/mul-tables.c
===
*** config/tilegx/mul-tables.c	(revision 217787)
--- config/tilegx/mul-tables.c	(working copy)
***
*** 18,23 
--- 18,26 
 along with GCC; see the file COPYING3.  If not see
 http://www.gnu.org/licenses/.  */
  
+ /* Note this file is auto-generated from gen-mul-tables.cc.
+Make any required changes there.  */
+ 
  #include config.h
  #include system.h
  #include coretypes.h


[PATCH,MIPS] Refine configure guard for .module availability

2014-11-21 Thread Matthew Fortune
(I'm not sure if I need approval from someone else for MIPS
specific top level 'configure' changes.  I'm cautiously assuming
I do for now.)

Since adding o32 FFPXX support, the MIPS backend uses the .module
directive to emit a .module [no]oddspreg when .module support is
detected in binutils.  The oddspreg option was however only added
to binutils with FPXX and not the initial .module support.  This
leads to errors when using binutils-gdb between the following
commits:

commit 919731affbef19fcad8dddb0a595bb05755cb345
Author: mfortune matthew.fort...@imgtec.com
Date:   Tue May 20 13:28:20 2014 +0100

Add MIPS .module directive

commit 351cdf24d223290b15fa991e5052ec9e9bd1e284
Author: Matthew Fortune matthew.fort...@imgtec.com
Date:   Tue Jul 29 11:27:59 2014 +0100

[MIPS] Implement O32 FPXX, FP64 and FP64A ABI extensions

I have updated the configure check for .module to check for both
.module and FPXX support.  There was no point in separating
the detection of .module from detection of FPXX as there is no
need to switch to .module until using FPXX.

Tested a build of the compiler for mipsel-linux-gnu, mips64el-linux-gnu
with binutils which predates and postdates FPXX and checked that the
configure results are correct and that .module vs .gnu_attribute is
generated appropriately.

Thanks,
Matthew

gcc/

* configure.ac: When checking for .module support ensure that
o32 FPXX is supported to avoid a second configure check.
* configure: Regenerate.

diff --git a/gcc/configure.ac b/gcc/configure.ac
index f6e7ec3..584400d 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -4280,8 +4280,9 @@ LCF0:
  [Define if your assembler supports .gnu_attribute.])])
 
 gcc_GAS_CHECK_FEATURE([.module support],
-  gcc_cv_as_mips_dot_module,,,
-  [.module fp=32],,
+  gcc_cv_as_mips_dot_module,,[-32],
+  [.module mips2
+   .module fp=xx],,
   [AC_DEFINE(HAVE_AS_DOT_MODULE, 1,
  [Define if your assembler supports .module.])])
 if test x$gcc_cv_as_mips_dot_module = xno \




[committed] Cherry-pick a libsanitizer bugfix (PR sanitizer/64013)

2014-11-21 Thread Jakub Jelinek
Hi!

I've committed this as obvious.

2014-11-21  Jakub Jelinek  ja...@redhat.com

PR sanitizer/64013
* sanitizer_common/sanitizer_linux.cc (FileExists): Cherry pick
upstream r222532.

--- libsanitizer/sanitizer_common/sanitizer_linux.cc(revision 222531)
+++ libsanitizer/sanitizer_common/sanitizer_linux.cc(revision 222532)
@@ -283,17 +283,15 @@ uptr internal_execve(const char *filenam
 
 // - sanitizer_common.h
 bool FileExists(const char *filename) {
-#if SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
   struct stat st;
+#if SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
   if (internal_syscall(SYSCALL(newfstatat), AT_FDCWD, filename, st, 0))
-return false;
 #else
-  struct stat st;
   if (internal_stat(filename, st))
+#endif
 return false;
   // Sanity check: filename is a regular file.
   return S_ISREG(st.st_mode);
-#endif
 }
 
 uptr GetTid() {

Jakub


Re: [rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1

2014-11-21 Thread Alan Hayward

On 14/11/2014 16:48, Alan Hayward alan.hayw...@arm.com wrote:

This is a new version of my BE patch from a few weeks ago.
This is part 1 and covers rtlanal.c. The second part will be aarch64
specific.

When combined with the second patch, It fixes up movoi/ci/xi for Big
Endian, so that we end up with the lab of a big-endian integer to be in
the low byte of the highest-numbered register.

This will apply cleanly by itself and no regressions were seen when
testing aarch64 and x86_64 on make check.


Changelog:

2014-11-14  Alan Hayward  alan.hayw...@arm.com

* rtlanal.c
(subreg_get_info): Exit early for simple and common cases


Alan.

Hi,

The second part to this patch (aarch64 specific) has been approved.


Could someone review this one please.


Thanks,
Alan.





[PATCH] Improve PR63679

2014-11-21 Thread Richard Biener

This patch picks up work that was in my working tree already and fixes
it up.  When targets choose to not emitting piecewise aggregate inits
during gimplification or when that is disabled for other reasons
(like being too large) then even FRE with all its tricks cannot
constant fold from them.  The following patch teaches it to do that
via allowing offsetted reads (at the moment only reads from offset
zero would have been handled) and finally trying to do a lookup
from the static initializer.  It also generalizes the code doing
that to not only simplify reads from string constants but from
arbitrary constans by means of the recently improved 
native_encode/interpret_expr code and from CONSTRUCTORs via
using fold_ctor_reference.

This exposes several testcases that use static uninitialized globals
for which they don't expect loads to be optimized to zero ...

Bootstrapped on x86_64-unknown-linux-gnu, re-testing in progress
after a minor fix.

To really fix PR63679 fold_ctor_reference would need to learn
to combine several array fields to a vector constant or
native_encode_expr would need to learn to encode CONSTRUCTORs.
Also FRE would have to be run late.

Still referencing that PR as it lead me to re-investigate all this.

I'll go ahead and apply this patch as bugfix on Monday unless somebody
screams loudly.

Richard.

2014-11-21  Richard Biener  rguent...@suse.de

PR tree-optimization/63679
* tree-ssa-sccvn.c: Include ipa-ref.h, plugin-api.h and cgraph.h.
(copy_reference_ops_from_ref): Fix non-constant ADDR_EXPR case
to properly leave off at -1.
(fully_constant_vn_reference_p): Generalize folding from
constant initializers.
(vn_reference_lookup_3): When looking through aggregate copies
handle offsetted reads and try simplifying the result to
a constant.
* gimple-fold.h (fold_ctor_reference): Export.
* gimple-fold.c (fold_ctor_reference): Likewise.

* gcc.dg/tree-ssa/ssa-fre-42.c: New testcase.
* gcc.dg/tree-ssa/20030807-5.c: Avoid folding read from global to zero.
* gcc.target/i386/ssetype-1.c: Likewise.
* gcc.target/i386/ssetype-3.c: Likewise.
* gcc.target/i386/ssetype-5.c: Likewise.

Index: gcc/tree-ssa-sccvn.c
===
*** gcc/tree-ssa-sccvn.c.orig   2014-11-21 11:09:55.230818525 +0100
--- gcc/tree-ssa-sccvn.c2014-11-21 14:51:03.328237909 +0100
*** along with GCC; see the file COPYING3.
*** 65,70 
--- 65,73 
  #include tree-ssa-sccvn.h
  #include tree-cfg.h
  #include domwalk.h
+ #include ipa-ref.h
+ #include plugin-api.h
+ #include cgraph.h
  
  /* This algorithm is based on the SCC algorithm presented by Keith
 Cooper and L. Taylor Simpson in SCC-Based Value numbering
*** copy_reference_ops_from_ref (tree ref, v
*** 936,942 
  temp.op0 = ref;
  break;
}
! /* Fallthrough.  */
  /* These are only interesting for their operands, their
 existence, and their type.  They will never be the last
 ref in the chain of references (IE they require an
--- 939,945 
  temp.op0 = ref;
  break;
}
! break;
  /* These are only interesting for their operands, their
 existence, and their type.  They will never be the last
 ref in the chain of references (IE they require an
*** fully_constant_vn_reference_p (vn_refere
*** 1341,1364 
}
  }
  
!   /* Simplify reads from constant strings.  */
!   else if (op-opcode == ARRAY_REF
!   TREE_CODE (op-op0) == INTEGER_CST
!   integer_zerop (op-op1)
!   operands.length () == 2)
! {
!   vn_reference_op_t arg0;
!   arg0 = operands[1];
!   if (arg0-opcode == STRING_CST
!  (TYPE_MODE (op-type)
! == TYPE_MODE (TREE_TYPE (TREE_TYPE (arg0-op0
!  GET_MODE_CLASS (TYPE_MODE (op-type)) == MODE_INT
!  GET_MODE_SIZE (TYPE_MODE (op-type)) == 1
!  tree_int_cst_sgn (op-op0) = 0
!  compare_tree_int (op-op0, TREE_STRING_LENGTH (arg0-op0))  0)
!   return build_int_cst_type (op-type,
!  (TREE_STRING_POINTER (arg0-op0)
!   [TREE_INT_CST_LOW (op-op0)]));
  }
  
return NULL_TREE;
--- 1344,1409 
}
  }
  
!   /* Simplify reads from constants or constant initializers.  */
!   else if (BITS_PER_UNIT == 8
!   is_gimple_reg_type (ref-type)
!   (!INTEGRAL_TYPE_P (ref-type)
!  || TYPE_PRECISION (ref-type) % BITS_PER_UNIT == 0))
! {
!   HOST_WIDE_INT off = 0;
!   HOST_WIDE_INT size = tree_to_shwi (TYPE_SIZE (ref-type));
!   if (size % BITS_PER_UNIT != 0
! || size  MAX_BITSIZE_MODE_ANY_MODE)
!   return NULL_TREE;
!   size /= BITS_PER_UNIT;
!   unsigned i;

Re: SRA: don't drop clobbers

2014-11-21 Thread Richard Biener
On Thu, Nov 20, 2014 at 7:11 PM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Mon, Nov 03, 2014 at 10:46:49PM +0100, Marc Glisse wrote:
 On Mon, 3 Nov 2014, Marc Glisse wrote:

 On Mon, 3 Nov 2014, Martin Jambor wrote:
 
 I just applied your patch on top of trunk revision 217032 on my
 
 Ah, that explains it, thanks. This patch is a follow-up to
 r217034. Still, I didn't expect the ICE you are seeing by applying
 this patch to older trunk, I'll try to reproduce that.

 It is TODO_update_address_taken that used to remove clobbers, and as
 you said ESRA goes straight to TODO_update_ssa, which explains why
 the clobbers caused trouble. In any case, after r217034, update_ssa
 should handle clobbers much better. Could you take an other look
 based on a more recent trunk, please?


 Sorry for the delay.  Anyway, on the current trunk (i.e. Tuesday
 checkout) the patch works as expected, there are assignments from
 default definitions now and even though we do not warn as we should,
 the patch improves the generated code.  The function foo from the
 testcase is optimized to return SR.1_2(D); as soon as release_ssa
 now, whereas unpatched trunk leaves an undefined load even in the
 optimized dump.

 Thus, I like the patch and given that you posted it well before stage1
 end, I'd like to see it committed.  Richi, can you have a look and
 perhaps approve it?

Yes, the patch is ok.

Thanks,
Richard.

 Thanks,

 Martin



Re: [PATCH 8/9] Negative numbers added for sreal class.

2014-11-21 Thread Martin Liška

On 11/21/2014 01:03 PM, Richard Biener wrote:

On Fri, Nov 21, 2014 at 12:21 PM, Martin Liška mli...@suse.cz wrote:

On 11/14/2014 11:48 AM, Richard Biener wrote:


On Thu, Nov 13, 2014 at 1:35 PM, mliska mli...@suse.cz wrote:


gcc/ChangeLog:

2014-11-13  Martin Liska  mli...@suse.cz

  * predict.c (propagate_freq): More elegant sreal API is used.
  (estimate_bb_frequencies): New static constants defined by sreal
  replace precomputed ones.
  * sreal.c (sreal::normalize): New function.
  (sreal::to_int): Likewise.
  (sreal::operator+): Likewise.
  (sreal::operator-): Likewise.
  * sreal.h: Definition of new functions added.



Please use gcc_checking_assert()s everywhere.  sreal is supposed
to be fast... (I see it has current uses of gcc_assert - you may want
to mass-convert them as a followup).


---
   gcc/predict.c | 30 +++-
   gcc/sreal.c   | 56 
   gcc/sreal.h   | 75
---
   3 files changed, 126 insertions(+), 35 deletions(-)

diff --git a/gcc/predict.c b/gcc/predict.c
index 0215e91..0f640f5 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -82,7 +82,7 @@ along with GCC; see the file COPYING3.  If not see

   /* real constants: 0, 1, 1-1/REG_BR_PROB_BASE, REG_BR_PROB_BASE,
 1/REG_BR_PROB_BASE, 0.5, BB_FREQ_MAX.  */
-static sreal real_zero, real_one, real_almost_one, real_br_prob_base,
+static sreal real_almost_one, real_br_prob_base,
   real_inv_br_prob_base, real_one_half, real_bb_freq_max;

   static void combine_predictions_for_insn (rtx_insn *, basic_block);
@@ -2528,13 +2528,13 @@ propagate_freq (basic_block head, bitmap tovisit)
  bb-count = bb-frequency = 0;
   }

-  BLOCK_INFO (head)-frequency = real_one;
+  BLOCK_INFO (head)-frequency = sreal::one ();
 last = head;
 for (bb = head; bb; bb = nextbb)
   {
 edge_iterator ei;
-  sreal cyclic_probability = real_zero;
-  sreal frequency = real_zero;
+  sreal cyclic_probability = sreal::zero ();
+  sreal frequency = sreal::zero ();

 nextbb = BLOCK_INFO (bb)-next;
 BLOCK_INFO (bb)-next = NULL;
@@ -2559,13 +2559,13 @@ propagate_freq (basic_block head, bitmap tovisit)
* BLOCK_INFO (e-src)-frequency /
REG_BR_PROB_BASE);  */

-   sreal tmp (e-probability, 0);
+   sreal tmp = e-probability;
  tmp *= BLOCK_INFO (e-src)-frequency;
  tmp *= real_inv_br_prob_base;
  frequency += tmp;
}

- if (cyclic_probability == real_zero)
+ if (cyclic_probability == sreal::zero ())
  {
BLOCK_INFO (bb)-frequency = frequency;
  }
@@ -2577,7 +2577,7 @@ propagate_freq (basic_block head, bitmap tovisit)
/* BLOCK_INFO (bb)-frequency = frequency
/ (1 - cyclic_probability)
*/

- cyclic_probability = real_one - cyclic_probability;
+ cyclic_probability = sreal::one () - cyclic_probability;
BLOCK_INFO (bb)-frequency = frequency /
cyclic_probability;
  }
  }
@@ -2591,7 +2591,7 @@ propagate_freq (basic_block head, bitmap tovisit)
   = ((e-probability * BLOCK_INFO (bb)-frequency)
   / REG_BR_PROB_BASE); */

- sreal tmp (e-probability, 0);
+ sreal tmp = e-probability;
tmp *= BLOCK_INFO (bb)-frequency;
EDGE_INFO (e)-back_edge_prob = tmp * real_inv_br_prob_base;
  }
@@ -2873,13 +2873,11 @@ estimate_bb_frequencies (bool force)
 if (!real_values_initialized)
   {
real_values_initialized = 1;
- real_zero = sreal (0, 0);
- real_one = sreal (1, 0);
- real_br_prob_base = sreal (REG_BR_PROB_BASE, 0);
- real_bb_freq_max = sreal (BB_FREQ_MAX, 0);
+ real_br_prob_base = REG_BR_PROB_BASE;
+ real_bb_freq_max = BB_FREQ_MAX;
real_one_half = sreal (1, -1);
- real_inv_br_prob_base = real_one / real_br_prob_base;
- real_almost_one = real_one - real_inv_br_prob_base;
+ real_inv_br_prob_base = sreal::one () / real_br_prob_base;
+ real_almost_one = sreal::one () - real_inv_br_prob_base;
  }

 mark_dfs_back_edges ();
@@ -2897,7 +2895,7 @@ estimate_bb_frequencies (bool force)

FOR_EACH_EDGE (e, ei, bb-succs)
  {
- EDGE_INFO (e)-back_edge_prob = sreal (e-probability, 0);
+ EDGE_INFO (e)-back_edge_prob = e-probability;
EDGE_INFO (e)-back_edge_prob *= real_inv_br_prob_base;
  }
  }
@@ -2906,7 +2904,7 @@ estimate_bb_frequencies (bool force)
to outermost to examine frequencies for back edges.  */
 

Re: [PATCH 8/9] Negative numbers added for sreal class.

2014-11-21 Thread Richard Biener
On Fri, Nov 21, 2014 at 3:39 PM, Martin Liška mli...@suse.cz wrote:

 Hello.

 Ok, this is simplified, one can use sreal a = 12345 and it works ;)

 that's a  new API, right?  There is no max () and I think that using
 LONG_MIN here is asking for trouble (host dependence).  The
 comment in the file says the max should be
 sreal (SREAL_MAX_SIG, SREAL_MAX_EXP) and the min
 sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP)?


 Sure, sreal can store much bigger(smaller) numbers :)

 Where do you need sreal::to_double?  The host shouldn't perform
 double calculations so it can be only for dumping?  In which case
 the user should have used sreal::dump (), maybe with extra
 arguments.


 That new function was request from Honza, only for debugging purpose.
 I agree that dump should this kind of job.

 If no other problem, I will run tests once more and commit it.
 Thanks,
 Martin

-#define SREAL_MAX_EXP (INT_MAX / 4)
+#define SREAL_MAX_EXP (INT_MAX / 8)

this change doesn't look necessary anymore?

Btw, it's also odd that...

 #define SREAL_PART_BITS 32
...
 #define SREAL_MIN_SIG ((uint64_t) 1  (SREAL_PART_BITS - 1))
 #define SREAL_MAX_SIG (((uint64_t) 1  SREAL_PART_BITS) - 1)

thus all m_sig values fit in 32bits but we still use a uint64_t m_sig ...
(the implementation uses 64bit for internal computations, but still
the storage is wasteful?)

Of course the way normalize() works requires that storage to be
64bits to store unnormalized values.

I'd say ok with the SREAL_MAX_EXP change reverted.

Thanks,
Richard.



 Otherwise looks good to me and sorry for not noticing the above
 earlier.

 Thanks,
 Richard.

 Thanks,
 Martin


};

extern void debug (sreal ref);
 @@ -76,12 +133,12 @@ inline sreal operator+= (sreal a, const sreal
 b)

inline sreal operator-= (sreal a, const sreal b)
{
 -return a = a - b;
 +  return a = a - b;
}

inline sreal operator/= (sreal a, const sreal b)
{
 -return a = a / b;
 +  return a = a / b;
}

inline sreal operator*= (sreal a, const sreal b)
 --
 2.1.2






Re: [PATCH 8/9] Negative numbers added for sreal class.

2014-11-21 Thread Martin Liška

On 11/21/2014 04:02 PM, Richard Biener wrote:

On Fri, Nov 21, 2014 at 3:39 PM, Martin Liška mli...@suse.cz wrote:


Hello.

Ok, this is simplified, one can use sreal a = 12345 and it works ;)


that's a  new API, right?  There is no max () and I think that using
LONG_MIN here is asking for trouble (host dependence).  The
comment in the file says the max should be
sreal (SREAL_MAX_SIG, SREAL_MAX_EXP) and the min
sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP)?



Sure, sreal can store much bigger(smaller) numbers :)


Where do you need sreal::to_double?  The host shouldn't perform
double calculations so it can be only for dumping?  In which case
the user should have used sreal::dump (), maybe with extra
arguments.



That new function was request from Honza, only for debugging purpose.
I agree that dump should this kind of job.

If no other problem, I will run tests once more and commit it.
Thanks,
Martin


-#define SREAL_MAX_EXP (INT_MAX / 4)
+#define SREAL_MAX_EXP (INT_MAX / 8)

this change doesn't look necessary anymore?

Btw, it's also odd that...

  #define SREAL_PART_BITS 32
...
  #define SREAL_MIN_SIG ((uint64_t) 1  (SREAL_PART_BITS - 1))
  #define SREAL_MAX_SIG (((uint64_t) 1  SREAL_PART_BITS) - 1)

thus all m_sig values fit in 32bits but we still use a uint64_t m_sig ...
(the implementation uses 64bit for internal computations, but still
the storage is wasteful?)

Of course the way normalize() works requires that storage to be
64bits to store unnormalized values.

I'd say ok with the SREAL_MAX_EXP change reverted.



Hi.

You are right, this change was done because I used one bit for 
m_negative (bitfield), not needed any more.


Final version attached.

Thank you,
Martin


Thanks,
Richard.





Otherwise looks good to me and sorry for not noticing the above
earlier.

Thanks,
Richard.


Thanks,
Martin



};

extern void debug (sreal ref);
@@ -76,12 +133,12 @@ inline sreal operator+= (sreal a, const sreal
b)

inline sreal operator-= (sreal a, const sreal b)
{
-return a = a - b;
+  return a = a - b;
}

inline sreal operator/= (sreal a, const sreal b)
{
-return a = a / b;
+  return a = a / b;
}

inline sreal operator*= (sreal a, const sreal b)
--
2.1.2








From b28e4264b5f9965ca5ab4f52ce6f4c9df00d4800 Mon Sep 17 00:00:00 2001
From: mliska mli...@suse.cz
Date: Fri, 21 Nov 2014 12:07:40 +0100
Subject: [PATCH 1/2] Negative numbers added for sreal class.

gcc/ChangeLog:

2014-11-13  Martin Liska  mli...@suse.cz

	* predict.c (propagate_freq): More elegant sreal API is used.
	(estimate_bb_frequencies): Precomputed constants replaced by integer
	constants.
	* sreal.c (sreal::normalize): New function.
	(sreal::to_int): Likewise.
	(sreal::operator+): Likewise.
	(sreal::operator-): Likewise.
	* sreal.h: Definition of new functions added.
---
 gcc/predict.c |  30 
 gcc/sreal.c   | 114 --
 gcc/sreal.h   |  82 +-
 3 files changed, 174 insertions(+), 52 deletions(-)

diff --git a/gcc/predict.c b/gcc/predict.c
index 779af11..0cfe4a9 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -82,7 +82,7 @@ along with GCC; see the file COPYING3.  If not see
 
 /* real constants: 0, 1, 1-1/REG_BR_PROB_BASE, REG_BR_PROB_BASE,
 		   1/REG_BR_PROB_BASE, 0.5, BB_FREQ_MAX.  */
-static sreal real_zero, real_one, real_almost_one, real_br_prob_base,
+static sreal real_almost_one, real_br_prob_base,
 	 real_inv_br_prob_base, real_one_half, real_bb_freq_max;
 
 static void combine_predictions_for_insn (rtx_insn *, basic_block);
@@ -2541,13 +2541,13 @@ propagate_freq (basic_block head, bitmap tovisit)
 	bb-count = bb-frequency = 0;
 }
 
-  BLOCK_INFO (head)-frequency = real_one;
+  BLOCK_INFO (head)-frequency = 1;
   last = head;
   for (bb = head; bb; bb = nextbb)
 {
   edge_iterator ei;
-  sreal cyclic_probability = real_zero;
-  sreal frequency = real_zero;
+  sreal cyclic_probability = 0;
+  sreal frequency = 0;
 
   nextbb = BLOCK_INFO (bb)-next;
   BLOCK_INFO (bb)-next = NULL;
@@ -2572,13 +2572,13 @@ propagate_freq (basic_block head, bitmap tovisit)
   * BLOCK_INFO (e-src)-frequency /
   REG_BR_PROB_BASE);  */
 
-		sreal tmp (e-probability, 0);
+		sreal tmp = e-probability;
 		tmp *= BLOCK_INFO (e-src)-frequency;
 		tmp *= real_inv_br_prob_base;
 		frequency += tmp;
 	  }
 
-	  if (cyclic_probability == real_zero)
+	  if (cyclic_probability == 0)
 	{
 	  BLOCK_INFO (bb)-frequency = frequency;
 	}
@@ -2590,7 +2590,7 @@ propagate_freq (basic_block head, bitmap tovisit)
 	  /* BLOCK_INFO (bb)-frequency = frequency
 	  / (1 - cyclic_probability) */
 
-	  cyclic_probability = real_one - cyclic_probability;
+	  cyclic_probability = sreal (1) - cyclic_probability;
 	  BLOCK_INFO (bb)-frequency = frequency / cyclic_probability;
 	}
 	}
@@ -2604,7 +2604,7 @@ propagate_freq (basic_block head, bitmap tovisit)
 

Re: [PATCH, MPX runtime 1/2] Integrate MPX runtime library

2014-11-21 Thread Ilya Enkovich
On 19 Nov 21:11, Ilya Enkovich wrote:
 2014-11-19 20:55 GMT+03:00 Jeff Law l...@redhat.com:
  On 11/19/14 07:15, Ilya Enkovich wrote:
 
  --
  2014-11-19  Ilya Enkovich  ilya.enkov...@intel.com
 
  * Makefile.def: Add libmpx.
  * configure.ac: Add libmpx.
  * Makefile.in: Regenerate.
  * configure: Regenerate.
 
  gcc/
 
  2014-11-19  Ilya Enkovich  ilya.enkov...@intel.com
 
  * gcc.c (LIBMPX_LIBS): New.
  (LIBMPX_SPEC): New.
  (MPX_SPEC): New.
  (LINK_COMMAND_SPEC): Add MPX_SPEC.
  * c-family/c.opt (static-libmpx): New.
 
  libmpx/
 
  2014-11-19  Ilya Enkovich  ilya.enkov...@intel.com
 
  Initial commit.
 
  So I have only done a cursory peek at this code, but one thing which I did
  immediately note was the CPU feature testing stuff.  Shouldn't all that
  stuff be integrated into the feature testing bits already found in libgcc?
 
 I'll have a look at these features.
 
 
  I've asked the steering committee to vote on accepting the runtime --
  necessary given Intel is keeping copyright ownership to the best of my
  knowledge.
 
 Thanks!
 
 Ilya
 
 
  Jeff
 

Jakub objected adding CPUID checks used in MPX runtime into 
__builtin_cpu_supports.  So I just added required bits into cpuid.h and removed 
local implementation of cpuid.  Is it OK?

Thanks,
Ilya
--
2014-11-21  Ilya Enkovich  ilya.enkov...@intel.com

* Makefile.def: Add libmpx.
* configure.ac: Add libmpx.
* Makefile.in: Regenerate.
* configure: Regenerate.

gcc/

2014-11-21  Ilya Enkovich  ilya.enkov...@intel.com

* config/i386/cpuid.h (bit_MPX): New.
(bit_BNDREGS): New.
(bit_BNDCSR): New.
* gcc.c (LIBMPX_LIBS): New.
(LIBMPX_SPEC): New.
(MPX_SPEC): New.
(LINK_COMMAND_SPEC): Add MPX_SPEC.
* c-family/c.opt (static-libmpx): New.

libmpx/

2014-11-21  Ilya Enkovich  ilya.enkov...@intel.com

Initial commit.


diff --git a/Makefile.def b/Makefile.def
index 40bbca9..4a535d2 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -128,6 +128,9 @@ target_modules = { module= libsanitizer;
   bootstrap=true;
   lib_path=.libs;
   raw_cxx=true; };
+target_modules = { module= libmpx;
+  bootstrap=true;
+  lib_path=.libs; };
 target_modules = { module= libvtv;
   bootstrap=true;
   lib_path=.libs;
diff --git a/configure.ac b/configure.ac
index b27fb1d..ccb119b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -162,6 +162,7 @@ target_libraries=target-libgcc \
target-libstdc++-v3 \
target-libsanitizer \
target-libvtv \
+   target-libmpx \
target-libssp \
target-libquadmath \
target-libgfortran \
@@ -642,6 +643,25 @@ if test -d ${srcdir}/libvtv; then
 fi
 fi
 
+
+# Disable libmpx on unsupported systems.
+if test -d ${srcdir}/libmpx; then
+if test x$enable_libmpx = x; then
+   AC_MSG_CHECKING([for libmpx support])
+   if (srcdir=${srcdir}/libmpx; \
+   . ${srcdir}/configure.tgt; \
+   test $LIBMPX_SUPPORTED != yes)
+   then
+   AC_MSG_RESULT([no])
+   noconfigdirs=$noconfigdirs target-libmpx
+   else
+   AC_MSG_RESULT([yes])
+   fi
+fi
+fi
+
+
+
 # Disable libquadmath for some systems.
 case ${target} in
   avr-*-*)
@@ -2652,6 +2672,11 @@ if echo  ${target_configdirs}  | grep  libvtv   
/dev/null 21 
   bootstrap_target_libs=${bootstrap_target_libs}target-libvtv,
 fi
 
+# If we are building libmpx, bootstrap it.
+if echo  ${target_configdirs}  | grep  libmpx   /dev/null 21; then
+  bootstrap_target_libs=${bootstrap_target_libs}target-libmpx,
+fi
+
 # Determine whether gdb needs tk/tcl or not.
 # Use 'maybe' since enable_gdbtk might be true even if tk isn't available
 # and in that case we want gdb to be built without tk.  Ugh!
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 85dcb98..8f5d76c 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1040,6 +1040,9 @@ fchkp-instrument-marked-only
 C ObjC C++ ObjC++ LTO Report Var(flag_chkp_instrument_marked_only) Init(0)
 Instrument only functions marked with bnd_instrument attribute.
 
+static-libmpx
+Driver
+
 fcilkplus
 C ObjC C++ ObjC++ LTO Report Var(flag_cilkplus) Init(0)
 Enable Cilk Plus
diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
index 133e356..f85cebb 100644
--- a/gcc/config/i386/cpuid.h
+++ b/gcc/config/i386/cpuid.h
@@ -72,6 +72,7 @@
 #define bit_AVX2   (1  5)
 #define bit_BMI2   (1  8)
 #define bit_RTM(1  11)
+#define bit_MPX(1  14)
 #define bit_AVX512F(1  16)
 #define bit_AVX512DQ   (1  17)
 #define bit_RDSEED (1  18)
@@ -87,6 +88,10 @@
 /* %ecx */
 #define bit_PREFETCHWT1  (1  0)
 
+/* XFEATURE_ENABLED_MASK register bits (%eax == 13, %ecx == 0) */
+#define 

Re: [PATCH, MPX wrappers 1/3] Add MPX wrappers library

2014-11-21 Thread Ilya Enkovich
On 18 Nov 14:15, Jeff Law wrote:
 On 11/18/14 09:48, Ilya Enkovich wrote:
 On 15 Nov 00:10, Jeff Law wrote:
 On 11/14/14 10:26, Ilya Enkovich wrote:
 Hi,
 
 This patch introduces a simple library with several wrappers to be used 
 with MPX and Pointer Bounds Checker.  Wrappers allow to obtain, copy and 
 just keep alive bounds whrough widely use library calls.  It significantly 
 increases checking  quality.
 
 Thanks,
 Ilya
 --
 gcc/
 
 2014-11-14  Ilya Enkovich  ilya.enkov...@intel.com
 
* gcc.c (MPX_SPEC): Add wrappers library.
 
 libmpx/
 
 2014-11-14  Ilya Enkovich  ilya.enkov...@intel.com
 
* Makefile.am (SUBDIRS): New.
(MAKEOVERRIDES): New.
* Makefile.in: Regenerate.
* configure.ac: Add mpxintr/Makefile to config
files.
* configure: Regenerate.
* mpxwrap/Makefile.am: New.
* mpxwrap/Makefile.in: New.
* mpxwrap/libtool-version: New.
* mpxwrap/mpx_wrappers.cc: New.
 As Joseph mentioned, symbol versioning.  Anytime a target side
 library is added to GCC, it should be properly versioned.
 
 Don't forget copyright headers in the new files.  Remember it has to
 be suitable for embeddeding in the target without infecting the
 target with the GPL.  LGPL or GPL + exception clause seem the most
 appropriate to me.
 
 
 Jeff
 
 
 Thank you for review!  Here is a version with license and versioning added.
 
 Thanks,
 Ilya
 --
 gcc/
 
 2014-11-18  Ilya Enkovich  ilya.enkov...@intel.com
 
  * gcc.c (MPX_SPEC): Add wrappers library.
 
 libmpx/
 
 2014-11-18  Ilya Enkovich  ilya.enkov...@intel.com
 
  * Makefile.am (SUBDIRS): New.
  (MAKEOVERRIDES): New.
  * Makefile.in: Regenerate.
  * configure.ac: Add mpxintr/Makefile to config
  files.
  * configure: Regenerate.
  * mpxwrap/Makefile.am: New.
  * mpxwrap/Makefile.in: New.
  * mpxwrap/libtool-version: New.
  * mpxwrap/mpx_wrappers.cc: New.
  * mpxwrap/libmpxwrappers.map: New.
 OK.
 Jeff
 
Hi,

There is a missing check in libmpx configure.  We may try to build mpxwrappers 
when binutils don't support MPX and thus get build failure.  I added a check 
for MPX support in used assembler and mpxwrappers library is now built 
conditionally.

Since the latest version of runtime library supports static link, I also 
supported -static-libmpxwrappers option.

Does it look OK?

Thanks,
Ilya
--
gcc/

2014-11-21  Ilya Enkovich  ilya.enkov...@intel.com

* gcc.c (LIBMPX_WRAPPERSSPEC): New.
(MPX_SPEC): Add wrappers library.
* c-family/c.opt (static-libmpxwrappers): New.

libmpx/

2014-11-21  Ilya Enkovich  ilya.enkov...@intel.com

* Makefile.am (SUBDIRS): Add mpxwrap when used
AS supports MPX.
(MAKEOVERRIDES): New.
* Makefile.in: Regenerate.
* configure.ac: Check AS supports MPX.  Add
mpxintr/Makefile to config files.
* configure: Regenerate.
* mpxwrap/Makefile.am: New.
* mpxwrap/Makefile.in: New.
* mpxwrap/libtool-version: New.
* mpxwrap/mpx_wrappers.cc: New.
* mpxwrap/libmpxwrappers.map: New.


diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 8f5d76c..283c632 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1043,6 +1043,9 @@ Instrument only functions marked with bnd_instrument 
attribute.
 static-libmpx
 Driver
 
+static-libmpxwrappers
+Driver
+
 fcilkplus
 C ObjC C++ ObjC++ LTO Report Var(flag_cilkplus) Init(0)
 Enable Cilk Plus
diff --git a/gcc/gcc.c b/gcc/gcc.c
index 75e5767..aa8c9a3 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -828,9 +828,23 @@ proper position among the other output files.  */
 #endif
 #endif
 
+#ifndef LIBMPXWRAPPERS_SPEC
+#if defined(HAVE_LD_STATIC_DYNAMIC)
+#define LIBMPXWRAPPERS_SPEC \
+%{mmpx:%{fcheck-pointer-bounds:%{!fno-chkp-use-wrappers:\
+%{static:-lmpxwrappers}\
+%{!static:%{static-libmpxwrappers: LD_STATIC_OPTION  --whole-archive}\
+-lmpxwrappers %{static-libmpxwrappers:--no-whole-archive \
+LD_DYNAMIC_OPTION }
+#else
+#define LIBMPXWRAPPERS_SPEC \
+%{mmpx:%{fcheck-pointer-bounds:{!fno-chkp-use-wrappers:-lmpxwrappers}}}
+#endif
+#endif
+
 #ifndef MPX_SPEC
 #define MPX_SPEC \
-%{!nostdlib:%{!nodefaultlibs: LIBMPX_SPEC }}
+%{!nostdlib:%{!nodefaultlibs: LIBMPX_SPEC LIBMPXWRAPPERS_SPEC }}
 #endif
 
 /* -u* was put back because both BSD and SysV seem to support it.  */
diff --git a/libmpx/Makefile.am b/libmpx/Makefile.am
index 6cee4ac..bd0a8b6 100644
--- a/libmpx/Makefile.am
+++ b/libmpx/Makefile.am
@@ -2,6 +2,9 @@ ACLOCAL_AMFLAGS = -I .. -I ../config
 
 if LIBMPX_SUPPORTED
 SUBDIRS = mpxrt
+if MPX_AS_SUPPORTED
+SUBDIRS += mpxwrap
+endif
 nodist_toolexeclib_HEADERS = libmpx.spec
 endif
 
@@ -45,3 +48,5 @@ AM_MAKEFLAGS = \
PICFLAG=$(PICFLAG) \
RANLIB=$(RANLIB) \
DESTDIR=$(DESTDIR)
+
+MAKEOVERRIDES =
diff --git a/libmpx/configure.ac b/libmpx/configure.ac
index bd7a5eb..180503c 100644
--- a/libmpx/configure.ac
+++ b/libmpx/configure.ac
@@ -93,6 +93,18 @@ AC_CHECK_TOOL(AS, as)
 

Re: [PATCH][AArch64] Implement vsqrt_f64 intrinsic

2014-11-21 Thread Marcus Shawcroft
On 17 November 2014 17:35, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

 2014-11-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/aarch64/arm_neon.h (vsqrt_f64): New intrinsic.

 2014-11-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * gcc.target/aarch64/simd/vsqrt_f64_1.c

OK /Marcus


Re: [PATCH][wwwdocs] Add Cortex-A53 erratum workaround note to AArch64 changes for 4.8

2014-11-21 Thread Marcus Shawcroft
On 17 November 2014 11:42, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

 Makes sense. Here are the changes for the 4.9 and 4.8 changes.html pages.

 Ok?


This looks ok to me, I'd suggest changing...

+  li Starting with GCC 4.8.4 a workaround for the ARM Cortex-A53

to

+ li As of GCC 4.8.4 

OK with that change.
/Marcus


Re: [PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.

2014-11-21 Thread Marcus Shawcroft
On 14 November 2014 15:46, Alan Lawrence alan.lawre...@arm.com wrote:

 gcc/ChangeLog:

 * config/aarch64/aarch64-simd.md (vec_shrmode): New.

 gcc/testsuite/ChangeLog:

 * lib/target-supports.exp
 (check_effective_target_whole_vector_shift): Add aarch64{,_be}.

OK /Marcus


Re: [PATCH][AArch64]Tidy up aarch64_simd_expand_args

2014-11-21 Thread Marcus Shawcroft
On 17 November 2014 16:56, Alan Lawrence alan.lawre...@arm.com wrote:
 This is a pure tidyup, no new functionality. Changes are
 (1) Use op[0] to store the result operand, rather than a separate variable,
 thus combining the two large switch statements into one;
 (2) The 'arg' and 'mode' arrays were (almost-)only ever used to store data
 *within* each iteration, so turn them into scalar variables.
 (3) Use 'opc' rather than 'argc' as it indexes operands.

 Cross-tested check-gcc on aarch64-none-elf.

 gcc/ChangeLog:
 * config/aarch64/aarch64-builtins.c (aarch64_simd_expand_args):
 Refactor by combining switch statements and make arrays into
 scalars.

OK /Marcus


Re: [PATCH][AArch64] Add vector pattern for __builtin_ctz

2014-11-21 Thread Marcus Shawcroft
On 14 November 2014 16:38, Jiong Wang jiong.w...@arm.com wrote:

 gcc/
   * config/aarch64/iterators.md (VS): New mode iterator.
   (vsi2qi): New mode attribute.
   (VSI2QI): Likewise.
   * config/aarch64/aarch64-simd-builtins.def: New entry for ctz.
   * config/aarch64/aarch64-simd.md (ctzmode2): New pattern for ctz.
   * config/aarch64/aarch64-builtins.c
   (aarch64_builtin_vectorized_function): Support BUILT_IN_CTZ.

 gcc/testsuite/
   * gcc.target/aarch64/vect_ctz_1.c: New testcase.

OK /Marcus


Re: [PATCH][AArch64][1/5] Implement TARGET_SCHED_MACRO_FUSION_PAIR_P

2014-11-21 Thread Marcus Shawcroft
On 18 November 2014 12:20, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

 On 18/11/14 10:33, Kyrill Tkachov wrote:

 diff --git a/gcc/config/arm/aarch-common-protos.h
 b/gcc/config/arm/aarch-common-protos.h
 index 264bf01..ad7ec43c 100644
 --- a/gcc/config/arm/aarch-common-protos.h
 +++ b/gcc/config/arm/aarch-common-protos.h
 @@ -36,7 +36,6 @@ extern int arm_no_early_alu_shift_value_dep (rtx, rtx);
   extern int arm_no_early_mul_dep (rtx, rtx);
   extern int arm_no_early_store_addr_dep (rtx, rtx);
   extern bool arm_rtx_shift_left_p (rtx);
 -
   /* RTX cost table definitions.  These are used when tuning for speed
 rather
  than for size and should reflect the_additional_  cost over the cost
  of the fastest instruction in the machine, which is COSTS_N_INSNS
 (1).


 This hunk should not be here. I'll remove it when I commit if approved...
 Sorry for that.

 Kyrill


Ok, with that hunk dropped. /Marcus


Re: [PATCH][AArch64][2/5] Implement adrp+add fusion

2014-11-21 Thread Marcus Shawcroft
On 18 November 2014 10:33, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

 2014-11-18  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/aarch64/aarch64.c: Include tm-constrs.h
 (AARCH64_FUSE_ADRP_ADD): Define.
 (cortexa57_tunings): Add AARCH64_FUSE_ADRP_ADD to fuseable_ops.
 (cortexa53_tunings): Likewise.
 (aarch_macro_fusion_pair_p): Handle AARCH64_FUSE_ADRP_ADD.

OK /Marcus


Re: [PATCH][AArch64][3/5] Implement fusion of MOVK+MOVK

2014-11-21 Thread Marcus Shawcroft
On 18 November 2014 10:33, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

 2014-11-18  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/aarch64/aarch64.c (AARCH64_FUSE_MOVK_MOVK): Define.
 (cortexa53_tunings): Specify AARCH64_FUSE_MOVK_MOVK in fuseable_ops.
 (cortexa57_tunings): Likewise.
 (aarch_macro_fusion_pair_p): Handle AARCH64_FUSE_MOVK_MOVK.

OK /Marcus


Re: [PATCH][AArch64][4/5] Implement fusion of ARDP+LDR

2014-11-21 Thread Marcus Shawcroft
On 18 November 2014 10:33, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

 2014-11-18  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/aarch64/aarch64.c (AARCH64_FUSE_ADRP_LDR): Define.
 (cortexa53_tunings): Specify AARCH64_FUSE_ADRP_LDR in fuseable_ops.
 (aarch_macro_fusion_pair_p): Handle AARCH64_FUSE_ADRP_LDR.

OK /Marcus


Re: [PATCH, i386] Add new arg values for __builtin_cpu_supports

2014-11-21 Thread Jeff Law

On 11/20/14 09:40, Jakub Jelinek wrote:

On Thu, Nov 20, 2014 at 07:36:03PM +0300, Ilya Enkovich wrote:

Hi,

MPX runtime checks some feature bits in order to check MPX is fully
supported.  Runtime does it by cpuid calls but there is a
__builtin_cpu_supports which may be used for that.  Unfortunately
currently it doesn't support required bits.  Will it be OK to add them for
trunk?


I think using cpuid for that is just fine.  __builtin_cpu_supports
is for ISA additions users might actually want to version code for,
MPX stuff, as the instructions are nops without hw support, are not
something one would multi-version a function for.
If anything, AVX512F and AVX512BW+VL might be good candidates for that, not
MPX.
SOrry, I didn't know the __builtin_cpu_supports was really only ment for 
user multi-versioning.  In that case, it won't make any sense to put the 
MPX stuff in there.


Sorry for sending you down a wrong path Ilya.

jeff


[C++ PATCH] Allow void type as a literal type in C++14

2014-11-21 Thread Marek Polacek
I noticed that C++14 [basic.types] says that a type is a literal type
if it is: void, [...].  Yet our literal_type_p doesn't consider void
type as a literal type.  The following is an attempt to fix that along
with a testcase.  It seems that void was only added in C++14, so check
for cxx14 as well.

Bootstrapped/regtested on ppc64-linux, ok for trunk?

2014-11-21  Marek Polacek  pola...@redhat.com

* constexpr.c (literal_type_p): Return true for void type in C++14.

* g++.dg/cpp0x/constexpr-function2.C: Limit dg-error to C++11.
* g++.dg/cpp0x/constexpr-neg1.C: Likewise.
* g++.dg/cpp1y/constexpr-void1.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 2678223..0a258cf 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -59,7 +59,8 @@ literal_type_p (tree t)
 {
   if (SCALAR_TYPE_P (t)
   || TREE_CODE (t) == VECTOR_TYPE
-  || TREE_CODE (t) == REFERENCE_TYPE)
+  || TREE_CODE (t) == REFERENCE_TYPE
+  || (VOID_TYPE_P (t)  cxx_dialect = cxx14))
 return true;
   if (CLASS_TYPE_P (t))
 {
diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-function2.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-function2.C
index 8c51c9d..95ee244 100644
--- gcc/testsuite/g++.dg/cpp0x/constexpr-function2.C
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-function2.C
@@ -23,7 +23,7 @@ constexpr int area = squarei(side); // { dg-error 
side|argument }
 int next(constexpr int x) // { dg-error parameter }
 { return x + 1; }
 
-constexpr void f(int x)   // { dg-error return type .void }
+constexpr void f(int x)   // { dg-error return type .void  { target 
c++11_only } }
 { /* ... */ }
 
 constexpr int prev(int x)
diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-neg1.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-neg1.C
index 35f5e8e..dfa1d6b 100644
--- gcc/testsuite/g++.dg/cpp0x/constexpr-neg1.C
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-neg1.C
@@ -29,7 +29,7 @@ int next(constexpr int x) {   // { dg-error parameter }
 extern constexpr int memsz;// { dg-error definition }
 
 // error: return type is void
-constexpr void f(int x)// { dg-error void }
+constexpr void f(int x)// { dg-error void  { target 
c++11_only } }
 { /* ... */ }
 // error: use of decrement
 constexpr int prev(int x)
diff --git gcc/testsuite/g++.dg/cpp1y/constexpr-void1.C 
gcc/testsuite/g++.dg/cpp1y/constexpr-void1.C
index e69de29..10ef5bc 100644
--- gcc/testsuite/g++.dg/cpp1y/constexpr-void1.C
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-void1.C
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++14 } }
+
+struct S
+{
+  int i = 20;
+
+  constexpr void
+  foo (void)
+  {
+if (i  20)
+  __builtin_abort ();
+  }
+};

Marek


Re: [PATCH 4/4] OpenMP 4.0 offloading to Intel MIC: non-fallback testing

2014-11-21 Thread Bernd Edlinger
Aehm Kirill,

excuse me please, but if I do

autogen Makefile.def


I get this from svn diff

Index: Makefile.in
===
--- Makefile.in    (revision 217890)
+++ Makefile.in    (working copy)
@@ -35238,9 +35238,6 @@
 $(SHELL) $(srcdir)/mkinstalldirs $(TARGET_SUBDIR)/liboffloadmic ; \
 $(NORMAL_TARGET_EXPORTS)  \
 echo Configuring in $(TARGET_SUBDIR)/liboffloadmic; \
-     \
-    this_target=${target_alias}; \
-     \
 cd $(TARGET_SUBDIR)/liboffloadmic || exit 1; \
 case $(srcdir) in \
   /* | [A-Za-z]:[\\/]*) topdir=$(srcdir) ;; \
@@ -35248,14 +35245,12 @@
     sed -e 's,\./,,g' -e 's,[^/]*/,../,g' `$(srcdir) ;; \
 esac; \
 module_srcdir=liboffloadmic; \
-    srcdiroption=--srcdir=$${topdir}/liboffloadmic; \
-    libsrcdir=$$s/liboffloadmic; \
 rm -f no-such-file || : ; \
 CONFIG_SITE=no-such-file $(SHELL) \
   $$s/$$module_srcdir/configure \
   --srcdir=$${topdir}/$$module_srcdir \
   $(TARGET_CONFIGARGS) --build=${build_alias} --host=${target_alias} \
-      --target=$${this_target} $${srcdiroption} 
@extra_liboffloadmic_configure_flags@ \
+      --target=${target_alias} @extra_liboffloadmic_configure_flags@ \
   || exit 1
 @endif target-liboffloadmic
 

svn blame Makefile.in points to:

r217498 | kyukhin | 2014-11-13 15:03:17 +0100 (Thu, 13 Nov 2014) | 110 lines

[PATCH 2/4] OpenMP 4.0 offloading to Intel MIC: liboffloadmic.

    * Makefile.def: Add liboffloadmic to target_modules.  Make
    liboffloadmic depend on libgomp's configure, libstdc++ and libgcc.
    * Makefile.in: Regenerate.
    * configure: Regenerate.
    * configure.ac: Add liboffloadmic to target binaries.
    Restrict liboffloadmic for POSIX and i*86, and x86_64 architectures.
    Add liboffloadmic to noconfig list when C++ is not supported.


so, did you really regenerate Makefile.in in that patch, or am I missing 
something ?


Regards,
Bernd.
  

libgo patch committed: Use ppc64le for little-endian 64-bit PowerPC architecture

2014-11-21 Thread Ian Lance Taylor
This patch by Lynn A. Boger changes libgo to use ppc64le for
little-endian 64-bit PowerPC.  Bootstrapped and ran testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
diff -r 96de84075614 libgo/configure.ac
--- a/libgo/configure.acTue Nov 18 09:28:24 2014 -0800
+++ b/libgo/configure.acFri Nov 21 10:01:45 2014 -0800
@@ -194,6 +194,7 @@
 mips_abi=unknown
 is_ppc=no
 is_ppc64=no
+is_ppc64le=no
 is_s390=no
 is_s390x=no
 is_sparc=no
@@ -266,11 +267,18 @@
 #ifdef _ARCH_PPC64
 #error 64-bit
 #endif],
-[is_ppc=yes], [is_ppc64=yes])
+[is_ppc=yes],
+[AC_COMPILE_IFELSE([
+#if defined(_BIG_ENDIAN) || defined(__BIG_ENDIAN__)
+#error 64be
+#endif],
+[is_ppc64le=yes],[is_ppc64=yes])])
 if test $is_ppc = yes; then
   GOARCH=ppc
+elif test $is_ppc64 = yes; then
+  GOARCH=ppc64
 else
-  GOARCH=ppc64
+  GOARCH=ppc64le
 fi
 ;;
   s390*-*-*)
@@ -310,6 +318,7 @@
 AM_CONDITIONAL(LIBGO_IS_MIPSO64, test $mips_abi = o64)
 AM_CONDITIONAL(LIBGO_IS_PPC, test $is_ppc = yes)
 AM_CONDITIONAL(LIBGO_IS_PPC64, test $is_ppc64 = yes)
+AM_CONDITIONAL(LIBGO_IS_PPC64LE, test $is_ppc64le = yes)
 AM_CONDITIONAL(LIBGO_IS_S390, test $is_s390 = yes)
 AM_CONDITIONAL(LIBGO_IS_S390X, test $is_s390x = yes)
 AM_CONDITIONAL(LIBGO_IS_SPARC, test $is_sparc = yes)
diff -r 96de84075614 libgo/go/go/build/syslist.go
--- a/libgo/go/go/build/syslist.go  Tue Nov 18 09:28:24 2014 -0800
+++ b/libgo/go/go/build/syslist.go  Fri Nov 21 10:01:45 2014 -0800
@@ -5,4 +5,4 @@
 package build
 
 const goosList = darwin dragonfly freebsd linux nacl netbsd openbsd plan9 
solaris windows 
-const goarchList = 386 amd64 amd64p32 arm arm64 alpha m68k mipso32 mipsn32 
mipsn64 mipso64 ppc ppc64 s390 s390x sparc sparc64 
+const goarchList = 386 amd64 amd64p32 arm arm64 alpha m68k mipso32 mipsn32 
mipsn64 mipso64 ppc ppc64 ppc64le s390 s390x sparc sparc64 
diff -r 96de84075614 libgo/testsuite/gotest
--- a/libgo/testsuite/gotestTue Nov 18 09:28:24 2014 -0800
+++ b/libgo/testsuite/gotestFri Nov 21 10:01:45 2014 -0800
@@ -379,7 +379,7 @@
 {
text=T
case $GOARCH in
-   ppc64) text=[TD] ;;
+   ppc64*) text=[TD] ;;
esac
 
symtogo='sed -e s/_test/XXXtest/ -e s/.*_\([^_]*\.\)/\1/ -e 
s/XXXtest/_test/'


Re: [PATCH] Set goarch to ppc64le where needed for gccgo testing

2014-11-21 Thread Ian Lance Taylor
On Wed, Nov 19, 2014 at 12:55 PM, Lynn A. Boger
labo...@linux.vnet.ibm.com wrote:
 Updated patch:

Thanks.  Committed.

Ian


 On 11/19/2014 09:01 AM, Lynn A. Boger wrote:

 Hi,

 This change goes along with the change to the GOARCH setting in gccgo for
 ppc64le which will be done in gofrontend.  The description for that change
 is here: https://groups.google.com/forum/#!topic/gofrontend-dev/ocEttrpsw-s

 This change has been bootstrapped and tested along with the above change
 to gofrontend on ppc, ppc64, and ppc64le.

 2014-11-19  Lynn Boger labo...@linux.vnet.ibm.com
 * gcc/testsuite/go.test/go-test.exp:  Add case for ppc64le goarch
 value for go testing


 Index: gcc/testsuite/go.test/go-test.exp
 ===
 --- gcc/testsuite/go.test/go-test.exp   (revision 217507)
 +++ gcc/testsuite/go.test/go-test.exp   (working copy)
 @@ -237,13 +237,15 @@ proc go-set-goarch { } {
 return 
 }
 }
 -   powerpc*-*-* {
 -   if [check_effective_target_ilp32] {
 -   set goarch ppc
 -   } else {
 -   set goarch ppc64
 -   }
 +   powerpc-*-* {
 +   set goarch ppc
 }
 +   powerpc64-*-* {
 +   set goarch ppc64
 +   }
 +   powerpc64le-*-* {
 +   set goarch ppc64le
 +   }
 s390-*-* {
 set goarch s390
 }






[PATCH 0/2, AArch64, v3] APM X-Gene 1 cost-table and pipeline model

2014-11-21 Thread Philipp Tomsich
The following patch-series adds optimized support for the APM X-Gene 1
by providing a cost-model and pipeline-model. The pipeline-model has a 
few long reservation-chains, but looking at the stats for the generated
NDA shows that it's well below other AArch64 cores (e.g. Cortex-A53) in
overall size.

This includes all the requested enhancements and cleans up the naming of
the various states and reservations in 'xgene1.md'.

Even though it isn't wired into the 32bit ARM backend yet, we've decided
to keep the machine-description in config/arm... after all, the X-Gene
family is backwards compatible with ARMv7 and our benchmarking has shown
good potential for performance improvements from improving the instruction
selection and scheduling when using ARMv7 code (after all, X-Gene 1 is a
4-way superscalar design).

After having a few further discussions with my colleagues regarding the
latencies and modelling of divides in the pipeline, we've readjusted the
modelling of the divides another time... even though it doesn't make a
difference in real-world benchmarks.

Thanks to everyone who took the time to review and comment.


Philipp Tomsich (2):
  Core definition for APM XGene-1 and associated cost-table.
  Pipeline model for APM XGene-1.

 gcc/ChangeLog|  14 +
 gcc/config/aarch64/aarch64-cores.def |   1 +
 gcc/config/aarch64/aarch64-tune.md   |   2 +-
 gcc/config/aarch64/aarch64.c |  62 
 gcc/config/aarch64/aarch64.md|   3 +-
 gcc/config/arm/aarch-cost-tables.h   | 101 +++
 gcc/config/arm/xgene1.md | 532 +++
 gcc/doc/invoke.texi  |   3 +-
 8 files changed, 715 insertions(+), 3 deletions(-)
 create mode 100644 gcc/config/arm/xgene1.md

-- 
1.9.1



[PATCH 1/2] Core definition for APM XGene-1 and associated cost-table.

2014-11-21 Thread Philipp Tomsich
To keep this change separately buildable from the pipeline model,
this patch directs the APM XGene-1 to use the generic scheduling
model.
---
 gcc/ChangeLog|   8 +++
 gcc/config/aarch64/aarch64-cores.def |   1 +
 gcc/config/aarch64/aarch64-tune.md   |   2 +-
 gcc/config/aarch64/aarch64.c |  62 +
 gcc/config/arm/aarch-cost-tables.h   | 101 +++
 gcc/doc/invoke.texi  |   3 +-
 6 files changed, 175 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2fa58ca..c9ac0d9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2014-11-19  Philipp Tomsich  philipp.toms...@theobroma-systems.com
+
+   * config/aarch64/aarch64-cores.def (xgene1): Update/add the
+   xgene1 (APM XGene-1) core definition.
+   * gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1
+   * config/arm/aarch-cost-tables.h: Add cost tables for APM XGene-1
+   * doc/invoke.texi: Document -mcpu=xgene1.
+
 2014-11-18  Maciej W. Rozycki  ma...@codesourcery.com
 
* config/mips/mips.md (compression): Add `micromips32' setting.
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 312941f..e553e50 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -37,6 +37,7 @@
 AARCH64_CORE(cortex-a53,  cortexa53, cortexa53, 8,  AARCH64_FL_FPSIMD | 
AARCH64_FL_CRC, cortexa53)
 AARCH64_CORE(cortex-a57,  cortexa15, cortexa15, 8,  AARCH64_FL_FPSIMD | 
AARCH64_FL_CRC, cortexa57)
 AARCH64_CORE(thunderx,thunderx,  thunderx, 8,  AARCH64_FL_FPSIMD | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx)
+AARCH64_CORE(xgene1,  xgene1,xgene1,8,  AARCH64_FL_FPSIMD, 
xgene1)
 
 /* V8 big.LITTLE implementations.  */
 
diff --git a/gcc/config/aarch64/aarch64-tune.md 
b/gcc/config/aarch64/aarch64-tune.md
index c717ea8..6409082 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr tune
-   cortexa53,cortexa15,thunderx,cortexa57cortexa53
+   cortexa53,cortexa15,thunderx,xgene1,cortexa57cortexa53
(const (symbol_ref ((enum attr_tune) aarch64_tune
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 4fec21e..9b92527 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -226,6 +226,27 @@ static const struct cpu_addrcost_table 
cortexa57_addrcost_table =
 #if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
 __extension__
 #endif
+static const struct cpu_addrcost_table xgene1_addrcost_table =
+{
+#if HAVE_DESIGNATED_INITIALIZERS
+  .addr_scale_costs =
+#endif
+{
+  NAMED_PARAM (hi, 1),
+  NAMED_PARAM (si, 0),
+  NAMED_PARAM (di, 0),
+  NAMED_PARAM (ti, 1),
+},
+  NAMED_PARAM (pre_modify, 1),
+  NAMED_PARAM (post_modify, 0),
+  NAMED_PARAM (register_offset, 0),
+  NAMED_PARAM (register_extend, 1),
+  NAMED_PARAM (imm_offset, 0),
+};
+
+#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
+__extension__
+#endif
 static const struct cpu_regmove_cost generic_regmove_cost =
 {
   NAMED_PARAM (GP2GP, 1),
@@ -262,6 +283,17 @@ static const struct cpu_regmove_cost thunderx_regmove_cost 
=
   NAMED_PARAM (FP2FP, 4)
 };
 
+static const struct cpu_regmove_cost xgene1_regmove_cost =
+{
+  NAMED_PARAM (GP2GP, 1),
+  NAMED_PARAM (GP2FP, 8),
+  NAMED_PARAM (FP2GP, 8),
+  /* We currently do not provide direct support for TFmode Q-Q move.
+ Therefore we need to raise the cost above 2 in order to have
+ reload handle the situation.  */
+  NAMED_PARAM (FP2FP, 4)
+};
+
 /* Generic costs for vector insn classes.  */
 #if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
 __extension__
@@ -302,6 +334,26 @@ static const struct cpu_vector_cost cortexa57_vector_cost =
   NAMED_PARAM (cond_not_taken_branch_cost, 1)
 };
 
+/* Generic costs for vector insn classes.  */
+#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
+__extension__
+#endif
+static const struct cpu_vector_cost xgene1_vector_cost =
+{
+  NAMED_PARAM (scalar_stmt_cost, 1),
+  NAMED_PARAM (scalar_load_cost, 5),
+  NAMED_PARAM (scalar_store_cost, 1),
+  NAMED_PARAM (vec_stmt_cost, 2),
+  NAMED_PARAM (vec_to_scalar_cost, 4),
+  NAMED_PARAM (scalar_to_vec_cost, 4),
+  NAMED_PARAM (vec_align_load_cost, 10),
+  NAMED_PARAM (vec_unalign_load_cost, 10),
+  NAMED_PARAM (vec_unalign_store_cost, 2),
+  NAMED_PARAM (vec_store_cost, 2),
+  NAMED_PARAM (cond_taken_branch_cost, 2),
+  NAMED_PARAM (cond_not_taken_branch_cost, 1)
+};
+
 #if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
 __extension__
 #endif
@@ -345,6 +397,16 @@ static const struct tune_params thunderx_tunings =
   NAMED_PARAM (issue_rate, 2)
 };
 
+static const struct tune_params xgene1_tunings =
+{
+  xgene1_extra_costs,
+  xgene1_addrcost_table,
+  xgene1_regmove_cost,

[PATCH 2/2] Pipeline model for APM XGene-1.

2014-11-21 Thread Philipp Tomsich
---
 gcc/ChangeLog |   6 +
 gcc/config/aarch64/aarch64.md |   3 +-
 gcc/config/arm/xgene1.md  | 532 ++
 3 files changed, 540 insertions(+), 1 deletion(-)
 create mode 100644 gcc/config/arm/xgene1.md

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c9ac0d9..dad2278 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,11 @@
 2014-11-19  Philipp Tomsich  philipp.toms...@theobroma-systems.com
 
+   * config/aarch64/aarch64.md: Include xgene1.md.
+   (generic_sched): Set to no for xgene1.
+   * config/arm/xgene1.md: New file.
+
+2014-11-19  Philipp Tomsich  philipp.toms...@theobroma-systems.com
+
* config/aarch64/aarch64-cores.def (xgene1): Update/add the
xgene1 (APM XGene-1) core definition.
* gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 597ff8c..1b36384 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -191,7 +191,7 @@
 
 (define_attr generic_sched yes,no
   (const (if_then_else
-  (eq_attr tune cortexa53,cortexa15,thunderx)
+  (eq_attr tune cortexa53,cortexa15,thunderx,xgene1)
   (const_string no)
   (const_string yes
 
@@ -199,6 +199,7 @@
 (include ../arm/cortex-a53.md)
 (include ../arm/cortex-a15.md)
 (include thunderx.md)
+(include ../arm/xgene1.md)
 
 ;; ---
 ;; Jumps and other miscellaneous insns
diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md
new file mode 100644
index 000..563a959
--- /dev/null
+++ b/gcc/config/arm/xgene1.md
@@ -0,0 +1,532 @@
+;; Machine description for AppliedMicro xgene1 core.
+;; Copyright (C) 2012-2014 Free Software Foundation, Inc.
+;; Contributed by Theobroma Systems Design und Consulting GmbH.
+;;See http://www.theobroma-systems.com for more info.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; http://www.gnu.org/licenses/.
+
+;; Pipeline description for the xgene1 micro-architecture
+
+(define_automaton xgene1)
+
+(define_cpu_unit xgene1_decode_out0 xgene1)
+(define_cpu_unit xgene1_decode_out1 xgene1)
+(define_cpu_unit xgene1_decode_out2 xgene1)
+(define_cpu_unit xgene1_decode_out3 xgene1)
+
+(define_cpu_unit xgene1_divide xgene1)
+(define_cpu_unit xgene1_fp_divide xgene1)
+(define_cpu_unit xgene1_fsu xgene1)
+(define_cpu_unit xgene1_fcmp xgene1)
+
+(define_reservation xgene1_decode1op
+( xgene1_decode_out0 )
+|( xgene1_decode_out1 )
+|( xgene1_decode_out2 )
+|( xgene1_decode_out3 )
+)
+(define_reservation xgene1_decode2op
+( xgene1_decode_out0 + xgene1_decode_out1 )
+|( xgene1_decode_out0 + xgene1_decode_out2 )
+|( xgene1_decode_out0 + xgene1_decode_out3 )
+|( xgene1_decode_out1 + xgene1_decode_out2 )
+|( xgene1_decode_out1 + xgene1_decode_out3 )
+|( xgene1_decode_out2 + xgene1_decode_out3 )
+)
+(define_reservation xgene1_decodeIsolated
+( xgene1_decode_out0 + xgene1_decode_out1 + xgene1_decode_out2 + 
xgene1_decode_out3 )
+)
+
+(define_insn_reservation xgene1_branch 1
+  (and (eq_attr tune xgene1)
+   (eq_attr type branch))
+  xgene1_decode1op)
+
+(define_insn_reservation xgene1_nop 1
+  (and (eq_attr tune xgene1)
+   (eq_attr type no_insn))
+  xgene1_decode1op)
+
+(define_insn_reservation xgene1_call 1
+  (and (eq_attr tune xgene1)
+   (eq_attr type call))
+  xgene1_decode2op)
+
+(define_insn_reservation xgene1_f_load 10
+  (and (eq_attr tune xgene1)
+   (eq_attr type f_loadd,f_loads))
+  xgene1_decode2op)
+
+(define_insn_reservation xgene1_f_store 4
+  (and (eq_attr tune xgene1)
+   (eq_attr type f_stored,f_stores))
+  xgene1_decode2op)
+
+(define_insn_reservation xgene1_fmov 2
+  (and (eq_attr tune xgene1)
+   (eq_attr type fmov,fconsts,fconstd))
+  xgene1_decode1op)
+
+(define_insn_reservation xgene1_f_mcr 10
+  (and (eq_attr tune xgene1)
+   (eq_attr type f_mcr))
+  xgene1_decodeIsolated)
+
+(define_insn_reservation xgene1_f_mrc 4
+  (and (eq_attr tune xgene1)
+   (eq_attr type f_mrc))
+  xgene1_decode2op)
+
+(define_insn_reservation xgene1_load_pair 6
+  (and (eq_attr tune xgene1)
+   (eq_attr type load2))
+  xgene1_decodeIsolated)
+
+(define_insn_reservation xgene1_store_pair 2
+  (and (eq_attr 

Re: [PATCH 4/4] OpenMP 4.0 offloading to Intel MIC: non-fallback testing

2014-11-21 Thread Ilya Verbin
Hi,

On 21 Nov 19:19, Bernd Edlinger wrote:
 so, did you really regenerate Makefile.in in that patch, or am I missing 
 something ?

You're right.  This patch was rebased so many times, that we may forget to
regenerate it before committing.

Do you plan to submit any patch for Makefile.in?
Or should I post this change separately for review? (with regtesting)

  -- Ilya


[RFC] First steps towards segregating types.

2014-11-21 Thread Andrew MacLeod
I've been trying to sort out how to proceed with the gimple_type work, 
and the first step always come back to figuring out all the places types 
are used. This has turned out to be non-trivial and is difficult to do 
in an iterative way.  I believe I've found a reasonable way to proceed.


Over the next few months I plan to maintain a branch (tree-type) which 
leaves types still implemented as trees, and introduce 2 new typedefs 
and a few macros:


typedef union tree_node *tree_type_ptr;  // same as tree
typedef const union tree_node *const_tree_type_ptr;  // same as const_tree

I  will introduce their use throughout the compiler where types are 
needed.   This will tag all the type locations and still allow me to 
bootstrap and run tests to ensure things are still working.


meanwhile, I'll also maintain another patchset which can be applied to 
this branch and will switch those types to a completely separate type 
structure not connected to trees.   It changes all the TYPE_ accessor 
macros to be incompatible with trees. This causes compilation errors 
everywhere a type is referenced, passed, used, or whatever.   It is 
likely to pick up a few extra things along the way related to separating 
types that are not appropriate for the main branch.


I can then go through the source files fixing the compilation issues 
raised by adding tree_type_ptr where appropriate and modifying whatever 
else is required to deal with a segregated type (there is no shortage of 
those!).  These changes can then be applied to the main branch, and 
tested with a bootstrap/testrun/target-build cycle.  I'll also try to 
keep the branch relatively current with mainline.


Once the entire compiler has been processed, the next hunk of work would 
involve removing the types from the tree union and a multitude of 
related cleanups (I'm tracking a list) .  The 3 type structs would be 
replaced with a single type node and tree_type_ptr can be replaced with 
a pointer to the new type_node.  const_tree_type_ptr can also be 
replaced with a normal const version of the same pointer.. we will *not* 
be stuck with the const_tree paradigm.   It is just needed to enable 
compatibility with const_tree for now :-P


There are a few issues, of course :-)

The biggest issue is what to do with fields which can be either a type 
or a tree...  ie   TREE_VALUE() of a TREE_LIST can be a type, as can  a 
TREE_VEC element or a DECL_CONTEXT.  I think the DECL_INITIAL field is 
overloaded and can sometimes be a type, and this was recently introduced 
to TARGET_STATIC_CHAIN.  I suspect the compilation process will identify 
others.


Looking primarily at TREE_LIST first (which can  be a mixed list of 
trees and types),  the question is how to generally handle this situation


I have 2 workable approaches in mind, but am open to suggestions.

1 - introduce a TYPE_REF tree node, which is effectively just a 'typed' 
tree node, and the TREE_TYPE() field of a TYPE_REF node would point to 
the type node.  Any routines which utilize a TYPE node in a tree list 
would have to be modified to make use of this new TYPE_REF node to refer 
to the type.


2 - change the field (list-value in this case) to be a tagged union of 
{ tree tree_value, tree_type_ptr type_value } and use a bit in the base 
to flag which kind of value it is. This would be compatible with GTY, 
and would require changing routines and algorithms to check the bit and 
use the right field.


Option 2 also introduces a change in current practice.  TREE_VALUE() can 
be either an rvalue or an lvalue right now. This would no longer be 
possible and would require changing to a get_value(), set_value(), and 
value_ptr() model. There would be a tree variant and a type variant, 
along with asserts to make sure they are being used properly.  These 
algorithmic changes can also be fully tested on the main branch.  I've 
implemented this change, and it impacts 40 files which utilize 
TREE_VALUE  as an lvalue.  The upside of this is we at least have the 
illusion of more control.   I think the union could possibly be 
macrod/templated to be generally applicable in other circumstances.


I'm not 100% sure, but I think the TYPE_REF approach could continue with 
the current lvalue or rvalue approach, perhaps with some tweaking...   
All conjecture since I haven't prototyped it.  It also provides a 
general mechanism for referencing a type node in any tree circumstance.  
I have a feeling this is the easiest approach, and lends itself well to 
an initial implementation.  At the moment I'm leaning this way but 
I'm going to think about it over the weekend. Perhaps prototyping it 
next week will give me a stronger feeling one way or the other.


I also suspect it will be worth introducing a TYPE_VEC node which 
parallels the TREE_VEC, only giving us a list of types.  There may be 
places that a TREE_LIST is comprised entirely of types, and I'd consider 
trying to convert those to a TYPE_VEC.


I've attached 2 

RE: [PATCH 4/4] OpenMP 4.0 offloading to Intel MIC: non-fallback testing

2014-11-21 Thread Bernd Edlinger
Hi Ilya,

On Fri, 21 Nov 2014 21:44:40, Ilya Verbin wrote:

 Hi,

 On 21 Nov 19:19, Bernd Edlinger wrote:
 so, did you really regenerate Makefile.in in that patch, or am I missing 
 something ?

 You're right. This patch was rebased so many times, that we may forget to
 regenerate it before committing.

 Do you plan to submit any patch for Makefile.in?
 Or should I post this change separately for review? (with regtesting)

 -- Ilya


No, at least not immediately, so I would prefer if you go ahead with your patch 
ASAP.


Thanks,
Bernd.
  

[PATCH 1/2, PR 63814] Strengthen cgraph_edge_brings_value_p

2014-11-21 Thread Martin Jambor
Hi,

PR 63814 is caused by cgraph_edge_brings_value_p misidentifying an
edge to an expanded artificial thunk as an edge to the original node,
which then leads to crazy double-cloning and doubling the thunks along
the call.

This patch fixes the bug by strengthening the predicate so that it
knows where the value is supposed to go and can check that it goes
there and not anywhere else.  It also adds an extra availability check
that was probably missing in it.

Bootstrapped and tested on x86_64-linux, and i686-linux.  OK for
trunk?

Thanks,

Martin


2014-11-20  Martin Jambor  mjam...@suse.cz

PR ipa/63814
* ipa-cp.c (same_node_or_its_all_contexts_clone_p): New function.
(cgraph_edge_brings_value_p): New parameter dest, use
same_node_or_its_all_contexts_clone_p and check availability.
(cgraph_edge_brings_value_p): Likewise.
(get_info_about_necessary_edges): New parameter dest, pass it to
cgraph_edge_brings_value_p.  Update caller.
(gather_edges_for_value): Likewise.
(perhaps_add_new_callers): Use cgraph_edge_brings_value_p to check
both the destination and availability.


Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -2785,17 +2785,31 @@ get_clone_agg_value (struct cgraph_node
   return NULL_TREE;
 }
 
-/* Return true if edge CS does bring about the value described by SRC.  */
+/* Return true is NODE is DEST or its clone for all contexts.  */
 
 static bool
-cgraph_edge_brings_value_p (struct cgraph_edge *cs,
-   ipcp_value_sourcetree *src)
+same_node_or_its_all_contexts_clone_p (cgraph_node *node, cgraph_node *dest)
+{
+  if (node == dest)
+return true;
+
+  struct ipa_node_params *info = IPA_NODE_REF (node);
+  return info-is_all_contexts_clone  info-ipcp_orig_node == dest;
+}
+
+/* Return true if edge CS does bring about the value described by SRC to node
+   DEST or its clone for all contexts.  */
+
+static bool
+cgraph_edge_brings_value_p (cgraph_edge *cs, ipcp_value_sourcetree *src,
+   cgraph_node *dest)
 {
   struct ipa_node_params *caller_info = IPA_NODE_REF (cs-caller);
-  cgraph_node *real_dest = cs-callee-function_symbol ();
-  struct ipa_node_params *dst_info = IPA_NODE_REF (real_dest);
+  enum availability availability;
+  cgraph_node *real_dest = cs-callee-function_symbol (availability);
 
-  if ((dst_info-ipcp_orig_node  !dst_info-is_all_contexts_clone)
+  if (!same_node_or_its_all_contexts_clone_p (real_dest, dest)
+  || availability = AVAIL_INTERPOSABLE
   || caller_info-node_dead)
 return false;
   if (!src-val)
@@ -2834,18 +2848,18 @@ cgraph_edge_brings_value_p (struct cgrap
 }
 }
 
-/* Return true if edge CS does bring about the value described by SRC.  */
+/* Return true if edge CS does bring about the value described by SRC to node
+   DEST or its clone for all contexts.  */
 
 static bool
-cgraph_edge_brings_value_p (struct cgraph_edge *cs,
-   ipcp_value_sourceipa_polymorphic_call_context
-   *src)
+cgraph_edge_brings_value_p (cgraph_edge *cs,
+   ipcp_value_sourceipa_polymorphic_call_context 
*src,
+   cgraph_node *dest)
 {
   struct ipa_node_params *caller_info = IPA_NODE_REF (cs-caller);
   cgraph_node *real_dest = cs-callee-function_symbol ();
-  struct ipa_node_params *dst_info = IPA_NODE_REF (real_dest);
 
-  if ((dst_info-ipcp_orig_node  !dst_info-is_all_contexts_clone)
+  if (!same_node_or_its_all_contexts_clone_p (real_dest, dest)
   || caller_info-node_dead)
 return false;
   if (!src-val)
@@ -2871,13 +2885,14 @@ get_next_cgraph_edge_clone (struct cgrap
   return next_edge_clone[cs-uid];
 }
 
-/* Given VAL, iterate over all its sources and if they still hold, add their
-   edge frequency and their number into *FREQUENCY and *CALLER_COUNT
-   respectively.  */
+/* Given VAL that is intended for DEST, iterate over all its sources and if
+   they still hold, add their edge frequency and their number into *FREQUENCY
+   and *CALLER_COUNT respectively.  */
 
 template typename valtype
 static bool
-get_info_about_necessary_edges (ipcp_valuevaltype *val, int *freq_sum,
+get_info_about_necessary_edges (ipcp_valuevaltype *val, cgraph_node *dest,
+   int *freq_sum,
gcov_type *count_sum, int *caller_count)
 {
   ipcp_value_sourcevaltype *src;
@@ -2890,7 +2905,7 @@ get_info_about_necessary_edges (ipcp_val
   struct cgraph_edge *cs = src-cs;
   while (cs)
{
- if (cgraph_edge_brings_value_p (cs, src))
+ if (cgraph_edge_brings_value_p (cs, src, dest))
{
  count++;
  freq += cs-frequency;
@@ -2907,12 +2922,13 @@ get_info_about_necessary_edges (ipcp_val
   return hot;
 }
 
-/* Return a vector of incoming edges that do 

RE: [PATCH] MIPS16/GCC: Optimise `__call_stub_fp_' call/return stubs

2014-11-21 Thread Moore, Catherine


 -Original Message-
 From: Rozycki, Maciej
 Sent: Wednesday, November 19, 2014 8:05 AM
 To: gcc-patches@gcc.gnu.org
 Cc: Moore, Catherine; Eric Christopher; Matthew Fortune
 Subject: [PATCH] MIPS16/GCC: Optimise `__call_stub_fp_' call/return stubs
 
 
 2014-11-19  Maciej W. Rozycki  ma...@codesourcery.com
 
   gcc/
   * config/mips/mips.c (mips16_build_call_stub): Move the save of
   the return address in $18 ahead of passing arguments to FPRs.
 
   Maciej

This looks OK.  Please commit.
 


Re: [PATCH, i386] Add new arg values for __builtin_cpu_supports

2014-11-21 Thread Ilya Enkovich
2014-11-21 20:45 GMT+03:00 Jeff Law l...@redhat.com:
 On 11/20/14 09:40, Jakub Jelinek wrote:

 On Thu, Nov 20, 2014 at 07:36:03PM +0300, Ilya Enkovich wrote:

 Hi,

 MPX runtime checks some feature bits in order to check MPX is fully
 supported.  Runtime does it by cpuid calls but there is a
 __builtin_cpu_supports which may be used for that.  Unfortunately
 currently it doesn't support required bits.  Will it be OK to add them
 for
 trunk?


 I think using cpuid for that is just fine.  __builtin_cpu_supports
 is for ISA additions users might actually want to version code for,
 MPX stuff, as the instructions are nops without hw support, are not
 something one would multi-version a function for.
 If anything, AVX512F and AVX512BW+VL might be good candidates for that,
 not
 MPX.

 SOrry, I didn't know the __builtin_cpu_supports was really only ment for
 user multi-versioning.  In that case, it won't make any sense to put the MPX
 stuff in there.

 Sorry for sending you down a wrong path Ilya.

It's OK, AVX guys will just transform this MPX patch into AVX512 one :)

Ilya


 jeff


[PATCH] Fix VRP handling of {ADD,SUB,MUL}_OVERFLOW (PR tree-optimization/64006)

2014-11-21 Thread Jakub Jelinek
Hi!

As discussed on IRC and in the PR, these internal calls are quite
unique for VRP in that they return _Complex integer result,
which VRP doesn't track, but then extract using REALPART_EXPR/IMAGPART_EXPR
the two results from that _Complex int and to generate good code
it is desirable to get proper ranges of those two results.
The problem is that right now this works only on the first VRP iteration,
the REALPART_EXPR/IMAGPART_EXPR statements are handled if their operand
is set by {ADD,SUB,MUL}_OVERFLOW.  If we iterate because a VR of one
of the internal call arguments changes, nothing in the propagator marks
the REALPART_EXPR/IMAGPART_EXPR statements for reconsideration.

The following patch handles this, by making the internal calls interesting
to the propagator and returning the right SSA_PROP_* for it (depending on
whether any of the value ranges of the REALPART_EXPR/IMAGPART_EXPR immediate
uses would change or not).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-21  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/64006
* tree-vrp.c (stmt_interesting_for_vrp): Return true
for {ADD,SUB,MUL}_OVERFLOW internal calls.
(vrp_visit_assignment_or_call): For {ADD,SUB,MUL}_OVERFLOW
internal calls, check if any REALPART_EXPR/IMAGPART_EXPR
immediate uses would change their value ranges and return
SSA_PROP_INTERESTING if so, or SSA_PROP_NOT_INTERESTING
if there are some REALPART_EXPR/IMAGPART_EXPR immediate uses
interesting for vrp.

* gcc.c-torture/execute/pr64006.c: New test.

--- gcc/tree-vrp.c.jj   2014-11-21 10:17:05.0 +0100
+++ gcc/tree-vrp.c  2014-11-21 13:12:09.895013334 +0100
@@ -6949,6 +6949,20 @@ stmt_interesting_for_vrp (gimple stmt)
   (is_gimple_call (stmt)
  || !gimple_vuse (stmt)))
return true;
+  else if (is_gimple_call (stmt)  gimple_call_internal_p (stmt))
+   switch (gimple_call_internal_fn (stmt))
+ {
+ case IFN_ADD_OVERFLOW:
+ case IFN_SUB_OVERFLOW:
+ case IFN_MUL_OVERFLOW:
+   /* These internal calls return _Complex integer type,
+  but are interesting to VRP nevertheless.  */
+   if (lhs  TREE_CODE (lhs) == SSA_NAME)
+ return true;
+   break;
+ default:
+   break;
+ }
 }
   else if (gimple_code (stmt) == GIMPLE_COND
   || gimple_code (stmt) == GIMPLE_SWITCH)
@@ -7101,6 +7115,74 @@ vrp_visit_assignment_or_call (gimple stm
 
   return SSA_PROP_NOT_INTERESTING;
 }
+  else if (is_gimple_call (stmt)  gimple_call_internal_p (stmt))
+switch (gimple_call_internal_fn (stmt))
+  {
+  case IFN_ADD_OVERFLOW:
+  case IFN_SUB_OVERFLOW:
+  case IFN_MUL_OVERFLOW:
+   /* These internal calls return _Complex integer type,
+  which VRP does not track, but the immediate uses
+  thereof might be interesting.  */
+   if (lhs  TREE_CODE (lhs) == SSA_NAME)
+ {
+   imm_use_iterator iter;
+   use_operand_p use_p;
+   enum ssa_prop_result res = SSA_PROP_VARYING;
+
+   set_value_range_to_varying (get_value_range (lhs));
+
+   FOR_EACH_IMM_USE_FAST (use_p, iter, lhs)
+ {
+   gimple use_stmt = USE_STMT (use_p);
+   if (!is_gimple_assign (use_stmt))
+ continue;
+   enum tree_code rhs_code = gimple_assign_rhs_code (use_stmt);
+   if (rhs_code != REALPART_EXPR  rhs_code != IMAGPART_EXPR)
+ continue;
+   tree rhs1 = gimple_assign_rhs1 (use_stmt);
+   tree use_lhs = gimple_assign_lhs (use_stmt);
+   if (TREE_CODE (rhs1) != rhs_code
+   || TREE_OPERAND (rhs1, 0) != lhs
+   || TREE_CODE (use_lhs) != SSA_NAME
+   || !stmt_interesting_for_vrp (use_stmt)
+   || (!INTEGRAL_TYPE_P (TREE_TYPE (use_lhs))
+   || !TYPE_MIN_VALUE (TREE_TYPE (use_lhs))
+   || !TYPE_MAX_VALUE (TREE_TYPE (use_lhs
+ continue;
+
+   /* If there is a change in the value range for any of the
+  REALPART_EXPR/IMAGPART_EXPR immediate uses, return
+  SSA_PROP_INTERESTING.  If there are any REALPART_EXPR
+  or IMAGPART_EXPR immediate uses, but none of them have
+  a change in their value ranges, return
+  SSA_PROP_NOT_INTERESTING.  If there are no
+  {REAL,IMAG}PART_EXPR uses at all,
+  return SSA_PROP_VARYING.  */
+   value_range_t new_vr = VR_INITIALIZER;
+   extract_range_basic (new_vr, use_stmt);
+   value_range_t *old_vr = get_value_range (use_lhs);
+   if (old_vr-type != new_vr.type
+   || !vrp_operand_equal_p (old_vr-min, new_vr.min)
+

[PATCH] Fix up __builtin_*_overflow expansion on some targets (PR target/63848)

2014-11-21 Thread Jakub Jelinek
Hi!

Apparently, emit_cmp_and_jump_insns can silently generate wrong code
for wider modes on some targets, so this patch changes all those calls in
internal-fn.c to do_compare_rtx_and_jump, which is a wrapper around
emit_cmp_and_jump_insns that should handle the wider mode comparison
expansion.  Unfortunately, the order of arguments is different :(.

No new testcases provided, the existing testsuite exhibited this on various
targets.

Bootstrapped/regtested on x86_64-linux and i686-linux, tested on the
testcases for ia64 and Uros tested the testcases on Alpha (in both cases
they previously failed), ok for trunk?

2014-11-21  Jakub Jelinek  ja...@redhat.com

PR target/63848
PR target/63975
* internal-fn.c (expand_arith_overflow_result_store,
expand_addsub_overflow, expand_neg_overflow, expand_mul_overflow): Use
do_compare_rtx_and_jump instead of emit_cmp_and_jump_insns everywhere,
adjust arguments to those functions.  Use unsignedp = true for
EQ, NE, GEU, LEU, LTU and GTU comparisons.

--- gcc/internal-fn.c.jj2014-11-19 18:48:02.0 +0100
+++ gcc/internal-fn.c   2014-11-21 17:34:00.634621461 +0100
@@ -386,8 +386,8 @@ expand_arith_overflow_result_store (tree
   int uns = TYPE_UNSIGNED (TREE_TYPE (TREE_TYPE (lhs)));
   lres = convert_modes (tgtmode, mode, res, uns);
   gcc_assert (GET_MODE_PRECISION (tgtmode)  GET_MODE_PRECISION (mode));
-  emit_cmp_and_jump_insns (res, convert_modes (mode, tgtmode, lres, uns),
-  EQ, NULL_RTX, mode, false, done_label,
+  do_compare_rtx_and_jump (res, convert_modes (mode, tgtmode, lres, uns),
+  EQ, true, mode, NULL_RTX, NULL_RTX, done_label,
   PROB_VERY_LIKELY);
   write_complex_part (target, const1_rtx, true);
   emit_label (done_label);
@@ -533,8 +533,8 @@ expand_addsub_overflow (location_t loc,
  ? (CONST_SCALAR_INT_P (op0)  REG_P (op1))
  : CONST_SCALAR_INT_P (op1)))
tem = op1;
-  emit_cmp_and_jump_insns (res, tem, code == PLUS_EXPR ? GEU : LEU,
-  NULL_RTX, mode, false, done_label,
+  do_compare_rtx_and_jump (res, tem, code == PLUS_EXPR ? GEU : LEU,
+  true, mode, NULL_RTX, NULL_RTX, done_label,
   PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -549,7 +549,7 @@ expand_addsub_overflow (location_t loc,
   rtx tem = expand_binop (mode, add_optab,
  code == PLUS_EXPR ? res : op0, sgn,
  NULL_RTX, false, OPTAB_LIB_WIDEN);
-  emit_cmp_and_jump_insns (tem, op1, GEU, NULL_RTX, mode, false,
+  do_compare_rtx_and_jump (tem, op1, GEU, true, mode, NULL_RTX, NULL_RTX,
   done_label, PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -591,9 +591,9 @@ expand_addsub_overflow (location_t loc,
emit_jump (do_error);
   else if (pos_neg == 3)
/* If ARG0 is not known to be always positive, check at runtime.  */
-   emit_cmp_and_jump_insns (op0, const0_rtx, LT, NULL_RTX, mode, false,
-do_error, PROB_VERY_UNLIKELY);
-  emit_cmp_and_jump_insns (op1, op0, LEU, NULL_RTX, mode, false,
+   do_compare_rtx_and_jump (op0, const0_rtx, LT, false, mode, NULL_RTX,
+NULL_RTX, do_error, PROB_VERY_UNLIKELY);
+  do_compare_rtx_and_jump (op1, op0, LEU, true, mode, NULL_RTX, NULL_RTX,
   done_label, PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -607,7 +607,7 @@ expand_addsub_overflow (location_t loc,
  OPTAB_LIB_WIDEN);
   rtx tem = expand_binop (mode, add_optab, op1, sgn, NULL_RTX, false,
  OPTAB_LIB_WIDEN);
-  emit_cmp_and_jump_insns (op0, tem, LTU, NULL_RTX, mode, false,
+  do_compare_rtx_and_jump (op0, tem, LTU, true, mode, NULL_RTX, NULL_RTX,
   done_label, PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -619,8 +619,8 @@ expand_addsub_overflow (location_t loc,
 unsigned.  */
   res = expand_binop (mode, add_optab, op0, op1, NULL_RTX, false,
  OPTAB_LIB_WIDEN);
-  emit_cmp_and_jump_insns (res, const0_rtx, LT, NULL_RTX, mode, false,
-  do_error, PROB_VERY_UNLIKELY);
+  do_compare_rtx_and_jump (res, const0_rtx, LT, false, mode, NULL_RTX,
+  NULL_RTX, do_error, PROB_VERY_UNLIKELY);
   rtx tem = op1;
   /* The operation is commutative, so we can pick operand to compare
 against.  For prec = BITS_PER_WORD, I think preferring REG operand
@@ -633,7 +633,7 @@ expand_addsub_overflow (location_t loc,
  ? (CONST_SCALAR_INT_P (op1)  REG_P (op0))
  : CONST_SCALAR_INT_P (op0))
tem = op0;
-  emit_cmp_and_jump_insns (res, tem, GEU, 

Re: [PATCH 4/4] OpenMP 4.0 offloading to Intel MIC: non-fallback testing

2014-11-21 Thread Ilya Verbin
Hi Jakub!

 On Fri, 21 Nov 2014 21:44:40, Ilya Verbin wrote:
  You're right. This patch was rebased so many times, that we may forget to
  regenerate it before committing.

Build with liboffloadmic passed.  OK for trunk?

  -- Ilya


* Makefile.in: Regenerate.


diff --git a/Makefile.in b/Makefile.in
index f1ff972..0bae570 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -35238,9 +35238,6 @@ configure-target-liboffloadmic:
$(SHELL) $(srcdir)/mkinstalldirs $(TARGET_SUBDIR)/liboffloadmic ; \
$(NORMAL_TARGET_EXPORTS)  \
echo Configuring in $(TARGET_SUBDIR)/liboffloadmic; \
-\
-   this_target=${target_alias}; \
-\
cd $(TARGET_SUBDIR)/liboffloadmic || exit 1; \
case $(srcdir) in \
  /* | [A-Za-z]:[\\/]*) topdir=$(srcdir) ;; \
@@ -35248,14 +35245,12 @@ configure-target-liboffloadmic:
sed -e 's,\./,,g' -e 's,[^/]*/,../,g' `$(srcdir) ;; \
esac; \
module_srcdir=liboffloadmic; \
-   srcdiroption=--srcdir=$${topdir}/liboffloadmic; \
-   libsrcdir=$$s/liboffloadmic; \
rm -f no-such-file || : ; \
CONFIG_SITE=no-such-file $(SHELL) \
  $$s/$$module_srcdir/configure \
  --srcdir=$${topdir}/$$module_srcdir \
  $(TARGET_CONFIGARGS) --build=${build_alias} --host=${target_alias} \
- --target=$${this_target} $${srcdiroption} 
@extra_liboffloadmic_configure_flags@ \
+ --target=${target_alias} @extra_liboffloadmic_configure_flags@ \
  || exit 1
 @endif target-liboffloadmic
 


[PATCH 2/2, PR 63814] Do not re-create expanded artificial thunks

2014-11-21 Thread Martin Jambor
Hi,

when debugging PR 63814 I noticed that when cgraph_node::create_clone
was using redirect_edge_duplicating_thunks to redirect two edges to a
thunk of a clone, two thunks were created, one for each edge.  The
reason is that even though duplicate_thunk_for_node attempts to locate
an already created thunk, it does so by looking for a caller with
thunk.thunk_p set and the previously created one does not have it set
because (on i686) expand_thunk has expanded the thunk to gimple and
cleared the flag.

This patch fixes the issue by marking such expanded thunks with yet
another flag and then uses the flag to identify such expanded thunks.
Bootstrapped and tested on x86_64-linux and i686-linux.  Honza, do you
think this is a good approach?  Is the patch OK for trunk?

Thanks,

Martin


2014-11-21  Martin Jambor  mjam...@suse.cz

* cgraph.h (cgraph_thunk_info): Converted thunk_p to a bit-field.
Added new flag expanded_thunk_p.
* cgraphunit.c (expand_thunk): Set expanded_thunk_p when appropriate.
* cgraphclones.c (duplicate_thunk_for_node): Also re-use an expanded
thunk if available.

Index: src/gcc/cgraph.h
===
--- src.orig/gcc/cgraph.h
+++ src/gcc/cgraph.h
@@ -552,7 +552,9 @@ struct GTY(()) cgraph_thunk_info {
   bool virtual_offset_p;
   bool add_pointer_bounds_args;
   /* Set to true when alias node is thunk.  */
-  bool thunk_p;
+  unsigned thunk_p : 1;
+  /* Set when this is an already expanded thunk.  */
+  unsigned expanded_thunk_p : 1;
 };
 
 /* Information about the function collected locally.
Index: src/gcc/cgraphclones.c
===
--- src.orig/gcc/cgraphclones.c
+++ src/gcc/cgraphclones.c
@@ -311,7 +311,7 @@ duplicate_thunk_for_node (cgraph_node *t
 
   cgraph_edge *cs;
   for (cs = node-callers; cs; cs = cs-next_caller)
-if (cs-caller-thunk.thunk_p
+if ((cs-caller-thunk.thunk_p || cs-caller-thunk.expanded_thunk_p)
 cs-caller-thunk.this_adjusting == thunk-thunk.this_adjusting
 cs-caller-thunk.fixed_offset == thunk-thunk.fixed_offset
 cs-caller-thunk.virtual_offset_p == thunk-thunk.virtual_offset_p
Index: src/gcc/cgraphunit.c
===
--- src.orig/gcc/cgraphunit.c
+++ src/gcc/cgraphunit.c
@@ -1504,6 +1504,7 @@ cgraph_node::expand_thunk (bool output_a
   set_cfun (NULL);
   TREE_ASM_WRITTEN (thunk_fndecl) = 1;
   thunk.thunk_p = false;
+  thunk.expanded_thunk_p = true;
   analyzed = false;
 }
   else
@@ -1686,6 +1687,7 @@ cgraph_node::expand_thunk (bool output_a
   /* Since we want to emit the thunk, we explicitly mark its name as
 referenced.  */
   thunk.thunk_p = false;
+  thunk.expanded_thunk_p = true;
   lowered = true;
   bitmap_obstack_release (NULL);
 }


Re: [PATCH 4/4] OpenMP 4.0 offloading to Intel MIC: non-fallback testing

2014-11-21 Thread Jakub Jelinek
On Fri, Nov 21, 2014 at 10:14:21PM +0300, Ilya Verbin wrote:
  On Fri, 21 Nov 2014 21:44:40, Ilya Verbin wrote:
   You're right. This patch was rebased so many times, that we may forget to
   regenerate it before committing.
 
 Build with liboffloadmic passed.  OK for trunk?
 
   -- Ilya
 
 
   * Makefile.in: Regenerate.

Ok.
 --- a/Makefile.in
 +++ b/Makefile.in
 @@ -35238,9 +35238,6 @@ configure-target-liboffloadmic:
   $(SHELL) $(srcdir)/mkinstalldirs $(TARGET_SUBDIR)/liboffloadmic ; \
   $(NORMAL_TARGET_EXPORTS)  \
   echo Configuring in $(TARGET_SUBDIR)/liboffloadmic; \
 -  \
 - this_target=${target_alias}; \
 -  \
   cd $(TARGET_SUBDIR)/liboffloadmic || exit 1; \
   case $(srcdir) in \
 /* | [A-Za-z]:[\\/]*) topdir=$(srcdir) ;; \
 @@ -35248,14 +35245,12 @@ configure-target-liboffloadmic:
   sed -e 's,\./,,g' -e 's,[^/]*/,../,g' `$(srcdir) ;; \
   esac; \
   module_srcdir=liboffloadmic; \
 - srcdiroption=--srcdir=$${topdir}/liboffloadmic; \
 - libsrcdir=$$s/liboffloadmic; \
   rm -f no-such-file || : ; \
   CONFIG_SITE=no-such-file $(SHELL) \
 $$s/$$module_srcdir/configure \
 --srcdir=$${topdir}/$$module_srcdir \
 $(TARGET_CONFIGARGS) --build=${build_alias} --host=${target_alias} \
 -   --target=$${this_target} $${srcdiroption} 
 @extra_liboffloadmic_configure_flags@ \
 +   --target=${target_alias} @extra_liboffloadmic_configure_flags@ \
 || exit 1
  @endif target-liboffloadmic
  

Jakub


Re: [PATCH] PR lto/63968: 175.vpr from cpu2000 fails to build with LTO

2014-11-21 Thread Jan Hubicka
 Can you verify that the implementation is correct? I tend to remember that I 
 introduced the
 lazy incerementation to inliner both for perofrmance and correctness 
 reasons. I used to get
 odd orders when keys was increased.
 
 Honza
 
 Hello.
 
 What kind of correctness do you mean? Old implementation didn't
 support increment operation and the fact was hushed up.

I see, you patch actually implement the variant of busy (and thus suboptimal) 
method
of increasing key by combination of removalinsertion.  I guess O(log n) is 
good enough
for everything except for inliner that does the lazy increases instead. Doing 
lazy increases
probably means to store pair of keys per node that is wasteful, so the patch is 
OK as it is.

Honza
 
 Martin


Re: [RFC] First steps towards segregating types.

2014-11-21 Thread Diego Novillo
On Fri, Nov 21, 2014 at 1:48 PM, Andrew MacLeod amacl...@redhat.com wrote:

 1 - introduce a TYPE_REF tree node, which is effectively just a 'typed' tree
 node, and the TREE_TYPE() field of a TYPE_REF node would point to the type
 node.  Any routines which utilize a TYPE node in a tree list would have to
 be modified to make use of this new TYPE_REF node to refer to the type.

 2 - change the field (list-value in this case) to be a tagged union of {
 tree tree_value, tree_type_ptr type_value } and use a bit in the base to
 flag which kind of value it is. This would be compatible with GTY, and would
 require changing routines and algorithms to check the bit and use the right
 field.

Seems to me that option 2 would also help against code that blindly
looks at TREE_VALUE and assumes it to be a tree. Wouldn't that make
initial implementation a bit more challenging?

Option 1 does seem easier, but I kind of like the forcing of rvalues
that option 2 provides.

Also liking option 1. The final change to the final type should be
simpler that way.


Diego.


[PATCH, PR 63551] Use proper type in evaluate_conditions_for_known_args

2014-11-21 Thread Martin Jambor
Hi,

the testcase of PR 63551 passes a union between a signed and an
unsigned integer between two functions as a parameter.  The caller
initializes to an unsigned integer with the highest order bit set, the
callee loads the data through the signed field and compares with zero.
evaluate_conditions_for_known_args then wrongly evaluated the
condition in these circumstances, which later on lead to insertion of
builtin_unreachable and miscompilation.

Fixed by fold_converting the known value first.  I use the type of the
value in the condition which should do exactly the right thing because
the value is taken from the corresponding gimple_cond statement in
which types must match.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


2014-11-21  Martin Jambor  mjam...@suse.cz

PR ipa/63551
* ipa-inline-analysis.c (evaluate_conditions_for_known_args): Convert
value of the argument to the type of the value in the condition.

testsuite/
* gcc.dg/ipa/pr63551.c: New test.


Index: src/gcc/ipa-inline-analysis.c
===
--- src.orig/gcc/ipa-inline-analysis.c
+++ src/gcc/ipa-inline-analysis.c
@@ -880,6 +880,7 @@ evaluate_conditions_for_known_args (stru
}
   if (c-code == IS_NOT_CONSTANT || c-code == CHANGED)
continue;
+  val = fold_convert (TREE_TYPE (c-val), val);
   res = fold_binary_to_constant (c-code, boolean_type_node, val, c-val);
   if (res  integer_zerop (res))
continue;
Index: src/gcc/testsuite/gcc.dg/ipa/pr63551.c
===
--- /dev/null
+++ src/gcc/testsuite/gcc.dg/ipa/pr63551.c
@@ -0,0 +1,33 @@
+/* { dg-do run } */
+/* { dg-options -Os } */
+
+union U
+{
+  unsigned int f0;
+  int f1;
+};
+
+int a, d;
+
+void
+fn1 (union U p)
+{
+  if (p.f1 = 0)
+if (a)
+  d = 0;
+}
+
+void
+fn2 ()
+{
+  d = 0;
+  union U b = { 4294967286 };
+  fn1 (b);
+}
+
+int
+main ()
+{
+  fn2 ();
+  return 0;
+}


Re: [RFC] First steps towards segregating types.

2014-11-21 Thread Richard Biener
On November 21, 2014 8:45:09 PM CET, Diego Novillo dnovi...@google.com wrote:
On Fri, Nov 21, 2014 at 1:48 PM, Andrew MacLeod amacl...@redhat.com
wrote:

 1 - introduce a TYPE_REF tree node, which is effectively just a
'typed' tree
 node, and the TREE_TYPE() field of a TYPE_REF node would point to the
type
 node.  Any routines which utilize a TYPE node in a tree list would
have to
 be modified to make use of this new TYPE_REF node to refer to the
type.

 2 - change the field (list-value in this case) to be a tagged union
of {
 tree tree_value, tree_type_ptr type_value } and use a bit in the base
to
 flag which kind of value it is. This would be compatible with GTY,
and would
 require changing routines and algorithms to check the bit and use the
right
 field.

Seems to me that option 2 would also help against code that blindly
looks at TREE_VALUE and assumes it to be a tree. Wouldn't that make
initial implementation a bit more challenging?

Option 1 does seem easier, but I kind of like the forcing of rvalues
that option 2 provides.

Also liking option 1. The final change to the final type should be
simpler that way.

I don't like either :). It seems you are concerned about uses from trees. An 
intermediate step here that would be useful is doing what David did for RTL 
insns and now gimple - expose tree_type as static type but keep tree as its 
base.

Thus make references to trees that are always types use tree_type * while 
keeping those that can refer to types and sth else refer to tree.

That's something that would not be completely artificial at this point.

Richard.


Diego.




Re: [RFC] First steps towards segregating types.

2014-11-21 Thread Andrew MacLeod

On 11/21/2014 02:45 PM, Diego Novillo wrote:

On Fri, Nov 21, 2014 at 1:48 PM, Andrew MacLeod amacl...@redhat.com wrote:


1 - introduce a TYPE_REF tree node, which is effectively just a 'typed' tree
node, and the TREE_TYPE() field of a TYPE_REF node would point to the type
node.  Any routines which utilize a TYPE node in a tree list would have to
be modified to make use of this new TYPE_REF node to refer to the type.

2 - change the field (list-value in this case) to be a tagged union of {
tree tree_value, tree_type_ptr type_value } and use a bit in the base to
flag which kind of value it is. This would be compatible with GTY, and would
require changing routines and algorithms to check the bit and use the right
field.

Seems to me that option 2 would also help against code that blindly
looks at TREE_VALUE and assumes it to be a tree. Wouldn't that make
initial implementation a bit more challenging?
The opposite I think...  option 2 requires compile time correctness.   
In order to get option 1 it right, anywhere there is a TYPE_REF we're 
going to have to change it to look through the TYPE_REF to get the 
type.  If we don't then I'll probably end up with run-time errors in the 
main branch when TREE_VALUE() gets an unexpected TYPE_REF, or something 
like that.  Im also somewhat concerned about places which use it as an 
LVALUE and write the wrong sort of thing back in.  Won't catch that 
until runtime either.   I'll know more next week about that aspect with 
some practical implementation.


Maybe even a combination... change to the get,set,ptr model for 
accessors, AND then use TYPE_REFalthough at the same time i'd be 
nice to ditch the TREE_VALUE_PTR variations... thats virtually the same 
thing as an LVALUE :-P.   that is non-trivial however.  perhaps thats a 
bad idea :-)




Option 1 does seem easier, but I kind of like the forcing of rvalues
that option 2 provides.

Also liking option 1. The final change to the final type should be
simpler that way.



Its also relatively easy to change individual cases from option 2 to 
option 1 down the road.   vice versa is not true :-)


Andrew


Re: [PATCH, PR 63551] Use proper type in evaluate_conditions_for_known_args

2014-11-21 Thread Martin Jambor
On Fri, Nov 21, 2014 at 09:07:50PM +0100, Martin Jambor wrote:
 Hi,
 
 the testcase of PR 63551 passes a union between a signed and an
 unsigned integer between two functions as a parameter.  The caller
 initializes to an unsigned integer with the highest order bit set, the
 callee loads the data through the signed field and compares with zero.
 evaluate_conditions_for_known_args then wrongly evaluated the
 condition in these circumstances, which later on lead to insertion of
 builtin_unreachable and miscompilation.
 
 Fixed by fold_converting the known value first.  I use the type of the
 value in the condition which should do exactly the right thing because
 the value is taken from the corresponding gimple_cond statement in
 which types must match.
 
 Bootstrapped and tested on x86_64-linux.  OK for trunk?

I forgot, this is also a 4.9 bug and I have bootstrapped and tested it
on top of the 4.9 branch as well.  So OK for trunk and the 4.9 branch?

Thanks,

Martin

 
 2014-11-21  Martin Jambor  mjam...@suse.cz
 
   PR ipa/63551
   * ipa-inline-analysis.c (evaluate_conditions_for_known_args): Convert
   value of the argument to the type of the value in the condition.
 
 testsuite/
   * gcc.dg/ipa/pr63551.c: New test.
 
 
 Index: src/gcc/ipa-inline-analysis.c
 ===
 --- src.orig/gcc/ipa-inline-analysis.c
 +++ src/gcc/ipa-inline-analysis.c
 @@ -880,6 +880,7 @@ evaluate_conditions_for_known_args (stru
   }
if (c-code == IS_NOT_CONSTANT || c-code == CHANGED)
   continue;
 +  val = fold_convert (TREE_TYPE (c-val), val);
res = fold_binary_to_constant (c-code, boolean_type_node, val, 
 c-val);
if (res  integer_zerop (res))
   continue;
 Index: src/gcc/testsuite/gcc.dg/ipa/pr63551.c
 ===
 --- /dev/null
 +++ src/gcc/testsuite/gcc.dg/ipa/pr63551.c
 @@ -0,0 +1,33 @@
 +/* { dg-do run } */
 +/* { dg-options -Os } */
 +
 +union U
 +{
 +  unsigned int f0;
 +  int f1;
 +};
 +
 +int a, d;
 +
 +void
 +fn1 (union U p)
 +{
 +  if (p.f1 = 0)
 +if (a)
 +  d = 0;
 +}
 +
 +void
 +fn2 ()
 +{
 +  d = 0;
 +  union U b = { 4294967286 };
 +  fn1 (b);
 +}
 +
 +int
 +main ()
 +{
 +  fn2 ();
 +  return 0;
 +}


Re: [PATCH, PR 63551] Use proper type in evaluate_conditions_for_known_args

2014-11-21 Thread Richard Biener
On November 21, 2014 9:07:50 PM CET, Martin Jambor mjam...@suse.cz wrote:
Hi,

the testcase of PR 63551 passes a union between a signed and an
unsigned integer between two functions as a parameter.  The caller
initializes to an unsigned integer with the highest order bit set, the
callee loads the data through the signed field and compares with zero.
evaluate_conditions_for_known_args then wrongly evaluated the
condition in these circumstances, which later on lead to insertion of
builtin_unreachable and miscompilation.

Fixed by fold_converting the known value first.  I use the type of the
value in the condition which should do exactly the right thing because
the value is taken from the corresponding gimple_cond statement in
which types must match.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

I think you want to use fold_unary (VIEW_CONVERT,...) Here if you consider the 
case with
Int and float.  And fail if that returns NULL or not a constant.

Thanks,
Richard.

Thanks,

Martin


2014-11-21  Martin Jambor  mjam...@suse.cz

   PR ipa/63551
   * ipa-inline-analysis.c (evaluate_conditions_for_known_args): Convert
   value of the argument to the type of the value in the condition.

testsuite/
   * gcc.dg/ipa/pr63551.c: New test.


Index: src/gcc/ipa-inline-analysis.c
===
--- src.orig/gcc/ipa-inline-analysis.c
+++ src/gcc/ipa-inline-analysis.c
@@ -880,6 +880,7 @@ evaluate_conditions_for_known_args (stru
   }
   if (c-code == IS_NOT_CONSTANT || c-code == CHANGED)
   continue;
+  val = fold_convert (TREE_TYPE (c-val), val);
res = fold_binary_to_constant (c-code, boolean_type_node, val,
c-val);
   if (res  integer_zerop (res))
   continue;
Index: src/gcc/testsuite/gcc.dg/ipa/pr63551.c
===
--- /dev/null
+++ src/gcc/testsuite/gcc.dg/ipa/pr63551.c
@@ -0,0 +1,33 @@
+/* { dg-do run } */
+/* { dg-options -Os } */
+
+union U
+{
+  unsigned int f0;
+  int f1;
+};
+
+int a, d;
+
+void
+fn1 (union U p)
+{
+  if (p.f1 = 0)
+if (a)
+  d = 0;
+}
+
+void
+fn2 ()
+{
+  d = 0;
+  union U b = { 4294967286 };
+  fn1 (b);
+}
+
+int
+main ()
+{
+  fn2 ();
+  return 0;
+}




[PATCH][OpenMP] Fix named critical sections inside target functions

2014-11-21 Thread Ilya Verbin
Hi,

'#pragma omp critical (name)' can be placed in the function, marked
with '#pragma omp declare target', in this case the corresponding node
should be marked as offloadable too.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

  -- Ilya


gcc/
* omp-low.c (lower_omp_critical): Mark critical sections
inside target functions as offloadable.


diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 3924282..6c5774c 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -9366,16 +9366,6 @@ lower_omp_critical (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
  DECL_ARTIFICIAL (decl) = 1;
  DECL_IGNORED_P (decl) = 1;
 
- /* If '#pragma omp critical' is inside target region, the symbol must
-be marked for offloading.  */
- omp_context *octx;
- for (octx = ctx-outer; octx; octx = octx-outer)
-   if (is_targetreg_ctx (octx))
- {
-   varpool_node::get_create (decl)-offloadable = 1;
-   break;
- }
-
  varpool_node::finalize_decl (decl);
 
  critical_name_mutexes-put (name, decl);
@@ -9383,6 +9373,20 @@ lower_omp_critical (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
   else
decl = *n;
 
+  /* If '#pragma omp critical' is inside target region or
+inside function marked as offloadable, the symbol must be
+marked as offloadable too.  */
+  omp_context *octx;
+  if (cgraph_node::get (current_function_decl)-offloadable)
+   varpool_node::get_create (decl)-offloadable = 1;
+  else
+   for (octx = ctx-outer; octx; octx = octx-outer)
+ if (is_targetreg_ctx (octx))
+   {
+ varpool_node::get_create (decl)-offloadable = 1;
+ break;
+   }
+
   lock = builtin_decl_explicit (BUILT_IN_GOMP_CRITICAL_NAME_START);
   lock = build_call_expr_loc (loc, lock, 1, build_fold_addr_expr_loc (loc, 
decl));
 


Re: [PATCH] Fix VRP handling of {ADD,SUB,MUL}_OVERFLOW (PR tree-optimization/64006)

2014-11-21 Thread Richard Biener
On November 21, 2014 8:04:39 PM CET, Jakub Jelinek ja...@redhat.com wrote:
Hi!

As discussed on IRC and in the PR, these internal calls are quite
unique for VRP in that they return _Complex integer result,
which VRP doesn't track, but then extract using
REALPART_EXPR/IMAGPART_EXPR
the two results from that _Complex int and to generate good code
it is desirable to get proper ranges of those two results.
The problem is that right now this works only on the first VRP
iteration,
the REALPART_EXPR/IMAGPART_EXPR statements are handled if their operand
is set by {ADD,SUB,MUL}_OVERFLOW.  If we iterate because a VR of one
of the internal call arguments changes, nothing in the propagator marks
the REALPART_EXPR/IMAGPART_EXPR statements for reconsideration.

The following patch handles this, by making the internal calls
interesting
to the propagator and returning the right SSA_PROP_* for it (depending
on
whether any of the value ranges of the REALPART_EXPR/IMAGPART_EXPR
immediate
uses would change or not).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

2014-11-21  Jakub Jelinek  ja...@redhat.com

   PR tree-optimization/64006
   * tree-vrp.c (stmt_interesting_for_vrp): Return true
   for {ADD,SUB,MUL}_OVERFLOW internal calls.
   (vrp_visit_assignment_or_call): For {ADD,SUB,MUL}_OVERFLOW
   internal calls, check if any REALPART_EXPR/IMAGPART_EXPR
   immediate uses would change their value ranges and return
   SSA_PROP_INTERESTING if so, or SSA_PROP_NOT_INTERESTING
   if there are some REALPART_EXPR/IMAGPART_EXPR immediate uses
   interesting for vrp.

   * gcc.c-torture/execute/pr64006.c: New test.

--- gcc/tree-vrp.c.jj  2014-11-21 10:17:05.0 +0100
+++ gcc/tree-vrp.c 2014-11-21 13:12:09.895013334 +0100
@@ -6949,6 +6949,20 @@ stmt_interesting_for_vrp (gimple stmt)
  (is_gimple_call (stmt)
 || !gimple_vuse (stmt)))
   return true;
+  else if (is_gimple_call (stmt)  gimple_call_internal_p (stmt))
+  switch (gimple_call_internal_fn (stmt))
+{
+case IFN_ADD_OVERFLOW:
+case IFN_SUB_OVERFLOW:
+case IFN_MUL_OVERFLOW:
+  /* These internal calls return _Complex integer type,
+ but are interesting to VRP nevertheless.  */
+  if (lhs  TREE_CODE (lhs) == SSA_NAME)
+return true;
+  break;
+default:
+  break;
+}
 }
   else if (gimple_code (stmt) == GIMPLE_COND
  || gimple_code (stmt) == GIMPLE_SWITCH)
@@ -7101,6 +7115,74 @@ vrp_visit_assignment_or_call (gimple stm
 
   return SSA_PROP_NOT_INTERESTING;
 }
+  else if (is_gimple_call (stmt)  gimple_call_internal_p (stmt))
+switch (gimple_call_internal_fn (stmt))
+  {
+  case IFN_ADD_OVERFLOW:
+  case IFN_SUB_OVERFLOW:
+  case IFN_MUL_OVERFLOW:
+  /* These internal calls return _Complex integer type,
+ which VRP does not track, but the immediate uses
+ thereof might be interesting.  */
+  if (lhs  TREE_CODE (lhs) == SSA_NAME)
+{
+  imm_use_iterator iter;
+  use_operand_p use_p;
+  enum ssa_prop_result res = SSA_PROP_VARYING;
+
+  set_value_range_to_varying (get_value_range (lhs));
+
+  FOR_EACH_IMM_USE_FAST (use_p, iter, lhs)
+{
+  gimple use_stmt = USE_STMT (use_p);
+  if (!is_gimple_assign (use_stmt))
+continue;
+  enum tree_code rhs_code = gimple_assign_rhs_code (use_stmt);
+  if (rhs_code != REALPART_EXPR  rhs_code != IMAGPART_EXPR)
+continue;
+  tree rhs1 = gimple_assign_rhs1 (use_stmt);
+  tree use_lhs = gimple_assign_lhs (use_stmt);
+  if (TREE_CODE (rhs1) != rhs_code
+  || TREE_OPERAND (rhs1, 0) != lhs
+  || TREE_CODE (use_lhs) != SSA_NAME
+  || !stmt_interesting_for_vrp (use_stmt)
+  || (!INTEGRAL_TYPE_P (TREE_TYPE (use_lhs))
+  || !TYPE_MIN_VALUE (TREE_TYPE (use_lhs))
+  || !TYPE_MAX_VALUE (TREE_TYPE (use_lhs
+continue;
+
+  /* If there is a change in the value range for any of the
+ REALPART_EXPR/IMAGPART_EXPR immediate uses, return
+ SSA_PROP_INTERESTING.  If there are any REALPART_EXPR
+ or IMAGPART_EXPR immediate uses, but none of them have
+ a change in their value ranges, return
+ SSA_PROP_NOT_INTERESTING.  If there are no
+ {REAL,IMAG}PART_EXPR uses at all,
+ return SSA_PROP_VARYING.  */
+  value_range_t new_vr = VR_INITIALIZER;
+  extract_range_basic (new_vr, use_stmt);
+  value_range_t *old_vr = get_value_range (use_lhs);
+  if (old_vr-type != new_vr.type
+  || !vrp_operand_equal_p (old_vr-min, 

Re: [RFC] First steps towards segregating types.

2014-11-21 Thread Andrew MacLeod

On 11/21/2014 03:13 PM, Richard Biener wrote:

On November 21, 2014 8:45:09 PM CET, Diego Novillo dnovi...@google.com wrote:

On Fri, Nov 21, 2014 at 1:48 PM, Andrew MacLeod amacl...@redhat.com
wrote:


1 - introduce a TYPE_REF tree node, which is effectively just a

'typed' tree

node, and the TREE_TYPE() field of a TYPE_REF node would point to the

type

node.  Any routines which utilize a TYPE node in a tree list would

have to

be modified to make use of this new TYPE_REF node to refer to the

type.

2 - change the field (list-value in this case) to be a tagged union

of {

tree tree_value, tree_type_ptr type_value } and use a bit in the base

to

flag which kind of value it is. This would be compatible with GTY,

and would

require changing routines and algorithms to check the bit and use the

right

field.

Seems to me that option 2 would also help against code that blindly
looks at TREE_VALUE and assumes it to be a tree. Wouldn't that make
initial implementation a bit more challenging?

Option 1 does seem easier, but I kind of like the forcing of rvalues
that option 2 provides.

Also liking option 1. The final change to the final type should be
simpler that way.

I don't like either :). It seems you are concerned about uses from trees. An 
intermediate step here that would be useful is doing what David did for RTL 
insns and now gimple - expose tree_type as static type but keep tree as its 
base.


Didn't say I was thrilled with either, just the only 2 I had come up 
with :-)

Thus make references to trees that are always types use tree_type * while 
keeping those that can refer to types and sth else refer to tree.

That's something that would not be completely artificial at this point.

Richard.





Or possibly a third type which is a hybrid of the two, and also maps to 
a tree...  something like  tree_type_hybrid *


That could work, and will continue to highlight all the places which 
still need to be dealt with.


And it's  much less work.. :-) I'll give that a go and see how it plays out.

Thanks
Andrew




Re: [PATCH] Fix up __builtin_*_overflow expansion on some targets (PR target/63848)

2014-11-21 Thread Richard Biener
On November 21, 2014 8:08:37 PM CET, Jakub Jelinek ja...@redhat.com wrote:
Hi!

Apparently, emit_cmp_and_jump_insns can silently generate wrong code
for wider modes on some targets, so this patch changes all those calls
in
internal-fn.c to do_compare_rtx_and_jump, which is a wrapper around
emit_cmp_and_jump_insns that should handle the wider mode comparison
expansion.  Unfortunately, the order of arguments is different :(.

No new testcases provided, the existing testsuite exhibited this on
various
targets.

Bootstrapped/regtested on x86_64-linux and i686-linux, tested on the
testcases for ia64 and Uros tested the testcases on Alpha (in both
cases
they previously failed), ok for trunk?

Ok.

Thanks,
Richard.

2014-11-21  Jakub Jelinek  ja...@redhat.com

   PR target/63848
   PR target/63975
   * internal-fn.c (expand_arith_overflow_result_store,
   expand_addsub_overflow, expand_neg_overflow, expand_mul_overflow): Use
   do_compare_rtx_and_jump instead of emit_cmp_and_jump_insns everywhere,
   adjust arguments to those functions.  Use unsignedp = true for
   EQ, NE, GEU, LEU, LTU and GTU comparisons.

--- gcc/internal-fn.c.jj   2014-11-19 18:48:02.0 +0100
+++ gcc/internal-fn.c  2014-11-21 17:34:00.634621461 +0100
@@ -386,8 +386,8 @@ expand_arith_overflow_result_store (tree
   int uns = TYPE_UNSIGNED (TREE_TYPE (TREE_TYPE (lhs)));
   lres = convert_modes (tgtmode, mode, res, uns);
 gcc_assert (GET_MODE_PRECISION (tgtmode)  GET_MODE_PRECISION (mode));
-  emit_cmp_and_jump_insns (res, convert_modes (mode, tgtmode,
lres, uns),
- EQ, NULL_RTX, mode, false, done_label,
+  do_compare_rtx_and_jump (res, convert_modes (mode, tgtmode,
lres, uns),
+ EQ, true, mode, NULL_RTX, NULL_RTX, done_label,
  PROB_VERY_LIKELY);
   write_complex_part (target, const1_rtx, true);
   emit_label (done_label);
@@ -533,8 +533,8 @@ expand_addsub_overflow (location_t loc,
 ? (CONST_SCALAR_INT_P (op0)  REG_P (op1))
 : CONST_SCALAR_INT_P (op1)))
   tem = op1;
-  emit_cmp_and_jump_insns (res, tem, code == PLUS_EXPR ? GEU :
LEU,
- NULL_RTX, mode, false, done_label,
+  do_compare_rtx_and_jump (res, tem, code == PLUS_EXPR ? GEU :
LEU,
+ true, mode, NULL_RTX, NULL_RTX, done_label,
  PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -549,7 +549,7 @@ expand_addsub_overflow (location_t loc,
   rtx tem = expand_binop (mode, add_optab,
 code == PLUS_EXPR ? res : op0, sgn,
 NULL_RTX, false, OPTAB_LIB_WIDEN);
-  emit_cmp_and_jump_insns (tem, op1, GEU, NULL_RTX, mode, false,
+  do_compare_rtx_and_jump (tem, op1, GEU, true, mode, NULL_RTX,
NULL_RTX,
  done_label, PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -591,9 +591,9 @@ expand_addsub_overflow (location_t loc,
   emit_jump (do_error);
   else if (pos_neg == 3)
   /* If ARG0 is not known to be always positive, check at runtime.  */
-  emit_cmp_and_jump_insns (op0, const0_rtx, LT, NULL_RTX, mode, false,
-   do_error, PROB_VERY_UNLIKELY);
-  emit_cmp_and_jump_insns (op1, op0, LEU, NULL_RTX, mode, false,
+  do_compare_rtx_and_jump (op0, const0_rtx, LT, false, mode, NULL_RTX,
+   NULL_RTX, do_error, PROB_VERY_UNLIKELY);
+  do_compare_rtx_and_jump (op1, op0, LEU, true, mode, NULL_RTX,
NULL_RTX,
  done_label, PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -607,7 +607,7 @@ expand_addsub_overflow (location_t loc,
 OPTAB_LIB_WIDEN);
rtx tem = expand_binop (mode, add_optab, op1, sgn, NULL_RTX, false,
 OPTAB_LIB_WIDEN);
-  emit_cmp_and_jump_insns (op0, tem, LTU, NULL_RTX, mode, false,
+  do_compare_rtx_and_jump (op0, tem, LTU, true, mode, NULL_RTX,
NULL_RTX,
  done_label, PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -619,8 +619,8 @@ expand_addsub_overflow (location_t loc,
unsigned.  */
   res = expand_binop (mode, add_optab, op0, op1, NULL_RTX, false,
 OPTAB_LIB_WIDEN);
-  emit_cmp_and_jump_insns (res, const0_rtx, LT, NULL_RTX, mode,
false,
- do_error, PROB_VERY_UNLIKELY);
+  do_compare_rtx_and_jump (res, const0_rtx, LT, false, mode,
NULL_RTX,
+ NULL_RTX, do_error, PROB_VERY_UNLIKELY);
   rtx tem = op1;
 /* The operation is commutative, so we can pick operand to compare
against.  For prec = BITS_PER_WORD, I think preferring REG operand
@@ -633,7 +633,7 @@ expand_addsub_overflow (location_t loc,
 ? (CONST_SCALAR_INT_P (op1)  REG_P (op0))
 : CONST_SCALAR_INT_P (op0))
   tem = op0;

Re: [PATCH][OpenMP] Fix named critical sections inside target functions

2014-11-21 Thread Jakub Jelinek
On Fri, Nov 21, 2014 at 11:19:26PM +0300, Ilya Verbin wrote:
 Hi,
 
 '#pragma omp critical (name)' can be placed in the function, marked
 with '#pragma omp declare target', in this case the corresponding node
 should be marked as offloadable too.
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Please add a testcase for this.

   * omp-low.c (lower_omp_critical): Mark critical sections
   inside target functions as offloadable.
 
 
 diff --git a/gcc/omp-low.c b/gcc/omp-low.c
 index 3924282..6c5774c 100644
 --- a/gcc/omp-low.c
 +++ b/gcc/omp-low.c
 @@ -9366,16 +9366,6 @@ lower_omp_critical (gimple_stmt_iterator *gsi_p, 
 omp_context *ctx)
 DECL_ARTIFICIAL (decl) = 1;
 DECL_IGNORED_P (decl) = 1;
  
 -   /* If '#pragma omp critical' is inside target region, the symbol must
 -  be marked for offloading.  */
 -   omp_context *octx;
 -   for (octx = ctx-outer; octx; octx = octx-outer)
 - if (is_targetreg_ctx (octx))
 -   {
 - varpool_node::get_create (decl)-offloadable = 1;
 - break;
 -   }
 -
 varpool_node::finalize_decl (decl);
  
 critical_name_mutexes-put (name, decl);
 @@ -9383,6 +9373,20 @@ lower_omp_critical (gimple_stmt_iterator *gsi_p, 
 omp_context *ctx)
else
   decl = *n;
  
 +  /* If '#pragma omp critical' is inside target region or
 +  inside function marked as offloadable, the symbol must be
 +  marked as offloadable too.  */
 +  omp_context *octx;
 +  if (cgraph_node::get (current_function_decl)-offloadable)
 + varpool_node::get_create (decl)-offloadable = 1;
 +  else
 + for (octx = ctx-outer; octx; octx = octx-outer)
 +   if (is_targetreg_ctx (octx))
 + {
 +   varpool_node::get_create (decl)-offloadable = 1;
 +   break;
 + }
 +
lock = builtin_decl_explicit (BUILT_IN_GOMP_CRITICAL_NAME_START);
lock = build_call_expr_loc (loc, lock, 1, build_fold_addr_expr_loc 
 (loc, decl));
  

Jakub


Re: [PATCH 2/2] PR debug/38757 continued. Handle C11, C++11 and C++14.

2014-11-21 Thread Mark Wielaard
On Fri, Nov 21, 2014 at 09:28:45AM +0100, Jakub Jelinek wrote:
 I think best would be to tweak
   if (value  2 || value  4)
 error_at (loc, dwarf version %d is not supported, value);
   else
 opts-x_dwarf_version = value;
 so that we accept value 5 too, and for now, until the
 most common consumers are changed, use
   if (dwarf_version = 5 /* || !dwarf_strict */)
 so that
 - you can actually use it in the test with -gdwarf-5
 - you can commit it right away
 - people can start playing with what it will mean to support DWARF5
 
 GCC 4.5 also allowed -gdwarf-4 even when DWARF4 has not been released yet.
 When there are consumers that can grok it, we can uncomment the
 || !dwarf_strict.

That makes sense and would be convenient for me.

I made the change in opts.c and added some minimal documentation.
And made sure we only emit the new DWARFv5 language values, but not yet
anything else (the table header format has changed for debug_info and
debug_line in v5, but we don't emit new style headers yet). The testcases
were updated to explicitly add -gdwarf-5.

 else if (strncmp (language_string, GNU C, 5) == 0)
   {
 language = DW_LANG_C89;
 if (dwarf_version = 3 || !dwarf_strict)
  -   if (strcmp (language_string, GNU C99) == 0)
  - language = DW_LANG_C99;
  +   {
  + if (strcmp (language_string, GNU C89) != 0)
  +   language = DW_LANG_C99;
  +
  + if (dwarf_version = 5 || !dwarf_strict)
  +   if (strcmp (language_string, GNU C11) == 0)
  + language = DW_LANG_C11;
  +   }
 
 Shouldn't we emit at least DW_LANG_C99 for GNU C11 if
 not dwarf_version = 5 /* || !dwarf_strict */ but
 dwarf_version = 3 || !dwarf_strict is true?

Yes, that is the intention. If it is a versioned GNU C then it is
at least DW_LANG_C89, if we have -gdwarf-3 or higher and it isn't
GNU C89 then it is at least DW_LANG_C99 and if we have -gdwarf-5
and it is GNU C11 then we emit DW_LANG_C11.

I added an explicit testcase for this.

 BTW, noticed we don't have anything for Fortran 2003 and 2008,
 filed a DWARF Issue for that.

Thanks. I have only focussed on C and C++ because I don't know anything
about version changes in other language standards.

With the above change everything keeps working fine. You only need a
patched GDB when explicitly using -gdwarf-5.

OK to commit?

Thanks,

Mark
PR debug/38757 continued. Handle C11, C++11 and C++14.

Add experimental (minimal) DWARFv5 support.

This change depends on the new DWARFv5 constants mentioned in the
following draft: http://dwarfstd.org/doc/dwarf5.20141029.pdf

gcc/ChangeLog

* doc/invoke.texi (-gdwarf-@{version}): Mention experimental DWARFv5.
* opts.c (common_handle_option): Accept -gdwarf-5.
* dwarf2out.c (is_cxx): Add DW_LANG_C_plus_plus_11 and
DW_LANG_C_plus_plus_14.
(lower_bound_default): Likewise. Plus DW_LANG_C11.
(gen_compile_unit_die): Output DW_LANG_C_plus_plus_11,
DW_LANG_C_plus_plus_14 or DW_LANG_C11.
(output_compilation_unit_header): Output at most a DWARFv4 header.
(output_skeleton_debug_sections): Likewise.
(output_line_info): Likewise.
(output_aranges): Document header version number.

gcc/testsuite/ChangeLog

* gcc.dg/debug/dwarf2/lang-c11.c: New test.
* gcc.dg/debug/dwarf2/lang-c11-d4-strict.c: Likewise.
* g++.dg/debug/dwarf2/lang-cpp11.C: Likewise.
* g++.dg/debug/dwarf2/lang-cpp14.C: Likewise.
* g++.dg/debug/dwarf2/lang-cpp98.C: Likewise.

include/ChangeLog

* dwarf2.h: Add DW_LANG_C_plus_plus_11, DW_LANG_C11 and
DW_LANG_C_plus_plus_14.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 89edddb..d7bce2a 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -5407,8 +5407,8 @@ assembler (GAS) to fail with an error.
 @item -gdwarf-@var{version}
 @opindex gdwarf-@var{version}
 Produce debugging information in DWARF format (if that is supported).
-The value of @var{version} may be either 2, 3 or 4; the default version
-for most targets is 4.
+The value of @var{version} may be either 2, 3, 4 or 5; the default version
+for most targets is 4.  DWARF Version 5 is only experimental.
 
 Note that with DWARF Version 2, some ports require and always
 use some non-conflicting DWARF 3 extensions in the unwind tables.
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 3d50ac9..d0eaaf1 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -4684,7 +4684,8 @@ is_cxx (void)
 {
   unsigned int lang = get_AT_unsigned (comp_unit_die (), DW_AT_language);
 
-  return lang == DW_LANG_C_plus_plus || lang == DW_LANG_ObjC_plus_plus;
+  return (lang == DW_LANG_C_plus_plus || lang == DW_LANG_ObjC_plus_plus
+ || lang == DW_LANG_C_plus_plus_11 || lang == DW_LANG_C_plus_plus_14);
 }
 
 /* Return TRUE if the language is Java.  */
@@ -8966,7 +8967,9 @@ output_die (dw_die_ref die)
 static void
 

Re: [RFC] First steps towards segregating types.

2014-11-21 Thread Richard Biener
On November 21, 2014 9:22:08 PM CET, Andrew MacLeod amacl...@redhat.com wrote:
On 11/21/2014 03:13 PM, Richard Biener wrote:
 On November 21, 2014 8:45:09 PM CET, Diego Novillo
dnovi...@google.com wrote:
 On Fri, Nov 21, 2014 at 1:48 PM, Andrew MacLeod
amacl...@redhat.com
 wrote:

 1 - introduce a TYPE_REF tree node, which is effectively just a
 'typed' tree
 node, and the TREE_TYPE() field of a TYPE_REF node would point to
the
 type
 node.  Any routines which utilize a TYPE node in a tree list would
 have to
 be modified to make use of this new TYPE_REF node to refer to the
 type.
 2 - change the field (list-value in this case) to be a tagged
union
 of {
 tree tree_value, tree_type_ptr type_value } and use a bit in the
base
 to
 flag which kind of value it is. This would be compatible with GTY,
 and would
 require changing routines and algorithms to check the bit and use
the
 right
 field.
 Seems to me that option 2 would also help against code that blindly
 looks at TREE_VALUE and assumes it to be a tree. Wouldn't that make
 initial implementation a bit more challenging?

 Option 1 does seem easier, but I kind of like the forcing of rvalues
 that option 2 provides.

 Also liking option 1. The final change to the final type should be
 simpler that way.
 I don't like either :). It seems you are concerned about uses from
trees. An intermediate step here that would be useful is doing what
David did for RTL insns and now gimple - expose tree_type as static
type but keep tree as its base.

Didn't say I was thrilled with either, just the only 2 I had come up 
with :-)
 Thus make references to trees that are always types use tree_type *
while keeping those that can refer to types and sth else refer to tree.

 That's something that would not be completely artificial at this
point.

 Richard.




Or possibly a third type which is a hybrid of the two, and also maps to

a tree...  something like  tree_type_hybrid *

That could work, and will continue to highlight all the places which 
still need to be dealt with.

Well, on a case by case basis you could find a better union in the tree type 
hierarchy.

Richard.

And it's  much less work.. :-) I'll give that a go and see how it plays
out.

Thanks
Andrew




[PATCH v2] gcc/ubsan.c: Use 'pretty_print' for 'pretty_name' to avoid memory overflow

2014-11-21 Thread Chen Gang
According to the next code, 'pretty_name' may need additional bytes more
than 16 (may have unlimited length for array type). There is an easy way
to fix it: use 'pretty_print' for 'pretty_name'.

Let the code meet 2 white spaces alignment coding styles (originally,
some of code is 1 white sapce alignment).

It passes testsuite under fedora 20 x86_64-unknown-linux-gnu.

2014-11-22  Chen Gang  gang.chen.5...@gmail.com

* ubsan.c (ubsan_type_descriptor): Use 'pretty_print' for
'pretty_name' to avoid memory overflow
---
 gcc/ubsan.c | 57 +++--
 1 file changed, 27 insertions(+), 30 deletions(-)

diff --git a/gcc/ubsan.c b/gcc/ubsan.c
index 41cf546..c03b000 100644
--- a/gcc/ubsan.c
+++ b/gcc/ubsan.c
@@ -336,7 +336,7 @@ ubsan_type_descriptor (tree type, enum ubsan_print_style 
pstyle)
   tree dtype = ubsan_type_descriptor_type ();
   tree type2 = type;
   const char *tname = NULL;
-  char *pretty_name;
+  pretty_printer pretty_name;
   unsigned char deref_depth = 0;
   unsigned short tkind, tinfo;
 
@@ -375,54 +375,50 @@ ubsan_type_descriptor (tree type, enum ubsan_print_style 
pstyle)
 /* We weren't able to determine the type name.  */
 tname = unknown;
 
-  /* Decorate the type name with '', '*', struct, or union.  */
-  pretty_name = (char *) alloca (strlen (tname) + 16 + deref_depth);
   if (pstyle == UBSAN_PRINT_POINTER)
 {
-  int pos = sprintf (pretty_name, '%s%s%s%s%s%s%s,
-TYPE_VOLATILE (type2) ? volatile  : ,
-TYPE_READONLY (type2) ? const  : ,
-TYPE_RESTRICT (type2) ? restrict  : ,
-TYPE_ATOMIC (type2) ? _Atomic  : ,
-TREE_CODE (type2) == RECORD_TYPE
-? struct 
-: TREE_CODE (type2) == UNION_TYPE
-  ? union  : , tname,
-deref_depth == 0 ?  :  );
+  pp_printf (pretty_name, '%s%s%s%s%s%s%s,
+TYPE_VOLATILE (type2) ? volatile  : ,
+TYPE_READONLY (type2) ? const  : ,
+TYPE_RESTRICT (type2) ? restrict  : ,
+TYPE_ATOMIC (type2) ? _Atomic  : ,
+TREE_CODE (type2) == RECORD_TYPE
+? struct 
+: TREE_CODE (type2) == UNION_TYPE
+  ? union  : , tname,
+deref_depth == 0 ?  :  );
   while (deref_depth--  0)
-pretty_name[pos++] = '*';
-  pretty_name[pos++] = '\'';
-  pretty_name[pos] = '\0';
+   pp_star(pretty_name);
+  pp_quote(pretty_name);
 }
   else if (pstyle == UBSAN_PRINT_ARRAY)
 {
   /* Pretty print the array dimensions.  */
   gcc_assert (TREE_CODE (type) == ARRAY_TYPE);
   tree t = type;
-  int pos = sprintf (pretty_name, '%s , tname);
+  pp_printf (pretty_name, '%s , tname);
   while (deref_depth--  0)
-pretty_name[pos++] = '*';
+   pp_star(pretty_name);
   while (TREE_CODE (t) == ARRAY_TYPE)
{
- pretty_name[pos++] = '[';
+ pp_left_bracket(pretty_name);
  tree dom = TYPE_DOMAIN (t);
  if (dom  TREE_CODE (TYPE_MAX_VALUE (dom)) == INTEGER_CST)
-   pos += sprintf (pretty_name[pos], HOST_WIDE_INT_PRINT_DEC,
-   tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1);
+   pp_printf (pretty_name, HOST_WIDE_INT_PRINT_DEC,
+  tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1);
  else
/* ??? We can't determine the variable name; print VLA unspec.  */
-   pretty_name[pos++] = '*';
- pretty_name[pos++] = ']';
+   pp_star(pretty_name);
+ pp_right_bracket(pretty_name);
  t = TREE_TYPE (t);
}
-  pretty_name[pos++] = '\'';
-  pretty_name[pos] = '\0';
+  pp_quote(pretty_name);
 
- /* Save the tree with stripped types.  */
- type = t;
+  /* Save the tree with stripped types.  */
+  type = t;
 }
   else
-sprintf (pretty_name, '%s', tname);
+pp_printf (pretty_name, '%s', tname);
 
   switch (TREE_CODE (type))
 {
@@ -459,8 +455,9 @@ ubsan_type_descriptor (tree type, enum ubsan_print_style 
pstyle)
   DECL_IGNORED_P (decl) = 1;
   DECL_EXTERNAL (decl) = 0;
 
-  size_t len = strlen (pretty_name);
-  tree str = build_string (len + 1, pretty_name);
+  const char *tmp = pp_formatted_text(pretty_name);
+  size_t len = strlen (tmp);
+  tree str = build_string (len + 1, tmp);
   TREE_TYPE (str) = build_array_type (char_type_node,
  build_index_type (size_int (len)));
   TREE_READONLY (str) = 1;
-- 
1.9.3


Re: [PATCH][OpenMP] Fix named critical sections inside target functions

2014-11-21 Thread Ilya Verbin
 On 21 Nov 2014, at 23:36, Jakub Jelinek ja...@redhat.com wrote:
 
 On Fri, Nov 21, 2014 at 11:19:26PM +0300, Ilya Verbin wrote:
 Hi,
 
 '#pragma omp critical (name)' can be placed in the function, marked
 with '#pragma omp declare target', in this case the corresponding node
 should be marked as offloadable too.
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 Please add a testcase for this.

By default with disabled offloading it will always PASS.  Add anyway?

  -- Ilya


Re: [PATCH][OpenMP] Fix named critical sections inside target functions

2014-11-21 Thread H.J. Lu
On Fri, Nov 21, 2014 at 1:08 PM, Ilya Verbin iver...@gmail.com wrote:
 On 21 Nov 2014, at 23:36, Jakub Jelinek ja...@redhat.com wrote:

 On Fri, Nov 21, 2014 at 11:19:26PM +0300, Ilya Verbin wrote:
 Hi,

 '#pragma omp critical (name)' can be placed in the function, marked
 with '#pragma omp declare target', in this case the corresponding node
 should be marked as offloadable too.
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

 Please add a testcase for this.

 By default with disabled offloading it will always PASS.  Add anyway?


Have you fixed the offloading issue with binutils 2.25?


-- 
H.J.


  1   2   >