[PATCH, PR 58398] Fix regression in gcc.dg/attr-ifunc-4.c

2013-09-17 Thread Bernd Edlinger
The attached patch fixes the regression in gcc.dg/attr-ifunc-4.c (PR 58398).

The problem is that the resolver function just looks like an alias, but it 
actually is
something completely different. So inlining the resolver function has to be 
avoided.

The patch was bootstrapped and regression-tested without any problems
on x86_64-unknown-linux-gnu.

OK for trunk?

Regards,
Bernd Edlinger2013-09-17  Bernd Edlinger  bernd.edlin...@hotmail.de

PR ipa/58398
* cgraph.c (cgraph_function_body_availability): Check for ifunc
attribute, and don't inline the resolver in this case.



patch-pr58398.diff
Description: Binary data


Re: [PATCH] Fix segfault with inlining

2013-09-17 Thread Eric Botcazou
 I've looked at the C++ testcase
 
 int foo (int x)
 {
   try {
 return x;
   }
   catch (...)
   {
 return 0;
   }
 }
 
 which exhibits exactly the behavior you quote - return x is considered
 throwing an exception.  The C++ FE doesn't arrange for TREE_THIS_NOTRAP to
 be set here (maybe due to this issue you quote?).

I presume that you compiled with -fnon-call-exceptions?  Otherwise, I don't 
see how something that isn't a call can throw an exception in C++, it should 
be seen at most as possibly trapping, which is less blocking.

 Other than that the patch looks reasonable (I suppose you need
 is_parameter_of only because as we recursively handle the trees
 PARM_DECLs from the destination could already have leaked into
 the tree we recurse into?)

Do you mean that the test on DECL_CONTEXT is superfluous?  Possibly indeed, 
but with nested functions you can have PARM_DECLs of different origins in a 
given function body, although this may be irrelevant for tree-inline.c.

-- 
Eric Botcazou


Re: [PATCH, PR 58398] Fix regression in gcc.dg/attr-ifunc-4.c

2013-09-17 Thread Jan Hubicka
 The attached patch fixes the regression in gcc.dg/attr-ifunc-4.c (PR 58398).
 
 The problem is that the resolver function just looks like an alias, but it 
 actually is
 something completely different. So inlining the resolver function has to be 
 avoided.
 
 The patch was bootstrapped and regression-tested without any problems
 on x86_64-unknown-linux-gnu.
 
 OK for trunk?
 
 Regards,
 Bernd Edlinger  

 2013-09-17  Bernd Edlinger  bernd.edlin...@hotmail.de
 
   PR ipa/58398
   * cgraph.c (cgraph_function_body_availability): Check for ifunc
   attribute, and don't inline the resolver in this case.

OK,
thanks!

Honza
 




[PATCH v3] Caller instrumentation with -finstrument-calls

2013-09-17 Thread Paul Woegerer
Hello Jan,
   
the MAINTAINERS file reveals that you are the right person to contact
for profile feedback related changes.

This is the third iteration of the caller instrumentation patch
originally posted and explained here:
http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01593.html

The hooks now conform to the naming scheme suggested by Andrew Pinski
and the extra bitfield for the no_instrument_calls func attribute is
now relocated to tree_decl_with_vis (it was in tree_function_decl
before, but there is no room left there for another bitfield).

It would be great if this patch could make it into GCC 4.9.0.

Thanks,
Paul

Paul Woegerer (1):
  Caller instrumentation with -finstrument-calls.

 gcc/builtins.def|   5 ++
 gcc/c-family/c-common.c |  34 +++
 gcc/c/c-decl.c  |   2 +
 gcc/common.opt  |  20 -
 gcc/cp/decl.c   |   2 +
 gcc/doc/invoke.texi |  42 +
 gcc/function.c  |   3 +-
 gcc/gimplify.c  | 113 +++-
 gcc/ipa.c   |   1 +
 gcc/java/jcf-parse.c|   1 +
 gcc/libfuncs.h  |   6 ++
 gcc/optabs.c|   6 ++
 gcc/opts.c  |  10 +++
 gcc/testsuite/g++.dg/other/instrument_calls-1.C |  14 +++
 gcc/testsuite/g++.dg/other/instrument_calls-2.C |  20 +
 gcc/testsuite/g++.dg/other/instrument_calls-3.C |  17 
 gcc/testsuite/gcc.dg/instrument_calls-1.c   |   8 ++
 gcc/testsuite/gcc.dg/instrument_calls-2.c   |   8 ++
 gcc/testsuite/gcc.dg/instrument_calls-3.c   |   8 ++
 gcc/testsuite/gcc.dg/instrument_calls-4.c   |   8 ++
 gcc/testsuite/gcc.dg/instrument_calls-5.c   |  11 +++
 gcc/testsuite/gcc.dg/instrument_calls-6.c   |  11 +++
 gcc/testsuite/gcc.dg/instrument_calls-7.c   |  13 +++
 gcc/testsuite/gcc.dg/instrument_calls-8.c   |   7 ++
 gcc/testsuite/gcc.dg/instrument_calls-9.c   |  12 +++
 gcc/tree-core.h |   4 +-
 gcc/tree-streamer-in.c  |   2 +
 gcc/tree-streamer-out.c |   1 +
 gcc/tree.h  |   6 ++
 29 files changed, 390 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-1.C
 create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-2.C
 create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-3.C
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-1.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-2.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-3.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-4.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-5.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-6.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-7.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-8.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-9.c

-- 
1.8.4



[PATCH] Caller instrumentation with -finstrument-calls.

2013-09-17 Thread Paul Woegerer
2013-07-01  Paul Woegerer  paul_woege...@mentor.com

Caller instrumentation with -finstrument-calls.
* gcc/builtins.def: Add call-hooks __gnu_profile_call_before and
__gnu_profile_call_after.
* gcc/libfuncs.h (enum libfunc_index): Likewise.
* gcc/optabs.c (init_optabs): Likewise.
* gcc/c-family/c-common.c (no_instrument_calls): Add attribute.
(handle_no_instrument_calls_attribute): New.
* gcc/common.opt (finstrument-calls): New option.
(finstrument-calls-exclude-function-list): Likewise.
(finstrument-calls-exclude-file-list): Likewise.
* gcc/opts.c (common_handle_option): Handle new options.
* gcc/tree-core.h (tree_decl_with_vis): Add bitfield
no_instrument_calls_before_after.
* gcc/tree.h: Macro for no_instrument_calls_before_after access.
* gcc/c/c-decl.c (merge_decls): Handle tree_function_decl field.
* gcc/cp/decl.c (duplicate_decls): Likewise.
* gcc/function.c (expand_function_start): Likewise.
* gcc/ipa.c: Likewise.
* gcc/java/jcf-parse.c: Likewise.
* gcc/tree-streamer-in.c: Likewise.
* gcc/tree-streamer-out.c: Likewise.
(finstrument-calls-exclude-function-list): Likewise.
(finstrument-calls-exclude-file-list): Likewise.
* gcc/gimplify.c (flag_instrument_calls_exclude_p): New.
(addr_expr_for_call_instrumentation): New.
(maybe_add_profile_call): New.
(gimplify_call_expr): Add call-hooks insertion.
(gimplify_modify_expr): Likewise.
* gcc/doc/invoke.texi: Added documentation for
-finstrument-calls-exclude-function-list and
-finstrument-calls-exclude-file-list and
-finstrument-calls.
* gcc/testsuite/g++.dg/other/instrument_calls-1.C  Added
 regression test for -finstrument-calls.
* gcc/testsuite/g++.dg/other/instrument_calls-2.C: Likewise.
* gcc/testsuite/g++.dg/other/instrument_calls-3.C: Likewise.
* gcc/testsuite/gcc.dg/instrument_calls-1.c: Likewise.
* gcc/testsuite/gcc.dg/instrument_calls-2.c: Likewise.
* gcc/testsuite/gcc.dg/instrument_calls-3.c: Likewise.
* gcc/testsuite/gcc.dg/instrument_calls-4.c: Likewise.
* gcc/testsuite/gcc.dg/instrument_calls-5.c: Likewise.
* gcc/testsuite/gcc.dg/instrument_calls-6.c: Likewise.
* gcc/testsuite/gcc.dg/instrument_calls-7.c: Likewise.
* gcc/testsuite/gcc.dg/instrument_calls-8.c: Likewise.
* gcc/testsuite/gcc.dg/instrument_calls-9.c: Likewise.

Signed-off-by: Paul Woegerer paul_woege...@mentor.com
---
 gcc/builtins.def|   5 ++
 gcc/c-family/c-common.c |  34 +++
 gcc/c/c-decl.c  |   2 +
 gcc/common.opt  |  20 -
 gcc/cp/decl.c   |   2 +
 gcc/doc/invoke.texi |  42 +
 gcc/function.c  |   3 +-
 gcc/gimplify.c  | 113 +++-
 gcc/ipa.c   |   1 +
 gcc/java/jcf-parse.c|   1 +
 gcc/libfuncs.h  |   6 ++
 gcc/optabs.c|   6 ++
 gcc/opts.c  |  10 +++
 gcc/testsuite/g++.dg/other/instrument_calls-1.C |  14 +++
 gcc/testsuite/g++.dg/other/instrument_calls-2.C |  20 +
 gcc/testsuite/g++.dg/other/instrument_calls-3.C |  17 
 gcc/testsuite/gcc.dg/instrument_calls-1.c   |   8 ++
 gcc/testsuite/gcc.dg/instrument_calls-2.c   |   8 ++
 gcc/testsuite/gcc.dg/instrument_calls-3.c   |   8 ++
 gcc/testsuite/gcc.dg/instrument_calls-4.c   |   8 ++
 gcc/testsuite/gcc.dg/instrument_calls-5.c   |  11 +++
 gcc/testsuite/gcc.dg/instrument_calls-6.c   |  11 +++
 gcc/testsuite/gcc.dg/instrument_calls-7.c   |  13 +++
 gcc/testsuite/gcc.dg/instrument_calls-8.c   |   7 ++
 gcc/testsuite/gcc.dg/instrument_calls-9.c   |  12 +++
 gcc/tree-core.h |   4 +-
 gcc/tree-streamer-in.c  |   2 +
 gcc/tree-streamer-out.c |   1 +
 gcc/tree.h  |   6 ++
 29 files changed, 390 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-1.C
 create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-2.C
 create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-3.C
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-1.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-2.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-3.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-4.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-5.c
 create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-6.c
 

Re: [gomp4, trunk] Two simd fixes

2013-09-17 Thread Richard Biener
On Mon, 16 Sep 2013, Jakub Jelinek wrote:

 Hi!
 
 This patch fixes two issues I found on the pr58392.c testcase:
 1) we weren't copying decl attributes, so e.g. inside #pragma omp parallel
 omp simd array temporary arrays lost their attribute and weren't
 adjusted because of that
 2) DR_ALIGNED_TO wasn't reset after resetting DR_OFFSET on simd lane access
 DRs, which resulted in the vectorizer trying to peel for alignment on those.
 Those are always automatic vars that can be just aligned more.
 
 Ok?
 
 2013-09-16  Jakub Jelinek  ja...@redhat.com
 
   * omp-low.c (copy_var_decl): Copy DECL_ATTRIBUTES.
   * tree-vect-data-refs.c (vect_analyze_data_refs): For
   simd_lane_access drs, update also DR_ALIGNED_TO.
 
 --- gcc/omp-low.c.jj  2013-09-16 10:08:43.0 +0200
 +++ gcc/omp-low.c 2013-09-16 15:25:31.683903448 +0200
 @@ -888,6 +888,7 @@ copy_var_decl (tree var, tree name, tree
TREE_NO_WARNING (copy) = TREE_NO_WARNING (var);
TREE_USED (copy) = 1;
DECL_SEEN_IN_BIND_EXPR_P (copy) = 1;
 +  DECL_ATTRIBUTES (copy) = DECL_ATTRIBUTES (var);
  
return copy;
  }

Ok.

 --- gcc/tree-vect-data-refs.c.jj  2013-09-13 16:48:28.0 +0200
 +++ gcc/tree-vect-data-refs.c 2013-09-16 14:47:56.500538758 +0200
 @@ -3039,6 +3039,9 @@ again:
   {
 DR_OFFSET (newdr) = ssize_int (0);
 DR_STEP (newdr) = step;
 +   DR_ALIGNED_TO (newdr)
 + = size_int (highest_pow2_factor
 + (DR_OFFSET (newdr)));

That looks odd - DR_OFFSET (newdr) is constant zero, so you can
as well immediately use BIGGEST_ALIGNMENT here (that's what
highest_pow2_factor does).

Ok with that change.

Thanks,
Richard.

 dr = newdr;
 simd_lane_access = true;
   }
 
   Jakub
 


Commit: MSP430: Add support for interrupt handlers

2013-09-17 Thread Nick Clifton
Hi Guys,

  I am applying the patch below to add support for interrupt handlers to
  the MSP430 backend.  The patch also adds a couple of MSP430 specific
  builtin functions intended to be used inside interrupt handlers.  In
  addition the patch adds support for naked functions, critical
  functions (which disable interrupts whilst they execute) and reentrant
  functions (which disable interrupts but always reenable them upon
  exit).

  Tested with no regressions on an msp430-elf toolchain.

Cheers
  Nick

gcc/ChangeLog
2013-09-17  Nick Clifton  ni...@redhat.com

* config/msp430/msp430-protos.h: Add prototypes for new functions.
* config/msp430/msp430.c (msp430_preserve_reg_p): Add support for
interrupt handlers.
(is_attr_func): New function.
(msp430_is_interrupt_func): New function.
(is_naked_func): New function.
(is_reentrant_func): New function.
(is_critical_func): New function.
(msp430_start_function): Add annotations for function attributes.
(msp430_attr): New function.
(msp430_attribute_table): New.
(msp430_function_section): New function.
(TARGET_ASM_FUNCTION_SECTION): Define.
(msp430_builtin): New enum.
(msp430_init_builtins): New function.
(msp430_builtin_devl): New function.
(msp430_expand_builtin): New function.
(TARGET_INIT_BUILTINS): Define.
(TARGET_EXPAND_BUILTINS): Define.
(TARGET_BUILTIN_DECL): Define.
(msp430_expand_prologue): Add support for naked, interrupt,
critical and reentranct functions.
(msp430_expand_epilogue): Likewise.
(msp430_print_operand): Handle 'O' character.
* config/msp430/msp430.h (TARGET_CPU_CPP_BUILTINS): Define
NO_TRAMPOLINES.
* config/msp430/msp430.md (unspec): Add UNS_DINT, UNS_EINT,
UNS_PUSH_INTR, UNS_POP_INTR, UNS_BIC_SR, UNS_BIS_SR.
(pushm): Use a 'n' rather than an 'i' contraint.
(msp_return): Add generation of the interrupt return instruction.
(disable_interrupts): New pattern.
(enable_interrupts): New pattern.
(push_intr_state): New pattern.
(pop_intr_state): New pattern.
(bic_SR): New pattern.
(bis_SR): New pattern.
* doc/extend.texi: Document MSP430 function attributes and builtin
functions.



msp430.intr.patch.xz
Description: application/xz


Re: Dump framework newline cleanup

2013-09-17 Thread Richard Biener
On Mon, Sep 16, 2013 at 8:36 PM, Teresa Johnson tejohn...@google.com wrote:
 Yep, looked too quickly every time and thought the newline after be
 zero was applying. Here is the patch with the fix. Ok for trunk
 pending regression testing?

Ok.

Thanks,
Richard.

 2013-09-16  Teresa Johnson  tejohn...@google.com

 * coverage.c (get_coverage_counts): Add missing newline.

 Index: coverage.c
 ===
 --- coverage.c  (revision 202628)
 +++ coverage.c  (working copy)
 @@ -347,7 +347,7 @@ get_coverage_counts (unsigned counter, unsigned ex
if (!warned++  dump_enabled_p ())
 dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location,
   (flag_guess_branch_prob
 -  ? file %s not found, execution counts estimated
 +  ? file %s not found, execution counts estimated\n
: file %s not found, execution counts assumed to 
  be zero\n),
   da_file_name);

 Thanks,
 Teresa

 On Mon, Sep 16, 2013 at 11:20 AM, Xinliang David Li davi...@google.com 
 wrote:
 Looks like there is one missing spot:

 @@ -349,7 +349,7 @@ get_coverage_counts (unsigned counter, u
   (flag_guess_branch_prob
? file %s not found, execution counts
 estimated 
: file %s not found, execution counts assumed to 
 
 -be zero),
 +be zero\n),
   da_file_name);
return NULL;


 I found this when testing interaction of -fprofile-use and
 -fno-tree-vectorize without a profile.

 thanks,

 David


 On Mon, Sep 16, 2013 at 11:06 AM, Teresa Johnson tejohn...@google.com 
 wrote:
 On Mon, Sep 16, 2013 at 10:57 AM, Xinliang David Li davi...@google.com 
 wrote:
 I noticed there are a couple of dump_printf_loc instances in
 coverage.c not ended with newline. They should be fixed.

 I committed this change this morning as r202628. I believe I fixed all
 the dump_printf_loc calls (just double-checked). Can you let me know
 if you see anymore after you update to this revision?

 Thanks,
 Teresa


 David

 On Tue, Sep 10, 2013 at 6:32 AM, Teresa Johnson tejohn...@google.com 
 wrote:
 On Mon, Sep 9, 2013 at 9:55 PM, Xinliang David Li davi...@google.com 
 wrote:
 looks fine to me.

 In the long run, I wonder if the machinery in diagnostic messages can
 be reused for opt-info dumping -- i.e., support different streams. It
 has many nice features including %qD specifier for printing tree
 decls.

 Yes, this would have some advantages such as getting the function name 
 emitted.

 Teresa


 David

 On Mon, Sep 9, 2013 at 12:01 PM, Teresa Johnson tejohn...@google.com 
 wrote:
 I've attached a patch that implements the cleanup of newline emission
 by the new dump framework as discussed here:

 http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01779.html

 Essentially, I have removed the leading newline emission from
 dump_loc, and updated dump_printf_loc invocations to emit a trailing
 newline as necessary. This will remove unnecessary vertical space in
 the dump output.

 I did not do any other cleanup of the existing vectorization messages
 - there are IMO a lot of messages being emitted by the vectorizer
 under MSG_NOTE (and probably MSG_MISSED_OPTIMIZATION) that should only
 be emitted to the dump file under -fdump-tree-... and not emitted
 under -fopt-info-all. The ones that stay under -fopt-info-all need
 some formatting/style cleanup. Leaving that for follow-on work.

 Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?

 Thanks,
 Teresa

 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


[PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-09-17 Thread Ilya Enkovich
Hi,

Here is a patch introducing new type and mode for bounds. It is a part of MPX 
ISA support patch (http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01094.html).

Bootstrapped and tested on linux-x86_64. Is it OK for trunk?

Thanks,
Ilya
--

gcc/

2013-09-16  Ilya Enkovich  ilya.enkov...@intel.com

* mode-classes.def (MODE_BOUND): New.
* tree.def (BOUND_TYPE): New.
* genmodes.c (complete_mode): Support MODE_BOUND.
(BOUND_MODE): New.
(make_bound_mode): New.
* machmode.h (BOUND_MODE_P): New.
* stor-layout.c (int_mode_for_mode): Support MODE_BOUND.
(layout_type): Support BOUND_TYPE.
* tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE.
* tree.c (build_int_cst_wide): Support BOUND_TYPE.
(type_contains_placeholder_1): Likewise.
* tree.h (BOUND_TYPE_P): New.
* varasm.c (output_constant): Support BOUND_TYPE.
* doc/rtl.texi (MODE_BOUND): New.

diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 1d62223..02b1214 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the 
@file{@var{machine}-modes.def}.
 @xref{Jump Patterns},
 also see @ref{Condition Code}.
 
+@findex MODE_BOUND
+@item MODE_BOUND
+Bound modes class.  Used to represent values of pointer bounds.
+
 @findex MODE_RANDOM
 @item MODE_RANDOM
 This is a catchall mode class for modes which don't fit into the above
diff --git a/gcc/genmodes.c b/gcc/genmodes.c
index dc38483..89174ec 100644
--- a/gcc/genmodes.c
+++ b/gcc/genmodes.c
@@ -333,6 +333,7 @@ complete_mode (struct mode_data *m)
   break;
 
 case MODE_INT:
+case MODE_BOUND:
 case MODE_FLOAT:
 case MODE_DECIMAL_FLOAT:
 case MODE_FRACT:
@@ -533,6 +534,18 @@ make_special_mode (enum mode_class cl, const char *name,
   new_mode (cl, name, file, line);
 }
 
+#define BOUND_MODE(N, Y) make_bound_mode (#N, Y, __FILE__, __LINE__)
+
+static void ATTRIBUTE_UNUSED
+make_bound_mode (const char *name,
+   unsigned int bytesize,
+   const char *file, unsigned int line)
+{
+  struct mode_data *m = new_mode (MODE_BOUND, name, file, line);
+  m-bytesize = bytesize;
+}
+
+
 #define INT_MODE(N, Y) FRACTIONAL_INT_MODE (N, -1U, Y)
 #define FRACTIONAL_INT_MODE(N, B, Y) \
   make_int_mode (#N, B, Y, __FILE__, __LINE__)
diff --git a/gcc/machmode.h b/gcc/machmode.h
index 981ee92..d4a20b2 100644
--- a/gcc/machmode.h
+++ b/gcc/machmode.h
@@ -174,6 +174,9 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES];
|| CLASS == MODE_ACCUM  \
|| CLASS == MODE_UACCUM)
 
+#define BOUND_MODE_P(MODE)  \
+  (GET_MODE_CLASS (MODE) == MODE_BOUND)
+
 /* Get the size in bytes and bits of an object of mode MODE.  */
 
 extern CONST_MODE_SIZE unsigned char mode_size[NUM_MACHINE_MODES];
diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def
index 7207ef7..c5ea215 100644
--- a/gcc/mode-classes.def
+++ b/gcc/mode-classes.def
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
   DEF_MODE_CLASS (MODE_RANDOM),/* other */ 
   \
   DEF_MODE_CLASS (MODE_CC),/* condition code in a register */ \
   DEF_MODE_CLASS (MODE_INT),   /* integer */  \
+  DEF_MODE_CLASS (MODE_BOUND),/* bounds */ \
   DEF_MODE_CLASS (MODE_PARTIAL_INT),   /* integer with padding bits */\
   DEF_MODE_CLASS (MODE_FRACT), /* signed fractional number */ \
   DEF_MODE_CLASS (MODE_UFRACT),/* unsigned fractional number 
*/   \
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 6f6b310..82611c7 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -383,6 +383,7 @@ int_mode_for_mode (enum machine_mode mode)
 case MODE_VECTOR_ACCUM:
 case MODE_VECTOR_UFRACT:
 case MODE_VECTOR_UACCUM:
+case MODE_BOUND:
   mode = mode_for_size (GET_MODE_BITSIZE (mode), MODE_INT, 0);
   break;
 
@@ -2135,6 +2136,13 @@ layout_type (tree type)
   SET_TYPE_MODE (type, VOIDmode);
   break;
 
+case BOUND_TYPE:
+  SET_TYPE_MODE (type,
+ mode_for_size (TYPE_PRECISION (type), MODE_BOUND, 0));
+  TYPE_SIZE (type) = bitsize_int (GET_MODE_BITSIZE (TYPE_MODE (type)));
+  TYPE_SIZE_UNIT (type) = size_int (GET_MODE_SIZE (TYPE_MODE (type)));
+  break;
+
 case OFFSET_TYPE:
   TYPE_SIZE (type) = bitsize_int (POINTER_SIZE);
   TYPE_SIZE_UNIT (type) = size_int (POINTER_SIZE / BITS_PER_UNIT);
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 69e4006..8b0825c 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -697,6 +697,7 @@ dump_generic_node (pretty_printer *buffer, tree node, int 
spc, int flags,
   break;
 
 case VOID_TYPE:
+case BOUND_TYPE:
 case INTEGER_TYPE:
 case REAL_TYPE:
 case FIXED_POINT_TYPE:
diff --git a/gcc/tree.c b/gcc/tree.c
index b469b97..bbbe16e 

Re: New GCC options for loop vectorization

2013-09-17 Thread Richard Biener
On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li davi...@google.com wrote:
 On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li davi...@google.com 
 wrote:
 On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li davi...@google.com 
 wrote:
 Currently -ftree-vectorize turns on both loop and slp vectorizations,
 but there is no simple way to turn on loop vectorization alone. The
 logic for default O3 setting is also complicated.

 In this patch, two new options are introduced:

 1) -ftree-loop-vectorize

 This option is used to turn on loop vectorization only. option
 -ftree-slp-vectorize also becomes a first class citizen, and no funny
 business of Init(2) is needed.  With this change, -ftree-vectorize
 becomes a simple alias to -ftree-loop-vectorize +
 -ftree-slp-vectorize.

 For instance, to turn on only slp vectorize at O3, the old way is:

  -O3 -fno-tree-vectorize -ftree-slp-vectorize

 With the new change it becomes:

 -O3 -fno-loop-vectorize


 To turn on only loop vectorize at O2, the old way is

 -O2 -ftree-vectorize -fno-slp-vectorize

 The new way is

 -O2 -ftree-loop-vectorize



 2) -ftree-vect-loop-peeling

 This option is used to turn on/off loop peeling for alignment.  In the
 long run, this should be folded into the cheap cost model proposed by
 Richard.  This option is also useful in scenarios where peeling can
 introduce runtime problems:
 http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html  which happens to be
 common in practice.



 Patch attached. Compiler boostrapped. Ok after testing?

 I'd like you to split 1) and 2), mainly because I agree on 1) but not on 
 2).

 Ok. Can you also comment on 2) ?

 I think we want to decide how granular we want to control the vectorizer
 and using which mechanism.  My cost-model re-org makes
 ftree-vect-loop-version a no-op (basically removes it), so 2) looks like
 a step backwards in this context.

 Using cost model to do a coarse grain control/configuration is
 certainly something we want, but having a fine grain control is still
 useful.


 So, can you summarize what pieces (including versioning) of the vectorizer
 you'd want to be able to disable separately?

 Loop peeling seems to be the main one. There is also a correctness
 issue related. For instance, the following code is common in practice,
 but loop peeling wrongly assumes initial base-alignment and generates
 aligned mov instruction after peeling, leading to SEGV.  Peeling is
 not something we can blindly turned on -- even when it is on, there
 should be a way to turn it off explicitly:

 char a[1];

 void foo(int n)
 {
   int* b = (int*)(a+n);
   int i = 0;
   for (; i  1000; ++i)
 b[i] = 1;
 }

 int main(int argn, char** argv)
 {
   foo(argn);
 }

But that's just a bug that should be fixed (looking into it).

  Just disabling peeling for
 alignment may get you into the versioning for alignment path (and thus
 an unvectorized loop at runtime).

 This is not true for target supporting mis-aligned access. I have not
 seen a case where alignment driver loop version happens on x86.

Also it's know that the alignment peeling
 code needs some serious TLC (it's outcome depends on the order of DRs,
 the cost model it uses leaves to be desired as we cannot distinguish
 between unaligned load and store costs).

 Yet another reason to turn it off as it is not effective anyways?

As said I'll disable all remains of -ftree-vect-loop-version with the cost model
patch because it wasn't guarding versioning for aliasing but only versioning
for alignment.

We have to be consistent here - if we add a way to disable peeling for
alignment then we certainly don't want to remove the ability to disable
versioning for alignment, no?

Richard.


 thanks,

 David


 Richard.


 I've stopped a quick try doing 1) myself because

 @@ -1691,6 +1695,12 @@ common_handle_option (struct gcc_options
  opts-x_flag_ipa_reference = false;
break;

 +case OPT_ftree_vectorize:
 +  if (!opts_set-x_flag_tree_loop_vectorize)
 + opts-x_flag_tree_loop_vectorize = value;
 +  if (!opts_set-x_flag_tree_slp_vectorize)
 + opts-x_flag_tree_slp_vectorize = value;
 +  break;

 doesn't look obviously correct.  Does that handle

   -ftree-vectorize -fno-tree-loop-vectorize -ftree-vectorize

 or

   -ftree-loop-vectorize -fno-tree-vectorize

 properly?  Currently at least

   -ftree-slp-vectorize -fno-tree-vectorize

 doesn't work.


 Right -- same is true for -fprofile-use option. FDO enables some
 passes, but can not re-enable them if they are flipped off before.


 That said, the option machinery doesn't handle an option being an alias
 for two other options, so it's mechanism to contract positives/negatives
 doesn't work here and the override hooks do not work reliably for
 repeated options.

 Or am I wrong here?  Should 

Re: [PATCH] Fix segfault with inlining

2013-09-17 Thread Richard Biener
On Tue, Sep 17, 2013 at 9:03 AM, Eric Botcazou ebotca...@adacore.com wrote:
 I've looked at the C++ testcase

 int foo (int x)
 {
   try {
 return x;
   }
   catch (...)
   {
 return 0;
   }
 }

 which exhibits exactly the behavior you quote - return x is considered
 throwing an exception.  The C++ FE doesn't arrange for TREE_THIS_NOTRAP to
 be set here (maybe due to this issue you quote?).

 I presume that you compiled with -fnon-call-exceptions?  Otherwise, I don't
 see how something that isn't a call can throw an exception in C++, it should
 be seen at most as possibly trapping, which is less blocking.

Yes, with -fnon-call-exceptions.

 Other than that the patch looks reasonable (I suppose you need
 is_parameter_of only because as we recursively handle the trees
 PARM_DECLs from the destination could already have leaked into
 the tree we recurse into?)

 Do you mean that the test on DECL_CONTEXT is superfluous?  Possibly indeed,
 but with nested functions you can have PARM_DECLs of different origins in a
 given function body, although this may be irrelevant for tree-inline.c.

Yeah, I thought testing for a PARM_DECL should be sufficient?  For
nested functions
references to outer parms should have been lowered via the static
chain at the point
tree-inline.c sees them.

So, if you agree that the DECL_CONTEXT test is superfluous the patch is ok
with the is_parameter_of function removed.

Thanks,
Richard.

 --
 Eric Botcazou


Fwd: GCC internals conditional execution macro?

2013-09-17 Thread Nicklas Bo Jensen
Hi,

Let me suggest to remove the section Macros to control conditional
execution in GCC internals. I assume the section is obsolete given
that it is empty.

Best,
Nicklas

2013-09-17  Nicklas Bo Jensen  nbjen...@gmail.com
* doc/tm.texi (Macros to control conditional execution):
Remove empty section.

Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi (revision 202626)
+++ gcc/doc/tm.texi (working copy)
@@ -6106,15 +6106,6 @@
 returns @code{VOIDmode}.
 @end deftypefn

-@node Cond Exec Macros
-@subsection Macros to control conditional execution
-@findex conditional execution
-@findex predication
-
-There is one macro that may need to be defined for targets
-supporting conditional execution, independent of how they
-represent conditional branches.
-
 @node Costs
 @section Describing Relative Costs of Operations
 @cindex costs of instructions



-- Forwarded message --
From: Andreas Schwab sch...@linux-m68k.org
Date: Mon, Sep 16, 2013 at 8:03 PM
Subject: Re: GCC internals conditional execution macro?
To: Nicklas Bo Jensen nbjen...@gmail.com
Cc: g...@gcc.gnu.org


Nicklas Bo Jensen nbjen...@gmail.com writes:

 In GCC internals for GCC 4.8.1 and trunk the section Macros to
 control conditional execution mentions that there exists a macro, but
 does not name the macro? Which macro is thought of here?

The macro has been removed in r188983 without removing the now empty
section.

Andreas.

--
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: [Patch] Implement regex_match and regex_search

2013-09-17 Thread Paolo Carlini

Hi,

On 09/15/2013 03:45 AM, Tim Shen wrote:

...finally.

This patch complete flags specifed in [28.5]. However, `optimize` and
`match_any` are ignored. `format_*` are unimplemented yet.

regex_iterator and regex_token_iterator should work now, but need more
testcases.
Great. Tim, please complete the testing on -m32 etc, if everything goes 
well, just wait a day or so and commit.

Next, format string and and regex_replace should be worked on.

I see...

Thanks again!
Paolo.



Re: New GCC options for loop vectorization

2013-09-17 Thread Richard Biener
On Tue, Sep 17, 2013 at 10:20 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li davi...@google.com 
 wrote:
 On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li davi...@google.com 
 wrote:
 On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li davi...@google.com 
 wrote:
 Currently -ftree-vectorize turns on both loop and slp vectorizations,
 but there is no simple way to turn on loop vectorization alone. The
 logic for default O3 setting is also complicated.

 In this patch, two new options are introduced:

 1) -ftree-loop-vectorize

 This option is used to turn on loop vectorization only. option
 -ftree-slp-vectorize also becomes a first class citizen, and no funny
 business of Init(2) is needed.  With this change, -ftree-vectorize
 becomes a simple alias to -ftree-loop-vectorize +
 -ftree-slp-vectorize.

 For instance, to turn on only slp vectorize at O3, the old way is:

  -O3 -fno-tree-vectorize -ftree-slp-vectorize

 With the new change it becomes:

 -O3 -fno-loop-vectorize


 To turn on only loop vectorize at O2, the old way is

 -O2 -ftree-vectorize -fno-slp-vectorize

 The new way is

 -O2 -ftree-loop-vectorize



 2) -ftree-vect-loop-peeling

 This option is used to turn on/off loop peeling for alignment.  In the
 long run, this should be folded into the cheap cost model proposed by
 Richard.  This option is also useful in scenarios where peeling can
 introduce runtime problems:
 http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html  which happens to be
 common in practice.



 Patch attached. Compiler boostrapped. Ok after testing?

 I'd like you to split 1) and 2), mainly because I agree on 1) but not on 
 2).

 Ok. Can you also comment on 2) ?

 I think we want to decide how granular we want to control the vectorizer
 and using which mechanism.  My cost-model re-org makes
 ftree-vect-loop-version a no-op (basically removes it), so 2) looks like
 a step backwards in this context.

 Using cost model to do a coarse grain control/configuration is
 certainly something we want, but having a fine grain control is still
 useful.


 So, can you summarize what pieces (including versioning) of the vectorizer
 you'd want to be able to disable separately?

 Loop peeling seems to be the main one. There is also a correctness
 issue related. For instance, the following code is common in practice,
 but loop peeling wrongly assumes initial base-alignment and generates
 aligned mov instruction after peeling, leading to SEGV.  Peeling is
 not something we can blindly turned on -- even when it is on, there
 should be a way to turn it off explicitly:

 char a[1];

 void foo(int n)
 {
   int* b = (int*)(a+n);
   int i = 0;
   for (; i  1000; ++i)
 b[i] = 1;
 }

 int main(int argn, char** argv)
 {
   foo(argn);
 }

 But that's just a bug that should be fixed (looking into it).

Bug in the testcase.  b[i] asserts that b is aligned to 'int', so this invokes
undefined behavior if peeling cannot reach an alignment of 16.

Richard.


Re: [PATCH, i386, MPX 1/X] Support of Intel MPX ISA. 2/2 New registers and instructions

2013-09-17 Thread Ilya Enkovich
On 16 Sep 11:24, Uros Bizjak wrote:
 On Fri, Sep 13, 2013 at 12:18 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
  2013/9/11 Uros Bizjak ubiz...@gmail.com:
 
 
  Hi Uros,
 
  Thanks a lot for the review!
 
  The x86 part looks mostly OK (I have a couple of comments bellow), but
  please first get target-independent changes reviewed and committed.
 
  Do you mean I should move bound type and mode declaration into a separate 
  patch?
 
 Yes, target-independent part (middle end) has to go through the
 separate review to check if this part is OK. The target-dependent part
 uses the infrastructure from the middle end, so it can go into the
 code base only after target-independent parts are committed.

I sent a separate patch for bound type and mode class 
(http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01268.html). Here is target part 
of the patch with fixes you mentioned. Does it look OK?

Bootstrapped and checked on linux-x86_64. Still shows incorrect length 
attribute computation (described here 
http://gcc.gnu.org/ml/gcc/2013-07/msg00311.html).

Thanks,
Ilya
--

2013-09-16  Ilya Enkovich  ilya.enkov...@intel.com

* config/i386/constraints.md (B): New.
(Ti): New.
(Tb): New.
* config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__.
* config/i386/i386-modes.def (BND32): New.
(BND64): New.
* config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.c (isa_opts): Add mmpx.
(regclass_map): Add bound registers.
(dbx_register_map): Likewise.
(dbx64_register_map): Likewise.
(svr4_dbx_register_map): Likewise.
(PTA_MPX): New.
(ix86_option_override_internal): Support MPX ISA.
(ix86_conditional_register_usage): Support bound registers.
(print_reg): Likewise.
(ix86_code_end): Add MPX bnd prefix.
(output_set_got): Likewise.
(ix86_output_call_insn): Likewise.
(ix86_print_operand): Add '!' (MPX bnd) print prefix support.
(ix86_print_operand_punct_valid_p): Likewise.
(ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
UNSPEC_BNDMK_ADDR.
(ix86_class_likely_spilled_p): Add bound regs support.
(ix86_hard_regno_mode_ok): Likewise.
(x86_order_regs_for_local_alloc): Likewise.
(ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.h (FIRST_PSEUDO_REGISTER): Fix to new value.
(FIXED_REGISTERS): Add bound registers.
(CALL_USED_REGISTERS): Likewise.
(REG_ALLOC_ORDER): Likewise.
(HARD_REGNO_NREGS): Likewise.
(TARGET_MPX): New.
(VALID_BND_REG_MODE): New.
(FIRST_BND_REG): New.
(LAST_BND_REG): New.
(reg_class): Add BND_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(BND_REGNO_P): New.
(ANY_BND_REG_P): New.
(BNDmode): New.
(HI_REGISTER_NAMES): Add bound registers.
* config/i386/i386.md (UNSPEC_BNDMK): New.
(UNSPEC_BNDMK_ADDR): New.
(UNSPEC_BNDSTX): New.
(UNSPEC_BNDLDX): New.
(UNSPEC_BNDLDX_ADDR): New.
(UNSPEC_BNDCL): New.
(UNSPEC_BNDCU): New.
(UNSPEC_BNDCN): New.
(UNSPEC_MPX_FENCE): New.
(BND0_REG): New.
(BND1_REG): New.
(type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_immediate): Likewise.
(prefix_0f): Likewise.
(memory): Likewise.
(prefix_rep): Check for bnd prefix.
(BND): New.
(bnd_ptr): New.
(BNDCHECK): New.
(bndcheck): New.
(*jcc_1): Add MPX bnd prefix and fix length.
(*jcc_2): Likewise.
(jump): Likewise.
(simple_return_internal): Likewise.
(simple_return_pop_internal): Likewise.
(*indirect_jump): Add MPX bnd prefix.
(*tablejump_1): Likewise.
(simple_return_internal_long): Likewise.
(simple_return_indirect_internal): Likewise.
(mode_mk): New.
(*mode_mk): New.
(movmode): New.
(*movmode_internal_mpx): New.
(mode_bndcheck): New.
(*mode_bndcheck): New.
(mode_ldx): New.
(*mode_ldx): New.
(mode_stx): New.
(*mode_stx): New.
* config/i386/predicates.md (lea_address_operand): Rename to...
(address_no_seg_operand): ... this.
(address_mpx_no_base_operand): New.
(address_mpx_no_index_operand): New.
(bnd_mem_operator): New.
* config/i386/i386.opt (mmpx): New.
* doc/invoke.texi: Add documentation for the flags -mmpx, -mno-mpx.
* doc/rtl.texi Add documentation for BND32mode and BND64mode.
diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 28e626f..79d02f7 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -18,7 +18,7 @@
 ;; http://www.gnu.org/licenses/.
 
 ;;; Unused letters:
-;;; B H   T
+;;;  

Re: [x86,PATCH] Simple fix for Atom LEA splitting.

2013-09-17 Thread Yuri Rumyantsev
Here is a final patch with fixed commentary.

2013/9/16 Uros Bizjak ubiz...@gmail.com:
 On Mon, Sep 16, 2013 at 5:01 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:

 Does this comment looks good to you:

   if (start != NULL_RTX)
 {
   bb = BLOCK_FOR_INSN (start);
   if (start != BB_HEAD (bb))
 /* Initialize prev to insn if insn and start belong to the same bb;
   in this case increase_distance can increment distance to 1.  */
 prev = insn;

 I'd say something in the lines of:

 If insn and start belong to the same bb, set prev to insn, so the call
 to increase_distance will increase the distance between insns by 1.

 Best regards,
 Uros.


fixed_patch
Description: Binary data


Re: [PATCH] Fix segfault with inlining

2013-09-17 Thread Eric Botcazou
 Yeah, I thought testing for a PARM_DECL should be sufficient?  For
 nested functions references to outer parms should have been lowered via the
 static chain at the point tree-inline.c sees them.

OK for the latter point, but are you sure for the former?  My understanding is 
that we're already in SSA form, so parameters can be represented by SSA_NAMEs 
without defining statements.

-- 
Eric Botcazou


Re: [PATCH] Fix segfault with inlining

2013-09-17 Thread Jakub Jelinek
On Fri, Sep 13, 2013 at 04:29:48PM +0200, Eric Botcazou wrote:
 @@ -4748,6 +4774,8 @@ copy_gimple_seq_and_replace_locals (gimp
id.transform_call_graph_edges = CB_CGE_DUPLICATE;
id.transform_new_cfg = false;
id.transform_return_to_modify = false;
 +  id.transform_parameter = false;
 +  id.transform_parameter = false;
id.transform_lang_insert_block = NULL;
  
/* Walk the tree once to find local labels.  */

Why are you storing the same thing twice?

Jakub


Re: [PATCH] Fix segfault with inlining

2013-09-17 Thread Richard Biener
On Tue, Sep 17, 2013 at 10:42 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Yeah, I thought testing for a PARM_DECL should be sufficient?  For
 nested functions references to outer parms should have been lowered via the
 static chain at the point tree-inline.c sees them.

 OK for the latter point, but are you sure for the former?  My understanding is
 that we're already in SSA form, so parameters can be represented by SSA_NAMEs
 without defining statements.

That's true...  so you can only simplify is_parameter_of by dropping
the context check.

Richard.

 --
 Eric Botcazou


Re: [PATCH] Handle loops with control flow in loop-distribution

2013-09-17 Thread Andreas Schwab
Installed as obvious.

Andreas.

* gcc.dg/tree-ssa/ldist-22.c (main): Return zero.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-22.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-22.c
index f6fff77..afc792f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ldist-22.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-22.c
@@ -25,7 +25,7 @@ int main()
 abort ();
   if (a[0] != 0 || a[101] != 0)
 abort ();
-  return;
+  return 0;
 }
 
 /* { dg-final { scan-tree-dump generated memset zero ldist } } */
-- 
1.8.4

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


RE: [PATCH, PR 57748] Check for out of bounds access

2013-09-17 Thread Bernd Edlinger
Hi Martin,

On Tue, 17 Sep 2013 01:09:45, Martin Jambor wrote:
 Hi,

 On Sun, Sep 15, 2013 at 06:55:17PM +0200, Bernd Edlinger wrote:
 Hello Richard,

 attached is my second attempt at fixing PR 57748. This time the
 movmisalign path is completely removed and a similar bug in the read
 handling of misaligned structures with a non-BLKmode is fixed
 too. There are several new test cases for the different possible
 failure modes.

 I think the third and fourth testcases are undefined as the
 description of zero-length arrays extension clearly says the whole
 thing only makes sense when used as the last field of the
 outermost-aggregate type. I have not really understood what the third
 testcase is supposed to test but I did not try too much. Instead of
 the fourth testcase, you can demonstrate the need for your change in
 expand_expr_real_1 by augmenting the original testcase a little like
 in attached pr57748-m1.c.

The third test case tries to demonstrate the possible write data store
race (by checking the assembler output). But you are right, this example
is probably not valid C at all.

I was actually worried about unions with non-BLK mode and
a movmisalign optab handler.

When you look at stor-layout.c (compute_record_mode)
you'll see, that in the case of a union usually an integer mode is chosen,
which is exactly the same size as the whole union.
And just by chance this does not have a movmisalign optab.

Therefore I tried to cheat with that zero-sized array, which should
probably be rejected at stor-layout.c in the first place.

When I tried to make a test case out of it, the bug on the read side hit
me as a total surprise...

 The hunk in expand_expr_real_1 can prove problematic if at any point
 we need to pass some other modifier to the expansion of tem. I'll try
 to see if I can come up with a testcase tomorrow. But perhaps we
 never do (and can hope we never will) and then it would be sort of
 OKish (note that I cannot approve anything) even though it can
 pessimize unaligned access paths (by not using movmisalign_optab even
 when perfectly possible - which is always when there is no zero sized
 array). It really just shows how evil non-BLKmode structures with
 zero-sized arrays are and how they complicate things. The expansion
 of component_refs is reasonably built around the assumption that we'd
 expand the structure in its mode in the most efficient manner and then
 chuck the correct part out of it, but here we need to tell the
 expansion of the structure to hold itself back because we'll be
 looking outside of the structure (as specified by mode).

I too am under the very strong impression that this was not the intention
of the design to use a non-BLKmode on a structure with zero-sized arrays.

 I'm not sure to what extent the hunk adding tests for bitregion_start
 and bitregion_end being zero is connected to this issue. I do not see
 any of the testcases exercising that path. If it is indeed another
 problem, I think it should be submitted (and potentially committed) as
 a separate patch, preferably with a testcase.

Yes, you're probably right. I was unable to find a test case where this
code path executes with bitregions. As I said, it maybe possible to prove that
bitregion_start and bitregion_end == 0 if the other conditions are satisfied.
What is obvious, that it would cause problems to set bitpos=0 when
bitregion_start/end is pointing elsewhere.
It is however much easier to prove that not going into that code path
would not cause any problems if bitregion_start/end is not zero.

So this was just for safer programming, but probably no real bug.


Thanks,
Bernd.

 Having said all that, I think that removing the misalignp path from
 expand_assignment altogether is a good idea. I have verified that
 when the expander is now presented with basically the same thing that
 4.7 choked on, expand_expr (..., EXPAND_WRITE) can cope with it (see
 attached file c.c) and doing that simplifies this complex code path.

 Thanks,

 Martin


 This patch was boot-strapped and regression tested on  
 x86_64-unknown-linux-gnu
 and i686-pc-linux-gnu.

 Additionally I generated eCos and an eCos-application (on ARMv5 using packed
 structures) with an arm-eabi cross compiler, and looked for differences in 
 the
 disassembled code with and without this patch, but there were none.

 OK for trunk?

 Regards
 Bernd.

 2013-09-15 Bernd Edlinger bernd.edlin...@hotmail.de

 PR middle-end/57748
 * expr.c (expand_assignment): Remove misalignp code path.
 Check for bitregion in offset arithmetic.
 (expand_expr_real_1): Use EXAND_MEMORY on base object.

 testsuite:

 PR middle-end/57748
 * gcc.dg/torture/pr57748-1.c: New test.
 * gcc.dg/torture/pr57748-2.c: New test.
 * gcc.dg/torture/pr57748-3.c: New test.
 * gcc.dg/torture/pr57748-3a.c: New test.
 * gcc.dg/torture/pr57748-4.c: New test.
 * gcc.dg/torture/pr57748-4a.c: New test.


 

Re: Fwd: GCC internals conditional execution macro?

2013-09-17 Thread Marek Polacek
On Tue, Sep 17, 2013 at 10:35:22AM +0200, Nicklas Bo Jensen wrote:
 Hi,
 
 Let me suggest to remove the section Macros to control conditional
 execution in GCC internals. I assume the section is obsolete given
 that it is empty.
 
 Best,
 Nicklas
 
 2013-09-17  Nicklas Bo Jensen  nbjen...@gmail.com
 * doc/tm.texi (Macros to control conditional execution):

Hasn't this been already removed by
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01231.html
?

Marek


Re: [PATCH, PR 57748] Check for out of bounds access

2013-09-17 Thread Richard Biener
On Sun, Sep 15, 2013 at 6:55 PM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hello Richard,

 attached is my second attempt at fixing PR 57748. This time the movmisalign
 path is completely removed and a similar bug in the read handling of 
 misaligned
 structures with a non-BLKmode is fixed too. There are several new test cases 
 for the
 different possible failure modes.

 This patch was boot-strapped and regression tested on  
 x86_64-unknown-linux-gnu
 and i686-pc-linux-gnu.

 Additionally I generated eCos and an eCos-application (on ARMv5 using packed
 structures) with an arm-eabi cross compiler, and looked for differences in the
 disassembled code with and without this patch, but there were none.

 OK for trunk?

I agree that the existing movmisaling path that you remove is simply bogus, so
removing it looks fine to me.  Can you give rationale to

@@ -4773,6 +4738,8 @@ expand_assignment (tree to, tree from, b
  if (MEM_P (to_rtx)
   GET_MODE (to_rtx) == BLKmode
   GET_MODE (XEXP (to_rtx, 0)) != VOIDmode
+  bitregion_start == 0
+  bitregion_end == 0
   bitsize  0
   (bitpos % bitsize) == 0
   (bitsize % GET_MODE_ALIGNMENT (mode1)) == 0

and especially to

@@ -9905,7 +9861,7 @@ expand_expr_real_1 (tree exp, rtx target
   modifier != EXPAND_STACK_PARM
  ? target : NULL_RTX),
 VOIDmode,
-modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier);
+EXPAND_MEMORY);

/* If the bitfield is volatile, we want to access it in the
   field's mode, not the computed mode.

which AFAIK makes memory expansion of loads/stores from/to registers
change (fail? go through stack memory?) - see handling of non-MEM return
values from that expand_expr call.

That is, do you see anything break with just removing the movmisalign path?
I'd rather install that (with the new testcases that then pass) separately as
this is a somewhat fragile area and being able to more selectively
bisect/backport
would be nice.

Thanks,
Richard.

 Regards
 Bernd.


Re: [PATCH] Don't always instrument shifts (PR sanitizer/58413)

2013-09-17 Thread Marek Polacek
On Mon, Sep 16, 2013 at 08:35:35PM +0200, Jakub Jelinek wrote:
 On Fri, Sep 13, 2013 at 08:01:36PM +0200, Marek Polacek wrote:
 I'd say the above is going to be a maintainance nightmare, with all the code
 duplication.  And you are certainly going to miss cases that way,
 e.g.
 void
 foo (void)
 {
   int A[-2 / -1] = {};
 }
 
 I'd say instead of adding all this, you should just at the right spot insert
 if (integer_zerop (t)) return NULL_TREE; or similar.
 
 For shift instrumentation, I guess you could add
 if (integer_zerop (t)  (tt == NULL_TREE || integer_zerop (tt)))
   return NULL_TREE;
 right before:
   t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t);

Yeah, this is _much_ better.  I'm glad we can live without that
ugliness.

  +/* PR sanitizer/58413 */
  +/* { dg-do run } */
  +/* { dg-options -fsanitize=shift -w } */
  +
  +int x = 7;
  +int
  +main (void)
  +{
  +  /* All of the following should pass.  */
  +  int A[128  5] = {};
  +  int B[128  5] = {};
  +
  +  static int e =
  +((int)
  + (0x | ((31  ((1  (4)) - 1))  (((15) + 6) + 4)) |
  +  ((0)  ((15) + 6)) | ((0)  (15;
 
 This relies on int32plus, so needs to be /* { dg-do run { target int32plus } 
 } */

Fixed.

  --- gcc/testsuite/c-c++-common/ubsan/shift-5.c.mp3  2013-09-13 
  18:25:06.195847144 +0200
  +++ gcc/testsuite/c-c++-common/ubsan/shift-5.c  2013-09-13 
  19:16:38.990211229 +0200
  @@ -0,0 +1,21 @@
  +/* { dg-do compile} */
  +/* { dg-options -fsanitize=shift -w } */
  +/* { dg-shouldfail ubsan } */
  +
  +int x;
  +int
  +main (void)
  +{
  +  /* None of the following should pass.  */
  +  switch (x)
  +{
  +case 1  -1:  /* { dg-error  } */
  +case -1  -1: /* { dg-error  } */
  +case 1  -1:  /* { dg-error  } */
  +case -1  -1: /* { dg-error  } */
  +case -1  200:/* { dg-error  } */
  +case 1  200: /* { dg-error  } */
 
 Can't you fill in the error you are expecting, or is the problem
 that the wording is very different between C and C++?

I discovered { target c } stuff, so I filled in both error messages.

This patch seems to work: bootstrap-ubsan passes + ubsan testsuite
passes too.  Ok for trunk?

2013-09-17  Marek Polacek  pola...@redhat.com
Jakub Jelinek  ja...@redhat.com

PR sanitizer/58413
c-family/
* c-ubsan.c (ubsan_instrument_shift): Don't instrument
an expression if we can prove it is correct.
(ubsan_instrument_division): Likewise.  Remove unnecessary
check.

testsuite/
* c-c++-common/ubsan/shift-4.c: New test.
* c-c++-common/ubsan/shift-5.c: New test.
* c-c++-common/ubsan/div-by-zero-5.c: New test.
* gcc.dg/ubsan/c-shift-1.c: New test.

--- gcc/c-family/c-ubsan.c.mp   2013-09-17 12:24:44.582835840 +0200
+++ gcc/c-family/c-ubsan.c  2013-09-17 12:24:48.772849823 +0200
@@ -51,14 +51,6 @@ ubsan_instrument_division (location_t lo
   if (TREE_CODE (type) != INTEGER_TYPE)
 return NULL_TREE;
 
-  /* If we *know* that the divisor is not -1 or 0, we don't have to
- instrument this expression.
- ??? We could use decl_constant_value to cover up more cases.  */
-  if (TREE_CODE (op1) == INTEGER_CST
-   integer_nonzerop (op1)
-   !integer_minus_onep (op1))
-return NULL_TREE;
-
   t = fold_build2 (EQ_EXPR, boolean_type_node,
op1, build_int_cst (type, 0));
 
@@ -74,6 +66,11 @@ ubsan_instrument_division (location_t lo
   t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, t, x);
 }
 
+  /* If the condition was folded to 0, no need to instrument
+ this expression.  */
+  if (integer_zerop (t))
+return NULL_TREE;
+
   /* In case we have a SAVE_EXPR in a conditional context, we need to
  make sure it gets evaluated before the condition.  */
   t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t);
@@ -138,6 +135,11 @@ ubsan_instrument_shift (location_t loc,
   tt = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, x, tt);
 }
 
+  /* If the condition was folded to 0, no need to instrument
+ this expression.  */
+  if (integer_zerop (t)  (tt == NULL_TREE || integer_zerop (tt)))
+return NULL_TREE;
+
   /* In case we have a SAVE_EXPR in a conditional context, we need to
  make sure it gets evaluated before the condition.  */
   t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t);
--- gcc/testsuite/c-c++-common/ubsan/shift-4.c.mp   2013-09-17 
12:25:12.130931875 +0200
+++ gcc/testsuite/c-c++-common/ubsan/shift-4.c  2013-09-17 10:19:44.665199565 
+0200
@@ -0,0 +1,30 @@
+/* PR sanitizer/58413 */
+/* { dg-do run { target int32plus } } */
+/* { dg-options -fsanitize=shift -w } */
+
+int x = 7;
+int
+main (void)
+{
+  /* All of the following should pass.  */
+  int A[128  5] = {};
+  int B[128  5] = {};
+
+  static int e =
+((int)
+ (0x | ((31  ((1  (4)) - 1))  (((15) + 6) + 4)) |
+  ((0)  ((15) + 6)) | ((0)  (15;
+
+  if (e != 503316480)
+__builtin_abort ();
+
+  switch (x)
+ 

Re: [PATCH, PR 57748] Check for out of bounds access

2013-09-17 Thread Richard Biener
On Tue, Sep 17, 2013 at 12:00 PM, Richard Biener
richard.guent...@gmail.com wrote:
 On Sun, Sep 15, 2013 at 6:55 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
 Hello Richard,

 attached is my second attempt at fixing PR 57748. This time the movmisalign
 path is completely removed and a similar bug in the read handling of 
 misaligned
 structures with a non-BLKmode is fixed too. There are several new test cases 
 for the
 different possible failure modes.

 This patch was boot-strapped and regression tested on  
 x86_64-unknown-linux-gnu
 and i686-pc-linux-gnu.

 Additionally I generated eCos and an eCos-application (on ARMv5 using packed
 structures) with an arm-eabi cross compiler, and looked for differences in 
 the
 disassembled code with and without this patch, but there were none.

 OK for trunk?

 I agree that the existing movmisaling path that you remove is simply bogus, so
 removing it looks fine to me.  Can you give rationale to

 @@ -4773,6 +4738,8 @@ expand_assignment (tree to, tree from, b
   if (MEM_P (to_rtx)
GET_MODE (to_rtx) == BLKmode
GET_MODE (XEXP (to_rtx, 0)) != VOIDmode
 +  bitregion_start == 0
 +  bitregion_end == 0
bitsize  0
(bitpos % bitsize) == 0
(bitsize % GET_MODE_ALIGNMENT (mode1)) == 0

 and especially to

 @@ -9905,7 +9861,7 @@ expand_expr_real_1 (tree exp, rtx target
modifier != EXPAND_STACK_PARM
   ? target : NULL_RTX),
  VOIDmode,
 -modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier);
 +EXPAND_MEMORY);

 /* If the bitfield is volatile, we want to access it in the
field's mode, not the computed mode.

 which AFAIK makes memory expansion of loads/stores from/to registers
 change (fail? go through stack memory?) - see handling of non-MEM return
 values from that expand_expr call.

In particular this seems to disable all movmisalign handling for MEM_REFs
wrapped in component references which looks wrong.  I was playing with

typedef long long V
  __attribute__ ((vector_size (2 * sizeof (long long)), may_alias));

struct S { long long a[11]; V v; }__attribute__((aligned(8),packed)) ;
struct S a, *b = a;
V v, w;

int main()
{
  v = b-v;
  b-v = w;
  return 0;
}

(use -fno-common) and I see that we use unaligned stores too often
(even with a properly aligned MEM).

The above at least shows movmisalign opportunities wrapped in component-refs.

 That is, do you see anything break with just removing the movmisalign path?
 I'd rather install that (with the new testcases that then pass) separately as
 this is a somewhat fragile area and being able to more selectively
 bisect/backport
 would be nice.

 Thanks,
 Richard.

 Regards
 Bernd.


Re: [PATCH] Fix segfault with inlining

2013-09-17 Thread Eric Botcazou
 That's true...  so you can only simplify is_parameter_of by dropping
 the context check.

OK, thanks, installed with this modification and the fix for the oversight 
spotted by Jakub, after retesting on x86-64/Linux.

-- 
Eric Botcazou


Use CreateSemaphoreW instead of CreateSemaphoreA in libgcc.

2013-09-17 Thread Jacek Caban
This is no-op for usual GCC targets, because we don't pass any string to
CreateSemaphore anyway. However this trivial change will help
mingw-w64's efforts to support WinRT, where only unicode variant is
available.

libgcc/Changelog:
config/i386/gthr-win32.c: CreateSemaphoreW instead of CreateSemaphoreA.
config/i386/gthr-win32.h: Likewise.



Re: Use CreateSemaphoreW instead of CreateSemaphoreA in libgcc.

2013-09-17 Thread Kai Tietz
2013/9/17 Jacek Caban cja...@gmail.com:
 This is no-op for usual GCC targets, because we don't pass any string to
 CreateSemaphore anyway. However this trivial change will help
 mingw-w64's efforts to support WinRT, where only unicode variant is
 available.

 libgcc/Changelog:
 config/i386/gthr-win32.c: CreateSemaphoreW instead of CreateSemaphoreA.
 config/i386/gthr-win32.h: Likewise.


Please attach (or inline) patch.

Thanks,
Kai


Re: Fwd: GCC internals conditional execution macro?

2013-09-17 Thread Nicklas Bo Jensen
 Hasn't this been already removed by
 http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01231.html
 ?

Yes. Okay. Please ignore then.

Best,
Nicklas


Re: Use CreateSemaphoreW instead of CreateSemaphoreA in libgcc.

2013-09-17 Thread Jacek Caban
On 09/17/13 13:41, Kai Tietz wrote:
 2013/9/17 Jacek Caban cja...@gmail.com:
 This is no-op for usual GCC targets, because we don't pass any string to
 CreateSemaphore anyway. However this trivial change will help
 mingw-w64's efforts to support WinRT, where only unicode variant is
 available.

 libgcc/Changelog:
 config/i386/gthr-win32.c: CreateSemaphoreW instead of CreateSemaphoreA.
 config/i386/gthr-win32.h: Likewise.

 Please attach (or inline) patch.

It's attached now, sorry.

Jacek

commit eea3738e6103da1d1bc391b99734c93737d292a4
Author: Jacek Caban ja...@codeweavers.com
Date:   Tue May 7 17:22:01 2013 +0200

Use CreateSemaphoreW instead of CreateSemaphoreA in libgcc.

libgcc/Changelog:
config/i386/gthr-win32.c: CreateSemaphoreW instead of CreateSemaphoreA.
config/i386/gthr-win32.h: Likewise.

diff --git a/libgcc/config/i386/gthr-win32.c b/libgcc/config/i386/gthr-win32.c
index f6f661a..f323031 100644
--- a/libgcc/config/i386/gthr-win32.c
+++ b/libgcc/config/i386/gthr-win32.c
@@ -147,7 +147,7 @@ void
 __gthr_win32_mutex_init_function (__gthread_mutex_t *mutex)
 {
   mutex-counter = -1;
-  mutex-sema = CreateSemaphore (NULL, 0, 65535, NULL);
+  mutex-sema = CreateSemaphoreW (NULL, 0, 65535, NULL);
 }
 
 void
@@ -195,7 +195,7 @@ __gthr_win32_recursive_mutex_init_function 
(__gthread_recursive_mutex_t *mutex)
   mutex-counter = -1;
   mutex-depth = 0;
   mutex-owner = 0;
-  mutex-sema = CreateSemaphore (NULL, 0, 65535, NULL);
+  mutex-sema = CreateSemaphoreW (NULL, 0, 65535, NULL);
 }
 
 int
diff --git a/libgcc/config/i386/gthr-win32.h b/libgcc/config/i386/gthr-win32.h
index d2e729a..1e437fc 100644
--- a/libgcc/config/i386/gthr-win32.h
+++ b/libgcc/config/i386/gthr-win32.h
@@ -635,7 +635,7 @@ static inline void
 __gthread_mutex_init_function (__gthread_mutex_t *__mutex)
 {
   __mutex-counter = -1;
-  __mutex-sema = CreateSemaphore (NULL, 0, 65535, NULL);
+  __mutex-sema = CreateSemaphoreW (NULL, 0, 65535, NULL);
 }
 
 static inline void
@@ -697,7 +697,7 @@ __gthread_recursive_mutex_init_function 
(__gthread_recursive_mutex_t *__mutex)
   __mutex-counter = -1;
   __mutex-depth = 0;
   __mutex-owner = 0;
-  __mutex-sema = CreateSemaphore (NULL, 0, 65535, NULL);
+  __mutex-sema = CreateSemaphoreW (NULL, 0, 65535, NULL);
 }
 
 static inline int


Re: Use CreateSemaphoreW instead of CreateSemaphoreA in libgcc.

2013-09-17 Thread Kai Tietz
Hi Jacek,

I applied patch at rev. 202648 with following ChangeLog

2013-09-17  Jacek Caban

* config/i386/gthr-win32.c: CreateSemaphoreW instead of
CreateSemaphoreA.
* config/i386/gthr-win32.h: Likewise.

The wide-variant is in general ok due we don't support any windows-OS
anymore, which doesn't support wide API.

Thanks,
Kai


Re: [x86,PATCH] Simple fix for Atom LEA splitting.

2013-09-17 Thread Kirill Yukhin
Hello,
On 16 Sep 16:36, Uros Bizjak wrote:
 The patch with a fixed comment is OK otherwise.

Checked into main trunk: http://gcc.gnu.org/ml/gcc-cvs/2013-09/msg00512.html

--
Thanks, K


RE: [PATCH, PR 57748] Check for out of bounds access

2013-09-17 Thread Bernd Edlinger
On Tue, 17 Sep 2013 12:45:40, Richard Biener wrote:

 On Tue, Sep 17, 2013 at 12:00 PM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Sun, Sep 15, 2013 at 6:55 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
 Hello Richard,

 attached is my second attempt at fixing PR 57748. This time the movmisalign
 path is completely removed and a similar bug in the read handling of 
 misaligned
 structures with a non-BLKmode is fixed too. There are several new test 
 cases for the
 different possible failure modes.

 This patch was boot-strapped and regression tested on 
 x86_64-unknown-linux-gnu
 and i686-pc-linux-gnu.

 Additionally I generated eCos and an eCos-application (on ARMv5 using packed
 structures) with an arm-eabi cross compiler, and looked for differences in 
 the
 disassembled code with and without this patch, but there were none.

 OK for trunk?

 I agree that the existing movmisaling path that you remove is simply bogus, 
 so
 removing it looks fine to me. Can you give rationale to

 @@ -4773,6 +4738,8 @@ expand_assignment (tree to, tree from, b
 if (MEM_P (to_rtx)
  GET_MODE (to_rtx) == BLKmode
  GET_MODE (XEXP (to_rtx, 0)) != VOIDmode
 +  bitregion_start == 0
 +  bitregion_end == 0
  bitsize 0
  (bitpos % bitsize) == 0
  (bitsize % GET_MODE_ALIGNMENT (mode1)) == 0

OK, as already said, I think it could be dangerous to set bitpos=0 without
considering bitregion_start/end, but I think it may be possible that this
can not happen, because if bitsize is a multiple if ALIGNMENT, and
bitpos is a multiple of bitsize, we probably do not have a bit-field at all.
And of course I have no test case that fails without this hunk.
Maybe it would be better to add an assertion here like:

  {
  gcc_assert (bitregion_start == 0  bitregion_end == 0);
  to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT);
  bitpos = 0;
  }

 and especially to

 @@ -9905,7 +9861,7 @@ expand_expr_real_1 (tree exp, rtx target
  modifier != EXPAND_STACK_PARM
 ? target : NULL_RTX),
 VOIDmode,
 - modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier);
 + EXPAND_MEMORY);

 /* If the bitfield is volatile, we want to access it in the
 field's mode, not the computed mode.

 which AFAIK makes memory expansion of loads/stores from/to registers
 change (fail? go through stack memory?) - see handling of non-MEM return
 values from that expand_expr call.

I wanted to make the expansion of MEM_REF and TARGET_MEM_REF
not go thru the final misalign handling, which is guarded by
if (modifier != EXPAND_WRITE   modifier != EXPAND_MEMORY  ...

What we want here is most likely EXPAND_MEMORY, which returns a
memory context if possible.

Could you specify more explicitly what you mean with handling of non-MEM return
values from that expand_expr call, then I could try finding test cases for
that.


 In particular this seems to disable all movmisalign handling for MEM_REFs
 wrapped in component references which looks wrong. I was playing with

 typedef long long V
 __attribute__ ((vector_size (2 * sizeof (long long)), may_alias));

 struct S { long long a[11]; V v; }__attribute__((aligned(8),packed)) ;
 struct S a, *b = a;
 V v, w;

 int main()
 {
 v = b-v;
 b-v = w;
 return 0;
 }

 (use -fno-common) and I see that we use unaligned stores too often
 (even with a properly aligned MEM).

 The above at least shows movmisalign opportunities wrapped in component-refs.

hmm, interesting. This does not compile differently with or without this patch.

I have another observation, regarding the testcase pr50444.c:

method:
.LFB4:
    .cfi_startproc
    movq    32(%rdi), %rax
    testq   %rax, %rax
    jne .L7
    addl    $1, 16(%rdi)
    movl    $3, %eax
    movq    %rax, 32(%rdi)
    movdqu  16(%rdi), %xmm0
    pxor    (%rdi), %xmm0
    movdqu  %xmm0, 40(%rdi)

here the first movdqu could as well be movdqa, because 16+rdi is 128-bit 
aligned.
In the ctor method a movdqa is used, but the SRA is very pessimistic and 
generates
an unaligned MEM_REF. Also this example does not compile any different with 
this patch.


 That is, do you see anything break with just removing the movmisalign path?
 I'd rather install that (with the new testcases that then pass) separately as
 this is a somewhat fragile area and being able to more selectively
 bisect/backport
 would be nice.

No, I think that is a good idea.

Attached the first part of the patch, that does only remove the movmisalign 
path.

Should I apply this one after regression testing?

Bernd.

 Thanks,
 Richard.

 Regards
 Bernd.2013-09-17  Bernd Edlinger  bernd.edlin...@hotmail.de

PR middle-end/57748
* expr.c (expand_assignment): Remove misalignp code path.

testsuite:

PR middle-end/57748
* gcc.dg/torture/pr57748-1.c: New test.
* gcc.dg/torture/pr57748-2.c: New test.



patch-pr57748.diff
Description: Binary data


[PATCH][RFC] teach loop distribution to distribute loop nests

2013-09-17 Thread Richard Biener

This teaches loop distribution to distribute nested loops.  I plan
to commit the trivial bits of it but not the rest of the patch
until I have an idea how to best limit the loop nest walk
(it tries distributing nests from outer to inner loops, re-doing
dependence analysis and RDG build).

At this point loop distribution needs a better cost model,
the ability to turn flow dependences into data dependences and
turning data dependences into partition ordering dependences.

Still the first thing for me to tackle is some more patterns to recognize.

Bootstrapped with -ftree-loop-distribution and tested on 
x86_64-unknown-linux-gnu.

Richard.

2013-09-17  Richard Biener  rguent...@suse.de

* tree-loop-distribution.c (ssa_name_has_uses_outside_loop_p):
Properly handle loop nests.
(classify_partition): Disable builtins for loop nests.
(similar_memory_accesses): Refine cost model.
(distribute_loop): Dump which loop we are trying to distribute.
(tree_loop_distribution): Handle distribution of nested loops.

* gfortran.dg/ldist-2.f: New testcase.
* gcc.dg/tree-ssa/ldist-5.c: Adjust XFAIL reason.

Index: trunk/gcc/testsuite/gfortran.dg/ldist-2.f
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- trunk/gcc/testsuite/gfortran.dg/ldist-2.f   2013-09-17 13:42:22.144740768 
+0200
***
*** 0 
--- 1,64 
+ ! { dg-do compile }
+ ! { dg-options -O3 -fno-tree-loop-im -ftree-loop-distribution 
-fdump-tree-ldist-details }
+ 
+ ! Testcase from bwaves block_solver.f
+ subroutine mat_times_vec(y,x,a,axp,ayp,azp,axm,aym,azm,
+  $  nb,nx,ny,nz)
+ implicit none
+ integer nb,nx,ny,nz,i,j,k,m,l,kit,im1,ip1,jm1,jp1,km1,kp1
+ 
+ real*8 y(nb,nx,ny,nz),x(nb,nx,ny,nz)
+ 
+ real*8 a(nb,nb,nx,ny,nz),
+  1  axp(nb,nb,nx,ny,nz),ayp(nb,nb,nx,ny,nz),azp(nb,nb,nx,ny,nz),
+  2  axm(nb,nb,nx,ny,nz),aym(nb,nb,nx,ny,nz),azm(nb,nb,nx,ny,nz)
+ 
+ 
+   do k=1,nz
+ c do j=1,ny
+ cdo i=1,nx
+ c   do l=1,nb
+ c  y(l,i,j,k)=0.0d0
+ c   enddo
+ cenddo
+ c enddo
+ 
+  km1=mod(k+nz-2,nz)+1
+  kp1=mod(k,nz)+1
+  do j=1,ny
+ jm1=mod(j+ny-2,ny)+1
+ jp1=mod(j,ny)+1
+ do i=1,nx
+im1=mod(i+nx-2,nx)+1
+ip1=mod(i,nx)+1
+do l=1,nb
+   y(l,i,j,k)=0.0d0
+   do m=1,nb
+  y(l,i,j,k)=y(l,i,j,k)+
+  1   a(l,m,i,j,k)*x(m,i,j,k)+
+  2   axp(l,m,i,j,k)*x(m,ip1,j,k)+
+  3   ayp(l,m,i,j,k)*x(m,i,jp1,k)+
+  4   azp(l,m,i,j,k)*x(m,i,j,kp1)+
+  5   axm(l,m,i,j,k)*x(m,im1,j,k)+
+  6   aym(l,m,i,j,k)*x(m,i,jm1,k)+
+  7   azm(l,m,i,j,k)*x(m,i,j,km1)
+   enddo
+enddo
+ enddo
+  enddo
+ enddo
+ 
+ 
+ 
+ cy=x
+ cwhere (mask) y=tmp
+ return
+ end
+ 
+ ! We fail to distribute the loop because the output dependence for the
+ ! two stores to y(l,i,j,k) forces them into the same partition.  This is
+ ! because loop distribution does not promote such dependences into
+ ! constraints on partition ordering
+ 
+ ! { dg-final { scan-tree-dump distributed: split to 2 loops ldist { xfail 
*-*-* } } }
+ ! { dg-final { cleanup-tree-dump ldist } }
Index: trunk/gcc/tree-loop-distribution.c
===
*** trunk.orig/gcc/tree-loop-distribution.c 2013-09-17 11:51:49.0 
+0200
--- trunk/gcc/tree-loop-distribution.c  2013-09-17 14:03:53.378065359 +0200
*** ssa_name_has_uses_outside_loop_p (tree d
*** 624,630 
  {
gimple use_stmt = USE_STMT (use_p);
if (!is_gimple_debug (use_stmt)
!  loop != loop_containing_stmt (use_stmt))
return true;
  }
  
--- 624,631 
  {
gimple use_stmt = USE_STMT (use_p);
if (!is_gimple_debug (use_stmt)
!  loop != loop_containing_stmt (use_stmt)
!  !flow_loop_nested_p (loop, loop_containing_stmt (use_stmt)))
return true;
  }
  
*** classify_partition (loop_p loop, struct
*** 1139,1149 
if (stmt_has_scalar_dependences_outside_loop (loop, stmt))
{
  if (dump_file  (dump_flags  TDF_DETAILS))
!   fprintf (dump_file, not generating builtin, partition has 
 scalar uses outside of the loop\n);
  partition-kind = PKIND_REDUCTION;
  return;
}
  }
  
/* Perform general partition disqualification for builtins.  */
--- 1140,1162 
if (stmt_has_scalar_dependences_outside_loop (loop, stmt))
{
  if (dump_file  (dump_flags  TDF_DETAILS))
!   fprintf 

Re: [PATCH] Don't always instrument shifts (PR sanitizer/58413)

2013-09-17 Thread Marek Polacek
On Mon, Sep 16, 2013 at 03:59:12PM +, Joseph S. Myers wrote:
 On Mon, 16 Sep 2013, Marek Polacek wrote:
 
  On Fri, Sep 13, 2013 at 07:18:24PM +, Joseph S. Myers wrote:
   On Fri, 13 Sep 2013, Marek Polacek wrote:
   
This is kind of fugly, but don't have anything better at the moment.
2013-09-13  Marek Polacek  pola...@redhat.com

PR sanitizer/58413
c-family/
* c-ubsan.c (ubsan_instrument_shift): Don't instrument
an expression if we can prove it is correct.
   
   Shouldn't the conditions used here for an expression being proved correct 
   match those for instrumentation, i.e. depend on flag_isoc99 and on 
   (cxx_dialect == cxx11 || cxx_dialect == cxx1y)?
  
  I don't think so: for the unsigned case we could restrict it to C
  only, but it doesn't hurt doing it even for C++; in the signed case
  we care only about C, but we can't restrict it to flag_isoc99 only,
  since we need to prove the correctnes even for ANSI C.
 
 I don't understand how this answers my question.

I'm sorry.

Please disregard the original (ugly) patch, the folloing applies to
the new (pretty) patch
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01283.html.

 * The following principle applies: for any command-line options, with 
 ubsan enabled, if an integer operation with particular (non-constant) 
 operands is accepted by the sanitization code at runtime, the same 
 operation with the same operand values (and types) as constants should be 
 accepted at compile time (and at runtime) in contexts where an integer 
 constant expression is required.  Does this patch make the compiler meet 
 this principle, for all the different command-line options that vary what 
 is accepted at runtime?

I believe so.  E.g.

int i = 4, j = 3, k;
k = i  j;

is ok, thus the following is ok as well

case (4  3) (for C++/C with various -std=*).
 
 * The following principle also applies: for any command-line options, with 
 ubsan enabled, if an integer operation with particular (non-constant) 
 operands is rejected by the sanitization code at runtime, the same 
 operation with the same operand values (and types) as constants should be 
 rejected at compile time (or at runtime) in contexts where an integer 
 constant expression is required.  Does this patch make the compiler meet 
 this principle, for all the different command-line options that vary what 
 is accepted at runtime?

And I think this applies as well.  At runtime we reject e.g.
int i = 1, j = 120, k;
k = i  j;

and at compile-time we reject

  enum e { 
red = 0  120,
  };

Marek


Re: [C++ Patch] PR 58435

2013-09-17 Thread Jason Merrill

OK.

Jason


Re: RFA: Testsuite: Add exceptions for MSP430

2013-09-17 Thread nick clifton

Hi Mike,


Ok, I assume that the changes to hppa and return 0 are intentional and good…


-   || [istarget hppa64-hp-hpux11.23] } {
-   return 0;
+|| [istarget hppa64-hp-hpux11.23] } {
+   return 0;


Sorry - yes - they are just whitespace adjustments so that the entries 
line up.


Cheers
  Nick




Re: [PATCH ARM]Extend thumb1_reorg to save more comparison instructions

2013-09-17 Thread Richard Earnshaw
On 17/09/13 03:16, bin.cheng wrote:
 
 
 -Original Message-
 From: Richard Earnshaw
 Sent: Thursday, September 12, 2013 11:24 PM
 To: Bin Cheng
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH ARM]Extend thumb1_reorg to save more comparison
 instructions

 On 18/04/13 06:34, Bin Cheng wrote:

 Sorry for the delay, I've been trying to get my head around this one.

 thumb1_reorg-20130417.txt


 Index: gcc/config/arm/arm.c

 ==
 =
 --- gcc/config/arm/arm.c(revision 197562)
 +++ gcc/config/arm/arm.c(working copy)
 @@ -14026,6 +14026,7 @@ thumb1_reorg (void)
rtx set, dest, src;
rtx pat, op0;
rtx prev, insn = BB_END (bb);
 +  bool insn_clobbered = false;

while (insn != BB_HEAD (bb)  DEBUG_INSN_P (insn))
 insn = PREV_INSN (insn);
 @@ -14034,12 +14035,29 @@ thumb1_reorg (void)
if (INSN_CODE (insn) != CODE_FOR_cbranchsi4_insn)
 continue;

 -  /* Find the first non-note insn before INSN in basic block BB.
 */
 +  /* Get the register with which we are comparing.  */
 +  pat = PATTERN (insn);
 +  op0 = XEXP (XEXP (SET_SRC (pat), 0), 0);
 +
 +  /* Find the first flag setting insn before INSN in basic block
 + BB.  */
gcc_assert (insn != BB_HEAD (bb));
 -  prev = PREV_INSN (insn);
 -  while (prev != BB_HEAD (bb)  (NOTE_P (prev) || DEBUG_INSN_P
 (prev)))
 -   prev = PREV_INSN (prev);
 +  for (prev = PREV_INSN (insn);
 +  (!insn_clobbered
 +prev != BB_HEAD (bb)
 +(NOTE_P (prev)
 +   || DEBUG_INSN_P (prev)
 +   || (GET_CODE (prev) == SET

 This can't be right.  prev is an insn of some form, so the test that it is
 a SET will
 always fail.

 What you need to do here is to initialize 'set' to null before the loop
 and then
 have something like

  || ((set = single_set (prev)) != NULL

 +get_attr_conds (prev) == CONDS_NOCOND)));
 +  prev = PREV_INSN (prev))
 +   {
 + if (reg_set_p (op0, prev))
 +   insn_clobbered = true;
 +   }

 +  /* Skip if op0 is clobbered by insn other than prev. */
 +  if (insn_clobbered)
 +   continue;
 +
set = single_set (prev);

 This now becomes redundant and ...

if (!set)
 continue;

 This will be based on the set you extracted above.

 
 Hi Richard, here is the updated patch according to your comments.  Tested on
 thumb1, please review.
 

OK.

R.



Re: [PATCH][Resend][tree-optimization] Fix PR58088

2013-09-17 Thread Richard Earnshaw
On 09/09/13 10:56, Kyrylo Tkachov wrote:
 [Resending, since I was away and not pinging it]
 
 
 Hi all,
 
 In PR58088 the constant folder goes into an infinite recursion and runs out of
 stack space because of two conflicting optimisations:
 (X * C1)  C2 plays dirty when nested inside an IOR expression like so: ((X *
 C1)  C2) | C4. One can undo the other leading to an infinite recursion.
 
 Thanks to Marek for finding the IOR case.
 
 This patch fixes that by checking in the IOR case that the change to C2 will
 not conflict with the AND case transformation. Example testcases in the PR on
 bugzilla.
 
 This affects both trunk and 4.8 and regresses and bootstraps cleanly on both.
 
 Bootstrapped on x86_64-linux-gnu and tested arm-none-eabi on qemu.
 
 Ok for trunk and 4.8?
 
 Thanks,
 Kyrill
 
 2013-09-09  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
   PR tree-optimization/58088
   * fold-const.c (mask_with_trailing_zeros): New function.
   (fold_binary_loc): Make sure we don't recurse infinitely
   when the X in (X  C1) | C2 is a tree of the form (Y * K1)  K2.
   Use mask_with_trailing_zeros where appropriate.
   
   
 2013-09-09  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
   PR tree-optimization/58088
   * gcc.c-torture/compile/pr58088.c: New test.=
 
 
 pr58088.patch
 
 

@@ -9942,6 +9942,22 @@ exact_inverse (tree type, tree cst)
 }
 }

+/*  Mask out the tz least significant bits of X of type TYPE where
+tz is the number of trailing zeroes in Y.  */
+static double_int
+mask_with_tz (tree type, double_int x, double_int y)
+{
+  int tz = y.trailing_zeros ();
+  if (tz  0)

blank line between declarations and statements.

@@ -11266,6 +11282,7 @@ fold_binary_loc (location_t loc,
{
  double_int c1, c2, c3, msk;
  int width = TYPE_PRECISION (type), w;
+ bool valid = true;
  c1 = tree_to_double_int (TREE_OPERAND (arg0, 1));
  c2 = tree_to_double_int (arg1);

blank line after declarations before code body.

}
- if (c3 != c1)
+ /* If X is a tree of the form (Y * K1)  K2, this might conflict

Should be a blank line before the comment as well

+with that optimization from the BIT_AND_EXPR optimizations.
+This could end up in an infinite recursion.  */
+ if (TREE_CODE (TREE_OPERAND (arg0, 0)) == MULT_EXPR
+  TREE_CODE (TREE_OPERAND (TREE_OPERAND (arg0, 0), 1))
+   == INTEGER_CST)
+ {
+   tree t = TREE_OPERAND (TREE_OPERAND (arg0, 0), 1);
+   double_int masked = mask_with_tz (type, c3, tree_to_double_int (t));
+   valid = masked != c1;

blank line before statements after declarations.
+ }
+
+ if (c3 != c1  valid)

'valid' should come before the comparison test.  Furthermore, I think
'valid' is misleading; 'try_simplify' would be a more accurate
description of what the test is about.

OK with those changes.

R.



[C++ PATCH] demangler fix (take 2)

2013-09-17 Thread Gary Benson
Hi all,

This is a resubmission of my previous demangler fix [1] rewritten
to avoid using hashtables and other libiberty features.

From the above referenced email:

d_print_comp maintains a certain amount of scope across calls (namely
a stack of templates) which is used when evaluating references in
template argument lists.  If such a reference is later used from a
subtitution then the scope in force at the time of the substitution is
used.  This appears to be wrong (I say appears because I couldn't find
anything in the API [2] to clarify this).

The attached patch causes the demangler to capture the scope the first
time such a reference is traversed, and to use that captured scope on
subsequent traversals.  This fixes GDB PR 14963 [3] whereby a
reference is resolved against the wrong template, causing an infinite
loop and eventual stack overflow and segmentation fault.

I've added the result to the demangler test suite, but I know of no
way to check the validity of the demangled symbol other than by
inspection (and I am no expert here!)  If anybody knows a way to
check this then please let me know!  Otherwise, I hope this
not-really-checked demangled version is acceptable.

Thanks,
Gary

[1] http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00215.html
[2] http://mentorembedded.github.io/cxx-abi/abi.html#mangling
[3] http://sourceware.org/bugzilla/show_bug.cgi?id=14963

-- 
http://gbenson.net/
diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
index 89e108a..2ff8216 100644
--- a/libiberty/ChangeLog
+++ b/libiberty/ChangeLog
@@ -1,3 +1,20 @@
+2013-09-17  Gary Benson  gben...@redhat.com
+
+   * cp-demangle.c (struct d_saved_scope): New structure.
+   (struct d_print_info): New fields saved_scopes and
+   num_saved_scopes.
+   (d_print_init): Initialize the above.
+   (d_print_free): New function.
+   (cplus_demangle_print_callback): Call the above.
+   (d_copy_templates): New function.
+   (d_print_comp): New variables saved_templates and
+   need_template_restore.
+   [DEMANGLE_COMPONENT_REFERENCE,
+   DEMANGLE_COMPONENT_RVALUE_REFERENCE]: Capture scope the first
+   time the component is traversed, and use the captured scope for
+   subsequent traversals.
+   * testsuite/demangle-expected: Add regression test.
+
 2013-09-10  Paolo Carlini  paolo.carl...@oracle.com
 
PR bootstrap/58386
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 70f5438..a199f6d 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -275,6 +275,18 @@ struct d_growable_string
   int allocation_failure;
 };
 
+/* A demangle component and some scope captured when it was first
+   traversed.  */
+
+struct d_saved_scope
+{
+  /* The component whose scope this is.  */
+  const struct demangle_component *container;
+  /* The list of templates, if any, that was current when this
+ scope was captured.  */
+  struct d_print_template *templates;
+};
+
 enum { D_PRINT_BUFFER_LENGTH = 256 };
 struct d_print_info
 {
@@ -302,6 +314,10 @@ struct d_print_info
   int pack_index;
   /* Number of d_print_flush calls so far.  */
   unsigned long int flush_count;
+  /* Array of saved scopes for evaluating substitutions.  */
+  struct d_saved_scope *saved_scopes;
+  /* Number of saved scopes in the above array.  */
+  int num_saved_scopes;
 };
 
 #ifdef CP_DEMANGLE_DEBUG
@@ -3665,6 +3681,30 @@ d_print_init (struct d_print_info *dpi, 
demangle_callbackref callback,
   dpi-opaque = opaque;
 
   dpi-demangle_failure = 0;
+
+  dpi-saved_scopes = NULL;
+  dpi-num_saved_scopes = 0;
+}
+
+/* Free a print information structure.  */
+
+static void
+d_print_free (struct d_print_info *dpi)
+{
+  int i;
+
+  for (i = 0; i  dpi-num_saved_scopes; i++)
+{
+  struct d_print_template *ts, *tn;
+
+  for (ts = dpi-saved_scopes[i].templates; ts != NULL; ts = tn)
+   {
+ tn = ts-next;
+ free (ts);
+   }
+}
+
+  free (dpi-saved_scopes);
 }
 
 /* Indicate that an error occurred during printing, and test for error.  */
@@ -3749,6 +3789,7 @@ cplus_demangle_print_callback (int options,
demangle_callbackref callback, void *opaque)
 {
   struct d_print_info dpi;
+  int success;
 
   d_print_init (dpi, callback, opaque);
 
@@ -3756,7 +3797,9 @@ cplus_demangle_print_callback (int options,
 
   d_print_flush (dpi);
 
-  return ! d_print_saw_error (dpi);
+  success = ! d_print_saw_error (dpi);
+  d_print_free (dpi);
+  return success;
 }
 
 /* Turn components into a human readable string.  OPTIONS is the
@@ -3913,6 +3956,36 @@ d_print_subexpr (struct d_print_info *dpi, int options,
 d_append_char (dpi, ')');
 }
 
+/* Return a shallow copy of the current list of templates.
+   On error d_print_error is called and a partial list may
+   be returned.  Whatever is returned must be freed.  */
+
+static struct d_print_template *
+d_copy_templates (struct d_print_info *dpi)
+{
+  struct d_print_template *src, *result, 

RE: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)

2013-09-17 Thread Iyer, Balaji V
Hello,
Has anyone had a chance to look at this. The C++ part is only a week 
old, but the C part has been in review for ~3 weeks. I would greatly appreciate 
if someone could review this and approve for trunk if it is Ok for trunk.

Thanks,

Balaji V. Iyer.

 -Original Message-
 From: Iyer, Balaji V
 Sent: Wednesday, September 11, 2013 2:18 PM
 To: r...@redhat.com; Jason Merrill (ja...@redhat.com); Jeff Law; Aldy
 Hernandez (al...@redhat.com)
 Cc: gcc-patches@gcc.gnu.org
 Subject: RE: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and 
 C++)
 
 Hello Everyone,
   Couple weeks back, I had submitted a patch for review that will
 implement Cilk keywords (_Cilk_spawn and _Cilk_sync) into the C compiler. I
 recently finished C++ implementation also. In this email, I am attaching 2
 patches: 1 for C (and the common parts for C and C++) and 1 for C++. The C++
 Changelog is labelled cp-ChangeLog.cilkplus and the other one is just
 ChangeLog.cilkplus. There isn't much changes in the C patch. Only noticeable
 changes would be moving functions to the common parts so that C++ can use
 them.
 
   It passes all the tests and does not affect  (by affect I mean fail a 
 passing
 test or pass a failing one) any of the other tests in the testsuite directory.
 
   Is this Ok for trunk?
 
 Thanks,
 
 Balaji V. Iyer.
 
  -Original Message-
  From: Iyer, Balaji V
  Sent: Friday, August 30, 2013 1:02 PM
  To: gcc-patches@gcc.gnu.org
  Subject: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C
 
  The email seem to be bouncing gcc-patches. I have gzipped my patch.
 
  Thanks,
 
  Balaji V. Iyer.
 
 
-Original Message-
From: Iyer, Balaji V
Sent: Friday, August 30, 2013 11:42 AM
To: 'Aldy Hernandez'
Cc: r...@redhat.com; Jeff Law; gcc-patches@gcc.gnu.org
Subject: RE: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync)
for C
   
Hi Aldy,
Attached, please find a fixed patch and the changelog entries.
   
 -Original Message-
 From: Aldy Hernandez [mailto:al...@redhat.com]
 Sent: Wednesday, August 28, 2013 2:36 PM
 To: Iyer, Balaji V
 Cc: r...@redhat.com; Jeff Law; gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync)
 for C

 On 08/27/13 16:27, Iyer, Balaji V wrote:
  Hello Aldy, I went through all the emails and here are the
  major issues that I could gather (other than lowering the
  keywords after gimplification, which I am skipping since it is
  more of an optimization for now).

 Ok, for now I am fine with delaying handling all this as a
 gimple tuple since most of your code lives in it's only little world 
 :).
 But I will go on record saying that part of the reason that you
 have to handle CALL_EXPR, MODIFY_EXPR, INIT_EXPR and such is
 because you don't
have easy gimplified code to examine.
 Anyways, agreed, you can do this later.

 
  1. Calling the gimplify_cilk_spawn on top of the gimplify_expr
  before the switch-statement could slow the compiler down 2. I
  need a CILK_SPAWN_STMT case in the switch statement in
  gimplify_expr
  (). 3.
  No test for catching the suspicious spawned function warning 4.
  Reasoning for expanding the 2 builtin functions in builtins.c
  instead of just inserting the appropriate expanded-code when I
  am inserting the function call.
 
  Did I miss anything else (or misunderstand anything you pointed 
  out)?
 
  Here are my answers to those questions above and am attaching
  a fixed patch with the changelog entries:
 
  1  2(partial): There are 3 places where we could have _Cilk_spawn:
  INIT_EXPR, CALL_EXPR and MODIFY_EXPR. INIT_EXPR and
  MODIFY_EXPRS
are
  both gimplified using gimplify_modify_expr. I have moved the
  cilk_detect_spawn into this function. We will go into the
  cilk_detect_spawn if cilk plus is enabled, and if there is a
  cilk_frame (meaning the function has a Cilk_spawn in it)
  thereby reducing the number of hits into this function 
  significantly.
  Inside this function, it will go into the function that has a
  spawned function call and then unwrap the CILK_SPAWN_STMT
  wrapper and returns true. This shouldn't cause a huge
  compilation time
  hit. 2.
  To handle CALL_EXPR (e.g. _Cilk_spawn foo (x), where foo
  returns a void or the return value of it is ignored), I have
  added a CILK_SPAWN_STMT
case.
  Again, I am calling the detect_cilk_spawn and we will only
  step into this function if Cilk Plus is enabled and if there
  is a cilk-frame (i.e saying the function has a cilk spawn in
  it). If there is an error (seen_error () == true), then it
  just falls through into CALL_EXPR and is handled like a normal
  call expr not spawned expression. 3. This warning rarely get
  

[PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)

2013-09-17 Thread Marek Polacek
This patch adds the no_sanitize_undefined attribute, so the user can tell
that a particular function should be ignored by ubsan.

Ran ubsan testsuite/bootstrap-ubsan on x86_64-linux, ok for trunk?

2013-09-17  Marek Polacek  pola...@redhat.com

PR sanitizer/58411
* doc/extend.texi: Document no_sanitize_undefined attribute.
* builtins.c (fold_builtin_0): Don't sanitize function if it has the
no_sanitize_undefined attribute.

c-family/
* c-common.c (handle_no_sanitize_undefined_attribute): New function.
Declare it.
(struct attribute_spec c_common_att): Add no_sanitize_undefined.
cp/
* typeck.c (cp_build_binary_op): Don't sanitize function if it has the
no_sanitize_undefined attribute.
c/
* c-typeck.c (build_binary_op): Don't sanitize function if it has the
no_sanitize_undefined attribute.

testsuite/
* c-c++-common/ubsan/attrib-1.c: New test.

--- gcc/c-family/c-common.c.mp2 2013-09-17 15:55:56.417946667 +0200
+++ gcc/c-family/c-common.c 2013-09-17 15:58:55.905513029 +0200
@@ -311,6 +311,8 @@ static tree handle_no_sanitize_address_a
  int, bool *);
 static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree,
 int, bool *);
+static tree handle_no_sanitize_undefined_attribute (tree *, tree, tree, int,
+   bool *);
 static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noclone_attribute (tree *, tree, tree, int, bool *);
 static tree handle_leaf_attribute (tree *, tree, tree, int, bool *);
@@ -722,6 +724,9 @@ const struct attribute_spec c_common_att
   { no_sanitize_address,0, 0, true, false, false,
  handle_no_sanitize_address_attribute,
  false },
+  { no_sanitize_undefined,  0, 0, true, false, false,
+ handle_no_sanitize_undefined_attribute,
+ false },
   { warning,   1, 1, true,  false, false,
  handle_error_attribute, false },
   { error, 1, 1, true,  false, false,
@@ -6575,6 +6580,22 @@ handle_no_address_safety_analysis_attrib
   return NULL_TREE;
 }
 
+/* Handle a no_sanitize_undefined attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_no_sanitize_undefined_attribute (tree *node, tree name, tree, int,
+ bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+{
+  warning (OPT_Wattributes, %qE attribute ignored, name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
 /* Handle a noinline attribute; arguments as in
struct attribute_spec.handler.  */
 
--- gcc/doc/extend.texi.mp2 2013-09-17 15:55:44.250907707 +0200
+++ gcc/doc/extend.texi 2013-09-17 16:06:21.439974916 +0200
@@ -2136,6 +2136,7 @@ attributes are currently defined for fun
 @code{warn_unused_result}, @code{nonnull}, @code{gnu_inline},
 @code{externally_visible}, @code{hot}, @code{cold}, @code{artificial},
 @code{no_sanitize_address}, @code{no_address_safety_analysis},
+@code{no_sanitize_undefined},
 @code{error} and @code{warning}.
 Several other attributes are defined for functions on particular
 target systems.  Other attributes, including @code{section} are
@@ -3500,6 +3501,12 @@ The @code{no_address_safety_analysis} is
 @code{no_sanitize_address} attribute, new code should use
 @code{no_sanitize_address}.
 
+@item no_sanitize_undefined
+@cindex @code{no_sanitize_undefined} function attribute
+The @code{no_sanitize_undefined} attribute on functions is used
+to inform the compiler that it should not check for undefined behavior
+in the function when compiling with the @option{-fsanitize=undefined} option.
+
 @item regparm (@var{number})
 @cindex @code{regparm} attribute
 @cindex functions that are passed arguments in registers on the 386
--- gcc/cp/typeck.c.mp2 2013-09-17 16:10:49.935644344 +0200
+++ gcc/cp/typeck.c 2013-09-17 16:11:20.601743694 +0200
@@ -4887,6 +4887,8 @@ cp_build_binary_op (location_t location,
   if ((flag_sanitize  SANITIZE_UNDEFINED)
!processing_template_decl
current_function_decl != 0
+   !lookup_attribute (no_sanitize_undefined,
+   DECL_ATTRIBUTES (current_function_decl))
(doing_div_or_mod || doing_shift))
 {
   /* OP0 and/or OP1 might have side-effects.  */
--- gcc/c/c-typeck.c.mp22013-09-17 16:09:31.423381687 +0200
+++ gcc/c/c-typeck.c2013-09-17 16:10:00.626476422 +0200
@@ -10498,6 +10498,8 @@ build_binary_op (location_t location, en
 
   if (flag_sanitize  SANITIZE_UNDEFINED
current_function_decl != 0
+   !lookup_attribute (no_sanitize_undefined,
+   DECL_ATTRIBUTES (current_function_decl))
(doing_div_or_mod || 

Disable creation of local aliases on targets w/o alias support

2013-09-17 Thread Jan Hubicka
Hi,
this patch should fix HP-PA bootstrap issue where we create local aliases but 
the target
has no support for them. 

Bootstrapped/regtested x86_64-linux (with aliases disabled) and commited.
PR middle-end/58329
* ipa-devirt.c (ipa_devirt): Be ready for symtab_nonoverwritable_alias
to return NULL.
* ipa.c (function_and_variable_visibility): Likewise.
* ipa-profile.c (ipa_profile): Likewise.
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 202650)
+++ ipa-devirt.c(working copy)
@@ -1098,7 +1098,13 @@ ipa_devirt (void)
   cgraph_node_name (likely_target),
   likely_target-symbol.order);
if (!symtab_can_be_discarded ((symtab_node) likely_target))
- likely_target = cgraph (symtab_nonoverwritable_alias 
((symtab_node)likely_target));
+ {
+   cgraph_node *alias;
+   alias = cgraph (symtab_nonoverwritable_alias
+((symtab_node)likely_target));
+   if (alias)
+ likely_target = alias;
+ }
nconverted++;
update = true;
cgraph_turn_edge_to_speculative
Index: ipa.c
===
--- ipa.c   (revision 202650)
+++ ipa.c   (working copy)
@@ -998,7 +998,7 @@ function_and_variable_visibility (bool w
{
  struct cgraph_node *alias = cgraph (symtab_nonoverwritable_alias 
((symtab_node) node));
 
- if (alias != node)
+ if (alias  alias != node)
{
  while (node-callers)
{
Index: ipa-profile.c
===
--- ipa-profile.c   (revision 202650)
+++ ipa-profile.c   (working copy)
@@ -625,7 +625,13 @@ ipa_profile (void)
 of N2.  Speculate on the local alias to allow inlining.
   */
  if (!symtab_can_be_discarded ((symtab_node) n2))
-   n2 = cgraph (symtab_nonoverwritable_alias 
((symtab_node)n2));
+   {
+ cgraph_node *alias;
+ alias = cgraph (symtab_nonoverwritable_alias
+  ((symtab_node)n2));
+ if (alias)
+   n2 = alias;
+   }
  nconverted++;
  cgraph_turn_edge_to_speculative
(e, n2,
Index: symtab.c
===
--- symtab.c(revision 202650)
+++ symtab.c(working copy)
@@ -1083,6 +1083,10 @@ symtab_nonoverwritable_alias (symtab_nod
   (void *)new_node, true);
   if (new_node)
 return new_node;
+#ifndef ASM_OUTPUT_DEF
+  /* If aliases aren't supported by the assembler, fail.  */
+  return NULL;
+#endif
 
   /* Otherwise create a new one.  */
   new_decl = copy_node (node-symbol.decl);


Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)

2013-09-17 Thread Jakub Jelinek
On Tue, Sep 17, 2013 at 05:24:22PM +0200, Marek Polacek wrote:
 This patch adds the no_sanitize_undefined attribute, so the user can tell
 that a particular function should be ignored by ubsan.

Does this correspond to some llvm attribute?

 --- gcc/builtins.c.mp22013-09-17 16:13:26.623161281 +0200
 +++ gcc/builtins.c2013-09-17 16:15:20.846557451 +0200
 @@ -10313,7 +10313,9 @@ fold_builtin_0 (location_t loc, tree fnd
return fold_builtin_classify_type (NULL_TREE);
  
  case BUILT_IN_UNREACHABLE:
 -  if (flag_sanitize  SANITIZE_UNREACHABLE)
 +  if (flag_sanitize  SANITIZE_UNREACHABLE
 +!lookup_attribute (no_sanitize_undefined,
 + DECL_ATTRIBUTES (current_function_decl)))
   return ubsan_instrument_unreachable (loc);
break;

I wonder if current_function_decl couldn't be NULL here, say if
__builtin_unreachable () appears in C++ global var initializers or similar.

Jakub


Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)

2013-09-17 Thread Marek Polacek
On Tue, Sep 17, 2013 at 05:37:51PM +0200, Jakub Jelinek wrote:
 On Tue, Sep 17, 2013 at 05:24:22PM +0200, Marek Polacek wrote:
  This patch adds the no_sanitize_undefined attribute, so the user can tell
  that a particular function should be ignored by ubsan.
 
 Does this correspond to some llvm attribute?

No, it seems they don't have a flag for disabling the ubsan; they only
have flags for disabling asan/tsan/msan.
 
  --- gcc/builtins.c.mp2  2013-09-17 16:13:26.623161281 +0200
  +++ gcc/builtins.c  2013-09-17 16:15:20.846557451 +0200
  @@ -10313,7 +10313,9 @@ fold_builtin_0 (location_t loc, tree fnd
 return fold_builtin_classify_type (NULL_TREE);
   
   case BUILT_IN_UNREACHABLE:
  -  if (flag_sanitize  SANITIZE_UNREACHABLE)
  +  if (flag_sanitize  SANITIZE_UNREACHABLE
  +  !lookup_attribute (no_sanitize_undefined,
  +   DECL_ATTRIBUTES (current_function_decl)))
  return ubsan_instrument_unreachable (loc);
 break;
 
 I wonder if current_function_decl couldn't be NULL here, say if
 __builtin_unreachable () appears in C++ global var initializers or similar.

Well I wonder too ;)  I thought it can't be NULL, and tried this

struct C {
  C() { __builtin_unreachable (); }
};

C c;

int
main ()
{
  return 0;
}

and here everything looks ok.  Or is this not the proper way of
checking that?  Surely, I can add the check for current_function_decl
!= NULL just to be on the safe side...

Marek


Fix PR58332

2013-09-17 Thread Jan Hubicka
Hi,
this patch makes inliner to not inline functions with -O0 optimization attribute
and also to not inline into functions.

Bootstrapped/regtested x86_64-linux, comitted.
PR middle-end/58332
* gcc.c-torture/compile/pr58332.c: New testcase.
* cif-code.def (FUNCTION_NOT_OPTIMIZED): New CIF code.
* ipa-inline.c (can_inline_edge_p): Do not downgrade
FUNCTION_NOT_OPTIMIZED.
* ipa-inline-analysis.c (compute_inline_parameters): Function
not optimized is not inlinable unless it is alwaysinline.
(inline_analyze_function): Force calls in not optimized
function not inlinable.

Index: testsuite/gcc.c-torture/compile/pr58332.c
===
--- testsuite/gcc.c-torture/compile/pr58332.c   (revision 0)
+++ testsuite/gcc.c-torture/compile/pr58332.c   (revision 0)
@@ -0,0 +1,2 @@
+static inline int foo (int x) { return x + 1; }
+__attribute__ ((__optimize__ (0))) int bar (void) { return foo (100); }
Index: cif-code.def
===
--- cif-code.def(revision 202656)
+++ cif-code.def(working copy)
@@ -37,6 +37,9 @@ DEFCIFCODE(UNSPECIFIED , )
functions that have not been rejected for inlining yet.  */
 DEFCIFCODE(FUNCTION_NOT_CONSIDERED, N_(function not considered for inlining))
 
+/* Caller is compiled with optimizations disabled.  */
+DEFCIFCODE(FUNCTION_NOT_OPTIMIZED, N_(caller is not optimized))
+
 /* Inlining failed owing to unavailable function body.  */
 DEFCIFCODE(BODY_NOT_AVAILABLE, N_(function body not available))
 
Index: ipa-inline.c
===
--- ipa-inline.c(revision 202656)
+++ ipa-inline.c(working copy)
@@ -275,7 +275,8 @@ can_inline_edge_p (struct cgraph_edge *e
 }
   else if (e-call_stmt_cannot_inline_p)
 {
-  e-inline_failed = CIF_MISMATCHED_ARGUMENTS;
+  if (e-inline_failed != CIF_FUNCTION_NOT_OPTIMIZED)
+e-inline_failed = CIF_MISMATCHED_ARGUMENTS;
   inlinable = false;
 }
   /* Don't inline if the functions have different EH personalities.  */
Index: ipa-inline-analysis.c
===
--- ipa-inline-analysis.c   (revision 202656)
+++ ipa-inline-analysis.c   (working copy)
@@ -2664,7 +2664,11 @@ compute_inline_parameters (struct cgraph
   info-stack_frame_offset = 0;
 
   /* Can this function be inlined at all?  */
-  info-inlinable = tree_inlinable_function_p (node-symbol.decl);
+  if (!optimize  !lookup_attribute (always_inline,
+ DECL_ATTRIBUTES (node-symbol.decl)))
+info-inlinable = false;
+  else
+info-inlinable = tree_inlinable_function_p (node-symbol.decl);
 
   /* Type attributes can use parameter indices to describe them.  */
   if (TYPE_ATTRIBUTES (TREE_TYPE (node-symbol.decl)))
@@ -3678,6 +3682,22 @@ inline_analyze_function (struct cgraph_n
   if (optimize  !node-thunk.thunk_p)
 inline_indirect_intraprocedural_analysis (node);
   compute_inline_parameters (node, false);
+  if (!optimize)
+{
+  struct cgraph_edge *e;
+  for (e = node-callees; e; e = e-next_callee)
+   {
+ if (e-inline_failed == CIF_FUNCTION_NOT_CONSIDERED)
+   e-inline_failed = CIF_FUNCTION_NOT_OPTIMIZED;
+ e-call_stmt_cannot_inline_p = true;
+   }
+  for (e = node-indirect_calls; e; e = e-next_callee)
+   {
+ if (e-inline_failed == CIF_FUNCTION_NOT_CONSIDERED)
+   e-inline_failed = CIF_FUNCTION_NOT_OPTIMIZED;
+ e-call_stmt_cannot_inline_p = true;
+   }
+}
 
   pop_cfun ();
 }


Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)

2013-09-17 Thread Jeff Law

On 09/17/2013 08:50 AM, Iyer, Balaji V wrote:

Hello, Has anyone had a chance to look at this. The C++ part is only
a week old, but the C part has been in review for ~3 weeks. I would
greatly appreciate if someone could review this and approve for trunk
if it is Ok for trunk.

Obviously not yet.  Everyone is pretty busy right now.

jeff


Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)

2013-09-17 Thread Jakub Jelinek
On Tue, Sep 17, 2013 at 06:45:25PM +0200, Marek Polacek wrote:
 --- gcc/builtins.c.mp22013-09-17 16:13:26.623161281 +0200
 +++ gcc/builtins.c2013-09-17 18:42:11.338273135 +0200
 @@ -10313,7 +10313,10 @@ fold_builtin_0 (location_t loc, tree fnd
return fold_builtin_classify_type (NULL_TREE);
  
  case BUILT_IN_UNREACHABLE:
 -  if (flag_sanitize  SANITIZE_UNREACHABLE)
 +  if (flag_sanitize  SANITIZE_UNREACHABLE
 +current_function_decl != 0
 +!lookup_attribute (no_sanitize_undefined,
 + DECL_ATTRIBUTES (current_function_decl)))
   return ubsan_instrument_unreachable (loc);
break;
  

I'd say you should instead use
 (current_function_decl == NULL
|| !lookup_attribute (...))
so that you instrument even outside of fn bodies, just with no way to turn
it off in the code (only command line options).

Ok with that change.

Jakub


Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)

2013-09-17 Thread Marek Polacek
On Tue, Sep 17, 2013 at 06:51:59PM +0200, Jakub Jelinek wrote:
 On Tue, Sep 17, 2013 at 06:45:25PM +0200, Marek Polacek wrote:
  --- gcc/builtins.c.mp2  2013-09-17 16:13:26.623161281 +0200
  +++ gcc/builtins.c  2013-09-17 18:42:11.338273135 +0200
  @@ -10313,7 +10313,10 @@ fold_builtin_0 (location_t loc, tree fnd
 return fold_builtin_classify_type (NULL_TREE);
   
   case BUILT_IN_UNREACHABLE:
  -  if (flag_sanitize  SANITIZE_UNREACHABLE)
  +  if (flag_sanitize  SANITIZE_UNREACHABLE
  +  current_function_decl != 0
  +  !lookup_attribute (no_sanitize_undefined,
  +   DECL_ATTRIBUTES (current_function_decl)))
  return ubsan_instrument_unreachable (loc);
 break;
   
 
 I'd say you should instead use
  (current_function_decl == NULL
 || !lookup_attribute (...))
 so that you instrument even outside of fn bodies, just with no way to turn
 it off in the code (only command line options).
 
 Ok with that change.

Thanks, will commit the following tomorrow if no one objects...

2013-09-17  Marek Polacek  pola...@redhat.com

PR sanitizer/58411
* doc/extend.texi: Document no_sanitize_undefined attribute.
* builtins.c (fold_builtin_0): Don't sanitize function if it has the
no_sanitize_undefined attribute.

c-family/
* c-common.c (handle_no_sanitize_undefined_attribute): New function.
Declare it.
(struct attribute_spec c_common_att): Add no_sanitize_undefined.
cp/
* typeck.c (cp_build_binary_op): Don't sanitize function if it has the
no_sanitize_undefined attribute.
c/
* c-typeck.c (build_binary_op): Don't sanitize function if it has the
no_sanitize_undefined attribute.

testsuite/
* c-c++-common/ubsan/attrib-1.c: New test.

--- gcc/c-family/c-common.c.mp2 2013-09-17 15:55:56.417946667 +0200
+++ gcc/c-family/c-common.c 2013-09-17 15:58:55.905513029 +0200
@@ -311,6 +311,8 @@ static tree handle_no_sanitize_address_a
  int, bool *);
 static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree,
 int, bool *);
+static tree handle_no_sanitize_undefined_attribute (tree *, tree, tree, int,
+   bool *);
 static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noclone_attribute (tree *, tree, tree, int, bool *);
 static tree handle_leaf_attribute (tree *, tree, tree, int, bool *);
@@ -722,6 +724,9 @@ const struct attribute_spec c_common_att
   { no_sanitize_address,0, 0, true, false, false,
  handle_no_sanitize_address_attribute,
  false },
+  { no_sanitize_undefined,  0, 0, true, false, false,
+ handle_no_sanitize_undefined_attribute,
+ false },
   { warning,   1, 1, true,  false, false,
  handle_error_attribute, false },
   { error, 1, 1, true,  false, false,
@@ -6575,6 +6580,22 @@ handle_no_address_safety_analysis_attrib
   return NULL_TREE;
 }
 
+/* Handle a no_sanitize_undefined attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_no_sanitize_undefined_attribute (tree *node, tree name, tree, int,
+ bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+{
+  warning (OPT_Wattributes, %qE attribute ignored, name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
 /* Handle a noinline attribute; arguments as in
struct attribute_spec.handler.  */
 
--- gcc/doc/extend.texi.mp2 2013-09-17 15:55:44.250907707 +0200
+++ gcc/doc/extend.texi 2013-09-17 16:06:21.439974916 +0200
@@ -2136,6 +2136,7 @@ attributes are currently defined for fun
 @code{warn_unused_result}, @code{nonnull}, @code{gnu_inline},
 @code{externally_visible}, @code{hot}, @code{cold}, @code{artificial},
 @code{no_sanitize_address}, @code{no_address_safety_analysis},
+@code{no_sanitize_undefined},
 @code{error} and @code{warning}.
 Several other attributes are defined for functions on particular
 target systems.  Other attributes, including @code{section} are
@@ -3500,6 +3501,12 @@ The @code{no_address_safety_analysis} is
 @code{no_sanitize_address} attribute, new code should use
 @code{no_sanitize_address}.
 
+@item no_sanitize_undefined
+@cindex @code{no_sanitize_undefined} function attribute
+The @code{no_sanitize_undefined} attribute on functions is used
+to inform the compiler that it should not check for undefined behavior
+in the function when compiling with the @option{-fsanitize=undefined} option.
+
 @item regparm (@var{number})
 @cindex @code{regparm} attribute
 @cindex functions that are passed arguments in registers on the 386
--- gcc/cp/typeck.c.mp2 2013-09-17 16:10:49.935644344 +0200
+++ gcc/cp/typeck.c 

Re: New GCC options for loop vectorization

2013-09-17 Thread Xinliang David Li
On Tue, Sep 17, 2013 at 1:20 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li davi...@google.com 
 wrote:
 On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li davi...@google.com 
 wrote:
 On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li davi...@google.com 
 wrote:
 Currently -ftree-vectorize turns on both loop and slp vectorizations,
 but there is no simple way to turn on loop vectorization alone. The
 logic for default O3 setting is also complicated.

 In this patch, two new options are introduced:

 1) -ftree-loop-vectorize

 This option is used to turn on loop vectorization only. option
 -ftree-slp-vectorize also becomes a first class citizen, and no funny
 business of Init(2) is needed.  With this change, -ftree-vectorize
 becomes a simple alias to -ftree-loop-vectorize +
 -ftree-slp-vectorize.

 For instance, to turn on only slp vectorize at O3, the old way is:

  -O3 -fno-tree-vectorize -ftree-slp-vectorize

 With the new change it becomes:

 -O3 -fno-loop-vectorize


 To turn on only loop vectorize at O2, the old way is

 -O2 -ftree-vectorize -fno-slp-vectorize

 The new way is

 -O2 -ftree-loop-vectorize



 2) -ftree-vect-loop-peeling

 This option is used to turn on/off loop peeling for alignment.  In the
 long run, this should be folded into the cheap cost model proposed by
 Richard.  This option is also useful in scenarios where peeling can
 introduce runtime problems:
 http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html  which happens to be
 common in practice.



 Patch attached. Compiler boostrapped. Ok after testing?

 I'd like you to split 1) and 2), mainly because I agree on 1) but not on 
 2).

 Ok. Can you also comment on 2) ?

 I think we want to decide how granular we want to control the vectorizer
 and using which mechanism.  My cost-model re-org makes
 ftree-vect-loop-version a no-op (basically removes it), so 2) looks like
 a step backwards in this context.

 Using cost model to do a coarse grain control/configuration is
 certainly something we want, but having a fine grain control is still
 useful.


 So, can you summarize what pieces (including versioning) of the vectorizer
 you'd want to be able to disable separately?

 Loop peeling seems to be the main one. There is also a correctness
 issue related. For instance, the following code is common in practice,
 but loop peeling wrongly assumes initial base-alignment and generates
 aligned mov instruction after peeling, leading to SEGV.  Peeling is
 not something we can blindly turned on -- even when it is on, there
 should be a way to turn it off explicitly:

 char a[1];

 void foo(int n)
 {
   int* b = (int*)(a+n);
   int i = 0;
   for (; i  1000; ++i)
 b[i] = 1;
 }

 int main(int argn, char** argv)
 {
   foo(argn);
 }

 But that's just a bug that should be fixed (looking into it).

This kind of code is not uncommon for certain applications (e.g, group
varint decoding).  Besides, the code like this may be built with
-fno-strict-aliasing.



  Just disabling peeling for
 alignment may get you into the versioning for alignment path (and thus
 an unvectorized loop at runtime).

 This is not true for target supporting mis-aligned access. I have not
 seen a case where alignment driver loop version happens on x86.

Also it's know that the alignment peeling
 code needs some serious TLC (it's outcome depends on the order of DRs,
 the cost model it uses leaves to be desired as we cannot distinguish
 between unaligned load and store costs).

 Yet another reason to turn it off as it is not effective anyways?

 As said I'll disable all remains of -ftree-vect-loop-version with the cost 
 model
 patch because it wasn't guarding versioning for aliasing but only versioning
 for alignment.

 We have to be consistent here - if we add a way to disable peeling for
 alignment then we certainly don't want to remove the ability to disable
 versioning for alignment, no?

yes, for consistency, the version control flag may also be useful to be kept.

David


 Richard.


 thanks,

 David


 Richard.


 I've stopped a quick try doing 1) myself because

 @@ -1691,6 +1695,12 @@ common_handle_option (struct gcc_options
  opts-x_flag_ipa_reference = false;
break;

 +case OPT_ftree_vectorize:
 +  if (!opts_set-x_flag_tree_loop_vectorize)
 + opts-x_flag_tree_loop_vectorize = value;
 +  if (!opts_set-x_flag_tree_slp_vectorize)
 + opts-x_flag_tree_slp_vectorize = value;
 +  break;

 doesn't look obviously correct.  Does that handle

   -ftree-vectorize -fno-tree-loop-vectorize -ftree-vectorize

 or

   -ftree-loop-vectorize -fno-tree-vectorize

 properly?  Currently at least

   -ftree-slp-vectorize -fno-tree-vectorize

 doesn't work.


 Right -- same is true for -fprofile-use option. FDO 

Re: Using gen_int_mode instead of GEN_INT minor testsuite fallout on MIPS

2013-09-17 Thread Mike Stump
On Sep 16, 2013, at 8:41 PM, DJ Delorie d...@redhat.com wrote:
 m32c's PSImode is 24-bits, why does it have 32 in the macro?
 
 /* 24-bit pointers, in 32-bit units */
 -PARTIAL_INT_MODE (SI);
 +PARTIAL_INT_MODE_NAME (SI, 32, PSI);

Sorry, fingers copied the wrong number.  Thanks for the catch.

Index: gcc/config/msp430/msp430-modes.def
===
--- gcc/config/msp430/msp430-modes.def  (revision 202634)
+++ gcc/config/msp430/msp430-modes.def  (working copy)
@@ -1,3 +1,3 @@
 /* 20-bit address */
-PARTIAL_INT_MODE (SI);
+PARTIAL_INT_MODE_NAME (SI, 20, PSI);
 
Index: gcc/config/bfin/bfin-modes.def
===
--- gcc/config/bfin/bfin-modes.def  (revision 202634)
+++ gcc/config/bfin/bfin-modes.def  (working copy)
@@ -19,7 +19,7 @@
http://www.gnu.org/licenses/.  */
 
 /* PDImode for the 40-bit accumulators.  */
-PARTIAL_INT_MODE (DI);
+PARTIAL_INT_MODE_NAME (DI, 40, PDI);
 
 /* Two of those - covering both accumulators for vector multiplications.  */
 VECTOR_MODE (INT, PDI, 2);
Index: gcc/config/m32c/m32c-modes.def
===
--- gcc/config/m32c/m32c-modes.def  (revision 202634)
+++ gcc/config/m32c/m32c-modes.def  (working copy)
@@ -22,7 +22,7 @@
 /*INT_MODE (PI, 3);*/
 
 /* 24-bit pointers, in 32-bit units */
-PARTIAL_INT_MODE (SI);
+PARTIAL_INT_MODE_NAME (SI, 24, PSI);
 
 /* 48-bit MULEX result */
 /* INT_MODE (MI, 6); */
Index: gcc/config/rs6000/rs6000-modes.def
===
--- gcc/config/rs6000/rs6000-modes.def  (revision 202634)
+++ gcc/config/rs6000/rs6000-modes.def  (working copy)
@@ -45,4 +45,4 @@ VECTOR_MODES (FLOAT, 32); /*   V
 /* Replacement for TImode that only is allowed in GPRs.  We also use PTImode
for quad memory atomic operations to force getting an even/odd register
combination.  */
-PARTIAL_INT_MODE (TI);
+PARTIAL_INT_MODE_NAME (TI, 128, PTI);
Index: gcc/config/sh/sh-modes.def
===
--- gcc/config/sh/sh-modes.def  (revision 202634)
+++ gcc/config/sh/sh-modes.def  (working copy)
@@ -18,9 +18,9 @@ along with GCC; see the file COPYING3.
 http://www.gnu.org/licenses/.  */
 
 /* The SH uses a partial integer mode to represent the FPSCR register.  */
-PARTIAL_INT_MODE (SI);
+PARTIAL_INT_MODE_NAME (SI, 32, PSI);
 /* PDI mode is used to represent a function address in a target register.  */
-PARTIAL_INT_MODE (DI);
+PARTIAL_INT_MODE_NAME (DI, 64, PDI);
 
 /* Vector modes.  */
 VECTOR_MODE  (INT, QI, 2);/* V2QI */
Index: gcc/genmodes.c
===
--- gcc/genmodes.c  (revision 202634)
+++ gcc/genmodes.c  (working copy)
@@ -629,10 +629,14 @@ reset_float_format (const char *name, co
   m-format = format;
 }
 
-/* Partial integer modes are specified by relation to a full integer mode.
-   For now, we do not attempt to narrow down their bit sizes.  */
-#define PARTIAL_INT_MODE(M) \
-  make_partial_integer_mode (#M, P #M, -1U, __FILE__, __LINE__)
+/* Partial integer modes are specified by relation to a full integer
+   mode.  */
+#define PARTIAL_INT_MODE(M,PREC)   \
+  make_partial_integer_mode (#M, P #PREC #M, PREC, __FILE__, __LINE__)
+/* Partial integer modes are specified by relation to a full integer
+   mode.  */
+#define PARTIAL_INT_MODE_NAME(M,PREC,NAME) \
+  make_partial_integer_mode (#M, #NAME, PREC, __FILE__, __LINE__)
 static void ATTRIBUTE_UNUSED
 make_partial_integer_mode (const char *base, const char *name,
   unsigned int precision,
@@ -669,7 +673,7 @@ make_vector_mode (enum mode_class bclass
   struct mode_data *v;
   enum mode_class vclass = vector_class (bclass);
   struct mode_data *component = find_mode (base);
-  char namebuf[8];
+  char namebuf[16];
 
   if (vclass == MODE_RANDOM)
 return;
@@ -917,7 +921,7 @@ enum machine_mode\n{);
 end will try to use it for bitfields in structures and the
 like, which we do not want.  Only the target md file should
 generate BImode widgets.  */
-  if (first  first-precision == 1)
+  if (first  first-precision == 1  c == MODE_INT)
first = first-next;
 
   if (first  last)
@@ -1187,7 +1191,7 @@ emit_class_narrowest_mode (void)
 /* Bleah, all this to get the comment right for MIN_MODE_INT.  */
 tagged_printf (MIN_%s, mode_class_names[c],
   modes[c]
-  ? (modes[c]-precision != 1
+  ? ((c != MODE_INT || modes[c]-precision != 1)
  ? modes[c]-name
  : (modes[c]-next
 ? modes[c]-next-name
Index: gcc/machmode.def
===
--- gcc/machmode.def

Re: New GCC options for loop vectorization

2013-09-17 Thread Xinliang David Li
On Tue, Sep 17, 2013 at 8:45 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, Sep 17, 2013 at 08:37:57AM -0700, Xinliang David Li wrote:
  char a[1];
 
  void foo(int n)
  {
int* b = (int*)(a+n);
int i = 0;
for (; i  1000; ++i)
  b[i] = 1;
  }
 
  int main(int argn, char** argv)
  {
foo(argn);
  }
 
  But that's just a bug that should be fixed (looking into it).

 This kind of code is not uncommon for certain applications (e.g, group
 varint decoding).  Besides, the code like this may be built with

 That is irrelevant to the fact that it is invalid.

 -fno-strict-aliasing.

 It isn't invalid because of aliasing violations, but because of unaligned
 access without saying that it is unaligned (say accessing through
 aligned(1) type, or packed struct or similar, or doing memcpy).
 On various architectures unaligned accesses don't cause faults, so it
 may appear to work, and even on i?86/x86_64 often appears to work, as
 long as you aren't trying to vectorize code (which doesn't change anything
 on the fact that it is undefined behavior).

ok, undefined behavior it is.  By the way, ICC does loop versioning on
the case and therefore has no problem. Clang/LLVM vectorizes it with
neither peeling nor versioning, and it works fine to. For legacy code
like this, GCC is less tolerant.

thanks,

David


 Jakub


Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)

2013-09-17 Thread Jakub Jelinek
On Tue, Sep 17, 2013 at 06:26:52PM +0200, Marek Polacek wrote:
 Well I wonder too ;)  I thought it can't be NULL, and tried this
 
 struct C {
   C() { __builtin_unreachable (); }
 };

I was more wondering about stuff like:
int a = (__builtin_unreachable (), 1);
or similar.

Jakub


Re: New GCC options for loop vectorization

2013-09-17 Thread Jakub Jelinek
On Tue, Sep 17, 2013 at 08:37:57AM -0700, Xinliang David Li wrote:
  char a[1];
 
  void foo(int n)
  {
int* b = (int*)(a+n);
int i = 0;
for (; i  1000; ++i)
  b[i] = 1;
  }
 
  int main(int argn, char** argv)
  {
foo(argn);
  }
 
  But that's just a bug that should be fixed (looking into it).
 
 This kind of code is not uncommon for certain applications (e.g, group
 varint decoding).  Besides, the code like this may be built with

That is irrelevant to the fact that it is invalid.

 -fno-strict-aliasing.

It isn't invalid because of aliasing violations, but because of unaligned
access without saying that it is unaligned (say accessing through
aligned(1) type, or packed struct or similar, or doing memcpy).
On various architectures unaligned accesses don't cause faults, so it
may appear to work, and even on i?86/x86_64 often appears to work, as
long as you aren't trying to vectorize code (which doesn't change anything
on the fact that it is undefined behavior).

Jakub


Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)

2013-09-17 Thread Marek Polacek
On Tue, Sep 17, 2013 at 06:34:38PM +0200, Jakub Jelinek wrote:
 On Tue, Sep 17, 2013 at 06:26:52PM +0200, Marek Polacek wrote:
  Well I wonder too ;)  I thought it can't be NULL, and tried this
  
  struct C {
C() { __builtin_unreachable (); }
  };
 
 I was more wondering about stuff like:
 int a = (__builtin_unreachable (), 1);
 or similar.

Oh yeah, that would segfault, so I added the check that c_f_d is
non-NULL.  Ok now?

2013-09-17  Marek Polacek  pola...@redhat.com

PR sanitizer/58411
* doc/extend.texi: Document no_sanitize_undefined attribute.
* builtins.c (fold_builtin_0): Don't sanitize function if it has the
no_sanitize_undefined attribute.

c-family/
* c-common.c (handle_no_sanitize_undefined_attribute): New function.
Declare it.
(struct attribute_spec c_common_att): Add no_sanitize_undefined.
cp/
* typeck.c (cp_build_binary_op): Don't sanitize function if it has the
no_sanitize_undefined attribute.
c/
* c-typeck.c (build_binary_op): Don't sanitize function if it has the
no_sanitize_undefined attribute.

testsuite/
* c-c++-common/ubsan/attrib-1.c: New test.

--- gcc/c-family/c-common.c.mp2 2013-09-17 15:55:56.417946667 +0200
+++ gcc/c-family/c-common.c 2013-09-17 15:58:55.905513029 +0200
@@ -311,6 +311,8 @@ static tree handle_no_sanitize_address_a
  int, bool *);
 static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree,
 int, bool *);
+static tree handle_no_sanitize_undefined_attribute (tree *, tree, tree, int,
+   bool *);
 static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noclone_attribute (tree *, tree, tree, int, bool *);
 static tree handle_leaf_attribute (tree *, tree, tree, int, bool *);
@@ -722,6 +724,9 @@ const struct attribute_spec c_common_att
   { no_sanitize_address,0, 0, true, false, false,
  handle_no_sanitize_address_attribute,
  false },
+  { no_sanitize_undefined,  0, 0, true, false, false,
+ handle_no_sanitize_undefined_attribute,
+ false },
   { warning,   1, 1, true,  false, false,
  handle_error_attribute, false },
   { error, 1, 1, true,  false, false,
@@ -6575,6 +6580,22 @@ handle_no_address_safety_analysis_attrib
   return NULL_TREE;
 }
 
+/* Handle a no_sanitize_undefined attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_no_sanitize_undefined_attribute (tree *node, tree name, tree, int,
+ bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+{
+  warning (OPT_Wattributes, %qE attribute ignored, name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
 /* Handle a noinline attribute; arguments as in
struct attribute_spec.handler.  */
 
--- gcc/doc/extend.texi.mp2 2013-09-17 15:55:44.250907707 +0200
+++ gcc/doc/extend.texi 2013-09-17 16:06:21.439974916 +0200
@@ -2136,6 +2136,7 @@ attributes are currently defined for fun
 @code{warn_unused_result}, @code{nonnull}, @code{gnu_inline},
 @code{externally_visible}, @code{hot}, @code{cold}, @code{artificial},
 @code{no_sanitize_address}, @code{no_address_safety_analysis},
+@code{no_sanitize_undefined},
 @code{error} and @code{warning}.
 Several other attributes are defined for functions on particular
 target systems.  Other attributes, including @code{section} are
@@ -3500,6 +3501,12 @@ The @code{no_address_safety_analysis} is
 @code{no_sanitize_address} attribute, new code should use
 @code{no_sanitize_address}.
 
+@item no_sanitize_undefined
+@cindex @code{no_sanitize_undefined} function attribute
+The @code{no_sanitize_undefined} attribute on functions is used
+to inform the compiler that it should not check for undefined behavior
+in the function when compiling with the @option{-fsanitize=undefined} option.
+
 @item regparm (@var{number})
 @cindex @code{regparm} attribute
 @cindex functions that are passed arguments in registers on the 386
--- gcc/cp/typeck.c.mp2 2013-09-17 16:10:49.935644344 +0200
+++ gcc/cp/typeck.c 2013-09-17 16:11:20.601743694 +0200
@@ -4887,6 +4887,8 @@ cp_build_binary_op (location_t location,
   if ((flag_sanitize  SANITIZE_UNDEFINED)
!processing_template_decl
current_function_decl != 0
+   !lookup_attribute (no_sanitize_undefined,
+   DECL_ATTRIBUTES (current_function_decl))
(doing_div_or_mod || doing_shift))
 {
   /* OP0 and/or OP1 might have side-effects.  */
--- gcc/c/c-typeck.c.mp22013-09-17 16:09:31.423381687 +0200
+++ gcc/c/c-typeck.c2013-09-17 16:10:00.626476422 +0200
@@ -10498,6 +10498,8 @@ build_binary_op (location_t 

Re: Using gen_int_mode instead of GEN_INT minor testsuite fallout on MIPS

2013-09-17 Thread Richard Sandiford
Mike Stump mikest...@comcast.net writes:
 +/* Partial integer modes are specified by relation to a full integer
 +   mode.  */
 +#define PARTIAL_INT_MODE(M,PREC) \
 +  make_partial_integer_mode (#M, P #PREC #M, PREC, __FILE__, __LINE__)
 +/* Partial integer modes are specified by relation to a full integer
 +   mode.  */
 +#define PARTIAL_INT_MODE_NAME(M,PREC,NAME)   \
 +  make_partial_integer_mode (#M, #NAME, PREC, __FILE__, __LINE__)

Sorry for the bikeshedding, but I think it'd better to have a single macro:

#define PARTIAL_INT_MODE(M, PREC, NAME)

You can easily add an explicit Pnmode if the port happens to want
that name.

Thanks,
Richard


Re: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction

2013-09-17 Thread Dominique Dhumieres
The new test gcc.dg/tree-ssa/slsr-39.c fails in 64 bit mode (see
http://gcc.gnu.org/ml/gcc-regression/2013-09/msg00455.html ).
Looking for MEM in the dump returns

  _12 = MEM[(int[50] *)_17];
  MEM[(int[50] *)_20] = _13;

TIA

Dominique


Re: [PATCH] manage dom-walk_data initialization and finalization with constructors and destructors

2013-09-17 Thread Jeff Law

On 09/17/2013 12:39 PM, Trevor Saunders wrote:

I'd like to go ahead and get your patch installed -- do you have a
GCC copyright assignment on file with the FSF?  Your change is large
enough to require one.


Its my understanding that Mozilla has one covering work done by
employees which would include me.

OK.  Corporate blanket assignment works for me.



sorry about the formatting issues.

No worries.  It takes time to get up to speed on all the niggling details.

I'll throw it into a build/regression test cycle, assuming nothing bad 
pops out, I'll get it installed.


jeff




Re: [PATCH] RTEMS: Add LEON3/SPARC multilibs

2013-09-17 Thread Joel Sherrill
Committed to the head.

Is this too radical to also go on the 4.8 branch?
We would need to discuss it on the RTEMS side but it
only impacts us if the multilib is there for sparc-elf
on 4.8.

Thanks Sebastian.

On 8/30/2013 6:58 AM, Daniel Hellstrom wrote:
 Hello Sebastian,
 
 That seems like a good idea.
 
 Thanks,
 Daniel
 
 
 On 08/29/2013 01:04 PM, Sebastian Huber wrote:
 Recently support for LEON3 specific instructions were added to GCC.
 Make this support available for RTEMS.

 This patch should be committed to GCC 4.9.

 gcc/ChangeLog
 2013-08-29  Sebastian Huber  sebastian.hu...@embedded-brains.de

  * config/sparc/t-rtems: Add leon3 multilibs.
 ---
   gcc/config/sparc/t-rtems |4 ++--
   1 files changed, 2 insertions(+), 2 deletions(-)

 diff --git a/gcc/config/sparc/t-rtems b/gcc/config/sparc/t-rtems
 index 63d0217..f1a3d84 100644
 --- a/gcc/config/sparc/t-rtems
 +++ b/gcc/config/sparc/t-rtems
 @@ -17,6 +17,6 @@
   # http://www.gnu.org/licenses/.
   #
   
 -MULTILIB_OPTIONS = msoft-float mcpu=v8
 -MULTILIB_DIRNAMES = soft v8
 +MULTILIB_OPTIONS = msoft-float mcpu=v8/mcpu=leon3
 +MULTILIB_DIRNAMES = soft v8 leon3
   MULTILIB_MATCHES = msoft-float=mno-fpu
 



-- 
Joel Sherrill, Ph.D. Director of Research  Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available(256) 722-9985


[v3] More noexcept for lists

2013-09-17 Thread Marc Glisse

Hello,

after vectors, lists. I didn't touch the throw we were discussing earlier 
today for now. There will be an inconsistency with debug list iterators 
because they use a general wrapper:
- I would need François to tell if that wrapper is ever used with 
iterators that can throw,
- the same wrapper is used for several containers, so unless we change all 
containers at once it can't stay consistent.


Bootstrap+testsuite ok.

2013-09-18  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/58338
* include/bits/list.tcc (_List_base::_M_clear, list::erase): Mark as
noexcept.
* include/bits/stl_list.h (_List_iterator) [_List_iterator,
_M_const_cast, operator*, operator-, operator++, operator--,
operator==, operator!=]: Likewise.
(_List_const_iterator) [_List_const_iterator, _M_const_cast, operator*,
operator-, operator++, operator--, operator==, operator!=]: Likewise.
(operator==(const _List_iterator, const _List_const_iterator),
operator!=(const _List_iterator, const _List_const_iterator)):
Likewise.
(_List_impl) [_List_impl(const _Node_alloc_type),
_List_impl(_Node_alloc_type)]: Likewise.
(_List_base) [_M_put_node, _List_base(const _Node_alloc_type),
_List_base(_List_base), _M_clear, _M_init]: Likewise.
(list) [list(), list(const allocator_type)]: Merge.
(list) [list(const allocator_type), front, back, pop_front, pop_back,
erase, _M_erase]: Mark as noexcept.
* include/debug/list (list) [list(const _Allocator), front, back,
pop_front, pop_back, _M_erase, erase]: Likewise.
* include/profile/list (list) [list(const _Allocator), front, back,
pop_front, pop_back, erase]: Likewise.
* testsuite/23_containers/list/requirements/dr438/assign_neg.cc:
Adjust line number.
* testsuite/23_containers/list/requirements/dr438/constructor_1_neg.cc:
Likewise.
* testsuite/23_containers/list/requirements/dr438/constructor_2_neg.cc:
Likewise.
* testsuite/23_containers/list/requirements/dr438/insert_neg.cc:
Likewise.

--
Marc GlisseIndex: include/bits/list.tcc
===
--- include/bits/list.tcc   (revision 202655)
+++ include/bits/list.tcc   (working copy)
@@ -56,21 +56,21 @@
 #ifndef _LIST_TCC
 #define _LIST_TCC 1
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
   templatetypename _Tp, typename _Alloc
 void
 _List_base_Tp, _Alloc::
-_M_clear()
+_M_clear() _GLIBCXX_NOEXCEPT
 {
   typedef _List_node_Tp  _Node;
   _Node* __cur = static_cast_Node*(_M_impl._M_node._M_next);
   while (__cur != _M_impl._M_node)
{
  _Node* __tmp = __cur;
  __cur = static_cast_Node*(__cur-_M_next);
 #if __cplusplus = 201103L
  _M_get_Node_allocator().destroy(__tmp);
 #else
@@ -138,21 +138,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
return __it;
  }
return __position._M_const_cast();
   }
 #endif
 
   templatetypename _Tp, typename _Alloc
 typename list_Tp, _Alloc::iterator
 list_Tp, _Alloc::
 #if __cplusplus = 201103L
-erase(const_iterator __position)
+erase(const_iterator __position) noexcept
 #else
 erase(iterator __position)
 #endif
 {
   iterator __ret = iterator(__position._M_node-_M_next);
   _M_erase(__position._M_const_cast());
   return __ret;
 }
 
 #if __cplusplus = 201103L
Index: include/bits/stl_list.h
===
--- include/bits/stl_list.h (revision 202655)
+++ include/bits/stl_list.h (working copy)
@@ -126,76 +126,76 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 {
   typedef _List_iterator_Tp_Self;
   typedef _List_node_Tp_Node;
 
   typedef ptrdiff_t  difference_type;
   typedef std::bidirectional_iterator_tagiterator_category;
   typedef _Tpvalue_type;
   typedef _Tp*   pointer;
   typedef _Tp   reference;
 
-  _List_iterator()
+  _List_iterator() _GLIBCXX_NOEXCEPT
   : _M_node() { }
 
   explicit
-  _List_iterator(__detail::_List_node_base* __x)
+  _List_iterator(__detail::_List_node_base* __x) _GLIBCXX_NOEXCEPT
   : _M_node(__x) { }
 
   _Self
-  _M_const_cast() const
+  _M_const_cast() const _GLIBCXX_NOEXCEPT
   { return *this; }
 
   // Must downcast from _List_node_base to _List_node to get to _M_data.
   reference
-  operator*() const
+  operator*() const _GLIBCXX_NOEXCEPT
   { return static_cast_Node*(_M_node)-_M_data; }
 
   pointer
-  operator-() const
+  operator-() const _GLIBCXX_NOEXCEPT
   { return 

patch to canonize small wide-ints.

2013-09-17 Thread Kenneth Zadeck

Richi,

This patch canonizes the bits above the precision for wide ints with 
types or modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT.


I expect that most of the changes in rtl.h will go away.   in 
particular, when we decide that we can depend on richard's patch to 
clean up rtl constants, then the only thing that will be left will be 
the addition of the TARGET_SUPPORTS_WIDE_INT test.


I do believe that there is one more conserved force in the universe than 
what physicist's generally consider: it is uglyness.  There is a lot of 
truth and beauty in the patch but in truth there is a lot of places 
where the uglyness is just moved someplace else.


in the pushing the ugly around dept, trees and wide-ints are not 
canonized the same way.I spent several days going down the road 
where it tried to have them be the same, but it got very ugly having 32 
bit unsigned int csts have the upper 32 bits set.   So now 
wide_int_to_tree and the wide-int constructors from tree-cst are now 
more complex.


i think that i am in favor of this patch, especially in conjunction with 
richards cleanup, but only mildly.


There is also some cleanup where richard wanted the long lines addressed.

Ok to commit to the wide-int branch?

kenny

Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c	(revision 202389)
+++ gcc/emit-rtl.c	(working copy)
@@ -579,8 +579,6 @@ immed_wide_int_const (const wide_int v,
   if (len  2 || prec = HOST_BITS_PER_WIDE_INT)
 return gen_int_mode (v.elt (0), mode);
 
-  wide_int copy = v;
-  wi::clear_undef (copy, SIGNED);
 #if TARGET_SUPPORTS_WIDE_INT
   {
 unsigned int i;
@@ -599,12 +597,12 @@ immed_wide_int_const (const wide_int v,
 CWI_PUT_NUM_ELEM (value, len);
 
 for (i = 0; i  len; i++)
-  CONST_WIDE_INT_ELT (value, i) = copy.elt (i);
+  CONST_WIDE_INT_ELT (value, i) = v.elt (i);
 
 return lookup_const_wide_int (value);
   }
 #else
-  return immed_double_const (copy.elt (0), copy.elt (1), mode);
+  return immed_double_const (v.elt (0), v.elt (1), mode);
 #endif
 }
 
Index: gcc/lto-streamer-in.c
===
--- gcc/lto-streamer-in.c	(revision 202389)
+++ gcc/lto-streamer-in.c	(working copy)
@@ -1273,7 +1273,7 @@ lto_input_tree_1 (struct lto_input_block
   for (i = 0; i  len; i++)
 	a[i] = streamer_read_hwi (ib);
   result = wide_int_to_tree (type, wide_int::from_array
- (a, len, TYPE_PRECISION (type), false));
+ (a, len, TYPE_PRECISION (type)));
   streamer_tree_cache_append (data_in-reader_cache, result, hash);
 }
   else if (tag == LTO_tree_scc)
Index: gcc/real.c
===
--- gcc/real.c	(revision 202389)
+++ gcc/real.c	(working copy)
@@ -2248,7 +2248,6 @@ real_from_integer (REAL_VALUE_TYPE *r, e
   /* Clear out top bits so elt will work with precisions that aren't
 	 a multiple of HOST_BITS_PER_WIDE_INT.  */
   val = wide_int::from (val, len, sgn);
-  wi::clear_undef (val, sgn);
   len = len / HOST_BITS_PER_WIDE_INT;
 
   SET_REAL_EXP (r, len * HOST_BITS_PER_WIDE_INT + e);
Index: gcc/rtl.h
===
--- gcc/rtl.h	(revision 202389)
+++ gcc/rtl.h	(working copy)
@@ -1422,6 +1422,7 @@ wi::int_traits rtx_mode_t::get_precisi
   return GET_MODE_PRECISION (x.second);
 }
 
+#if 0
 inline wi::storage_ref
 wi::int_traits rtx_mode_t::decompose (HOST_WIDE_INT *,
 	unsigned int precision,
@@ -1437,13 +1438,57 @@ wi::int_traits rtx_mode_t::decompose (
   return wi::storage_ref (CONST_WIDE_INT_ELT (x.first, 0),
 			  CONST_WIDE_INT_NUNITS (x.first), precision);
   
+#if TARGET_SUPPORTS_WIDE_INT != 0
 case CONST_DOUBLE:
   return wi::storage_ref (CONST_DOUBLE_LOW (x.first), 2, precision);
+#endif
   
 default:
   gcc_unreachable ();
 }
 }
+#else
+/* For now, assume that the storage is not canonical, i.e. that there
+   are bits above the precision that are not all zeros or all ones.
+   If this is fixed in rtl, then we will not need the calls to
+   force_to_size.  */
+inline wi::storage_ref
+wi::int_traits rtx_mode_t::decompose (HOST_WIDE_INT *scratch,
+	unsigned int precision,
+	const rtx_mode_t x)
+{
+  int len;
+  int small_prec = precision  (HOST_BITS_PER_WIDE_INT - 1);
+
+  gcc_checking_assert (precision == get_precision (x));
+  switch (GET_CODE (x.first))
+{
+case CONST_INT:
+  len = 1;
+  if (small_prec)
+	scratch[0] = sext_hwi (INTVAL (x.first), precision);
+  else
+	scratch = INTVAL (x.first);
+  break;
+  
+case CONST_WIDE_INT:
+  len = CONST_WIDE_INT_NUNITS (x.first);
+  scratch = CONST_WIDE_INT_ELT (x.first, 0);
+  break;
+  
+#if TARGET_SUPPORTS_WIDE_INT == 0
+case CONST_DOUBLE:
+  len = 2;
+  scratch = CONST_DOUBLE_LOW (x.first);
+  break;
+#endif  
+
+

[rl78] optimize prologues

2013-09-17 Thread DJ Delorie

Committed.

2013-09-17  Nick Clifton  ni...@redhat.com

* config/rl78/rl78.c (need_to_save): Change return type to bool.
For interrupt functions: save all call clobbered registers if the
interrupt handler is not a leaf function.
(rl78_expand_prologue): Always recompute the frame information.
For interrupt functions: only select bank 0 if one of the bank 0
registers is going to be psuhed.

Index: config/rl78/rl78.c
===
--- config/rl78/rl78.c  (revision 202666)
+++ config/rl78/rl78.c  (working copy)
@@ -537,40 +537,45 @@ rl78_force_nonfar_3 (rtx *operands, rtx 
 static bool
 rl78_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to 
ATTRIBUTE_UNUSED)
 {
   return true;
 }
 
-/* Returns nonzero if the given register needs to be saved by the
+/* Returns true if the given register needs to be saved by the
current function.  */
-static int
-need_to_save (int regno)
+static bool
+need_to_save (unsigned int regno)
 {
   if (is_interrupt_func (cfun-decl))
 {
-  if (regno  8)
-   return 1; /* don't know what devirt will need */
+   /* We don't need to save registers that have
+ been reserved for interrupt handlers.  */
   if (regno  23)
-   return 0; /* don't need to save interrupt registers */
-  if (crtl-is_leaf)
-   {
- return df_regs_ever_live_p (regno);
-   }
-  else
-   return 1;
+   return false;
+
+  /* If the handler is a non-leaf function then it may call
+non-interrupt aware routines which will happily clobber
+any call_used registers, so we have to preserve them.  */
+  if (!crtl-is_leaf  call_used_regs[regno])
+   return true;
+
+  /* Otherwise we only have to save a register, call_used
+or not, if it is used by this handler.  */
+  return df_regs_ever_live_p (regno);
 }
+
   if (regno == FRAME_POINTER_REGNUM  frame_pointer_needed)
-return 1;
+return true;
   if (fixed_regs[regno])
-return 0;
+return false;
   if (crtl-calls_eh_return)
-return 1;
+return true;
   if (df_regs_ever_live_p (regno)
!call_used_regs[regno])
-return 1;
-  return 0;
+return true;
+  return false;
 }
 
 /* We use this to wrap all emitted insns in the prologue.  */
 static rtx
 F (rtx x)
 {
@@ -1023,20 +1028,26 @@ rl78_expand_prologue (void)
   rtx sp = gen_rtx_REG (HImode, STACK_POINTER_REGNUM);
   int rb = 0;
 
   if (rl78_is_naked_func ())
 return;
 
-  if (!cfun-machine-computed)
-rl78_compute_frame_info ();
+  /* Always re-compute the frame info - the register usage may have changed.  
*/
+  rl78_compute_frame_info ();
 
   if (flag_stack_usage_info)
 current_function_static_stack_size = cfun-machine-framesize;
 
   if (is_interrupt_func (cfun-decl)  !TARGET_G10)
-emit_insn (gen_sel_rb (GEN_INT (0)));
+for (i = 0; i  4; i++)
+  if (cfun-machine-need_to_push [i])
+   {
+ /* Select Bank 0 if we are using any registers from Bank 0.   */
+ emit_insn (gen_sel_rb (GEN_INT (0)));
+ break;
+   }
 
   for (i = 0; i  16; i++)
 if (cfun-machine-need_to_push [i])
   {
if (TARGET_G10)
  {


Re: [PATCH v2 1/6] Convert symtab, cgraph and varpool nodes into a real class hierarchy

2013-09-17 Thread David Malcolm
On Tue, 2013-09-10 at 15:34 +0200, Jan Hubicka wrote:

Thanks for reviewing this, and sorry for the late response (I lost most
of last week to illness).  Some questions inline below...

  This patch is the handwritten part of the conversion of these types
  to C++; it requires the followup patch, which is autogenerated.
  
  It converts:
struct GTY(()) symtab_node_base
  to:
class GTY((user)) symtab_node_base
  
  and converts:
struct GTY(()) cgraph_node
  to:
struct GTY((user)) cgraph_node : public symtab_node_base
  
  and:
struct GTY(()) varpool_node
  to:
class GTY((user)) varpool_node : public symtab_node_base
  
  dropping the symtab_node_def union.
  
  Since gengtype is unable to cope with inheritance, we have to mark the
  types with GTY((user)), and handcode the gty field-visiting functions.
  Given the simple hierarchy, we don't need virtual functions for this.
  
  Unfortunately doing so runs into various bugs in gengtype's handling
  of GTY((user)), so the patch also includes workarounds for these bugs.
  
  gengtype walks the graph of the *types* in the code, and produces
  functions in gtype-desc.[ch] for all types that are reachable from a
  GTY root.
  
  However, it ignores the contents of GTY((user)) types when walking
  this graph.
  
  Hence if you have a subgraph of types that are only reachable
  via fields in GTY((user)) types, gengtype won't generate helper code
  for those types.
  
  Ideally there would be some way to mark a GTY((user)) type to say
  which types it references, to avoid having to mark these types as
  GTY((user)).
  
  For now, work around this issue by providing explicit roots of the
  missing types, of dummy variables (see the bottom of cgraph.c)
  
[..]

  diff --git a/gcc/cgraph.c b/gcc/cgraph.c
  index f12bf1b..4b12163 100644
  --- a/gcc/cgraph.c
  +++ b/gcc/cgraph.c
  @@ -2994,4 +2994,222 @@ cgraph_get_body (struct cgraph_node *node)
 return true;
   }
   
  +/* GTY((user)) hooks for symtab_node_base (and its subclasses).
  +   We could use virtual functions for this, but given the presence of the
  +   type field and the trivial size of the class hierarchy, switches are
  +   perhaps simpler and faster.  */
  +
  +void gt_ggc_mx (symtab_node_base *x)
  +{
  +  /* Hand-written equivalent of the chain_next/chain_prev machinery, to
  + avoid deep call-stacks.
  +
  + Locate the neighbors of x (within the linked-list) that haven't been
  + marked yet, so that x through xlimit give a range suitable for 
  marking.
  + Note that x (on entry) itself has already been marked by the
  + gtype-desc.c code, so we first try its successor.
  +  */
  +  symtab_node_base * xlimit = x ? x-next : NULL;
  +  while (ggc_test_and_set_mark (xlimit))
  +   xlimit = xlimit-next;
  +  if (x != xlimit)
  +for (;;)
  +  {
  +symtab_node_base * const xprev = x-previous;
  +if (xprev == NULL) break;
  +x = xprev;
  +(void) ggc_test_and_set_mark (xprev);
  +  }
  +  while (x != xlimit)
  +{
  +  /* Code common to all symtab nodes. */
  +  gt_ggc_m_9tree_node (x-decl);
  +  gt_ggc_mx_symtab_node_base (x-next);
 Aren't you marking next twice? Once by xlimit walk and one by recursion?
Good catch; removed.

  +  gt_ggc_mx_symtab_node_base (x-previous);
The comment above also applies to previous, so I've removed this also.

  +  gt_ggc_mx_symtab_node_base (x-next_sharing_asm_name);
  +  gt_ggc_mx_symtab_node_base (x-previous_sharing_asm_name);
  +  gt_ggc_mx_symtab_node_base (x-same_comdat_group);
 
 You can skip marking these.  They only point within symbol table and
 not externally.
OK; removed.

  +  gt_ggc_m_20vec_ipa_ref_t_va_gc_ (x-ref_list.references);
  +  gt_ggc_m_9tree_node (x-alias_target);
  +  gt_ggc_m_18lto_file_decl_data (x-lto_file_data);
  +
  +  /* Extra code, per subclass. */
  +  switch (x-type)
 Didn't we agreed on the is_a API?

There's just one interesting subclass here, so I've converted this to:

  if (cgraph_node *cgn = dyn_cast cgraph_node * (x))
{

eliminating the switch and static_cast.

  +{
  +case SYMTAB_FUNCTION:
  +  {
 
  +cgraph_node *cgn = static_cast cgraph_node * (x);
  +gt_ggc_m_11cgraph_edge (cgn-callees);
  +gt_ggc_m_11cgraph_edge (cgn-callers);
  +gt_ggc_m_11cgraph_edge (cgn-indirect_calls);
  +gt_ggc_m_11cgraph_node (cgn-origin);
  +gt_ggc_m_11cgraph_node (cgn-nested);
  +gt_ggc_m_11cgraph_node (cgn-next_nested);
  +gt_ggc_m_11cgraph_node (cgn-next_sibling_clone);
  +gt_ggc_m_11cgraph_node (cgn-prev_sibling_clone);
  +gt_ggc_m_11cgraph_node (cgn-clones);
  +gt_ggc_m_11cgraph_node (cgn-clone_of);
 Same as here.

Sorry, it's not clear to me what you meant by Same as here. here.  Do
you mean that I can skip marking them because they're 

[rl78] Add -mallregs

2013-09-17 Thread DJ Delorie

GCC typically avoids using virtual registers $r24 through $r31, as
this register bank (bank 3) is reserved for hand-written assembly
interrupt handlers.  If unneeded for that, this new option lets gcc
use those registers also.  Committed.

* config/rl78/constraints.md (Wcv): Allow up to $r31.
* config/rl78/rl78.c (rl78_asm_file_start: Likewise.
(rl78_option_override): Likewise, if -mallregs.
(is_virtual_register): Likewise.
* config/rl78/rl78.h (reg_class): Extend VREGS to $r31.
(REGNO_OK_FOR_BASE_P): Likewise.
* config/rl78/rl78.opt (-mallregs): New.

Index: config/rl78/rl78.h
===
--- config/rl78/rl78.h  (revision 202668)
+++ config/rl78/rl78.h  (working copy)
@@ -262,13 +262,13 @@ enum reg_class
   { 0x000c, 0x },  /* B and C - index regs.  */\
   { 0x00ff, 0x },  /* all real registers.  */  \
   { 0x, 0x0001 },  /* SP */\
   { 0x0300, 0x },  /* R8 - HImode */   \
   { 0x0c00, 0x },  /* R10 - HImode */  \
   { 0xff00, 0x },  /* INT - HImode */  \
-  { 0x007fff00, 0x },  /* Virtual registers.  */   \
+  { 0xff7fff00, 0x },  /* Virtual registers.  */   \
   { 0xff7f, 0x0002 },  /* General registers.  */   \
   { 0x0400, 0x0004 },  /* PSW.  */ \
   { 0xff7f, 0x001f }   /* All registers.  */   \
 }
 
 #define TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P hook_bool_mode_true
@@ -349,13 +349,13 @@ enum reg_class
   (IN_RANGE ((REGNO), (MIN), (MAX))\
|| (reg_renumber != NULL\
 reg_renumber[(REGNO)] = (MIN)   \
 reg_renumber[(REGNO)] = (MAX)))
 
 #ifdef REG_OK_STRICT
-#define REGNO_OK_FOR_BASE_P(regno)  REGNO_IN_RANGE (regno, 16, 23)
+#define REGNO_OK_FOR_BASE_P(regno)  REGNO_IN_RANGE (regno, 16, 31)
 #else
 #define REGNO_OK_FOR_BASE_P(regno) 1
 #endif
 
 #define REGNO_OK_FOR_INDEX_P(regno)REGNO_OK_FOR_BASE_P (regno)
 
Index: config/rl78/constraints.md
===
--- config/rl78/constraints.md  (revision 202668)
+++ config/rl78/constraints.md  (working copy)
@@ -260,16 +260,16 @@
   es:[AX..HL] for calls
   (match_test rl78_es_addr (op)  satisfies_constraint_Cca (rl78_es_base 
(op))
|| satisfies_constraint_Cca (op))
   )
 
 (define_memory_constraint Ccv
-  [AX..HL,r8-r23] for calls
+  [AX..HL,r8-r31] for calls
   (and (match_code mem)
(and (match_code reg 0)
-   (match_test REGNO (XEXP (op, 0))  24)))
+   (match_test REGNO (XEXP (op, 0))  31)))
   )
 (define_memory_constraint Wcv
   es:[AX..HL,r8-r23] for calls
   (match_test rl78_es_addr (op)  satisfies_constraint_Ccv (rl78_es_base 
(op))
|| satisfies_constraint_Ccv (op))
   )
Index: config/rl78/rl78.c
===
--- config/rl78/rl78.c  (revision 202668)
+++ config/rl78/rl78.c  (working copy)
@@ -269,12 +269,13 @@ rl78_asm_file_start (void)
   else
 {
   for (i = 0; i  8; i++)
{
  fprintf (asm_out_file, r%d\t=\t0x%x\n, 8 + i, 0xffef0 + i);
  fprintf (asm_out_file, r%d\t=\t0x%x\n, 16 + i, 0xffee8 + i);
+ fprintf (asm_out_file, r%d\t=\t0x%x\n, 24 + i, 0xffee0 + i);
}
 }
 
   opt_pass *rl78_devirt_pass = make_pass_rl78_devirt (g);
   static struct register_pass_info rl78_devirt_info =
 {
@@ -306,12 +307,19 @@ rl78_option_override (void)
 {
   flag_omit_frame_pointer = 1;
   flag_no_function_cse = 1;
   flag_split_wide_types = 0;
 
   init_machine_status = rl78_init_machine_status;
+
+  if (TARGET_ALLREGS)
+{
+  int i;
+  for (i=24; i32; i++)
+   fixed_regs[i] = 0;
+}
 }
 
 /* Most registers are 8 bits.  Some are 16 bits because, for example,
gcc doesn't like dealing with $FP as a register pair.  This table
maps register numbers to size in bytes.  */
 static const int register_sizes[] =
@@ -2212,13 +2220,13 @@ insn_ok_now (rtx insn)
 /* Returns TRUE if R is a virtual register.  */
 static bool
 is_virtual_register (rtx r)
 {
   return (GET_CODE (r) == REG
   REGNO (r) = 8
-  REGNO (r)  24);
+  REGNO (r)  32);
 }
 
 /* In all these alloc routines, we expect the following: the insn
pattern is unshared, the insn was previously recognized and failed
due to predicates or constraints, and the operand data is in
recog_data.  */
Index: config/rl78/rl78.opt
===
--- config/rl78/rl78.opt(revision 202668)
+++ config/rl78/rl78.opt(working copy)
@@ -39,12 +39,16 @@ Enum(rl78_mul_types) String(none) Value(
 EnumValue
 Enum(rl78_mul_types) String(rl78) Value(MUL_RL78)
 
 EnumValue
 Enum(rl78_mul_types) 

libgo patch committed: Fix reflect bug in method calls

2013-09-17 Thread Ian Lance Taylor
This patch to libgo fixes a bug when calling a method when the
reflect.Value object holds a pointer to the actual value.  The code was
calling iword which tests v.kind, but for a method value that is always
Func.  This fixes the code to implement iword directly using
v.typ.Kind().  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline and 4.8 branch.

I added a test to the reflect testsuite in the master sources, and it
will be imported into the gccgo sources in due course.

Ian

Index: libgo/go/reflect/value.go
===
--- libgo/go/reflect/value.go	(revision 202233)
+++ libgo/go/reflect/value.go	(working copy)
@@ -611,7 +611,13 @@ func methodReceiver(op string, v Value,
 		}
 		fn = unsafe.Pointer(m.tfn)
 		t = m.mtyp
-		rcvr = v.iword()
+		// Can't call iword here, because it checks v.kind,
+		// and that is always Func.
+		if v.flagflagIndir != 0  (v.typ.Kind() == Ptr || v.typ.Kind() == UnsafePointer) {
+			rcvr = loadIword(v.val, v.typ.size)
+		} else {
+			rcvr = iword(v.val)
+		}
 	}
 	return
 }


[PATCH]Fix missed propagation opportunity in DOM

2013-09-17 Thread Jeff Law


This is a repost with fixes to avoid the phase-ordering problem exposed 
by 58387 and 58340.  I've included the testcase for 58387.


--


I recently noticed that we were failing to propagate edge equivalences 
into PHI arguments in non-dominated successors.


The case loos like this:

;;   basic block 11, loop depth 0, count 0, freq 160, maybe hot
;;prev block 10, next block 12, flags: (NEW, REACHABLE)
;;pred:   10 [50.0%]  (FALSE_VALUE,EXECUTABLE)
  _257 = di_13(D)-comps;
  _258 = (long unsigned int) _255;
  _259 = _258 * 24;
  p_260 = _257 + _259;
  _261 = _255 + 1;
  di_13(D)-next_comp = _261;
  if (p_260 != 0B)
goto bb 12;
  else
goto bb 13;
;;succ:   12 [100.0%]  (TRUE_VALUE,EXECUTABLE)
;;13 (FALSE_VALUE,EXECUTABLE)

;;   basic block 12, loop depth 0, count 0, freq 272, maybe hot
;;   Invalid sum of incoming frequencies 160, should be 272
;;prev block 11, next block 13, flags: (NEW, REACHABLE)
;;pred:   11 [100.0%]  (TRUE_VALUE,EXECUTABLE)
  p_260-type = 37;
  p_260-u.s_builtin.type = _139;
;;succ:   13 [100.0%]  (FALLTHRU,EXECUTABLE)

;;   basic block 13, loop depth 0, count 0, freq 319, maybe hot
;;   Invalid sum of incoming frequencies 432, should be 319
;;prev block 12, next block 14, flags: (NEW, REACHABLE)
;;pred:   110 [100.0%]  (FALLTHRU)
;;12 [100.0%]  (FALLTHRU,EXECUTABLE)
;;11 (FALSE_VALUE,EXECUTABLE)
  # _478 = PHI 0B(110), p_260(12), p_260(11)
  ret = _478;
  _142 = di_13(D)-expansion;
  _143 = _478-u.s_builtin.type;

In particular note block 11 does *not* dominate block 13.  However, we 
know that when we traverse the edge 11-13 that p_260 will have the 
value zero, which should be propagated into the PHI node.


After fixing the propagation with the attached patch we have
_478 = PHI 0B(110), p_260(12), 0B(11)

I have other code which then discovers the unconditional NULL pointer 
dereferences when we traverse 110-13 or 11-13 and isolates those paths.


That in turn allows blocks 12 and 13 to be combined, which in turn 
allows discovery of additional CSE opportunities.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  Applied 
to the trunk.


* gcc.c-torture/execute/pr58387.c: New test.


* tree-ssa-dom.c (cprop_into_successor_phis): Also propagate
edge implied equivalences into successor phis.
* tree-ssa-threadupdate.c (phi_args_equal_on_edges): Moved into
here from tree-ssa-threadedge.c.
(mark_threaded_blocks): When threading through a joiner, if both
successors of the joiner's clone reach the same block, verify the
PHI arguments are equal.  If not, cancel the jump threading request.
* tree-ssa-threadedge.c (phi_args_equal_on_edges): Moved into
tree-ssa-threadupdate.c
(thread_across_edge): Don't check PHI argument equality when
threading through joiner block here.

diff --git a/gcc/testsuite/gcc.c-torture/execute/pr58387.c 
b/gcc/testsuite/gcc.c-torture/execute/pr58387.c
new file mode 100644
index 000..74c32df
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr58387.c
@@ -0,0 +1,11 @@
+extern void abort(void);
+
+int a = -1; 
+
+int main ()
+{
+  int b = a == 0 ? 0 : -a;
+  if (b  1)
+abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index e02a566..bf75135 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -1642,6 +1642,28 @@ cprop_into_successor_phis (basic_block bb)
   if (gsi_end_p (gsi))
continue;
 
+  /* We may have an equivalence associated with this edge.  While
+we can not propagate it into non-dominated blocks, we can
+propagate them into PHIs in non-dominated blocks.  */
+
+  /* Push the unwind marker so we can reset the const and copies
+table back to its original state after processing this edge.  */
+  const_and_copies_stack.safe_push (NULL_TREE);
+
+  /* Extract and record any simple NAME = VALUE equivalences. 
+
+Don't bother with [01] = COND equivalences, they're not useful
+here.  */
+  struct edge_info *edge_info = (struct edge_info *) e-aux;
+  if (edge_info)
+   {
+ tree lhs = edge_info-lhs;
+ tree rhs = edge_info-rhs;
+
+ if (lhs  TREE_CODE (lhs) == SSA_NAME)
+   record_const_or_copy (lhs, rhs);
+   }
+
   indx = e-dest_idx;
   for ( ; !gsi_end_p (gsi); gsi_next (gsi))
{
@@ -1667,6 +1689,8 @@ cprop_into_successor_phis (basic_block bb)
   may_propagate_copy (orig_val, new_val))
propagate_value (orig_p, new_val);
}
+
+  restore_vars_to_original_value ();
 }
 }
 
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 42474f1..47db280 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -841,28 +841,6 @@ thread_around_empty_blocks (edge taken_edge,
   return false;
 }
   
-/* 

Merge from gcc 4.8 branch to gccgo branch

2013-09-17 Thread Ian Lance Taylor
I merged the GCC 4.8 brach to the gccgo branch.

Ian


[C++ Patch] PR 58448

2013-09-17 Thread Paolo Carlini

Hi,

this ICE is caused by error_mark_node as TREE_TYPE of a TYPE_DECL, which 
leads to a crash at the beginning of the TYPE_DECL case of tsubst_decl.


I tried various approaches - for example turning all error_operand_p (t) 
== true arguments passes to tsubst into error_mark_nodes also works - 
but I think I have a weak preference for the solution below, because 
conceptually matches the section of grokdeclarator beginning with:


  /* If this is declaring a typedef name, return a TYPE_DECL.  */
  if (typedef_p  decl_context != TYPENAME)

which seems rather special in terms of producing such TYPE_DECLs in case 
of errors (it does that for error recovery reasons, I suppose: just 
returning error_mark_node leads to worse diagnostic for eg, 
parse/error32.C).


Tested x86_64-linux.

Thanks,
Paolo.

///
/cp
2013-09-17  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58448
* pt.c (tsubst_decl, [TYPE_DECL]): Check TREE_TYPE (t) for
error_mark_node.

/testsuite
2013-09-17  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58448
* g++.dg/template/crash117.C: New.
Index: cp/pt.c
===
--- cp/pt.c (revision 202668)
+++ cp/pt.c (working copy)
@@ -10741,19 +10741,23 @@ tsubst_decl (tree t, tree args, tsubst_flags_t com
tree type = NULL_TREE;
bool local_p;
 
-   if (TREE_CODE (t) == TYPE_DECL
-t == TYPE_MAIN_DECL (TREE_TYPE (t)))
+   if (TREE_CODE (t) == TYPE_DECL)
  {
-   /* If this is the canonical decl, we don't have to
-  mess with instantiations, and often we can't (for
-  typename, template type parms and such).  Note that
-  TYPE_NAME is not correct for the above test if
-  we've copied the type for a typedef.  */
-   type = tsubst (TREE_TYPE (t), args, complain, in_decl);
-   if (type == error_mark_node)
+   if (TREE_TYPE (t) == error_mark_node)
  RETURN (error_mark_node);
-   r = TYPE_NAME (type);
-   break;
+   else if (t == TYPE_MAIN_DECL (TREE_TYPE (t)))
+ {
+   /* If this is the canonical decl, we don't have to
+  mess with instantiations, and often we can't (for
+  typename, template type parms and such).  Note that
+  TYPE_NAME is not correct for the above test if
+  we've copied the type for a typedef.  */
+   type = tsubst (TREE_TYPE (t), args, complain, in_decl);
+   if (type == error_mark_node)
+ RETURN (error_mark_node);
+   r = TYPE_NAME (type);
+   break;
+ }
  }
 
/* Check to see if we already have the specialization we
Index: testsuite/g++.dg/template/crash117.C
===
--- testsuite/g++.dg/template/crash117.C(revision 0)
+++ testsuite/g++.dg/template/crash117.C(working copy)
@@ -0,0 +1,6 @@
+// PR c++/58448
+
+class SmallVector; struct Types4;
+template typename, typename, typename, typename struct Types {
+  typedef Types4::Constructable // { dg-error template|typedef|expected }
+} TypesSmallVector, SmallVector, SmallVector, SmallVector::  // { dg-error 
expected }


[rl78] fix far address optimizations

2013-09-17 Thread DJ Delorie

Track both parts of far addresses so they don't get optimized away.
Committed.

* config/rl78/constraints.md: For each W* constraint, rename to C*
and create a W* constraint that checks for an optional ES: prefix
pattern also.
* config/rl78/rl78.md (UNS_ES_ADDR): New.
(es_addr): New.  Used to wrap far addresses.
* config/rl78/rl78-protos.h (rl78_es_addr): New.
(rl78_es_base): New.
* config/rl78/rl78.c (rl78_as_legitimate_address): Accept unspec
wrapped far addresses.
(rl78_print_operand_1): Unwrap far addresses before processing.
(rl78_lo16): Wrap far addresses in unspecs.
(rl78_es_addr): New.
(rl78_es_base): New.
(insn_ok_now): Check for not-yet-wrapped far addresses.
(transcode_memory_rtx): Properly re-wrap far addresses.

Index: config/rl78/constraints.md
===
--- config/rl78/constraints.md  (revision 202665)
+++ config/rl78/constraints.md  (working copy)
@@ -200,103 +200,155 @@
 (define_register_constraint Zint INT_REGS
  The interrupt registers.)
 
 ; All the memory addressing schemes the RL78 supports
 ; of the form W {register} {bytes of offset}
 ;  or W {register} {register}
+; Additionally, the Cxx forms are the same as the Wxx forms, but without
+; the ES: override.
 
 ; absolute address
-(define_memory_constraint Wab
+(define_memory_constraint Cab
   [addr]
   (and (match_code mem)
(ior (match_test CONSTANT_P (XEXP (op, 0)))
(match_test GET_CODE (XEXP (op, 0)) == PLUS  GET_CODE (XEXP 
(XEXP (op, 0), 0)) == SYMBOL_REF))
)
   )
+(define_memory_constraint Wab
+  es:[addr]
+  (match_test rl78_es_addr (op)  satisfies_constraint_Cab (rl78_es_base 
(op))
+   || satisfies_constraint_Cab (op))
+  )
 
-(define_memory_constraint Wbc
+(define_memory_constraint Cbc
   word16[BC]
   (and (match_code mem)
(ior
(and (match_code reg 0)
 (match_test REGNO (XEXP (op, 0)) == BC_REG))
(and (match_code plus 0)
 (and (and (match_code reg 00)
   (match_test REGNO (XEXP (XEXP (op, 0), 0)) == BC_REG))
   (match_test uword_operand (XEXP (XEXP (op, 0), 1), 
VOIDmode)
)
   )
+(define_memory_constraint Wbc
+  es:word16[BC]
+  (match_test rl78_es_addr (op)  satisfies_constraint_Cbc (rl78_es_base 
(op))
+   || satisfies_constraint_Cbc (op))
+  )
 
-(define_memory_constraint Wde
+(define_memory_constraint Cde
   [DE]
   (and (match_code mem)
(and (match_code reg 0)
(match_test REGNO (XEXP (op, 0)) == DE_REG)))
   )
+(define_memory_constraint Wde
+  es:[DE]
+  (match_test rl78_es_addr (op)  satisfies_constraint_Cde (rl78_es_base 
(op))
+   || satisfies_constraint_Cde (op))
+  )
 
-(define_memory_constraint Wca
+(define_memory_constraint Cca
   [AX..HL] for calls
   (and (match_code mem)
(and (match_code reg 0)
(match_test REGNO (XEXP (op, 0)) = HL_REG)))
   )
+(define_memory_constraint Wca
+  es:[AX..HL] for calls
+  (match_test rl78_es_addr (op)  satisfies_constraint_Cca (rl78_es_base 
(op))
+   || satisfies_constraint_Cca (op))
+  )
 
-(define_memory_constraint Wcv
+(define_memory_constraint Ccv
   [AX..HL,r8-r23] for calls
   (and (match_code mem)
(and (match_code reg 0)
(match_test REGNO (XEXP (op, 0))  24)))
   )
+(define_memory_constraint Wcv
+  es:[AX..HL,r8-r23] for calls
+  (match_test rl78_es_addr (op)  satisfies_constraint_Ccv (rl78_es_base 
(op))
+   || satisfies_constraint_Ccv (op))
+  )
 
-(define_memory_constraint Wd2
+(define_memory_constraint Cd2
   word16[DE]
   (and (match_code mem)
(ior
(and (match_code reg 0)
 (match_test REGNO (XEXP (op, 0)) == DE_REG))
(and (match_code plus 0)
 (and (and (match_code reg 00)
   (match_test REGNO (XEXP (XEXP (op, 0), 0)) == DE_REG))
   (match_test uword_operand (XEXP (XEXP (op, 0), 1), 
VOIDmode)
)
   )
+(define_memory_constraint Wd2
+  es:word16[DE]
+  (match_test rl78_es_addr (op)  satisfies_constraint_Cd2 (rl78_es_base 
(op))
+   || satisfies_constraint_Cd2 (op))
+  )
 
-(define_memory_constraint Whl
+(define_memory_constraint Chl
   [HL]
   (and (match_code mem)
(and (match_code reg 0)
(match_test REGNO (XEXP (op, 0)) == HL_REG)))
   )
+(define_memory_constraint Whl
+  es:[HL]
+  (match_test rl78_es_addr (op)  satisfies_constraint_Chl (rl78_es_base 
(op))
+   || satisfies_constraint_Chl (op))
+  )
 
-(define_memory_constraint Wh1
+(define_memory_constraint Ch1
   byte8[HL]
   (and (match_code mem)
(and (match_code plus 0)
(and (and (match_code reg 00)
  (match_test REGNO (XEXP (XEXP (op, 0), 0)) == HL_REG))
  (match_test ubyte_operand 

[GOOGLE] AutoFDO should honor system paths in the profile

2013-09-17 Thread Dehao Chen
This patch makes AutoFDO honor system paths stored in the profile.

Bootstrapped and passed regression tests.

OK for google-4_8 branch?

Thanks,
Dehao

Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 202672)
+++ gcc/auto-profile.c (working copy)
@@ -616,11 +616,11 @@ bool autofdo_module_profile::read ()
 {
   char *name = xstrdup (gcov_read_string ());
   unsigned total_num = 0;
-  unsigned num_array[6];
+  unsigned num_array[7];
   unsigned exported = gcov_read_unsigned ();
   unsigned lang = gcov_read_unsigned ();
   unsigned ggc_memory = gcov_read_unsigned ();
-  for (unsigned j = 0; j  6; j++)
+  for (unsigned j = 0; j  7; j++)
  {
   num_array[j] = gcov_read_unsigned ();
   total_num += num_array[j];
@@ -638,9 +638,10 @@ bool autofdo_module_profile::read ()
   module-ggc_memory = ggc_memory;
   module-num_quote_paths = num_array[1];
   module-num_bracket_paths = num_array[2];
-  module-num_cpp_defines = num_array[3];
-  module-num_cpp_includes = num_array[4];
-  module-num_cl_args = num_array[5];
+  module-num_system_paths = num_array[3];
+  module-num_cpp_defines = num_array[4];
+  module-num_cpp_includes = num_array[5];
+  module-num_cl_args = num_array[6];
   module-source_filename = name;
   module-is_primary = strcmp (name, in_fnames[0]) == 0;
   module-flags = module-is_primary ? exported : 1;


Re: [GOOGLE] AutoFDO should honor system paths in the profile

2013-09-17 Thread Xinliang David Li
ok.

David

On Tue, Sep 17, 2013 at 4:53 PM, Dehao Chen de...@google.com wrote:
 This patch makes AutoFDO honor system paths stored in the profile.

 Bootstrapped and passed regression tests.

 OK for google-4_8 branch?

 Thanks,
 Dehao

 Index: gcc/auto-profile.c
 ===
 --- gcc/auto-profile.c (revision 202672)
 +++ gcc/auto-profile.c (working copy)
 @@ -616,11 +616,11 @@ bool autofdo_module_profile::read ()
  {
char *name = xstrdup (gcov_read_string ());
unsigned total_num = 0;
 -  unsigned num_array[6];
 +  unsigned num_array[7];
unsigned exported = gcov_read_unsigned ();
unsigned lang = gcov_read_unsigned ();
unsigned ggc_memory = gcov_read_unsigned ();
 -  for (unsigned j = 0; j  6; j++)
 +  for (unsigned j = 0; j  7; j++)
   {
num_array[j] = gcov_read_unsigned ();
total_num += num_array[j];
 @@ -638,9 +638,10 @@ bool autofdo_module_profile::read ()
module-ggc_memory = ggc_memory;
module-num_quote_paths = num_array[1];
module-num_bracket_paths = num_array[2];
 -  module-num_cpp_defines = num_array[3];
 -  module-num_cpp_includes = num_array[4];
 -  module-num_cl_args = num_array[5];
 +  module-num_system_paths = num_array[3];
 +  module-num_cpp_defines = num_array[4];
 +  module-num_cpp_includes = num_array[5];
 +  module-num_cl_args = num_array[6];
module-source_filename = name;
module-is_primary = strcmp (name, in_fnames[0]) == 0;
module-flags = module-is_primary ? exported : 1;


[PATCH], PR target/58452, Fix gcc 4.8/trunk linuxpaired breakage

2013-09-17 Thread Michael Meissner
While doing some work on power8, I wanted to make sure that for existing
systems, I was generating the same code.  So I built some code and ran it
through various -mcpu= options.  When I built a powerpc-linuxpaired
compiler, the compiler has trouble with a simple loop that should be
vectorized.  I traced the code to changes in the vectorizer that required the
predicates for movmismalign* to accept memory operands.

In the main part of the powerpc compiler, we made this change in April, 2011,
but we missed the paired floating point support, since you need to use special
configuration options to enable paired floating point support.

2011-04-01  Andrew Pinski  pins...@gmail.com
Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/48262
* config/rs6000/vector.md (movmisalignmode): Allow for memory
operands, as per the specifications.

* config/rs6000/altivec.md (vec_extract_evenv4si): Correct modes.
(vec_extract_evenv4sf): Ditto.
(vec_extract_evenv8hi): Ditto.
(vec_extract_evenv16qi): Ditto.
(vec_extract_oddv4si): Ditto.

I will do the usual bootstrap/make check tomorrow.  Assuming it has no
regressions, can I check this patch it to both the 4.8 branch and trunk?

2013-09-17  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/58452
* config/rs6000/paired.md (movmisalignv2sf): Fix to allow memory
operaands.

Index: gcc/config/rs6000/paired.md
===
--- gcc/config/rs6000/paired.md (revision 202632)
+++ gcc/config/rs6000/paired.md (working copy)
@@ -462,8 +462,8 @@ (define_expand reduc_splus_v2sf
 })
 
 (define_expand movmisalignv2sf
-  [(set (match_operand:V2SF 0 gpc_reg_operand =f)
-(match_operand:V2SF 1 gpc_reg_operand f))]
+  [(set (match_operand:V2SF 0 nonimmediate_operand )
+(match_operand:V2SF 1 any_operand ))]
   TARGET_PAIRED_FLOAT
 {
   paired_expand_vector_move (operands);


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: Using gen_int_mode instead of GEN_INT minor testsuite fallout on MIPS

2013-09-17 Thread Mike Stump
On Sep 17, 2013, at 10:24 AM, Mike Stump mikest...@comcast.net wrote:
 On Sep 16, 2013, at 8:41 PM, DJ Delorie d...@redhat.com wrote:
 m32c's PSImode is 24-bits, why does it have 32 in the macro?
 
 /* 24-bit pointers, in 32-bit units */
 -PARTIAL_INT_MODE (SI);
 +PARTIAL_INT_MODE_NAME (SI, 32, PSI);
 
 Sorry, fingers copied the wrong number.  Thanks for the catch.
 
 partial-1.diffs.txt

p7 boostrap test complete:

New tests that PASS:

gcc.dg/simulate-thread/atomic-other-short.c  -O3 -g  thread simulation test

it seems someone doesn't flush or wait, I don't think my patch actually fixed 
this.

Re: [go-nuts] Solaris gccgo http.Get error?

2013-09-17 Thread Ian Lance Taylor
On Tue, Sep 17, 2013 at 12:28 PM,  ernie.hers...@10gen.com wrote:
 If you don't mind explaining, can you tell me why you didn't apply the
 change to the 4.7 branch?

I'm not maintaining Go on the 4.7 branch.  I don't object to somebody
else doing it, I'm just not doing it myself.  My time is limited and I
have to draw the line somewhere.

Ian

 On Friday, August 9, 2013 4:53:30 PM UTC-4, Ian Lance Taylor wrote:

 On Thu, Aug 8, 2013 at 11:22 PM, Jakob Borg ja...@nym.se wrote:
 
  But, adding a
 
   hints.ai_socktype = SOCK_STREAM;
 
  gives me
 
  jb@zlogin2:~ $ ./test
  canonical name: www.google.com
  26 2 6
  26 2 6
  26 2 6
  26 2 6
  26 2 6
  26 2 6
 
  It seems we might need a tweak to support Solaris... :/

 Looks like it.  I committed a patch to the master repository.  This
 patch copies it over to gccgo.  Bootstrapped and ran Go testsuite on
 x86_64-unknown-linux-gnu.  Committed to mainline and 4.8 branch.

 Note that I have not made the change on the 4.7 branch which is what
 you are using.  The same patch should work for the 4.7 sources,
 though, if you want to copy it over.

 Ian

 --
 You received this message because you are subscribed to the Google Groups
 golang-nuts group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to golang-nuts+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/groups/opt_out.


[rl78] add bit test/branch insns

2013-09-17 Thread DJ Delorie

A few new patterns.  Committed.


2013-09-17  Nick Clifton  ni...@redhat.com

* config/rl78/rl78-real.md (bf): New pattern.
(bt): New pattern.
* config/rl78/rl78.c (rl78_print_operand_1): Handle %B.
(rl78_print_operand): Do not put a # before a %B.
* config/rl78/rl78.opt: Tweak doc strings.

Index: config/rl78/rl78-real.md
===
--- config/rl78/rl78-real.md(revision 202675)
+++ config/rl78/rl78-real.md(working copy)
@@ -456,6 +456,61 @@
(set (reg:HI AX_REG)
(match_dup 0))]
   
   [(set (match_dup 0) (reg:HI AX_REG))]
   )
 
+;; Bit test and branch insns.
+
+;; NOTE: These patterns will work for bits in other places, not just A.
+
+(define_insn bf
+  [(set (pc)
+   (if_then_else (eq (and (reg:QI A_REG)
+  (match_operand 0 immediate_operand n))
+ (const_int 0))
+ (label_ref (match_operand 1  ))
+ (pc)))]
+  
+  bf\tA.%B0, $%1
+)
+
+(define_insn bt
+  [(set (pc)
+   (if_then_else (ne (and (reg:QI A_REG)
+  (match_operand 0 immediate_operand n))
+ (const_int 0))
+ (label_ref (match_operand 1  ))
+ (pc)))]
+  
+  bt\tA.%B0, $%1
+)
+
+;; NOTE: These peepholes are fragile.  They rely upon GCC generating
+;; a specific sequence on insns, based upon examination of test code.
+;; Improvements to GCC or using code other than the test code can result
+;; in the peephole not matching and the optimization being missed.
+
+(define_peephole2
+  [(set (match_operand:QI 1 register_operand) (reg:QI A_REG))
+   (set (match_dup 1) (and:QI (match_dup 1) (match_operand 2 
immediate_operand)))
+   (set (pc) (if_then_else (eq (match_dup 1) (const_int 0))
+  (label_ref (match_operand 3 ))
+  (pc)))]
+  peep2_regno_dead_p (3, REGNO (operands[1]))
+exact_log2 (INTVAL (operands[2])) = 0
+  [(set (pc) (if_then_else (eq (and (reg:QI A_REG) (match_dup 2)) (const_int 
0))
+  (label_ref (match_dup 3)) (pc)))]
+  )
+
+(define_peephole2
+  [(set (match_operand:QI 1 register_operand) (reg:QI A_REG))
+   (set (match_dup 1) (and:QI (match_dup 1) (match_operand 2 
immediate_operand)))
+   (set (pc) (if_then_else (ne (match_dup 1) (const_int 0))
+  (label_ref (match_operand 3 ))
+  (pc)))]
+  peep2_regno_dead_p (3, REGNO (operands[1]))
+exact_log2 (INTVAL (operands[2])) = 0
+  [(set (pc) (if_then_else (ne (and (reg:QI A_REG) (match_dup 2)) (const_int 
0))
+  (label_ref (match_dup 3)) (pc)))]
+  )
+
Index: config/rl78/rl78.c
===
--- config/rl78/rl78.c  (revision 202675)
+++ config/rl78/rl78.c  (working copy)
@@ -1283,12 +1283,13 @@ rl78_function_arg_boundary (enum machine
m - minus - negative of CONST_INT value.
c - inverse of a conditional (NE vs EQ for example)
z - collapsed conditional
s - shift count mod 8
S - shift count mod 16
r - reverse shift count (8-(count mod 8))
+   B - bit position
 
h - bottom HI of an SI
H - top HI of an SI
q - bottom QI of an HI
Q - top QI of an HI
e - third QI of an SI (i.e. where the ES register gets values from)
@@ -1409,12 +1410,14 @@ rl78_print_operand_1 (FILE * file, rtx o
   else if (letter == 'q')
fprintf (file, %ld, INTVAL (op)  0xff);
   else if (letter == 'h')
fprintf (file, %ld, INTVAL (op)  0x);
   else if (letter == 'e')
fprintf (file, %ld, (INTVAL (op)  16)  0xff);
+  else if (letter == 'B')
+   fprintf (file, %d, exact_log2 (INTVAL (op)));
   else if (letter == 'E')
fprintf (file, %ld, (INTVAL (op)  24)  0xff);
   else if (letter == 'm')
fprintf (file, %ld, - INTVAL (op));
   else if (letter == 's')
fprintf (file, %ld, INTVAL (op) % 8);
@@ -1602,13 +1605,13 @@ rl78_print_operand_1 (FILE * file, rtx o
 #undef  TARGET_PRINT_OPERAND
 #define TARGET_PRINT_OPERAND   rl78_print_operand
 
 static void
 rl78_print_operand (FILE * file, rtx op, int letter)
 {
-  if (CONSTANT_P (op)  letter != 'u'  letter != 's'  letter != 'r'  
letter != 'S')
+  if (CONSTANT_P (op)  letter != 'u'  letter != 's'  letter != 'r'  
letter != 'S'  letter != 'B')
 fprintf (file, #);
   rl78_print_operand_1 (file, op, letter);
 }
 
 #undef  TARGET_TRAMPOLINE_INIT
 #define TARGET_TRAMPOLINE_INIT rl78_trampoline_init
Index: config/rl78/rl78.opt
===
--- config/rl78/rl78.opt(revision 202675)
+++ config/rl78/rl78.opt(working copy)
@@ -20,13 +20,13 @@
 ;---
 
 HeaderInclude
 config/rl78/rl78-opts.h
 
 msim
-Target
+Target Report
 Use the simulator 

RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction

2013-09-17 Thread bin.cheng


 -Original Message-
 From: Dominique Dhumieres [mailto:domi...@lps.ens.fr]
 Sent: Wednesday, September 18, 2013 1:47 AM
 To: gcc-patches@gcc.gnu.org
 Cc: hjl.to...@gmail.com; Bin Cheng
 Subject: Re: [PATCH GCC]Catch more MEM_REFs sharing common
 addressing part in gimple strength reduction
 
 The new test gcc.dg/tree-ssa/slsr-39.c fails in 64 bit mode (see
 http://gcc.gnu.org/ml/gcc-regression/2013-09/msg00455.html ).
 Looking for MEM in the dump returns
 
   _12 = MEM[(int[50] *)_17];
   MEM[(int[50] *)_20] = _13;
 

Thanks for reporting, I think this can be fixed by patch:
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00761.html

Thanks.
bin





Re: [ping][PATCH][1 of 2] Add value range info to SSA_NAME for zero sign extension elimination in RTL

2013-09-17 Thread Kugan


Thanks Richard for the review.
On 16/09/13 23:43, Richard Biener wrote:

On Mon, 16 Sep 2013, Kugan wrote:


Hi,

Updated the patch to the latest changes in trunk that splits tree.h. I also
noticed an error in printing double_int and fixed it.

Is this OK?


print_gimple_stmt (dump_file, stmt, 0,
-TDF_SLIM | (dump_flags  TDF_LINENO));
+TDF_SLIM | TDF_RANGE | (dump_flags 
TDF_LINENO));

this should be (dump_flags  (TDF_LINENO|TDF_RANGE)) do not always
dump range info.  I'd have simply re-used TDF_ALIAS (and interpret
it as SSA annotation info), adding -range in dump file modifiers
is ok with me.

+static void
+print_double_int (pretty_printer *buffer, double_int cst)
+{
+  tree node = double_int_to_tree (integer_type_node, cst);
+  if (TREE_INT_CST_HIGH (node) == 0)
+pp_printf (buffer, HOST_WIDE_INT_PRINT_UNSIGNED, TREE_INT_CST_LOW
(node));
+  else if (TREE_INT_CST_HIGH (node) == -1
+TREE_INT_CST_LOW (node) != 0)
+pp_printf (buffer, - HOST_WIDE_INT_PRINT_UNSIGNED,
+   -TREE_INT_CST_LOW (node));
+  else
+sprintf (pp_buffer (buffer)-digit_buffer,
+ HOST_WIDE_INT_PRINT_DOUBLE_HEX,
+ (unsigned HOST_WIDE_INT) TREE_INT_CST_HIGH (node),
+ (unsigned HOST_WIDE_INT) TREE_INT_CST_LOW (node));

using sprintf here looks like a layering violation to me.  You
probably want to factor out code from the INTEGER_CST handling
of tree-pretty-print.c:dump_generic_node into a pp_double_int
function in pretty-print.[ch] instead.

@@ -1628,6 +1647,27 @@ dump_gimple_phi (pretty_printer *buffer, gimple
phi, int spc, int flags)
pp_string (buffer, # );
  }

+  if ((flags  TDF_RANGE)
+   !POINTER_TYPE_P (TREE_TYPE (lhs))
+   SSA_NAME_RANGE_INFO (lhs))
+{
+  double_int min, max;
+  value_range_type range_type;

I realize the scheme is pre-existing but can you try factoring
out the dumping of SSA_NAME_PTR_INFO / SSA_NAME_RANGE_INFO into
a separate routine that can be shared by dump_gimple_phi and
pp_gimple_stmt_1?

+get_range_info (tree name, double_int min, double_int max,
+enum value_range_type range_type)
+{
+  gcc_assert (!POINTER_TYPE_P (TREE_TYPE (name)));
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+  range_info_def *ri = SSA_NAME_RANGE_INFO (name);

the TREE_CODE (name) == SSA_NAME assert is redundant with the
tree-checking performed by SSA_NAME_RANGE_INFO.  Likewise in
the other functions.

+void
+get_range_info (tree name, double_int min, double_int max,
+enum value_range_type range_type)

I'm not sure we want to use references.  Well - first time.

+  /* If min  max, it is  VR_ANTI_RANGE.  */
+  if (ri-min.scmp (ri-max) == 1)
+{

I think that's wrong and needs to be conditional on TYPE_UNSIGNED
of the SSA name.

+  else if (vr_value[i]-type == VR_ANTI_RANGE)
+{
+  /* VR_ANTI_RANGE ~[min, max] is encoded compactly as
+ [max + 1, min - 1] without additional attributes.
+ When min value  max value, we know that it is
+ VR_ANTI_RANGE; it is VR_RANGE othewise.  */
+  set_range_info (name,
+  tree_to_double_int (vr_value[i]-max)
+  + double_int_one,
+  tree_to_double_int (vr_value[i]-min)
+  - double_int_one);

there is a complication for when max + 1 or min - 1 overflow - those
should be non-canonical ranges I think, but double-check this
(check set_and_canonicalize_value_range).

I have now added a check for min == 0 for unsigned type. AFAIK, For 
double_int type, this is the only case we should check.


I have also made the other changes you have asked me to do. Please find 
the modified patch and ChangeLog.


Bootstrapped and regtested for x86_64-unknown-linux-gnu.  Is this OK.

Thanks,
Kugan


+2013-09-17  Kugan Vivekanandarajah  kug...@linaro.org
+
+   * gimple-pretty-print.c (dump_ssaname_info) : New function.
+   * gimple-pretty-print.c (dump_gimple_phi) : Dump range info.
+   * (pp_gimple_stmt_1) : Likewise.
+   * tree-pretty-print.c (dump_intger_cst_node) : New function.
+   * (dump_generic_node) : Call dump_intger_cst_node for INTEGER_CST.
+   * tree-ssa-alias.c (dump_alias_info) : Check pointer type.
+   * tree-ssa-copy.c (fini_copy_prop) : Check pointer type and copy
+   range info.
+   * tree-ssanames.c (make_ssa_name_fn) : Check pointer type in
+   initialize.
+   * (set_range_info) : New function.
+   * (get_range_info) : Likewise.
+   * (duplicate_ssa_name_range_info) : Likewise.
+   * (duplicate_ssa_name_fn) : Check pointer type and call correct
+   duplicate function.
+   * tree-vrp.c (vrp_finalize): Call set_range_info to upddate
+   value range of SSA_NAMEs.
+   * tree.h (SSA_NAME_PTR_INFO) : changed to access via union
+   * tree.h