date:20141125

Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior

2014-11-25 Thread Tim Shen

On Mon, Nov 24, 2014 at 3:17 AM, Jonathan Wakely jwak...@redhat.com wrote:
 OK for trunk - thanks.

Committed. :)

Thanks!


-- 
Regards,
Tim Shen

Re: [PATCH] Fix find_base_term in 32-bit -fpic code (PR lto/64025)

2014-11-25 Thread Uros Bizjak

On Tue, Nov 25, 2014 at 8:40 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Tue, Nov 25, 2014 at 12:25 AM, Jakub Jelinek ja...@redhat.com wrote:

 The fallback delegitimization I've added as last option mainly for
 debug info purposes, when we don't know if the base is a PIC register
 or say a PIC register plus some addend, unfortunately in some tests
 broke find_base_term, which for PLUS looks only at the first operand
 and recursion on it finds a base term, it returns it immediately.
 So, it found base term of _GLOBAL_OFFSET_TABLE_, when the right base
 term is actually in the second operand.

 This patch fixes it by swapping the operands, debug info doesn't care about
 the order, it won't match in any instruction anyway, but helps alias.c.

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

 2014-11-24  Jakub Jelinek  ja...@redhat.com

 PR lto/64025
 * config/i386/i386.c (ix86_delegitimize_address): Ensure result
 comes before (addend - _GLOBAL_OFFSET_TABLE_) term.

 Can you also swap operands of (%ecx - %ebx) + foo? There is no point
 digging into RTX involving registers only when we know that we are
 looking for foo. This will also be consistent with the code you
 patched below.

Something like attached prototype patch.

Uros.
Index: i386.c
===
--- i386.c  (revision 218037)
+++ i386.c  (working copy)
@@ -14847,19 +14847,20 @@ ix86_delegitimize_address (rtx x)
 leal (%ebx, %ecx, 4), %ecx
 ...
 movl foo@GOTOFF(%ecx), %edx
-in which case we return (%ecx - %ebx) + foo
-or (%ecx - _GLOBAL_OFFSET_TABLE_) + foo if pseudo_pic_reg
+in which case we return foo + (%ecx - %ebx)
+or foo + (%ecx - _GLOBAL_OFFSET_TABLE_) if pseudo_pic_reg
 and reload has completed.  */
   if (pic_offset_table_rtx
   (!reload_completed || !ix86_use_pseudo_pic_reg ()))
-result = gen_rtx_PLUS (Pmode, gen_rtx_MINUS (Pmode, copy_rtx (addend),
-pic_offset_table_rtx),
-  result);
+result = gen_rtx_PLUS (Pmode, result,
+  gen_rtx_MINUS (Pmode, copy_rtx (addend),
+ pic_offset_table_rtx));
   else if (pic_offset_table_rtx  !TARGET_MACHO  !TARGET_VXWORKS_RTP)
{
  rtx tmp = gen_rtx_SYMBOL_REF (Pmode, GOT_SYMBOL_NAME);
- tmp = gen_rtx_MINUS (Pmode, copy_rtx (addend), tmp);
- result = gen_rtx_PLUS (Pmode, tmp, result);
+ result = gen_rtx_PLUS (Pmode, result,
+gen_rtx_MINUS (Pmode, copy_rtx (addend),
+   tmp));
}
   else
return orig_x;

Re: [PATCH RFC]Pair load store instructions using a generic scheduling fusion pass

2014-11-25 Thread Bin.Cheng

On Mon, Nov 24, 2014 at 10:28 PM, James Greenhalgh
james.greenha...@arm.com wrote:
 On Fri, Nov 14, 2014 at 02:43:12AM +, Bin.Cheng wrote:
 On Fri, Nov 7, 2014 at 7:13 AM, Jeff Law l...@redhat.com wrote:
  On 11/05/14 02:30, Bin.Cheng wrote:
  Thanks very much for reviewing.  I refined the patch according to your
  comments.  Also made two small changes: a)  skip breaking dependency
  between memory access and the corresponding base-reg modifying
  instruction.  This feature doesn't help load/store pair that much and
  only increases compilation time.  b) a minor bug fix in arm backend
  hook when calculating priority for memory accesses with minus offset.
 
  I am running bootstrap/test against latest trunk, and will adapt
  ChangeLog once get approved generally.  So how about this one?
 
  OK for the trunk.  Thanks for your patience.
 
  Jeff
 

 Thanks for reviewing.  For the record, attached patch is committed.
 The only update is I disabled the pass if peephole2 isn't in effect
 because it relies on peephole2 to do real fusion work.

 Hi Bin,

 The documentation for TARGET_SCHED_FUSION_PRIORITY doesn't look
 right to me (see: https://gcc.gnu.org/onlinedocs/gccint/Scheduling.html ).

 I think you'll need to wrap your examples in something like @smallexample
 tags if you want to maintain their formatting.

Hi James,
Thanks very much for reporting this, will fix it.

Thanks,
bin

Re: [PATCH AARCH64]load store pair optimization using sched_fusion pass.

2014-11-25 Thread Bin.Cheng

Ping.  Anybody have a look?

Thanks,
bin

On Tue, Nov 18, 2014 at 4:34 PM, Bin Cheng bin.ch...@arm.com wrote:
 Hi,
 This is the patch implementing ldp/stp optimization for aarch64.  It
 consists of two parts.  The first one is peephole part, which further
 includes ldp/stp patterns (both peephole patterns and the insn match
 patterns) and auxiliary functions (both checking the validity and merging).
 The second part implements the aarch64 backend hook for sched-fusion pass,
 which calculates appropriate priorities for different kinds of load/store
 instructions.  With these priorities, sched-fusion pass can schedule as many
 load/store instructions together as possible, thus the coming peephole2 pass
 can merge them.

 I collected data for miscellaneous benchmarks.  Some cases are improved;
 most of the rest cases are not regressed; only couple of them are regressed
 a little by 2-3%.  After looking into the regressions I can confirm that
 code transformation is generally good with many load/stores paired.  These
 regressions are most probably false alarms and caused by other issues.

 Conclusion is this patch can pair lots of consecutive load/store
 instructions into ldp/stp.  The conclusion can be proven by code size
 improvement of benchmarks.  E.g., in general it cuts off text size of
 spec2k6 binaries (O3 level, not statically linked in my build) by 1.68%.

 Bootstrap and test on aarch64.  Is it OK?

 2014-11-18  Bin Cheng  bin.ch...@arm.com

 * config/aarch64/aarch64.md (load_pairmode): Split to
 load_pairsi, load_pairdi, load_pairsf and load_pairdf.
 (load_pairsi, load_pairdi, load_pairsf, load_pairdf): Split
 from load_pairmode.  New alternative to support int/fp
 registers in fp/int mode patterns.
 (store_pairmode:): Split to store_pairsi, store_pairdi,
 store_pairsf and store_pairdi.
 (store_pairsi, store_pairdi, store_pairsf, store_pairdf): Split
 from store_pairmode.  New alternative to support int/fp
 registers in fp/int mode patterns.
 (*load_pair_extendsidi2_aarch64): New pattern.
 (*load_pair_zero_extendsidi2_aarch64): New pattern.
 (aarch64-ldpstp.md): Include.
 * config/aarch64/aarch64-ldpstp.md: New file.
 * config/aarch64/aarch64-protos.h (aarch64_gen_adjusted_ldpstp):
 New.
 (extract_base_offset_in_addr): New.
 (aarch64_operands_ok_for_ldpstp): New.
 (aarch64_operands_adjust_ok_for_ldpstp): New.
 * config/aarch64/aarch64.c (enum sched_fusion_type): New enum.
 (TARGET_SCHED_FUSION_PRIORITY): New hook.
 (fusion_load_store): New functon.
 (extract_base_offset_in_addr): New function.
 (aarch64_gen_adjusted_ldpstp): New function.
 (aarch64_sched_fusion_priority): New function.
 (aarch64_operands_ok_for_ldpstp): New function.
 (aarch64_operands_adjust_ok_for_ldpstp): New function.

 2014-11-18  Bin Cheng  bin.ch...@arm.com

 * gcc.target/aarch64/ldp-stp-1.c: New test.
 * gcc.target/aarch64/ldp-stp-2.c: New test.
 * gcc.target/aarch64/ldp-stp-3.c: New test.
 * gcc.target/aarch64/ldp-stp-4.c: New test.
 * gcc.target/aarch64/ldp-stp-5.c: New test.
 * gcc.target/aarch64/lr_free_1.c: Disable scheduling fusion
 and peephole2 pass.

Re: [PATCH] Fix find_base_term in 32-bit -fpic code (PR lto/64025)

2014-11-25 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 09:13:10AM +0100, Uros Bizjak wrote:
 On Tue, Nov 25, 2014 at 8:40 AM, Uros Bizjak ubiz...@gmail.com wrote:
  On Tue, Nov 25, 2014 at 12:25 AM, Jakub Jelinek ja...@redhat.com wrote:
 
  The fallback delegitimization I've added as last option mainly for
  debug info purposes, when we don't know if the base is a PIC register
  or say a PIC register plus some addend, unfortunately in some tests
  broke find_base_term, which for PLUS looks only at the first operand
  and recursion on it finds a base term, it returns it immediately.
  So, it found base term of _GLOBAL_OFFSET_TABLE_, when the right base
  term is actually in the second operand.
 
  This patch fixes it by swapping the operands, debug info doesn't care about
  the order, it won't match in any instruction anyway, but helps alias.c.
 
  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
  2014-11-24  Jakub Jelinek  ja...@redhat.com
 
  PR lto/64025
  * config/i386/i386.c (ix86_delegitimize_address): Ensure result
  comes before (addend - _GLOBAL_OFFSET_TABLE_) term.
 
  Can you also swap operands of (%ecx - %ebx) + foo? There is no point
  digging into RTX involving registers only when we know that we are
  looking for foo. This will also be consistent with the code you
  patched below.
 
 Something like attached prototype patch.

Actually, thinking about it more, at least according to
commutative_operand_precedence the canonical order is
what we used to return (i.e. (something - _G_O_T_) + (symbol_ref)
or
(something - _G_O_T_) + (const (symbol_ref +- const))
So perhaps better fix is to follow find_base_value, which does something
like:
/* Guess which operand is the base address:
   If either operand is a symbol, then it is the base.  If
   either operand is a CONST_INT, then the other is the base.  */
if (CONST_INT_P (src_1) || CONSTANT_P (src_0))
  return find_base_value (src_0);
else if (CONST_INT_P (src_0) || CONSTANT_P (src_1))
  return find_base_value (src_1);
and do something similar in find_base_term too.  I.e. perhaps even with
higher precedence over REG_P with REG_POINTER (or lower, in these cases
it doesn't really matter, neither argument is REG_P), choose first
operand that is CONSTANT_P and not CONST_INT_P.

Jakub

Re: [PATCH] Fix building of gengtype

2014-11-25 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 12:35:09AM +0100, Jakub Jelinek wrote:
 My last 2 bootstraps failed, both because of a race while building
 host gengtype (each time different gengtype*.o).

Found bootstrap failures even with this patch (dunno what changed on my box
that I started getting these last night, make has not changed), that time
with errors.o and gcc-ar.o.
The generated headers are solved these days in automatic dependencies world
through
# In order for parallel make to really start compiling the expensive
# objects from $(OBJS) as early as possible, build all their
# prerequisites strictly before all objects.
$(ALL_HOST_OBJS) : | $(generated_files)
and build/*.o have explicit dependencies.
I've tried to compare $(ALL_HOST_OBJS) on my box with all *.o */*.o files
I had in stage3 directory, and besides build/*.o, I found:

crtbegin.o crtbeginS.o crtbeginT.o crtend.o crtendS.o crtfastmath.o crtprec32.o 
crtprec64.o crtprec80.o
errors.o gcc-ar.o gcc-nm.o gcc-ranlib.o gengtype-lex.o gengtype.o 
gengtype-parse.o gengtype-state.o

not being listed in ALL_HOST_OBJS.  The crt*.o files come from libgcc build
and thus are ok, the rest I've tried to handle in the following updated
patch.  If the #define GENERATOR_FILE inside of the 5 files is too ugly,
another alternative might be to define both -DHOST_GENERATOR_FILE 
-DGENERATOR_FILE
in Makefile.in and don't error in config.h if GENERATOR_FILE is defined,
if HOST_GENERATOR_FILE is also defined.

2014-11-25  Jakub Jelinek  ja...@redhat.com

* Makefile.in (ALL_HOST_BACKEND_OBJS): Add $(GENGTYPE_OBJS),
gcc-ar.o, gcc-nm.o and gcc-ranlib.o.
(GENGTYPE_OBJS): New.
(gengtype-lex.o, gengtype-parse.o, gengtype-state.o, gengtype.o):
Remove explicit dependencies.
(CFLAGS-gengtype-lex.o, CFLAGS-gengtype-parse.o,
CFLAGS-gengtype-state.o, CFLAGS-gengtype.o): Add -DHOST_GENERATOR_FILE
instead of -DGENERATOR_FILE.
(CFLAGS-errors.o): New.
* gengtype.c: Instead of testing GENERATOR_FILE define, test
HOST_GENERATOR_FILE.  If defined, include config.h and define
GENERATOR_FILE afterwards, otherwise include bconfig.h.
* gengtype-parse.c: Likewise.
* gengtype-state.c: Likewise.
* gengtype-lex.l: Likewise.
* errors.c: Likewise.

--- gcc/Makefile.in.jj  2014-11-25 00:06:43.122178737 +0100
+++ gcc/Makefile.in 2014-11-25 08:55:34.727300843 +0100
@@ -1509,7 +1509,8 @@ ALL_HOST_FRONTEND_OBJS = $(foreach v,$(C
 ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) $(OBJS-libcommon) \
   $(OBJS-libcommon-target) @TREEBROWSER@ main.o c-family/cppspec.o \
   $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \
-  $(GCOV_TOOL_OBJS) lto-wrapper.o collect-utils.o
+  $(GCOV_TOOL_OBJS) $(GENGTYPE_OBJS) gcc-ar.o gcc-nm.o gcc-ranlib.o \
+  lto-wrapper.o collect-utils.o
 
 # This lists all host object files, whether they are included in this
 # compilation or not.
@@ -2484,30 +2485,31 @@ build/gengenrtl.o : gengenrtl.c $(BCONFI
 # on BCONFIG_H.  For the build objects, add -DGENERATOR_FILE manually,
 # the build-%: rule doesn't apply to them.
 
+GENGTYPE_OBJS = gengtype.o gengtype-parse.o gengtype-state.o \
+  gengtype-lex.o errors.o
+
 gengtype-lex.o build/gengtype-lex.o : gengtype-lex.c gengtype.h $(SYSTEM_H)
-gengtype-lex.o: $(CONFIG_H) $(BCONFIG_H)
-CFLAGS-gengtype-lex.o += -DGENERATOR_FILE
+CFLAGS-gengtype-lex.o += -DHOST_GENERATOR_FILE
 build/gengtype-lex.o: $(BCONFIG_H)
 
 gengtype-parse.o build/gengtype-parse.o : gengtype-parse.c gengtype.h \
   $(SYSTEM_H)
-gengtype-parse.o: $(CONFIG_H)
-CFLAGS-gengtype-parse.o += -DGENERATOR_FILE
+CFLAGS-gengtype-parse.o += -DHOST_GENERATOR_FILE
 build/gengtype-parse.o: $(BCONFIG_H)
 
 gengtype-state.o build/gengtype-state.o: gengtype-state.c $(SYSTEM_H) \
   gengtype.h errors.h double-int.h version.h $(HASHTAB_H) $(OBSTACK_H) \
   $(XREGEX_H)
-gengtype-state.o: $(CONFIG_H)
-CFLAGS-gengtype-state.o += -DGENERATOR_FILE
+CFLAGS-gengtype-state.o += -DHOST_GENERATOR_FILE
 build/gengtype-state.o: $(BCONFIG_H)
 gengtype.o build/gengtype.o : gengtype.c $(SYSTEM_H) gengtype.h\
   rtl.def insn-notes.def errors.h double-int.h version.h   \
   $(HASHTAB_H) $(OBSTACK_H) $(XREGEX_H)
-gengtype.o: $(CONFIG_H)
-CFLAGS-gengtype.o += -DGENERATOR_FILE
+CFLAGS-gengtype.o += -DHOST_GENERATOR_FILE
 build/gengtype.o: $(BCONFIG_H)
 
+CFLAGS-errors.o += -DHOST_GENERATOR_FILE
+
 build/genmddeps.o: genmddeps.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h\
   errors.h $(READ_MD_H)
 build/genmodes.o : genmodes.c $(BCONFIG_H) $(SYSTEM_H) errors.h
\
--- gcc/gengtype.c.jj   2014-11-21 10:17:06.135695325 +0100
+++ gcc/gengtype.c  2014-11-25 08:56:18.042523089 +0100
@@ -17,10 +17,11 @@
along with GCC; see the file COPYING3.  If not see
http://www.gnu.org/licenses/.  */
 
-#ifdef GENERATOR_FILE
-#include bconfig.h
-#else
+#ifdef HOST_GENERATOR_FILE
 #include config.h
+#define GENERATOR_FILE 1
+#else
+#include bconfig.h

Re: [Patch, libstdc++/63497] Avoid dereferencing invalid iterator in regex_executor

2014-11-25 Thread Tim Shen

On Wed, Oct 22, 2014 at 8:19 PM, Tim Shen tims...@google.com wrote:
 Committed. Thank you too!

I'm backporting this patch to gcc-4_9-branch. Do we usually boot 
test it and then commit directly, or it should be reviewed again?


-- 
Regards,
Tim Shen
commit 1e146769d08ff19cc01a08b91ca8fd3151f34faf
Author: timshen tims...@google.com
Date:   Tue Nov 25 00:36:25 2014 -0800

PR libstdc++/63497
include/bits/regex_executor.h (_Executor::_M_word_boundary): Remove
unused parameter.
include/bits/regex_executor.tcc (_Executor::_M_dfs,
_Executor::_M_word_boundary): Avoid dereferecing _M_current at _M_end
or other invalid position.

diff --git a/libstdc++-v3/include/bits/regex_executor.h 
b/libstdc++-v3/include/bits/regex_executor.h
index 708c78e..0d1b676 100644
--- a/libstdc++-v3/include/bits/regex_executor.h
+++ b/libstdc++-v3/include/bits/regex_executor.h
@@ -134,7 +134,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 
   bool
-  _M_word_boundary(_State_TraitsT __state) const;
+  _M_word_boundary() const;
 
   bool
   _M_lookahead(_State_TraitsT __state);
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc 
b/libstdc++-v3/include/bits/regex_executor.tcc
index 052302b..ef49161 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -257,7 +257,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_dfs__match_mode(__state._M_next);
  break;
case _S_opcode_word_boundary:
- if (_M_word_boundary(__state) == !__state._M_neg)
+ if (_M_word_boundary() == !__state._M_neg)
_M_dfs__match_mode(__state._M_next);
  break;
// Here __state._M_alt offers a single start node for a sub-NFA.
@@ -267,9 +267,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_dfs__match_mode(__state._M_next);
  break;
case _S_opcode_match:
+ if (_M_current == _M_end)
+   break;
  if (__dfs_mode)
{
- if (_M_current != _M_end  __state._M_matches(*_M_current))
+ if (__state._M_matches(*_M_current))
{
  ++_M_current;
  _M_dfs__match_mode(__state._M_next);
@@ -348,25 +350,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   templatetypename _BiIter, typename _Alloc, typename _TraitsT,
 bool __dfs_mode
 bool _Executor_BiIter, _Alloc, _TraitsT, __dfs_mode::
-_M_word_boundary(_State_TraitsT __state) const
+_M_word_boundary() const
 {
-  // By definition.
-  bool __ans = false;
-  auto __pre = _M_current;
-  --__pre;
-  if (!(_M_at_begin()  _M_at_end()))
+  bool __left_is_word = false;
+  if (_M_current != _M_begin
+ || (_M_flags  regex_constants::match_prev_avail))
{
- if (_M_at_begin())
-   __ans = _M_is_word(*_M_current)
-  !(_M_flags  regex_constants::match_not_bow);
- else if (_M_at_end())
-   __ans = _M_is_word(*__pre)
-  !(_M_flags  regex_constants::match_not_eow);
- else
-   __ans = _M_is_word(*_M_current)
- != _M_is_word(*__pre);
+ auto __prev = _M_current;
+ if (_M_is_word(*std::prev(__prev)))
+   __left_is_word = true;
}
-  return __ans;
+  bool __right_is_word =
+   _M_current != _M_end  _M_is_word(*_M_current);
+
+  if (__left_is_word == __right_is_word)
+   return false;
+  if (__left_is_word  !(_M_flags  regex_constants::match_not_eow))
+   return true;
+  if (__right_is_word  !(_M_flags  regex_constants::match_not_bow))
+   return true;
+  return false;
 }
 
 _GLIBCXX_END_NAMESPACE_VERSION

RE: [PATCH, AARCH64] Fix ICE in CCMP (PR64015)

2014-11-25 Thread Zhenqiang Chen

 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Richard Henderson
 Sent: Monday, November 24, 2014 4:57 PM
 To: Zhenqiang Chen; gcc-patches@gcc.gnu.org
 Cc: Marcus Shawcroft
 Subject: Re: [PATCH, AARCH64] Fix ICE in CCMP (PR64015)

 On 11/24/2014 06:11 AM, Zhenqiang Chen wrote:
  Expand pass always uses sign-extend to represent constant value. For
  the case in the patch, a 8-bit unsigned value 252 is represented as
  -4, which pass the ccmn check. After mode conversion, -4 becomes
  252, which leads to mismatch.

 This sort of thing is why I suggested from the beginning that expansion
 happen directly from trees instead of sort-of re-expanding from rtl.

 I think you're better off fixing this properly than hacking around it here.

Thanks for the comments.

Here was your previous comments: We could avoid that by using struct 
expand_operand, create_input_operand et al, then expand_insn.  That does 
require that the target hooks be given trees rather than rtl as input.

I want to confirm with you two things before I rework it.
(1) expand_insn needs an optab_handler as input. Do I need to define a 
ccmp_optab with different mode support in optabs.def?
(2) To make sure later operands not clobber CC, all operands are expanded 
before ccmp-first in current implementation. If taking tree/gimple as input, 
what's your preferred logic to guarantee CC not clobbered?

Thanks!
-Zhenqiang

[PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name

2014-11-25 Thread Ilya Enkovich

Hi,

This patch partly fixes PR bootstrap/63995 by avoiding duplicating static 
bounds vars.  With this fix bootstrap still fails at stage 2 and 3 comparison.

Bootstrapped and checked on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

PR bootstrap/63995
* tree-chkp (chkp_make_static_bounds): Share bounds var
between nodes sharing assembler name.

gcc/testsuite

2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

PR bootstrap/63995
* g++.dg/dg.exp: Add mpx-dg.exp.
* g++.dg/pr63995-1.C: New.


diff --git a/gcc/testsuite/g++.dg/dg.exp b/gcc/testsuite/g++.dg/dg.exp
index 14beae1..44eab0c 100644
--- a/gcc/testsuite/g++.dg/dg.exp
+++ b/gcc/testsuite/g++.dg/dg.exp
@@ -18,6 +18,7 @@
 
 # Load support procs.
 load_lib g++-dg.exp
+load_lib mpx-dg.exp
 
 # If a testcase doesn't have special options, use these.
 global DEFAULT_CXXFLAGS
diff --git a/gcc/testsuite/g++.dg/pr63995-1.C b/gcc/testsuite/g++.dg/pr63995-1.C
new file mode 100644
index 000..82e7606
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr63995-1.C
@@ -0,0 +1,16 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-require-effective-target mpx } */
+/* { dg-options -O2 -g -fcheck-pointer-bounds -mmpx } */
+
+int test1 (int i)
+{
+  extern const int arr[10];
+  return arr[i];
+}
+
+extern const int arr[10];
+
+int test2 (int i)
+{
+  return arr[i];
+}
diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
index 3e38691..d425084 100644
--- a/gcc/tree-chkp.c
+++ b/gcc/tree-chkp.c
@@ -2727,9 +2727,29 @@ chkp_make_static_bounds (tree obj)
   /* First check if we already have required var.  */
   if (chkp_static_var_bounds)
 {
-  slot = chkp_static_var_bounds-get (obj);
-  if (slot)
-   return *slot;
+  /* If there is a symbol sharing assembler name with obj,
+we may use its bounds.  */
+  if (TREE_CODE (obj) == VAR_DECL)
+   {
+ varpool_node *node = varpool_node::get_create (obj);
+
+ while (node-previous_sharing_asm_name)
+   node = (varpool_node *)node-previous_sharing_asm_name;
+
+ while (node)
+   {
+ slot = chkp_static_var_bounds-get (node-decl);
+ if (slot)
+   return *slot;
+ node = (varpool_node *)node-next_sharing_asm_name;
+   }
+   }
+  else
+   {
+ slot = chkp_static_var_bounds-get (obj);
+ if (slot)
+   return *slot;
+   }
 }
 
   /* Build decl for bounds var.  */

Re: [Patch, libstdc++/63775] Fix regex bracket expression parsing

2014-11-25 Thread Tim Shen

On Wed, Nov 12, 2014 at 11:45 PM, Tim Shen tims...@google.com wrote:
 Committed with comment fix and slight change on testcase
 (VERIFY(false) at end of the try block -- must throw).

Is it possible to backport this patch to 4.9 branch? It's an important
fix, but I'm not sure if there's any binary compatibility problem. Is
it fine because it's only _Compiler class, which is an intermediate
class, that's modified?


-- 
Regards,
Tim Shen

[PATCH, PR64056, i386] Fix chkp tests requiring mempcpy

2014-11-25 Thread Ilya Enkovich

Hi,

This patch adds check for mempcpy availability for tests requiring it.  Checked 
with RUNTESTFLAGS=--target_board='unix{-m32,}' i386.exp=chkp-*.  OK for trunk?

Thanks,
Ilya
--
2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

PR target/64056
* gcc.target/i386/chkp-strlen-4.c: Add mempcpy target check.
* gcc.target/i386/chkp-stropt-4.c: Likewise.
* gcc.target/i386/chkp-stropt-8.c: Likewise.
* gcc.target/i386/chkp-stropt-12.c: Likewise.
* gcc.target/i386/chkp-stropt-16.c: Likewise.


diff --git a/gcc/testsuite/gcc.target/i386/chkp-strlen-4.c 
b/gcc/testsuite/gcc.target/i386/chkp-strlen-4.c
index a9ebe2b..2da762a 100644
--- a/gcc/testsuite/gcc.target/i386/chkp-strlen-4.c
+++ b/gcc/testsuite/gcc.target/i386/chkp-strlen-4.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target mpx } */
+/* { dg-require-effective-target mempcpy } */
 /* { dg-options -fcheck-pointer-bounds -mmpx -O2 -fdump-tree-strlen 
-D_GNU_SOURCE } */
 /* { dg-final { scan-tree-dump-times strlen 1 strlen } } */
 /* { dg-final { cleanup-tree-dump strlen } } */
diff --git a/gcc/testsuite/gcc.target/i386/chkp-stropt-12.c 
b/gcc/testsuite/gcc.target/i386/chkp-stropt-12.c
index 94e936d..01a5159 100644
--- a/gcc/testsuite/gcc.target/i386/chkp-stropt-12.c
+++ b/gcc/testsuite/gcc.target/i386/chkp-stropt-12.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target mpx } */
+/* { dg-require-effective-target mempcpy } */
 /* { dg-options -fcheck-pointer-bounds -mmpx -O2 -fdump-tree-chkpopt 
-fchkp-use-fast-string-functions -D_GNU_SOURCE } */
 /* { dg-final { scan-tree-dump-not mempcpy_nobnd chkpopt } } */
 /* { dg-final { cleanup-tree-dump chkpopt } } */
diff --git a/gcc/testsuite/gcc.target/i386/chkp-stropt-16.c 
b/gcc/testsuite/gcc.target/i386/chkp-stropt-16.c
index 4b26d58..f925ef9 100644
--- a/gcc/testsuite/gcc.target/i386/chkp-stropt-16.c
+++ b/gcc/testsuite/gcc.target/i386/chkp-stropt-16.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target mpx } */
+/* { dg-require-effective-target mempcpy } */
 /* { dg-options -fcheck-pointer-bounds -mmpx -O2 -fdump-tree-chkpopt 
-fchkp-use-nochk-string-functions -fchkp-use-fast-string-functions 
-D_GNU_SOURCE } */
 /* { dg-final { scan-tree-dump mempcpy_nobnd_nochk chkpopt } } */
 /* { dg-final { cleanup-tree-dump chkpopt } } */
diff --git a/gcc/testsuite/gcc.target/i386/chkp-stropt-4.c 
b/gcc/testsuite/gcc.target/i386/chkp-stropt-4.c
index 4ee2390..3ae6bf5 100644
--- a/gcc/testsuite/gcc.target/i386/chkp-stropt-4.c
+++ b/gcc/testsuite/gcc.target/i386/chkp-stropt-4.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target mpx } */
+/* { dg-require-effective-target mempcpy } */
 /* { dg-options -fcheck-pointer-bounds -mmpx -O2 -fdump-tree-chkpopt 
-fchkp-use-nochk-string-functions -D_GNU_SOURCE } */
 /* { dg-final { scan-tree-dump mempcpy_nochk chkpopt } } */
 /* { dg-final { cleanup-tree-dump chkpopt } } */
diff --git a/gcc/testsuite/gcc.target/i386/chkp-stropt-8.c 
b/gcc/testsuite/gcc.target/i386/chkp-stropt-8.c
index 8c3b15d..6d6d55e 100644
--- a/gcc/testsuite/gcc.target/i386/chkp-stropt-8.c
+++ b/gcc/testsuite/gcc.target/i386/chkp-stropt-8.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target mpx } */
+/* { dg-require-effective-target mempcpy } */
 /* { dg-options -fcheck-pointer-bounds -mmpx -O2 -fdump-tree-chkpopt 
-fchkp-use-fast-string-functions -D_GNU_SOURCE } */
 /* { dg-final { scan-tree-dump mempcpy_nobnd chkpopt } } */
 /* { dg-final { cleanup-tree-dump chkpopt } } */

Re: [PATCH, ciklplus]: Use -ffloat-store for 32bit x86 in cilk-plus/AN/builtin_fn_{custom,mutating}.c

2014-11-25 Thread Richard Biener

On Mon, Nov 24, 2014 at 10:33 PM, Jeff Law l...@redhat.com wrote:
 On 11/22/14 11:50, Uros Bizjak wrote:

 Hello!

 These two tests fix PR target/63847 [1], where x87 excess precision
 causes testcase to fail. The problem was triggered by -fpic, please
 see the PR for analysis.

 The patch adds -ffloat-store for 32bit x86 target, a standard and well
 tested solution for this problem.

 2014-11-22  Uros Bizjak  ubiz...@gmail.com

  PR target/63847
  * c-c++-common/cilk-plus/AN/builtin_fn_custom.c: Add -ffloat-store
  for 32bit x86 targets.
  * c-c++-common/cilk-plus/AN/builtin_fn_mutating.c: Ditto.

 OK.

Don't we have -fexcess-precision=standard for this now?

Richard.

 Jeff

Re: [PATCH] Fix linemap_line_start (PR preprocessor/60436)

2014-11-25 Thread Richard Biener

On Tue, Nov 25, 2014 at 12:22 AM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 As mentioned in the PR, when preprocessing very large files, if there are
 huge numbers of lines where no #line is emitted, we might not detect
 overflowinging into adhoc locations.
 Apparently in the add_map case we already handle that fine, by first
 stopping tracking columns and after another 256M lines give up on tracking
 locations, so this patch just makes sure we enter that path if
 going over those limits.

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

 2014-11-24  Jakub Jelinek  ja...@redhat.com

 PR preprocessor/60436
 * line-map.c (linemap_line_start): If highest is above 0x6000
 and we are still tracking columns or highest is above 0x7000,
 force add_map.

 --- libcpp/line-map.c.jj2014-11-12 08:06:57.0 +0100
 +++ libcpp/line-map.c   2014-11-24 12:14:52.691276169 +0100
 @@ -529,10 +529,10 @@ linemap_line_start (struct line_maps *se
line_delta * ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)  1000)
|| (max_column_hint = (1U  ORDINARY_MAP_NUMBER_OF_COLUMN_BITS 
 (map)))
|| (max_column_hint = 80
 -  ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) = 10))
 -{
 -  add_map = true;
 -}
 +  ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) = 10)
 +  || (highest  0x6000
 +  (set-max_column_hint || highest  0x7000)))
 +add_map = true;
else
  max_column_hint = set-max_column_hint;
if (add_map)
 @@ -543,7 +543,7 @@ linemap_line_start (struct line_maps *se
   /* If the column number is ridiculous or we've allocated a huge
  number of source_locations, give up on column numbers. */
   max_column_hint = 0;
 - if (highest 0x7000)
 + if (highest  0x7000)
 return 0;
   column_bits = 0;
 }

 Jakub

Re: [PATCH, AARCH64] Fix ICE in CCMP (PR64015)

2014-11-25 Thread Richard Henderson

On 11/25/2014 09:41 AM, Zhenqiang Chen wrote:
 I want to confirm with you two things before I rework it.
 (1) expand_insn needs an optab_handler as input. Do I need to define a 
 ccmp_optab with different mode support in optabs.def?

No, look again: expand_insn needs an enum insn_code as input.  Since this is
the backend, you can use any icode name you like, which means that you can use
CODE_FOR_ccmp_and etc directly.

 (2) To make sure later operands not clobber CC, all operands are expanded 
 before ccmp-first in current implementation. If taking tree/gimple as input, 
 what's your preferred logic to guarantee CC not clobbered?

Hmm.  Perhaps the target hook will need to output two sequences, each of which
will be concatenated while looping around the calls to gen_ccmp_next.  The
first sequence will be operand preparation and the second sequence will be ccmp
generation.

Something like

bool
aarch64_gen_ccmp_start(rtx *prep_seq, rtx *gen_seq,
   int cmp_code, int bit_code,
   tree op0, tree op1)
{
  bool success;

  start_sequence ();
  // Widen and expand operands
  *prep_seq = get_insns ();
  end_sequence ();

  start_sequence ();
  // Generate the first compare
  *gen_seq = get_insns ();
  end_sequence ();

  return success;
}

bool
aarch64_gen_ccmp_next(rtx *prep_seq, rtx *gen_seq,
  rtx prev, int cmp_code, int bit_code,
  tree op0, tree op1)
{
  bool success;

  push_to_sequence (*prep_seq);
  // Widen and expand operands
  *prep_seq = get_insns ();
  end_sequence ();

  push_to_sequence (*gen_seq);
  // Generate the next ccmp
  *gen_seq = get_insns ();
  end_sequence ();

  return success;
}

If there are ever any failures, the middle-end can simply discard the
sequences.  If everything succeeds, it simply calls emit_insn on both sequences.


r~

Re: [PATCH] Add verify_sese

2014-11-25 Thread Richard Biener

On Tue, Nov 25, 2014 at 1:01 AM, Tom de Vries tom_devr...@mentor.com wrote:
 Richard,

 I ran into a problem with my oacc kernels directive patch series where
 tail-merge added another entry into a region that was previously
 single-entry-single-exit.

 That resulted in hitting this assert in calc_dfs_tree:
 ...
   /* This aborts e.g. when there is _no_ path from ENTRY to EXIT at all.  */
   gcc_assert (di-nodes == (unsigned int) n_basic_blocks_for_fn (cfun) - 1);
 ...
 during a call to move_sese_region_to_fn.

 This patch makes sure that we abort earlier, with a clearer message of what
 is actually wrong.

 Bootstrapped and reg-tested on x86_64.

 OK for trunk/stage3?

I believe someone made the function work for SEME regions and I believe
it is actually used to copy loops with multiple exits so I don't see how the
patch can work in these cases?

Thanks,
Richard.

 Thanks,
 - Tom

Re: [RFC] First steps towards segregating types.

2014-11-25 Thread Richard Biener

On Tue, Nov 25, 2014 at 2:06 AM, Joseph Myers jos...@codesourcery.com wrote:
 On Mon, 24 Nov 2014, Richard Biener wrote:

  TREE_LIST should die (with the typical replacement being vecsomething);
  most lists do not need all the overhead of individually allocated objects
  with (code, flags, type, chain, value, purpose).  Probably TREE_VEC too.

 Note that there is nothing wrong with TREE_LIST or TREE_VEC if
 they were based off tree_base only.  That they inherit from
 tree_common is the bug to fix - either by not using TREE_LIST or
 TREE_VEC from the users that use fields from tree_common or
 tree_typed or by adjusting those users to not need those fields.

 Even inheriting from tree_base, typically lists don't need code (because
 you know statically that something is a list) or flags.  And because
 generally something is statically a list or not a list, there is no
 particular benefit from sharing the static type of tree, and better
 compile-time checking if there is no such common base class for list and
 other objects at all.

 (Identifiers are another case that doesn't generally benefit from having a
 common static type of tree.)

All true - but 'tree's were built on the premise that everything is a tree.
An incremental change is to make that sane - removing bits out of
the tree space is also possible (though please not by a wart like
a TYPE_REF tree node ...)

Richard.


 --
 Joseph S. Myers
 jos...@codesourcery.com

Re: [Patch] Improving jump-thread pass for PR 54742

2014-11-25 Thread Richard Biener

On Mon, Nov 24, 2014 at 11:05 PM, Sebastian Pop seb...@gmail.com wrote:
 Jeff Law wrote:
 On 11/23/14 15:22, Sebastian Pop wrote:
 The second patch attached limits the search for FSM jump threads to loops.  
 With
 that patch, we are now down to 470 jump threads in an x86_64-linux bootstrap
 (and 424 jump threads on powerpc64-linux bootstrap.)
 
 Yea, that was one of the things I was going to poke at as well as a
 quick scan of your patch gave me the impression it wasn't limited to
 loops.

 Again, I haven't looked much at the patch, but I got the impression
 you're doing a backwards walk through the predecessors to discover
 the result of the COND_EXPR.  Correct?

 Yes.


 That's something I'd been wanting to do -- basically start with a
 COND_EXPR, then walk the dataflow backwards substituting values into
 the COND_EXPR (possibly creating non-gimple).  Ultimately the goal
 is to substitute and fold, getting to a constant :-)

 The forward exhaustive stuff we do now is, crazy.   The backwards
 approach could be decoupled from DOM  VRP into an independent pass,
 which I think would be wise.

 Using a SEME region copier is also something I really wanted to do
 long term.  In fact, I believe a lot of tree-ssa-threadupdate.c
 ought to be ripped out and replaced with a SEME based copier.

 I did an experiment around these lines over the week-end, and now that you
 mention it, I feel less shy to speak about; well the patch does not yet pass
 bootstrap, and there still are about 20 failing test-cases.  I feel better
 reading the code generation part of jump-threading after this patch ;-)
 Basically I think all the tree-ssa-threadupdate.c can be replaced by
 duplicate_seme_region that generalizes the code generation.

Btw I once thought about doing on-the-fly lattice use/update and folding
during basic-block copying (or even re-generating expressions via
simplifying gimple_build ()).  Or have a substitute-and-fold like
facility that can run on SEME regions and do this.

Richard.

 It appears you've built at least parts of two pieces needed to all
 this as a Bodik style optimizer.  Which is exactly the long term
 direction I think this code ought to take.


 
 One of the reasons I think we see more branches is that in sese region 
 copying we
 do not use the knowledge of the value of the condition for the last branch 
 in a
 jump-thread path: we rely on other propagation passes to remove the branch. 
  The
 last attached patch adds:
 
/* Remove the last branch in the jump thread path.  */
remove_ctrl_stmt_and_useless_edges (region_copy[n_region - 1], 
  exit-dest);
 That's certainly a possibility.  But I would expect that even with
 this limitation something would be picking up the fact that the
 branch is statically computable (even if it's an RTL optimizer).
 But it's definitely something to look for.

 
 Please let me know if the attached patches are producing better results on 
 gcc.

 For the trunk:
   instructions:1339016494968
   branches :243568982489

 First version of your patch:

   instructions:1339739533291
   branches: 243806615986

 Latest version of your patch:

   instructions:1339749122609
   branches: 243809838262

 I think I got about the same results.

 I got my scripts installed on the gcc-farm.  I first used an x86_64 gcc75 and
 valgrind was crashing not recognizing how to decode an instruction.  Then I
 moved to gcc112 a powerpc64-linux where I got this data from stage2 cc1plus
 compiling the same file alias.ii at -O2: (I got 3 runs of each mostly because
 there is a bit of noise in all these numbers)

 $ valgrind --tool=cachegrind --cache-sim=no --branch-sim=yes ./cc1plus -O2 
 ~/alias.ii

 all 4 patches:

 ==153617== I   refs:  13,914,038,211
 ==153617==
 ==153617== Branches:   1,926,407,760  (1,879,827,481 cond + 46,580,279 
 ind)
 ==153617== Mispredicts:  144,890,904  (  132,094,105 cond + 12,796,799 
 ind)
 ==153617== Mispred rate: 7.5% (  7.0% +   27.4%   
 )

 ==34993== I   refs:  13,915,335,629
 ==34993==
 ==34993== Branches:   1,926,597,919  (1,880,017,558 cond + 46,580,361 ind)
 ==34993== Mispredicts:  144,974,266  (  132,177,440 cond + 12,796,826 ind)
 ==34993== Mispred rate: 7.5% (  7.0% +   27.4%   )

 ==140841== I   refs:  13,915,334,459
 ==140841==
 ==140841== Branches:   1,926,597,819  (1,880,017,458 cond + 46,580,361 
 ind)
 ==140841== Mispredicts:  144,974,296  (  132,177,470 cond + 12,796,826 
 ind)
 ==140841== Mispred rate: 7.5% (  7.0% +   27.4%   
 )

 patch 1:

 ==99902== I   refs:  13,915,069,710
 ==99902==
 ==99902== Branches:   1,926,963,813  (1,880,376,148 cond + 46,587,665 ind)
 ==99902== Mispredicts:  145,501,564  (  132,656,576 cond + 12,844,988 ind)
 ==99902== Mispred rate: 7.5% (  7.0% +   27.5%   )

 ==3907== I   refs:  13,915,082,469
 ==3907==
 ==3907== Branches:

Re: [PATCH, ciklplus]: Use -ffloat-store for 32bit x86 in cilk-plus/AN/builtin_fn_{custom,mutating}.c

2014-11-25 Thread Uros Bizjak

On Tue, Nov 25, 2014 at 10:23 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Mon, Nov 24, 2014 at 10:33 PM, Jeff Law l...@redhat.com wrote:
 On 11/22/14 11:50, Uros Bizjak wrote:

 Hello!

 These two tests fix PR target/63847 [1], where x87 excess precision
 causes testcase to fail. The problem was triggered by -fpic, please
 see the PR for analysis.

 The patch adds -ffloat-store for 32bit x86 target, a standard and well
 tested solution for this problem.

 2014-11-22  Uros Bizjak  ubiz...@gmail.com

  PR target/63847
  * c-c++-common/cilk-plus/AN/builtin_fn_custom.c: Add -ffloat-store
  for 32bit x86 targets.
  * c-c++-common/cilk-plus/AN/builtin_fn_mutating.c: Ditto.

 OK.

 Don't we have -fexcess-precision=standard for this now?

Oh ... indeed. I will update the patch to enable it for all x86 targets.

Thanks,
Uros.

Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name

2014-11-25 Thread Richard Biener

On Tue, Nov 25, 2014 at 9:45 AM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch partly fixes PR bootstrap/63995 by avoiding duplicating static 
 bounds vars.  With this fix bootstrap still fails at stage 2 and 3 comparison.

 Bootstrapped and checked on x86_64-unknown-linux-gnu.  OK for trunk?

 Thanks,
 Ilya
 --
 gcc/

 2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

 PR bootstrap/63995
 * tree-chkp (chkp_make_static_bounds): Share bounds var
 between nodes sharing assembler name.

 gcc/testsuite

 2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

 PR bootstrap/63995
 * g++.dg/dg.exp: Add mpx-dg.exp.
 * g++.dg/pr63995-1.C: New.


 diff --git a/gcc/testsuite/g++.dg/dg.exp b/gcc/testsuite/g++.dg/dg.exp
 index 14beae1..44eab0c 100644
 --- a/gcc/testsuite/g++.dg/dg.exp
 +++ b/gcc/testsuite/g++.dg/dg.exp
 @@ -18,6 +18,7 @@

  # Load support procs.
  load_lib g++-dg.exp
 +load_lib mpx-dg.exp

  # If a testcase doesn't have special options, use these.
  global DEFAULT_CXXFLAGS
 diff --git a/gcc/testsuite/g++.dg/pr63995-1.C 
 b/gcc/testsuite/g++.dg/pr63995-1.C
 new file mode 100644
 index 000..82e7606
 --- /dev/null
 +++ b/gcc/testsuite/g++.dg/pr63995-1.C
 @@ -0,0 +1,16 @@
 +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
 +/* { dg-require-effective-target mpx } */
 +/* { dg-options -O2 -g -fcheck-pointer-bounds -mmpx } */
 +
 +int test1 (int i)
 +{
 +  extern const int arr[10];
 +  return arr[i];
 +}
 +
 +extern const int arr[10];
 +
 +int test2 (int i)
 +{
 +  return arr[i];
 +}
 diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
 index 3e38691..d425084 100644
 --- a/gcc/tree-chkp.c
 +++ b/gcc/tree-chkp.c
 @@ -2727,9 +2727,29 @@ chkp_make_static_bounds (tree obj)
/* First check if we already have required var.  */
if (chkp_static_var_bounds)
  {
 -  slot = chkp_static_var_bounds-get (obj);
 -  if (slot)
 -   return *slot;
 +  /* If there is a symbol sharing assembler name with obj,
 +we may use its bounds.  */
 +  if (TREE_CODE (obj) == VAR_DECL)
 +   {
 + varpool_node *node = varpool_node::get_create (obj);
 +
 + while (node-previous_sharing_asm_name)
 +   node = (varpool_node *)node-previous_sharing_asm_name;
 +
 + while (node)
 +   {
 + slot = chkp_static_var_bounds-get (node-decl);
 + if (slot)
 +   return *slot;
 + node = (varpool_node *)node-next_sharing_asm_name;
 +   }

Hum.  varpool_node::get returns the ultimate alias target thus the
walking shouldn't be necessary.  Just

  node = varpool_node::get_create (obj);
  slot = chkp_static_var_bounds-get (node-decl);
  if (slot)
return *slot;

and then making sure to set the decl also for node-decl.  I suppose
it really asks for making chkp_static_var_bounds-get based on
a varpool node and not a decl so you consistently use the ultimate
alias target.

Richard.

 +   }
 +  else
 +   {
 + slot = chkp_static_var_bounds-get (obj);
 + if (slot)
 +   return *slot;
 +   }
  }

/* Build decl for bounds var.  */

Re: [PATCH, PR64056, i386] Fix chkp tests requiring mempcpy

2014-11-25 Thread Richard Biener

On Tue, Nov 25, 2014 at 10:11 AM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch adds check for mempcpy availability for tests requiring it.  
 Checked with RUNTESTFLAGS=--target_board='unix{-m32,}' i386.exp=chkp-*.  OK 
 for trunk?

Ok.

Thanks,
Richard.

 Thanks,
 Ilya
 --
 2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

 PR target/64056
 * gcc.target/i386/chkp-strlen-4.c: Add mempcpy target check.
 * gcc.target/i386/chkp-stropt-4.c: Likewise.
 * gcc.target/i386/chkp-stropt-8.c: Likewise.
 * gcc.target/i386/chkp-stropt-12.c: Likewise.
 * gcc.target/i386/chkp-stropt-16.c: Likewise.


 diff --git a/gcc/testsuite/gcc.target/i386/chkp-strlen-4.c 
 b/gcc/testsuite/gcc.target/i386/chkp-strlen-4.c
 index a9ebe2b..2da762a 100644
 --- a/gcc/testsuite/gcc.target/i386/chkp-strlen-4.c
 +++ b/gcc/testsuite/gcc.target/i386/chkp-strlen-4.c
 @@ -1,5 +1,6 @@
  /* { dg-do compile } */
  /* { dg-require-effective-target mpx } */
 +/* { dg-require-effective-target mempcpy } */
  /* { dg-options -fcheck-pointer-bounds -mmpx -O2 -fdump-tree-strlen 
 -D_GNU_SOURCE } */
  /* { dg-final { scan-tree-dump-times strlen 1 strlen } } */
  /* { dg-final { cleanup-tree-dump strlen } } */
 diff --git a/gcc/testsuite/gcc.target/i386/chkp-stropt-12.c 
 b/gcc/testsuite/gcc.target/i386/chkp-stropt-12.c
 index 94e936d..01a5159 100644
 --- a/gcc/testsuite/gcc.target/i386/chkp-stropt-12.c
 +++ b/gcc/testsuite/gcc.target/i386/chkp-stropt-12.c
 @@ -1,5 +1,6 @@
  /* { dg-do compile } */
  /* { dg-require-effective-target mpx } */
 +/* { dg-require-effective-target mempcpy } */
  /* { dg-options -fcheck-pointer-bounds -mmpx -O2 -fdump-tree-chkpopt 
 -fchkp-use-fast-string-functions -D_GNU_SOURCE } */
  /* { dg-final { scan-tree-dump-not mempcpy_nobnd chkpopt } } */
  /* { dg-final { cleanup-tree-dump chkpopt } } */
 diff --git a/gcc/testsuite/gcc.target/i386/chkp-stropt-16.c 
 b/gcc/testsuite/gcc.target/i386/chkp-stropt-16.c
 index 4b26d58..f925ef9 100644
 --- a/gcc/testsuite/gcc.target/i386/chkp-stropt-16.c
 +++ b/gcc/testsuite/gcc.target/i386/chkp-stropt-16.c
 @@ -1,5 +1,6 @@
  /* { dg-do compile } */
  /* { dg-require-effective-target mpx } */
 +/* { dg-require-effective-target mempcpy } */
  /* { dg-options -fcheck-pointer-bounds -mmpx -O2 -fdump-tree-chkpopt 
 -fchkp-use-nochk-string-functions -fchkp-use-fast-string-functions 
 -D_GNU_SOURCE } */
  /* { dg-final { scan-tree-dump mempcpy_nobnd_nochk chkpopt } } */
  /* { dg-final { cleanup-tree-dump chkpopt } } */
 diff --git a/gcc/testsuite/gcc.target/i386/chkp-stropt-4.c 
 b/gcc/testsuite/gcc.target/i386/chkp-stropt-4.c
 index 4ee2390..3ae6bf5 100644
 --- a/gcc/testsuite/gcc.target/i386/chkp-stropt-4.c
 +++ b/gcc/testsuite/gcc.target/i386/chkp-stropt-4.c
 @@ -1,5 +1,6 @@
  /* { dg-do compile } */
  /* { dg-require-effective-target mpx } */
 +/* { dg-require-effective-target mempcpy } */
  /* { dg-options -fcheck-pointer-bounds -mmpx -O2 -fdump-tree-chkpopt 
 -fchkp-use-nochk-string-functions -D_GNU_SOURCE } */
  /* { dg-final { scan-tree-dump mempcpy_nochk chkpopt } } */
  /* { dg-final { cleanup-tree-dump chkpopt } } */
 diff --git a/gcc/testsuite/gcc.target/i386/chkp-stropt-8.c 
 b/gcc/testsuite/gcc.target/i386/chkp-stropt-8.c
 index 8c3b15d..6d6d55e 100644
 --- a/gcc/testsuite/gcc.target/i386/chkp-stropt-8.c
 +++ b/gcc/testsuite/gcc.target/i386/chkp-stropt-8.c
 @@ -1,5 +1,6 @@
  /* { dg-do compile } */
  /* { dg-require-effective-target mpx } */
 +/* { dg-require-effective-target mempcpy } */
  /* { dg-options -fcheck-pointer-bounds -mmpx -O2 -fdump-tree-chkpopt 
 -fchkp-use-fast-string-functions -D_GNU_SOURCE } */
  /* { dg-final { scan-tree-dump mempcpy_nobnd chkpopt } } */
  /* { dg-final { cleanup-tree-dump chkpopt } } */

Re: [PATCH] Fix regressions in libgomp testsuite: set flag_fat_lto_objects for offload

2014-11-25 Thread Richard Biener

On Mon, Nov 24, 2014 at 5:44 PM, Ilya Verbin iver...@gmail.com wrote:
 On 17 Nov 10:57, Richard Biener wrote:
 On Fri, Nov 14, 2014 at 6:08 PM, Ilya Verbin iver...@gmail.com wrote:
  On 14 Nov 09:01, H.J. Lu wrote:
  On Fri, Nov 14, 2014 at 8:51 AM, Ilya Verbin iver...@gmail.com wrote:
   On 14 Nov 08:46, H.J. Lu wrote:
   What happens when -flto is used on command line?  Will we
   generate both LTO IR and offload IR?
  
   Right.
  
   I'm not sure whether we should make slim objects in case of LTO + 
   offload IR...
  
 
  Isn't __gnu_lto_slim only applied to regular LTO IR? Should offload IR be
  handled separately from regular LTO IR? It is odd to use 
  flag_fat_lto_objects
  to control offload IR.
 
  It is handled separately, but it uses a common infrastructure with regular 
  LTO
  for streaming, therefore compile_file automatically emits __gnu_lto_slim 
  when
  there is at least one section with IR (flag_generate_lto is set).  You 
  propose
  to introduce a second flag like flag_fat_lto_objects to disable 
  __gnu_lto_slim?

 Err... why is offloading not guarded with a new symbol like 
 __gnu_lto_offload?

 Well, it's possible to guard offload IR with a new symbol, using a patch like
 this (it is not fully regtested).  But I don't like it...  Maybe we could just
 change the meaning of __gnu_lto_v1 from object contains LTO IR to object
 contains any IR?  In collect2 both LTO and offload cases are handled
 identically.  Is there other place where the symbol is used?

I don't think so (and even collect2.c should be changed to use simple-object
to identify LTO objects rather than ar...).  But I think libtool uses
it as well.

In the patch adding flag_generate_offload sounds like a good solution,
I didn't like emitting fat LTO objects unconditionally just because we offload.

Richard.

   -- Ilya


 diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
 index c133a22..f09d79d 100644
 --- a/gcc/ada/gcc-interface/decl.c
 +++ b/gcc/ada/gcc-interface/decl.c
 @@ -1490,7 +1490,8 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree 
 gnu_expr, int definition)
  definition
  debug_info_p
  !optimize
 -!flag_generate_lto)
 +!flag_generate_lto
 +!flag_generate_offload)
   {
 tree param = create_param_decl (gnu_entity_name, gnu_type, false);
 gnat_pushdecl (param, gnat_entity);
 diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
 index 2fd99a7..fed1a3e 100644
 --- a/gcc/cgraphunit.c
 +++ b/gcc/cgraphunit.c
 @@ -2075,7 +2075,7 @@ ipa_passes (void)
  }

/* Some targets need to handle LTO assembler output specially.  */
 -  if (flag_generate_lto)
 +  if (flag_generate_lto || flag_generate_offload)
  targetm.asm_out.lto_start ();

if (!in_lto_p)
 @@ -2092,7 +2092,7 @@ ipa_passes (void)
 }
  }

 -  if (flag_generate_lto)
 +  if (flag_generate_lto || flag_generate_offload)
  targetm.asm_out.lto_end ();

if (!flag_ltrans  (in_lto_p || !flag_lto || flag_fat_lto_objects))
 @@ -2176,10 +2176,10 @@ symbol_table::compile (void)

/* Offloading requires LTO infrastructure.  */
if (!in_lto_p  g-have_offload)
 -flag_generate_lto = 1;
 +flag_generate_offload = 1;

/* If LTO is enabled, initialize the streamer hooks needed by GIMPLE.  */
 -  if (flag_generate_lto)
 +  if (flag_generate_lto || flag_generate_offload)
  lto_streamer_hooks_init ();

/* Don't run the IPA passes if there was any error or sorry messages.  */
 diff --git a/gcc/collect2.c b/gcc/collect2.c
 index 9c3a1c5..2dcebcd 100644
 --- a/gcc/collect2.c
 +++ b/gcc/collect2.c
 @@ -2392,12 +2392,16 @@ scan_prog_file (const char *prog_name, scanpass 
 which_pass,
if (found_lto)
  continue;

 -  /* Look for the LTO info marker symbol, and add filename to
 +  /* Look for the LTO or offload info marker symbol, and add 
 filename to
   the LTO objects list if found.  */
for (p = buf; (ch = *p) != '\0'  ch != '\n'; p++)
  if (ch == ' '   p[1] == '_'  p[2] == '_'
 -(strncmp (p + (p[3] == '_' ? 2 : 1), __gnu_lto_v1, 12) 
 == 0)
 -ISSPACE (p[p[3] == '_' ? 14 : 13]))
 +(((strncmp (p + (p[3] == '_' ? 2 : 1),
 +  __gnu_lto_v1, 12) == 0)
 + ISSPACE (p[p[3] == '_' ? 14 : 13]))
 +   || ((strncmp (p + (p[3] == '_' ? 2 : 1),
 + __gnu_offload_v1, 16) == 0)
 +ISSPACE (p[p[3] == '_' ? 18 : 17]
{
  add_lto_object (lto_objects, prog_name);

 diff --git a/gcc/common.opt b/gcc/common.opt
 index 41c8d4e..11a5500 100644
 --- a/gcc/common.opt
 +++ b/gcc/common.opt
 @@ -67,6 +67,10 @@ int *param_values
  Variable
  int flag_generate_lto

 +; Nonzero if we should write GIMPLE bytecode for offload compilation.
 +Variable
 +int

Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name

2014-11-25 Thread Ilya Enkovich

2014-11-25 12:43 GMT+03:00 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 25, 2014 at 9:45 AM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch partly fixes PR bootstrap/63995 by avoiding duplicating static 
 bounds vars.  With this fix bootstrap still fails at stage 2 and 3 
 comparison.

 Bootstrapped and checked on x86_64-unknown-linux-gnu.  OK for trunk?

 Thanks,
 Ilya
 --
 gcc/

 2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

 PR bootstrap/63995
 * tree-chkp (chkp_make_static_bounds): Share bounds var
 between nodes sharing assembler name.

 gcc/testsuite

 2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

 PR bootstrap/63995
 * g++.dg/dg.exp: Add mpx-dg.exp.
 * g++.dg/pr63995-1.C: New.


 diff --git a/gcc/testsuite/g++.dg/dg.exp b/gcc/testsuite/g++.dg/dg.exp
 index 14beae1..44eab0c 100644
 --- a/gcc/testsuite/g++.dg/dg.exp
 +++ b/gcc/testsuite/g++.dg/dg.exp
 @@ -18,6 +18,7 @@

  # Load support procs.
  load_lib g++-dg.exp
 +load_lib mpx-dg.exp

  # If a testcase doesn't have special options, use these.
  global DEFAULT_CXXFLAGS
 diff --git a/gcc/testsuite/g++.dg/pr63995-1.C 
 b/gcc/testsuite/g++.dg/pr63995-1.C
 new file mode 100644
 index 000..82e7606
 --- /dev/null
 +++ b/gcc/testsuite/g++.dg/pr63995-1.C
 @@ -0,0 +1,16 @@
 +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
 +/* { dg-require-effective-target mpx } */
 +/* { dg-options -O2 -g -fcheck-pointer-bounds -mmpx } */
 +
 +int test1 (int i)
 +{
 +  extern const int arr[10];
 +  return arr[i];
 +}
 +
 +extern const int arr[10];
 +
 +int test2 (int i)
 +{
 +  return arr[i];
 +}
 diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
 index 3e38691..d425084 100644
 --- a/gcc/tree-chkp.c
 +++ b/gcc/tree-chkp.c
 @@ -2727,9 +2727,29 @@ chkp_make_static_bounds (tree obj)
/* First check if we already have required var.  */
if (chkp_static_var_bounds)
  {
 -  slot = chkp_static_var_bounds-get (obj);
 -  if (slot)
 -   return *slot;
 +  /* If there is a symbol sharing assembler name with obj,
 +we may use its bounds.  */
 +  if (TREE_CODE (obj) == VAR_DECL)
 +   {
 + varpool_node *node = varpool_node::get_create (obj);
 +
 + while (node-previous_sharing_asm_name)
 +   node = (varpool_node *)node-previous_sharing_asm_name;
 +
 + while (node)
 +   {
 + slot = chkp_static_var_bounds-get (node-decl);
 + if (slot)
 +   return *slot;
 + node = (varpool_node *)node-next_sharing_asm_name;
 +   }

 Hum.  varpool_node::get returns the ultimate alias target thus the
 walking shouldn't be necessary.  Just

   node = varpool_node::get_create (obj);
   slot = chkp_static_var_bounds-get (node-decl);
   if (slot)
 return *slot;

 and then making sure to set the decl also for node-decl.  I suppose
 it really asks for making chkp_static_var_bounds-get based on
 a varpool node and not a decl so you consistently use the ultimate
 alias target.

varpool_node::get just returns symtab_node::get which returns
decl-decl_with_vis.symtab_node - thus no aliases walkthrough. Also
none of two varpool_nodes is an alias. The only connection between
these nodes seems to be {next,previous}_sharing_asm_name. Here is how
these nodes look:

(gdb) p *$2
$3 = {symtab_node = {type = SYMTAB_VARIABLE, resolution =
LDPR_UNKNOWN, definition = 0, alias = 0, weakref = 0,
cpp_implicit_alias = 0, analyzed = 0, writeonly = 0,
refuse_visibility_changes = 0, externally_visible = 0,
no_reorder = 0, force_output = 0, forced_by_abi = 0, unique_name =
0, implicit_section = 0, body_removed = 1, used_from_other_partition =
0, in_other_partition = 0, address_taken = 0, in_init_priority_hash =
0,
need_lto_streaming = 0, offloadable = 0, order = 3, decl =
0x77dd2cf0, next = 0x77f46200, previous = 0x77dd67a8,
next_sharing_asm_name = 0x0, previous_sharing_asm_name =
0x77f46200, same_comdat_group = 0x0,
ref_list = {references = 0x0, referring = {m_vec = 0x23856b0}},
alias_target = 0x0, lto_file_data = 0x0, aux = 0x0, x_comdat_group =
0x0, x_section = 0x0}, output = 0, need_bounds_init = 0,
dynamically_initialized = 0,
  tls_model = TLS_MODEL_NONE, used_by_single_function = 0}

(gdb) p *$5
$6 = {symtab_node = {type = SYMTAB_VARIABLE, resolution =
LDPR_UNKNOWN, definition = 0, alias = 0, weakref = 0,
cpp_implicit_alias = 0, analyzed = 0, writeonly = 0,
refuse_visibility_changes = 0, externally_visible = 0,
no_reorder = 0, force_output = 0, forced_by_abi = 0, unique_name =
0, implicit_section = 0, body_removed = 1, used_from_other_partition =
0, in_other_partition = 0, address_taken = 0, in_init_priority_hash =
0,
need_lto_streaming = 0, offloadable = 0, order = 2, decl =
0x77dd2bd0, next = 0x77dd6620, previous = 0x77f46300,
next_sharing_asm_name = 0x77f46300, previous_sharing_asm_name =
0x0, same_comdat_group = 0x0,
ref_list = {references =

Re: [Patch] Improving jump-thread pass for PR 54742

2014-11-25 Thread Markus Trippelsdorf

On 2014.11.24 at 22:05 +, Sebastian Pop wrote:
 I got my scripts installed on the gcc-farm.  I first used an x86_64 gcc75 and
 valgrind was crashing not recognizing how to decode an instruction.  Then I
 moved to gcc112 a powerpc64-linux where I got this data from stage2 cc1plus
 compiling the same file alias.ii at -O2: (I got 3 runs of each mostly because
 there is a bit of noise in all these numbers)
 
 $ valgrind --tool=cachegrind --cache-sim=no --branch-sim=yes ./cc1plus -O2 
 ~/alias.ii

BTW perf is also available on gcc112:

trippels@gcc2-power8 ~ % perf list

List of pre-defined events (to be used in -e):
  cpu-cycles OR cycles   [Hardware event]
  instructions   [Hardware event]
  cache-references   [Hardware event]
  cache-misses   [Hardware event]
  branch-instructions OR branches[Hardware event]
  branch-misses  [Hardware event]
  stalled-cycles-frontend OR idle-cycles-frontend[Hardware event]
  stalled-cycles-backend OR idle-cycles-backend  [Hardware event]

  cpu-clock  [Software event]
  task-clock [Software event]
  page-faults OR faults  [Software event]
  context-switches OR cs [Software event]
  cpu-migrations OR migrations   [Software event]
  minor-faults   [Software event]
  major-faults   [Software event]
  alignment-faults   [Software event]
  emulation-faults   [Software event]
  dummy  [Software event]

  L1-dcache-loads[Hardware cache event]
  L1-dcache-load-misses  [Hardware cache event]
  L1-dcache-store-misses [Hardware cache event]
  L1-dcache-prefetches   [Hardware cache event]
  L1-icache-loads[Hardware cache event]
  L1-icache-load-misses  [Hardware cache event]
  L1-icache-prefetches   [Hardware cache event]
  LLC-loads  [Hardware cache event]
  LLC-load-misses[Hardware cache event]
  LLC-stores [Hardware cache event]
  LLC-store-misses   [Hardware cache event]
  LLC-prefetches [Hardware cache event]
  dTLB-load-misses   [Hardware cache event]
  iTLB-load-misses   [Hardware cache event]
  branch-loads   [Hardware cache event]
  branch-load-misses [Hardware cache event]

  rNNN   [Raw hardware event 
descriptor]
  cpu/t1=v1[,t2=v2,t3 ...]/modifier  [Raw hardware event 
descriptor]
   (see 'man perf-list' on how to encode it)

  mem:addr[:access][Hardware breakpoint]

-- 
Markus

Re: [Patch, libstdc++/63497] Avoid dereferencing invalid iterator in regex_executor

2014-11-25 Thread Jonathan Wakely


On 25/11/14 00:41 -0800, Tim Shen wrote:

On Wed, Oct 22, 2014 at 8:19 PM, Tim Shen tims...@google.com wrote:

Committed. Thank you too!


I'm backporting this patch to gcc-4_9-branch. Do we usually boot 
test it and then commit directly, or it should be reviewed again?


I approved it for the branch (in the bugzilla comments) so usually you
could just test it and commit it ... but since you asked ... maybe you
should leave the _M_word_boundary signature unchanged for the branch,
since the unused parameter doesn't do any harm and removing it isn't
needed for the fix to work.

Re: [PATCH, ciklplus]: Use -ffloat-store for 32bit x86 in cilk-plus/AN/builtin_fn_{custom,mutating}.c

2014-11-25 Thread Uros Bizjak

On Tue, Nov 25, 2014 at 10:38 AM, Uros Bizjak ubiz...@gmail.com wrote:

 These two tests fix PR target/63847 [1], where x87 excess precision
 causes testcase to fail. The problem was triggered by -fpic, please
 see the PR for analysis.

 The patch adds -ffloat-store for 32bit x86 target, a standard and well
 tested solution for this problem.

 2014-11-22  Uros Bizjak  ubiz...@gmail.com

  PR target/63847
  * c-c++-common/cilk-plus/AN/builtin_fn_custom.c: Add -ffloat-store
  for 32bit x86 targets.
  * c-c++-common/cilk-plus/AN/builtin_fn_mutating.c: Ditto.

 OK.

 Don't we have -fexcess-precision=standard for this now?

 Oh ... indeed. I will update the patch to enable it for all x86 targets.

cc1plus: sorry, unimplemented: -fexcess-precision=standard for C++

Uros.

RE: [PATCH] Fix PR ipa/61190, updated

2014-11-25 Thread Bernd Edlinger


Hi Honza,

On Mon, 24 Nov 2014 16:57:42 +0100, Jan Hubicka wrote:

 +cgraph_node::call_for_symbol_thunks_and_aliases_1 (bool (*callback)
 + (cgraph_node *, void *),
 + void *data,
 + bool include_overwritable,
 + bool exclude_virtual_thunks)

 Instead of adding _1 variant into public API, please just add implicit 
 agrumnet
 bool exclude_virtual_thunks=false into
 +cgraph_node::call_for_symbol_thunks_and_aliases

Ok, done.

 Index: gcc/ipa-pure-const.c
 ===
 --- gcc/ipa-pure-const.c (revision 215888)
 +++ gcc/ipa-pure-const.c (working copy)
 @@ -744,6 +744,8 @@ analyze_function (struct cgraph_node *fn, bool ipa
 {
 /* Thunk gets propagated through, so nothing interesting happens. */
 gcc_assert (ipa);
 + if (fn-thunk.virtual_offset_p)
 + l-pure_const_state = IPA_NEITHER;
 return l;
 }


Hmm, I looked again at the above if statement, and I think now it should
better be if (fn-thunk.thunk_p  fn-thunk.virtual_offset_p), because
thunk.virtual_offset_p is probably not well defined if we come here because
of fn-alias == true.

 This makes the lattice to be initialized correctly, but you also need the
 function_symbol calls that will skip thunks replaced by
 something like function_or_non_virtual_thunk_symbol.


Oh, I see what you mean, thanks.

I created a new method function_or_virtual_thunk_symbol() for this.
And simplified the algorithm of both function_symbol variants a bit.

Attached, you'll find my updated patch for review.

Boot-strapped and regression tested on x86_64-linux-gnu.
OK for trunk?


Thanks
Bernd.

 Can you, please, send the updated patch?
 Sorry for late review,
 Honza


  2014-11-25  Bernd Edlinger  bernd.edlin...@hotmail.de

PR ipa/61190
* cgraph.h (symtab_node::call_for_symbol_and_aliases): Fix comment.
(cgraph_node::function_or_virtual_thunk_symbol): New function.
(cgraph_node::call_for_symbol_and_aliases): Fix comment.
(cgraph_node::call_for_symbol_thunks_and_aliases): Adjust comment.
Add new optional parameter exclude_virtual_thunks.
* cgraph.c (cgraph_node::call_for_symbol_thunks_and_aliases): Add new
optional parameter exclude_virtual_thunks.
(cgraph_node::set_const_flag): Don't propagate to virtual thunks.
(cgraph_node::set_pure_flag): Likewise.
(cgraph_node::function_symbol): Simplified.
(cgraph_node::function_or_virtual_thunk_symbol): New function.
* ipa-pure-const.c (analyze_function): For virtual thunks set
pure_const_state to IPA_NEITHER.
(propagate_pure_const): Use function_or_virtual_thunk_symbol.

testsuite/ChangeLog:
2014-11-25  Bernd Edlinger  bernd.edlin...@hotmail.de

PR ipa/61190
* g++.old-deja/g++.mike/p4736b.C: Use -O2.



patch-pr61190.diff
Description: Binary data

Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name

2014-11-25 Thread Richard Biener

On Tue, Nov 25, 2014 at 11:19 AM, Ilya Enkovich enkovich@gmail.com wrote:
 2014-11-25 12:43 GMT+03:00 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 25, 2014 at 9:45 AM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 Hi,

 This patch partly fixes PR bootstrap/63995 by avoiding duplicating static 
 bounds vars.  With this fix bootstrap still fails at stage 2 and 3 
 comparison.

 Bootstrapped and checked on x86_64-unknown-linux-gnu.  OK for trunk?

 Thanks,
 Ilya
 --
 gcc/

 2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

 PR bootstrap/63995
 * tree-chkp (chkp_make_static_bounds): Share bounds var
 between nodes sharing assembler name.

 gcc/testsuite

 2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

 PR bootstrap/63995
 * g++.dg/dg.exp: Add mpx-dg.exp.
 * g++.dg/pr63995-1.C: New.


 diff --git a/gcc/testsuite/g++.dg/dg.exp b/gcc/testsuite/g++.dg/dg.exp
 index 14beae1..44eab0c 100644
 --- a/gcc/testsuite/g++.dg/dg.exp
 +++ b/gcc/testsuite/g++.dg/dg.exp
 @@ -18,6 +18,7 @@

  # Load support procs.
  load_lib g++-dg.exp
 +load_lib mpx-dg.exp

  # If a testcase doesn't have special options, use these.
  global DEFAULT_CXXFLAGS
 diff --git a/gcc/testsuite/g++.dg/pr63995-1.C 
 b/gcc/testsuite/g++.dg/pr63995-1.C
 new file mode 100644
 index 000..82e7606
 --- /dev/null
 +++ b/gcc/testsuite/g++.dg/pr63995-1.C
 @@ -0,0 +1,16 @@
 +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
 +/* { dg-require-effective-target mpx } */
 +/* { dg-options -O2 -g -fcheck-pointer-bounds -mmpx } */
 +
 +int test1 (int i)
 +{
 +  extern const int arr[10];
 +  return arr[i];
 +}
 +
 +extern const int arr[10];
 +
 +int test2 (int i)
 +{
 +  return arr[i];
 +}
 diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
 index 3e38691..d425084 100644
 --- a/gcc/tree-chkp.c
 +++ b/gcc/tree-chkp.c
 @@ -2727,9 +2727,29 @@ chkp_make_static_bounds (tree obj)
/* First check if we already have required var.  */
if (chkp_static_var_bounds)
  {
 -  slot = chkp_static_var_bounds-get (obj);
 -  if (slot)
 -   return *slot;
 +  /* If there is a symbol sharing assembler name with obj,
 +we may use its bounds.  */
 +  if (TREE_CODE (obj) == VAR_DECL)
 +   {
 + varpool_node *node = varpool_node::get_create (obj);
 +
 + while (node-previous_sharing_asm_name)
 +   node = (varpool_node *)node-previous_sharing_asm_name;
 +
 + while (node)
 +   {
 + slot = chkp_static_var_bounds-get (node-decl);
 + if (slot)
 +   return *slot;
 + node = (varpool_node *)node-next_sharing_asm_name;
 +   }

 Hum.  varpool_node::get returns the ultimate alias target thus the
 walking shouldn't be necessary.  Just

   node = varpool_node::get_create (obj);
   slot = chkp_static_var_bounds-get (node-decl);
   if (slot)
 return *slot;

 and then making sure to set the decl also for node-decl.  I suppose
 it really asks for making chkp_static_var_bounds-get based on
 a varpool node and not a decl so you consistently use the ultimate
 alias target.

 varpool_node::get just returns symtab_node::get which returns
 decl-decl_with_vis.symtab_node - thus no aliases walkthrough. Also
 none of two varpool_nodes is an alias. The only connection between
 these nodes seems to be {next,previous}_sharing_asm_name. Here is how
 these nodes look:

Ok, then it's get_for_asmname ().  That said - the above loops look
bogus to me.  Honza - any better ideas?

Richard.

 (gdb) p *$2
 $3 = {symtab_node = {type = SYMTAB_VARIABLE, resolution =
 LDPR_UNKNOWN, definition = 0, alias = 0, weakref = 0,
 cpp_implicit_alias = 0, analyzed = 0, writeonly = 0,
 refuse_visibility_changes = 0, externally_visible = 0,
 no_reorder = 0, force_output = 0, forced_by_abi = 0, unique_name =
 0, implicit_section = 0, body_removed = 1, used_from_other_partition =
 0, in_other_partition = 0, address_taken = 0, in_init_priority_hash =
 0,
 need_lto_streaming = 0, offloadable = 0, order = 3, decl =
 0x77dd2cf0, next = 0x77f46200, previous = 0x77dd67a8,
 next_sharing_asm_name = 0x0, previous_sharing_asm_name =
 0x77f46200, same_comdat_group = 0x0,
 ref_list = {references = 0x0, referring = {m_vec = 0x23856b0}},
 alias_target = 0x0, lto_file_data = 0x0, aux = 0x0, x_comdat_group =
 0x0, x_section = 0x0}, output = 0, need_bounds_init = 0,
 dynamically_initialized = 0,
   tls_model = TLS_MODEL_NONE, used_by_single_function = 0}

 (gdb) p *$5
 $6 = {symtab_node = {type = SYMTAB_VARIABLE, resolution =
 LDPR_UNKNOWN, definition = 0, alias = 0, weakref = 0,
 cpp_implicit_alias = 0, analyzed = 0, writeonly = 0,
 refuse_visibility_changes = 0, externally_visible = 0,
 no_reorder = 0, force_output = 0, forced_by_abi = 0, unique_name =
 0, implicit_section = 0, body_removed = 1, used_from_other_partition =
 0, in_other_partition = 0, address_taken = 0, in_init_priority_hash =
 0,
 need_lto_streaming =

Re: [PATCH, 1/8] Expand oacc kernels after pass_build_ealias

2014-11-25 Thread Tom de Vries


On 24-11-14 11:56, Tom de Vries wrote:

On 15-11-14 18:19, Tom de Vries wrote:

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch moves omp expansion of the oacc kernels directive to after
pass_build_ealias.

The rationale is that in order to use pass_parallelize_loops for analysis and
transformation of an oacc kernels region, we postpone omp expansion of that
region until the earliest point in the pass list where enough information is
availabe to run pass_parallelize_loops, in other words, after pass_build_ealias.

The patch postpones expansion in expand_omp, and ensures expansion by adding
pass_expand_omp_ssa:
- after pass_build_ealias, and
- after pass_all_early_optimizations for the case we're not optimizing.

In order to make sure the oacc kernels region arrives at pass_expand_omp_ssa,
the way it left expand_omp, the patch makes pass_ccp and pass_forwprop aware of
lowered omp code, to handle it conservatively.

The patch contains changes in expand_omp_target to deal with ssa-code, similar
to what is already present in expand_omp_taskreg.

Furthermore, the patch forces the .omp_data_sizes and .omp_data_kinds to not be
static for oacc kernels. It does this to get some references to .omp_data_sizes
and .omp_data_kinds in the ssa code.  Without these references, the definitions
will be removed. The reference of the variables in GIMPLE_OACC_KERNELS is not
enough to have them not removed. [ In vries/oacc-kernels, I used a BUILT_IN_USE
kludge for this purpose ].

Finally, at the end of pass_expand_omp_ssa we're left with SSA_NAMEs in the
original function of which the definition has been removed (as in moved to the
split off function). TODO_remove_unused_locals takes care of some of them, but
not the anonymous ones. So the patch iterates over all SSA_NAMEs to find these
dangling SSA_NAMEs and releases them.



Reposting with small update: I've replaced the use of the rather generic
gimple_stmt_omp_lowering_p with the more specific gimple_stmt_omp_data_i_init_p.

Bootstrapped and reg-tested in the same way as before.



I've moved pass_expand_omp_ssa one down in the pass list, past pass_fre.

This allows fre to unify references to the same omp variable before entering 
pass_oacc_kernels, which helps pass_lim in pass_oacc_kernels.


F.i. this reduction fragment:
...
  # VUSE .MEM_8
  # PT = { D.2282 }
  _67 = .omp_data_i_59-sumD.2270;
  # VUSE .MEM_8
  _68 = *_67;

  _70 = _66 + _68;

  # VUSE .MEM_8
  # PT = { D.2282 }
  _69 = .omp_data_i_59-sumD.2270;
  # .MEM_71 = VDEF .MEM_8
  *_69 = _70;
...

is transformed by fre into:
...
  # VUSE .MEM_8
  # PT = { D.2282 }
  _67 = .omp_data_i_59-sumD.2270;
  # VUSE .MEM_8
  _68 = *_67;

  _70 = _66 + _68;

  # .MEM_71 = VDEF .MEM_8
  *_67 = _70;
...

In order for pass_fre to respect the kernels region boundaries, I've added a 
change in tree-ssa-sccvn.c:visit_use to handle the .omp_data_i init conservatively.


Bootstrapped and reg-tested as before.

OK for trunk?

Thanks,
- Tom

[PATCH 1/7] Expand oacc kernels after pass_fre

2014-11-25  Tom de Vries  t...@codesourcery.com

	* function.h (struct function): Add contains_oacc_kernels field.
	* gimplify.c (gimplify_omp_workshare): Set contains_oacc_kernels.
	* omp-low.c: Include gimple-pretty-print.h.
	(release_first_vuse_in_edge_dest): New function.
	(expand_omp_target): Handle ssa-code.
	(expand_omp): Don't expand GIMPLE_OACC_KERNELS when not in ssa.
	(pass_data_expand_omp): Don't set PROP_gimple_eomp unconditionally in
	properties_provided field.
	(pass_expand_omp::execute): Set PROP_gimple_eomp in
	cfun-curr_properties only if cfun does not contain oacc kernels.
	(pass_data_expand_omp_ssa): Add TODO_remove_unused_locals to
	todo_flags_finish field.
	(pass_expand_omp_ssa::execute): Release dangling SSA_NAMEs after calling
	execute_expand_omp.
	(lower_omp_target): Add static_arrays variable, init to 1.  Don't use
	static arrays for kernels directive.  Use static_arrays variable.
	Handle case that .omp_data_kinds is not static.
	(gimple_stmt_ssa_operand_references_var_p)
	(gimple_stmt_omp_data_i_init_p): New function.
	* omp-low.h (gimple_stmt_omp_data_i_init_p): Declare.
	* passes.def: Add pass_expand_omp_ssa after pass_fre.  Add
	pass_expand_omp_ssa after pass_all_early_optimizations.
	* tree-ssa-ccp.c: Include omp-low.h.
	(surely_varying_stmt_p, ccp_visit_stmt): Handle

Re: [PATCH, 2/8] Add pass_oacc_kernels

2014-11-25 Thread Tom de Vries


On 15-11-14 18:20, Tom de Vries wrote:

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds a pass group pass_oacc_kernels.

The rationale is that we want a pass group to run oacc kernels region related
(optimization) passes in.



Updated for moving pass_oacc_kernels down past pass_fre in the pass list.

Bootstrapped and reg-tested as before.

OK for trunk?

Thanks,
  - Tom

[PATCH 2/7] Add pass_oacc_kernels

2014-11-25  Tom de Vries  t...@codesourcery.com

	* passes.def: Add pass group pass_oacc_kernels.
	* tree-pass.h (make_pass_oacc_kernels): Declare.
	* tree-ssa-loop.c (gate_oacc_kernels): New static function.
	(pass_data_oacc_kernels): New pass_data.
	(class pass_oacc_kernels): New pass.
	(make_pass_oacc_kernels): New function.
---
 gcc/passes.def  |  7 ++-
 gcc/tree-pass.h |  1 +
 gcc/tree-ssa-loop.c | 48 
 3 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/gcc/passes.def b/gcc/passes.def
index bf1cd34..efb3d8c 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -86,7 +86,12 @@ along with GCC; see the file COPYING3.  If not see
 	 execute TODO_rebuild_alias at this point.  */
 	  NEXT_PASS (pass_build_ealias);
 	  NEXT_PASS (pass_fre);
-	  NEXT_PASS (pass_expand_omp_ssa);
+	  /* Pass group that runs when there are oacc kernels in the
+	 function.  */
+	  NEXT_PASS (pass_oacc_kernels);
+	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
+	  NEXT_PASS (pass_expand_omp_ssa);
+	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_merge_phi);
 	  NEXT_PASS (pass_cd_dce);
 	  NEXT_PASS (pass_early_ipa_sra);
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 75f8aa5..d63ab2b 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -449,6 +449,7 @@ extern gimple_opt_pass *make_pass_strength_reduction (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vtable_verify (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ubsan (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sanopt (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_oacc_kernels (gcc::context *ctxt);
 
 /* IPA Passes */
 extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 758b5fc..c29aa22 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -157,6 +157,54 @@ make_pass_tree_loop (gcc::context *ctxt)
   return new pass_tree_loop (ctxt);
 }
 
+/* Gate for oacc kernels pass group.  */
+
+static bool
+gate_oacc_kernels (function *fn)
+{
+  if (!flag_openacc)
+return false;
+
+  return fn-contains_oacc_kernels;
+}
+
+/* The oacc kernels superpass.  */
+
+namespace {
+
+const pass_data pass_data_oacc_kernels =
+{
+  GIMPLE_PASS, /* type */
+  oacc_kernels, /* name */
+  OPTGROUP_LOOP, /* optinfo_flags */
+  TV_TREE_LOOP, /* tv_id */
+  PROP_cfg, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_oacc_kernels : public gimple_opt_pass
+{
+public:
+  pass_oacc_kernels (gcc::context *ctxt)
+: gimple_opt_pass (pass_data_oacc_kernels, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *fn) { return gate_oacc_kernels (fn); }
+
+}; // class pass_oacc_kernels
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_oacc_kernels (gcc::context *ctxt)
+{
+  return new pass_oacc_kernels (ctxt);
+}
+
 /* The no-loop superpass.  */
 
 namespace {
-- 
1.9.1

Re: [PATCH, 3/8] Add pass_ch_oacc_kernels to pass_oacc_kernels

2014-11-25 Thread Tom de Vries


On 15-11-14 18:21, Tom de Vries wrote:

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds a pass_ch_oacc_kernels to the pass group pass_oacc_kernels.

The idea is that pass_parallelize_loops only deals with loops for which the
header has been copied, so the easiest way to meet that requirement when running
pass_parallelize_loops in group pass_oacc_kernels, is to run pass_ch as a part
of pass_oacc_kernels.

We define a seperate pass pass_ch_oacc_kernels, to leave all loops that aren't
part of a kernels region alone.



Updated for moving pass_oacc_kernels down past pass_fre in the pass list.

Bootstrapped and reg-tested as before.

OK for trunk?

Thanks,
  - Tom
[PATCH 3/7] Add pass_ch_oacc_kernels to pass_oacc_kernels

2014-11-25  Tom de Vries  t...@codesourcery.com

	* omp-low.c (loop_in_oacc_kernels_region_p): New function.
	* omp-low.h (loop_in_oacc_kernels_region_p): Declare.
	* passes.def: Add pass_ch_oacc_kernels to pass group pass_oacc_kernels.
	* tree-pass.h (make_pass_ch_oacc_kernels): Declare
	* tree-ssa-loop-ch.c: Include omp-low.h.
	(pass_ch_execute): Declare.
	(pass_ch::execute): Factor out ...
	(pass_ch_execute): ... this new function.  If handling oacc kernels,
	skip loops that are not in oacc kernels region.
	(pass_ch_oacc_kernels::execute):
	(pass_data_ch_oacc_kernels): New pass_data.
	(class pass_ch_oacc_kernels): New pass.
	(pass_ch_oacc_kernels::execute, make_pass_ch_oacc_kernels): New
	function.
---
 gcc/omp-low.c  | 83 ++
 gcc/omp-low.h  |  2 ++
 gcc/passes.def |  1 +
 gcc/tree-pass.h|  1 +
 gcc/tree-ssa-loop-ch.c | 59 +--
 5 files changed, 144 insertions(+), 2 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 3ac546c..543dd48 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -13912,4 +13912,87 @@ gimple_stmt_omp_data_i_init_p (gimple stmt)
 		   SSA_OP_DEF);
 }
 
+/* Return true if LOOP is inside a kernels region.  */
+
+bool
+loop_in_oacc_kernels_region_p (struct loop *loop, basic_block *region_entry,
+			   basic_block *region_exit)
+{
+  bitmap excludes_bitmap = BITMAP_GGC_ALLOC ();
+  bitmap region_bitmap = BITMAP_GGC_ALLOC ();
+  bitmap_clear (region_bitmap);
+
+  if (region_entry != NULL)
+*region_entry = NULL;
+  if (region_exit != NULL)
+*region_exit = NULL;
+
+  basic_block bb;
+  gimple last;
+  FOR_EACH_BB_FN (bb, cfun)
+{
+  if (bitmap_bit_p (region_bitmap, bb-index))
+	continue;
+
+  last = last_stmt (bb);
+  if (!last)
+	continue;
+
+  if (gimple_code (last) != GIMPLE_OACC_KERNELS)
+	continue;
+
+  bitmap_clear (excludes_bitmap);
+  bitmap_set_bit (excludes_bitmap, bb-index);
+
+  vecbasic_block dominated
+	= get_all_dominated_blocks (CDI_DOMINATORS, bb);
+
+  unsigned di;
+  basic_block dom;
+
+  basic_block end_region = NULL;
+  FOR_EACH_VEC_ELT (dominated, di, dom)
+	{
+	  if (dom == bb)
+	continue;
+
+	  last = last_stmt (dom);
+	  if (!last)
+	continue;
+
+	  if (gimple_code (last) != GIMPLE_OMP_RETURN)
+	continue;
+
+	  if (end_region == NULL
+	  || dominated_by_p (CDI_DOMINATORS, end_region, dom))
+	end_region = dom;
+	}
+
+  vecbasic_block excludes
+	= get_all_dominated_blocks (CDI_DOMINATORS, end_region);
+
+  unsigned di2;
+  basic_block exclude;
+
+  FOR_EACH_VEC_ELT (excludes, di2, exclude)
+	if (exclude != end_region)
+	  bitmap_set_bit (excludes_bitmap, exclude-index);
+
+  FOR_EACH_VEC_ELT (dominated, di, dom)
+	if (!bitmap_bit_p (excludes_bitmap, dom-index))
+	  bitmap_set_bit (region_bitmap, dom-index);
+
+  if (bitmap_bit_p (region_bitmap, loop-header-index))
+	{
+	  if (region_entry != NULL)
+	*region_entry = bb;
+	  if (region_exit != NULL)
+	*region_exit = end_region;
+	  return true;
+	}
+}
+
+  return false;
+}
+
 #include gt-omp-low.h
diff --git a/gcc/omp-low.h b/gcc/omp-low.h
index 32076e4..30df867 100644
--- a/gcc/omp-low.h
+++ b/gcc/omp-low.h
@@ -29,6 +29,8 @@ extern tree omp_reduction_init (tree, tree);
 extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *);
 extern void omp_finish_file (void);
 extern bool gimple_stmt_omp_data_i_init_p (gimple);
+extern bool loop_in_oacc_kernels_region_p (struct loop *, basic_block *,
+

Re: [PATCH, 4/8] Add pass_tree_loop_{init,done} to pass_oacc_kernels

2014-11-25 Thread Tom de Vries


On 15-11-14 18:21, Tom de Vries wrote:

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds pass_tree_loop_init and pass_tree_loop_init_done to
pass_oacc_kernels.

Pass_parallelize_loops is run between these passes in the pass group
pass_tree_loop, since it requires loop information.  We do the same for
pass_oacc_kernels.



Updated for moving pass_oacc_kernels down past pass_fre in the pass list.

Bootstrapped and reg-tested as before.

OK for trunk?

Thanks,
  - Tom
[PATCH 4/7] Add pass_tree_loop_{init,done} to pass_oacc_kernels

2014-11-25  Tom de Vries  t...@codesourcery.com

	* passes.def: Run pass_tree_loop_init and pass_tree_loop_done in pass
	group pass_oacc_kernels.
	* tree-ssa-loop.c (pass_tree_loop_init::clone)
	(pass_tree_loop_done::clone): New function.
---
 gcc/passes.def  | 2 ++
 gcc/tree-ssa-loop.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/gcc/passes.def b/gcc/passes.def
index 01368bb..37e08a8 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -91,7 +91,9 @@ along with GCC; see the file COPYING3.  If not see
 	  NEXT_PASS (pass_oacc_kernels);
 	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
 	  NEXT_PASS (pass_ch_oacc_kernels);
+	  NEXT_PASS (pass_tree_loop_init);
 	  NEXT_PASS (pass_expand_omp_ssa);
+	  NEXT_PASS (pass_tree_loop_done);
 	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_merge_phi);
 	  NEXT_PASS (pass_cd_dce);
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index c29aa22..c78b013 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -269,6 +269,7 @@ public:
 
   /* opt_pass methods: */
   virtual unsigned int execute (function *);
+  opt_pass * clone () { return new pass_tree_loop_init (m_ctxt); }
 
 }; // class pass_tree_loop_init
 
@@ -563,6 +564,7 @@ public:
 
   /* opt_pass methods: */
   virtual unsigned int execute (function *) { return tree_ssa_loop_done (); }
+  opt_pass * clone () { return new pass_tree_loop_done (m_ctxt); }
 
 }; // class pass_tree_loop_done
 
-- 
1.9.1

Re: [PATCH, 5/8] Add pass_loop_im to pass_oacc_kernels

2014-11-25 Thread Tom de Vries


On 15-11-14 18:22, Tom de Vries wrote:

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds pass_loop_im to pass group pass_oacc_kernels.

We need this pass to simplify the loop body, and allow pass_parloops to detect
that loop iterations are independent.



Updated for moving pass_oacc_kernels down past pass_fre in the pass list.

Bootstrapped and reg-tested as before.

OK for trunk?

Thanks,
  - Tom
[PATCH 5/7] Add pass_loop_im to pass_oacc_kernels

2014-11-25  Tom de Vries  t...@codesourcery.com

	* passes.def: Add pass_lim in pass group pass_ch_oacc_kernels.

	* c-c++-common/restrict-2.c: Update for new pass_lim.
	* c-c++-common/restrict-4.c: Same.
	* g++.dg/tree-ssa/pr33615.C:  Same.
	* g++.dg/tree-ssa/restrict1.C: Same.
	* gcc.dg/tm/pub-safety-1.c:  Same.
	* gcc.dg/tm/reg-promotion.c:  Same.
	* gcc.dg/tree-ssa/20050314-1.c:  Same.
	* gcc.dg/tree-ssa/loop-32.c: Same.
	* gcc.dg/tree-ssa/loop-33.c: Same.
	* gcc.dg/tree-ssa/loop-34.c: Same.
	* gcc.dg/tree-ssa/loop-35.c: Same.
	* gcc.dg/tree-ssa/loop-7.c: Same.
	* gcc.dg/tree-ssa/pr23109.c: Same.
	* gcc.dg/tree-ssa/restrict-3.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-1.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-10.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-11.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-12.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-2.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-3.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-6.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-7.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-8.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-9.c: Same.
	* gcc.dg/tree-ssa/structopt-1.c: Same.
	* gfortran.dg/pr32921.f: Same.
---
 gcc/passes.def  | 1 +
 gcc/testsuite/c-c++-common/restrict-2.c | 6 +++---
 gcc/testsuite/c-c++-common/restrict-4.c | 6 +++---
 gcc/testsuite/g++.dg/tree-ssa/pr33615.C | 6 +++---
 gcc/testsuite/g++.dg/tree-ssa/restrict1.C   | 6 +++---
 gcc/testsuite/gcc.dg/tm/pub-safety-1.c  | 6 +++---
 gcc/testsuite/gcc.dg/tm/reg-promotion.c | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/20050314-1.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-32.c | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-33.c | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-34.c | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-35.c | 8 
 gcc/testsuite/gcc.dg/tree-ssa/loop-7.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/pr23109.c | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/restrict-3.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-1.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-10.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-11.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-12.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-2.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-3.c   | 8 
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-6.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-7.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-8.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-9.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/structopt-1.c | 6 +++---
 gcc/testsuite/gfortran.dg/pr32921.f | 6 +++---
 27 files changed, 81 insertions(+), 80 deletions(-)

diff --git a/gcc/passes.def b/gcc/passes.def
index 37e08a8..438d292 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -92,6 +92,7 @@ along with GCC; see the file COPYING3.  If not see
 	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
 	  NEXT_PASS (pass_ch_oacc_kernels);
 	  NEXT_PASS (pass_tree_loop_init);
+	  NEXT_PASS (pass_lim);
 	  NEXT_PASS (pass_expand_omp_ssa);
 	  NEXT_PASS (pass_tree_loop_done);
 	  POP_INSERT_PASSES ()
diff --git a/gcc/testsuite/c-c++-common/restrict-2.c b/gcc/testsuite/c-c++-common/restrict-2.c
index 3f71b77..f0b0e15a 100644
--- a/gcc/testsuite/c-c++-common/restrict-2.c
+++ b/gcc/testsuite/c-c++-common/restrict-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O -fno-strict-aliasing -fdump-tree-lim1-details } */
+/* { dg-options -O -fno-strict-aliasing -fdump-tree-lim2-details } */
 
 void foo (float * __restrict__ a, float * __restrict__ b, int n, int j)
 {
@@ -10,5 +10,5 @@ void foo (float * __restrict__ a, float * __restrict__ b, int n, int j)
 
 /* We should move the RHS of the store out of the loop.  */
 
-/* { dg-final { scan-tree-dump-times Moving statement 11 lim1 } } */
-/* { dg-final { cleanup-tree-dump lim1 } } */

Re: [PATCH, 6/8] Add pass_ccp to pass_oacc_kernels

2014-11-25 Thread Tom de Vries


On 15-11-14 18:22, Tom de Vries wrote:

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds pass_loop_ccp to pass group pass_oacc_kernels.

We need this pass to simplify the loop body, and allow pass_parloops to detect
that loop iterations are independent.



As suggested here ( https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02993.html ) 
I've replaced the pass_ccp with pass_copyprop, which performs trivial constant 
propagation in addition to copy propagation.


Bootstrapped and reg-tested as before.

OK for trunk?

Thanks,
- Tom

[PATCH 6/7] Add pass_copy_prop in pass_oacc_kernels

2014-11-25  Tom de Vries  t...@codesourcery.com

	* passes.def: Add pass_copy_prop to pass group pass_oacc_kernels.
	* tree-ssa-copy.c (stmt_may_generate_copy): Handle .omp_data_i init
	conservatively.
---
 gcc/passes.def  | 1 +
 gcc/tree-ssa-copy.c | 4 
 2 files changed, 5 insertions(+)

diff --git a/gcc/passes.def b/gcc/passes.def
index 438d292..fb0d331 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -93,6 +93,7 @@ along with GCC; see the file COPYING3.  If not see
 	  NEXT_PASS (pass_ch_oacc_kernels);
 	  NEXT_PASS (pass_tree_loop_init);
 	  NEXT_PASS (pass_lim);
+	  NEXT_PASS (pass_copy_prop);
 	  NEXT_PASS (pass_expand_omp_ssa);
 	  NEXT_PASS (pass_tree_loop_done);
 	  POP_INSERT_PASSES ()
diff --git a/gcc/tree-ssa-copy.c b/gcc/tree-ssa-copy.c
index 7c22c5e..d6eb7a7 100644
--- a/gcc/tree-ssa-copy.c
+++ b/gcc/tree-ssa-copy.c
@@ -55,6 +55,7 @@ along with GCC; see the file COPYING3.  If not see
 #include tree-scalar-evolution.h
 #include tree-ssa-dom.h
 #include tree-ssa-loop-niter.h
+#include omp-low.h
 
 
 /* This file implements the copy propagation pass and provides a
@@ -110,6 +111,9 @@ stmt_may_generate_copy (gimple stmt)
   if (gimple_has_volatile_ops (stmt))
 return false;
 
+  if (gimple_stmt_omp_data_i_init_p (stmt))
+return false;
+
   /* Statements with loads and/or stores will never generate a useful copy.  */
   if (gimple_vuse (stmt))
 return false;
-- 
1.9.1

Re: [PATCH, 7/8] Add pass_parloops_oacc_kernels to pass_oacc_kernels

2014-11-25 Thread Tom de Vries


On 15-11-14 18:23, Tom de Vries wrote:

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds:
- a specialized version of pass_parallelize_loops called
 pass_parloops_oacc_kernels to pass group pass_oacc_kernels, and
- relevant test-cases.

The pass only handles loops that are in a kernels region, and skips over bits of
pass_parallelize_loops that are already done for oacc kernels.

The pass reintroduces the use of omp_expand_local, I haven't managed to make it
work yet using the external pass pass_expand_omp_ssa.

An obvious limitation of the patch is the fact that we copy over the clauses
from the kernels directive to the generated parallel directive. We'll need to do
something more intelligent here, f.i. setting vector_length based on the
parallelization factor.

Another limitation is that the pass still needs -ftree-parallelize-loops to
trigger.



Updated for using pass_copyprop instead of pass_ccp in pass_oacc_kernels.

Bootstrapped and reg-tested as before.

OK for trunk?

Thanks,
- Tom

[PATCH 7/7] Add pass_parloops_oacc_kernels to pass_oacc_kernels

2014-11-25  Tom de Vries  t...@codesourcery.com

	* passes.def: Add pass_parallelize_loops_oacc_kernels in pass group
	pass_oacc_kernels.  Move pass_expand_omp_ssa into pass group
	pass_oacc_kernels.
	* tree-parloops.c (create_parallel_loop): Add function parameters
	region_entry and bool oacc_kernels_p.  Handle oacc_kernels_p.
	(gen_parallel_loop): Same.  Use omp_expand_local if oacc_kernels_p.
	Call create_parallel_loop with additional args.
	(parallelize_loops): Add function parameter oacc_kernels_p.  Calculate
	dominance info.  Skip loops that are not in a kernels region. Call
	gen_parallel_loop with additional args.
	(pass_parallelize_loops::execute): Call parallelize_loops with false
	argument.
	(pass_data_parallelize_loops_oacc_kernels): New pass_data.
	(class pass_parallelize_loops_oacc_kernels): New pass.
	(pass_parallelize_loops_oacc_kernels::execute)
	(make_pass_parallelize_loops_oacc_kernels): New function.
	* tree-pass.h (make_pass_parallelize_loops_oacc_kernels): Declare.

	* testsuite/libgomp.oacc-c/oacc-kernels-2-run.c: New test.
	* testsuite/libgomp.oacc-c/oacc-kernels-run.c: New test.

	* gcc.dg/oacc-kernels-2.c: New test.
	* gcc.dg/oacc-kernels.c: New test.
---
 gcc/passes.def |   1 +
 gcc/testsuite/gcc.dg/oacc-kernels-2.c  |  79 +++
 gcc/testsuite/gcc.dg/oacc-kernels.c|  71 ++
 gcc/tree-parloops.c| 242 -
 gcc/tree-pass.h|   2 +
 .../testsuite/libgomp.oacc-c/oacc-kernels-2-run.c  |  65 ++
 .../testsuite/libgomp.oacc-c/oacc-kernels-run.c|  59 +
 7 files changed, 464 insertions(+), 55 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/oacc-kernels-2.c
 create mode 100644 gcc/testsuite/gcc.dg/oacc-kernels.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c/oacc-kernels-2-run.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c/oacc-kernels-run.c

diff --git a/gcc/passes.def b/gcc/passes.def
index fb0d331..d91283b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -94,6 +94,7 @@ along with GCC; see the file COPYING3.  If not see
 	  NEXT_PASS (pass_tree_loop_init);
 	  NEXT_PASS (pass_lim);
 	  NEXT_PASS (pass_copy_prop);
+  	  NEXT_PASS (pass_parallelize_loops_oacc_kernels);
 	  NEXT_PASS (pass_expand_omp_ssa);
 	  NEXT_PASS (pass_tree_loop_done);
 	  POP_INSERT_PASSES ()
diff --git a/gcc/testsuite/gcc.dg/oacc-kernels-2.c b/gcc/testsuite/gcc.dg/oacc-kernels-2.c
new file mode 100644
index 000..1ff4bad
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/oacc-kernels-2.c
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target fopenacc } */
+/* { dg-options -fopenacc -ftree-parallelize-loops=32 -O2 -std=c99 -fdump-tree-parloops_oacc_kernels-all -fdump-tree-copyrename } */
+
+#include stdlib.h
+#include stdio.h
+
+#define N (1024 * 512)
+#define N_REF 4293394432
+
+#if 1
+#define COUNTERTYPE unsigned int
+#else
+#define COUNTERTYPE int
+#endif
+
+int
+main (void)
+{
+  unsigned int i;
+
+  unsigned int *__restrict a;
+  unsigned int *__restrict b;
+  unsigned int *__restrict c;
+
+  a = malloc (N * sizeof (unsigned int));
+  b = malloc (N * sizeof (unsigned int));
+  c =

[C++ Patch] PR 63786

2014-11-25 Thread Paolo Carlini


Hi,

we are crashing on this kind of invalid code because we don't early 
check the case with check_for_bare_parameter_packs. Tested x86_64-linux.


Thanks,
Paolo.

//
/cp
2014-11-25  Paolo Carlini  paolo.carl...@oracle.com

PR c++/63786
* parser.c (cp_parser_label_for_labeled_statement): Check the case
with check_for_bare_parameter_packs.

/testsuite
2014-11-25  Paolo Carlini  paolo.carl...@oracle.com

PR c++/63786
* g++.dg/cpp0x/variadic163.C: New.
Index: cp/parser.c
===
--- cp/parser.c (revision 218039)
+++ cp/parser.c (working copy)
@@ -9820,6 +9820,8 @@ cp_parser_label_for_labeled_statement (cp_parser*
cp_lexer_consume_token (parser-lexer);
/* Parse the constant-expression.  */
expr = cp_parser_constant_expression (parser);
+   if (check_for_bare_parameter_packs (expr))
+ expr = error_mark_node;
 
ellipsis = cp_lexer_peek_token (parser-lexer);
if (ellipsis-type == CPP_ELLIPSIS)
@@ -9826,8 +9828,9 @@ cp_parser_label_for_labeled_statement (cp_parser*
  {
/* Consume the `...' token.  */
cp_lexer_consume_token (parser-lexer);
-   expr_hi =
- cp_parser_constant_expression (parser);
+   expr_hi = cp_parser_constant_expression (parser);
+   if (check_for_bare_parameter_packs (expr_hi))
+ expr_hi = error_mark_node;
 
/* We don't need to emit warnings here, as the common code
   will do this for us.  */
Index: testsuite/g++.dg/cpp0x/variadic163.C
===
--- testsuite/g++.dg/cpp0x/variadic163.C(revision 0)
+++ testsuite/g++.dg/cpp0x/variadic163.C(working copy)
@@ -0,0 +1,21 @@
+// PR c++/63786
+// { dg-do compile { target c++11 } }
+// { dg-options  }
+
+template int... Is
+int f(int i) {
+switch (i) {
+case Is:   // { dg-error not expanded }
+return 0;
+}
+
+switch (i) {
+case 0 ...Is:  // { dg-error not expanded }
+return 0;
+}
+return 0;
+}
+
+int main() {
+f1,2,3(1);
+}

Re: [PATCH, PR63995, CHKP] Use single static bounds var for varpool nodes sharing asm name

2014-11-25 Thread Ilya Enkovich

2014-11-25 14:11 GMT+03:00 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 25, 2014 at 11:19 AM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2014-11-25 12:43 GMT+03:00 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 25, 2014 at 9:45 AM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 Hi,

 This patch partly fixes PR bootstrap/63995 by avoiding duplicating static 
 bounds vars.  With this fix bootstrap still fails at stage 2 and 3 
 comparison.

 Bootstrapped and checked on x86_64-unknown-linux-gnu.  OK for trunk?

 Thanks,
 Ilya
 --
 gcc/

 2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

 PR bootstrap/63995
 * tree-chkp (chkp_make_static_bounds): Share bounds var
 between nodes sharing assembler name.

 gcc/testsuite

 2014-11-25  Ilya Enkovich  ilya.enkov...@intel.com

 PR bootstrap/63995
 * g++.dg/dg.exp: Add mpx-dg.exp.
 * g++.dg/pr63995-1.C: New.


 diff --git a/gcc/testsuite/g++.dg/dg.exp b/gcc/testsuite/g++.dg/dg.exp
 index 14beae1..44eab0c 100644
 --- a/gcc/testsuite/g++.dg/dg.exp
 +++ b/gcc/testsuite/g++.dg/dg.exp
 @@ -18,6 +18,7 @@

  # Load support procs.
  load_lib g++-dg.exp
 +load_lib mpx-dg.exp

  # If a testcase doesn't have special options, use these.
  global DEFAULT_CXXFLAGS
 diff --git a/gcc/testsuite/g++.dg/pr63995-1.C 
 b/gcc/testsuite/g++.dg/pr63995-1.C
 new file mode 100644
 index 000..82e7606
 --- /dev/null
 +++ b/gcc/testsuite/g++.dg/pr63995-1.C
 @@ -0,0 +1,16 @@
 +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
 +/* { dg-require-effective-target mpx } */
 +/* { dg-options -O2 -g -fcheck-pointer-bounds -mmpx } */
 +
 +int test1 (int i)
 +{
 +  extern const int arr[10];
 +  return arr[i];
 +}
 +
 +extern const int arr[10];
 +
 +int test2 (int i)
 +{
 +  return arr[i];
 +}
 diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
 index 3e38691..d425084 100644
 --- a/gcc/tree-chkp.c
 +++ b/gcc/tree-chkp.c
 @@ -2727,9 +2727,29 @@ chkp_make_static_bounds (tree obj)
/* First check if we already have required var.  */
if (chkp_static_var_bounds)
  {
 -  slot = chkp_static_var_bounds-get (obj);
 -  if (slot)
 -   return *slot;
 +  /* If there is a symbol sharing assembler name with obj,
 +we may use its bounds.  */
 +  if (TREE_CODE (obj) == VAR_DECL)
 +   {
 + varpool_node *node = varpool_node::get_create (obj);
 +
 + while (node-previous_sharing_asm_name)
 +   node = (varpool_node *)node-previous_sharing_asm_name;
 +
 + while (node)
 +   {
 + slot = chkp_static_var_bounds-get (node-decl);
 + if (slot)
 +   return *slot;
 + node = (varpool_node *)node-next_sharing_asm_name;
 +   }

 Hum.  varpool_node::get returns the ultimate alias target thus the
 walking shouldn't be necessary.  Just

   node = varpool_node::get_create (obj);
   slot = chkp_static_var_bounds-get (node-decl);
   if (slot)
 return *slot;

 and then making sure to set the decl also for node-decl.  I suppose
 it really asks for making chkp_static_var_bounds-get based on
 a varpool node and not a decl so you consistently use the ultimate
 alias target.

 varpool_node::get just returns symtab_node::get which returns
 decl-decl_with_vis.symtab_node - thus no aliases walkthrough. Also
 none of two varpool_nodes is an alias. The only connection between
 these nodes seems to be {next,previous}_sharing_asm_name. Here is how
 these nodes look:

 Ok, then it's get_for_asmname ().  That said - the above loops look
 bogus to me.  Honza - any better ideas?

get_for_asmname ()  returns the first element in a chain of nodes with
the same asm name.  May I rely on the order of nodes in this chain?
Probably use ASSEMBLER_NAME as a key in chkp_static_var_bounds hash?

Thanks,
Ilya


 Richard.

 (gdb) p *$2
 $3 = {symtab_node = {type = SYMTAB_VARIABLE, resolution =
 LDPR_UNKNOWN, definition = 0, alias = 0, weakref = 0,
 cpp_implicit_alias = 0, analyzed = 0, writeonly = 0,
 refuse_visibility_changes = 0, externally_visible = 0,
 no_reorder = 0, force_output = 0, forced_by_abi = 0, unique_name =
 0, implicit_section = 0, body_removed = 1, used_from_other_partition =
 0, in_other_partition = 0, address_taken = 0, in_init_priority_hash =
 0,
 need_lto_streaming = 0, offloadable = 0, order = 3, decl =
 0x77dd2cf0, next = 0x77f46200, previous = 0x77dd67a8,
 next_sharing_asm_name = 0x0, previous_sharing_asm_name =
 0x77f46200, same_comdat_group = 0x0,
 ref_list = {references = 0x0, referring = {m_vec = 0x23856b0}},
 alias_target = 0x0, lto_file_data = 0x0, aux = 0x0, x_comdat_group =
 0x0, x_section = 0x0}, output = 0, need_bounds_init = 0,
 dynamically_initialized = 0,
   tls_model = TLS_MODEL_NONE, used_by_single_function = 0}

 (gdb) p *$5
 $6 = {symtab_node = {type = SYMTAB_VARIABLE, resolution =
 LDPR_UNKNOWN, definition = 0, alias = 0, weakref = 0,
 cpp_implicit_alias = 0, analyzed = 0, writeonly = 0,

[PATCH] Remove unnecessary calls to strchr.

2014-11-25 Thread Ilya Tocar

Hi,

As proposed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63853
this patch replaces some function calls with pointer arithmetic.
I didn't mention PR in Changelog, as they are not actually related.
Ok for trunk?


gcc/
* gcc.c (handle_foffload_option): Remove unnecessary calls to strchr,
strlen, strncpy.
* lto-wrapper.c (append_offload_options): Likewise.

---
 gcc/gcc.c | 24 +---
 gcc/lto-wrapper.c |  2 +-
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/gcc/gcc.c b/gcc/gcc.c
index 653ca8d..4731eec 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -3384,11 +3384,11 @@ handle_foffload_option (const char *arg)
 {
   next = strchr (cur, ',');
   if (next == NULL)
-   next = strchr (cur, '\0');
+   next = end;
   next = (next  end) ? end : next;
 
   target = XNEWVEC (char, next - cur + 1);
-  strncpy (target, cur, next - cur);
+  memcpy (target, cur, next - cur);
   target[next - cur] = '\0';
 
   /* If 'disable' is passed to the option, stop parsing the option and 
clean
@@ -3408,8 +3408,7 @@ handle_foffload_option (const char *arg)
  if (n == NULL)
n = strchr (c, '\0');
 
- if (strlen (target) == (size_t) (n - c)
-  strncmp (target, c, n - c) == 0)
+ if (next - cur == n - c  strncmp (target, c, n - c) == 0)
break;
 
  c = *n ? n + 1 : NULL;
@@ -3420,7 +3419,10 @@ handle_foffload_option (const char *arg)
 target);
 
   if (!offload_targets)
-   offload_targets = xstrdup (target);
+   {
+ offload_targets = target;
+ target = NULL;
+   }
   else
{
  /* Check that the target hasn't already presented in the list.  */
@@ -3431,8 +3433,7 @@ handle_foffload_option (const char *arg)
  if (n == NULL)
n = strchr (c, '\0');
 
- if (strlen (target) == (size_t) (n - c)
-  strncmp (c, target, n - c) == 0)
+ if (next - cur == n - c  strncmp (c, target, n - c) == 0)
break;

Re: [PATCH] Remove unnecessary calls to strchr.

2014-11-25 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 03:15:04PM +0300, Ilya Tocar wrote:
 As proposed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63853
 this patch replaces some function calls with pointer arithmetic.
 I didn't mention PR in Changelog, as they are not actually related.
 Ok for trunk?
 @@ -3408,8 +3408,7 @@ handle_foffload_option (const char *arg)
 if (n == NULL)
   n = strchr (c, '\0');
  
 -   if (strlen (target) == (size_t) (n - c)
 -strncmp (target, c, n - c) == 0)
 +   if (next - cur == n - c  strncmp (target, c, n - c) == 0)

I suppose you could use memcmp here, you know the string lengths.

 @@ -3431,8 +3433,7 @@ handle_foffload_option (const char *arg)
 if (n == NULL)
   n = strchr (c, '\0');
  
 -   if (strlen (target) == (size_t) (n - c)
 -strncmp (c, target, n - c) == 0)
 +   if (next - cur == n - c  strncmp (c, target, n - c) == 0)
   break;

And here too.

Ok with or without those changes.

Jakub

[PATCH] sreal class fix for PR64050 and PR64060

2014-11-25 Thread Martin Liška


Hello.

Following patch is fix sreal problems that are mentioned in PR64050, PR64060.
I added new GCC plugin test where I test sreal arithmetics and number 
comparison.

Patch can bootstrap on ppc64-linux-pc and x86_64-linux-pc and can pass 
regression
tests.

Thanks,
Martin
gcc/ChangeLog:

2014-11-25  Martin Liska  Martin li...@suse.cz

PR bootstrap/64050
PR ipa/64060
* sreal.c (sreal::operator+): Addition fixed.
(sreal::signedless_plus): Negative numbers are
handled correctly.
(sreal::operator-): Subtraction is fixed.
(sreal::signedless_minus): Negative numbers are
handled correctly.
* sreal.h (sreal::operator): Equal negative numbers
are compared correctly.
(sreal::shift): New checking asserts are introduced.
Operation is fixed.

gcc/testsuite/ChangeLog:

2014-11-25  Martin Liska  Martin li...@suse.cz

PR bootstrap/64050
PR ipa/64060
* gcc.dg/plugin/plugin.exp: New plugin.
* gcc.dg/plugin/sreal-test-1.c: New test.
* gcc.dg/plugin/sreal_plugin.c: New test.
diff --git a/gcc/sreal.c b/gcc/sreal.c
index 0337f9e..2b5e3ae 100644
--- a/gcc/sreal.c
+++ b/gcc/sreal.c
@@ -182,9 +182,9 @@ sreal::operator+ (const sreal other) const
 {
   sreal tmp = -(*b_p);
   if (*a_p  tmp)
-	return signedless_minus (tmp, *a_p, false);
+	return signedless_minus (tmp, *a_p, true);
   else
-	return signedless_minus (*a_p, tmp, true);
+	return signedless_minus (*a_p, tmp, false);
 }
 
   gcc_checking_assert (a_p-m_negative == b_p-m_negative);
@@ -203,7 +203,7 @@ sreal::signedless_plus (const sreal a, const sreal b, bool negative)
   const sreal *a_p = a;
   const sreal *b_p = b;
 
-  if (*a_p  *b_p)
+  if (a_p-m_exp  b_p-m_exp)
 std::swap (a_p, b_p);
 
   dexp = a_p-m_exp - b_p-m_exp;
@@ -211,6 +211,7 @@ sreal::signedless_plus (const sreal a, const sreal b, bool negative)
   if (dexp  SREAL_BITS)
 {
   r.m_sig = a_p-m_sig;
+  r.m_negative = negative;
   return r;
 }
 
@@ -248,11 +249,11 @@ sreal::operator- (const sreal other) const
   /* We want to substract a smaller number from bigger
 for nonegative numbers.  */
   if (!m_negative  *this  other)
-return -signedless_minus (other, *this, true);
+return signedless_minus (other, *this, true);
 
   /* Example: -2 - (-3) = 3 - 2 */
   if (m_negative  *this  other)
-return signedless_minus (-other, -(*this), true);
+return signedless_minus (-other, -(*this), false);
 
   sreal r = signedless_minus (*this, other, m_negative);
 
@@ -274,6 +275,7 @@ sreal::signedless_minus (const sreal a, const sreal b, bool negative)
   if (dexp  SREAL_BITS)
 {
   r.m_sig = a_p-m_sig;
+  r.m_negative = negative;
   return r;
 }
   if (dexp == 0)
diff --git a/gcc/sreal.h b/gcc/sreal.h
index 1362bf6..3938c6e 100644
--- a/gcc/sreal.h
+++ b/gcc/sreal.h
@@ -60,6 +60,11 @@ public:
 
   bool operator (const sreal other) const
   {
+/* We negate result in case of negative numbers and
+   it would return true for equal negative numbers.  */
+if (*this == other)
+  return false;
+
 if (m_negative != other.m_negative)
   return m_negative  other.m_negative;
 
@@ -86,10 +91,19 @@ public:
 return tmp;
   }
 
-  sreal shift (int sig) const
+  sreal shift (int s) const
   {
+gcc_checking_assert (s = SREAL_BITS);
+gcc_checking_assert (s = -SREAL_BITS);
+
+/* Exponent should never be so large because shift_right is used only by
+ sreal_add and sreal_sub ant thus the number cannot be shifted out from
+ exponent range.  */
+gcc_checking_assert (m_exp + s = SREAL_MAX_EXP);
+gcc_checking_assert (m_exp + s = -SREAL_MAX_EXP);
+
 sreal tmp = *this;
-tmp.m_sig += sig;
+tmp.m_exp += s;
 
 return tmp;
   }
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index e4b5f54..c12b3da 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -59,6 +59,7 @@ set plugin_test_list [list \
 { selfassign.c self-assign-test-1.c self-assign-test-2.c } \
 { ggcplug.c ggcplug-test-1.c } \
 { one_time_plugin.c one_time-test-1.c } \
+{ sreal_plugin.c sreal-test-1.c } \
 { start_unit_plugin.c start_unit-test-1.c } \
 { finish_unit_plugin.c finish_unit-test-1.c } \
 ]
diff --git a/gcc/testsuite/gcc.dg/plugin/sreal-test-1.c b/gcc/testsuite/gcc.dg/plugin/sreal-test-1.c
new file mode 100644
index 000..1bce2cc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/sreal-test-1.c
@@ -0,0 +1,8 @@
+/* Test that pass is inserted and invoked once. */
+/* { dg-do compile } */
+/* { dg-options -O } */
+
+int main (int argc, char **argv)
+{
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/sreal_plugin.c b/gcc/testsuite/gcc.dg/plugin/sreal_plugin.c
new file mode 100644
index 000..f113816
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/sreal_plugin.c
@@ -0,0

PATCH: PR rtl-optimization/64037: Miscompilation with -Os and enum class : char parameter

2014-11-25 Thread H.J. Lu

Hi,

The enclosed testcase fails on x86 when compiled with -Os since we pass
a byte parameter with a byte load in caller and read it as an int in
callee.  The reason it only shows up with -Os is x86 backend encodes
a byte load with an int load if -O isn't used.  When a byte load is
used, the upper 24 bits of the register have random value for none
WORD_REGISTER_OPERATIONS targets.

It happens because setup_incoming_promotions in combine.c has

  /* The mode and signedness of the argument before any promotions happen
 (equal to the mode of the pseudo holding it at that stage).  */
  mode1 = TYPE_MODE (TREE_TYPE (arg));
  uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));

  /* The mode and signedness of the argument after any source language and
 TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
  mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
  uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));

  /* The mode and signedness of the argument as it is actually passed,
 after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
  mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, uns3,
 TREE_TYPE (cfun-decl), 0);

while they are actually passed in register by assign_parm_setup_reg in
function.c:

  /* Store the parm in a pseudoregister during the function, but we may
 need to do it in a wider mode.  Using 2 here makes the result
 consistent with promote_decl_mode and thus expand_expr_real_1.  */
  promoted_nominal_mode
= promote_function_mode (data-nominal_type, data-nominal_mode, unsignedp,
 TREE_TYPE (current_function_decl), 2);

where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
and TYPE_MODE (nominal_type). TREE_TYPE here is

(gdb) call debug_tree (type)
 enumeral_type 0x719f85e8 X
type integer_type 0x718a93f0 unsigned char public unsigned
string-flag QI
size integer_cst 0x718a5fa8 constant 8
unit size integer_cst 0x718a5fc0 constant 1
align 8 symtab 0 alias set -1 canonical type 0x718a93f0
precision 8 min integer_cst 0x718a5fd8 0 max integer_cst
0x718a5f78 255
static unsigned type_5 QI size integer_cst 0x718a5fa8 8 unit
size integer_cst 0x718a5fc0 1
align 8 symtab 0 alias set -1 canonical type 0x719f85e8
precision 8 min integer_cst 0x718a5fd8 0 max integer_cst
0x718a5f78 255
values tree_list 0x719fb028
purpose identifier_node 0x719f6738 V
bindings (nil)
local bindings (nil)
value const_decl 0x718c21c0 V type enumeral_type
0x719f85e8 X
readonly constant used VOID file pr64037.ii line 2 col 3
align 1 context enumeral_type 0x719f85e8 X initial
integer_cst 0x719d8d08 2 context translation_unit_decl
0x77ff91e0 D.1
chain type_decl 0x719f5c78 X
(gdb) 

and DECL_ARG_TYPE is

(gdb) call debug_tree (type)
 integer_type 0x718a9690 int public SI
size integer_cst 0x718a5e70 type integer_type 0x718a9150
bitsizetype constant 32
unit size integer_cst 0x718a5e88 type integer_type
0x718a90a8 sizetype constant 4
align 32 symtab 0 alias set 1 canonical type 0x718a9690
precision 32 min integer_cst 0x718c60c0 -2147483648 max
integer_cst 0x718c60d8 2147483647
pointer_to_this pointer_type 0x718cb930
(gdb) 

This mismatch makes combine thinks a byte parameter is passed as int
in register and turns

(insn 9 6 10 2 (set (reg:SI 92 [ b ])
(zero_extend:SI (subreg:QI (reg:SI 91 [ b ]) 0))) pr64037.ii:9
138 {*zero_extendqisi2}
 (expr_list:REG_DEAD (reg:SI 91 [ b ])
(nil)))
(insn 10 9 0 2 (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4
A32])
(reg:SI 92 [ b ])) pr64037.ii:9 90 {*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 92 [ b ])
(expr_list:REG_DEAD (reg/v/f:SI 88 [ out ])
(nil

into

Trying 9 - 10: 
Successfully matched this instruction:
(set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4 A32])
(reg:SI 91 [ b ])) 
allowing combination of insns 9 and 10
original costs 6 + 4 = 10
replacement cost 4
deferring deletion of insn with uid = 9.
modifying insn i310: [r88:SI]=r91:SI
  REG_DEAD r91:SI
  REG_DEAD r88:SI


This patch makes setup_incoming_promotions to match assign_parm_setup_reg.
Tested on Linux/x86-64 without regressions.  OK for trunk and backport?

Thanks.


H.J.

diff --git a/gcc/combine.c b/gcc/combine.c
index 1808f97..a0449a2 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1561,8 +1561,8 @@ setup_incoming_promotions (rtx_insn *first)
   uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));
 
   /* The mode and signedness of the argument as it is actually passed,
- after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
-  mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, uns3,
+ see assign_parm_setup_reg in function.c.  */
+

Re: [PATCH v2] gcc/c-family/c-cppbuiltin.c: Let buffer enough to print host wide integer value

2014-11-25 Thread Chen Gang

On 11/25/14 7:56, Joseph Myers wrote:
 On Sun, 23 Nov 2014, Chen Gang wrote:
 
 +  gcc_assert (wi::fits_to_tree_p(value, integer_type_node));
 
 Watch formatting: space before '(' in the wi::fits_to_tree_p call.  
 Applies elsewhere in this patch as well.
 

OK, thanks, I shall notice next.

 When making such an interface change, (a) you should update the comment on 
 builtin_define_with_int_value to explain the new interface, and (b) you 
 should check existing callers to make sure their values are indeed in 
 range, and describe the check you did.
 
 In fact, -fabi-version=0 results in __GXX_ABI_VERSION being defined to 
 99 using builtin_define_with_int_value.  That's out of range of int on 
 targets with 16-bit int.  So that indicates against requiring the value to 
 be within range of int.  It might however be OK to require the value to be 
 within range of target long.
 

For me, can let builtin_define_with_int_value() fit all kinds of integer
values, and the assert need be:

  gcc_assert (wi::fits_to_tree_p (value, char_type_node)
  || wi::fits_to_tree_p (value, short_integer_type_node)
  || wi::fits_to_tree_p (value, integer_type_node)
  || wi::fits_to_tree_p (value, long_integer_type_node)
  || wi::fits_to_tree_p (value, long_long_integer_type_node));

If it really can fit all kinds of integer values, for me, the related
comments of builtin_define_with_int_value() need not be changed.

 +  if (value = 0)
 +{
 +  sprintf (buf, %s=HOST_WIDE_INT_PRINT_DEC%s,
 +   macro, value,
 +   value = HOST_INT_MAX
 +   ? 
 +   : value = HOST_LONG_MAX
 + ? L : LL);
 
 Limits on the host's int and long are completely irrelevant here.  The 
 question is the target's int and long, not the host's - and consistency 
 indicates checking with wi::fits_to_tree_p (value, integer_type_node) if 
 the assertion checked with long_integer_type_node.
 

OK, thanks. And for me, the related sprintf() should be:

  sprintf (buf, %s=%sHOST_WIDE_INT_PRINT_DEC%s%s,
   macro,
   value  0 ? ( : ,
   value,
   wi::fits_to_tree_p (value, char_type_node)
 || wi::fits_to_tree_p (value, short_integer_type_node)
 || wi::fits_to_tree_p (value, integer_type_node)
   ?  
   : wi::fits_to_tree_p (value, long_integer_type_node)
 ? L : LL,
   value  0 ? ) : ); 

Thanks.
-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed

Re: [C++ Patch] PR 63786

2014-11-25 Thread Jason Merrill


OK.

Jason

[PATCH] Fix PR61927

2014-11-25 Thread Richard Biener


I am testing the following patch which reverts order of group
and pattern analysis to 4.8 state.  It doesn't really matter but
it avoids pattern analysis to know about groups which its failure
causes the wrong-code in the PR.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Help with making the testcase in the PR suitable for the testsuite
is appreciated - my Fortran fu is limited.

Richard.

2014-11-25  Richard Biener  rguent...@suse.de

PR tree-optimization/61927
* tree-vect-loop.c (vect_analyze_loop_2): Revert ordering
of group and pattern analysis to the one in GCC 4.8.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 218019)
+++ gcc/tree-vect-loop.c(working copy)
@@ -1662,6 +1662,13 @@ vect_analyze_loop_2 (loop_vec_info loop_
   return false;
 }
 
+  /* Classify all cross-iteration scalar data-flow cycles.
+ Cross-iteration cycles caused by virtual phis are analyzed separately.  */
+
+  vect_analyze_scalar_cycles (loop_vinfo);
+
+  vect_pattern_recog (loop_vinfo, NULL);
+
   /* Analyze the access patterns of the data-refs in the loop (consecutive,
  complex, etc.). FORNOW: Only handle consecutive access pattern.  */
 
@@ -1674,13 +1681,6 @@ vect_analyze_loop_2 (loop_vec_info loop_
   return false;
 }
 
-  /* Classify all cross-iteration scalar data-flow cycles.
- Cross-iteration cycles caused by virtual phis are analyzed separately.  */
-
-  vect_analyze_scalar_cycles (loop_vinfo);
-
-  vect_pattern_recog (loop_vinfo, NULL);
-
   /* Data-flow analysis to detect stmts that do not need to be vectorized.  */
 
   ok = vect_mark_stmts_to_be_vectorized (loop_vinfo);

[PATCH] Fix PR64065(?)

2014-11-25 Thread Richard Biener


The following might fix PR64065 but is certainly a bug.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2014-11-25  Richard Biener  rguent...@suse.de

PR lto/64065
* lto-streamer-out.c (output_struct_function_base): Stream
last_clique field.
* lto-streamer-in.c (input_struct_function_base): Likewise.

Index: gcc/lto-streamer-out.c
===
--- gcc/lto-streamer-out.c  (revision 218019)
+++ gcc/lto-streamer-out.c  (working copy)
@@ -1956,6 +1956,7 @@ output_struct_function_base (struct outp
   bp_pack_value (bp, fn-has_simduid_loops, 1);
   bp_pack_value (bp, fn-va_list_fpr_size, 8);
   bp_pack_value (bp, fn-va_list_gpr_size, 8);
+  bp_pack_value (bp, fn-last_clique, sizeof (short) * 8);
 
   /* Output the function start and end loci.  */
   stream_output_location (ob, bp, fn-function_start_locus);
Index: gcc/lto-streamer-in.c
===
--- gcc/lto-streamer-in.c   (revision 218019)
+++ gcc/lto-streamer-in.c   (working copy)
@@ -903,6 +903,7 @@ input_struct_function_base (struct funct
   fn-has_simduid_loops = bp_unpack_value (bp, 1);
   fn-va_list_fpr_size = bp_unpack_value (bp, 8);
   fn-va_list_gpr_size = bp_unpack_value (bp, 8);
+  fn-last_clique = bp_unpack_value (bp, sizeof (short) * 8);
 
   /* Input the function start and end loci.  */
   fn-function_start_locus = stream_input_location (bp, data_in);

Re: [PATCH] rs6000: Replace a stray addic with addi

2014-11-25 Thread David Edelsohn

On Mon, Nov 24, 2014 at 10:18 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 Tested as usual...  okay for trunk?


 Segher


 2014-11-24  Segher Boessenkool  seg...@kernel.crashing.org

 gcc/
 * config/rs6000/sysv4.h (ASM_OUTPUT_REG_POP): Use addi instead
 of addic.

Okay.

Thanks, David

Re: [PATCH] rs6000: Remove iorxor/IORXOR code attrs

2014-11-25 Thread David Edelsohn

On Mon, Nov 24, 2014 at 10:11 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 As Richard pointed out, those do nothing more than code/CODE.
 Tested etc.; okay for trunk?


 Segher


 2014-11-21  Segher Boessenkool  seg...@kernel.crashing.org

 gcc/
 * config/rs6000/rs6000.md (iorxor, IORXOR): Delete code_attrs.
 (rest of file): Replace those with code resp. CODE.

Okay.

thanks, David

Re: [PATCH] AIX: Filename-based shared library versioning for libgcc_s

2014-11-25 Thread David Edelsohn

On Tue, Nov 11, 2014 at 10:42 AM, Michael Haubenwallner
michael.haubenwall...@ssi-schaefer.com wrote:


 On 11/11/2014 04:02 PM, David Edelsohn wrote:
 Michael,

 Why does the configure change match with p*-*-aix... instead of power*
 or powerpc*?  Yes, it's unique and will match, but why make it as
 short as possible, which doesn't match other uses?

 Actually I did have powerpc* initially, but gmp-6.0.0 config.guess'ed
 power7-ibm-aix6.1.0.0 now. Then I've thought that one may use ppc as well,
 but now I see this config.sub's to powerpc anyway, so power* is fine.
 Patch updated.

 In your documentation, how are you distinguishing between Dynamic
 Linking and Runtime Linking?

 I've tried to use the same naming scheme as in the ld Command Reference
 and the dlopen Subroutine man pages.

 Actually, there is
 at linktime:
   Dynamic Linking: also known as Dynamic Mode or (more common)
Shared Linking: record a shared object's name into the created binary

 at runtime:
   Runtime Loading: load these shared objects at process startup
   Runtime Linking: resolve the symbols after loading shared objects
   Dynamic Loading: load shared objects by application logic with dlopen()

 I'm unsure how to make this as clear as possible though.

Now that things have calmed down with respect to breakage on AIX, the
patch for building libgcc_s is okay.

Thanks, David

[PATCH][AArch64]Fix ICE at -O0 on vld1_lane intrinsics

2014-11-25 Thread Alan Lawrence

vld1_lane intrinsics ICE at -O0 because they contain a call to the vset_lane 
intrinsics, through which the lane index is not constant-propagated. (They are 
fine at -O1 and higher!). This fixes the ICE by replacing said call by a macro.


Rather than defining many individual macros 
__aarch64_vset(q?)_lane_[uspf](8|16|32|64), instead this introduces a 
__AARCH64_NUM_LANES macro using sizeof(), such that a single 
__aarch64_vset_lane_any macro handles all variants (with bounds-checking and 
endianness-flipping). This reduces potential for error vs. writing the number of 
lanes for each variant by hand as previously.


Also factor the endianness-flipping out to a separate macro __aarch64_lane; I 
intend to use this for vget_lane too in another patch.


Tested with check-gcc on aarch64-none-elf and aarch64_be-none-elf (including new 
test that FAILs without this patch).


Ok for trunk?


gcc/ChangeLog:

* config/aarch64/arm_neon.h (__AARCH64_NUM_LANES, __aarch64_lane *2):
New.
(aarch64_vset_lane_any): Redefine using previous, same for BE + LE.
(vset_lane_f32, vset_lane_f64, vset_lane_p8, vset_lane_p16,
vset_lane_s8, vset_lane_s16, vset_lane_s32, vset_lane_s64,
vset_lane_u8, vset_lane_u16, vset_lane_u32, vset_lane_u64): Remove
number of lanes.
(vld1_lane_f32, vld1_lane_f64, vld1_lane_p8, vld1_lane_p16,
vld1_lane_s8, vld1_lane_s16, vld1_lane_s32, vld1_lane_s64,
vld1_lane_u8, vld1_lane_u16, vld1_lane_u32, vld1_lane_u64): Call
__aarch64_vset_lane_any rather than vset_lane_xxx.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vld1_lane-o0.c: New test.diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 921a5db..1291a8d 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -604,173 +604,28 @@ typedef struct poly16x8x4_t
 #define __aarch64_vdupq_laneq_u64(__a, __b) \
__aarch64_vdup_lane_any (u64, q, q, __a, __b)
 
-/* vset_lane and vld1_lane internal macro.  */
+/* Internal macro for lane indices.  */
+
+#define __AARCH64_NUM_LANES(__v) (sizeof (__v) / sizeof (__v[0]))
 
-#ifdef __AARCH64EB__
 /* For big-endian, GCC's vector indices are the opposite way around
to the architectural lane indices used by Neon intrinsics.  */
-#define __aarch64_vset_lane_any(__vec, __index, __val, __lanes) \
-  __extension__			\
-  ({\
-__builtin_aarch64_im_lane_boundsi (__index, __lanes);	\
-__vec[__lanes - 1 - __index] = __val;			\
-__vec;			\
-  })
+#ifdef __AARCH64EB__
+#define __aarch64_lane(__vec, __idx) (__AARCH64_NUM_LANES (__vec) - 1 - __idx)
 #else
-#define __aarch64_vset_lane_any(__vec, __index, __val, __lanes) \
-  __extension__			\
-  ({\
-__builtin_aarch64_im_lane_boundsi (__index, __lanes);	\
-__vec[__index] = __val;	\
-__vec;			\
-  })
+#define __aarch64_lane(__vec, __idx) __idx
 #endif
 
-/* vset_lane  */
-
-__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
-vset_lane_f32 (float32_t __elem, float32x2_t __vec, const int __index)
-{
-  return __aarch64_vset_lane_any (__vec, __index, __elem, 2);
-}
-
-__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
-vset_lane_f64 (float64_t __elem, float64x1_t __vec, const int __index)
-{
-  return __aarch64_vset_lane_any (__vec, __index, __elem, 1);
-}
-
-__extension__ static __inline poly8x8_t __attribute__ ((__always_inline__))
-vset_lane_p8 (poly8_t __elem, poly8x8_t __vec, const int __index)
-{
-  return __aarch64_vset_lane_any (__vec, __index, __elem, 8);
-}
-
-__extension__ static __inline poly16x4_t __attribute__ ((__always_inline__))
-vset_lane_p16 (poly16_t __elem, poly16x4_t __vec, const int __index)
-{
-  return __aarch64_vset_lane_any (__vec, __index, __elem, 4);
-}
-
-__extension__ static __inline int8x8_t __attribute__ ((__always_inline__))
-vset_lane_s8 (int8_t __elem, int8x8_t __vec, const int __index)
-{
-  return __aarch64_vset_lane_any (__vec, __index, __elem, 8);
-}
-
-__extension__ static __inline int16x4_t __attribute__ ((__always_inline__))
-vset_lane_s16 (int16_t __elem, int16x4_t __vec, const int __index)
-{
-  return __aarch64_vset_lane_any (__vec, __index, __elem, 4);
-}
-
-__extension__ static __inline int32x2_t __attribute__ ((__always_inline__))
-vset_lane_s32 (int32_t __elem, int32x2_t __vec, const int __index)
-{
-  return __aarch64_vset_lane_any (__vec, __index, __elem, 2);
-}
-
-__extension__ static __inline int64x1_t __attribute__ ((__always_inline__))
-vset_lane_s64 (int64_t __elem, int64x1_t __vec, const int __index)
-{
-  return __aarch64_vset_lane_any (__vec, __index, __elem, 1);
-}
-
-__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
-vset_lane_u8 (uint8_t __elem, uint8x8_t __vec, const int __index)
-{
-  return __aarch64_vset_lane_any (__vec, __index, __elem, 8);
-}
-
-__extension__ static __inline uint16x4_t __attribute__ ((__always_inline__))

[PATCH] Fix PR62238

2014-11-25 Thread Richard Biener


I will test the following patch fixing a tree sharing issue in
PR62238 and plugging a SSA name leak.  The issue here is that
force_gimple_operand and friends modify trees in-place, injecting
SSA name uses to them.  If you end up not emitting their definitions
or and up re-using those trees in not appropriate places you'll
break things.

Fixed by unsharing the tree.

The following also plugs the SSA name leak which makes the SSA
verifier ICE become a segfault (a released SSA name leaked into
a tree used otherwise).

Richard.

2014-11-25  Richard Biener  rguent...@suse.de

PR tree-optimization/62238
* tree-predcom.c (ref_at_iteration): Unshare the expression
before gimplifying it.
(prepare_initializers_chain): Discard unused seq.

* gcc.dg/torture/pr62238.c: New testcase.

Index: gcc/tree-predcom.c
===
--- gcc/tree-predcom.c  (revision 218019)
+++ gcc/tree-predcom.c  (working copy)
@@ -1402,8 +1402,8 @@ ref_at_iteration (data_reference_p dr, i
 off = size_binop (PLUS_EXPR, off,
  size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)));
   tree addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
-  addr = force_gimple_operand_1 (addr, stmts, is_gimple_mem_ref_addr,
-NULL_TREE);
+  addr = force_gimple_operand_1 (unshare_expr (addr), stmts,
+is_gimple_mem_ref_addr, NULL_TREE);
   tree alias_ptr = fold_convert (reference_alias_ptr_type (DR_REF (dr)), coff);
   /* While data-ref analysis punts on bit offsets it still handles
  bitfield accesses at byte boundaries.  Cope with that.  Note that
@@ -2354,7 +2354,6 @@ prepare_initializers_chain (struct loop
   unsigned i, n = (chain-type == CT_INVARIANT) ? 1 : chain-length;
   struct data_reference *dr = get_chain_root (chain)-ref;
   tree init;
-  gimple_seq stmts;
   dref laref;
   edge entry = loop_preheader_edge (loop);
 
@@ -2378,12 +2377,17 @@ prepare_initializers_chain (struct loop
 
   for (i = 0; i  n; i++)
 {
+  gimple_seq stmts = NULL;
+
   if (chain-inits[i] != NULL_TREE)
continue;
 
   init = ref_at_iteration (dr, (int) i - n, stmts);
   if (!chain-all_always_accessed  tree_could_trap_p (init))
-   return false;
+   {
+ gimple_seq_discard (stmts);
+ return false;
+   }
 
   if (stmts)
gsi_insert_seq_on_edge_immediate (entry, stmts);
Index: gcc/testsuite/gcc.dg/torture/pr62238.c
===
--- gcc/testsuite/gcc.dg/torture/pr62238.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr62238.c  (working copy)
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+
+int a[4], b, c, d; 
+
+int
+fn1 (int p)
+{
+  for (; d; d++)
+{
+  unsigned int h;
+  for (h = 0; h  3; h++)
+   {
+ if (a[c+c+h])
+   {
+ if (p)
+   break;
+ return 0;
+   }
+ b = 0;
+   }
+}
+  return 0;
+}
+
+int
+main ()
+{
+  fn1 (0);
+  return 0;
+}

[PATCH] libgcc: Add CFI directives to the floating point support code for ARM.

2014-11-25 Thread Martin Galvan

This patch adds CFI directives to the floating point support code for ARM.

Previously, if we tried to do a backtrace from that code in a debug session we'd
get something like this:

(gdb) bt
#0  __nedf2 () at 
../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1082
#1  0x0db6 in __aeabi_cdcmple () at 
../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1158
#2  0xf5c28f5c in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Now we'll get something like this:

(gdb) bt
#0  __nedf2 () at 
../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1156
#1  0x0db6 in __aeabi_cdcmple () at 
../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1263
#2  0x0dc8 in __aeabi_dcmpeq () at 
../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1285
#3  0x0504 in main ()

I don't have write access, so it'd be nice if someone could commit this one for 
me after reviewing.

Thanks a lot!

libgcc/ChangeLog:
2014-11-25 Martin Galvan martin.gal...@tallertechnologies.com

  * config/arm/lib1funcs.S (CFI_START_FUNCTION, CFI_END_FUNCTION):
  New macros.
  * config/arm/ieee754-df.S: Add CFI directives.
  * config/arm/ieee754-sf.S: Add CFI directives.

diff --git a/libgcc/config/arm/ieee754-df.S b/libgcc/config/arm/ieee754-df.S
index 1c45a39..5b34a04 100644
--- a/libgcc/config/arm/ieee754-df.S
+++ b/libgcc/config/arm/ieee754-df.S
@@ -33,8 +33,12 @@
  * Only the default rounding mode is intended for best performances.
  * Exceptions aren't supported yet, but that can be added quite easily
  * if necessary without impacting performances.
+ *
+ * In the CFI related comments, 'previousOffset' refers to the previous offset
+ * from sp used to compute the CFA.
  */

+.cfi_sections .debug_frame

 #ifndef __ARMEB__
 #define xl r0
@@ -53,11 +57,13 @@

 ARM_FUNC_START negdf2
 ARM_FUNC_ALIAS aeabi_dneg negdf2
+CFI_START_FUNCTION

 @ flip sign bit
 eor xh, xh, #0x8000
 RET

+CFI_END_FUNCTION
 FUNC_END aeabi_dneg
 FUNC_END negdf2

@@ -66,6 +72,7 @@ ARM_FUNC_ALIAS aeabi_dneg negdf2
 #ifdef L_arm_addsubdf3

 ARM_FUNC_START aeabi_drsub
+CFI_START_FUNCTION

 eor xh, xh, #0x8000 @ flip sign bit of first arg
 b   1f
@@ -81,7 +88,11 @@ ARM_FUNC_ALIAS aeabi_dsub subdf3
 ARM_FUNC_START adddf3
 ARM_FUNC_ALIAS aeabi_dadd adddf3

-1:  do_push {r4, r5, lr}
+1:  do_push {r4, r5, lr}@ sp -= 12
+.cfi_adjust_cfa_offset 12   @ CFA is now sp + previousOffset + 12
+.cfi_rel_offset r4, 0   @ Registers are saved from sp to sp + 8
+.cfi_rel_offset r5, 4
+.cfi_rel_offset lr, 8

 @ Look for zeroes, equal values, INF, or NAN.
 shift1  lsl, r4, xh, #1
@@ -148,6 +159,11 @@ ARM_FUNC_ALIAS aeabi_dadd adddf3
 @ Since this is not common case, rescale them off line.
 teq r4, r5
 beq LSYM(Lad_d)
+
+@ CFI note: we're lucky that the branches to Lad_* that appear after this 
function
+@ have a CFI state that's exactly the same as the one we're in at this
+@ point. Otherwise the CFI would change to a different state after the branch,
+@ which would be disastrous for backtracing.
 LSYM(Lad_x):

 @ Compensate for the exponent overlapping the mantissa MSB added later
@@ -413,6 +429,7 @@ LSYM(Lad_i):
 orrne   xh, xh, #0x0008 @ quiet NAN
 RETLDM  r4, r5

+CFI_END_FUNCTION
 FUNC_END aeabi_dsub
 FUNC_END subdf3
 FUNC_END aeabi_dadd
@@ -420,12 +437,19 @@ LSYM(Lad_i):

 ARM_FUNC_START floatunsidf
 ARM_FUNC_ALIAS aeabi_ui2d floatunsidf
+CFI_START_FUNCTION

 teq r0, #0
 do_it   eq, t
 moveq   r1, #0
 RETc(eq)
-do_push {r4, r5, lr}
+
+do_push {r4, r5, lr}@ sp -= 12
+.cfi_adjust_cfa_offset 12   @ CFA is now sp + previousOffset + 12
+.cfi_rel_offset r4, 0   @ Registers are saved from sp + 0 to sp + 8.
+.cfi_rel_offset r5, 4
+.cfi_rel_offset lr, 8
+
 mov r4, #0x400  @ initial exponent
 add r4, r4, #(52-1 - 1)
 mov r5, #0  @ sign bit is 0
@@ -435,17 +459,25 @@ ARM_FUNC_ALIAS aeabi_ui2d floatunsidf
 mov xh, #0
 b   LSYM(Lad_l)

+CFI_END_FUNCTION
 FUNC_END aeabi_ui2d
 FUNC_END floatunsidf

 ARM_FUNC_START floatsidf
 ARM_FUNC_ALIAS aeabi_i2d floatsidf
+CFI_START_FUNCTION

 teq r0, #0
 do_it   eq, t
 moveq   r1, #0
 RETc(eq)
-do_push {r4, r5, lr}
+
+do_push {r4, r5, lr}@ sp -= 12
+.cfi_adjust_cfa_offset 12   @ CFA is now sp + previousOffset + 12
+.cfi_rel_offset r4, 0   @ Registers are saved from sp + 0 to sp + 8.
+.cfi_rel_offset r5, 4
+.cfi_rel_offset lr, 8
+
 mov r4, #0x400  @ initial exponent
 add r4, r4, #(52-1 - 1)
 andsr5, r0, #0x8000 @ sign bit in r5
@@ -457,11 +489,13 @@ ARM_FUNC_ALIAS aeabi_i2d floatsidf
 mov xh, #0
 b   LSYM(Lad_l)

+CFI_END_FUNCTION
 FUNC_END aeabi_i2d
 FUNC_END floatsidf

 ARM_FUNC_START extendsfdf2
 ARM_FUNC_ALIAS aeabi_f2d extendsfdf2
+

Re: [PATCH] gcc parallel make check

2014-11-25 Thread Tom de Vries


On 15-09-14 18:05, Jakub Jelinek wrote:

libstdc++-v3/
* testsuite/Makefile.am (check_p_numbers0, check_p_numbers1,
check_p_numbers2, check_p_numbers3, check_p_numbers4,
check_p_numbers5, check_p_numbers6, check_p_numbers,
check_p_subdirs): New variables.
(check_DEJAGNU_normal_targets): Use check_p_subdirs.
(check-DEJAGNU): Rewritten so that for parallelized
testing each job runs all the *.exp files, with
GCC_RUNTEST_PARALLELIZE_DIR set in environment.
* testsuite/Makefile.in: Regenerated.
* testsuite/lib/libstdc++.exp (gcc_parallel_test_run_p,
gcc_parallel_test_enable): New procedures.  If
GCC_RUNTEST_PARALLELIZE_DIR is set in environment, override
runtest_file_p to invoke also gcc_parallel_test_run_p.
* testsuite/libstdc++-abi/abi.exp: Run all the tests serially
by the first parallel runtest encountering it.  Fix up path
of the extract_symvers script.
* testsuite/libstdc++-xmethods/xmethods.exp: Run all the tests
serially by the first parallel runtest encountering it.  Run
dg-finish even in case of error.


When comparing test results of patch builds with test results of reference 
builds, the only differences I'm seeing are random differences in amount of 
'UNSUPPORTED: prettyprinter.exp'.


This patch fixes that by ensuring that we print that unsupported message only 
once.

The resulting test result comparison diff is:
...
--- without/FAIL  2014-11-24 17:46:32.202673282 +0100
+++ with/FAIL 2014-11-25 13:45:15.636131571 +0100
 libstdc++-v3/testsuite/libstdc++.sum:UNSUPPORTED: prettyprinters.exp
-libstdc++-v3/testsuite/libstdc++.sum:UNSUPPORTED: prettyprinters.exp
-libstdc++-v3/testsuite/libstdc++.sum:UNSUPPORTED: prettyprinters.exp
-libstdc++-v3/testsuite/libstdc++.sum:UNSUPPORTED: prettyprinters.exp
-libstdc++-v3/testsuite/libstdc++.sum:UNSUPPORTED: prettyprinters.exp
 libstdc++-v3/testsuite/libstdc++.sum:UNSUPPORTED: xmethods.exp
...

Furthermore, the patch adds a dg-finish in case the prettyprinters.exp file is 
unsupported, which AFAIU is also required in that case.


Bootstrapped and reg-tested on x86_64.

OK for trunk/stage3?

Thanks,
- Tom


2014-11-25  Tom de Vries  t...@codesourcery.com

	* testsuite/libstdc++-prettyprinters/prettyprinters.exp: Add missing
	dg-finish.  Only print unsupported message once.
---
 libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp b/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp
index a57660f..e5be5b5 100644
--- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp
+++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp
@@ -30,7 +30,14 @@ if ![info exists ::env(GUALITY_GDB_NAME)] {
 }
 
 if {! [gdb_version_check]} {
+dg-finish
+# Only print unsupported message in one instance.
+if ![gcc_parallel_test_run_p prettyprinters] {
+	return
+}
+gcc_parallel_test_enable 0
 unsupported prettyprinters.exp
+gcc_parallel_test_enable 1
 return
 }
 
-- 
1.9.1

C++ PATCH to lookup_template_variable

2014-11-25 Thread Jason Merrill

We need to use unknown_type_node for non-dependent arguments, too; we 
don't know what type the variable has until we look up the specialization.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit c348ed4ea7152054ff623a3efbca7fab49227a5f
Author: Jason Merrill ja...@redhat.com
Date:   Mon Nov 24 19:14:04 2014 -0500

	* pt.c (lookup_template_variable): Always unknown_type_node.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 1d6b916..29fb2e1 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8026,19 +8026,14 @@ lookup_template_class (tree d1, tree arglist, tree in_decl, tree context,
   return ret;
 }
 
-/* Return a TEMPLATE_ID_EXPR for the given variable template and ARGLIST. 
-   If the ARGLIST refers to any template parameters, the type of the
-   expression is the unknown_type_node since the template-id could
-   refer to an explicit or partial specialization. */
+/* Return a TEMPLATE_ID_EXPR for the given variable template and ARGLIST.
+   The type of the expression is the unknown_type_node since the
+   template-id could refer to an explicit or partial specialization. */
 
 tree
 lookup_template_variable (tree templ, tree arglist)
 {
-  tree type;
-  if (uses_template_parms (arglist))
-type = unknown_type_node;
-  else
-type = TREE_TYPE (templ);
+  tree type = unknown_type_node;
   tsubst_flags_t complain = tf_warning_or_error;
   tree parms = INNERMOST_TEMPLATE_PARMS (DECL_TEMPLATE_PARMS (templ));
   arglist = coerce_template_parms (parms, arglist, templ, complain,
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ17.C b/gcc/testsuite/g++.dg/cpp1y/var-templ17.C
new file mode 100644
index 000..c6d97eb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/var-templ17.C
@@ -0,0 +1,9 @@
+// DR 1727: a specialization doesn't need to have the same type
+// { dg-do compile { target c++14 } }
+
+template class T T t = 42;
+template void* tint = 0;
+
+templateclass T, class U struct same;
+templateclass T struct sameT,T {};
+samevoid*,decltype(tint) s;

RE: [PATCH][MIPS] Fix P5600 memory cost

2014-11-25 Thread Matthew Fortune

Hi Prachi,
 
 OK with fixes to the changelog entry:
 
 latency not latency. Remember to tab in the changelog entry and split the
 line as it will exceed 80 chars. Also two spaces between the date/name and
 name/email. E.g.
 
 2014-11-05  Prachi Godbole  prachi.godb...@imgtec.com
 
   * config/mips/mips.c (mips_rtx_cost_data): Fix memory_latency cost
 for
   p5600.

I can't see this committed in svn trunk, did you find a problem with the patch?

Thanks,
Matthew

Re: [PATCH] sreal class fix for PR64050 and PR64060

2014-11-25 Thread Richard Biener

On Tue, Nov 25, 2014 at 1:55 PM, Martin Liška mli...@suse.cz wrote:
 Hello.

 Following patch is fix sreal problems that are mentioned in PR64050,
 PR64060.
 I added new GCC plugin test where I test sreal arithmetics and number
 comparison.

 Patch can bootstrap on ppc64-linux-pc and x86_64-linux-pc and can pass
 regression
 tests.

Ok.

Thanks,
Richard.

 Thanks,
 Martin

Re: [PATCH] Add verify_sese

2014-11-25 Thread Tom de Vries


On 25-11-14 10:28, Richard Biener wrote:

On Tue, Nov 25, 2014 at 1:01 AM, Tom de Vries tom_devr...@mentor.com wrote:

Richard,

I ran into a problem with my oacc kernels directive patch series where
tail-merge added another entry into a region that was previously
single-entry-single-exit.

That resulted in hitting this assert in calc_dfs_tree:
...
   /* This aborts e.g. when there is _no_ path from ENTRY to EXIT at all.  */
   gcc_assert (di-nodes == (unsigned int) n_basic_blocks_for_fn (cfun) - 1);
...
during a call to move_sese_region_to_fn.

This patch makes sure that we abort earlier, with a clearer message of what
is actually wrong.

Bootstrapped and reg-tested on x86_64.

OK for trunk/stage3?


I believe someone made the function work for SEME regions and I believe
it is actually used to copy loops with multiple exits


This is the first part of the function comment for move_sese_region_to_fn:
...
/* Move a single-entry, single-exit region delimited by ENTRY_BB and
   EXIT_BB to function DEST_CFUN.  The whole region is replaced by a
   single basic block in the original CFG and the new basic block is
   returned.  DEST_CFUN must not have a CFG yet.

   Note that the region need not be a pure SESE region.  Blocks inside
   the region may contain calls to abort/exit.  The only restriction
   is that ENTRY_BB should be the only entry point and it must
   dominate EXIT_BB.
...

I'm guessing you're referring to the 'not pure SESE region' bit?

So in fact, it's not a single-entry-single-exit region, but more a 
single-entry-at-most-one-continuation region. [ Note that in case of f.i. an 
eternal loop, we can also have single entry, no continuation. ]



so I don't see how the
patch can work in these cases?



The bbs with calls to abort/exit don't have any successor edges. verify_sese 
doesn't assert anything specific about suchs bbs.


Thanks,
- Tom

Re: PATCH: PR rtl-optimization/64037: Miscompilation with -Os and enum class : char parameter

2014-11-25 Thread Richard Biener

On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 The enclosed testcase fails on x86 when compiled with -Os since we pass
 a byte parameter with a byte load in caller and read it as an int in
 callee.  The reason it only shows up with -Os is x86 backend encodes
 a byte load with an int load if -O isn't used.  When a byte load is
 used, the upper 24 bits of the register have random value for none
 WORD_REGISTER_OPERATIONS targets.

 It happens because setup_incoming_promotions in combine.c has

   /* The mode and signedness of the argument before any promotions happen
  (equal to the mode of the pseudo holding it at that stage).  */
   mode1 = TYPE_MODE (TREE_TYPE (arg));
   uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));

   /* The mode and signedness of the argument after any source language and
  TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
   mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
   uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));

   /* The mode and signedness of the argument as it is actually passed,
  after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
   mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, uns3,
  TREE_TYPE (cfun-decl), 0);

 while they are actually passed in register by assign_parm_setup_reg in
 function.c:

   /* Store the parm in a pseudoregister during the function, but we may
  need to do it in a wider mode.  Using 2 here makes the result
  consistent with promote_decl_mode and thus expand_expr_real_1.  */
   promoted_nominal_mode
 = promote_function_mode (data-nominal_type, data-nominal_mode, 
 unsignedp,
  TREE_TYPE (current_function_decl), 2);

 where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
 and TYPE_MODE (nominal_type). TREE_TYPE here is

I think the bug is here, not in combine.c.  Can you try going back in history
for both snippets and see if they matched at some point?

Thanks,
Richard.

 (gdb) call debug_tree (type)
  enumeral_type 0x719f85e8 X
 type integer_type 0x718a93f0 unsigned char public unsigned
 string-flag QI
 size integer_cst 0x718a5fa8 constant 8
 unit size integer_cst 0x718a5fc0 constant 1
 align 8 symtab 0 alias set -1 canonical type 0x718a93f0
 precision 8 min integer_cst 0x718a5fd8 0 max integer_cst
 0x718a5f78 255
 static unsigned type_5 QI size integer_cst 0x718a5fa8 8 unit
 size integer_cst 0x718a5fc0 1
 align 8 symtab 0 alias set -1 canonical type 0x719f85e8
 precision 8 min integer_cst 0x718a5fd8 0 max integer_cst
 0x718a5f78 255
 values tree_list 0x719fb028
 purpose identifier_node 0x719f6738 V
 bindings (nil)
 local bindings (nil)
 value const_decl 0x718c21c0 V type enumeral_type
 0x719f85e8 X
 readonly constant used VOID file pr64037.ii line 2 col 3
 align 1 context enumeral_type 0x719f85e8 X initial
 integer_cst 0x719d8d08 2 context translation_unit_decl
 0x77ff91e0 D.1
 chain type_decl 0x719f5c78 X
 (gdb)

 and DECL_ARG_TYPE is

 (gdb) call debug_tree (type)
  integer_type 0x718a9690 int public SI
 size integer_cst 0x718a5e70 type integer_type 0x718a9150
 bitsizetype constant 32
 unit size integer_cst 0x718a5e88 type integer_type
 0x718a90a8 sizetype constant 4
 align 32 symtab 0 alias set 1 canonical type 0x718a9690
 precision 32 min integer_cst 0x718c60c0 -2147483648 max
 integer_cst 0x718c60d8 2147483647
 pointer_to_this pointer_type 0x718cb930
 (gdb)

 This mismatch makes combine thinks a byte parameter is passed as int
 in register and turns

 (insn 9 6 10 2 (set (reg:SI 92 [ b ])
 (zero_extend:SI (subreg:QI (reg:SI 91 [ b ]) 0))) pr64037.ii:9
 138 {*zero_extendqisi2}
  (expr_list:REG_DEAD (reg:SI 91 [ b ])
 (nil)))
 (insn 10 9 0 2 (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4
 A32])
 (reg:SI 92 [ b ])) pr64037.ii:9 90 {*movsi_internal}
  (expr_list:REG_DEAD (reg:SI 92 [ b ])
 (expr_list:REG_DEAD (reg/v/f:SI 88 [ out ])
 (nil

 into

 Trying 9 - 10:
 Successfully matched this instruction:
 (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4 A32])
 (reg:SI 91 [ b ]))
 allowing combination of insns 9 and 10
 original costs 6 + 4 = 10
 replacement cost 4
 deferring deletion of insn with uid = 9.
 modifying insn i310: [r88:SI]=r91:SI
   REG_DEAD r91:SI
   REG_DEAD r88:SI


 This patch makes setup_incoming_promotions to match assign_parm_setup_reg.
 Tested on Linux/x86-64 without regressions.  OK for trunk and backport?

 Thanks.


 H.J.
 
 diff --git a/gcc/combine.c b/gcc/combine.c
 index 1808f97..a0449a2 100644
 --- a/gcc/combine.c
 +++ b/gcc/combine.c
 @@ -1561,8 +1561,8 @@ setup_incoming_promotions (rtx_insn *first)

Re: PATCH: PR rtl-optimization/64037: Miscompilation with -Os and enum class : char parameter

2014-11-25 Thread Richard Biener

On Tue, Nov 25, 2014 at 4:01 PM, Richard Biener
richard.guent...@gmail.com wrote:
 On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 The enclosed testcase fails on x86 when compiled with -Os since we pass
 a byte parameter with a byte load in caller and read it as an int in
 callee.  The reason it only shows up with -Os is x86 backend encodes
 a byte load with an int load if -O isn't used.  When a byte load is
 used, the upper 24 bits of the register have random value for none
 WORD_REGISTER_OPERATIONS targets.

 It happens because setup_incoming_promotions in combine.c has

   /* The mode and signedness of the argument before any promotions happen
  (equal to the mode of the pseudo holding it at that stage).  */
   mode1 = TYPE_MODE (TREE_TYPE (arg));
   uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));

   /* The mode and signedness of the argument after any source language 
 and
  TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
   mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
   uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));

   /* The mode and signedness of the argument as it is actually passed,
  after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
   mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, uns3,
  TREE_TYPE (cfun-decl), 0);

 while they are actually passed in register by assign_parm_setup_reg in
 function.c:

   /* Store the parm in a pseudoregister during the function, but we may
  need to do it in a wider mode.  Using 2 here makes the result
  consistent with promote_decl_mode and thus expand_expr_real_1.  */
   promoted_nominal_mode
 = promote_function_mode (data-nominal_type, data-nominal_mode, 
 unsignedp,
  TREE_TYPE (current_function_decl), 2);

 where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
 and TYPE_MODE (nominal_type). TREE_TYPE here is

 I think the bug is here, not in combine.c.  Can you try going back in history
 for both snippets and see if they matched at some point?

Oh, and note that I think DECL_ARG_TYPE is sth dangerous - it's meant
to be a source language ABI kind-of-thing.  Or rather an optimization
hit.  For example in C when integral promotions happen to call arguments
this can be used to optimize sign-/zero-extensions in the callee.  Unless
something else overrides this (like the target which specifies the real ABI).
IIRC.

Richard.

 Thanks,
 Richard.

 (gdb) call debug_tree (type)
  enumeral_type 0x719f85e8 X
 type integer_type 0x718a93f0 unsigned char public unsigned
 string-flag QI
 size integer_cst 0x718a5fa8 constant 8
 unit size integer_cst 0x718a5fc0 constant 1
 align 8 symtab 0 alias set -1 canonical type 0x718a93f0
 precision 8 min integer_cst 0x718a5fd8 0 max integer_cst
 0x718a5f78 255
 static unsigned type_5 QI size integer_cst 0x718a5fa8 8 unit
 size integer_cst 0x718a5fc0 1
 align 8 symtab 0 alias set -1 canonical type 0x719f85e8
 precision 8 min integer_cst 0x718a5fd8 0 max integer_cst
 0x718a5f78 255
 values tree_list 0x719fb028
 purpose identifier_node 0x719f6738 V
 bindings (nil)
 local bindings (nil)
 value const_decl 0x718c21c0 V type enumeral_type
 0x719f85e8 X
 readonly constant used VOID file pr64037.ii line 2 col 3
 align 1 context enumeral_type 0x719f85e8 X initial
 integer_cst 0x719d8d08 2 context translation_unit_decl
 0x77ff91e0 D.1
 chain type_decl 0x719f5c78 X
 (gdb)

 and DECL_ARG_TYPE is

 (gdb) call debug_tree (type)
  integer_type 0x718a9690 int public SI
 size integer_cst 0x718a5e70 type integer_type 0x718a9150
 bitsizetype constant 32
 unit size integer_cst 0x718a5e88 type integer_type
 0x718a90a8 sizetype constant 4
 align 32 symtab 0 alias set 1 canonical type 0x718a9690
 precision 32 min integer_cst 0x718c60c0 -2147483648 max
 integer_cst 0x718c60d8 2147483647
 pointer_to_this pointer_type 0x718cb930
 (gdb)

 This mismatch makes combine thinks a byte parameter is passed as int
 in register and turns

 (insn 9 6 10 2 (set (reg:SI 92 [ b ])
 (zero_extend:SI (subreg:QI (reg:SI 91 [ b ]) 0))) pr64037.ii:9
 138 {*zero_extendqisi2}
  (expr_list:REG_DEAD (reg:SI 91 [ b ])
 (nil)))
 (insn 10 9 0 2 (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4
 A32])
 (reg:SI 92 [ b ])) pr64037.ii:9 90 {*movsi_internal}
  (expr_list:REG_DEAD (reg:SI 92 [ b ])
 (expr_list:REG_DEAD (reg/v/f:SI 88 [ out ])
 (nil

 into

 Trying 9 - 10:
 Successfully matched this instruction:
 (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4 A32])
 (reg:SI 91 [ b ]))
 allowing combination of insns 9 and 10
 original costs 6 + 4 = 10
 replacement cost 4
 deferring deletion of insn

[PATCH, MIPS, COMMITTED] Testsuite fixes for soft-float configurations

2014-11-25 Thread Matthew Fortune

Re: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02879.html

The new FPXX tests now work correctly for soft-float configurations.
Tests should only need to specify one of the 5 floating-point options
and any other options are then inferred from that. The FPXX tests
were the first tests to really rely on -mfp* options which is why
we hadn't seen this issue before. Using a -mfp option implies that
the test is hard-float and double float. To create a test that is
single-float then only the -msingle-float option should be used
without specifying a -mfp option. I have not done anything to improve
single-float testsuite support in this patch though.

I committed this via a git-svn bridge so fingers crossed it went in
correctly! I also used the ChangeLog merging script from the link
below which is a marvellous invention if others don't know of it.

https://gcc.gnu.org/wiki/GitMirror#git-merge-changelog

Thanks,
Matthew

gcc/testuite/

* gcc.target/mips/mips.exp: Add support for -msoft-float
and -mhard-float options.  Ensure that explicit -mfp*
options imply both -mhard-float and -mdouble-float.
* gcc.target/mips/call-clobbered-1.c: Add -mhard-float to the
compile options.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@218047 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog  | 8 
 gcc/testsuite/gcc.target/mips/call-clobbered-1.c | 2 +-
 gcc/testsuite/gcc.target/mips/mips.exp   | 8 
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index b577824..7b9b365 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,11 @@
+2014-11-25  Matthew Fortune  matthew.fort...@imgtec.com
+
+   * gcc.target/mips/mips.exp: Add support for -msoft-float and
+   -mhard-float options.  Ensure that explicit -mfp* options imply
+   both -mhard-float and -mdouble-float.
+   * gcc.target/mips/call-clobbered-1.c: Add -mhard-float to the
+   compile options.
+
 2014-11-25  Paolo Carlini  paolo.carl...@oracle.com
 
PR c++/63786
diff --git a/gcc/testsuite/gcc.target/mips/call-clobbered-1.c 
b/gcc/testsuite/gcc.target/mips/call-clobbered-1.c
index ecb994f..77294aa 100644
--- a/gcc/testsuite/gcc.target/mips/call-clobbered-1.c
+++ b/gcc/testsuite/gcc.target/mips/call-clobbered-1.c
@@ -1,6 +1,6 @@
 /* Check that we handle call-clobbered FPRs correctly.  */
 /* { dg-skip-if code quality test { *-*-* } { -O0 } {  } } */
-/* { dg-options isa=2 -mabi=32 -ffixed-f0 -ffixed-f1 -ffixed-f2 -ffixed-f3 
-ffixed-f4 -ffixed-f5 -ffixed-f6 -ffixed-f7 -ffixed-f8 -ffixed-f9 -ffixed-f10 
-ffixed-f11 -ffixed-f12 -ffixed-f13 -ffixed-f14 -ffixed-f15 -ffixed-f16 
-ffixed-f17 -ffixed-f18 -ffixed-f19 } */
+/* { dg-options isa=2 -mabi=32 -mhard-float -ffixed-f0 -ffixed-f1 -ffixed-f2 
-ffixed-f3 -ffixed-f4 -ffixed-f5 -ffixed-f6 -ffixed-f7 -ffixed-f8 -ffixed-f9 
-ffixed-f10 -ffixed-f11 -ffixed-f12 -ffixed-f13 -ffixed-f14 -ffixed-f15 
-ffixed-f16 -ffixed-f17 -ffixed-f18 -ffixed-f19 } */
 
 void bar (void);
 double a;
diff --git a/gcc/testsuite/gcc.target/mips/mips.exp 
b/gcc/testsuite/gcc.target/mips/mips.exp
index a9beb27..6ae71ad 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -234,6 +234,7 @@ set mips_option_groups {
 dump_pattern -dp
 endianness -E(L|B)|-me(l|b)
 float -m(hard|soft)-float
+fpu -m(double|single)-float
 forbid_cpu forbid_cpu=.*
 fp -mfp(32|xx|64)
 gp -mgp(32|64)
@@ -858,6 +859,8 @@ proc mips-dg-finish {} {
 #|   |
 # -modd-spreg -mno-odd-spreg
 #|   |
+# -mdouble-float  -msingle-float
+#|   |
 # -mabs=2008/-mabs=legacy no option
 #|   |
 # -mhard-float-msoft-float
@@ -947,7 +950,12 @@ proc mips-dg-options { args } {
 mips_option_dependency options -mips3d -mpaired-single
 mips_option_dependency options -mpaired-single -mfp64
 mips_option_dependency options -mfp64 -mhard-float
+mips_option_dependency options -mfp32 -mhard-float
+mips_option_dependency options -mfpxx -mhard-float
 mips_option_dependency options -mfp64 -modd-spreg
+mips_option_dependency options -mfp64 -mdouble-float
+mips_option_dependency options -mfp32 -mdouble-float
+mips_option_dependency options -mfpxx -mdouble-float
 mips_option_dependency options -mabs=2008 -mhard-float
 mips_option_dependency options -mabs=legacy -mhard-float
 mips_option_dependency options -mrelax-pic-calls -mno-plt
-- 
1.9.4

Re: [PATCH 1/2] teach mklog to get name / email from git config when available

2014-11-25 Thread Diego Novillo




On 20/11/2014, 16:51 , Tom de Vries wrote:


OK for trunk?


This is fine. Thanks.


Diego.

Re: [PATCH] Add verify_sese

2014-11-25 Thread Richard Biener

On Tue, Nov 25, 2014 at 3:59 PM, Tom de Vries tom_devr...@mentor.com wrote:
 On 25-11-14 10:28, Richard Biener wrote:

 On Tue, Nov 25, 2014 at 1:01 AM, Tom de Vries tom_devr...@mentor.com
 wrote:

 Richard,

 I ran into a problem with my oacc kernels directive patch series where
 tail-merge added another entry into a region that was previously
 single-entry-single-exit.

 That resulted in hitting this assert in calc_dfs_tree:
 ...
/* This aborts e.g. when there is _no_ path from ENTRY to EXIT at all.
 */
gcc_assert (di-nodes == (unsigned int) n_basic_blocks_for_fn (cfun) -
 1);
 ...
 during a call to move_sese_region_to_fn.

 This patch makes sure that we abort earlier, with a clearer message of
 what
 is actually wrong.

 Bootstrapped and reg-tested on x86_64.

 OK for trunk/stage3?


 I believe someone made the function work for SEME regions and I believe
 it is actually used to copy loops with multiple exits


 This is the first part of the function comment for move_sese_region_to_fn:
 ...
 /* Move a single-entry, single-exit region delimited by ENTRY_BB and
EXIT_BB to function DEST_CFUN.  The whole region is replaced by a
single basic block in the original CFG and the new basic block is
returned.  DEST_CFUN must not have a CFG yet.

Note that the region need not be a pure SESE region.  Blocks inside
the region may contain calls to abort/exit.  The only restriction
is that ENTRY_BB should be the only entry point and it must
dominate EXIT_BB.
 ...

 I'm guessing you're referring to the 'not pure SESE region' bit?

 So in fact, it's not a single-entry-single-exit region, but more a
 single-entry-at-most-one-continuation region. [ Note that in case of f.i. an
 eternal loop, we can also have single entry, no continuation. ]

 so I don't see how the
 patch can work in these cases?


 The bbs with calls to abort/exit don't have any successor edges. verify_sese
 doesn't assert anything specific about suchs bbs.

Ah, indeed.

Patch is ok then.

Thanks,
Richard.

 Thanks,
 - Tom

Re: PATCH: PR rtl-optimization/64037: Miscompilation with -Os and enum class : char parameter

2014-11-25 Thread H.J. Lu

On Tue, Nov 25, 2014 at 7:01 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 The enclosed testcase fails on x86 when compiled with -Os since we pass
 a byte parameter with a byte load in caller and read it as an int in
 callee.  The reason it only shows up with -Os is x86 backend encodes
 a byte load with an int load if -O isn't used.  When a byte load is
 used, the upper 24 bits of the register have random value for none
 WORD_REGISTER_OPERATIONS targets.

 It happens because setup_incoming_promotions in combine.c has

   /* The mode and signedness of the argument before any promotions happen
  (equal to the mode of the pseudo holding it at that stage).  */
   mode1 = TYPE_MODE (TREE_TYPE (arg));
   uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));

   /* The mode and signedness of the argument after any source language 
 and
  TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
   mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
   uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));

   /* The mode and signedness of the argument as it is actually passed,
  after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
   mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, uns3,
  TREE_TYPE (cfun-decl), 0);

 while they are actually passed in register by assign_parm_setup_reg in
 function.c:

   /* Store the parm in a pseudoregister during the function, but we may
  need to do it in a wider mode.  Using 2 here makes the result
  consistent with promote_decl_mode and thus expand_expr_real_1.  */
   promoted_nominal_mode
 = promote_function_mode (data-nominal_type, data-nominal_mode, 
 unsignedp,
  TREE_TYPE (current_function_decl), 2);

 where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
 and TYPE_MODE (nominal_type). TREE_TYPE here is

 I think the bug is here, not in combine.c.  Can you try going back in history
 for both snippets and see if they matched at some point?


The bug was introduced by

https://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00613.html

commit 5d93234932c3d8617ce92b77b7013ef6bede9508
Author: shinwell shinwell@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Thu Sep 20 11:01:18 2007 +

  gcc/
  * combine.c: Include cgraph.h.
  (setup_incoming_promotions): Rework to allow more aggressive
  elimination of sign extensions when all call sites of the
  current function are known to lie within the current unit.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@128618
138bc75d-0d04-0410-961f-82ee72b054a4

Before this commit, combine.c has

  enum machine_mode mode = TYPE_MODE (TREE_TYPE (arg));
  int uns = TYPE_UNSIGNED (TREE_TYPE (arg));

  mode = promote_mode (TREE_TYPE (arg), mode, uns, 1);
  if (mode == GET_MODE (reg)  mode != DECL_MODE (arg))
{
  rtx x;
  x = gen_rtx_CLOBBER (DECL_MODE (arg), const0_rtx);
  x = gen_rtx_fmt_e ((uns ? ZERO_EXTEND : SIGN_EXTEND), mode, x);
  record_value_for_reg (reg, first, x);
}

It matches function.c:

  /* This is not really promoting for a call.  However we need to be
 consistent with assign_parm_find_data_types and expand_expr_real_1.  */
  promoted_nominal_mode
= promote_mode (data-nominal_type, data-nominal_mode, unsignedp, 1);

r128618 changed

mode = promote_mode (TREE_TYPE (arg), mode, uns, 1);

to

mode3 = promote_mode (DECL_ARG_TYPE (arg), mode2, uns3, 1);

It breaks none WORD_REGISTER_OPERATIONS targets.

-- 
H.J.

Re: [PATCH][AArch64] Remove crypto extension from default for cortex-a53, cortex-a57

2014-11-25 Thread Kyrill Tkachov



On 25/11/14 01:36, Gerald Pfeifer wrote:

On Tuesday 2014-11-18 09:38, Kyrill Tkachov wrote:

Here's what I propose.

+ li The cryptographic extensions to the ARMv8-A architecture are no
+   longer enabled by default when specifying the
+   code-mcpu=cortex-a53/code, code-mcpu=cortex-a57/code or
+   code-mcpu=cortex-a57.cortex-a53/code options.  To enable these
+   extensions add the code+crypto/code extension to your given
+   code-mcpu/code or code-march/code options' value.

option's?

Or better to the value of your...option(s)?


Ok, I've reworded it and added a small example to demonstrate.



The description talks about -mcpu and mentions -march only once.
Isn't this a bit confusing?


The change is to the behaviour of -mcpu, not -march. -march is only 
mentioned as a way of getting the

previous behaviour if the user so wishes.

How about this amendment?

Thanks for looking at it,
Kyrill



Gerald
Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.41
diff -U 3 -r1.41 changes.html
--- htdocs/gcc-5/changes.html	23 Nov 2014 14:42:28 -	1.41
+++ htdocs/gcc-5/changes.html	25 Nov 2014 16:05:01 -
@@ -376,8 +376,9 @@
are no longer enabled by default when specifying the
code-mcpu=cortex-a53/code, code-mcpu=cortex-a57/code or
code-mcpu=cortex-a57.cortex-a53/code options.  To enable these
-   extensions add the code+crypto/code extension to your given
-   code-mcpu/code or code-march/code options' value.
+   extensions add the code+crypto/code extension to the value of
+   code-mcpu/code or code-march/code e.g.
+   code-mcpu=cortex-a53+crypto/code.
  /li
  liSupport for the Cavium ThunderX processor is now available through the
 code-mcpu=thunderx/code and code-mtune=thunderx/code options.

Re: [Patch] Improving jump-thread pass for PR 54742

2014-11-25 Thread Jeff Law


On 11/24/14 21:55, Jeff Law wrote:

On 11/24/14 18:09, Sebastian Pop wrote:

Sebastian Pop wrote:

I removed the return -1 and started a bootstrap on powerpc64-linux.


Bootstrap passed on top of the 4 previous patches on powerpc64-linux.


I will report the valgrind output.


The output from valgrind looks closer to the output of master with no
other
patches: still 1M more instructions executed, and 300K more branches

Just ran my suite where we get ~25k more branches, which definitely puts
us in the noise.  (that's with all 4 patches + fixing the return value
).  I'm going to look at little closer at this stuff tomorrow, but I
think we've resolved the performance issue.  I'll dig deeper into the
implementation tomorrow as well.
I was running without your followup patches (must have used the wrong 
bits from my git stash), so those results are bogus, but in a good way.


After fixing that goof, I'm seeing consistent improvements with your set 
of 4 patches and the fix for the wrong return code.  Across the suite, 
~140M fewer branches, not huge, but definitely not in the noise.


So, time to dig into the implementation :-)

Jeff

ps.  In case you're curious about the noise, it's primarily address hashing.

Re: [PATCH] crtstuff: Add missing semicolon

2014-11-25 Thread Jeff Law


On 11/24/14 20:44, Segher Boessenkool wrote:

I wonder how this survived so long, I must be building some strange
configs (it failed on an avr cross).  Okay for trunk?


Segher


2014-11-24  Segher Boessenkool  seg...@kernel.crashing.org

libgcc/
* crtstuff.c (__do_glbal_ctors_1): Add missing semicolon.

I think this falls under the obviously OK rule :-)

jeff

[PATCH] pr31397 - implement -Wsuggest-override

2014-11-25 Thread tsaunders

From: Trevor Saunders tsaund...@mozilla.com

Hi,

this is a new warning to find places where virtual functions are over ridden, 
but not marked override.

included test passes, I expect comments so regtest is pending, and ChangeLog is 
omitted.

Trev

---
 gcc/c-family/c.opt|  5 +
 gcc/cp/class.c|  4 
 gcc/doc/invoke.texi   |  6 +-
 gcc/testsuite/g++.dg/warn/Wsuggest-override.C | 21 +
 4 files changed, 35 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wsuggest-override.C

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 85dcb98..259b520 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -574,6 +574,11 @@ Wsuggest-attribute=format
 C ObjC C++ ObjC++ Var(warn_suggest_attribute_format) Warning
 Warn about functions which might be candidates for format attributes
 
+Wsuggest-override
+C++ ObjC++ Var(warn_override) Warning
+Suggest that the override keyword be used when the declaration of a virtual
+function overrides another.
+
 Wswitch
 C ObjC C++ ObjC++ Var(warn_switch) Warning LangEnabledBy(C ObjC C++ 
ObjC++,Wall)
 Warn about enumerated switches, with no default, missing a case
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 16279df..515f33f 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -2777,6 +2777,10 @@ check_for_override (tree decl, tree ctype)
 {
   DECL_VINDEX (decl) = decl;
   overrides_found = true;
+  if (warn_override  DECL_VIRTUAL_P (decl)  !DECL_OVERRIDE_P (decl)
+  !DECL_DESTRUCTOR_P (decl))
+   warning_at(DECL_SOURCE_LOCATION (decl), OPT_Wsuggest_override,
+  %q+D can be marked override, decl);
 }
 
   if (DECL_VIRTUAL_P (decl))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 89edddb..8741e8e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -275,7 +275,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wstack-protector -Wstack-usage=@var{len} -Wstrict-aliasing @gol
 -Wstrict-aliasing=n @gol -Wstrict-overflow -Wstrict-overflow=@var{n} @gol
 -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{]} @gol
--Wsuggest-final-types @gol -Wsuggest-final-methods @gol
+-Wsuggest-final-types @gol -Wsuggest-final-methods @gol -Wsuggest-override @gol
 -Wmissing-format-attribute @gol
 -Wswitch  -Wswitch-default  -Wswitch-enum -Wswitch-bool -Wsync-nand @gol
 -Wsystem-headers  -Wtrampolines  -Wtrigraphs  -Wtype-limits  -Wundef @gol
@@ -4255,6 +4255,10 @@ effective with link time optimization, where the 
information about the class
 hiearchy graph is more complete. It is recommended to first consider suggestins
 of @option{-Wsuggest-final-types} and then rebuild with new annotations.
 
+@item -Wsuggest-override
+Warn about overriding virtual functions that are not marked with the override
+keyword.
+
 @item -Warray-bounds
 @opindex Wno-array-bounds
 @opindex Warray-bounds
diff --git a/gcc/testsuite/g++.dg/warn/Wsuggest-override.C 
b/gcc/testsuite/g++.dg/warn/Wsuggest-override.C
new file mode 100644
index 000..929d365
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wsuggest-override.C
@@ -0,0 +1,21 @@
+// { dg-do compile }
+// { dg-options -std=c++11 -Wsuggest-override }
+struct A
+{
+   A();
+   virtual ~A();
+   virtual void f();
+   virtual int bar();
+   operator int();
+   virtual operator float();
+};
+
+struct B : A
+{
+   B();
+   virtual ~B();
+   virtual void f(); // { dg-warning can be marked override }
+virtual int bar() override;
+operator int();
+virtual operator float(); // { dg-warning can be marked override }
+};
-- 
2.1.3

Re: [PATCH] Fix a bug in bt-load.c

2014-11-25 Thread Jeff Law


On 11/24/14 20:56, Segher Boessenkool wrote:

This caused ICEs on sh64.

`min_cost' and `def' here are supposed to refer to the same element;
removing it from the heap before asking the heap for the key doesn't
work (and at the end of the loop here we will ask for min_key on an
empty heap, which then does gcc_unreachable).

Bootstrapped and tested on powerpc64-linux, but I doubt it exercises
this code at all; only sh64 did ICE, and does not anymore.  Okay for
trunk?


Segher


2014-11-24  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* bt-load.c (migrate_btr_defs): Get the key of a heap entry
before removing it, not after.

OK.

Did sh64 ICE during a build, or was it during testing or something else? 
 Trying to figure out if we need a distinct test in the suite or not.



jeff

Re: [PATCH] mn10300: Fix an ICE

2014-11-25 Thread Jeff Law


On 11/24/14 20:37, Segher Boessenkool wrote:

`lcc' is not an insn but just a pattern.  This caused a build error in
libgcc.
Tested with a cross compiler build (which fails without and succeeds
with the patch).  Not tested much more; this compiler really likes to
ICE, something with ipa-icf.

Is this okay for trunk?


Segher


2014-11-24  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* config/mn10300/mn10300.c (mn10300_insert_setlb_lcc): Remove
PATTERN call.

OK.

A good example of a case that would have been caught if we get to a 
point where stuff in the insn chain are not RTX objects, but something 
else entirely.


jeff

Re: [PING][PATCH] Change contrib/test_installed for testing cross compilers

2014-11-25 Thread Jeff Law


On 11/24/14 09:51, Alan Lawrence wrote:

Having just been experimenting with testing of installed compilers - yes
something like this could be useful, however: to do cross-testing I
found I also (a) had to set my target_list; so either an extra flag for
that, or maybe just a generic 'extra_site_flags' parameter?
(b) I had to set up some boards...so maybe could have got there with the
--tmpdir flag, ok;
(c) lost all the parallelism provided by the Makefile in build/gcc. It
should be possible to use the (check-parallel-xxx rules from) Makefile
in conjunction with the site.exp from contrib/test_installed, haven't
got that far yet...

This does leave me wondering (1) whether a one-step $ test_installed
is feasible, or a two-stage setup and then run is inevitable; (2)
whether having all that parallelism expressed in the Makefile is the
best place for it. Not that I have an alternative proposal at this point...
It might be inevitable to have a two stage setup.  Red Hat does 
installed compiler testing and I'm sure increased parallelism for 
install testing would be appreciated by the team doing that work.


As for the --target, change itself, seems reasonable.

Jeff

[PATCH] Enhance ASAN_CHECK optimization

2014-11-25 Thread Yury Gribov


Hi all,

This patch improves current optimization of ASAN_CHECKS performed by 
sanopt pass.  In addition to searching the sanitized pointer in 
asan_check_map, it also tries to search for definition of this pointer. 
 This allows more checks to be dropped when definition is not a gimple 
value (e.g. load from struct field) and thus cannot be propagated by 
forwprop.


In my experiments this rarely managed to remove more than 0.5% of 
ASAN_CHECKs but some files got as much as 1% improvement e.g.

* gimple.c: 49 (out of 5293)
* varasm.c: 42 (out of 3678)

For a total it was able to remove 2657 checks in Asan-bootstrapped GCC 
(out of ~500K).


I've Asan-bootstrapped, bootstrapped and regtested on x64.

Is this ok for stage3?

Best regards,
Yury

From 85f65c403f132245e9efcc8a420269f8d631fae6 Mon Sep 17 00:00:00 2001
From: Yury Gribov y.gri...@samsung.com
Date: Tue, 25 Nov 2014 11:49:11 +0300
Subject: [PATCH] 2014-11-25  Yury Gribov  y.gri...@samsung.com

gcc/
	* sanopt.c (maybe_get_single_definition): New function.
	(struct tree_map_traits): New struct.
	(struct sanopt_ctx): Use custom traits for asan_check_map.
	(maybe_get_dominating_check): New function.
	(maybe_optimize_ubsan_null_ifn): Move code to
	maybe_get_dominating_check.
	(maybe_optimize_asan_check_ifn): Ditto. Take non-SSA expressions
	into account when optimizing.
	(sanopt_optimize_walker): Do not treat recoverable sanitization
	specially.
---
 gcc/sanopt.c |  194 +++---
 1 file changed, 116 insertions(+), 78 deletions(-)

diff --git a/gcc/sanopt.c b/gcc/sanopt.c
index e1d11e0..9fe87de 100644
--- a/gcc/sanopt.c
+++ b/gcc/sanopt.c
@@ -84,6 +84,35 @@ struct sanopt_info
   bool visited_p;
 };
 
+/* If T has a single definition of form T = T2, return T2.  */
+
+static tree
+maybe_get_single_definition (tree t)
+{
+  if (TREE_CODE (t) == SSA_NAME)
+{
+  gimple g = SSA_NAME_DEF_STMT (t);
+  if (gimple_assign_single_p (g))
+	return gimple_assign_rhs1 (g);
+}
+  return NULL_TREE;
+}
+
+/* Traits class for tree hash maps below.  */
+
+struct tree_map_traits : default_hashmap_traits
+{
+  static inline hashval_t hash (const_tree ref)
+{
+  return iterative_hash_expr (ref, 0);
+}
+
+  static inline bool equal_keys (const_tree ref1, const_tree ref2)
+{
+  return operand_equal_p (ref1, ref2, 0);
+}
+}; 
+
 /* This is used to carry various hash maps and variables used
in sanopt_optimize_walker.  */
 
@@ -95,7 +124,7 @@ struct sanopt_ctx
 
   /* This map maps a pointer (the second argument of ASAN_CHECK) to
  a vector of ASAN_CHECK call statements that check the access.  */
-  hash_maptree, auto_vecgimple  asan_check_map;
+  hash_maptree, auto_vecgimple, tree_map_traits asan_check_map;
 
   /* Number of IFN_ASAN_CHECK statements.  */
   int asan_num_accesses;
@@ -197,6 +226,24 @@ imm_dom_path_with_freeing_call (basic_block bb, basic_block dom)
   return false;
 }
 
+/* Get the first dominating check from the list of stored checks.
+   Non-dominating checks are silently dropped.  */
+
+static gimple
+maybe_get_dominating_check (auto_vecgimple v)
+{
+  for (; !v.is_empty (); v.pop ())
+{
+  gimple g = v.last ();
+  sanopt_info *si = (sanopt_info *) gimple_bb (g)-aux;
+  if (!si-visited_p)
+	/* At this point we shouldn't have any statements
+	   that aren't dominating the current BB.  */
+	return g;
+}
+  return NULL;
+}
+
 /* Optimize away redundant UBSAN_NULL calls.  */
 
 static bool
@@ -209,7 +256,8 @@ maybe_optimize_ubsan_null_ifn (struct sanopt_ctx *ctx, gimple stmt)
   bool remove = false;
 
   auto_vecgimple v = ctx-null_check_map.get_or_insert (ptr);
-  if (v.is_empty ())
+  gimple g = maybe_get_dominating_check (v);
+  if (!g)
 {
   /* For this PTR we don't have any UBSAN_NULL stmts recorded, so there's
 	 nothing to optimize yet.  */
@@ -220,43 +268,30 @@ maybe_optimize_ubsan_null_ifn (struct sanopt_ctx *ctx, gimple stmt)
   /* We already have recorded a UBSAN_NULL check for this pointer. Perhaps we
  can drop this one.  But only if this check doesn't specify stricter
  alignment.  */
-  while (!v.is_empty ())
-{
-  gimple g = v.last ();
-  /* Remove statements for BBs that have been already processed.  */
-  sanopt_info *si = (sanopt_info *) gimple_bb (g)-aux;
-  if (si-visited_p)
-	{
-	  v.pop ();
-	  continue;
-	}
 
-  /* At this point we shouldn't have any statements that aren't dominating
-	 the current BB.  */
-  tree align = gimple_call_arg (g, 2);
-  int kind = tree_to_shwi (gimple_call_arg (g, 1));
-  /* If this is a NULL pointer check where we had segv anyway, we can
-	 remove it.  */
-  if (integer_zerop (align)
-	   (kind == UBSAN_LOAD_OF
-	  || kind == UBSAN_STORE_OF
-	  || kind == UBSAN_MEMBER_ACCESS))
-	remove = true;
-  /* Otherwise remove the check in non-recovering mode, or if the
-	 stmts have same location.  */
-  else if (integer_zerop

Re: [PATCH] mn10300: Fix an ICE

2014-11-25 Thread Segher Boessenkool

On Tue, Nov 25, 2014 at 09:44:35AM -0700, Jeff Law wrote:
 On 11/24/14 20:37, Segher Boessenkool wrote:
 `lcc' is not an insn but just a pattern.  This caused a build error in
 libgcc.

 A good example of a case that would have been caught if we get to a 
 point where stuff in the insn chain are not RTX objects, but something 
 else entirely.

Hey, it already did ICE, easy to catch.  But you mean wouldn't even
compile I guess :-)


Segher

Re: [PATCH] mn10300: Fix an ICE

2014-11-25 Thread Jeff Law


On 11/25/14 10:14, Segher Boessenkool wrote:

On Tue, Nov 25, 2014 at 09:44:35AM -0700, Jeff Law wrote:

On 11/24/14 20:37, Segher Boessenkool wrote:

`lcc' is not an insn but just a pattern.  This caused a build error in
libgcc.



A good example of a case that would have been caught if we get to a
point where stuff in the insn chain are not RTX objects, but something
else entirely.


Hey, it already did ICE, easy to catch.  But you mean wouldn't even
compile I guess :-)
Exactly.  This kind of problem is something I want to catch at compile 
time rather than at runtime.


jeff

Re: [PATCH] Fix a bug in bt-load.c

2014-11-25 Thread Segher Boessenkool

On Tue, Nov 25, 2014 at 09:41:40AM -0700, Jeff Law wrote:
 On 11/24/14 20:56, Segher Boessenkool wrote:
 This caused ICEs on sh64.
 
 `min_cost' and `def' here are supposed to refer to the same element;
 removing it from the heap before asking the heap for the key doesn't
 work (and at the end of the loop here we will ask for min_key on an
 empty heap, which then does gcc_unreachable).

 Did sh64 ICE during a build, or was it during testing or something else? 
  Trying to figure out if we need a distinct test in the suite or not.

During libgcc build, pretty much all files.

The libiberty fibheap code returns 0 for min_key on an empty heap; the
new fibonacci_heap code ICEs.  This bt-load code will always fail if
there is any work to do, so I don't think any other test is needed :-)


Segher

Re: [PATCH] Fix a bug in bt-load.c

2014-11-25 Thread Jeff Law


On 11/25/14 10:26, Segher Boessenkool wrote:

On Tue, Nov 25, 2014 at 09:41:40AM -0700, Jeff Law wrote:

On 11/24/14 20:56, Segher Boessenkool wrote:

This caused ICEs on sh64.

`min_cost' and `def' here are supposed to refer to the same element;
removing it from the heap before asking the heap for the key doesn't
work (and at the end of the loop here we will ask for min_key on an
empty heap, which then does gcc_unreachable).



Did sh64 ICE during a build, or was it during testing or something else?
  Trying to figure out if we need a distinct test in the suite or not.


During libgcc build, pretty much all files.

The libiberty fibheap code returns 0 for min_key on an empty heap; the
new fibonacci_heap code ICEs.  This bt-load code will always fail if
there is any work to do, so I don't think any other test is needed :-)

Ok.  Thanks.

jeff

Re: [Patch] Improving jump-thread pass for PR 54742

2014-11-25 Thread Sebastian Pop

Jeff Law wrote:
 On 11/24/14 21:55, Jeff Law wrote:
 On 11/24/14 18:09, Sebastian Pop wrote:
 Sebastian Pop wrote:
 I removed the return -1 and started a bootstrap on powerpc64-linux.
 
 Bootstrap passed on top of the 4 previous patches on powerpc64-linux.
 
 I will report the valgrind output.
 
 The output from valgrind looks closer to the output of master with no
 other
 patches: still 1M more instructions executed, and 300K more branches
 Just ran my suite where we get ~25k more branches, which definitely puts
 us in the noise.  (that's with all 4 patches + fixing the return value
 ).  I'm going to look at little closer at this stuff tomorrow, but I
 think we've resolved the performance issue.  I'll dig deeper into the
 implementation tomorrow as well.
 I was running without your followup patches (must have used the
 wrong bits from my git stash), so those results are bogus, but in a
 good way.
 
 After fixing that goof, I'm seeing consistent improvements with your
 set of 4 patches and the fix for the wrong return code.  Across the
 suite, ~140M fewer branches, not huge, but definitely not in the
 noise.

Thanks for your testing.

 
 So, time to dig into the implementation :-)
 

To ease the review, I squashed all the patches in a single one.

I will bootstrap and regression test this patch on x86_64-linux and
powerpc64-linux.  I will also run it on our internal benchmarks, coremark, and
the llvm test-suite.

I will also include a longer testcase that makes sure we do not regress on
coremark.

Sebastian
From db0f6817768920b497225484fab24a20e5ddf556 Mon Sep 17 00:00:00 2001
From: Sebastian Pop s@samsung.com
Date: Fri, 26 Sep 2014 14:54:20 -0500
Subject: [PATCH] extend jump thread for finite state automata PR 54742

Adapted from a patch from James Greenhalgh.

	* params.def (max-fsm-thread-path-insns, max-fsm-thread-length,
	max-fsm-thread-paths): New.

	* doc/invoke.texi (max-fsm-thread-path-insns, max-fsm-thread-length,
	max-fsm-thread-paths): Documented.

	* testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c: New.

	* tree-cfg.c (split_edge_bb_loc): Export.
	* tree-cfg.h (split_edge_bb_loc): Declared extern.

	* tree-ssa-threadedge.c (simplify_control_stmt_condition): Restore the
	original value of cond when simplification fails.
	(fsm_find_thread_path): New.
	(fsm_find_control_statement_thread_paths): New.
	(fsm_thread_through_normal_block): Call find_control_statement_thread_paths.

	* tree-ssa-threadupdate.c (dump_jump_thread_path): Pretty print
	EDGE_START_FSM_THREAD.
	(duplicate_seme_region): New.
	(thread_through_all_blocks): Generate code for EDGE_START_FSM_THREAD edges
	calling gimple_duplicate_sese_region.

	* tree-ssa-threadupdate.h (jump_thread_edge_type): Add EDGE_START_FSM_THREAD.
---
 gcc/doc/invoke.texi  |  12 ++
 gcc/params.def   |  15 ++
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c |  38 
 gcc/tree-cfg.c   |   2 +-
 gcc/tree-cfg.h   |   1 +
 gcc/tree-ssa-threadedge.c| 215 ++-
 gcc/tree-ssa-threadupdate.c  | 198 -
 gcc/tree-ssa-threadupdate.h  |   1 +
 8 files changed, 479 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 89edddb..074183f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10624,6 +10624,18 @@ large and significantly increase compile time at optimization level
 @option{-O1} and higher.  This parameter is a maximum nubmer of statements
 in a single generated constructor.  Default value is 5000.
 
+@item max-fsm-thread-path-insns
+Maximum number of instructions to copy when duplicating blocks on a
+finite state automaton jump thread path.  The default is 100.
+
+@item max-fsm-thread-length
+Maximum number of basic blocks on a finite state automaton jump thread
+path.  The default is 10.
+
+@item max-fsm-thread-paths
+Maximum number of new jump thread paths to create for a finite state
+automaton.  The default is 50.
+
 @end table
 @end table
 
diff --git a/gcc/params.def b/gcc/params.def
index 9b21c07..edf3f53 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1140,6 +1140,21 @@ DEFPARAM (PARAM_CHKP_MAX_CTOR_SIZE,
 	  Maximum number of statements to be included into a single static 
 	  constructor generated by Pointer Bounds Checker,
 	  5000, 100, 0)
+
+DEFPARAM (PARAM_MAX_FSM_THREAD_PATH_INSNS,
+	  max-fsm-thread-path-insns,
+	  Maximum number of instructions to copy when duplicating blocks on a finite state automaton jump thread path,
+	  100, 1, 99)
+
+DEFPARAM (PARAM_MAX_FSM_THREAD_LENGTH,
+	  max-fsm-thread-length,
+	  Maximum number of basic blocks on a finite state automaton jump thread path,
+	  10, 1, 99)
+
+DEFPARAM (PARAM_MAX_FSM_THREAD_PATHS,
+	  max-fsm-thread-paths,
+	  Maximum number of new

[PATCH, libgfortran]: Remove unused variable

2014-11-25 Thread Uros Bizjak

Hello!

2014-11-25  Uros Bizjak  ubiz...@gmail.com

* intrinsics/env.c (getenv): Remove unused variable res_len.

Bootstrapped on x86_64-linux-gnu.

Almost trivial, but ... OK for mainline?

Uros.

Index: intrinsics/env.c
===
--- intrinsics/env.c(revision 218056)
+++ intrinsics/env.c(working copy)
@@ -42,7 +42,6 @@ PREFIX(getenv) (char * name, char * value, gfc_cha
 {
   char *name_nt;
   char *res = NULL;
-  int res_len;

   if (name == NULL || value == NULL)
 runtime_error (Both arguments to getenv are mandatory.);

Re: [PATCH, libgfortran]: Remove unused variable

2014-11-25 Thread Steve Kargl

On Tue, Nov 25, 2014 at 07:17:17PM +0100, Uros Bizjak wrote:
 
 2014-11-25  Uros Bizjak  ubiz...@gmail.com
 
 * intrinsics/env.c (getenv): Remove unused variable res_len.
 
 Bootstrapped on x86_64-linux-gnu.
 
 Almost trivial, but ... OK for mainline?
 

Yes.

-- 
Steve

Re: [PATCH 2/5] combine: handle I2 a parallel of two SETs

2014-11-25 Thread Jeff Law


On 11/14/14 12:19, Segher Boessenkool wrote:

If I2 is a PARALLEL of two SETs, split it into two instructions, I1
and I2.  If there already was an I1, rename it to I0.  If there
already was an I0, don't do anything.

This surprisingly simple patch is enough to let combine handle such
PARALLELs properly.

It's clever.




2014-11-14  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* combine.c (try_combine): If I2 is a PARALLEL of two SETs,
split it into two insns.
So you're virtually serializing the PARALLEL to make combine happy if 
I'm reading this correctly.


THe first thing I worry about is preserving the semantics of a PARALLEL. 
 Specifically that all the inputs are evaluated, then all the side 
effects happen.  So I think one of the checks you need is that the 
destinations of the SETs are not used as source operands in the SETs.


The second thing I worry about handling of match_dup operands.  But 
presumably all the resulting insns must match in one way or another or 
the whole thing gets reset to its prior state.  So I suspect those are 
OK as well.


Related, I was worried about RTL structure sharing, but in the end I 
think those are a non-concern for the same basic reasons as match_dups 
aren't a real worry.





---
  gcc/combine.c | 31 +++
  1 file changed, 31 insertions(+)

diff --git a/gcc/combine.c b/gcc/combine.c
index f7797e7..c4d23e3 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2780,6 +2780,37 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
  SUBST_LINK (LOG_LINKS (i2), alloc_insn_link (i1, LOG_LINKS (i2)));
}
  }
+
+  /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
+ make those two SETs separate I1 and I2 insns, and make an I0 that is
+ the original I1.  */
+  if (i0 == 0

Test for NULL.



+   GET_CODE (PATTERN (i2)) == PARALLEL
+   XVECLEN (PATTERN (i2), 0) = 2
+   GET_CODE (XVECEXP (PATTERN (i2), 0, 0)) == SET
+   GET_CODE (XVECEXP (PATTERN (i2), 0, 1)) == SET
+   REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)))
+   REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)))
+   !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)), i2, i3)
+   !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)), i2, i3)
+   (XVECLEN (PATTERN (i2), 0) == 2
+ || GET_CODE (XVECEXP (PATTERN (i2), 0, 2)) == CLOBBER))
As noted above, I think you need to verify the set/clobbered operands do 
not conflict with any of the source operands.  Otherwise you run the 
risk of changing the semantics when you rip apart the PARALLEL.


Ah, just saw that Bernd made the same observation.  Good.

And I think while convention has CLOBBERs at the end of insns, I don't 
think that's a hard requirement.  So I think you need a stronger check 
for elements 2 and beyond in the vector.


OK with the direction this is going, but I think another iteration is 
going to be necessary.


Jeff

Re: [PATCH 3/5] combine: add regno field to LOG_LINKS

2014-11-25 Thread Jeff Law


On 11/14/14 12:19, Segher Boessenkool wrote:

With this new field in place, we can have LOG_LINKS for insns that set
more than one register and distribute them properly in distribute_links.
This then allows many more PARALLELs to be combined.

Also split off new functions can_combine_{def,use}_p from the
create_log_links function.


2014-11-14  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* combine.c (struct insn_link): New field `regno'.
(alloc_insn_link): New parameter `regno'.  Use it.
(find_single_use): Check the new field.
(can_combine_def_p, can_combine_use_p): New functions.  Split
off from ...
(create_log_links): ... here.  Correct data type of `regno'.
Adjust call to alloc_insn_link.
(adjust_for_new_dest): Find regno, use it in call to
alloc_insn_link.
(try_combine): Adjust call to alloc_insn_link.
(distribute_links): Check the new field.
Didn't you lose the check that avoids duplicated LOG_LINKs?   Or is the 
claim that the check is no longer needed because there are no duplicates 
now that we include the register associated with the link?




+
+  rtx set = single_set (insn);
+  gcc_assert (set);
+
+  rtx reg = SET_DEST (set);
+
+  while (GET_CODE (reg) == ZERO_EXTRACT
+|| GET_CODE (reg) == STRICT_LOW_PART
+|| GET_CODE (reg) == SUBREG)
+reg = XEXP (reg, 0);
+  gcc_assert (REG_P (reg));
Can REG ever be a hard reg here?  If so, then the SUBREG case needs to 
simplify the hard reg rather than just strip off the SUBREG.



Might be OK, depends on answers to questions above -- holding final 
approval pending those answers.


Jeff

Re: Document __builtin_*_overflow

2014-11-25 Thread Gerald Pfeifer

Hi Jakub,

On Wednesday 2014-11-12 14:13, Jakub Jelinek wrote:
 This patch mentions __builtin_*_overflow in gcc-5/changes.html.
 Ok for CVS?

I've fallen a bit behind with GCC patches, sorry.

What do you think about this follow-up patch on top of yours?

Gerald

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.41
diff -u -r1.41 changes.html
--- changes.html23 Nov 2014 14:42:28 -  1.41
+++ changes.html25 Nov 2014 18:49:02 -
@@ -157,14 +157,14 @@
These builtins have two integral arguments (which don't need to have
the same type), the arguments are extended to infinite precision
signed type, code+/code, code-/code or code*/code
-   is performed on those and the result is stored into some integer
-   variable pointed by the last argument.  If the stored value is equal
-   to the infinite precision result, the built-in functions return
+   is performed on those, and the result is stored in an integer
+   variable pointed to by the last argument.  If the stored value is
+   equal to the infinite precision result, the built-in functions return
codefalse/code, otherwise codetrue/code.  The type of
the integer variable that will hold the result can be different from
-   the types of arguments.  The following snippet demonstrates how
-   this can be used in computing the size for the codecalloc/code
-   function:
+   the types of the first two arguments.  The following snippet
+   demonstrates how this can be used in computing the size for the
+   codecalloc/code function:
 blockquotepre
 void *
 calloc (size_t x, size_t y)
@@ -177,8 +177,8 @@
   return ret;
 }
 /pre/blockquote
-   On e.g. i?86 or x86-64 the above will result in codemul/code
-   instruction followed by jump on overflow.
+   On e.g. i?86 or x86-64 the above will result in a codemul/code
+   instruction followed by a jump on overflow.
 /li
 liThe option code-fextended-identifiers/code is now enabled
by default for C++, and for C99 and later C versions.  Various

Gerald

[ping^4] [libgomp] make it possible to use OMP on both sides of a fork

2014-11-25 Thread Nathaniel Smith

Ping^4 for: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00519.html

On Tue, Nov 18, 2014 at 12:53 AM, Nathaniel Smith n...@pobox.com wrote:
 Hello,

 Ping for https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00519.html

 Patches posted early enough during Stage 1 and not yet fully reviewed
 may still get in early in Stage 3.  Please make sure to ping them
 soon enough.

 This patch was initially posted before stage 1 opened... for 4.9. So
 hopefully that qualifies :-). It would be nice to get it in someday...

 -n

 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org



-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

Re: [patch] Restore bootstrap on powerpc-apple-darwin

2014-11-25 Thread FX

 2014-11-24  Rohit  rohitarul...@freescale.com
 
  PR bootstrap/63703
  * config/rs6000/darwin.h (REGISTER_NAMES): Update based on 32 newly
  added GCC hard register numbers for SPE high registers.
 
 
 IMO, it's obvious, and as you say, doesn't touch any other target.

After further confirmations that it restores full bootstrap on 
powerpc-apple-darwin9, I’ve committed (r218058).
I’ll backport to the 4.9 branch shortly.

FX

Re: Document __builtin_*_overflow

2014-11-25 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 07:50:02PM +0100, Gerald Pfeifer wrote:
 Hi Jakub,
 
 On Wednesday 2014-11-12 14:13, Jakub Jelinek wrote:
  This patch mentions __builtin_*_overflow in gcc-5/changes.html.
  Ok for CVS?
 
 I've fallen a bit behind with GCC patches, sorry.
 
 What do you think about this follow-up patch on top of yours?

LGTM, thanks.

 --- changes.html  23 Nov 2014 14:42:28 -  1.41
 +++ changes.html  25 Nov 2014 18:49:02 -
 @@ -157,14 +157,14 @@
   These builtins have two integral arguments (which don't need to have
   the same type), the arguments are extended to infinite precision
   signed type, code+/code, code-/code or code*/code
 - is performed on those and the result is stored into some integer
 - variable pointed by the last argument.  If the stored value is equal
 - to the infinite precision result, the built-in functions return
 + is performed on those, and the result is stored in an integer
 + variable pointed to by the last argument.  If the stored value is
 + equal to the infinite precision result, the built-in functions return
   codefalse/code, otherwise codetrue/code.  The type of
   the integer variable that will hold the result can be different from
 - the types of arguments.  The following snippet demonstrates how
 - this can be used in computing the size for the codecalloc/code
 - function:
 + the types of the first two arguments.  The following snippet
 + demonstrates how this can be used in computing the size for the
 + codecalloc/code function:
  blockquotepre
  void *
  calloc (size_t x, size_t y)
 @@ -177,8 +177,8 @@
return ret;
  }
  /pre/blockquote
 - On e.g. i?86 or x86-64 the above will result in codemul/code
 - instruction followed by jump on overflow.
 + On e.g. i?86 or x86-64 the above will result in a codemul/code
 + instruction followed by a jump on overflow.
  /li
  liThe option code-fextended-identifiers/code is now enabled
   by default for C++, and for C99 and later C versions.  Various
 
 Gerald

Jakub

[PATCH, libobjc]: Remove ‘...’ is static but used in inline function ‘...’ which is not static

2014-11-25 Thread Uros Bizjak

Hello!

Recently, gcc bootstrap started to emit following warnings when
building libobjc:

libobjc/sendmsg.c:338:13: warning: ‘get_implementation’ is static but
used in inline function ‘get_imp’ which is not static
libobjc/sendmsg.c:335:15: warning: ‘sarray_get_safe’ is static but
used in inline function ‘get_imp’ which is not static
libobjc/sendmsg.c:143:21: warning: ‘__objc_word_forward’ is static but
used in inline function ‘__objc_get_forward_imp’ which is not static
libobjc/sendmsg.c:141:21: warning: ‘__objc_double_forward’ is static
but used in inline function ‘__objc_get_forward_imp’ which is not
static
libobjc/sendmsg.c:139:21: warning: ‘__objc_block_forward’ is static
but used in inline function ‘__objc_get_forward_imp’ which is not
static


2014-11-25  Uros Bizjak  ubiz...@gmail.com

* sendmsg.c (get_imp): Declare as static inline.
(__objc_get_forward_imp): Ditto.

Bootstrapped on x86_64-linux-gnu.

OK for mainline?

Uros.

Index: sendmsg.c
===
--- sendmsg.c   (revision 218056)
+++ sendmsg.c   (working copy)
@@ -105,7 +105,7 @@
 id nil_method (id, SEL);

 /* Given a selector, return the proper forwarding implementation.  */
-inline
+static inline
 IMP
 __objc_get_forward_imp (id rcv, SEL sel)
 {
@@ -320,7 +320,7 @@
   return res;
 }

-inline
+static inline
 IMP
 get_imp (Class class, SEL sel)
 {

Re: [PATCH 4/5] combine: distribute_log_links for PARALLELs of SETs

2014-11-25 Thread Jeff Law


On 11/14/14 12:19, Segher Boessenkool wrote:

Now that LOG_LINKS are per regno, we can distribute them on PARALLELs
just fine.  Do so.  This makes PARALLELs not lose their LOG_LINKS early
when e.g. a trivial reg-reg move is combined, so that they can be used
in more useful combinations as well.


2014-11-14  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* combine.c (distribute_links): Handle multiple SETs.
So the code in distribute_links implies that we're not going to see hard 
register SUBREGs, so ignore my concerns with the prior patch in this 
series WRT hard register SUBREGs.



This is OK once prereqs are approved.

You might consider pushing the two LOG_LINKs related patches forward 
independently of the patch to rip apart the PARALLELs.  Though I think 
that all of the patches are pretty close to being approved.  Your call.


Jeff

Re: [patch c++]: Fix PR/53904

2014-11-25 Thread Jason Merrill


On 11/20/2014 02:48 PM, Kai Tietz wrote:

this issue fixes a type-overflow issue caused by trying to cast a UHWI
via tree_to_shwi.
As soon as value gets larger then SHWI_MAX, we get an error for it.
So we need to cast it via tree_to_uhwi, and then casting it to the signed 
variant.


The problem seems to be with zero-length arrays getting -1 from 
array_type_nelts.  Let's use array_type_nelts_top instead so we don't 
ever see negative values.


Jason

[PATCH v3] gcc/c-family/c-cppbuiltin.c: Let buffer enough to print host wide integer value

2014-11-25 Thread Chen Gang

The original length 18 is not enough for HOST_WIDE_INT printing, need
use 20 instead of.

Also need additional bytes for printing related prefix and suffix, and
give a related check.

It passes testsuite under fedora 20 x86_64-unknown-linux-gnu.


2014-11-26  Chen Gang gang.chen.5...@gmail.com

* c-family/c-cppbuiltin.c (builtin_define_with_int_value): Let
buffer enough to print host wide integer value.
---
 gcc/c-family/c-cppbuiltin.c | 30 ++
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index c571d1b..b1b96fb 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1366,14 +1366,28 @@ static void
 builtin_define_with_int_value (const char *macro, HOST_WIDE_INT value)
 {
   char *buf;
-  size_t mlen = strlen (macro);
-  size_t vlen = 18;
-  size_t extra = 2; /* space for = and NUL.  */
-
-  buf = (char *) alloca (mlen + vlen + extra);
-  memcpy (buf, macro, mlen);
-  buf[mlen] = '=';
-  sprintf (buf + mlen + 1, HOST_WIDE_INT_PRINT_DEC, value);
+  size_t vlen = 20; /* maximize value length: -9223372036854775807 */
+  size_t extra = 6; /* space for =, NUL, (, ), and L L. */
+
+  gcc_assert (wi::fits_to_tree_p (value, char_type_node)
+ || wi::fits_to_tree_p (value, short_integer_type_node)
+ || wi::fits_to_tree_p (value, integer_type_node)
+ || wi::fits_to_tree_p (value, long_integer_type_node)
+ || wi::fits_to_tree_p (value, long_long_integer_type_node));
+
+  buf = (char *) alloca (strlen (macro) + vlen + extra);
+
+  sprintf (buf, %s=%sHOST_WIDE_INT_PRINT_DEC%s%s,
+  macro,
+  value  0 ? ( : ,
+  value,
+  wi::fits_to_tree_p (value, char_type_node)
+|| wi::fits_to_tree_p (value, short_integer_type_node)
+|| wi::fits_to_tree_p (value, integer_type_node)
+  ? 
+  : wi::fits_to_tree_p (value, long_integer_type_node)
+? L : LL,
+  value  0 ? ) : );
 
   cpp_define (parse_in, buf);
 }
-- 
1.9.3

Re: [PATCH, libobjc]: Remove ‘...’ is static but used in inline function ‘...’ which is not static

2014-11-25 Thread Andrew Pinski

On Tue, Nov 25, 2014 at 11:09 AM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 Recently, gcc bootstrap started to emit following warnings when
 building libobjc:

 libobjc/sendmsg.c:338:13: warning: ‘get_implementation’ is static but
 used in inline function ‘get_imp’ which is not static
 libobjc/sendmsg.c:335:15: warning: ‘sarray_get_safe’ is static but
 used in inline function ‘get_imp’ which is not static
 libobjc/sendmsg.c:143:21: warning: ‘__objc_word_forward’ is static but
 used in inline function ‘__objc_get_forward_imp’ which is not static
 libobjc/sendmsg.c:141:21: warning: ‘__objc_double_forward’ is static
 but used in inline function ‘__objc_get_forward_imp’ which is not
 static
 libobjc/sendmsg.c:139:21: warning: ‘__objc_block_forward’ is static
 but used in inline function ‘__objc_get_forward_imp’ which is not
 static


This patch is incorrect as get_imp is exported from libobjc.so.  See
libobjc.def.  I would rather use -std=gnu90 to compile these source
files as you are changing the exported symbols.

This also fixes bug 63863.

Thanks,
Andrew Pinski



 2014-11-25  Uros Bizjak  ubiz...@gmail.com

 * sendmsg.c (get_imp): Declare as static inline.
 (__objc_get_forward_imp): Ditto.

 Bootstrapped on x86_64-linux-gnu.

 OK for mainline?

 Uros.

 Index: sendmsg.c
 ===
 --- sendmsg.c   (revision 218056)
 +++ sendmsg.c   (working copy)
 @@ -105,7 +105,7 @@
  id nil_method (id, SEL);

  /* Given a selector, return the proper forwarding implementation.  */
 -inline
 +static inline
  IMP
  __objc_get_forward_imp (id rcv, SEL sel)
  {
 @@ -320,7 +320,7 @@
return res;
  }

 -inline
 +static inline
  IMP
  get_imp (Class class, SEL sel)
  {

Re: PATCH: PR rtl-optimization/64037: Miscompilation with -Os and enum class : char parameter

2014-11-25 Thread H.J. Lu

On Tue, Nov 25, 2014 at 7:04 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Tue, Nov 25, 2014 at 4:01 PM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 The enclosed testcase fails on x86 when compiled with -Os since we pass
 a byte parameter with a byte load in caller and read it as an int in
 callee.  The reason it only shows up with -Os is x86 backend encodes
 a byte load with an int load if -O isn't used.  When a byte load is
 used, the upper 24 bits of the register have random value for none
 WORD_REGISTER_OPERATIONS targets.

 It happens because setup_incoming_promotions in combine.c has

   /* The mode and signedness of the argument before any promotions 
 happen
  (equal to the mode of the pseudo holding it at that stage).  */
   mode1 = TYPE_MODE (TREE_TYPE (arg));
   uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));

   /* The mode and signedness of the argument after any source language 
 and
  TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
   mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
   uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));

   /* The mode and signedness of the argument as it is actually passed,
  after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
   mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, uns3,
  TREE_TYPE (cfun-decl), 0);

 while they are actually passed in register by assign_parm_setup_reg in
 function.c:

   /* Store the parm in a pseudoregister during the function, but we may
  need to do it in a wider mode.  Using 2 here makes the result
  consistent with promote_decl_mode and thus expand_expr_real_1.  */
   promoted_nominal_mode
 = promote_function_mode (data-nominal_type, data-nominal_mode, 
 unsignedp,
  TREE_TYPE (current_function_decl), 2);

 where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
 and TYPE_MODE (nominal_type). TREE_TYPE here is

 I think the bug is here, not in combine.c.  Can you try going back in history
 for both snippets and see if they matched at some point?

 Oh, and note that I think DECL_ARG_TYPE is sth dangerous - it's meant
 to be a source language ABI kind-of-thing.  Or rather an optimization
 hit.  For example in C when integral promotions happen to call arguments
 this can be used to optimize sign-/zero-extensions in the callee.  Unless
 something else overrides this (like the target which specifies the real ABI).
 IIRC.


PROMOTE_MODE is a performance hint, not an ABI requirement.
i386.h has

#define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \
do {\
  if (((MODE) == HImode  TARGET_PROMOTE_HI_REGS)  \
  || ((MODE) == QImode  TARGET_PROMOTE_QI_REGS))  \
(MODE) = SImode;\
} while (0)

We may promote QI/HI to SI, depending on optimization.

On the other hand, TARGET_PROMOTE_FUNCTION_MODE is
determined by psABI.

I am enclosing the missing ChangeLog entries.


-- 
H.J.
---
gcc/

PR rtl-optimization/64037
* combine.c (setup_incoming_promotions): Pass the argument
before any promotions happen to promote_function_mode.

gcc/testsuite/

PR rtl-optimization/64037
* g++.dg/pr64037.C: New test.

Re: [PATCH] gcc parallel make check

2014-11-25 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 03:27:40PM +0100, Tom de Vries wrote:
 This patch fixes that by ensuring that we print that unsupported message only 
 once.
 
 The resulting test result comparison diff is:
 2014-11-25  Tom de Vries  t...@codesourcery.com
 
   * testsuite/libstdc++-prettyprinters/prettyprinters.exp: Add missing
   dg-finish.  Only print unsupported message once.

LGTM.

 --- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp
 +++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp
 @@ -30,7 +30,14 @@ if ![info exists ::env(GUALITY_GDB_NAME)] {
  }
  
  if {! [gdb_version_check]} {
 +dg-finish
 +# Only print unsupported message in one instance.
 +if ![gcc_parallel_test_run_p prettyprinters] {
 + return
 +}
 +gcc_parallel_test_enable 0
  unsupported prettyprinters.exp
 +gcc_parallel_test_enable 1
  return
  }
  
 -- 
 1.9.1
 


Jakub

patch to fix PR63527

2014-11-25 Thread Vladimir Makarov

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63527

The patch was tested and bootstrapped on x86/x86-64.

Committed as rev. 218509.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63527
Index: ira-lives.c
===
--- ira-lives.c	(revision 218058)
+++ ira-lives.c	(working copy)
@@ -1123,8 +1123,10 @@ process_bb_node_lives (ira_loop_tree_nod
 	 pessimistic, but it probably doesn't matter much in practice.  */
   FOR_BB_INSNS_REVERSE (bb, insn)
 	{
+	  int regno;
+	  ira_allocno_t a;
 	  df_ref def, use;
-	  bool call_p;
+	  bool call_p, clear_pic_use_conflict_p;
 
 	  if (!NONDEBUG_INSN_P (insn))
 	continue;
@@ -1134,6 +1136,21 @@ process_bb_node_lives (ira_loop_tree_nod
 		 INSN_UID (insn), loop_tree_node-parent-loop_num,
 		 curr_point);
 
+	  call_p = CALL_P (insn);
+	  clear_pic_use_conflict_p = false;
+	  /* Processing insn usage in call insn can create conflict
+	 with pic pseudo and pic hard reg and that is wrong.
+	 Check this situation and fix it at the end of the insn
+	 processing.  */
+	  if (call_p  pic_offset_table_rtx != NULL_RTX
+	   (regno = REGNO (pic_offset_table_rtx)) = FIRST_PSEUDO_REGISTER
+	   (a = ira_curr_regno_allocno_map[regno]) != NULL)
+	clear_pic_use_conflict_p
+		= (find_regno_fusage (insn, USE, REAL_PIC_OFFSET_TABLE_REGNUM)
+		! TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS
+	   (ALLOCNO_OBJECT (a, 0)),
+	   REAL_PIC_OFFSET_TABLE_REGNUM));
+
 	  /* Mark each defined value as live.  We need to do this for
 	 unused values because they still conflict with quantities
 	 that are live at the time of the definition.
@@ -1143,7 +1160,6 @@ process_bb_node_lives (ira_loop_tree_nod
 	 on a call-clobbered register.  Marking the register as
 	 live would stop us from allocating it to a call-crossing
 	 allocno.  */
-	  call_p = CALL_P (insn);
 	  FOR_EACH_INSN_DEF (def, insn)
 	if (!call_p || !DF_REF_FLAGS_IS_SET (def, DF_REF_MAY_CLOBBER))
 	  mark_ref_live (def);
@@ -1207,7 +1223,7 @@ process_bb_node_lives (ira_loop_tree_nod
 	  EXECUTE_IF_SET_IN_SPARSESET (objects_live, i)
 	{
 		  ira_object_t obj = ira_object_id_map[i];
-		  ira_allocno_t a = OBJECT_ALLOCNO (obj);
+		  a = OBJECT_ALLOCNO (obj);
 		  int num = ALLOCNO_NUM (a);
 		  HARD_REG_SET this_call_used_reg_set;
 
@@ -1257,7 +1273,7 @@ process_bb_node_lives (ira_loop_tree_nod
 	  make_early_clobber_and_input_conflicts ();
 
 	  curr_point++;
-
+	  
 	  /* Mark each used value as live.  */
 	  FOR_EACH_INSN_USE (use, insn)
 	mark_ref_live (use);
@@ -1286,6 +1302,17 @@ process_bb_node_lives (ira_loop_tree_nod
 		}
 	}
 
+	  if (clear_pic_use_conflict_p)
+	{
+	  regno = REGNO (pic_offset_table_rtx);
+	  a = ira_curr_regno_allocno_map[regno];
+	  CLEAR_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (ALLOCNO_OBJECT (a, 0)),
+  REAL_PIC_OFFSET_TABLE_REGNUM);
+	  CLEAR_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS
+  (ALLOCNO_OBJECT (a, 0)),
+  REAL_PIC_OFFSET_TABLE_REGNUM);
+	}
+
 	  curr_point++;
 	}

Re: [PATCH] mn10300: Fix an ICE

2014-11-25 Thread David Malcolm

On Tue, 2014-11-25 at 10:15 -0700, Jeff Law wrote:
 On 11/25/14 10:14, Segher Boessenkool wrote:
  On Tue, Nov 25, 2014 at 09:44:35AM -0700, Jeff Law wrote:
  On 11/24/14 20:37, Segher Boessenkool wrote:
  `lcc' is not an insn but just a pattern.  This caused a build error in
  libgcc.
 
  A good example of a case that would have been caught if we get to a
  point where stuff in the insn chain are not RTX objects, but something
  else entirely.
 
  Hey, it already did ICE, easy to catch.  But you mean wouldn't even
  compile I guess :-)
 Exactly.  This kind of problem is something I want to catch at compile 
 time rather than at runtime.

Right.  FWIW I have a set of patches that converts PATTERN() to
requiring a const rtx_insn * rather than a const_rtx, but so far they
only compile on x86_64.  Extending them to cover all archs would have
caught this at compile time, I guess, since lcc would have been just
an rtx.

Presumably something for the next stage1.

Re: [patch, build] Restore bootstrap in building libcc1 on darwin

2014-11-25 Thread Mike Stump

On Nov 23, 2014, at 4:06 PM, FX fxcoud...@gmail.com wrote:
 One question to build maintainers, and one patch submitted to top-level 
 configure.ac

So, not sure who wants to review this.  From the darwin perspective, Ok.

Re: patch to fix PR63527

2014-11-25 Thread H.J. Lu

On Tue, Nov 25, 2014 at 12:22 PM, Vladimir Makarov vmaka...@redhat.com wrote:
 The following patch fixes

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63527

 The patch was tested and bootstrapped on x86/x86-64.

 Committed as rev. 218509.

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63527

I checked in this testcase.

Thanks.

-- 
H.J.
---
Index: ChangeLog
===
--- ChangeLog (revision 218060)
+++ ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2014-11-25  H.J. Lu  hongjiu...@intel.com
+
+ PR target/63527
+ * gcc.target/i386/pr63527.c: New test.
+
 2014-11-25  Martin Liska  mli...@suse.cz

  PR bootstrap/64050
Index: gcc.target/i386/pr63527.c
===
--- gcc.target/i386/pr63527.c (revision 0)
+++ gcc.target/i386/pr63527.c (working copy)
@@ -0,0 +1,25 @@
+/* PR rtl-optimization/pr63527 */
+/* { dg-do compile { target { ia32  fpic } } } */
+/* { dg-options -O2 -fPIC } */
+
+struct cache_file
+{
+  char magic[sizeof ld.so-1.7.0 - 1];
+  unsigned int nlibs;
+};
+typedef unsigned int size_t;
+size_t cachesize __attribute__ ((visibility (hidden)));
+struct cache_file *cache __attribute__ ((visibility (hidden)));
+extern int __munmap (void *__addr, size_t __len);
+void
+_dl_unload_cache (void)
+{
+  if (cache != ((void *)0)  cache != (struct cache_file *) -1)
+{
+  __munmap (cache, cachesize);
+  cache = ((void *)0) ;
+}
+}
+
+/* We shouldn't load EBX again.  */
+/* { dg-final { scan-assembler-not movl\[ \t\]%\[^,\]+, %ebx } } */

Re: [PATCH] Fix PR ipa/61190, updated

2014-11-25 Thread Jan Hubicka

  Index: gcc/ipa-pure-const.c
  ===
  --- gcc/ipa-pure-const.c (revision 215888)
  +++ gcc/ipa-pure-const.c (working copy)
  @@ -744,6 +744,8 @@ analyze_function (struct cgraph_node *fn, bool ipa
  {
  /* Thunk gets propagated through, so nothing interesting happens. */
  gcc_assert (ipa);
  + if (fn-thunk.virtual_offset_p)
  + l-pure_const_state = IPA_NEITHER;
  return l;
  }
 
 
 Hmm, I looked again at the above if statement, and I think now it should
 better be if (fn-thunk.thunk_p  fn-thunk.virtual_offset_p), because
 thunk.virtual_offset_p is probably not well defined if we come here because
 of fn-alias == true.

Yes, that is right.  I plan to put the other thunk info off the structure 
anyway.
 
  This makes the lattice to be initialized correctly, but you also need the
  function_symbol calls that will skip thunks replaced by
  something like function_or_non_virtual_thunk_symbol.
 
 
 Oh, I see what you mean, thanks.
 
 I created a new method function_or_virtual_thunk_symbol() for this.
 And simplified the algorithm of both function_symbol variants a bit.
 
 Attached, you'll find my updated patch for review.
 
 Boot-strapped and regression tested on x86_64-linux-gnu.
 OK for trunk?
 
 
 Thanks
 Bernd.
 
  Can you, please, send the updated patch?
  Sorry for late review,
  Honza
 
 
 

 2014-11-25  Bernd Edlinger  bernd.edlin...@hotmail.de
 
   PR ipa/61190
   * cgraph.h (symtab_node::call_for_symbol_and_aliases): Fix comment.
   (cgraph_node::function_or_virtual_thunk_symbol): New function.
   (cgraph_node::call_for_symbol_and_aliases): Fix comment.
   (cgraph_node::call_for_symbol_thunks_and_aliases): Adjust comment.
   Add new optional parameter exclude_virtual_thunks.
   * cgraph.c (cgraph_node::call_for_symbol_thunks_and_aliases): Add new
   optional parameter exclude_virtual_thunks.
   (cgraph_node::set_const_flag): Don't propagate to virtual thunks.
   (cgraph_node::set_pure_flag): Likewise.
   (cgraph_node::function_symbol): Simplified.
   (cgraph_node::function_or_virtual_thunk_symbol): New function.
   * ipa-pure-const.c (analyze_function): For virtual thunks set
   pure_const_state to IPA_NEITHER.
   (propagate_pure_const): Use function_or_virtual_thunk_symbol.

OK,
Honza
 
 testsuite/ChangeLog:
 2014-11-25  Bernd Edlinger  bernd.edlin...@hotmail.de
 
   PR ipa/61190
   * g++.old-deja/g++.mike/p4736b.C: Use -O2.

Re: [patch, build] Restore bootstrap in building libcc1 on darwin

2014-11-25 Thread Phil Muldoon

On 25/11/14 20:37, Mike Stump wrote:
 On Nov 23, 2014, at 4:06 PM, FX fxcoud...@gmail.com wrote:
 One question to build maintainers, and one patch submitted to top-level 
 configure.ac

 So, not sure who wants to review this.  From the darwin perspective, Ok.

I mean from my limited viewpoint it looks fine. As long as the .so is
built, that's really our only goal from a GDB point of view.  But I am
not a maintainer, so I have refrained from commenting on this change,
as it seems fairly straightforward.  Though I am no expert on GCC
build systems.

Cheers

Phil

Re: [PATCH] pr61324 pr 63649 - fix crash in ipa_comdats

2014-11-25 Thread Jan Hubicka

 From: Trevor Saunders tsaund...@mozilla.com
 
 Hi,
 
 the interesting symbol in the test case for pr61324 is __GLOBAL__sub_I_s.  It
 refers to nothing, and is called by nothing, however it is kept (I believe
 because of -fkeep-inline-functions).  That means ipa_comdats never tries to 
 put

Aha, that explans why it is around.

 it in a comdat, and so it never ends up in the hash table.  It seems like the
 simplest solution is to just check if symbol is not in the map before trying 
 to
 get the comdat it should go in, but another approach might be to keep separate
 hash maps for comdat functions and functions that can't be in any comdat, and
 then iterate over only the functions that belong in a comdat.

Well, -fkeep-inline-functions promise you that you can call any inline function 
from debugger.
I suppose in this case you also want to be able to call static functions. 
Comdat pass may
bundle the function into comdat that is later optimized away by linker, so I 
would say we
just want to disable the whole comdat pass when -fkeep-inline-functions is used?

Patch for that is preapproved.
Honza
 
 boottstrapped + regtested x86_64-unknown-linux-gnu, ok?
 
 Trev
 
 gcc/
 
   * ipa-comdats.c (ipa_commdats): check if map contains symbol before
   trying to put symbol in a comdat.
 
 diff --git a/gcc/ipa-comdats.c b/gcc/ipa-comdats.c
 index af2aef8..8695a7e 100644
 --- a/gcc/ipa-comdats.c
 +++ b/gcc/ipa-comdats.c
 @@ -327,18 +327,18 @@ ipa_comdats (void)
  !symbol-alias
  symbol-real_symbol_p ())
   {
 -   tree group = *map.get (symbol);
 +   tree *group = map.get (symbol);
  
 -   if (group == error_mark_node)
 +   if (!group || *group == error_mark_node)
   continue;
 if (dump_file)
   {
 fprintf (dump_file, Localizing symbol\n);
 symbol-dump (dump_file);
 -   fprintf (dump_file, To group: %s\n, IDENTIFIER_POINTER (group));
 +   fprintf (dump_file, To group: %s\n, IDENTIFIER_POINTER 
 (*group));
   }
 symbol-call_for_symbol_and_aliases (set_comdat_group,
 -  *comdat_head_map.get (group),
 +  *comdat_head_map.get (*group),
true);
   }
  }
 diff --git a/gcc/testsuite/g++.dg/pr61324.C b/gcc/testsuite/g++.dg/pr61324.C
 new file mode 100644
 index 000..6102574
 --- /dev/null
 +++ b/gcc/testsuite/g++.dg/pr61324.C
 @@ -0,0 +1,13 @@
 +// { dg-do compile }
 +// { dg-options -O -fkeep-inline-functions -fno-use-cxa-atexit }
 +void foo ();
 +
 +struct S
 +{
 +  ~S ()
 +  {
 +foo ();
 +  }
 +};
 +
 +S s;
 -- 
 2.1.3

Re: patch to fix PR63527

2014-11-25 Thread H.J. Lu

On Tue, Nov 25, 2014 at 12:54 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Nov 25, 2014 at 12:22 PM, Vladimir Makarov vmaka...@redhat.com 
 wrote:
 The following patch fixes

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63527

 The patch was tested and bootstrapped on x86/x86-64.

 Committed as rev. 218509.

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63527

 I checked in this testcase.

 Thanks.

 --
 H.J.
 ---
 Index: ChangeLog
 ===
 --- ChangeLog (revision 218060)
 +++ ChangeLog (working copy)
 @@ -1,3 +1,8 @@
 +2014-11-25  H.J. Lu  hongjiu...@intel.com
 +
 + PR target/63527
 + * gcc.target/i386/pr63527.c: New test.
 +
  2014-11-25  Martin Liska  mli...@suse.cz

Added another testcase from

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534


-- 
H.J.
---
Index: ChangeLog
===
--- ChangeLog (revision 218061)
+++ ChangeLog (working copy)
@@ -1,5 +1,10 @@
 2014-11-25  H.J. Lu  hongjiu...@intel.com

+ PR target/63534
+ * gcc.target/i386/pr63534.c: New test.
+
+2014-11-25  H.J. Lu  hongjiu...@intel.com
+
  PR target/63527
  * gcc.target/i386/pr63527.c: New test.

Index: gcc.target/i386/pr63534.c
===
--- gcc.target/i386/pr63534.c (revision 0)
+++ gcc.target/i386/pr63534.c (working copy)
@@ -0,0 +1,15 @@
+/* PR target/pr63534 */
+/* { dg-do compile { target { ia32  fpic } } } */
+/* { dg-options -O2 -fPIC } */
+
+extern void bar (void);
+
+void
+foo (void)
+{
+  bar ();
+  bar ();
+}
+
+/* We shouldn't load EBX again.  */
+/* { dg-final { scan-assembler-not movl\[ \t\]%\[^,\]+, %ebx } } */

Re: [Patch] Improving jump-thread pass for PR 54742

2014-11-25 Thread Sebastian Pop

Sebastian Pop wrote:
 I will bootstrap and regression test this patch on x86_64-linux and
 powerpc64-linux.  I will also run it on our internal benchmarks, coremark, and
 the llvm test-suite.
 
 I will also include a longer testcase that makes sure we do not regress on
 coremark.

Done all the above.  Attached is the new patch with a new testcase.  I have also
added verify_seme inspired by the recent patch adding verify_sese.

Sebastian
From ca222d5222fb976c7aa258d3e3c04e593f42f7a2 Mon Sep 17 00:00:00 2001
From: Sebastian Pop s@samsung.com
Date: Fri, 26 Sep 2014 14:54:20 -0500
Subject: [PATCH] extend jump thread for finite state automata PR 54742

Adapted from a patch from James Greenhalgh.

	* params.def (max-fsm-thread-path-insns, max-fsm-thread-length,
	max-fsm-thread-paths): New.

	* doc/invoke.texi (max-fsm-thread-path-insns, max-fsm-thread-length,
	max-fsm-thread-paths): Documented.

	* tree-cfg.c (split_edge_bb_loc): Export.
	* tree-cfg.h (split_edge_bb_loc): Declared extern.

	* tree-ssa-threadedge.c (simplify_control_stmt_condition): Restore the
	original value of cond when simplification fails.
	(fsm_find_thread_path): New.
	(fsm_find_control_statement_thread_paths): New.
	(fsm_thread_through_normal_block): Call find_control_statement_thread_paths.

	* tree-ssa-threadupdate.c (dump_jump_thread_path): Pretty print
	EDGE_START_FSM_THREAD.
	(verify_seme): New.
	(duplicate_seme_region): New.
	(thread_through_all_blocks): Generate code for EDGE_START_FSM_THREAD edges
	calling gimple_duplicate_sese_region.

	* tree-ssa-threadupdate.h (jump_thread_edge_type): Add EDGE_START_FSM_THREAD.

	* testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c: New.
	* testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c: New.
---
 gcc/doc/invoke.texi  |   12 ++
 gcc/params.def   |   15 ++
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c |   43 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c |  127 +
 gcc/tree-cfg.c   |2 +-
 gcc/tree-cfg.h   |1 +
 gcc/tree-ssa-threadedge.c|  215 +-
 gcc/tree-ssa-threadupdate.c  |  201 +++-
 gcc/tree-ssa-threadupdate.h  |1 +
 9 files changed, 614 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 89edddb..074183f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10624,6 +10624,18 @@ large and significantly increase compile time at optimization level
 @option{-O1} and higher.  This parameter is a maximum nubmer of statements
 in a single generated constructor.  Default value is 5000.
 
+@item max-fsm-thread-path-insns
+Maximum number of instructions to copy when duplicating blocks on a
+finite state automaton jump thread path.  The default is 100.
+
+@item max-fsm-thread-length
+Maximum number of basic blocks on a finite state automaton jump thread
+path.  The default is 10.
+
+@item max-fsm-thread-paths
+Maximum number of new jump thread paths to create for a finite state
+automaton.  The default is 50.
+
 @end table
 @end table
 
diff --git a/gcc/params.def b/gcc/params.def
index 9b21c07..edf3f53 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1140,6 +1140,21 @@ DEFPARAM (PARAM_CHKP_MAX_CTOR_SIZE,
 	  Maximum number of statements to be included into a single static 
 	  constructor generated by Pointer Bounds Checker,
 	  5000, 100, 0)
+
+DEFPARAM (PARAM_MAX_FSM_THREAD_PATH_INSNS,
+	  max-fsm-thread-path-insns,
+	  Maximum number of instructions to copy when duplicating blocks on a finite state automaton jump thread path,
+	  100, 1, 99)
+
+DEFPARAM (PARAM_MAX_FSM_THREAD_LENGTH,
+	  max-fsm-thread-length,
+	  Maximum number of basic blocks on a finite state automaton jump thread path,
+	  10, 1, 99)
+
+DEFPARAM (PARAM_MAX_FSM_THREAD_PATHS,
+	  max-fsm-thread-paths,
+	  Maximum number of new jump thread paths to create for a finite state automaton,
+	  50, 1, 99)
 /*
 
 Local variables:
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
new file mode 100644
index 000..bb34a74
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-dom1-details } */
+/* { dg-final { scan-tree-dump-times FSM 6 dom1 } } */
+/* { dg-final { cleanup-tree-dump dom1 } } */
+
+int sum0, sum1, sum2, sum3;
+int foo (char *s, char **ret)
+{
+  int state=0;
+  char c;
+
+  for (; *s  state != 4; s++)
+{
+  c = *s;
+  if (c == '*')
+	{
+	  s++;
+	  break;
+	}
+  switch (state)
+	{
+	case 0:
+	  if (c == '+')
+	state = 1;
+	  else if (c != '-')
+	sum0+=c;
+	  break;
+	case 1:
+	  if (c == '+')
+

Re: [Patch, Fortran] convert almost all {warning,error}_now to common diagnostic

2014-11-25 Thread FX

 (a) those majority which might need buffering (gfc_error, gfc_warning);

Is there a plan for those in the longer term?


 Bootstrapped and regtested on x86-64-gnu-linux.
 OK for the trunk?

OK

Re: [PATCH 3/5] combine: add regno field to LOG_LINKS

2014-11-25 Thread Segher Boessenkool

On Tue, Nov 25, 2014 at 11:46:52AM -0700, Jeff Law wrote:
 On 11/14/14 12:19, Segher Boessenkool wrote:
 With this new field in place, we can have LOG_LINKS for insns that set
 more than one register and distribute them properly in distribute_links.
 This then allows many more PARALLELs to be combined.
 
 Also split off new functions can_combine_{def,use}_p from the
 create_log_links function.
 
 
 2014-11-14  Segher Boessenkool  seg...@kernel.crashing.org
 
 gcc/
  * combine.c (struct insn_link): New field `regno'.
  (alloc_insn_link): New parameter `regno'.  Use it.
  (find_single_use): Check the new field.
  (can_combine_def_p, can_combine_use_p): New functions.  Split
  off from ...
  (create_log_links): ... here.  Correct data type of `regno'.
  Adjust call to alloc_insn_link.
  (adjust_for_new_dest): Find regno, use it in call to
  alloc_insn_link.
  (try_combine): Adjust call to alloc_insn_link.
  (distribute_links): Check the new field.

 Didn't you lose the check that avoids duplicated LOG_LINKs?

I don't think so; if I did, that's a bug.

 Or is the 
 claim that the check is no longer needed because there are no duplicates 
 now that we include the register associated with the link?

Are you talking about create_log_links?  There can be no duplicates there
(anymore), that would be multiple defs of the same reg in the same insn,
indeed.

I did check all the places that look at links, and adjusted everything
that needed adjusting.  Could have missed something of course...


Segher

1 2 >

1 - 100 of 155 matches

Mail list logo