Re: [wwwdocs, patch] gcc-4.9/changes.html: Add quip about #pragma GCC ivdep and update Fortran section

2013-10-25 Thread Tobias Burnus

Gerald Pfeifer wrote:

On Thu, 24 Oct 2013, Tobias Burnus wrote:

Any comments? Or is the patch OK?

thanks for doing this.


Thanks for looking at the patch. However, the patch has a link problem. 
The documentation is at

   http://gcc.gnu.org/onlinedocs/gcc/Loop_002dSpecific-Pragmas.html

That's also the link I use in the changes.html file. However, some 
script changes the link to:

   http://gcc.gnu.org/onlinedocs/gcc/Loop-Specific-Pragmas.html
which won't work. Try yourself at http://gcc.gnu.org/gcc-4.9/changes.html


Actually, a similar issue was reported at 
http://gcc.gnu.org/ml/gcc-help/2013-10/msg00132.html


Does anyone know which script modifies those links?

Tobias


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Marc Glisse

On Thu, 24 Oct 2013, Jeff Law wrote:


On 10/24/13 23:23, Marc Glisse wrote:

Hello,

I noticed that in some cases we were failing to find aliasing
information because we were only looking at an SSA_NAME variable,
missing the fact that it was really an ADDR_EXPR. The attached patch
passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
losing some type information for instance)

I didn't investigate the 2 tests where I had to remove dg-bogus, because
removing dg-bogus sounds like a bonus...

2013-10-25  Marc Glisse  marc.gli...@inria.fr

gcc/
 * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for an
 ADDR_EXPR in the defining statement.

Shouldn't the ADDR_EXPR have been propagated into the use?


Maybe when the address is a constant, but here it comes from malloc.

--
Marc Glisse


Re: [RFC] Fix context-sensitiveness of peephole2 pass

2013-10-25 Thread Jakub Jelinek
On Thu, Oct 24, 2013 at 11:14:42PM -0600, Jeff Law wrote:
 On 10/24/13 15:10, Eric Botcazou wrote:
 As discovered by Richard B. under PR rtl-optimization/58831, the peephole2
 pass has been context-sensitive for a long time when scratch registers are
 needed, in the sense that the behaviour of the pass for a given function is
 dependent on what happened for the previously optimized function.
 
 Obvious patch attached, tested on x86_64-suse-linux.  Do we want to apply it
 on mainline only or on all the active branches?
 
 
 2013-10-24  Eric Botcazou  ebotca...@adacore.com
 
  * recog.c (search_ofs): New static variable moved from...
  (peep2_find_free_register): ...here.
  (peephole2_optimize): Initialize it.
 OK for trunk.  Release branch owners have the final call for their branches.

This is ok for the release branches too.

Jakub


Re: [Patch, C++] Add C++ FE support for #pragma ivdep

2013-10-25 Thread Tobias Burnus

Jason Merrill wrote:

On 10/10/2013 04:46 AM, Tobias Burnus wrote:

I considered to add the annotation also to C++11's range-based loops,
but as those are unlikely to vectorize, I didn't do so.


I would think that a range-based loop over an array should vectorize
nicely:

int ar[8];
for (int i: ar) { ... }


They do - but during my experiments they either also did without 
#pragma GCC ivdep – typically with static arrays like yours – or they 
didn't with vector.


However, I came now up with an example where #pragma GCC ivdep should 
have an effect:


int ar[100];

void foo(int *a) {
  for (auto i : ar) {
i *= *a;
  }
}

Therefore, I will send a follow up patch.

Tobias


RE: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-10-25 Thread Bernd Edlinger
Hi Richard,


 Did you just propose:

 --- stor-layout.c.orig 2013-10-22 10:46:49.233261818 +0200
 +++ stor-layout.c 2013-10-24 14:54:00.425259062 +0200
 @@ -471,27 +471,7 @@
 static enum machine_mode
 mode_for_array (tree elem_type, tree size)
 {
 - tree elem_size;
 - unsigned HOST_WIDE_INT int_size, int_elem_size;
 - bool limit_p;
 -
 - /* One-element arrays get the component type's mode. */
 - elem_size = TYPE_SIZE (elem_type);
 - if (simple_cst_equal (size, elem_size))
 - return TYPE_MODE (elem_type);
 -
 - limit_p = true;
 - if (host_integerp (size, 1)  host_integerp (elem_size, 1))
 - {
 - int_size = tree_low_cst (size, 1);
 - int_elem_size = tree_low_cst (elem_size, 1);
 - if (int_elem_size 0
 -  int_size % int_elem_size == 0
 -  targetm.array_mode_supported_p (TYPE_MODE (elem_type),
 - int_size / int_elem_size))
 - limit_p = false;
 - }
 - return mode_for_size_tree (size, MODE_INT, limit_p);
 + return BLKmode;
 }

 ???

 Yes. Does it work?

 Richard.


No.

PASS: gcc.target/i386/pr14289-1.c  (test for errors, line 10)
FAIL: gcc.target/i386/pr14289-1.c (test for excess errors)
Excess errors:
/home/ed/gnu/gcc-4.9-20131013/gcc/testsuite/gcc.target/i386/pr14289-1.c:5:14: 
error: data type of 'a' isn't suitable for a register

Does that look like an ABI-Change?

the test case uses:
register int a[2] asm(ebx);

Bernd.

Re: [Patch, C++] Add C++ FE support for #pragma ivdep

2013-10-25 Thread Jakub Jelinek
On Fri, Oct 25, 2013 at 08:22:22AM +0200, Tobias Burnus wrote:
 Jason Merrill wrote:
 On 10/10/2013 04:46 AM, Tobias Burnus wrote:
 I considered to add the annotation also to C++11's range-based loops,
 but as those are unlikely to vectorize, I didn't do so.
 
 I would think that a range-based loop over an array should vectorize
 nicely:
 
 int ar[8];
 for (int i: ar) { ... }
 
 They do - but during my experiments they either also did without
 #pragma GCC ivdep – typically with static arrays like yours – or
 they didn't with vector.
 
 However, I came now up with an example where #pragma GCC ivdep
 should have an effect:
 
 int ar[100];
 
 void foo(int *a) {
   for (auto i : ar) {
 i *= *a;
   }
 }
 
 Therefore, I will send a follow up patch.

What about
#pragma GCC ivdep
while (cond) { ... }
and
#pragma GCC ivdep
do { ... } while (cond);
(for both C and C++)?

Jakub


Today's MPX trunk patches (Rev. 204046): i386.md:18329:2: error: 'TARGET_MPX' was not declared in this scope

2013-10-25 Thread Tobias Burnus

Hi Kirill,

with the current trunk (newst git mirror version), bootstrapping fails 
here with the following error. Did you forget to commit one file?



g++ -c   -g -DIN_GCC-fno-exceptions -fno-rtti 
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long 
-Wno-variadic-macros -Wno-overlength-strings -fno-common 
-DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I../../gcc 
-I../../gcc/build -I../../gcc/../include -I../../gcc/../libcpp/include 
-I../../gcc/../libdecnumber -I../../gcc/../libdecnumber/bid 
-I../libdecnumber -I../../gcc/../libbacktrace -DCLOOG_INT_GMP\

-o build/gencondmd.o build/gencondmd.c
../../gcc/config/i386/i386.md:18329:2: error: 'TARGET_MPX' was not 
declared in this scope

   TARGET_MPX
  ^
../../gcc/config/i386/i386.md:18329:2: error: 'TARGET_MPX' was not 
declared in this scope

   TARGET_MPX
  ^
../../gcc/config/i386/i386.md:18329:2: error: 'TARGET_MPX' was not 
declared in this scope

   TARGET_MPX
  ^
../../gcc/config/i386/i386.md:18329:2: error: 'TARGET_MPX' was not 
declared in this scope

   TARGET_MPX
  ^

Tobias


Re: Today's MPX trunk patches (Rev. 204046): i386.md:18329:2: error: 'TARGET_MPX' was not declared in this scope

2013-10-25 Thread Michael Zolotukhin
Adding Ilya.

On 25 October 2013 10:48, Tobias Burnus bur...@net-b.de wrote:
 Hi Kirill,

 with the current trunk (newst git mirror version), bootstrapping fails
 here with the following error. Did you forget to commit one file?


 g++ -c   -g -DIN_GCC-fno-exceptions -fno-rtti
 -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
 -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long
 -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H
 -DGENERATOR_FILE -I. -Ibuild -I../../gcc -I../../gcc/build
 -I../../gcc/../include -I../../gcc/../libcpp/include
 -I../../gcc/../libdecnumber -I../../gcc/../libdecnumber/bid
 -I../libdecnumber -I../../gcc/../libbacktrace -DCLOOG_INT_GMP\
 -o build/gencondmd.o build/gencondmd.c
 ../../gcc/config/i386/i386.md:18329:2: error: 'TARGET_MPX' was not declared
 in this scope
TARGET_MPX
   ^
 ../../gcc/config/i386/i386.md:18329:2: error: 'TARGET_MPX' was not declared
 in this scope
TARGET_MPX
   ^
 ../../gcc/config/i386/i386.md:18329:2: error: 'TARGET_MPX' was not declared
 in this scope
TARGET_MPX
   ^
 ../../gcc/config/i386/i386.md:18329:2: error: 'TARGET_MPX' was not declared
 in this scope
TARGET_MPX
   ^

 Tobias



-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.


[PATCH, i386]: Unbreak bootstrap

2013-10-25 Thread Uros Bizjak
Hello!

2013-10-25  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.h (TARGET_MPX): New define.
(TARGET_MPX_P): Ditto.

Tested on x86_64-pc-linux-gnu, committed.

Uros.
Index: i386.h
===
--- i386.h  (revision 204047)
+++ i386.h  (working copy)
@@ -128,6 +128,8 @@
 #define TARGET_XSAVE_P(x)  TARGET_ISA_XSAVE_P(x)
 #define TARGET_XSAVEOPTTARGET_ISA_XSAVEOPT
 #define TARGET_XSAVEOPT_P(x)   TARGET_ISA_XSAVEOPT_P(x)
+#define TARGET_MPX TARGET_ISA_MPX
+#define TARGET_MPX_P(x)TARGET_ISA_MPX_P(x)
 
 #define TARGET_LP64TARGET_ABI_64
 #define TARGET_LP64_P(x)   TARGET_ABI_64_P(x)


Re: wwwdocs / gomp: Update projects/gomp/index.html

2013-10-25 Thread Jakub Jelinek
On Fri, Oct 25, 2013 at 03:11:40AM +0200, Gerald Pfeifer wrote:
 Hi Tobias,
 
 On Thu, 24 Oct 2013, Tobias Burnus wrote:
 the attached patch updates the gomp project patch.
 
 (Besides adding an item for the OpenMP 4.0 merge, I removed the 95 from
 Fortran 95 as Fortran is backward compatible and gfortran is effectively a
 Fortran 66/77/90/95/2003/2008 compiler [minus bugs and some missing
 features].)
 
 OK?
 
 looks good to me, though please give Jakub a bit to chime in in
 case.
 
 One thing to consider is replacing SVN mainline by mainline GCC;
 that way it's easier to understand and we avoid referring to our
 revision control system of the day (well, decade ;-).

LGTM.

Jakub


Re: lto-plugin: mismatch between ld's architecture and GCC's configure --host

2013-10-25 Thread Thomas Schwinge
Hi!

Ping.  To sum it up, with these patches applied, there are no changes for
a regular build (not using the new configure options).  On the other
hand, configuring GCC as described, it is possible use the 32-bit x86
linker for/with a x86_64 build, and get the very same GCC test results as
when using a x86_64 linker.


On Mon, 14 Oct 2013 12:28:11 +0200, I wrote:
 On Sat, 12 Oct 2013 12:20:19 +0200, I wrote:
  This is a bit of a weird scenario -- but it is supposed to work fine in
  my opinion (but doesn't).
  
  I have a GNU toolchain as 32-bit x86 GNU/Linux executables, configured to
  to generate code for 32-bit x86 by default, and using -m64 for x86_64.
  
  This toolchain I'm using on a x86_64 system (which can execute 32-bit
  executables) to build a *native* GCC, that is I'm using the 32-bit
  toolchain to build a x86_64 GCC (configuring with CC='gcc -m64' CXX='g++
  -m64').  I intend to continue using the 32-bit toolchain's linker, which
  also is a 32-bit executable (GNU ld).  That one also defaults to x86
  code, but can handle the x86_64 case fine if passed -m elf_x86_64, which
  GCC does.
  
  That the linker is a 32-bit executable is an implementation detail that
  is not important generally: it's a separate process living in its own
  address space.  However it becomes relevant in the case of linker
  plugins: the native x86_64 GCC that I'm building also builds a x86_64
  lto-plugin, which the 32-bit ld cannot load:
  
  $ gcc/xgcc -B[...] [...]/gcc.c-torture/execute/ieee/2320-1.c [...] 
  -flto [...]
  [...]/ld: [...]/gcc/liblto_plugin.so: error loading plugin: 
  [...]/gcc/liblto_plugin.so: wrong ELF class: ELFCLASS64
  collect2: error: ld returned 1 exit status
  
  So, aside from building a 64-bit ld (which is the lame alternative), I
  now need to teach GCC's build system that the lto-plugin may need special
  configuration: CC='gcc -m32' -- and [...] its own build of libiberty,
  too [...]
 
  Instead of auto-detecting the linker's
  architecture (and then, what to do with that information?), I intend to
  make this a manual process (so, some new top-level configure
  argument(s)).  Adding yet another set of {...,CC,...}_FOR_[something] is
  probably overkill -- I'll try to find something simpler.
  
  Any comments on this scenario?
 
 Here are the patches.  Unless the new option is exercised, there are no
 effects on a native x86_64 GNU/Linux bootstrap build (the build trees'
 *.o files are identical, as are the test results).  OK to commit?

 Allow overriding the libiberty used for building the LTO plugin.
 
   lto-plugin/
   * configure.ac (--with-libiberty): New configure option.
   * configure: Regenerate.
   * Makefile.am (libiberty, libiberty_pic): New variables.
   (liblto_plugin_la_LIBADD, liblto_plugin_la_LDFLAGS)
   (liblto_plugin_la_DEPENDENCIES): Use them.
   * Makefile.in: Regenerate.
 ---
  lto-plugin/Makefile.am  | 20 +++-
  lto-plugin/Makefile.in  | 22 --
  lto-plugin/configure| 17 +++--
  lto-plugin/configure.ac |  5 +
  4 files changed, 43 insertions(+), 21 deletions(-)
 
 diff --git lto-plugin/Makefile.am lto-plugin/Makefile.am
 index b24015e..8b7bb54 100644
 --- lto-plugin/Makefile.am
 +++ lto-plugin/Makefile.am
 @@ -15,17 +15,19 @@ libexecsub_LTLIBRARIES = liblto_plugin.la
  gcc_build_dir = ../$(host_subdir)/gcc
  in_gcc_libs = $(foreach lib, $(libexecsub_LTLIBRARIES), 
 $(gcc_build_dir)/$(lib))
  
 -# Can be removed when libiberty becomes a normal convenience library
 -Wc=-Wc,
 -
  liblto_plugin_la_SOURCES = lto-plugin.c
 +# Note that we intentionally override the bindir supplied by 
 ACX_LT_HOST_FLAGS.
 +liblto_plugin_la_LDFLAGS = $(lt_host_flags) -module -bindir $(libexecsubdir)
 +# Can be simplified when libiberty becomes a normal convenience library.
 +libiberty=$(with_libiberty)/libiberty.a
 +libiberty_pic=$(with_libiberty)/pic/libiberty.a
 +Wc=-Wc,
  liblto_plugin_la_LIBADD = \
 - $(if $(wildcard 
 ../libiberty/pic/libiberty.a),$(Wc)../libiberty/pic/libiberty.a,)
 -# Note that we intentionally override the bindir supplied by 
 ACX_LT_HOST_FLAGS
 -liblto_plugin_la_LDFLAGS = $(lt_host_flags) -module -bindir $(libexecsubdir) 
 \
 - $(if $(wildcard 
 ../libiberty/pic/libiberty.a),,-Wc,../libiberty/libiberty.a)
 -liblto_plugin_la_DEPENDENCIES = $(if $(wildcard \
 - ../libiberty/pic/libiberty.a),../libiberty/pic/libiberty.a,)
 + $(if $(wildcard $(libiberty_pic)),$(Wc)$(libiberty_pic),)
 +liblto_plugin_la_LDFLAGS += \
 + $(if $(wildcard $(libiberty_pic)),,-Wc,$(libiberty))
 +liblto_plugin_la_DEPENDENCIES = \
 + $(if $(wildcard $(libiberty_pic)),$(libiberty_pic),)
  
  all-local: $(in_gcc_libs)
  
 diff --git lto-plugin/configure.ac lto-plugin/configure.ac
 index 9a418d2..b73fabb 100644
 --- lto-plugin/configure.ac
 +++ lto-plugin/configure.ac
 @@ -4,6 +4,11 @@ AC_CANONICAL_SYSTEM
  GCC_TOPLEV_SUBDIRS
  AM_INIT_AUTOMAKE([foreign no-dist])
  

Re: [Patch Preview/RFC] tree-optimization/57994: Constant folding of infinity

2013-10-25 Thread Richard Biener
On Thu, 24 Oct 2013, Paolo Carlini wrote:

 Hi,
 
 this is just a preview, but I decided to send it out early to understand if
 I'm on the right track or not. As you can see in the Bug, this started as a
 spin-off of a library issue with complex pow, which led us to:
 
 __builtin_exp(-__builtin_huge_val())
 
 not being folded to a constant at compile-time. The reason is simple: the
 various do_mpfr_arg*, etc, all check real_isfinite on the arguments. In the
 audit trail of the bug we came to the conclusion that allowing non-NANs could
 make sense (the mpfr facilities appear to be quite solid in this case - maybe
 for NANs too, at least in the non-complex case, but that's another story).
 However, when today I started fiddling with this kind of change, I noticed
 that isn't enough for the kind of code we really care about for the original
 issue, involving logs too, thus something like:
 
   long double num = __builtin_logl(0L);
   long double res = __builtin_expl(num);
 
 because the log isn't folded at all for zero: fold_builtin_logarithm calls
 do_mpfr_arg1 with true as last argument. We can make progress if, when it's
 safe - is !flag_trapping_math  !flag_errno_math enough? - we pass true
 instead. Then we also have to tweak do_mpfr_ckconv, because it checks
 mpfr_number_p (and real_isfinite) on the result of the folding, thus ruling
 out infinities. Then we are almost there: the latter can be actually fully
 folded if -O1 is used, -O0 is not Ok, because otherwise num isn't const
 propagated to the evaluation of res. Is this so far by and large Ok? I'm
 attaching a corresponding patchlet (it of course lacks testcases and testsuite
 adjustments)
 
 Note: in the patchlet I'm not trying to handle NaNs; neither complex numbers;
 neither bessel, remquo and lgamma (which could probably be added?) Also, I
 suppose we could discover other cases like fold_builtin_logarithm, where, in
 the context of folding infinities too, we may have to tweak a bit the way
 do_mpfr_* functions are called

Generally I think it's quite odd that do_mpfr_arg1 restricts operands
and results in any way.  mpfr should be able to return the correct
(exceptional) value for any inputs.  We have to restrict ourselves
with using the result only in the case we have to set errno (where
we can still constant propagate but have to leave the call around
for errno setting purposes) and if the function is supposed to trap on
exceptional values (same handling as errno).  But this logic should
be really in the callers.  Restricting what arguments we feed into
mpfr should never be necessary unless there are bugs in mpfr (which
should better be reported and fixed).

Not sure if we have to care about SNaN inputs in any way, but we'd
have -fsignalling-nans to control this anyway.

I have no opinion on the patch itself for now, we have to decide
first whether to trust mpfr enough to feed it with exceptional
values (where obviously the range test of do_mpfr_arg? doesn't work,
but AFAICS it's only to avoid errno cases which should be done
separately).

For example

@@ -8191,7 +8191,9 @@ fold_builtin_logarithm (location_t loc, tree fndec
   const enum built_in_function fcode = builtin_mathfn_code (arg);
 
   /* Calculate the result when the argument is a constant.  */
-  if ((res = do_mpfr_arg1 (arg, type, func, dconst0, NULL, false)))
+  if ((res = do_mpfr_arg1 (arg, type, func, dconst0, NULL,
+  !flag_trapping_math  !flag_errno_math
+  ? true : false)))
return res;
 
   /* Special case, optimize logN(expN(x)) = x.  */

we should be able to constant fold for all inputs, but if arg = 0
and flag_trapping_math || flag_errno_math then we have to fold to

 ( log (arg), constant-folding-result )

especially for the -ferrno-math case this allows to continue constant
folding with the result, removing otherwise dead code.

Not sure if this fixes the not constant folding (as the result
is a constant but with side-effects that still need to be executed
and represented by calls with unused result).

Richard.


[committed] Dump OpenMP version number listed for -fopenmp at doc/invoke.texi

2013-10-25 Thread Tobias Burnus
Dear all,

I have committed the patch below as obvious (Rev. 204049).
Seemingly, there are a lot of places which have to be updated for
a new OpenMP version ...

Tobias

Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (Revision 204048)
+++ gcc/ChangeLog   (Arbeitskopie)
@@ -1,3 +1,8 @@
+2013-10-25  Tobias Burnus  bur...@net-b.de
+
+   * doc/invoke.texi (fopenmp): Change supported OpenMP version
+   to 4.0.
+
 2013-10-25  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.h (TARGET_MPX): New define.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (Revision 204048)
+++ gcc/doc/invoke.texi (Arbeitskopie)
@@ -1825,7 +1825,7 @@ freestanding and hosted environments.
 Enable handling of OpenMP directives @code{#pragma omp} in C/C++ and
 @code{!$omp} in Fortran.  When @option{-fopenmp} is specified, the
 compiler generates parallel code according to the OpenMP Application
-Program Interface v3.0 @w{@uref{http://www.openmp.org/}}.  This option
+Program Interface v4.0 @w{@uref{http://www.openmp.org/}}.  This option
 implies @option{-pthread}, and thus is only supported on targets that
 have support for @option{-pthread}.



Re: [patch] Fix PR rtl-optimization/58831

2013-10-25 Thread Eric Botcazou
 Actually I think it's jfc's code; he had a major revamp of alias.c back
 in the early egcs days and getting it integrated was one of the
 project's early wins.  Sadly, jfc hasn't been involved in GCC work for a
 long long time.

OK, thanks for the clarification.

 Anyway, I think the issue here is that once we set
 new_reg_base_value[regno] to zero, when we do encounter insn 30, we
 assume that's the first set of (reg 4) and we use the RHS as a base value.
 
 The result is we end up with something in new_reg_base_value[4] that is
 not correct for the entire life.  ie, from function entry to the clobber
 (reg 4) has a value totally unrelated to what we've stored in
 new_reg_base_value[4]?  RIght?

Yes, that's exactly it.

 Seems reasonable to me, assuming my understandings noted above are correct.

Thanks.

-- 
Eric Botcazou


Re: [ARM][PATCH] Fix testsuite testcase neon-vcond-[ltgt,unordered].c

2013-10-25 Thread Kyrill Tkachov

On 24/10/13 20:03, Kugan wrote:


Hi Kyrill,

It happens for armv5te arm-none-linux-gnueabi. --with-mode=arm
--with-arch=armv5te --with-float=soft


Ah ok, I can reproduce it now. So, while I agree that we add a scan for vbit and 
vbif to these testcases, there seems to be something dodgy going on with the 
register allocation.


With -march=armv5te I'm getting the following snippet of code in the ltgt case:

.L12:
ldr r4, [ip]
ldr r5, [ip, #4]
ldr r6, [ip, #8]
ldr r7, [ip, #12]
vmovd20, r4, r5  @ v4sf
vmovd21, r6, r7
vcgt.f32q8, q10, q9
vcgt.f32q10, q9, q10
vorrq8, q8, q10
vmovd22, r4, r5  @ v4sf
vmovd23, r6, r7
vbitq11, q9, q8
vmovr4, r5, d22  @ v4sf
vmovr6, r7, d23

The second vcgt.f32 trashes q10, then recreates it in q11 with:
vmovd22, r4, r5  @ v4sf
vmovd23, r6, r7

so it can do the vbit. Surely there's something better that can be done?

In contrast, with -march=armv7-a we get:

.L12:
vld1.32 {q9}, [r4]!
vcgt.f32q8, q9, q10
vcgt.f32q11, q10, q9
vorrq8, q8, q11
vbslq8, q10, q9
vst1.32 {q8}, [lr]!



So, while I agree with the patch, there seems to be some funny business with the 
register allocation that could be worth looking into.


Thanks,
Kyrill



You can also find the logs here in
http://cbuild.validation.linaro.org/build/gcc-linaro-4.8-2013.10/logs/armv7l-precise-cbuild461-calxeda02_21_00_precise_armel-armv5r2/

I changed neon-vcond-gt.c too.

Thanks,
Kugan

2013-10-23  Kugan Vivekanandarajah  kug...@linaro.org

* gcc.target/arm/neon-vcond-gt.c: Scan for vbsl or vbit or vbif.
* gcc.target/arm/neon-vcond-ltgt.c: Scan for vbsl or vbit or vbif.
* gcc.target/arm/neon-vcond-unordered.c: Scan for vbsl or vbit or vbif.









Re: [PATCH] RFA: Remove mudflap

2013-10-25 Thread Richard Biener
On Thu, Oct 24, 2013 at 10:35 PM, Jeff Law l...@redhat.com wrote:
 On 10/24/13 07:40, Richard Biener wrote:

 we were supposed to remove mudflap for 4.9, no?


 Really?  I guess it hasn't been removed yet since the include is still
 there?  who is doing that?


 Yeah, nobody has done it yet appearantly :/

 Well, here we go...


 Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Obviously
 the mudflap specific tests were deleted and not run and were manually
 ignored when comparing testsuite runs.

 OK for the trunk?

Ok with preserving options as dummy, see examples like

fargument-alias
Common Ignore
Does nothing. Preserved for backward compatibility.

I don't think we should sorry (), but we can certainly warn that
mudflap was replaced by -fsanitize=address(?).  Implementing
the warning properly isn't a requirement for the patch to go in
(but the options should be still handled, but ignored).  We can
followup with a patch implementing the warning.

Thanks,
Richard.

 jeff


RE: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips

2013-10-25 Thread Gopalasubramanian, Ganesh
 OK, so it is about 2%.  Did you try if you need lookahead even in the early 
 pass (before reload)?  My guess would be so, but if not, it could cut the 
 cost to half.  For -Ofast/-O3 it looks resonable to me, but we will  need to 
 announce it on the ML.  For other settings I think we need to work on more 
 improvements or cut the expenses.

Yes, it is required before reload.  

I have another idea which can be pondered upon. Currently, can we enable 
lookahead with the value 4 (pre reload) for default? This will exponentially 
cut the cost of build time. 
I have done some measurements on the build time of some benchmarks (mentioned 
below) with lookahead value 4. The 2% increase in build time with value 8 is 
now almost gone.

   dfa4   no_lookahead
 
 perlbench   - 191s  193s
 bzip2   - 19s   19s
 gcc - 429s  429s
 mcf - 3s3s
 gobmk   - 116s  115s
 hmmer   - 60s   60s
 sjeng   - 18s   17s
 libquantum  - 6s6s
 h264ref - 107s  107s
 omnetpp - 128s  128s
 astar   - 7s7s
 bwaves  - 5s5s
 gamess  - 1964s 1957s
 milc- 18s   18s
 GemsFDTD- 273s  272s

Lookahead value 4 also helps because, the modified decoder model in bdver3.md 
is only two cycles deep (though in hardware it is actually 4 cycles deep). This 
means that we can look another two levels deep for better schedule.
GemsFDTD still retains the performance boost of around 6-7% with value 4.

Let me know your thoughts.

Regards
Ganesh

-Original Message-
From: Jan Hubicka [mailto:hubi...@ucw.cz] 
Sent: Thursday, October 24, 2013 6:48 PM
To: Gopalasubramanian, Ganesh
Cc: Jan Hubicka; gcc-patches@gcc.gnu.org; Uros Bizjak (ubiz...@gmail.com); H.J. 
Lu (hjl.to...@gmail.com)
Subject: Re: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 
chips

 Hi,
 
  Is this with -fschedule-insns? Or only with default settings?  Did you test 
  the compile time implications of increasing the lookahead? (value of 8 is 
  very large, we may consider enbling it only for -Ofast, limiting for 
  postreload only or something similar).
 
 The improvement is seen with the options -fschedule-insns  -fschedule-insns2 
 -fsched-pressure
 
 Below are the build times of some of the SPEC benchmarks
 
   dfa8   no_lookahead
 
 perlbench   - 196s  193s
 bzip2   - 19s   19s
 gcc - 439s  429s
 mcf - 3s3s
 gobmk   - 119s  115s
 hmmer   - 62s   60s
 sjeng   - 18s   17s
 libquantum  - 6s6s
 h264ref - 110s  107s
 omnetpp - 132s  128s
 astar   - 7s7s
 bwaves  - 4s5s
 gamess  - 1996s 1957s
 milc- 18s   18s
 GemsFDTD- 276s  272s
 
 I think we can enable it by default rather than for -Ofast.
 Please let me know your inputs.

OK, so it is about 2%.  Did you try if you need lookahead even in the early 
pass (before reload)?  My guess would be so, but if not, it could cut the cost 
to half.  For -Ofast/-O3 it looks resonable to me, but we will need to announce 
it on the ML.  For other settings I think we need to work on more improvmeents 
or cut the expenses.

Honza
 
 Regards
 Ganesh
 
 -Original Message-
 From: Jan Hubicka [mailto:hubi...@ucw.cz]
 Sent: Thursday, October 24, 2013 2:54 PM
 To: Gopalasubramanian, Ganesh
 Cc: gcc-patches@gcc.gnu.org; Uros Bizjak (ubiz...@gmail.com); 
 hubi...@ucw.cz; H.J. Lu (hjl.to...@gmail.com)
 Subject: Re: Fix scheduler ix86_issue_rate and ix86_adjust_cost for 
 modern x86 chips
 
  Attached is the patch which does the following scheduler related changes.
  * re-models bdver3 decoder.
  * It enables lookahead with value 8 for all BD architectures. The patch 
  doesn't consider if reloading is completed or not (an area that needs to be 
  worked on).
  * The issue rate for BD architectures are set to 4.
  
  I see the following performance improvements on bdver3 machine.
  * GemsFDTD improves by 6-7% with lookahead value changed to 8.
  * Hmmer improves by 9% when issue rate when set to 4 .
 
 Is this with -fschedule-insns? Or only with default settings?  Did you test 
 the compile time implications of increasing the lookahead? (value of 8 is 
 very large, we may consider enbling it only for -Ofast, limiting for 
 postreload only or something similar).
 
  
  I have considered the following hardware details for the model.
  * There are four decoders inside a hardware decoder block.
  * These four independent decoders can execute in parallel.  (They can take 
  8B from four different instructions and decode).
  * These four decoders are pipelined 4 cycles deep and are non-stalling.
  

Re: [PATCH] Fix names of various macro parameters in tree.h

2013-10-25 Thread Richard Biener
On Fri, Oct 25, 2013 at 3:17 AM, David Malcolm dmalc...@redhat.com wrote:
 I noticed that some of the macros in tree.h that act on trees have
 parameters named CODE, rather NODE, which is confusing when in the
 presence of other macros that act on enum tree_code values.

 The attached patch renames such params for macros that I believe act on
 trees (mostly because they go ahead and lookup the TREE_CODE() of the
 param).

 Successfully bootstrappedregtested on x86_64-unknown-linux.

 OK for trunk?

Ok.

Thanks,
Richard.

 (fwiw the macros appear to have been this way since they were introduced
 in 87675 in 2004-09-17)

 gcc/
 * tree.h (EXCEPTIONAL_CLASS_P): Rename parameter from CODE
 to NODE, since this works on a tree, not an
 enum tree_code.
 (CONSTANT_CLASS_P): Likewise.
 (TYPE_P): Likewise.
 (DECL_P): Likewise.
 (INDIRECT_REF_P): Likewise.
 (REFERENCE_CLASS_P): Likewise.
 (COMPARISON_CLASS_P): Likewise.
 (UNARY_CLASS_P): Likewise.
 (BINARY_CLASS_P): Likewise.
 (STATEMENT_CLASS_P): Likewise.
 (VL_EXP_CLASS_P): Likewise.
 (EXPRESSION_CLASS_P): Likewise.
 (IS_TYPE_OR_DECL_P): Likewise.



[PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM

2013-10-25 Thread Bernd Edlinger
Hello,

this patch fixes the recently discovered data store race on arm-eabi-gcc with 
-fno-strict-volatile-bitfields
for structures like this:

#define test_type unsigned short

typedef struct s{
 unsigned char Prefix[1];
 test_type Type;
}__attribute((__packed__,__aligned__(4))) ss;

volatile ss v;

void __attribute__((noinline))
foo (test_type u)
{
  v.Type = u;
}

test_type __attribute__((noinline))
bar (void)
{
  return v.Type;
}


I've manually confirmed the correct code generation using variations of the
example above on an ARM cross-compiler for -fno-strict-volatile-bitfields.

Note, that this example is still causes ICE's for -fstrict-volatile-bitfields,
but I'd like to fix that separately.

Boot-strapped and regression-tested on x86_64-linux-gnu.

Ok for trunk?

Thanks
Bernd.2013-10-25  Bernd Edlinger  bernd.edlin...@hotmail.de

Fix C++0x memory model for unaligned fields in packed, aligned(4)
structures with -fno-strict-volatile-bitfields on STRICT_ALIGNMENT
targets like arm-none-eabi.
* expmed.c (store_bit_field): Handle unaligned fields like
bit regions.



patch-unaligned-data.diff
Description: Binary data


Re: [PATCH] fixing typo in expr.c to allow proper recognition of complex addresses in some arches.

2013-10-25 Thread Eric Botcazou
 Thanks.  Installed on the trunk.

Well, no, that will be problematic for some architectures.  The history of 
this piece of code is complicated and it's admittedly lacking a comment, but 
the purpose of the block is clear enough:

op0 = expand_expr (base, NULL_RTX, VOIDmode, EXPAND_SUM);
op0 = memory_address_addr_space (address_mode, op0, as);
if (!integer_zerop (TREE_OPERAND (exp, 1)))
  {
rtx off
  = immed_double_int_const (mem_ref_offset (exp), address_mode);
op0 = simplify_gen_binary (PLUS, address_mode, op0, off);
  }
op0 = memory_address_addr_space (mode, op0, as);

The offset computation is done in address_mode and then, only at the end, 
converted to mode.

-- 
Eric Botcazou


Re: [PATCH, i386]: Unbreak bootstrap

2013-10-25 Thread Ilya Enkovich
2013/10/25 Uros Bizjak ubiz...@gmail.com:
 Hello!

 2013-10-25  Uros Bizjak  ubiz...@gmail.com

 * config/i386/i386.h (TARGET_MPX): New define.
 (TARGET_MPX_P): Ditto.

 Tested on x86_64-pc-linux-gnu, committed.

Thanks for fixing it!


 Uros.


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Richard Biener
On Fri, Oct 25, 2013 at 8:11 AM, Marc Glisse marc.gli...@inria.fr wrote:
 On Thu, 24 Oct 2013, Jeff Law wrote:

 On 10/24/13 23:23, Marc Glisse wrote:

 Hello,

 I noticed that in some cases we were failing to find aliasing
 information because we were only looking at an SSA_NAME variable,
 missing the fact that it was really an ADDR_EXPR. The attached patch
 passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
 losing some type information for instance)

 I didn't investigate the 2 tests where I had to remove dg-bogus, because
 removing dg-bogus sounds like a bonus...

 2013-10-25  Marc Glisse  marc.gli...@inria.fr

 gcc/
  * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for an
  ADDR_EXPR in the defining statement.

 Shouldn't the ADDR_EXPR have been propagated into the use?


 Maybe when the address is a constant, but here it comes from malloc.

points-to should have propagated the alias info, so no, looking at
def-stmts in random places like this isn't ok.  Where does alias info
get lost?

Thanks,
Richard.

 --
 Marc Glisse


[PATCH] Handle __builtin_unreachable () using assertions in VRP

2013-10-25 Thread Jakub Jelinek
Hi!

As discussed on IRC, this patch attempts to preserve VRP computed
range info for some simple __builtin_unreachable () using assertions.
If there are no immediate uses of some SSA_NAME except for those in
a condition guarding __builtin_unreachable () and in ASSERT_EXPR
in the following basic block, we can copy the range info from the assert
lhs (which is lost during remove_range_assertions) and thus preserve it.
Eventually we might then remove the __builtin_unreachable () in that case
and still benefit from user annotating the source.
In the testcase below in bar we have even without this patch the expected
range info on the SSA_NAME set to x + 1, but in foo there was no range
info preserved.  As we have just one user of get_range_info right now,
not creating a testcase.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

int
foo (int x)
{
  if (x  26 || x  37)
__builtin_unreachable ();
  return x;
}

int
bar (int x)
{
  if (x  26 || x  37)
__builtin_unreachable ();
  return x + 1;
}

2013-10-25  Jakub Jelinek  ja...@redhat.com

* tree-vrp.c (remove_range_assertions): If ASSERT_EXPR_VAR
has no other immediate uses but in the condition and ASSERT_EXPR
and the other successor of the predecessor bb is
__builtin_unreachable (), set_range_info of the ASSERT_EXPR_VAR
to the range info of the ASSERT_EXPR's lhs.

--- gcc/tree-vrp.c.jj   2013-10-24 10:19:21.0 +0200
+++ gcc/tree-vrp.c  2013-10-24 14:32:29.065878208 +0200
@@ -6488,12 +6488,16 @@ remove_range_assertions (void)
 {
   basic_block bb;
   gimple_stmt_iterator si;
+  /* 1 if looking at ASSERT_EXPRs immediately at the beginning of
+ a basic block preceeded by GIMPLE_COND branching to it and
+ __builtin_trap, -1 if not yet checked, 0 otherwise.  */
+  int is_unreachable;
 
   /* Note that the BSI iterator bump happens at the bottom of the
  loop and no bump is necessary if we're removing the statement
  referenced by the current BSI.  */
   FOR_EACH_BB (bb)
-for (si = gsi_start_bb (bb); !gsi_end_p (si);)
+for (si = gsi_after_labels (bb), is_unreachable = -1; !gsi_end_p (si);)
   {
gimple stmt = gsi_stmt (si);
gimple use_stmt;
@@ -6501,30 +6505,96 @@ remove_range_assertions (void)
if (is_gimple_assign (stmt)
 gimple_assign_rhs_code (stmt) == ASSERT_EXPR)
  {
+   tree lhs = gimple_assign_lhs (stmt);
tree rhs = gimple_assign_rhs1 (stmt);
tree var;
tree cond = fold (ASSERT_EXPR_COND (rhs));
-   use_operand_p use_p;
+   use_operand_p use_p, use2_p;
imm_use_iterator iter;
 
gcc_assert (cond != boolean_false_node);
 
-   /* Propagate the RHS into every use of the LHS.  */
var = ASSERT_EXPR_VAR (rhs);
-   FOR_EACH_IMM_USE_STMT (use_stmt, iter,
-  gimple_assign_lhs (stmt))
+   gcc_assert (TREE_CODE (var) == SSA_NAME);
+
+   if (!POINTER_TYPE_P (TREE_TYPE (lhs))
+SSA_NAME_RANGE_INFO (lhs))
+ {
+   if (is_unreachable == -1)
+ {
+   is_unreachable = 0;
+   if (single_pred_p (bb))
+ {
+   basic_block pred_bb = single_pred (bb);
+   gimple last = last_stmt (pred_bb);
+   if (last  gimple_code (last) == GIMPLE_COND)
+ {
+   basic_block other_bb
+ = EDGE_SUCC (pred_bb, 0)-dest;
+   if (other_bb == bb)
+ other_bb = EDGE_SUCC (pred_bb, 1)-dest;
+   if (EDGE_COUNT (other_bb-succs) == 0)
+ {
+   gimple_stmt_iterator gsi
+ = gsi_after_labels (other_bb);
+   if (!gsi_end_p (gsi)
+gimple_call_builtin_p
+   (gsi_stmt (gsi),
+BUILT_IN_UNREACHABLE))
+ is_unreachable = 1;
+ }
+ }
+ }
+ }
+   /* Handle
+  if (x_7 = 10  x_7  20)
+__builtin_unreachable ();
+  x_8 = ASSERT_EXPR x_7, ...;
+  if the only uses of x_7 are in the ASSERT_EXPR and
+  in the condition.  In that case, we can copy the
+  range info from x_8 computed in this pass also
+  for x_7.  */
+   if (is_unreachable)
+ {
+   bool ok = true;
+   FOR_EACH_IMM_USE_FAST (use_p, iter, var)
+ if (USE_STMT (use_p) != stmt)
+ 

Re: [Patch] Refactor regex executors

2013-10-25 Thread Paolo Carlini

On 10/25/2013 06:30 AM, Tim Shen wrote:

This patch optimizes BFS executor by removing quantifier tracking, but
precisely control the assignment order of the variable recording the
optimal solution.
Ok. Are there measurable performance changes at this point? I'm asking 
because we should take the chance and add performance testcases when we 
improve the performance: later, unless somebody files a specific Bug 
report, we become lazy about those, I know that ;)


Paolo.



Re: Debug functions review

2013-10-25 Thread Paolo Carlini

Hi,

On 10/24/2013 09:22 PM, François Dumont wrote:

Ok to commit ?

Looks great to me.

Thanks!
Paolo.


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Richard Biener
On Fri, Oct 25, 2013 at 10:59 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Fri, Oct 25, 2013 at 8:11 AM, Marc Glisse marc.gli...@inria.fr wrote:
 On Thu, 24 Oct 2013, Jeff Law wrote:

 On 10/24/13 23:23, Marc Glisse wrote:

 Hello,

 I noticed that in some cases we were failing to find aliasing
 information because we were only looking at an SSA_NAME variable,
 missing the fact that it was really an ADDR_EXPR. The attached patch
 passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
 losing some type information for instance)

 I didn't investigate the 2 tests where I had to remove dg-bogus, because
 removing dg-bogus sounds like a bonus...

 2013-10-25  Marc Glisse  marc.gli...@inria.fr

 gcc/
  * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for an
  ADDR_EXPR in the defining statement.

 Shouldn't the ADDR_EXPR have been propagated into the use?


 Maybe when the address is a constant, but here it comes from malloc.

 points-to should have propagated the alias info, so no, looking at
 def-stmts in random places like this isn't ok.  Where does alias info
 get lost?

It doesn't get lost, but this is the missed optimization that malloc
results still alias with global pointers (here a function argument).
Taking into account the offset shouldn't change anything.

I suppose you are looking for the memcpy to be folded into an
assignment?  Which ao_ref_from_ptr_and_size is critical for
the transform - it doesn't seem to be the one in memory op folding.

Richard.

 Thanks,
 Richard.

 --
 Marc Glisse


Re: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-10-25 Thread Richard Biener
On Fri, Oct 25, 2013 at 8:29 AM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hi Richard,


 Did you just propose:

 --- stor-layout.c.orig 2013-10-22 10:46:49.233261818 +0200
 +++ stor-layout.c 2013-10-24 14:54:00.425259062 +0200
 @@ -471,27 +471,7 @@
 static enum machine_mode
 mode_for_array (tree elem_type, tree size)
 {
 - tree elem_size;
 - unsigned HOST_WIDE_INT int_size, int_elem_size;
 - bool limit_p;
 -
 - /* One-element arrays get the component type's mode. */
 - elem_size = TYPE_SIZE (elem_type);
 - if (simple_cst_equal (size, elem_size))
 - return TYPE_MODE (elem_type);
 -
 - limit_p = true;
 - if (host_integerp (size, 1)  host_integerp (elem_size, 1))
 - {
 - int_size = tree_low_cst (size, 1);
 - int_elem_size = tree_low_cst (elem_size, 1);
 - if (int_elem_size 0
 -  int_size % int_elem_size == 0
 -  targetm.array_mode_supported_p (TYPE_MODE (elem_type),
 - int_size / int_elem_size))
 - limit_p = false;
 - }
 - return mode_for_size_tree (size, MODE_INT, limit_p);
 + return BLKmode;
 }

 ???

 Yes. Does it work?

 Richard.


 No.

 PASS: gcc.target/i386/pr14289-1.c  (test for errors, line 10)
 FAIL: gcc.target/i386/pr14289-1.c (test for excess errors)
 Excess errors:
 /home/ed/gnu/gcc-4.9-20131013/gcc/testsuite/gcc.target/i386/pr14289-1.c:5:14: 
 error: data type of 'a' isn't suitable for a register

 Does that look like an ABI-Change?

 the test case uses:
 register int a[2] asm(ebx);

Eh ... even

register struct { int i; int a[0]; } asm (ebx);

works.  Also with int a[1] but not with a[2].  So just handling trailing
arrays makes this case regress as well.

Now I'd call this use questionable ... but we've likely supported that
for decades and cannot change that now.

Back to fixing everything in expand.

Richard.

 Bernd.


[PATCH] Introduce [sg]et_nonzero_bits

2013-10-25 Thread Jakub Jelinek
Hi!

tree-ssa-ccp.c already computes which bits are known to be zero, but
we preserve that info only for pointers and not for integers.
This patch changes SSA_NAME_RANGE_INFO, so we preserve that info even for
integers.  The bitmask is also computed from range info.
There are no users of this besides ccp itself right now, but I'm working
on using that info e.g. in the vectorizer.

I had to tweak one testcase, because it relied on one of the two warnings
being emitted during VRP1 and one during VRP2, but because of the range info
derived nonzero_bits we now manage to optimize away the second test during
CCP3, which doesn't have -Wtype-limits warnings.  If it would be too
important to keep warning even there, I guess it could check get_range_info
when folding a condition to 0/1 and warn if the rhs isn't in the lhs' range.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

During the development of the patch I had there code to not only
use get_range_info if otherwise returning VARYING, but also
if it was already constant, but that failed bootstrap miserably:

It was something along the lines of:

++/* If get_nonzero_bits tells us some bits
++   have to be zero and they aren't, something
++   went wrong.  Only bits in value where val.mask
++   has zeroes (i.e. valid bit from CCP), mask has ones
++   (don't care about bits outside of precision) and
++   has zeroes in nonzero_bits (guaranteed to be zero)
++   should be verified to be zero.  */
++double_int valv = tree_to_double_int (val.value);
++gcc_assert ((valv  ~val.mask
++  ~nonzero_bits  mask).is_zero ());
++if (!(valv  ~nonzero_bits  mask).is_zero ())
++  val.value = double_int_to_tree (TREE_TYPE (lhs),
++  valv  nonzero_bits);
++if (nonzero_bits.is_zero ())
++  val.mask = double_int_zero;
++else
++  val.mask = val.mask  (nonzero_bits | ~mask);

The problem was that during the iteration when only a subset of the edges
were executable it sometimes failed the assertion, dunno if that is
the result of the nonzero bits info in this case coming from completely
different algorithm (from VRP ranges) and that simply until the CCP
iterations settle it can be wrong, so just to be safe in that case I'm
adding the info only in ccp_finalize (and the new nonzero_bits mask with
the old one).  From what I could see, the range info on that testcase looked
just fine, the nonzero_bits derived from it too, if desirable, I can try to
cook up a small testcase showing that (reduce from libgcov.c that exhibited
that).

2013-10-24  Jakub Jelinek  ja...@redhat.com

* gimple-pretty-print.c (dump_ssaname_info): Print newline also
in case of VR_VARYING.  Print get_nonzero_bits if not all ones.
* tree-ssanames.h (struct range_info_def): Add nonzero_bits field.
(set_nonzero_bits, get_nonzero_bits): New prototypes.
* tree-ssa-ccp.c (get_default_value): Use get_range_info to see if
a default def isn't partially constant.
(ccp_finalize): If after IPA, set_range_info if integral SSA_NAME
is known to be partially zero.
(evaluate_stmt): If we'd return otherwise VARYING, use get_range_info
to see if a default def isn't partially constant.
* tree-ssanames.c (set_range_info): Initialize nonzero_bits upon
creation of a range, if VR_RANGE, try to improve nonzero_bits from
the range.
(set_nonzero_bits, get_nonzero_bits): New functions.

* g++.dg/warn/pr33738.C (main): Initialize a2 again to make sure
we warn about it already during VRP1 pass.

--- gcc/gimple-pretty-print.c.jj2013-10-23 14:43:12.0 +0200
+++ gcc/gimple-pretty-print.c   2013-10-24 17:26:59.650945232 +0200
@@ -1731,7 +1731,7 @@ dump_ssaname_info (pretty_printer *buffe
   if (!POINTER_TYPE_P (TREE_TYPE (node))
SSA_NAME_RANGE_INFO (node))
 {
-  double_int min, max;
+  double_int min, max, nonzero_bits;
   value_range_type range_type = get_range_info (node, min, max);
 
   if (range_type == VR_VARYING)
@@ -1744,8 +1744,20 @@ dump_ssaname_info (pretty_printer *buffe
  pp_printf (buffer, , );
  pp_double_int (buffer, max, TYPE_UNSIGNED (TREE_TYPE (node)));
  pp_printf (buffer, ]);
- newline_and_indent (buffer, spc);
}
+  nonzero_bits = get_nonzero_bits (node);
+  if (nonzero_bits != double_int_minus_one
+  (nonzero_bits
+ != double_int::mask (TYPE_PRECISION (TREE_TYPE (node)
+   {
+ pp_string (buffer,  NONZERO );
+ sprintf (pp_buffer (buffer)-digit_buffer,
+  HOST_WIDE_INT_PRINT_DOUBLE_HEX,
+  (unsigned HOST_WIDE_INT) nonzero_bits.high,
+  nonzero_bits.low);
+ 

Re: [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM

2013-10-25 Thread Richard Biener
On Fri, Oct 25, 2013 at 10:40 AM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hello,

 this patch fixes the recently discovered data store race on arm-eabi-gcc with 
 -fno-strict-volatile-bitfields
 for structures like this:

 #define test_type unsigned short

 typedef struct s{
  unsigned char Prefix[1];
  test_type Type;
 }__attribute((__packed__,__aligned__(4))) ss;

 volatile ss v;

 void __attribute__((noinline))
 foo (test_type u)
 {
   v.Type = u;
 }

 test_type __attribute__((noinline))
 bar (void)
 {
   return v.Type;
 }


 I've manually confirmed the correct code generation using variations of the
 example above on an ARM cross-compiler for -fno-strict-volatile-bitfields.

 Note, that this example is still causes ICE's for -fstrict-volatile-bitfields,
 but I'd like to fix that separately.

 Boot-strapped and regression-tested on x86_64-linux-gnu.

 Ok for trunk?

Isn't it more appropriate to fix it here:

  if (TREE_CODE (to) == COMPONENT_REF
   DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
get_bit_range (bitregion_start, bitregion_end, to, bitpos, offset);

?

Btw, the C++ standard doesn't cover packed or aligned attributes so
we could declare this a non-issue.  Any opinion on that?

Thanks,
Richard.

 Thanks
 Bernd.


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Marc Glisse

On Fri, 25 Oct 2013, Richard Biener wrote:


On Fri, Oct 25, 2013 at 10:59 AM, Richard Biener
richard.guent...@gmail.com wrote:

On Fri, Oct 25, 2013 at 8:11 AM, Marc Glisse marc.gli...@inria.fr wrote:

On Thu, 24 Oct 2013, Jeff Law wrote:


On 10/24/13 23:23, Marc Glisse wrote:


Hello,

I noticed that in some cases we were failing to find aliasing
information because we were only looking at an SSA_NAME variable,
missing the fact that it was really an ADDR_EXPR. The attached patch
passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
losing some type information for instance)

I didn't investigate the 2 tests where I had to remove dg-bogus, because
removing dg-bogus sounds like a bonus...

2013-10-25  Marc Glisse  marc.gli...@inria.fr

gcc/
 * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for an
 ADDR_EXPR in the defining statement.


Shouldn't the ADDR_EXPR have been propagated into the use?



Maybe when the address is a constant, but here it comes from malloc.


points-to should have propagated the alias info, so no, looking at
def-stmts in random places like this isn't ok.  Where does alias info
get lost?


By points-to, do you mean in pt_solution? That's very rough information,
and in particular here it can't tell me that 2 fields of the same struct
don't alias, can it?


It doesn't get lost, but this is the missed optimization that malloc
results still alias with global pointers (here a function argument).
Taking into account the offset shouldn't change anything.


I think this is a different issue (but if you say improving malloc would 
help, maybe).



I suppose you are looking for the memcpy to be folded into an
assignment?  Which ao_ref_from_ptr_and_size is critical for
the transform - it doesn't seem to be the one in memory op folding.


No, I only used memcpy as an example, my goal is for the compiler to 
notice that the call (memcpy here, possibly other functions with some new 
attribute later) doesn't clobber the counter (i) and thus the counter 
value is the same after the call as it was before.


If memcpy gets turned to an assignment (which alignment should prevent), 
my testcase becomes useless.


--
Marc Glisse


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Richard Biener
On Fri, Oct 25, 2013 at 11:29 AM, Marc Glisse marc.gli...@inria.fr wrote:
 On Fri, 25 Oct 2013, Richard Biener wrote:

 On Fri, Oct 25, 2013 at 10:59 AM, Richard Biener
 richard.guent...@gmail.com wrote:

 On Fri, Oct 25, 2013 at 8:11 AM, Marc Glisse marc.gli...@inria.fr
 wrote:

 On Thu, 24 Oct 2013, Jeff Law wrote:

 On 10/24/13 23:23, Marc Glisse wrote:


 Hello,

 I noticed that in some cases we were failing to find aliasing
 information because we were only looking at an SSA_NAME variable,
 missing the fact that it was really an ADDR_EXPR. The attached patch
 passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
 losing some type information for instance)

 I didn't investigate the 2 tests where I had to remove dg-bogus,
 because
 removing dg-bogus sounds like a bonus...

 2013-10-25  Marc Glisse  marc.gli...@inria.fr

 gcc/
  * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for an
  ADDR_EXPR in the defining statement.


 Shouldn't the ADDR_EXPR have been propagated into the use?



 Maybe when the address is a constant, but here it comes from malloc.


 points-to should have propagated the alias info, so no, looking at
 def-stmts in random places like this isn't ok.  Where does alias info
 get lost?


 By points-to, do you mean in pt_solution? That's very rough information,
 and in particular here it can't tell me that 2 fields of the same struct
 don't alias, can it?

No, it cannot.

 It doesn't get lost, but this is the missed optimization that malloc
 results still alias with global pointers (here a function argument).
 Taking into account the offset shouldn't change anything.


 I think this is a different issue (but if you say improving malloc would
 help, maybe).


 I suppose you are looking for the memcpy to be folded into an
 assignment?  Which ao_ref_from_ptr_and_size is critical for
 the transform - it doesn't seem to be the one in memory op folding.


 No, I only used memcpy as an example, my goal is for the compiler to notice
 that the call (memcpy here, possibly other functions with some new attribute
 later) doesn't clobber the counter (i) and thus the counter value is the
 same after the call as it was before.

Ah, so you are looking at call_may_clobber_ref_p_1 and its pointer handling
when special-casing builtins?  Note that fields can only be disambiguated
if the size of the access is known (not sure what fancy attribute you
are going to invent here ...).  Generally the simple alias machinery
is written to be cheap, walking use-def chains isn't.  Users do not have
to expect use-def chains to be walked (we can change that rule of course,
but for example the RTL alias machinery also shares the core of the
gimple alias machinery.

But I can see that in this particular case the patch makes sense.  Still,

+  if (TREE_CODE (ptr) == SSA_NAME)
+{
+  gimple stmt = SSA_NAME_DEF_STMT (ptr);
+  if (gimple_assign_single_p (stmt)
+  !gimple_has_volatile_ops (stmt)
+  gimple_assign_rhs_code (stmt) == ADDR_EXPR)
+   ptr = gimple_assign_rhs1 (stmt);
+}

no need to look at gimple_has_volatile_ops (stmt).  Also you want to
handle

 p_2 = p_1 + CST;
 foo (p_2);

which has a related canonical form,

 p_2 = MEM[p_1, CST];

The patch is ok with the volatile check removed, you can followup
with handling POINTER_PLUS_EXPR if you like.

Thanks,
Richard.

 If memcpy gets turned to an assignment (which alignment should prevent), my
 testcase becomes useless.

 --
 Marc Glisse


Re: [PATCH, PR 57748] Check for out of bounds access

2013-10-25 Thread Eric Botcazou
 Doesn't that also apply to arithmetic on symbols when the offset is NULL?

This can presumably be retrofitted when the offset is null or constant.

 But yes, the codegen example posted shows this kind of difference
 (though it doesn't seem to save anything for that case).  I'd have expected
 a more explicit guarding of this case, like with MEM_P (to_rtx)
  SYMBOL_REF_P (XEXP (to_rtx, 0)) though, eventually even looking
 whether the resulting address is legitimate.  But ... doesn't forwprop
 fix this up later anyway?

Not clear IMO, it would need to reassociate.

 The comment before it is also odd and likely only applies to later
 added restrictions.  Just to quote it again:
 
if (offset != 0)
  {
 ...
   /* A constant address in TO_RTX can have VOIDmode, we must not try
 to call force_reg for that case.  Avoid that case.  */ if (MEM_P (to_rtx)
GET_MODE (to_rtx) == BLKmode
GET_MODE (XEXP (to_rtx, 0)) != VOIDmode
bitsize  0
(bitpos % bitsize) == 0
(bitsize % GET_MODE_ALIGNMENT (mode1)) == 0
MEM_ALIGN (to_rtx) == GET_MODE_ALIGNMENT (mode1))
 {
   to_rtx = adjust_address (to_rtx, mode1, bitpos /
 BITS_PER_UNIT); bitpos = 0;
 }
 
   to_rtx = offset_address (to_rtx, offset_rtx,
highest_pow2_factor_for_target (to,
offset));
 }

Yes, the comment is out-of-sync and not very informative.

-- 
Eric Botcazou


RE: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-10-25 Thread Bernd Edlinger
Hi,

 Eh ... even

 register struct { int i; int a[0]; } asm (ebx);

 works. Also with int a[1] but not with a[2]. So just handling trailing
 arrays makes this case regress as well.

 Now I'd call this use questionable ... but we've likely supported that
 for decades and cannot change that now.

 Back to fixing everything in expand.

 Richard.


Ok, finally you asked for it.

Here is my previous version of that patch again.

I have now added a new value EXPAND_REFERENCE to the expand_modifier
enumeration. It is almost like EXPAND_MEMORY but it does not interfere with
constant values.

I have done the same modification to VIEW_CONVERT_EXPR too, because
this is a possible inner reference, itself. It is however inherently hard to
test around this code.

To understand this patch it is good to know what type of object the
return value tem of get_inner_reference can be.

From the program logic at get_inner_reference it is clear that the
return value may *not* be BIT_FIELD_REF, COMPONENT_REF, ARRAY_REF,
ARRAY_RANGE_REF, REALPART_EXPR, IMAGPART_EXPR. The result may
be VIEW_CONVERT_EXPR only on a STRICT_ALIGNMENT target. This is probably
further restricted because exp is gimplified.

Usually the result will be a MEM_REF or a SSA_NAME of the memory where
the structure is to be found.

When you look at where EXPAND_MEMORY is handled you see it is special-cased
in TARGET_MEM_REF, MEM_REF, ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF,
ARRAY_RANGE_REF.

At TARGET_MEM_REF, MEM_REF, VIEW_CONVERT_EXPR, it should be the
same if EXPAND_MEMORY, EXPAND_WRITE or EXPAND_REFERENCE is given:
If it is an unaligned memory, we just return the unaligned reference.

This was missing for VIEW_CONVERT_EXPR, and unfortunately I have no test case,
because it is only a problem for STRICT_ALIGNMENT targets, and even there it 
will
certainly be really hard to find test cases that exercise this code.

In ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF
we do not have to touch the handling of the outer modifier. However we pass
EXPAND_REFERENCE to the inner object, which should not be a recursive
use of any ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF.

TARGET_MEM_REF, MEM_REF and VIEW_CONVERT_EXPR know how to handle
EXPAND_REFERENCE, anything else handles it like EXPAND_NORMAL.


Boot-strapped and regression-tested on x86_64-linux-gnu
OK for trunk?

Thanks
Bernd.2013-10-25  Bernd Edlinger  bernd.edlin...@hotmail.de

PR middle-end/57748
* expr.h (expand_modifier): New enum value EXPAND_REFERENCE.
* expr.c (expand_expr_real_1): Use EXAND_REFERENCE on base object.

testsuite:
2013-10-25  Bernd Edlinger  bernd.edlin...@hotmail.de

PR middle-end/57748
* gcc.dg/torture/pr57748-3.c: New test.
* gcc.dg/torture/pr57748-4.c: New test.



patch-pr57748-2.diff
Description: Binary data


Re: PR C++/58708 - string literal operator templates broken

2013-10-25 Thread Paolo Carlini

Hi,

On 10/18/2013 04:42 AM, Ed Smith-Rowland wrote:

--- testsuite/g++.dg/cpp1y/pr58708.C(revision 0)
+++ testsuite/g++.dg/cpp1y/pr58708.C(working copy)
@@ -0,0 +1,70 @@
+// { dg-options -std=c++1y }
+
+#include array
+#include vector
+#include type_traits
+#include testsuite_hooks.h
are you sure you want these includes in a C++ front-end testcase? Even 
testsuite_hooks.h instead of cassert? What about something more 
minimal, not including large headers like vector, for the C++ 
front-end + a library testcase (first blush, the above would be 
perfectly fine)


Thanks,
Paolo.


Re: C++ PATCH to resolve LWG issue 1265

2013-10-25 Thread Paolo Carlini

Hi,

On 10/23/2013 09:16 PM, Jason Merrill wrote:
The second hunk adds %X for printing an exception-specification in 
diagnostics.


In my experience these convenience %? that we have got easily trigger 
warnings during bootstrap/build, eg:


/scratch/Gcc/svn-dirs/trunk/gcc/cp/error.c:3302:53: warning: unknown 
conversion type character ‘R’ in format [-Wformat]
/scratch/Gcc/svn-dirs/trunk/gcc/cp/error.c:3302:53: warning: too many 
arguments for format [-Wformat-extra-args]


I suppose we could do something about that...

Paolo.


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Marc Glisse

On Fri, 25 Oct 2013, Richard Biener wrote:


Ah, so you are looking at call_may_clobber_ref_p_1 and its pointer handling
when special-casing builtins?


Yes.


Note that fields can only be disambiguated
if the size of the access is known


TBAA could also help sometimes.


(not sure what fancy attribute you are going to invent here ...).


That will certainly require quite a bit of discussion...


Generally the simple alias machinery is written to be cheap,


I wouldn't mind an expensive version ;-)


walking use-def chains isn't.


Peeking at the defining statement shouldn't be very costly, as long as you 
don't do it recursively.



no need to look at gimple_has_volatile_ops (stmt).


Ok.


Also you want to handle

p_2 = p_1 + CST;
foo (p_2);

which has a related canonical form,

p_2 = MEM[p_1, CST];


This testcase seems relevant, I'll see if I can handle it:

void f (const char *c, int *i)
{
  *i = 42;
  __builtin_memcpy (i + 1, c, sizeof (int));
  if (*i != 42) __builtin_abort();
}


The patch is ok with the volatile check removed, you can followup
with handling POINTER_PLUS_EXPR if you like.


Thanks.

--
Marc Glisse


Re: C++ PATCH to resolve LWG issue 1265

2013-10-25 Thread Jakub Jelinek
On Fri, Oct 25, 2013 at 12:39:58PM +0200, Paolo Carlini wrote:
 On 10/23/2013 09:16 PM, Jason Merrill wrote:
 The second hunk adds %X for printing an exception-specification in
 diagnostics.
 
 In my experience these convenience %? that we have got easily
 trigger warnings during bootstrap/build, eg:
 
 /scratch/Gcc/svn-dirs/trunk/gcc/cp/error.c:3302:53: warning: unknown
 conversion type character ‘R’ in format [-Wformat]
 /scratch/Gcc/svn-dirs/trunk/gcc/cp/error.c:3302:53: warning: too
 many arguments for format [-Wformat-extra-args]
 
 I suppose we could do something about that...

That is because if you are starting bootstrap with an older compiler
that doesn't grok these, you get warnings during stage1, but those
should be ignored.  When you start bootstrapping with gcc 4.9
or in stage2 and later, they won't show up.

That said, we probably need to teach new version of gettext about those.

Jakub


Re: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-10-25 Thread Richard Biener
On Fri, Oct 25, 2013 at 12:02 PM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hi,

 Eh ... even

 register struct { int i; int a[0]; } asm (ebx);

 works. Also with int a[1] but not with a[2]. So just handling trailing
 arrays makes this case regress as well.

 Now I'd call this use questionable ... but we've likely supported that
 for decades and cannot change that now.

 Back to fixing everything in expand.

 Richard.


 Ok, finally you asked for it.

 Here is my previous version of that patch again.

 I have now added a new value EXPAND_REFERENCE to the expand_modifier
 enumeration. It is almost like EXPAND_MEMORY but it does not interfere with
 constant values.

 I have done the same modification to VIEW_CONVERT_EXPR too, because
 this is a possible inner reference, itself. It is however inherently hard to
 test around this code.

 To understand this patch it is good to know what type of object the
 return value tem of get_inner_reference can be.

 From the program logic at get_inner_reference it is clear that the
 return value may *not* be BIT_FIELD_REF, COMPONENT_REF, ARRAY_REF,
 ARRAY_RANGE_REF, REALPART_EXPR, IMAGPART_EXPR. The result may
 be VIEW_CONVERT_EXPR only on a STRICT_ALIGNMENT target. This is probably
 further restricted because exp is gimplified.

 Usually the result will be a MEM_REF or a SSA_NAME of the memory where
 the structure is to be found.

 When you look at where EXPAND_MEMORY is handled you see it is special-cased
 in TARGET_MEM_REF, MEM_REF, ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF,
 ARRAY_RANGE_REF.

 At TARGET_MEM_REF, MEM_REF, VIEW_CONVERT_EXPR, it should be the
 same if EXPAND_MEMORY, EXPAND_WRITE or EXPAND_REFERENCE is given:
 If it is an unaligned memory, we just return the unaligned reference.

 This was missing for VIEW_CONVERT_EXPR, and unfortunately I have no test case,
 because it is only a problem for STRICT_ALIGNMENT targets, and even there it 
 will
 certainly be really hard to find test cases that exercise this code.

 In ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF
 we do not have to touch the handling of the outer modifier. However we pass
 EXPAND_REFERENCE to the inner object, which should not be a recursive
 use of any ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF.

 TARGET_MEM_REF, MEM_REF and VIEW_CONVERT_EXPR know how to handle
 EXPAND_REFERENCE, anything else handles it like EXPAND_NORMAL.


 Boot-strapped and regression-tested on x86_64-linux-gnu
 OK for trunk?

You point to a weak spot in expansion - that it handles creating
the base MEM to offset with handled components by recursing
into the case that handles bare MEM_REFs.  This makes the
bare MEM_REF handling code somewhat awkward (it's the
one to assign mem-attrs which are later adjusted for example).

Maybe a better appropach than adding yet another expand
modifier would be to split out the base MEM expansion part
out of the bare MEM_REF handling code so we can call that
instead of recursing.

In this light - instead of a new expand modifier don't you want
an actual flag that specifies we are coming from a call that
wants to expand a base?  That is, allow EXPAND_SUM
but with the recursion flag set?

Finally I think the recursion into the VIEW_CONVERT_EXPR case
is only there because of the keep_aligning flag of get_inner_reference
which should be obsolete now that we properly handle its effects
in get_object_alignment.  So you wouldn't need to adjust this path
if we finally can get rid of that.

What do others think?

Thanks,
Richard.

 Thanks
 Bernd.


[C++ testcase, committed] PR 54812

2013-10-25 Thread Paolo Carlini

Hi,

I had a look to the testcases Jason added / tweaked in his recent patch 
about trivial but non-callable [cd]tors: I think a simple testcase 
directly from 54812 cannot hurt.


Thanks,
Paolo.


2013-10-25  Paolo Carlini  paolo.carl...@oracle.com

PR c++/54812
* g++.dg/cpp0x/defaulted47.C: New.
Index: g++.dg/cpp0x/defaulted47.C
===
--- g++.dg/cpp0x/defaulted47.C  (revision 0)
+++ g++.dg/cpp0x/defaulted47.C  (working copy)
@@ -0,0 +1,16 @@
+// PR c++/54812
+// { dg-do compile { target c++11 } }
+
+class A
+{
+  A() = default;   // { dg-error private }
+};
+
+A a;   // { dg-error context }
+
+class B
+{
+  ~B() = default;  // { dg-error private }
+};
+
+B b;   // { dg-error context }


[PATCH v4, nds32] Andes nds32 port, machine description 3/3

2013-10-25 Thread Chung-Ju Wu
Hi, all,

This is v4 patch for Andes nds32 port on machine description part 3,
which includes other rtl patterns and settings.

gcc/
2013-10-25  Chung-Ju Wu  jasonw...@gmail.com
Shiva Chen  shiva0...@gmail.com

* config/nds32/constants.md: New file.
* config/nds32/constraints.md: New file.
* config/nds32/iterators.md: New file.
* config/nds32/nds32-doubleword.md: New file.
* config/nds32/nds32-intrinsic.md: New file.
* config/nds32/nds32_intrinsic.h: New file.
* config/nds32/nds32-modes.def: New file.
* config/nds32/nds32-multiple.md: New file.
* config/nds32/nds32.opt: New file.
* config/nds32/nds32-opts.h: New file.
* config/nds32/nds32-protos.h: New file.
* config/nds32/nds32-peephole2.md: New file.
* config/nds32/pipelines.md: New file.
* config/nds32/predicates.md: New file.
* config/nds32/t-mlibs: New file.


Best regards,
jasonwucj


RFA: Andes nds32 port v4 patch

2013-10-25 Thread Chung-Ju Wu
Hi, GCC global reviewers,

I would like to thank Joseph Myers's preliminary review and
Richard Sandiford's help for further technical review in these
three months:
  http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01138.html
  http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01180.html

Their comments clearly enhance nds32 port machine description design.
Now we have v4 patch for nds32 port in which all the issues have
been addressed:
  http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02153.html
  http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02154.html
  http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02156.html

Any advice or comment to improve the patch is much appreciated.
If there is no further comment, is this nds32 v4 patch OK for the trunk? :)


Best regards,
jasonwucj


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Tobias Burnus

Marc Glisse wrote:

I noticed that in some cases we were failing to find aliasing
information because we were only looking at an SSA_NAME variable,
missing the fact that it was really an ADDR_EXPR. The attached patch
passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
losing some type information for instance)

I didn't investigate the 2 tests where I had to remove dg-bogus, because
removing dg-bogus sounds like a bonus...


I wonder why you are seeing failures with those test cases. I tired your 
patch (w/o modifying the test cases) and they passed.


The idea of those test cases is to ensure that there is no output like
note: loop versioned for vectorization because of possible aliasing
Because that output would be a hint that #pragma GCC ivdep doesn't work.

What kind of output do you see? Seemingly not the one above as you kept 
the version dg-bogus.


Tobias


RE: [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM

2013-10-25 Thread Bernd Edlinger
Hi,

On Fri, 25 Oct 2013 11:26:20, Richard Biener wrote:

 On Fri, Oct 25, 2013 at 10:40 AM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
 Hello,

 this patch fixes the recently discovered data store race on arm-eabi-gcc 
 with -fno-strict-volatile-bitfields
 for structures like this:

 #define test_type unsigned short

 typedef struct s{
 unsigned char Prefix[1];
 test_type Type;
 }__attribute((__packed__,__aligned__(4))) ss;

 volatile ss v;

 void __attribute__((noinline))
 foo (test_type u)
 {
 v.Type = u;
 }

 test_type __attribute__((noinline))
 bar (void)
 {
 return v.Type;
 }


 I've manually confirmed the correct code generation using variations of the
 example above on an ARM cross-compiler for -fno-strict-volatile-bitfields.

 Note, that this example is still causes ICE's for 
 -fstrict-volatile-bitfields,
 but I'd like to fix that separately.

 Boot-strapped and regression-tested on x86_64-linux-gnu.

 Ok for trunk?

 Isn't it more appropriate to fix it here:

 if (TREE_CODE (to) == COMPONENT_REF
  DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
 get_bit_range (bitregion_start, bitregion_end, to, bitpos, offset);

 ?


Honestly, I'd call this is a work-around, not a design.

Therefore I would not move that workaround to expr.c.

Also the block below is only a work-around IMHO.

  if (MEM_P (str_rtx)  bitregion_start 0)
    {
  enum machine_mode bestmode;
  HOST_WIDE_INT offset, size;

  gcc_assert ((bitregion_start % BITS_PER_UNIT) == 0);

  offset = bitregion_start / BITS_PER_UNIT;
  bitnum -= bitregion_start;
  size = (bitnum + bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT;
  bitregion_end -= bitregion_start;
  bitregion_start = 0;
  bestmode = get_best_mode (bitsize, bitnum,
    bitregion_start, bitregion_end,
    MEM_ALIGN (str_rtx), VOIDmode,
    MEM_VOLATILE_P (str_rtx));
  str_rtx = adjust_bitfield_address_size (str_rtx, bestmode, offset, size);
    }

Here, if bitregion_start = 8, we have a 4 byte aligned memory context,
and whoops, now it is only 1 byte aligned.

this example:

struct s
{
  char a;
  int b:24;
};

struct s ss;

void foo(int b)
{
  ss.b = b;
}


gets compiled (at -O3) to:

foo:
    @ Function supports interworking.
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    ldr    r3, .L2
    mov    r1, r0, lsr #8
    mov    r2, r0, lsr #16
    strb    r1, [r3, #2]
    strb    r0, [r3, #1]
    strb    r2, [r3, #3]
    bx    lr

while...

struct s
{
  char a;
  int b:24;
};

struct s ss;

void foo(int b)
{
  ss.b = b;
}


gets compiled (at -O3) to

foo:
    @ Function supports interworking.
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    ldr    r3, .L2
    mov    r2, r0, lsr #16
    strb    r2, [r3, #2]
    strh    r0, [r3]    @ movhi
    bx    lr

which is more efficient, but only because the memory context is still
aligned in this case.

 Btw, the C++ standard doesn't cover packed or aligned attributes so
 we could declare this a non-issue. Any opinion on that?

 Thanks,
 Richard.

 Thanks
 Bernd. 

Re: PR C++/58708 - string literal operator templates broken

2013-10-25 Thread Ed Smith-Rowland

On 10/25/2013 06:08 AM, Paolo Carlini wrote:

Hi,

On 10/18/2013 04:42 AM, Ed Smith-Rowland wrote:

--- testsuite/g++.dg/cpp1y/pr58708.C (revision 0)
+++ testsuite/g++.dg/cpp1y/pr58708.C(working copy)
@@ -0,0 +1,70 @@
+// { dg-options -std=c++1y }
+
+#include array
+#include vector
+#include type_traits
+#include testsuite_hooks.h
are you sure you want these includes in a C++ front-end testcase? Even 
testsuite_hooks.h instead of cassert? What about something more 
minimal, not including large headers like vector, for the C++ 
front-end + a library testcase (first blush, the above would be 
perfectly fine)


Thanks,
Paolo.


You're right.  I'll send something slimmer later tonight.
The rest of the tests in the cpp1y directory use __builtin_abort.
So, you also want a library testcase?



Re: PR C++/58708 - string literal operator templates broken

2013-10-25 Thread Paolo Carlini

On 10/25/2013 01:38 PM, Ed Smith-Rowland wrote:

So, you also want a library testcase?
Up to you: if you feel it would test something that the c++ front-end 
testcase doesn't, why not.


Paolo.


[c++-concepts] Requires expr in non-templte

2013-10-25 Thread Andrew Sutton
This patch prevents the use of requires expressions in non-template
scopes. This restriction was relaxed in the most recent version of
concepts lite, but the implementation requires some thought. For now,
I am marking it an error to make it consistent with previous versions
of the spec.

2013-10-25  Andrew Sutton  andrew.n.sut...@gmail.com
* gcc/cp/parsre.c (cp_parser_requires_expression): Gracefully
fail when parsing a requires expr outside a template.

Andrew Sutton


template-requires.patch
Description: Binary data


Re: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips

2013-10-25 Thread Jan Hubicka
  OK, so it is about 2%.  Did you try if you need lookahead even in the early 
  pass (before reload)?  My guess would be so, but if not, it could cut the 
  cost to half.  For -Ofast/-O3 it looks resonable to me, but we will  need 
  to announce it on the ML.  For other settings I think we need to work on 
  more improvements or cut the expenses.
 
 Yes, it is required before reload.  
 
 I have another idea which can be pondered upon. Currently, can we enable 
 lookahead with the value 4 (pre reload) for default? This will exponentially 
 cut the cost of build time. 
 I have done some measurements on the build time of some benchmarks (mentioned 
 below) with lookahead value 4. The 2% increase in build time with value 8 is 
 now almost gone.
 
dfa4   no_lookahead
  
  perlbench   - 191s  193s
  bzip2   - 19s   19s
  gcc - 429s  429s
  mcf - 3s3s
  gobmk   - 116s  115s
  hmmer   - 60s   60s
  sjeng   - 18s   17s
  libquantum  - 6s6s
  h264ref - 107s  107s
  omnetpp - 128s  128s
  astar   - 7s7s
  bwaves  - 5s5s
  gamess  - 1964s 1957s
  milc- 18s   18s
  GemsFDTD- 273s  272s
 
 Lookahead value 4 also helps because, the modified decoder model in bdver3.md 
 is only two cycles deep (though in hardware it is actually 4 cycles deep). 
 This means that we can look another two levels deep for better schedule.
 GemsFDTD still retains the performance boost of around 6-7% with value 4.
 
 Let me know your thoughts.

This seems resonable.  I would go for lookahead of 4 for now and 8 for -Ofast
and we can tune things based on the experience with this setting incrementally.
Uros, Richard, what do you think?

Honza


Re: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips

2013-10-25 Thread Richard Biener
On Fri, 25 Oct 2013, Jan Hubicka wrote:

   OK, so it is about 2%.  Did you try if you need lookahead even in the 
   early pass (before reload)?  My guess would be so, but if not, it could 
   cut the cost to half.  For -Ofast/-O3 it looks resonable to me, but we 
   will  need to announce it on the ML.  For other settings I think we need 
   to work on more improvements or cut the expenses.
  
  Yes, it is required before reload.  
  
  I have another idea which can be pondered upon. Currently, can we enable 
  lookahead with the value 4 (pre reload) for default? This will 
  exponentially cut the cost of build time. 
  I have done some measurements on the build time of some benchmarks 
  (mentioned below) with lookahead value 4. The 2% increase in build time 
  with value 8 is now almost gone.
  
 dfa4   no_lookahead
   
   perlbench   - 191s  193s
   bzip2   - 19s   19s
   gcc - 429s  429s
   mcf - 3s3s
   gobmk   - 116s  115s
   hmmer   - 60s   60s
   sjeng   - 18s   17s
   libquantum  - 6s6s
   h264ref - 107s  107s
   omnetpp - 128s  128s
   astar   - 7s7s
   bwaves  - 5s5s
   gamess  - 1964s 1957s
   milc- 18s   18s
   GemsFDTD- 273s  272s
  
  Lookahead value 4 also helps because, the modified decoder model in 
  bdver3.md is only two cycles deep (though in hardware it is actually 4 
  cycles deep). This means that we can look another two levels deep for 
  better schedule.
  GemsFDTD still retains the performance boost of around 6-7% with value 4.
  
  Let me know your thoughts.
 
 This seems resonable.  I would go for lookahead of 4 for now and 8 for -Ofast
 and we can tune things based on the experience with this setting 
 incrementally.
 Uros, Richard, what do you think?

Well, certainly -O3 not -Ofast.

Richard.


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Marc Glisse

On Fri, 25 Oct 2013, Tobias Burnus wrote:


Marc Glisse wrote:

I noticed that in some cases we were failing to find aliasing
information because we were only looking at an SSA_NAME variable,
missing the fact that it was really an ADDR_EXPR. The attached patch
passes bootstrap+testsuite, does it make sense? (I am a bit afraid of
losing some type information for instance)

I didn't investigate the 2 tests where I had to remove dg-bogus, because
removing dg-bogus sounds like a bonus...


I wonder why you are seeing failures with those test cases. I tired your 
patch (w/o modifying the test cases) and they passed.


The idea of those test cases is to ensure that there is no output like
note: loop versioned for vectorization because of possible aliasing
Because that output would be a hint that #pragma GCC ivdep doesn't work.

What kind of output do you see? Seemingly not the one above as you kept the 
version dg-bogus.


beats self on the head
My source directory includes the word alias in it. So I'll leave those 2 
dg-bogus alone, but please tighten the regexp a bit, include at least a 
space or something...


--
Marc Glisse


Re: question about register pairs

2013-10-25 Thread Joern Rennecke
On 25 October 2013 05:15, DJ Delorie d...@redhat.com wrote:

 Yup, my registers are smaller than Pmode.

 This is what I ended up with...

 Some notes: I lie to gcc and tell it that $fp (reg 22) is two bytes
 when it's really one.

Well, it's not really a lie if you map hardware registers 22 and 23 to
a single register for the purposes of gcc internals.  Although it
does make some other things more awkward, e.g. when you
copy fp, and this gets split so you have an insn that copies the
highpart of fp to another register.

 Index: reload.c
 ===
 --- reload.c(revision 203733)
 +++ reload.c(working copy)
 @@ -723,13 +723,15 @@ find_valid_class_1 (enum machine_mode ou
unsigned int best_size = 0;
int cost;

for (rclass = 1; rclass  N_REG_CLASSES; rclass++)
  {
int bad = 0;
 -  for (regno = 0; regno  FIRST_PSEUDO_REGISTER  !bad; regno++)
 +  for (regno = 0;
 +  regno  FIRST_PSEUDO_REGISTER  !bad;
 +  regno += HARD_REGNO_NREGS (regno, mode))
 {
   if (in_hard_reg_set_p (reg_class_contents[rclass], mode, regno)
!HARD_REGNO_MODE_OK (regno, mode))
 bad = 1;
 }

Seeing the patched code in its entirety like this, I notice that we
would use HARD_REGNO_NREGS
for a regno that's not ok for the mode.
That can be avoided if we put a break into the if.  And then the !bad
term in the loop condition
becomes redundant.
Although the HARD_REGNO_NREGS definition in tm.texi says that HARD_REGNO_NREGS
must never return zero.  That wouldn't be technically violated with an
assert, but I suppose the
spirit of the text is that the macro should return always a
non-positive value, so I suppose this
use of HARD_REGNO_MODE_OK is not actually a problem.


Re: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-10-25 Thread Martin Jambor
Hi,

On Fri, Oct 25, 2013 at 12:51:13PM +0200, Richard Biener wrote:
 On Fri, Oct 25, 2013 at 12:02 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
  Hi,
 
  Eh ... even
 
  register struct { int i; int a[0]; } asm (ebx);
 
  works. Also with int a[1] but not with a[2]. So just handling trailing
  arrays makes this case regress as well.
 
  Now I'd call this use questionable ... but we've likely supported that
  for decades and cannot change that now.
 
  Back to fixing everything in expand.
 
  Richard.
 
 
  Ok, finally you asked for it.
 
  Here is my previous version of that patch again.
 
  I have now added a new value EXPAND_REFERENCE to the expand_modifier
  enumeration. It is almost like EXPAND_MEMORY but it does not interfere with
  constant values.
 
  I have done the same modification to VIEW_CONVERT_EXPR too, because
  this is a possible inner reference, itself. It is however inherently hard to
  test around this code.
 
  To understand this patch it is good to know what type of object the
  return value tem of get_inner_reference can be.
 
  From the program logic at get_inner_reference it is clear that the
  return value may *not* be BIT_FIELD_REF, COMPONENT_REF, ARRAY_REF,
  ARRAY_RANGE_REF, REALPART_EXPR, IMAGPART_EXPR. The result may
  be VIEW_CONVERT_EXPR only on a STRICT_ALIGNMENT target. This is probably
  further restricted because exp is gimplified.
 
  Usually the result will be a MEM_REF or a SSA_NAME of the memory where
  the structure is to be found.
 
  When you look at where EXPAND_MEMORY is handled you see it is special-cased
  in TARGET_MEM_REF, MEM_REF, ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF,
  ARRAY_RANGE_REF.
 
  At TARGET_MEM_REF, MEM_REF, VIEW_CONVERT_EXPR, it should be the
  same if EXPAND_MEMORY, EXPAND_WRITE or EXPAND_REFERENCE is given:
  If it is an unaligned memory, we just return the unaligned reference.
 
  This was missing for VIEW_CONVERT_EXPR, and unfortunately I have no test 
  case,
  because it is only a problem for STRICT_ALIGNMENT targets, and even there 
  it will
  certainly be really hard to find test cases that exercise this code.
 
  In ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF
  we do not have to touch the handling of the outer modifier. However we pass
  EXPAND_REFERENCE to the inner object, which should not be a recursive
  use of any ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF.
 
  TARGET_MEM_REF, MEM_REF and VIEW_CONVERT_EXPR know how to handle
  EXPAND_REFERENCE, anything else handles it like EXPAND_NORMAL.
 
 
  Boot-strapped and regression-tested on x86_64-linux-gnu
  OK for trunk?
 
 You point to a weak spot in expansion - that it handles creating
 the base MEM to offset with handled components by recursing
 into the case that handles bare MEM_REFs.  This makes the
 bare MEM_REF handling code somewhat awkward (it's the
 one to assign mem-attrs which are later adjusted for example).
 
 Maybe a better appropach than adding yet another expand
 modifier would be to split out the base MEM expansion part
 out of the bare MEM_REF handling code so we can call that
 instead of recursing.

I think that we should seize every easy opportunity to break up
expand_expr* functions into smaller ones with nicer names and easier
to understand semantics.  So I think is a good idea.

Thanks,

Martin


 
 In this light - instead of a new expand modifier don't you want
 an actual flag that specifies we are coming from a call that
 wants to expand a base?  That is, allow EXPAND_SUM
 but with the recursion flag set?
 
 Finally I think the recursion into the VIEW_CONVERT_EXPR case
 is only there because of the keep_aligning flag of get_inner_reference
 which should be obsolete now that we properly handle its effects
 in get_object_alignment.  So you wouldn't need to adjust this path
 if we finally can get rid of that.
 
 What do others think?
 
 Thanks,
 Richard.
 
  Thanks
  Bernd.


Re: [PATCH] fixing typo in expr.c to allow proper recognition of complex addresses in some arches.

2013-10-25 Thread Igor Shevlyakov
But it's nothing to do with construction inside
memory_address_addr_space but with validity of an address construct.
first thing memory_address_addr_space does is deconstructing
address_mode from as parameter. But if one doesn't pass actual mode
there's not way to find it and call to legitimate_address_p hook will
fail.

On Fri, Oct 25, 2013 at 1:39 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Thanks.  Installed on the trunk.

 Well, no, that will be problematic for some architectures.  The history of
 this piece of code is complicated and it's admittedly lacking a comment, but
 the purpose of the block is clear enough:

 op0 = expand_expr (base, NULL_RTX, VOIDmode, EXPAND_SUM);
 op0 = memory_address_addr_space (address_mode, op0, as);
 if (!integer_zerop (TREE_OPERAND (exp, 1)))
   {
 rtx off
   = immed_double_int_const (mem_ref_offset (exp), address_mode);
 op0 = simplify_gen_binary (PLUS, address_mode, op0, off);
   }
 op0 = memory_address_addr_space (mode, op0, as);

 The offset computation is done in address_mode and then, only at the end,
 converted to mode.

 --
 Eric Botcazou


Re: PR C++/58708 - string literal operator templates broken

2013-10-25 Thread Ed Smith-Rowland

On 10/25/2013 07:54 AM, Paolo Carlini wrote:

On 10/25/2013 01:38 PM, Ed Smith-Rowland wrote:

So, you also want a library testcase?
Up to you: if you feel it would test something that the c++ front-end 
testcase doesn't, why not.


Paolo.

I think this patch should be sufficient - no containers, it just follows 
the pattern of the other tests in the cpp1y directory.

I builds and tests of x86_64-linux.

OK?

Later we can think of library tools that could help authors of literals 
operators.

The bits/bits/parse_numbers.h is a start on this.

Ed

Index: cp/parser.c
===
--- cp/parser.c (revision 203997)
+++ cp/parser.c (working copy)
@@ -3793,22 +3793,39 @@
   tree charvec;
   tree argpack = make_node (NONTYPE_ARGUMENT_PACK);
   const char *str = TREE_STRING_POINTER (value);
-  int i, len = TREE_STRING_LENGTH (value) - 1;
+  int sz = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (value;
+  int len = TREE_STRING_LENGTH (value) / sz - 1;
   tree argvec = make_tree_vec (2);
 
-  tree string_char_type_node = TREE_TYPE (TREE_TYPE (value));
+  tree str_char_type_node = TREE_TYPE (TREE_TYPE (value));
+  str_char_type_node = TYPE_MAIN_VARIANT (str_char_type_node);
 
   /* First template parm is character type.  */
-  TREE_VEC_ELT (argvec, 0) = string_char_type_node;
+  TREE_VEC_ELT (argvec, 0) = str_char_type_node;
 
   /* Fill in CHARVEC with all of the parameters.  */
   charvec = make_tree_vec (len);
-  for (i = 0; i  len; ++i)
-TREE_VEC_ELT (charvec, i) = build_int_cst (string_char_type_node, str[i]);
+  if (sz == 1)
+{
+  for (int i = 0; i  len; ++i)
+   TREE_VEC_ELT (charvec, i) = build_int_cst (str_char_type_node, str[i]);
+}
+  else if (sz == 2)
+{
+  const uint16_t *num = (const uint16_t *)str;
+  for (int i = 0; i  len; ++i)
+   TREE_VEC_ELT (charvec, i) = build_int_cst (str_char_type_node, num[i]);
+}
+  else if (sz == 4)
+{
+  const uint32_t *num = (const uint32_t *)str;
+  for (int i = 0; i  len; ++i)
+   TREE_VEC_ELT (charvec, i) = build_int_cst (str_char_type_node, num[i]);
+}
 
   /* Build the argument packs.  */
   SET_ARGUMENT_PACK_ARGS (argpack, charvec);
-  TREE_TYPE (argpack) = string_char_type_node;
+  TREE_TYPE (argpack) = str_char_type_node;
 
   TREE_VEC_ELT (argvec, 1) = argpack;
 
Index: testsuite/g++.dg/cpp1y/pr58708.C
===
--- testsuite/g++.dg/cpp1y/pr58708.C(revision 0)
+++ testsuite/g++.dg/cpp1y/pr58708.C(working copy)
@@ -0,0 +1,68 @@
+// { dg-options -std=c++1y }
+// { dg-do run }
+
+#include type_traits
+
+struct Foo
+{
+  bool type[4];
+  bool const_type[4];
+  int chars[10];
+};
+
+templatetypename CharT, CharT... str
+Foo
+operator_foo()
+{
+  CharT arr[]{str...};
+
+  Foo foo;
+
+  foo.type[0] = std::is_sameCharT, char::value;
+  foo.type[1] = std::is_sameCharT, wchar_t::value;
+  foo.type[2] = std::is_sameCharT, char16_t::value;
+  foo.type[3] = std::is_sameCharT, char32_t::value;
+  foo.const_type[0] = std::is_sameCharT, const char::value;
+  foo.const_type[1] = std::is_sameCharT, const wchar_t::value;
+  foo.const_type[2] = std::is_sameCharT, const char16_t::value;
+  foo.const_type[3] = std::is_sameCharT, const char32_t::value;
+
+  for(int i; i  sizeof(arr)/sizeof(CharT) - 1; ++i)
+foo.chars[i] = (int)arr[i];
+
+  return foo;
+}
+
+int
+main()
+{
+  Foo foo;
+
+  foo = U\x1\x10001\x10002_foo;
+  if (foo.type[3] != true) __builtin_abort();
+  if (sizeof(foo.chars)/sizeof(char32_t)-1 != 3) __builtin_abort();
+  if (foo.chars[0] != 65536) __builtin_abort();
+  if (foo.chars[1] != 65537) __builtin_abort();
+  if (foo.chars[2] != 65538) __builtin_abort();
+
+  foo = \x61\x62\x63_foo;
+  if (foo.type[0] != true) __builtin_abort();
+  if (sizeof(foo.chars)/sizeof(char)-1 != 3) __builtin_abort();
+  if (foo.chars[0] != 97) __builtin_abort();
+  if (foo.chars[1] != 98) __builtin_abort();
+  if (foo.chars[2] != 99) __builtin_abort();
+
+  foo = L\x01020304\x05060708_foo;
+  if (foo.type[1] != true) __builtin_abort();
+  if (sizeof(foo.chars)/sizeof(wchar_t)-1 != 2) __builtin_abort();
+  if (foo.chars[0] != 16909060) __builtin_abort();
+  if (foo.chars[1] != 84281096) __builtin_abort();
+
+  foo = u\x0102\x0304\x0506\x0708_foo;
+  if (foo.type[2] != true) __builtin_abort();
+  if (sizeof(foo.chars)/sizeof(char16_t)-1 != 4) __builtin_abort();
+  if (foo.chars[0] != 258) __builtin_abort();
+  if (foo.chars[1] != 772) __builtin_abort();
+  if (foo.chars[2] != 1286) __builtin_abort();
+  if (foo.chars[3] != 1800) __builtin_abort();
+}

gcc/cp:

2013-10-25  Edward Smith-Rowland  3dw...@verizon.net

PR c++/58708
* parser.c (make_string_pack): Discover non-const type and size
of character and build parm pack with correct type and chars.

gcc/testsuite:

2013-10-25  Edward Smith-Rowland  3dw...@verizon.net

PR c++/58708

Re: PR C++/58708 - string literal operator templates broken

2013-10-25 Thread Paolo Carlini

On 10/25/2013 03:46 PM, Ed Smith-Rowland wrote:

On 10/25/2013 07:54 AM, Paolo Carlini wrote:

On 10/25/2013 01:38 PM, Ed Smith-Rowland wrote:

So, you also want a library testcase?
Up to you: if you feel it would test something that the c++ front-end 
testcase doesn't, why not.


Paolo.

I think this patch should be sufficient - no containers, it just 
follows the pattern of the other tests in the cpp1y directory.
Note, however, that type_traits isn't small at all, and apparently you 
are only using std::is_same which is just a one line template + a one 
line specialization (just grep in testsuite/g++.dg)


Paolo.


Re: [Patch] Refactor regex executors

2013-10-25 Thread Paolo Carlini

Hi,

On 10/25/2013 04:05 PM, Tim Shen wrote:

On Fri, Oct 25, 2013 at 5:07 AM, Paolo Carlini paolo.carl...@oracle.com wrote:

Ok. Are there measurable performance changes at this point? I'm asking
because we should take the chance and add performance testcases when we
improve the performance: later, unless somebody files a specific Bug report,
we become lazy about those, I know that ;)

I do use another test case, but it's from Stackoverflow. I tend not to
include that; Howeverm I make a bfs version of the previous split.cc
testcase and it indeed reflect the optimization(in my machine, running
time from 294u to 194u).
Ah good. Can't you just have split_bfs.cc defining the 
_GLIBCXX_REGEX_DFS_QUANTIFIERS_LIMIT macro and including the existing 
split.cc?


Paolo.


[PATCH, SH] Add support for inlined builtin_strncmp

2013-10-25 Thread Christian Bruel
Hello,

This patch implements the cmpstrnsi pattern to support the strncmp
builtin for constant lengths. The cmp/str instructions is used for size
= 8 bytes, else fall back to the byte-at-a-time check to favor small
strings.

I now also handle the cases where align is known for both cmpstr and
cmpstrn, so we can avoid the pointer check, and added a schedule
improvement to speculate the extu.b  r1,r1 instruction into the delay
slot, winning an additional instruction (we know that r1 is 0) when the
end of string is reached. The byte-at-a-time loop becomes:

mov.b   @r4+,r1
tst r1,r1
bt/s.L4
mov.b   @r3+,r0
cmp/eq  r1,r0
bt/s.L9
extu.b  r1,r1
.L4:
extu.b  r0,r0
rts
sub r1,r0

Enabled the existing execute/builtins/strncmp-2.c for functional check
and added 2 new target specific tests.

No regressions for -m2 and -m4 for sh-elf.
OK for trunk ?

Many thanks,

Christian





2013-10-27  Christian Bruel  christian.br...@st.com

	* gcc/config/sh/sh-mem.cc (sh_expand_cmpnstr): Moved here.
	(sh_expand_cmpstr): Handle known align and schedule improvements.
	* gcc/config/sh/sh-protos.h (sh_expand_cmpstrn): Declare.
	* gcc/config/sh/sh.md (cmpstrnsi): New pattern.

	* gcc.c-torture/execute/builtins/strncmp-2.c: Enable for SH.
	* gcc.target/sh/cmpstr.c: New test.
	* gcc.target/sh/cmpstrn.c: New test.

Index: config/sh/sh-mem.cc
===
--- config/sh/sh-mem.cc	(revision 204013)
+++ config/sh/sh-mem.cc	(working copy)
@@ -200,22 +200,25 @@ sh_expand_cmpstr (rtx *operands)
   rtx L_return = gen_label_rtx ();
   rtx L_loop_byte = gen_label_rtx ();
   rtx L_end_loop_byte = gen_label_rtx ();
-  rtx L_loop_long = gen_label_rtx ();
-  rtx L_end_loop_long = gen_label_rtx ();
 
   rtx jump, addr1, addr2;
   int prob_unlikely = REG_BR_PROB_BASE / 10;
   int prob_likely = REG_BR_PROB_BASE / 4;
 
-  emit_insn (gen_iorsi3 (tmp1, s1_addr, s2_addr));
-  emit_move_insn (tmp0, GEN_INT (3));
+  rtx L_loop_long = gen_label_rtx ();
+  rtx L_end_loop_long = gen_label_rtx ();
 
-  emit_insn (gen_tstsi_t (tmp0, tmp1));
+  int align = INTVAL (operands[3]);
 
   emit_move_insn (tmp0, const0_rtx);
 
-  jump = emit_jump_insn (gen_branch_false (L_loop_byte));
-  add_int_reg_note (jump, REG_BR_PROB, prob_likely);
+  if (align  4)
+{
+  emit_insn (gen_iorsi3 (tmp1, s1_addr, s2_addr));
+  emit_insn (gen_tstsi_t (GEN_INT (3), tmp1));
+  jump = emit_jump_insn (gen_branch_false (L_loop_byte));
+  add_int_reg_note (jump, REG_BR_PROB, prob_likely);
+}
 
   addr1 = adjust_automodify_address (s1, SImode, s1_addr, 0);
   addr2 = adjust_automodify_address (s2, SImode, s2_addr, 0);
@@ -250,7 +253,7 @@ sh_expand_cmpstr (rtx *operands)
   add_int_reg_note (jump, REG_BR_PROB, prob_likely);
   /* end loop.  */
 
-  /* Fallthu, check if one of the word is greater.  */
+  /* Fallthu, diff results r.  */
   if (TARGET_LITTLE_ENDIAN)
 {
   rtx low_1 = gen_lowpart (HImode, tmp1);
@@ -267,15 +270,15 @@ sh_expand_cmpstr (rtx *operands)
   jump = emit_jump_insn (gen_jump_compact (L_return));
   emit_barrier_after (jump);
 
-  /* start byte loop.  */
-  addr1 = adjust_automodify_address (s1, QImode, s1_addr, 0);
-  addr2 = adjust_automodify_address (s2, QImode, s2_addr, 0);
-
   emit_label (L_end_loop_long);
 
   emit_move_insn (s1_addr, plus_constant (Pmode, s1_addr, -4));
   emit_move_insn (s2_addr, plus_constant (Pmode, s2_addr, -4));
 
+  /* start byte loop.  */
+  addr1 = adjust_automodify_address (s1, QImode, s1_addr, 0);
+  addr2 = adjust_automodify_address (s2, QImode, s2_addr, 0);
+
   emit_label (L_loop_byte);
 
   emit_insn (gen_extendqisi2 (tmp2, addr2));
@@ -289,13 +292,16 @@ sh_expand_cmpstr (rtx *operands)
   add_int_reg_note (jump, REG_BR_PROB, prob_unlikely);
 
   emit_insn (gen_cmpeqsi_t (tmp1, tmp2));
-  emit_jump_insn (gen_branch_true (L_loop_byte));
+  if (flag_delayed_branch)
+emit_insn (gen_zero_extendqisi2 (tmp2, gen_lowpart (QImode, tmp2)));
+  jump = emit_jump_insn (gen_branch_true (L_loop_byte));
   add_int_reg_note (jump, REG_BR_PROB, prob_likely);
   /* end loop.  */
 
   emit_label (L_end_loop_byte);
 
-  emit_insn (gen_zero_extendqisi2 (tmp2, gen_lowpart (QImode, tmp2)));
+  if (! flag_delayed_branch)
+emit_insn (gen_zero_extendqisi2 (tmp2, gen_lowpart (QImode, tmp2)));
   emit_insn (gen_zero_extendqisi2 (tmp1, gen_lowpart (QImode, tmp1)));
 
   emit_label (L_return);
@@ -305,3 +311,166 @@ sh_expand_cmpstr (rtx *operands)
   return true;
 }
 
+/* Emit code to perform a strcmp.
+
+   OPERANDS[0] is the destination.
+   OPERANDS[1] is the first string.
+   OPERANDS[2] is the second string.
+   OPERANDS[3] is the length.
+   OPERANDS[4] is the align.  */
+bool
+sh_expand_cmpnstr (rtx *operands)
+{
+  rtx s1 = copy_rtx (operands[1]);
+  rtx s2 = copy_rtx (operands[2]);
+
+  rtx s1_addr = copy_addr_to_reg (XEXP (s1, 0));
+  rtx s2_addr = copy_addr_to_reg 

Re: [PATCH, SH] Add support for inlined builtin_strncmp

2013-10-25 Thread Christian Bruel
In the ChangeLog,  the entry

* gcc/config/sh/sh-mem.cc (sh_expand_cmpnstr): Moved here.

is instead

 * gcc/config/sh/sh-mem.cc (sh_expand_cmpnstr): New function.

Sorry for this,

Christian



Re: [PATCH, C++, PR58282] Handle noexcept on transactions with -fno-exceptions

2013-10-25 Thread Tom de Vries
On 07/09/13 18:54, Jason Merrill wrote:
 OK.
 

I've reproduced the same problem with the 4.7 and 4.8 branch, and checked that
applying the patch fixes the problem.

Committed to 4.7 and 4.8 branch as well.

Thanks,
- Tom


Re: [Patch] Refactor regex executors

2013-10-25 Thread Paolo Carlini

On 10/25/2013 04:22 PM, Tim Shen wrote:

On Fri, Oct 25, 2013 at 10:12 AM, Paolo Carlini
paolo.carl...@oracle.com wrote:

Ah good. Can't you just have split_bfs.cc defining the
_GLIBCXX_REGEX_DFS_QUANTIFIERS_LIMIT macro and including the existing
split.cc?

Here :) Hack the __FILE__ macro to let it report correctly.
Uhm, cute, didn't consider the naming issue, but I'm not sure we want to 
do that... What about having the code in a split.h and including it from 
split.cc and split_bfs.cc instead?


Paolo.


Re: [PATCH][buildrobot] libcpp/lex.c: Use enum properly

2013-10-25 Thread Tom Tromey
 Andrew == Andrew Pinski pins...@gmail.com writes:

 enum raw_str_phase phase = RAW_STR_PREFIX;

Andrew This is a good work around but please add a comment of why this is
Andrew needed (to work around a bug in XLC++).

I agree.
It's ok with this update.

thanks,
Tom


Re: PR C++/58708 - string literal operator templates broken

2013-10-25 Thread Ed Smith-Rowland

On 10/25/2013 10:01 AM, Paolo Carlini wrote:

On 10/25/2013 03:46 PM, Ed Smith-Rowland wrote:

On 10/25/2013 07:54 AM, Paolo Carlini wrote:

On 10/25/2013 01:38 PM, Ed Smith-Rowland wrote:

So, you also want a library testcase?
Up to you: if you feel it would test something that the c++ 
front-end testcase doesn't, why not.


Paolo.

I think this patch should be sufficient - no containers, it just 
follows the pattern of the other tests in the cpp1y directory.
Note, however, that type_traits isn't small at all, and apparently 
you are only using std::is_same which is just a one line template + a 
one line specialization (just grep in testsuite/g++.dg)


Paolo.


Here is one with necessary tools inlined.
Ok?


gcc/cp:

2013-10-25  Edward Smith-Rowland  3dw...@verizon.net

PR c++/58708
* parser.c (make_string_pack): Discover non-const type and size
of character and build parm pack with correct type and chars.

gcc/testsuite:

2013-10-25  Edward Smith-Rowland  3dw...@verizon.net

PR c++/58708
*g++.dg/cpp1y/pr58708.C : New.

Index: cp/parser.c
===
--- cp/parser.c (revision 203997)
+++ cp/parser.c (working copy)
@@ -3793,22 +3793,39 @@
   tree charvec;
   tree argpack = make_node (NONTYPE_ARGUMENT_PACK);
   const char *str = TREE_STRING_POINTER (value);
-  int i, len = TREE_STRING_LENGTH (value) - 1;
+  int sz = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (value;
+  int len = TREE_STRING_LENGTH (value) / sz - 1;
   tree argvec = make_tree_vec (2);
 
-  tree string_char_type_node = TREE_TYPE (TREE_TYPE (value));
+  tree str_char_type_node = TREE_TYPE (TREE_TYPE (value));
+  str_char_type_node = TYPE_MAIN_VARIANT (str_char_type_node);
 
   /* First template parm is character type.  */
-  TREE_VEC_ELT (argvec, 0) = string_char_type_node;
+  TREE_VEC_ELT (argvec, 0) = str_char_type_node;
 
   /* Fill in CHARVEC with all of the parameters.  */
   charvec = make_tree_vec (len);
-  for (i = 0; i  len; ++i)
-TREE_VEC_ELT (charvec, i) = build_int_cst (string_char_type_node, str[i]);
+  if (sz == 1)
+{
+  for (int i = 0; i  len; ++i)
+   TREE_VEC_ELT (charvec, i) = build_int_cst (str_char_type_node, str[i]);
+}
+  else if (sz == 2)
+{
+  const uint16_t *num = (const uint16_t *)str;
+  for (int i = 0; i  len; ++i)
+   TREE_VEC_ELT (charvec, i) = build_int_cst (str_char_type_node, num[i]);
+}
+  else if (sz == 4)
+{
+  const uint32_t *num = (const uint32_t *)str;
+  for (int i = 0; i  len; ++i)
+   TREE_VEC_ELT (charvec, i) = build_int_cst (str_char_type_node, num[i]);
+}
 
   /* Build the argument packs.  */
   SET_ARGUMENT_PACK_ARGS (argpack, charvec);
-  TREE_TYPE (argpack) = string_char_type_node;
+  TREE_TYPE (argpack) = str_char_type_node;
 
   TREE_VEC_ELT (argvec, 1) = argpack;
 
Index: testsuite/g++.dg/cpp1y/pr58708.C
===
--- testsuite/g++.dg/cpp1y/pr58708.C(revision 0)
+++ testsuite/g++.dg/cpp1y/pr58708.C(working copy)
@@ -0,0 +1,91 @@
+// { dg-options -std=c++1y }
+// { dg-do run }
+
+templatetypename _Tp, _Tp __v
+  struct integral_constant
+  {
+static constexpr _Tp  value = __v;
+typedef _Tp   value_type;
+typedef integral_constant_Tp, __v   type;
+constexpr operator value_type() const { return value; }
+constexpr value_type operator()() const { return value; }
+  };
+
+templatetypename _Tp, _Tp __v
+  constexpr _Tp integral_constant_Tp, __v::value;
+
+typedef integral_constantbool, true true_type;
+
+typedef integral_constantbool, falsefalse_type;
+
+templatetypename, typename
+  struct is_same
+  : public false_type { };
+
+templatetypename _Tp
+  struct is_same_Tp, _Tp
+  : public true_type { };
+
+struct Foo
+{
+  bool type[4];
+  bool const_type[4];
+  int chars[10];
+};
+
+templatetypename CharT, CharT... str
+Foo
+operator_foo()
+{
+  CharT arr[]{str...};
+
+  Foo foo;
+
+  foo.type[0] = is_sameCharT, char::value;
+  foo.type[1] = is_sameCharT, wchar_t::value;
+  foo.type[2] = is_sameCharT, char16_t::value;
+  foo.type[3] = is_sameCharT, char32_t::value;
+  foo.const_type[0] = is_sameCharT, const char::value;
+  foo.const_type[1] = is_sameCharT, const wchar_t::value;
+  foo.const_type[2] = is_sameCharT, const char16_t::value;
+  foo.const_type[3] = is_sameCharT, const char32_t::value;
+
+  for(int i; i  sizeof(arr)/sizeof(CharT) - 1; ++i)
+foo.chars[i] = (int)arr[i];
+
+  return foo;
+}
+
+int
+main()
+{
+  Foo foo;
+
+  foo = U\x1\x10001\x10002_foo;
+  if (foo.type[3] != true) __builtin_abort();
+  if (sizeof(foo.chars)/sizeof(char32_t)-1 != 3) __builtin_abort();
+  if (foo.chars[0] != 65536) __builtin_abort();
+  if (foo.chars[1] != 65537) __builtin_abort();
+  if (foo.chars[2] != 65538) __builtin_abort();
+
+  foo = \x61\x62\x63_foo;
+  if (foo.type[0] != true) 

Re: PR C++/58708 - string literal operator templates broken

2013-10-25 Thread Ed Smith-Rowland

On 10/25/2013 10:01 AM, Paolo Carlini wrote:

On 10/25/2013 03:46 PM, Ed Smith-Rowland wrote:

On 10/25/2013 07:54 AM, Paolo Carlini wrote:

On 10/25/2013 01:38 PM, Ed Smith-Rowland wrote:

So, you also want a library testcase?
Up to you: if you feel it would test something that the c++ 
front-end testcase doesn't, why not.


Paolo.

I think this patch should be sufficient - no containers, it just 
follows the pattern of the other tests in the cpp1y directory.
Note, however, that type_traits isn't small at all, and apparently 
you are only using std::is_same which is just a one line template + a 
one line specialization (just grep in testsuite/g++.dg)


Paolo.


Hold up.  something didn't work.
I'll be back later.
Ed



Re: PR C++/58708 - string literal operator templates broken

2013-10-25 Thread Paolo Carlini

On 10/25/2013 04:29 PM, Ed Smith-Rowland wrote:

Here is one with necessary tools inlined.

You still have a lot of redundant code... I'm not going to insist anyway.

Paolo.


Re: PR C++/58708 - string literal operator templates broken

2013-10-25 Thread Jakub Jelinek
On Fri, Oct 25, 2013 at 10:29:41AM -0400, Ed Smith-Rowland wrote:
 2013-10-25  Edward Smith-Rowland  3dw...@verizon.net
 
 PR c++/58708
   * parser.c (make_string_pack): Discover non-const type and size
   of character and build parm pack with correct type and chars.
 
 gcc/testsuite:
 
 2013-10-25  Edward Smith-Rowland  3dw...@verizon.net
 
 PR c++/58708
   *g++.dg/cpp1y/pr58708.C : New.

 --- testsuite/g++.dg/cpp1y/pr58708.C  (revision 0)
 +++ testsuite/g++.dg/cpp1y/pr58708.C  (working copy)
 @@ -0,0 +1,91 @@
 +// { dg-options -std=c++1y }
 +// { dg-do run }
 +
 +templatetypename _Tp, _Tp __v
 +  struct integral_constant
 +  {
 +static constexpr _Tp  value = __v;
 +typedef _Tp   value_type;
 +typedef integral_constant_Tp, __v   type;
 +constexpr operator value_type() const { return value; }
 +constexpr value_type operator()() const { return value; }
 +  };
 +
 +templatetypename _Tp, _Tp __v
 +  constexpr _Tp integral_constant_Tp, __v::value;
 +
 +typedef integral_constantbool, true true_type;
 +
 +typedef integral_constantbool, falsefalse_type;
 +
 +templatetypename, typename
 +  struct is_same
 +  : public false_type { };
 +
 +templatetypename _Tp
 +  struct is_same_Tp, _Tp
 +  : public true_type { };

Why not just the minimal:
template class T, class U 
struct is_same {
  static constexpr bool value = false;
};

template class T 
struct is_sameT, T {
  static constexpr bool value = true;
};
other tests are using?

Jakub


Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping

2013-10-25 Thread Martin Jambor
Hi,

On Thu, Oct 24, 2013 at 01:02:51AM +0200, Steven Bosscher wrote:
 On Wed, Oct 23, 2013 at 6:46 PM, Martin Jambor wrote:
 
   /* Perform the second half of the transformation started in
  @@ -4522,7 +4704,15 @@ ira (FILE *f)
allocation because of -O0 usage or because the function is too
big.  */
 if (ira_conflicts_p)
  -find_moveable_pseudos ();
  +{
  +  df_analyze ();
  +  calculate_dominance_info (CDI_DOMINATORS);
  +
  +  find_moveable_pseudos ();
  +  split_live_ranges_for_shrink_wrap ();
  +
  +  free_dominance_info (CDI_DOMINATORS);
  +}
 
 
 You probably want to add another df_analyze if
 split_live_ranges_for_shrink_wrap makes code transformations. AFAIU
 find_moveable_pseudos doesn't change global liveness but your
 transformation might. IRA/LRA need up-to-date DF_LR results to compute
 allocno live ranges.
 

OK, I have changed the patch to fo that (it is below, still bootstraps
and passes tests on x86_64 fine).  However, I have noticed that the
corresponding part in function ira now looks like:

  /* ... */
  if (delete_trivially_dead_insns (get_insns (), max_reg_num ()))
df_analyze ();

  /* It is not worth to do such improvement when we use a simple
 allocation because of -O0 usage or because the function is too
 big.  */
  if (ira_conflicts_p)
{
  df_analyze ();
  calculate_dominance_info (CDI_DOMINATORS);

  find_moveable_pseudos ();
  if (split_live_ranges_for_shrink_wrap ())
df_analyze ();

  free_dominance_info (CDI_DOMINATORS);
}
  /* ... */

So, that left me wondering whether the first call to df_analyze is
actually necessary, or whether perhaps the data are actually already
up to date.  What do you think?

Thanks for all the feedback,

Martin


2013-10-23  Martin Jambor  mjam...@suse.cz

PR rtl-optimization/10474
* ira.c (find_moveable_pseudos): Do not calculate dominance info
nor df analysis.
(interesting_dest_for_shprep): New function.
(split_live_ranges_for_shrink_wrap): Likewise.
(ira): Calculate dominance info and df analysis. Call
split_live_ranges_for_shrink_wrap.

testsuite/
* gcc.dg/pr10474.c: New testcase.
* gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.

diff --git a/gcc/ira.c b/gcc/ira.c
index 203fbff..532db31 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3989,9 +3989,6 @@ find_moveable_pseudos (void)
   pseudo_replaced_reg.release ();
   pseudo_replaced_reg.safe_grow_cleared (max_regs);
 
-  df_analyze ();
-  calculate_dominance_info (CDI_DOMINATORS);
-
   i = 0;
   bitmap_initialize (live, 0);
   bitmap_initialize (used, 0);
@@ -4311,7 +4308,196 @@ find_moveable_pseudos (void)
   regstat_free_ri ();
   regstat_init_n_sets_and_refs ();
   regstat_compute_ri ();
-  free_dominance_info (CDI_DOMINATORS);
+}
+
+
+/* If insn is interesting for parameter range-splitting shring-wrapping
+   preparation, i.e. it is a single set from a hard register to a pseudo, which
+   is live at CALL_DOM, return the destination.  Otherwise return NULL.  */
+
+static rtx
+interesting_dest_for_shprep (rtx insn, basic_block call_dom)
+{
+  rtx set = single_set (insn);
+  if (!set)
+return NULL;
+  rtx src = SET_SRC (set);
+  rtx dest = SET_DEST (set);
+  if (!REG_P (src) || !HARD_REGISTER_P (src)
+  || !REG_P (dest) || HARD_REGISTER_P (dest)
+  || (call_dom  !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest
+return NULL;
+  return dest;
+}
+
+/* Split live ranges of pseudos that are loaded from hard registers in the
+   first BB in a BB that dominates all non-sibling call if such a BB can be
+   found and is not in a loop.  Return true if the function has made any
+   changes.  */
+
+static bool
+split_live_ranges_for_shrink_wrap (void)
+{
+  basic_block bb, call_dom = NULL;
+  basic_block first = single_succ (ENTRY_BLOCK_PTR);
+  rtx insn, last_interesting_insn = NULL;
+  bitmap_head need_new, reachable;
+  vecbasic_block queue;
+
+  if (!flag_shrink_wrap)
+return false;
+
+  bitmap_initialize (need_new, 0);
+  bitmap_initialize (reachable, 0);
+  queue.create (n_basic_blocks);
+
+  FOR_EACH_BB (bb)
+FOR_BB_INSNS (bb, insn)
+  if (CALL_P (insn)  !SIBLING_CALL_P (insn))
+   {
+ if (bb == first)
+   {
+ bitmap_clear (need_new);
+ bitmap_clear (reachable);
+ queue.release ();
+ return false;
+   }
+
+ bitmap_set_bit (need_new, bb-index);
+ bitmap_set_bit (reachable, bb-index);
+ queue.quick_push (bb);
+ break;
+   }
+
+  if (queue.is_empty ())
+{
+  bitmap_clear (need_new);
+  bitmap_clear (reachable);
+  queue.release ();
+  return false;
+}
+
+  while (!queue.is_empty ())
+{
+  edge e;
+  edge_iterator ei;
+
+  bb = queue.pop ();
+  FOR_EACH_EDGE (e, ei, bb-succs)
+   if (e-dest != 

[C++ Patch] PR 58878

2013-10-25 Thread Paolo Carlini

Hi,

here the issue is that we fail to detect shadowing declarations in 
inline member function templates. The reason is the following check in 
check_template_shadow:


   if (decl == olddecl
-  || TEMPLATE_PARMS_FOR_INLINE (current_template_parms))
+  || (DECL_TEMPLATE_PARM_P (decl)
+   TEMPLATE_PARMS_FOR_INLINE (current_template_parms)))
 return true;

which, to avoid duplicate error messages involving template parameters 
(see g++.old-deja/g++.benjamin/tem0[34].C) ends up skipping VAR_DECLs 
etc too.


Tested x86_64-linux.

Thanks,
Paolo.

/
/cp
2013-10-25  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58878
* pt.c (check_template_shadow): Don't skip declarations in inline
member templates.

/testsuite
2013-10-25  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58878
* g++.dg/template/pr58878.C: New.
Index: cp/pt.c
===
--- cp/pt.c (revision 204056)
+++ cp/pt.c (working copy)
@@ -3511,7 +3511,8 @@ check_template_shadow (tree decl)
  name inside a class.  We check TPFI to avoid duplicate errors for
  inline member templates.  */
   if (decl == olddecl
-  || TEMPLATE_PARMS_FOR_INLINE (current_template_parms))
+  || (DECL_TEMPLATE_PARM_P (decl)
+  TEMPLATE_PARMS_FOR_INLINE (current_template_parms)))
 return true;
 
   error (declaration of %q+#D, decl);
Index: testsuite/g++.dg/template/pr58878.C
===
--- testsuite/g++.dg/template/pr58878.C (revision 0)
+++ testsuite/g++.dg/template/pr58878.C (working copy)
@@ -0,0 +1,62 @@
+// PR c++/58878
+
+// Template-members of non-template class
+struct A
+{
+template typename t// { dg-error shadows }
+void f()
+{
+int t = 1;   // { dg-error declaration }
+}
+
+template typename t
+void g();
+};
+
+template typename t// { dg-error shadows }
+void A::g()
+{
+int t = 2;   // { dg-error declaration }
+}
+
+// (Non-template) Members of template class
+template typename t// { dg-error shadows }
+struct B
+{
+void f()
+{
+int t = 3;   // { dg-error declaration }
+}
+
+void g();
+};
+
+template typename t// { dg-error shadows }
+void Bt::g()
+{
+int t = 4;   // { dg-error declaration }
+}
+
+
+// Template members of template class
+template typename t// { dg-error shadows }
+struct C
+{
+template typename s// { dg-error shadows }
+void f()
+{
+int t = 5;   // { dg-error declaration }
+int s = 6;   // { dg-error declaration }
+}
+
+template typename s
+void g();
+};
+
+template typename t// { dg-error shadows }
+template typename s// { dg-error shadows }
+void Ct::g()
+{
+int t = 7;   // { dg-error declaration }
+int s = 8;   // { dg-error declaration }
+}


Re: [C++ Patch] PR 58878

2013-10-25 Thread Jason Merrill

OK.

Jason


Re: [Patch, C++] Add C++ FE support for #pragma GCC ivdep

2013-10-25 Thread Jakub Jelinek
On Thu, Oct 24, 2013 at 09:17:55PM +0200, Tobias Burnus wrote:
 2013-08-24  Tobias Burnus  bur...@net-b.de
 
   PR other/33426
   * g++.dg/parse/ivdep.C: New.
   * g++.dg/vect/pr33426-ivdep.cc: New.

FYI, I'm seeing various new FAILs on i686-linux:
+FAIL: gcc.dg/vect/vect-ivdep-1.c  (test for warnings, line )
+FAIL: gcc.dg/vect/vect-ivdep-1.c -flto  (test for warnings, line )
+FAIL: gfortran.dg/vect/vect-do-concurrent-1.f90  -O   (test for warnings, line 
)
+FAIL: g++.dg/vect/pr33426-ivdep.cc -std=gnu++98  (test for warnings, line )
+FAIL: g++.dg/vect/pr33426-ivdep.cc -std=gnu++11  (test for warnings, line )

 --- /dev/null
 +++ b/gcc/testsuite/g++.dg/vect/pr33426-ivdep.cc
 @@ -0,0 +1,19 @@
 +/* { dg-do compile } */
 +/* { dg-require-effective-target vect_float } */
 +/* { dg-options -O3 -fopt-info-vec-optimized } */

The problem is likely the same in all cases and same as I've mailed about
recently, dg-options should never be used in *.dg/vect/* testcases,
instead just use dg-additional-options, because otherwise the
required target options to enable vectorization aren't passed in.

Jakub


Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Jeff Law

On 10/25/13 03:29, Marc Glisse wrote:



It doesn't get lost, but this is the missed optimization that malloc
results still alias with global pointers (here a function argument).
Taking into account the offset shouldn't change anything.


I think this is a different issue (but if you say improving malloc would
help, maybe).
Has the patent on Steensgaard's techniques expired yet?  IIRC it would 
handle this kind of stuff quite well.


Jeff


Re: [Patch] Fix gcc.dg/20050922-*.c

2013-10-25 Thread Mike Stump
On Oct 24, 2013, at 7:33 PM, Hans-Peter Nilsson h...@bitrange.com wrote:
 On Thu, 24 Oct 2013, Hans-Peter Nilsson wrote:
 On Mon, 21 Oct 2013, Mike Stump wrote:
 
 On Oct 21, 2013, at 3:28 AM, Vidya Praveen vidyaprav...@arm.com wrote:
 Tests gcc.dg/20050922-1.c and gcc.dg/20050922-2.c includes stdlib.h. This 
 can
 be a issue especially since they define uint32_t.
 
 OK for 4.7, 4.8?
 
 For release branches, you'd need to transition from the
 theoretical to the practical.  On which systems (software)
 does
 it fail?  If none, then no, a back port isn't necessary.  If
 it
 fails on a system (or software) on which real users use, then
 I'll approve it once you name the system (software) and let it
 bake on trunk for a week and see if anyone objects?
 
 I too would like to include this change on those branches, as
 recent generic newlib changes has caused these tests to break
 (regress) for my autotester for cris-elf testing those branches.
 
 Uhm, I'm on the fence, half-way wanting to retract my
 suggestion.  This seems a recent bug in newlib, in which e.g.
 #include stdlib.h causes uint32_t to be defined, bleeding from
 newlib-internal include of stdint.h.  On the other hand, testing
 that bleed isn't the purpose of these tests.

Ah, that's what I was interested in, recent change in newlib; that makes it 
even more reasonable to me.  Standard headers are supposed to include all 
headers they need, and if they need uint32_t (or any of the types the header 
that defines that type has in it) in an interface in any header that that 
header needs…  then it isn't a gratuitous stupidity.

Re: [PATCH] RFA: Remove mudflap

2013-10-25 Thread Joseph S. Myers
On Fri, 25 Oct 2013, Richard Biener wrote:

 Ok with preserving options as dummy, see examples like
 
 fargument-alias
 Common Ignore
 Does nothing. Preserved for backward compatibility.
 
 I don't think we should sorry (), but we can certainly warn that
 mudflap was replaced by -fsanitize=address(?).  Implementing
 the warning properly isn't a requirement for the patch to go in
 (but the options should be still handled, but ignored).  We can
 followup with a patch implementing the warning.

Note that warning is just a matter of putting

Warn(switch %qs is no longer supported)

alongside Ignore - see various options in c.opt for examples.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM

2013-10-25 Thread Joseph S. Myers
On Fri, 25 Oct 2013, Richard Biener wrote:

 Btw, the C++ standard doesn't cover packed or aligned attributes so
 we could declare this a non-issue.  Any opinion on that?

I think the memory model naturally applies to packed structures (i.e., 
writes to fields in them should not write to any other fields except as 
part of a sequence of consecutive non-zero-width bit-fields, unless 
allow-store-data-races is set).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH][ubsan] Add VLA bound instrumentation

2013-10-25 Thread Marek Polacek
On Thu, Oct 24, 2013 at 03:57:17PM -0400, Jason Merrill wrote:
 On 09/25/2013 08:41 AM, Marek Polacek wrote:
 +  /* Do the instrumentation of VLAs if desired.  */
 +  if ((flag_sanitize  SANITIZE_VLA)
 +   size  !TREE_CONSTANT (size)
 +  /* From C++1y onwards, we throw an exception on a negative length size
 + of an array.  */
 +   cxx_dialect  cxx1y)
 
 This code is in a completely different place from the C++1y code in
 cp_finish_decl; they should be in the same place.  I'm also
 concerned that doing it here will mean adding sanitization code to
 template definitions, but I think we want to wait to add it until
 instantiation time.
 
I've tried to implement the instrumentation in cp_finish_decl.
However, the problem is with multidimensional arrays, e.g. for

int x = -1;
int a[1][x];

array_of_runtime_bound_p returns false, thus we don't instrument this
at all, nor throw an exception in c++1y mode...  I don't know what to
do with that.  Previous implementation in create_array_type_for_decl
handled this fine.

 +  /* Prevent bogus set-but-not-used warnings: we're definitely using
 + the variable.  */
 +  if (VAR_P (size))
 +DECL_READ_P (size) = 1;
 
 Use mark_rvalue_use for this.

Ah, thanks.  This is not needed anymore.

Both ubsan testsuite + bootstrap-ubsan pass.

2013-10-25  Marek Polacek  pola...@redhat.com

Implement -fsanitize=vla-bound.
* opts.c (common_handle_option): Handle vla-bound.
* sanitizer.def (BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE):
Define.
* flag-types.h (enum sanitize_code): Add SANITIZE_VLA.
* asan.c (initialize_sanitizer_builtins): Build BT_FN_VOID_PTR_PTR.
c-family/
* c-ubsan.c: Don't include hash-table.h.
(ubsan_instrument_vla): New function.
* c-ubsan.h: Declare it.
cp/
* decl.c (cp_finish_decl): Add VLA instrumentation.
c/
* c-decl.c (grokdeclarator): Add VLA instrumentation.
testsuite/
* g++.dg/ubsan/cxx1y-vla.C: New test.
* c-c++-common/ubsan/vla-3.c: New test.
* c-c++-common/ubsan/vla-2.c: New test.
* c-c++-common/ubsan/vla-4.c: New test.
* c-c++-common/ubsan/vla-1.c: New test.

--- gcc/opts.c.mp   2013-10-25 11:56:57.69722 +0200
+++ gcc/opts.c  2013-10-25 11:57:04.221224139 +0200
@@ -1445,6 +1445,7 @@ common_handle_option (struct gcc_options
  { undefined, SANITIZE_UNDEFINED, sizeof undefined - 1 },
  { unreachable, SANITIZE_UNREACHABLE,
sizeof unreachable - 1 },
+ { vla-bound, SANITIZE_VLA, sizeof vla-bound - 1 },
  { NULL, 0, 0 }
};
const char *comma;
--- gcc/c-family/c-ubsan.c.mp   2013-10-25 11:56:57.699200012 +0200
+++ gcc/c-family/c-ubsan.c  2013-10-25 11:57:04.223224148 +0200
@@ -25,7 +25,6 @@ along with GCC; see the file COPYING3.
 #include alloc-pool.h
 #include cgraph.h
 #include gimple.h
-#include hash-table.h
 #include output.h
 #include toplev.h
 #include ubsan.h
@@ -86,8 +85,7 @@ ubsan_instrument_division (location_t lo
   return t;
 }
 
-/* Instrument left and right shifts.  If not instrumenting, return
-   NULL_TREE.  */
+/* Instrument left and right shifts.  */
 
 tree
 ubsan_instrument_shift (location_t loc, enum tree_code code,
@@ -157,4 +155,23 @@ ubsan_instrument_shift (location_t loc,
   t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node);
 
   return t;
+}
+
+/* Instrument variable length array bound.  */
+
+tree
+ubsan_instrument_vla (location_t loc, tree size)
+{
+  tree type = TREE_TYPE (size);
+  tree t, tt;
+
+  t = fold_build2 (LE_EXPR, boolean_type_node, size, build_int_cst (type, 0));
+  tree data = ubsan_create_data (__ubsan_vla_data,
+loc, ubsan_type_descriptor (type), NULL_TREE);
+  data = build_fold_addr_expr_loc (loc, data);
+  tt = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE);
+  tt = build_call_expr_loc (loc, tt, 2, data, ubsan_encode_value (size));
+  t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node);
+
+  return t;
 }
--- gcc/c-family/c-ubsan.h.mp   2013-10-25 11:56:57.700200016 +0200
+++ gcc/c-family/c-ubsan.h  2013-10-25 11:57:04.223224148 +0200
@@ -23,5 +23,6 @@ along with GCC; see the file COPYING3.
 
 extern tree ubsan_instrument_division (location_t, tree, tree);
 extern tree ubsan_instrument_shift (location_t, enum tree_code, tree, tree);
+extern tree ubsan_instrument_vla (location_t, tree);
 
 #endif  /* GCC_C_UBSAN_H  */
--- gcc/sanitizer.def.mp2013-10-25 11:56:57.703200028 +0200
+++ gcc/sanitizer.def   2013-10-25 11:57:04.225224158 +0200
@@ -297,3 +297,7 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HAN
  __ubsan_handle_builtin_unreachable,
  BT_FN_VOID_PTR,
  ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE,
+ 

Re: Aliasing: look through pointer's def stmt

2013-10-25 Thread Marc Glisse

On Fri, 25 Oct 2013, Richard Biener wrote:


you can followup with handling POINTER_PLUS_EXPR if you like.


Like this? (bootstrap+testsuite on x86_64-unknown-linux-gnu)

2013-10-26  Marc Glisse  marc.gli...@inria.fr

gcc/
* tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Look for a
POINTER_PLUS_EXPR in the defining statement.

gcc/testsuite/
* gcc.dg/tree-ssa/alias-24.c: New file.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/alias-24.c(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+void f (const char *c, int *i)
+{
+  *i = 42;
+  __builtin_memcpy (i + 1, c, sizeof (int));
+  if (*i != 42) __builtin_abort();
+}
+
+/* { dg-final { scan-tree-dump-not abort optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
+

Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-24.c
___
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c(revision 204071)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -567,20 +567,28 @@ void
 ao_ref_init_from_ptr_and_size (ao_ref *ref, tree ptr, tree size)
 {
   HOST_WIDE_INT t1, t2;
   ref-ref = NULL_TREE;
   if (TREE_CODE (ptr) == SSA_NAME)
 {
   gimple stmt = SSA_NAME_DEF_STMT (ptr);
   if (gimple_assign_single_p (stmt)
   gimple_assign_rhs_code (stmt) == ADDR_EXPR)
ptr = gimple_assign_rhs1 (stmt);
+  else if (is_gimple_assign (stmt)
+   gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
+   host_integerp (gimple_assign_rhs2 (stmt), 1))
+   {
+ ao_ref_init_from_ptr_and_size (ref, gimple_assign_rhs1 (stmt), size);
+ ref-offset += 8 * TREE_INT_CST_LOW (gimple_assign_rhs2 (stmt));
+ return;
+   }
 }
 
   if (TREE_CODE (ptr) == ADDR_EXPR)
 ref-base = get_ref_base_and_extent (TREE_OPERAND (ptr, 0),
 ref-offset, t1, t2);
   else
 {
   ref-base = build2 (MEM_REF, char_type_node,
  ptr, null_pointer_node);
   ref-offset = 0;


Re: [PATCH 1/2] Handwritten part of conversion of IPA pass hooks to virtual functions

2013-10-25 Thread David Malcolm
On Thu, 2013-10-10 at 09:56 +0200, Richard Biener wrote:
 As a general observation I don't like the
 
  if (pass-has_foo_member_function)
pass-foo_member_function ()
 
 style.  It is totally against the idea of having virtual functions.  Why
 not simply provide stub implementations in the base classes?

(sorry about the belated response)

The reason for the has_foo_member_function flags is that everywhere we
use them, the style is more like this:

if (pass_has_foo_member_function)
  {
 do_various_stuff ();
 pass-foo_member_function ();
 do_more_stuff ();
  }

where the precise stuff varies between the different vfuncs.  I don't
think there's a portable way to check if a vfunc has been overridden, so
having these flags seems the best way of handling this kind of logic.

It's very similar to what we now have for the gate and execute hooks
for the opt_pass base class.   (In the first version of that work that I
posted, I had vfuncs for the has_ logic, but simple bool flags let us
avoid that virtual call, hence I went with the latter approach).

The patches still apply cleanly to trunk; I'm rechecking the bootstrap
and regtest.  Are they OK for trunk if the testing passes?
[For context: http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00639.html ]

Thanks
Dave

 On Thu, Oct 10, 2013 at 2:41 AM, David Malcolm dmalc...@redhat.com wrote:
  * passes.c (ipa_opt_pass_d::generate_summary): New.
  (ipa_opt_pass_d::write_summary): New.
  (ipa_opt_pass_d::read_summary): New.
  (ipa_opt_pass_d::write_optimization_summary): New.
  (ipa_opt_pass_d::read_optimization_summary): New.
  (ipa_opt_pass_d::stmt_fixup): New.
  (ipa_opt_pass_d::function_transform): New.
  (ipa_opt_pass_d::variable_transform): New.
  (execute_ipa_summary_passes): Use ipa_data_. to determine if
  pass has a generate_summary implemenation, since we can't check
  a vtable entry.
  (execute_one_ipa_transform_pass): Likewise for
  function_transform, and for the
  function_transform_todo_flags_start field.
  (ipa_write_summaries_2): Likewise for write_summary.
  (ipa_write_optimization_summaries_1): Likewise for
  write_optimization_summary.
  (ipa_read_summaries_1): Likewise for read_summary.
  (ipa_read_optimization_summaries_1): Likewise for
  read_optimization_summary.
  (execute_ipa_stmt_fixups): Likewise for stmt_fixup.
 
  * tree-pass.h (struct ipa_pass_data): New.
  (ipa_opt_pass_d::generate_summary): Convert to virtual function.
  (ipa_opt_pass_d::write_summary): Likewise.
  (ipa_opt_pass_d::read_summary): Likewise.
  (ipa_opt_pass_d::write_optimization_summary): Likewise.
  (ipa_opt_pass_d::read_optimization_summary): Likewise.
  (ipa_opt_pass_d::stmt_fixup): Likewise.
  (ipa_opt_pass_d::function_transform_todo_flags_start): Drop;
  this is now part of ipa_pass_data.
  (ipa_opt_pass_d::function_transform): Convert to virtual function.
  (ipa_opt_pass_d::variable_transform): Likewise.
  (ipa_opt_pass_d::ipa_data_): New.
  (ipa_opt_pass_d::ipa_opt_pass_d): Drop function pointer fields in
  favor of virtual functions and an ipa_pass_data.
  ---
   gcc/passes.c| 70 
  ++---
   gcc/tree-pass.h | 54 ++--
   2 files changed, 89 insertions(+), 35 deletions(-)
 
  diff --git a/gcc/passes.c b/gcc/passes.c
  index 1b2202e..625f1d3 100644
  --- a/gcc/passes.c
  +++ b/gcc/passes.c
  @@ -117,7 +117,59 @@ opt_pass::opt_pass (const pass_data data, context 
  *ctxt)
   {
   }
 
  +/* Default implementations of the various ipa_opt_pass_d hooks,
  +   provided to detect inconsistent pass metadata.
 
  +   These implementations should never be called: any pass that sets a
  +   has_ flag it its ipa_pass_data should provide a function overriding
  +   the corresponding default implementation.  */
  +void
  +ipa_opt_pass_d::generate_summary (void)
  +{
  +  internal_error (base generate_summary called for %s, name);
  +}
  +
  +void
  +ipa_opt_pass_d::write_summary (void)
  +{
  +  internal_error (base write_summary called for %s, name);
  +}
  +
  +void
  +ipa_opt_pass_d::read_summary (void)
  +{
  +  internal_error (base read_summary called for %s, name);
  +}
  +
  +void
  +ipa_opt_pass_d::write_optimization_summary (void)
  +{
  +  internal_error (base write_optimization_summary called for %s, name);
  +}
  +
  +void
  +ipa_opt_pass_d::read_optimization_summary (void)
  +{
  +  internal_error (base read_optimization_summary called for %s, name);
  +}
  +
  +void
  +ipa_opt_pass_d::stmt_fixup (struct cgraph_node *, gimple *)
  +{
  +  internal_error (base stmt_fixup called for %s, name);
  +}
  +
  +unsigned int
  +ipa_opt_pass_d::function_transform (struct cgraph_node *)
  +{
  +  

[PATCH] Use get_nonzero_bits to improve vectorization

2013-10-25 Thread Jakub Jelinek
Hi!

The following patch makes use of the computed nonzero_bits preserved
in the SSA_NAME_RANGE_INFO.
I chose to write a new routine instead of improving current
highest_pow2_factor, because that routine didn't care about overflows etc.
and by working on ctz numbers instead of powers of two in UHWI we can handle
even larger constants etc.  highest_pow2_factor could very well overflow to
zero etc.  So, the patch introduces a new tree_ctz routine and reimplements
highest_pow2_factor on top of that, plus uses tree_ctz also in
get_object_alignment_2 and in the vectorizer to determine if it can avoid
scalar loop for bound (and indirectly also in the analysis whether peeling
for alignment is needed).

With this patch, e.g.
int a[1024];

void
foo (int x, int y)
{
  int i;
  x = -32;
  y = -32;
  for (i = x + 32; i  y; i++)
a[i]++;
}
can be vectorized without any peeling for alignment or scalar loop
afterwards.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-10-25  Jakub Jelinek  ja...@redhat.com

* tree.c (tree_ctz): New function.
* tree.h (tree_ctz): New prototype.
* tree-ssanames.h (get_range_info, get_nonzero_bits): Change
first argument from tree to const_tree.
* tree-ssanames.c (get_range_info, get_nonzero_bits): Likewise.
* tree-vectorizer.h (vect_generate_tmps_on_preheader): New prototype.
* tree-vect-loop-manip.c (vect_generate_tmps_on_preheader): No longer
static.
* expr.c (highest_pow2_factor): Reimplemented using tree_ctz.
* tree-vect-loop.c (vect_analyze_loop_operations,
vect_transform_loop): Don't force scalar loop for bound just because
number of iterations is unknown, only do it if it is not known to be
a multiple of vectorization_factor.
* builtins.c (get_object_alignment_2): Use tree_ctz on offset.

--- gcc/tree.c.jj   2013-10-23 14:43:12.0 +0200
+++ gcc/tree.c  2013-10-25 15:00:55.296178794 +0200
@@ -2213,6 +2213,110 @@ tree_floor_log2 (const_tree expr)
  : floor_log2 (low));
 }
 
+/* Return number of known trailing zero bits in EXPR, or, if the value of
+   EXPR is known to be zero, the precision of it's type.  */
+
+int
+tree_ctz (const_tree expr)
+{
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (expr))
+   !POINTER_TYPE_P (TREE_TYPE (expr)))
+return 0;
+
+  int ret1, ret2, prec = TYPE_PRECISION (TREE_TYPE (expr));
+  switch (TREE_CODE (expr))
+{
+case INTEGER_CST:
+  ret1 = tree_to_double_int (expr).trailing_zeros ();
+  return MIN (ret1, prec);
+case SSA_NAME:
+  ret1 = get_nonzero_bits (expr).trailing_zeros ();
+  return MIN (ret1, prec);
+case PLUS_EXPR:
+case MINUS_EXPR:
+case BIT_IOR_EXPR:
+case BIT_XOR_EXPR:
+case MIN_EXPR:
+case MAX_EXPR:
+  ret1 = tree_ctz (TREE_OPERAND (expr, 0));
+  ret2 = tree_ctz (TREE_OPERAND (expr, 1));
+  return MIN (ret1, ret2);
+case POINTER_PLUS_EXPR:
+  ret1 = tree_ctz (TREE_OPERAND (expr, 0));
+  ret2 = tree_ctz (TREE_OPERAND (expr, 1));
+  ret2 = MIN (ret2, prec);
+  return MIN (ret1, ret2);
+case BIT_AND_EXPR:
+  ret1 = tree_ctz (TREE_OPERAND (expr, 0));
+  ret2 = tree_ctz (TREE_OPERAND (expr, 1));
+  return MAX (ret1, ret2);
+case MULT_EXPR:
+  ret1 = tree_ctz (TREE_OPERAND (expr, 0));
+  ret2 = tree_ctz (TREE_OPERAND (expr, 1));
+  return MIN (ret1 + ret2, prec);
+case LSHIFT_EXPR:
+  ret1 = tree_ctz (TREE_OPERAND (expr, 0));
+  if (host_integerp (TREE_OPERAND (expr, 1), 1)
+  ((unsigned HOST_WIDE_INT) tree_low_cst (TREE_OPERAND (expr, 1), 1)
+  (unsigned HOST_WIDE_INT) prec))
+   {
+ ret2 = tree_low_cst (TREE_OPERAND (expr, 1), 1);
+ return MIN (ret1 + ret2, prec);
+   }
+  return ret1;
+case RSHIFT_EXPR:
+  if (host_integerp (TREE_OPERAND (expr, 1), 1)
+  ((unsigned HOST_WIDE_INT) tree_low_cst (TREE_OPERAND (expr, 1), 1)
+  (unsigned HOST_WIDE_INT) prec))
+   {
+ ret1 = tree_ctz (TREE_OPERAND (expr, 0));
+ ret2 = tree_low_cst (TREE_OPERAND (expr, 1), 1);
+ if (ret1  ret2)
+   return ret1 - ret2;
+   }
+  return 0;
+case TRUNC_DIV_EXPR:
+case CEIL_DIV_EXPR:
+case FLOOR_DIV_EXPR:
+case ROUND_DIV_EXPR:
+case EXACT_DIV_EXPR:
+  if (TREE_CODE (TREE_OPERAND (expr, 1)) == INTEGER_CST)
+   {
+ ret2 = tree_log2 (TREE_OPERAND (expr, 1));
+ if (ret2 = 0  tree_int_cst_sgn (TREE_OPERAND (expr, 1)) == 1)
+   {
+ ret1 = tree_ctz (TREE_OPERAND (expr, 0));
+ if (ret1  ret2)
+   return ret1 - ret2;
+   }
+   }
+  return 0;
+CASE_CONVERT:
+  ret1 = tree_ctz (TREE_OPERAND (expr, 0));
+  if (ret1  ret1 == TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (expr, 0
+   ret1 = prec;
+  return MIN (ret1, prec);
+case SAVE_EXPR:
+  return tree_ctz 

Re: C++ PATCH to deal with trivial but non-callable [cd]tors

2013-10-25 Thread Eric Botcazou
 Late in the C++11 process it was decided that a constructor or
 destructor can be trivial but not callable; as a result, everywhere that
 assumed that a call to a trivial function didn't need any processing
 needed to be updated.  This patch does that.

This has introduced a problem for the -fdump-ada-spec machinery, which boils 
down to the TYPE_METHODS field of the following structure:

 struct _outer {
   struct _inner {
 int x;
   } inner;
 } outer;

Previously it was empty, now it contains the following destructor:

 function_decl 0x76c6e300 _outer
type method_type 0x76c6d2a0
type void_type 0x76b0ebd0 void VOID
align 8 symtab 0 alias set -1 canonical type 0x76b0ebd0
pointer_to_this pointer_type 0x76b0ec78
QI
size integer_cst 0x76b04280 constant 8
unit size integer_cst 0x76b042a0 constant 1
align 8 symtab 0 alias set -1 canonical type 0x76c6d1f8 method 
basetype record_type 0x76c5c930 _outer
arg-types tree_list 0x76c52fa0 value pointer_type 
0x76c5cc78
chain tree_list 0x76c52f78 value integer_type 0x76b0e5e8 
int
chain tree_list 0x76b00b18 value void_type 
0x76b0ebd0 void
throws tree_list 0x76c527a8
public abstract external autoinline decl_3 QI file t5.h line 1 col 9 align 
16 context record_type 0x76c5c930 _outer
arguments parm_decl 0x76c6c480 this
[...]
   full-name _outer::~_outer() throw ()
not-really-extern chain function_decl 0x76c6e500 __base_dtor 


The destructor is created as a side effect of the call to type_build_dtor_call 
added to cxx_maybe_build_cleanup by the patch:

@@ -14296,7 +14300,7 @@ cxx_maybe_build_cleanup (tree decl, tsub
 }
   /* Handle ordinary C++ destructors.  */
   type = TREE_TYPE (decl);
-  if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type))
+  if (type_build_dtor_call (type))
 {
   int flags = LOOKUP_NORMAL|LOOKUP_DESTRUCTOR;
   bool has_vbases = (TREE_CODE (type) == RECORD_TYPE

#0  implicitly_declare_fn (kind=sfk_destructor, type=0x76c5c930, 
const_p=false, inherited_ctor=0x0, inherited_parms=0x0)
at /home/eric/gnat/gnat-head/src/gcc/cp/method.c:1551
#1  0x00758ef5 in lazily_declare_fn (sfk=sfk_destructor, 
type=0x76c5c930) at /home/eric/gnat/gnat-head/src/gcc/cp/method.c:1950
#2  0x0075ee64 in lookup_fnfields_1 (type=0x76c5c930, 
name=0x76b0aec8) at /home/eric/gnat/gnat-head/src/gcc/cp/search.c:1471
#3  0x0075eea5 in lookup_fnfields_slot (type=0x76c5c930, 
name=0x76b0aec8) at /home/eric/gnat/gnat-head/src/gcc/cp/search.c:1483
#4  0x00687dc7 in type_build_dtor_call (t=0x76c5c930)
at /home/eric/gnat/gnat-head/src/gcc/cp/class.c:5193
#5  0x005ef967 in cxx_maybe_build_cleanup (decl=0x76c64390, 
complain=3) at /home/eric/gnat/gnat-head/src/gcc/cp/decl.c:14303
#6  0x005c99aa in expand_static_init (decl=0x76c64390, init=0x0)
at /home/eric/gnat/gnat-head/src/gcc/cp/decl.c:6902
#7  0x005c8d6c in cp_finish_decl (decl=0x76c64390, init=0x0, 
init_const_expr_p=false, asmspec_tree=0x0, flags=1)
at /home/eric/gnat/gnat-head/src/gcc/cp/decl.c:6504
#8  0x006e22b5 in cp_parser_init_declarator (parser=0x76c5b068, 
decl_specifiers=0x7fffda00, checks=0x0, 
function_definition_allowed_p=true, member_p=false, 
declares_class_or_enum=2, function_definition_p=0x7fffda8b, 
maybe_range_for_decl=0x0)
at /home/eric/gnat/gnat-head/src/gcc/cp/parser.c:16640


Is that expected and, consequently, should we adjust the machinery?

-- 
Eric Botcazou


Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-25 Thread Ilya Enkovich
2013/10/25 Jeff Law l...@redhat.com:
 On 10/21/13 05:49, Ilya Enkovich wrote:

 Hi,

 This patch introduces built-in functions used by Pointers Checker and flag
 to enable Pointers Checker. Builtins available for user are expanded in
 expand_builtin. All other builtins are not allowed in expand until generic
 version of Pointers Cheker is implemented.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --

 gcc/

 2013-10-04  Ilya Enkovich  ilya.enkov...@intel.com

 * builtin-types.def (BT_FN_VOID_CONST_PTR): New.
 (BT_FN_PTR_CONST_PTR): New.
 (BT_FN_CONST_PTR_CONST_PTR): New.
 (BT_FN_PTR_CONST_PTR_SIZE): New.
 (BT_FN_PTR_CONST_PTR_CONST_PTR): New.
 (BT_FN_VOID_PTRPTR_CONST_PTR): New.
 (BT_FN_VOID_CONST_PTR_SIZE): New.
 (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
 * chkp-builtins.def: New.
 * builtins.def: include chkp-builtins.def.
 (DEF_CHKP_BUILTIN): New.
 * builtins.c (expand_builtin): Support
 BUILT_IN_CHKP_INIT_PTR_BOUNDS,
 BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
 BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
 BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
 BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
 BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
 BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
 BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
 BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_ARG_BND,
 BUILT_IN_CHKP_NARROW,
 BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
 * common.opt (fcheck-pointers): New.
 * toplev.c (process_options): Check Pointers Checker is supported.
 * doc/extend.texi: Document Pointers Checker built-in functions.

 Just a few minor comments.


 --- a/gcc/common.opt
 +++ b/gcc/common.opt
 @@ -874,6 +874,11 @@ fbounds-check
   Common Report Var(flag_bounds_check)
   Generate code to check bounds before indexing arrays

 +fcheck-pointers
 +Common Report Var(flag_check_pointers)
 +Add pointers checker instrumentation.  fchkp-* flags are used to
 +control instrumentation.  Currently available for C, C++ and ObjC.
 +

 I'd probably use pointer bounds checking rather than pointers checker.
 It's a nit, but most folks have heard the term pointer bounds checking,
 but few probabaly use pointers checker.

 I think you make several references to pointers checker that are probably
 best reworded slightly to use pointer bounds checker



 diff --git a/gcc/toplev.c b/gcc/toplev.c
 index feba051..285b36d 100644
 --- a/gcc/toplev.c
 +++ b/gcc/toplev.c
 @@ -1290,6 +1290,18 @@ process_options (void)
 if (flag_mudflap  flag_lto)
   sorry (mudflap cannot be used together with link-time
 optimization);

 +  if (flag_check_pointers)
 +{
 +  if (flag_lto)
 +   sorry (Pointers checker is not yet fully supported for link-time
 optimization);

 What was the final resolution of this?  Like jsm, this seems to me to be
 papering over a problem elsewhere.

 I'll pre-approve this patch with the terminology change and the flag_lto
 hack removed.

 jeff


There are currently two known issues with LTO. The first one is ICE in
LTO streamer when it reads instrumented code. The second one is
unitialized flag_check_pointers when code is compiled by lto1 (do you
know why it may happen BTW?). It also causes ICE beacause instrumented
code is met when not expected. Of course, I'll fix these problems
anyway, but I was going to allow '-fcheck-pointers -flto' only when
checker testsuite has 100% pass rate with lto.

Ilya


Re: [PATCH][ubsan] Add VLA bound instrumentation

2013-10-25 Thread Jason Merrill

On 10/25/2013 12:58 PM, Marek Polacek wrote:

I've tried to implement the instrumentation in cp_finish_decl.
However, the problem is with multidimensional arrays, e.g. for

int x = -1;
int a[1][x];

array_of_runtime_bound_p returns false, thus we don't instrument this
at all, nor throw an exception in c++1y mode...


Because the above is not valid under the proposed standard C++ VLA 
support; only the leftmost bound can be variable.


I think the right place to handle both ubsan and c++1y VLA checks is in 
compute_array_index_type, in the block where we're calling variable_size.


Jason



Re: C++ PATCH to deal with trivial but non-callable [cd]tors

2013-10-25 Thread Jason Merrill

On 10/25/2013 01:53 PM, Eric Botcazou wrote:

This has introduced a problem for the -fdump-ada-spec machinery, which boils
down to the TYPE_METHODS field of the following structure:

  struct _outer {
struct _inner {
  int x;
} inner;
  } outer;

Previously it was empty, now it contains the following destructor:
[snip]
Is that expected and, consequently, should we adjust the machinery?


In C++ all classes have destructors, but we try to defer building the 
implicit declaration.  My patch causes us to build those implicit 
declarations more often, which is probably a bit of a memory regression, 
but it would be good for your code to handle the dtor being declared.


Jason



Re: [PATCH][ubsan] Add VLA bound instrumentation

2013-10-25 Thread Marek Polacek
On Fri, Oct 25, 2013 at 02:17:48PM -0400, Jason Merrill wrote:
 On 10/25/2013 12:58 PM, Marek Polacek wrote:
 I've tried to implement the instrumentation in cp_finish_decl.
 However, the problem is with multidimensional arrays, e.g. for
 
 int x = -1;
 int a[1][x];
 
 array_of_runtime_bound_p returns false, thus we don't instrument this
 at all, nor throw an exception in c++1y mode...
 
 Because the above is not valid under the proposed standard C++ VLA
 support; only the leftmost bound can be variable.

I see.
 
 I think the right place to handle both ubsan and c++1y VLA checks is
 in compute_array_index_type, in the block where we're calling
 variable_size.

I'm sorry, you want me to move the c++1y VLA check into
compute_array_index_type, or just do the ubsan instrumentation in
there?  Thanks,

Marek


Re: [PATCH][ubsan] Add VLA bound instrumentation

2013-10-25 Thread Jason Merrill

On 10/25/2013 03:03 PM, Marek Polacek wrote:

On Fri, Oct 25, 2013 at 02:17:48PM -0400, Jason Merrill wrote:

I think the right place to handle both ubsan and c++1y VLA checks is
in compute_array_index_type, in the block where we're calling
variable_size.


I'm sorry, you want me to move the c++1y VLA check into
compute_array_index_type, or just do the ubsan instrumentation in
there?  Thanks,


Both.

Jason



Re: question about register pairs

2013-10-25 Thread DJ Delorie

  Some notes: I lie to gcc and tell it that $fp (reg 22) is two bytes
  when it's really one.
 
 Well, it's not really a lie if you map hardware registers 22 and 23 to
 a single register for the purposes of gcc internals.

Yeah, I'm basically making those two registers into a permanent bigger
register.

 Although it does make some other things more awkward, e.g. when you
 copy fp, and this gets split so you have an insn that copies the
 highpart of fp to another register.

Not a problem, the chip can copy the whole $fp in one insn.  Registers
are only 8 bits on this chip.

 Seeing the patched code in its entirety like this, I notice that we
 would use HARD_REGNO_NREGS for a regno that's not ok for the mode.

The core problem is that gcc has no way of dealing with register pairs
as real registers - the second halves are still registers, still need
to be in the reg class, still need _NREGS() values, etc.  There's no
way to tell gcc that a register is a set of (for example) N
registers starting at 3, 5, 9, 12, or 14.

 That can be avoided if we put a break into the if.  And then the
 !bad term in the loop condition becomes redundant.

The problem I had to solve was to re-synchronize the counter with the
even-numbered register.  The naive patch counted even registers up to
$fp, then checked odd (invalid) registers after that.

The only other way I can think to solve this problem is to add a
VALID_STARTING_REG(reg,mode) macro, but that would require changes all
over the core code.


Re: [Patch, C++] Add C++ FE support for #pragma GCC ivdep

2013-10-25 Thread Tobias Burnus

Jakub Jelinek wrote:

FYI, I'm seeing various new FAILs on i686-linux:
+FAIL: gcc.dg/vect/vect-ivdep-1.c  (test for warnings, line )
+FAIL: gcc.dg/vect/vect-ivdep-1.c -flto  (test for warnings, line )
+FAIL: gfortran.dg/vect/vect-do-concurrent-1.f90  -O   (test for warnings, line 
)
+FAIL: g++.dg/vect/pr33426-ivdep.cc -std=gnu++98  (test for warnings, line )
+FAIL: g++.dg/vect/pr33426-ivdep.cc -std=gnu++11  (test for warnings, line )


+/* { dg-options -O3 -fopt-info-vec-optimized } */

The problem is likely the same in all cases and same as I've mailed about
recently, dg-options should never be used in *.dg/vect/* testcases,
instead just use dg-additional-options, because otherwise the
required target options to enable vectorization aren't passed in.


I have now changed it to dg-additional-options. Additionally, I added a 
space to the regexp to avoid the issues mentioned at 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02165.html


The current version successfully regtested with --target_board 
'unix{-m32,}' on my x86_64-linux – and I also confirmed that the 
dg-bogus still fails when commenting the pragma.


Committed as Rev. 204074.

Tobias
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 204073)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,9 @@
+2013-10-25  Tobias Burnus  bur...@net-b.de
+
+	* g++.dg/vect/pr33426-ivdep.cc: Use dg-options.
+	* gfortran.dg/vect/vect-do-concurrent-1.f90: Ditto.
+	* testsuite/gcc.dg/vect/vect-ivdep-1.c: Ditto.
+
 2013-10-25  Yufeng Zhang  yufeng.zh...@arm.com
 
 	* gcc.dg/wmul-1.c: New test.
Index: gcc/testsuite/g++.dg/vect/pr33426-ivdep.cc
===
--- gcc/testsuite/g++.dg/vect/pr33426-ivdep.cc	(Revision 204073)
+++ gcc/testsuite/g++.dg/vect/pr33426-ivdep.cc	(Arbeitskopie)
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_float } */
-/* { dg-options -O3 -fopt-info-vec-optimized } */
+/* { dg-additional-options -O3 -fopt-info-vec-optimized } */
 
 /* PR other/33426 */
 /* Testing whether #pragma ivdep is working.  */
@@ -14,6 +14,6 @@ void foo(int n, int *a, int *b, int *c, int *d, in
 }
 
 /* { dg-message loop vectorized  { target *-*-* } 0 } */
-/* { dg-bogus version  { target *-*-* } 0 } */
-/* { dg-bogus alias  { target *-*-* } 0 } */
+/* { dg-bogus  version  { target *-*-* } 0 } */
+/* { dg-bogus  alias  { target *-*-* } 0 } */
 /* { dg-final { cleanup-tree-dump vect } } */
Index: gcc/testsuite/gcc.dg/vect/vect-ivdep-1.c
===
--- gcc/testsuite/gcc.dg/vect/vect-ivdep-1.c	(Revision 204073)
+++ gcc/testsuite/gcc.dg/vect/vect-ivdep-1.c	(Arbeitskopie)
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_float } */
-/* { dg-options -O3 -fopt-info-vec-optimized } */
+/* { dg-additional-options -O3 -fopt-info-vec-optimized } */
 
 /* PR other/33426 */
 /* Testing whether #pragma ivdep is working.  */
@@ -14,6 +14,6 @@ void foo(int n, int *a, int *b, int *c, int *d, in
 }
 
 /* { dg-message loop vectorized  { target *-*-* } 0 } */
-/* { dg-bogus version  { target *-*-* } 0 } */
-/* { dg-bogus alias  { target *-*-* } 0 } */
+/* { dg-bogus  version  { target *-*-* } 0 } */
+/* { dg-bogus  alias  { target *-*-* } 0 } */
 /* { dg-final { cleanup-tree-dump vect } } */
Index: gcc/testsuite/gfortran.dg/vect/vect-do-concurrent-1.f90
===
--- gcc/testsuite/gfortran.dg/vect/vect-do-concurrent-1.f90	(Revision 204073)
+++ gcc/testsuite/gfortran.dg/vect/vect-do-concurrent-1.f90	(Arbeitskopie)
@@ -1,6 +1,6 @@
 ! { dg-do compile }
 ! { dg-require-effective-target vect_float }
-! { dg-options -O3 -fopt-info-vec-optimized }
+! { dg-additional-options -O3 -fopt-info-vec-optimized }
 
 subroutine test(n, a, b, c)
   integer, value :: n
@@ -12,6 +12,6 @@ subroutine test(n, a, b, c)
 end subroutine test
 
 ! { dg-message loop vectorized  { target *-*-* } 0 }
-! { dg-bogus version  { target *-*-* } 0 }
-! { dg-bogus alias  { target *-*-* } 0 }
+! { dg-bogus  version  { target *-*-* } 0 }
+! { dg-bogus  alias  { target *-*-* } 0 }
 ! { dg-final { cleanup-tree-dump vect } }


Re: [PATCH] rewrite stack vectors

2013-10-25 Thread Diego Novillo

On 2013-10-10 14:07 , tsaund...@mozilla.com wrote:

This makes the implementation of stack vectors simpler and easier to use.  This 

   works by 
making the size of the on stack storage a template argument, so the
size is embedded in the type.  This allows you to implicitly convert a
stack_vecT, N to a vecT, va_heap *, and it will just work.  Because there's
no need to support stack vectors in unions we can make them be a more normal
c++ class with a constructor and destructor that are nontrivial.



Thanks.  This looks much simpler, indeed.  The patch is fine to commit.  
Just a couple of observations/questions:




@@ -638,7 +638,7 @@ propagate_threaded_block_debug_into (basic_block dest, 
basic_block src)
i++;
  }
  
-  vectree, va_stack fewvars = vNULL;

+  stack_vectree, alloc_count fewvars;
pointer_set_t *vars = NULL;


Hm, what will happen now if alloc_count == 0?  If I'm following the 
logic, this is tied to the presence of the 'vars' local, so it seems 
that we are fine.  Otherwise, the quick_push operation will run into 
trouble.
  
/* If we're already starting with 3/4 of alloc_count, go for a

@@ -646,8 +646,6 @@ propagate_threaded_block_debug_into (basic_block dest, 
basic_block src)
   VEC.  */
if (i * 4  alloc_count * 3)
  vars = pointer_set_create ();
-  else if (alloc_count)
-vec_stack_alloc (tree, fewvars, alloc_count);
  
/* Now go through the initial debug stmts in DEST again, this time

   actually inserting in VARS or FEWVARS.  Don't bother checking for
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 574446a..4a14607 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -107,7 +107,7 @@ typedef struct
 with a PHI DEF that would soon become non-dominant, and when we got
 to the suitable one, it wouldn't have anything to substitute any
 more.  */
-static vecadjust_info, va_stack adjust_vec;
+static vecadjust_info, va_heap adjust_vec;


A file global was declared as a stack vector?  Sigh.



Diego.


Re: [Patch] Fix gcc.dg/20050922-*.c

2013-10-25 Thread Hans-Peter Nilsson
On Fri, 25 Oct 2013, Mike Stump wrote:
 On Oct 24, 2013, at 7:33 PM, Hans-Peter Nilsson h...@bitrange.com wrote:
  On Thu, 24 Oct 2013, Hans-Peter Nilsson wrote:
  I too would like to include this change on those branches, as
  recent generic newlib changes has caused these tests to break
  (regress) for my autotester for cris-elf testing those branches.
 
  Uhm, I'm on the fence, half-way wanting to retract my
  suggestion.  This seems a recent bug in newlib, in which e.g.
  #include stdlib.h causes uint32_t to be defined, bleeding from
  newlib-internal include of stdint.h.  On the other hand, testing
  that bleed isn't the purpose of these tests.

 Ah, that's what I was interested in, recent change in newlib;
 that makes it even more reasonable to me.  Standard headers
 are supposed to include all headers they need,

But only as allowed by the standard.
No gratuitous leaking of identifiers allowed.

 and if they
 need uint32_t (or any of the types the header that defines that type
 has in it) in an interface in any header that that header
 needs?
 then it isn't a gratuitous stupidity.

Yeah, but in this case it was.  None of the standard contents in
stdlib.h need uint32_t, at face value.  A look in the C99
standard make me say
http://sourceware.org/ml/newlib/2013/msg00803.html.

brgds, H-P


Re: [PATCH] rewrite stack vectors

2013-10-25 Thread Trevor Saunders
On Fri, Oct 25, 2013 at 03:30:47PM -0400, Diego Novillo wrote:
 On 2013-10-10 14:07 , tsaund...@mozilla.com wrote:
 This makes the implementation of stack vectors simpler and easier to use.  
 This 

 works by making the size of the on stack storage a template argument, so the
 size is embedded in the type.  This allows you to implicitly convert a
 stack_vecT, N to a vecT, va_heap *, and it will just work.  Because 
 there's
 no need to support stack vectors in unions we can make them be a more normal
 c++ class with a constructor and destructor that are nontrivial.
 
 
 Thanks.  This looks much simpler, indeed.  The patch is fine to
 commit.  Just a couple of observations/questions:

I don't have commit access, so can someone check it in for me?  I
bootstrapped and got no changes in regression tests two weeks ago, but
haven't checked it since if that helps.

 @@ -638,7 +638,7 @@ propagate_threaded_block_debug_into (basic_block dest, 
 basic_block src)
 i++;
   }
 -  vectree, va_stack fewvars = vNULL;
 +  stack_vectree, alloc_count fewvars;
 pointer_set_t *vars = NULL;
 
 Hm, what will happen now if alloc_count == 0?  If I'm following the
 logic, this is tied to the presence of the 'vars' local, so it seems
 that we are fine.  Otherwise, the quick_push operation will run into
 trouble.

The first quick_push is ok because if we were going to add more than
alloc_count vars we would have created the pointer set, and the second
is safe because we explicitly check the vectors length is less than
alloc_count.

 /* If we're already starting with 3/4 of alloc_count, go for a
 @@ -646,8 +646,6 @@ propagate_threaded_block_debug_into (basic_block dest, 
 basic_block src)
VEC.  */
 if (i * 4  alloc_count * 3)
   vars = pointer_set_create ();
 -  else if (alloc_count)
 -vec_stack_alloc (tree, fewvars, alloc_count);
 /* Now go through the initial debug stmts in DEST again, this time
actually inserting in VARS or FEWVARS.  Don't bother checking for
 diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
 index 574446a..4a14607 100644
 --- a/gcc/tree-vect-loop-manip.c
 +++ b/gcc/tree-vect-loop-manip.c
 @@ -107,7 +107,7 @@ typedef struct
  with a PHI DEF that would soon become non-dominant, and when we got
  to the suitable one, it wouldn't have anything to substitute any
  more.  */
 -static vecadjust_info, va_stack adjust_vec;
 +static vecadjust_info, va_heap adjust_vec;
 
 A file global was declared as a stack vector?  Sigh.

indeed, istr checking and it could be moved to the stack, but involved
more refactering than seemed to make sense here.

Trev

 
 
 
 Diego.


wwwdocs: Broken links due to the preprocess script (was: Re: [wwwdocs, patch] gcc-4.9/changes.html: Add quip about #pragma GCC ivdep and update Fortran section)

2013-10-25 Thread Tobias Burnus

Tobias Burnus wrote:
Thanks for looking at the patch. However, the patch has a link 
problem. The documentation is at

http://gcc.gnu.org/onlinedocs/gcc/Loop_002dSpecific-Pragmas.html

That's also the link I use in the changes.html file. However, some 
script changes the link to:

   http://gcc.gnu.org/onlinedocs/gcc/Loop-Specific-Pragmas.html
which won't work. Try yourself at http://gcc.gnu.org/gcc-4.9/changes.html


Actually, a similar issue was reported at 
http://gcc.gnu.org/ml/gcc-help/2013-10/msg00132.html


The reason for the broken links are the following lines in the 
/www/bin/preprocess script: 
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/wwwdocs/bin/preprocess.diff?r1=1.38r2=1.39f=h


Gerald, do you still know why you added it 9 years ago? The commit 
comment is Use sed to work around makeinfo 4.7 brokenness.


I think makeinfo is still broken, but those pages do not seem to go 
through the preprocess script, which means that only links to that page 
will change to a hyphen, breaking the links.


Do you think it would be sensible to remove those lines again - or, 
alternatively, to run a similar script (e.g. perl -i -e 's/_002d/-/g' 
`find onlinedocs -name \*.html`) on the onlinedocs/.


I think the impact of the the former on links is smaller. (One still 
needs to re-run the script on those files to restore the links.)


Tobias


[Patch, C, C++] Accept GCC ivdep for 'do' and 'while', and for C++11's range-based loops

2013-10-25 Thread Tobias Burnus

Tobias Burnus wrote:

Jason Merrill wrote:

I would think that a range-based loop over an array should vectorize
nicely:

[...]
Therefore, I will send a follow up patch.


Attached is that patch [C++].

[C/C++:] Additionally, as Jakub proposed and other compilers also 
support,  the patch accepts #pragma GCC ivdep for 'do' and 'while' loops.


I did an successful all-language bootstrap followed by successful 
regstesting on x86-64-gnu-linux.

OK for committal?

Tobias
2013-10-25  Tobias Burnus  bur...@net-b.de

gcc/c/
	PR other/33426
	* c-parser.c (c_parser_while_statement, c_parser_while_statement,
	c_parser_pragma): Add GCC ivdep support to 'do' and 'while'.
	(c_parser_statement_after_labels): Update calls.

gcc/cp/
	PR other/33426
	* parser.c (cp_parser_for, cp_parser_range_for, cp_convert_range_for,
	cp_parser_iteration_statement, cp_parser_pragma): Add GCC ivdep
	support to 'do' and 'while' and to range-based loops.
	* cp-tree.h (cp_convert_range_for): Update prototype.
	* pt.c (tsubst_expr): Update call.

gcc/testsuite/
	PR other/33426
	* gcc.dg/vect/vect-ivdep-2.c: New.
	* g++.dg/vect/pr33426-ivdep-2.cc: New.
	* g++.dg/vect/pr33426-ivdep-3.cc: New.

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 4f25078..9ccae3b 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1157,8 +1157,8 @@ static void c_parser_statement (c_parser *);
 static void c_parser_statement_after_labels (c_parser *);
 static void c_parser_if_statement (c_parser *);
 static void c_parser_switch_statement (c_parser *);
-static void c_parser_while_statement (c_parser *);
-static void c_parser_do_statement (c_parser *);
+static void c_parser_while_statement (c_parser *, bool);
+static void c_parser_do_statement (c_parser *, bool);
 static void c_parser_for_statement (c_parser *, bool);
 static tree c_parser_asm_statement (c_parser *);
 static tree c_parser_asm_operands (c_parser *);
@@ -4579,10 +4579,10 @@ c_parser_statement_after_labels (c_parser *parser)
 	  c_parser_switch_statement (parser);
 	  break;
 	case RID_WHILE:
-	  c_parser_while_statement (parser);
+	  c_parser_while_statement (parser, false);
 	  break;
 	case RID_DO:
-	  c_parser_do_statement (parser);
+	  c_parser_do_statement (parser, false);
 	  break;
 	case RID_FOR:
 	  c_parser_for_statement (parser, false);
@@ -4912,7 +4912,7 @@ c_parser_switch_statement (c_parser *parser)
 */
 
 static void
-c_parser_while_statement (c_parser *parser)
+c_parser_while_statement (c_parser *parser, bool ivdep)
 {
   tree block, cond, body, save_break, save_cont;
   location_t loc;
@@ -4927,6 +4927,11 @@ c_parser_while_statement (c_parser *parser)
 		statement);
   cond = error_mark_node;
 }
+
+  if (ivdep  cond != error_mark_node)
+cond = build2 (ANNOTATE_EXPR, TREE_TYPE (cond), cond,
+		   build_int_cst (integer_type_node,
+		   annot_expr_ivdep_kind));
   save_break = c_break_label;
   c_break_label = NULL_TREE;
   save_cont = c_cont_label;
@@ -4945,7 +4950,7 @@ c_parser_while_statement (c_parser *parser)
 */
 
 static void
-c_parser_do_statement (c_parser *parser)
+c_parser_do_statement (c_parser *parser, bool ivdep)
 {
   tree block, cond, body, save_break, save_cont, new_break, new_cont;
   location_t loc;
@@ -4974,7 +4979,10 @@ c_parser_do_statement (c_parser *parser)
 		do-while statement);
   cond = error_mark_node;
 }
-
+  if (ivdep  cond != error_mark_node)
+cond = build2 (ANNOTATE_EXPR, TREE_TYPE (cond), cond,
+		   build_int_cst (integer_type_node,
+		   annot_expr_ivdep_kind));
   if (!c_parser_require (parser, CPP_SEMICOLON, expected %;%))
 c_parser_skip_to_end_of_block_or_statement (parser);
   c_finish_loop (loc, cond, NULL, body, new_break, new_cont, false);
@@ -9102,12 +9110,19 @@ c_parser_pragma (c_parser *parser, enum pragma_context context)
 case PRAGMA_IVDEP:
   c_parser_consume_pragma (parser);
   c_parser_skip_to_pragma_eol (parser);
-  if (!c_parser_next_token_is_keyword (parser, RID_FOR))
+  if (!c_parser_next_token_is_keyword (parser, RID_FOR)
+	   !c_parser_next_token_is_keyword (parser, RID_WHILE)
+	   !c_parser_next_token_is_keyword (parser, RID_DO))
 	{
-	  c_parser_error (parser, for statement expected);
+	  c_parser_error (parser, for, while or do statement expected);
 	  return false;
 	}
-  c_parser_for_statement (parser, true);
+  if (c_parser_next_token_is_keyword (parser, RID_FOR))
+	c_parser_for_statement (parser, true);
+  else if (c_parser_next_token_is_keyword (parser, RID_WHILE))
+	c_parser_while_statement (parser, true);
+  else
+	c_parser_do_statement (parser, true);
   return false;
 
 case PRAGMA_GCC_PCH_PREPROCESS:
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 507b389..692d3cc 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4321,7 +4321,7 @@ extern int comparing_specializations;
sizeof can be nested.  */
 
 extern int cp_unevaluated_operand;
-extern tree cp_convert_range_for (tree, tree, tree);
+extern tree cp_convert_range_for 

patch adding LRA usage for ppc

2013-10-25 Thread Vladimir Makarov
  I've committed the following patch to use LRA for ppc.  By default 
reload pass is used.  To use LRA, please add option -mlra.


  I also removed change in constraints for swap insn as it was 
requested by David Edelsohn.  It was necessary to fix one failure of new 
gcc tests.  I'll work on more safe fix for this later.


  The patch also fixes LRA crash on 32-bit SPEC2006 gamess reported by 
Mike Meissner.


  There is still LRA crash on SPEC2006 dealII in 32-bit mode with 
options used by Mike.  It is an issue with address constraints for 
TImode.  I'll work on this.  I guess address legitimize hook will solve 
the problem.  Currently LRA does not use this hook.  This hook is also 
needed to fix a LRA failures on ARM.


  Some tests for SDmode can fail for LRA as it generates a different 
code than reload.  If all toolchain is built with LRA, the failures will 
be gone.


  The patch was bootstrapped on ppc64 with LRA on/off.

  Committed as rev. 204079.

2013-10-25  Vladimir Makarov vmaka...@redhat.com

* config/rs6000/rs6000-protos.h
(rs6000_secondary_memory_needed_mode): New prototype.
* config/rs6000/rs6000.c: Include ira.h.
(TARGET_LRA_P): Redefine.
(rs6000_legitimate_offset_address_p): Call
legitimate_constant_pool_address_p in strict mode for LRA.
(rs6000_legitimate_address_p): Ditto.
(legitimate_lo_sum_address_p): Add code for LRA.
Use lra_in_progress.
(rs6000_emit_move): Add LRA version of code to generate load/store
of SDmode values.
(rs6000_secondary_memory_needed_mode): New.
(rs6000_alloc_sdmode_stack_slot): Do nothing for LRA.
(rs6000_secondary_reload_class): Return NO_REGS for LRA for
constants, memory, and FP registers.
(rs6000_lra_p): New.
* config/rs6000/rs6000.h (SECONDARY_MEMORY_NEEDED_MODE): New
macro.
* config/rs6000/rs6000.opt (mlra): New option.
* lra-spills.c (lra_final_code_change): Remove useless move insns.


Index: config/rs6000/rs6000-protos.h
===
--- config/rs6000/rs6000-protos.h   (revision 204075)
+++ config/rs6000/rs6000-protos.h   (working copy)
@@ -126,6 +126,8 @@ extern void rs6000_split_multireg_move (
 extern void rs6000_emit_le_vsx_move (rtx, rtx, enum machine_mode);
 extern void rs6000_emit_move (rtx, rtx, enum machine_mode);
 extern rtx rs6000_secondary_memory_needed_rtx (enum machine_mode);
+extern enum machine_mode rs6000_secondary_memory_needed_mode (enum
+ machine_mode);
 extern rtx (*rs6000_legitimize_reload_address_ptr) (rtx, enum machine_mode,
int, int, int, int *);
 extern bool rs6000_legitimate_offset_address_p (enum machine_mode, rtx,
Index: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c  (revision 204075)
+++ config/rs6000/rs6000.c  (working copy)
@@ -55,6 +55,7 @@
 #include intl.h
 #include params.h
 #include tm-constrs.h
+#include ira.h
 #include opts.h
 #include tree-vectorizer.h
 #include dumpfile.h
@@ -1554,6 +1555,9 @@ static const struct attribute_spec rs600
 #undef TARGET_MODE_DEPENDENT_ADDRESS_P
 #define TARGET_MODE_DEPENDENT_ADDRESS_P rs6000_mode_dependent_address_p
 
+#undef TARGET_LRA_P
+#define TARGET_LRA_P rs6000_lra_p
+
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE rs6000_can_eliminate
 
@@ -6226,7 +6230,7 @@ rs6000_legitimate_offset_address_p (enum
 return false;
   if (!reg_offset_addressing_ok_p (mode))
 return virtual_stack_registers_memory_p (x);
-  if (legitimate_constant_pool_address_p (x, mode, strict))
+  if (legitimate_constant_pool_address_p (x, mode, strict || lra_in_progress))
 return true;
   if (GET_CODE (XEXP (x, 1)) != CONST_INT)
 return false;
@@ -6366,19 +6370,31 @@ legitimate_lo_sum_address_p (enum machin
 
   if (TARGET_ELF || TARGET_MACHO)
 {
+  bool large_toc_ok;
+
   if (DEFAULT_ABI != ABI_AIX  DEFAULT_ABI != ABI_DARWIN  flag_pic)
return false;
-  if (TARGET_TOC)
+  /* LRA don't use LEGITIMIZE_RELOAD_ADDRESS as it usually calls
+push_reload from reload pass code.  LEGITIMIZE_RELOAD_ADDRESS
+recognizes some LO_SUM addresses as valid although this
+function says opposite.  In most cases, LRA through different
+transformations can generate correct code for address reloads.
+It can not manage only some LO_SUM cases.  So we need to add
+code analogous to one in rs6000_legitimize_reload_address for
+LOW_SUM here saying that some addresses are still valid.  */
+  large_toc_ok = (lra_in_progress  TARGET_CMODEL != CMODEL_SMALL
+  small_toc_ref (x, VOIDmode));
+  if (TARGET_TOC  ! large_toc_ok)
return false;
   if (GET_MODE_NUNITS (mode) != 1)

Re: [PATCH] RFA: Remove mudflap

2013-10-25 Thread Jeff Law

On 10/25/13 10:26, Joseph S. Myers wrote:

On Fri, 25 Oct 2013, Richard Biener wrote:


Ok with preserving options as dummy, see examples like

fargument-alias
Common Ignore
Does nothing. Preserved for backward compatibility.

I don't think we should sorry (), but we can certainly warn that
mudflap was replaced by -fsanitize=address(?).  Implementing
the warning properly isn't a requirement for the patch to go in
(but the options should be still handled, but ignored).  We can
followup with a patch implementing the warning.


Note that warning is just a matter of putting

Warn(switch %qs is no longer supported)

alongside Ignore - see various options in c.opt for examples.

Thanks.  It turns out there's an example just after -fmudflapir :-)

jeff



patch to fix PR58759

2013-10-25 Thread Vladimir Makarov

  The following patch fixes

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58759

  The patch was successfully bootstrapped and tested on x86/x86-64.

  Committed as rev. 204080.

2013-10-25  Vladimir Makarov  vmaka...@redhat.com

PR rtl-optimization/58759
* lra-constraints.c (lra_constraints): Remove wrong condition to
remove insn setting up an equivalent pseudo.

2013-10-25  Vladimir Makarov  vmaka...@redhat.com

PR rtl-optimization/58759
* gcc.target/i386/pr58759.c: New.

Index: lra-constraints.c
===
--- lra-constraints.c   (revision 204069)
+++ lra-constraints.c   (working copy)
@@ -3975,18 +3975,6 @@ lra_constraints (bool first_p)
dest_reg = SUBREG_REG (dest_reg);
  if ((REG_P (dest_reg)
(x = get_equiv_substitution (dest_reg)) != dest_reg
-  /* Check that this is actually an insn setting up
- the equivalence.  */
-   (in_list_p (curr_insn,
- ira_reg_equiv
- [REGNO (dest_reg)].init_insns)
-  /* Init insns may contain not all insns setting
- up equivalence as we have live range
- splitting.  So here we use another condition
- to check insn setting up the equivalence
- which should be removed, e.g. in case when
- the equivalence is a constant.  */
-  || ! MEM_P (x))
   /* Remove insns which set up a pseudo whose value
  can not be changed.  Such insns might be not in
  init_insns because we don't update equiv data
@@ -3999,8 +3987,10 @@ lra_constraints (bool first_p)
  secondary memory movement.  Then the pseudo is
  spilled and we use the equiv constant.  In this
  case we should remove the additional insn and
- this insn is not init_insns list.  */
+ this insn is not init_insns list.  */
(! MEM_P (x) || MEM_READONLY_P (x)
+  /* Check that this is actually an insn setting
+ up the equivalence.  */
   || in_list_p (curr_insn,
 ira_reg_equiv
 [REGNO (dest_reg)].init_insns)))
Index: testsuite/gcc.target/i386/pr58759.c
===
--- testsuite/gcc.target/i386/pr58759.c (revision 0)
+++ testsuite/gcc.target/i386/pr58759.c (working copy)
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-options -O2 } */
+
+
+int a, b, c, d, e, f, h, l, m, n, k, o;
+long long g;
+
+struct S
+{
+  int f1;
+  int f2;
+  int f3;
+  int f4;
+};
+
+static struct S i = {0,0,0,0}, j;
+
+void
+foo ()
+{
+  m = 1  d;
+  n = b + c;
+  o = k  1;
+  f = 0 == e;
+}
+
+int
+main ()
+{
+  for (; h  1; h++)
+{
+  g = 1 | (0  1 - a ? 0 : a);
+  foo ();
+  for (l = 0; l  3; l++)
+  j = i;
+}
+  return 0;
+}


[PATCH] make gengtype more robust against user error

2013-10-25 Thread David Malcolm
This patch addresses various forms of failure described in
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01974.html

It adds a default: gcc_unreachable(); to the autogenerated switch()
statement in the routines for a base class, converting various kinds of
tag errors from leading to silent lack-of-field traversal into giving
run-time assertion failures (addressing (F) and (G))

It also issues an error within gengtype itself for a missing desc
(failure B), turning this into an error message from gengtype.

I found another potential failure mode:

(H) If you had:

class GTY((desc(%0.kind), tag(0))) foo
{
public:
  int kind;
  tree p;
};

class GTY((tag(1))) bar : public foo
{
public:
  tree q;
};

static GTY(()) foo *dummy_foo;

and there are no explicit pointers to bar in the code, gengtype
treated it as unused, and silently omitted the case for it from
foo's marking routine.

I've updated set_gc_type_used so that it propagates usage down into
subclasses (which recurses), fixing this issue.

To do this efficiently we need to track subclasses in within gengtype,
so the patch also adds that, and uses it to eliminate an O(N^2).

Note that for error (G), if a class within the hierarchy omits a GTY
marker, gengtype doesn't parse it at all, and so doesn't parse the
inheritance information - so we can't issue a warning about this.
However, the lack of a tag will trigger a run-time assertion failure
due to hitting the default:  gcc_unreachable(); in the switch.
The patch also adds a paragraph to the docs, spelling out the need
for evary class in such a hierarchy to have a GTY marker.

I believe this addresses all of the silent-lack-of-field-traversal
issues, converting them to gengtype errors or runtime assertions.
It also adds a handler for (E), turning this from a failure to
compile bogus C to a specific error in gengtype.

I'm bootstrapping/regtesting now.
OK for trunk if that passes?

gcc/
* doc/gty.texi (Inheritance and GTY): Make it clear that
to use autogenerated markers for a class-hierarchy, every class
must have a GTY marker.
* gengtype.h (struct type): Add linked list of subclasses to
the s member of the union.
(add_subclass): New decl.
* gengtype-state.c (read_state_struct_type): Set up subclass
linked list.
* gengtype.c (get_ultimate_base_class): New.
(add_subclass): New.
(new_structure): Set up subclass linked list.
(set_gc_used_type): Propagate usage information to subclasses.
(output_mangled_typename): Use get_ultimate_base_class.
(walk_subclasses): Use the subclass linked list, avoiding an
O(N^2) when writing out all types.
(walk_type): Issue an error if the base class is missing a tag,
rather than generating bogus C code.  Add a gcc_unreachable
default case, in case people omit tags from concrete subclasses,
or get the values wrong.
(write_func_for_structure): Issue an error for subclasses for
which the base doesn't have a desc, since otherwise the
autogenerated routines for the base would silently fail to visit
any subclass fields.
(write_root): Use get_ultimate_base_class, tweaking constness of
tp to match that function's signature.
---
 gcc/doc/gty.texi |   7 +++
 gcc/gengtype-state.c |   2 +
 gcc/gengtype.c   | 119 ++-
 gcc/gengtype.h   |  10 +
 4 files changed, 109 insertions(+), 29 deletions(-)

diff --git a/gcc/doc/gty.texi b/gcc/doc/gty.texi
index 090f6a6..a64d110 100644
--- a/gcc/doc/gty.texi
+++ b/gcc/doc/gty.texi
@@ -497,6 +497,13 @@ The base class and its discriminator must be identified 
using the ``desc''
 option.  Each concrete subclass must use the ``tag'' option to identify
 which value of the discriminator it corresponds to.
 
+Every class in the hierarchy must have a @code{GTY(())} marker, as
+gengtype will only attempt to parse classes that have such a marker
+@footnote{Classes lacking such a marker will not be identified as being
+part of the hierarchy, and so the marking routines will not handle them,
+leading to a assertion failure within the marking routines due to an
+unknown tag value (assuming that assertions are enabled).}.
+
 @smallexample
 class GTY((desc(%h.kind), tag(0))) example_base
 @{
diff --git a/gcc/gengtype-state.c b/gcc/gengtype-state.c
index 1e9fade..fda473a 100644
--- a/gcc/gengtype-state.c
+++ b/gcc/gengtype-state.c
@@ -1615,6 +1615,8 @@ read_state_struct_type (type_p type)
   read_state_lang_bitmap ((type-u.s.bitmap));
   read_state_type ((type-u.s.lang_struct));
   read_state_type ((type-u.s.base_class));
+  if (type-u.s.base_class)
+   add_subclass (type-u.s.base_class, type);
 }
   else
 {
diff --git a/gcc/gengtype.c b/gcc/gengtype.c
index 31e0f99..f35952e 100644
--- a/gcc/gengtype.c
+++ b/gcc/gengtype.c
@@ -137,6 +137,16 @@ xasprintf (const char *format, ...)
   

Re: [Patch, C, C++] Accept GCC ivdep for 'do' and 'while', and for C++11's range-based loops

2013-10-25 Thread Joseph S. Myers
On Fri, 25 Oct 2013, Tobias Burnus wrote:

 Tobias Burnus wrote:
  Jason Merrill wrote:
   I would think that a range-based loop over an array should vectorize
   nicely:
  [...]
  Therefore, I will send a follow up patch.
 
 Attached is that patch [C++].
 
 [C/C++:] Additionally, as Jakub proposed and other compilers also support,
 the patch accepts #pragma GCC ivdep for 'do' and 'while' loops.

The C front-end changes are OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Compute nonzero_bits from __builtin_unreachable assertions

2013-10-25 Thread Jakub Jelinek
Hi!

And here is a patch that allows vectorization without peeling for alignment
and scalar loop for bound even for fn2, fn3 and fn4 in the following
testcase, though as with the range __builtin_unreachable () notes, it is
quite fragile, because it only works if there are no immediate uses of the
tested SSA_NAME before the assertion.  Perhaps more reliable way would be to
convert those assertions info __builtin_assume_aligned, but that has the
disadvantage that it's first argument is a pointer and it returns a pointer,
so we'd need to cast integers to pointers and back, or add ASSUME_ALIGNED
internal function.

Bootstrapped/regtested on x86_64-linux and i686-linux.

int a[1024];

void
fn1 (int x, int y)
{
  int i;
  x = -32;
  y = -32;
  for (i = x + 32; i  y; i++)
a[i]++;
}

void
fn2 (int x, int y)
{
  int i;
  if (x  31)
__builtin_unreachable ();
  if (y  31)
__builtin_unreachable ();
  for (i = x + 32; i  x + y; i++)
a[i]++;
}

void
fn3 (int x, int y)
{
  int i;
  if (x % 32)
__builtin_unreachable ();
  if (y % 32)
__builtin_unreachable ();
  for (i = x + 32; i  x + y; i++)
a[i]++;
}

void
fn3 (int x, int y)
{
  int i;
  if ((x % 32) != 0)
__builtin_unreachable ();
  if ((y % 32) != 0)
__builtin_unreachable ();
  for (i = x + 32; i  x + y; i++)
a[i]++;
}

2013-10-25  Jakub Jelinek  ja...@redhat.com

* tree-vrp.c (maybe_set_nonzero_bits): New function.
(remove_range_assertions): Call it.

--- gcc/tree-vrp.c.jj   2013-10-24 14:32:29.0 +0200
+++ gcc/tree-vrp.c  2013-10-25 21:21:35.183092937 +0200
@@ -6459,6 +6459,60 @@ check_all_array_refs (void)
 }
 }
 
+/* Handle
+   _4 = x_3  31;
+   if (_4 != 0)
+ goto bb 6;
+   else
+ goto bb 7;
+   bb 6:
+   __builtin_unreachable ();
+   bb 7:
+   x_5 = ASSERT_EXPR x_3, ...;
+   If x_3 has no other immediate uses (checked by caller),
+   var is the x_3 var from ASSERT_EXPR, we can clear low 5 bits
+   from the non-zero bitmask.  */
+
+static void
+maybe_set_nonzero_bits (basic_block bb, tree var)
+{
+  edge e = single_pred_edge (bb);
+  basic_block cond_bb = e-src;
+  gimple stmt = last_stmt (cond_bb);
+  tree cst;
+
+  if (stmt == NULL
+  || gimple_code (stmt) != GIMPLE_COND
+  || gimple_cond_code (stmt) != ((e-flags  EDGE_TRUE_VALUE)
+? EQ_EXPR : NE_EXPR)
+  || TREE_CODE (gimple_cond_lhs (stmt)) != SSA_NAME
+  || !integer_zerop (gimple_cond_rhs (stmt)))
+return;
+
+  stmt = SSA_NAME_DEF_STMT (gimple_cond_lhs (stmt));
+  if (!is_gimple_assign (stmt)
+  || gimple_assign_rhs_code (stmt) != BIT_AND_EXPR
+  || TREE_CODE (gimple_assign_rhs2 (stmt)) != INTEGER_CST)
+return;
+  if (gimple_assign_rhs1 (stmt) != var)
+{
+  gimple stmt2;
+
+  if (TREE_CODE (gimple_assign_rhs1 (stmt)) != SSA_NAME)
+   return;
+  stmt2 = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt));
+  if (!gimple_assign_cast_p (stmt2)
+ || gimple_assign_rhs1 (stmt2) != var
+ || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt2))
+ || (TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs1 (stmt)))
+ != TYPE_PRECISION (TREE_TYPE (var
+   return;
+}
+  cst = gimple_assign_rhs2 (stmt);
+  set_nonzero_bits (var, (get_nonzero_bits (var)
+  ~tree_to_double_int (cst)));
+}
+
 /* Convert range assertion expressions into the implied copies and
copy propagate away the copies.  Doing the trivial copy propagation
here avoids the need to run the full copy propagation pass after
@@ -6576,8 +6630,11 @@ remove_range_assertions (void)
}
}
if (ok)
- set_range_info (var, SSA_NAME_RANGE_INFO (lhs)-min,
- SSA_NAME_RANGE_INFO (lhs)-max);
+ {
+   set_range_info (var, SSA_NAME_RANGE_INFO (lhs)-min,
+   SSA_NAME_RANGE_INFO (lhs)-max);
+   maybe_set_nonzero_bits (bb, var);
+ }
  }
  }
 

Jakub


[google gcc-4_8] write ggc_memory to gcda file

2013-10-25 Thread Rong Xu
Hi,

This patch writes out ggc_memory to gcda files.

Google branches only. Test is ongoing. OK to check-in after test passes?

-Rong
2013-10-25  Rong Xu  x...@google.com

* contrib/profile_tool (ModuleInfo): write out ggc_memory to gcda file.
* gcc/gcov-io.c (gcov_read_module_info): Ditto.
* libgcc/dyn-ipa.c (gcov_write_module_info): Ditto.

Index: contrib/profile_tool
===
--- contrib/profile_tool(revision 204080)
+++ contrib/profile_tool(working copy)
@@ -674,6 +674,7 @@ class ModuleInfo(DataObject):
 self.module_id = reader.ReadWord()
 self.is_primary = reader.ReadWord()
 self.flags = reader.ReadWord()
+self.ggc_memory = reader.ReadWord()
 self.language = reader.ReadWord()
 self.num_quote_paths = reader.ReadWord()
 self.num_bracket_paths = reader.ReadWord()
@@ -710,6 +711,7 @@ class ModuleInfo(DataObject):
 writer.WriteWord(self.is_primary)
 writer.WriteWord(self.flags)
 writer.WriteWord(self.language)
+writer.WriteWord(self.ggc_memory)
 writer.WriteWord(self.num_quote_paths)
 writer.WriteWord(self.num_bracket_paths)
 writer.WriteWord(self.num_system_paths)
Index: gcc/gcov-io.c
===
--- gcc/gcov-io.c   (revision 204080)
+++ gcc/gcov-io.c   (working copy)
@@ -591,14 +591,15 @@ gcov_read_module_info (struct gcov_module_info *mo
   mod_info-ident = gcov_read_unsigned ();
   mod_info-is_primary = gcov_read_unsigned ();
   mod_info-flags = gcov_read_unsigned ();
-  mod_info-lang  = gcov_read_unsigned ();
+  mod_info-lang = gcov_read_unsigned ();
+  mod_info-ggc_memory = gcov_read_unsigned ();
   mod_info-num_quote_paths = gcov_read_unsigned ();
   mod_info-num_bracket_paths = gcov_read_unsigned ();
   mod_info-num_system_paths = gcov_read_unsigned ();
   mod_info-num_cpp_defines = gcov_read_unsigned ();
   mod_info-num_cpp_includes = gcov_read_unsigned ();
   mod_info-num_cl_args = gcov_read_unsigned ();
-  len -= 10;
+  len -= 11;
 
   filename_len = gcov_read_unsigned ();
   mod_info-da_filename = (char *) xmalloc (filename_len *
Index: libgcc/dyn-ipa.c
===
--- libgcc/dyn-ipa.c(revision 204080)
+++ libgcc/dyn-ipa.c(working copy)
@@ -2095,7 +2095,7 @@ gcov_write_module_info (const struct gcov_info *mo
   len += 1; /* Each string is lead by a length.  */
 }
 
-  len += 10; /* 9 more fields */
+  len += 11; /* 11 more fields */
 
   gcov_write_tag_length (GCOV_TAG_MODULE_INFO, len);
   gcov_write_unsigned (module_info-ident);
@@ -2104,6 +2104,7 @@ gcov_write_module_info (const struct gcov_info *mo
 SET_MODULE_INCLUDE_ALL_AUX (module_info);
   gcov_write_unsigned (module_info-flags);
   gcov_write_unsigned (module_info-lang);
+  gcov_write_unsigned (module_info-ggc_memory);
   gcov_write_unsigned (module_info-num_quote_paths);
   gcov_write_unsigned (module_info-num_bracket_paths);
   gcov_write_unsigned (module_info-num_system_paths);


[GOOGLE] Fix a bug when profile propagation handles infinite loop.

2013-10-25 Thread Dehao Chen
If the propagation finds an infinite look, if the in-edge count is
non-zero, then it will cause compiler go into infinite loop when
building with AutoFDO.

Bootstrapped and regression test on-going.

OK for google-4_8 branch?

Thanks,
Dehao

Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 204027)
+++ gcc/auto-profile.c (working copy)
@@ -1287,6 +1287,7 @@ afdo_propagate (void)
 {
   basic_block bb;
   bool changed = true;
+  int i = 0;

   FOR_ALL_BB (bb)
 {
@@ -1295,7 +1296,7 @@ afdo_propagate (void)
  bb-flags |= BB_ANNOTATED;
 }

-  while (changed)
+  while (changed  i++  PARAM_VALUE (PARAM_AUTOFDO_MAX_PROPAGATE_ITERATIONS))
 {
   changed = false;

Index: gcc/params.def
===
--- gcc/params.def (revision 204027)
+++ gcc/params.def (working copy)
@@ -460,6 +460,14 @@ DEFPARAM(PARAM_MAX_PREDICTED_ITERATIONS,
  The maximum number of loop iterations we predict statically,
  100, 0, 0)

+/* This parameter controls the maximum iterations that AutoFDO profile
+   prpagation algorithm will run for a specific CFG.  */
+
+DEFPARAM(PARAM_AUTOFDO_MAX_PROPAGATE_ITERATIONS,
+ max-autofdo-max-propagate-iterations,
+ The maximum number of AutoFDO profile propagation iterations,
+ 1000, 0, 0)
+
 /* This parameter controls the probability of builtin_expect. The default
value is 90%. This empirical value is obtained through the weighted
probability of FDO counters (with the FDO count value as the weight)


Re: [PATCH, SH] Add support for inlined builtin_strncmp

2013-10-25 Thread Kaz Kojima
Christian Bruel christian.br...@st.com wrote:
 No regressions for -m2 and -m4 for sh-elf.
 OK for trunk ?

OK with the change of ChangeLog entry suggested by your another mail.
Thanks!

Regards,
kaz


Re: [google gcc-4_8] write ggc_memory to gcda file

2013-10-25 Thread Xinliang David Li
Looks good.

David

On Fri, Oct 25, 2013 at 4:04 PM, Rong Xu x...@google.com wrote:
 Hi,

 This patch writes out ggc_memory to gcda files.

 Google branches only. Test is ongoing. OK to check-in after test passes?

 -Rong


Re: [GOOGLE] Fix a bug when profile propagation handles infinite loop.

2013-10-25 Thread Xinliang David Li
What is the usual number of iterations?

David

On Fri, Oct 25, 2013 at 4:10 PM, Dehao Chen de...@google.com wrote:
 If the propagation finds an infinite look, if the in-edge count is
 non-zero, then it will cause compiler go into infinite loop when
 building with AutoFDO.

 Bootstrapped and regression test on-going.

 OK for google-4_8 branch?

 Thanks,
 Dehao

 Index: gcc/auto-profile.c
 ===
 --- gcc/auto-profile.c (revision 204027)
 +++ gcc/auto-profile.c (working copy)
 @@ -1287,6 +1287,7 @@ afdo_propagate (void)
  {
basic_block bb;
bool changed = true;
 +  int i = 0;

FOR_ALL_BB (bb)
  {
 @@ -1295,7 +1296,7 @@ afdo_propagate (void)
   bb-flags |= BB_ANNOTATED;
  }

 -  while (changed)
 +  while (changed  i++  PARAM_VALUE 
 (PARAM_AUTOFDO_MAX_PROPAGATE_ITERATIONS))
  {
changed = false;

 Index: gcc/params.def
 ===
 --- gcc/params.def (revision 204027)
 +++ gcc/params.def (working copy)
 @@ -460,6 +460,14 @@ DEFPARAM(PARAM_MAX_PREDICTED_ITERATIONS,
   The maximum number of loop iterations we predict statically,
   100, 0, 0)

 +/* This parameter controls the maximum iterations that AutoFDO profile
 +   prpagation algorithm will run for a specific CFG.  */
 +
 +DEFPARAM(PARAM_AUTOFDO_MAX_PROPAGATE_ITERATIONS,
 + max-autofdo-max-propagate-iterations,
 + The maximum number of AutoFDO profile propagation iterations,
 + 1000, 0, 0)
 +
  /* This parameter controls the probability of builtin_expect. The default
 value is 90%. This empirical value is obtained through the weighted
 probability of FDO counters (with the FDO count value as the weight)


  1   2   >