Re: PR59723: fix LTO + fortran namelist ICEs

2014-01-28 Thread Richard Biener
On Tue, 21 Jan 2014, Aldy Hernandez wrote:

 Hi folks.
 
 The problem here is that while streaming the DECL_NAME with stream_read_tree()
 and consequently lto_output_tree(), we trigger an ICE because we are recursing
 on the DFS walk:
 
   else
 {
   /* This is the first time we see EXPR, write all reachable
trees to OB.  */
   static bool in_dfs_walk;
 
   /* Protect against recursion which means disconnect between
  what tree edges we walk in the DFS walk and what edges
we stream out.  */
   gcc_assert (!in_dfs_walk);
 
 In the namelist fortran testcases, this is the first time we see the
 DECL_NAMEs, so we ICE.  I fixed this by outputting the DECL_NAME's string with
 streamer_write_string() instead.
 
 [I honestly wondered why we don't stream the entire NAMELIST_DECL the same way
 we stream IMPORTED_DECL, all in one go, instead of piecemeal like the present
 code does.  But LTO is this complicated black box in my head that I try not to
 think about too much...the current patch touches as little as possible.]
 
 This change alone fixes the problems in the PR, but I also found another ICE
 now that streaming actually works: dwarf.  It turns out, that the way the
 CONSTRUCTOR elements in the NAMELIST_DECL are streamed, a PARM_DECL that has
 been previously seen (in the function's DECL_ARGUMENTS) will be streamed with
 different reference magic (or whatever streamed references or ids are called
 in LTO speak).  So when we read the CONSTRUCTOR elements in the LTO read
 phase, we end up with different pointers for a PARM_DECL in the constructor
 for the seemingly same PARM_DECL in the function's arguments (DECL_ARGUMENTS).
 
 I don't understand LTO very well, but I do see that using stream_read_tree
 (lto_output_tree) caches the entries, so it seems like a good fit to avoid
 writing two distinct items for the same PARM_DECL. And since the constructor
 elements have been previously seen, we won't hit the aforementioned DFS
 recursion ICE.
 
 It'd be great if the LTO gods could illuminate this abyss (that's you Mr.
 Biener ;-)), but the attached patch fixes all the LTO problems exhibited by:
 
 make check-fortran RUNTESTFLAGS=--target_board=unix'{-flto}'
 dg.exp=*namelist*
 
 As an aside, we really should test some subset of the LTO tests while testing
 Fortran, or this oversight will surely repeat itself on any changes to the
 Fortran streamer bits.
 
 Does this help?  OK?

The patch doesn't make much sense to me.  I think the problem is that
NAMELIST_DECL is output in a ref section (LTO_namelist_decl_ref) but
the output routine writes other refs (via stream_write_tree and
outputting the constructor).  lto_output_tree_ref may not do this
kind of stuff.  Instead the contents of a NAMELIST_DECL need to be
output from the generic tree writing routines.

Where are NAMELIST_DECLs possibly refered from?

Richard.


Re: [gomp4] Initial support for OpenACC data clauses

2014-01-28 Thread Thomas Schwinge
Hi!

On Tue, 14 Jan 2014 07:09:33 -0800, I wrote:
 Here is a patch series that adds initial support for OpenACC data
 clauses.  It is not yet complete, but I thought I might as well already
 now strive to get this integrated upstream instead of hoarding the
 patches locally.

Committed to gomp-4_0-branch in r207173..8.


 Would it be a good idea to also commit to trunk the (portions of the)
 patches that don't directly relate with OpenACC stuff?  That way, trunk
 and gomp-4_0-branch would diverge a little less?  Or, would you first
 like to see all of this stabilitize on gomp-4_0-branch?


Grüße,
 Thomas


pgpGbePYtAtQf.pgp
Description: PGP signature


Re: PR59723: fix LTO + fortran namelist ICEs

2014-01-28 Thread Jakub Jelinek
On Tue, Jan 28, 2014 at 10:40:51AM +0100, Richard Biener wrote:
 The patch doesn't make much sense to me.  I think the problem is that
 NAMELIST_DECL is output in a ref section (LTO_namelist_decl_ref) but
 the output routine writes other refs (via stream_write_tree and
 outputting the constructor).  lto_output_tree_ref may not do this
 kind of stuff.  Instead the contents of a NAMELIST_DECL need to be
 output from the generic tree writing routines.
 
 Where are NAMELIST_DECLs possibly refered from?

I think usually from BLOCK_VARS of some BLOCK.

Jakub


[patch] fix libstdc++/59656

2014-01-28 Thread Jonathan Wakely

This adds new (non-public) constructors to shared_ptr and
__shared_count so that weak_ptr::lock() can be implemented without
exceptions. This allows it to be used with -fno-exceptions and also
avoids the overhead of throwing and catching an exception when the
weak_ptr has expired.

Tested x86_64-linux, committed to trunk.

2014-01-28  Jonathan Wakely  jwak...@redhat.com
Kyle Lippincott  spect...@google.com

	PR libstdc++/59656

* include/bits/shared_ptr.h (shared_ptr): Add new non-throwing
constructor and grant friendship to weak_ptr.
(weak_ptr::lock()): Use new constructor.
* include/bits/shared_ptr_base.h
(_Sp_counted_base::_M_add_ref_lock_nothrow()): Declare new function
and define specializations.
(__shared_count): Add new non-throwing constructor.
(__shared_ptr): Add new non-throwing constructor and grant friendship
to __weak_ptr.
(__weak_ptr::lock()): Use new constructor.
* testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust dg-error.
* testsuite/20_util/shared_ptr/cons/void_neg.cc: Likewise.


commit bf7fef6f31608a1da2e2fbc79c2514c0cdecca1d
Author: Jonathan Wakely jwak...@redhat.com
Date:   Tue Jan 28 09:45:30 2014 +

2014-01-28  Jonathan Wakely  jwak...@redhat.com
Kyle Lippincott  spect...@google.com

PR libstdc++/59656
* include/bits/shared_ptr.h (shared_ptr): Add new non-throwing
constructor and grant friendship to weak_ptr.
(weak_ptr::lock()): Use new constructor.
* include/bits/shared_ptr_base.h
(_Sp_counted_base::_M_add_ref_lock_nothrow()): Declare new function
and define specializations.
(__shared_count): Add new non-throwing constructor.
(__shared_ptr): Add new non-throwing constructor and grant friendship
to __weak_ptr.
(__weak_ptr::lock()): Use new constructor.
* testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust dg-error.
* testsuite/20_util/shared_ptr/cons/void_neg.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/shared_ptr.h 
b/libstdc++-v3/include/bits/shared_ptr.h
index 8fcf710..081d3bd 100644
--- a/libstdc++-v3/include/bits/shared_ptr.h
+++ b/libstdc++-v3/include/bits/shared_ptr.h
@@ -319,6 +319,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   templatetypename _Tp1, typename _Alloc, typename... _Args
friend shared_ptr_Tp1
allocate_shared(const _Alloc __a, _Args... __args);
+
+  // This constructor is non-standard, it is used by weak_ptr::lock().
+  shared_ptr(const weak_ptr_Tp __r, std::nothrow_t)
+  : __shared_ptr_Tp(__r, std::nothrow) { }
+
+  friend class weak_ptr_Tp;
 };
 
   // 20.7.2.2.7 shared_ptr comparisons
@@ -492,23 +498,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   shared_ptr_Tp
   lock() const noexcept
-  {
-#ifdef __GTHREADS
-   if (this-expired())
- return shared_ptr_Tp();
-
-   __try
- {
-   return shared_ptr_Tp(*this);
- }
-   __catch(const bad_weak_ptr)
- {
-   return shared_ptr_Tp();
- }
-#else
-   return this-expired() ? shared_ptr_Tp() : shared_ptr_Tp(*this);
-#endif
-  }
+  { return shared_ptr_Tp(*this, std::nothrow); }
 };
 
   // 20.7.2.3.6 weak_ptr specialized algorithms.
diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index 1c3a47d..536df01 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -134,7 +134,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   
   void
   _M_add_ref_lock();
-  
+
+  bool
+  _M_add_ref_lock_nothrow();
+
   void
   _M_release() noexcept
   {
@@ -247,6 +250,51 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
+inline bool
+_Sp_counted_base_S_single::
+_M_add_ref_lock_nothrow()
+{
+  if (_M_use_count == 0)
+   return false;
+  ++_M_use_count;
+  return true;
+}
+
+  template
+inline bool
+_Sp_counted_base_S_mutex::
+_M_add_ref_lock_nothrow()
+{
+  __gnu_cxx::__scoped_lock sentry(*this);
+  if (__gnu_cxx::__exchange_and_add_dispatch(_M_use_count, 1) == 0)
+   {
+ _M_use_count = 0;
+ return false;
+   }
+  return true;
+}
+
+  template
+inline bool
+_Sp_counted_base_S_atomic::
+_M_add_ref_lock_nothrow()
+{
+  // Perform lock-free add-if-not-zero operation.
+  _Atomic_word __count = _M_get_use_count();
+  do
+   {
+ if (__count == 0)
+   return false;
+ // Replace the current counter value with the old value + 1, as
+ // long as it's not changed meanwhile.
+   }
+  while (!__atomic_compare_exchange_n(_M_use_count, __count, __count + 1,
+ true, __ATOMIC_ACQ_REL,
+ __ATOMIC_RELAXED));
+  

Re: Do not produce empty try-finally statements

2014-01-28 Thread Richard Biener
On Mon, Jan 20, 2014 at 5:22 PM, Jan Hubicka hubi...@ucw.cz wrote:

 Yes.  Say, this could be surrounded by some try/catch, if we do it the first
 way, a would be still considered live across the EH path to whatever catches
 it.

 The EH optimizations involving cleanups with only clobbers in them are that
 if at the end of the cleanup after only CLOBBER stmts you would rethrow the 
 exception
 externally, then the clobber isn't needed and the whole cleanup can be
 removed.  And, if it rethrows somewhere internally, we can move the clobber
 stmts to the landing pad of wherever it would be caught.

 OK, I still do not see how ehclanup1 can then safely remove them pre-inline
 given that the whole function body can be inlined into another containing the
 outer EH region.

That's true.

  If this is valid, why we can not just eliminate EH in those
 outer clobber try..finally as part of ehlowering earlier?

Probably we'd miss too many inlining cases from early inlining and
the ehcleanup1 time is just a heuristic that works good enough for us?

Jakub?  Micha?

Thanks,
Richard.

 Honza

   Jakub


Re: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-01-28 Thread Bernd Schmidt

On 12/17/2013 12:39 PM, Michael V. Zolotukhin wrote:

Here is a patch 2/3: Add tables generation.

This patch is just a slightly modified patch sent a couple of weeks ago.  When
compiling with '-fopenmp' compiler generates a special symbol, containing
addresses and sizes of globals/omp_fn-functions, and places it into a special
section.  Later, at linking, these sections are merged together and we get a
single table with all addresses/sizes for entire binary.  Also, in this patch we
start to pass '__OPENMP_TARGET__' symbol to GOMP_target calls.


I also have a question about the code in this patch. I can see how the 
table is constructed - what's not clear to me is how it is going to be 
used? How do you map from a function or variable you want to look up to 
an index in this table?



Bernd



[jit] Add set_options function to the testsuite

2014-01-28 Thread David Malcolm
Committed to branch dmalcolm/jit:

gcc/testsuite/
* jit.dg/harness.h (test_jit): Move the various calls to set up
options on the context into...
(set_options): ...this new function.
---
 gcc/testsuite/ChangeLog.jit|  6 ++
 gcc/testsuite/jit.dg/harness.h | 23 ++-
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/gcc/testsuite/ChangeLog.jit b/gcc/testsuite/ChangeLog.jit
index 461c067..b150796 100644
--- a/gcc/testsuite/ChangeLog.jit
+++ b/gcc/testsuite/ChangeLog.jit
@@ -1,3 +1,9 @@
+2014-01-28  David Malcolm  dmalc...@redhat.com
+
+   * jit.dg/harness.h (test_jit): Move the various calls to set up
+   options on the context into...
+   (set_options): ...this new function.
+
 2014-01-27  David Malcolm  dmalc...@redhat.com
 
* jit.dg/test-error-call-with-not-enough-args.c: New test case.
diff --git a/gcc/testsuite/jit.dg/harness.h b/gcc/testsuite/jit.dg/harness.h
index c9b8a3d..71a5fcf 100644
--- a/gcc/testsuite/jit.dg/harness.h
+++ b/gcc/testsuite/jit.dg/harness.h
@@ -91,16 +91,8 @@ void check_string_value (const char *actual, const char 
*expected)
   pass (%s: actual: NULL == expected: NULL);
 }
 
-/* Run one iteration of the test.  */
-static void
-test_jit (const char *argv0, void *user_data)
+static void set_options (gcc_jit_context *ctxt, const char *argv0)
 {
-  gcc_jit_context *ctxt;
-  gcc_jit_result *result;
-
-  ctxt = gcc_jit_context_acquire ();
- /* FIXME: error-handling */
-
   /* Set up options.  */
   gcc_jit_context_set_str_option (
 ctxt,
@@ -130,6 +122,19 @@ test_jit (const char *argv0, void *user_data)
 ctxt,
 GCC_JIT_BOOL_OPTION_DUMP_SUMMARY,
 1);
+}
+
+/* Run one iteration of the test.  */
+static void
+test_jit (const char *argv0, void *user_data)
+{
+  gcc_jit_context *ctxt;
+  gcc_jit_result *result;
+
+  ctxt = gcc_jit_context_acquire ();
+ /* FIXME: error-handling */
+
+  set_options (ctxt, argv0);
 
   create_code (ctxt, user_data);
 
-- 
1.7.11.7



Re: Do not produce empty try-finally statements

2014-01-28 Thread Jakub Jelinek
On Tue, Jan 28, 2014 at 11:48:16AM +0100, Richard Biener wrote:
 On Mon, Jan 20, 2014 at 5:22 PM, Jan Hubicka hubi...@ucw.cz wrote:
 
  Yes.  Say, this could be surrounded by some try/catch, if we do it the 
  first
  way, a would be still considered live across the EH path to whatever 
  catches
  it.
 
  The EH optimizations involving cleanups with only clobbers in them are that
  if at the end of the cleanup after only CLOBBER stmts you would rethrow 
  the exception
  externally, then the clobber isn't needed and the whole cleanup can be
  removed.  And, if it rethrows somewhere internally, we can move the clobber
  stmts to the landing pad of wherever it would be caught.
 
  OK, I still do not see how ehclanup1 can then safely remove them pre-inline
  given that the whole function body can be inlined into another containing 
  the
  outer EH region.
 
 That's true.

There are two kinds of clobbers, the direct ones, which surely can be safely
removed by ehcleanup1 if they are the only reason why there is a landing pad
which will be rethrown outside of the current function, and then the
indirect ones, meant primarily for C++ destructors on *this, which are just
heuristics and current state seems to be working good enough for them IMHO.

For the direct cleanups, the important thing is that they are necessarily
local variables in the current function, so when returning from that
function, whether normally or abnormally, the inliner still has all the info
about them.  If it wants, it can add the corresponding clobbers itself (not
sure if it does or doesn't bother, but if it doesn't bother right now, it
certainly could add those in the future if it proves to be important).
For the indirect clobbers, typically if you inline some destructor, you
either inline it into another destructor which will also have (a larger)
clobber for *this afterwards, or into a function that defines the var as
local and there will be a normal direct clobber for it.

Jakub


Re: [PATCH, ARM][PING] Reintroduce minipool ranges for zero-extension insn patterns

2014-01-28 Thread Ramana Radhakrishnan
On Thu, Jan 23, 2014 at 3:16 PM, Yury Gribov y.gri...@samsung.com wrote:
 Hi,

 Julian Brown has proposed patch
 (http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01191.html) for the dreadful
 push_minipool_fix error (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49423)
 in June but it didn't seem to get enough attention.#


That patch appears to be garbled with text that appears to be from a
testcase. It needs to be rebased for trunk to deal with the additional
changes to the attributes currently.. Additionally I'm not really sure
why there is an additional load from the constant pool here - what is
the constant in this case ? Given it is atmost a 16 bit constant
surely that should be loaded with a single mov(w) instruction in armv7
land.

regards
Ramana

 Can we submit it?

 --
 Best regards,
 Yury


Re: [PATCH, ARM] Suppress Redundant Flag Setting for Cortex-A15

2014-01-28 Thread Ramana Radhakrishnan
On Fri, Jan 24, 2014 at 5:16 PM, Ian Bolton ian.bol...@arm.com wrote:
 Hi there!

 An existing optimisation for Thumb-2 converts t32 encodings to
 t16 encodings to reduce codesize, at the expense of causing
 redundant flag setting for ADD, AND, etc.  This redundant flag
 setting can have negative performance impact on cortex-a15.

 This patch introduces two new tuning options so that the conversion
 from t32 to t16, which takes place in thumb2_reorg, can be suppressed
 for cortex-a15.

 To maintain some of the original benefit (reduced codesize), the
 suppression is only done where the enclosing basic block is deemed
 worthy of optimising for speed.

 This tested with no regressions and performance has improved for
 the workloads tested on cortex-a15.  (It might be beneficial to
 other processors too, but that has not been investigated yet.)

 OK for stage 1?

This is OK for stage1.

Ramana


 Cheers,
 Ian


 2014-01-24  Ian Bolton  ian.bol...@arm.com

 gcc/
 * config/arm/arm-protos.h (tune_params): New struct members.
 * config/arm/arm.c: Initialise tune_params per processor.
 (thumb2_reorg): Suppress conversion from t32 to t16 when
 optimizing for speed, based on new tune_params.


Re: [ARM] add armv7ve support

2014-01-28 Thread Ramana Radhakrishnan
On Fri, Dec 20, 2013 at 6:50 PM, Renlin Li renlin...@arm.com wrote:
 Hi all,

 This patch will add armv7ve support to gcc. Armv7ve is basically a armv7-a
 architecture profile with Virtualization Extensions. Additional test cases
 are also added.

 With this patch and to keep backward compatibility with old assembler, the
 following asm header will be generated when -march=armv7ve option is
 presented.
 .arch armv7-a
 .arch_extension virt
 .arch_extension idiv
 .arch_extension sec
 .arch_extension mp

 This is a amendment to a previous patch:
 http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02365.html
 No new __ARM_ARCH_7VE__ is defined. Instead, __ARM_ARCH_7A__ is defined with
 additional extensions (e.g. __ARM_ARCH_EXT_IDIV__) when arch is set to
 armv7ve.


 Okay for trunk?

This is Ok since this was submitted quite sometime back.

Sorry about the delay - I've been on holiday.

Thanks,
Ramana


 Regards,
 Renlin Li


 gcc/ChangeLog:

 2013-12-20  Renlin Li  renlin...@arm.com

 * config.gcc:  Add armv7ve for --with-arch option.
 * config/arm/arm-arches.def (ARM_ARCH): Add armv7ve arch.
 * config/arm/arm.c (FL_FOR_ARCH7VE):  New.
 (arm_file_start): Generate correct asm header for armv7ve.
 * config/arm/bpabi.h:  Add multilib support for armv7ve.
 * config/arm/driver-arm.c: Change the architectures of cortex-a7
 and cortex-a15 to armv7ve.
 * config/arm/t-aprofile: Add multilib support for armv7ve.
 * doc/invoke.texi:  Docuemnt -march=armv7ve.

 gcc/testsuite/ChangeLog:

 2013-12-20  Renlin Li  renlin...@arm.com

 * gcc.target/arm/ftest-armv7ve-arm.c: New.
 * gcc.target/arm/ftest-armv7ve-thumb.c: New.
 * lib/target-supports.exp: New armfunc, armflag and armdef for
 armv7ve.



Re: [PATCH] Don't ignore write/write DDR_REVERSED_P dependencies (PR tree-optimization/59594)

2014-01-28 Thread Richard Biener
On Fri, 24 Jan 2014, Jakub Jelinek wrote:

 Hi!
 
 I admit I fully don't understand why exactly, but my experimentation so far
 showed that for read/write and write/read ddrs it is ok and desirable to
 ignore the dist  0  DDR_REVERSED_P (ddr) cases, but for write/write
 ddrs it is undesirable.  See the PR for further tests, perhaps I could
 turn them into further testcases.

Please.

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 2014-01-22  Jakub Jelinek  ja...@redhat.com
 
   PR tree-optimization/59594
   * tree-vect-data-refs.c (vect_analyze_data_ref_dependence): Don't
   ignore dist  0  DDR_REVERSED_P (ddr) if step is negative and both
   DRs are writes.
 
   * gcc.dg/vect/no-vfa-vect-depend-2.c: New test.
   * gcc.dg/vect/pr59594.c: New test.
 
 --- gcc/tree-vect-data-refs.c.jj  2014-01-16 20:54:59.0 +0100
 +++ gcc/tree-vect-data-refs.c 2014-01-22 13:13:49.751362484 +0100
 @@ -383,11 +383,13 @@ vect_analyze_data_ref_dependence (struct
 continue;
   }
  
 -  if (dist  0  DDR_REVERSED_P (ddr))
 +  if (dist  0  DDR_REVERSED_P (ddr)
 +(DR_IS_READ (dra) || DR_IS_READ (drb)))

I think that'snot sufficient.  It depends
on the order of the stmts whether the dependence distance is really
negative - we are trying to catch write-after-read negative distance
here I think.  We can't rely on the DDRs being formed in stmt order
(anymore, at least since 4.9 where we start to arbitrary re-order
the vector of DRs).

   {
 /* If DDR_REVERSED_P the order of the data-refs in DDR was
reversed (to make distance vector positive), and the actual
 -  distance is negative.  */
 +  distance is negative.  If both DRs are writes, we can't ignore
 +  the DDR.  See PR59594.  */
 if (dump_enabled_p ())
   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
dependence distance negative.\n);
 --- gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c.jj   2014-01-22 
 13:28:47.100724091 +0100
 +++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c  2014-01-22 
 13:41:38.736778586 +0100
 @@ -0,0 +1,55 @@
 +/* { dg-require-effective-target vect_int } */
 +
 +#include stdarg.h
 +#include tree-vect.h
 +
 +#define N 17
 +
 +int ia[N] = {48,45,42,39,36,33,30,27,24,21,18,15,12,9,6,3,0};
 +int ib[N] = {48,45,42,39,36,33,30,27,24,21,18,15,12,9,6,3,0};
 +int res[N] = {48,192,180,168,156,144,132,120,108,96,84,72,60,48,36,24,12};
 +
 +__attribute__ ((noinline))
 +int main1 ()
 +{
 +  int i;
 +
 +  /* Not vectorizable due to data dependence: dependence distance 1.  */ 
 +  for (i = N - 1; i = 0; i--)
 +{
 +  ia[i] = ia[i+1] * 4;
 +}
 +
 +  /* check results:  */
 +  for (i = 0; i  N; i++)
 +{
 +  if (ia[i] != 0)
 +abort ();
 +} 
 +
 +  /* Vectorizable. Dependence distance -1.  */
 +  for (i = N - 1; i = 0; i--)
 +{
 +  ib[i+1] = ib[i] * 4;
 +}
 +
 +  /* check results:  */
 +  for (i = 0; i  N; i++)
 +{
 +  if (ib[i] != res[i])
 +abort ();
 +}
 +
 +  return 0;
 +}
 +
 +int main (void)
 +{
 +  check_vect();
 +
 +  return main1 ();
 +}
 +
 +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect {xfail 
 vect_no_align } } } */
 +/* { dg-final { scan-tree-dump-times dependence distance negative 1 vect 
  } } */
 +/* { dg-final { cleanup-tree-dump vect } } */
 --- gcc/testsuite/gcc.dg/vect/pr59594.c.jj2014-01-22 13:39:51.362322166 
 +0100
 +++ gcc/testsuite/gcc.dg/vect/pr59594.c   2014-01-22 13:39:43.0 
 +0100
 @@ -0,0 +1,31 @@
 +/* PR tree-optimization/59594 */
 +
 +#include tree-vect.h
 +
 +#define N 1024
 +int b[N + 1];
 +
 +int
 +main ()
 +{
 +  int i;
 +  check_vect ();
 +  for (i = 0; i  N + 1; i++)
 +{
 +  b[i] = i;
 +  asm ();
 +}
 +  for (i = N; i = 0; i--)
 +{
 +  b[i + 1] = b[i];
 +  b[i] = 1;
 +}
 +  if (b[0] != 1)
 +__builtin_abort ();
 +  for (i = 0; i  N; i++)
 +if (b[i + 1] != i)
 +  __builtin_abort ();
 +  return 0;
 +}
 +
 +/* { dg-final { cleanup-tree-dump vect } } */
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer


Re: Do not produce empty try-finally statements

2014-01-28 Thread Michael Matz
Hi,

On Tue, 28 Jan 2014, Richard Biener wrote:

  The EH optimizations involving cleanups with only clobbers in them 
  are that if at the end of the cleanup after only CLOBBER stmts you 
  would rethrow the exception externally, then the clobber isn't needed 
  and the whole cleanup can be removed.  And, if it rethrows somewhere 
  internally, we can move the clobber stmts to the landing pad of 
  wherever it would be caught.
 
  OK, I still do not see how ehclanup1 can then safely remove them 
  pre-inline given that the whole function body can be inlined into 
  another containing the outer EH region.
 
 That's true.

Yes, and I think they should be removed only after inlining.  
(Alternatively the inliner could be extended to add clobbers when changing 
an external-throw into an internal-throw, but well, ...)

   If this is valid, why we can not just eliminate EH in those outer 
  clobber try..finally as part of ehlowering earlier?
 
 Probably we'd miss too many inlining cases from early inlining and the 
 ehcleanup1 time is just a heuristic that works good enough for us?

No, removing even more clobbers would remove even more stack slot sharing.  
If anything we should remove _less_ regions (as in the precondition for 
Honzas sentence above, If this is valid, ... simply is false).  AFAIU 
this all is just a problem with O0 code quality.  So, there's the obvious 
solution: run ehcleanup also for O0.


Ciao,
Michael.


Re: [PATCH, ARM][PING] Reintroduce minipool ranges for zero-extension insn patterns

2014-01-28 Thread Julian Brown
On Tue, 28 Jan 2014 12:09:27 +
Ramana Radhakrishnan ramana@googlemail.com wrote:

 On Thu, Jan 23, 2014 at 3:16 PM, Yury Gribov y.gri...@samsung.com
 wrote:
  Hi,
 
  Julian Brown has proposed patch
  (http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01191.html) for the
  dreadful push_minipool_fix error
  (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49423) in June but it
  didn't seem to get enough attention.#
 
 
 That patch appears to be garbled with text that appears to be from a
 testcase.

That's bizarre! I have no idea how that can have happened.

 It needs to be rebased for trunk to deal with the additional
 changes to the attributes currently.. Additionally I'm not really sure
 why there is an additional load from the constant pool here - what is
 the constant in this case ? Given it is atmost a 16 bit constant
 surely that should be loaded with a single mov(w) instruction in armv7
 land.

I've lost the context on this patch now, but I may be able to have a
look some time in the next few days and refresh my memory --
particularly if there's a testcase available that reproduces on current
mainline.

Thanks,

Julian


[PATCH] Remove unused ddr_is_anti_dependent, ddrs_have_anti_deps

2014-01-28 Thread Richard Biener

Committed as they are also wrong (no handling of DDR_REVERSED_P).

Richard.

2014-01-28  Richard Biener  rguent...@suse.de

* tree-data-ref.h (ddr_is_anti_dependent, ddrs_have_anti_deps):
Remove.

Index: gcc/tree-data-ref.h
===
*** gcc/tree-data-ref.h (revision 207178)
--- gcc/tree-data-ref.h (working copy)
*** same_access_functions (const struct data
*** 457,488 
return true;
  }
  
- /* Return true when DDR is an anti-dependence relation.  */
- 
- static inline bool
- ddr_is_anti_dependent (ddr_p ddr)
- {
-   return (DDR_ARE_DEPENDENT (ddr) == NULL_TREE
-  DR_IS_READ (DDR_A (ddr))
-  DR_IS_WRITE (DDR_B (ddr))
-  !same_access_functions (ddr));
- }
- 
- /* Return true when DEPENDENCE_RELATIONS contains an anti-dependence.  */
- 
- static inline bool
- ddrs_have_anti_deps (vecddr_p dependence_relations)
- {
-   unsigned i;
-   ddr_p ddr;
- 
-   for (i = 0; dependence_relations.iterate (i, ddr); i++)
- if (ddr_is_anti_dependent (ddr))
-   return true;
- 
-   return false;
- }
- 
  /* Returns true when all the dependences are computable.  */
  
  inline bool
--- 457,462 


Re: Do not produce empty try-finally statements

2014-01-28 Thread Michael Matz
Hi,

On Tue, 28 Jan 2014, Jakub Jelinek wrote:

 There are two kinds of clobbers, the direct ones, which surely can be 
 safely removed by ehcleanup1 if they are the only reason why there is a 
 landing pad which will be rethrown outside of the current function,

You can only safely (as in, not introducing false conflicts for stack 
slots) remove them before inlining, _if_ the inliner would add them back 
in ...

 For the direct cleanups, the important thing is that they are necessarily
 local variables in the current function, so when returning from that
 function, whether normally or abnormally, the inliner still has all the info
 about them.  If it wants, it can add the corresponding clobbers itself

... which it doesn't.

 (not sure if it does or doesn't bother, but if it doesn't bother right 
 now, it certainly could add those in the future if it proves to be 
 important).


Ciao,
Michael.


Re: C vs. C++ breakage on 4.7 (was Re: [Patch, fortran] PR58007: unresolved fixup hell)

2014-01-28 Thread Richard Biener
On Tue, Jan 28, 2014 at 3:25 AM, Hans-Peter Nilsson h...@bitrange.com wrote:
 On Mon, 27 Jan 2014, Richard Biener wrote:
 Huh, so we have C for cross-builds and C++ for bootstraps.

 No, we use a C host compiler in both cases. Only stages 2 and 3 build with a 
 C++ compiler.

 Tomatos potatoes!  As fortran isn't built until then, it'll be
 built as C for cross-builds and C++ for native bootstraps.  Can
 we make stage 2 and 3 built with C on the 4.7 branch?

You can, with a configure option.  No, at this point we don't want to
change the defaults.

Richard.



 Richard.
  I
 wish we could retire that difference *also* on the 4.7 branch
 (using either C *or* C++ for *both* would be fine with me FWIW).
 I believe we're now eperiencing more problems than benefits with
 that difference, now that the migration is over.
 
   Looks like you committed C++ code there, in module.c:
  Alright; can you try the attached patch?
 
 Sorry, not at the moment, but I see Janus took care of that
 (thanks) and it looks pretty obvious to me.  It'll be noticed
 when it's committed...
 
 brgds, H-P




Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-01-28 Thread Ilya Verbin
Adding gcc-patches.


Hi Bernd,

The table is used in libgomp (see my patch [1]), as proposed by Jakub
(see [2]).  The idea is that the order of entries in the host and
target tables must be identical.  This allows to set up one-to-one
correspondence between host and target addresses.
In my patch libgomp calls device_init_func from the plugin, that loads
all libraries to the target, and returns the joint table with
addresses for all libraries.  But probably it's better to have a
function like device_load, that will load only one library image, an
return a table for this library.

[1] http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01896.html
[2] http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01392.html

  -- Ilya


2014/1/28 Bernd Schmidt ber...@codesourcery.com

 On 12/17/2013 12:39 PM, Michael V. Zolotukhin wrote:

 Here is a patch 2/3: Add tables generation.

 This patch is just a slightly modified patch sent a couple of weeks ago.  
 When
 compiling with '-fopenmp' compiler generates a special symbol, containing
 addresses and sizes of globals/omp_fn-functions, and places it into a special
 section.  Later, at linking, these sections are merged together and we get a
 single table with all addresses/sizes for entire binary.  Also, in this 
 patch we
 start to pass '__OPENMP_TARGET__' symbol to GOMP_target calls.


 I also have a question about the code in this patch. I can see how the table 
 is constructed - what's not clear to me is how it is going to be used? How do 
 you map from a function or variable you want to look up to an index in this 
 table?


 Bernd



Re: Do not produce empty try-finally statements

2014-01-28 Thread Jakub Jelinek
On Tue, Jan 28, 2014 at 01:48:21PM +0100, Michael Matz wrote:
 Hi,
 
 On Tue, 28 Jan 2014, Jakub Jelinek wrote:
 
  There are two kinds of clobbers, the direct ones, which surely can be 
  safely removed by ehcleanup1 if they are the only reason why there is a 
  landing pad which will be rethrown outside of the current function,
 
 You can only safely (as in, not introducing false conflicts for stack 
 slots) remove them before inlining, _if_ the inliner would add them back 
 in ...

So it is just an optimization thing, not correctness, right?
We've been doing it this way since 4.7.0.

  For the direct cleanups, the important thing is that they are necessarily
  local variables in the current function, so when returning from that
  function, whether normally or abnormally, the inliner still has all the info
  about them.  If it wants, it can add the corresponding clobbers itself
 
 ... which it doesn't.

So, if we want to improve it, we should do that in the inliner.
Because, not removing them in the ehcleanup1 means the inliner will see far
more complex functions than really necessary.

Jakub


Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals

2014-01-28 Thread Dodji Seketeli
Here is the patch I am committing right now.

gcc/ChangeLog

* input.c (location_get_source_line): Bail out on when line number
is zero, and test the return value of
lookup_or_add_file_to_cache_tab.

gcc/testsuite/ChangeLog

* c-c++-common/cpp/warning-zero-location.c: New test.
* c-c++-common/cpp/warning-zero-location-2.c: Likewise.

diff --git a/gcc/input.c b/gcc/input.c
index 547c177..63cd062 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -698,7 +698,13 @@ location_get_source_line (expanded_location xloc,
   static char *buffer;
   static ssize_t len;
 
-  fcache * c = lookup_or_add_file_to_cache_tab (xloc.file);
+  if (xloc.line == 0)
+return NULL;
+
+  fcache *c = lookup_or_add_file_to_cache_tab (xloc.file);
+  if (c == NULL)
+return NULL;
+
   bool read = read_line_num (c, xloc.line, buffer, len);
 
   if (read  line_len)
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c 
b/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c
new file mode 100644
index 000..c0e0bf7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c
@@ -0,0 +1,10 @@
+/*
+   { dg-options -D _GNU_SOURCE -fdiagnostics-show-caret }
+   { dg-do compile }
+ */
+
+#line 4636 configure
+#include .h
+int main() { return 0; }
+
+/* { dg-error No such file or directory { target *-*-* } 4636 } */
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c 
b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c
new file mode 100644
index 000..ca2e102
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c
@@ -0,0 +1,8 @@
+/*
+   { dg-options -D _GNU_SOURCE -fdiagnostics-show-caret }
+   { dg-do compile }
+ */
+
+#define _GNU_SOURCE/* { dg-warning redefined } */
+
+/* { dg-message  #define _GNU_SOURCE {target *-*-* } 0 }
-- 
Dodji


Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals

2014-01-28 Thread Dodji Seketeli
Dodji Seketeli do...@redhat.com writes:

 Here is the patch I am committing right now.

 gcc/ChangeLog

   * input.c (location_get_source_line): Bail out on when line number
   is zero, and test the return value of
   lookup_or_add_file_to_cache_tab.

 gcc/testsuite/ChangeLog

   * c-c++-common/cpp/warning-zero-location.c: New test.
   * c-c++-common/cpp/warning-zero-location-2.c: Likewise.

I forgot to say that it passed bootstrap  test on
x86_64-unknown-linux-gnu against trunk.

Thanks.

-- 
Dodji


Re: [PATCH] Fix PR59890, improve var-tracking compile-time

2014-01-28 Thread H.J. Lu
On Mon, Jan 20, 2014 at 10:50 AM, Richard Biener rguent...@suse.de wrote:

 This improves var-tracking dataflow convergence by using post order
 on the inverted CFG - which is appropriate for forward dataflow
 problems.  This haves compile-time spent in var-tracking for PR45364
 (it also improves other testcases, but that one seems to be the best
 case).

 For this to pass bootstrap I had to fix PR59890, two latent
 bugs with the recently added local_get_addr_cache.

 Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk?

 Thanks,
 Richard.

 2014-01-20  Richard Biener  rguent...@suse.de

 PR rtl-optimization/45364
 PR rtl-optimization/59890
 * var-tracking.c (local_get_addr_clear_given_value): Handle
 already cleared slot.
 (val_reset): Handle not allocated local_get_addr_cache.
 (vt_find_locations): Use post-order on the inverted CFG.


This caused:

FAIL: gcc.dg/guality/pr43051-1.c  -O1  line 40 v == 1
FAIL: gcc.dg/guality/pr43051-1.c  -O1  line 41 e == a[1]
FAIL: gcc.dg/guality/pr43051-1.c  -Os  line 40 v == 1
FAIL: gcc.dg/guality/pr43051-1.c  -Os  line 41 e == a[1]

on Linux/x86.

H.J.


Re: configure check for flex

2014-01-28 Thread Andreas Schwab
Hans-Peter Nilsson h...@bitrange.com writes:

 See is_release in that same configure.ac, that might be the only
 additional condition that's needed.

is_release only distinguishes a release from a snapshot, but does not
say anything whether its a tarball or a VC checkout (ie. is_release=yes
is also possible in a VC checkout).

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: configure check for flex

2014-01-28 Thread Hans-Peter Nilsson
On Tue, 28 Jan 2014, Andreas Schwab wrote:
 Hans-Peter Nilsson h...@bitrange.com writes:

  See is_release in that same configure.ac, that might be the only
  additional condition that's needed.

 is_release only distinguishes a release from a snapshot, but does not
 say anything whether its a tarball or a VC checkout (ie. is_release=yes
 is also possible in a VC checkout).

That difference seems unimportant, at least lesser than a simple
means to avoid the missing-flex gotcha and the we-wont-need-it
gotcha.  But anyhoo, I'll leave it to you to provide a better
alternative.

brgds, H-P


Re: [RFC][gomp4] Offloading patches (1/3): Add '-fopenmp_target' option

2014-01-28 Thread Ilya Verbin
Hi Bernd,

2014/1/27 Bernd Schmidt ber...@codesourcery.com:
 Once I worked around this by unsetting the environment variables around this
 compiler invocation here, the next problem is exposed - the code tries to
 link together files compiled for the target (created by the code quoted
 above) and the host (the _omp_descr file, I believe). Linker errors ensue.

Thanks, that's a bug.  Fortunately, it could be fixed easily.

 As mentioned before, I think all this target-specific code has no place in
 lto-wrapper to begin with. For ptx, we're going to require some quite
 different mechanisms, so I think it might be best to invoke a new tool,
 maybe called $target-gen-offload, which knows how to produce an image that
 can be linked into the host executable. Different offload targets can then
 use different strategies to produce such an image.

That's quite a viable way.  We added all this stuff to these patches
to allow other targets to reuse it as much as possible.  I.e. we
wasn't aware if other targets support objcopy et al., so we proposed
our way so that others could reuse it as-is if everything is
available.  It turned out, that the targets differ much in this place,
so as you suggested, it's better to move all this stuff to
target-specific utils.  Certainly, these patches don't pretend to be a
final version - they are just RFC, and we currently only want to show
what we need and find out what other targets need.

 Probably each such image
 should contain its own code to register itself with libgomp, so that we
 don't have to construct a table.

If other targets use another mapping scheme, then I think these tables
could easily be omitted from host/target executables (e.g. we could
add a corresponding flag to the target images descriptor).  But
personally I believe this part is general enough to satisfy all
targets.  Could you please describe how functions would be invoked on
PTX?

 Some other observations:
  * is OFFLOAD_TARGET_NAMES actually useful, or would any string
generated at link time suffice?

Yep, it might be redundant for now, because all we need is target
compilers.  Target names aren't necessary.

  * Is the user expected to set OFFLOAD_TARGET_COMPILERS, or should
this be done by the gcc driver, possibly based on command line
options (I'd much prefer that)?

It's supposed to be set by gcc driver.  Initial work in this direction
could be found here:
http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01242.html

  * Do we actually need an -fopenmp-target option? The way I imagine it
(and which was somewhat present in the Makefile patches I posted
last year) is that an offload compiler is specially configured to
know that that's how it will be used, and to know what the host
architecture is. A $target-gen-offload could then be built with
knowledge of the host architecture and installed in the host
compiler's libexec install directory.

Our idea here was that a single x86 compiler could serve both as host
and as target compiler, depending on presence of this option.  If
these compilers are always different, then indeed this option isn't
needed.

 Bernd

  -- Ilya


[PATCH][ARM][committed] Remove useless statement in arm_new_rtx_costs

2014-01-28 Thread Kyrill Tkachov

Hi all,

I've committed the attached trivial patch to remove a useless statement as 
r207193.

Thanks,
Kyrill


2014-01-28  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/arm/arm.c (arm_new_rtx_costs): Remove useless statement
at const_int_cost.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index bf5d1b2..bcb64ee 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -10406,7 +10406,6 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 const_int_cost:
   if (mode == SImode)
 	{
-	  *cost += 0;
 	  *cost += COSTS_N_INSNS (arm_gen_constant (outer_code, SImode, NULL,
 		INTVAL (x), NULL, NULL,
 		0, 0));

[patch] Fix gcc.target/arm/thumb-cbranchqi.c.

2014-01-28 Thread Kazu_Hirata
Hi,

Attached is a patch to fix gcc.target/arm/thumb-cbranchqi.c.

Without this patch, the testcase fails because these days gcc
generates:

ldrb:
ldrbr3, [r0, #8]
mov r0, #2
cmp r3, #127
bls .L2
mov r0, #5  
.L2:
@ sp needed  
bx  lr

Note that we see the bls instruction instead of the bhi
instruction that the testcase expects to see.

The patch fixes the problem by accepting bhi also.

Note that this patch does not miss the point of the original issue,
namely PR target/40603:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40603

The point is to take advantage of the [Rn+imm5] addressing mode for
ldrb, which is not available for ldrsb, and an unsigned comparison
against 127.  Since we are loading an unsigned byte and performing an
unsigned comparison, we must have bls or bhi.

Tested on arm-none-eabi.  OK to apply?

Kazu Hirata

gcc/testsuite/
2014-01-27  Kazu Hirata  k...@codesourcery.com

* gcc.target/arm/thumb-cbranchqi.c: Accept bls also.

Index: gcc/testsuite/gcc.target/arm/thumb-cbranchqi.c
===
--- gcc/testsuite/gcc.target/arm/thumb-cbranchqi.c  (revision 207119)
+++ gcc/testsuite/gcc.target/arm/thumb-cbranchqi.c  (working copy)
@@ -12,4 +12,4 @@ int ldrb(unsigned char* p)
 
 
 /* { dg-final { scan-assembler 127 } } */
-/* { dg-final { scan-assembler bhi } } */
+/* { dg-final { scan-assembler bhi|bls } } */


[PATCH] Fix PR58742

2014-01-28 Thread Richard Biener

This implements simple pattern matching for folding of
pointer subtraction GIMPLE IL which is performed in an
integer type.  It handles the simple case, like the
others in associate_plusminus (cases involving cancellation).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-01-28  Richard Biener  rguent...@suse.de

PR tree-optimization/58742
* tree-ssa-forwprop.c (associate_plusminus): Handle
pointer subtraction of the form (T)(P + A) - (T)P.

Index: gcc/tree-ssa-forwprop.c
===
*** gcc/tree-ssa-forwprop.c (revision 207180)
--- gcc/tree-ssa-forwprop.c (working copy)
*** associate_plusminus (gimple_stmt_iterato
*** 2543,2548 
--- 2543,2549 
CST +- (CST +- A)  -  CST +- A
CST +- (A +- CST)  -  CST +- A
A + ~A -  -1
+   (T)(P + A) - (T)P  - (T)A
  
   via commutating the addition and contracting operations to zero
   by reassociation.  */
*** associate_plusminus (gimple_stmt_iterato
*** 2646,2651 
--- 2647,2701 
  gimple_set_modified (stmt, true);
}
}
+ else if (CONVERT_EXPR_CODE_P (def_code)  code == MINUS_EXPR
+   TREE_CODE (rhs2) == SSA_NAME)
+   {
+ /* (T)(ptr + adj) - (T)ptr - (T)adj.  */
+ gimple def_stmt2 = SSA_NAME_DEF_STMT (rhs2);
+ if (TREE_CODE (gimple_assign_rhs1 (def_stmt)) == SSA_NAME
+  is_gimple_assign (def_stmt2)
+  can_propagate_from (def_stmt2)
+  CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt2))
+  TREE_CODE (gimple_assign_rhs1 (def_stmt2)) == SSA_NAME)
+   {
+ /* Now we have (T)A - (T)ptr.  */
+ tree ptr = gimple_assign_rhs1 (def_stmt2);
+ def_stmt2 = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (def_stmt));
+ if (is_gimple_assign (def_stmt2)
+  gimple_assign_rhs_code (def_stmt2) == POINTER_PLUS_EXPR
+  gimple_assign_rhs1 (def_stmt2) == ptr)
+   {
+ /* And finally (T)(ptr + X) - (T)ptr.  */
+ tree adj = gimple_assign_rhs2 (def_stmt2);
+ /* If the conversion of the pointer adjustment to the
+final type requires a sign- or zero-extension we
+have to punt - it is not defined which one is
+correct.  */
+ if (TYPE_PRECISION (TREE_TYPE (rhs1))
+ = TYPE_PRECISION (TREE_TYPE (adj))
+ || (TREE_CODE (adj) == INTEGER_CST
+  tree_int_cst_sign_bit (adj) == 0))
+   {
+ if (useless_type_conversion_p (TREE_TYPE (rhs1),
+TREE_TYPE (adj)))
+   {
+ code = TREE_CODE (adj);
+ rhs1 = adj;
+   }
+ else
+   {
+ code = NOP_EXPR;
+ rhs1 = adj;
+   }
+ rhs2 = NULL_TREE;
+ gimple_assign_set_rhs_with_ops (gsi, code, rhs1,
+ NULL_TREE);
+ gcc_assert (gsi_stmt (*gsi) == stmt);
+ gimple_set_modified (stmt, true);
+   }
+   }
+   }
+   }
}
  }
  


Re: -Og bug? (was: [PATCH] libsanitizer demangling using cp-demangle.c)

2014-01-28 Thread Ian Lance Taylor
On Tue, Jan 28, 2014 at 6:36 AM, Thomas Schwinge
tho...@codesourcery.com wrote:
 Avoid 'dc' may be uninitialized warning.

 libiberty/
 * cp-demangle.c (d_demangle_callback): Put __builtin_unreachable
 in place, to help the compiler.

 --- libiberty/cp-demangle.c
 +++ libiberty/cp-demangle.c
 @@ -5824,6 +5824,8 @@ d_demangle_callback (const char *mangled, int options,
   NULL);
 d_advance (di, strlen (d_str (di)));
 break;
 +  default:
 +   __builtin_unreachable ();

You can't call __builtin_unreachable in this code, because libiberty
in stage 1 will be compiled by the host compiler and
__builtin_unreachable is specific to GCC.

This patch is OK if you call abort instead of __builtin_unreachable.

Thanks.

Ian


Re: PATCH: PR target/59672: Add -m16 support for x86

2014-01-28 Thread Uros Bizjak
On Mon, Jan 27, 2014 at 8:44 PM, H.J. Lu hongjiu...@intel.com wrote:

 The .code16gcc directive was added to binutils back in 1999:

 ---
'.code16gcc' provides experimental support for generating 16-bit code
 from gcc, and differs from '.code16' in that 'call', 'ret', 'enter',
 'leave', 'push', 'pop', 'pusha', 'popa', 'pushf', and 'popf'
 instructions default to 32-bit size.  This is so that the stack pointer
 is manipulated in the same way over function calls, allowing access to
 function parameters at the same stack offsets as in 32-bit mode.
 '.code16gcc' also automatically adds address size prefixes where
 necessary to use the 32-bit addressing modes that gcc generates.
 ---

 It encodes 32-bit assembly instructions generated by GCC in 16-bit format
  so that GCC can be used to generate 16-bit instructions.  To do that, the
  .code16gcc directive may be placed at the very beginning of the assembly
  code.  This patch adds -m16 to x86 backend by:

 1. Add -m16 and make it mutually exclusive with -m32, -m64 and -mx32.
 2. Treat -m16 like -m32 so that --32 is passed to assembler.
 3. Output .code16gcc at the very beginning of the assembly code.
 4. Turn off 64-bit ISA when -m16 is used.

 Tested on Linux/x86 and Linux/x86-64.  OK for trunk?

 Thanks.

 H.J.
 ---
 PR target/59672
 * config/i386/gnu-user64.h (SPEC_32): Add m16| to m32.
 (SPEC_X32): Likewise.
 (SPEC_64): Likewise.
 * config/i386/i386.c (ix86_option_override_internal): Turn off
 OPTION_MASK_ISA_64BIT, OPTION_MASK_ABI_X32 and OPTION_MASK_ABI_64
 for TARGET_16BIT.
 (x86_file_start): Output .code16gcc for TARGET_16BIT.
 * config/i386/i386.h (TARGET_16BIT): New macro.
 (TARGET_16BIT_P): Likewise.
 * config/i386/i386.opt: Add m16.
 * doc/invoke.texi: Document -m16.

OK for mainline, needs OK from RMs for a backport.

Please also add the entry to Changes.html, this is user-visible change.

Thanks,
Uros.


Re: -Og bug?

2014-01-28 Thread Thomas Schwinge
Hi!

On Tue, 28 Jan 2014 06:52:30 -0800, Ian Lance Taylor i...@google.com wrote:
 On Tue, Jan 28, 2014 at 6:36 AM, Thomas Schwinge
 tho...@codesourcery.com wrote:
  Avoid 'dc' may be uninitialized warning.
 
  libiberty/
  * cp-demangle.c (d_demangle_callback): Put __builtin_unreachable
  in place, to help the compiler.

For my own education: why is this not considered a GCC trunk bug?  It is
xgcc/cc1 which is coming up with this (bogus?) warning, but only for -Og
and not for -O0, -O1, etc.?

  --- libiberty/cp-demangle.c
  +++ libiberty/cp-demangle.c
  @@ -5824,6 +5824,8 @@ d_demangle_callback (const char *mangled, int options,
NULL);
  d_advance (di, strlen (d_str (di)));
  break;
  +  default:
  +   __builtin_unreachable ();
 
 You can't call __builtin_unreachable in this code, because libiberty
 in stage 1 will be compiled by the host compiler and
 __builtin_unreachable is specific to GCC.

Right, thanks for catching that.

 This patch is OK if you call abort instead of __builtin_unreachable.

As soon as I'm clear that this is not in fact a GCC bug, I'll commit the
following.  stdlib.h already is being included.  Source code comment
snatched from regex.c.

Avoid 'dc' may be uninitialized warning.

libiberty/
* cp-demangle.c (d_demangle_callback): Put an abort call in
place, to help the compiler.

--- libiberty/cp-demangle.c
+++ libiberty/cp-demangle.c
@@ -5824,6 +5824,8 @@ d_demangle_callback (const char *mangled, int options,
  NULL);
d_advance (di, strlen (d_str (di)));
break;
+  default:
+   abort (); /* We have listed all the cases.  */
   }
 
 /* If DMGL_PARAMS is set, then if we didn't consume the entire


Grüße,
 Thomas


pgpPnuJ5Z7Alx.pgp
Description: PGP signature


Re: [patch] Fix gcc.target/arm/thumb-cbranchqi.c.

2014-01-28 Thread Richard Earnshaw
On 28/01/14 14:16, kazu_hir...@mentor.com wrote:
 Hi,
 
 Attached is a patch to fix gcc.target/arm/thumb-cbranchqi.c.
 
 Without this patch, the testcase fails because these days gcc
 generates:
 
 ldrb:
 ldrbr3, [r0, #8]
 mov r0, #2
 cmp r3, #127
 bls .L2
 mov r0, #5  
 .L2:
 @ sp needed  
 bx  lr
 
 Note that we see the bls instruction instead of the bhi
 instruction that the testcase expects to see.
 
 The patch fixes the problem by accepting bhi also.
 
 Note that this patch does not miss the point of the original issue,
 namely PR target/40603:
 
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40603
 
 The point is to take advantage of the [Rn+imm5] addressing mode for
 ldrb, which is not available for ldrsb, and an unsigned comparison
 against 127.  Since we are loading an unsigned byte and performing an
 unsigned comparison, we must have bls or bhi.
 
 Tested on arm-none-eabi.  OK to apply?
 
 Kazu Hirata
 
 gcc/testsuite/
 2014-01-27  Kazu Hirata  k...@codesourcery.com
 
   * gcc.target/arm/thumb-cbranchqi.c: Accept bls also.
 
 Index: gcc/testsuite/gcc.target/arm/thumb-cbranchqi.c
 ===
 --- gcc/testsuite/gcc.target/arm/thumb-cbranchqi.c(revision 207119)
 +++ gcc/testsuite/gcc.target/arm/thumb-cbranchqi.c(working copy)
 @@ -12,4 +12,4 @@ int ldrb(unsigned char* p)
  
  
  /* { dg-final { scan-assembler 127 } } */
 -/* { dg-final { scan-assembler bhi } } */
 +/* { dg-final { scan-assembler bhi|bls } } */
 

OK.

R.




Re: PATCH: PR target/59672: Add -m16 support for x86

2014-01-28 Thread H.J. Lu
On Tue, Jan 28, 2014 at 8:30 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Tue, Jan 28, 2014 at 5:01 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Mon, Jan 27, 2014 at 8:44 PM, H.J. Lu hongjiu...@intel.com wrote:

 The .code16gcc directive was added to binutils back in 1999:

 ---
'.code16gcc' provides experimental support for generating 16-bit code
 from gcc, and differs from '.code16' in that 'call', 'ret', 'enter',
 'leave', 'push', 'pop', 'pusha', 'popa', 'pushf', and 'popf'
 instructions default to 32-bit size.  This is so that the stack pointer
 is manipulated in the same way over function calls, allowing access to
 function parameters at the same stack offsets as in 32-bit mode.
 '.code16gcc' also automatically adds address size prefixes where
 necessary to use the 32-bit addressing modes that gcc generates.
 ---

 It encodes 32-bit assembly instructions generated by GCC in 16-bit format
  so that GCC can be used to generate 16-bit instructions.  To do that, the
  .code16gcc directive may be placed at the very beginning of the assembly
  code.  This patch adds -m16 to x86 backend by:

 1. Add -m16 and make it mutually exclusive with -m32, -m64 and -mx32.
 2. Treat -m16 like -m32 so that --32 is passed to assembler.
 3. Output .code16gcc at the very beginning of the assembly code.
 4. Turn off 64-bit ISA when -m16 is used.

 Tested on Linux/x86 and Linux/x86-64.  OK for trunk?

 Thanks.

 H.J.
 ---
 PR target/59672
 * config/i386/gnu-user64.h (SPEC_32): Add m16| to m32.
 (SPEC_X32): Likewise.
 (SPEC_64): Likewise.
 * config/i386/i386.c (ix86_option_override_internal): Turn off
 OPTION_MASK_ISA_64BIT, OPTION_MASK_ABI_X32 and OPTION_MASK_ABI_64
 for TARGET_16BIT.
 (x86_file_start): Output .code16gcc for TARGET_16BIT.
 * config/i386/i386.h (TARGET_16BIT): New macro.
 (TARGET_16BIT_P): Likewise.
 * config/i386/i386.opt: Add m16.
 * doc/invoke.texi: Document -m16.

 OK for mainline, needs OK from RMs for a backport.

 Please also add the entry to Changes.html, this is user-visible change.

 Oh, a short scan-asm testcase would be nice, too.


scan-asm testcase doesn't do anything useful.  The only
difference in assembly code between -m16 and -m32 is the
.code16gcc directive  All magic is done in assembler.

Here is a run-time test.  It builds 16-bit image for BIOS and
loads it into qemu-system-i386.  OK to install?

Thanks.

-- 
H.J.
---
PR target/59672
* gcc.target/i386/m16/Makefile: New file.
* gcc.target/i386/m16/m16.exp: Likewise.
* gcc.target/i386/m16/include16/string.h: Likewise.
* gcc.target/i386/m16/include16/sys16.h: Likewise.
* gcc.target/i386/m16/lib16/conio.c: Likewise.
* gcc.target/i386/m16/lib16/crt0.c: Likewise.
* gcc.target/i386/m16/lib16/exit.c: Likewise.
* gcc.target/i386/m16/lib16/strlen.c: Likewise.
* gcc.target/i386/m16/run/run16.c: Likewise.
* gcc.target/i386/m16/run/wrap.S: Likewise.
* gcc.target/i386/m16/run/wwrap.S: Likewise.
* gcc.target/i386/m16/test16/test1.c: Likewise.
From 67f20cece3a500c26ace8992d868bbaeec53e4a4 Mon Sep 17 00:00:00 2001
From: H.J. Lu hjl.to...@gmail.com
Date: Thu, 23 Jan 2014 06:51:14 -0800
Subject: [PATCH 2/2] Initial -m16 test

	PR target/59672
	* gcc.target/i386/m16/Makefile: New file.
	* gcc.target/i386/m16/m16.exp: Likewise.
	* gcc.target/i386/m16/include16/string.h: Likewise.
	* gcc.target/i386/m16/include16/sys16.h: Likewise.
	* gcc.target/i386/m16/lib16/conio.c: Likewise.
	* gcc.target/i386/m16/lib16/crt0.c: Likewise.
	* gcc.target/i386/m16/lib16/exit.c: Likewise.
	* gcc.target/i386/m16/lib16/strlen.c: Likewise.
	* gcc.target/i386/m16/run/run16.c: Likewise.
	* gcc.target/i386/m16/run/wrap.S: Likewise.
	* gcc.target/i386/m16/run/wwrap.S: Likewise.
	* gcc.target/i386/m16/test16/test1.c: Likewise.
---
 gcc/ChangeLog.m16  |  19 ++
 gcc/testsuite/gcc.target/i386/m16/Makefile |  89 ++
 .../gcc.target/i386/m16/include16/string.h |   6 +
 .../gcc.target/i386/m16/include16/sys16.h  |  24 ++
 gcc/testsuite/gcc.target/i386/m16/lib16/conio.c|  19 ++
 gcc/testsuite/gcc.target/i386/m16/lib16/crt0.c |   9 +
 gcc/testsuite/gcc.target/i386/m16/lib16/exit.c |   9 +
 gcc/testsuite/gcc.target/i386/m16/lib16/strlen.c   |  11 +
 gcc/testsuite/gcc.target/i386/m16/m16.exp  |  55 
 gcc/testsuite/gcc.target/i386/m16/run/run16.c  | 312 +
 gcc/testsuite/gcc.target/i386/m16/run/wrap.S   | 180 
 gcc/testsuite/gcc.target/i386/m16/run/wwrap.S  |  13 +
 gcc/testsuite/gcc.target/i386/m16/test16/test1.c   |   7 +
 13 files changed, 753 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/m16/Makefile
 create mode 100644 gcc/testsuite/gcc.target/i386/m16/include16/string.h
 create mode 100644 gcc/testsuite/gcc.target/i386/m16/include16/sys16.h
 create mode 100644 gcc/testsuite/gcc.target/i386/m16/lib16/conio.c
 create mode 100644 

Re: PATCH: PR target/59672: Add -m16 support for x86

2014-01-28 Thread Uros Bizjak
On Tue, Jan 28, 2014 at 5:35 PM, H.J. Lu hjl.to...@gmail.com wrote:

 The .code16gcc directive was added to binutils back in 1999:

 scan-asm testcase doesn't do anything useful.  The only
 difference in assembly code between -m16 and -m32 is the
 .code16gcc directive  All magic is done in assembler.

The test would just pass -m16 in dg-options and scan for the above
directive. It is a simple test that -m16 works as expected.

Uros.


Re: PATCH: PR target/59672: Add -m16 support for x86

2014-01-28 Thread H.J. Lu
On Tue, Jan 28, 2014 at 8:01 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Mon, Jan 27, 2014 at 8:44 PM, H.J. Lu hongjiu...@intel.com wrote:

 The .code16gcc directive was added to binutils back in 1999:

 ---
'.code16gcc' provides experimental support for generating 16-bit code
 from gcc, and differs from '.code16' in that 'call', 'ret', 'enter',
 'leave', 'push', 'pop', 'pusha', 'popa', 'pushf', and 'popf'
 instructions default to 32-bit size.  This is so that the stack pointer
 is manipulated in the same way over function calls, allowing access to
 function parameters at the same stack offsets as in 32-bit mode.
 '.code16gcc' also automatically adds address size prefixes where
 necessary to use the 32-bit addressing modes that gcc generates.
 ---

 It encodes 32-bit assembly instructions generated by GCC in 16-bit format
  so that GCC can be used to generate 16-bit instructions.  To do that, the
  .code16gcc directive may be placed at the very beginning of the assembly
  code.  This patch adds -m16 to x86 backend by:

 1. Add -m16 and make it mutually exclusive with -m32, -m64 and -mx32.
 2. Treat -m16 like -m32 so that --32 is passed to assembler.
 3. Output .code16gcc at the very beginning of the assembly code.
 4. Turn off 64-bit ISA when -m16 is used.

 Tested on Linux/x86 and Linux/x86-64.  OK for trunk?

 Thanks.

 H.J.
 ---
 PR target/59672
 * config/i386/gnu-user64.h (SPEC_32): Add m16| to m32.
 (SPEC_X32): Likewise.
 (SPEC_64): Likewise.
 * config/i386/i386.c (ix86_option_override_internal): Turn off
 OPTION_MASK_ISA_64BIT, OPTION_MASK_ABI_X32 and OPTION_MASK_ABI_64
 for TARGET_16BIT.
 (x86_file_start): Output .code16gcc for TARGET_16BIT.
 * config/i386/i386.h (TARGET_16BIT): New macro.
 (TARGET_16BIT_P): Likewise.
 * config/i386/i386.opt: Add m16.
 * doc/invoke.texi: Document -m16.

 OK for mainline, needs OK from RMs for a backport.

Checked in.

 Please also add the entry to Changes.html, this is user-visible change.


Here is the patch for changes.html.  OK to install?

Thanks.

-- 
H.J.
---
Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v
retrieving revision 1.53
diff -u -p -r1.53 changes.html
--- changes.html 21 Jan 2014 08:34:06 - 1.53
+++ changes.html 28 Jan 2014 16:26:19 -
@@ -415,6 +415,9 @@ auto incr = [](auto x) { return x++; };
   well on the most current Intel processors, which are Haswell
   and Silvermont for GCC 4.9.
 /li
+liSupport to encode 32-bit assembly instructions in 16-bit format
+  is now available through the code-m16/code option.
+/li
 liBetter inlining of codememcpy/code and codememset/code
  that is aware of value ranges and produces shorter alignment prologues.
 /li


Re: PATCH: PR target/59672: Add -m16 support for x86

2014-01-28 Thread Uros Bizjak
On Tue, Jan 28, 2014 at 5:01 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Mon, Jan 27, 2014 at 8:44 PM, H.J. Lu hongjiu...@intel.com wrote:

 The .code16gcc directive was added to binutils back in 1999:

 ---
'.code16gcc' provides experimental support for generating 16-bit code
 from gcc, and differs from '.code16' in that 'call', 'ret', 'enter',
 'leave', 'push', 'pop', 'pusha', 'popa', 'pushf', and 'popf'
 instructions default to 32-bit size.  This is so that the stack pointer
 is manipulated in the same way over function calls, allowing access to
 function parameters at the same stack offsets as in 32-bit mode.
 '.code16gcc' also automatically adds address size prefixes where
 necessary to use the 32-bit addressing modes that gcc generates.
 ---

 It encodes 32-bit assembly instructions generated by GCC in 16-bit format
  so that GCC can be used to generate 16-bit instructions.  To do that, the
  .code16gcc directive may be placed at the very beginning of the assembly
  code.  This patch adds -m16 to x86 backend by:

 1. Add -m16 and make it mutually exclusive with -m32, -m64 and -mx32.
 2. Treat -m16 like -m32 so that --32 is passed to assembler.
 3. Output .code16gcc at the very beginning of the assembly code.
 4. Turn off 64-bit ISA when -m16 is used.

 Tested on Linux/x86 and Linux/x86-64.  OK for trunk?

 Thanks.

 H.J.
 ---
 PR target/59672
 * config/i386/gnu-user64.h (SPEC_32): Add m16| to m32.
 (SPEC_X32): Likewise.
 (SPEC_64): Likewise.
 * config/i386/i386.c (ix86_option_override_internal): Turn off
 OPTION_MASK_ISA_64BIT, OPTION_MASK_ABI_X32 and OPTION_MASK_ABI_64
 for TARGET_16BIT.
 (x86_file_start): Output .code16gcc for TARGET_16BIT.
 * config/i386/i386.h (TARGET_16BIT): New macro.
 (TARGET_16BIT_P): Likewise.
 * config/i386/i386.opt: Add m16.
 * doc/invoke.texi: Document -m16.

 OK for mainline, needs OK from RMs for a backport.

 Please also add the entry to Changes.html, this is user-visible change.

Oh, a short scan-asm testcase would be nice, too.

Thanks,
Uros.


RE: [PING] [PATCH] _Cilk_for for C and C++

2014-01-28 Thread Iyer, Balaji V


 -Original Message-
 From: Iyer, Balaji V
 Sent: Monday, January 27, 2014 4:36 PM
 To: Jakub Jelinek
 Cc: Jason Merrill; 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org';
 'r...@redhat.com'
 Subject: RE: [PING] [PATCH] _Cilk_for for C and C++
 
  -Original Message-
  From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
  ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek
  Sent: Monday, January 27, 2014 3:50 PM
  To: Iyer, Balaji V
  Cc: Jason Merrill; 'Jeff Law'; 'Aldy Hernandez';
  'gcc-patches@gcc.gnu.org'; 'r...@redhat.com'
  Subject: Re: [PING] [PATCH] _Cilk_for for C and C++
 
  On Mon, Jan 27, 2014 at 08:41:14PM +, Iyer, Balaji V wrote:
 Did you get a chance to look at this _Cilk_for patch?
 
  IMHO it is not as much work as you are fearing, at most a few hours of
  work to get it right, and well worth doing.  So, please at least try
  it out and if you get stuck with it, explain why.
 
 Hi Jakub,
   I tried it that way in the original patch submission for C
 (http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01369.html), but it hit a
 dead-end when I was trying to get STL iterators working for C++. This is why I
 re-structured things this way to get them both working.
 
 Thanks,
 
 Balaji V. Iyer.
 

Hi Jakub,
I thought about it a bit more, and the main issue here is that we need 
access to the _Cilk_for loop's components both inside the child function and 
the parent function.

So, at the moment, I have modelled the _Cilk_for as something like this:

#pragma omp for  schedule (runtime: grain)
_Cilk-for (vectorint::iterator ii = array.begin (); ii != array.end (); ii++)
#pragma omp parallel 
{
body
}

From what I understand, you feel this is a bit ugly and you want this to be 
modelled something like this?

#pragma omp parallel for schedule (runtime: grain)
_Cilk_for (vectorint::iterator ii = array.begin (); ii != array.end(); ii++)
{
body
}

Am I right?

As it stands, doing it the way you suggested did not work when we have 
iterators since iterator expansion pushed inside the child function and its 
expanded variables are not accessible outside the child function by 
gimplify_omp_for. That is, the expansion is put after #pragma omp parallel for 
and that is all pulled into the child function and thus the information to 
compute the count is lost for the parent function.

There is a hack that I think may get around this. This is a bit ugly and really 
is not the way I would think of _Cilk_fors. I am OK with trying this if you 
will accept it.
 
If I do something like this:
 
#pragma omp parallel for schedule (runtime:grain) if ((array.end() - 
array.begin ())/1)
_Cilk_for (vector int::iterator ii = array.begin (); ii != array.end (); ii++)
{
body
}

The new addition is if clause where if ((end - start) / step)

Then, in the expand_parallel_task, I can extract the if (...) clause and then 
pass the expression as a parameter for the loop-count. Yes, it's bit ugly but 
if you are willing to accept it, I can try to implement this.

Please let me know.

Thanks,

Balaji V. Iyer.


Re: PATCH: PR target/59672: Add -m16 support for x86

2014-01-28 Thread Jakub Jelinek
On Tue, Jan 28, 2014 at 08:35:13AM -0800, H.J. Lu wrote:
 Here is a run-time test.  It builds 16-bit image for BIOS and
 loads it into qemu-system-i386.  OK to install?

Ugh, I'd say we don't want this kind of stuff in gcc testsuite.
A scan-assembler would be IMHO enough.

 PR target/59672
 * gcc.target/i386/m16/Makefile: New file.
 * gcc.target/i386/m16/m16.exp: Likewise.
 * gcc.target/i386/m16/include16/string.h: Likewise.
 * gcc.target/i386/m16/include16/sys16.h: Likewise.
 * gcc.target/i386/m16/lib16/conio.c: Likewise.
 * gcc.target/i386/m16/lib16/crt0.c: Likewise.
 * gcc.target/i386/m16/lib16/exit.c: Likewise.
 * gcc.target/i386/m16/lib16/strlen.c: Likewise.
 * gcc.target/i386/m16/run/run16.c: Likewise.
 * gcc.target/i386/m16/run/wrap.S: Likewise.
 * gcc.target/i386/m16/run/wwrap.S: Likewise.
 * gcc.target/i386/m16/test16/test1.c: Likewise.

Jakub


Re: C++ PATCH for c++/54652 (ICE with repeated typedef/attribute)

2014-01-28 Thread Jason Merrill
It occurred to me that we don't need to call merge_types at all if we're 
just going to throw away the result.


commit 00a4445cf80b647d14144c9b509cf06d052a888e
Author: Jason Merrill ja...@redhat.com
Date:   Tue Jan 28 08:50:23 2014 -0500

	* decl.c (duplicate_decls): Tweak.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index c93c783..aca96fc 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -1923,13 +1923,13 @@ duplicate_decls (tree newdecl, tree olddecl, bool newdecl_is_friend)
   if (TREE_CODE (newdecl) == FUNCTION_DECL)
 	maybe_instantiate_noexcept (olddecl);
 
-  /* Merge the data types specified in the two decls.  */
-  newtype = merge_types (TREE_TYPE (newdecl), TREE_TYPE (olddecl));
-
   /* For typedefs use the old type, as the new type's DECL_NAME points
 	 at newdecl, which will be ggc_freed.  */
   if (TREE_CODE (newdecl) == TYPE_DECL)
 	newtype = oldtype;
+  else
+	/* Merge the data types specified in the two decls.  */
+	newtype = merge_types (TREE_TYPE (newdecl), TREE_TYPE (olddecl));
 
   if (VAR_P (newdecl))
 	{


Re: PATCH: PR target/59672: Add -m16 support for x86

2014-01-28 Thread H.J. Lu
On Tue, Jan 28, 2014 at 8:42 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Tue, Jan 28, 2014 at 5:35 PM, H.J. Lu hjl.to...@gmail.com wrote:

 The .code16gcc directive was added to binutils back in 1999:

 scan-asm testcase doesn't do anything useful.  The only
 difference in assembly code between -m16 and -m32 is the
 .code16gcc directive  All magic is done in assembler.

 The test would just pass -m16 in dg-options and scan for the above
 directive. It is a simple test that -m16 works as expected.


scan-asm doesn't work with -m32 on Linux/x86-64 with

RUNTESTFLAGS=--target_board='unix{-m32}'

since -m32 is appended after any dg-options:

Executing on host:
/export/build/gnu/gcc-m16/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc-m16/build-x86_64-linux/gcc/
/export/gnu/import/git/gcc-misc/gcc/testsuite/gcc.target/i386/m16-1.c
-fno-diagnostics-show-caret -fdiagnostics-color=never  -ansi
-pedantic-errors -m16 -ffat-lto-objects -S  -m32 -o m16-1.s
(timeout = 300)

Is there a way to change it?

-- 
H.J.


Re: -Og bug?

2014-01-28 Thread Ian Lance Taylor
On Tue, Jan 28, 2014 at 8:11 AM, Thomas Schwinge
tho...@codesourcery.com wrote:
 On Tue, 28 Jan 2014 06:52:30 -0800, Ian Lance Taylor i...@google.com wrote:
 On Tue, Jan 28, 2014 at 6:36 AM, Thomas Schwinge
 tho...@codesourcery.com wrote:
  Avoid 'dc' may be uninitialized warning.
 
  libiberty/
  * cp-demangle.c (d_demangle_callback): Put __builtin_unreachable
  in place, to help the compiler.

 For my own education: why is this not considered a GCC trunk bug?  It is
 xgcc/cc1 which is coming up with this (bogus?) warning, but only for -Og
 and not for -O0, -O1, etc.?

I don't really have an opinion on whether this is a bug in GCC or
not.  Since libiberty is compiled by other compilers, I think your
cp-demangle.c patch is reasonable and appropriate either way.

In particular, it's not a bug for the compiler to consider the
possibility that type may take on a value not named in the enum.
C/C++ impose no restrictions on values of enum type.  It's valid to
write code that stores a value that is not an enum constant into a
variable of enum type, so it's reasonable for the compiler to consider
the possibility, even though we can clearly see that it can not
happen.

I don't know why the compiler reports a different warning for -O1 and
-Og.  I encourage you to reduce the code into a standalone test case
and file a bug report.

Ian


Re: Do not produce empty try-finally statements

2014-01-28 Thread Jan Hubicka
 Hi,
 
 On Tue, 28 Jan 2014, Richard Biener wrote:
 
   The EH optimizations involving cleanups with only clobbers in them 
   are that if at the end of the cleanup after only CLOBBER stmts you 
   would rethrow the exception externally, then the clobber isn't needed 
   and the whole cleanup can be removed.  And, if it rethrows somewhere 
   internally, we can move the clobber stmts to the landing pad of 
   wherever it would be caught.
  
   OK, I still do not see how ehclanup1 can then safely remove them 
   pre-inline given that the whole function body can be inlined into 
   another containing the outer EH region.
  
  That's true.
 
 Yes, and I think they should be removed only after inlining.  

In that case I think I shoud look into detecting clobber only EH in inliner and
do not account it into the size/time estimates.  I always wondered why tramp3d
becomes so harder for inliner with EH enabled, I seem to get it now ;)

 (Alternatively the inliner could be extended to add clobbers when changing 
 an external-throw into an internal-throw, but well, ...)
 
If this is valid, why we can not just eliminate EH in those outer 
   clobber try..finally as part of ehlowering earlier?
  
  Probably we'd miss too many inlining cases from early inlining and the 
  ehcleanup1 time is just a heuristic that works good enough for us?
 
 No, removing even more clobbers would remove even more stack slot sharing.  
 If anything we should remove _less_ regions (as in the precondition for 
 Honzas sentence above, If this is valid, ... simply is false).  AFAIU 
 this all is just a problem with O0 code quality.  So, there's the obvious 
 solution: run ehcleanup also for O0.

It is also problem of inliner quality decisions and memory use/compile time.
The in-memory representation of unnecesary EH is quite big.

I am quite ignorant in this area, but for -O0 can't we simply disable all
clobbers?

Honza
 
 
 Ciao,
 Michael.


Ping Re: Fix IBM long double spurious overflows

2014-01-28 Thread Joseph S. Myers
On Mon, 6 Jan 2014, David Edelsohn wrote:

 On Sat, Jan 4, 2014 at 8:16 AM, Joseph S. Myers jos...@codesourcery.com 
 wrote:
  This patch fixes various cases of spurious overflow exceptions in the
  IBM long double support code.  The generic issue is that an initial
  approximation is computed by using the relevant arithmetic operation
  on the high parts of the operands - but this may overflow double in
  some cases where the final result is large but still a long way (up to
  around 2^53 ulp) from overflowing long double.  For division overflow
  could occur not just from the initial a / c division but also from the
  subsequent multiplication of the result by c (in some cases where a is
  DBL_MAX, say), when the final result of the division need not be large
  at all.
 
  __gcc_qadd already tried to handle such overflow cases, but detected
  them by examining the result of the addition of high parts - which
  leaves a spurious overflow exception raised even if it returns the
  correct non-overflowing value.  This patch instead checks the operands
  and does appropriate scaling, in all of __gcc_qadd, __gcc_qmul and
  __gcc_qdiv, to avoid spurious overflow exceptions arising as well as
  avoiding the bad results arising from such overflows.
 
  Tested with no regressions with cross to powerpc-linux-gnu (and also
  ran the glibc libm tests, which provide a rather more thorough test of
  floating-point arithmetic than the GCC testsuite; this patch fixes, at
  least, bad results from cbrtl (LDBL_MAX) that arose from the second
  division issue mentioned, as well as the specific cases shown in the
  tests added to the GCC testsuite).  OK to commit?
 
 Before we go farther down this path, IBM needs to internally decide on
 the end goal and the amount of language / library conformance that
 makes sense for the IBM long double format.

Ping.  Original patch: 
http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00157.html.

-- 
Joseph S. Myers
jos...@codesourcery.com


C++ PATCH for c++/53756 (-g and C++1y auto)

2014-01-28 Thread Jason Merrill
My earlier work to support return type deduction omitted support for 
debugging information; this patch fixes that oversight.  It also 
corrects the mangled name of 'operator auto', which should reflect the 
'auto' rather than the deduced return type.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 2b1d597e10480dfc2aefe251990cacc2af11d7cc
Author: Jason Merrill ja...@redhat.com
Date:   Wed Nov 6 12:02:55 2013 -0500

	PR c++/53756
gcc/
	* dwarf2out.c (auto_die): New static.
	(gen_type_die_with_usage): Handle C++1y 'auto'.
	(gen_subprogram_die): If in-class DIE had 'auto', emit type again
	on definition.
gcc/cp/
	* mangle.c (write_unqualified_name): Handle operator auto.

diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index be3c698..add73cf 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -1231,6 +1231,9 @@ write_unqualified_name (const tree decl)
 	  fn_type = get_mostly_instantiated_function_type (decl);
 	  type = TREE_TYPE (fn_type);
 	}
+	  else if (FNDECL_USED_AUTO (decl))
+	type = (DECL_STRUCT_FUNCTION (decl)-language
+		-x_auto_return_pattern);
 	  else
 	type = DECL_CONV_FN_TYPE (decl);
 	  write_conversion_operator_name (type);
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 22282d8..f6efd1f 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -247,6 +247,9 @@ static GTY(()) bool cold_text_section_used = false;
 /* The default cold text section.  */
 static GTY(()) section *cold_text_section;
 
+/* The DIE for C++1y 'auto' in a function return type.  */
+static GTY(()) dw_die_ref auto_die;
+
 /* Forward declarations for functions defined in this file.  */
 
 static char *stripattributes (const char *);
@@ -17999,6 +18002,13 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
 	add_AT_file (subr_die, DW_AT_decl_file, file_index);
 	  if (get_AT_unsigned (old_die, DW_AT_decl_line) != (unsigned) s.line)
 	add_AT_unsigned (subr_die, DW_AT_decl_line, s.line);
+
+	  /* If the prototype had an 'auto' return type, emit the real
+	 type on the definition die.  */
+	  if (is_cxx()  debug_info_level  DINFO_LEVEL_TERSE
+	   get_AT_ref (old_die, DW_AT_type) == auto_die)
+	add_type_attribute (subr_die, TREE_TYPE (TREE_TYPE (decl)),
+0, 0, context_die);
 	}
 }
   else
@@ -19820,6 +19830,25 @@ gen_type_die_with_usage (tree type, dw_die_ref context_die,
   break;
 
 default:
+  // A C++ function with deduced return type can have
+  // a TEMPLATE_TYPE_PARM named 'auto' in its type.
+  if (is_cxx ())
+	{
+	  tree name = TYPE_NAME (type);
+	  if (TREE_CODE (name) == TYPE_DECL)
+	name = DECL_NAME (name);
+	  if (name == get_identifier (auto))
+	{
+	  if (!auto_die)
+		{
+		  auto_die = new_die (DW_TAG_unspecified_type,
+  comp_unit_die (), NULL_TREE);
+		  add_name_attribute (auto_die, auto);
+		}
+	  equate_type_number_to_die (type, auto_die);
+	  break;
+	}
+	}
   gcc_unreachable ();
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn12.C b/gcc/testsuite/g++.dg/cpp1y/auto-fn12.C
index e4e58e8..ab4a1bb 100644
--- a/gcc/testsuite/g++.dg/cpp1y/auto-fn12.C
+++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn12.C
@@ -1,5 +1,5 @@
 // { dg-options -std=c++1y }
-// { dg-final { scan-assembler _ZN1AIiEcviEv } }
+// { dg-final { scan-assembler _ZN1AIiEcvDaEv } }
 
 template class T
 struct A {
diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn22.C b/gcc/testsuite/g++.dg/cpp1y/auto-fn22.C
new file mode 100644
index 000..f05cbb9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn22.C
@@ -0,0 +1,9 @@
+// { dg-options -std=c++1y }
+
+struct A
+{
+  operator auto();
+};
+
+// { dg-final { scan-assembler _ZN1AcvDaEv } }
+A::operator auto() { return 42; }
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/auto1.C b/gcc/testsuite/g++.dg/debug/dwarf2/auto1.C
new file mode 100644
index 000..188ca11
--- /dev/null
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/auto1.C
@@ -0,0 +1,30 @@
+// PR c++/53756
+// { dg-options -std=c++1y -g -dA -fno-debug-types-section }
+// We're looking for something like
+
+// .uleb128 0x3# (DIE (0x33) DW_TAG_subprogram)
+// .ascii a1\0   # DW_AT_name
+// .long   0x4c# DW_AT_type
+//...
+// .uleb128 0x5# (DIE (0x4c) DW_TAG_unspecified_type)
+// .long   .LASF6  # DW_AT_name: auto
+//...
+// .uleb128 0x7# (DIE (0x57) DW_TAG_subprogram)
+// .long   0x33# DW_AT_specification
+// .long   0x87# DW_AT_type
+//...
+// .uleb128 0x9# (DIE (0x87) DW_TAG_base_type)
+// .ascii int\0  # DW_AT_name
+
+// { dg-final { scan-assembler a1.*(0x\[0-9a-f\]+)\[ \t\]*# DW_AT_type.*\\1. DW_TAG_unspecified_type.*DW_AT_specification\[\n\r\]{1,2}\[^\n\r\]*(0x\[0-9a-f\]+)\[ \t\]*# DW_AT_type.*\\2. DW_TAG_base_type } }
+
+struct A
+{
+  auto a1 () { return 42; }
+};
+
+int main()
+{
+  A a;
+  a.a1();
+}


Re: C++ PATCH for c++/53756 (-g and C++1y auto)

2014-01-28 Thread Paolo Carlini

On 01/28/2014 06:05 PM, Jason Merrill wrote:
My earlier work to support return type deduction omitted support for 
debugging information; this patch fixes that oversight.  It also 
corrects the mangled name of 'operator auto', which should reflect the 
'auto' rather than the deduced return type.


Tested x86_64-pc-linux-gnu, applying to trunk.
Ah! Then I guess that in order to fix c++/58561 only is_base_type needs 
tweaking: shall we change the default to just return 0?


Thanks,
Paolo.


Re: [C++ Patch] PR 51219

2014-01-28 Thread Paolo Carlini
... by the way, I don't understand why we are appending the constructor 
at all for the unnamed bit-field?!? Eg, what about the below?


Thanks,
Paolo.


Index: cp/typeck2.c
===
--- cp/typeck2.c(revision 207199)
+++ cp/typeck2.c(working copy)
@@ -1268,11 +1268,7 @@ process_init_constructor_record (tree type, tree i
   tree type;
 
   if (!DECL_NAME (field)  DECL_C_BIT_FIELD (field))
-   {
- flags |= picflag_from_initializer (integer_zero_node);
- CONSTRUCTOR_APPEND_ELT (v, field, integer_zero_node);
- continue;
-   }
+   continue;
 
   if (TREE_CODE (field) != FIELD_DECL || DECL_ARTIFICIAL (field))
continue;
Index: testsuite/g++.dg/init/bitfield5.C
===
--- testsuite/g++.dg/init/bitfield5.C   (revision 0)
+++ testsuite/g++.dg/init/bitfield5.C   (working copy)
@@ -0,0 +1,12 @@
+// PR c++/51219
+
+struct A
+{
+  int i;
+  int : 8;
+};
+
+void foo()
+{
+  A a = { 0 };
+}


Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals

2014-01-28 Thread H.J. Lu
On Tue, Jan 28, 2014 at 5:23 AM, Dodji Seketeli do...@redhat.com wrote:
 Dodji Seketeli do...@redhat.com writes:

 Here is the patch I am committing right now.

 gcc/ChangeLog

   * input.c (location_get_source_line): Bail out on when line number
   is zero, and test the return value of
   lookup_or_add_file_to_cache_tab.

 gcc/testsuite/ChangeLog

   * c-c++-common/cpp/warning-zero-location.c: New test.
   * c-c++-common/cpp/warning-zero-location-2.c: Likewise.

 I forgot to say that it passed bootstrap  test on
 x86_64-unknown-linux-gnu against trunk.


The new tests failed on Linux/x86:

ERROR: c-c++-common/cpp/warning-zero-location-2.c -std=gnu++11: syntax
error in target selector 4636 for  dg-error 10 No such file or
directory { target *-*-* } 4636 
ERROR: c-c++-common/cpp/warning-zero-location-2.c -std=gnu++98: syntax
error in target selector 4636 for  dg-error 10 No such file or
directory { target *-*-* } 4636 
ERROR: c-c++-common/cpp/warning-zero-location-2.c  -Wc++-compat :
syntax error in target selector 4636 for  dg-error 10 No such file
or directory { target *-*-* } 4636 




-- 
H.J.


C++ PATCH for c++/58632 (ICE with class shadowing template parm)

2014-01-28 Thread Jason Merrill
This ICE was introduced when I adjusted lookup_and_check_tag to find 
template template parameters so that they can be friends.  But in this 
case that meant we started to try to define the ttp as a class, leading 
to chaos.  In lookup_and_check_tag we can tell that we're in a class 
definition by looking at the scope, and then ignore ttp in that specific 
context.


Tested x86_64-pc-linux-gnu, applying to trunk and 4.8.
commit b071be05acf3f427a59be19e8ef327e4ff20106e
Author: Jason Merrill ja...@redhat.com
Date:   Mon Jan 27 13:20:37 2014 -0500

	PR c++/58632
	* decl.c (lookup_and_check_tag): Ignore template parameters if
	scope == ts_current.
	* pt.c (check_template_shadow): Don't complain about the injected
	class name.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index aca96fc..e14e401 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -11982,7 +11982,10 @@ lookup_and_check_tag (enum tag_types tag_code, tree name,
 
   if (decl
(DECL_CLASS_TEMPLATE_P (decl)
-	  || DECL_TEMPLATE_TEMPLATE_PARM_P (decl)))
+	  /* If scope is ts_current we're defining a class, so ignore a
+	 template template parameter.  */
+	  || (scope != ts_current
+	   DECL_TEMPLATE_TEMPLATE_PARM_P (decl
 decl = DECL_TEMPLATE_RESULT (decl);
 
   if (decl  TREE_CODE (decl) == TYPE_DECL)
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 47d07db..6c68bae 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -3527,6 +3527,11 @@ check_template_shadow (tree decl)
 	   TEMPLATE_PARMS_FOR_INLINE (current_template_parms)))
 return true;
 
+  /* Don't complain about the injected class name, as we've already
+ complained about the class itself.  */
+  if (DECL_SELF_REFERENCE_P (decl))
+return false;
+
   error (declaration of %q+#D, decl);
   error ( shadows template parm %q+#D, olddecl);
   return false;
diff --git a/gcc/testsuite/g++.dg/template/shadow1.C b/gcc/testsuite/g++.dg/template/shadow1.C
new file mode 100644
index 000..6eb30d0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/shadow1.C
@@ -0,0 +1,4 @@
+// PR c++/58632
+
+templatetemplateint I class A // { dg-message shadows }
+class A {};			// { dg-error declaration }


Re: Ping Re: Fix IBM long double spurious overflows

2014-01-28 Thread David Edelsohn
On Tue, Jan 28, 2014 at 12:51 PM, Joseph S. Myers
jos...@codesourcery.com wrote:
 On Mon, 6 Jan 2014, David Edelsohn wrote:

 On Sat, Jan 4, 2014 at 8:16 AM, Joseph S. Myers jos...@codesourcery.com 
 wrote:
  This patch fixes various cases of spurious overflow exceptions in the
  IBM long double support code.  The generic issue is that an initial
  approximation is computed by using the relevant arithmetic operation
  on the high parts of the operands - but this may overflow double in
  some cases where the final result is large but still a long way (up to
  around 2^53 ulp) from overflowing long double.  For division overflow
  could occur not just from the initial a / c division but also from the
  subsequent multiplication of the result by c (in some cases where a is
  DBL_MAX, say), when the final result of the division need not be large
  at all.
 
  __gcc_qadd already tried to handle such overflow cases, but detected
  them by examining the result of the addition of high parts - which
  leaves a spurious overflow exception raised even if it returns the
  correct non-overflowing value.  This patch instead checks the operands
  and does appropriate scaling, in all of __gcc_qadd, __gcc_qmul and
  __gcc_qdiv, to avoid spurious overflow exceptions arising as well as
  avoiding the bad results arising from such overflows.
 
  Tested with no regressions with cross to powerpc-linux-gnu (and also
  ran the glibc libm tests, which provide a rather more thorough test of
  floating-point arithmetic than the GCC testsuite; this patch fixes, at
  least, bad results from cbrtl (LDBL_MAX) that arose from the second
  division issue mentioned, as well as the specific cases shown in the
  tests added to the GCC testsuite).  OK to commit?

 Before we go farther down this path, IBM needs to internally decide on
 the end goal and the amount of language / library conformance that
 makes sense for the IBM long double format.

 Ping.  Original patch:
 http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00157.html.

Joseph,

Adding the various tests for overflow slows down some other code where
performance is important.  Explicitly changing rounding mode would be
even more invasive and have significant performance impact.

This long double format never was designed for the rounding features
of the current ISO C standard. The format has been used effectively
without the additional features and there have not been requests for
conformance from normal users of the long double format.

Without a clearer need, there is no urgency to make the format fully
conforming and implement all of the performance mitigation and
alternatives. We prefer to not pursue the fixes until circumstances
change.

Thanks, David


C++ PATCH for c++/58701 (ICE with NSDMI and static anonymous union)

2014-01-28 Thread Jason Merrill
When we're building up the constructor for the anonymous union type, 
build_anon_member_initialization was getting confused, assuming that 
anything it found with anonymous union type would be a COMPONENT_REF. 
But within the constructor, *this also has anonymous union type.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit c961c6b0c58e0790cac83e6e071b742a53373cfa
Author: Jason Merrill ja...@redhat.com
Date:   Tue Jan 28 13:42:45 2014 -0500

	PR c++/58701
	* semantics.c (build_anon_member_initialization): Stop walking
	when we run out of COMPONENT_REFs.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 3a8daca..fd6466d 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7515,7 +7515,8 @@ build_anon_member_initialization (tree member, tree init,
   fields.safe_push (TREE_OPERAND (member, 1));
   member = TREE_OPERAND (member, 0);
 }
-  while (ANON_AGGR_TYPE_P (TREE_TYPE (member)));
+  while (ANON_AGGR_TYPE_P (TREE_TYPE (member))
+	  TREE_CODE (member) == COMPONENT_REF);
 
   /* VEC has the constructor elements vector for the context of FIELD.
  If FIELD is an anonymous aggregate, we will push inside it.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C b/gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C
new file mode 100644
index 000..57dfd59
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C
@@ -0,0 +1,11 @@
+// PR c++/58701
+// { dg-require-effective-target c++11 }
+// { dg-final { scan-assembler 7 } }
+
+static union
+{
+  union
+  {
+int i = 7;
+  };
+};


Re: C++ PATCH for c++/53756 (-g and C++1y auto)

2014-01-28 Thread Jason Merrill

On 01/28/2014 01:33 PM, Paolo Carlini wrote:

Ah! Then I guess that in order to fix c++/58561 only is_base_type needs
tweaking: shall we change the default to just return 0?


Makes sense.

Jason




Re: -Og bug?

2014-01-28 Thread Thomas Schwinge
Hi!

On Tue, 28 Jan 2014 09:12:44 -0800, Ian Lance Taylor i...@google.com wrote:
 On Tue, Jan 28, 2014 at 8:11 AM, Thomas Schwinge
 tho...@codesourcery.com wrote:
  On Tue, 28 Jan 2014 06:52:30 -0800, Ian Lance Taylor i...@google.com 
  wrote:
  On Tue, Jan 28, 2014 at 6:36 AM, Thomas Schwinge
  tho...@codesourcery.com wrote:
   Avoid 'dc' may be uninitialized warning.
  
   libiberty/
   * cp-demangle.c (d_demangle_callback): Put __builtin_unreachable
   in place, to help the compiler.
 
  For my own education: why is this not considered a GCC trunk bug?  It is
  xgcc/cc1 which is coming up with this (bogus?) warning, but only for -Og
  and not for -O0, -O1, etc.?
 
 I don't really have an opinion on whether this is a bug in GCC or
 not.  Since libiberty is compiled by other compilers, I think your
 cp-demangle.c patch is reasonable and appropriate either way.
 
 In particular, it's not a bug for the compiler to consider the
 possibility that type may take on a value not named in the enum.
 C/C++ impose no restrictions on values of enum type.  It's valid to
 write code that stores a value that is not an enum constant into a
 variable of enum type, so it's reasonable for the compiler to consider
 the possibility, even though we can clearly see that it can not
 happen.

OK, I agree to all of that, but I'd assume that if the compiler doesn't
do such value tracking to see whether all cases have been covered, it
also souldn't emit such possibly unitialized warning, to not cause false
positive warnings.

 I don't know why the compiler reports a different warning for -O1 and
 -Og.  I encourage you to reduce the code into a standalone test case
 and file a bug report.

http://gcc.gnu.org/PR59970.  Will try to continue with that one, but at
low priority.


Grüße,
 Thomas


pgpWXpSQPrTMc.pgp
Description: PGP signature


Re: Ping Re: Fix IBM long double spurious overflows

2014-01-28 Thread Joseph S. Myers
On Tue, 28 Jan 2014, David Edelsohn wrote:

 Joseph,
 
 Adding the various tests for overflow slows down some other code where
 performance is important.  Explicitly changing rounding mode would be
 even more invasive and have significant performance impact.
 
 This long double format never was designed for the rounding features
 of the current ISO C standard. The format has been used effectively
 without the additional features and there have not been requests for
 conformance from normal users of the long double format.
 
 Without a clearer need, there is no urgency to make the format fully
 conforming and implement all of the performance mitigation and
 alternatives. We prefer to not pursue the fixes until circumstances
 change.

The glibc libm testsuite has much more thorough coverage (hopefully soon 
to include running all tests in all rounding modes by default) than it did 
two years ago, and it's a pain to keep test results clean across all 
architectures when the basic arithmetic operations for IBM long double do 
not follow the normal conventions as regards permitted errors for most 
glibc libm functions (results within a few ulps, no spurious overflows or 
underflows except possibly for exact underflowing results, no missing 
underflows), or as regards working in all rounding modes, making it hard 
to distinguish between libgcc and glibc bugs.

Thus, if these issues are not to be fixed in libgcc, I think we need to 
seek FSF approval to use a copy of the current IBM long double libgcc code 
under LGPLv2.1+ in glibc, with a view to fixing the issues in that copy 
only and linking it directly into libc and libm (for their internal use 
rather than re-exporting symbols from it).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: -Og bug?

2014-01-28 Thread Ian Lance Taylor
`On Tue, Jan 28, 2014 at 1:10 PM, Thomas Schwinge
tho...@codesourcery.com wrote:

 OK, I agree to all of that, but I'd assume that if the compiler doesn't
 do such value tracking to see whether all cases have been covered, it
 also souldn't emit such possibly unitialized warning, to not cause false
 positive warnings.

The -Wuninitialized warning is full of false positives.

It is the canonical example of why warnings that are driven by
optimizations are difficult for users in practice.

Ian


Re: C++ PATCH for c++/53756 (-g and C++1y auto)

2014-01-28 Thread Paolo Carlini

Hi,

On 01/28/2014 10:02 PM, Jason Merrill wrote:

On 01/28/2014 01:33 PM, Paolo Carlini wrote:

Ah! Then I guess that in order to fix c++/58561 only is_base_type needs
tweaking: shall we change the default to just return 0?

Makes sense.

Good. Then I'm finishing testing the below.

Thanks,
Paolo.

/
/cp
2013-10-30  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58581
* call.c (build_over_call): Check return value of mark_used.

/testsuite
2013-10-30  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58581
* g++.dg/cpp0x/deleted1.C: New.
Index: dwarf2out.c
===
--- dwarf2out.c (revision 207199)
+++ dwarf2out.c (working copy)
@@ -10252,6 +10252,16 @@ is_base_type (tree type)
   return 0;
 
 default:
+  // A C++ function with deduced return type can have
+  // a TEMPLATE_TYPE_PARM named 'auto' in its type.
+  if (is_cxx ())
+   {
+ tree name = TYPE_NAME (type);
+ if (TREE_CODE (name) == TYPE_DECL)
+   name = DECL_NAME (name);
+ if (name == get_identifier (auto))
+   return 0;
+   }
   gcc_unreachable ();
 }
 
Index: testsuite/g++.dg/cpp1y/auto-fn23.C
===
--- testsuite/g++.dg/cpp1y/auto-fn23.C  (revision 0)
+++ testsuite/g++.dg/cpp1y/auto-fn23.C  (working copy)
@@ -0,0 +1,9 @@
+// PR c++/58561
+// { dg-options -std=c++1y -g }
+
+auto foo();
+
+namespace N
+{
+  using ::foo;
+}


Go patch committed: Put nointerface methods in unique sections

2014-01-28 Thread Ian Lance Taylor
This patch to the Go frontend puts nointerface methods in unique
sections.  A method marked nointerface may not be needed in the final
link, and putting it in a unique section makes it possible for the
linker to discard it if possible.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r e6c55d1cd62b go/gogo.cc
--- a/go/gogo.cc	Fri Jan 24 14:47:58 2014 -0800
+++ b/go/gogo.cc	Tue Jan 28 13:42:48 2014 -0800
@@ -4094,12 +4094,19 @@
   // stack splitting for the thunk.
   bool disable_split_stack = this-is_recover_thunk_;
 
+  // This should go into a unique section if that has been
+  // requested elsewhere, or if this is a nointerface function.
+  // We want to put a nointerface function into a unique section
+  // because there is a good chance that the linker garbage
+  // collection can discard it.
+  bool in_unique_section = this-in_unique_section_ || this-nointerface_;
+
   Btype* functype = this-type_-get_backend_fntype(gogo);
   this-fndecl_ =
   gogo-backend()-function(functype, no-get_id(gogo), asm_name,
 is_visible, false, is_inlinable,
-disable_split_stack,
-this-in_unique_section_, this-location());
+disable_split_stack, in_unique_section,
+this-location());
 }
   return this-fndecl_;
 }


C++ PATCH for c++/59818 (wrong overload with PMF)

2014-01-28 Thread Jason Merrill
In this testcase, we deduce Bar const for T and substitute it into the 
second parameter, accidentally getting a pointer to const member 
function because we forgot to strip the const from Bar.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 35f0fd76efae307b54f9fa11d39c988699eb4214
Author: Jason Merrill ja...@redhat.com
Date:   Tue Jan 28 16:24:28 2014 -0500

	PR c++/59818
	* pt.c (tsubst_function_type): Make sure we keep the same function
	quals.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 6c68bae..011db2c 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -11189,6 +11189,8 @@ tsubst_function_type (tree t,
   else
 {
   tree r = TREE_TYPE (TREE_VALUE (arg_types));
+  /* Don't pick up extra function qualifiers from the basetype.  */
+  r = cp_build_qualified_type_real (r, type_memfn_quals (t), complain);
   if (! MAYBE_CLASS_TYPE_P (r))
 	{
 	  /* [temp.deduct]
diff --git a/gcc/testsuite/g++.dg/template/ptrmem24.C b/gcc/testsuite/g++.dg/template/ptrmem24.C
new file mode 100644
index 000..a419410
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/ptrmem24.C
@@ -0,0 +1,20 @@
+// PR c++/59818
+
+template class T
+struct Identity {
+  typedef T type;
+};
+
+struct Foo {
+  template typename T
+  Foo(T*, void (IdentityT::type::*m)(void));
+};
+
+struct Bar {
+  void Method(void) const;
+};
+
+void Bar::Method(void) const
+{
+  Foo foo(this, Bar::Method);	// { dg-error no match }
+}


[jit] Implement nested jit-compilation contexts

2014-01-28 Thread David Malcolm
Committed to dmalcolm/jit:

As discussed in http://gcc.gnu.org/ml/jit/2014-q1/msg1.html,
experiments with adding libgccjit to GNU Octave revealed the absence of
a way for a user of libgccjit to separate one-time startup from
per-invocation activities.

For example, the existing GNU Octave JIT has a singleton class
jit_typeinfo which is created once per process, containing the JIT
representation of types, helper functions and other such global
declarations.   When a function or loop becomes hot, the JIT uses the
objects referenced by this singleton - for example, a type-inferencer
uses the instances of types owned by the jit_typeinfo singleton, giving
back an IR for the function in terms of these type object instances.

As of 96b218c9a1d5f39fb649e02c0e77586b180e8516, libgccjit's entities now
have lifetimes bounded by gcc_jit_context objects, rather than within
the activation frame of a callback.

This next commit adds a gcc_jit_context_new_child_context API entrypoint,
allowing client code to create nested contexts: one-time initialization
can be done in a parent context, and per-method JIT-compilation can be
done in a child context.  Child contexts can use entities from their
parent context (or, indeed, ancestor contexts), though not vice-versa.

Currently contexts can be arbitrarily nested, but I believe it will be
unlikely for client code to need a nesting structure more complex than
that of a singleton parent context with multiple child contexts.

Implementation-wise, this is all rather suboptimal, requiring repeated
playback of parent contexts, but fixing that would require deep surgery
to GCC proper e.g. being able to share GC heaps between in-process
invocations of the compiler.   I believe this API at least allows the
future possibility of such a refactoring internally without necessarily
having to break clients.

gcc/jit/
* libgccjit.h (gcc_jit_context_new_child_context): New function.

* libgccjit.map (gcc_jit_context_new_child_context): New function.

* libgccjit.c (gcc_jit_context): Make the constructor explicit,
with a parent context as a parameter.
(gcc_jit_context_acquire): Create context with a NULL parent.
(gcc_jit_context_new_child_context): New function, creating a
context with the given parent.

* internal-api.h (gcc::jit::recording::context::context): New
explicit constructor, taking a parent context as a parameter.
(gcc::jit::recording::context::m_parent_ctxt): New field.

* internal-api.c (gcc::jit::recording::context::context): New
explicit constructor, taking a parent context as a parameter.
(gcc::jit::recording::context::replay_into): Replay parent contexts
before replaying the context itself.

gcc/testsuite/
* jit.dg/harness.h (test_jit): Add the possibility of turning off
this function, if the newly-coined TEST_ESCHEWS_TEST_JIT is
defined, for use by...
* jit.dg/test-nested-contexts.c: New test case, adapting
test-quadratic.c, but splitting it into a 3-deep arrangement of
nested contexts, to test the implementation of child contexts.
---
 gcc/jit/ChangeLog.jit   |  21 +
 gcc/jit/internal-api.c  |  49 +++
 gcc/jit/internal-api.h  |   3 +
 gcc/jit/libgccjit.c |  11 +-
 gcc/jit/libgccjit.h |  34 ++
 gcc/jit/libgccjit.map   |   1 +
 gcc/testsuite/ChangeLog.jit |   9 +
 gcc/testsuite/jit.dg/harness.h  |   2 +
 gcc/testsuite/jit.dg/test-nested-contexts.c | 627 
 9 files changed, 756 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/jit.dg/test-nested-contexts.c

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index ed0ffb7..67856f4 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,24 @@
+2014-01-28  David Malcolm  dmalc...@redhat.com
+
+   * libgccjit.h (gcc_jit_context_new_child_context): New function.
+
+   * libgccjit.map (gcc_jit_context_new_child_context): New function.
+
+   * libgccjit.c (gcc_jit_context): Make the constructor explicit,
+   with a parent context as a parameter.
+   (gcc_jit_context_acquire): Create context with a NULL parent.
+   (gcc_jit_context_new_child_context): New function, creating a
+   context with the given parent.
+
+   * internal-api.h (gcc::jit::recording::context::context): New
+   explicit constructor, taking a parent context as a parameter.
+   (gcc::jit::recording::context::m_parent_ctxt): New field.
+
+   * internal-api.c (gcc::jit::recording::context::context): New
+   explicit constructor, taking a parent context as a parameter.
+   (gcc::jit::recording::context::replay_into): Replay parent contexts
+   before replaying the context itself.
+
 2014-01-27  David Malcolm  dmalc...@redhat.com
 

[PATCH] Fix up vectorizer DDR_REVERSED_P handling (PR tree-optimization/59594, take 2)

2014-01-28 Thread Jakub Jelinek
On Tue, Jan 28, 2014 at 01:14:32PM +0100, Richard Biener wrote:
  I admit I fully don't understand why exactly, but my experimentation so far
  showed that for read/write and write/read ddrs it is ok and desirable to
  ignore the dist  0  DDR_REVERSED_P (ddr) cases, but for write/write
  ddrs it is undesirable.  See the PR for further tests, perhaps I could
  turn them into further testcases.
 
 Please.

New testcase in the patch.

  -  if (dist  0  DDR_REVERSED_P (ddr))
  +  if (dist  0  DDR_REVERSED_P (ddr)
  +  (DR_IS_READ (dra) || DR_IS_READ (drb)))
 
 I think that'snot sufficient.  It depends
 on the order of the stmts whether the dependence distance is really
 negative - we are trying to catch write-after-read negative distance
 here I think.  We can't rely on the DDRs being formed in stmt order
 (anymore, at least since 4.9 where we start to arbitrary re-order
 the vector of DRs).

As discussed on IRC, the actual bug was that vect_analyze_data_ref_accesses
reordered the data references before DDRs were constructed, thus DDR_A
wasn't necessarily before DDR_B lexically in the loop and thus using
DDR_REVERSED_P bit didn't really reflect whether it is negative or positive
distance.

Fixed by making sure the data refs aren't reordered (well, they are
reordered on a copy of the vector only for the purposes of
vect_analyze_data_ref_accesses).  Bootstrapped/regtested on x86_64-linux
and i686-linux, ok for trunk?

2014-01-28  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/59594
* tree-vect-data-refs.c (vect_analyze_data_ref_accesses): Sort
a copy of the datarefs vector rather than the vector itself.

* gcc.dg/vect/no-vfa-vect-depend-2.c: New test.
* gcc.dg/vect/no-vfa-vect-depend-3.c: New test.
* gcc.dg/vect/pr59594.c: New test.

--- gcc/tree-vect-data-refs.c.jj2014-01-23 10:52:26.766346677 +0100
+++ gcc/tree-vect-data-refs.c   2014-01-28 16:11:34.371698307 +0100
@@ -2484,19 +2484,21 @@ vect_analyze_data_ref_accesses (loop_vec
 return true;
 
   /* Sort the array of datarefs to make building the interleaving chains
- linear.  */
-  qsort (datarefs.address (), datarefs.length (),
+ linear.  Don't modify the original vector's order, it is needed for
+ determining what dependencies are reversed.  */
+  vecdata_reference_p datarefs_copy = datarefs.copy ();
+  qsort (datarefs_copy.address (), datarefs_copy.length (),
 sizeof (data_reference_p), dr_group_sort_cmp);
 
   /* Build the interleaving chains.  */
-  for (i = 0; i  datarefs.length () - 1;)
+  for (i = 0; i  datarefs_copy.length () - 1;)
 {
-  data_reference_p dra = datarefs[i];
+  data_reference_p dra = datarefs_copy[i];
   stmt_vec_info stmtinfo_a = vinfo_for_stmt (DR_STMT (dra));
   stmt_vec_info lastinfo = NULL;
-  for (i = i + 1; i  datarefs.length (); ++i)
+  for (i = i + 1; i  datarefs_copy.length (); ++i)
{
- data_reference_p drb = datarefs[i];
+ data_reference_p drb = datarefs_copy[i];
  stmt_vec_info stmtinfo_b = vinfo_for_stmt (DR_STMT (drb));
 
  /* ???  Imperfect sorting (non-compatible types, non-modulo
@@ -2573,7 +2575,7 @@ vect_analyze_data_ref_accesses (loop_vec
}
 }
 
-  FOR_EACH_VEC_ELT (datarefs, i, dr)
+  FOR_EACH_VEC_ELT (datarefs_copy, i, dr)
 if (STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (DR_STMT (dr))) 
  !vect_analyze_data_ref_access (dr))
   {
@@ -2588,9 +2590,13 @@ vect_analyze_data_ref_accesses (loop_vec
 continue;
   }
 else
-  return false;
+ {
+   datarefs_copy.release ();
+   return false;
+ }
   }
 
+  datarefs_copy.release ();
   return true;
 }
 
--- gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c.jj 2014-01-28 
14:06:10.818303424 +0100
+++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c2014-01-28 
14:06:10.818303424 +0100
@@ -0,0 +1,55 @@
+/* { dg-require-effective-target vect_int } */
+
+#include stdarg.h
+#include tree-vect.h
+
+#define N 17
+
+int ia[N] = {48,45,42,39,36,33,30,27,24,21,18,15,12,9,6,3,0};
+int ib[N] = {48,45,42,39,36,33,30,27,24,21,18,15,12,9,6,3,0};
+int res[N] = {48,192,180,168,156,144,132,120,108,96,84,72,60,48,36,24,12};
+
+__attribute__ ((noinline))
+int main1 ()
+{
+  int i;
+
+  /* Not vectorizable due to data dependence: dependence distance 1.  */ 
+  for (i = N - 1; i = 0; i--)
+{
+  ia[i] = ia[i+1] * 4;
+}
+
+  /* check results:  */
+  for (i = 0; i  N; i++)
+{
+  if (ia[i] != 0)
+   abort ();
+} 
+
+  /* Vectorizable. Dependence distance -1.  */
+  for (i = N - 1; i = 0; i--)
+{
+  ib[i+1] = ib[i] * 4;
+}
+
+  /* check results:  */
+  for (i = 0; i  N; i++)
+{
+  if (ib[i] != res[i])
+   abort ();
+}
+
+  return 0;
+}
+
+int main (void)
+{
+  check_vect ();
+
+  return main1 ();
+}
+
+/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect {xfail 
vect_no_align 

[PATCH] Improve EDGE_ABNORMAL construction (PR middle-end/59917, tree-optimization/59920)

2014-01-28 Thread Jakub Jelinek
Hi!

As discussed in the PR and similarly to what has been done previously
for __builtin_setjmp only, this patch attempts to decrease number of
EDGE_ABNORMAL edges in functions with non-local gotos and/or setjmp or
other functions that return twice.
Because, if we have many non-local labels or calls returning twice and
many calls that could potentially longjmp or goto a non-local label,
and especially if we have many SSA_NAMEs live across those abnormal edges,
the number of needed PHI arguments is number of non-local
labels/returns_twice calls times number of (most of) calls times number of
such SSA_NAMEs, which even on real-world testcases means gigabytes of memory
and hours of compilation time.
The patch changes it, so that abnormal edges from calls that might longjmp
or do non-local goto point to a special basic block containing
an artificial ABNORMAL_DISPATCHER internal call and from that basic block
there are abnormal edges to each non-local
label/__builtin_setjmp_receiver/returns_twice call.

The patch also fixes the OpenMP PR, the abnormal edges since their
introduction for setjmp for 4.9 (and for non-local gotos and computed gotos
since forever) prevent discovery of OpenMP regions, because dominance can't
be used for that.  As OpenMP SESE regions must not be entered abnormally or
left abnormally (exit allowed as an exception and we allow abort too) in a
valid program, we don't need to deal with longjmp jumping out of or into
an OpenMP region (explicitly disallowed in the standard) and similarly for
non-local gotos or computed gotos, the patch constructs the abnormal
dispatchers or computed goto factored blocks one per OpenMP SESE region
that needs it, which means fewer abnormal edges and more importantly that
the regions can be easily discovered and outlined into separate functions.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-01-27  Jakub Jelinek  ja...@redhat.com

PR middle-end/59917
PR tree-optimization/59920
* tree.c (build_common_builtin_nodes): Remove
__builtin_setjmp_dispatcher initialization.
* omp-low.h (make_gimple_omp_edges): Add a new int * argument.
* profile.c (branch_prob): Use gsi_start_nondebug_after_labels_bb
instead of gsi_after_labels + manually skipping debug stmts.
Don't ignore bbs with BUILT_IN_SETJMP_DISPATCHER, instead
ignore bbs with IFN_ABNORMAL_DISPATCHER.
* tree-inline.c (copy_edges_for_bb): Remove
can_make_abnormal_goto argument, instead add abnormal_goto_dest
argument.  Ignore computed_goto_p stmts.  Don't call
make_abnormal_goto_edges.  If a call might need abnormal edges
for non-local gotos, see if it already has an edge to
IFN_ABNORMAL_DISPATCHER or if it is IFN_ABNORMAL_DISPATCHER
with true argument, don't do anything then, otherwise add
EDGE_ABNORMAL from the call's bb to abnormal_goto_dest.
(copy_cfg_body): Compute abnormal_goto_dest, adjust copy_edges_for_bb
caller.
* gimple-low.c (struct lower_data): Remove calls_builtin_setjmp.
(lower_function_body): Don't emit __builtin_setjmp_dispatcher.
(lower_stmt): Don't set data-calls_builtin_setjmp.
(lower_builtin_setjmp): Adjust comment.
* builtins.def (BUILT_IN_SETJMP_DISPATCHER): Remove.
* tree-cfg.c (found_computed_goto): Remove.
(factor_computed_gotos): Remove.
(make_goto_expr_edges): Return bool, true for computed gotos.
Don't call make_abnormal_goto_edges.
(build_gimple_cfg): Don't set found_computed_goto, don't call
factor_computed_gotos.
(computed_goto_p): No longer static.
(make_blocks): Don't set found_computed_goto.
(handle_abnormal_edges): New function.
(make_edges): If make_goto_expr_edges returns true, push bb
into ab_edge_goto vector, for stmt_can_make_abnormal_goto calls
instead of calling make_abnormal_goto_edges push bb into ab_edge_call
vector.  Record mapping between bbs and OpenMP regions if there
are any, adjust make_gimple_omp_edges caller.  Call
handle_abnormal_edges.
(make_abnormal_goto_edges): Remove.
* tree-cfg.h (make_abnormal_goto_edges): Remove.
(computed_goto_p): New prototype.
* internal-fn.c (expand_ABNORMAL_DISPATCHER): New function.
* builtins.c (expand_builtin): Don't handle
BUILT_IN_SETJMP_DISPATCHER.
* internal-fn.def (ABNORMAL_DISPATCHER): New.
* omp-low.c (make_gimple_omp_edges): Add region_idx argument, when
filling *region also set *region_idx to (*region)-entry-index.

* gcc.dg/pr59920-1.c: New test.
* gcc.dg/pr59920-2.c: New test.
* gcc.dg/pr59920-3.c: New test.
* c-c++-common/gomp/pr59917-1.c: New test.
* c-c++-common/gomp/pr59917-2.c: New test.

--- gcc/tree.c.jj   2014-01-17 15:16:14.0 +0100
+++ gcc/tree.c  

Re: PATCH: PR target/59672: Add -m16 support for x86

2014-01-28 Thread Gerald Pfeifer
On Tue, 28 Jan 2014, H.J. Lu wrote:
 Here is the patch for changes.html.  OK to install?

Yes.  Just say command-line option please.

Thanks,
Gerald


Re: PATCH: PR target/59672: Add -m16 support for x86

2014-01-28 Thread H.J. Lu
On Tue, Jan 28, 2014 at 3:42 PM, Gerald Pfeifer ger...@pfeifer.com wrote:
 On Tue, 28 Jan 2014, H.J. Lu wrote:
 Here is the patch for changes.html.  OK to install?

 Yes.  Just say command-line option please.


This is what I checked in.

Thanks.

-- 
H.J.
---
Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v
retrieving revision 1.53
diff -u -p -r1.53 changes.html
--- changes.html 21 Jan 2014 08:34:06 - 1.53
+++ changes.html 28 Jan 2014 23:57:13 -
@@ -415,6 +415,9 @@ auto incr = [](auto x) { return x++; };
   well on the most current Intel processors, which are Haswell
   and Silvermont for GCC 4.9.
 /li
+liSupport to encode 32-bit assembly instructions in 16-bit format
+  is now available through the code-m16/code command-line option.
+/li
 liBetter inlining of codememcpy/code and codememset/code
  that is aware of value ranges and produces shorter alignment prologues.
 /li


Re: [PATCH, rs6000] Implement -maltivec=be for vec_mergeh and vec_mergel Altivec builtins

2014-01-28 Thread Bill Schmidt
Hi,

David suggested privately that I rework some of the pattern names to fit
in with existing practice.  I've done this, and the result is below.
Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
regressions.  Is this ok for trunk?

Thanks,
Bill

On Thu, 2014-01-23 at 18:08 -0600, Bill Schmidt wrote:
 Hi,
 
 This patch continues the series of changes to the Altivec builtins to
 accommodate big-endian element order when targeting a little endian
 machine.  Here the focus is on the vector merge-high and merge-low
 operations.
 
 The primary change is in altivec.md.  As an example, look at the pattern
 altivec_vmrghw.  Previously, this was represented with a single
 define_insn.  Now it's been split into a define_expand to create the
 RTL, and a define_insn (*altivec_vmrghw_endian) to generate the hardware
 instruction.  This is because we need a different selection vector when
 using -maltivec=be and targeting LE.  (Normally LE and BE can use the
 same selection vector, and GCC takes care of interpreting the indices as
 left-to-right or right-to-left.)  The new define_insn also substitutes
 vmrglw with swapped operands for vmrghw for little-endian mode, since
 the hardware instruction has big-endian bias in the interpretation of
 high and low.
 
 Because -maltivec=be applies only to programmer-specified builtins, we
 need to adjust internal uses of altivec_vmrghw and friends.  Thus we
 have a new define_insn altivec_vmrghw_internal that generates the
 hardware instruction directly with none of the above transformations.
 New unspecs are needed for these internal forms.
 
 The VSX flavors of merge-high and merge-low are a little simpler (see
 vsx.md).  Here we already had a define_expand where the instructions are
 generated by separate xxpermdi patterns, and there are no internal uses
 to worry about.  So we only need to change the selection vector in the
 generated RTL.
 
 There are four new test cases that cover all of the supported data
 types.  Tests are divided between those that require only VMX
 instructions and those that require VSX instructions.  There are also
 variants for -maltivec and -maltivec=be.
 
 Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
 regressions.  Ok for trunk?
 
 Thanks,
 Bill
 

gcc:

2014-01-28  Bill Schmidt  wschm...@linux.vnet.ibm.com

* config/rs6000/rs6000.c (altivec_expand_vec_perm_const):  Use
CODE_FOR_altivec_vmrg*_direct rather than CODE_FOR_altivec_vmrg*.
* config/rs6000/vsx.md (vsx_mergel_mode): Adjust for
-maltivec=be with LE targets.
(vsx_mergeh_mode): Likewise.
* config/rs6000/altivec.md (UNSPEC_VMRG[HL]_DIRECT): New
unspecs.
(mulv8hi3): Use gen_altivec_vmrg[hl]w_direct.
(altivec_vmrghb): Replace with define_expand and new
*altivec_vmrghb_internal insn; adjust for -maltivec=be with LE
targets.
(altivec_vmrghb_direct): New define_insn.
(altivec_vmrghh): Replace with define_expand and new
*altivec_vmrghh_internal insn; adjust for -maltivec=be with LE
targets.
(altivec_vmrghh_direct): New define_insn.
(altivec_vmrghw): Replace with define_expand and new
*altivec_vmrghw_internal insn; adjust for -maltivec=be with LE
targets.
(altivec_vmrghw_direct): New define_insn.
(*altivec_vmrghsf): Adjust for endianness.
(altivec_vmrglb): Replace with define_expand and new
*altivec_vmrglb_internal insn; adjust for -maltivec=be with LE
targets.
(altivec_vmrglb_direct): New define_insn.
(altivec_vmrglh): Replace with define_expand and new
*altivec_vmrglh_internal insn; adjust for -maltivec=be with LE
targets.
(altivec_vmrglh_direct): New define_insn.
(altivec_vmrglw): Replace with define_expand and new
*altivec_vmrglw_internal insn; adjust for -maltivec=be with LE
targets.
(altivec_vmrglw_direct): New define_insn.
(*altivec_vmrglsf): Adjust for endianness.
(vec_widen_umult_hi_v16qi): Use gen_altivec_vmrghh_direct.
(vec_widen_umult_lo_v16qi): Use gen_altivec_vmrglh_direct.
(vec_widen_smult_hi_v16qi): Use gen_altivec_vmrghh_direct.
(vec_widen_smult_lo_v16qi): Use gen_altivec_vmrglh_direct.
(vec_widen_umult_hi_v8hi): Use gen_altivec_vmrghw_direct.
(vec_widen_umult_lo_v8hi): Use gen_altivec_vmrglw_direct.
(vec_widen_smult_hi_v8hi): Use gen_altivec_vmrghw_direct.
(vec_widen_smult_lo_v8hi): Use gen_altivec_vmrglw_direct.

gcc/testsuite:

2014-01-28  Bill Schmidt  wschm...@linux.vnet.ibm.com

* gcc.dg/vmx/merge-be-order.c: New.
* gcc.dg/vmx/merge.c: New.
* gcc.dg/vmx/merge-vsx-be-order.c: New.
* gcc.dg/vmx/merge-vsx.c: New.


Index: gcc/testsuite/gcc.dg/vmx/merge-be-order.c
===
--- gcc/testsuite/gcc.dg/vmx/merge-be-order.c   

Re: [PATCH, rs6000] Implement -maltivec=be for vec_mergeh and vec_mergel Altivec builtins

2014-01-28 Thread David Edelsohn
On Tue, Jan 28, 2014 at 7:17 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:

 David suggested privately that I rework some of the pattern names to fit
 in with existing practice.  I've done this, and the result is below.
 Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
 regressions.  Is this ok for trunk?

 gcc:

 2014-01-28  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (altivec_expand_vec_perm_const):  Use
 CODE_FOR_altivec_vmrg*_direct rather than CODE_FOR_altivec_vmrg*.
 * config/rs6000/vsx.md (vsx_mergel_mode): Adjust for
 -maltivec=be with LE targets.
 (vsx_mergeh_mode): Likewise.
 * config/rs6000/altivec.md (UNSPEC_VMRG[HL]_DIRECT): New
 unspecs.
 (mulv8hi3): Use gen_altivec_vmrg[hl]w_direct.
 (altivec_vmrghb): Replace with define_expand and new
 *altivec_vmrghb_internal insn; adjust for -maltivec=be with LE
 targets.
 (altivec_vmrghb_direct): New define_insn.
 (altivec_vmrghh): Replace with define_expand and new
 *altivec_vmrghh_internal insn; adjust for -maltivec=be with LE
 targets.
 (altivec_vmrghh_direct): New define_insn.
 (altivec_vmrghw): Replace with define_expand and new
 *altivec_vmrghw_internal insn; adjust for -maltivec=be with LE
 targets.
 (altivec_vmrghw_direct): New define_insn.
 (*altivec_vmrghsf): Adjust for endianness.
 (altivec_vmrglb): Replace with define_expand and new
 *altivec_vmrglb_internal insn; adjust for -maltivec=be with LE
 targets.
 (altivec_vmrglb_direct): New define_insn.
 (altivec_vmrglh): Replace with define_expand and new
 *altivec_vmrglh_internal insn; adjust for -maltivec=be with LE
 targets.
 (altivec_vmrglh_direct): New define_insn.
 (altivec_vmrglw): Replace with define_expand and new
 *altivec_vmrglw_internal insn; adjust for -maltivec=be with LE
 targets.
 (altivec_vmrglw_direct): New define_insn.
 (*altivec_vmrglsf): Adjust for endianness.
 (vec_widen_umult_hi_v16qi): Use gen_altivec_vmrghh_direct.
 (vec_widen_umult_lo_v16qi): Use gen_altivec_vmrglh_direct.
 (vec_widen_smult_hi_v16qi): Use gen_altivec_vmrghh_direct.
 (vec_widen_smult_lo_v16qi): Use gen_altivec_vmrglh_direct.
 (vec_widen_umult_hi_v8hi): Use gen_altivec_vmrghw_direct.
 (vec_widen_umult_lo_v8hi): Use gen_altivec_vmrglw_direct.
 (vec_widen_smult_hi_v8hi): Use gen_altivec_vmrghw_direct.
 (vec_widen_smult_lo_v8hi): Use gen_altivec_vmrglw_direct.

 gcc/testsuite:

 2014-01-28  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * gcc.dg/vmx/merge-be-order.c: New.
 * gcc.dg/vmx/merge.c: New.
 * gcc.dg/vmx/merge-vsx-be-order.c: New.
 * gcc.dg/vmx/merge-vsx.c: New.

Thanks for adjusting the patch to be more consistent with the pattern
names used in the rest of the port.  The patch is okay.

Thanks, David


Re: Ping Re: Fix IBM long double spurious overflows

2014-01-28 Thread David Edelsohn
On Tue, Jan 28, 2014 at 4:19 PM, Joseph S. Myers
jos...@codesourcery.com wrote:

 The glibc libm testsuite has much more thorough coverage (hopefully soon
 to include running all tests in all rounding modes by default) than it did
 two years ago, and it's a pain to keep test results clean across all
 architectures when the basic arithmetic operations for IBM long double do
 not follow the normal conventions as regards permitted errors for most
 glibc libm functions (results within a few ulps, no spurious overflows or
 underflows except possibly for exact underflowing results, no missing
 underflows), or as regards working in all rounding modes, making it hard
 to distinguish between libgcc and glibc bugs.

 Thus, if these issues are not to be fixed in libgcc, I think we need to
 seek FSF approval to use a copy of the current IBM long double libgcc code
 under LGPLv2.1+ in glibc, with a view to fixing the issues in that copy
 only and linking it directly into libc and libm (for their internal use
 rather than re-exporting symbols from it).

Joseph,

The testsuite can disable those tests or xfail them for IBM long double.

It is not appropriate for a GCC or GLIBC maintainer to impose behavior
or conformance on a specific target and port-specific code. I am sorry
that the failures bother you, but ports have the freedom to conform or
not conform with standards in target-specific code.

Thanks, David


Re: Ping Re: Fix IBM long double spurious overflows

2014-01-28 Thread Joseph S. Myers
On Tue, 28 Jan 2014, David Edelsohn wrote:

 On Tue, Jan 28, 2014 at 4:19 PM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 
  The glibc libm testsuite has much more thorough coverage (hopefully soon
  to include running all tests in all rounding modes by default) than it did
  two years ago, and it's a pain to keep test results clean across all
  architectures when the basic arithmetic operations for IBM long double do
  not follow the normal conventions as regards permitted errors for most
  glibc libm functions (results within a few ulps, no spurious overflows or
  underflows except possibly for exact underflowing results, no missing
  underflows), or as regards working in all rounding modes, making it hard
  to distinguish between libgcc and glibc bugs.
 
  Thus, if these issues are not to be fixed in libgcc, I think we need to
  seek FSF approval to use a copy of the current IBM long double libgcc code
  under LGPLv2.1+ in glibc, with a view to fixing the issues in that copy
  only and linking it directly into libc and libm (for their internal use
  rather than re-exporting symbols from it).
 
 Joseph,
 
 The testsuite can disable those tests or xfail them for IBM long double.

Doing so requires significant investigation for each test to determine 
that it arises from a libgcc bug.  For tests in auto-libm-test-in it is 
also liable to disable more than necessary, because each source line 
represents tests with the specified inputs rounded up and down for each 
supported floating-point format - disabling tests like this was never 
intended to be permanent, only a temporary marking until the underlying 
bug is fixed.

 It is not appropriate for a GCC or GLIBC maintainer to impose behavior
 or conformance on a specific target and port-specific code. I am sorry
 that the failures bother you, but ports have the freedom to conform or
 not conform with standards in target-specific code.

It is appropriate for glibc maintainers to seek consensus in the glibc 
community on minimum requirements for glibc ports, just as on any other 
question about what is or is not supported in glibc.  For example, it is 
presently expected that a port uses ELF and supports TLS and PIC.  What 
floating-point formats are supported is part of that.

I'll seek consensus in the glibc community on minimum standards for the 
underlying floating-point arithmetic.

Note that some architectures use -mieee when building glibc in order to 
get floating point corresponding to glibc expectations (which may be 
slower than the GCC defaults on those architectures).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Ping Re: Fix IBM long double spurious overflows

2014-01-28 Thread Joseph S. Myers
On Wed, 29 Jan 2014, Joseph S. Myers wrote:

 It is appropriate for glibc maintainers to seek consensus in the glibc 
 community on minimum requirements for glibc ports, just as on any other 
 question about what is or is not supported in glibc.  For example, it is 
 presently expected that a port uses ELF and supports TLS and PIC.  What 
 floating-point formats are supported is part of that.
 
 I'll seek consensus in the glibc community on minimum standards for the 
 underlying floating-point arithmetic.

I have now made my proposal at 
https://sourceware.org/ml/libc-alpha/2014-01/msg00582.html.  This is 
similar to plenty of other proposals in glibc (such as those I made 
recently regarding principles for moving architectures from ports to libc, 
or regarding increasing the minimum Linux kernel version).

As the de facto maintainer for the soft-float powerpc support in glibc, I 
get to deal with these test failures when reviewing test results each 
release cycle, and they take a disproportionate amount of time, and my 
view is that fixing these issues in the __gcc_* functions, in copies local 
to glibc if necessary, is the best way to keep the ldbl-128ibm support in 
glibc maintainable.  (Although I'd certainly prefer to have suitable 
versions of the functions in libgcc, under different names and only 
enabled with options such as -frounding-math if necessary - although 
enabling correctness for exceptions with -frounding-math would seem rather 
odd.)

glibc used not to require TLS or atomic operations beyond test-and-set; 
the non-TLS support became unmaintainable, and now it does require TLS 
(which required kernel heleprs to be implemented on various architectures, 
and did make things slower for some architectures on some workloads).  It 
used not to require a 2.6 kernel when used with the Linux kernel; now it 
requires 2.6.16 and we're considering requiring 2.6.32.  I think of 
requirements on the floating point used by glibc functions in much the 
same way: evaluating a new requirement that would help in maintaining and 
developing glibc.

-- 
Joseph S. Myers
jos...@codesourcery.com


C++ PATCH for c++/59791 (ICE with return type mentioning local variable)

2014-01-28 Thread Jason Merrill
We already deal with PARM_DECLs that aren't available for lookup in a 
late-specified return type, but this case needs the same treatment for a 
local variable.  During normal instantiation of the lambda we find the 
local variable fine, but later when we're doing dump_template_bindings 
from cxx_print_decl, the variable isn't in scope anymore, and we can 
deal with that in the same way we handle PARM_DECLs.


Tested x86_64-cp-linux-gnu, applying to trunk.
commit 6930b77188740b3e1979260ee805b38733b0f698
Author: Jason Merrill ja...@redhat.com
Date:   Tue Jan 28 16:56:42 2014 -0500

	PR c++/59791
	* pt.c (tsubst_decl) [VAR_DECL]: Allow in unevaluated context.
	(tsubst_copy): Use it if lookup fails.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 011db2c..7f1b6d5 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -10990,9 +10990,7 @@ tsubst_decl (tree t, tree args, tsubst_flags_t complain)
 	DECL_TEMPLATE_INFO (r) = build_template_info (tmpl, argvec);
 	SET_DECL_IMPLICIT_INSTANTIATION (r);
 	  }
-	else if (cp_unevaluated_operand)
-	  gcc_unreachable ();
-	else
+	else if (!cp_unevaluated_operand)
 	  register_local_specialization (r, t);
 
 	DECL_CHAIN (r) = NULL_TREE;
@@ -12481,6 +12479,11 @@ tsubst_copy (tree t, tree args, tsubst_flags_t complain, tree in_decl)
 		}
 	  else
 		{
+		  /* This can happen for a variable used in a late-specified
+		 return type of a local lambda.  Just make a dummy decl
+		 since it's only used for its type.  */
+		  if (cp_unevaluated_operand)
+		return tsubst_decl (t, args, complain);
 		  gcc_assert (errorcount || sorrycount);
 		  return error_mark_node;
 		}
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-decltype1.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-decltype1.C
new file mode 100644
index 000..0ab0cdd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-decltype1.C
@@ -0,0 +1,21 @@
+// PR c++/59791
+// We force the gimple dump to trigger use of lang_decl_name.
+// { dg-options -std=c++11 -fdump-tree-gimple }
+// { dg-final { cleanup-tree-dump gimple } }
+
+template  class T  void
+f (T t)
+{
+  int i = t;
+  [](int)-decltype (i + t)
+  {
+return 0;
+  }
+  (0);
+}
+
+void
+foo ()
+{
+  f (0);
+}


C++ PATCH for c++/59315 (Wunused-3.C with -fno-use-cxa-atexit)

2014-01-28 Thread Jason Merrill
The old build_cleanup called mark_used; the new one that just calls 
cxx_maybe_build_cleanup didn't any more.  So let's add a call in the 
latter function.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit be2e25f452c30b0a7a00a803422cdbbd80c4e9c3
Author: Jason Merrill ja...@redhat.com
Date:   Tue Jan 28 22:30:31 2014 -0500

	PR c++/59315
	* decl.c (cxx_maybe_build_cleanup): Call mark_used.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index e14e401..e57cf07 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -14353,6 +14353,12 @@ cxx_maybe_build_cleanup (tree decl, tsubst_flags_t complain)
  destructor call instead.  */
   if (cleanup != NULL  EXPR_P (cleanup))
 SET_EXPR_LOCATION (cleanup, UNKNOWN_LOCATION);
+
+  if (cleanup)
+/* Treat all objects with destructors as used; the destructor may do
+   something substantive.  */
+mark_used (decl);
+
   return cleanup;
 }
 
diff --git a/gcc/testsuite/g++.dg/warn/Wunused-3.C b/gcc/testsuite/g++.dg/warn/Wunused-3.C
index 3100909..2d00dda 100644
--- a/gcc/testsuite/g++.dg/warn/Wunused-3.C
+++ b/gcc/testsuite/g++.dg/warn/Wunused-3.C
@@ -1,5 +1,5 @@
 // { dg-do compile }
-// { dg-options -Wunused -O }
+// { dg-options -Wunused -O -fno-use-cxa-atexit }
 
 void do_cleanups();
 


[Ping][Patch, trivial] PR 56653: Fix warning when verifying checksums from MD5SUMS file in tarballs

2014-01-28 Thread Dmitry Gorbachev
-- Forwarded message --
From: Dmitry Gorbachev d.g.gorbac...@gmail.com
Date: Sat, 28 Dec 2013 02:33:18 +0400
Subject: [Patch, trivial] PR 56653: Fix warning when verifying
checksums from MD5SUMS file in tarballs
To: gcc-patches@gcc.gnu.org

This patch is to fix `md5sum: WARNING: 1 line is improperly formatted' thing.


2013-12-28  Dmitry Gorbachev  d.g.gorbac...@gmail.com

PR 56653
* gcc_release: Add an extra `#' character to the comment header of
MD5SUMS.


*** maintainer-scripts/gcc_release
--- maintainer-scripts/gcc_release
***
*** 214,218 
  # Suggested usage:
  # md5sum -c MD5SUMS | grep -v \OK$\
!   MD5SUMS

find . -type f |
--- 214,218 
  # Suggested usage:
  # md5sum -c MD5SUMS | grep -v \OK$\
! #  MD5SUMS

find . -type f |


[PATCH][PING] Fix handling of context diff patches in mklog

2014-01-28 Thread Yury Gribov



 Original Message 
Subject: [PATCH] Fix handling of context diff patches in mklog
Date: Wed, 22 Jan 2014 18:36:23 +0400
From: Yury Gribov y.gri...@samsung.com
To: GCC Patches gcc-patches@gcc.gnu.org,  Diego Novillo 
dnovi...@google.com
CC: Viacheslav Garbuzov v.garbu...@samsung.com,  Yuri Gribov 
tetra2...@gmail.com


Hi,

This patch improves support for context diffs in mklog and also fixes
tiny bug which caused generation of redundant colons in output.

Verified against several real-world patches.

Ok to commit?

-Y



diff --git a/contrib/mklog b/contrib/mklog
index 16ce191..c6d89a9 100755
--- a/contrib/mklog
+++ b/contrib/mklog
@@ -80,18 +80,16 @@ sub remove_suffixes ($) {
 	return $filename;
 }
 
-# Check if line can be a function declaration:
-# First pattern cut extra symbols added by diff
-# second pattern checks that line is not a comment or brace
-sub is_function  {
+# Check if line is a top-level declaration.
+# TODO: ignore preprocessor directives except maybe #define ?
+sub is_top_level {
 	my ($function, $is_context_diff) = (@_);
 	if ($is_context_diff) {
 		$function =~ s/^..//;
 	} else {
 		$function =~ s/^.//;
 	}
-	return $function
-	 ($function !~ /^[\s{}]/);
+	return $function  $function !~ /^[\s{}]/;
 }
 
 # For every file in the .diff print all the function names in ChangeLog
@@ -105,13 +103,14 @@ chomp (my @diff_lines = DFILE);
 close (DFILE);
 $line_idx = 0;
 foreach (@diff_lines) {
-# Stop processing functions if we found a new file
+# Stop processing functions if we found a new file.
 	# Remember both left and right names because one may be /dev/null.
-if (/^[+*][+*][+*] +(\S+)/) {
+# Don't be fooled by line markers in case of context diff.
+if (!/\*\*\*$/  /^[+*][+*][+*] +(\S+)/) {
 		$left = remove_suffixes ($1);
 		$look_for_funs = 0;
 	}
-if (/^--- +(\S+)?/) {
+if (!/---$/  /^--- +(\S+)?/) {
 		$right = remove_suffixes ($1);
 		$look_for_funs = 0;
 	}
@@ -120,7 +119,7 @@ foreach (@diff_lines) {
 	# We should now have both left and right name,
 	# so we can decide filename.
 
-if ($left  (/^\*{15}$/ || /^@@ /)) {
+if ($left  (/^\*{15}/ || /^@@ /)) {
 	# If we have not seen any function names in the previous file (ie,
 	# $change_msg is empty), we just write out a ':' before starting the next
 	# file.
@@ -145,9 +144,15 @@ foreach (@diff_lines) {
 	$look_for_funs = $filename =~ '\.(c|cpp|C|cc|h|inc|def)$';
 }
 
-# Remember the last line in a unified diff block that might start
+# Context diffs have extra whitespace after first char;
+# remove it to make matching easier.
+if ($is_context_diff) {
+  s/^([-+! ]) /\1/;
+}
+
+# Remember the last line in a diff block that might start
 # a new function.
-if (/^[-+ ]([a-zA-Z0-9_].*)/) {
+if (/^[-+! ]([a-zA-Z0-9_].*)/) {
 $save_fn = $1;
 }
 
@@ -169,9 +174,9 @@ foreach (@diff_lines) {
 
 # Mark if we met doubtfully changed function.
 $doubtfunc = 0;
-$is_context_diff = 0;
 if ($diff_lines[$line_idx] =~ /^@@ .* @@ ([a-zA-Z0-9_].*)/) {
 	$doubtfunc = 1;
+$is_context_diff = 0;
 }
 elsif ($diff_lines[$line_idx] =~ /^\*\*\*\*\*\** ([a-zA-Z0-9_].*)/) {
 	$doubtfunc = 1;
@@ -184,17 +189,16 @@ foreach (@diff_lines) {
 # Note that we don't try too hard to find good matches.  This should
 # return a superset of the actual set of functions in the .diff file.
 #
-# The first two patterns work with context diff files (diff -c). The
-# third pattern works with unified diff files (diff -u).
+# The first pattern works with context diff files (diff -c). The
+# second pattern works with unified diff files (diff -u).
 #
-# The fourth pattern looks for the starts of functions or classes
-# within a unified diff block.
+# The third pattern looks for the starts of functions or classes
+# within a diff block both for context and unified diff files.
 
 if ($look_for_funs
  (/^\*\*\*\*\*\** ([a-zA-Z0-9_].*)/
-|| /^[\-\+\!] ([a-zA-Z0-9_]+)[ \t]*\(.*/
 	|| /^@@ .* @@ ([a-zA-Z0-9_].*)/
-	|| /^[-+ ](\{)/))
+	|| /^[-+! ](\{)/))
   {
 	$_ = $1;
 	my $fn;
@@ -219,12 +223,16 @@ foreach (@diff_lines) {
 	$no_real_change = 0;
 	if ($doubtfunc) {
 		$idx = $line_idx;
+	# Skip line info in context diffs.
+		while ($is_context_diff  $diff_lines[$idx + 1] =~ /^[-\*]{3} [0-9]/) {
+			++$idx;
+		}
 	# Check all lines till the first change
 	# for the presence of really changed function
 		do {
 			++$idx;
-			$no_real_change = is_function ($diff_lines[$idx], $is_context_diff);
-		} while (!$no_real_change  ($diff_lines[$idx] !~  /^[\+\-\!]/))
+			$no_real_change = is_top_level ($diff_lines[$idx], $is_context_diff);
+		} while (!$no_real_change  ($diff_lines[$idx] !~ /^[-+!]/))
 	}
 	if ($fn  !$seen_names{$fn}  !$no_real_change) {
 	# If this is the first function in the file, we display it next
@@ -246,7 +254,7 @@ foreach (@diff_lines) {
 # If we have