Re: std::regex: inserting std::wregex to std::vector loses some std::wregex values

2014-09-24 Thread Tim Shen
On Tue, Sep 16, 2014 at 5:28 PM, Tim Shen tims...@google.com wrote:
 So I'll change the patch to move _M_traits to _NFA, and add a new
 basic_regex::_M_loc member.

Here it is :). Bootstrapped and tested with debug flag.

 Should the abi compatible fix be another patch for branch 4.9? In
 which the move ctor is not noexcept and calls the copy ctor?

I'll make another patch for it.


-- 
Regards,
Tim Shen
commit 58b73dfbd04eefcfa4a1ff570e38de83b2f0daa9
Author: Tim Shen tims...@google.com
Date:   Sun Sep 21 16:23:13 2014 -0700

PR libstdc++/63199
* include/bits/regex.h (basic_regex::basic_regex, basic_regex::assign,
basic_regex::imbue, basic_regex::getloc, basic_regex::swap): Add
_M_loc for basic_regex.
* include/bits/regex_automaton.h: Add _M_traits for _NFA.
* include/bits/regex_compiler.h (_Compiler::_M_get_nfa, __compile_nfa):
Make _Compiler::_M_nfa heap allocated.
* include/bits/regex_compiler.tcc (_Compiler::_Compiler): Make
_Compiler::_M_nfa heap allocated.
* include/bits/regex_executor.h (_Executor::_M_is_word):
Fix accessing _M_traits.
* include/bits/regex_executor.tcc (_Executor::_M_dfs):
Fix accessing _M_traits.
* testsuite/28_regex/algorithms/regex_match/ecma/wchar_t/63199.cc:
New testcase.

diff --git a/libstdc++-v3/include/bits/regex.h 
b/libstdc++-v3/include/bits/regex.h
index 5205089..4ec20d7 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -64,7 +64,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 inline std::shared_ptr_NFA_TraitsT
 __compile_nfa(const typename _TraitsT::char_type* __first,
  const typename _TraitsT::char_type* __last,
- const _TraitsT __traits,
+ const typename _TraitsT::locale_type __loc,
  regex_constants::syntax_option_type __flags);
 
 _GLIBCXX_END_NAMESPACE_VERSION
@@ -433,7 +433,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* character sequence.
*/
   basic_regex()
-  : _M_flags(ECMAScript), _M_automaton(nullptr)
+  : _M_flags(ECMAScript), _M_loc(), _M_original_str(), 
_M_automaton(nullptr)
   { }
 
   /**
@@ -481,10 +481,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*
* @param __rhs A @p regex object.
*/
-  basic_regex(const basic_regex __rhs) noexcept
-  : _M_flags(__rhs._M_flags), _M_traits(__rhs._M_traits),
-   _M_automaton(std::move(__rhs._M_automaton))
-  { }
+  basic_regex(basic_regex __rhs) noexcept = default;
 
   /**
* @brief Constructs a basic regular expression from the string
@@ -520,12 +517,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
basic_regex(_FwdIter __first, _FwdIter __last,
flag_type __f = ECMAScript)
: _M_flags(__f),
+ _M_loc(),
  _M_original_str(__first, __last),
- _M_automaton(__detail::__compile_nfa(_M_original_str.c_str(),
-  _M_original_str.c_str()
-+ _M_original_str.size(),
-  _M_traits,
-  _M_flags))
+ _M_automaton(__detail::__compile_nfa_Rx_traits(
+   _M_original_str.c_str(),
+   _M_original_str.c_str() + _M_original_str.size(),
+   _M_loc,
+   _M_flags))
{ }
 
   /**
@@ -662,9 +660,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _M_flags = __flags;
  _M_original_str.assign(__s.begin(), __s.end());
  auto __p = _M_original_str.c_str();
- _M_automaton = __detail::__compile_nfa(__p,
-__p + _M_original_str.size(),
-_M_traits, _M_flags);
+ _M_automaton = __detail::__compile_nfa_Rx_traits(
+   __p,
+   __p + _M_original_str.size(),
+   _M_loc,
+   _M_flags);
  return *this;
}
 
@@ -728,9 +728,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   locale_type
   imbue(locale_type __loc)
   {
-   auto __ret = _M_traits.imbue(__loc);
-   this-assign(_M_original_str, _M_flags);
-   return __ret;
+   std::swap(__loc, _M_loc);
+   if (_M_automaton != nullptr)
+ this-assign(_M_original_str, _M_flags);
+   return __loc;
   }
 
   /**
@@ -739,7 +740,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   locale_type
   getloc() const
-  { return _M_traits.getloc(); }
+  { return _M_loc; }
 
   // [7.8.6] swap
   /**
@@ -751,7 +752,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   swap(basic_regex __rhs)
   {
std::swap(_M_flags, __rhs._M_flags);
-   std::swap(_M_traits, __rhs._M_traits);
+   std::swap(_M_loc, __rhs._M_loc);
+   std::swap(_M_original_str, __rhs._M_original_str);
std::swap(_M_automaton, 

Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Ilya Enkovich
2014-09-23 20:06 GMT+04:00 Jeff Law l...@redhat.com:
 On 09/23/14 10:01, Steven Bosscher wrote:

 On Fri, Sep 19, 2014 at 10:03 PM, Jeff Law l...@redhat.com wrote:

 On 09/19/14 13:36, Ilya Enkovich wrote:


 Hi,

 During my work on enabling pseudo PIC register I've found that cfg
 cleaunp
 may remove lables with LABEL_PRESERVE_P set to 1.  In my case I
 generated
 SET_RIP during expand pass and cfg cleanup removed label it used as an
 operand.  Below is a patch that fixes it.  It is not actually required
 for
 our latest PIC related patch but still seems to make sense.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 2014-09-19  Ilya Enkovich  ilya.enkov...@intel.com

  * cfgcleanup.c (try_optimize_cfg): Do not remove label
  with LABEL_PRESERVE_P flag set.


 OK.  Please install.

 Note for those not following the x86 32 bit PIC register discussion, I
 asked
 Ilya to submit this separately.  It was something an earlier version of
 his
 patch triggered and it stood out as something that ought to be fixed
 regardless of the final form of the PIC register changes that are in
 progress.


 Jeff,

 Are you sure this patch is necessary, and is not just papering over
 another problem? In the past, all cases I've seen where labels were
 removed inadvertently were caused by incorrect reference counting or
 missing REG_LABEL_* notes.

Description of LABEL_PRESERVE_P says label that should always be
considered to be needed.  That means even if we do not have any usages
we shouldn't remove it.  Why can't we add some additional usages
later?


 Did the label use count drop to zero? Is there a REG_LABEL_TARGET note
 for the label operand?

In the current code of ix86_expand_prologue I don't see any notes
generation for set_rip_rex64 instruction which actually uses label.
But IMO this is another potential issue and we still shouldn't remove
labels with LABEL_PRESERVE_P.


 The way it was described to me is, yes, the label count dropped to zero.  In
 simplest terms, it was a single use label that was marked with
 LABEL_PRESERVE_P.  The combiner removed the last reference, then cfgcleanup
 came along and *boom*.


There was also another case in 64bit target with large code model
where I had combiner unrelated problem with removed label used by
still existing set_rip_rex64.

Ilya

 It was with some ongoing development work that's going in a slight different
 direction, so we don't have a testcase to include.

 jeff


Re: Enable EBX for x86 in 32bits PIC code

2014-09-24 Thread Ilya Enkovich
2014-09-23 20:10 GMT+04:00 Jeff Law l...@redhat.com:
 On 09/23/14 10:03, Jakub Jelinek wrote:

 On Tue, Sep 23, 2014 at 10:00:00AM -0600, Jeff Law wrote:

 On 09/23/14 08:34, Jakub Jelinek wrote:

 On Tue, Sep 23, 2014 at 05:54:37PM +0400, Ilya Enkovich wrote:

 use fixed EBX at least until we make sure pseudo PIC doesn't harm debug
 info generation.  If we have such option then gcc.target/i386/pic-1.c
 and


 For debug info, it seems you are already handling this in
 delegitimize_address target hook, I'd suggest just building some very
 large
 shared library at -O2 -g -fpic on i?86 and either look at the
 sizes of .debug_info/.debug_loc sections with/without the patch,
 or use the locstat utility from elfutils (talk to Petr Machata if
 needed).

 Can't hurt, but I really don't see how changing from a fixed to an
 allocatable register is going to muck up debug info in any significant
 way.


 What matters is if the delegitimize_address target hook is as efficient in
 delegitimization as before.  E.g. if it previously matched only when
 seeing
 %ebx + gotoff or similar, and wouldn't match anything now, some vars could
 have debug locations including UNSPEC and be dropped on the floor.

 Ah, yea, that makes sense.

 jeff


After register allocation we have no idea where GOT address is and
therefore delegitimize_address target hook becomes less efficient and
cannot remove UNSPECs. That's what I see now when build GCC with patch
applied:

../../../../gcc/libgfortran/generated/sum_r4.c: In function 'msum_r4':
../../../../gcc/libgfortran/generated/sum_r4.c:195:1: note:
non-delegitimized UNSPEC UNSPEC_GOTOFF (1) found in variable location
 msum_r4 (gfc_array_r4 * const restrict retarray,
 ^
../../../../gcc/libgfortran/generated/sum_r4.c:195:1: note:
non-delegitimized UNSPEC UNSPEC_GOTOFF (1) found in variable location
../../../../gcc/libgfortran/generated/sum_r4.c:195:1: note:
non-delegitimized UNSPEC UNSPEC_GOTOFF (1) found in variable location
../../../../gcc/libgfortran/generated/sum_r4.c:195:1: note:
non-delegitimized UNSPEC UNSPEC_GOTOFF (1) found in variable location
../../../../gcc/libgfortran/generated/sum_r8.c: In function 'msum_r8':
../../../../gcc/libgfortran/generated/sum_r8.c:195:1: note:
non-delegitimized UNSPEC UNSPEC_GOTOFF (1) found in variable location
 msum_r8 (gfc_array_r8 * const restrict retarray,
 ^
../../../../gcc/libgfortran/generated/sum_r8.c:195:1: note:
non-delegitimized UNSPEC UNSPEC_GOTOFF (1) found in variable location
../../../../gcc/libgfortran/generated/sum_r8.c:195:1: note:
non-delegitimized UNSPEC UNSPEC_GOTOFF (1) found in variable location
../../../../gcc/libgfortran/generated/sum_r8.c:195:1: note:
non-delegitimized UNSPEC UNSPEC_GOTOFF (1) found in variable location


Ilya


Re: [PATCH, i386, Pointer Bounds Checker 33/x] MPX ABI

2014-09-24 Thread Ilya Enkovich
2014-09-23 22:01 GMT+04:00 Jeff Law l...@redhat.com:
 On 09/23/14 00:31, Ilya Enkovich wrote:


 I did this change a couple of years ago and don't remember exactly
 what problem was caused by PARALLEL.  But from my comment it seems
 parallel lead to values in BND0 and BND1 not to be actually defined by
 call from DF point of view.  I'll try to reproduce a problem I had.

 Please do.  That would indicate a bug in the DF infrastructure.  I'm not
 real familiar with the DF implementation, but a quick glance at
 df_def_record_1 seems to indicate it's got support for a set destination
 being a PARALLEL.

 This kind of scheme also doesn't tend to play well with exception
 handling
  scheduling becuase you can't guarantee the sets and the call are in the
 same block and scheduler as a single group.


 How can the sets and  the call no be in the same block/group if all of
 them are parts of a single instruction?

 Obviously in the cases where we've had these problems in the past they were
 distinct instructions.  So EH interactions isn't going to be an issue for
 MPX.

 However, we've still got the problem that the RTL you've generated is
 ill-formed.  If I understand things correctly, the assignments are the
 result of the call, that should be modeled by having the destination be a
 PARALLEL as mentioned earlier.

OK. Will try it. BTW call_value_pop patterns have two sets. One for
returned value and one for stack register. How comes it differs much
from what I do with bound regs?

Thanks,
Ilya




 Jeff


Re: [gomp4] OpenACC wait directive

2014-09-24 Thread Ilmir Usmanov

Hi Cesar!

Thank you for the patch!

On 24.09.2014 02:29, Cesar Philippidis wrote:

This patch adds support for the async clause in the wait directive in
fortran. It should be pretty straight forward. The fortran FE already
supports the wait directive, but the async clause was introduced to the
wait directive in OpenACC 2.0 and that was missing in gomp-4_0-branch.

Yes, I've mostly focused on spec. ver. 1.0.


Is this OK for gomp-4_0-branch?

No, it isn't. According to the spec and this presentation:
http://www.pgroup.com/lit/presentations/cea-3.pdf (See slide 1-35)
it is possible to write construction like:
!$acc wait(1) async(2)
However, your patch doesn't support this. Also, don't forget to check 
whether a queue waits itself (for example, wait(1) async(1)).
In addition, it breaks current support of the directive (for example, 
wait(1)).



Note that this patch doesn't actually
implement the async or wait clause in the middle end yet, because that
requires additional runtime support.

Thanks,
Cesar

--
Ilmir.


Re: [PATCH, Pointer Bounds Checker 22/x] Inline

2014-09-24 Thread Ilya Enkovich
2014-09-23 23:55 GMT+04:00 Jeff Law l...@redhat.com:
 On 08/18/14 09:35, Ilya Enkovich wrote:

 Here is an updated version.

 Thanks,
 Ilya
 --
 2014-08-15  Ilya Enkovich  ilya.enkov...@intel.com

 * ipa-inline.c (early_inliner): Check edge has summary allocated.
 * tree-inline.c: Include tree-chkp.h.
 (declare_return_variable): Add arg holding
 returned bounds slot.  Create and initialize returned bounds var.
 (remap_gimple_stmt): Handle returned bounds.
 Return sequence of statements instead of a single statement.
 (insert_init_stmt): Add declaration.
 (remap_gimple_seq): Adjust to new remap_gimple_stmt signature.
 (copy_bb): Adjust to changed return type of remap_gimple_stmt.
 (expand_call_inline): Handle returned bounds.  Add bounds copy
 for generated mem to mem assignments.
 * tree-inline.h (copy_body_data): Add fields retbnd and
 assign_stmts.
 * cgraph.c: Include tree-chkp.h.
 (cgraph_redirect_edge_call_stmt_to_callee): Support
 returned bounds.
 * value-prof.c: Include tree-chkp.h.
 (gimple_ic): Support returned bounds.

 OK for the trunk.

 FWIW, when building up gimple (or RTL if you were ever to do that one day),
 it's sometimes helpful to the reviewer to show what you're doing.  For
 example, it took me a bit of time to realize that you needed the output from
 the direct call as an argument to the duplicated RETBND statement.  It
 looked for quite a while like you'd simply made a mistake.

Got it.  Will try to give more useful descriptions for my patches in the future.


 I'm a bit curious why you removed the original RETBND statement in
 value-prof, only to reinsert it.  Is there some reason you needed to do
 that?

After call transformation we have smth like that:

if (confition)
  new_lhs = direct_call (...);
else
  old_lhs = call (...);
old_bnd = __builtin_retbnd (old_lhs);

Original retbnd statement removal + reinsertion is used to transform it into:

if (confition)
  new_lhs = direct_call (...);
else
{
  old_lhs = call (...);
  old_bnd = __builtin_retbnd (old_lhs);
}

The rest of code inserts bounds for new_lhs and creates phi node for
bounds similar to what is done for call return value.

Thanks,
Ilya


 Richi -- in response to your comment about working around a bug earlier in
 this thread.  As Ilya mentioned, he just cloned existing practice in that
 code for creating the copy of the call.


 Jeff


Re: [PATCH] Fix PR63266: Keep track of impact of sign extension in bswap

2014-09-24 Thread Richard Biener
On Tue, Sep 16, 2014 at 12:24 PM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 Hi all,

 The fix for PR61306 disabled bswap when a sign extension is detected. However 
 this led to a test case regression (and potential performance regression) in 
 case where a sign extension happens but its effect is canceled by other bit 
 manipulation. This patch aims to fix that by having a special marker to track 
 bytes whose value is unpredictable due to sign extension. If the final result 
 of a bit manipulation doesn't contain any such marker then the bswap 
 optimization can proceed.

Nice and simple idea.

Ok.

Thanks,
Richard.

 *** gcc/ChangeLog ***

 2014-09-15  Thomas Preud'homme  thomas.preudho...@arm.com

 PR tree-optimization/63266
 * tree-ssa-math-opts.c (struct symbolic_number): Add comment about
 marker for unknown byte value.
 (MARKER_MASK): New macro.
 (MARKER_BYTE_UNKNOWN): New macro.
 (HEAD_MARKER): New macro.
 (do_shift_rotate): Mark bytes with unknown values due to sign
 extension when doing an arithmetic right shift. Replace hardcoded
 mask for marker by new MARKER_MASK macro.
 (find_bswap_or_nop_1): Likewise and adjust ORing of two symbolic
 numbers accordingly.

 *** gcc/testsuite/ChangeLog ***

 2014-09-15  Thomas Preud'homme  thomas.preudho...@arm.com

 PR tree-optimization/63266
 * gcc.dg/optimize-bswapsi-1.c (swap32_d): New bswap pass test.


 Testing:

 * Built an arm-none-eabi-gcc cross-compiler and used it to run the testsuite 
 on QEMU emulating Cortex-M3 without any regression
 * Bootstrapped on x86_64-linux-gnu target and testsuite was run without 
 regression


 Ok for trunk?


Re: [PATCH, testsuite]: PR 58757: Check for FP denormal values without triggering denormal exceptions

2014-09-24 Thread Uros Bizjak
On Tue, Sep 23, 2014 at 8:40 PM, Marc Glisse marc.gli...@inria.fr wrote:

 Attached patch avoids triggering denormal exceptions when FP insns are
 used to check for non-zero denormal values.


 But I thought the point of the test was to verify that the compiler's
 understanding of existence of subnormal values was consistent with the
 processor.  If the processor is in a mode supporting such values, the
 exceptions should be masked.  That is, the present test should pass
 unconditionally, if it doesn't pass that indicates a bug (which might be
 appropriate for XFAILing).


 Alpha needs special instruction mode to process denormals. Without
 this special mode the insn traps as soon as denormal value is
 processed.


 Yes, but I thought the point of that PR was that unless -mieee was given
 to support such values, *_TRUE_MIN should be the same as *_MIN, reflecting
 that they aren't supported.  And so the failure is showing that this bug
 is present (and so XFAILing with a comment referring to the bug is
 appropriate, rather than changing the test to pass).


 That's also my understanding, I am sorry Uros that I wasn't clear enough in
 the PR...

I see the intention now.

However, alpha *does* support all IEEE features, the only problem is
in its default model, which is for some reason High-Performance
IEEE-Format Arithmetic (please see alpha AHB [1], section 4.7.6.5).
This model does not require the overhead of an operating system
completion handler and can be the fastest of the three IEEE models..
Unfortunately, this model also notifies applications of all
exceptional floating-point operations. Denormals are considered
non-finite IEEE values, so they trap.

When the target is in certain high-speed mode, it is up to the user
to obey all the limitations, in this particular case, that only IEEE
finite numbers are provided. This is not the case with the original
testcase, so I'd say that the test is out of specs. It beats me, why
-mieee is not the default on alpha, since current default suits
-ffast-math more, but it looks that we have to live with this mess.

To avoid traps on denormals, -mieee has to be specified. This option
enables FP software completion that completes denormal handling, so
there is no need to notify application  IMO, instead of XFAILing
the test, we should simply provide -mieee. __*_DENORM_MIN__ should
indeed apply to the underlying FP format, not to sme target-dependent
model and its implementation details.

[1] http://www.compaq.com/cpq-alphaserver/technology/literature/alphaahb.pdf

Uros.


Re: [PATCH, Pointer Bounds Checker 19/x] Support bounds in expand

2014-09-24 Thread Ilya Enkovich
2014-09-24 0:58 GMT+04:00 Jeff Law l...@redhat.com:
 On 06/05/14 08:46, Ilya Enkovich wrote:

 2014-06-05  Ilya Enkovich  ilya.enkov...@intel.com

 * calls.c: Include tree-chkp.h, rtl-chkp.h, bitmap.h.
 (arg_data): Add fields special_slot, pointer_arg and
 pointer_offset.
 (store_bounds): New.
 (emit_call_1): Propagate instrumentation flag for CALL.
 (initialize_argument_information): Compute pointer_arg,
 pointer_offset and special_slot for pointer bounds arguments.
 (finalize_must_preallocate): Preallocate when storing bounds
 in bounds table.
 (compute_argument_addresses): Skip pointer bounds.
 (expand_call): Store bounds into tables separately.  Return
 result joined with resulting bounds.
 * cfgexpand.c: Include tree-chkp.h, rtl-chkp.h.
 (expand_call_stmt): Propagate bounds flag for CALL_EXPR.
 (expand_return): Add returned bounds arg.  Handle returned bounds.
 (expand_gimple_stmt_1): Adjust to new expand_return signature.
 (gimple_expand_cfg): Reset rtx bounds map.
 * expr.c: Include tree-chkp.h, rtl-chkp.h.
 (expand_assignment): Handle returned bounds.
 (store_expr_with_bounds): New.  Replaces store_expr with new
 bounds
 target argument.  Handle bounds returned by calls.
 (store_expr): Now wraps store_expr_with_bounds.
 * expr.h (store_expr_with_bounds): New.
 * function.c: Include tree-chkp.h, rtl-chkp.h.
 (bounds_parm_data): New.
 (use_register_for_decl): Do not registerize decls used for bounds
 stores and loads.
 (assign_parms_augmented_arg_list): Add bounds of the result
 structure pointer as the second argument.
 (assign_parm_find_entry_rtl): Mark bounds are never passed on
 the stack.
 (assign_parm_is_stack_parm): Likewise.
 (assign_parm_load_bounds): New.
 (assign_bounds): New.
 (assign_parms): Load bounds and determine a location for
 returned bounds.
 (diddle_return_value_1): New.
 (diddle_return_value): Handle returned bounds.
 * function.h (rtl_data): Add field for returned bounds.


 diff --git a/gcc/calls.c b/gcc/calls.c
 index e1dc8eb..5fbbe9f 100644
 --- a/gcc/calls.c
 +++ b/gcc/calls.c
 @@ -44,11 +44,14 @@ along with GCC; see the file COPYING3.  If not see
   #include tm_p.h
   #include timevar.h
   #include sbitmap.h
 +#include bitmap.h
   #include langhooks.h
   #include target.h
   #include cgraph.h
   #include except.h
   #include dbgcnt.h
 +#include tree-chkp.h
 +#include rtl-chkp.h

   /* Like PREFERRED_STACK_BOUNDARY but in units of bytes, not bits.  */
   #define STACK_BYTES (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
 @@ -76,6 +79,15 @@ struct arg_data
 /* If REG is a PARALLEL, this is a copy of VALUE pulled into the
 correct
form for emit_group_move.  */
 rtx parallel_value;
 +  /* If value is passed in neither reg nor stack, this field holds a
 number
 + of a special slot to be used.  */
 +  rtx special_slot;

 I really dislike special_slot and the comment here.  The comment that it's
 neither a reg nor stack is just bogus.  What hardware resource does
 special_slot refer to?  It's a register, but one that we do not typically
 expose.  Let's at least clarify the comment and then we'll see if something
 other than special_slot as a name makes sense.  Yes, I realize this is a
 bit of bikeshedding, but when the comments/terminology is confusing, the
 code becomes even harder to understand.

Special slot is not a register.  When bounds are passed in a register
then everything work as if we pass any other argument in a register.
Special slot is used when we are out of bounds registers and pass
bounds for pointer passed in a register.  It doesn't refer to any
hardware resource.  In MPX ABI we state that special Bounds Table
entries (related to stack pointer value (and lower) right before a
call) are used.  In software implementation it also may be some other
places like vars in TLS.


 I'm a bit concerned that this is exposing more details of the MPX
 implementation than is advisable to the front/middle end.  On the other
 hand, I'd expect any other implementation that seeks to work in a
 transparent manner is going to have many of the same implementation
 properties as we see with MPX, so perhaps it's not a major problem.

I'm trying to not introduce any hardware dependencies into middle end.
Several months ago I created a simple prototype of generic target
support in Pointer Bounds Checker which used library calls instead of
MPX instructions, TLS for bounds passing etc.  I did it to check our
design is not bound to MPX and allows such implementation.  It was
very useful and showed some MPX details soaked into GIMPLE part. E.g.
chkp_initialize_bounds and chkp_make_bounds_constant hooks appeared
during that work.  Special slots mechanism worked well 

[PATCH i386 AVX512] [51/n] Add pd2dq and dq2pd converts.

2014-09-24 Thread Kirill Yukhin
Hello,
Patch in the bottom adds support for pd2dq and dq2pd
conversions.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/i386.c
(avx512f_ufix_notruncv8dfv8si_mask_round): Rename to ...
(ufix_notruncv8dfv8si2_mask_round): this.
* config/i386/sse.md
(define_insn avx512f_cvtdq2pd512_2): Update TARGET check.
(define_insn avx_cvtdq2pd256_2): Add EVEX version.
(define_insn sse2_cvtdq2pdmask_name): Add masking.
(define_insn avx_cvtpd2dq256mask_name): Ditto.
(define_expand sse2_cvtpd2dq): Delete.
(define_insn sse2_cvtpd2dqmask_name): Add masking.
(define_insn avx512f_ufix_notruncv8dfv8simask_nameround_name):
Delete.
(define_mode_attr pd2udqsuff): New.
(define_insn
ufix_notruncmodesi2dfmodelower2mask_nameround_name): Ditto.
(define_insn ufix_notruncv2dfv2si2mask_name): Ditto.
(define_insn *avx_cvttpd2dq256_2): Delete.
(define_expand sse2_cvttpd2dq): Ditto.
(define_insn sse2_cvttpd2dqmask_name): Add masking.

--
Thanks, K


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d70420d..1aec70f 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -30246,7 +30246,7 @@ static const struct builtin_description 
bdesc_round_args[] =
   { OPTION_MASK_ISA_AVX512F, CODE_FOR_floatv16siv16sf2_mask_round, 
__builtin_ia32_cvtdq2ps512_mask, IX86_BUILTIN_CVTDQ2PS512, UNKNOWN, (int) 
V16SF_FTYPE_V16SI_V16SF_HI_INT },
   { OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_cvtpd2dq512_mask_round, 
__builtin_ia32_cvtpd2dq512_mask, IX86_BUILTIN_CVTPD2DQ512, UNKNOWN, (int) 
V8SI_FTYPE_V8DF_V8SI_QI_INT },
   { OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_cvtpd2ps512_mask_round,  
__builtin_ia32_cvtpd2ps512_mask, IX86_BUILTIN_CVTPD2PS512, UNKNOWN, (int) 
V8SF_FTYPE_V8DF_V8SF_QI_INT },
-  { OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_ufix_notruncv8dfv8si_mask_round, 
__builtin_ia32_cvtpd2udq512_mask, IX86_BUILTIN_CVTPD2UDQ512, UNKNOWN, (int) 
V8SI_FTYPE_V8DF_V8SI_QI_INT },
+  { OPTION_MASK_ISA_AVX512F, CODE_FOR_ufix_notruncv8dfv8si2_mask_round, 
__builtin_ia32_cvtpd2udq512_mask, IX86_BUILTIN_CVTPD2UDQ512, UNKNOWN, (int) 
V8SI_FTYPE_V8DF_V8SI_QI_INT },
   { OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_vcvtph2ps512_mask_round,  
__builtin_ia32_vcvtph2ps512_mask, IX86_BUILTIN_CVTPH2PS512, UNKNOWN, (int) 
V16SF_FTYPE_V16HI_V16SF_HI_INT },
   { OPTION_MASK_ISA_AVX512F, 
CODE_FOR_avx512f_fix_notruncv16sfv16si_mask_round, 
__builtin_ia32_cvtps2dq512_mask, IX86_BUILTIN_CVTPS2DQ512, UNKNOWN, (int) 
V16SI_FTYPE_V16SF_V16SI_HI_INT },
   { OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_cvtps2pd512_mask_round, 
__builtin_ia32_cvtps2pd512_mask, IX86_BUILTIN_CVTPS2PD512, UNKNOWN, (int) 
V8DF_FTYPE_V8SF_V8DF_QI_INT },
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 287fd11..b2e1d4f 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -4463,33 +4463,33 @@
   (const_int 2) (const_int 3)
   (const_int 4) (const_int 5)
   (const_int 6) (const_int 7)]]
-  TARGET_AVX
+  TARGET_AVX512F
   vcvtdq2pd\t{%t1, %0|%0, %t1}
   [(set_attr type ssecvt)
(set_attr prefix evex)
(set_attr mode V8DF)])
 
 (define_insn avx_cvtdq2pd256_2
-  [(set (match_operand:V4DF 0 register_operand =x)
+  [(set (match_operand:V4DF 0 register_operand =v)
(float:V4DF
  (vec_select:V4SI
-   (match_operand:V8SI 1 nonimmediate_operand xm)
+   (match_operand:V8SI 1 nonimmediate_operand vm)
(parallel [(const_int 0) (const_int 1)
   (const_int 2) (const_int 3)]]
   TARGET_AVX
   vcvtdq2pd\t{%x1, %0|%0, %x1}
   [(set_attr type ssecvt)
-   (set_attr prefix vex)
+   (set_attr prefix maybe_evex)
(set_attr mode V4DF)])
 
-(define_insn sse2_cvtdq2pd
-  [(set (match_operand:V2DF 0 register_operand =x)
+(define_insn sse2_cvtdq2pdmask_name
+  [(set (match_operand:V2DF 0 register_operand =v)
(float:V2DF
  (vec_select:V2SI
-   (match_operand:V4SI 1 nonimmediate_operand xm)
+   (match_operand:V4SI 1 nonimmediate_operand vm)
(parallel [(const_int 0) (const_int 1)]]
-  TARGET_SSE2
-  %vcvtdq2pd\t{%1, %0|%0, %q1}
+  TARGET_SSE2  mask_avx512vl_condition
+  %vcvtdq2pd\t{%1, %0mask_operand2|%0mask_operand2, %q1}
   [(set_attr type ssecvt)
(set_attr prefix maybe_vex)
(set_attr ssememalign 64)
@@ -4506,14 +4506,14 @@
(set_attr prefix evex)
(set_attr mode OI)])
 
-(define_insn avx_cvtpd2dq256
-  [(set (match_operand:V4SI 0 register_operand =x)
-   (unspec:V4SI [(match_operand:V4DF 1 nonimmediate_operand xm)]
+(define_insn avx_cvtpd2dq256mask_name
+  [(set (match_operand:V4SI 0 register_operand =v)
+   (unspec:V4SI [(match_operand:V4DF 1 nonimmediate_operand vm)]
 UNSPEC_FIX_NOTRUNC))]
-  TARGET_AVX
-  vcvtpd2dq{y}\t{%1, %0|%0, %1}

[PATCH i386 AVX512] [52/n] Add convert ps2pd and ps2dq.

2014-09-24 Thread Kirill Yukhin
Hello,
Patch in the bottom adds support for ps2dq and ps2pd
conversions.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_c_enum unspec): Add UNSPEC_CVTINT2MASK.
(define_insn

fixsuffixfix_truncmodesselongvecmodelower2mask_nameround_saeonly_name):
New.
(define_insn fixsuffixfix_truncv2sfv2di2mask_name): Ditto.
(define_insn ufix_truncmodesseintvecmodelower2mask_name): Ditto.
(define_insn sse2_cvtss2sdround_saeonly_name): Change
nonimmediate_operand to round_saeonly_nimm_predicate.
(define_insn avx_cvtpd2ps256mask_name): Add masking.
(define_expand sse2_cvtpd2ps_mask): New.
(define_insn *sse2_cvtpd2psmask_name): Add masking.
(define_insn avx512_cvtssemodesuffix2maskmode): New.
(define_insn avx512_cvtmask2ssemodesuffixmode): Ditto.
(define_insn sse2_cvtps2pdmask_name): Add masking.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index b2e1d4f..c9d6e00 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -132,6 +132,7 @@
   ;; For AVX512BW support
   UNSPEC_PSHUFHW
   UNSPEC_PSHUFLW
+  UNSPEC_CVTINT2MASK
 
   ;; For AVX512DQ support
   UNSPEC_REDUCE
@@ -4659,6 +4660,38 @@
(set_attr prefix evex)
(set_attr mode sseintvecmode2)])
 
+(define_insn 
fixsuffixfix_truncmodesselongvecmodelower2mask_nameround_saeonly_name
+  [(set (match_operand:sselongvecmode 0 register_operand =v)
+   (any_fix:sselongvecmode
+ (match_operand:VF1_128_256VL 1 round_saeonly_nimm_predicate 
round_saeonly_constraint)))]
+  TARGET_AVX512DQ  round_saeonly_modev8sf_condition
+  vcvttps2fixsuffixqq\t{round_saeonly_mask_op2%1, 
%0mask_operand2|%0mask_operand2, %1round_saeonly_mask_op2}
+  [(set_attr type ssecvt)
+   (set_attr prefix evex)
+   (set_attr mode sseintvecmode3)])
+
+(define_insn fixsuffixfix_truncv2sfv2di2mask_name
+  [(set (match_operand:V2DI 0 register_operand =v)
+   (any_fix:V2DI
+ (vec_select:V2SF
+   (match_operand:V4SF 1 nonimmediate_operand vm)
+   (parallel [(const_int 0) (const_int 1)]]
+  TARGET_AVX512DQ  TARGET_AVX512VL
+  vcvttps2fixsuffixqq\t{%1, %0mask_operand2|%0mask_operand2, %1}
+  [(set_attr type ssecvt)
+   (set_attr prefix evex)
+   (set_attr mode TI)])
+
+(define_insn ufix_truncmodesseintvecmodelower2mask_name
+  [(set (match_operand:sseintvecmode 0 register_operand =v)
+   (unsigned_fix:sseintvecmode
+ (match_operand:VF1_128_256VL 1 nonimmediate_operand vm)))]
+  TARGET_AVX512VL
+  vcvttps2udq\t{%1, %0mask_operand2|%0mask_operand2, %1}
+  [(set_attr type ssecvt)
+   (set_attr prefix evex)
+   (set_attr mode sseintvecmode2)])
+
 (define_expand avx_cvttpd2dq256_2
   [(set (match_operand:V8SI 0 register_operand)
(vec_concat:V8SI
@@ -4713,7 +4746,7 @@
(vec_merge:V2DF
  (float_extend:V2DF
(vec_select:V2SF
- (match_operand:V4SF 2 nonimmediate_operand 
x,m,round_saeonly_constraint)
+ (match_operand:V4SF 2 round_saeonly_nimm_predicate 
x,m,round_saeonly_constraint)
  (parallel [(const_int 0) (const_int 1)])))
  (match_operand:V2DF 1 register_operand 0,0,v)
  (const_int 1)))]
@@ -4741,14 +4774,14 @@
(set_attr prefix evex)
(set_attr mode V8SF)])
 
-(define_insn avx_cvtpd2ps256
-  [(set (match_operand:V4SF 0 register_operand =x)
+(define_insn avx_cvtpd2ps256mask_name
+  [(set (match_operand:V4SF 0 register_operand =v)
(float_truncate:V4SF
- (match_operand:V4DF 1 nonimmediate_operand xm)))]
-  TARGET_AVX
-  vcvtpd2ps{y}\t{%1, %0|%0, %1}
+ (match_operand:V4DF 1 nonimmediate_operand vm)))]
+  TARGET_AVX  mask_avx512vl_condition
+  vcvtpd2ps{y}\t{%1, %0mask_operand2|%0mask_operand2, %1}
   [(set_attr type ssecvt)
-   (set_attr prefix vex)
+   (set_attr prefix maybe_evex)
(set_attr btver2_decode vector)
(set_attr mode V4SF)])
 
@@ -4761,16 +4794,28 @@
   TARGET_SSE2
   operands[2] = CONST0_RTX (V2SFmode);)
 
-(define_insn *sse2_cvtpd2ps
-  [(set (match_operand:V4SF 0 register_operand =x)
+(define_expand sse2_cvtpd2ps_mask
+  [(set (match_operand:V4SF 0 register_operand)
+   (vec_merge:V4SF
+ (vec_concat:V4SF
+   (float_truncate:V2SF
+ (match_operand:V2DF 1 nonimmediate_operand))
+   (match_dup 4))
+ (match_operand:V4SF 2 register_operand)
+ (match_operand:QI 3 register_operand)))]
+  TARGET_SSE2
+  operands[4] = CONST0_RTX (V2SFmode);)
+
+(define_insn *sse2_cvtpd2psmask_name
+  [(set (match_operand:V4SF 0 register_operand =v)
(vec_concat:V4SF
  (float_truncate:V2SF
-   (match_operand:V2DF 1 nonimmediate_operand xm))
+   (match_operand:V2DF 1 nonimmediate_operand vm))
  (match_operand:V2SF 2 const0_operand)))]
-  TARGET_SSE2
+  TARGET_SSE2  mask_avx512vl_condition
 {
   if (TARGET_AVX)
-return 

[PATCH] Fix asan optimization for aligned accesses. (PR sanitizer/63316)

2014-09-24 Thread Jakub Jelinek
On Tue, Sep 02, 2014 at 07:09:50PM +0400, Marat Zakirov wrote:
 Here's a simple optimization patch for Asan. It stores alignment
 information into ASAN_CHECK which is then extracted by sanopt to reduce
 number of and 0x7 instructions for sufficiently aligned accesses. I
 checked it on linux kernel by comparing results of objdump -d -j .text
 vmlinux | grep and.*0x7, for optimized and regular cases. It eliminates
 12% of and 0x7's.
 
 No regressions. Sanitized GCC was successfully Asan-bootstrapped. No false
 positives were found in kernel.

Unfortunately it broke PR63316.  The problem is that you've just replaced
base_addr  7 with base_addr in the
(base_addr  7) + (real_size_in_bytes - 1) = shadow
computation.   7 is of course not useless there,  ~7 would be.
For known sufficiently aligned base_addr, instead we know that
(base_addr  7) is always 0 and thus can simplify the test
to (real_size_in_bytes - 1) = shadow
where (real_size_in_bytes - 1) is a constant.

Fixed thusly, committed to trunk.

BTW, I've noticed that perhaps using BIT_AND_EXPR for the
(shadow != 0)  ((base_addr  7) + (real_size_in_bytes - 1) = shadow)
tests isn't best, maybe we could get better code if we expanded it as
(shadow != 0)  ((base_addr  7) + (real_size_in_bytes - 1) = shadow)
(i.e. an extra basic block containing the second half of the test
and fastpath for the shadow == 0 case if it is sufficiently common
(probably it is)).  Will try to code this up unless somebody beats me to
that, but if somebody volunteered to benchmark such a change, it would
be very much appreciated.

2014-09-24  Jakub Jelinek  ja...@redhat.com

PR sanitizer/63316
* asan.c (asan_expand_check_ifn): Fix up align = 8 optimization.

* c-c++-common/asan/pr63316.c: New test.

--- gcc/asan.c.jj   2014-09-24 08:26:49.0 +0200
+++ gcc/asan.c  2014-09-24 11:00:59.380298362 +0200
@@ -2585,19 +2585,26 @@ asan_expand_check_ifn (gimple_stmt_itera
  gimple shadow_test = build_assign (NE_EXPR, shadow, 0);
  gimple_seq seq = NULL;
  gimple_seq_add_stmt (seq, shadow_test);
- /* Aligned (= 8 bytes) access do not need  7.  */
+ /* Aligned (= 8 bytes) can test just
+(real_size_in_bytes - 1 = shadow), as base_addr  7 is known
+to be 0.  */
  if (align  8)
-   gimple_seq_add_stmt (seq, build_assign (BIT_AND_EXPR,
-base_addr, 7));
- gimple_seq_add_stmt (seq, build_type_cast (shadow_type,
- gimple_seq_last (seq)));
- if (real_size_in_bytes  1)
-   gimple_seq_add_stmt (seq,
-build_assign (PLUS_EXPR, gimple_seq_last (seq),
-  real_size_in_bytes - 1));
- gimple_seq_add_stmt (seq, build_assign (GE_EXPR,
+   {
+ gimple_seq_add_stmt (seq, build_assign (BIT_AND_EXPR,
+  base_addr, 7));
+ gimple_seq_add_stmt (seq,
+  build_type_cast (shadow_type,
+   gimple_seq_last (seq)));
+ if (real_size_in_bytes  1)
+   gimple_seq_add_stmt (seq,
+build_assign (PLUS_EXPR,
   gimple_seq_last (seq),
-  shadow));
+  real_size_in_bytes - 1));
+ t = gimple_assign_lhs (gimple_seq_last_stmt (seq));
+   }
+ else
+   t = build_int_cst (shadow_type, real_size_in_bytes - 1);
+ gimple_seq_add_stmt (seq, build_assign (GE_EXPR, t, shadow));
  gimple_seq_add_stmt (seq, build_assign (BIT_AND_EXPR, shadow_test,
   gimple_seq_last (seq)));
  t = gimple_assign_lhs (gimple_seq_last (seq));
--- gcc/testsuite/c-c++-common/asan/pr63316.c.jj2014-09-24 
10:57:21.879454411 +0200
+++ gcc/testsuite/c-c++-common/asan/pr63316.c   2014-09-24 11:04:16.773241665 
+0200
@@ -0,0 +1,22 @@
+/* PR sanitizer/63316 */
+/* { dg-do run } */
+/* { dg-options -fsanitize=address -O2 } */
+
+#ifdef __cplusplus
+extern C {
+#endif
+extern void *malloc (__SIZE_TYPE__);
+extern void free (void *);
+#ifdef __cplusplus
+}
+#endif
+
+int
+main ()
+{
+  int *p = (int *) malloc (sizeof (int));
+  *p = 3;
+  asm volatile ( : : r (p) : memory);
+  free (p);
+  return 0;
+}


Jakub


Re: libsanitizer merge from upstream r218156

2014-09-24 Thread Jakub Jelinek
On Tue, Sep 23, 2014 at 11:03:55AM -0700, Konstantin Serebryany wrote:
  OT, will you please look at the underaligned asan malloc etc.?  GCC assumes
  that even malloc (1) or malloc (7) is sizeof (void *) aligned on Linux
  (and can and will assume 2 * sizeof (void *) alignment hopefully soon).
 
 What's wrong here?
 I am pretty confident that asan's malloc always returns 16-aligned pointers.

Sorry, that was just my guess, I haven't really analyzed PR63316 before
writing this.  Analyzed it now and fixed.

Jakub


Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Steven Bosscher
On Wed, Sep 24, 2014 at 8:41 AM, Ilya Enkovich wrote:
 2014-09-23 20:06 GMT+04:00 Jeff Law:
 On 09/23/14 10:01, Steven Bosscher wrote:
 Are you sure this patch is necessary, and is not just papering over
 another problem? In the past, all cases I've seen where labels were
 removed inadvertently were caused by incorrect reference counting or
 missing REG_LABEL_* notes.

 Description of LABEL_PRESERVE_P says label that should always be
 considered to be needed.

It's more specific than that, really:

@item LABEL_PRESERVE_P (@var{x})
In a @code{code_label} or @code{note}, indicates that the label is referenced by
code or data not visible to the RTL of a given function.


The not visible part is important. If there are visible references
to a label, then they should never be removed (obviously) and that
should work through LABEL_NUSES. Unfortunately we are not very good at
keeping LABEL_NUSES up-to-date (this is why all the
rebuild_jump_labels() are still required).

What appears to be the case here, is that you have a label between two
basic blocks B1 and B2, and the label acts as a control flow barrier:
B1 and B2 cannot be merged. Then this should be expressed in the CFG.
Otherwise: What else prevents the merge_blocks CFG hooks from deleting
the label?



 That means even if we do not have any usages
 we shouldn't remove it.

Sorry, no.
Even a LABEL_PRESERVE_P label can be deleted: It will be replaced by a
NOTE_INSN_DELETED_LABEL. See cfgrtl.c:delete_insn().

If you really want to prevent a label from being deleted, then
LABEL_PRESERVE_P is not a sufficient condition.


  Why can't we add some additional usages
 later?

If you add the usages later, then you're lying to the compiler ;-)



 Did the label use count drop to zero? Is there a REG_LABEL_TARGET note
 for the label operand?

 In the current code of ix86_expand_prologue I don't see any notes
 generation for set_rip_rex64 instruction which actually uses label.
 But IMO this is another potential issue and we still shouldn't remove
 labels with LABEL_PRESERVE_P.

Notes are generated in jump.c:rebuild_jump_labels. They are
automatically added when a label is not

Ciao!
Steven


Re: [patch] Implement move semantics for iostreams

2014-09-24 Thread Jonathan Wakely

On 22/09/14 14:35 +0100, Jonathan Wakely wrote:

This adds move and swap functions to the iostream classes.


This fixes a silly typo.

Tested x86_64-linux, committed to trunk.

commit acaef9854dff5f37d86b80fc8236df5fd90b0ca5
Author: Jonathan Wakely jwak...@redhat.com
Date:   Wed Sep 24 10:10:28 2014 +0100

	PR libstdc++/63353
	* src/c++11/ios.cc (ios_base::_M_swap): Fix typo.

diff --git a/libstdc++-v3/src/c++11/ios.cc b/libstdc++-v3/src/c++11/ios.cc
index b5124ec..0e136d4 100644
--- a/libstdc++-v3/src/c++11/ios.cc
+++ b/libstdc++-v3/src/c++11/ios.cc
@@ -229,7 +229,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  std::swap(_M_local_word, __rhs._M_local_word); // array swap
 else
  {
-   if (!__lhs_local  !__lhs_local)
+   if (!__lhs_local  !__rhs_local)
 	 std::swap(_M_word, __rhs._M_word);
else
 	 {


Re: [PATCH 1/14][AArch64] Temporarily remove aarch64_gimple_fold_builtin code for reduction operations

2014-09-24 Thread Marcus Shawcroft
On 18 September 2014 12:45, Alan Lawrence alan.lawre...@arm.com wrote:
 The gimple folding ties the AArch64 backend to the tree representation of
 the midend via the neon intrinsics. This code enables constant folding of
 Neon intrinsics reduction ops, so improves performance, but is not necessary
 for correctness. By temporarily removing it (here), we can then change the
 midend representation independently of the AArch64 backend + intrinsics.

 However, I'm leaving the code in place, as a later patch will bring it all
 back in a very similar form (but enabled for bigendian).

 Bootstrapped on aarch64-none-linux; tested aarch64.exp on aarch64-none-elf
 and aarch64_be-none-elf. (The removed code was already disabled for
 bigendian; and this is solely a __builtin-folding mechanism, i.e. used only
 for Neon/ACLE intrinsics.)

 gcc/ChangeLog:
 * config/aarch64/aarch64.c (TARGET_GIMPLE_FOLD_BUILTIN): Comment
 out.
 * config/aarch64/aarch64-builtins.c (aarch64_gimple_fold_builtin):
 Remove using preprocessor directives.

OK /Marcus


Re: [PATCH 4/14][AArch64] Use new reduc_plus_scal optabs, inc. for __builtins

2014-09-24 Thread Marcus Shawcroft
On 18 September 2014 12:59, Alan Lawrence alan.lawre...@arm.com wrote:
 This migrates AArch64 over to the new optab for 'plus' reductions, i.e. so
 the define_expands produce scalars by generating a MOV to a GPR.
 Effectively, this moves the vget_lane inside every arm_neon.h intrinsic,
 into the inside of the define_expand.

 Tested: aarch64.exp vect.exp on aarch64-none-elf and aarch64_be-none-elf
 (full check-gcc on next patch for reduc_min/max)


+(define_expand reduc_splus_mode
+

Can't we just drop the define_expands for the old optabs altogether?

/Marcus


Re: [PATCH 5/14][AArch64] Use new reduc_[us](min|max)_scal optabs, inc. for builtins

2014-09-24 Thread Marcus Shawcroft
On 18 September 2014 13:02, Alan Lawrence alan.lawre...@arm.com wrote:
 Similarly to the previous patch (r/2205), this migrates AArch64 to the new
 reduce-to-scalar optabs for min and max. For consistency we apply the same
 treatment to the smax_nan and smin_nan patterns (used for __builtins), even
 though reduc_smin_nan_scal (etc.) is not a standard name.

 Tested: check-gcc on aarch64-none-elf and aarch64_be-none-elf.

 gcc/ChangeLog:

 * config/aarch64/aarch64-simd-builtins.def (reduc_smax_,
 reduc_smin_,
 reduc_umax_, reduc_umin_, reduc_smax_nan_, reduc_smin_nan_): Remove.
 (reduc_smax_scal_, reduc_smin_scal_, reduc_umax_scal_,
 reduc_umin_scal_, reduc_smax_nan_scal_, reduc_smin_nan_scal_): New.

 * config/aarch64/aarch64-simd.md
 (reduc_maxmin_uns_mode): Rename VDQV_S variant to...
 (reduc_maxmin_uns_internalmode): ...this.
 (reduc_maxmin_uns_mode): New (VDQ_BHSI).
 (reduc_maxmin_uns_scal_mode): New (*2).

 (reduc_maxmin_uns_v2si): Combine with below, renaming...
 (reduc_maxmin_uns_mode): Combine V2F with above, renaming...
 (reduc_maxmin_uns_internal_mode): ...to this (VDQF).

 * config/aarch64/arm_neon.h (vmaxv_f32, vmaxv_s8, vmaxv_s16,
 vmaxv_s32, vmaxv_u8, vmaxv_u16, vmaxv_u32, vmaxvq_f32, vmaxvq_f64,
 vmaxvq_s8, vmaxvq_s16, vmaxvq_s32, vmaxvq_u8, vmaxvq_u16,
 vmaxvq_u32,
 vmaxnmv_f32, vmaxnmvq_f32, vmaxnmvq_f64, vminv_f32, vminv_s8,
 vminv_s16, vminv_s32, vminv_u8, vminv_u16, vminv_u32, vminvq_f32,
 vminvq_f64, vminvq_s8, vminvq_s16, vminvq_s32, vminvq_u8,
 vminvq_u16,
 vminvq_u32, vminnmv_f32, vminnmvq_f32, vminnmvq_f64): Update to use
 __builtin_aarch64_reduc_..._scal; remove vget_lane wrapper.

If we don;t need the old optabs, I think would be better to drop those
define_expands, otherwise OK.
/Marcus


Re: [PATCH 6/14][AArch64] Restore gimple_folding of reduction intrinsics

2014-09-24 Thread Marcus Shawcroft
On 18 September 2014 13:05, Alan Lawrence alan.lawre...@arm.com wrote:
 This gives us back the constant-folding of the neon-intrinsics that was
 removed in the first patch, but is now OK for bigendian too.

 bootstrapped on aarch64-none-linux-gnu.
 check-gcc on aarch64-none-elf and aarch64_be-none-elf.

 gcc/ChangeLog:

 * config/aarch64/aarch64.c (TARGET_GIMPLE_FOLD_BUILTIN): Define
 again.
 * config/aarch64/aarch64-builtins.c (aarch64_gimple_fold_builtin):
 Restore, enable for bigendian, update to use __builtin..._scal...

OK /Marcus


Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Ilya Enkovich
2014-09-24 13:30 GMT+04:00 Steven Bosscher stevenb@gmail.com:
 On Wed, Sep 24, 2014 at 8:41 AM, Ilya Enkovich wrote:
 2014-09-23 20:06 GMT+04:00 Jeff Law:
 On 09/23/14 10:01, Steven Bosscher wrote:
 Are you sure this patch is necessary, and is not just papering over
 another problem? In the past, all cases I've seen where labels were
 removed inadvertently were caused by incorrect reference counting or
 missing REG_LABEL_* notes.

 Description of LABEL_PRESERVE_P says label that should always be
 considered to be needed.

 It's more specific than that, really:

 @item LABEL_PRESERVE_P (@var{x})
 In a @code{code_label} or @code{note}, indicates that the label is referenced 
 by
 code or data not visible to the RTL of a given function.

I read another description:
/* 1 if RTX is a code_label that should always be considered to be needed.  */
#define LABEL_PRESERVE_P(RTX)   \
  (RTL_FLAG_CHECK2 (LABEL_PRESERVE_P, (RTX), CODE_LABEL, NOTE)-in_struct)



 The not visible part is important. If there are visible references
 to a label, then they should never be removed (obviously) and that
 should work through LABEL_NUSES. Unfortunately we are not very good at
 keeping LABEL_NUSES up-to-date (this is why all the
 rebuild_jump_labels() are still required).

Does rebuild handle all kinds of instructions including those which use UNSPEC?


 What appears to be the case here, is that you have a label between two
 basic blocks B1 and B2, and the label acts as a control flow barrier:
 B1 and B2 cannot be merged. Then this should be expressed in the CFG.
 Otherwise: What else prevents the merge_blocks CFG hooks from deleting
 the label?

Label acts as a barrier here but it is a side effect.  I don't care
about block merging.  I just don't want label with usages to be
removed.




 That means even if we do not have any usages
 we shouldn't remove it.

 Sorry, no.
 Even a LABEL_PRESERVE_P label can be deleted: It will be replaced by a
 NOTE_INSN_DELETED_LABEL. See cfgrtl.c:delete_insn().

According to description you quoted label marked by LABEL_PRESERVE_P
is used by some code or data.  Let this use be not visible to the RTL
of a given function.  It is still used, right? How can you remove it?

Ilya


 If you really want to prevent a label from being deleted, then
 LABEL_PRESERVE_P is not a sufficient condition.


  Why can't we add some additional usages
 later?

 If you add the usages later, then you're lying to the compiler ;-)



 Did the label use count drop to zero? Is there a REG_LABEL_TARGET note
 for the label operand?

 In the current code of ix86_expand_prologue I don't see any notes
 generation for set_rip_rex64 instruction which actually uses label.
 But IMO this is another potential issue and we still shouldn't remove
 labels with LABEL_PRESERVE_P.

 Notes are generated in jump.c:rebuild_jump_labels. They are
 automatically added when a label is not

 Ciao!
 Steven


Re: [patch] Implement move semantics for iostreams

2014-09-24 Thread Jakub Jelinek
On Wed, Sep 24, 2014 at 10:40:09AM +0100, Jonathan Wakely wrote:
 On 22/09/14 14:35 +0100, Jonathan Wakely wrote:
 This adds move and swap functions to the iostream classes.
 
 This fixes a silly typo.
 
 Tested x86_64-linux, committed to trunk.
 

 commit acaef9854dff5f37d86b80fc8236df5fd90b0ca5
 Author: Jonathan Wakely jwak...@redhat.com
 Date:   Wed Sep 24 10:10:28 2014 +0100
 
   PR libstdc++/63353
   * src/c++11/ios.cc (ios_base::_M_swap): Fix typo.
 
 diff --git a/libstdc++-v3/src/c++11/ios.cc b/libstdc++-v3/src/c++11/ios.cc
 index b5124ec..0e136d4 100644
 --- a/libstdc++-v3/src/c++11/ios.cc
 +++ b/libstdc++-v3/src/c++11/ios.cc
 @@ -229,7 +229,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   std::swap(_M_local_word, __rhs._M_local_word); // array swap
  else
   {
 -   if (!__lhs_local  !__lhs_local)
 +   if (!__lhs_local  !__rhs_local)
std::swap(_M_word, __rhs._M_word);
 else
{

Wouldn't this be something for a (non-Wall?) warning?
I mean if  or || contains the same conditions, perhaps we should
warn.

Jakub


Re: Fix libgomp crash without TLS (PR42616)

2014-09-24 Thread Varvara Rainchik
*Ping*

2014-09-19 15:41 GMT+04:00 Varvara Rainchik varvara.s.rainc...@gmail.com:
 I've corrected my patch accordingly to what you said. To diffirentiate
 second case in destructor I've added pthread_setspecific
 (gomp_tls_key, NULL) at the end of gomp_thread_start. So, destructor
 can simply skip the case when pthread_getspecific (gomp_tls_key)
 returns 0. I also think that it's better to set 0 in gomp_thread_start
 explicitly as thread data is initialized by a local variable in this
 function.

 But, I see that pthread_getspecific always returns 0 in destrucor
 because data pointer is implicitly set to 0 before destructor call in
 glibc:

 (pthread_create.c):

 /* Always clear the data. */
 level2[inner].data = NULL;

 /* Make sure the data corresponds to a valid
 key. This test fails if the key was
 deallocated and also if it was
 re-allocated. It is the user's
 responsibility to free the memory in this
 case. */
 if (level2[inner].seq
== __pthread_keys[idx].seq
/* It is not necessary to register a destructor
   function. */
   __pthread_keys[idx].destr != NULL)
 /* Call the user-provided destructor. */
 __pthread_keys[idx].destr (data);

 I suppose it's not necessary if everything is cleaned up in
 gomp_thread_start  and destructor. What do you think?


 Changes are bootstrapped and regtested on x86_64-linux.

 2014-09-19  Varvara Rainchik  varvara.rainc...@intel.com

 * libgomp.h (gomp_thread): For non TLS case create thread data.
 * team.c (non_tls_thread_data_destructor,
 create_non_tls_thread_data): New functions.


 ---
 diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
 index bcd5b34..2f33d99 100644
 --- a/libgomp/libgomp.h
 +++ b/libgomp/libgomp.h
 @@ -467,9 +467,15 @@ static inline struct gomp_thread *gomp_thread (void)
  }
  #else
  extern pthread_key_t gomp_tls_key;
 -static inline struct gomp_thread *gomp_thread (void)
 +extern struct gomp_thread *create_non_tls_thread_data (void);
 +static struct gomp_thread *gomp_thread (void)
  {
 -  return pthread_getspecific (gomp_tls_key);
 +  struct gomp_thread *thr = pthread_getspecific (gomp_tls_key);
 +  if (thr == NULL)
 +  {
 +thr = create_non_tls_thread_data ();
 +  }
 +  return thr;
  }
  #endif

 diff --git a/libgomp/team.c b/libgomp/team.c
 index e6a6d8f..a692df8 100644
 --- a/libgomp/team.c
 +++ b/libgomp/team.c
 @@ -41,6 +41,7 @@ pthread_key_t gomp_thread_destructor;
  __thread struct gomp_thread gomp_tls_data;
  #else
  pthread_key_t gomp_tls_key;
 +struct gomp_thread initial_thread_tls_data;
  #endif


 @@ -130,6 +131,7 @@ gomp_thread_start (void *xdata)
gomp_sem_destroy (thr-release);
thr-thread_pool = NULL;
thr-task = NULL;
 +  pthread_setspecific (gomp_tls_key, NULL);
return NULL;
  }

 @@ -222,8 +224,16 @@ gomp_free_pool_helper (void *thread_pool)
  void
  gomp_free_thread (void *arg __attribute__((unused)))
  {
 -  struct gomp_thread *thr = gomp_thread ();
 -  struct gomp_thread_pool *pool = thr-thread_pool;
 +  struct gomp_thread *thr;
 +  struct gomp_thread_pool *pool;
 +#ifdef HAVE_TLS
 +  thr = gomp_thread ();
 +#else
 +  thr = pthread_getspecific (gomp_tls_key);
 +  if (thr == NULL)
 +return;
 +#endif
 +  pool = thr-thread_pool;
if (pool)
  {
if (pool-threads_used  0)
 @@ -910,6 +920,21 @@ gomp_team_end (void)
  }
  }

 +/* Destructor for data created in create_non_tls_thread_data.  */
 +
 +#ifndef HAVE_TLS
 +void
 +non_tls_thread_data_destructor (void *arg __attribute__((unused)))
 +{
 +  struct gomp_thread *thr = pthread_getspecific (gomp_tls_key);
 +  if (thr != NULL  thr != initial_thread_tls_data)
 +  {
 +gomp_free_thread (arg);
 +free (thr);
 +pthread_setspecific (gomp_tls_key, NULL);
 +  }
 +}
 +#endif

  /* Constructors for this file.  */

 @@ -917,9 +942,7 @@ static void __attribute__((constructor))
  initialize_team (void)
  {
  #ifndef HAVE_TLS
 -  static struct gomp_thread initial_thread_tls_data;
 -
 -  pthread_key_create (gomp_tls_key, NULL);
 +  pthread_key_create (gomp_tls_key, non_tls_thread_data_destructor);
pthread_setspecific (gomp_tls_key, initial_thread_tls_data);
  #endif

 @@ -927,6 +950,19 @@ initialize_team (void)
  gomp_fatal (could not create thread pool destructor.);
  }

 +/* Create data for thread created by pthread_create.  */
 +
 +#ifndef HAVE_TLS
 +struct gomp_thread *create_non_tls_thread_data (void)
 +{
 +  struct gomp_thread *thr = gomp_malloc_cleared (sizeof (struct 
 gomp_thread));
 +  pthread_setspecific (gomp_tls_key, thr);
 +  gomp_sem_init (thr-release, 0);
 +
 +  return thr;
 +}
 +#endif
 +
  static void __attribute__((destructor))
  team_destructor (void)
 {

 2014-09-02 14:36 GMT+04:00 Varvara Rainchik varvara.s.rainc...@gmail.com:
 May I use gomp_free_thread as a destructor for pthread_key_create?
 Then I'll make initial_thread_tls_data global for the first case, but
 how can I differentiate thread created by gomp_thread_start (second
 case)?

 2014-09-01 14:51 GMT+04:00 Jakub Jelinek 

[PATCH][match-and-simplify][2/2] Delay for lowering

2014-09-24 Thread Richard Biener

This delays for lowering by recording fors to apply in
simplify similar to how we record ifs.

Bootstrapped on x86_64-unknown-linux-gnu.

Richard.

2014-09-24  Richard Biener  rguent...@suse.de

* genmatch.c (id_base): Derive from typed_noop_remove.
(struct user_id): New id_base derivative.
(struct simplify): Add vector of fors.
(lower_commutative): Adjust.
(lower_opt_convert): Likewise.
(replace_id): Work with user_id / id_base pairs.
(lower_for): New function, split out from ...
(parse_for): ... here.  Maintain a stack of active fors,
record substitutes in user_id.
(everywhere): Adjust for simplify constructor change and
maintaining of the stack of active fors.
* match-bitwise.pd: Enable truth_valued_p for comparison
codes using for.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 215546)
+++ gcc/genmatch.c  (working copy)
@@ -153,7 +153,7 @@ END_BUILTINS
 /* Hashtable of known pattern operators.  This is pre-seeded from
all known tree codes and all known builtin function ids.  */
 
-struct id_base : typed_free_removeid_base
+struct id_base : typed_noop_removeid_base
 {
   enum id_kind { CODE, FN, PREDICATE, USER_DEFINED } kind;
 
@@ -221,6 +221,14 @@ struct predicate_id : public id_base
   int nargs;
 };
 
+struct user_id : public id_base
+{
+  user_id (const char *id_)
+: id_base (id_base::USER_DEFINED, id_), substitutes (vNULL), nargs(-1) {}
+  vecid_base * substitutes;
+  int nargs;
+};
+
 template
 template
 inline bool
@@ -439,16 +447,17 @@ struct if_or_with {
 
 struct simplify {
   simplify (operand *match_, source_location match_location_,
-   struct operand *result_, source_location result_location_, 
vecif_or_with ifexpr_vec_ = vNULL)
+   struct operand *result_, source_location result_location_, 
vecif_or_with ifexpr_vec_, vecvecuser_id *  for_vec_)
   : match (match_), match_location (match_location_),
   result (result_), result_location (result_location_),
-  ifexpr_vec (ifexpr_vec_) {}
+  ifexpr_vec (ifexpr_vec_), for_vec (for_vec_) {}
 
   operand *match; 
   source_location match_location;
   struct operand *result;
   source_location result_location;
   vecif_or_with ifexpr_vec;
+  vecvecuser_id *  for_vec;
 };
 
 struct dt_node
@@ -686,7 +695,8 @@ lower_commutative (simplify *s, vecsimp
   for (unsigned i = 0; i  matchers.length (); ++i)
 {
   simplify *ns = new simplify (matchers[i], s-match_location,
-  s-result, s-result_location, 
s-ifexpr_vec);
+  s-result, s-result_location, s-ifexpr_vec,
+  s-for_vec);
   simplifiers.safe_push (ns);
 }
 }
@@ -814,7 +824,8 @@ lower_opt_convert (simplify *s, vecsimp
   for (unsigned i = 0; i  matchers.length (); ++i)
 {
   simplify *ns = new simplify (matchers[i], s-match_location,
-  s-result, s-result_location, 
s-ifexpr_vec);
+  s-result, s-result_location, s-ifexpr_vec,
+  s-for_vec);
   simplifiers.safe_push (ns);
 }
 }
@@ -837,48 +848,105 @@ check_operator (id_base *op, unsigned n_
   else
 fatal (%s expects %u operands, got %u operands, opr-id, 
opr-get_required_nargs (), n_ops);
 }
-
+
+/* In AST operand O replace operator ID with operator WITH.  */
+
 operand *
-replace_id (operand *o, const char *user_id, const char *oper)
+replace_id (operand *o, user_id *id, id_base *with)
 {
-  if (o-type == operand::OP_CAPTURE)
+  if (capture *c = dyn_castcapture * (o))
 {
-  capture *c = static_castcapture * (o);
   if (!c-what)
return c;
-  capture *nc = new capture (c-where, replace_id (c-what, user_id, 
oper));
-  return nc;
+  return new capture (c-where, replace_id (c-what, id, with));
 }
 
+  /* For c_expr we simply record a string replacement table which is
+ applied at code-generation time.  */
   if (c_expr *ce = dyn_castc_expr * (o))
 {
-  id_base *idb = get_operator (oper);
   vecc_expr::id_tab ids = ce-ids.copy ();
-  ids.safe_push (c_expr::id_tab (user_id, idb-id));
+  ids.safe_push (c_expr::id_tab (id-id, with-id));
   return new c_expr (ce-r, ce-code, ce-nr_stmts, ids);
 }
 
-  if (o-type != operand::OP_EXPR)
+  expr *e = dyn_castexpr * (o);
+  if (!e)
 return o;
 
-  expr *e = static_castexpr * (o);
   expr *ne;
-
-  if (e-operation-kind == id_base::USER_DEFINED
-   strcmp (e-operation-id, user_id) == 0)
+  if (e-operation == id)
 {
-  ne = new expr (get_operator (oper), e-is_commutative);
+  ne = new expr (with, e-is_commutative);
   check_operator (ne-operation, e-ops.length ());
 }
   else
 ne = new expr (e-operation, e-is_commutative);
 
   for (unsigned i = 0; i  e-ops.length 

Re: [patch] Implement move semantics for iostreams

2014-09-24 Thread Marek Polacek
On Wed, Sep 24, 2014 at 12:01:13PM +0200, Jakub Jelinek wrote:
  -   if (!__lhs_local  !__lhs_local)
  +   if (!__lhs_local  !__rhs_local)
   std::swap(_M_word, __rhs._M_word);
  else
   {
 
 Wouldn't this be something for a (non-Wall?) warning?
 I mean if  or || contains the same conditions, perhaps we should
 warn.

Yeah, I think it'd make sense to warn.  I don't think we have an
option for this (-Wlogical-op does something little bit different).
Hence:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63357

Marek


Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Steven Bosscher
On Wed, Sep 24, 2014 at 11:57 AM, Ilya Enkovich wrote:
 2014-09-24 13:30 GMT+04:00 Steven Bosscher :
 Description of LABEL_PRESERVE_P says label that should always be
 considered to be needed.

 It's more specific than that, really:

 @item LABEL_PRESERVE_P (@var{x})
 In a @code{code_label} or @code{note}, indicates that the label is 
 referenced by
 code or data not visible to the RTL of a given function.

 I read another description:
 /* 1 if RTX is a code_label that should always be considered to be needed.  */
 #define LABEL_PRESERVE_P(RTX)   \
   (RTL_FLAG_CHECK2 (LABEL_PRESERVE_P, (RTX), CODE_LABEL, NOTE)-in_struct)

Yes, from rtl.h.

I'd recommend to always read the descriptions in doc/ (in this case
doc/rtl.texi). The documentation in the header files is often not
very comprehensive.


 The not visible part is important. If there are visible references
 to a label, then they should never be removed (obviously) and that
 should work through LABEL_NUSES. Unfortunately we are not very good at
 keeping LABEL_NUSES up-to-date (this is why all the
 rebuild_jump_labels() are still required).

 Does rebuild handle all kinds of instructions including those which use 
 UNSPEC?

Yes. Patterns are walked (deep) and REG_LABEL notes are added for all
labels encountered that are not already the JUMP_LABEL of INSN. If the
label is reachable from XEXP(UNSPEC, 0) -- the 'E' operand -- then
that label is visible.


 What appears to be the case here, is that you have a label between two
 basic blocks B1 and B2, and the label acts as a control flow barrier:
 B1 and B2 cannot be merged. Then this should be expressed in the CFG.
 Otherwise: What else prevents the merge_blocks CFG hooks from deleting
 the label?

 Label acts as a barrier here but it is a side effect.  I don't care
 about block merging.  I just don't want label with usages to be
 removed.

Understood. Only, LABEL_PRESERVE_P is not the right means to achieve that.

So let's get back to basics and see what the usages look like. AFAIU
now, you emit the code label early, and add the references much later
(in machine reorg?). Does your UNSPEC have the code_label as an
operand? If so, what breaks if cfgcleanup removes the label? Is the
insn no longer recognized? Or does the label not end up in the
assembly output? Or ...? I can try to help figure out what breaks if
you have a test case.

FWIW, the LABEL_PRESERVE_P uses in config/i386/i386.c look suspect. It
probably only works because those labels are added late, and the code
paths that use (x86_64 large PIC code model) are not tested all that
well...


 That means even if we do not have any usages
 we shouldn't remove it.

 Sorry, no.
 Even a LABEL_PRESERVE_P label can be deleted: It will be replaced by a
 NOTE_INSN_DELETED_LABEL. See cfgrtl.c:delete_insn().

 According to description you quoted label marked by LABEL_PRESERVE_P
 is used by some code or data.  Let this use be not visible to the RTL
 of a given function.  It is still used, right? How can you remove it?

The code_label rtx is removed, but the label itself is still output to
the object file. The label number is retained in the CODE_LABEL_NUMBER
of the NOTE_INSN_DELETED_LABEL. Look for how NOTE_INSN_DELETED_LABEL
is handled in final.c. It's a hack IMHO, but that's how it has been
since day 0 (see https://gcc.gnu.org/r104).

Ciao!
Steven


[PATCH] Provide global var location info for asan

2014-09-24 Thread Jakub Jelinek
On Tue, Sep 23, 2014 at 11:03:55AM -0700, Konstantin Serebryany wrote:
  (asan_add_global): Ditto.
 
  I'll handle creation of location aggregates as follow-up.

Here it is, only lightly tested so far:

int a = 1;
int b = 2;
int c = 3;

int *
foo (int x)
{
  return x ? b : c;
}

int
main ()
{
  char *p = (char *) foo (1);
  int x = p[sizeof (int)];
  asm ( : : r (x));
  return 0;
}

used to print:
0x00601104 is located 60 bytes to the left of global variable 'a' defined 
in 'aa.c' (0x601140) of size 4
0x00601104 is located 0 bytes to the right of global variable 'b' defined 
in 'aa.c' (0x601100) of size 4
but now does:
0x00601104 is located 60 bytes to the left of global variable 'a' defined 
in 'aa.c:1:5' (0x601140) of size 4
0x00601104 is located 0 bytes to the right of global variable 'b' defined 
in 'aa.c:2:5' (0x601100) of size 4

I think this test is too fragile for the testsuite though, the order of the
vars in the data section can be arbitrary etc.
make -j16 -k check-gcc check-g++ check-gfortran 
RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} asan.exp'
passed.

For Marek: the patch also uses just one __ubsan_source_location
RECORD_TYPE everywhere, I've been really surprised we created a new type
node each time we needed it.

Ok for trunk?

2014-09-24  Jakub Jelinek  ja...@redhat.com

* ubsan.h (ubsan_get_source_location): New prototype.
* ubsan.c (ubsan_source_location_type): New variable.
Function renamed to ...
(ubsan_get_source_location_type): ... this.  Cache
return value in ubsan_source_location_type variable.
(ubsan_source_location, ubsan_create_data): Use
ubsan_get_source_location_type instead of
ubsan_source_location_type.
* asan.c (asan_protect_global): Don't protect globals
with ubsan_get_source_location_type () type.
(asan_add_global): Provide global decl location info
if possible.

--- gcc/ubsan.h.jj  2014-09-24 08:26:49.635418299 +0200
+++ gcc/ubsan.h 2014-09-24 11:35:05.231330166 +0200
@@ -47,6 +47,6 @@ extern tree ubsan_encode_value (tree, bo
 extern bool is_ubsan_builtin_p (tree);
 extern tree ubsan_build_overflow_builtin (tree_code, location_t, tree, tree, 
tree);
 extern tree ubsan_instrument_float_cast (location_t, tree, tree);
+extern tree ubsan_get_source_location_type (void);
 
 #endif  /* GCC_UBSAN_H  */
-
--- gcc/ubsan.c.jj  2014-09-24 08:26:49.639418278 +0200
+++ gcc/ubsan.c 2014-09-24 11:35:56.662054997 +0200
@@ -197,6 +197,9 @@ ubsan_type_descriptor_type (void)
   return ret;
 }
 
+/* Cached ubsan_get_source_location_type () return value.  */
+static GTY(()) tree ubsan_source_location_type;
+
 /* Build
struct __ubsan_source_location
{
@@ -206,12 +209,15 @@ ubsan_type_descriptor_type (void)
}
type.  */
 
-static tree
-ubsan_source_location_type (void)
+tree
+ubsan_get_source_location_type (void)
 {
   static const char *field_names[3]
 = { __filename, __line, __column };
   tree fields[3], ret;
+  if (ubsan_source_location_type)
+return ubsan_source_location_type;
+
   tree const_char_type = build_qualified_type (char_type_node,
   TYPE_QUAL_CONST);
 
@@ -229,6 +235,7 @@ ubsan_source_location_type (void)
   TYPE_FIELDS (ret) = fields[0];
   TYPE_NAME (ret) = get_identifier (__ubsan_source_location);
   layout_type (ret);
+  ubsan_source_location_type = ret;
   return ret;
 }
 
@@ -239,7 +246,7 @@ static tree
 ubsan_source_location (location_t loc)
 {
   expanded_location xloc;
-  tree type = ubsan_source_location_type ();
+  tree type = ubsan_get_source_location_type ();
 
   xloc = expand_location (loc);
   tree str;
@@ -484,7 +491,7 @@ ubsan_create_data (const char *name, int
 {
   gcc_checking_assert (i  2);
   fields[i] = build_decl (UNKNOWN_LOCATION, FIELD_DECL, NULL_TREE,
- ubsan_source_location_type ());
+ ubsan_get_source_location_type ());
   DECL_CONTEXT (fields[i]) = ret;
   if (i)
DECL_CHAIN (fields[i - 1]) = fields[i];
--- gcc/asan.c.jj   2014-09-24 11:13:43.548211574 +0200
+++ gcc/asan.c  2014-09-24 12:06:13.122500445 +0200
@@ -1316,7 +1316,8 @@ asan_protect_global (tree decl)
   || DECL_SIZE (decl) == 0
   || ASAN_RED_ZONE_SIZE * BITS_PER_UNIT  MAX_OFILE_ALIGNMENT
   || !valid_constant_size_p (DECL_SIZE_UNIT (decl))
-  || DECL_ALIGN_UNIT (decl)  2 * ASAN_RED_ZONE_SIZE)
+  || DECL_ALIGN_UNIT (decl)  2 * ASAN_RED_ZONE_SIZE
+  || TREE_TYPE (decl) == ubsan_get_source_location_type ())
 return false;
 
   rtl = DECL_RTL (decl);
@@ -2224,8 +2225,38 @@ asan_add_global (tree decl, tree type, v
   int has_dynamic_init = vnode ? vnode-dynamically_initialized : 0;
   CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
  build_int_cst (uptr, has_dynamic_init));
-  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
- build_int_cst 

[PATCH][match-and-simplify] Cleanup operator arity related diagnostics

2014-09-24 Thread Richard Biener

The following removes a bunch of code dealing with late verifying
of operator presence and matching arity.  This can now all be
verified at parsing, giving proper locations and operator names.

Bootstrap pending.

Richard.

2014-09-24  Richard Biener  rguent...@suse.de

* genmatch.c (struct id_base): Move nargs member here.
(check_operator): Remove.
(check_no_user_id): Likewise.
(parse_operation): Fix error locations, handle convert0/2
properly.
(parse_expr): Error on non-matching arity.
(parse_for): Compute arity of user-ids and complain for
inconsistent substitutions.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 215550)
+++ gcc/genmatch.c  (working copy)
@@ -157,9 +157,10 @@ struct id_base : typed_noop_removeid_ba
 {
   enum id_kind { CODE, FN, PREDICATE, USER_DEFINED } kind;
 
-  id_base (id_kind, const char *);
+  id_base (id_kind, const char *, int = -1);
 
   hashval_t hashval;
+  int nargs;
   const char *id;
 
   /* hash_table support.  */
@@ -185,10 +186,11 @@ id_base::equal (const value_type *op1,
 
 static hash_tableid_base *operators;
 
-id_base::id_base (id_kind kind_, const char *id_)
+id_base::id_base (id_kind kind_, const char *id_, int nargs_)
 {
   kind = kind_;
   id = id_;
+  nargs = nargs_;
   hashval = htab_hash_string (id);
 }
 
@@ -196,11 +198,8 @@ struct operator_id : public id_base
 {
   operator_id (enum tree_code code_, const char *id_, unsigned nargs_,
   const char *tcc_)
-  : id_base (id_base::CODE, id_),
-  code (code_), nargs (nargs_), tcc (tcc_) {}
-  unsigned get_required_nargs () const { return nargs; }
+  : id_base (id_base::CODE, id_, nargs_), code (code_), tcc (tcc_) {}
   enum tree_code code;
-  unsigned nargs;
   const char *tcc;
 };
 
@@ -216,17 +215,15 @@ struct simplify;
 struct predicate_id : public id_base
 {
   predicate_id (const char *id_)
-: id_base (id_base::PREDICATE, id_), matchers (vNULL), nargs(-1) {}
+: id_base (id_base::PREDICATE, id_), matchers (vNULL) {}
   vecsimplify * matchers;
-  int nargs;
 };
 
 struct user_id : public id_base
 {
   user_id (const char *id_)
-: id_base (id_base::USER_DEFINED, id_), substitutes (vNULL), nargs(-1) {}
+: id_base (id_base::USER_DEFINED, id_), substitutes (vNULL) {}
   vecid_base * substitutes;
-  int nargs;
 };
 
 template
@@ -830,36 +827,27 @@ lower_opt_convert (simplify *s, vecsimp
 }
 }
 
-void
-check_operator (id_base *op, unsigned n_ops, const cpp_token *token = 0)
-{
-  if (!op)
-return;
-
-  if (op-kind != id_base::CODE)
-return;
-
-  operator_id *opr = static_castoperator_id * (op);
-  if (opr-get_required_nargs () == n_ops)
-return;
-
-  if (token)
-fatal_at (token, %s expects %u operands, got %u operands, opr-id, 
opr-get_required_nargs (), n_ops);
-  else
-fatal (%s expects %u operands, got %u operands, opr-id, 
opr-get_required_nargs (), n_ops);
-}
-
 /* In AST operand O replace operator ID with operator WITH.  */
 
 operand *
 replace_id (operand *o, user_id *id, id_base *with)
 {
+  /* Deep-copy captures and expressions, replacing operations as
+ needed.  */
   if (capture *c = dyn_castcapture * (o))
 {
   if (!c-what)
return c;
   return new capture (c-where, replace_id (c-what, id, with));
 }
+  else if (expr *e = dyn_castexpr * (o))
+{
+  expr *ne = new expr (e-operation == id ? with : e-operation,
+  e-is_commutative);
+  for (unsigned i = 0; i  e-ops.length (); ++i)
+   ne-append_op (replace_id (e-ops[i], id, with));
+  return ne;
+}
 
   /* For c_expr we simply record a string replacement table which is
  applied at code-generation time.  */
@@ -870,23 +858,7 @@ replace_id (operand *o, user_id *id, id_
   return new c_expr (ce-r, ce-code, ce-nr_stmts, ids);
 }
 
-  expr *e = dyn_castexpr * (o);
-  if (!e)
-return o;
-
-  expr *ne;
-  if (e-operation == id)
-{
-  ne = new expr (with, e-is_commutative);
-  check_operator (ne-operation, e-ops.length ());
-}
-  else
-ne = new expr (e-operation, e-is_commutative);
-
-  for (unsigned i = 0; i  e-ops.length (); ++i)
-ne-append_op (replace_id (e-ops[i], id, with));
-
-  return ne;
+  return o;
 }
 
 /* Lower recorded fors for SIN and output to SIMPLIFIERS.  */
@@ -947,40 +919,6 @@ lower_for (simplify *sin, vecsimplify *
 simplifiers.safe_push (worklist[i]);
 }
 
-void
-check_no_user_id (operand *o)
-{
-  if (o-type == operand::OP_CAPTURE)
-{
-  capture *c = static_castcapture * (o);
-  if (c-what  c-what-type == operand::OP_EXPR)
-   {
- o = c-what;
- goto check_expr; 
-   }
-  return; 
-}
-
-  if (o-type != operand::OP_EXPR)
-return;
-
-check_expr:
-  expr *e = static_castexpr * (o);
-  if (e-operation-kind == id_base::USER_DEFINED)
-fatal (%s is not defined in for, 

[PATCH][match-and-simplify] Remove outlining of C exprs

2014-09-24 Thread Richard Biener

It no longer works in the face of (with {  } which would
need to pass down all named temporaries.  Instead we can
simply inline all C exprs now that we pass 'output' to all
gen_transform calls.

Boostrap pending on x86_64-unknown-linux-gnu.

Richard.

2014-09-24  Richard Biener  rguent...@suse.de

* genmatch.c (c_expr::output_code): Remove and inline into ...
(c_expr::gen_transform): ... here.
(outline_c_exprs): Remove.
(main): Do not call outline_c_exprs.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 215551)
+++ gcc/genmatch.c  (working copy)
@@ -355,7 +355,6 @@ struct c_expr : public operand
   vecid_tab ids;
 
   virtual void gen_transform (FILE *f, const char *, bool, int, const char *, 
dt_operand **);
-  void output_code (FILE *f, bool);
 };
 
 struct capture : public operand
@@ -1051,9 +1050,17 @@ expr::gen_transform (FILE *f, const char
   fprintf (f, }\n);
 }
 
+/* Generate code for a c_expr which is either the expression inside
+   an if statement or a sequence of statements which computes a
+   result to be stored to DEST.  */
+
 void
-c_expr::output_code (FILE *f, bool for_fn)
+c_expr::gen_transform (FILE *f, const char *dest,
+  bool, int, const char *, dt_operand **)
 {
+  if (dest  nr_stmts == 1)
+fprintf (f, %s = , dest);
+
   unsigned stmt_nr = 1;
   for (unsigned i = 0; i  code.length (); ++i)
 {
@@ -1100,35 +1107,14 @@ c_expr::output_code (FILE *f, bool for_f
   if (token-type == CPP_SEMICOLON)
{
  stmt_nr++;
- if (for_fn  stmt_nr == nr_stmts)
-   fputs (\n return , f);
+ if (dest  stmt_nr == nr_stmts)
+   fprintf (f, \n %s = , dest);
  else
fputc ('\n', f);
}
 }
 }
 
-
-void
-c_expr::gen_transform (FILE *f, const char *dest, bool, int, const char *, 
dt_operand **)
-{
-  /* If this expression has an outlined function variant, call it.  */
-  if (fname)
-{
-  fprintf (f, %s = %s (type, captures);\n, dest, fname);
-  return;
-}
-
-  /* All multi-stmt expressions should have been outlined.  Expressions
- with nr_stmts == 0 are used for if-expressions.  */
-  gcc_assert (nr_stmts = 1);
-
-  if (nr_stmts == 1)
-fprintf (f, %s = , dest);
-
-  output_code (f, false);
-}
-
 void
 capture::gen_transform (FILE *f, const char *dest, bool gimple, int depth, 
const char *in_type, dt_operand **indexes)
 {
@@ -2172,40 +2158,6 @@ write_predicate (FILE *f, predicate_id *
 
 
 static void
-outline_c_exprs (FILE *f, struct operand *op)
-{
-  if (op-type == operand::OP_C_EXPR)
-{
-  c_expr *e = static_cast c_expr *(op);
-  static unsigned fnnr = 1;
-  if (e-nr_stmts  1
-  !e-fname)
-   {
- e-fname = (char *)xmalloc (sizeof (cexprfn) + 4);
- sprintf (e-fname, cexprfn%d, fnnr);
- fprintf (f, \nstatic tree\ncexprfn%d (tree type, tree *captures)\n,
-  fnnr);
- fprintf (f, {\n);
- e-output_code (f, true);
- fprintf (f, }\n);
- fnnr++;
-   }
-}
-  else if (op-type == operand::OP_CAPTURE)
-{
-  capture *c = static_cast capture *(op);
-  if (c-what)
-   outline_c_exprs (f, c-what);
-}
-  else if (op-type == operand::OP_EXPR)
-{
-  expr *e = static_cast expr *(op);
-  for (unsigned i = 0; i  e-ops.length (); ++i)
-   outline_c_exprs (f, e-ops[i]);
-}
-}
-
-static void
 write_header (FILE *f, const char *head)
 {
   fprintf (f, /* Generated automatically by the program `genmatch' from\n);
@@ -3001,10 +2953,6 @@ add_operator (CONVERT2, CONVERT2, tcc
   if (verbose)
 dt.print (stderr);
 
-  /* Outline complex C expressions to helper functions.  */
-  for (unsigned i = 0; i  out_simplifiers.length (); ++i)
-outline_c_exprs (stdout, out_simplifiers[i]-result);
-
   if (gimple)
 dt.gen_gimple (stdout);
   else


RE: RFA: another patch to fix PR61360

2014-09-24 Thread Gopalasubramanian, Ganesh
The r-x alternative results in vector decoding on amdfam10. This is 
AMD-speak for microcoded instructions, and AMD optimization manual strongly 
recommends avoiding them. I have CC'd Ganesh, maybe he can provide more 
relevant data on the performance impact.

Thanks Uros!

Yes, the AMD SWOG recommends precisely what Uros mentions.
snip from SWOG for BD
When moving data from a GPR to an XMM register, use separate store and load 
instructions to move
the data first from the source register to a temporary location in memory and 
then from memory into
the destination register
/snip

This is listed as an optimization too. This holds good for all amdfam10 and BD  
family processors. 
I have to dig through the performance numbers will try to get them.

Regards
Ganesh


Re: [PATCH IRA] update_equiv_regs fails to set EQUIV reg-note for pseudo with more than one definition

2014-09-24 Thread Felix Yang
Hi Jeff,

Thanks for the comments. I updated the patch adding some enhancements.
Bootstrapped on x86_64-suse-linux. Please apply this patch if OK for trunk.

Three points:
1. For multiple-set register, it is not qualified to have a equiv
note once it is marked by no_equiv. The patch is updated with
   this consideration.
2. For the rtx_insn_list new interface, I noticed that the old
style XEXP accessor macros is still used in function no_equiv.
   And I choose to the old style macros with this patch and should
come up with another patch to fix this issue, OK?
3. For the conditions that an insn on the init_insns list which
did not have a note, I reconsider this and find that this can
   never happens. So I replaced the check with a gcc assertion.


Index: gcc/ChangeLog
===
--- gcc/ChangeLog(revision 215550)
+++ gcc/ChangeLog(working copy)
@@ -1,3 +1,11 @@
+2014-09-24  Felix Yang  felix.y...@huawei.com
+
+* ira.c (struct equivalence): Add no_equiv member.
+(no_equiv): Set no_equiv of struct equivalence if register is marked
+as having no known equivalence.
+(update_equiv_regs): Check all definitions for a multiple-set
+register to make sure that the RHS have the same value.
+
 2014-09-24  Jakub Jelinek  ja...@redhat.com

 PR sanitizer/63316
Index: gcc/ira.c
===
--- gcc/ira.c(revision 215550)
+++ gcc/ira.c(working copy)
@@ -2900,6 +2900,8 @@ struct equivalence
   /* Set when an attempt should be made to replace a register
  with the associated src_p entry.  */
   char replace;
+  /* Set if this register has no known equivalence.  */
+  char no_equiv;
 };

 /* reg_equiv[N] (where N is a pseudo reg number) is the equivalence
@@ -3247,6 +3249,7 @@ no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSE
   if (!REG_P (reg))
 return;
   regno = REGNO (reg);
+  reg_equiv[regno].no_equiv = 1;
   list = reg_equiv[regno].init_insns;
   if (list == const0_rtx)
 return;
@@ -3258,7 +3261,7 @@ no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSE
 return;
   ira_reg_equiv[regno].defined_p = false;
   ira_reg_equiv[regno].init_insns = NULL;
-  for (; list; list =  XEXP (list, 1))
+  for (; list; list = XEXP (list, 1))
 {
   rtx insn = XEXP (list, 0);
   remove_note (insn, find_reg_note (insn, REG_EQUIV, NULL_RTX));
@@ -3373,7 +3376,7 @@ update_equiv_regs (void)

   /* If this insn contains more (or less) than a single SET,
  only mark all destinations as having no known equivalence.  */
-  if (set == 0)
+  if (set == NULL_RTX)
 {
   note_stores (PATTERN (insn), no_equiv, NULL);
   continue;
@@ -3467,16 +3470,48 @@ update_equiv_regs (void)
   if (note  GET_CODE (XEXP (note, 0)) == EXPR_LIST)
 note = NULL_RTX;

-  if (DF_REG_DEF_COUNT (regno) != 1
-   (! note
+  if (DF_REG_DEF_COUNT (regno) != 1)
+{
+  rtx list;
+  bool equal_p = true;
+
+  /* Check if it is possible that this multiple-set register has
+ a known equivalence.  */
+  if (reg_equiv[regno].no_equiv)
+continue;
+
+  if (! note
   || rtx_varies_p (XEXP (note, 0), 0)
   || (reg_equiv[regno].replacement
! rtx_equal_p (XEXP (note, 0),
-reg_equiv[regno].replacement
-{
-  no_equiv (dest, set, NULL);
-  continue;
+reg_equiv[regno].replacement)))
+{
+  no_equiv (dest, set, NULL);
+  continue;
+}
+
+  list = reg_equiv[regno].init_insns;
+  for (; list; list = XEXP (list, 1))
+{
+  rtx note_tmp, insn_tmp;
+
+  insn_tmp = XEXP (list, 0);
+  note_tmp = find_reg_note (insn_tmp, REG_EQUAL, NULL_RTX);
+  gcc_assert (note_tmp);
+  if (! rtx_equal_p (XEXP (note, 0), XEXP (note_tmp, 0)))
+{
+  equal_p = false;
+  break;
+}
+}
+
+  if (! equal_p)
+{
+  no_equiv (dest, set, NULL);
+  continue;
+}
 }
+
   /* Record this insn as initializing this register.  */
   reg_equiv[regno].init_insns
 = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv[regno].init_insns);
@@ -3505,10 +3540,9 @@ update_equiv_regs (void)
  a register used only in one basic block from a MEM.  If so, and the
  MEM remains unchanged for the life of the register, add a REG_EQUIV
  note.  */
-
   note = find_reg_note (insn, REG_EQUIV, NULL_RTX);

-  if (note == 0  REG_BASIC_BLOCK (regno) = NUM_FIXED_BLOCKS
+  if (note == NULL_RTX  REG_BASIC_BLOCK (regno) = NUM_FIXED_BLOCKS
MEM_P (SET_SRC (set))
validate_equiv_mem (insn, dest, SET_SRC (set)))
 note = set_unique_reg_note (insn, REG_EQUIV, 

[PATCH, bootstrap PR63235] Fix bootstrap.

2014-09-24 Thread Kirill Yukhin
Hello,
Patch in the bottom fixes bootstrap
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63235)

gcc/
* varpool.c (varpool_node::add): Pass decl attributes
to lookup_attribute.

Is it ok for trunk?

--
Thanks, K

diff --git a/gcc/varpool.c b/gcc/varpool.c
index 8001c93..3761f14 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -449,7 +449,7 @@ varpool_node::add (tree decl)
   symtab-call_varpool_insertion_hooks (node);
   if (node-externally_visible_p ())
 node-externally_visible = true;
-  if (lookup_attribute (no_reorder, decl))
+  if (lookup_attribute (no_reorder, DECL_ATTRIBUTES (decl)))
 node-no_reorder = 1;
 }


Re: [PATCH, bootstrap PR63235] Fix bootstrap.

2014-09-24 Thread Jakub Jelinek
On Wed, Sep 24, 2014 at 04:16:50PM +0400, Kirill Yukhin wrote:
 Hello,
 Patch in the bottom fixes bootstrap
 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63235)
 
 gcc/
   * varpool.c (varpool_node::add): Pass decl attributes
   to lookup_attribute.
 
 Is it ok for trunk?

Ok, thanks.

 diff --git a/gcc/varpool.c b/gcc/varpool.c
 index 8001c93..3761f14 100644
 --- a/gcc/varpool.c
 +++ b/gcc/varpool.c
 @@ -449,7 +449,7 @@ varpool_node::add (tree decl)
symtab-call_varpool_insertion_hooks (node);
if (node-externally_visible_p ())
  node-externally_visible = true;
 -  if (lookup_attribute (no_reorder, decl))
 +  if (lookup_attribute (no_reorder, DECL_ATTRIBUTES (decl)))
  node-no_reorder = 1;
  }

Jakub


Re: [PATCH, testsuite]: PR 58757: Check for FP denormal values without triggering denormal exceptions

2014-09-24 Thread Marc Glisse

On Wed, 24 Sep 2014, Uros Bizjak wrote:


However, alpha *does* support all IEEE features, the only problem is
in its default model, which is for some reason High-Performance
IEEE-Format Arithmetic (please see alpha AHB [1], section 4.7.6.5).
This model does not require the overhead of an operating system
completion handler and can be the fastest of the three IEEE models..
Unfortunately, this model also notifies applications of all
exceptional floating-point operations. Denormals are considered
non-finite IEEE values, so they trap.

When the target is in certain high-speed mode, it is up to the user
to obey all the limitations, in this particular case, that only IEEE
finite numbers are provided. This is not the case with the original
testcase, so I'd say that the test is out of specs. It beats me, why
-mieee is not the default on alpha, since current default suits
-ffast-math more, but it looks that we have to live with this mess.


(I believe -mieee is the default on some alpha platforms, maybe debian 
or bsd?)



To avoid traps on denormals, -mieee has to be specified. This option
enables FP software completion that completes denormal handling, so
there is no need to notify application  IMO, instead of XFAILing
the test, we should simply provide -mieee. __*_DENORM_MIN__ should
indeed apply to the underlying FP format, not to sme target-dependent
model and its implementation details.

[1] http://www.compaq.com/cpq-alphaserver/technology/literature/alphaahb.pdf


In 4.7.6.5, I see: Underflow results are set to zero. so this is a 
functional model without denormals. According to the C11 standard, this 
means DBL_HAS_SUBNORM should be 0 and DBL_TRUE_MIN should be the same as 
DBL_MIN. The same is probably true on x86 with -ffast-math.


Giving DBL_TRUE_MIN an unusable value (zero or trapping) is not very 
useful, while providing the real usable minimum lets users do something 
meaningful with it.


The main issue is using incompatible flags in different objects or at 
link time...


--
Marc Glisse


Re: [PATCH][AArch64] Use __aarch64_vget_lane* macros for getting the lane in some lane multiply intrinsics

2014-09-24 Thread Kyrill Tkachov

Must have slipped through the cracks.
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00586.html
Ping?

Thanks,
Kyrill


On 08/09/14 11:29, Kyrill Tkachov wrote:

Hi all,

The included testcase currently ICEs at -O0 because vget_lane_f64 is a 
function, so if it's properly called with a constant argument but without 
constant propagation it will not be recognised as constant, causing an ICE.
This patch changes it to use the macro version directly.

I think there is work being done to fix this issue up as part of a more general 
rework, but until that comes this patch implements the concerned intrinsics 
using the __aarch64_vget_lane* macros like the other lane intrinsics around 
them.

Tested aarch64-none-elf.

Ok for trunk?

Thanks,
Kyrill

2014-09-08  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/aarch64/arm_neon.h (vmuld_lane_f64): Use macro for getting
the lane.
(vmuld_laneq_f64): Likewise.
(vmuls_lane_f32): Likewise.
(vmuls_laneq_f32): Likewise.

2014-09-08  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.target/aarch64/simd/vmul_lane_const_lane_1.c: New test.





Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Ilya Enkovich
2014-09-24 14:35 GMT+04:00 Steven Bosscher stevenb@gmail.com:
 On Wed, Sep 24, 2014 at 11:57 AM, Ilya Enkovich wrote:
 2014-09-24 13:30 GMT+04:00 Steven Bosscher :
 Description of LABEL_PRESERVE_P says label that should always be
 considered to be needed.

 It's more specific than that, really:

 @item LABEL_PRESERVE_P (@var{x})
 In a @code{code_label} or @code{note}, indicates that the label is 
 referenced by
 code or data not visible to the RTL of a given function.

 I read another description:
 /* 1 if RTX is a code_label that should always be considered to be needed.  
 */
 #define LABEL_PRESERVE_P(RTX)   \
   (RTL_FLAG_CHECK2 (LABEL_PRESERVE_P, (RTX), CODE_LABEL, NOTE)-in_struct)

 Yes, from rtl.h.

 I'd recommend to always read the descriptions in doc/ (in this case
 doc/rtl.texi). The documentation in the header files is often not
 very comprehensive.


 The not visible part is important. If there are visible references
 to a label, then they should never be removed (obviously) and that
 should work through LABEL_NUSES. Unfortunately we are not very good at
 keeping LABEL_NUSES up-to-date (this is why all the
 rebuild_jump_labels() are still required).

 Does rebuild handle all kinds of instructions including those which use 
 UNSPEC?

 Yes. Patterns are walked (deep) and REG_LABEL notes are added for all
 labels encountered that are not already the JUMP_LABEL of INSN. If the
 label is reachable from XEXP(UNSPEC, 0) -- the 'E' operand -- then
 that label is visible.


 What appears to be the case here, is that you have a label between two
 basic blocks B1 and B2, and the label acts as a control flow barrier:
 B1 and B2 cannot be merged. Then this should be expressed in the CFG.
 Otherwise: What else prevents the merge_blocks CFG hooks from deleting
 the label?

 Label acts as a barrier here but it is a side effect.  I don't care
 about block merging.  I just don't want label with usages to be
 removed.

 Understood. Only, LABEL_PRESERVE_P is not the right means to achieve that.

 So let's get back to basics and see what the usages look like. AFAIU
 now, you emit the code label early, and add the references much later
 (in machine reorg?). Does your UNSPEC have the code_label as an
 operand? If so, what breaks if cfgcleanup removes the label? Is the
 insn no longer recognized? Or does the label not end up in the
 assembly output? Or ...? I can try to help figure out what breaks if
 you have a test case.

 FWIW, the LABEL_PRESERVE_P uses in config/i386/i386.c look suspect. It
 probably only works because those labels are added late, and the code
 paths that use (x86_64 large PIC code model) are not tested all that
 well...

I didn't generate references separately from label.  Now I found an
old patch and a test where this problem appeared.  In this patch I
moved set_rip generation currently performed in ix86_expand_prologue
into expand pass.  And I got following code in expand dump for
testsuite/gcc.target/i386/pr55154.c test:

(note 7 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note/s 2 7 3 2  NOTE_INSN_DELETED_LABEL 2)
(insn 3 2 4 2 (set (reg:DI 85)
(unspec:DI [
(label_ref [2 deleted])
] UNSPEC_SET_RIP))
/export/users/ienkovic/issues/4161/gcc/gcc/testsuite/gcc.target/i386/pr55154.c:9
-1
 (insn_list:REG_LABEL_OPERAND 2 (nil)))

There is a REG_LABEL_OPERAND generated but label is still removed.

Ilya



 That means even if we do not have any usages
 we shouldn't remove it.

 Sorry, no.
 Even a LABEL_PRESERVE_P label can be deleted: It will be replaced by a
 NOTE_INSN_DELETED_LABEL. See cfgrtl.c:delete_insn().

 According to description you quoted label marked by LABEL_PRESERVE_P
 is used by some code or data.  Let this use be not visible to the RTL
 of a given function.  It is still used, right? How can you remove it?

 The code_label rtx is removed, but the label itself is still output to
 the object file. The label number is retained in the CODE_LABEL_NUMBER
 of the NOTE_INSN_DELETED_LABEL. Look for how NOTE_INSN_DELETED_LABEL
 is handled in final.c. It's a hack IMHO, but that's how it has been
 since day 0 (see https://gcc.gnu.org/r104).

 Ciao!
 Steven


Re: [PATCH 0/10] OpenACC 2.0 support for libgomp

2014-09-24 Thread Jakub Jelinek
On Tue, Sep 23, 2014 at 07:17:25PM +0100, Julian Brown wrote:
 The upcoming patch series constitutes our current (still in-progress)
 implementation of run-time support for OpenACC 2.0 in libgomp. We've
 tried to build on top of the (also currently WIP) support for OpenMP
 4.0's target construct, sharing code where possible: because of this,
 I've also prepared versions of (a fairly minimal, hopefully correct set
 of) prerequisite patches that apply to current mainline (and were
 previously on the gomp 4.0 branch), although in many cases we weren't
 the original authors of those.
 
 Other parts of the OpenACC support for GCC are being sent upstream
 concurrently with this runtime support (and are co-dependent with it),
 so unfortunately, though the main part of the implementation (part 7/10)
 works on our internal branch, I haven't yet been able to convincingly
 test the series I'm about to post upstream. However this code will be
 useful to others who are posting their bits of OpenACC support
 upstream, so perhaps it'd be useful to commit it anyway (we have to
 start somewhere!).
 
 I've tried to retain proper attribution for all the forthcoming patches,
 but I may have made mistakes. Please let me know if so!

Just random comments about all the 10 patches:

--- libgomp/Makefile.am (revision 215546)
+++ libgomp/Makefile.am (working copy)
@@ -14,13 +14,35 @@ libsubincludedir = $(libdir)/gcc/$(targe
 
 vpath % $(strip $(search_path))
 
-AM_CPPFLAGS = $(addprefix -I, $(search_path))
+AM_CPPFLAGS = $(addprefix -I, $(search_path)) \
+   $(addprefix -I, $(search_path)/../include)
 AM_CFLAGS = $(XCFLAGS)
 AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS)

This looks wrong, search_path is typically something like:
$(top_srcdir)/config/linux/x86 $(top_srcdir)/config/linux \
$(top_srcdir)/config/posix $(top_srcdir)
so $(search_path)/../include means you duplicate all the
*/config/* paths again.  Just add -I$(top_srcdir)/../include
to AM_CPPFLAGS.

As for plugins, my preference would be to move their sources
to a libgomp/plugins/ subdirectory and build them in that subdirectory
(for mic, which builds its plugin inside of libmicoffload it
could copy it there).

# TODO: not for OpenACC?
libgomp really needs to be built against libpthread, so if you don't
want that, you'd need to move the openacc bits to a separate shared library.

In general, I'd prefer if the stuff that gets committed to trunk
contains as few TODO: and FIXME: comments as possible, keep them
on the branch if you really need them.

 static void
+goacc_parse_device_num (void)
+{

Any reason why you don't want to use parse_int for this?
Does the standard require you parse and don't reject negative
numbers?

oacc-init.c:__thread  void *ACC_handle;
oacc-init.c:static __thread int handle_num = -1;
oacc-init.c:static __thread struct gomp_device_descr const *saved_bound_dev;
oacc-mem.c:__thread struct memmap_t *ACC_memmap;
oacc-parallel.c:static __thread struct devgeom devgeom = { 1, 1, 1 };
oacc-parallel.c:static __thread struct target_mem_desc *mapped_data = NULL;

Do you really need all those __thread vars?  As libgomp uses IE model
for performance reasons, growing the total size too much might very well
mean that the dynamic linker will refuse to dlopen it.  Couldn't you e.g.
use just a single __thread pointer to a struct that will contain all of
this?  Also, note that libgomp must be supported also for the
!HAVE_TLS case, where you shouldn't use __thread at all, use
pthread_getspecific etc. instead (so it would really help if you'd just
use a single pointer).

+void
+gomp_notify(const char *msg, ...)

Formatting, missing space before (.

   char bind_var;
+  int acc_notify_var;
   /* Internal ICV.  */
   struct target_mem_desc *target_data;

This is again in TLS, and duplicated/copied on any OpenMP parallel/task
etc., so it also affects performance of #pragma omp parallel/task.
Why do you need to put ACC stuff in there?  Can't it live in
target_data or elsewhere?

+   gomp_plugin_malloc;
+   gomp_plugin_malloc_cleared;
...
Please use GOMP_PLUGIN_ instead.  Also, please make sure the entrypoints
libgomp looks for in the plugins have similar/same prefix.

+__attribute__((used)) static void
+dump_mappings (FILE *f, splay_tree_node node)
+{

IMHO this should be guarded by some define, while it can be useful for
debugging the library, it is unneeded for production libgomp.

+  if (device-get_caps_func ()  TARGET_CAP_OPENMP_400)
+DLSYM (device_run);
+  if (device-get_caps_func ()  TARGET_CAP_OPENACC_200)

Cache the return value?  Also, I must say I'm not particularly excited
about different plugins not supporting both OpenMP 4.0 and OpenACC 2.0
offloading.  Why is that needed?

+  /* Make sure all the CUDA functions are there if any of them are.  */
+  if (optional_present  optional_present != optional_total)
+   {
+ err = plugin missing OpenACC CUDA handler function;
+ goto out;
+   }

So, any plugin 

Re: [PATCH][AArch64] Use __aarch64_vget_lane* macros for getting the lane in some lane multiply intrinsics

2014-09-24 Thread Marcus Shawcroft
On 8 September 2014 11:29, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 Hi all,

 The included testcase currently ICEs at -O0 because vget_lane_f64 is a
 function, so if it's properly called with a constant argument but without
 constant propagation it will not be recognised as constant, causing an ICE.
 This patch changes it to use the macro version directly.

 I think there is work being done to fix this issue up as part of a more
 general rework, but until that comes this patch implements the concerned
 intrinsics using the __aarch64_vget_lane* macros like the other lane
 intrinsics around them.

 Tested aarch64-none-elf.

 Ok for trunk?

 Thanks,
 Kyrill

 2014-09-08  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/aarch64/arm_neon.h (vmuld_lane_f64): Use macro for getting
 the lane.
 (vmuld_laneq_f64): Likewise.
 (vmuls_lane_f32): Likewise.
 (vmuls_laneq_f32): Likewise.

 2014-09-08  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * gcc.target/aarch64/simd/vmul_lane_const_lane_1.c: New test.

OK /Marcus


Re: [PATCH] PR63300 'const volatile' sometimes stripped in debug info.

2014-09-24 Thread Andreas Arnez
On Tue, Sep 23 2014, Mark Wielaard wrote:

 This certainly looks nicer than how I wrote it. It took me a while
 (again) to realize why this works. We rely on the fact that earlier in
 the function a match would have been found if there was already a fully
 qualified type available. So here we know some subset will be found and
 at least one qualifier we need will not be in the result returned by
 get_nearest_type_subqualifiers. Maybe add a comment saying that to the
 code?

 Could you add the testcases I wrote for my variant of the fix to your
 patch and make sure they PASS?

OK, here's the adjusted patch with a comment added and your testcases
included.  I changed the patch a bit further, to reduce unnecessary
iterations and recursions, and tested it again.

A few style aspects I'm not sure about:

* Is it OK to use __builtin_popcount in tree.c?

* Is it acceptable to add such a specialized function as
  get_nearest_type_subqualifiers to the tree interface?  Or would it be
  preferable to move it as a static function to dwarf2out.c, since
  that's the only user right now?

-- 8 --
Subject: [PATCH v3] PR63300 'const volatile' sometimes stripped in debug info.

When adding DW_TAG_restrict_type the handling of multiple modifiers
was adjusted incorrectly.  This patch fixes it with the help of a new
tree function get_nearest_type_subqualifiers.  The old tests didn't
catch this case because there always was an existing sub-qualified
type already.  The new guality testcase fails before and succeeds
after this patch.  The new dwarf2 testcases make sure the optimization
works and doesn't introduce unnecessary type tags.

gcc/ChangeLog

* tree.c (check_base_type): New.
(check_qualified_type): Exploit new helper function above.
(get_nearest_type_subqualifiers): New.
* tree.h (get_nearest_type_subqualifiers): New prototype.
* dwarf2out.c (modified_type_die): Fix handling for qualifiers.
Next qualifier to peel off is now determined with the help of
get_nearest_type_subqualifiers.

gcc/testsuite/ChangeLog

* gcc.dg/debug/dwarf2/stacked-qualified-types-1.c: New testcase.
* gcc.dg/debug/dwarf2/stacked-qualified-types-2.c: Likewise.
* gcc.dg/guality/pr63300-const-volatile.c: New testcase.
---
 gcc/dwarf2out.c| 62 +++---
 .../debug/dwarf2/stacked-qualified-types-1.c   | 18 +++
 .../debug/dwarf2/stacked-qualified-types-2.c   | 19 +++
 .../gcc.dg/guality/pr63300-const-volatile.c| 12 +
 gcc/tree.c | 52 --
 gcc/tree.h |  7 +++
 6 files changed, 135 insertions(+), 35 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-1.c
 create mode 100644 
gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-2.c
 create mode 100644 gcc/testsuite/gcc.dg/guality/pr63300-const-volatile.c

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index e87ade2..abd9df9 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -10474,12 +10474,14 @@ modified_type_die (tree type, int cv_quals, 
dw_die_ref context_die)
   tree qualified_type;
   tree name, low, high;
   dw_die_ref mod_scope;
+  /* Only these cv-qualifiers are currently handled.  */
+  const int cv_qual_mask = (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE
+   | TYPE_QUAL_RESTRICT);
 
   if (code == ERROR_MARK)
 return NULL;
 
-  /* Only these cv-qualifiers are currently handled.  */
-  cv_quals = (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE | TYPE_QUAL_RESTRICT);
+  cv_quals = cv_qual_mask;
 
   /* Don't emit DW_TAG_restrict_type for DWARFv2, since it is a type
  tag modifier (and not an attribute) old consumers won't be able
@@ -10530,7 +10532,7 @@ modified_type_die (tree type, int cv_quals, dw_die_ref 
context_die)
   else
{
  int dquals = TYPE_QUALS_NO_ADDR_SPACE (dtype);
- dquals = (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE | TYPE_QUAL_RESTRICT);
+ dquals = cv_qual_mask;
  if ((dquals  ~cv_quals) != TYPE_UNQUALIFIED
  || (cv_quals == dquals  DECL_ORIGINAL_TYPE (name) != type))
/* cv-unqualified version of named type.  Just use
@@ -10543,33 +10545,33 @@ modified_type_die (tree type, int cv_quals, 
dw_die_ref context_die)
 
   mod_scope = scope_die_for (type, context_die);
 
-  if ((cv_quals  TYPE_QUAL_CONST)
-  /* If there are multiple type modifiers, prefer a path which
-leads to a qualified type.  */
-   (((cv_quals  ~TYPE_QUAL_CONST) == TYPE_UNQUALIFIED)
- || get_qualified_type (type, cv_quals) == NULL_TREE
- || (get_qualified_type (type, cv_quals  ~TYPE_QUAL_CONST)
- != NULL_TREE)))
-{
-  mod_type_die = new_die (DW_TAG_const_type, mod_scope, type);
-  sub_die = modified_type_die (type, cv_quals  ~TYPE_QUAL_CONST,
-  context_die);

[PATCH, rs6000] Teach analyze_swaps to avoid vec_ste

2014-09-24 Thread Bill Schmidt
Hi,

The analyze_swaps pass performs special handling on certain non-swapping
loads and stores so that computations involving them can still be
optimized.  However, the intent was to avoid this for lvx, stvx, lve*,
and stve*.  The existing logic excludes these by looking for a PARALLEL
as the rtx code for the insn body.  It turns out this works for lvx,
stvx, and lve*, but stve* was implemented slightly differently, so this
check doesn't catch it.  This patch fixes the problem by looking for the
pattern that matches stve* as well; we now exclude any store with an
UNSPEC as its SET_SRC.

I've added a new compile-time test case to verify the fix.  The test
ICEs on existing trunk but passes with the new changes.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Is this ok for trunk?

Thanks,
Bill


[gcc]

2014-09-24  Bill Schmidt  wschm...@linux.vnet.ibm.com

* config/rs6000/rs6000.c (insn_is_swappable_p): Don't provide
special handling for stores whose SET_SRC is an UNSPEC (such as
UNSPEC_STVE).


[gcc/testsuite]

2014-09-24  Bill Schmidt  wschm...@linux.vnet.ibm.com

* gcc.target/powerpc/swaps-p8-17.c: New test.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 215486)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -33793,9 +33793,10 @@ insn_is_swappable_p (swap_web_entry *insn_entry, r
 return 0;
 
   /* Loads and stores seen here are not permuting, but we can still
- fix them up by converting them to permuting ones.  Exception:
- UNSPEC_LVX and UNSPEC_STVX, which have a PARALLEL body instead
- of a SET.  */
+ fix them up by converting them to permuting ones.  Exceptions:
+ UNSPEC_LVE, UNSPEC_LVX, and UNSPEC_STVX, which have a PARALLEL
+ body instead of a SET; and UNSPEC_STVE, which has an UNSPEC
+ for the SET source.  */
   rtx body = PATTERN (insn);
   int i = INSN_UID (insn);
 
@@ -33812,7 +33813,7 @@ insn_is_swappable_p (swap_web_entry *insn_entry, r
 
   if (insn_entry[i].is_store)
 {
-  if (GET_CODE (body) == SET)
+  if (GET_CODE (body) == SET  GET_CODE (SET_SRC (body)) != UNSPEC)
{
  *special = SH_NOSWAP_ST;
  return 1;
Index: gcc/testsuite/gcc.target/powerpc/swaps-p8-17.c
===
--- gcc/testsuite/gcc.target/powerpc/swaps-p8-17.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/swaps-p8-17.c  (working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { powerpc64le-*-* } } } */
+/* { dg-options -mcpu=power8 -O1 } */
+/* { dg-final { scan-assembler lxvd2x } } */
+/* { dg-final { scan-assembler xxpermdi } } */
+
+/* Verify that we don't try to do permute removal in the presence of
+   vec_ste.  This used to ICE.  */
+#include altivec.h
+
+void f (void *p)
+{
+  vector unsigned int u32 = vec_vsx_ld (1, (const unsigned int *)p);
+  vec_ste (u32, 1, (unsigned int *)p);
+}




Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Steven Bosscher
On Wed, Sep 24, 2014 at 2:30 PM, Ilya Enkovich wrote:
 I didn't generate references separately from label.  Now I found an
 old patch and a test where this problem appeared.  In this patch I
 moved set_rip generation currently performed in ix86_expand_prologue
 into expand pass.  And I got following code in expand dump for
 testsuite/gcc.target/i386/pr55154.c test:

 (note 7 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
 (note/s 2 7 3 2  NOTE_INSN_DELETED_LABEL 2)
 (insn 3 2 4 2 (set (reg:DI 85)
 (unspec:DI [
 (label_ref [2 deleted])
 ] UNSPEC_SET_RIP))
 /export/users/ienkovic/issues/4161/gcc/gcc/testsuite/gcc.target/i386/pr55154.c:9
 -1
  (insn_list:REG_LABEL_OPERAND 2 (nil)))

 There is a REG_LABEL_OPERAND generated but label is still removed.

Because it should be a REG_LABEL_TARGET?

AFAUI this is a contol flow insn so I'd expect it to be a jump_insn
(and the note will be a TARGET note). But it's not a PC-set insn and a
jump target the compiler will interpret as an infinite loop (if the
insns are really in the order as above) which is clearly not what you
want. So if you emit it as a jump_insn I'm not sure what will
happen...

Is it necessary to emit the label into a basic block?

Ciao!
Steven


[PATCH i386 AVX512] [53/n] Update vec_setmode_0 pattern constraints.

2014-09-24 Thread Kirill Yukhin
Hello,
Patch in the bottom extends to EVEX constraints
of vec_setmode_0 insn pattern.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_insn vec_setmode_0): Add EVEX version.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index c9d6e00..5f2fe5b 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -6259,13 +6259,13 @@
 ;; see comment above inline_secondary_memory_needed function in i386.c
 (define_insn vec_setmode_0
   [(set (match_operand:VI4F_128 0 nonimmediate_operand
- =x,x,x ,x,x,x,x  ,x  ,m ,m   ,m)
+ =v,v,v ,x,x,v,x  ,x  ,m ,m   ,m)
(vec_merge:VI4F_128
  (vec_duplicate:VI4F_128
(match_operand:ssescalarmode 2 general_operand
-  x,m,*r,m,x,x,*rm,*rm,!x,!*re,!*fF))
+  v,m,*r,m,x,v,*rm,*rm,!x,!*re,!*fF))
  (match_operand:VI4F_128 1 vector_move_operand
-  C,C,C ,C,0,x,0  ,x  ,0 ,0   ,0)
+  C,C,C ,C,0,v,0  ,x  ,0 ,0   ,0)
  (const_int 1)))]
   TARGET_SSE
   @


Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Ilya Enkovich
2014-09-24 16:47 GMT+04:00 Steven Bosscher stevenb@gmail.com:
 On Wed, Sep 24, 2014 at 2:30 PM, Ilya Enkovich wrote:
 I didn't generate references separately from label.  Now I found an
 old patch and a test where this problem appeared.  In this patch I
 moved set_rip generation currently performed in ix86_expand_prologue
 into expand pass.  And I got following code in expand dump for
 testsuite/gcc.target/i386/pr55154.c test:

 (note 7 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
 (note/s 2 7 3 2  NOTE_INSN_DELETED_LABEL 2)
 (insn 3 2 4 2 (set (reg:DI 85)
 (unspec:DI [
 (label_ref [2 deleted])
 ] UNSPEC_SET_RIP))
 /export/users/ienkovic/issues/4161/gcc/gcc/testsuite/gcc.target/i386/pr55154.c:9
 -1
  (insn_list:REG_LABEL_OPERAND 2 (nil)))

 There is a REG_LABEL_OPERAND generated but label is still removed.

 Because it should be a REG_LABEL_TARGET?

 AFAUI this is a contol flow insn so I'd expect it to be a jump_insn
 (and the note will be a TARGET note). But it's not a PC-set insn and a
 jump target the compiler will interpret as an infinite loop (if the
 insns are really in the order as above) which is clearly not what you
 want. So if you emit it as a jump_insn I'm not sure what will
 happen...

 Is it necessary to emit the label into a basic block?

It is not a control flow instruction. It copies value of instruction
pointer into a general purpose register.  Therefore REG_LABEL_OPERAND
seems to be correct.

Ilya


 Ciao!
 Steven


[PATCH i386 AVX512] [54/n] Add mov[dlh]dup insns support.

2014-09-24 Thread Kirill Yukhin
Hello,
patch in the bottom introduces support for
vmov[dlh]dup insns.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_insn avx_movshdup256mask_name): Add masking.
(define_insn sse3_movshdupmask_name): Ditto.
(define_insn avx_movsldup256mask_name): Ditto.
(define_insn sse3_movsldupmask_name): Ditto.
(define_insn vec_dupv2dfmask_name): Ditto.
(define_insn *vec_concatv2df): Add EVEX version.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 5f2fe5b..862c280 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -5776,34 +5776,34 @@
 
 ;; These are modeled with the same vec_concat as the others so that we
 ;; capture users of shufps that can use the new instructions
-(define_insn avx_movshdup256
-  [(set (match_operand:V8SF 0 register_operand =x)
+(define_insn avx_movshdup256mask_name
+  [(set (match_operand:V8SF 0 register_operand =v)
(vec_select:V8SF
  (vec_concat:V16SF
-   (match_operand:V8SF 1 nonimmediate_operand xm)
+   (match_operand:V8SF 1 nonimmediate_operand vm)
(match_dup 1))
  (parallel [(const_int 1) (const_int 1)
 (const_int 3) (const_int 3)
 (const_int 5) (const_int 5)
 (const_int 7) (const_int 7)])))]
-  TARGET_AVX
-  vmovshdup\t{%1, %0|%0, %1}
+  TARGET_AVX  mask_avx512vl_condition
+  vmovshdup\t{%1, %0mask_operand2|%0mask_operand2, %1}
   [(set_attr type sse)
(set_attr prefix vex)
(set_attr mode V8SF)])
 
-(define_insn sse3_movshdup
-  [(set (match_operand:V4SF 0 register_operand =x)
+(define_insn sse3_movshdupmask_name
+  [(set (match_operand:V4SF 0 register_operand =v)
(vec_select:V4SF
  (vec_concat:V8SF
-   (match_operand:V4SF 1 nonimmediate_operand xm)
+   (match_operand:V4SF 1 nonimmediate_operand vm)
(match_dup 1))
  (parallel [(const_int 1)
 (const_int 1)
 (const_int 7)
 (const_int 7)])))]
-  TARGET_SSE3
-  %vmovshdup\t{%1, %0|%0, %1}
+  TARGET_SSE3  mask_avx512vl_condition
+  %vmovshdup\t{%1, %0mask_operand2|%0mask_operand2, %1}
   [(set_attr type sse)
(set_attr prefix_rep 1)
(set_attr prefix maybe_vex)
@@ -5829,34 +5829,34 @@
(set_attr prefix evex)
(set_attr mode V16SF)])
 
-(define_insn avx_movsldup256
-  [(set (match_operand:V8SF 0 register_operand =x)
+(define_insn avx_movsldup256mask_name
+  [(set (match_operand:V8SF 0 register_operand =v)
(vec_select:V8SF
  (vec_concat:V16SF
-   (match_operand:V8SF 1 nonimmediate_operand xm)
+   (match_operand:V8SF 1 nonimmediate_operand vm)
(match_dup 1))
  (parallel [(const_int 0) (const_int 0)
 (const_int 2) (const_int 2)
 (const_int 4) (const_int 4)
 (const_int 6) (const_int 6)])))]
-  TARGET_AVX
-  vmovsldup\t{%1, %0|%0, %1}
+  TARGET_AVX  mask_avx512vl_condition
+  vmovsldup\t{%1, %0mask_operand2|%0mask_operand2, %1}
   [(set_attr type sse)
(set_attr prefix vex)
(set_attr mode V8SF)])
 
-(define_insn sse3_movsldup
-  [(set (match_operand:V4SF 0 register_operand =x)
+(define_insn sse3_movsldupmask_name
+  [(set (match_operand:V4SF 0 register_operand =v)
(vec_select:V4SF
  (vec_concat:V8SF
-   (match_operand:V4SF 1 nonimmediate_operand xm)
+   (match_operand:V4SF 1 nonimmediate_operand vm)
(match_dup 1))
  (parallel [(const_int 0)
 (const_int 0)
 (const_int 6)
 (const_int 6)])))]
-  TARGET_SSE3
-  %vmovsldup\t{%1, %0|%0, %1}
+  TARGET_SSE3  mask_avx512vl_condition
+  %vmovsldup\t{%1, %0mask_operand2|%0mask_operand2, %1}
   [(set_attr type sse)
(set_attr prefix_rep 1)
(set_attr prefix maybe_vex)
@@ -8342,24 +8342,24 @@
(set_attr prefix orig,vex,orig,vex,maybe_vex,orig,orig,vex,maybe_vex)
(set_attr mode DF,DF,V1DF,V1DF,V1DF,V2DF,V1DF,V1DF,V1DF)])
 
-(define_insn vec_dupv2df
-  [(set (match_operand:V2DF 0 register_operand =x,x)
+(define_insn vec_dupv2dfmask_name
+  [(set (match_operand:V2DF 0 register_operand =x,v)
(vec_duplicate:V2DF
- (match_operand:DF 1 nonimmediate_operand  0,xm)))]
-  TARGET_SSE2
+ (match_operand:DF 1 nonimmediate_operand  0,vm)))]
+  TARGET_SSE2  mask_avx512vl_condition
   @
unpcklpd\t%0, %0
-   %vmovddup\t{%1, %0|%0, %1}
+   %vmovddup\t{%1, %0mask_operand2|%0mask_operand2, %1}
   [(set_attr isa noavx,sse3)
(set_attr type sselog1)
(set_attr prefix orig,maybe_vex)
(set_attr mode V2DF,DF)])
 
 (define_insn *vec_concatv2df
-  [(set (match_operand:V2DF 0 register_operand =x,x,x,x,x,x,x,x)
+  [(set (match_operand:V2DF 0 register_operand =x,v,v,x,x,v,x,x)
(vec_concat:V2DF
- 

[PATCH i386 AVX512] [55/n] Extend `perm' insn patterns.

2014-09-24 Thread Kirill Yukhin
Hello,
Patch in the bottom extends `perm' insn
patterns.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_expand avx2_avx512f_permmode): Rename to ...
(define_expand avx2_avx512bw_permmode): this.
(define_expand avx512_permmode_mask): Add 128/256-bit wide
version.
(define_insn avx2_avx512f_permmode_1mask_name): Rename to ...
(define_insn avx2_avx512bw_permmode_1mask_name): this.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 862c280..7c02629 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -15962,14 +15962,14 @@
(set_attr prefix mask_prefix2)
(set_attr mode sseinsnmode)])
 
-(define_expand avx2_avx512f_permmode
+(define_expand avx2_avx512_permmode
   [(match_operand:VI8F_256_512 0 register_operand)
(match_operand:VI8F_256_512 1 nonimmediate_operand)
(match_operand:SI 2 const_0_to_255_operand)]
   TARGET_AVX2
 {
   int mask = INTVAL (operands[2]);
-  emit_insn (gen_avx2_avx512f_permmode_1 (operands[0], operands[1],
+  emit_insn (gen_avx2_avx512_permmode_1 (operands[0], operands[1],
  GEN_INT ((mask  0)  3),
  GEN_INT ((mask  2)  3),
  GEN_INT ((mask  4)  3),
@@ -15977,16 +15977,16 @@
   DONE;
 })
 
-(define_expand avx512f_permmode_mask
-  [(match_operand:V8FI 0 register_operand)
-   (match_operand:V8FI 1 nonimmediate_operand)
+(define_expand avx512_permmode_mask
+  [(match_operand:VI8F_256_512 0 register_operand)
+   (match_operand:VI8F_256_512 1 nonimmediate_operand)
(match_operand:SI 2 const_0_to_255_operand)
-   (match_operand:V8FI 3 vector_move_operand)
+   (match_operand:VI8F_256_512 3 vector_move_operand)
(match_operand:avx512fmaskmode 4 register_operand)]
   TARGET_AVX512F
 {
   int mask = INTVAL (operands[2]);
-  emit_insn (gen_avx2_avx512f_permmode_1_mask (operands[0], operands[1],
+  emit_insn (gen_avx2_avx512_permmode_1_mask (operands[0], operands[1],
   GEN_INT ((mask  0)  3),
   GEN_INT ((mask  2)  3),
   GEN_INT ((mask  4)  3),
@@ -15995,7 +15995,7 @@
   DONE;
 })
 
-(define_insn avx2_avx512f_permmode_1mask_name
+(define_insn avx2_avx512_permmode_1mask_name
   [(set (match_operand:VI8F_256_512 0 register_operand =v)
(vec_select:VI8F_256_512
  (match_operand:VI8F_256_512 1 nonimmediate_operand vm)


[wwwdocs] Update C++1y status page now that C++14 is finished.

2014-09-24 Thread Jonathan Wakely

C++14 is no longer the next standard, it's here, so update the project
page.

Committed to cvs.
Index: projects/cxx1y.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cxx1y.html,v
retrieving revision 1.15
retrieving revision 1.16
diff -u -u -r1.15 -r1.16
--- projects/cxx1y.html 23 Aug 2014 06:56:45 -  1.15
+++ projects/cxx1y.html 24 Sep 2014 12:46:05 -  1.16
@@ -13,28 +13,28 @@
 body
   h1C++1y/C++14 Support in GCC/h1
 
-  pGCC is beginning to introduce support for the next revision of the C++
-  standard, which is expected to be published in 2014./p
+  pGCC is beginning to introduce support for the latest revision of the C++
+  standard, which was published in 2014./p
 
-  pC++1y features are available as part of the mainline GCC
+  pC++14 features are available as part of the mainline GCC
 compiler in the trunk of
 a href=../svn.htmlGCC's Subversion
-  repository/a and in GCC 4.8 and later. To enable C++1y
-  support, add the command-line parameter code-std=c++1y/code
+  repository/a and in GCC 4.8 and later. To enable C++14
+  support, add the command-line parameter code-std=c++14/code
   to your codeg++/code command line. Or, to enable GNU
   extensions in addition to C++0x extensions,
-  add code-std=gnu++1y/code to your codeg++/code command
+  add code-std=gnu++14/code to your codeg++/code command
   line./p
 
-  pstrongImportant/strong: Because the ISO C++14 draft is still
-  evolving, GCC's support is strongexperimental/strong.  No attempt
+  pstrongImportant/strong: Because the final ISO C++14 standard was only
+  recently published, GCC's support is strongexperimental/strong.  No 
attempt
   will be made to maintain backward compatibility with implementations of
-  C++1y features that do not reflect the final standard./p
+  C++14 features that do not reflect the final standard./p
 
-h2C++1y Language Features/h2
+h2C++14 Language Features/h2
 
-  pThe following table lists new language features that have been
-  accepted into the C++1y standard. The Proposal column
+  pThe following table lists new language features that are part of
+  the C++14 standard. The Proposal column
   provides a link to the ISO C++ committee proposal that describes the
   feature, while the Available in GCC? column indicates the first
   version of GCC that contains an implementation of this feature (if
@@ -127,11 +127,11 @@
 /tr
   /table
 
-  !-- h2C++11 Library Features/h2 --
+  h2C++14 Library Features/h2
 
-  !-- pThe status of the library implementation can be tracked in this --
-  !--   a 
href=https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.200x;table/a
 --
-  !-- /p --
+  pThe status of the library implementation can be tracked in this
+  a 
href=https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2014;table/a
+  /p
 
   h2Development Branches/h2
 


[wwwdocs] Update libstdc++ section of gcc-5/changes.html

2014-09-24 Thread Jonathan Wakely

I forgot to send this patch a couple of days ago.

Document recent libstdc++ changes on trunk. Also tweak the 4.9 notes
slightly for consistency.

Committed to cvs.
Index: gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.9
retrieving revision 1.10
diff -u -r1.9 -r1.10
--- gcc-5/changes.html  5 Sep 2014 08:25:46 -   1.9
+++ gcc-5/changes.html  22 Sep 2014 15:31:58 -  1.10
@@ -84,12 +84,16 @@
 lia 
href=https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2011;
   Improved support for C++11/a, including:
   ul
+li codestd::deque/code meets the allocator-aware container 
requirements;/li
+li movable and swappable iostream classes;/li
 li support for codestd::aligned_union/code;/li
   /ul
 /li
 liAn implementation of codestd::experimental::any/code./li
 liNew random number distributions codelogistic_distribution/code and
   codeuniform_on_sphere_distribution/code as extensions./li
+lia 
href=https://sourceware.org/gdb/current/onlinedocs/gdb/Xmethods-In-Python.html;GDB
+  Xmethods/a for codestd::vector/code and 
codestd::unique_ptr/code;/li
   /ul
 
 h3 id=fortranFortran/h3
Index: gcc-4.9/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v
retrieving revision 1.79
retrieving revision 1.80
diff -u -r1.79 -r1.80
--- gcc-4.9/changes.html23 Aug 2014 15:58:55 -  1.79
+++ gcc-4.9/changes.html22 Sep 2014 15:31:57 -  1.80
@@ -344,7 +344,7 @@
 li implemention of codestd::make_unique/code; /li
 li implemention of codestd::shared_lock/code; /li
 li making codestd::result_of/code SFINAE-friendly; /li
-li adding codeoperator()/code to codeintegral_constant/code; 
/li
+li adding codeoperator()/code to 
codestd::integral_constant/code; /li
 li adding user-defined literals for standard library types
  codestd::basic_string/code, 
codestd::chrono::duration/code,
  and codestd::complex/code; /li


Re: [PATCH] PR63300 'const volatile' sometimes stripped in debug info.

2014-09-24 Thread Jakub Jelinek
On Wed, Sep 24, 2014 at 02:40:14PM +0200, Andreas Arnez wrote:
 A few style aspects I'm not sure about:
 
 * Is it OK to use __builtin_popcount in tree.c?

Definitely not, you can use popcount_hwi instead, which for GCC
host compiler (= 3.4) will use __builtin_popcount*, otherwise
fallback to a library function.

 * Is it acceptable to add such a specialized function as
   get_nearest_type_subqualifiers to the tree interface?  Or would it be
   preferable to move it as a static function to dwarf2out.c, since
   that's the only user right now?

I agree it should be kept in dwarf2out.c, it is too specialized.

Jakub


Re: [PATCH, rs6000] Teach analyze_swaps to avoid vec_ste

2014-09-24 Thread David Edelsohn
On Wed, Sep 24, 2014 at 8:46 AM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 The analyze_swaps pass performs special handling on certain non-swapping
 loads and stores so that computations involving them can still be
 optimized.  However, the intent was to avoid this for lvx, stvx, lve*,
 and stve*.  The existing logic excludes these by looking for a PARALLEL
 as the rtx code for the insn body.  It turns out this works for lvx,
 stvx, and lve*, but stve* was implemented slightly differently, so this
 check doesn't catch it.  This patch fixes the problem by looking for the
 pattern that matches stve* as well; we now exclude any store with an
 UNSPEC as its SET_SRC.

 I've added a new compile-time test case to verify the fix.  The test
 ICEs on existing trunk but passes with the new changes.

 Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
 regressions.  Is this ok for trunk?

 Thanks,
 Bill


 [gcc]

 2014-09-24  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (insn_is_swappable_p): Don't provide
 special handling for stores whose SET_SRC is an UNSPEC (such as
 UNSPEC_STVE).


 [gcc/testsuite]

 2014-09-24  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * gcc.target/powerpc/swaps-p8-17.c: New test.

Okay.

Thanks, David


Re: [PATCH, i386, Pointer Bounds Checker 33/x] MPX ABI

2014-09-24 Thread Ilya Enkovich
2014-09-24 11:05 GMT+04:00 Ilya Enkovich enkovich@gmail.com:
 2014-09-23 22:01 GMT+04:00 Jeff Law l...@redhat.com:
 On 09/23/14 00:31, Ilya Enkovich wrote:


 I did this change a couple of years ago and don't remember exactly
 what problem was caused by PARALLEL.  But from my comment it seems
 parallel lead to values in BND0 and BND1 not to be actually defined by
 call from DF point of view.  I'll try to reproduce a problem I had.

 Please do.  That would indicate a bug in the DF infrastructure.  I'm not
 real familiar with the DF implementation, but a quick glance at
 df_def_record_1 seems to indicate it's got support for a set destination
 being a PARALLEL.

 This kind of scheme also doesn't tend to play well with exception
 handling
  scheduling becuase you can't guarantee the sets and the call are in the
 same block and scheduler as a single group.


 How can the sets and  the call no be in the same block/group if all of
 them are parts of a single instruction?

 Obviously in the cases where we've had these problems in the past they were
 distinct instructions.  So EH interactions isn't going to be an issue for
 MPX.

 However, we've still got the problem that the RTL you've generated is
 ill-formed.  If I understand things correctly, the assignments are the
 result of the call, that should be modeled by having the destination be a
 PARALLEL as mentioned earlier.

 OK. Will try it. BTW call_value_pop patterns have two sets. One for
 returned value and one for stack register. How comes it differs much
 from what I do with bound regs?

 Thanks,
 Ilya




 Jeff

I tried to generate PARALLEL with all regs set by call.  Here is a
memset call I got:

(call_insn 23 22 24 2 (set (parallel [
(expr_list:REG_DEP_TRUE (reg:DI 0 ax)
(const_int 0 [0]))
(expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
(const_int 0 [0]))
(expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
(const_int 0 [0]))
])
(call/j (mem:QI (symbol_ref:DI (memset) [flags 0x41]
function_decl 0x77f79400 memset.chkp) [0 __builtin_memset S1
A8])
(const_int 0 [0])))
/export/users/ienkovic/mpx/tests/own-tests/255/test-255.c:11 652
{*call_value}
 (expr_list:REG_RETURNED (reg/f:DI 100)
(expr_list:REG_DEAD (reg:DI 5 di)
(expr_list:REG_DEAD (reg:SI 4 si)
(expr_list:REG_DEAD (reg:DI 1 dx)
(expr_list:REG_UNUSED (reg:BND64 78 bnd1)
(expr_list:REG_UNUSED (reg:BND64 77 bnd0)
(expr_list:REG_UNUSED (reg:DI 0 ax)
(expr_list:REG_CALL_DECL
(symbol_ref:DI (memset) [flags 0x41] function_decl 0x77f79400
memset.chkp)
(expr_list:REG_EH_REGION (const_int 0 [0])
(nil))
(expr_list:DI (set (reg:DI 0 ax)
(reg:DI 5 di))
(expr_list:DI (use (reg:DI 5 di))
(expr_list:BND64 (use (reg:BND64 77 bnd0))
(expr_list:SI (use (reg:SI 4 si))
(expr_list:DI (use (reg:DI 1 dx))
(nil)))

During register allocation LRA generated a weird move instruction:

(insn 63 0 0 (set (reg/f:DI 100)
(parallel [
(expr_list:REG_DEP_TRUE (reg:DI 0 ax)
(const_int 0 [0]))
(expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
(const_int 0 [0]))
(expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
(const_int 0 [0]))
])) -1
 (nil))

Which caused ICE later in LRA.  This move happens because of
REG_RETURNED (reg/f:DI 100) (see condition in inherit_in_ebb at
lra-constraints.c:5312).  Thus this code in LRA doesn't accept
PARALLEL dest for calls.

So my question here is should I go through problems to enable PARALLEL
call destination or current sets are OK taking into account we would
still have multiple sets due to call_value_pop patterns?

Thanks,
Ilya


[GOOGLE] Fix new tests

2014-09-24 Thread Teresa Johnson
The new tests added for -mpatch-functions-for-instrumentation did not
correctly restrict themselves to x86_64 since tree-prof.exp doesn't
support dg-do. Work around this by using target selectors on the
dg-options. I apply the -mpatch and related options only if it is
x86_64, otherwise it simply does splitting.

Ok for google branches?

Teresa

2014-09-24  Teresa Johnson  tejohn...@google.com

* testsuite/gcc.dg/tree-prof/cold_partition_patch.c:
* testsuite/g++.dg/tree-prof/partition_patch.C:

Index: testsuite/gcc.dg/tree-prof/cold_partition_patch.c
===
--- testsuite/gcc.dg/tree-prof/cold_partition_patch.c   (revision 215525)
+++ testsuite/gcc.dg/tree-prof/cold_partition_patch.c   (working copy)
@@ -1,8 +1,7 @@
 /* Check if patching works with function splitting. */
-/* { dg-do compile { target x86_64-*-* } } */
 /* { dg-require-effective-target freorder } */
-/* { dg-options -O2 -freorder-blocks-and-partition -save-temps
-mpatch-functions-for-instrumentation -fno-optimize-sibling-calls  }
*/
-
+/* { dg-options -O2 -freorder-blocks-and-partition -save-temps  {
target { ! x86_64-*-* } } }
+/* { dg-options -O2 -freorder-blocks-and-partition -save-temps
-mpatch-functions-for-instrumentation -fno-optimize-sibling-calls  {
target x86_64-*-* } } */
 #define SIZE 1

 const char *sarr[SIZE];
Index: testsuite/g++.dg/tree-prof/partition_patch.C
===
--- testsuite/g++.dg/tree-prof/partition_patch.C(revision 215525)
+++ testsuite/g++.dg/tree-prof/partition_patch.C(working copy)
@@ -1,7 +1,7 @@
 // Check if patching works with function splitting.
-// { dg-do compile { target x86_64-*-* } }
 // { dg-require-effective-target freorder }
-// { dg-options -O2 -fnon-call-exceptions
-freorder-blocks-and-partition -mpatch-functions-for-instrumentation
-fno-optimize-sibling-calls  }
+// { dg-options -O2 -fnon-call-exceptions
-freorder-blocks-and-partition  { target { ! x86_64-*-* } } }
+// { dg-options -O2 -fnon-call-exceptions
-freorder-blocks-and-partition -mpatch-functions-for-instrumentation
-fno-optimize-sibling-calls  { target x86_64-*-* } }

 int k;


-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [patch] Implement move semantics for iostreams

2014-09-24 Thread Jonathan Wakely

On 23/09/14 15:58 +0200, Rainer Orth wrote:

This patch broke Solaris bootstrap with Sun ld: when linking
libstdc++.so, ld complains

ld: fatal: libstdc++-symbols.ver-sun: 4520: symbol 'std::basic_ioschar, std::char_traitschar 
::move(std::basic_ioschar, std::char_traitschar )': symbol version conflict

and many more.  In that case, I find that this symbols is matched by
both the GLIBCXX_3.4 and GLIBCXX_3.4.21 patterns:

   GLIBCXX_3.4
   ##std::basic_i[g-r]* (cxx)
   _ZNSt9basic_iosIcSt11char_traitsIcEE4moveEOS2_;

   GLIBCXX_3.4.21
   ##_ZNSt9basic_iosI[cw]St11char_traitsI[cw]EE4moveE[OR]S2_ (glob)
   _ZNSt9basic_iosIcSt11char_traitsIcEE4moveEOS2_;


Rainer, I think this patch should fix it, could you test it please?

(I tried installing Solaris in a VM but couldn't get it to work, maybe
I should use the VirtualBox image instead of trying qemu/kvm.)


commit 61937e94b69fb848efd7925364fbb965ade8a444
Author: Jonathan Wakely jwak...@redhat.com
Date:   Wed Sep 24 14:24:38 2014 +0100

	* config/abi/pre/gnu.ver: Make GLIBCXX_3.4 patterns stricter so the
	new GLIBCXX_3.4.21 symbols don't match them.

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver
index 58c90d6..f736240 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -39,10 +39,11 @@ GLIBCXX_3.4 {
   std::basic_[g-h]*;
   std::basic_i[a-e]*;
 # std::basic_ifstream;
-  std::basic_i[g-r]*;
+# std::basic_ios;
+# std::basic_iostream;
   std::basic_istr[a-d]*;
 # std::basic_istream;
-  std::basic_istr[f-z]*;
+# std::basic_istringstream;
   std::basic_i[t-z]*;
   std::basic_[j-n]*;
   std::basic_o[a-e]*;
@@ -50,12 +51,12 @@ GLIBCXX_3.4 {
 # std::basic_o[g-z]*;
   std::basic_o[g-r]*;
   std::basic_ostr[a-d]*;
-  std::basic_ostr[f-z]*;
+# std::basic_ostringstream;
   std::basic_[p-r]*;
 # std::basic_streambuf
 # std::basic_string
 # std::basic_stringbuf
-  std::basic_stringstream*;
+# std::basic_stringstream;
   std::basic_[t-z]*;
   std::ba[t-z]*;
   std::b[b-z]*;
@@ -94,7 +95,7 @@ GLIBCXX_3.4 {
   std::i[p-r]*;
 # std::istream
 # std::istreambuf_iterator
-  std::istringstream*;
+# std::istringstream*;
   std::istrstream*;
   std::i[t-z]*;
   std::[A-Zj-k]*;
@@ -306,12 +307,14 @@ GLIBCXX_3.4 {
 # std::basic_streambuf
 _ZNSt15basic_streambufI[cw]St11char_traitsI[cw]EE[CD]*;
 _ZNKSt15basic_streambufI[cw]St11char_traitsI[cw]EE[0-9]*;
-_ZNSt15basic_streambufI[cw]St11char_traitsI[cw]EE[0-9][a-z][^t]*;
+_ZNSt15basic_streambufI[cw]St11char_traitsI[cw]EE4set[gp]*;
+_ZNSt15basic_streambufI[cw]St11char_traitsI[cw]EE4sync*;
+_ZNSt15basic_streambufI[cw]St11char_traitsI[cw]EE[5-9][a-z][^t]*;
 _ZNSt15basic_streambufI[cw]St11char_traitsI[cw]EE[0-9][0-9][a-z][^t]*;
 _ZNSt15basic_streambufI[cw]St11char_traitsI[cw]EEaSERKS2_;
 
 # std::basic_stringbuf
-_ZNSt15basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EEC*;
+_ZNSt15basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]E[RS]*;
 _ZNSt15basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EED[^2]*;
 _ZNSt15basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EE[0-9][a-r]*;
 _ZNSt15basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EE[0-9]seek*;
@@ -325,12 +328,46 @@ GLIBCXX_3.4 {
 _ZNSt15basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EE[0-9]_M_[q-z]*;
 _ZNSt15basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EE[0-9][0-9]_M_[a-z]*;
 
-# std::basic_iostream constructors, destructors
-_ZNSdC*;
+# std::basic_istringstream
+_ZNSt19basic_istringstreamI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]E[RS]*;
+_ZNSt19basic_istringstreamI[cw]St11char_traitsI[cw]ESaI[cw]EED*;
+_ZNSt19basic_istringstreamI[cw]St11char_traitsI[cw]ESaI[cw]EE3str*;
+_ZNKSt19basic_istringstream*;
+
+# std::basic_ostringstream
+_ZNSt19basic_ostringstreamI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]E[RS]*;
+_ZNSt19basic_ostringstreamI[cw]St11char_traitsI[cw]ESaI[cw]EED*;
+_ZNSt19basic_ostringstreamI[cw]St11char_traitsI[cw]ESaI[cw]EE3str*;
+_ZNKSt19basic_ostringstream*;
+
+# std::basic_stringstream
+_ZNSt18basic_stringstreamI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]E[RS]*;
+_ZNSt18basic_stringstreamI[cw]St11char_traitsI[cw]ESaI[cw]EED*;
+_ZNSt18basic_stringstreamI[cw]St11char_traitsI[cw]ESaI[cw]EE3str*;
+_ZNKSt18basic_stringstream*;
+
+# std::basic_iostream constructors (except move), destructors
+_ZNSdC[12]Ev;
+_ZNSdC[12]EP*;
 _ZNSdD*;
+_ZNSt14basic_iostreamIwSt11char_traitsIwEEC[12]Ev;
+_ZNSt14basic_iostreamIwSt11char_traitsIwEEC[12]EP*;
+_ZNSt14basic_iostreamIwSt11char_traitsIwEED*;
+
+# std::basic_ios constructors, destructors
+_ZNSt9basic_iosI[cw]St11char_traitsI[cw]EEC*;
+_ZNSt9basic_iosI[cw]St11char_traitsI[cw]EED*;
+
+# std::basic_ios members (except move, swap, set_rdbuf)
+

Re: [PATCH] PR63300 'const volatile' sometimes stripped in debug info.

2014-09-24 Thread Andreas Arnez
On Wed, Sep 24 2014, Jakub Jelinek wrote:

 On Wed, Sep 24, 2014 at 02:40:14PM +0200, Andreas Arnez wrote:
 A few style aspects I'm not sure about:
 
 * Is it OK to use __builtin_popcount in tree.c?

 Definitely not, you can use popcount_hwi instead, which for GCC
 host compiler (= 3.4) will use __builtin_popcount*, otherwise
 fallback to a library function.

 * Is it acceptable to add such a specialized function as
   get_nearest_type_subqualifiers to the tree interface?  Or would it be
   preferable to move it as a static function to dwarf2out.c, since
   that's the only user right now?

 I agree it should be kept in dwarf2out.c, it is too specialized.

   Jakub

OK, I'm using popcount_hwi now and moved get_nearest_type_subqualifiers
to dwarf2out.c.  Does this look OK?

-- 8 --
Subject: [PATCH v4] PR63300 'const volatile' sometimes stripped in debug info.

When adding DW_TAG_restrict_type the handling of multiple modifiers
was adjusted incorrectly.  This patch fixes it with the help of a new
tree function get_nearest_type_subqualifiers.  The old tests didn't
catch this case because there always was an existing sub-qualified
type already.  The new guality testcase fails before and succeeds
after this patch.  The new dwarf2 testcases make sure the optimization
works and doesn't introduce unnecessary type tags.

gcc/ChangeLog

* tree.c (check_base_type): New.
(check_qualified_type): Exploit new helper function above.
* tree.h (check_base_type): New prototype.
* dwarf2out.c (get_nearest_type_subqualifiers): New.
(modified_type_die): Fix handling for qualifiers.  Qualifiers to
peel off are now determined using get_nearest_type_subqualifiers.

gcc/testsuite/ChangeLog

* gcc.dg/debug/dwarf2/stacked-qualified-types-1.c: New testcase.
* gcc.dg/debug/dwarf2/stacked-qualified-types-2.c: Likewise.
* gcc.dg/guality/pr63300-const-volatile.c: New testcase.
---
 gcc/dwarf2out.c| 96 +++---
 .../debug/dwarf2/stacked-qualified-types-1.c   | 18 
 .../debug/dwarf2/stacked-qualified-types-2.c   | 19 +
 .../gcc.dg/guality/pr63300-const-volatile.c| 12 +++
 gcc/tree.c | 16 +++-
 gcc/tree.h |  4 +
 6 files changed, 131 insertions(+), 34 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-1.c
 create mode 100644 
gcc/testsuite/gcc.dg/debug/dwarf2/stacked-qualified-types-2.c
 create mode 100644 gcc/testsuite/gcc.dg/guality/pr63300-const-volatile.c

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index e87ade2..e15b42b 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -10461,6 +10461,40 @@ decl_quals (const_tree decl)
 ? TYPE_QUAL_VOLATILE : TYPE_UNQUALIFIED));
 }
 
+/* Determine the TYPE whose qualifiers match the largest strict subset
+   of the given TYPE_QUALS, and return its qualifiers.  Ignore all
+   qualifiers outside QUAL_MASK.  */
+
+static int
+get_nearest_type_subqualifiers (tree type, int type_quals, int qual_mask)
+{
+  tree t;
+  int best_rank = 0, best_qual = 0, max_rank;
+
+  type_quals = qual_mask;
+  max_rank = popcount_hwi (type_quals) - 1;
+
+  for (t = TYPE_MAIN_VARIANT (type); t  best_rank  max_rank;
+   t = TYPE_NEXT_VARIANT (t))
+{
+  int q = TYPE_QUALS (t)  qual_mask;
+
+  if ((q  type_quals) == q  q != type_quals
+  check_base_type (t, type))
+   {
+ int rank = popcount_hwi (q);
+
+ if (rank  best_rank)
+   {
+ best_rank = rank;
+ best_qual = q;
+   }
+   }
+}
+
+  return best_qual;
+}
+
 /* Given a pointer to an arbitrary ..._TYPE tree node, return a debugging
entry that chains various modifiers in front of the given type.  */
 
@@ -10474,12 +10508,14 @@ modified_type_die (tree type, int cv_quals, 
dw_die_ref context_die)
   tree qualified_type;
   tree name, low, high;
   dw_die_ref mod_scope;
+  /* Only these cv-qualifiers are currently handled.  */
+  const int cv_qual_mask = (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE
+   | TYPE_QUAL_RESTRICT);
 
   if (code == ERROR_MARK)
 return NULL;
 
-  /* Only these cv-qualifiers are currently handled.  */
-  cv_quals = (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE | TYPE_QUAL_RESTRICT);
+  cv_quals = cv_qual_mask;
 
   /* Don't emit DW_TAG_restrict_type for DWARFv2, since it is a type
  tag modifier (and not an attribute) old consumers won't be able
@@ -10530,7 +10566,7 @@ modified_type_die (tree type, int cv_quals, dw_die_ref 
context_die)
   else
{
  int dquals = TYPE_QUALS_NO_ADDR_SPACE (dtype);
- dquals = (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE | TYPE_QUAL_RESTRICT);
+ dquals = cv_qual_mask;
  if ((dquals  ~cv_quals) != TYPE_UNQUALIFIED
  || (cv_quals == dquals  DECL_ORIGINAL_TYPE (name) != type))

Re: std::regex: inserting std::wregex to std::vector loses some std::wregex values

2014-09-24 Thread Jonathan Wakely

On 23/09/14 23:11 -0700, Tim Shen wrote:

So I'll change the patch to move _M_traits to _NFA, and add a new
basic_regex::_M_loc member.


Here it is :). Bootstrapped and tested with debug flag.


OK for trunk - thanks.



Re: [PATCH] Fix asan optimization for aligned accesses. (PR sanitizer/63316)

2014-09-24 Thread ygribov
 BTW, I've noticed that perhaps using BIT_AND_EXPR for the 
 (shadow != 0)  ((base_addr  7) + (real_size_in_bytes - 1) = shadow) 
 tests isn't best, maybe we could get better code if we expanded it as 
 (shadow != 0)  ((base_addr  7) + (real_size_in_bytes - 1) = shadow) 
 (i.e. an extra basic block containing the second half of the test 
 and fastpath for the shadow == 0 case if it is sufficiently common 
 (probably it is)).

BIT_AND_EXPR allows efficient branchless implementation on platforms which
allow chained conditional compares (e.g. ARM). You can't repro this on
current trunk though because I'm still waiting for ccmp patches from
Zhenqiang Chen to be approved :(

 Will try to code this up unless somebody beats me to 
 that, but if somebody volunteered to benchmark such a change, it would 
 be very much appreciated.

AFAIK LLVM team recently got some 1% on SPEC from this.

-Y



--
View this message in context: 
http://gcc.1065356.n5.nabble.com/Re-please-verify-my-mail-to-community-tp1066917p1073370.html
Sent from the gcc - patches mailing list archive at Nabble.com.


Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Steven Bosscher
On Wed, Sep 24, 2014 at 2:51 PM, Ilya Enkovich wrote:
 2014-09-24 16:47 GMT+04:00 Steven Bosscher :
 It is not a control flow instruction. It copies value of instruction
 pointer into a general purpose register.  Therefore REG_LABEL_OPERAND
 seems to be correct.

OK - sorry for being a bit slow on the up-take, I got confused by the
asm syntax :-)

So I'm going to speculate a bit more... What you want to have is:

foo:
insns...
L2:
leaq L2(%rip), rXX


What happens is that L2 is deleted, which is to say converted to a
NOTE_INSN_DELETED_LABEL. Then the notes are re-ordered
(NOTE_INSN_DELETED_LABEL notes are not tied to anything in the insns
stream and can end up anywhere) so you end up with something like,

foo:
L2: # (was deleted)
insns...
leaq L2(%rip),rXX

I bet you'd find that in the failing test case the label is output to
the assembly file but it's simply in the wrong place.  For the large
code model, we get away with it because the prologue is output late
and the order of the insns is not adjusted (a few passes later, the
CFG doesn't even exist anymore so you don't go through cfgcleanup).
But if you emit the label early and let it go through the entire RTL
pipeline then anything can happen.

If the above makes sense, then you'll want to emit the label late, or
not at all, to the insns stream.

If you emit the label late into the insns stream, you'd rewrite the
set_rip as a define_insn_and_split that emits the label as part of the
last splitting pass. But there is no splitting pass late enough to
guarantee that the label and insns won't get separated.

If you don't emit the label to the insns stream, you would write
ix86_output_set_rip() and call that from the define_insns for set_rip.
You'd not emit the label in the expander. You'd create it and make it
an operand, but not emit it. Your ix86_output_set_rip() would write
the label and the set_rip instruction. This is probably the only way
to make 100% sure that the label is always exactly at the set_rip
instruction.

Something like below (completely untested, etc...).

Hope this helps,

Ciao!
Steven


Index: config/i386/i386-protos.h
===
--- config/i386/i386-protos.h   (revision 215483)
+++ config/i386/i386-protos.h   (working copy)
@@ -303,6 +303,7 @@ extern enum attr_cpu ix86_schedule;
 #endif

 extern const char * ix86_output_call_insn (rtx_insn *insn, rtx call_op);
+extern const char * ix86_output_set_rip_insn (rtx *operands);

 #ifdef RTX_CODE
 /* Target data for multipass lookahead scheduling.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 215483)
+++ config/i386/i386.c  (working copy)
@@ -11225,8 +11225,6 @@ ix86_expand_prologue (void)

  gcc_assert (Pmode == DImode);
  label = gen_label_rtx ();
- emit_label (label);
- LABEL_PRESERVE_P (label) = 1;
  tmp_reg = gen_rtx_REG (Pmode, R11_REG);
  gcc_assert (REGNO (pic_offset_table_rtx) != REGNO (tmp_reg));
  insn = emit_insn (gen_set_rip_rex64 (pic_offset_table_rtx,
@@ -12034,8 +12032,6 @@ ix86_expand_split_stack_prologue (void)
  rtx x;

  label = gen_label_rtx ();
- emit_label (label);
- LABEL_PRESERVE_P (label) = 1;
  emit_insn (gen_set_rip_rex64 (reg10, label));
  emit_insn (gen_set_got_offset_rex64 (reg11, label));
  emit_insn (ix86_gen_add3 (reg10, reg10, reg11));
@@ -25016,6 +25012,17 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op

   return ;
 }
+
+/* Output the assembly for a SET_RIP instruction.  We do so with this output
+   function to ensure that the label and %rip load instruction are together. */
+
+const char *
+ix86_output_set_rip_insn (rtx *operands)
+{
+  output_asm_label (operands[1]);
+  output_asm_insn (lea{q}\t{%l1(%%rip), %0|%0, %l1[rip]}, operands);
+  return ;
+}


 /* Clear stack slot assignments remembered from previous functions.
This is called from INIT_EXPANDERS once before RTL is emitted for each
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 215483)
+++ config/i386/i386.md (working copy)
@@ -12010,7 +12010,7 @@
   [(set (match_operand:DI 0 register_operand =r)
(unspec:DI [(label_ref (match_operand 1))] UNSPEC_SET_RIP))]
   TARGET_64BIT
-  lea{q}\t{%l1(%%rip), %0|%0, %l1[rip]}
+  * return ix86_output_set_rip_insn (operands);
   [(set_attr type lea)
(set_attr length_address 4)
(set_attr mode DI)])


Re: [PATCH] Fix asan optimization for aligned accesses. (PR sanitizer/63316)

2014-09-24 Thread ygribov
 AFAIK LLVM team recently got some 1% on SPEC from this. 

On x64 that is.



--
View this message in context: 
http://gcc.1065356.n5.nabble.com/Re-please-verify-my-mail-to-community-tp1066917p1073371.html
Sent from the gcc - patches mailing list archive at Nabble.com.


Re: Updated no_reorder patchkit

2014-09-24 Thread Martin Liška

On 09/16/2014 05:15 AM, Andi Kleen wrote:

This version addresses earlier comments and has an updated testsuite
(still no LTO tests however). The assembler statements also
no stay in order with ordered statements.

It doesn't disable sorting of paritions with ordered symbols. I think
that's an existing bug and is best addressed separately.

Passed LTO boot strap and test on x86_64-linux, plus build
of a large project that needs LTO order.

-Andi



Hello.

I've just merged trunk to my branch and observed regression connected to this 
patchset:

../../../libcilkrts/runtime/config/x86/os-unix-sysdep.c:114:5: internal 
compiler error: tree check: expected tree_list, have var_decl in 
get_attribute_name, at attribs.c:679
 if (__builtin_cpu_supports(sse))
 ^
0xc757a4 tree_check_failed(tree_node const*, char const*, int, char const*, ...)
../../gcc/tree.c:9167
0x566a35 tree_check
../../gcc/tree.h:2967
0x566a35 get_attribute_name(tree_node const*)
../../gcc/attribs.c:679
0xc788b5 private_lookup_attribute(char const*, unsigned long, tree_node*)
../../gcc/tree.c:5753
0xcd0468 lookup_attribute
../../gcc/tree.h:3773
0xcd0468 varpool_node::add(tree_node*)
../../gcc/varpool.c:452
0xced982 fold_builtin_cpu
../../gcc/config/i386/i386.c:32480
0x6826ef fold_builtin_call_array(unsigned int, tree_node*, tree_node*, int, 
tree_node**)
../../gcc/builtins.c:10565
0x59ec54 build_function_call_vec(unsigned int, vecunsigned int, va_heap, vl_ptr, 
tree_node*, vectree_node*, va_gc, vl_embed*, vectree_node*, va_gc, vl_embed*)
../../gcc/c/c-typeck.c:2958
0x5c659e c_parser_postfix_expression_after_primary
../../gcc/c/c-parser.c:7770
0x5b97bb c_parser_postfix_expression
../../gcc/c/c-parser.c:7590
0x5bbe6a c_parser_unary_expression
../../gcc/c/c-parser.c:6517
0x5c1ff6 c_parser_cast_expression
../../gcc/c/c-parser.c:6355
0x5c2235 c_parser_binary_expression
../../gcc/c/c-parser.c:6170
0x5c2de5 c_parser_conditional_expression
../../gcc/c/c-parser.c:5946
0x5c3420 c_parser_expr_no_commas
../../gcc/c/c-parser.c:5864
0x5c3ae6 c_parser_expression
../../gcc/c/c-parser.c:7897
0x5c45a9 c_parser_expression_conv
../../gcc/c/c-parser.c:7930
0x5c4622 c_parser_condition
../../gcc/c/c-parser.c:5050
0x5c46b7 c_parser_paren_condition
../../gcc/c/c-parser.c:5069


There's missing DECL_ATTRIBUTES in varpool.c in lookup_attribute call.

Ready for trunk?

Martin
diff --git a/gcc/varpool.c b/gcc/varpool.c
index 8001c93..3761f14 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -449,7 +449,7 @@ varpool_node::add (tree decl)
   symtab-call_varpool_insertion_hooks (node);
   if (node-externally_visible_p ())
 node-externally_visible = true;
-  if (lookup_attribute (no_reorder, decl))
+  if (lookup_attribute (no_reorder, DECL_ATTRIBUTES (decl)))
 node-no_reorder = 1;
 }
 


Re: Updated no_reorder patchkit

2014-09-24 Thread Jan Hubicka
 There's missing DECL_ATTRIBUTES in varpool.c in lookup_attribute call.
 
 Ready for trunk?
OK,
thanks
Honza
 
 Martin

 diff --git a/gcc/varpool.c b/gcc/varpool.c
 index 8001c93..3761f14 100644
 --- a/gcc/varpool.c
 +++ b/gcc/varpool.c
 @@ -449,7 +449,7 @@ varpool_node::add (tree decl)
symtab-call_varpool_insertion_hooks (node);
if (node-externally_visible_p ())
  node-externally_visible = true;
 -  if (lookup_attribute (no_reorder, decl))
 +  if (lookup_attribute (no_reorder, DECL_ATTRIBUTES (decl)))
  node-no_reorder = 1;
  }
  



Re: Updated no_reorder patchkit

2014-09-24 Thread Jakub Jelinek
On Wed, Sep 24, 2014 at 04:16:44PM +0200, Martin Liška wrote:
 On 09/16/2014 05:15 AM, Andi Kleen wrote:
 This version addresses earlier comments and has an updated testsuite
 (still no LTO tests however). The assembler statements also
 no stay in order with ordered statements.
 
 It doesn't disable sorting of paritions with ordered symbols. I think
 that's an existing bug and is best addressed separately.
 
 Passed LTO boot strap and test on x86_64-linux, plus build
 of a large project that needs LTO order.
 
 -Andi
 
 
 Hello.
 
 I've just merged trunk to my branch and observed regression connected to this 
 patchset:

This is already fixed, see r215552.

Jakub


Re: Updated no_reorder patchkit

2014-09-24 Thread Martin Liška

On 09/24/2014 04:17 PM, Jan Hubicka wrote:

There's missing DECL_ATTRIBUTES in varpool.c in lookup_attribute call.

Ready for trunk?

OK,
thanks
Honza


Ah, it has been fixed in r215552.

Martin



Martin



diff --git a/gcc/varpool.c b/gcc/varpool.c
index 8001c93..3761f14 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -449,7 +449,7 @@ varpool_node::add (tree decl)
symtab-call_varpool_insertion_hooks (node);
if (node-externally_visible_p ())
  node-externally_visible = true;
-  if (lookup_attribute (no_reorder, decl))
+  if (lookup_attribute (no_reorder, DECL_ATTRIBUTES (decl)))
  node-no_reorder = 1;
  }







Re: [PATCH 2/5] Existing call graph infrastructure enhancement

2014-09-24 Thread Martin Liška

On 06/13/2014 12:26 PM, mliska wrote:

Hi,
 this small patch prepares remaining needed infrastructure for the new pass.

Changelog:

2014-06-13  Martin Liska  mli...@suse.cz
Honza Hubicka  hubi...@ucw.cz

* ipa-utils.h (polymorphic_type_binfo_p): Function marked external
instead of static.
* ipa-devirt.c (polymorphic_type_binfo_p): Likewise.
* ipa-prop.h (count_formal_params): Likewise.
* ipa-prop.c (count_formal_params): Likewise.
* ipa-utils.c (ipa_merge_profiles): Be more tolerant if we merge
profiles for semantically equivalent functions.
* passes.c (do_per_function): If we load body of a function during WPA,
this condition should behave same.
* varpool.c (ctor_for_folding): More tolerant assert for variable
aliases created during WPA.

diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
index d733461..18592d7 100644
--- a/gcc/ipa-devirt.c
+++ b/gcc/ipa-devirt.c
@@ -176,7 +176,7 @@ struct GTY(()) odr_type_d
 inheritance (because vtables are shared).  Look up the BINFO of type
 and check presence of its vtable.  */

-static inline bool
+bool
  polymorphic_type_binfo_p (tree binfo)
  {
/* See if BINFO's type has an virtual table associtated with it.  */
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index b67deed..60bda71 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -210,7 +210,7 @@ ipa_populate_param_decls (struct cgraph_node *node,

  /* Return how many formal parameters FNDECL has.  */

-static inline int
+int
  count_formal_params (tree fndecl)
  {
tree parm;
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index cb23698..87573ff 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -529,6 +529,7 @@ void ipa_free_all_edge_args (void);
  void ipa_free_all_structures_after_ipa_cp (void);
  void ipa_free_all_structures_after_iinln (void);
  void ipa_register_cgraph_hooks (void);
+int count_formal_params (tree fndecl);

  /* This function ensures the array of node param infos is big enough to
 accommodate a structure for all nodes and reallocates it if not.  */
diff --git a/gcc/ipa-utils.c b/gcc/ipa-utils.c
index 8e7c7cb..bc2b958 100644
--- a/gcc/ipa-utils.c
+++ b/gcc/ipa-utils.c
@@ -660,13 +660,8 @@ ipa_merge_profiles (struct cgraph_node *dst,
if (dst-tp_first_run  src-tp_first_run  src-tp_first_run)
  dst-tp_first_run = src-tp_first_run;

-  if (src-profile_id)
-{
-  if (!dst-profile_id)
-   dst-profile_id = src-profile_id;
-  else
-   gcc_assert (src-profile_id == dst-profile_id);
-}
+  if (src-profile_id  !dst-profile_id)
+dst-profile_id = src-profile_id;

if (!dst-count)
  return;
diff --git a/gcc/ipa-utils.h b/gcc/ipa-utils.h
index a2c985a..996249a 100644
--- a/gcc/ipa-utils.h
+++ b/gcc/ipa-utils.h
@@ -72,6 +72,8 @@ struct odr_type_d;
  typedef odr_type_d *odr_type;
  void build_type_inheritance_graph (void);
  void update_type_inheritance_graph (void);
+bool polymorphic_type_binfo_p (tree binfo);
+
  vec cgraph_node *
  possible_polymorphic_call_targets (tree, HOST_WIDE_INT,
   ipa_polymorphic_call_context,
diff --git a/gcc/passes.c b/gcc/passes.c
index 4366251..9fdfe51 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1506,7 +1506,7 @@ do_per_function (void (*callback) (function *, void 
*data), void *data)
  {
struct cgraph_node *node;
FOR_EACH_DEFINED_FUNCTION (node)
-   if (node-analyzed  gimple_has_body_p (node-decl)
+   if (node-analyzed  (gimple_has_body_p (node-decl)  !in_lto_p)
 (!node-clone_of || node-decl != node-clone_of-decl))
  callback (DECL_STRUCT_FUNCTION (node-decl), data);
  }
diff --git a/gcc/varpool.c b/gcc/varpool.c
index ff67127..5cc558e 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -293,6 +293,7 @@ ctor_for_folding (tree decl)
if (decl != real_decl)
  {
gcc_assert (!DECL_INITIAL (decl)
+ || (node-alias  varpool_alias_target (node) == real_node)
  || DECL_INITIAL (decl) == error_mark_node);
if (lookup_attribute (weakref, DECL_ATTRIBUTES (decl)))
{



Hi.

Following patch enhances API functions to be ready for main patch of this 
patchset.

Ready for thunk?

Thank you,
Martin
gcc/ChangeLog:

2014-09-21  Martin Liška  mli...@suse.cz

* cgraph.c (cgraph_node::release_body): New argument keep_arguments
introduced.
* cgraph.h: Likewise.
* cgraphunit.c (cgraph_node::create_wrapper): Usage of new argument 
introduced.
* ipa-devirt.c (polymorphic_type_binfo_p): Safe check for binfos 
created by Java.
* tree-ssa-alias.c (ao_ref_base_alias_set): Static function transformed 
to global.
* tree-ssa-alias.h: Likewise.
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 8f04284..d40a2922 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1637,13 +1637,15 @@ release_function_body (tree decl)
are free'd in final.c via 

Re: [PATCH] microblaze: microblaze.md: Use 'SI' instead of 'VOID' for operand 1 of 'call_value_intern'

2014-09-24 Thread Chen Gang
Hello Michael:

Firstly, thank you very much for always providing your aid to me for
microblaze.

At present, after try testsuite, the result is much better than my
original trying, please help check the result: is it enough for our
microblaze testsuite (can we say it pass checking)?

  Current result:

# of expected passes65987
# of unexpected failures82
# of unexpected successes   1
# of expected failures  97
# of unresolved testcases   16378
# of unsupported tests  1810

  Original result:

# of expected passes  48408
# of unexpected failures  17253
# of unexpected successes 1
# of expected failures97
# of unresolved testcases 16570
# of unsupported tests1854

After check the current result log, I find many remote target test
related sentences, do we have to process it?

  e.g. Download to microblaze-xilinx-gdb failed, couldn't execute rcp: no 
such file or directory.



 And I guess, it is a glibc bug: which still add root directory (e.g.
 /upstream/release) in 'libc.so' when already has --with-sysroot for
 configure.

 
 Oh, sorry, glibc should also need --with-sysroot. I shall try it today,
 hope it will let all things OK.
 

After add --with-sysroot for glibc, this issue is still existance. And I
remove the redundant direcltory manually for libc.so and libpthread.so.

If our microblaze testsuite is OK, I will skip this issue (since I have
no enough time resource on glibc, at present).


Thanks.
-- 
Chen Gang

Open share and attitude like air water and life which God blessed


[jit] Update the various *_c_finalize functions

2014-09-24 Thread David Malcolm
Joseph - thanks for looking through the jit diff.

I plan to fix the issues you raise as a series of separate patches.

Here's the first:

On Tue, 2014-09-23 at 23:27 +, Joseph S. Myers wrote:

 Various *_finalize functions are missing comments explaining their 
 semantics.  Also the return type should be on the line before the function 
 name.

I've committed the following fix to branch dmalcolm/jit:

Five of the *_c_finalize functions were empty, since their files
contain no state [1][2].  Delete them.

Fix up the formatting of the remaining *_c_finalize functions, and
ensure they have descriptive leading comments.

[1] Most of these lost their state when the symbol_table class was
introduced, in r214422.

[2] predict.c has state in the form of these variables:

  static sreal real_zero, real_one, real_almost_one, real_br_prob_base,
   real_inv_br_prob_base, real_one_half, real_bb_freq_max;

and, within function estimate_bb_frequencies:

static int real_values_initialized = 0;

but it seems to me that this state doesn't need to be reset between
repeated in-process invocations.

gcc/ChangeLog.jit:
* cgraph.h (cgraphbuild_c_finalize): Delete prototype of empty
function.
(ipa_c_finalize): Likewise.
(predict_c_finalize): Likewise.
(symtab_c_finalize): Likewise.
(varpool_c_finalize): Likewise.

* cgraph.c (cgraph_c_finalize): Add leading comment.  Put return
type on line before function name.
* cgraphunit.c (cgraphunit_c_finalize): Likewise.
* dwarf2out.c (dwarf2out_c_finalize): Likewise.
* gcse.c (gcse_c_finalize): Likewise.
* ipa-cp.c (ipa_cp_c_finalize): Likewise.
* ipa-reference.c (ipa_reference_c_finalize): Likewise.

* params.c (params_c_finalize): Update leading comment to match
format of the others mentioned above.

* cgraphbuild.c (cgraphbuild_c_finalize): Delete empty function.
* ipa.c (ipa_c_finalize): Likewise.
* predict.c (predict_c_finalize): Likewise.
* symtab.c (symtab_c_finalize): Likewise.
* varpool.c (varpool_c_finalize): Likewise.

* toplev.c (toplev::finalize): Remove calls to empty functions
cgraphbuild_c_finalize, ipa_c_finalize, predict_c_finalize,
symtab_c_finalize, varpool_c_finalize.
---
 gcc/cgraph.c| 6 +-
 gcc/cgraph.h| 9 -
 gcc/cgraphbuild.c   | 4 
 gcc/cgraphunit.c| 6 +-
 gcc/dwarf2out.c | 6 +-
 gcc/gcse.c  | 6 +-
 gcc/ipa-cp.c| 3 +++
 gcc/ipa-reference.c | 6 +-
 gcc/ipa.c   | 4 
 gcc/params.c| 3 ++-
 gcc/predict.c   | 4 
 gcc/symtab.c| 4 
 gcc/toplev.c| 5 -
 gcc/varpool.c   | 4 
 14 files changed, 30 insertions(+), 40 deletions(-)

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 736dd73..1721634 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -3078,7 +3078,11 @@ gimple_check_call_matching_types (gimple call_stmt, tree 
callee,
   return true;
 }
 
-void cgraph_c_finalize (void)
+/* Reset all state within cgraph.c so that we can rerun the compiler
+   within the same process.  For use by toplev::finalize.  */
+
+void
+cgraph_c_finalize (void)
 {
   symtab = NULL;
 
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index c407a3b..fd45e01 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -1958,25 +1958,16 @@ void tree_function_versioning (tree, tree, 
vecipa_replace_map *, va_gc *,
 /* In cgraphbuild.c  */
 int compute_call_stmt_bb_frequency (tree, basic_block bb);
 void record_references_in_initializer (tree, bool);
-void cgraphbuild_c_finalize (void);
 
 /* In ipa.c  */
 void cgraph_build_static_cdtor (char which, tree body, int priority);
 void ipa_discover_readonly_nonaddressable_vars (void);
-void ipa_c_finalize (void);
 
 /* In ipa-cp.c  */
 void ipa_cp_c_finalize (void);
 
-/* In predict.c  */
-void predict_c_finalize (void);
-
-/* In symtab.c  */
-void symtab_c_finalize (void);
-
 /* In varpool.c  */
 tree ctor_for_folding (tree);
-void varpool_c_finalize (void);
 
 /* Return true when the symbol is real symbol, i.e. it is not inline clone
or abstract function kept for debug info purposes only.  */
diff --git a/gcc/cgraphbuild.c b/gcc/cgraphbuild.c
index 5610064..96d7015 100644
--- a/gcc/cgraphbuild.c
+++ b/gcc/cgraphbuild.c
@@ -576,7 +576,3 @@ make_pass_remove_cgraph_callee_edges (gcc::context *ctxt)
 {
   return new pass_remove_cgraph_callee_edges (ctxt);
 }
-
-void cgraphbuild_c_finalize (void)
-{
-}
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 1f52d35..9a3834a 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2288,7 +2288,11 @@ symbol_table::finalize_compilation_unit (void)
   timevar_pop (TV_CGRAPH);
 }
 
-void cgraphunit_c_finalize (void)
+/* Reset all state within cgraphunit.c so that we can rerun the compiler
+   within the same process.  For use by toplev::finalize.  */
+
+void
+cgraphunit_c_finalize (void)
 {
   

Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P

2014-09-24 Thread Ilya Enkovich
2014-09-24 17:50 GMT+04:00 Steven Bosscher stevenb@gmail.com:
 On Wed, Sep 24, 2014 at 2:51 PM, Ilya Enkovich wrote:
 2014-09-24 16:47 GMT+04:00 Steven Bosscher :
 It is not a control flow instruction. It copies value of instruction
 pointer into a general purpose register.  Therefore REG_LABEL_OPERAND
 seems to be correct.

 OK - sorry for being a bit slow on the up-take, I got confused by the
 asm syntax :-)

 So I'm going to speculate a bit more... What you want to have is:

 foo:
 insns...
 L2:
 leaq L2(%rip), rXX


 What happens is that L2 is deleted, which is to say converted to a
 NOTE_INSN_DELETED_LABEL. Then the notes are re-ordered
 (NOTE_INSN_DELETED_LABEL notes are not tied to anything in the insns
 stream and can end up anywhere) so you end up with something like,

 foo:
 L2: # (was deleted)
 insns...
 leaq L2(%rip),rXX

 I bet you'd find that in the failing test case the label is output to
 the assembly file but it's simply in the wrong place.  For the large
 code model, we get away with it because the prologue is output late
 and the order of the insns is not adjusted (a few passes later, the
 CFG doesn't even exist anymore so you don't go through cfgcleanup).
 But if you emit the label early and let it go through the entire RTL
 pipeline then anything can happen.

Actually label removal causes ICE later in CSE, so there is no output
to examine.

Having back a patch which allows me to reproduce a problem I can
finally answer your initial questions :)

This should not be necessary, you're probably papering over another 
problem. Did the label use count drop to zero? Is there a reg note for the 
label operand?

Here is a dump of basic_block I got in debugger right before label is removed:

(code_label/s 2 1 7 2 2  [3 uses])
(note 7 2 3 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 3 7 4 2 (set (reg:DI 85)
(unspec:DI [
(label_ref 2)
] UNSPEC_SET_RIP))
/gnumnt/msticlxl7_users/ienkovic/issues/4161/gcc/gcc/testsuite/gcc.target/i386/pr55154.c:9
-1
 (insn_list:REG_LABEL_OPERAND 2 (nil)))

So we have non zero uses count and appropriate reg note.  But label is
still removed.

BTW in my patch I should check LABEL_NUSES instead of
LABEL_PRESERVE_P.  I assumed it is possible to have LABEL_PRESERVE_P
and zero uses but now I see init_label_info called by
rebuild_jump_labels sets uses to 1 for all LABEL_PRESERVE_P labels.
Condition I modified doesn't care about count of label usages at all.
It just checks that BBs only predecessor doesn't jump to it.  Thus all
label uses by non jump instructions are ignored.

So I propose a new patch (not tested):

diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c
index 9325ea0..fe2a444 100644
--- a/gcc/cfgcleanup.c
+++ b/gcc/cfgcleanup.c
@@ -2701,17 +2701,7 @@ try_optimize_cfg (int mode)
   (single_pred_edge (b)-flags  EDGE_FALLTHRU)
   !(single_pred_edge (b)-flags  EDGE_COMPLEX)
   LABEL_P (BB_HEAD (b))
-  !LABEL_PRESERVE_P (BB_HEAD (b))
- /* If the previous block ends with a branch to this
-block, we can't delete the label.  Normally this
-is a condjump that is yet to be simplified, but
-if CASE_DROPS_THRU, this can be a tablejump with
-some element going to the same place as the
-default (fallthru).  */
-  (single_pred (b) == ENTRY_BLOCK_PTR_FOR_FN (cfun)
- || !JUMP_P (BB_END (single_pred (b)))
- || ! label_is_jump_target_p (BB_HEAD (b),
-  BB_END (single_pred (b)
+  !LABEL_NUSES (BB_HEAD (b)))
{
  delete_insn (BB_HEAD (b));
  if (dump_file)


 If the above makes sense, then you'll want to emit the label late, or
 not at all, to the insns stream.

 If you emit the label late into the insns stream, you'd rewrite the
 set_rip as a define_insn_and_split that emits the label as part of the
 last splitting pass. But there is no splitting pass late enough to
 guarantee that the label and insns won't get separated.

 If you don't emit the label to the insns stream, you would write
 ix86_output_set_rip() and call that from the define_insns for set_rip.
 You'd not emit the label in the expander. You'd create it and make it
 an operand, but not emit it. Your ix86_output_set_rip() would write
 the label and the set_rip instruction. This is probably the only way
 to make 100% sure that the label is always exactly at the set_rip
 instruction.

 Something like below (completely untested, etc...).

 Hope this helps,

Your point about misplaced label is quite reasonable.  I didn't see
such problems but agree that might happen.  Thank you for proposed
patch!  I think we should try to make changes you propose to securely
insert set_rip instructions any time we want.


Re: [patch] Implement move semantics for iostreams

2014-09-24 Thread Rainer Orth
Hi Jonathan,

 On 23/09/14 15:58 +0200, Rainer Orth wrote:
This patch broke Solaris bootstrap with Sun ld: when linking
libstdc++.so, ld complains

ld: fatal: libstdc++-symbols.ver-sun: 4520: symbol 'std::basic_ioschar,
 std::char_traitschar ::move(std::basic_ioschar,
 std::char_traitschar )': symbol version conflict

and many more.  In that case, I find that this symbols is matched by
both the GLIBCXX_3.4 and GLIBCXX_3.4.21 patterns:

GLIBCXX_3.4
##std::basic_i[g-r]* (cxx)
_ZNSt9basic_iosIcSt11char_traitsIcEE4moveEOS2_;

GLIBCXX_3.4.21
##_ZNSt9basic_iosI[cw]St11char_traitsI[cw]EE4moveE[OR]S2_ (glob)
_ZNSt9basic_iosIcSt11char_traitsIcEE4moveEOS2_;

 Rainer, I think this patch should fix it, could you test it please?

almost there: now I only get

ld: fatal: libstdc++-symbols.ver-sun: 4622: symbol 'std::basic_ostreamwchar_t, 
std::char_traitswchar_t ::basic_ostream(std::basic_iostreamwchar_t, 
std::char_traitswchar_t )': symbol version conflict
ld: fatal: libstdc++-symbols.ver-sun: 4623: symbol 'std::basic_ostreamwchar_t, 
std::char_traitswchar_t ::basic_ostream(std::basic_iostreamwchar_t, 
std::char_traitswchar_t )': symbol version conflict

from

  GLIBCXX_3.4:

##_ZNSt13basic_ostreamIwSt11char_traitsIwEEC[12]E[RP]* (glob)
_ZNSt13basic_ostreamIwSt11char_traitsIwEEC1ERSt14basic_iostreamIwS1_E;
_ZNSt13basic_ostreamIwSt11char_traitsIwEEC2ERSt14basic_iostreamIwS1_E;

  GLIBCXX_3.4.21:

##_ZNSt13basic_ostreamIwSt11char_traitsIwEEC[12]ERSt14basic_iostreamIwS1_E 
(glob)
_ZNSt13basic_ostreamIwSt11char_traitsIwEEC1ERSt14basic_iostreamIwS1_E;
_ZNSt13basic_ostreamIwSt11char_traitsIwEEC2ERSt14basic_iostreamIwS1_E;

The glob in the 3.4 version also matches

_ZNSt13basic_ostreamIwSt11char_traitsIwEEC1EPSt15basic_streambufIwS1_E;
_ZNSt13basic_ostreamIwSt11char_traitsIwEEC2EPSt15basic_streambufIwS1_E;

 (I tried installing Solaris in a VM but couldn't get it to work, maybe
 I should use the VirtualBox image instead of trying qemu/kvm.)

VirtualBox works for me in principle, but I often found bootstrapping
gcc inside some VM almost intolerably slow...  There's been some talk on
getting Solaris up and running in the compile farm.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


PING: Re: [patch] rename DECL_ABSTRACT to DECL_ABSTRACT_P

2014-09-24 Thread Aldy Hernandez

On 09/18/14 12:39, Aldy Hernandez wrote:



Yeah, sure, either way it's a good cleanup ;).

No strong opinions.  Though I think true/false are the way we want folks
to write new code.  Given that's the long term direction, might as well
fix that nit for DECL_ABSTRACT_P.


Alright... fixed.

OK?


Ping.


Re: [PATCH, i386, Pointer Bounds Checker 32/x] Pointer Bounds Checker hooks for i386 target

2014-09-24 Thread Ilya Enkovich
2014-09-23 21:17 GMT+04:00 Jeff Law l...@redhat.com:
 On 09/23/14 08:10, Ilya Enkovich wrote:

 Please use fold_convert (size_ptr, build_fold_addr_expr (var)).

 Is 'var' always accessed via a size_t effective type?  Watch out
 for TBAA issues if not.  (if it is, why is 'var' not of type size_t
 or size_t[]?)


 var has pointer bounds type.  I have to initialize it by parts and
 thus access it as a couple of integers having size of a pointer (I
 use integer instead of pointer because non poiner arithmetic is
 used).  Size type is not the best for this purpose and therefore I
 replace it with pointer_sized_int_node.

 So I have accesses of var's parts as integers and accesses of whole
 var as bounds.  Should I expect some problems from TBAA here?  How
 can I avoid problems with TBAA if any exists?

 In general, anytime you access a hunk of memory using two different types,
 then you run the risk of problems with TBAA.  In the case of bounds, we
 aren't exposing them to usercode, so you just have to worry about the
 refs/sets that you create.

 I think you could create an alias set for the bounds and attach it to every
 load/store if you aren't type safe for all the loads/stores.  That will
 create a dependency between all the bounds loads/stores, but not with
 unrelated loads/stores.   Alternately ensure all the loads/stores are in
 alias set 0, but that will likely have performance implications.

I access parts of bounds using pointer_sized_int_node only in
constructors which initialize static bound variables.  These
constructors do not have other usages of these vars and all other
usages of these vars in other functions use bounds type for access.
That should make me safe from TBAA point of view.

Ilya


 Jeff


Re: [PATCH] PR63300 'const volatile' sometimes stripped in debug info.

2014-09-24 Thread Jason Merrill

On 09/23/2014 06:53 PM, Mark Wielaard wrote:

And for the default case (gcc doesn't
create type sections by default) the optimization is useful.


I'm skeptical.  These DIEs are very small, and I wouldn't expect a hole 
in the qualifier space like this to come up that often.


Jason



Re: [patch] Implement move semantics for iostreams

2014-09-24 Thread Jonathan Wakely

On 24/09/14 16:38 +0200, Rainer Orth wrote:

Hi Jonathan,


On 23/09/14 15:58 +0200, Rainer Orth wrote:

This patch broke Solaris bootstrap with Sun ld: when linking
libstdc++.so, ld complains

ld: fatal: libstdc++-symbols.ver-sun: 4520: symbol 'std::basic_ioschar,
std::char_traitschar ::move(std::basic_ioschar,
std::char_traitschar )': symbol version conflict

and many more.  In that case, I find that this symbols is matched by
both the GLIBCXX_3.4 and GLIBCXX_3.4.21 patterns:

   GLIBCXX_3.4
   ##std::basic_i[g-r]* (cxx)
   _ZNSt9basic_iosIcSt11char_traitsIcEE4moveEOS2_;

   GLIBCXX_3.4.21
   ##_ZNSt9basic_iosI[cw]St11char_traitsI[cw]EE4moveE[OR]S2_ (glob)
   _ZNSt9basic_iosIcSt11char_traitsIcEE4moveEOS2_;


Rainer, I think this patch should fix it, could you test it please?


almost there: now I only get

ld: fatal: libstdc++-symbols.ver-sun: 4622: symbol 'std::basic_ostreamwchar_t, 
std::char_traitswchar_t ::basic_ostream(std::basic_iostreamwchar_t, 
std::char_traitswchar_t )': symbol version conflict
ld: fatal: libstdc++-symbols.ver-sun: 4623: symbol 'std::basic_ostreamwchar_t, 
std::char_traitswchar_t ::basic_ostream(std::basic_iostreamwchar_t, 
std::char_traitswchar_t )': symbol version conflict

from

 GLIBCXX_3.4:

   ##_ZNSt13basic_ostreamIwSt11char_traitsIwEEC[12]E[RP]* (glob)
   _ZNSt13basic_ostreamIwSt11char_traitsIwEEC1ERSt14basic_iostreamIwS1_E;
   _ZNSt13basic_ostreamIwSt11char_traitsIwEEC2ERSt14basic_iostreamIwS1_E;

 GLIBCXX_3.4.21:

   ##_ZNSt13basic_ostreamIwSt11char_traitsIwEEC[12]ERSt14basic_iostreamIwS1_E 
(glob)
   _ZNSt13basic_ostreamIwSt11char_traitsIwEEC1ERSt14basic_iostreamIwS1_E;
   _ZNSt13basic_ostreamIwSt11char_traitsIwEEC2ERSt14basic_iostreamIwS1_E;


Doh, yes, this additional tweak should solve that:

index f736240..95fc3c7 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -460,7 +460,7 @@ GLIBCXX_3.4 {

# std::basic_ostreamwchar_t
_ZNSt13basic_ostreamIwSt11char_traitsIwEEC[12]Ev;
-_ZNSt13basic_ostreamIwSt11char_traitsIwEEC[12]E[RP]*;
+_ZNSt13basic_ostreamIwSt11char_traitsIwEEC[12]EP*;
_ZNSt13basic_ostreamIwSt11char_traitsIwEED*;
_ZNKSt13basic_ostreamIwSt11char_traitsIwEE[0-9][a-z]*;
_ZNSt13basic_ostreamIwSt11char_traitsIwEE3putEw;


The glob in the 3.4 version also matches

   _ZNSt13basic_ostreamIwSt11char_traitsIwEEC1EPSt15basic_streambufIwS1_E;
   _ZNSt13basic_ostreamIwSt11char_traitsIwEEC2EPSt15basic_streambufIwS1_E;


Yes, that's all it needs to match, so changing [RP] to just P should work.


(I tried installing Solaris in a VM but couldn't get it to work, maybe
I should use the VirtualBox image instead of trying qemu/kvm.)


VirtualBox works for me in principle, but I often found bootstrapping
gcc inside some VM almost intolerably slow...  There's been some talk on
getting Solaris up and running in the compile farm.


That would be very useful.

Thanks for the quick testing and analysis.


Re: parallel check output changes?

2014-09-24 Thread Andrew MacLeod

On 09/23/2014 11:33 AM, Richard Sandiford wrote:

Segher Boessenkoolseg...@kernel.crashing.org  writes:

On Thu, Sep 18, 2014 at 01:44:55PM -0500, Segher Boessenkool wrote:

I am testing a patch that is just


diff --git a/contrib/dg-extract-results.py b/contrib/dg-extract-results.py
index cccbfd3..3781423 100644
--- a/contrib/dg-extract-results.py
+++ b/contrib/dg-extract-results.py
@@ -117,7 +117,7 @@ class Prog:
  self.tool_re = re.compile (r'^\t\t=== (.*) tests ===$')
  self.result_re = re.compile (r'^(PASS|XPASS|FAIL|XFAIL|UNRESOLVED'
   r'|WARNING|ERROR|UNSUPPORTED|UNTESTED'
- r'|KFAIL):\s*(\S+)')
+ r'|KFAIL):\s*(.+)')
  self.completed_re = re.compile (r'.* completed at (.*)')
  # Pieces of text to write at the head of the output.
  # start_line is a pair in which the first element is a datetime

Tested that with four runs on powerpc64-linux, four configs each time;
test-summary
shows the same in all cases.  Many lines have moved compared to without
the patch, but that cannot be helped.  Okay for mainline?


2014-09-19  Segher Boessenkoolseg...@kernel.crashing.org

contrib/
* dg-extract-results.py (Prog.result_re): Include options in test name.

FWIW, the \S+ thing was deliberate.  When one test is run multiple times
with different options, those options aren't necessarily tried in
alphabetical order.  The old sh/awk script therefore used just the test
name as the key and kept tests with the same name in the order that
they were encountered:

/^(PASS|XPASS|FAIL|XFAIL|UNRESOLVED|WARNING|ERROR|UNSUPPORTED|UNTESTED|KFAIL):/ 
{
   testname=\$2
   # Ugly hack for gfortran.dg/dg.exp
   if ($TOOL == gfortran  testname ~ /^gfortran.dg\/g77\//)
 testname=htestname
}

(note the $2).  This means that the output of the script is in the same
order as it would be for non-parallel runs.  I was following (or trying
to follow) that behaviour in the python script.

Your patch instead sorts based on the full test name, including options,
which means that the output no longer matches what you'd get from a
non-parallel run.  AFAICT, it also no longer matches what you'd get from
the .sh version.  That might be OK, just thought I'd mention it.

Thanks,
Richard

Is this suppose to be resolved now?  I'm still seeing some issues with a 
branch cut from mainline from yesterday.   This is from the following 
sequence:


check out revision 215511 , build, make -j16 check, make -j16 check, 
then compare all the .sum files:


PASS: gcc.dg/tls/asm-1.c  (test for errors, line 7)
PASS: gcc.dg/tls/asm-1.c (test for excess errors)
PASS: gcc.dg/tls/debug-1.c (test for excess errors)
PASS: gcc.dg/tls/diag-1.c (test for excess errors)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 4)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 5)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 6)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 7)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 11)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 12)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 13)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 14)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 17)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 18)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 19)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 20)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 22)

and then
PASS: gcc.dg/tls/asm-1.c  (test for errors, line 7)
PASS: gcc.dg/tls/asm-1.c (test for excess errors)
PASS: gcc.dg/tls/debug-1.c (test for excess errors)
PASS: gcc.dg/tls/diag-1.c (test for excess errors)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 11)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 12)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 13)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 14)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 17)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 18)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 19)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 20)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 22)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 4)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 5)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 6)
PASS: gcc.dg/tls/diag-2.c  (test for errors, line 7)

it looks like the first time sorted by line numerically (or just 
happened to leave the run order)  and the second time did the sort 
alphabetically...


Andrew



Re: [PATCH 2/5] Existing call graph infrastructure enhancement

2014-09-24 Thread Jan Hubicka
 Hi.
 
 Following patch enhances API functions to be ready for main patch of this 
 patchset.
 
 Ready for thunk?
 
 Thank you,
 Martin

 gcc/ChangeLog:
 
 2014-09-21  Martin Liška  mli...@suse.cz
 
   * cgraph.c (cgraph_node::release_body): New argument keep_arguments
   introduced.
   * cgraph.h: Likewise.
   * cgraphunit.c (cgraph_node::create_wrapper): Usage of new argument 
 introduced.
   * ipa-devirt.c (polymorphic_type_binfo_p): Safe check for binfos 
 created by Java.
   * tree-ssa-alias.c (ao_ref_base_alias_set): Static function transformed 
 to global.
   * tree-ssa-alias.h: Likewise.

 diff --git a/gcc/cgraph.c b/gcc/cgraph.c
 index 8f04284..d40a2922 100644
 --- a/gcc/cgraph.c
 +++ b/gcc/cgraph.c
 @@ -1637,13 +1637,15 @@ release_function_body (tree decl)
 are free'd in final.c via free_after_compilation().  */
  
  void
 -cgraph_node::release_body (void)
 +cgraph_node::release_body (bool keep_arguments)
  {
ipa_transforms_to_apply.release ();
if (!used_as_abstract_origin  symtab-state != PARSING)
  {
DECL_RESULT (decl) = NULL;
 -  DECL_ARGUMENTS (decl) = NULL;
 +
 +  if (!keep_arguments)
 + DECL_ARGUMENTS (decl) = NULL;
  }
/* If the node is abstract and needed, then do not clear DECL_INITIAL
   of its associated function function declaration because it's
 diff --git a/gcc/cgraph.h b/gcc/cgraph.h
 index a316e40..19ce3b8 100644
 --- a/gcc/cgraph.h
 +++ b/gcc/cgraph.h
 @@ -915,7 +915,7 @@ public:
   Use this only for functions that are released before being translated to
   target code (i.e. RTL).  Functions that are compiled to RTL and beyond
   are free'd in final.c via free_after_compilation().  */
 -  void release_body (void);
 +  void release_body (bool keep_arguments = false);

Please add documentation for KEEP_ARGUMENTS explaining that it is useful only 
if you want to
rebuild body as thunk.
  
/* cgraph_node is no longer nested function; update cgraph accordingly.  */
void unnest (void);
 diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
 index 3e3b8d2..c4597e2 100644
 --- a/gcc/cgraphunit.c
 +++ b/gcc/cgraphunit.c
 @@ -2300,7 +2300,7 @@ cgraph_node::create_wrapper (cgraph_node *target)
  tree decl_result = DECL_RESULT (decl);
  
  /* Remove the function's body.  */
I would say Remove the function's body but keep arguments to be reused for 
thunk.
 -release_body ();
 +release_body (true);
  reset ();
  
  DECL_RESULT (decl) = decl_result;
 diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
 index af42c6d..f374933 100644
 --- a/gcc/ipa-devirt.c
 +++ b/gcc/ipa-devirt.c
 @@ -225,7 +225,7 @@ static inline bool
  polymorphic_type_binfo_p (tree binfo)
  {
/* See if BINFO's type has an virtual table associtated with it.  */
 -  return BINFO_VTABLE (TYPE_BINFO (BINFO_TYPE (binfo)));
 +  return BINFO_TYPE (binfo)  BINFO_VTABLE (TYPE_BINFO (BINFO_TYPE 
 (binfo)));

Aha, this change was for Java, right? Please add comment that Java produces
BINFOs without BINFO_TYPE set.
  }
  
  /* Return TRUE if all derived types of T are known and thus
 diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
 index 442112a..1bf88e2 100644
 --- a/gcc/tree-ssa-alias.c
 +++ b/gcc/tree-ssa-alias.c
 @@ -559,7 +559,7 @@ ao_ref_base (ao_ref *ref)
  
  /* Returns the base object alias set of the memory reference *REF.  */
  
 -static alias_set_type
 +alias_set_type
  ao_ref_base_alias_set (ao_ref *ref)
  {
tree base_ref;
 diff --git a/gcc/tree-ssa-alias.h b/gcc/tree-ssa-alias.h
 index 436381a..0d35283 100644
 --- a/gcc/tree-ssa-alias.h
 +++ b/gcc/tree-ssa-alias.h
 @@ -98,6 +98,7 @@ extern void ao_ref_init (ao_ref *, tree);
  extern void ao_ref_init_from_ptr_and_size (ao_ref *, tree, tree);
  extern tree ao_ref_base (ao_ref *);
  extern alias_set_type ao_ref_alias_set (ao_ref *);
 +extern alias_set_type ao_ref_base_alias_set (ao_ref *);

I can not approve this change, but I suppose it is what Richard suggested?

Patch is OK except for the tree-ssa-alias bits.
Honza
  extern bool ptr_deref_may_alias_global_p (tree);
  extern bool ptr_derefs_may_alias_p (tree, tree);
  extern bool ref_may_alias_global_p (tree);



Re: [PATCH 2/14][Vectorizer] Make REDUC_xxx_EXPR tree codes produce a scalar result

2014-09-24 Thread Alan Lawrence
So it looks like patches 1-6 (reduc_foo) are relatively close to final, and 
given these fix PR/61114, I'm gonna try to land these while working on a respin 
of the second half (vec_shr)...(summary: yes I like the vec_perm idea too, but 
the devil is in the detail!)


However my CompileFarm account is still pending, so to that end, if you were 
able to test patch 2/14 (attached inc. Richie's s/VIEW_CONVERT_EXPR/NOP_EXPR/) 
on the CompileFarm PowerPC machine, that'd be great, many thanks indeed. It 
should apply on its own without patch 1. I'll aim to get an alternative patch 3 
back to the list shortly, and follow up with .md updates to the various backends.


Cheers, Alan


Richard Biener wrote:

On Thu, Sep 18, 2014 at 1:50 PM, Alan Lawrence alan.lawre...@arm.com wrote:

This fixes PR/61114 by redefining the REDUC_{MIN,MAX,PLUS}_EXPR tree codes.

These are presently documented as producing a vector with the result in
element 0, and this is inconsistent with their use in tree-vect-loop.c
(which on bigendian targets pulls the bits out of the wrong end of the
vector result). This leads to bugs on bigendian targets - see also
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61114.

I discounted fixing the vectorizer (to read from element 0) and then
making bigendian targets (whose architectural insn produces the result in
lane N-1) permute the result vector, as optimization of vectors in RTL seems
unlikely to remove such a permute and would lead to a performance
regression.

Instead it seems more natural for the tree code to produce a scalar result
(producing a vector with the result in lane 0 has already caused confusion,
e.g. https://gcc.gnu.org/ml/gcc-patches/2012-10/msg01100.html).

However, this patch preserves the meaning of the optab (producing a result
in lane 0 on little-endian architectures or N-1 on bigendian), thus
generally avoiding the need to change backends. Thus, expr.c extracts an
endianness-dependent element from the optab result to give the result
expected for the tree code.

Previously posted as an RFC
https://gcc.gnu.org/ml/gcc-patches/2014-08/msg00041.html , now with an extra
VIEW_CONVERT_EXPR if the types of the reduction/result do not match.


Huh.  Does that ever happen?  Please use a NOP_EXPR instead of
a VIEW_CONVERT_EXPR.

Ok with that change.

Thanks,
Richard.


Testing:
x86_86-none-linux-gnu: bootstrap, check-gcc, check-g++
aarch64-none-linux-gnu: bootstrap
aarch64-none-elf:  check-gcc, check-g++
arm-none-eabi: check-gcc

aarch64_be-none-elf: check-gcc, showing
FAIL-PASS: gcc.dg/vect/no-scevccp-outer-7.c execution test
FAIL-PASS: gcc.dg/vect/no-scevccp-outer-13.c execution test
Passes the (previously-failing) reduced testcase on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61114

Have also assembler/stage-1 tested that testcase on PowerPC, also
fixed.



gcc/ChangeLog:

* expr.c (expand_expr_real_2): For REDUC_{MIN,MAX,PLUS}_EXPR, add
extract_bit_field around optab result.

* fold-const.c (fold_unary_loc): For REDUC_{MIN,MAX,PLUS}_EXPR,
produce
scalar not vector.

* tree-cfg.c (verify_gimple_assign_unary): Check result vs operand
type
for REDUC_{MIN,MAX,PLUS}_EXPR.

* tree-vect-loop.c (vect_analyze_loop): Update comment.
(vect_create_epilog_for_reduction): For direct vector reduction, use
result of tree code directly without extract_bit_field.

* tree.def (REDUC_MAX_EXPR, REDUC_MIN_EXPR, REDUC_PLUS_EXPR): Update
comment.


commit a7b173d5efc6f08589b04fffeec9b3942b6282a0
Author: Alan Lawrence alan.lawre...@arm.com
Date:   Tue Jul 29 11:46:01 2014 +0100

Make tree codes produce scalar, with NOP_EXPRs. (tree-vect-loop.c mess)

diff --git a/gcc/expr.c b/gcc/expr.c
index a6233f3..c792028 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9044,7 +9044,17 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
   {
 op0 = expand_normal (treeop0);
 this_optab = optab_for_tree_code (code, type, optab_default);
-temp = expand_unop (mode, this_optab, op0, target, unsignedp);
+enum machine_mode vec_mode = TYPE_MODE (TREE_TYPE (treeop0));
+temp = expand_unop (vec_mode, this_optab, op0, NULL_RTX, unsignedp);
+gcc_assert (temp);
+/* The tree code produces a scalar result, but (somewhat by convention)
+   the optab produces a vector with the result in element 0 if
+   little-endian, or element N-1 if big-endian.  So pull the scalar
+   result out of that element.  */
+int index = BYTES_BIG_ENDIAN ? GET_MODE_NUNITS (vec_mode) - 1 : 0;
+int bitsize = GET_MODE_BITSIZE (GET_MODE_INNER (vec_mode));
+temp = extract_bit_field (temp, bitsize, bitsize * index, unsignedp,
+  target, mode, mode);
 gcc_assert (temp);
 return temp;
   }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 

[AArch64] Wire up vqdmullh_laneq_s16 and vqdmullh_laneq_s32

2014-09-24 Thread James Greenhalgh

Hi,

As per the subject line this patch adds support for two arm_neon.h
intrinsics that we had missed.

We also need to fix the signature of vqdmulls_lane_s32, which is an
obvious extension to this patch while we are in the area.

Tested for simd.exp and aarch64.exp with no issues.

OK?

Thanks,
James

---
gcc/

2014-09-24  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64-simd-builtins.def (sqdmull_laneq): Expand
iterator.
* config/aarch64/aarch64-simd.md
(aarch64_sqdmull_laneqmode): Expand iterator.
* config/aarch64/arm_neon.h (vqdmullh_laneq_s16): New.
(vqdmulls_lane_s32): Fix return type.
(vqdmulls_laneq_s32): New.

gcc/testsuite/

2014-09-24  James Greenhalgh  james.greenha...@arm.com

* gcc.target/aarch64/simd/vqdmullh_laneq_s16.c: New.
* gcc.target/aarch64/simd/vqdmulls_laneq_s32.c: Likewise.
* gcc.target/aarch64/simd/vqdmulls_lane_s32.c: Fix return type.
* gcc.target/aarch64/scalar_intrinsics.c (test_vqdmulls_s32):  Fix
return type.
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index de264c4..2367436 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -155,7 +155,7 @@
 
   BUILTIN_VSD_HSI (BINOP, sqdmull, 0)
   BUILTIN_VSD_HSI (TERNOP, sqdmull_lane, 0)
-  BUILTIN_VD_HSI (TERNOP, sqdmull_laneq, 0)
+  BUILTIN_VSD_HSI (TERNOP, sqdmull_laneq, 0)
   BUILTIN_VD_HSI (BINOP, sqdmull_n, 0)
   BUILTIN_VQ_HSI (BINOP, sqdmull2, 0)
   BUILTIN_VQ_HSI (TERNOP, sqdmull2_lane, 0)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 493e88628c2a7ef2c4f87031d86d1a5edcbca06b..45ea9d7895e93d4c4b137de1c01f6a1e93942d11 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3398,7 +3398,7 @@ (define_expand aarch64_sqdmull_lanemod
 
 (define_expand aarch64_sqdmull_laneqmode
   [(match_operand:VWIDE 0 register_operand =w)
-   (match_operand:VD_HSI 1 register_operand w)
+   (match_operand:VSD_HSI 1 register_operand w)
(match_operand:VCONQ 2 register_operand vwx)
(match_operand:SI 3 immediate_operand i)]
   TARGET_SIMD
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index feca00e..9b1873f 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -19420,16 +19420,28 @@ vqdmullh_lane_s16 (int16_t __a, int16x4_t __b, const int __c)
   return __builtin_aarch64_sqdmull_lanehi (__a, __b, __c);
 }
 
+__extension__ static __inline int32_t __attribute__ ((__always_inline__))
+vqdmullh_laneq_s16 (int16_t __a, int16x8_t __b, const int __c)
+{
+  return __builtin_aarch64_sqdmull_laneqhi (__a, __b, __c);
+}
+
 __extension__ static __inline int64_t __attribute__ ((__always_inline__))
 vqdmulls_s32 (int32_t __a, int32_t __b)
 {
   return __builtin_aarch64_sqdmullsi (__a, __b);
 }
 
-__extension__ static __inline int64x1_t __attribute__ ((__always_inline__))
+__extension__ static __inline int64_t __attribute__ ((__always_inline__))
 vqdmulls_lane_s32 (int32_t __a, int32x2_t __b, const int __c)
 {
-  return (int64x1_t) {__builtin_aarch64_sqdmull_lanesi (__a, __b, __c)};
+  return __builtin_aarch64_sqdmull_lanesi (__a, __b, __c);
+}
+
+__extension__ static __inline int64_t __attribute__ ((__always_inline__))
+vqdmulls_laneq_s32 (int32_t __a, int32x4_t __b, const int __c)
+{
+  return __builtin_aarch64_sqdmull_laneqsi (__a, __b, __c);
 }
 
 /* vqmovn */
diff --git a/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c b/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c
index c07c94c..ea29066 100644
--- a/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c
+++ b/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c
@@ -501,7 +501,7 @@ test_vqdmulls_s32 (int32_t a, int32_t b)
 
 /* { dg-final { scan-assembler-times \\tsqdmull\\td\[0-9\]+, s\[0-9\]+, v 1 } } */
 
-int64x1_t
+int64_t
 test_vqdmulls_lane_s32 (int32_t a, int32x2_t b)
 {
   return vqdmulls_lane_s32 (a, b, 1);
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmullh_laneq_s16.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmullh_laneq_s16.c
new file mode 100644
index 000..947ebf4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vqdmullh_laneq_s16.c
@@ -0,0 +1,15 @@
+/* Test the vqdmullh_laneq_s16 AArch64 SIMD intrinsic.  */
+
+/* { dg-do compile } */
+/* { dg-options -save-temps -O3 -fno-inline } */
+
+#include arm_neon.h
+
+int32_t
+t_vqdmullh_laneq_s16 (int16_t a, int16x8_t b)
+{
+  return vqdmullh_laneq_s16 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler-times sqdmull\[ \t\]+\[sS\]\[0-9\]+, ?\[hH\]\[0-9\]+, ?\[vV\]\[0-9\]+\.\[hH\]\\\[0\\\]\n 1 } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vqdmulls_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/simd/vqdmulls_lane_s32.c
index 6ed8e3a..24daaab 100644
--- 

Re: [PATCH] Fix PR 58867: asan and ubsan tests not run for installed testing.

2014-09-24 Thread Maxim Ostapenko

Hi Andrew!

I tried to run ASan and UBSan tests on installed toolchain, but failed 
because current GCC doesn't support this opportunity.


I see, you had fixed this issue 
(http://patchwork.ozlabs.org/patch/286866/), but the patch wasn't 
applied to GCC. So, I wonder if you are going to commit this.


-Maxim


Re: PING: Re: [patch] rename DECL_ABSTRACT to DECL_ABSTRACT_P

2014-09-24 Thread Jeff Law

On 09/24/14 08:40, Aldy Hernandez wrote:

On 09/18/14 12:39, Aldy Hernandez wrote:



Yeah, sure, either way it's a good cleanup ;).

No strong opinions.  Though I think true/false are the way we want folks
to write new code.  Given that's the long term direction, might as well
fix that nit for DECL_ABSTRACT_P.


Alright... fixed.

OK?


Ping.

OK for the trunk.  Sorry I didn't pre-approve the trivial update.

Jeff



Re: [PATCH, Pointer Bounds Checker 22/x] Inline

2014-09-24 Thread Jeff Law

On 09/24/14 01:28, Ilya Enkovich wrote:




I'm a bit curious why you removed the original RETBND statement in
value-prof, only to reinsert it.  Is there some reason you needed to do
that?


After call transformation we have smth like that:

if (confition)
   new_lhs = direct_call (...);
else
   old_lhs = call (...);
old_bnd = __builtin_retbnd (old_lhs);

Original retbnd statement removal + reinsertion is used to transform it into:

if (confition)
   new_lhs = direct_call (...);
else
{
   old_lhs = call (...);
   old_bnd = __builtin_retbnd (old_lhs);
}

The rest of code inserts bounds for new_lhs and creates phi node for
bounds similar to what is done for call return value.
Oh yea, makes perfect sense, the earlier code inserted the conditional, 
but left the bounds setting bits in their prior (now the merge point) 
location.


Thanks,
Jeff



Re: [GOOGLE] Fix new tests

2014-09-24 Thread Xinliang David Li
not sure if there is a better way, but ok.

David

On Wed, Sep 24, 2014 at 6:20 AM, Teresa Johnson tejohn...@google.com wrote:
 The new tests added for -mpatch-functions-for-instrumentation did not
 correctly restrict themselves to x86_64 since tree-prof.exp doesn't
 support dg-do. Work around this by using target selectors on the
 dg-options. I apply the -mpatch and related options only if it is
 x86_64, otherwise it simply does splitting.

 Ok for google branches?

 Teresa

 2014-09-24  Teresa Johnson  tejohn...@google.com

 * testsuite/gcc.dg/tree-prof/cold_partition_patch.c:
 * testsuite/g++.dg/tree-prof/partition_patch.C:

 Index: testsuite/gcc.dg/tree-prof/cold_partition_patch.c
 ===
 --- testsuite/gcc.dg/tree-prof/cold_partition_patch.c   (revision 215525)
 +++ testsuite/gcc.dg/tree-prof/cold_partition_patch.c   (working copy)
 @@ -1,8 +1,7 @@
  /* Check if patching works with function splitting. */
 -/* { dg-do compile { target x86_64-*-* } } */
  /* { dg-require-effective-target freorder } */
 -/* { dg-options -O2 -freorder-blocks-and-partition -save-temps
 -mpatch-functions-for-instrumentation -fno-optimize-sibling-calls  }
 */
 -
 +/* { dg-options -O2 -freorder-blocks-and-partition -save-temps  {
 target { ! x86_64-*-* } } }
 +/* { dg-options -O2 -freorder-blocks-and-partition -save-temps
 -mpatch-functions-for-instrumentation -fno-optimize-sibling-calls  {
 target x86_64-*-* } } */
  #define SIZE 1

  const char *sarr[SIZE];
 Index: testsuite/g++.dg/tree-prof/partition_patch.C
 ===
 --- testsuite/g++.dg/tree-prof/partition_patch.C(revision 215525)
 +++ testsuite/g++.dg/tree-prof/partition_patch.C(working copy)
 @@ -1,7 +1,7 @@
  // Check if patching works with function splitting.
 -// { dg-do compile { target x86_64-*-* } }
  // { dg-require-effective-target freorder }
 -// { dg-options -O2 -fnon-call-exceptions
 -freorder-blocks-and-partition -mpatch-functions-for-instrumentation
 -fno-optimize-sibling-calls  }
 +// { dg-options -O2 -fnon-call-exceptions
 -freorder-blocks-and-partition  { target { ! x86_64-*-* } } }
 +// { dg-options -O2 -fnon-call-exceptions
 -freorder-blocks-and-partition -mpatch-functions-for-instrumentation
 -fno-optimize-sibling-calls  { target x86_64-*-* } }

  int k;


 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


[Patch] Fix PR61889 for the w64-mingw32 case

2014-09-24 Thread Rainer Emrich
The following patch fixes PR61889 for x86_64-w64-mingw32. Details can be found
on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61889

The patch was bootstrapped on x86_64-w64-mingw32.

If patch the patch is ok, Kai would you apply, please?

Rainer

2014-09-24  Rainer Emrich  rai...@emrich-ebersheim.de

PR gcov-profile/61889
* gcc/gcov-tool.c: Remove wrong #if !defined(_WIN32)
* libgcc/libgcov-driver-system.c: undefine clashing macro for mkdir


Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (Revision 215554)
+++ gcc/gcov-tool.c (Arbeitskopie)
@@ -89,11 +89,7 @@ gcov_output_files (const char *out, stru
   /* Try to make directory if it doesn't already exist.  */
   if (access (out, F_OK) == -1)
 {
-#if !defined(_WIN32)
   if (mkdir (out, S_IRWXU | S_IRWXG | S_IRWXO) == -1  errno != EEXIST)
-#else
-  if (mkdir (out) == -1  errno != EEXIST)
-#endif
 fatal_error (Cannot make directory %s, out);
 } else
   unlink_profile_dir (out);
Index: libgcc/libgcov-driver-system.c
===
--- libgcc/libgcov-driver-system.c  (Revision 215554)
+++ libgcc/libgcov-driver-system.c  (Arbeitskopie)
@@ -66,6 +66,9 @@ create_file_directory (char *filename)
 #ifdef TARGET_POSIX_IO
  mkdir (filename, 0755) == -1
 #else
+#ifdef mkdir
+#undef mkdir
+#endif
  mkdir (filename) == -1
 #endif
 /* The directory might have been made by another process.  */


Re: [wwwdocs] Update C++1y status page now that C++14 is finished.

2014-09-24 Thread Mike Stump
On Sep 24, 2014, at 5:54 AM, Jonathan Wakely jwak...@redhat.com wrote:
 C++14 is no longer the next standard, it's here, so update the project
 page.

Can we have a web doc person update the name of the page (projects/cxx1y.html 
- projects/cxx14.html) and add a redirect as necessary?

[wwwdocs] IPA/LTO/FDO updates for gcc-5/changes.html

2014-09-24 Thread Jan Hubicka
Hi,
this patch adds list of changes to IPA/LTO/FDO before I forget about them ;)

Honza

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.9
diff -c -p -r1.9 changes.html
*** changes.html5 Sep 2014 08:25:46 -   1.9
--- changes.html24 Sep 2014 15:23:35 -
***
*** 17,22 
--- 17,67 
  h2 id=generalGeneral Optimizer Improvements/h2
  
ul
+ liInter-procedural optimization improvements:
+ ul
+  liDevirtualization pass was significantly improved by adding
+better support for speculative devirtualization and dynamic type
+detection. About 50% of virtual calls in Firefox are speculatively
+devirtualized during link-time optimization.
+  liNew comdat localization pass lets linker to eliminate more dead code
+in presence of C++ inline functions./li
+  liVirtual tables are now optimized. Local aliases are used to reduce
+dynamic linking time of C++ virtual tables on ELF targets and
+data alignment has been reduced to limit data segment bloat./li
+  liNew code-fno-semantic-interposition/code flag can be used
+to improve code quality of shared libraries where interposition of
+exported symbols is not allowed./li
+  liWrite-only variables are now detected and optimized out./li
+  liWith profile feedback the function inliner can now bypass
+code--param inline-insns-auto/code and code--param
+inline-insns-single/code limits for hot calls./li
+  liIPA reference pass was significantly sped up making it feasible
+to enable code-fipa-reference/code with
+code-fprofile-generage/code. This also solve bottleneck
+seen when optimizing Chromium with link time optimization./li
+  liSymbol table and call-graph API was reworked to C++ and
+simplified./li
+ /ul/li
+ liLink-time optimization improvements:
+ ul
+   liNew One Definition Rule based merging of C++ types implemented.
+ Type merging enables better devirtualization and alias analysis.
+ Streaming extra information needed to merge types adds about 2-6% of
+ memory size and object size increase. This can be controlled by
+ code-flto-odr-type-merging/code./li
+   liGCC bootstrap now use slim LTO object files./li
+   liMemory usage and link times was improved.  Tree merging was sped up,
+ memory usage of GIMPLE declarations and types was reduced, and,
+ support for on-demand streaming of variable constructors was 
added./li
+ /ul/li
+ liFeedback directed optimization improvements:
+ ul
+   liProfile precision was improved in presence of C++ inline and extern
+ inline functions./li
+   liNew codegcov-tool/code to manipulate profiles./li
+   liProfile is now more tolerant to source file changes (this can be
+ controlled by code--param profile-func-internal-id/code)./li
+ /ul/li
  liUndefinedBehaviorSanitizer gained a few new sanitization options:
  ul
licode-fsanitize=float-divide-by-zero/code: detect floating-point
***
*** 54,59 
--- 99,107 
  liFull support for a href=https://www.cilkplus.org/;Cilk Plus/a
has been added to the GCC compiler. Cilk Plus is an extension to
the C and C++ languages to support data and task parallelism./li
+ liNew attribute codeno_reorder/code prevents reordering of selected 
symbols.
+   This enables to link-time optimize Linux kernel without need to use
+   code-fno-toplevel-reorder/code that disable several 
optimizations./li
/ul
  
  h3 id=cC/h3
***
*** 90,95 
--- 138,152 
  liAn implementation of codestd::experimental::any/code./li
  liNew random number distributions codelogistic_distribution/code and
codeuniform_on_sphere_distribution/code as extensions./li
+ liNew One Definition Rule violation warning (controlled by 
code-Wodr/code)
+ detects mismatches in type definitions and virtual table contents
+   during link-time optimization./li
+ liNew warnings code-Wsuggest-final-types/code and
+   code-Wsuggest-final-methods/code helps developers
+   to annotate programs by codefinal/code specifiers (or anonymous
+   namespaces) in the cases where code generation improves.
+   These warnings can be used at compile time, but they are more
+   useful in combination with link-time optimization./li
/ul
  
  h3 id=fortranFortran/h3


Re: Enable EBX for x86 in 32bits PIC code

2014-09-24 Thread Jeff Law

On 09/24/14 00:56, Ilya Enkovich wrote:

2014-09-23 20:10 GMT+04:00 Jeff Law l...@redhat.com:

On 09/23/14 10:03, Jakub Jelinek wrote:


On Tue, Sep 23, 2014 at 10:00:00AM -0600, Jeff Law wrote:


On 09/23/14 08:34, Jakub Jelinek wrote:


On Tue, Sep 23, 2014 at 05:54:37PM +0400, Ilya Enkovich wrote:


use fixed EBX at least until we make sure pseudo PIC doesn't harm debug
info generation.  If we have such option then gcc.target/i386/pic-1.c
and



For debug info, it seems you are already handling this in
delegitimize_address target hook, I'd suggest just building some very
large
shared library at -O2 -g -fpic on i?86 and either look at the
sizes of .debug_info/.debug_loc sections with/without the patch,
or use the locstat utility from elfutils (talk to Petr Machata if
needed).


Can't hurt, but I really don't see how changing from a fixed to an
allocatable register is going to muck up debug info in any significant
way.



What matters is if the delegitimize_address target hook is as efficient in
delegitimization as before.  E.g. if it previously matched only when
seeing
%ebx + gotoff or similar, and wouldn't match anything now, some vars could
have debug locations including UNSPEC and be dropped on the floor.


Ah, yea, that makes sense.

jeff



After register allocation we have no idea where GOT address is and
therefore delegitimize_address target hook becomes less efficient and
cannot remove UNSPECs. That's what I see now when build GCC with patch
applied:

In theory this shouldn't be too hard to fix.

I haven't looked at the code, but it might be something looking 
explicitly for ebx by register #, or something similar.  Which case 
within delegitimize_address isn't firing as it should after your changes?


jeff



Re: [GOOGLE] Fix new tests

2014-09-24 Thread Teresa Johnson
On Wed, Sep 24, 2014 at 8:23 AM, Xinliang David Li davi...@google.com wrote:
 not sure if there is a better way, but ok.

I looked through the documentation and other tests last night, but
couldn't come up with a better way unfortunately.

Teresa


 David

 On Wed, Sep 24, 2014 at 6:20 AM, Teresa Johnson tejohn...@google.com wrote:
 The new tests added for -mpatch-functions-for-instrumentation did not
 correctly restrict themselves to x86_64 since tree-prof.exp doesn't
 support dg-do. Work around this by using target selectors on the
 dg-options. I apply the -mpatch and related options only if it is
 x86_64, otherwise it simply does splitting.

 Ok for google branches?

 Teresa

 2014-09-24  Teresa Johnson  tejohn...@google.com

 * testsuite/gcc.dg/tree-prof/cold_partition_patch.c:
 * testsuite/g++.dg/tree-prof/partition_patch.C:

 Index: testsuite/gcc.dg/tree-prof/cold_partition_patch.c
 ===
 --- testsuite/gcc.dg/tree-prof/cold_partition_patch.c   (revision 215525)
 +++ testsuite/gcc.dg/tree-prof/cold_partition_patch.c   (working copy)
 @@ -1,8 +1,7 @@
  /* Check if patching works with function splitting. */
 -/* { dg-do compile { target x86_64-*-* } } */
  /* { dg-require-effective-target freorder } */
 -/* { dg-options -O2 -freorder-blocks-and-partition -save-temps
 -mpatch-functions-for-instrumentation -fno-optimize-sibling-calls  }
 */
 -
 +/* { dg-options -O2 -freorder-blocks-and-partition -save-temps  {
 target { ! x86_64-*-* } } }
 +/* { dg-options -O2 -freorder-blocks-and-partition -save-temps
 -mpatch-functions-for-instrumentation -fno-optimize-sibling-calls  {
 target x86_64-*-* } } */
  #define SIZE 1

  const char *sarr[SIZE];
 Index: testsuite/g++.dg/tree-prof/partition_patch.C
 ===
 --- testsuite/g++.dg/tree-prof/partition_patch.C(revision 215525)
 +++ testsuite/g++.dg/tree-prof/partition_patch.C(working copy)
 @@ -1,7 +1,7 @@
  // Check if patching works with function splitting.
 -// { dg-do compile { target x86_64-*-* } }
  // { dg-require-effective-target freorder }
 -// { dg-options -O2 -fnon-call-exceptions
 -freorder-blocks-and-partition -mpatch-functions-for-instrumentation
 -fno-optimize-sibling-calls  }
 +// { dg-options -O2 -fnon-call-exceptions
 -freorder-blocks-and-partition  { target { ! x86_64-*-* } } }
 +// { dg-options -O2 -fnon-call-exceptions
 -freorder-blocks-and-partition -mpatch-functions-for-instrumentation
 -fno-optimize-sibling-calls  { target x86_64-*-* } }

  int k;


 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [PATCH] microblaze: microblaze.md: Use 'SI' instead of 'VOID' for operand 1 of 'call_value_intern'

2014-09-24 Thread Michael Eager

On 09/24/14 07:31, Chen Gang wrote:

Hello Michael:

Firstly, thank you very much for always providing your aid to me for
microblaze.

At present, after try testsuite, the result is much better than my
original trying, please help check the result: is it enough for our
microblaze testsuite (can we say it pass checking)?

   Current result:

 # of expected passes65987
 # of unexpected failures82
 # of unexpected successes   1
 # of expected failures  97
 # of unresolved testcases   16378
 # of unsupported tests  1810


This is good.



   Original result:

 # of expected passes  48408
 # of unexpected failures  17253
 # of unexpected successes 1
 # of expected failures97
 # of unresolved testcases 16570
 # of unsupported tests1854

After check the current result log, I find many remote target test
related sentences, do we have to process it?

   e.g. Download to microblaze-xilinx-gdb failed, couldn't execute rcp: no such 
file or directory.


The test suite uses rcp to transfer files to or from the target,
either to provide input to a test case or to check the output.
Most Linux systems do not install rcp, since it is a security risk.


And I guess, it is a glibc bug: which still add root directory (e.g.
/upstream/release) in 'libc.so' when already has --with-sysroot for
configure.



Oh, sorry, glibc should also need --with-sysroot. I shall try it today,
hope it will let all things OK.



After add --with-sysroot for glibc, this issue is still existance. And I
remove the redundant direcltory manually for libc.so and libpthread.so.

If our microblaze testsuite is OK, I will skip this issue (since I have
no enough time resource on glibc, at present).


OK with me.


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


Re: [PATCH] microblaze: microblaze.md: Use 'SI' instead of 'VOID' for operand 1 of 'call_value_intern'

2014-09-24 Thread Mike Stump
On Sep 24, 2014, at 8:28 AM, Michael Eager ea...@eagerm.com wrote:
 After check the current result log, I find many remote target test
 related sentences, do we have to process it?
 
   e.g. Download to microblaze-xilinx-gdb failed, couldn't execute rcp: no 
 such file or directory.
 
 The test suite uses rcp to transfer files to or from the target,
 either to provide input to a test case or to check the output.
 Most Linux systems do not install rcp, since it is a security risk.

To clarify:

if {[board_info $desthost exists rcp_prog]} {
set RCP [board_info $desthost rcp_prog]
} else {
set RCP rcp
}

So, if you set rcp_prog to something else, you should be able to avoid rsh if 
you want.  Most people use ssh now-a-days.  You will want it set up to not 
require a password for testing.



Re: [PATCH i386 AVX512] [51/n] Add pd2dq and dq2pd converts.

2014-09-24 Thread Uros Bizjak
On Wed, Sep 24, 2014 at 10:49 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Hello,
 Patch in the bottom adds support for pd2dq and dq2pd
 conversions.

 Bootstrapped.
 AVX-512* tests on top of patch-set all pass
 under simulator.

 Is it ok for trunk?

 gcc/
 * config/i386/i386.c
 (avx512f_ufix_notruncv8dfv8si_mask_round): Rename to ...
 (ufix_notruncv8dfv8si2_mask_round): this.

(ufix_notruncv8dfv8si2_mask_round): ... this.

 * config/i386/sse.md
 (define_insn avx512f_cvtdq2pd512_2): Update TARGET check.
 (define_insn avx_cvtdq2pd256_2): Add EVEX version.
 (define_insn sse2_cvtdq2pdmask_name): Add masking.
 (define_insn avx_cvtpd2dq256mask_name): Ditto.
 (define_expand sse2_cvtpd2dq): Delete.
 (define_insn sse2_cvtpd2dqmask_name): Add masking.
 (define_insn avx512f_ufix_notruncv8dfv8simask_nameround_name):
 Delete.
 (define_mode_attr pd2udqsuff): New.
 (define_insn
 ufix_notruncmodesi2dfmodelower2mask_nameround_name): Ditto.
 (define_insn ufix_notruncv2dfv2si2mask_name): Ditto.
 (define_insn *avx_cvttpd2dq256_2): Delete.
 (define_expand sse2_cvttpd2dq): Ditto.
 (define_insn sse2_cvttpd2dqmask_name): Add masking.

You didn't mention following no-op change (in two places):

 - (match_operand:V2SI 2 const0_operand)))]
 + (const_vector:V2SI [(const_int 0) (const_int 0)])))]

OK with the updated ChangeLog.

Thanks,
Uros.


Re: [PATCH, i386, Pointer Bounds Checker 33/x] MPX ABI

2014-09-24 Thread Jeff Law

On 09/24/14 01:05, Ilya Enkovich wrote:

However, we've still got the problem that the RTL you've generated is
ill-formed.  If I understand things correctly, the assignments are the
result of the call, that should be modeled by having the destination be a
PARALLEL as mentioned earlier.


OK. Will try it. BTW call_value_pop patterns have two sets. One for
returned value and one for stack register. How comes it differs much
from what I do with bound regs?
The semantics of a PARALLEL are that all the values used in the 
expressions are evaluated, then all the side effects are performed.  So:


(define_insn *call_pop
  [(call (mem:QI (match_operand:SI 0 call_insn_operand lmBz))
 (match_operand 1))
   (set (reg:SI SP_REG)
(plus:SI (reg:SI SP_REG)
 (match_operand:SI 2 immediate_operand i)))]
  !TARGET_64BIT  !SIBLING_CALL_P (insn)
  * return ix86_output_call_insn (insn, operands[0]);
  [(set_attr type call)])

According to the semantics of a PARALLEL would indicate that the 
reference to SP_REG on the RHS of the 2nd assignment expression takes 
the value of SP_REG *prior to the call*.  And those are the semantics we 
depend on.



So in your case the RHS references to BND0_REG and BND1_REG use the 
values *before* the call -- and I don't think that's the semantics you 
want.  You might get away with it because of the UNSPEC wrapping, but 
IMHO, it's still ill-formed RTL.


jeff




Re: parallel check output changes?

2014-09-24 Thread Segher Boessenkool
On Wed, Sep 24, 2014 at 10:54:57AM -0400, Andrew MacLeod wrote:
 On 09/23/2014 11:33 AM, Richard Sandiford wrote:
 Your patch instead sorts based on the full test name, including options,
 which means that the output no longer matches what you'd get from a
 non-parallel run.  AFAICT, it also no longer matches what you'd get from
 the .sh version.  That might be OK, just thought I'd mention it.

With the parallellisation changes the output was pretty random order.  My
patch made that a fixed order again, albeit a different one from before.

 Is this suppose to be resolved now?  I'm still seeing some issues with a 
 branch cut from mainline from yesterday.   This is from the following 
 sequence:
 
 check out revision 215511 , build, make -j16 check, make -j16 check, 
 then compare all the .sum files:

I don't understand what exactly you did; you have left out some steps
I think?


Segher


Re: [PATCH] microblaze: microblaze.md: Use 'SI' instead of 'VOID' for operand 1 of 'call_value_intern'

2014-09-24 Thread Chen Gang
On 09/24/2014 11:28 PM, Michael Eager wrote:
 On 09/24/14 07:31, Chen Gang wrote:
 Hello Michael:

 Firstly, thank you very much for always providing your aid to me for
 microblaze.

 At present, after try testsuite, the result is much better than my
 original trying, please help check the result: is it enough for our
 microblaze testsuite (can we say it pass checking)?

Current result:

  # of expected passes65987
  # of unexpected failures82
  # of unexpected successes   1
  # of expected failures  97
  # of unresolved testcases   16378
  # of unsupported tests  1810
 
 This is good.
 

OK, thanks, and I shall send a fix patch for ((void (*)(void))0)()
tomorrow, it pass testsuite (old and new get the same result), but new
can fix ((void (*)(void))0)() issue. So I guess this fix is valid. :-)


Thanks.
-- 
Chen Gang

Open share and attitude like air water and life which God blessed


Re: [PATCH i386 AVX512] [52/n] Add convert ps2pd and ps2dq.

2014-09-24 Thread Uros Bizjak
On Wed, Sep 24, 2014 at 10:54 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Hello,
 Patch in the bottom adds support for ps2dq and ps2pd
 conversions.

 Bootstrapped.
 AVX-512* tests on top of patch-set all pass
 under simulator.

 Is it ok for trunk?

 gcc/
 * config/i386/sse.md
 (define_c_enum unspec): Add UNSPEC_CVTINT2MASK.
 (define_insn
 
 fixsuffixfix_truncmodesselongvecmodelower2mask_nameround_saeonly_name):
 New.
 (define_insn fixsuffixfix_truncv2sfv2di2mask_name): Ditto.
 (define_insn ufix_truncmodesseintvecmodelower2mask_name): 
 Ditto.
 (define_insn sse2_cvtss2sdround_saeonly_name): Change
 nonimmediate_operand to round_saeonly_nimm_predicate.
 (define_insn avx_cvtpd2ps256mask_name): Add masking.
 (define_expand sse2_cvtpd2ps_mask): New.
 (define_insn *sse2_cvtpd2psmask_name): Add masking.
 (define_insn avx512_cvtssemodesuffix2maskmode): New.
 (define_insn avx512_cvtmask2ssemodesuffixmode): Ditto.
 (define_insn sse2_cvtps2pdmask_name): Add masking.

OK, modulo UNSPEC_CVTINT2MASK stuff. Please split out and repost
UNSPEC_CVTINT2MASK part of the patch, as it doesn't belong in this
one. Also, please see the question in the patch.

Thanks,
Uros.


 --
 Thanks, K

 diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
 index b2e1d4f..c9d6e00 100644
 --- a/gcc/config/i386/sse.md
 +++ b/gcc/config/i386/sse.md
 @@ -132,6 +132,7 @@
;; For AVX512BW support
UNSPEC_PSHUFHW
UNSPEC_PSHUFLW
 +  UNSPEC_CVTINT2MASK

What was the reason to go with an unspec? The pattern that uses
generic vector RTX is preferred, unless that kind of pattern is too
complex.

;; For AVX512DQ support
UNSPEC_REDUCE
 @@ -4659,6 +4660,38 @@
 (set_attr prefix evex)
 (set_attr mode sseintvecmode2)])

 +(define_insn 
 fixsuffixfix_truncmodesselongvecmodelower2mask_nameround_saeonly_name
 +  [(set (match_operand:sselongvecmode 0 register_operand =v)
 +   (any_fix:sselongvecmode
 + (match_operand:VF1_128_256VL 1 round_saeonly_nimm_predicate 
 round_saeonly_constraint)))]
 +  TARGET_AVX512DQ  round_saeonly_modev8sf_condition
 +  vcvttps2fixsuffixqq\t{round_saeonly_mask_op2%1, 
 %0mask_operand2|%0mask_operand2, %1round_saeonly_mask_op2}
 +  [(set_attr type ssecvt)
 +   (set_attr prefix evex)
 +   (set_attr mode sseintvecmode3)])
 +
 +(define_insn fixsuffixfix_truncv2sfv2di2mask_name
 +  [(set (match_operand:V2DI 0 register_operand =v)
 +   (any_fix:V2DI
 + (vec_select:V2SF
 +   (match_operand:V4SF 1 nonimmediate_operand vm)
 +   (parallel [(const_int 0) (const_int 1)]]
 +  TARGET_AVX512DQ  TARGET_AVX512VL
 +  vcvttps2fixsuffixqq\t{%1, %0mask_operand2|%0mask_operand2, %1}
 +  [(set_attr type ssecvt)
 +   (set_attr prefix evex)
 +   (set_attr mode TI)])
 +
 +(define_insn ufix_truncmodesseintvecmodelower2mask_name
 +  [(set (match_operand:sseintvecmode 0 register_operand =v)
 +   (unsigned_fix:sseintvecmode
 + (match_operand:VF1_128_256VL 1 nonimmediate_operand vm)))]
 +  TARGET_AVX512VL
 +  vcvttps2udq\t{%1, %0mask_operand2|%0mask_operand2, %1}
 +  [(set_attr type ssecvt)
 +   (set_attr prefix evex)
 +   (set_attr mode sseintvecmode2)])
 +
  (define_expand avx_cvttpd2dq256_2
[(set (match_operand:V8SI 0 register_operand)
 (vec_concat:V8SI
 @@ -4713,7 +4746,7 @@
 (vec_merge:V2DF
   (float_extend:V2DF
 (vec_select:V2SF
 - (match_operand:V4SF 2 nonimmediate_operand 
 x,m,round_saeonly_constraint)
 + (match_operand:V4SF 2 round_saeonly_nimm_predicate 
 x,m,round_saeonly_constraint)
   (parallel [(const_int 0) (const_int 1)])))
   (match_operand:V2DF 1 register_operand 0,0,v)
   (const_int 1)))]
 @@ -4741,14 +4774,14 @@
 (set_attr prefix evex)
 (set_attr mode V8SF)])

 -(define_insn avx_cvtpd2ps256
 -  [(set (match_operand:V4SF 0 register_operand =x)
 +(define_insn avx_cvtpd2ps256mask_name
 +  [(set (match_operand:V4SF 0 register_operand =v)
 (float_truncate:V4SF
 - (match_operand:V4DF 1 nonimmediate_operand xm)))]
 -  TARGET_AVX
 -  vcvtpd2ps{y}\t{%1, %0|%0, %1}
 + (match_operand:V4DF 1 nonimmediate_operand vm)))]
 +  TARGET_AVX  mask_avx512vl_condition
 +  vcvtpd2ps{y}\t{%1, %0mask_operand2|%0mask_operand2, %1}
[(set_attr type ssecvt)
 -   (set_attr prefix vex)
 +   (set_attr prefix maybe_evex)
 (set_attr btver2_decode vector)
 (set_attr mode V4SF)])

 @@ -4761,16 +4794,28 @@
TARGET_SSE2
operands[2] = CONST0_RTX (V2SFmode);)

 -(define_insn *sse2_cvtpd2ps
 -  [(set (match_operand:V4SF 0 register_operand =x)
 +(define_expand sse2_cvtpd2ps_mask
 +  [(set (match_operand:V4SF 0 register_operand)
 +   (vec_merge:V4SF
 + (vec_concat:V4SF
 +   (float_truncate:V2SF
 + (match_operand:V2DF 1 nonimmediate_operand))
 +   (match_dup 4))
 + (match_operand:V4SF 2 

[jit] Add copyright and license headers and footers

2014-09-24 Thread David Malcolm
On Tue, 2014-09-23 at 23:27 +, Joseph S. Myers wrote:
[...]
  diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
 
 Should start with standard copyright and license header.  This applies to 
 all sources in gcc/jit/.
[...]

I've committed the following to the dmalcolm/jit branch:

ChangeLog.jit:
* ChangeLog.jit: Add copyright footer.

contrib/ChangeLog.jit:
* ChangeLog.jit: Add copyright footer.

gcc/ChangeLog.jit:
* ChangeLog.jit: Add copyright footer.

gcc/java/ChangeLog.jit:
* ChangeLog.jit: Add copyright footer.

gcc/jit/ChangeLog.jit:
* ChangeLog.jit: Add copyright footer.
* Make-lang.in: Update copyright.
* config-lang.in: Update copyright.
* docs/examples/install-hello-world.c: Add copyright header.
* docs/examples/tut01-square.c: Likewise.
* docs/examples/tut02-sum-of-squares.c: Likewise.
* docs/examples/tut03-toyvm/toyvm.c: Likewise.
* internal-api.c: Likewise.
* internal-api.h: Likewise.
* libgccjit++.h: Likewise.
* libgccjit.c: Likewise.
* libgccjit.h: Likewise.
* libgccjit.map: Likewise.

gcc/testsuite/ChangeLog.jit:
* ChangeLog.jit: Add copyright footer.

libbacktrace/ChangeLog.jit:
* ChangeLog.jit: Add copyright footer.

libcpp/ChangeLog.jit:
* ChangeLog.jit: Add copyright footer.

libdecnumber/ChangeLog.jit:
* ChangeLog.jit: Add copyright footer.

libiberty/ChangeLog.jit:
* ChangeLog.jit: Add copyright footer.

zlib/ChangeLog.jit:
* ChangeLog.jit: Add copyright footer.
---
 ChangeLog.jit| 10 ++
 contrib/ChangeLog.jit| 10 ++
 gcc/ChangeLog.jit| 10 ++
 gcc/java/ChangeLog.jit   | 10 ++
 gcc/jit/ChangeLog.jit| 22 ++
 gcc/jit/Make-lang.in |  2 +-
 gcc/jit/config-lang.in   |  2 +-
 gcc/jit/docs/examples/install-hello-world.c  | 19 +++
 gcc/jit/docs/examples/tut01-square.c | 19 +++
 gcc/jit/docs/examples/tut02-sum-of-squares.c | 19 +++
 gcc/jit/docs/examples/tut03-toyvm/toyvm.c| 19 ++-
 gcc/jit/internal-api.c   | 20 
 gcc/jit/internal-api.h   | 20 
 gcc/jit/libgccjit++.h| 19 ++-
 gcc/jit/libgccjit.c  | 21 -
 gcc/jit/libgccjit.h  | 22 +++---
 gcc/jit/libgccjit.map| 18 ++
 gcc/testsuite/ChangeLog.jit  | 10 ++
 libbacktrace/ChangeLog.jit   | 10 ++
 libcpp/ChangeLog.jit | 10 ++
 libdecnumber/ChangeLog.jit   | 10 ++
 libiberty/ChangeLog.jit  | 10 ++
 zlib/ChangeLog.jit   | 10 ++
 23 files changed, 314 insertions(+), 8 deletions(-)

diff --git a/ChangeLog.jit b/ChangeLog.jit
index 5d2db3f..d2c3941 100644
--- a/ChangeLog.jit
+++ b/ChangeLog.jit
@@ -1,3 +1,7 @@
+2014-09-24  David Malcolm  dmalc...@redhat.com
+
+   * ChangeLog.jit: Add copyright footer.
+
 2014-09-11  David Malcolm  dmalc...@redhat.com
 
* MAINTAINERS (Various Maintainers): Add myself as jit maintainer.
@@ -6,3 +10,9 @@
 
* configure.ac: Add --enable-host-shared
* configure: Regenerate.
+
+Copyright (C) 2013-2014 Free Software Foundation, Inc.
+
+Copying and distribution of this file, with or without modification,
+are permitted in any medium without royalty provided the copyright
+notice and this notice are preserved.
diff --git a/contrib/ChangeLog.jit b/contrib/ChangeLog.jit
index 79be84d..38a315a 100644
--- a/contrib/ChangeLog.jit
+++ b/contrib/ChangeLog.jit
@@ -1,4 +1,14 @@
+2014-09-24  David Malcolm  dmalc...@redhat.com
+
+   * ChangeLog.jit: Add copyright footer.
+
 2014-01-23  David Malcolm  dmalc...@redhat.com
 
* jit-coverage-report.py: New file: a script to print crude
code-coverage information for the libgccjit API.
+
+Copyright (C) 2014 Free Software Foundation, Inc.
+
+Copying and distribution of this file, with or without modification,
+are permitted in any medium without royalty provided the copyright
+notice and this notice are preserved.
diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index 9771913..29307b1 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.jit
@@ -1,5 +1,9 @@
 2014-09-24  David Malcolm  dmalc...@redhat.com
 
+   * ChangeLog.jit: Add copyright footer.
+
+2014-09-24  David Malcolm  dmalc...@redhat.com
+
* cgraph.h (cgraphbuild_c_finalize): Delete prototype of empty
function.
(ipa_c_finalize): Likewise.
@@ -280,3 +284,9 @@

Re: [PATCH] microblaze: microblaze.md: Use 'SI' instead of 'VOID' for operand 1 of 'call_value_intern'

2014-09-24 Thread Chen Gang
On 09/24/2014 11:37 PM, Mike Stump wrote:
 On Sep 24, 2014, at 8:28 AM, Michael Eager ea...@eagerm.com wrote:
 After check the current result log, I find many remote target test
 related sentences, do we have to process it?

   e.g. Download to microblaze-xilinx-gdb failed, couldn't execute rcp: 
 no such file or directory.

 The test suite uses rcp to transfer files to or from the target,
 either to provide input to a test case or to check the output.
 Most Linux systems do not install rcp, since it is a security risk.
 
 To clarify:
 
 if {[board_info $desthost exists rcp_prog]} {
 set RCP [board_info $desthost rcp_prog]
 } else {
 set RCP rcp
 }
 
 So, if you set rcp_prog to something else, you should be able to avoid rsh if 
 you want.  Most people use ssh now-a-days.  You will want it set up to not 
 require a password for testing.
 

OK, thank you for your information.

For one simple solving way under fedora: yum install rsh, and I will
get another issue:

  Download to microblaze-xilinx-gdb failed, microblaze-xilinx-gdb: Unknown 
host

So I guess the root cause is: I only use cross-compiling environments
under fedora x86_64, no any real or virtual target for test.


Thanks.
-- 
Chen Gang

Open share and attitude like air water and life which God blessed


Re: [PATCH i386 AVX512] [54/n] Add mov[dlh]dup insns support.

2014-09-24 Thread Uros Bizjak
On Wed, Sep 24, 2014 at 2:51 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Hello,
 patch in the bottom introduces support for
 vmov[dlh]dup insns.

 Bootstrapped.
 AVX-512* tests on top of patch-set all pass
 under simulator.

 Is it ok for trunk?

 gcc/
 * config/i386/sse.md
 (define_insn avx_movshdup256mask_name): Add masking.
 (define_insn sse3_movshdupmask_name): Ditto.
 (define_insn avx_movsldup256mask_name): Ditto.
 (define_insn sse3_movsldupmask_name): Ditto.
 (define_insn vec_dupv2dfmask_name): Ditto.
 (define_insn *vec_concatv2df): Add EVEX version.

OK.

Thanks,
Uros.


Re: parallel check output changes?

2014-09-24 Thread Andrew MacLeod

On 09/24/2014 12:10 PM, Segher Boessenkool wrote:

On Wed, Sep 24, 2014 at 10:54:57AM -0400, Andrew MacLeod wrote:

On 09/23/2014 11:33 AM, Richard Sandiford wrote:

Your patch instead sorts based on the full test name, including options,
which means that the output no longer matches what you'd get from a
non-parallel run.  AFAICT, it also no longer matches what you'd get from
the .sh version.  That might be OK, just thought I'd mention it.

With the parallellisation changes the output was pretty random order.  My
patch made that a fixed order again, albeit a different one from before.


Is this suppose to be resolved now?  I'm still seeing some issues with a
branch cut from mainline from yesterday.   This is from the following
sequence:

check out revision 215511 , build, make -j16 check, make -j16 check,
then compare all the .sum files:

I don't understand what exactly you did; you have left out some steps
I think?

What?  no.. like what?  check out a tree, basic configure and build from 
scratch (./configure --verbose, make -j16 all)  and then run make check 
twice in a row.. literally make -j16 -i check.  nothing in between. so 
the compiler and toolchain are exactly the same. and different results.  
same way Ive done it forever.  except I am still getting some  different 
results from run to run.  target is a normal build-x86_64-unknown-linux-gnu


what I'm saying is something still isn't all getting sorted all the time 
(maybe if a section wasn't split up, it doesn't sort?), or all the 
patches to fix it aren't in, or there is something else still amok.  
Notice it isn't options that is the problem this time.. its the trailing 
line number of the test case warning. One is in numerical order, the 
other is in alphabetical order.


Im running it a third time now.. we'll see if its different than both 
the others or not.


Andrew


Re: [PATCH i386 AVX512] [53/n] Update vec_setmode_0 pattern constraints.

2014-09-24 Thread Uros Bizjak
On Wed, Sep 24, 2014 at 2:48 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Hello,
 Patch in the bottom extends to EVEX constraints
 of vec_setmode_0 insn pattern.

 Bootstrapped.
 AVX-512* tests on top of patch-set all pass
 under simulator.

 Is it ok for trunk?

 gcc/
 * config/i386/sse.md
 (define_insn vec_setmode_0): Add EVEX version.

OK.

Thanks,
Uros.


Re: [patch] libstdc++/29988 Rb_Tree reuse allocated nodes

2014-09-24 Thread Jonathan Wakely

On 23/09/14 21:58 +0200, François Dumont wrote:

On 23/09/2014 13:22, Jonathan Wakely wrote:

On 22/09/14 23:51 +0200, François Dumont wrote:

New patch in a couple of day then.


OK, thanks.

It was faster than I though, here is the fixed patch tested under 
Linux x86_64.

[snip]

Ok to commit ?


Yes, it looks good - thanks!

You can close the PR after you commit it (and set Target Milestone to
5.0).




Re: [PATCH] PR63300 'const volatile' sometimes stripped in debug info.

2014-09-24 Thread Mark Wielaard
Hi Andreas,

On Wed, 2014-09-24 at 14:40 +0200, Andreas Arnez wrote:
 I changed the patch a bit further, to reduce unnecessary
 iterations and recursions, and tested it again.

Thanks for adding the tests and the testing. I think in general it is a
nicer and cleaner fix than I did. I do have a question about the removal
of the recursion of modified_type_die while stripping/adding qualifiers
though:

 +  /* Determine a lesser qualified type that most closely matches
 +  this one.  Then generate DW_TAG_* entries for the remaining
 +  qualifiers.  */
 +  sub_quals = get_nearest_type_subqualifiers (type, cv_quals,
 +   cv_qual_mask);
 +  mod_type_die = modified_type_die (type, sub_quals, context_die);
 +
 +  for (i = 0; i  sizeof (qual_info) / sizeof (qual_info[0]); i++)
 + if (qual_info[i].q  cv_quals  ~sub_quals)
 +   {
 + dw_die_ref d = new_die (qual_info[i].t, mod_scope, type);
 + if (mod_type_die)
 +   add_AT_die_ref (d, DW_AT_type, mod_type_die);
 + mod_type_die = d;
 +   }

Are you sure this is completely equivalent to the previous code that
recursed into modified_type_die again for each qualifier added?

At the top of modified_type_die we check whether there is already a
qualified type and if there is then we try to get the DIE for that one
with lookup_type_die. If there is no such DIE yet, then at the end of
modified_type_die we associate that type with the DIE with a call to
equate_type_number_to_die.

In your patch we skip that association in case we need to add more than
one qualifier. Is it guaranteed that for these in between qualified
type DIES there is no associated real type that get_qualified_type would
have been able to find?

O. Yes, of course that is guaranteed. If there was such a type then
get_nearest_type_subqualifiers would have returned it. Doh. OK.

Now do I delete this whole email?
Or will I just say: Looks good to me after thinking a bit about it. :)

Thanks,

Mark


Re: [PATCH i386 AVX512] [55/n] Extend `perm' insn patterns.

2014-09-24 Thread Uros Bizjak
On Wed, Sep 24, 2014 at 2:53 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Hello,
 Patch in the bottom extends `perm' insn
 patterns.

 Bootstrapped.
 AVX-512* tests on top of patch-set all pass
 under simulator.

 Is it ok for trunk?

 gcc/
 * config/i386/sse.md
 (define_expand avx2_avx512f_permmode): Rename to ...
 (define_expand avx2_avx512bw_permmode): this.

This is not consistent with the patch below. You are renaming to
avx2_avx512... Also, please use ellipsis before this.
 (define_expand avx512_permmode_mask): Add 128/256-bit wide
 version.

Mention also the rename, and Use VI8F_256_512 mode iterator.

 (define_insn avx2_avx512f_permmode_1mask_name): Rename to ...
 (define_insn avx2_avx512bw_permmode_1mask_name): this.

Ellipsis before this.

OK with updated ChangeLog.

Thanks,
Uros.


Re: [PATCH 1/4] [AARCH64,NEON] Add patterns + builtins for vld[234](q?)_lane_* intrinsics

2014-09-24 Thread Charles Baylis
Kyril, Tejas,

Thanks for the review. I agree with all points and will respin v2 accordingly

Charles


[jit] Use standard initial includes

2014-09-24 Thread David Malcolm
On Tue, 2014-09-23 at 23:27 +, Joseph S. Myers wrote:
[...]
  +#include config.h
  +#include system.h
  +#include ansidecl.h
  +#include coretypes.h
 
 The standard initial includes are config.h, system.h, coretypes.h.  
 system.h includes libiberty.h which includes ansidecl.h, so direct 
 ansidecl.h includes shouldn't be needed anywhere.
[...]

I've committed the following fix for the above to branch dmalcolm/jit:

gcc/jit/ChangeLog.jit:

* dummy-frontend.c: Update copyright year.  Follow standard for
initial includes by removing redundant include of ansidecl.h.
* internal-api.c: Follow standard for initial includes by removing
redundant include of ansidecl.h.
* jit-builtins.c: Likewise.
* libgccjit.c: Likewise.
---
 gcc/jit/ChangeLog.jit| 9 +
 gcc/jit/dummy-frontend.c | 3 +--
 gcc/jit/internal-api.c   | 1 -
 gcc/jit/jit-builtins.c   | 1 -
 gcc/jit/libgccjit.c  | 1 -
 5 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index f451771..4ddd3cb 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,14 @@
 2014-09-24  David Malcolm  dmalc...@redhat.com
 
+   * dummy-frontend.c: Update copyright year.  Follow standard for
+   initial includes by removing redundant include of ansidecl.h.
+   * internal-api.c: Follow standard for initial includes by removing
+   redundant include of ansidecl.h.
+   * jit-builtins.c: Likewise.
+   * libgccjit.c: Likewise.
+
+2014-09-24  David Malcolm  dmalc...@redhat.com
+
* ChangeLog.jit: Add copyright footer.
* Make-lang.in: Update copyright.
* config-lang.in: Update copyright.
diff --git a/gcc/jit/dummy-frontend.c b/gcc/jit/dummy-frontend.c
index 1b96c91..1d178f9 100644
--- a/gcc/jit/dummy-frontend.c
+++ b/gcc/jit/dummy-frontend.c
@@ -1,5 +1,5 @@
 /* jit.c -- Dummy frontend for use during JIT-compilation.
-   Copyright (C) 2013 Free Software Foundation, Inc.
+   Copyright (C) 2013-2014 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -19,7 +19,6 @@ along with GCC; see the file COPYING3.  If not see
 
 #include config.h
 #include system.h
-#include ansidecl.h
 #include coretypes.h
 #include opts.h
 #include signop.h
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index 9e59d92..76ada70 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -20,7 +20,6 @@ along with GCC; see the file COPYING3.  If not see
 
 #include config.h
 #include system.h
-#include ansidecl.h
 #include coretypes.h
 #include opts.h
 #include tree.h
diff --git a/gcc/jit/jit-builtins.c b/gcc/jit/jit-builtins.c
index 160ef20..c4b0f59 100644
--- a/gcc/jit/jit-builtins.c
+++ b/gcc/jit/jit-builtins.c
@@ -19,7 +19,6 @@ along with GCC; see the file COPYING3.  If not see
 
 #include config.h
 #include system.h
-#include ansidecl.h
 #include coretypes.h
 #include opts.h
 #include tree.h
diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c
index 510ed86..cb8321c 100644
--- a/gcc/jit/libgccjit.c
+++ b/gcc/jit/libgccjit.c
@@ -20,7 +20,6 @@ along with GCC; see the file COPYING3.  If not see
 
 #include config.h
 #include system.h
-#include ansidecl.h
 #include coretypes.h
 #include opts.h
 
-- 
1.7.11.7



Re: [PATCH] microblaze: microblaze.md: Use 'SI' instead of 'VOID' for operand 1 of 'call_value_intern'

2014-09-24 Thread Mike Stump
On Sep 24, 2014, at 9:23 AM, Chen Gang gang.chen.5...@gmail.com wrote:
 For one simple solving way under fedora: yum install rsh, and I will
 get another issue:
 
  Download to microblaze-xilinx-gdb failed, microblaze-xilinx-gdb: Unknown 
 host
 
 So I guess the root cause is: I only use cross-compiling environments
 under fedora x86_64, no any real or virtual target for test.

Yes, if you want to test on a target, you will need a target.  You can either 
have a simulator (see binutils and sim/* for an example of how to write one) or 
target hardware in some form.

  1   2   >