Re: [PING] [PATCH] [MSP430] PR79242: Implement Complex Partial Integers

2018-02-13 Thread Jeff Law
On 02/08/2018 09:54 AM, Jozef Lawrynowicz wrote:
> ping x1
> 
> Complex Partial Integers are unimplemented, resulting in an ICE when
> attempting to use them. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79242
> This results in GCC7/8 for msp430-elf failing to build.
> 
> typedef _Complex __int20 C;
> 
> C
> foo (C x, C y)
> {
>   return x + y;
> }
> 
> (Thanks Jakub - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79242#c2)
> 
> ../../gcc/testsuite/gcc.target/msp430/pr79242.c: In function 'foo':
> ../../gcc/testsuite/gcc.target/msp430/pr79242.c:8:1: internal compiler
> error: in make_decl_rtl, at varasm.c:1304
>  foo (C x, C y)
>  ^~~
> 0xc07b29 make_decl_rtl(tree_node*)
> ../../gcc/varasm.c:1303
> 0x67523c set_parm_rtl(tree_node*, rtx_def*)
> ../../gcc/cfgexpand.c:1274
> 0x79ffb9 expand_function_start(tree_node*)
> ../../gcc/function.c:5166
> 0x6800e1 execute
> ../../gcc/cfgexpand.c:6250
> 
> The attached patch defines a new complex mode for PARTIAL_INT.
> You may notice that genmodes.c:complex_class returns MODE_COMPLEX_INT for
> MODE_PARTIAL_INT rather than MODE_COMPLEX_PARTIAL_INT. I reviewed the uses of
> MODE_COMPLEX_INT and it doesn't looked like a Complex Partial Int requires any
> different behaviour to MODE_COMPLEX_INT.
> 
> msp430_hard_regno_nregs now returns 2 for CPSImode, but I feel like this may 
> be
> better handled in the front-end. PSImode is already defined to only use 1
> register, so for a CPSI shouldn't the front-end should be able to work out 
> that
> double the amount of registers are required? Thoughts?
> Without the definition for CPSI in msp430_hard_regno_nregs,
> rtlanal.c:subreg_get_info thinks that a CPSI requires 4 registers of size 2,
> instead of 2 registers of size 4.
> 
> Successfully bootstrapped and tested for c,c++,fortran,lto,objc on
> x86_64-pc-linux-gnu with no regressions on gcc-7-branch.
> 
> With this patch gcc-7-branch now builds for msp430-elf. A further bug prevents
> trunk from building for msp430-elf.
> 
> If the attached patch is acceptable, I would appreciate if someone would 
> commit
> it for me (to trunk and gcc-7-branch), as I do not have write access.
> 
> 
> 0001-Add-support-for-Complex-Partial-Integers-CPSImode.patch
> 
> 
> From 31d8554ebb6afeb2d8f235cf3d3c262236aa5e32 Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Fri, 12 Jan 2018 13:23:40 +
> Subject: [PATCH] Add support for Complex Partial Integers - CPSImode
> 
> 2018-01-XX  Jozef Lawrynowicz 
> 
> gcc/
>   PR target/79242
>   * machmode.def: Define a complex mode for PARTIAL_INT.
>   * genmodes.c (complex_class): Return MODE_COMPLEX_INT for
> MODE_PARTIAL_INT.
>   * doc/rtl.texi: Document CSPImode.
>   * config/msp430/msp430.c (msp430_hard_regno_nregs): Add CPSImode
> handling.
> (msp430_hard_regno_nregs_with_padding): Likewise.
> 
> gcc/testsuite/
>   PR target/79242
>   * gcc.target/msp430/pr79242.c: New test.
Thanks.  Installed.

jeff


Re: [PATCH] avoid warning for members declared both aligned and packed (PR 84108)

2018-02-13 Thread Jeff Law
On 02/02/2018 11:58 AM, Martin Sebor wrote:
> The design of the attribute exclusion framework includes
> support for different exclusions applying to different
> kinds of declarations (functions, types, and variables
> or fields), but the support is incomplete -- the logic
> to consider these differences is missing.  This is
> because the differences are apparently rare.  However,
> as the bug below points out, they do exist.
> 
> PR middle-end/84108 - incorrect -Wattributes warning for
> packed/aligned conflict on struct members, shows that while
> declaring a non-member variable aligned is enough to reduce
> the its alignment and declaring it both aligned and packed
> triggers a -Wattributes warning:
> 
>   int a __attribute__((packed, aligned (2)));   // -Wattributes
> 
> a struct member must be declared both aligned and packed in
> order to have its alignment reduced.  (Declaring a member
> just aligned has no effect and doesn't cause a warning).
> 
>   struct S {
> int b __attribute__((packed, aligned (2)));
> int c __attribute__((aligned (2)));   // no effect
>   };
> 
> As a result of the incomplete logic GCC 8 issues a -Wattributes
> for the declaration of b in the struct.
> 
> By adding the missing logic the attached patch lets GCC avoid
> the spurious warning.
> 
> I considered adding support for detecting the ineffective
> attribute aligned on the declaration of the member c at
> the same time but since that's not a regression I decided
> to defer that until GCC 9.  I opened bug 84185 to track it.
> 
> Tested on x86_64-linux with no regressions.
> 
> Martin
> 
> gcc-84108.diff
> 
> 
> PR middle-end/84108 - incorrect -Wattributes warning for packed/aligned 
> conflict on struct members
> 
> gcc/ChangeLog:
> 
>   PR c/84108
>   * attribs.c (diag_attr_exclusions): Consider the exclusion(s)
>   that correspond to the kind of a declaration.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c/84108
>   * gcc.dg/Wattributes-8.c: New test.
OK.  Sorry for the wait.

Thanks.

Jeff


Re: PR84239, Reimplement CET intrinsics for rdssp/incssp insn

2018-02-13 Thread Jeff Law
On 02/12/2018 07:16 AM, Tsimbalist, Igor V wrote:
>> -Original Message-
>> From: Sandra Loosemore [mailto:san...@codesourcery.com]
>> Sent: Friday, February 9, 2018 7:42 PM
>> To: Tsimbalist, Igor V ; gcc-
>> patc...@gcc.gnu.org
>> Cc: Uros Bizjak 
>> Subject: Re: PR84239, Reimplement CET intrinsics for rdssp/incssp insn
>>
>> On 02/09/2018 05:50 AM, Tsimbalist, Igor V wrote:
>>> Introduce a couple of new CET intrinsics for reading and updating a
>> shadow stack
>>> pointer (_get_ssp and _inc_ssp), which are more user friendly. They replace
>> the existing
>>> _rdssp[d|q] and _incssp[d|q] instrinsics. The _get_ssp intrinsic has more
>> deterministic
>>> semantic: it returns a value of the shadow stack pointer if HW is CET
>> capable and
>>> 0 otherwise.
>>>
>>> Ok for trunk?
>> Just reviewing the documentation part:
>>
>>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>>> index cb9df97..9f25dd9 100644
>>> --- a/gcc/doc/extend.texi
>>> +++ b/gcc/doc/extend.texi
>>> @@ -12461,6 +12461,7 @@ instructions, but allow the compiler to
>> schedule those calls.
>>>  * TILEPro Built-in Functions::
>>>  * x86 Built-in Functions::
>>>  * x86 transactional memory intrinsics::
>>> +* x86 control-flow protection intrinsics::
>>>  @end menu
>>>
>>>  @node AArch64 Built-in Functions
>>> @@ -21772,13 +21773,17 @@ void __builtin_ia32_wrpkru (unsigned int)
>>>  unsigned int __builtin_ia32_rdpkru ()
>>>  @end smallexample
>>>
>>> -The following built-in functions are available when @option{-mcet} is
>> used.
>>> -They are used to support Intel Control-flow Enforcment Technology (CET).
>>> -Each built-in function generates the  machine instruction that is part of
>> the
>>> -function's name.
>>> +The following built-in functions are available when @option{-mcet} or
>>> +@option{-mshstk} option is used.  They support shadow stack
>>> +machine instructions from Intel Control-flow Enforcment Technology
>> (CET).
>>
>> s/Enforcment/Enforcement/
>>
>>> +Each built-in function generates the  machine instruction that is part
>>> +of the function's name.  These are the internal low level functions.
>> s/low level/low-level/
>>
>>> +Normally the functions in @ref{x86 control-flow protection intrinsics}
>>> +should be used instead.
>>> +
>>>  @smallexample
>>> -unsigned int __builtin_ia32_rdsspd (unsigned int)
>>> -unsigned long long __builtin_ia32_rdsspq (unsigned long long)
>>> +unsigned int __builtin_ia32_rdsspd (void)
>>> +unsigned long long __builtin_ia32_rdsspq (void)
>>>  void __builtin_ia32_incsspd (unsigned int)
>>>  void __builtin_ia32_incsspq (unsigned long long)
>>>  void __builtin_ia32_saveprevssp(void);
>>> @@ -21885,6 +21890,51 @@ else
>>>  Note that, in most cases, the transactional and non-transactional code
>>>  must synchronize together to ensure consistency.
>>>
>>> +@node x86 control-flow protection intrinsics
>>> +@subsection x86 Control-Flow Protection Intrinsics
>>> +
>>> +@deftypefn {CET Function} {ret_type} _get_ssp (void)
>>> +The @code{ret_type} is @code{unsigned long long} for x86-64 platform
>>> +and @code{unsigned int} for x86 pltform.
>> I'd prefer the sentence about the return type be placed after the
>> description of what the function does.  And please fix typos:
>> s/x86-64 platform/64-bit targets/
>> s/x86 pltform/32-bit targets/
>>
>>> +Get the current value of shadow stack pointer if shadow stack support
>>> +from Intel CET is enabled in the HW or @code{0} otherwise.
>> s/HW/hardware,/
>>
>>> +@end deftypefn
>>> +
>>> +@deftypefn {CET Function} void _inc_ssp (unsigned int)
>>> +Increment the current shadow stack pointer by the size specified by the
>>> +function argument.  For security reason only unsigned byte value is used
>>> +from the argument.  Therefore for the size greater than @code{255} the
>>> +function should be called several times.
>> How about rephrasing the last two sentences:
>>
>> The argument is masked to a byte value for security reasons, so to
>> increment by more than 255 bytes you must call the function multiple times.
>>
>>> +@end deftypefn
>>> +
>>> +The shadow stack unwind code looks like:
>>> +
>>> +@smallexample
>>> +#include 
>>> +
>>> +/* Unwind the shadow stack for EH.  */
>>> +#define _Unwind_Frames_Extra(x)\
>>> +  do   \
>>> +@{ \
>>> +  _Unwind_Word ssp = _get_ssp ();  \
>>> +  if (ssp != 0)\
>>> +   @{  \
>>> + _Unwind_Word tmp = (x);   \
>>> + while (tmp > 255) \
>>> +   @{  \
>>> + _inc_ssp (tmp);   \
>>> + tmp -= 255;   \
>>> +   @}  \
>>> + _inc_ssp (tmp);   \
>>> +   @}  \
>>> +@} \
>>> +while (0)
>>> +@end smallexample
>> Tabs in Texinfo input 

Re: [PATCH] correct -Wrestrict handling of arrays of arrays (PR 84095)

2018-02-13 Thread Jeff Law
On 02/01/2018 04:45 PM, Martin Sebor wrote:
> The previous patch didn't resolve all the false positives
> in the Linux kernel.  The attached is an update that fixes
> the remaining one having to do with multidimensional array
> members:
> 
>   struct S { char a[2][4]; };
> 
>   void f (struct S *p, int i)
>   {
> strcpy (p->a[0], "012");
> strcpy (p->a[i] + 1, p->a[0]);   // false positive here
>   }
> 
> In the process of fixing this I also made a couple of minor
> restructuring changes to the builtin_memref constructor to
> in order to make the code easier to follow: I broke it out
> into a couple of helper functions and called those.
> 
> As with the first revision of the patch, this one is also
> meant to be applied on top of
> 
>   https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01488.html
> 
> Sorry about the late churn.  Even though I tested the original
> implementation with the Linux kernel the bugs were only exposed
> non-default configurations that I didn't build.
> 
> Jakub, you had concerns about the code in the constructor
> and about interpreting the offsets in the diagnostics.
> I tried to address those in the patch.  Please review
> the changes and let me know if you have any further comments.
> 
> Thanks
> Martin
> 
> On 01/30/2018 04:19 PM, Martin Sebor wrote:
>> Testing GCC 8 with recent Linux kernel sources has uncovered
>> a bug in the handling of arrays of arrays by the -Wrestrict
>> checker where it fails to take references to different array
>> elements into consideration, issuing false positives.
>>
>> The attached patch corrects this mistake.
>>
>> In addition, to make warnings involving excessive offset bounds
>> more meaningful (less confusing), I've made a cosmetic change
>> to constrain them to the bounds of the accessed object.  I've
>> done this in response to multiple comments indicating that
>> the warnings are hard to interpret.  This change is meant to
>> be applied on top of the patch for bug 83698 (submitted mainly
>> to improve the readability of the offsets):
>>
>>   https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01488.html
>>
>> Martin
> 
> 
> gcc-84095.diff
> 
> 
> PR middle-end/84095 - false-positive -Wrestrict warnings for memcpy within 
> array
> 
> gcc/ChangeLog:
> 
>   PR middle-end/84095
>   * gimple-ssa-warn-restrict.c (builtin_memref::extend_offset_range): New.
>   (builtin_memref::set_base_and_offset): Same.  Handle inner references.
>   (builtin_memref::builtin_memref): Factor out parts into
>   set_base_and_offset and call it.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR middle-end/84095
>   * c-c++-common/Warray-bounds-3.c: Adjust text of expected warnings.
>   * c-c++-common/Wrestrict.c: Same.
>   * gcc.dg/Wrestrict-6.c: Same.
>   * gcc.dg/Warray-bounds-27.c: New test.
>   * gcc.dg/Wrestrict-8.c: New test.
>   * gcc.dg/Wrestrict-9.c: New test.
>   * gcc.dg/pr84095.c: New test.
> 
> diff --git a/gcc/gimple-ssa-warn-restrict.c b/gcc/gimple-ssa-warn-restrict.c
> index 528eb5b..367e05f 100644
> --- a/gcc/gimple-ssa-warn-restrict.c
> +++ b/gcc/gimple-ssa-warn-restrict.c

> +  else if (gimple_nop_p (stmt))
> + expr = SSA_NAME_VAR (expr);
> +  else
> + {
> +   base = expr;
> +   return;
>   }
This looks odd.  Can you explain what you're trying to do here?

I'm not offhand why you'd ever want to extract SSA_NAME_VAR.  In general
it's primary use is for dumps and debugging info.  I won't quite go so
far as to say using it for anything else is wrong, but it's certainly
something you ought to explain.


The rest looks fairly reasonable.  It's a bit hard to follow, but I
don't think we should do another round of refactoring at this stage.

jeff


Re: [PATCH] make -Wrestrict for strcat more meaningful (PR 83698)

2018-02-13 Thread Jeff Law
On 01/16/2018 01:36 PM, Martin Sebor wrote:
> PR 83698 - bogus offset in -Wrestrict messages for strcat of
> unknown strings, points out that the offsets printed by
> -Wrestrict for possibly overlapping strcat calls with unknown
> strings don't look meaningful in some cases.  The root cause
> of the bogus values is wrapping during the conversion from
> offset_int in which the pass tracks numerical values to
> HOST_WIDE_INT for printing.  (The problem will go away once
> GCC's pretty-printer supports wide int formatting.)  For
> instance, the following:
> 
>   extern char *d;
>   strcat (d + 3, d + 5);
> 
> results in
> 
>   warning: ‘strcat’ accessing 0 or more bytes at offsets 3 and 5 may
> overlap 1 byte at offset [3, -9223372036854775806]
> 
> which, besides printing the bogus negative offset on LP64
> targets, isn't correct because strcat always accesses at least
> one byte (the nul) and there can be no overlap at offset 3.
> To be more accurate, the warning should say something like:
> 
>   warning: ‘strcat’ accessing 3 or more bytes at offsets 3 and 5 may
> overlap 1 byte at offset 5 [-Wrestrict]
> 
> because the function must access at least 3 bytes in order to
> cause an overlap, and when it does, the overlap starts at the
> higher of the two offsets, i.e., 5.  (Though it's virtually
> impossible to have a single sentence and a singled set of
> numbers cover all the cases with perfect accuracy.)
> 
> The attached patch fixes these issues to make the printed values
> make more sense.  (It doesn't affect when diagnostics are printed.)
> 
> Although this isn't strictly a regression, it has an impact on
> the readability of the warnings.  If left unchanged, the original
> messages are likely to confuse users and lead to bug reports.
> 
> Martin
> 
> gcc-83698.diff
> 
> 
> PR tree-optimization/83698 - bogus offset in -Wrestrict messages for strcat 
> of unknown strings
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/83698
>   * gimple-ssa-warn-restrict.c (builtin_memref::builtin_memref): For
>   arrays constrain the offset range to their bounds.
>   (builtin_access::strcat_overlap): Adjust the bounds of overlap offset.
>   (builtin_access::overlap): Avoid setting the size of overlap if it's
>   already been set.
>   (maybe_diag_overlap): Also consider arrays when deciding what values
>   of offsets to include in diagnostics.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/83698
>   * gcc.dg/Wrestrict-7.c: New test.
>   * c-c++-common/Wrestrict.c: Adjust expected values for strcat.
>   * gcc.target/i386/chkp-stropt-17.c: Same.
> 
> Index: gcc/gimple-ssa-warn-restrict.c
> ===
> --- gcc/gimple-ssa-warn-restrict.c(revision 256752)
> +++ gcc/gimple-ssa-warn-restrict.c(working copy)
> @@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr, tree si
> base = SSA_NAME_VAR (base);
>}
>  
> +  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
> +{
> +  if (offrange[0] < 0 && offrange[1] > 0)
> + offrange[0] = 0;
> +}
> +
Comment before this code so that it's more obvious to the reader that
you're narrowing the range of the low bound because it's an array
(presumably Ada's virtual origins aren't a concern here).

Note some of the preceding context has changed, but I don't think it
materially affects your patch, only the ability to apply it cleanly.

Jakub had a nit with that hunk (unnecessarily nested conditional).  I'm
going to assume you'll address that.

I concur that the changes to strcat_overlap Jakub was concerned about
merely affect the diagnostic output, not whether or not we emit a
diagnostic.  I'm going to assume that even though we may drop bits that
the net is better with this patch rather than without.

OK for the trunk with the comment and nit fixed.

jeff


Re: [PATCH] make -Wrestrict for strcat more meaningful (PR 83698)

2018-02-13 Thread Jeff Law
On 01/16/2018 05:35 PM, Martin Sebor wrote:
> On 01/16/2018 02:32 PM, Jakub Jelinek wrote:
>> On Tue, Jan 16, 2018 at 01:36:26PM -0700, Martin Sebor wrote:
>>> --- gcc/gimple-ssa-warn-restrict.c    (revision 256752)
>>> +++ gcc/gimple-ssa-warn-restrict.c    (working copy)
>>> @@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr, tree si
>>>    base = SSA_NAME_VAR (base);
>>>    }
>>>
>>> +  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
>>> +    {
>>> +  if (offrange[0] < 0 && offrange[1] > 0)
>>> +    offrange[0] = 0;
>>> +    }
>>
>> Why the 2 nested ifs?
> 
> No particular reason.  There may have been more code in there
> that I ended up removing.  Or a comment.  I can remove the
> extra braces when the patch is approved.
> 
>>
>>> @@ -1079,14 +1085,35 @@ builtin_access::strcat_overlap ()
>>>  return false;
>>>
>>>    /* When strcat overlap is certain it is always a single byte:
>>> - the terminatinn NUL, regardless of offsets and sizes.  When
>>> + the terminating NUL, regardless of offsets and sizes.  When
>>>   overlap is only possible its range is [0, 1].  */
>>>    acs.ovlsiz[0] = dstref->sizrange[0] == dstref->sizrange[1] ? 1 : 0;
>>>    acs.ovlsiz[1] = 1;
>>> -  acs.ovloff[0] = (dstref->sizrange[0] +
>>> dstref->offrange[0]).to_shwi ();
>>> -  acs.ovloff[1] = (dstref->sizrange[1] +
>>> dstref->offrange[1]).to_shwi ();
>>
>> You use to_shwi many times in the patch, do the callers or something
>> earlier
>> in this function guarantee that you aren't throwing away any bits (unlike
>> tree_to_shwi, to_shwi method doesn't ICE, just throws away upper bits).
>> Especially when you perform additions like here, even if both
>> wide_ints fit
>> into a shwi, the result might not.
> 
> No, I'm not sure.  In fact, it wouldn't surprise me if it did
> happen.  It doesn't cause false positives or negatives but it
> can make the offsets less than meaningful in cases where they
> are within valid bounds.  There are also cases where they are
> meaningless to begin with and there is little the pass can do
> about that.
I was kind of expecting an update to try and address some of these
issues.  Though after re-reading your response the consequence of
throwing away bits here is just the diagnostic is not as precise as it
could be, right?  ie, it doesn't change when we issue a diagnostic, just
the contents of the diagnostic.

I filed this into my gcc9 bucket because it doesn't fix a regression,
but it appears that a regression fix does depend on this stuff to some
degree (84095).  So I'll try to take a look at this shortly so that we
can unblock 84095.


Jeff


Re: [PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-13 Thread Jason Merrill
On Tue, Feb 13, 2018 at 5:20 PM, Martin Sebor  wrote:
> On 02/13/2018 01:09 PM, Jason Merrill wrote:
>>
>> On Tue, Feb 13, 2018 at 2:59 PM, Martin Sebor  wrote:
>>>
>>> On 02/13/2018 12:15 PM, Jason Merrill wrote:

 On Tue, Feb 13, 2018 at 1:31 PM, Martin Sebor  wrote:
>
> On 02/13/2018 09:24 AM, Martin Sebor wrote:
>>
>>
>>
>> On 02/13/2018 08:35 AM, Martin Sebor wrote:
>>>
>>>
>>>
>>> On 02/13/2018 07:40 AM, Jason Merrill wrote:



 On Mon, Feb 12, 2018 at 6:32 PM, Martin Sebor 
 wrote:
>
>
>
> While testing my fix for 83871 (handling attributes on explicit
> specializations) I noticed another old regression: while GCC 4.4
> would diagnose declarations of explicit specializations of all
> primary templates declared deprecated, GCC 4.5 and later only
> diagnose declarations of explicit specializations of class
> templates but not those of function or variable templates.




 Hmm, the discussion on the core reflector seemed to be agreeing that
 we want to be able to define non-deprecated specializations of a
 deprecated primary template.
>>>
>>>
>>>
>>>
>>> Yes, that's what Richard wanted to do.  The only way to do it
>>> within the existing constraints(*) is to define a non-deprecated
>>> primary, and a deprecated partial specialization.  This is in line
>>> with that approach and supported by Clang and all other compilers
>>> I tested (including Clang).
>>
>>
>>
>>
>> To clarify, this approach works for class templates (e.g., like
>> std::numeric_limits that was mentioned in the core discussion)
>> and for variable templates.  Functions have no partial
>> specilizations so they have to be overloaded to achieve the same
>> effect.
>>
>> Implementations don't treat the deprecated attribute on partial
>> specializations consistently.
>>
>> EDG accepts and honors it on class template partial specializations
>> but rejects it with an error on those of variables.
>>
>> Clang accepts but silently ignores it on class template partial
>> specializations and rejects with an error it on variables.
>>
>> MSVC accepts and honors it on variables but silently ignores it
>> on class template partial specializations.
>>
>> GCC ignores it silently on class partial specializations and
>> with a warning on variables (I opened bug 84347 to track this
>> and to have GCC honor is everywhere).
>>
>> This is clearly a mess, which isn't surprising given how poorly
>> specified this is in the standard.  But from the test cases and
>> from the core discussion it seems clear that deprecating
>> a template, including its partial specializations (as opposed
>> to just a single explicit specialization) is desirable and
>> already supported, and that the wording in the standard just
>> needs to be adjusted to reflect that.
>>
>>>
>>> Martin
>>>
>>> [*] Except (as Richard noted) that the standard doesn't seem to
>>> allow a template to be deprecated.  I think that's a bug in the
>>> spec because all implementations allow it to some degree.
>
>
>
>
> One other note.  While thinking about this problem during
> the core discussion, another approach to deprecating a primary
> template without also deprecating all of its specializations
> occurred to me.
>
> 1) First declare the primary template without [[deprecated]].
> 2) Next declare its non-deprecated specializations (partial
>or explicit).
> 3) Finally declare the primary again, this time [[deprecated]].
>
> Like this:
>
>   template  structS;
>   template  structS { };
>   template  struct [[deprecated]] S { };
>   template  struct [[deprecated]] S { };
>
>   S si; // warning
>   S sci;  // no warning
>   S svi;   // warning
>
> This works as expected with Intel ICC.  All other compilers
> diagnose all three variables.  I'd say for [[deprecated]] it
> should work the way ICC does.  (For [[noreturn]] the first
> declaration must be [[noreturn]], so there this solution
> wouldn't work also because of that, in addition to function
> templates not being partially-specializable.)



 My understanding of the reflector discussion, and Richard's comment in
 particular, was that [[deprecated]] should apply to the instances, not
 the template itself, so that declaring the primary template
 [[deprecated]] doesn't affect explicit specializations.  Your last
 example should work as you expect in this model, but you can 

Re: [PATCH] Fix PR84101, account for function ABI details in vectorization costs

2018-02-13 Thread Jeff Law
On 01/30/2018 02:59 AM, Richard Biener wrote:
> 
> This patch tries to deal with the "easy" part of a function ABI,
> the return value location, in vectorization costing.  The testcase
> shows that if we vectorize the returned value but the function
> doesn't return in memory or in a vector register but as in this
> case in an integer register pair (reg:TI ax) (bah, ABI details
> exposed late?  why's this not a parallel?) we end up spilling
> badly.
PARALLEL is used when the ABI mandates a value be returned in multiple
places.  Typically that happens when the value is returned in different
types of registers (integer, floating point, vector).

Presumably it's not a PARALLEL in this case because the value is only
returned in %eax.

> 
> The idea is to account for such spilling so if vectorization
> benefits outweight the spilling we'll vectorize anyway.
That's a pretty serious bleed of the target into the vectorizer.  But
we've already deemed that the vectorizer is going to have these target
dependencies.  So I won't object on those grounds.


> 
> I think the particular testcase could be fixed in the subreg
> pass basically undoing the vectorization but I realize that
> generally this is a too hard problem and avoiding vectorization
> is better.  Still this patch is somewhat fragile in that it
> depends on us "seeing" that the stored to decl is returned
> (see cfun_returns).
> 
> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> 
> I'd like to hear opinions on my use of hard_function_value
> and also from other target maintainers.  I'm not sure we
> have sufficient testsuite coverage of _profitable_ vectorization
> of a return value.  Feel free to add to this for your
> target.
Well, it's the right way to get at the information.  I'm not aware of
any other way to get what you want.  We could possibly hide the RTL bits
to avoid GET_MODE and friends within the vectorizer -- your call.

I'm not sure the bits in vect_mode_store_cost are right though.  ISTM
you want to penalize if and only if the return value is not stored in a
vector-capable location.  If it's a PARALLEL and any element is a
suitable vector register, then do not penalize -- the spills are
unavoidable in that case.

So I think you have to iterate over elements in the PARALLEL case to
verify none of them are suitable for holding the vector result.

I'm not entirely sure what to do with CONCAT.  I wasn't immediately
aware it could show up in that context.


Or am I missing something here?


> 
> Ok for trunk?
> 
> Thanks,
> Richard.
> 
> 2018-01-30  Richard Biener  
> 
>   PR tree-optimization/84101
>   * tree-vect-stmts.c: Include explow.h for hard_function_value.
>   (cfun_returns): New helper.
>   (vect_model_store_cost): When vectorizing a store to a decl
>   we return and the function ABI returns via a non-vector
>   register account for the possible spilling that will happen.
> 
>   * gcc.target/i386/pr84101.c: New testcase.


Jeff


Re: [PR81611] improve auto-inc

2018-02-13 Thread Jeff Law
On 01/23/2018 08:42 PM, Alexandre Oliva wrote:
> These two patches fix PR81611.
> 
> The first one improves forwprop so that we avoid adding SSA conflicting
> by forwpropping the iv increment, which may cause both the incremented
> and the original value to be live, even when the iv is copied between
> the PHI node and the increment.  We already handled the case in which
> there aren't any such copies.
> 
> Alas, this is not enough to address the problem on avr, even though it
> fixes it on e.g. ppc.  The reason is that avr rejects var+offset
> addresses, and this prevents the memory access in a post-inc code
> sequence from being adjusted to an address that auto-inc-dec can
> recognize.
> 
> The second patch adjusts auto-inc-dec to recognize a code sequence in
> which the original, unincremented pseudo is used in an address after
> it's incremented into another pseudo, and turn that into a post-inc
> address, leaving the copying for subsequent passes to eliminate.
> 
> Regstrapped on x86_64-linux-gnu, i686-linux-gnu, ppc64-linux-gnu and
> aarch64-linux-gnu.  Ok to install?
> 
> 
> I'd appreciate suggestions on how to turn the submitted testcase into a
> regression test; I suppose an avr-specific test that requires the
> auto-inc transformation is a possibility, but that feels a bit too
> limited, doesn't it?  Thoughts?  Thanks in advance,
> 
> 

[ snip the DOM change which was approved and installed ]

> [PR81611] turn inc-and-use-of-dead-orig into auto-inc
> 
> When the addressing modes available on the machine don't allow offsets
> in addresses, odds are that post-increments will be represented in
> trees and RTL as:
> 
>   y <= x + 1
>   ... *(x) ...
>   x <= y
> 
> so deal with this form so as to create auto-inc addresses that we'd
> otherwise miss.
> 
> 
> for  gcc/ChangeLog
> 
>   PR rtl-optimization/81611
>   * auto-inc-dec.c (attempt_change): Move dead note from
>   mem_insn if it's the next use of regno
>   (find_address): Take address use of reg holding
>   non-incremented value.
>   (merge_in_block): Attempt to use a mem insn that is the next
>   use of the original regno.
> ---
>  gcc/auto-inc-dec.c |   46 --
>  1 file changed, 44 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c
> index d02fa9d081c7..4ffbcf56a456 100644
> --- a/gcc/auto-inc-dec.c
> +++ b/gcc/auto-inc-dec.c
> @@ -508,7 +508,11 @@ attempt_change (rtx new_addr, rtx inc_reg)
>before the memory reference.  */
>gcc_assert (mov_insn);
>emit_insn_before (mov_insn, inc_insn.insn);
> -  move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0);
> +  regno = REGNO (inc_insn.reg0);
> +  if (reg_next_use[regno] == mem_insn.insn)
> + move_dead_notes (mov_insn, mem_insn.insn, inc_insn.reg0);
> +  else
> + move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0);
>  
>regno = REGNO (inc_insn.reg_res);
>reg_next_def[regno] = mov_insn;

> @@ -1370,6 +1385,33 @@ merge_in_block (int max_reg, basic_block bb)
> insn_is_add_or_inc = false;
>   }
>   }
> +
> +   if (ok && insn_is_add_or_inc
> +   && inc_insn.reg0 != inc_insn.reg_res)
> + {
> +   regno = REGNO (inc_insn.reg0);
> +   int luid = DF_INSN_LUID (mem_insn.insn);
> +   mem_insn.insn = get_next_ref (regno, bb, reg_next_use);
So I think a comment is warranted  right as we enter the TRUE arm.

At that point INC_INSN is an inc/dec.  But MEM_INSN is not necessarily a
memory reference.  It could be a memory reference, it could be a copy,
it could be something completely different (it's just the next insn that
references the result of the increment).  In the case we care about we
want it to be a copy of INC_INSN's REG_RES back to REG0.

ISTM that verifying MEM_INSN is a reg->reg copy (reg_res -> reg0) before
we call get_next_ref for reg0 is advisable and probably good from a
compile-time standpoint by avoiding calls into find_address.

After we call get_next_ref to get the next reference of the source of
the increment, then we're hoping to find a memory reference that uses
REG0.  But it's not guaranteed it's a memory reference insn.

I was having an awful time understanding how this code could work from
the comments until I put it under a debugger and got a sense of the
state as we entered that IF block.  Then it was much clearer :-)

I believe Georg had other testcases in subsequent comments in the BZ,
but I don't believe they were flagged as regressions.  So I think once
you check in your patch we note in the BZ that the original testcase
(and thus the regression) is fixed, but that the later tests are not.
We then remove the regression marker.

If Georg (or anyone else) does the analysis to show those later tests
are also regressions, then we'll add the regression marker 

PING^2: [PATCH] i386: Add __x86_indirect_thunk_nt_reg for -fcf-protection -mcet

2018-02-13 Thread H.J. Lu
On Thu, Feb 8, 2018 at 10:09 AM, H.J. Lu  wrote:
> On Fri, Feb 2, 2018 at 8:54 AM, H.J. Lu  wrote:
>> nocf_check attribute can be used with -fcf-protection -mcet to disable
>> control-flow check by adding NOTRACK prefix before indirect branch.
>> When -mindirect-branch=thunk-extern -mindirect-branch-register is added,
>> indirect branch via register, "notrack call/jmp reg", is converted to
>>
>> call/jmp __x86_indirect_thunk_nt_reg
>>
>> When running on machines with CET enabled, __x86_indirect_thunk_nt_reg
>> can be updated to
>>
>> notrack jmp reg
>>
>> at run-time to restore NOTRACK prefix in the original indirect branch.
>>
>> Since we don't support -mindirect-branch=thunk-extern, CET and MPX at
>> the same time, -mindirect-branch=thunk-extern is disallowed with
>> -fcf-protection=branch and -fcheck-pointer-bounds.
>>
>> Tested on i686 and x86-64.  OK for trunk?
>>
>> Thanks.
>>
>> H.J.
>> ---
>> gcc/
>>
>> PR target/84176
>> * config/i386/i386.c (ix86_set_indirect_branch_type): Issue an
>> error when -mindirect-branch=thunk-extern, -fcf-protection=branch
>> and -fcheck-pointer-bounds are used together.
>> (indirect_thunk_prefix): New enum.
>> (indirect_thunk_need_prefix): New function.
>> (indirect_thunk_name): Replace need_bnd_p with need_prefix.  Use
>> "_nt" instead of "_bnd" for NOTRACK prefix.
>> (output_indirect_thunk): Replace need_bnd_p with need_prefix.
>> (output_indirect_thunk_function): Likewise.
>> (): Likewise.
>> (ix86_code_end): Update output_indirect_thunk_function calls.
>> (ix86_output_indirect_branch_via_reg): Replace
>> ix86_bnd_prefixed_insn_p with indirect_thunk_need_prefix.
>> (ix86_output_indirect_branch_via_push): Likewise.
>> (ix86_output_function_return): Likewise.
>> * doc/invoke.texi: Document -mindirect-branch=thunk-extern is
>> incompatible with -fcf-protection=branch and
>> -fcheck-pointer-bounds.
>>
>> gcc/testsuite/
>>
>> PR target/84176
>> * gcc.target/i386/indirect-thunk-11.c: New test.
>> * gcc.target/i386/indirect-thunk-12.c: Likewise.
>> * gcc.target/i386/indirect-thunk-attr-12.c: Likewise.
>> * gcc.target/i386/indirect-thunk-attr-13.c: Likewise.
>> * gcc.target/i386/indirect-thunk-attr-14.c: Likewise.
>> * gcc.target/i386/indirect-thunk-attr-15.c: Likewise.
>> * gcc.target/i386/indirect-thunk-attr-16.c: Likewise.
>> * gcc.target/i386/indirect-thunk-extern-10.c: Likewise.
>> * gcc.target/i386/indirect-thunk-extern-8.c: Likewise.
>> * gcc.target/i386/indirect-thunk-extern-9.c: Likewise.
>
> Hi Jan,
>
> Can you review it:
>
> https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00113.html
>

Hi Jan,

Can you take a look?  I have more retpoline patches which
depend on this one.

Thanks.

-- 
H.J.


PING^2: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER

2018-02-13 Thread H.J. Lu
On Thu, Feb 8, 2018 at 10:05 AM, H.J. Lu  wrote:
> On Sun, Jan 28, 2018 at 11:56 AM, H.J. Lu  wrote:
>> On Sat, Jan 27, 2018 at 2:12 PM, H.J. Lu  wrote:
>>> For
>>>
>>> ---
>>> struct C {
>>>   virtual ~C();
>>>   virtual void f();
>>> };
>>>
>>> void
>>> f (C *p)
>>> {
>>>   p->f();
>>>   p->f();
>>> }
>>> ---
>>>
>>> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates:
>>>
>>> _Z1fP1C:
>>> .LFB0:
>>> .cfi_startproc
>>> pushq   %rbx
>>> .cfi_def_cfa_offset 16
>>> .cfi_offset 3, -16
>>> movq(%rdi), %rax
>>> movq%rdi, %rbx
>>> jmp .LIND1
>>> .LIND0:
>>> pushq   16(%rax)
>>> jmp __x86_indirect_thunk
>>> .LIND1:
>>> call.LIND0
>>> movq(%rbx), %rax
>>> movq%rbx, %rdi
>>> popq%rbx
>>> .cfi_def_cfa_offset 8
>>> movq16(%rax), %rax
>>> jmp __x86_indirect_thunk_rax
>>> .cfi_endproc
>>>
>>> x86-64 is supposed to have asynchronous unwind tables by default, but
>>> there is nothing that reflects the change in the (relative) frame
>>> address after .LIND0.  That region really has to be moved outside of
>>> the .cfi_startproc/.cfi_endproc bracket.
>>>
>>> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect
>>> branch via register when -mindirect-branch=thunk-extern is used.  Now,
>>> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates:
>>>
>>> _Z1fP1C:
>>> .LFB0:
>>> .cfi_startproc
>>> pushq   %rbx
>>> .cfi_def_cfa_offset 16
>>> .cfi_offset 3, -16
>>> movq(%rdi), %rax
>>> movq%rdi, %rbx
>>> movq16(%rax), %rax
>>> call__x86_indirect_thunk_rax
>>> movq(%rbx), %rax
>>> movq%rbx, %rdi
>>> popq%rbx
>>> .cfi_def_cfa_offset 8
>>> movq16(%rax), %rax
>>> jmp __x86_indirect_thunk_rax
>>> .cfi_endproc
>>>
>>> Now "-mindirect-branch=thunk-extern" is equivalent to
>>> "-mindirect-branch=thunk-extern -mindirect-branch-register", which is
>>> used by Linux kernel.
>>>
>>> Tested on i686 and x86-64.  OK for trunk?
>>>
>>> Thanks.
>>>
>>> H.J.
>>> 
>>> gcc/
>>>
>>> PR target/84039
>>> * config/i386/constraints.md (Bs): Replace
>>> ix86_indirect_branch_register with
>>> TARGET_INDIRECT_BRANCH_REGISTER.
>>> (Bw): Likewise.
>>> * config/i386/i386.md (indirect_jump): Likewise.
>>> (tablejump): Likewise.
>>> (*sibcall_memory): Likewise.
>>> (*sibcall_value_memory): Likewise.
>>> Peepholes of indirect call and jump via memory: Likewise.
>>> * config/i386/i386.opt: Likewise.
>>> * config/i386/predicates.md (indirect_branch_operand): Likewise.
>>> (GOT_memory_operand): Likewise.
>>> (call_insn_operand): Likewise.
>>> (sibcall_insn_operand): Likewise.
>>> (GOT32_symbol_operand): Likewise.
>>> * config/i386/i386.h (TARGET_INDIRECT_BRANCH_REGISTER): New.
>>>
>>
>> Here is the updated patch  to disallow *sibcall_GOT_32 and 
>> *sibcall_value_GOT_32
>> for TARGET_INDIRECT_BRANCH_REGISTER.
>>
>> Tested on i686 and x86-64.  OK for trunk?
>>
>
> Hi Jan,
>
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02233.html
>
> Is OK for trunk?
>

Hi Jan,

Can you take a look?  I have more retpoline patches which
depend on this one.

Thanks.


-- 
H.J.


Re: [PATCH] FIx endless match.pd recursion on cst1 + cst2 + cst3 (PR tree-optimization/84334)

2018-02-13 Thread Jakub Jelinek
On Tue, Feb 13, 2018 at 03:28:25PM -0400, Marc Glisse wrote:
> On Tue, 13 Feb 2018, Richard Biener wrote:
> 
> > On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek  
> > wrote:
> > > Hi!
> > > 
> > > On the following testcase, we recurse infinitely, because
> > > we have float re-association enabled, but also rounding-math, so
> > > we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1
> > > but (cst2 + cst3) doesn't simplify and we try again and optimize
> > > it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on
> > > forever.  If @0 is not a CONSTANT_CLASS_P, there is not a problem,
> > > if it is, the code just checks if we can actually simplify the
> > > operation between cst2 and cst3 into a constant.
> > 
> > Is there a reason to try simplifying at all for constant @0?
> 
> Yes. cst2+cst3 might simplify (the operation happens to be exact and not
> require rounding), which leaves us with only one addition instead of 2.

Yeah, exactly, e.g.

/* { dg-do compile } */
/* { dg-options "-Ofast -frounding-math" } */

float
foo (void)
{
  float a = 9.99974752427078783512115478515625e-7f;
  float b = 1024.0f;
  float c = 2048.0f;
  return a + b + c;
}

would no longer be optimized into a single addition rather than 2.

Jakub


Re: [PATCH, rs6000] Fix PR84279, powerpc64le ICE on cvc4

2018-02-13 Thread Peter Bergner
On 2/13/18 5:51 PM, Segher Boessenkool wrote:
> On Tue, Feb 13, 2018 at 05:07:26PM -0600, Peter Bergner wrote:
>> It's up to you whether you want the backport because you don't trust
>> me being able to create a failing test case. :-)
> 
> We can backport without having a failing testcase.  Let's do that for 7
> at least?

Ok, I'll test the back port to GCC 7 and commit if no regressions.

Peter





Re: [PATCH, rs6000] Fix PR84279, powerpc64le ICE on cvc4

2018-02-13 Thread Segher Boessenkool
On Tue, Feb 13, 2018 at 05:07:26PM -0600, Peter Bergner wrote:
> On 2/13/18 4:34 PM, Segher Boessenkool wrote:
> >> This patch passed bootstrap and retesting on powerpc64le-linux with
> >> no regressions.  Ok for mainline?
> > 
> > Okay, thanks!  Does this need backports?
> 
> Committed with your suggested change below.  Thanks!
> 
> It'd be easy to backport and should be fairly harmless.  That said, I was
> never able to create a simpler test case, using __builtin_altivec_lvx()
> that would end up being re-recog'd as vsx_movv4si_64, so the only way I
> know we can hit this is with Kelvin's optimization that replaces aligned
> vsx loads/stores with altivec loads/stores and that optimization is only
> on trunk.
> 
> It's up to you whether you want the backport because you don't trust
> me being able to create a failing test case. :-)

We can backport without having a failing testcase.  Let's do that for 7
at least?


Segher


[PATCH, rs6000] (v2) PR84220 remove RS6000_BTI_NOT_OPAQUE refs from builtins table

2018-02-13 Thread Will Schmidt
Hi,
This is (v2), which is notably more comprehensive than (v1).
  Some of our builtin definitions were allowing invalid parameters, and a
subsequent ICE (on invalid code) were the result.  This is due to the use of
RS6000_BTI_NOT_OPAQUE (which allowed vector arguments), where a
RS6000_BTI_INTSI appears to be a more appropriate choice.
This change adjusts the definitions for the VEC_SLD, VEC_SLDW, vec_XXSLDWI
and VEC_XXPERMDI entries.

Testcases have been added to ensure we generate the 'invalid intrinsic'
message as is appropriate, instead of ICEing.
Giving credit, this was found by Peter Bergner while working another issue.

Sniff-tests passed on P8.  Doing larger reg-test across power systems now.
OK for trunk?
Thanks,
-Will

[gcc]

2018-02-13  Will Schmidt  

PR target/84220
* config/rs6000/rs6000-c.c: Update definitions for
ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VEC_SLDW,
VSX_BUILTIN_VEC_XXSLDWI, and ALTIVEC_BUILTIN_VEC_XXPERMDI builtins.

[testsuite]

2018-02-13  Will Schmidt  

PR target/84220
* gcc.target/powerpc/pr84220-sld.c: New test.
* gcc.target/powerpc/pr84220-sld2.c: New test.
* gcc.target/powerpc/pr84220-sldw.c: New test.
* gcc.target/powerpc/pr84220-xxperm.c: New test.
* gcc.target/powerpc/pr84220-xxsld.c: New test.

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index a68be51..843a375 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -3654,64 +3654,65 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
 RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
RS6000_BTI_bool_V16QI },
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
 RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
RS6000_BTI_unsigned_V16QI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_4SF,
-RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_4SI,
-RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_4SI,
-RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 
RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 
RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_4SI,
-RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 
RS6000_BTI_unsigned_V4SI, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 
RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_8HI,
-RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_8HI,
-RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 
RS6000_BTI_unsigned_V8HI, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 
RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_8HI,
-RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, 
RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, 
RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_8HI,
-RS6000_BTI_pixel_V8HI, RS6000_BTI_pixel_V8HI, RS6000_BTI_pixel_V8HI, 
RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_pixel_V8HI, RS6000_BTI_pixel_V8HI, RS6000_BTI_pixel_V8HI, 
RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_16QI,
-RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, 
RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_16QI,
-RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 
RS6000_BTI_unsigned_V16QI, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 
RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_16QI,
-RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_2DF,
-RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_2DI,
-RS6000_BTI_bool_V2DI, 

Re: [PATCH, rs6000] Begin deprecation of -maltivec=be

2018-02-13 Thread Segher Boessenkool
On Tue, Feb 13, 2018 at 04:11:36PM -0600, Kelvin Nilsen wrote:
> PR 78303 was recently marked RESOLVED, WONTFIX.  The resolution was to
> deprecate the troublesome command-line option.
> 
> This patch begins the process of deprecation by issuing a warning
> message when this command-line option is specified.  The patch has
> bootstrapped and tested without regressions on
> powerpc64le-unknown-linux.  Is this ok for trunk?

Is -w appropriate for all these testcases, i.e. would we want to see
any other warnings that are generated for them?  Or will we remember
to remove -w when we remove -maltivec=be :-)

Will you write a wwwdocs patch as well, please?  Is the plan to remove
this in GCC 9 already?

> --- gcc/config/rs6000/rs6000.c(revision 257395)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -4028,6 +4028,13 @@ rs6000_option_override_internal (bool global_init_
>if (global_init_p)
>  rs6000_isa_flags_explicit = global_options_set.x_rs6000_isa_flags;
>  
> +  /* We plan to deprecate the -maltivec=be option.  For now, just
> + issue a warning message.  */
> +  if (global_init_p
> +  && (rs6000_altivec_element_order == 2))

Please remove the useless parens.

Otherwise okay for trunk.  Thanks!


Segher


Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-13 Thread Martin Sebor

On 02/13/2018 04:05 PM, David Malcolm wrote:

On Tue, 2018-02-13 at 15:37 -0700, Martin Sebor wrote:

On 02/13/2018 01:59 PM, Manuel López-Ibáñez wrote:



On 13 Feb 2018 5:58 pm, "Martin Sebor" > wrote:


I wanted to make the _n() functions like warning_n() more
robust by letting them accept a tree argument (as well as
offset_int and wide_int) in addition to HOST_WIDE_INT but
I can't do it if they can't work with these types.


There must be a tree-diagnostics.c where you can add those
functions and
then call the general diagnostic functions. Same for RTL.


I don't see how to do that.

Here's a sketch of what I tried to do:

   struct IntegerConverter
   {
 union {
   tree t;
   unsigned HOST_WIDE_INT hwi;
   // buffer for offset_int, wide_int, etc.
 } value;

 IntegerConverter (tree t)
 {
   value.t = t;
 }

 IntegerConverter (unsigned HOST_WIDE_INT x)
 {
   value.x = x;
 }

 ...
   };

   void error_n (int, const IntegerConverter &, const char*, ...);
   ...

With that, the call

   error_n (loc, t, "%E thing", "%E things", t);

works when t is a tree, and the call to the same function

   error_n (loc, i, "%wu thing", "%wu things", i);

also works when i is a HOST_WIDE_INT.  I chose this approach
to avoid introducing additional overloads of the error_n()
functions.


Why can't you simply introduce a tree-diagnostic.h and tree-
diagnostic.c and put overloads there?

In diagnostic.c the various _n all use diagnostic_n_impl, which
ultimately calls out to
  ngettext (singular_gmsgid, plural_gmsgid, n)
so there would presumably be an analogous "diagnostic_n_hwi_impl" (or
somesuch) function there to do the same thing.


The analogous function would, in fact, do the exact same
thing.



That would keep the tree stuff separated from the rest of the
diagnostics subsystem.

Or am I missing something?


(Sorry if I'm being overly verbose below.  I'm just trying
to make things clear.)

I wasn't introducing overloads.  I was replacing each of
the existing error_n() and related functions with one that
takes an IntegerConverter argument instead of the int n
argument.  The IntegerConverter constructor is overloaded
on all the types I want to be able to call error_n() with
(tree, offset_int, wide_int, and perhaps also the various
flavors of poly_int).  That has the effect if overloading
each of the functions on each of the types
the IntegerConverter class constructor is overloaded on.

Introducing new overloads of error_n() et al. and keeping
the ones in diagnostic-core.h would lead to ambiguities.
E.g., the call to

  error_n (loc, i, "%wu thing", "%wu things", i);

would be ambiguous if i were signed long (converting it
to unsigned long is no better than converting it to int).

I could change the existing error_n() to take HOST_WIDE_INT
instead of int as an argument and also add the overload(s)
as you suggest.  But the last time this issue came up Jakub
had a strong preference for not adding overloads of these
functions.  It also didn't occur to me to duplicate all
these functions.  Does that seem like a good approach to
you? (I mean, to have two parallel sets of functions that
do the same thing and only differ in the type of one of
their arguments.  If duplicating the signatures is
thought to be the right solution to achieve modularity
then I would think that ultimately one should call
the other.)

Martin



Hope this is constructive
Dave


Note that pretty-printer.c works in the same way.


It works that way for arguments passed through the ellipsis.
I was trying to use tree (indirectly) in the signature of
the _n() functions and for that I need "tree.h" etc. in
diagnostic-core.h and be able to call functions from
the header in diagnostic-core.c.  I suppose I could make
the calls indirectly somehow but I can't avoid including
GCC headers in diagnostic-core.h.

Martin





[PATCH, rs6000, committed] Fix PR84372: test case gcc.target/powerpc/lvsl-lvsr.c fails on power9

2018-02-13 Thread Peter Bergner
Same problem as PR84365 and same fix.  Committed as obvious.

Peter

PR target/84372
* gcc.target/powerpc/lvsl-lvsr.c: Also match lxv when compiling
with -mcpu=power9.
 
Index: gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c
===
--- gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c(revision 257647)
+++ gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c(working copy)
@@ -6,7 +6,7 @@
 /* { dg-options "-O0 -Wno-deprecated" } */
 /* { dg-final { scan-assembler-times "lvsl" 2 } } */
 /* { dg-final { scan-assembler-times "lvsr" 2 } } */
-/* { dg-final { scan-assembler-times "lxvd2x" 2 } } */
+/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 2 } } */
 /* { dg-final { scan-assembler-times "vperm" 2 } } */



Re: [PATCH, rs6000] PR84220 fix altivec_vec_sld and vec_sldw intrinsic definitions

2018-02-13 Thread Will Schmidt
On Thu, 2018-02-08 at 17:48 -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Feb 07, 2018 at 09:14:59AM -0600, Will Schmidt wrote:
> >   Our VEC_SLD definitions were mistakenly allowing the third argument to be
> > of an invalid type, triggering an ICE (on invalid code) later in the build
> > process.  This fixes those definitions.  The nearby VEC_SLDW definitions 
> > have
> > the same issue, those have been fixed as part of this patch too.
> > Testcases have been added to ensure we generate the 'invalid intrinsic'
> > message as is appropriate, instead of ICEing.
> > Giving proper credit, this was found by Peter Bergner while working a
> > different issue. :-)
> > 
> > Sniff-tests passed on P8.  Doing larger reg-test across power systems now.
> > OK for trunk?
> > And,.. do we want this one backported too?
> 
> > diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
> > index a68be51..26f9990 100644
> > --- a/gcc/config/rs6000/rs6000-c.c
> > +++ b/gcc/config/rs6000/rs6000-c.c
> > @@ -3654,39 +3654,39 @@ const struct altivec_builtin_types 
> > altivec_overloaded_builtins[] = {
> >{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
> >  RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
> > RS6000_BTI_bool_V16QI },
> >{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
> >  RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
> > RS6000_BTI_unsigned_V16QI },
> >{ ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_4SF,
> > -RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 
> > RS6000_BTI_NOT_OPAQUE },
> > +RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_INTSI },
> 
> It isn't clear to me what RS6000_BTI_NOT_OPAQUE means...  rs6000-c.c says:
> 
> /* For arguments after the last, we have RS6000_BTI_NOT_OPAQUE in
>the opX fields.  */
> 
> (whatever that means!), and the following code seems to allow anything in
> such args?  If you understand it, please update some comments somewhere?

I dug in a bit more to try to understand the history and context.

The RS6000_BTI_NOT_OPAQUE entry was added as part of the (large) AltiVec
rewrite (By Paolo Bonzini) in Apr 2005.
The ALTIVEC_BUILTIN_VEC_SLD entries, and the "for arguments after the
last" code chunk in altivec_resolve_overloaded_builtin() was part of
that addition, and pretty much un-touched since that time.

> >{ VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DF,
> 
> XXPERMDI is the only other builtin that uses NOT_OPAQUE, does that suffer
> from the same problem?  If so, you can completely delete NOT_OPAQUE it
> seems?

Yes and no.   I've generated a few more tests that show the problem also
included vec_xxperms.   
SO,.. I've updated the patch to fix those references too.

With that change, all references to NOT_OPAQUE in the builtins table are
removed.

(I'll be posting that momentarily while regtests run overnight..)

So then with the idea of cleaning up all remaining references to
_NOT_OPAQUE..  I got stuck.  :-)

The _NOT_OPAQUE definition is the first entry in the (rs6000.h: enum
rs6000_builtin_type_index)

enum rs6000_builtin_type_index
{
-  RS6000_BTI_NOT_OPAQUE,
+  RS6000_BTI_unset,
   RS6000_BTI_opaque_V2SI,


And the only other reference is in this chunk of code in rs6000-c.c:
altivec_resolve_overloaded_builtin() 

  /* For arguments after the last, we have RS6000_BTI_NOT_OPAQUE in
   the opX fields.  */
for (; desc->code == fcode; desc++)
  {
   if ((desc->op1 == RS6000_BTI_NOT_OPAQUE
|| rs6000_builtin_type_compatible (types[0], desc->op1))
   && (desc->op2 == RS6000_BTI_NOT_OPAQUE
   || rs6000_builtin_type_compatible (types[1], desc->op2))
   && (desc->op3 == RS6000_BTI_NOT_OPAQUE
   || rs6000_builtin_type_compatible (types[2], desc->op3)))


So there should no longer be any matches to ...NOT_OPAQUE, but if I
comment out the snippets "== ..NOT_OPAQUE || ", lots of ICE's show up.

which makes me wonder if the check here is more of a "if desc->op1 was
not explicitly set,... " thing.  But it's not clear to me. 

So I'm deliberately not touching this chunk of code at this time. :-)

Thanks
-Will


> 
> So what is/was it for, that is what I wonder.
> 
> Your patch looks fine if you can clear that up :-)
> 
> 
> Segher
> 




Re: [PATCH, rs6000] Fix PR84279, powerpc64le ICE on cvc4

2018-02-13 Thread Peter Bergner
On 2/13/18 4:34 PM, Segher Boessenkool wrote:
>> This patch passed bootstrap and retesting on powerpc64le-linux with
>> no regressions.  Ok for mainline?
> 
> Okay, thanks!  Does this need backports?

Committed with your suggested change below.  Thanks!

It'd be easy to backport and should be fairly harmless.  That said, I was
never able to create a simpler test case, using __builtin_altivec_lvx()
that would end up being re-recog'd as vsx_movv4si_64, so the only way I
know we can hit this is with Kelvin's optimization that replaces aligned
vsx loads/stores with altivec loads/stores and that optimization is only
on trunk.

It's up to you whether you want the backport because you don't trust
me being able to create a failing test case. :-)


>> --- gcc/testsuite/g++.dg/pr84279.C   (nonexistent)
>> +++ gcc/testsuite/g++.dg/pr84279.C   (working copy)
>> @@ -0,0 +1,35 @@
>> +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
> 
> I don't think this needs lp64?

Yeah, I think you're right.  I'll remove it.  Thanks.

Peter



Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-13 Thread David Malcolm
On Tue, 2018-02-13 at 15:37 -0700, Martin Sebor wrote:
> On 02/13/2018 01:59 PM, Manuel López-Ibáñez wrote:
> > 
> > 
> > On 13 Feb 2018 5:58 pm, "Martin Sebor"  > > wrote:
> > 
> > 
> > I wanted to make the _n() functions like warning_n() more
> > robust by letting them accept a tree argument (as well as
> > offset_int and wide_int) in addition to HOST_WIDE_INT but
> > I can't do it if they can't work with these types.
> > 
> > 
> > There must be a tree-diagnostics.c where you can add those
> > functions and
> > then call the general diagnostic functions. Same for RTL.
> 
> I don't see how to do that.
> 
> Here's a sketch of what I tried to do:
> 
>struct IntegerConverter
>{
>  union {
>tree t;
>unsigned HOST_WIDE_INT hwi;
>// buffer for offset_int, wide_int, etc.
>  } value;
> 
>  IntegerConverter (tree t)
>  {
>value.t = t;
>  }
> 
>  IntegerConverter (unsigned HOST_WIDE_INT x)
>  {
>value.x = x;
>  }
> 
>  ...
>};
> 
>void error_n (int, const IntegerConverter &, const char*, ...);
>...
> 
> With that, the call
> 
>error_n (loc, t, "%E thing", "%E things", t);
> 
> works when t is a tree, and the call to the same function
> 
>error_n (loc, i, "%wu thing", "%wu things", i);
> 
> also works when i is a HOST_WIDE_INT.  I chose this approach
> to avoid introducing additional overloads of the error_n()
> functions.

Why can't you simply introduce a tree-diagnostic.h and tree-
diagnostic.c and put overloads there?

In diagnostic.c the various _n all use diagnostic_n_impl, which
ultimately calls out to 
  ngettext (singular_gmsgid, plural_gmsgid, n)
so there would presumably be an analogous "diagnostic_n_hwi_impl" (or
somesuch) function there to do the same thing.

That would keep the tree stuff separated from the rest of the
diagnostics subsystem.

Or am I missing something?

Hope this is constructive
Dave

> > Note that pretty-printer.c works in the same way.
> 
> It works that way for arguments passed through the ellipsis.
> I was trying to use tree (indirectly) in the signature of
> the _n() functions and for that I need "tree.h" etc. in
> diagnostic-core.h and be able to call functions from
> the header in diagnostic-core.c.  I suppose I could make
> the calls indirectly somehow but I can't avoid including
> GCC headers in diagnostic-core.h.
> 
> Martin
> 


Re: [PR tree-optimization/84047] missing -Warray-bounds on an out-of-bounds index

2018-02-13 Thread Jeff Law
On 02/08/2018 03:38 AM, Richard Biener wrote:
> On Thu, Feb 1, 2018 at 6:42 PM, Aldy Hernandez  wrote:
>> Since my patch isn't the easy one liner I wanted it to be, perhaps we
>> should concentrate on Martin's patch, which is more robust, and has
>> testcases to boot!  His patch from last week also fixes a couple other
>> PRs.
>>
>> Richard, would this be acceptable?  That is, could you or Jakub review
>> Martin's all-encompassing patch?  If so, I'll drop mine.
> 
> Sorry, no - this one looks way too complicated.
> 
>> Also, could someone pontificate on whether we want to fix
>> -Warray-bounds regressions for this release cycle?
> 
> Remove bogus ones?  Yes.  Add "missing ones"?  No.
Seems reasonable.  I'll retarget the missed warning stuff for gcc-9 and
we'll consider those out-of-scope for gcc-8.

Still in scope would be bogus warnings.

Jeff


Re: [RFC][AARCH64] Machine reorg pass for aarch64/Falkor to handle prefetcher tag collision

2018-02-13 Thread Kugan Vivekanandarajah
Hi Kyrill,

On 13 February 2018 at 20:47, Kyrill  Tkachov
 wrote:
> Hi Kugan,
>
> On 12/02/18 23:58, Kugan Vivekanandarajah wrote:
>>
>> Implements a machine reorg pass for aarch64/Falkor to handle
>> prefetcher tag collision. This is strictly not part of the loop
>> unroller but for Falkor, unrolling can make h/w prefetcher performing
>> badly if there are too much tag collisions based on the discussions in
>> https://gcc.gnu.org/ml/gcc/2017-10/msg00178.html.
>>
>
> Could you expand a bit more on what transformation exactly this pass does?

This is similar to what LLVM does in https://reviews.llvm.org/D35366.

Falkor hardware prefetcher works well when signature of the prefetches
(or tags as computed in the patch - similar to LLVM) are different for
different memory streams. If different memory streams  have the same
signature, it can result in bad performance. This machine reorg pass
tries to change the signature of memory loads by changing the base
register with a free register.

> From my understanding the loads that use the same base
> register and offset and have the same destination register
> are considered part of the same stream by the hardware prefetcher, so for
> example:
> ldr x0, [x1, 16] (load1)
> ... (set x1 to something else)
> ldr x0, [x1, 16] (load2)
>
> will cause the prefetcher to think that both loads are part of the same
> stream,
> so this pass tries to rewrite the sequence into:
> ldr x0, [x1, 16]
> ... (set x1 to something else)
> mov tmp, x1
> ldr x0, [tmp, 16]
>
> Where the tag/signature is the combination of destination x0, base x1 and
> offset 16.
> Is this a fair description?

This is precisely what is happening.

>
> I've got some comments on the patch itself
>
>> gcc/ChangeLog:
>>
>> 2018-02-12  Kugan Vivekanandarajah 
>>
>> * config/aarch64/aarch64.c (iv_p): New.
>> (strided_load_p): Likwise.
>> (make_tag): Likesie.
>> (get_load_info): Likewise.
>> (aarch64_reorg): Likewise.
>> (TARGET_MACHINE_DEPENDENT_REORG): Implement new target hook.
>
>
> New functions need function comments describing the arguments at least.
> Functions like make_tag, get_load_info etc can get tricky to maintain
> without
> some documentation on what they are supposed to accept and return.

I wil add the comments.

>
> I think the pass should be enabled at certain optimisation levels, say -O2?
> I don't think it would be desirable at -Os since it creates extra moves that
> increase code size.

Ok, I will change this.

>
> That being said, I would recommend you implement this as an aarch64-specific
> pass,
> in a similar way to cortex-a57-fma-steering.c. That way you can register it
> in
> aarch64-passes.def and have flexibility as to when exactly the pass gets to
> run
> (i.e. you wouldn't be limited by when machine_reorg gets run).
>
> Also, I suggest you don't use the "if (aarch64_tune != falkor) return;" way
> of
> gating this pass. Do it in a similar way to the FMA steering pass that is,
> define a new flag in aarch64-tuning-flags.def and use it in the tune_flags
> field
> of the falkor tuning struct.

Ok, I will revise the patch.


Thanks,
Kugan

>
> Hope this helps,
> Kyrill


Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-13 Thread Martin Sebor

On 02/13/2018 01:59 PM, Manuel López-Ibáñez wrote:



On 13 Feb 2018 5:58 pm, "Martin Sebor" > wrote:


I wanted to make the _n() functions like warning_n() more
robust by letting them accept a tree argument (as well as
offset_int and wide_int) in addition to HOST_WIDE_INT but
I can't do it if they can't work with these types.


There must be a tree-diagnostics.c where you can add those functions and
then call the general diagnostic functions. Same for RTL.


I don't see how to do that.

Here's a sketch of what I tried to do:

  struct IntegerConverter
  {
union {
  tree t;
  unsigned HOST_WIDE_INT hwi;
  // buffer for offset_int, wide_int, etc.
} value;

IntegerConverter (tree t)
{
  value.t = t;
}

IntegerConverter (unsigned HOST_WIDE_INT x)
{
  value.x = x;
}

...
  };

  void error_n (int, const IntegerConverter &, const char*, ...);
  ...

With that, the call

  error_n (loc, t, "%E thing", "%E things", t);

works when t is a tree, and the call to the same function

  error_n (loc, i, "%wu thing", "%wu things", i);

also works when i is a HOST_WIDE_INT.  I chose this approach
to avoid introducing additional overloads of the error_n()
functions.


Note that pretty-printer.c works in the same way.


It works that way for arguments passed through the ellipsis.
I was trying to use tree (indirectly) in the signature of
the _n() functions and for that I need "tree.h" etc. in
diagnostic-core.h and be able to call functions from
the header in diagnostic-core.c.  I suppose I could make
the calls indirectly somehow but I can't avoid including
GCC headers in diagnostic-core.h.

Martin



Re: [PATCH, rs6000] Fix PR84279, powerpc64le ICE on cvc4

2018-02-13 Thread Segher Boessenkool
Hi!

On Mon, Feb 12, 2018 at 09:34:30PM -0600, Peter Bergner wrote:
> PR84279 is a similar problem to PR83399, in that we generate an altivec
> load/store through an explicit call to the altivec_{l,st}vx_v4si_2op
> pattern and then due to spilling, we end up calling recog() and we match
> an earlier pattern, in this case vsx_movv4si_64bit.  That is ok, since
> this pattern can generate the lvx/stvx insns the altivec patterm can.
> However, due to a constraint bug, we end up using the wrong alternative.

> Now we _should_ match using the second to last alternative, but we end up
> matching the 8th alternative ("??Y" and "r").  The 8th alternative is used for
> storing a GPR, which we have, but the mem we're trying to store to does not
> have a valid address for a GPR store.  The "bug" is that the "Y" constraint
> code, which is implemented by  mem_operand_gpr() allows our altivec address
> when it should not.  The following patch which fixes the ICE adds code to
> mem_operand_gpr() which disallows such addresses.
> 
> This patch passed bootstrap and retesting on powerpc64le-linux with
> no regressions.  Ok for mainline?

Okay, thanks!  Does this need backports?

> --- gcc/testsuite/g++.dg/pr84279.C(nonexistent)
> +++ gcc/testsuite/g++.dg/pr84279.C(working copy)
> @@ -0,0 +1,35 @@
> +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */

I don't think this needs lp64?


Segher


[PATCH, rs6000, committed] Fix PR80370: Invalid option used in test case gcc.target/powerpc/builtins-3-p9-runnable.c

2018-02-13 Thread Peter Bergner
This patch removes an undocumented option that has since been deprecated and
removed and was causing this test case to FAIL.  The obvious patch below was
approved by Segher offline.  Committed.

Peter

PR target/84370
* gcc.target/powerpc/builtins-3-p9-runnable.c: Remove deprecated option.

Index: gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c
===
--- gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c   (revision 
257641)
+++ gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c   (revision 
257642)
@@ -1,6 +1,6 @@
 /* { dg-do run { target { powerpc64*-*-* && { lp64 && p9vector_hw } } } } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
-/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */
+/* { dg-options "-mcpu=power9 -O2" } */
 
 #include  // vector
 



Re: [RFC] Adds a target hook

2018-02-13 Thread Kugan Vivekanandarajah
Hi Kyrill,

Thanks for the review.

On 13 February 2018 at 20:58, Kyrill  Tkachov
 wrote:
> Hi Kugan,
>
> On 12/02/18 23:53, Kugan Vivekanandarajah wrote:
>>
>> Adds a target hook TARGET_HW_MAX_MEM_READ_STREAMS. Loop unroller, if
>> defined, will try to limit the unrolling factor based on this.
>>
>
> Could you elaborate a bit on this, in particular how is this different
> from the PARAM_SIMULTANEOUS_PREFETCHES param that describes
> "the number of prefetches that can run at the same time".
> The descriptions seem very similar to me...

You are right that they are similar. I wanted to keep it separate
because not all the micro-architectures might prefer limiting unroll
factor this way. If we keep this separate, we will have the option to
disable this without affecting the rest.

> Incidentally, since this is expected to always be an integer, maybe
> make it into a param so it is consistent with the other prefetch-related
> tuning numbers?

Ok, I will change it into param in the next iteration.

Thanks,
Kugan
>
> Thanks,
> Kyrill
>
>
>>
>> gcc/ChangeLog:
>>
>> 2018-02-12  Kugan Vivekanandarajah 
>>
>> * doc/tm.texi.in (TARGET_HW_MAX_MEM_READ_STREAMS): Dcoument.
>> * doc/tm.texi: Regenerate.
>> * target.def (hw_max_mem_read_streams): New target hook.
>
>


Re: [PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-13 Thread Martin Sebor

On 02/13/2018 01:09 PM, Jason Merrill wrote:

On Tue, Feb 13, 2018 at 2:59 PM, Martin Sebor  wrote:

On 02/13/2018 12:15 PM, Jason Merrill wrote:

On Tue, Feb 13, 2018 at 1:31 PM, Martin Sebor  wrote:

On 02/13/2018 09:24 AM, Martin Sebor wrote:



On 02/13/2018 08:35 AM, Martin Sebor wrote:



On 02/13/2018 07:40 AM, Jason Merrill wrote:



On Mon, Feb 12, 2018 at 6:32 PM, Martin Sebor 
wrote:



While testing my fix for 83871 (handling attributes on explicit
specializations) I noticed another old regression: while GCC 4.4
would diagnose declarations of explicit specializations of all
primary templates declared deprecated, GCC 4.5 and later only
diagnose declarations of explicit specializations of class
templates but not those of function or variable templates.




Hmm, the discussion on the core reflector seemed to be agreeing that
we want to be able to define non-deprecated specializations of a
deprecated primary template.




Yes, that's what Richard wanted to do.  The only way to do it
within the existing constraints(*) is to define a non-deprecated
primary, and a deprecated partial specialization.  This is in line
with that approach and supported by Clang and all other compilers
I tested (including Clang).




To clarify, this approach works for class templates (e.g., like
std::numeric_limits that was mentioned in the core discussion)
and for variable templates.  Functions have no partial
specilizations so they have to be overloaded to achieve the same
effect.

Implementations don't treat the deprecated attribute on partial
specializations consistently.

EDG accepts and honors it on class template partial specializations
but rejects it with an error on those of variables.

Clang accepts but silently ignores it on class template partial
specializations and rejects with an error it on variables.

MSVC accepts and honors it on variables but silently ignores it
on class template partial specializations.

GCC ignores it silently on class partial specializations and
with a warning on variables (I opened bug 84347 to track this
and to have GCC honor is everywhere).

This is clearly a mess, which isn't surprising given how poorly
specified this is in the standard.  But from the test cases and
from the core discussion it seems clear that deprecating
a template, including its partial specializations (as opposed
to just a single explicit specialization) is desirable and
already supported, and that the wording in the standard just
needs to be adjusted to reflect that.



Martin

[*] Except (as Richard noted) that the standard doesn't seem to
allow a template to be deprecated.  I think that's a bug in the
spec because all implementations allow it to some degree.




One other note.  While thinking about this problem during
the core discussion, another approach to deprecating a primary
template without also deprecating all of its specializations
occurred to me.

1) First declare the primary template without [[deprecated]].
2) Next declare its non-deprecated specializations (partial
   or explicit).
3) Finally declare the primary again, this time [[deprecated]].

Like this:

  template  structS;
  template  structS { };
  template  struct [[deprecated]] S { };
  template  struct [[deprecated]] S { };

  S si; // warning
  S sci;  // no warning
  S svi;   // warning

This works as expected with Intel ICC.  All other compilers
diagnose all three variables.  I'd say for [[deprecated]] it
should work the way ICC does.  (For [[noreturn]] the first
declaration must be [[noreturn]], so there this solution
wouldn't work also because of that, in addition to function
templates not being partially-specializable.)



My understanding of the reflector discussion, and Richard's comment in
particular, was that [[deprecated]] should apply to the instances, not
the template itself, so that declaring the primary template
[[deprecated]] doesn't affect explicit specializations.  Your last
example should work as you expect in this model, but you can also
write the simpler

template  struct [[deprecated]] S { };
template  struct S { }; // no warning



With this approach there would be no way to deprecate all of
a template's specializations) because it would always be
possible for a user to get around deprecation by defining
their own specialization, partial or explicit.


Yep.  And so he suggested that we might want to add a new way to write
attributes that do apply to the template name.


[[deprecated]] was introduced in part to make it possible for
C++ standard library implementers to add warnings for stuff
the committee has deprecated.  Most C++ deprecated features
are templates.  Declaring that [[deprecated]] isn't meant to
serve its purpose for templates and that some new form of
it is needed would make the attribute useless for standard
library implementers in the meantime, until that new form
is invented, as well as for all 

Re: [PATCH] combine: Update links correctly for new I2 (PR84169)

2018-02-13 Thread Segher Boessenkool
On Mon, Feb 12, 2018 at 05:12:20PM +0100, Jakub Jelinek wrote:
> On Mon, Feb 12, 2018 at 03:59:05PM +, Segher Boessenkool wrote:
> > If there is a LOG_LINK between two insns, this means those two insns
> > can be combined, as far as dataflow is concerned.  There never should
> > be a LOG_LINK between two unrelated insns.  If there is one, combine
> > will try to combine the insns without doing all the needed checks if
> > the earlier destination is used before the later insn, etc.
> > 
> > Unfortunately we do not update the LOG_LINKs correctly in some cases.
> > This patch fixes at least some of those cases.
> > 
> > This fixes the PR's testcase on aarch64.  Also tested on 30+ cross
> > compiler, and on powerpc64-linux {-m32,-m64}.  Will test on x86_64
> > as well before committing.
> 
> Will you check in the testcase too?
> 
> My preference would be something like following, so that it can
> be torture-tested on all targets.

Works flawlessly everywhere.  I've committed this now, thanks again!


Segher


[PATCH, rs6000] Begin deprecation of -maltivec=be

2018-02-13 Thread Kelvin Nilsen

PR 78303 was recently marked RESOLVED, WONTFIX.  The resolution was to
deprecate the troublesome command-line option.

This patch begins the process of deprecation by issuing a warning
message when this command-line option is specified.  The patch has
bootstrapped and tested without regressions on
powerpc64le-unknown-linux.  Is this ok for trunk?

gcc/ChangeLog:

2018-02-13  Kelvin Nilsen  

* config/rs6000/rs6000.c (rs6000_option_override_internal): Issue
warning message if user requests -maltivec=be.

gcc/testsuite/ChangeLog:

2018-02-13  Kelvin Nilsen  

* gcc.dg/vmx/extract-be-order.c: Disable -maltivec=be warning so
this test case still works ok.
* gcc.dg/vmx/extract-vsx-be-order.c: Likewise.
* gcc.dg/vmx/insert-be-order.c: Likewise.
* gcc.dg/vmx/insert-vsx-be-order.c: Likewise.
* gcc.dg/vmx/ld-be-order.c: Likewise.
* gcc.dg/vmx/ld-vsx-be-order.c: Likewise.
* gcc.dg/vmx/lde-be-order.c: Likewise.
* gcc.dg/vmx/ldl-be-order.c: Likewise.
* gcc.dg/vmx/ldl-vsx-be-order.c: Likewise.
* gcc.dg/vmx/merge-be-order.c: Likewise.
* gcc.dg/vmx/merge-vsx-be-order.c: Likewise.
* gcc.dg/vmx/mult-even-odd-be-order.c: Likewise.
* gcc.dg/vmx/pack-be-order.c: Likewise.
* gcc.dg/vmx/perm-be-order.c: Likewise.
* gcc.dg/vmx/splat-be-order.c: Likewise.
* gcc.dg/vmx/splat-vsx-be-order.c: Likewise.
* gcc.dg/vmx/st-be-order.c: Likewise.
* gcc.dg/vmx/st-vsx-be-order.c: Likewise.
* gcc.dg/vmx/ste-be-order.c: Likewise.
* gcc.dg/vmx/stl-be-order.c: Likewise.
* gcc.dg/vmx/stl-vsx-be-order.c: Likewise.
* gcc.dg/vmx/sum2s-be-order.c: Likewise.
* gcc.dg/vmx/unpack-be-order.c: Likewise.
* gcc.dg/vmx/vsums-be-order.c: Likewise.
* gcc.target/powerpc/vec-setup-be-long.c: Likewise.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 257395)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -4028,6 +4028,13 @@ rs6000_option_override_internal (bool global_init_
   if (global_init_p)
 rs6000_isa_flags_explicit = global_options_set.x_rs6000_isa_flags;
 
+  /* We plan to deprecate the -maltivec=be option.  For now, just
+ issue a warning message.  */
+  if (global_init_p
+  && (rs6000_altivec_element_order == 2))
+warning (0, "%qs command-line option is deprecated",
+"-maltivec=be");
+
   /* On 64-bit Darwin, power alignment is ABI-incompatible with some C
  library functions, so warn about it. The flag may be useful for
  performance studies from time to time though, so don't disable it
Index: gcc/testsuite/gcc.dg/vmx/extract-be-order.c
===
--- gcc/testsuite/gcc.dg/vmx/extract-be-order.c (revision 257395)
+++ gcc/testsuite/gcc.dg/vmx/extract-be-order.c (working copy)
@@ -1,4 +1,4 @@
-/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx -w" } */
 
 #include "harness.h"
 
Index: gcc/testsuite/gcc.dg/vmx/extract-vsx-be-order.c
===
--- gcc/testsuite/gcc.dg/vmx/extract-vsx-be-order.c (revision 257395)
+++ gcc/testsuite/gcc.dg/vmx/extract-vsx-be-order.c (working copy)
@@ -1,6 +1,6 @@
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx -w" } */
 
 #include "harness.h"
 
Index: gcc/testsuite/gcc.dg/vmx/insert-be-order.c
===
--- gcc/testsuite/gcc.dg/vmx/insert-be-order.c  (revision 257395)
+++ gcc/testsuite/gcc.dg/vmx/insert-be-order.c  (working copy)
@@ -1,4 +1,4 @@
-/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+/* { dg-options "-w -maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
 
 #include "harness.h"
 
Index: gcc/testsuite/gcc.dg/vmx/insert-vsx-be-order.c
===
--- gcc/testsuite/gcc.dg/vmx/insert-vsx-be-order.c  (revision 257395)
+++ gcc/testsuite/gcc.dg/vmx/insert-vsx-be-order.c  (working copy)
@@ -1,6 +1,6 @@
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */
+/* { dg-options "-w -maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */
 
 #include "harness.h"
 
Index: gcc/testsuite/gcc.dg/vmx/ld-be-order.c
===
--- gcc/testsuite/gcc.dg/vmx/ld-be-order.c  (revision 257395)
+++ gcc/testsuite/gcc.dg/vmx/ld-be-order.c  (working copy)
@@ 

[PATCH, rs6000, committed] Fix PR84365: gcc.target/powerpc/altivec-7-le.c fails on power9

2018-02-13 Thread Peter Bergner
The following patch was approved offline by Segher to fix PR84365.
The problem is when compiling with -mcpu=power9, we don't generate
the LE unfriendly lxvd2x and instead emit an LE friendly lxv insn.

Peter

PR target/84365
* gcc.target/powerpc/altivec-7-le.c: Also match lxv when compiling
with -mcpu=power9.

Index: gcc/testsuite/gcc.target/powerpc/altivec-7-le.c
===
--- gcc/testsuite/gcc.target/powerpc/altivec-7-le.c (revision 257640)
+++ gcc/testsuite/gcc.target/powerpc/altivec-7-le.c (revision 257641)
@@ -22,7 +22,7 @@
 /* { dg-final { scan-assembler-times "vpkpx" 2 } } */
 /* { dg-final { scan-assembler-times "vmulesb" 1 } } */
 /* { dg-final { scan-assembler-times "vmulosb" 1 } } */
-/* { dg-final { scan-assembler-times "lxvd2x" 36 } } */
+/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 36 } } */
 /* { dg-final { scan-assembler-times "lvewx" 2 } } */
 /* { dg-final { scan-assembler-times "lvxl" 1 } } */
 /* { dg-final { scan-assembler-times "vupklsh" 1 } } */



Re: Merge from trunk to gccgo branch

2018-02-13 Thread Ian Lance Taylor
I merged trunk revision 257637 to the gccgo branch.

Ian


[C++ PATCH] Fix -Weffc++ warning on return in assignment op in templates (PR c++/84364)

2018-02-13 Thread Jakub Jelinek
Hi!

Before r253599 check_return_expr would:
  if (WILDCARD_TYPE_P (TREE_TYPE (DECL_RESULT (current_function_decl)))
  || (retval != NULL_TREE
  && type_dependent_expression_p (retval)))
return retval;
when processing_template_decl, and thus not issue the warning, but now even
type dependent expressions can make it into the assignment operator -Weffc++
warning checks.

There are multiple problems, e.g. in
template 
struct A {
  T =(T ) {
return *this;   // { dg-bogus "should return a reference to" }
  }
};
already the:
  if (TREE_CODE (valtype) == REFERENCE_TYPE
  && same_type_ignoring_top_level_qualifiers_p
  (TREE_TYPE (valtype), TREE_TYPE (current_class_ref)))
test will fail, even when one can instantiate it with T = A.
The bigger problem is:
  /* Returning '*this' is obviously OK.  */
  if (retval == current_class_ref)
warn = false;
when retval is type dependent expression, it will be usually? built
with build_min and thus == comparison will not work.

The following patch just does what we used to do, not warn with -Weffc++
for type dependent retval expressions, defer that to instantiation.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-02-13  Jakub Jelinek  

PR c++/84364
* typeck.c (check_return_expr): Don't emit -Weffc++ warning
about return other than *this in assignment operators if
retval is type dependent expression.

* g++.dg/warn/effc4.C: New test.

--- gcc/cp/typeck.c.jj  2018-02-10 00:15:36.702163326 +0100
+++ gcc/cp/typeck.c 2018-02-13 16:51:04.570204418 +0100
@@ -9232,7 +9232,8 @@ check_return_expr (tree retval, bool *no
 
   /* Effective C++ rule 15.  See also start_function.  */
   if (warn_ecpp
-  && DECL_NAME (current_function_decl) == assign_op_identifier)
+  && DECL_NAME (current_function_decl) == assign_op_identifier
+  && !type_dependent_expression_p (retval))
 {
   bool warn = true;
 
--- gcc/testsuite/g++.dg/warn/effc4.C.jj2018-02-13 16:53:16.399220269 
+0100
+++ gcc/testsuite/g++.dg/warn/effc4.C   2018-02-13 16:55:10.472233986 +0100
@@ -0,0 +1,10 @@
+// PR c++/84364
+// { dg-do compile }
+// { dg-options "-Weffc++" }
+
+template 
+struct A {
+  A =(A& f) {
+return *this;  // { dg-bogus "should return a reference to" }
+  }
+};

Jakub


[PATCH] Fix a brown paper bag bug in my recent match.pd change (PR middle-end/84309)

2018-02-13 Thread Jakub Jelinek
Hi!

When changing my recent match.pd fix to defer the -> exp2 (log2 (C) * x)
optimization until later, I've swapped the two patterns and changed
the first one to use exps and logs, but forgot to change the second one
to use exp2s and log2s.  For the testcase in the patch it actually didn't
make a difference, it was enough that we deferred optimizing it (because
we shortly afterwards propagated constants into the pow and constant folded
it).

This patch adds a testcase that verifies it is done properly.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-02-13  Jakub Jelinek  

PR middle-end/84309
* match.pd (pow(C,x) -> exp(log(C)*x)): Use exp2s and log2s instead
of exps and logs in the use_exp2 case.

* gcc.dg/pr84309-2.c: New test.

--- gcc/match.pd.jj 2018-02-13 12:14:08.108314686 +0100
+++ gcc/match.pd2018-02-13 12:28:59.958328523 +0100
@@ -4032,7 +4032,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   /* As libmvec doesn't have a vectorized exp2, defer optimizing
 this until after vectorization.  */
   (if (canonicalize_math_after_vectorization_p ())
-   (exps (mult (logs @0) @1
+   (exp2s (mult (log2s @0) @1
 
  (for sqrts (SQRT)
   cbrts (CBRT)
--- gcc/testsuite/gcc.dg/pr84309-2.c.jj 2018-02-13 12:30:25.208304990 +0100
+++ gcc/testsuite/gcc.dg/pr84309-2.c2018-02-13 12:31:35.959285452 +0100
@@ -0,0 +1,11 @@
+/* PR middle-end/84309 */
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-optimized" } */
+
+double
+foo (double x)
+{
+  return __builtin_pow (2.0, x);
+}
+
+/* { dg-final { scan-tree-dump "__builtin_exp2 " "optimized" { target 
*-*-linux* *-*-gnu* } } } */

Jakub


Re: [C++ PATCH] Fix -Weffc++ warning on return in assignment op in templates (PR c++/84364)

2018-02-13 Thread Jason Merrill

OK.


Re: [PATCH] jit: fix link on OS X and Solaris (PR jit/64089 and PR jit/84288)

2018-02-13 Thread Rainer Orth
Hi David,

> libgccjit fails to link on OS X and Solaris due to jit/Make-lang.in,
> due to the assumption there that the linker is GNU ld.  Specifically,
> jit/Make-lang.in hardcodes the use of two options: --version-script
> and -soname.
>
> * on Darwin, --version-script doesn't seem to exist in the linker, and
>   it uses -install_name rather than -soname.
>
> * on Solaris, ld doesn't support --version-script.  However, the version
>   script used for libgccjit.so doesn't use any gld extensions, so one can
>   just use -M instead.
>
> This patch fixes these issues by using variables emitted by gcc's "configure"
> rather than hardcoding the options in jit/Make-lang.in.
>
> It's based on the first part of Rainer's patch for PR jit/84288, but I
> made the following changes:
> * the GNU ld case in configure.ac wasn't setting ld_version_script_option.
>   I set it to "--version-script" for that case.

That's weird: the configure.ac part starts with

ld_version_script_option='--version-script'

only overriding it in the Solaris case to keep things for other hosts
as they were.  Besides, I'm pretty sure I tested the patch on Solaris
with gas/gld to make certain that combo continues to work as is...

> * I moved the:
> LD_VERSION_SCRIPT_OPTION = @ld_version_script_option@
>   from gcc/jit/Make-lang.in to gcc/Makefile.in, as the Make-lang.in files
>   aren't substituted, only the gcc/Makefile.in.
>   Rainer: how did this work for you?

It didn't: what I'd attached to the PR was an initial version of the
patch in the middle between manually hacking gcc/jit/Make-lang.in and
properly autoconfiguring things.

> * added LD_SONAME_OPTION, done in the same way
> * conditionalized the usage of the options in Make-lang.in to cope with
>   empty LD_VERSION_SCRIPT_OPTION (as is presumably the case on OS X).
>   I used ($if condition,then-part[,else-part]) for this.
>   I had to add a $(COMMA) since the "then-part" contains commas, which
>   need to be treated as part of the "then-part", rather than separators
>   for the "else-part".
>   Hopefully this is compatible with every "make" implementation that we
>   support.
>
> Successfully bootstrapped on x86_64-pc-linux-gnu.
>
> I lightly tested the not-recognized case by hacking up the configure.ac
> (on x86_64-pc-linux-gnu) and verifying that it links, and that a
> smoketest of jit.dg/test-factorial works.
>
> Does this fix the jit linker issues on OS X and Solaris?

I'll give it a whirl tomorrow, including the jit-recording.c part of my
patch to allow the build to complete.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [Patch, Fortran, F08] PR 84313: reject procedure pointers in COMMON blocks

2018-02-13 Thread Janus Weil
2018-02-13 20:55 GMT+01:00 Richard Biener :
> On February 13, 2018 8:14:33 PM GMT+01:00, Steve Kargl 
>  wrote:
>>On Tue, Feb 13, 2018 at 07:24:35PM +0100, Janus Weil wrote:
>>>
>>> As my last submission, this fixes fallout from
>>>
>>https://groups.google.com/forum/?fromgroups#!topic/comp.lang.fortran/AIHRQ2kJv3c.
>>> As the last one, it is a very simple fix for an accepts-invalid
>>> problem (which is not a regression), so I hope this one will also
>>> still be suitable for trunk (if not, I hope the release managers, in
>>> CC, will stop me).
>>>
>>> It does regtest cleanly on x86_64-linux-gnu. Ok for trunk?
>>>
>>
>>Ok.

Thanks, Steve. Committed as r257636.


>>I think you can skip waiting RM approval.  To me, this patch
>>falls into the "too simply and obvious" fix category.
>>
>>PS: Given the response I got yesterday on the #gcc IRC channel
>>when I asked about the current status of trunk, I'm inclined to
>>ignore the "regression and doc fixes only" status.
>
> Heh! Fortran is not release critical so if it's broken we'll release anyway. 
> Which means it's up to frontend maintainers to decide what they accept at 
> this stage.

This is how it was handled for earlier releases also.

Nevertheless there is also quite a number of Fortran regression we
should take care of. Will try to have a look on the weekend.

Cheers,
Janus


Go patch committed: Don't export function descriptors for unexported names

2018-02-13 Thread Ian Lance Taylor
The patch changes the Go frontend to not export function descriptors
for unexported names.  They aren't needed, and could potentially cause
unlikely symbol name collisions.  Also, the runtime package's
reference to main could cause the runtime package to define
main.main..f, which could also be defined in the main package if it
does something like fmt.Print(main).  That will normally work but will
fail with a multiple symbol definition error when using -static-libgo.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 257600)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-7998e29eec43ede1cee925d87eef0b09da67d90b
+5d5ea2fd05dbf369ccc53c93d4846623cdea0c47
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 257527)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -1330,9 +1330,24 @@ Func_descriptor_expression::do_get_backe
   else
 {
   Location bloc = Linemap::predeclared_location();
+
+  // The runtime package has hash/equality functions that are
+  // referenced by type descriptors outside of the runtime, so the
+  // function descriptors must be visible even though they are not
+  // exported.
+  bool is_exported_runtime = false;
+  if (gogo->compiling_runtime()
+ && gogo->package_name() == "runtime"
+ && (no->name().find("hash") != std::string::npos
+ || no->name().find("equal") != std::string::npos))
+   is_exported_runtime = true;
+
   bool is_hidden = ((no->is_function()
 && no->func_value()->enclosing() != NULL)
+   || (Gogo::is_hidden_name(no->name())
+   && !is_exported_runtime)
|| Gogo::is_thunk(no));
+
   bvar = context->backend()->immutable_struct(var_name, asm_name,
   is_hidden, false,
  btype, bloc);


[PATCH] jit: fix link on OS X and Solaris (PR jit/64089 and PR jit/84288)

2018-02-13 Thread David Malcolm
libgccjit fails to link on OS X and Solaris due to jit/Make-lang.in,
due to the assumption there that the linker is GNU ld.  Specifically,
jit/Make-lang.in hardcodes the use of two options: --version-script
and -soname.

* on Darwin, --version-script doesn't seem to exist in the linker, and
  it uses -install_name rather than -soname.

* on Solaris, ld doesn't support --version-script.  However, the version
  script used for libgccjit.so doesn't use any gld extensions, so one can
  just use -M instead.

This patch fixes these issues by using variables emitted by gcc's "configure"
rather than hardcoding the options in jit/Make-lang.in.

It's based on the first part of Rainer's patch for PR jit/84288, but I
made the following changes:
* the GNU ld case in configure.ac wasn't setting ld_version_script_option.
  I set it to "--version-script" for that case.
* I moved the:
LD_VERSION_SCRIPT_OPTION = @ld_version_script_option@
  from gcc/jit/Make-lang.in to gcc/Makefile.in, as the Make-lang.in files
  aren't substituted, only the gcc/Makefile.in.
  Rainer: how did this work for you?
* added LD_SONAME_OPTION, done in the same way
* conditionalized the usage of the options in Make-lang.in to cope with
  empty LD_VERSION_SCRIPT_OPTION (as is presumably the case on OS X).
  I used ($if condition,then-part[,else-part]) for this.
  I had to add a $(COMMA) since the "then-part" contains commas, which
  need to be treated as part of the "then-part", rather than separators
  for the "else-part".
  Hopefully this is compatible with every "make" implementation that we
  support.

Successfully bootstrapped on x86_64-pc-linux-gnu.

I lightly tested the not-recognized case by hacking up the configure.ac
(on x86_64-pc-linux-gnu) and verifying that it links, and that a
smoketest of jit.dg/test-factorial works.

Does this fix the jit linker issues on OS X and Solaris?
(I didn't include the autogenerate configure changes)

gcc/ChangeLog:
PR jit/64089
PR jit/84288
* Makefile.in (LD_VERSION_SCRIPT_OPTION, LD_SONAME_OPTION): New.
* configure: Regenerate.
* configure.ac ("linker --version-script option"): New.
("linker soname option"): New.

gcc/jit/ChangeLog:
PR jit/64089
PR jit/84288
* Make-lang.in (COMMA): New.
(LIBGCCJIT_VERSION_SCRIPT_OPTION): New.
(LIBGCCJIT_SONAME_OPTION): New.
(jit): Move --version-script and -soname linker options to the
above.
---
 gcc/Makefile.in  |  4 
 gcc/configure.ac | 38 ++
 gcc/jit/Make-lang.in | 17 +++--
 3 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 6c37e46..903da58 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1116,6 +1116,10 @@ endif
 
 LANG_MAKEFRAGS = @all_lang_makefrags@
 
+# Used by gcc/jit/Make-lang.in
+LD_VERSION_SCRIPT_OPTION = @ld_version_script_option@
+LD_SONAME_OPTION = @ld_soname_option@
+
 # Flags to pass to recursive makes.
 # CC is set by configure.
 # ??? The choices here will need some experimenting with.
diff --git a/gcc/configure.ac b/gcc/configure.ac
index b7f9728..265920c 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3640,6 +3640,44 @@ if test x"$gcc_cv_ld_static_dynamic" = xyes; then
 fi
 AC_MSG_RESULT($gcc_cv_ld_static_dynamic)
 
+AC_MSG_CHECKING(linker --version-script option)
+gcc_cv_ld_version_script=no
+ld_version_script_option='--version-script'
+if test $in_tree_ld = yes || test x"$gnu_ld" = xyes; then
+  gcc_cv_ld_version_script=yes
+  ld_version_script_option='--version-script'
+elif test x$gcc_cv_ld != x; then
+  case "$target" in
+# Solaris 2 ld always supports -M.  It also supports a subset of
+# --version-script since Solaris 11.4, but requires
+# -z gnu-version-script-compat to activate.
+*-*-solaris2*)
+  gcc_cv_ld_version_script=yes
+  ld_version_script_option='-M'
+  ;;
+  esac
+fi
+# Don't AC_DEFINE result, only used in jit/Make-lang.in so far.
+AC_MSG_RESULT($gcc_cv_ld_version_script)
+AC_SUBST(ld_version_script_option)
+
+AC_MSG_CHECKING(linker soname option)
+gcc_cv_ld_soname=no
+if test $in_tree_ld = yes || test x"$gnu_ld" = xyes; then
+  gcc_cv_ld_soname=yes
+  ld_soname_option='-soname'
+elif test x$gcc_cv_ld != x; then
+  case "$target" in
+*-*-darwin*)
+  gcc_cv_ld_soname=yes
+  ld_soname_option='-install_name'
+  ;;
+  esac
+fi
+# Don't AC_DEFINE result, only used in jit/Make-lang.in so far.
+AC_MSG_RESULT($gcc_cv_ld_soname)
+AC_SUBST(ld_soname_option)
+
 if test x"$demangler_in_ld" = xyes; then
   AC_MSG_CHECKING(linker --demangle support)
   gcc_cv_ld_demangle=no
diff --git a/gcc/jit/Make-lang.in b/gcc/jit/Make-lang.in
index d4362a9..ba78f8e 100644
--- a/gcc/jit/Make-lang.in
+++ b/gcc/jit/Make-lang.in
@@ -51,6 +51,19 @@ LIBGCCJIT_FILENAME = \
 LIBGCCJIT_LINKER_NAME_SYMLINK = $(LIBGCCJIT_LINKER_NAME)
 LIBGCCJIT_SONAME_SYMLINK = $(LIBGCCJIT_SONAME)
 
+# 

Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-13 Thread Joseph Myers
On Mon, 12 Feb 2018, Martin Sebor wrote:

> Bug 84207 - Hard coded plural in gimple-fold.c points out one
> of a number of warning_at() calls where warning_n() should have
> been used.  The attached patch both replaces the calls and also
> changes the signatures of the warning_n(), error_n(), and
> inform_n() functions to take an unsigned HOST_WIDE_INT argument
> instead of int.  I also changed the implementation of
> diagnostic_n_impl() to deal with unsigned HOST_WIDE_INT values
> in excess of ULONG_MAX (the maximum value ngettext handles) so
> callers don't need to.

Saturating to ULONG_MAX is not correct for languages where the plural form 
depends on n%10 or n%100 (see the various Plural-Forms entries in the .po 
files).  If n is too large you want something like n % 100 + 100 
instead to get the correct plural form in all cases.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-13 Thread Manuel López-Ibáñez
On 13 Feb 2018 5:58 pm, "Martin Sebor"  wrote:


I wanted to make the _n() functions like warning_n() more
robust by letting them accept a tree argument (as well as
offset_int and wide_int) in addition to HOST_WIDE_INT but
I can't do it if they can't work with these types.


There must be a tree-diagnostics.c where you can add those functions and
then call the general diagnostic functions. Same for RTL.

Note that pretty-printer.c works in the same way.

Cheers,

Manuel.


Re: [PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-13 Thread Jason Merrill
On Tue, Feb 13, 2018 at 2:59 PM, Martin Sebor  wrote:
> On 02/13/2018 12:15 PM, Jason Merrill wrote:
>> On Tue, Feb 13, 2018 at 1:31 PM, Martin Sebor  wrote:
>>> On 02/13/2018 09:24 AM, Martin Sebor wrote:


 On 02/13/2018 08:35 AM, Martin Sebor wrote:
>
>
> On 02/13/2018 07:40 AM, Jason Merrill wrote:
>>
>>
>> On Mon, Feb 12, 2018 at 6:32 PM, Martin Sebor 
>> wrote:
>>>
>>>
>>> While testing my fix for 83871 (handling attributes on explicit
>>> specializations) I noticed another old regression: while GCC 4.4
>>> would diagnose declarations of explicit specializations of all
>>> primary templates declared deprecated, GCC 4.5 and later only
>>> diagnose declarations of explicit specializations of class
>>> templates but not those of function or variable templates.
>>
>>
>>
>> Hmm, the discussion on the core reflector seemed to be agreeing that
>> we want to be able to define non-deprecated specializations of a
>> deprecated primary template.
>
>
>
> Yes, that's what Richard wanted to do.  The only way to do it
> within the existing constraints(*) is to define a non-deprecated
> primary, and a deprecated partial specialization.  This is in line
> with that approach and supported by Clang and all other compilers
> I tested (including Clang).



 To clarify, this approach works for class templates (e.g., like
 std::numeric_limits that was mentioned in the core discussion)
 and for variable templates.  Functions have no partial
 specilizations so they have to be overloaded to achieve the same
 effect.

 Implementations don't treat the deprecated attribute on partial
 specializations consistently.

 EDG accepts and honors it on class template partial specializations
 but rejects it with an error on those of variables.

 Clang accepts but silently ignores it on class template partial
 specializations and rejects with an error it on variables.

 MSVC accepts and honors it on variables but silently ignores it
 on class template partial specializations.

 GCC ignores it silently on class partial specializations and
 with a warning on variables (I opened bug 84347 to track this
 and to have GCC honor is everywhere).

 This is clearly a mess, which isn't surprising given how poorly
 specified this is in the standard.  But from the test cases and
 from the core discussion it seems clear that deprecating
 a template, including its partial specializations (as opposed
 to just a single explicit specialization) is desirable and
 already supported, and that the wording in the standard just
 needs to be adjusted to reflect that.

>
> Martin
>
> [*] Except (as Richard noted) that the standard doesn't seem to
> allow a template to be deprecated.  I think that's a bug in the
> spec because all implementations allow it to some degree.
>>>
>>>
>>>
>>> One other note.  While thinking about this problem during
>>> the core discussion, another approach to deprecating a primary
>>> template without also deprecating all of its specializations
>>> occurred to me.
>>>
>>> 1) First declare the primary template without [[deprecated]].
>>> 2) Next declare its non-deprecated specializations (partial
>>>or explicit).
>>> 3) Finally declare the primary again, this time [[deprecated]].
>>>
>>> Like this:
>>>
>>>   template  structS;
>>>   template  structS { };
>>>   template  struct [[deprecated]] S { };
>>>   template  struct [[deprecated]] S { };
>>>
>>>   S si; // warning
>>>   S sci;  // no warning
>>>   S svi;   // warning
>>>
>>> This works as expected with Intel ICC.  All other compilers
>>> diagnose all three variables.  I'd say for [[deprecated]] it
>>> should work the way ICC does.  (For [[noreturn]] the first
>>> declaration must be [[noreturn]], so there this solution
>>> wouldn't work also because of that, in addition to function
>>> templates not being partially-specializable.)
>>
>>
>> My understanding of the reflector discussion, and Richard's comment in
>> particular, was that [[deprecated]] should apply to the instances, not
>> the template itself, so that declaring the primary template
>> [[deprecated]] doesn't affect explicit specializations.  Your last
>> example should work as you expect in this model, but you can also
>> write the simpler
>>
>> template  struct [[deprecated]] S { };
>> template  struct S { }; // no warning
>
>
> With this approach there would be no way to deprecate all of
> a template's specializations) because it would always be
> possible for a user to get around deprecation by defining
> their own specialization, partial or explicit.

Yep.  And so he suggested that we might want to add a new way to write
attributes 

Re: [PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-13 Thread Martin Sebor

On 02/13/2018 12:15 PM, Jason Merrill wrote:

On Tue, Feb 13, 2018 at 1:31 PM, Martin Sebor  wrote:

On 02/13/2018 09:24 AM, Martin Sebor wrote:


On 02/13/2018 08:35 AM, Martin Sebor wrote:


On 02/13/2018 07:40 AM, Jason Merrill wrote:


On Mon, Feb 12, 2018 at 6:32 PM, Martin Sebor  wrote:


While testing my fix for 83871 (handling attributes on explicit
specializations) I noticed another old regression: while GCC 4.4
would diagnose declarations of explicit specializations of all
primary templates declared deprecated, GCC 4.5 and later only
diagnose declarations of explicit specializations of class
templates but not those of function or variable templates.



Hmm, the discussion on the core reflector seemed to be agreeing that
we want to be able to define non-deprecated specializations of a
deprecated primary template.



Yes, that's what Richard wanted to do.  The only way to do it
within the existing constraints(*) is to define a non-deprecated
primary, and a deprecated partial specialization.  This is in line
with that approach and supported by Clang and all other compilers
I tested (including Clang).



To clarify, this approach works for class templates (e.g., like
std::numeric_limits that was mentioned in the core discussion)
and for variable templates.  Functions have no partial
specilizations so they have to be overloaded to achieve the same
effect.

Implementations don't treat the deprecated attribute on partial
specializations consistently.

EDG accepts and honors it on class template partial specializations
but rejects it with an error on those of variables.

Clang accepts but silently ignores it on class template partial
specializations and rejects with an error it on variables.

MSVC accepts and honors it on variables but silently ignores it
on class template partial specializations.

GCC ignores it silently on class partial specializations and
with a warning on variables (I opened bug 84347 to track this
and to have GCC honor is everywhere).

This is clearly a mess, which isn't surprising given how poorly
specified this is in the standard.  But from the test cases and
from the core discussion it seems clear that deprecating
a template, including its partial specializations (as opposed
to just a single explicit specialization) is desirable and
already supported, and that the wording in the standard just
needs to be adjusted to reflect that.



Martin

[*] Except (as Richard noted) that the standard doesn't seem to
allow a template to be deprecated.  I think that's a bug in the
spec because all implementations allow it to some degree.



One other note.  While thinking about this problem during
the core discussion, another approach to deprecating a primary
template without also deprecating all of its specializations
occurred to me.

1) First declare the primary template without [[deprecated]].
2) Next declare its non-deprecated specializations (partial
   or explicit).
3) Finally declare the primary again, this time [[deprecated]].

Like this:

  template  structS;
  template  structS { };
  template  struct [[deprecated]] S { };
  template  struct [[deprecated]] S { };

  S si; // warning
  S sci;  // no warning
  S svi;   // warning

This works as expected with Intel ICC.  All other compilers
diagnose all three variables.  I'd say for [[deprecated]] it
should work the way ICC does.  (For [[noreturn]] the first
declaration must be [[noreturn]], so there this solution
wouldn't work also because of that, in addition to function
templates not being partially-specializable.)


My understanding of the reflector discussion, and Richard's comment in
particular, was that [[deprecated]] should apply to the instances, not
the template itself, so that declaring the primary template
[[deprecated]] doesn't affect explicit specializations.  Your last
example should work as you expect in this model, but you can also
write the simpler

template  struct [[deprecated]] S { };
template  struct S { }; // no warning


With this approach there would be no way to deprecate all of
a template's specializations) because it would always be
possible for a user to get around deprecation by defining
their own specialization, partial or explicit.

I think we need to give users the choice of being able to do
one without the other (in addition to both).  I.e., either of

1) Deprecate a primary and all its uses (including partial and
   explicit specializations).

2) Deprecate just a subset of specializations of a template
   without also deprecating the rest.

An example of (1) is std::auto_ptr or the std::is_literal_type
type trait.  The intent is to remove them from namespace std
someday and providing any specializations for them will then
become an error.

An example of (2) is the std::numeric_limits primary template
that Richard brought up.  That was my understanding of what
he wanted to do but even if that's not what he meant it's
a 

Re: [Patch, Fortran, F08] PR 84313: reject procedure pointers in COMMON blocks

2018-02-13 Thread Richard Biener
On February 13, 2018 8:14:33 PM GMT+01:00, Steve Kargl 
 wrote:
>On Tue, Feb 13, 2018 at 07:24:35PM +0100, Janus Weil wrote:
>> 
>> As my last submission, this fixes fallout from
>>
>https://groups.google.com/forum/?fromgroups#!topic/comp.lang.fortran/AIHRQ2kJv3c.
>> As the last one, it is a very simple fix for an accepts-invalid
>> problem (which is not a regression), so I hope this one will also
>> still be suitable for trunk (if not, I hope the release managers, in
>> CC, will stop me).
>> 
>> It does regtest cleanly on x86_64-linux-gnu. Ok for trunk?
>> 
>
>Ok.  I think you can skip waiting RM approval.  To me, this patch
>falls into the "too simply and obvious" fix category.
>
>PS: Given the response I got yesterday on the #gcc IRC channel
>when I asked about the current status of trunk, I'm inclined to
>ignore the "regression and doc fixes only" status.

Heh! Fortran is not release critical so if it's broken we'll release anyway. 
Which means it's up to frontend maintainers to decide what they accept at this 
stage. 

Extrapolating from last years I would expect GCC 8 to be released mid April. 

Richard. 


No


New French PO file for 'gcc' (version 8.1-b20180128)

2018-02-13 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the French team of translators.  The file is available at:

http://translationproject.org/latest/gcc/fr.po

(This file, 'gcc-8.1-b20180128.fr.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH] FIx endless match.pd recursion on cst1 + cst2 + cst3 (PR tree-optimization/84334)

2018-02-13 Thread Marc Glisse

On Tue, 13 Feb 2018, Richard Biener wrote:


On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek  
wrote:

Hi!

On the following testcase, we recurse infinitely, because
we have float re-association enabled, but also rounding-math, so
we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1
but (cst2 + cst3) doesn't simplify and we try again and optimize
it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on
forever.  If @0 is not a CONSTANT_CLASS_P, there is not a problem,
if it is, the code just checks if we can actually simplify the
operation between cst2 and cst3 into a constant.


Is there a reason to try simplifying at all for constant @0?


Yes. cst2+cst3 might simplify (the operation happens to be exact and not 
require rounding), which leaves us with only one addition instead of 2.


On the other hand, mixing -frounding-math with reassociation seems strange 
to me, and likely not worth optimizing for.


--
Marc Glisse


Re: [PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-13 Thread Jason Merrill
On Tue, Feb 13, 2018 at 1:31 PM, Martin Sebor  wrote:
> On 02/13/2018 09:24 AM, Martin Sebor wrote:
>>
>> On 02/13/2018 08:35 AM, Martin Sebor wrote:
>>>
>>> On 02/13/2018 07:40 AM, Jason Merrill wrote:

 On Mon, Feb 12, 2018 at 6:32 PM, Martin Sebor  wrote:
>
> While testing my fix for 83871 (handling attributes on explicit
> specializations) I noticed another old regression: while GCC 4.4
> would diagnose declarations of explicit specializations of all
> primary templates declared deprecated, GCC 4.5 and later only
> diagnose declarations of explicit specializations of class
> templates but not those of function or variable templates.


 Hmm, the discussion on the core reflector seemed to be agreeing that
 we want to be able to define non-deprecated specializations of a
 deprecated primary template.
>>>
>>>
>>> Yes, that's what Richard wanted to do.  The only way to do it
>>> within the existing constraints(*) is to define a non-deprecated
>>> primary, and a deprecated partial specialization.  This is in line
>>> with that approach and supported by Clang and all other compilers
>>> I tested (including Clang).
>>
>>
>> To clarify, this approach works for class templates (e.g., like
>> std::numeric_limits that was mentioned in the core discussion)
>> and for variable templates.  Functions have no partial
>> specilizations so they have to be overloaded to achieve the same
>> effect.
>>
>> Implementations don't treat the deprecated attribute on partial
>> specializations consistently.
>>
>> EDG accepts and honors it on class template partial specializations
>> but rejects it with an error on those of variables.
>>
>> Clang accepts but silently ignores it on class template partial
>> specializations and rejects with an error it on variables.
>>
>> MSVC accepts and honors it on variables but silently ignores it
>> on class template partial specializations.
>>
>> GCC ignores it silently on class partial specializations and
>> with a warning on variables (I opened bug 84347 to track this
>> and to have GCC honor is everywhere).
>>
>> This is clearly a mess, which isn't surprising given how poorly
>> specified this is in the standard.  But from the test cases and
>> from the core discussion it seems clear that deprecating
>> a template, including its partial specializations (as opposed
>> to just a single explicit specialization) is desirable and
>> already supported, and that the wording in the standard just
>> needs to be adjusted to reflect that.
>>
>>>
>>> Martin
>>>
>>> [*] Except (as Richard noted) that the standard doesn't seem to
>>> allow a template to be deprecated.  I think that's a bug in the
>>> spec because all implementations allow it to some degree.
>
>
> One other note.  While thinking about this problem during
> the core discussion, another approach to deprecating a primary
> template without also deprecating all of its specializations
> occurred to me.
>
> 1) First declare the primary template without [[deprecated]].
> 2) Next declare its non-deprecated specializations (partial
>or explicit).
> 3) Finally declare the primary again, this time [[deprecated]].
>
> Like this:
>
>   template  structS;
>   template  structS { };
>   template  struct [[deprecated]] S { };
>   template  struct [[deprecated]] S { };
>
>   S si; // warning
>   S sci;  // no warning
>   S svi;   // warning
>
> This works as expected with Intel ICC.  All other compilers
> diagnose all three variables.  I'd say for [[deprecated]] it
> should work the way ICC does.  (For [[noreturn]] the first
> declaration must be [[noreturn]], so there this solution
> wouldn't work also because of that, in addition to function
> templates not being partially-specializable.)

My understanding of the reflector discussion, and Richard's comment in
particular, was that [[deprecated]] should apply to the instances, not
the template itself, so that declaring the primary template
[[deprecated]] doesn't affect explicit specializations.  Your last
example should work as you expect in this model, but you can also
write the simpler

template  struct [[deprecated]] S { };
template  struct S { }; // no warning

Jason


Re: [Patch, Fortran, F08] PR 84313: reject procedure pointers in COMMON blocks

2018-02-13 Thread Steve Kargl
On Tue, Feb 13, 2018 at 07:24:35PM +0100, Janus Weil wrote:
> 
> As my last submission, this fixes fallout from
> https://groups.google.com/forum/?fromgroups#!topic/comp.lang.fortran/AIHRQ2kJv3c.
> As the last one, it is a very simple fix for an accepts-invalid
> problem (which is not a regression), so I hope this one will also
> still be suitable for trunk (if not, I hope the release managers, in
> CC, will stop me).
> 
> It does regtest cleanly on x86_64-linux-gnu. Ok for trunk?
> 

Ok.  I think you can skip waiting RM approval.  To me, this patch
falls into the "too simply and obvious" fix category.

PS: Given the response I got yesterday on the #gcc IRC channel
when I asked about the current status of trunk, I'm inclined to
ignore the "regression and doc fixes only" status.
 
-- 
steve 


Re: [PATCH] avoid warning for members declared both aligned and packed (PR 84108)

2018-02-13 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00125.html

On 02/02/2018 11:58 AM, Martin Sebor wrote:

The design of the attribute exclusion framework includes
support for different exclusions applying to different
kinds of declarations (functions, types, and variables
or fields), but the support is incomplete -- the logic
to consider these differences is missing.  This is
because the differences are apparently rare.  However,
as the bug below points out, they do exist.

PR middle-end/84108 - incorrect -Wattributes warning for
packed/aligned conflict on struct members, shows that while
declaring a non-member variable aligned is enough to reduce
the its alignment and declaring it both aligned and packed
triggers a -Wattributes warning:

  int a __attribute__((packed, aligned (2)));   // -Wattributes

a struct member must be declared both aligned and packed in
order to have its alignment reduced.  (Declaring a member
just aligned has no effect and doesn't cause a warning).

  struct S {
int b __attribute__((packed, aligned (2)));
int c __attribute__((aligned (2)));   // no effect
  };

As a result of the incomplete logic GCC 8 issues a -Wattributes
for the declaration of b in the struct.

By adding the missing logic the attached patch lets GCC avoid
the spurious warning.

I considered adding support for detecting the ineffective
attribute aligned on the declaration of the member c at
the same time but since that's not a regression I decided
to defer that until GCC 9.  I opened bug 84185 to track it.

Tested on x86_64-linux with no regressions.

Martin




Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-13 Thread Jakub Jelinek
On Fri, Feb 09, 2018 at 05:08:24PM +0100, Paolo Bonzini wrote:
>   PR sanitizer/84307
>   * gcc.dg/asan/pr84307.c: New test.

BTW, your testcase shows a more severe problem, that we actually don't
handle compound literals correctly.

C99 says that:
"If the compound literal occurs outside the body of a function, the object
has static storage duration; otherwise, it has automatic storage duration
associated with the enclosing block."
but if we create an object with automatic storage duration, we don't
actually put that object into the scope of the enclosing block, but of the
enclosing function, which explains the weird ASAN_MARK UNPOISON present, but
corresponding ASAN_MARK POISON not present.  The following testcase should
IMHO FAIL with -fsanitize=address on the second bar call, but doesn't, even
at -O0 without any DSE.  When optimizing we because of this don't emit
CLOBBER stmts when the compound literal object goes out of scope, and with
-fsanitize=address -fsanitize-address-use-after-scope we don't emit the
POISON.

struct S { int s; } *p;

static inline void
foo (struct S *x)
{
  p = x;
}

static inline void
bar (void)
{
  p->s = 5;
}

int
main ()
{
  {
foo (&(struct S) { 1 });
bar ();
  }
  {
foo (&(struct S) { 2 });
  }
  bar ();
  return 0;
}

The following untested patch seems to cure thatm will see how much it will
break.

2018-02-13  Jakub Jelinek  

PR sanitizer/84340
* c-decl.c (build_compound_literal): Call pushdecl (decl) even when
it is not TREE_STATIC.

--- gcc/c/c-decl.c.jj   2018-01-03 10:20:20.114537949 +0100
+++ gcc/c/c-decl.c  2018-02-13 15:17:47.091186077 +0100
@@ -5348,6 +5348,8 @@ build_compound_literal (location_t loc,
   pushdecl (decl);
   rest_of_decl_compilation (decl, 1, 0);
 }
+  else
+pushdecl (decl);
 
   if (non_const)
 {


Jakub


Re: [PATCH] Fix handling of variable length fields in structures (PR c/82210)

2018-02-13 Thread Jeff Law
On 02/08/2018 11:22 PM, Jakub Jelinek wrote:
> Hi!
> 
> When placing a variable length field into a structure, we need to update
> rli->offset_align for the next field.  We do:
> rli->offset_align = MIN (rli->offset_align, desired_align);
> which updates it according to the start of that VLA field, the problem is
> that if the field doesn't have a size that is a multiple of this alignment
> rli->offset_align will not reflect properly the alignment of the end of that
> field.  E.g. on the testcase, we have a VLA array aligned as a whole (the
> field itself) to 16 bytes / 128 bits, so rli->offset_align remains 128.
> The array has element size 2 bytes / 16 bits, times function argument,
> so the end of the field is worst case aligned just to 16 bits; if we keep
> rli->offset_align as 128 for the next field, then DECL_OFFSET_ALIGN is too
> large. DECL_FIELD_OFFSET documented as:
> /* In a FIELD_DECL, this is the field position, counting in bytes, of the
>DECL_OFFSET_ALIGN-bit-sized word containing the bit closest to the 
> beginning
>of the structure.  */
> and when gimplifying COMPONENT_REFs with that field we:
>   tree offset = unshare_expr (component_ref_field_offset (t));
>   tree field = TREE_OPERAND (t, 1);
>   tree factor
> = size_int (DECL_OFFSET_ALIGN (field) / BITS_PER_UNIT);
>
>   /* Divide the offset by its alignment.  */
>   offset = size_binop_loc (loc, EXACT_DIV_EXPR, offset, factor);
> and later on multiply it again by DECL_OFFSET_ALIGN.  The EXACT_DIV_EXPR
> isn't exact.
> 
> Fixed by lowering the rli->offset_align if the size isn't a multiple of
> the align.  We don't have a multiple_of_p variant that would compute
> highest power of two number the expression is known to be a multiple of,
> so I'm just checking the most common case, where the size is a multiple
> of the starting alignment, and otherwise just compute it very
> conservatively.  This will be lower than necessary say for
>   __attribute__((aligned (16))) short field[2 * size];
> - just 16 bits instead of 32.  In theory we could do a binary search
> on power of two numbers in between that high initial rli->offset_align
> for which the first multiple_of_p failed, and the conservative guess
> we do to improve it.  If you think it is worth it, I can code it up.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2018-02-09  Jakub Jelinek  
> 
>   PR c/82210
>   * stor-layout.c (place_field): For variable length fields, adjust
>   offset_align afterwards not just based on the field's alignment,
>   but also on the size.
> 
>   * gcc.c-torture/execute/pr82210.c: New test.
OK.
jeff


[committed][PATCH] Fix another minor rl78 build failure

2018-02-13 Thread Jeff Law


This is enough to get it to build cc1.  Installed on the trunk.

Jeff
* config/rl/rl78.c (rl78_attribute_table): Fix terminator and
entry for "vector".

diff --git a/gcc/config/rl78/rl78.c b/gcc/config/rl78/rl78.c
index 5158e83c364..8346c9c31e0 100644
--- a/gcc/config/rl78/rl78.c
+++ b/gcc/config/rl78/rl78.c
@@ -905,8 +905,8 @@ const struct attribute_spec rl78_attribute_table[] =
 rl78_handle_naked_attribute, NULL },
   { "saddr",  0, 0, true, false, false, false,
 rl78_handle_saddr_attribute, NULL },
-  { "vector", 1, -1, true, false, false, 
-   rl78_handle_vector_attribute, false },
+  { "vector", 1, -1, true, false, false, false,
+   rl78_handle_vector_attribute, NULL },
   { NULL, 0, 0, false, false, false, false, NULL, NULL }
 };
 


Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-13 Thread Jason Merrill
On Tue, Feb 13, 2018 at 1:00 PM, Martin Sebor  wrote:
> On 02/13/2018 07:47 AM, Jason Merrill wrote:
>>
>> On Mon, Feb 12, 2018 at 6:39 PM, Martin Sebor  wrote:
>>>
>>> I really don't think it's helpful to try to force noreturn
>>> to match between the primary and its specializations.
>>
>> I continue to disagree.
>
> Can you explain what use case you are concerned about that isn't
> already handled by the warning about noreturn function returning?

A specialization that forgot [[noreturn]] and therefore doesn't get
the desired warning.

> For reference, the cases I consider important are:
>
> 1) "Unimplemented" primary template declared noreturn that throws
>or exits but whose specializations return a useful value and
>make use of attribute malloc (or one of the other blacklisted
>attributes).
>
> 2) The converse of (1).  A primary that returns a useful malloc
>like value and some of whose specializations are not/cannot
>be meaningfully implemented and are declared noreturn.

Right, but I still disagree with this use of noreturn, and therefore
don't consider these cases important.

> Defining template specializations that differ from the primary
> template in their implementation is idiomatic (analogous to
> defining overloads or overridden virtual functions).

> In any event, I am mainly interested in fixing the two bugs
> (one a P1 regression).   If you consider changing the warning
> aspect of the patch a condition of accepting the fix please let
> me know.  Removing the noreturn keyword from the whitelist is
> trivial.

Please do.

Jason


Re: [PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-13 Thread Martin Sebor

On 02/13/2018 09:24 AM, Martin Sebor wrote:

On 02/13/2018 08:35 AM, Martin Sebor wrote:

On 02/13/2018 07:40 AM, Jason Merrill wrote:

On Mon, Feb 12, 2018 at 6:32 PM, Martin Sebor  wrote:

While testing my fix for 83871 (handling attributes on explicit
specializations) I noticed another old regression: while GCC 4.4
would diagnose declarations of explicit specializations of all
primary templates declared deprecated, GCC 4.5 and later only
diagnose declarations of explicit specializations of class
templates but not those of function or variable templates.


Hmm, the discussion on the core reflector seemed to be agreeing that
we want to be able to define non-deprecated specializations of a
deprecated primary template.


Yes, that's what Richard wanted to do.  The only way to do it
within the existing constraints(*) is to define a non-deprecated
primary, and a deprecated partial specialization.  This is in line
with that approach and supported by Clang and all other compilers
I tested (including Clang).


To clarify, this approach works for class templates (e.g., like
std::numeric_limits that was mentioned in the core discussion)
and for variable templates.  Functions have no partial
specilizations so they have to be overloaded to achieve the same
effect.

Implementations don't treat the deprecated attribute on partial
specializations consistently.

EDG accepts and honors it on class template partial specializations
but rejects it with an error on those of variables.

Clang accepts but silently ignores it on class template partial
specializations and rejects with an error it on variables.

MSVC accepts and honors it on variables but silently ignores it
on class template partial specializations.

GCC ignores it silently on class partial specializations and
with a warning on variables (I opened bug 84347 to track this
and to have GCC honor is everywhere).

This is clearly a mess, which isn't surprising given how poorly
specified this is in the standard.  But from the test cases and
from the core discussion it seems clear that deprecating
a template, including its partial specializations (as opposed
to just a single explicit specialization) is desirable and
already supported, and that the wording in the standard just
needs to be adjusted to reflect that.



Martin

[*] Except (as Richard noted) that the standard doesn't seem to
allow a template to be deprecated.  I think that's a bug in the
spec because all implementations allow it to some degree.


One other note.  While thinking about this problem during
the core discussion, another approach to deprecating a primary
template without also deprecating all of its specializations
occurred to me.

1) First declare the primary template without [[deprecated]].
2) Next declare its non-deprecated specializations (partial
   or explicit).
3) Finally declare the primary again, this time [[deprecated]].

Like this:

  template  structS;
  template  structS { };
  template  struct [[deprecated]] S { };
  template  struct [[deprecated]] S { };

  S si; // warning
  S sci;  // no warning
  S svi;   // warning

This works as expected with Intel ICC.  All other compilers
diagnose all three variables.  I'd say for [[deprecated]] it
should work the way ICC does.  (For [[noreturn]] the first
declaration must be [[noreturn]], so there this solution
wouldn't work also because of that, in addition to function
templates not being partially-specializable.)

Martin


[Patch, Fortran, F08] PR 84313: reject procedure pointers in COMMON blocks

2018-02-13 Thread Janus Weil
Hi all,

as the subject line says, the attached patch rejects procedure
pointers in COMMON blocks (which is forbidden in F08). Since it's
apparently legal in F03, I'm still accepting it with -std=f2003 and
add that flag to a test case where this 'feature' is used. In another
one, I'm adding the error message that one gets with -std=f2008.

As my last submission, this fixes fallout from
https://groups.google.com/forum/?fromgroups#!topic/comp.lang.fortran/AIHRQ2kJv3c.
As the last one, it is a very simple fix for an accepts-invalid
problem (which is not a regression), so I hope this one will also
still be suitable for trunk (if not, I hope the release managers, in
CC, will stop me).

It does regtest cleanly on x86_64-linux-gnu. Ok for trunk?

Cheers,
Janus


2018-02-13  Janus Weil  

PR fortran/84313
* symbol.c (check_conflict): Reject procedure pointers in common blocks.


2018-02-13  Janus Weil  

PR fortran/84313
* gfortran.dg/proc_ptr_common_1.f90: Fix invalid test case,
add necessary compiler options.
* gfortran.dg/proc_ptr_common_2.f90: Add missing error message.
Index: gcc/fortran/symbol.c
===
--- gcc/fortran/symbol.c(revision 257589)
+++ gcc/fortran/symbol.c(working copy)
@@ -809,7 +809,9 @@ check_conflict (symbol_attribute *attr, const char
conf2 (threadprivate);
}
 
-  if (!attr->proc_pointer)
+  /* Procedure pointers in COMMON blocks are allowed in F03,
+   * but forbidden per F08:C5100.  */
+  if (!attr->proc_pointer || (gfc_option.allow_std & GFC_STD_F2008))
conf2 (in_common);
 
   conf2 (omp_declare_target_link);
Index: gcc/testsuite/gfortran.dg/proc_ptr_common_1.f90
===
--- gcc/testsuite/gfortran.dg/proc_ptr_common_1.f90 (revision 257589)
+++ gcc/testsuite/gfortran.dg/proc_ptr_common_1.f90 (working copy)
@@ -1,16 +1,18 @@
 ! { dg-do run }
-
+! { dg-options "-std=f2003 -fall-intrinsics" }
+!
 ! PR fortran/36592
 !
 ! Procedure Pointers inside COMMON blocks.
+! (Allowed in F03, but forbidden in F08.)
 !
 ! Contributed by Janus Weil .
 
 subroutine one()
   implicit none
-  common /com/ p1,p2,a,b
   procedure(real), pointer :: p1,p2
   integer :: a,b
+  common /com/ p1,p2,a,b
   if (a/=5 .or. b/=-9 .or. p1(0.0)/=1.0 .or. p2(0.0)/=0.0) call abort()
 end subroutine one
 
Index: gcc/testsuite/gfortran.dg/proc_ptr_common_2.f90
===
--- gcc/testsuite/gfortran.dg/proc_ptr_common_2.f90 (revision 257589)
+++ gcc/testsuite/gfortran.dg/proc_ptr_common_2.f90 (working copy)
@@ -12,7 +12,7 @@ abstract interface
 end interface
 
 procedure(foo), pointer, bind(C) :: proc
-common /com/ proc,r
+common /com/ proc,r  ! { dg-error "PROCEDURE attribute conflicts with COMMON 
attribute" }
 
 common s
 call s()  ! { dg-error "PROCEDURE attribute conflicts with COMMON attribute" }


[PATCH][committed] Fix trivial rl78 failure

2018-02-13 Thread Jeff Law

I don't think this is sufficient to bring the rl78 back to a working
state, but it's a necessary step.

ARGS is unused in rl78_handle_func_attribute, but wasn't marked
appropriately.

Fixed in the obvious way.  Committed to the trunk.

Jeff
commit 5c38c127433adad901f70bac0b9cff258ae831cd
Author: law 
Date:   Tue Feb 13 18:17:23 2018 +

* config/rl78/rl78.c (rl78_handle_func_attribute): Mark
ARGS as unused.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@257632 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index f2a89589995..a7b43651115 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2018-02-13  Jeff Law  
+
+   * config/rl78/rl78.c (rl78_handle_func_attribute): Mark
+   ARGS as unused.
+
 2018-02-13  Alexandre Oliva 
 
* common.opt (gas-loc-support, gas-locview-support): New.
diff --git a/gcc/config/rl78/rl78.c b/gcc/config/rl78/rl78.c
index 7b2be23577b..5158e83c364 100644
--- a/gcc/config/rl78/rl78.c
+++ b/gcc/config/rl78/rl78.c
@@ -804,7 +804,7 @@ is_brk_interrupt_func (const_tree decl)
 static tree
 rl78_handle_func_attribute (tree * node,
tree   name,
-   tree   args,
+   tree   args ATTRIBUTE_UNUSED,
intflags ATTRIBUTE_UNUSED,
bool * no_add_attrs)
 {


Re: Reduce inline limits a bit to compensate changes in inlining metrics

2018-02-13 Thread Jan Hubicka
> Hi.
> 
> I see quite SPEC 2006 benchmark changes for -Ofast -march=native:
> (note that size changes are not presented all)

Thanks! It is interesting that I do not see similar observations on Czerny 
which also
runs spec2006. In partiuclar both AMD machines agrees on soplex milc and bwaves.
They are flat in Czerny graph, so it may be again dependency of AMD machines on 
code lyaout
we was seeing recently.
https://gcc.opensuse.org/gcc-old/SPEC/CFP/sb-czerny-head-64-2006/410_bwaves_recent_big.png
https://gcc.opensuse.org/gcc-old/SPEC/CFP/sb-czerny-head-64-2006/433_milc_recent_big.png
https://gcc.opensuse.org/gcc-old/SPEC/CFP/sb-czerny-head-64-2006/450_soplex_recent_big.png

We sould take a look at spoplex. Megrez will pick up the patch tonight so we 
will see if the
regression reproduces on both bulldozer machines.

Honza

> 
> 1) gillan (AMD bulldozer):
> 
> +--+---+--+--+---+---+---+
> | Performance Regressions - Execution Time |   Δ   | Previous | Current  | σ 
> | Δ (B) | σ (B) |
> +--+---+--+--+---+---+---+
> | SPEC/SPEC2006/FP/453.povray  | 6.06% | 180.9933 | 191.9582 | - 
> | 3.48% | - |
> | SPEC/SPEC2006/FP/410.bwaves  | 5.13% | 307.0029 | 322.7495 | - 
> | 8.11% | - |
> | SPEC/SPEC2006/FP/433.milc| 4.80% | 310.4705 | 325.3810 | - 
> | 4.01% | - |
> | SPEC/SPEC2006/FP/450.soplex  | 3.50% | 449.4463 | 465.1627 | - 
> | 3.09% | - |
> | SPEC/SPEC2006/INT/471.omnetpp| 1.41% | 562.2769 | 570.1829 | - 
> | 1.65% | - |
> | SPEC/SPEC2006/FP/447.dealII  | 1.40% | 359.1753 | 364.1933 | - 
> | 1.66% | - |
> | SPEC/SPEC2006/INT/483.xalancbmk  | 1.00% | 430.9722 | 435.2821 | - 
> | 7.59% | - |
> +--+---+--+--+---+---+---+
> 
> +---++--+--+---++---+
> | Performance Improvements - Execution Time |   Δ| Previous | Current  | 
> σ | Δ (B)  | σ (B) |
> +---++--+--+---++---+
> | SPEC/SPEC2006/INT/429.mcf | -7.93% | 606.7478 | 558.6419 | 
> - | -7.42% | - |
> | SPEC/SPEC2006/INT/473.astar   | -2.07% | 577.5786 | 565.6328 | 
> - | -6.16% | - |
> | SPEC/SPEC2006/FP/435.gromacs  | -2.01% | 329.9626 | 323.3386 | 
> - | -0.34% | - |
> | SPEC/SPEC2006/INT/458.sjeng   | -1.71% | 667.0133 | 655.5964 | 
> - | -1.54% | - |
> | SPEC/SPEC2006/FP/436.cactusADM| -1.07% | 379.5225 | 375.4560 | 
> - | -4.12% | - |
> +---++--+--+---++---+
> 
> +---+-+--+--+---+-+---+
> |  Performance Improvements - size  |Δ|   Previous   
> |   Current| σ |  Δ (B)  | σ (B) |
> +---+-+--+--+---+-+---+
> | SPEC/SPEC2006/FP/482.sphinx3/elf/sections/text| -15.92% |  244450. 
> |  205522. | - | -9.45%  | - |
> | SPEC/SPEC2006/FP/482.sphinx3/elf  | -12.16% |  335640. 
> |  294816. | - | -6.46%  | - |
> | SPEC/SPEC2006/INT/456.hmmer/elf/sections/text | -12.14% |  395778. 
> |  347746. | - | -9.75%  | - |
> | SPEC/SPEC2006/INT/483.xalancbmk/elf/sections/text | -10.43% | 3282082. 
> | 2939698. | - | -11.33% | - |
> | SPEC/SPEC2006/FP/453.povray/elf/sections/text | -9.48%  | 1004658. 
> |  909426. | - | -12.68% | - |
> | SPEC/SPEC2006/INT/400.perlbench/elf/sections/text | -9.42%  | 1141458. 
> | 1033890. | - | -9.75%  | - |
> | SPEC/SPEC2006/INT/456.hmmer/elf   | -8.77%  |  508624. 
> |  464024. | - | -7.28%  | - |
> +---+-+--+--+---+-+---+
> 
> 2) AMD Ryzen 5 machine:
> 
> +--+---+--+--+---+---+---+
> | Performance Regressions - Execution Time |   Δ   | Previous | Current  | σ 
> | Δ (B) | σ (B) |
> +--+---+--+--+---+---+---+
> | SPEC/SPEC2006/FP/450.soplex  | 6.28% | 204.6552 | 217.5132 | - 
> | 8.06% | - |
> | SPEC/SPEC2006/FP/433.milc| 6.11% | 200.5912 | 212.8397 | - 
> | 5.81% | - |
> | SPEC/SPEC2006/FP/410.bwaves  | 3.28% | 145.9785 | 150.7609 | - 
> | 5.37% | - |
> | SPEC/SPEC2006/INT/462.libquantum | 1.98% | 306.7858 | 312.8472 | - 
> | 2.14% | - |
> 

Re: [PATCH] FIx endless match.pd recursion on cst1 + cst2 + cst3 (PR tree-optimization/84334)

2018-02-13 Thread Richard Biener
On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>On the following testcase, we recurse infinitely, because
>we have float re-association enabled, but also rounding-math, so
>we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1
>but (cst2 + cst3) doesn't simplify and we try again and optimize
>it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on
>forever.  If @0 is not a CONSTANT_CLASS_P, there is not a problem,
>if it is, the code just checks if we can actually simplify the
>operation between cst2 and cst3 into a constant.

Is there a reason to try simplifying at all for constant @0?  I'd rather not 
try to avoid all the complex code. 

Richard. 

>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
>2018-02-13  Jakub Jelinek  
>
>   PR tree-optimization/84334
>   * match.pd ((A +- CST1) +- CST2 -> A + CST3): If A is
>   also a CONSTANT_CLASS_P, only optimize if we can fold the
>   operation between CST1 and CST2 into a constant.
>
>   * gcc.dg/pr84334.c: New test.
>
>--- gcc/match.pd.jj2018-02-13 09:33:31.0 +0100
>+++ gcc/match.pd   2018-02-13 12:14:08.108314686 +0100
>@@ -1733,9 +1733,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  CONSTANT_CLASS_P@2)
>  /* If one of the types wraps, use that one.  */
>  (if (!ANY_INTEGRAL_TYPE_P (type) || TYPE_OVERFLOW_WRAPS (type))
>-  (if (outer_op == PLUS_EXPR)
>-   (plus (view_convert @0) (inner_op @2 (view_convert @1)))
>-   (minus (view_convert @0) (neg_inner_op @2 (view_convert @1
>+  /* If all 3 captures are CONSTANT_CLASS_P, only optimize if we
>+   can simplify @2 with @1 into a constant, otherwise we might recurse
>+   forever.  */
>+  (if (CONSTANT_CLASS_P (@0))
>+   (with { tree cst = fold_unary (VIEW_CONVERT_EXPR, type, @1);
>+ if (cst && CONSTANT_CLASS_P (cst))
>+   cst = const_binop (outer_op == PLUS_EXPR
>+  ? inner_op : neg_inner_op, type,
>+  @2, cst); }
>+  (if (cst)
>+   (outer_op (view_convert @0) { cst; })))
>+   (if (outer_op == PLUS_EXPR)
>+  (plus (view_convert @0) (inner_op @2 (view_convert @1)))
>+  (minus (view_convert @0) (neg_inner_op @2 (view_convert @1)
>   (if (!ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
>  || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
>(if (outer_op == PLUS_EXPR)
>--- gcc/testsuite/gcc.dg/pr84334.c.jj  2018-02-13 12:18:12.765463667
>+0100
>+++ gcc/testsuite/gcc.dg/pr84334.c 2018-02-13 11:36:56.019632428 +0100
>@@ -0,0 +1,12 @@
>+/* PR tree-optimization/84334 */
>+/* { dg-do compile } */
>+/* { dg-options "-Ofast -frounding-math" } */
>+
>+float
>+foo (void)
>+{
>+  float a = 9.99974752427078783512115478515625e-7f;
>+  float b = 1.4950485415756702423095703125e-6f;
>+  float c = 4.99873689375817775726318359375e-6f;
>+  return a + b + c;
>+}
>
>   Jakub



Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-13 Thread Martin Sebor

On 02/13/2018 07:47 AM, Jason Merrill wrote:

On Mon, Feb 12, 2018 at 6:39 PM, Martin Sebor  wrote:

I really don't think it's helpful to try to force noreturn
to match between the primary and its specializations.


I continue to disagree.


Can you explain what use case you are concerned about that isn't
already handled by the warning about noreturn function returning?

For reference, the cases I consider important are:

1) "Unimplemented" primary template declared noreturn that throws
   or exits but whose specializations return a useful value and
   make use of attribute malloc (or one of the other blacklisted
   attributes).

2) The converse of (1).  A primary that returns a useful malloc
   like value and some of whose specializations are not/cannot
   be meaningfully implemented and are declared noreturn.

Defining template specializations that differ from the primary
template in their implementation is idiomatic (analogous to
defining overloads or overridden virtual functions).

In any event, I am mainly interested in fixing the two bugs
(one a P1 regression).   If you consider changing the warning
aspect of the patch a condition of accepting the fix please let
me know.  Removing the noreturn keyword from the whitelist is
trivial.

Martin


Re: [PATCH] Fix a brown paper bag bug in my recent match.pd change (PR middle-end/84309)

2018-02-13 Thread Richard Biener
On February 13, 2018 6:54:36 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>When changing my recent match.pd fix to defer the -> exp2 (log2 (C) *
>x)
>optimization until later, I've swapped the two patterns and changed
>the first one to use exps and logs, but forgot to change the second one
>to use exp2s and log2s.  For the testcase in the patch it actually
>didn't
>make a difference, it was enough that we deferred optimizing it
>(because
>we shortly afterwards propagated constants into the pow and constant
>folded
>it).
>
>This patch adds a testcase that verifies it is done properly.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. 
Richard. 
>
>2018-02-13  Jakub Jelinek  
>
>   PR middle-end/84309
>   * match.pd (pow(C,x) -> exp(log(C)*x)): Use exp2s and log2s instead
>   of exps and logs in the use_exp2 case.
>
>   * gcc.dg/pr84309-2.c: New test.
>
>--- gcc/match.pd.jj2018-02-13 12:14:08.108314686 +0100
>+++ gcc/match.pd   2018-02-13 12:28:59.958328523 +0100
>@@ -4032,7 +4032,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   /* As libmvec doesn't have a vectorized exp2, defer optimizing
>this until after vectorization.  */
>   (if (canonicalize_math_after_vectorization_p ())
>-  (exps (mult (logs @0) @1
>+  (exp2s (mult (log2s @0) @1
> 
>  (for sqrts (SQRT)
>   cbrts (CBRT)
>--- gcc/testsuite/gcc.dg/pr84309-2.c.jj2018-02-13 12:30:25.208304990
>+0100
>+++ gcc/testsuite/gcc.dg/pr84309-2.c   2018-02-13 12:31:35.959285452
>+0100
>@@ -0,0 +1,11 @@
>+/* PR middle-end/84309 */
>+/* { dg-do compile } */
>+/* { dg-options "-Ofast -fdump-tree-optimized" } */
>+
>+double
>+foo (double x)
>+{
>+  return __builtin_pow (2.0, x);
>+}
>+
>+/* { dg-final { scan-tree-dump "__builtin_exp2 " "optimized" { target
>*-*-linux* *-*-gnu* } } } */
>
>   Jakub



Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-13 Thread Martin Sebor

On 02/13/2018 10:07 AM, Manuel López-Ibáñez wrote:

On 13/02/18 03:10, Martin Sebor wrote:

PS Is there any reason why diagnostic-core.h and diagnostic.c
does not/should not include tree.h and other GCC headers?


The short reason is that we want to keep the core diagnostics as
independent of the rest of the compiler as possible so that using it
doesn't bring undesired dependencies. It took some effort a few years
ago to clean-up headers, in particular, the mess of toplev.h, inclusion
of rtl in FE files, inclusion of tree.h in back-end files, etc. I'm not
sure what is the status now but I'm afraid that without any definition
of actual internal/external interfaces and without actual physical
separation of files and compilation steps (read: libraries), the headers
may have crept back in.


Thanks for the background.  It makes sense to keep header
dependencies to a minimum.  Here it seems like more than
that because adding calls to (for instance) tree_to_uhwi()
to diagnostic.c causes:

/src/gcc/svn/gcc/diagnostic.c:56: undefined reference to 
`tree_to_uhwi(tree_node const*)'

collect2: error: ld returned 1 exit status
Makefile:2929: recipe for target 'gcov' failed
make[2]: *** [gcov] Error 1

I wanted to make the _n() functions like warning_n() more
robust by letting them accept a tree argument (as well as
offset_int and wide_int) in addition to HOST_WIDE_INT but
I can't do it if they can't work with these types.


Long time ago there was a plan of making GCC more modular, perhaps
moving to a library framework like LLVM/Clang. I even had a patch that
moved all core diagnostics and line-map stuff to its own libdiagnostic
that was used by libcpp and the rest of the compiler (instead of
overriding call-backs in libcpp like we do now and forcing libcpp into
every executable that wishes to use diagnostics or line-maps).
Unfortunately, it never worked properly because of the build machinery.

More background (probably very much outdated and forgotten):

https://docs.google.com/document/pub?id=1ZfyfkB62EFaR4_g4JKm4--guz3vxm9pciOBziMHTnK4


https://gcc.gnu.org/wiki/rearch

https://gcc.gnu.org/wiki/ModularGCC



Yes, better/greater modularity would be useful.

Thanks again for the background.

Martin


[PATCH] FIx endless match.pd recursion on cst1 + cst2 + cst3 (PR tree-optimization/84334)

2018-02-13 Thread Jakub Jelinek
Hi!

On the following testcase, we recurse infinitely, because
we have float re-association enabled, but also rounding-math, so
we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1
but (cst2 + cst3) doesn't simplify and we try again and optimize
it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on
forever.  If @0 is not a CONSTANT_CLASS_P, there is not a problem,
if it is, the code just checks if we can actually simplify the
operation between cst2 and cst3 into a constant.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-02-13  Jakub Jelinek  

PR tree-optimization/84334
* match.pd ((A +- CST1) +- CST2 -> A + CST3): If A is
also a CONSTANT_CLASS_P, only optimize if we can fold the
operation between CST1 and CST2 into a constant.

* gcc.dg/pr84334.c: New test.

--- gcc/match.pd.jj 2018-02-13 09:33:31.0 +0100
+++ gcc/match.pd2018-02-13 12:14:08.108314686 +0100
@@ -1733,9 +1733,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   CONSTANT_CLASS_P@2)
  /* If one of the types wraps, use that one.  */
  (if (!ANY_INTEGRAL_TYPE_P (type) || TYPE_OVERFLOW_WRAPS (type))
-  (if (outer_op == PLUS_EXPR)
-   (plus (view_convert @0) (inner_op @2 (view_convert @1)))
-   (minus (view_convert @0) (neg_inner_op @2 (view_convert @1
+  /* If all 3 captures are CONSTANT_CLASS_P, only optimize if we
+can simplify @2 with @1 into a constant, otherwise we might recurse
+forever.  */
+  (if (CONSTANT_CLASS_P (@0))
+   (with { tree cst = fold_unary (VIEW_CONVERT_EXPR, type, @1);
+  if (cst && CONSTANT_CLASS_P (cst))
+cst = const_binop (outer_op == PLUS_EXPR
+   ? inner_op : neg_inner_op, type,
+   @2, cst); }
+   (if (cst)
+(outer_op (view_convert @0) { cst; })))
+   (if (outer_op == PLUS_EXPR)
+   (plus (view_convert @0) (inner_op @2 (view_convert @1)))
+   (minus (view_convert @0) (neg_inner_op @2 (view_convert @1)
   (if (!ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
   || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
(if (outer_op == PLUS_EXPR)
--- gcc/testsuite/gcc.dg/pr84334.c.jj   2018-02-13 12:18:12.765463667 +0100
+++ gcc/testsuite/gcc.dg/pr84334.c  2018-02-13 11:36:56.019632428 +0100
@@ -0,0 +1,12 @@
+/* PR tree-optimization/84334 */
+/* { dg-do compile } */
+/* { dg-options "-Ofast -frounding-math" } */
+
+float
+foo (void)
+{
+  float a = 9.99974752427078783512115478515625e-7f;
+  float b = 1.4950485415756702423095703125e-6f;
+  float c = 4.99873689375817775726318359375e-6f;
+  return a + b + c;
+}

Jakub


Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-13 Thread Martin Sebor

On 02/13/2018 08:59 AM, Michael Matz wrote:

Hi,

On Mon, 12 Feb 2018, Martin Sebor wrote:


Removing noreturn from the whitelist means having to prevent
the attribute from causing conflicts with the attributes on
the blacklist.  E.g., in this:

  template  [[malloc]] void* allocate (int);

  template <> [[noreturn]] void* allocate (int);


Marking a function having a return type as noreturn doesn't make sense.


No, that's certainly not so in general(*).  It makes sense
when a function cannot meaningfully be implemented to honor
its broader API contract (this applies to overloads, virtual
functions, and also template specializations).  In the example
above, the author of the allocate template may wish to have it
allocate and initialize an array of T which can only be
implemented for non-void types so they define allocate like
this:

  template 
  [[gnu::malloc]]
  T*
  allocate (int n)
  {
return new T[n];
  }

  template <>
  [[noreturn]]
  void*
  allocate (int)
  {
throw "cannot allocate void arrays";
  }

Defining a specialization that differs from a primary or vice
versa is a common idiom that I don't want to force users to
abandon.

As I already mentioned, this idiom has a parallel in object
oriented code (as opposed to in generic code) where a "pure"
virtual functions are declared to return a non-void type and
defined to exit/throw/abort, or not defined at all (i.e.,
the same as abort).  In these case, warning would on calls
to such functions would be helpful the same that warning
on noreturn functions


So a warning in this case is actually a good thing.


A warning is only useful if it detects a bug or suggests
a potential improvement.  The bug Jason is concerned about
having the warning detect involves changing the allocate
specialization in the future to return a value without
removing the noreturn atttribute.  But that bug doesn't
exist with the implementation above and is already diagnosed
with the changed implementation, so there is no reason to
warn for it now.

No other use case/concern where a warning on the above might
be useful has been brought up so, AFAICS, the only instances
of the warning will be false positives.


And changing the
return type to void (so that noreturn makes sense) makes it not a
specialization anymore (or alternatively if the primary is also changed to
void then malloc doesn't make sense anymore).


Exactly.  The only way to suppress the warning is to either
remove the attributes from the primary, or duplicate them
on the specialization.  Neither approach would be correct
because neither would reflect the property of the declaration
it's applied to.

Martin

[*] There are a number of practical examples that show where
noreturn is useful with non-void return types.  The common ones
involve overloads of APIs with an expected signature that are
not/cannot be implemented to return a value.  See the cfe-dev
discussion at
http://lists.llvm.org/pipermail/cfe-dev/2011-May/014969.html
for one such example.  For another one see N4226 where applying
noreturn to main was considered sufficiently useful to justify
a C++ proposal for an enhancement.  (AFAIK, the proposal hasn't
yet been accepted.


Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-13 Thread Manuel López-Ibáñez

On 13/02/18 03:10, Martin Sebor wrote:

PS Is there any reason why diagnostic-core.h and diagnostic.c
does not/should not include tree.h and other GCC headers?


The short reason is that we want to keep the core diagnostics as independent of 
the rest of the compiler as possible so that using it doesn't bring undesired 
dependencies. It took some effort a few years ago to clean-up headers, in 
particular, the mess of toplev.h, inclusion of rtl in FE files, inclusion of 
tree.h in back-end files, etc. I'm not sure what is the status now but I'm 
afraid that without any definition of actual internal/external interfaces and 
without actual physical separation of files and compilation steps (read: 
libraries), the headers may have crept back in.


Long time ago there was a plan of making GCC more modular, perhaps moving to a 
library framework like LLVM/Clang. I even had a patch that moved all core 
diagnostics and line-map stuff to its own libdiagnostic that was used by libcpp 
and the rest of the compiler (instead of overriding call-backs in libcpp like 
we do now and forcing libcpp into every executable that wishes to use 
diagnostics or line-maps). Unfortunately, it never worked properly because of 
the build machinery.


More background (probably very much outdated and forgotten):

https://docs.google.com/document/pub?id=1ZfyfkB62EFaR4_g4JKm4--guz3vxm9pciOBziMHTnK4

https://gcc.gnu.org/wiki/rearch

https://gcc.gnu.org/wiki/ModularGCC

Cheers,

Manuel.


Re: [RX] Fix PR 83831 -- Unused bclr, bnot, bset insns

2018-02-13 Thread Nick Clifton
Hi Oleg,

> OK for trunk?

> gcc/ChangeLog:
> 
>   PR target/83831
>   * config/rx/rx-protos.h (rx_reg_dead_or_unused_after_insn,
>   rx_copy_reg_dead_or_unused_notes, rx_fuse_in_memory_bitop): New
>   declarations.
>   (set_of_reg): New struct.
>   (rx_find_set_of_reg, rx_find_use_of_reg): New functions.
>   * config/rx/rx.c (rx_reg_dead_or_unused_after_insn,
>   rx_copy_reg_dead_or_unused_notes, rx_fuse_in_memory_bitop): New
>   functions.
>   * config/rx/rx.md (andsi3, iorsi3, xorsi3): Convert to insn_and_split.
>   Split into bitclr, bitset, bitinvert patterns if appropriate.
>   (*bitset, *bitinvert, *bitclr): Convert to named insn_and_split and
>   use rx_fuse_in_memory_bitop.
>   (*bitset_in_memory, *bitinvert_in_memory, *bitclr_in_memory): Convert
>   to named insn, correct maximum insn length.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/83831
>   * gcc.target/rx/pr83831.c: New tests.

Approved - please apply - and thanks very much for doing this!

Cheers
  Nick




Re: Fallout: PR84340

2018-02-13 Thread Martin Liška

On 02/13/2018 04:22 PM, Paolo Bonzini wrote:

On 13/02/2018 14:35, Jakub Jelinek wrote:

On Tue, Feb 13, 2018 at 12:21:55PM +0100, Jakub Jelinek wrote:

On Tue, Feb 13, 2018 at 12:15:36PM +0100, Paolo Bonzini wrote:

The issue is that the ASAN_CHECK doesn't exist at early DSE time, and
thus causes the store to disappear.


If it was DSE removing the stores before asan pass, then it would FAIL
before as well.


Sorry, while ASAN_CHECK is introduced late, ASAN_MARK is present there
already from the gimplification.


Yeah, and it keeps everything alive.


For use after scope, I guess a lot of the stores after end of scope
(rather than reads) are something DSE could consider removing.
So, shall we just disable DSE on vars where their address "escapes"
through ASAN_MARK when -fsanitize-address-use-after-scope?


But the stores _are_ dead; it's only the ASAN_CHECK that isn't.  Hence
the idea of doing the entire instrumentation very early.


Generally, dead stores could be eliminable when stored before the
corresponding ASAN_MARK poison (but even ASAN_MARK with "..W.." will
prevent those) and uneliminable when stored after ASAN_MARK poison.

For the "fn spec" for now, I'd just go with "..R.." for ASAN_CHECK and
NULL for ASAN_MARK for now.


I'm a bit scared of that even, :) especially in stage4.  If you think
it's safe enough, I can give it a shot, but honestly I wouldn't have
much time to deal with the fallout now (hence the quick revert).


Let me do it tomorrow.

Martin



Paolo





Re: [PATCH][i386][3/3] PR target/84164: Make *cmpqi_ext_ patterns accept more zero_extract modes

2018-02-13 Thread Jeff Law
On 02/09/2018 07:50 AM, Kyrill Tkachov wrote:
> Hi Uros,
> 
> On 08/02/18 22:54, Uros Bizjak wrote:
>> On Thu, Feb 8, 2018 at 6:11 PM, Kyrill  Tkachov
>>  wrote:
>>> Hi all,
>>>
>>> This patch fixes some fallout in the i386 testsuite that occurs after
>>> the
>>> simplification in patch [1/3] [1].
>>> The gcc.target/i386/extract-2.c FAILs because it expects to match:
>>> (set (reg:CC 17 flags)
>>>  (compare:CC (subreg:QI (zero_extract:SI (reg:HI 98)
>>>  (const_int 8 [0x8])
>>>  (const_int 8 [0x8])) 0)
>>>  (const_int 4 [0x4])))
>>>
>>> which is the *cmpqi_ext_2 pattern in i386.md but with the new
>>> simplification
>>> the combine/simplify-rtx
>>> machinery produces:
>>> (set (reg:CC 17 flags)
>>>  (compare:CC (subreg:QI (zero_extract:HI (reg:HI 98)
>>>  (const_int 8 [0x8])
>>>  (const_int 8 [0x8])) 0)
>>>  (const_int 4 [0x4])))
>>>
>>> Notice that the zero_extract now has HImode like the register source
>>> rather
>>> than SImode.
>>> The existing *cmpqi_ext_ patterns however explicitly demand an
>>> SImode on
>>> the zero_extract.
>>> I'm not overly familiar with the i386 port but I think that's too
>>> restrictive.
>>> The RTL documentation says:
>>> For (zero_extract:m loc size pos) "The mode m is the same as the mode
>>> that
>>> would be used for loc if it were a register."
>>> I'm not sure if that means that the mode of the zero_extract and the
>>> source
>>> register must always match (as is the
>>> case after patch [1/3]) but in any case it shouldn't matter semantically
>>> since we're taking a QImode subreg of the whole
>>> thing anyway.
>>>
>>> So the proposed solution in this patch is to allow HI, SI and DImode
>>> zero_extracts in these patterns as these are the
>>> modes that the ext_register_operand predicate accepts, so that the
>>> patterns
>>> can match the new form above.
>>>
>>> With this patch the aforementioned test passes again and bootstrap and
>>> testing on x86_64-unknown-linux-gnu shows
>>> no regressions.
>>>
>>> Is this ok for trunk if the first patch is accepted?
>> Huh, there are many other zero-extract patterns besides cmpqi_ext_*
>> with QImode subreg of SImode zero_extract in i386.md, used to access
>> high QImode register of HImode pair. A quick grep shows these that
>> have _ext_ in their name:
>>
>> (define_insn "*cmpqi_ext_1"
>> (define_insn "*cmpqi_ext_2"
>> (define_expand "cmpqi_ext_3"
>> (define_insn "*cmpqi_ext_3"
>> (define_insn "*cmpqi_ext_4"
>> (define_insn "addqi_ext_1"
>> (define_insn "*addqi_ext_2"
>> (define_expand "testqi_ext_1_ccno"
>> (define_insn "*testqi_ext_1"
>> (define_insn "*testqi_ext_2"
>> (define_insn_and_split "*testqi_ext_3"
>> (define_insn "andqi_ext_1"
>> (define_insn "*andqi_ext_1_cc"
>> (define_insn "*andqi_ext_2"
>> (define_insn "*qi_ext_1"
>> (define_insn "*qi_ext_2"
>> (define_expand "xorqi_ext_1_cc"
>> (define_insn "*xorqi_ext_1_cc"
>>
>> There are also relevant splitters and peephole2 patterns.
> 
> I see. Another approach I've looked at is removing the mode specifier from
> the zero_extract in these patterns. This means that they can be of any mode
> so they will match all of these modes without creating new patterns through
> iterators. That also works for the testcase and passes bootstrap and
> testing
> however there is the snag that the define_insns that don't start with a "*"
> are used to generate RTL through the gen_* mechanism and in that context
> the absence of a mode on the zero_extract would mean a VOIDmode
> zero_extract
> would be created, which I'm fairly sure is not good. So in my
> experiments I left
> those patterns alone (with an explicit SI on the zero_extract).
> 
>> IIRC, SImode zero_extract was enough to catch all high-register uses.
>> There will be a pattern explosion if we want to handle all other
>> integer modes here. However, I'm not a RTL expert, so someone will
>> have to say what is the correct RTX form here.
> 
> Jeff, Richard, could you please give us some guidance on this issue?
> Sorry for the trouble.
> 
I don't think any of the patterns above are known to the generic code.
So you just have to check the x86 backend to see their precise uses in a
generator (ie gen_cmpqi_ext_3) and verify those do not allow a VOIDmode
(or any other undesirable mode) to slip through.

Jeff


Re: [PATCH v6] aarch64: Add split-stack support

2018-02-13 Thread Wilco Dijkstra
Hi Adhemerval,

A few comments on the assembly code:


+# This function is called with non-standard calling convention: on entry
+# x10 is the requested stack pointer, x11 is previous stack pointer (if
+# functions has stacked arguments which needs to be restored), and x12 is
+# the caller link register on function entry (which will be restored by
+# morestack when returning to caller).  The split-stack prologue is in
+# the form:
+#
+# function:
+#  mrsx9, tpidr_el0
+#  ldur   x9, [x9, #-8]
+#  movx10, 
+#  movk   x10, #0x0, lsl #16
+#  subx10, sp, x10
+#  movx11, sp  # if function has stacked arguments

+#  movx12, x30

This should go...

+#  cmpx9, x10
+#  bcc.LX
+# main_fn_entry:
+#  [function body]
+# LX:

...here:

movx12, x30

+#  bl  __morestack
+#  b   main_fn_entry

Note using a bl here rather than a branch means you'll kill the return stack.

+# The N bit is also restored to indicate that the function is called
+# (so the prologue addition can set up the argument pointer correctly).
+
+ENTRY(__morestack)
+.LFB1:
+   .cfi_startproc
+
+#ifdef __PIC__
+   .cfi_personality 0x9b,DW.ref.__gcc_personality_v0
+   .cfi_lsda 0x1b,.LLSDA1
+#else
+   .cfi_personality 0x3,__gcc_personality_v0
+   .cfi_lsda 0x3,.LLSDA1
+#endif
+   # Calculate requested stack size.
+   sub x10, sp, x10
+
+   # Save parameters
+   stp x29, x12, [sp, -MORESTACK_FRAMESIZE]!
+   .cfi_def_cfa_offset MORESTACK_FRAMESIZE
+   .cfi_offset 29, -MORESTACK_FRAMESIZE
+   .cfi_offset 30, -MORESTACK_FRAMESIZE+8
+   add x29, sp, 0
+   .cfi_def_cfa_register 29
+   # Adjust the requested stack size for the frame pointer save.
+   stp x0, x1, [x29, 16]
+   stp x2, x3, [x29, 32]
+   add x10, x10, BACKOFF
+   stp x4, x5, [x29, 48]
+   stp x6, x7, [x29, 64]
+   stp x8, x30, [x29, 80]
+   str x10, [x29, 96]

Probably best to use NEWSTACK_SAVE here for clarity.

+   # void __morestack_block_signals (void)
+   bl  __morestack_block_signals
+
+   # void *__generic_morestack (size_t *pframe_size,
+   #void *old_stack,
+   #size_t param_size)
+   # pframe_size: is the size of the required stack frame (the function
+   #  amount of space remaining on the allocated stack).
+   # old_stack: points at the parameters the old stack
+   # param_size: size in bytes of parameters to copy to the new stack.
+   add x0, x29, NEWSTACK_SAVE
+   add x1, x29, MORESTACK_FRAMESIZE
+   mov x2, 0
+   bl  __generic_morestack
+
+   # Start using new stack
+   mov sp, x0
+
+   # Set __private_ss stack guard for the new stack.
+   ldr x9, [x29, NEWSTACK_SAVE]
+   add x0, x0, BACKOFF
+   sub x0, x0, x9
+.LEHB0:
+   mrs x1, tpidr_el0
+   str x0, [x1, SPLITSTACK_PTR_TP]

Is the label for unwinding exactly at the right place?

+   # void __morestack_unblock_signals (void)
+   bl  __morestack_unblock_signals
+
+   # Set up for a call to the target function.
+   ldp x0, x1, [x29, 16]
+   ldp x2, x3, [x29, 32]
+   ldp x4, x5, [x29, 48]
+   ldp x6, x7, [x29, 64]
+   ldp x8, x12, [x29, 80]
+   add x11, x29, MORESTACK_FRAMESIZE

+   ldr x30, [x29, 8]
+   # Indicate __morestack was called.
+   cmp x12, 0

No idea what this is for but the ldr/cmp are completely redundant.

+   blr x12
+
+   stp x0, x1, [x29, 16]

+   stp x2, x3, [x29, 32]
+   stp x4, x5, [x29, 48]
+   stp x6, x7, [x29, 64]

These 3 STPs are dead.

+   bl  __morestack_block_signals
+
+   # void *__generic_releasestack (size_t *pavailable)
+   add x0, x29, NEWSTACK_SAVE
+   bl  __generic_releasestack
+
+   # Reset __private_ss stack guard to value for old stack
+   ldr x9, [x29, NEWSTACK_SAVE]
+   add x0, x0, BACKOFF
+   sub x0, x0, x9
+
+   # Update TCB split stack field
+.LEHE0:
+   mrs x1, tpidr_el0
+   str x0, [x1, SPLITSTACK_PTR_TP]
+
+   bl __morestack_unblock_signals
+
+   # Use old stack again.
+   add sp, x29, MORESTACK_FRAMESIZE
+
+   ldp x0, x1, [x29, 16]

+   ldp x2, x3, [x29, 32]
+   ldp x4, x5, [x29, 48]
+   ldp x6, x7, [x29, 64]

These 3 LDPs are dead.

+   ldp x29, x30, [x29]
+
+   .cfi_remember_state
+   .cfi_restore 30
+   .cfi_restore 29
+   .cfi_def_cfa 31, 0
+
+   ret

[RX] Fix PR 83831 -- Unused bclr, bnot, bset insns

2018-02-13 Thread Oleg Endo
Hi,

The attached patch fixes the deficits mentioned in PR 83831 which is
about unused bclr, bnot and bset instructions.

For some simple cases, the combine pass can successfully fuse a load-
modify-store insn sequence into an insn that operates on a memory
directly.  However, in some cases where it thinks it's too complex, it
will not try to combine the insns.

What I'm doing here is similar to what I've been doing on SH in the
split1 pass after combine -- manually walking the insns up/down
(limited to the BB) to find the def/use and fuse 3 insns into 1, if
it's possible to do so.

For that I've copy-pasted some of the RTL utility functions from SH.  I
will propose folding and moving those into rtl.h / rtlanal.c during
next stage 1.

With that patch, I get a code size decrease of about 1 KByte on a
larger application.

The attached patch is the version for GCC 8 (trunk).  I've posted
versions for GCC 6 and GCC 7 in the PR.  All 3 patches have been tested
with 
   "make -k check" on rx-sim for c and c++

with no new failures.

OK for trunk?

Cheers,
Oleg

gcc/ChangeLog:

PR target/83831
* config/rx/rx-protos.h (rx_reg_dead_or_unused_after_insn,
rx_copy_reg_dead_or_unused_notes, rx_fuse_in_memory_bitop): New
declarations.
(set_of_reg): New struct.
(rx_find_set_of_reg, rx_find_use_of_reg): New functions.
* config/rx/rx.c (rx_reg_dead_or_unused_after_insn,
rx_copy_reg_dead_or_unused_notes, rx_fuse_in_memory_bitop): New
functions.
* config/rx/rx.md (andsi3, iorsi3, xorsi3): Convert to insn_and_split.
Split into bitclr, bitset, bitinvert patterns if appropriate.
(*bitset, *bitinvert, *bitclr): Convert to named insn_and_split and
use rx_fuse_in_memory_bitop.
(*bitset_in_memory, *bitinvert_in_memory, *bitclr_in_memory): Convert
to named insn, correct maximum insn length.

gcc/testsuite/ChangeLog:

PR target/83831
* gcc.target/rx/pr83831.c: New tests.Index: gcc/config/rx/rx-protos.h
===
--- gcc/config/rx/rx-protos.h	(revision 257549)
+++ gcc/config/rx/rx-protos.h	(working copy)
@@ -63,6 +63,112 @@
 extern void		rx_split_cbranch (machine_mode, enum rtx_code,
 	  rtx, rtx, rtx);
 extern machine_mode	rx_select_cc_mode (enum rtx_code, rtx, rtx);
+
+extern bool rx_reg_dead_or_unused_after_insn (const rtx_insn* i, int regno);
+extern void rx_copy_reg_dead_or_unused_notes (rtx reg, const rtx_insn* src,
+	  rtx_insn* dst);
+
+extern bool rx_fuse_in_memory_bitop (rtx* operands, rtx_insn* curr_insn,
+ rtx (*gen_insn)(rtx, rtx));
+
+/* Result value of rx_find_set_of_reg.  */
+struct set_of_reg
+{
+  /* The insn where sh_find_set_of_reg stopped looking.
+ Can be NULL_RTX if the end of the insn list was reached.  */
+  rtx_insn* insn;
+
+  /* The set rtx of the specified reg if found, NULL_RTX otherwise.  */
+  const_rtx set_rtx;
+
+  /* The set source rtx of the specified reg if found, NULL_RTX otherwise.
+ Usually, this is the most interesting return value.  */
+  rtx set_src;
+};
+
+/* FIXME: Copy-pasta from SH.  Move to rtl.h.
+   Given a reg rtx and a start insn, try to find the insn that sets
+   the specified reg by using the specified insn stepping function,
+   such as 'prev_nonnote_nondebug_insn_bb'.  When the insn is found,
+   try to extract the rtx of the reg set.  */
+template  inline set_of_reg
+rx_find_set_of_reg (rtx reg, rtx_insn* insn, F stepfunc,
+		bool ignore_reg_reg_copies = false)
+{
+  set_of_reg result;
+  result.insn = insn;
+  result.set_rtx = NULL_RTX;
+  result.set_src = NULL_RTX;
+
+  if (!REG_P (reg) || insn == NULL_RTX)
+return result;
+
+  for (rtx_insn* i = stepfunc (insn); i != NULL_RTX; i = stepfunc (i))
+{
+  if (BARRIER_P (i))
+	break;
+  if (!INSN_P (i) || DEBUG_INSN_P (i))
+	  continue;
+  if (reg_set_p (reg, i))
+	{
+	  if (CALL_P (i))
+	break;
+
+	  result.insn = i;
+	  result.set_rtx = set_of (reg, i);
+
+	  if (result.set_rtx == NULL_RTX || GET_CODE (result.set_rtx) != SET)
+	break;
+
+	  result.set_src = XEXP (result.set_rtx, 1);
+
+	  if (ignore_reg_reg_copies && REG_P (result.set_src))
+	{
+	  reg = result.set_src;
+	  continue;
+	}
+	  if (ignore_reg_reg_copies && SUBREG_P (result.set_src)
+	  && REG_P (SUBREG_REG (result.set_src)))
+	{
+	  reg = SUBREG_REG (result.set_src);
+	  continue;
+	}
+
+	  break;
+	}
+}
+
+  /* If the searched reg is found inside a (mem (post_inc:SI (reg))), set_of
+ will return NULL and set_rtx will be NULL.
+ In this case report a 'not found'.  result.insn will always be non-null
+ at this point, so no need to check it.  */
+  if (result.set_src != NULL && result.set_rtx == NULL)
+result.set_src = NULL;
+
+  return result;
+}
+
+/* FIXME: Move to rtlh.h.  */
+template  inline rtx_insn*
+rx_find_use_of_reg (rtx reg, rtx_insn* insn, F 

Re: [PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-13 Thread Martin Sebor

On 02/13/2018 08:35 AM, Martin Sebor wrote:

On 02/13/2018 07:40 AM, Jason Merrill wrote:

On Mon, Feb 12, 2018 at 6:32 PM, Martin Sebor  wrote:

While testing my fix for 83871 (handling attributes on explicit
specializations) I noticed another old regression: while GCC 4.4
would diagnose declarations of explicit specializations of all
primary templates declared deprecated, GCC 4.5 and later only
diagnose declarations of explicit specializations of class
templates but not those of function or variable templates.


Hmm, the discussion on the core reflector seemed to be agreeing that
we want to be able to define non-deprecated specializations of a
deprecated primary template.


Yes, that's what Richard wanted to do.  The only way to do it
within the existing constraints(*) is to define a non-deprecated
primary, and a deprecated partial specialization.  This is in line
with that approach and supported by Clang and all other compilers
I tested (including Clang).


To clarify, this approach works for class templates (e.g., like
std::numeric_limits that was mentioned in the core discussion)
and for variable templates.  Functions have no partial
specilizations so they have to be overloaded to achieve the same
effect.

Implementations don't treat the deprecated attribute on partial
specializations consistently.

EDG accepts and honors it on class template partial specializations
but rejects it with an error on those of variables.

Clang accepts but silently ignores it on class template partial
specializations and rejects with an error it on variables.

MSVC accepts and honors it on variables but silently ignores it
on class template partial specializations.

GCC ignores it silently on class partial specializations and
with a warning on variables (I opened bug 84347 to track this
and to have GCC honor is everywhere).

This is clearly a mess, which isn't surprising given how poorly
specified this is in the standard.  But from the test cases and
from the core discussion it seems clear that deprecating
a template, including its partial specializations (as opposed
to just a single explicit specialization) is desirable and
already supported, and that the wording in the standard just
needs to be adjusted to reflect that.



Martin

[*] Except (as Richard noted) that the standard doesn't seem to
allow a template to be deprecated.  I think that's a bug in the
spec because all implementations allow it to some degree.




Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-13 Thread Jeff Law
On 02/13/2018 06:43 AM, Alexandre Oliva wrote:
> On Feb 12, 2018, Alexandre Oliva  wrote:
> 
>> This patch supersedes the previous one.  Testing underway...  Ok if it
>> succeeds?
> 
> I failed to update the patch I posted after making a correct to symbol
> poisoning, that had caused builds to fail right away, sorry.  Thanks,
> Rainer, for catching the error.
> 
> Here's the patch that actually passed regstrap on native i686 and
> x86_64-linux-gnu, and fixed numerous regressions on cross builds.
> Ok to install?
> 
> 
> [LVU, IEPM] several new controlling options
> 
> Given that the minimum insn length is not generally reliable to tell
> whether an insn actually advances PC, this patch disables the locview
> list optimizations that can only be done when can tell it.
> 
> The preexisting logic is retained, however, and can be enabled with
> the newly-introduced -ginternal-reset-location-view.  This is now
> enabled by default only if the target defines a hook that may override
> or defer to the preexisting logic.  The negated command line option
> can then be used should errors still be encountered.
> 
> 
> We also introduce options to control whether to assume .loc and view
> support in the assembler, and to control whether to output inline
> entry points (and views) from markers.
> 
> 
> This patch also fixes a number of documentation formatting errors,
> namely using @item rather than @itemx for all but the first of several
> options before a description.
> 
> for  gcc/ChangeLog
> 
>   * common.opt (gas-loc-support, gas-locview-support): New.
>   (ginline-points, ginternal-reset-location-views): New.
>   * doc/invoke.texi: Document them.  Use @itemx where intended.
>   (gvariable-location-views): Adjust.
>   * target.def (reset_location_view): New.
>   * doc/tm.texi.in (DWARF2_ASM_VIEW_DEBUG_INFO): New.
>   (TARGET_RESET_LOCATION_VIEW): New.
>   * doc/tm.texi: Rebuilt.
>   * dwarf2out.c (dwarf2out_default_as_loc_support): New.
>   (dwarf2out_default_as_locview_support): New.
>   (output_asm_line_debug_info): Use option variables.
>   (dwarf2out_maybe_output_loclist_view_pair): Likewise.
>   (output_loc_list): Likewise.
>   (add_high_low_attributes): Check option variables.
>   Don't output entry view attribute in strict mode.
>   (gen_inlined_subroutine_die): Check option variables.
>   (dwarf2out_inline_entry): Likewise.
>   (init_sections_and_labels): Likewise.
>   (dwarf2out_early_finish): Likewise.
>   (maybe_reset_location_view): New, from...
>   (dwarf2out_var_location): ... here.  Call it.
>   * debug.h (dwarf2out_default_as_loc_support): Declare.
>   (dwarf2out_default_as_locview_support): Declare.
>   * hooks.c (hook_int_rtx_insn_0): New.
>   * hooks.h (hook_int_rtx_insn_0): Declare.
>   * toplev.c (process_options): Take -gas-loc-support and
>   -gas-locview-support from dwarf2out.  Enable
>   -gvariable-location-views by default only with locview
>   assembler support.  Enable -ginternal-reset-location-views by
>   default only if the target defines the corresponding hook.
>   Enable -ginline-points by default if location views are
>   enabled; force it disabled if statement frontiers are
>   disabled.
>   * tree-inline.c (expand_call_inline): Check option variables.
>   * tree-ssa-live.c (remove_unused_scope_block_p): Likewise.
OK.  I probably would have gone with the initial version or something
closer to that -- this seems to have expanded in scope a bit.  But it's
not enough to object to since the scope creep is really just giving
options to turn on/off the various bits for testing purposes.

I plan to commit it momentarily given the number of targets that are
failing in my tester due to this issue.

jeff


Re: [C++ Patch] PR 84333 ("[6/7/8 Regression] ICE with ternary operator in template function")

2018-02-13 Thread Jason Merrill
On Tue, Feb 13, 2018 at 10:43 AM, Paolo Carlini
 wrote:
> On 13/02/2018 15:35, Jason Merrill wrote:
> Thanks. Unfortunately, I don't think I sent a complete solution for these
> issues. In fact we would still ICE on:
>
> template int foo()
> {
>   return sizeof(T) > 1 ? : 1;
> }
>
> because type_contains_placeholder_p doesn't know how to handle a
> TEMPLATE_TYPE_PARM. In fact, cp_save_expr teaches me something:
>
> tree
> cp_save_expr (tree expr)
> {
>   /* There is no reason to create a SAVE_EXPR within a template; if
>  needed, we can create the SAVE_EXPR when instantiating the
>  template.  Furthermore, the middle-end cannot handle C++-specific
>  tree codes.  */
>   if (processing_template_decl)
> return expr;
>   return save_expr (expr);
> }
>
> would it be correct to just use it? Or we have to do something more complex?
> Note the issue only affects the GNU-extension with omitted middle operand,
> thus I believe we have *some* leeway...

Hmm, yes, that should work.  In a template we're just figuring out the
result type, at instantiation time we'll see the missing middle op
again and can do the right thing.  OK.

Jason


Re: [PATCH] RX TARGET_RTX_COSTS function

2018-02-13 Thread Oleg Endo
Hi,

On Tue, 2018-02-13 at 15:54 +, Sebastian Perta wrote:
> 
> The patch required some changes (the prototype, second param more
> exactly, has changed) in order to compile in the trunk.

Could you please send the patch as an attachment?  The formatting looks
a bit off (tabs spaces etc).

> So I updated this and I also same a further change to the patch: I
> disabled the replacement of the division with multiplication of
> reciprocals on -Os because it increases code size for example for the
> following division:

Do you happen to have any other numbers on the resulting code
size/speed?  Looking at the new costs that the patch introduces, I'd
expect there to be some more changes than just the 1/x...

Cheers,
Oleg


Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-13 Thread Michael Matz
Hi,

On Mon, 12 Feb 2018, Martin Sebor wrote:

> >> Removing noreturn from the whitelist means having to prevent
> >> the attribute from causing conflicts with the attributes on
> >> the blacklist.  E.g., in this:
> >>
> >>   template  [[malloc]] void* allocate (int);
> >>
> >>   template <> [[noreturn]] void* allocate (int);

Marking a function having a return type as noreturn doesn't make sense.  
So a warning in this case is actually a good thing.  And changing the 
return type to void (so that noreturn makes sense) makes it not a 
specialization anymore (or alternatively if the primary is also changed to 
void then malloc doesn't make sense anymore).


Ciao,
Michael.


Re: [C++ Patch] PR 84333 ("[6/7/8 Regression] ICE with ternary operator in template function")

2018-02-13 Thread Paolo Carlini
.. I should honestly add that now I don't really believe - as stated by 
Volker - that the issue, in any substantive way, is a recent, post 
6.1.0, regression: for the GNU extension build_conditional_expr was 
definitely copying arg2 = arg1 before that release and using save_expr 
on arg1 and the comment in cp_save_expr was already true.


Paolo.


C++ PATCH for c++/84080, ICE with return type deduction and specialization

2018-02-13 Thread Jason Merrill
In this testcase, we were deducing T to be auto, which is nonsensical.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit e8579febf0ce4e02b6b9c447c0f3adec56c66abc
Author: Jason Merrill 
Date:   Tue Feb 13 10:09:38 2018 -0500

PR c++/84080 - ICE with return type deduction and specialization.

* pt.c (determine_specialization): Check uses_template_parms.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 02d448e99b6..222084df4cb 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -2203,6 +2203,11 @@ determine_specialization (tree template_id,
   specialize TMPL will produce DECL.  */
continue;
 
+ if (uses_template_parms (targs))
+   /* We deduced something involving 'auto', which isn't a valid
+  template argument.  */
+   continue;
+
   /* Remove, from the set of candidates, all those functions
  whose constraints are not satisfied. */
   if (flag_concepts && !constraints_satisfied_p (fn, targs))
diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn47.C 
b/gcc/testsuite/g++.dg/cpp1y/auto-fn47.C
new file mode 100644
index 000..7de2d9f9e5c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn47.C
@@ -0,0 +1,6 @@
+// PR c++/84080
+// { dg-do compile { target c++14 } }
+
+template  T foo();
+
+template <> auto foo<0>() { return 42; } // { dg-error "does not match" }


[PATCH] RX TARGET_RTX_COSTS function

2018-02-13 Thread Sebastian Perta
Hello,

DJ has posted a patch a few years ago which implements TARGET_RTX_COSTS for
RX:
https://gcc.gnu.org/ml/gcc-patches/2012-05/msg8.html

Nick has accepted the patch:
https://gcc.gnu.org/ml/gcc-patches/2012-05/msg00012.html

But it was never made it into the trunk.

The patch required some changes (the prototype, second param more exactly,
has changed) in order to compile in the trunk.
So I updated this and I also same a further change to the patch: I disabled
the replacement of the division with multiplication of reciprocals on -Os
because it increases code size for example for the following division:
int a;
void foo()
{
  a = a / 24;
}

The output code without this patch or with my updated patch it looks like:
_foo:
mov.L   #_a, r4
mov.L   [r4], r14
div #24, r14
mov.L   r14, [r4]
rts

While with DJ's original patch:

_foo:
pushm   r7-r8
mov.L   #_a, r3
mov.L   [r3], r14
mov.L   #0x2aab, r7
emulr14, r7
shar#2, r8, r4
shar#31, r14
sub r14, r4, r14
mov.L   r14, [r3]
rtsd#8, r7-r8

Regression test is OK, I tested with the following command:
"make -k check-gcc RUNTESTFLAGS=--target_board=rx-sim"

Please let me know if this OK to check-in.

Best Regards,
Sebastian

Index: ChangeLog
===
--- ChangeLog   (revision 257628)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2018-02-13  Sebastian Perta  
+
+   * config/rx/rx.c (TARGET_RTX_COSTS): Define.
+   * config/rx/rx.c (rx_rtx_costs): New function.
+
 2018-02-13  Paolo Bonzini 
 
PR sanitizer/84340



Index: rx.c
===
--- rx.c(revision 257412)
+++ rx.c(working copy)
@@ -2975,7 +2975,73 @@
   return COSTS_N_INSNS (1);
 }
 
+#undef  TARGET_RTX_COSTS
+#define TARGET_RTX_COSTS rx_rtx_costs
+
 static bool
+rx_rtx_costs (rtx  x,
+   machine_mode mode,
+   int  outer_code ATTRIBUTE_UNUSED,
+   int  opno ATTRIBUTE_UNUSED,
+   int *total,
+   bool speed)
+{
+  int code = GET_CODE (x);
+
+  if((GET_CODE (x) == CONST_INT) && (INTVAL (x) == 0))
+   {
+   *total = 0;
+   return true;
+   }
+  switch (code)
+{
+case MULT:
+  if (mode == DImode)
+   {
+ *total = COSTS_N_INSNS (2);
+ return true;
+   }
+  /* fall through */
+case PLUS:
+case MINUS:
+case AND:
+case COMPARE:
+case IOR:
+case XOR:
+  if (GET_CODE (XEXP (x, 0)) == MEM
+ || GET_CODE (XEXP (x, 1)) == MEM)
+   *total = COSTS_N_INSNS (3);
+  else
+   *total = COSTS_N_INSNS (1);
+  return true;
+
+case DIV:
+  if(speed)
+ /* This is worst case.  */
+ *total = COSTS_N_INSNS (20);
+  else
+ *total = COSTS_N_INSNS (3);
+  return true;
+
+case UDIV:
+  if(speed)
+ /* This is worst case.  */
+ *total = COSTS_N_INSNS (18);
+  else
+ *total = COSTS_N_INSNS (3);
+  return true;
+
+case IF_THEN_ELSE:
+  *total = COSTS_N_INSNS (3);
+  return true;
+
+default:
+  break;
+}
+  return false;
+}
+
+static bool
 rx_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to)
 {
   /* We can always eliminate to the frame pointer.



Re: Fix VR_ANTI_RANGE handling in intersect_range_with_nonzero_bits (PR 84321)

2018-02-13 Thread Jeff Law
On 02/13/2018 08:26 AM, Richard Biener wrote:
> On Mon, Feb 12, 2018 at 4:29 PM, Richard Sandiford
>  wrote:
>> VR_ANTI_RANGE is basically a union of two ranges, and although
>> intersect_range_with_nonzero_bits had code to deal with the upper
>> one being empty, it didn't handle the lower one being empty.
>> There were also some off-by-one errors.
>>
>> This patch rewrites the code in a hopefully clearer way.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> 
> Ok.
And committed.

jeff


Re: [C++ Patch] PR 84333 ("[6/7/8 Regression] ICE with ternary operator in template function")

2018-02-13 Thread Paolo Carlini

Hi Jason,

On 13/02/2018 15:35, Jason Merrill wrote:

OK.
Thanks. Unfortunately, I don't think I sent a complete solution for 
these issues. In fact we would still ICE on:


template int foo()
{
  return sizeof(T) > 1 ? : 1;
}

because type_contains_placeholder_p doesn't know how to handle a 
TEMPLATE_TYPE_PARM. In fact, cp_save_expr teaches me something:


tree
cp_save_expr (tree expr)
{
  /* There is no reason to create a SAVE_EXPR within a template; if
 needed, we can create the SAVE_EXPR when instantiating the
 template.  Furthermore, the middle-end cannot handle C++-specific
 tree codes.  */
  if (processing_template_decl)
    return expr;
  return save_expr (expr);
}

would it be correct to just use it? Or we have to do something more 
complex? Note the issue only affects the GNU-extension with omitted 
middle operand, thus I believe we have *some* leeway...


Thanks!
Paolo.

/

Index: cp/call.c
===
--- cp/call.c   (revision 257627)
+++ cp/call.c   (working copy)
@@ -4805,7 +4805,7 @@ build_conditional_expr_1 (location_t loc, tree arg
   if (lvalue_p (arg1))
arg2 = arg1 = cp_stabilize_reference (arg1);
   else
-   arg2 = arg1 = save_expr (arg1);
+   arg2 = arg1 = cp_save_expr (arg1);
 }
 
   /* If something has already gone wrong, just pass that fact up the
Index: testsuite/g++.dg/template/sizeof16.C
===
--- testsuite/g++.dg/template/sizeof16.C(nonexistent)
+++ testsuite/g++.dg/template/sizeof16.C(working copy)
@@ -0,0 +1,7 @@
+// PR c++/84333
+// { dg-options -Wno-pedantic }
+
+template int foo()
+{
+  return sizeof(int) > 1 ? : 1;
+}
Index: testsuite/g++.dg/template/sizeof17.C
===
--- testsuite/g++.dg/template/sizeof17.C(nonexistent)
+++ testsuite/g++.dg/template/sizeof17.C(working copy)
@@ -0,0 +1,7 @@
+// PR c++/84333
+// { dg-options -Wno-pedantic }
+
+template int foo()
+{
+  return sizeof(T) > 1 ? : 1;
+}


Re: [PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-13 Thread Martin Sebor

On 02/13/2018 07:40 AM, Jason Merrill wrote:

On Mon, Feb 12, 2018 at 6:32 PM, Martin Sebor  wrote:

While testing my fix for 83871 (handling attributes on explicit
specializations) I noticed another old regression: while GCC 4.4
would diagnose declarations of explicit specializations of all
primary templates declared deprecated, GCC 4.5 and later only
diagnose declarations of explicit specializations of class
templates but not those of function or variable templates.


Hmm, the discussion on the core reflector seemed to be agreeing that
we want to be able to define non-deprecated specializations of a
deprecated primary template.


Yes, that's what Richard wanted to do.  The only way to do it
within the existing constraints(*) is to define a non-deprecated
primary, and a deprecated partial specialization.  This is in line
with that approach and supported by Clang and all other compilers
I tested (including Clang).

Martin

[*] Except (as Richard noted) that the standard doesn't seem to
allow a template to be deprecated.  I think that's a bug in the
spec because all implementations allow it to some degree.


Re: Fix VR_ANTI_RANGE handling in intersect_range_with_nonzero_bits (PR 84321)

2018-02-13 Thread Richard Biener
On Mon, Feb 12, 2018 at 4:29 PM, Richard Sandiford
 wrote:
> VR_ANTI_RANGE is basically a union of two ranges, and although
> intersect_range_with_nonzero_bits had code to deal with the upper
> one being empty, it didn't handle the lower one being empty.
> There were also some off-by-one errors.
>
> This patch rewrites the code in a hopefully clearer way.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Ok.

Richard.

> Richard
>
>
> 2018-02-12  Richard Sandiford  
>
> gcc/
> PR tree-optimization/84321
> * tree-vrp.c (intersect_range_with_nonzero_bits): Fix VR_ANTI_RANGE
> handling.  Also check whether the anti-range contains any values
> that satisfy the mask; switch to a VR_RANGE if not.
>
> gcc/testsuite/
> PR tree-optimization/84321
> * gcc.dg/pr84321.c: New test.
>
> Index: gcc/tree-vrp.c
> ===
> *** gcc/tree-vrp.c  2018-02-08 15:16:21.784407397 +
> --- gcc/tree-vrp.c  2018-02-12 15:26:13.703500747 +
> *** intersect_range_with_nonzero_bits (enum
> *** 184,220 
>const wide_int _bits,
>signop sgn)
>   {
> !   if (vr_type == VR_RANGE)
>   {
> !   *max = wi::round_down_for_mask (*max, nonzero_bits);
>
> !   /* Check that the range contains at least one valid value.  */
> !   if (wi::gt_p (*min, *max, sgn))
> !   return VR_UNDEFINED;
>
> !   *min = wi::round_up_for_mask (*min, nonzero_bits);
> !   gcc_checking_assert (wi::le_p (*min, *max, sgn));
> ! }
> !   if (vr_type == VR_ANTI_RANGE)
> ! {
> !   *max = wi::round_up_for_mask (*max, nonzero_bits);
>
> !   /* If the calculation wrapped, we now have a VR_RANGE whose
> !lower bound is *MAX and whose upper bound is *MIN.  */
> !   if (wi::gt_p (*min, *max, sgn))
> {
> ! std::swap (*min, *max);
> ! *max = wi::round_down_for_mask (*max, nonzero_bits);
>   gcc_checking_assert (wi::le_p (*min, *max, sgn));
>   return VR_RANGE;
> }
>
> !   *min = wi::round_down_for_mask (*min, nonzero_bits);
> gcc_checking_assert (wi::le_p (*min, *max, sgn));
>
> !   /* Check whether we now have an empty set of values.  */
> !   if (*min - 1 == *max)
> return VR_UNDEFINED;
>   }
> return vr_type;
>   }
> --- 184,244 
>const wide_int _bits,
>signop sgn)
>   {
> !   if (vr_type == VR_ANTI_RANGE)
>   {
> !   /* The VR_ANTI_RANGE is equivalent to the union of the ranges
> !A: [-INF, *MIN) and B: (*MAX, +INF].  First use NONZERO_BITS
> !to create an inclusive upper bound for A and an inclusive lower
> !bound for B.  */
> !   wide_int a_max = wi::round_down_for_mask (*min - 1, nonzero_bits);
> !   wide_int b_min = wi::round_up_for_mask (*max + 1, nonzero_bits);
>
> !   /* If the calculation of A_MAX wrapped, A is effectively empty
> !and A_MAX is the highest value that satisfies NONZERO_BITS.
> !Likewise if the calculation of B_MIN wrapped, B is effectively
> !empty and B_MIN is the lowest value that satisfies NONZERO_BITS.  */
> !   bool a_empty = wi::ge_p (a_max, *min, sgn);
> !   bool b_empty = wi::le_p (b_min, *max, sgn);
>
> !   /* If both A and B are empty, there are no valid values.  */
> !   if (a_empty && b_empty)
> !   return VR_UNDEFINED;
>
> !   /* If exactly one of A or B is empty, return a VR_RANGE for the
> !other one.  */
> !   if (a_empty || b_empty)
> {
> ! *min = b_min;
> ! *max = a_max;
>   gcc_checking_assert (wi::le_p (*min, *max, sgn));
>   return VR_RANGE;
> }
>
> !   /* Update the VR_ANTI_RANGE bounds.  */
> !   *min = a_max + 1;
> !   *max = b_min - 1;
> gcc_checking_assert (wi::le_p (*min, *max, sgn));
>
> !   /* Now check whether the excluded range includes any values that
> !satisfy NONZERO_BITS.  If not, switch to a full VR_RANGE.  */
> !   if (wi::round_up_for_mask (*min, nonzero_bits) == b_min)
> !   {
> ! unsigned int precision = min->get_precision ();
> ! *min = wi::min_value (precision, sgn);
> ! *max = wi::max_value (precision, sgn);
> ! vr_type = VR_RANGE;
> !   }
> ! }
> !   if (vr_type == VR_RANGE)
> ! {
> !   *max = wi::round_down_for_mask (*max, nonzero_bits);
> !
> !   /* Check that the range contains at least one valid value.  */
> !   if (wi::gt_p (*min, *max, sgn))
> return VR_UNDEFINED;
> +
> +   *min = wi::round_up_for_mask (*min, nonzero_bits);
> +   gcc_checking_assert (wi::le_p (*min, *max, sgn));
>   }
> return vr_type;
>   }
> Index: 

Re: Fallout: PR84340

2018-02-13 Thread Paolo Bonzini
On 13/02/2018 14:35, Jakub Jelinek wrote:
> On Tue, Feb 13, 2018 at 12:21:55PM +0100, Jakub Jelinek wrote:
>> On Tue, Feb 13, 2018 at 12:15:36PM +0100, Paolo Bonzini wrote:
>>> The issue is that the ASAN_CHECK doesn't exist at early DSE time, and
>>> thus causes the store to disappear.
>>
>> If it was DSE removing the stores before asan pass, then it would FAIL
>> before as well.
> 
> Sorry, while ASAN_CHECK is introduced late, ASAN_MARK is present there
> already from the gimplification.

Yeah, and it keeps everything alive.

> For use after scope, I guess a lot of the stores after end of scope
> (rather than reads) are something DSE could consider removing.
> So, shall we just disable DSE on vars where their address "escapes"
> through ASAN_MARK when -fsanitize-address-use-after-scope?

But the stores _are_ dead; it's only the ASAN_CHECK that isn't.  Hence
the idea of doing the entire instrumentation very early.

> Generally, dead stores could be eliminable when stored before the
> corresponding ASAN_MARK poison (but even ASAN_MARK with "..W.." will
> prevent those) and uneliminable when stored after ASAN_MARK poison.
> 
> For the "fn spec" for now, I'd just go with "..R.." for ASAN_CHECK and
> NULL for ASAN_MARK for now.

I'm a bit scared of that even, :) especially in stage4.  If you think
it's safe enough, I can give it a shot, but honestly I wouldn't have
much time to deal with the fallout now (hence the quick revert).

Paolo


Re: [PATCH v6] aarch64: Add split-stack support

2018-02-13 Thread Szabolcs Nagy

On 07/02/18 18:07, Adhemerval Zanella wrote:
 5. The TCB support on GLIBC is meant to be included in version 2.28.



...

+/* -fsplit-stack uses a TCB field available on glibc-2.27.  GLIBC also
+   exports symbol, __tcb_private_ss, to signal it has the field available
+   on TCB bloc.  This aims to prevent binaries linked against newer
+   GLIBC to run on non-supported ones.  */



i suspect this needs to be updated since the glibc patch
is not committed yet.

(i'll review the glibc patch, if it looks ok then it can
be committed after the gcc side is accepted.)


+
+static bool
+aarch64_supports_split_stack (bool report ATTRIBUTE_UNUSED,
+ struct gcc_options *opts ATTRIBUTE_UNUSED)
+{
+#ifndef TARGET_GLIBC_MAJOR
+#define TARGET_GLIBC_MAJOR 0
+#endif
+#ifndef TARGET_GLIBC_MINOR
+#define TARGET_GLIBC_MINOR 0
+#endif
+  /* Note: Can't test DEFAULT_ABI here, it isn't set until later.  */
+  if (TARGET_GLIBC_MAJOR * 1000 + TARGET_GLIBC_MINOR >= 2026)
+return true;
+
+  if (report)
+error ("%<-fsplit-stack%> currently only supported on AArch64 GNU/Linux with 
glibc-2.27 or later");
+  return false;
+}
+
+#undef TARGET_SUPPORTS_SPLIT_STACK
+#define TARGET_SUPPORTS_SPLIT_STACK aarch64_supports_split_stack
+


patch for PR84359

2018-02-13 Thread Vladimir Makarov

  The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84359

  Committed as rev. 257628.

Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 257627)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2018-02-13  Vladimir Makarov  
+
+	PR target/84359
+	* gcc.target/i386/57193.c: Add -march=x86-64.
+
 2018-02-13  Paolo Bonzini  
 
 	PR sanitizer/84340
Index: testsuite/gcc.target/i386/pr57193.c
===
--- testsuite/gcc.target/i386/pr57193.c	(revision 257537)
+++ testsuite/gcc.target/i386/pr57193.c	(working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -march=x86-64" } */
 /* { dg-final { scan-assembler-times "movdqa" 2 } } */
 
 #include 


Re: Fallout: PR84340

2018-02-13 Thread Jakub Jelinek
On Tue, Feb 13, 2018 at 12:21:55PM +0100, Jakub Jelinek wrote:
> On Tue, Feb 13, 2018 at 12:15:36PM +0100, Paolo Bonzini wrote:
> > On 13/02/2018 10:32, Martin Liška wrote:
> > > Hello.
> > > 
> > > It caused PR84340, I'm suggesting following fix:
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84340#c3
> > 
> > I don't think EAF_DIRECT is the issue.  You could think of ASAN_MARK as
> > writing a global variable, which it can do because it's not const.
> > 
> > The issue is that the ASAN_CHECK doesn't exist at early DSE time, and
> > thus causes the store to disappear.
> 
> That doesn't make sense to me, because the testcases regressed with the change
> of "fn spec" attribute on ASAN_{CHECK,MARK}.
> If it was DSE removing the stores before asan pass, then it would FAIL
> before as well.

Sorry, while ASAN_CHECK is introduced late, ASAN_MARK is present there
already from the gimplification.

For use after scope, I guess a lot of the stores after end of scope
(rather than reads) are something DSE could consider removing.
So, shall we just disable DSE on vars where their address "escapes"
through ASAN_MARK when -fsanitize-address-use-after-scope?
Generally, dead stores could be eliminable when stored before the
corresponding ASAN_MARK poison (but even ASAN_MARK with "..W.." will
prevent those) and uneliminable when stored after ASAN_MARK poison.

For the "fn spec" for now, I'd just go with "..R.." for ASAN_CHECK and
NULL for ASAN_MARK for now.

Jakub


Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-13 Thread Jason Merrill
On Mon, Feb 12, 2018 at 6:39 PM, Martin Sebor  wrote:
> I really don't think it's helpful to try to force noreturn
> to match between the primary and its specializations.

I continue to disagree.

Jason


Re: [PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-13 Thread Jason Merrill
On Mon, Feb 12, 2018 at 6:32 PM, Martin Sebor  wrote:
> While testing my fix for 83871 (handling attributes on explicit
> specializations) I noticed another old regression: while GCC 4.4
> would diagnose declarations of explicit specializations of all
> primary templates declared deprecated, GCC 4.5 and later only
> diagnose declarations of explicit specializations of class
> templates but not those of function or variable templates.

Hmm, the discussion on the core reflector seemed to be agreeing that
we want to be able to define non-deprecated specializations of a
deprecated primary template.

Jason


Re: [C++ Patch] PR 84333 ("[6/7/8 Regression] ICE with ternary operator in template function")

2018-02-13 Thread Jason Merrill
OK.

On Mon, Feb 12, 2018 at 2:18 PM, Paolo Carlini  wrote:
> Hi,
>
> this ICE on valid happens only with checking enabled - that explains why we
> didn't notice it so far - but I think points to a minor but substantive
> correctness issue. In short, we ICE when build_conditional_expr calls
> save_expr, which in turn calls contain_placeholder_p, which doesn't handle
> correctly the sizeof(int), and tries to use TREE_CONSTANT on the
> INTEGER_TYPE. I think that in general we simply have to explicitly handle
> both kinds of sizeof in contains_placeholder_p: even for a type as simple as
> INTEGER_TYPE the result may not be trivial, ie, type_contains_placeholder_1
> checks the bounds:
>
>case INTEGER_TYPE:
> case REAL_TYPE:
> case FIXED_POINT_TYPE:
>   /* Here we just check the bounds.  */
>   return (CONTAINS_PLACEHOLDER_P (TYPE_MIN_VALUE (type))
>   || CONTAINS_PLACEHOLDER_P (TYPE_MAX_VALUE (type)));
>
> I'm finishing testing the below on x86_64-linux, all good so far.
>
> Thanks, Paolo.
>
> //
>


Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-13 Thread Paolo Bonzini
On 13/02/2018 14:00, Jakub Jelinek wrote:
>> Certainly, for now I'll revert.
> Reversion is not the right thing, the "fn spec" attributes were clearly
> incorrect. So, we should change them to something more conservative that
> will work.

That would only be "all dots", that is no fnspec at all.  Martin
suggested removing EAF_DIRECT, but I don't think I agree with his
reasoning.  Besides, aliasing doesn't see the shadow memory at all (see
call_may_clobber_ref_p_1), so it's okay to ignore it for the sake of
fnspecs.

>> But can you expand on why it's too early?  Indeed I suppose it may
>> affect inlining decisions, on the other hand it seems dangerous to apply
>> instrumentation after pretty much any optimization pass.
>
> It will prevent pretty much all optimizations.  We don't want -O2
> -fsanitize=address to be unusably slow, if people want to catch everything,
> they can always use -O0 -fsanitize=address.  The current placement of the
> passes has been a result of long discussions if I remember well.

I'm not sure it will be that bad, together with the fnspec.  Consider
that PR84340 is latent in current GCC; the testcases work because GCC
thinks that the  pointer escaped, and thus treated the stores as not
dead.  In other words, -fsanitize=address -O2 _currently_ lacks an awful
lot of aliasing-based optimizations such as DSE, because all variables
are marked as escaping after the initial ASAN_MARK(UNPOISON, , sz).

With some luck (that we can ascertain between now and stage 1) the
negative effects of pass placement balance with the positive effects of
the fnspec.  But I agree that it requires some discussion and benchmarking.

Paolo


C++ PATCH for more variadic capture issues

2018-02-13 Thread Jason Merrill
This patch fixes two issues:

1) We were failing to capture a parameter pack used by a pack
expansion which needs to use PACK_EXPANSION_EXTRA_ARGS, because in
that case we defer actual substitution until we have all the arguments
we need.  But by that time we're inside an instantiation of the lambda
op(), and it's too late to capture anything.  So we need to force
lambda capture while we're packing up the arguments we'll need for
that later substitution.

2) While looking at that, I noticed that although we recently started
to support implicit capture within a pack expansion, we didn't support
explicit capture.

Tested x86_64-pc-linux-gnu, applying to trunk.  After this patch, the
Boost.Hana testsuite also passes again.
commit 6383f88febfbe13aef71a9da2f678ecae344d1f2
Author: Jason Merrill 
Date:   Mon Feb 12 10:26:53 2018 -0500

Fix more variadic capture issues.

* pt.c (find_parameter_packs_r): Also look at explicit captures.
(check_for_bare_parameter_packs): Check current_class_type for
lambda context.
(extract_locals_r): Handle seeing a full instantiation of a pack.
(tsubst_pack_expansion): Likewise.  Force lambda capture.
* parser.c (cp_parser_lambda_introducer): Don't
check_for_bare_parameter_packs.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 9a05e4fc812..81c6f0128e6 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -10412,8 +10412,6 @@ cp_parser_lambda_introducer (cp_parser* parser, tree 
lambda_expr)
  cp_lexer_consume_token (parser->lexer);
  capture_init_expr = make_pack_expansion (capture_init_expr);
}
- else
-   check_for_bare_parameter_packs (capture_init_expr);
}
 
   if (LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (lambda_expr) != CPLD_NONE
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index a83b7073d20..02d448e99b6 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -3587,7 +3587,6 @@ find_parameter_packs_r (tree *tp, int *walk_subtrees, 
void* data)
 }
   break;
 
-  /* Look through a lambda capture proxy to the field pack.  */
 case VAR_DECL:
   if (DECL_PACK_P (t))
 {
@@ -3707,6 +3706,12 @@ find_parameter_packs_r (tree *tp, int *walk_subtrees, 
void* data)
 
 case LAMBDA_EXPR:
   {
+   /* Look at explicit captures.  */
+   for (tree cap = LAMBDA_EXPR_CAPTURE_LIST (t);
+cap; cap = TREE_CHAIN (cap))
+ cp_walk_tree (_VALUE (cap), _parameter_packs_r, ppd,
+   ppd->visited);
+   /* Since we defer implicit capture, look in the body as well.  */
tree fn = lambda_function (t);
cp_walk_tree (_SAVED_TREE (fn), _parameter_packs_r, ppd,
  ppd->visited);
@@ -3907,7 +3912,7 @@ check_for_bare_parameter_packs (tree t)
 return false;
 
   /* A lambda might use a parameter pack from the containing context.  */
-  if (current_function_decl && LAMBDA_FUNCTION_P (current_function_decl))
+  if (current_class_type && LAMBDA_TYPE_P (current_class_type))
 return false;
 
   if (TREE_CODE (t) == TYPE_DECL)
@@ -11410,30 +11415,72 @@ tsubst_binary_right_fold (tree t, tree args, 
tsubst_flags_t complain,
 /* Walk through the pattern of a pack expansion, adding everything in
local_specializations to a list.  */
 
+struct el_data
+{
+  tree extra;
+  tsubst_flags_t complain;
+};
 static tree
-extract_locals_r (tree *tp, int */*walk_subtrees*/, void *data)
+extract_locals_r (tree *tp, int */*walk_subtrees*/, void *data_)
 {
-  tree *extra = reinterpret_cast(data);
+  el_data  = *reinterpret_cast(data_);
+  tree *extra = 
+  tsubst_flags_t complain = data.complain;
   if (tree spec = retrieve_local_specialization (*tp))
 {
   if (TREE_CODE (spec) == NONTYPE_ARGUMENT_PACK)
{
- /* Pull out the actual PARM_DECL for the partial instantiation.  */
+ /* Maybe pull out the PARM_DECL for a partial instantiation.  */
  tree args = ARGUMENT_PACK_ARGS (spec);
- gcc_assert (TREE_VEC_LENGTH (args) == 1);
- tree arg = TREE_VEC_ELT (args, 0);
- spec = PACK_EXPANSION_PATTERN (arg);
+ if (TREE_VEC_LENGTH (args) == 1)
+   {
+ tree elt = TREE_VEC_ELT (args, 0);
+ if (PACK_EXPANSION_P (elt))
+   elt = PACK_EXPANSION_PATTERN (elt);
+ if (DECL_PACK_P (elt))
+   spec = elt;
+   }
+ if (TREE_CODE (spec) == NONTYPE_ARGUMENT_PACK)
+   {
+ /* Handle lambda capture here, since we aren't doing any
+substitution now, and so tsubst_copy won't call
+process_outer_var_ref.  */
+ tree args = ARGUMENT_PACK_ARGS (spec);
+ int len = TREE_VEC_LENGTH (args);
+ for (int i = 0; i < len; ++i)
+   {
+ tree arg = TREE_VEC_ELT (args, i);
+ tree carg = arg;
+  

C++ PATCH for c++/84338, wrong variadic sizeof

2018-02-13 Thread Jason Merrill
Here sizeof... was improperly being resolved early on an argument pack
consisting of a single pack expansion, when the sizeof... is still
dependent.  This turned out to be because of our handling of partial
instantiation of a function parameter pack: I had previously changed
extract_fnparm_pack to wrap such a partially instantiated pack in an
EXPR_PACK_EXPANSION, but later reverted that because it was breaking
things.  That breakage turns out to have been because the uses of
ARGUMENT_PACK_EXTRACT_ARG for PARM_DECL and VAR_DECL were missing the
code to strip a pack expansion that we had for template parameters;
moving all of that into a function rather than a macro fixed that, and
so now we can wrap the parameter pack again, which fixes the testcase.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 0978de34113258d973e3a78dad466c3d602a9130
Author: Jason Merrill 
Date:   Mon Feb 12 17:04:56 2018 -0500

PR c++/84338 - wrong variadic sizeof.

* pt.c (argument_pack_select_arg): Like the macro, but look through
a pack expansion.
(tsubst, tsubst_copy, dependent_template_arg_p): Use it.
(extract_fnparm_pack): Do make_pack_expansion.
(extract_locals_r): Do strip a pack expansion.
* cp-tree.h (ARGUMENT_PACK_SELECT_ARG): Remove.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index a6c75aed33d..9a9e9f0bbcb 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -3558,12 +3558,6 @@ extern void decl_shadowed_for_var_insert (tree, tree);
 #define ARGUMENT_PACK_SELECT_INDEX(NODE)   \
   (((struct tree_argument_pack_select *)ARGUMENT_PACK_SELECT_CHECK 
(NODE))->index)
   
-/* In an ARGUMENT_PACK_SELECT, the actual underlying argument that the
-   ARGUMENT_PACK_SELECT represents. */
-#define ARGUMENT_PACK_SELECT_ARG(NODE) \
-  TREE_VEC_ELT (ARGUMENT_PACK_ARGS (ARGUMENT_PACK_SELECT_FROM_PACK (NODE)), \
-   ARGUMENT_PACK_SELECT_INDEX (NODE))
-
 #define FOLD_EXPR_CHECK(NODE)  \
   TREE_CHECK4 (NODE, UNARY_LEFT_FOLD_EXPR, UNARY_RIGHT_FOLD_EXPR,  \
   BINARY_LEFT_FOLD_EXPR, BINARY_RIGHT_FOLD_EXPR)
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index b58c60f0dcb..a83b7073d20 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -3394,6 +3394,34 @@ get_template_argument_pack_elems (const_tree t)
   return ARGUMENT_PACK_ARGS (t);
 }
 
+/* In an ARGUMENT_PACK_SELECT, the actual underlying argument that the
+   ARGUMENT_PACK_SELECT represents. */
+
+static tree
+argument_pack_select_arg (tree t)
+{
+  tree args = ARGUMENT_PACK_ARGS (ARGUMENT_PACK_SELECT_FROM_PACK (t));
+  tree arg = TREE_VEC_ELT (args, ARGUMENT_PACK_SELECT_INDEX (t));
+
+  /* If the selected argument is an expansion E, that most likely means we were
+ called from gen_elem_of_pack_expansion_instantiation during the
+ substituting of an argument pack (of which the Ith element is a pack
+ expansion, where I is ARGUMENT_PACK_SELECT_INDEX) into a pack expansion.
+ In this case, the Ith element resulting from this substituting is going to
+ be a pack expansion, which pattern is the pattern of E.  Let's return the
+ pattern of E, and gen_elem_of_pack_expansion_instantiation will build the
+ resulting pack expansion from it.  */
+  if (PACK_EXPANSION_P (arg))
+{
+  /* Make sure we aren't throwing away arg info.  */
+  gcc_assert (!PACK_EXPANSION_EXTRA_ARGS (arg));
+  arg = PACK_EXPANSION_PATTERN (arg);
+}
+
+  return arg;
+}
+
+
 /* True iff FN is a function representing a built-in variadic parameter
pack.  */
 
@@ -10933,7 +10961,12 @@ extract_fnparm_pack (tree tmpl_parm, tree *spec_p)
   parmvec = make_tree_vec (len);
   spec_parm = *spec_p;
   for (i = 0; i < len; i++, spec_parm = DECL_CHAIN (spec_parm))
-TREE_VEC_ELT (parmvec, i) = spec_parm;
+{
+  tree elt = spec_parm;
+  if (DECL_PACK_P (elt))
+   elt = make_pack_expansion (elt);
+  TREE_VEC_ELT (parmvec, i) = elt;
+}
 
   /* Build the argument packs.  */
   SET_ARGUMENT_PACK_ARGS (argpack, parmvec);
@@ -11388,7 +11421,8 @@ extract_locals_r (tree *tp, int */*walk_subtrees*/, 
void *data)
  /* Pull out the actual PARM_DECL for the partial instantiation.  */
  tree args = ARGUMENT_PACK_ARGS (spec);
  gcc_assert (TREE_VEC_LENGTH (args) == 1);
- spec = TREE_VEC_ELT (args, 0);
+ tree arg = TREE_VEC_ELT (args, 0);
+ spec = PACK_EXPANSION_PATTERN (arg);
}
   *extra = tree_cons (*tp, spec, *extra);
 }
@@ -13747,29 +13781,9 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
  {
arg = TMPL_ARG (args, level, idx);
 
+   /* See through ARGUMENT_PACK_SELECT arguments. */
if (arg && TREE_CODE (arg) == ARGUMENT_PACK_SELECT)
- {
-   /* See through ARGUMENT_PACK_SELECT arguments. */
-   

Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-13 Thread Alexandre Oliva
On Feb 12, 2018, Alexandre Oliva  wrote:

> This patch supersedes the previous one.  Testing underway...  Ok if it
> succeeds?

I failed to update the patch I posted after making a correct to symbol
poisoning, that had caused builds to fail right away, sorry.  Thanks,
Rainer, for catching the error.

Here's the patch that actually passed regstrap on native i686 and
x86_64-linux-gnu, and fixed numerous regressions on cross builds.
Ok to install?


[LVU, IEPM] several new controlling options

Given that the minimum insn length is not generally reliable to tell
whether an insn actually advances PC, this patch disables the locview
list optimizations that can only be done when can tell it.

The preexisting logic is retained, however, and can be enabled with
the newly-introduced -ginternal-reset-location-view.  This is now
enabled by default only if the target defines a hook that may override
or defer to the preexisting logic.  The negated command line option
can then be used should errors still be encountered.


We also introduce options to control whether to assume .loc and view
support in the assembler, and to control whether to output inline
entry points (and views) from markers.


This patch also fixes a number of documentation formatting errors,
namely using @item rather than @itemx for all but the first of several
options before a description.

for  gcc/ChangeLog

* common.opt (gas-loc-support, gas-locview-support): New.
(ginline-points, ginternal-reset-location-views): New.
* doc/invoke.texi: Document them.  Use @itemx where intended.
(gvariable-location-views): Adjust.
* target.def (reset_location_view): New.
* doc/tm.texi.in (DWARF2_ASM_VIEW_DEBUG_INFO): New.
(TARGET_RESET_LOCATION_VIEW): New.
* doc/tm.texi: Rebuilt.
* dwarf2out.c (dwarf2out_default_as_loc_support): New.
(dwarf2out_default_as_locview_support): New.
(output_asm_line_debug_info): Use option variables.
(dwarf2out_maybe_output_loclist_view_pair): Likewise.
(output_loc_list): Likewise.
(add_high_low_attributes): Check option variables.
Don't output entry view attribute in strict mode.
(gen_inlined_subroutine_die): Check option variables.
(dwarf2out_inline_entry): Likewise.
(init_sections_and_labels): Likewise.
(dwarf2out_early_finish): Likewise.
(maybe_reset_location_view): New, from...
(dwarf2out_var_location): ... here.  Call it.
* debug.h (dwarf2out_default_as_loc_support): Declare.
(dwarf2out_default_as_locview_support): Declare.
* hooks.c (hook_int_rtx_insn_0): New.
* hooks.h (hook_int_rtx_insn_0): Declare.
* toplev.c (process_options): Take -gas-loc-support and
-gas-locview-support from dwarf2out.  Enable
-gvariable-location-views by default only with locview
assembler support.  Enable -ginternal-reset-location-views by
default only if the target defines the corresponding hook.
Enable -ginline-points by default if location views are
enabled; force it disabled if statement frontiers are
disabled.
* tree-inline.c (expand_call_inline): Check option variables.
* tree-ssa-live.c (remove_unused_scope_block_p): Likewise.
---
 gcc/common.opt  |   16 
 gcc/debug.h |2 +
 gcc/doc/invoke.texi |  106 +++--
 gcc/doc/tm.texi |   23 ++
 gcc/doc/tm.texi.in  |9 ++
 gcc/dwarf2out.c |  188 ++-
 gcc/hooks.c |6 ++
 gcc/hooks.h |1 
 gcc/target.def  |   17 +
 gcc/toplev.c|   45 +++-
 gcc/tree-inline.c   |2 -
 gcc/tree-ssa-live.c |4 +
 12 files changed, 326 insertions(+), 93 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 40ec0088c57e..e0bc4d1bb18d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2880,6 +2880,14 @@ g
 Common Driver RejectNegative JoinedOrMissing
 Generate debug information in default format.
 
+gas-loc-support
+Common Driver Var(dwarf2out_as_loc_support) Init(2)
+Assume assembler support for (DWARF2+) .loc directives
+
+gas-locview-support
+Common Driver Var(dwarf2out_as_locview_support) Init(2)
+Assume assembler support for view in (DWARF2+) .loc directives
+
 gcoff
 Common Driver Ignore Warn(switch %qs no longer supported)
 Does nothing.  Preserved for backward compatibility.
@@ -2912,6 +2920,14 @@ ggdb
 Common Driver JoinedOrMissing
 Generate debug information in default extended format.
 
+ginline-points
+Common Driver Var(debug_inline_points) Init(2)
+Generate extended entry point information for inlined functions
+
+ginternal-reset-location-views
+Common Driver Var(debug_internal_reset_location_views) Init(2)
+Compute locview reset points based on insn length estimates
+
 gno-
 RejectNegative Joined Undocumented
 ; Catch the gno- prefix, so it doesn't backtrack 

Re: Fallout: PR84340

2018-02-13 Thread Jakub Jelinek
On Tue, Feb 13, 2018 at 12:15:36PM +0100, Paolo Bonzini wrote:
> On 13/02/2018 10:32, Martin Liška wrote:
> > Hello.
> > 
> > It caused PR84340, I'm suggesting following fix:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84340#c3
> 
> I don't think EAF_DIRECT is the issue.  You could think of ASAN_MARK as
> writing a global variable, which it can do because it's not const.
> 
> The issue is that the ASAN_CHECK doesn't exist at early DSE time, and
> thus causes the store to disappear.

That doesn't make sense to me, because the testcases regressed with the change
of "fn spec" attribute on ASAN_{CHECK,MARK}.
If it was DSE removing the stores before asan pass, then it would FAIL
before as well.

Jakub


Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-13 Thread Jakub Jelinek
On Tue, Feb 13, 2018 at 01:55:23PM +0100, Paolo Bonzini wrote:
> On 13/02/2018 00:49, Jakub Jelinek wrote:
> > On Tue, Feb 13, 2018 at 12:41:28AM +0100, Paolo Bonzini wrote:
> >> On 09/02/2018 19:07, Jakub Jelinek wrote:
> >>> On Fri, Feb 09, 2018 at 07:01:08PM +0100, Richard Biener wrote:
> > which indeed fixes the testcase and seems not to break asan.exp.
> 
>  Huh. Need to double check why that makes sense ;) 
> >>>
> >>> I think it does, for both ASAN_CHECK and ASAN_MARK the pointer argument
> >>> is the second one, the first one is an integer argument with flags.
> >>> And ASAN_MARK, both poison and unpoison, works kind like a clobber on the
> >>> referenced variable, before unpoison it is generally inaccessible and 
> >>> after
> >>> poison too.
> >>
> >> This was too optimistic. :(
> >>
> >> In use-after-scope-types-1.C, after the patch FRE+DSE are able to
> >> optimize away the problematic read.  In general it seems to me that the
> >> sanitizer passes should be before DSE if we want ASAN builtins to have
> >> precise info, otherwise some reads or stores might not be
> >> instrumented---GCC was being lucky here.
> >>
> >> The obvious change here is:
> > 
> > That is way too early, I don't think this is a good idea.
> > Certainly not in stage4.
> 
> Certainly, for now I'll revert.

Reversion is not the right thing, the "fn spec" attributes were clearly
incorrect. So, we should change them to something more conservative that
will work.

> But can you expand on why it's too early?  Indeed I suppose it may
> affect inlining decisions, on the other hand it seems dangerous to apply
> instrumentation after pretty much any optimization pass.

It will prevent pretty much all optimizations.  We don't want -O2
-fsanitize=address to be unusably slow, if people want to catch everything,
they can always use -O0 -fsanitize=address.  The current placement of the
passes has been a result of long discussions if I remember well.

Jakub


Re: [PATCH] WebAssembly: Disable subdirectory configuration for unsupported LD

2018-02-13 Thread Maciej W. Rozycki
On Sat, 3 Feb 2018, Maciej W. Rozycki wrote:

>   * configure.ac  (noconfigdirs): Add `ld'.
>   * configure: Regenerate.

 I have now committed this change to both repos.

  Maciej


Re: Reduce inline limits a bit to compensate changes in inlining metrics

2018-02-13 Thread Martin Liška
Hi.

I see quite SPEC 2006 benchmark changes for -Ofast -march=native:
(note that size changes are not presented all)

1) gillan (AMD bulldozer):

+--+---+--+--+---+---+---+
| Performance Regressions - Execution Time |   Δ   | Previous | Current  | σ | 
Δ (B) | σ (B) |
+--+---+--+--+---+---+---+
| SPEC/SPEC2006/FP/453.povray  | 6.06% | 180.9933 | 191.9582 | - | 
3.48% | - |
| SPEC/SPEC2006/FP/410.bwaves  | 5.13% | 307.0029 | 322.7495 | - | 
8.11% | - |
| SPEC/SPEC2006/FP/433.milc| 4.80% | 310.4705 | 325.3810 | - | 
4.01% | - |
| SPEC/SPEC2006/FP/450.soplex  | 3.50% | 449.4463 | 465.1627 | - | 
3.09% | - |
| SPEC/SPEC2006/INT/471.omnetpp| 1.41% | 562.2769 | 570.1829 | - | 
1.65% | - |
| SPEC/SPEC2006/FP/447.dealII  | 1.40% | 359.1753 | 364.1933 | - | 
1.66% | - |
| SPEC/SPEC2006/INT/483.xalancbmk  | 1.00% | 430.9722 | 435.2821 | - | 
7.59% | - |
+--+---+--+--+---+---+---+

+---++--+--+---++---+
| Performance Improvements - Execution Time |   Δ| Previous | Current  | σ 
| Δ (B)  | σ (B) |
+---++--+--+---++---+
| SPEC/SPEC2006/INT/429.mcf | -7.93% | 606.7478 | 558.6419 | - 
| -7.42% | - |
| SPEC/SPEC2006/INT/473.astar   | -2.07% | 577.5786 | 565.6328 | - 
| -6.16% | - |
| SPEC/SPEC2006/FP/435.gromacs  | -2.01% | 329.9626 | 323.3386 | - 
| -0.34% | - |
| SPEC/SPEC2006/INT/458.sjeng   | -1.71% | 667.0133 | 655.5964 | - 
| -1.54% | - |
| SPEC/SPEC2006/FP/436.cactusADM| -1.07% | 379.5225 | 375.4560 | - 
| -4.12% | - |
+---++--+--+---++---+

+---+-+--+--+---+-+---+
|  Performance Improvements - size  |Δ|   Previous   |  
 Current| σ |  Δ (B)  | σ (B) |
+---+-+--+--+---+-+---+
| SPEC/SPEC2006/FP/482.sphinx3/elf/sections/text| -15.92% |  244450. |  
205522. | - | -9.45%  | - |
| SPEC/SPEC2006/FP/482.sphinx3/elf  | -12.16% |  335640. |  
294816. | - | -6.46%  | - |
| SPEC/SPEC2006/INT/456.hmmer/elf/sections/text | -12.14% |  395778. |  
347746. | - | -9.75%  | - |
| SPEC/SPEC2006/INT/483.xalancbmk/elf/sections/text | -10.43% | 3282082. | 
2939698. | - | -11.33% | - |
| SPEC/SPEC2006/FP/453.povray/elf/sections/text | -9.48%  | 1004658. |  
909426. | - | -12.68% | - |
| SPEC/SPEC2006/INT/400.perlbench/elf/sections/text | -9.42%  | 1141458. | 
1033890. | - | -9.75%  | - |
| SPEC/SPEC2006/INT/456.hmmer/elf   | -8.77%  |  508624. |  
464024. | - | -7.28%  | - |
+---+-+--+--+---+-+---+

2) AMD Ryzen 5 machine:

+--+---+--+--+---+---+---+
| Performance Regressions - Execution Time |   Δ   | Previous | Current  | σ | 
Δ (B) | σ (B) |
+--+---+--+--+---+---+---+
| SPEC/SPEC2006/FP/450.soplex  | 6.28% | 204.6552 | 217.5132 | - | 
8.06% | - |
| SPEC/SPEC2006/FP/433.milc| 6.11% | 200.5912 | 212.8397 | - | 
5.81% | - |
| SPEC/SPEC2006/FP/410.bwaves  | 3.28% | 145.9785 | 150.7609 | - | 
5.37% | - |
| SPEC/SPEC2006/INT/462.libquantum | 1.98% | 306.7858 | 312.8472 | - | 
2.14% | - |
+--+---+--+--+---+---+---+

+---++--+--+---++---+
| Performance Improvements - Execution Time |   Δ| Previous | Current  | σ 
| Δ (B)  | σ (B) |
+---++--+--+---++---+
| SPEC/SPEC2006/INT/429.mcf | -4.71% | 255.8058 | 243.7667 | - 
| -1.18% | - |
| SPEC/SPEC2006/INT/473.astar   | -3.33% | 361.8033 | 349.7520 | - 
| 0.67%  | - |
| SPEC/SPEC2006/FP/482.sphinx3  | -1.28% | 301.7175 | 297.8409 | - 
| 5.49%  | - |
+---++--+--+---++---+

+---+-+--+--+---+-+---+
|  Performance 

Re: [RFC] Tree loop unroller pass

2018-02-13 Thread Wilco Dijkstra
Hi Kugan,

> Based on the previous discussions, I tried to implement a tree loop
> unroller for partial unrolling. I would like to queue this RFC patches
> for next stage1 review.

This is a great plan - GCC urgently requires a good unroller!

> * Cost-model for selecting the loop uses the same params used
> elsewhere in related optimizations. I was told that keeping this same
> would allow better tuning for all the optimizations.

I'd advise against using the existing params as is. Unrolling by 8x by default 
is
way too aggressive and counterproductive. It was perhaps OK for in-order cores
20 years ago, but not today. The goal of unrolling is to create more ILP in 
small
loops, not to generate huge blocks of repeated code which definitely won't fit 
in
micro-op caches and loop buffers...

Also we need to enable this by default, at least with -O3, maybe even for small
(or rather tiny) loops in -O2 like LLVM does.

> * I have also implemented an option to limit loops based on memory
> streams. i.e., some micro-architectures where limiting the resulting
> memory streams is preferred and used  to limit unrolling factor.

I'm not convinced this is needed once you tune the parameters for unrolling.
If you have say 4 read streams you must have > 10 instructions already so
you may want to unroll this 2x in -O3, but definitely not 8x. So I see the 
streams
issue as a problem caused by too aggressive unroll settings. I think if you
address that first, you're unlikely going to have an issue with too many 
streams.

> * I expect that there will be some cost-model changes might be needed
> to handle (or provide ability to handle) various loop preferences of
> the micro-architectures. I am sending this patch for review early to
> get feedbacks on this.

Yes it should be feasible to have settings based on backend preference
and optimization level (so O3/Ofast will unroll more than O2).

> * Position of the pass in passes.def can also be changed. Example,
> unrolling before SLP.

As long as it runs before IVOpt so we get base+immediate addressing modes.

Wilco


Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-13 Thread Paolo Bonzini
On 13/02/2018 00:49, Jakub Jelinek wrote:
> On Tue, Feb 13, 2018 at 12:41:28AM +0100, Paolo Bonzini wrote:
>> On 09/02/2018 19:07, Jakub Jelinek wrote:
>>> On Fri, Feb 09, 2018 at 07:01:08PM +0100, Richard Biener wrote:
> which indeed fixes the testcase and seems not to break asan.exp.

 Huh. Need to double check why that makes sense ;) 
>>>
>>> I think it does, for both ASAN_CHECK and ASAN_MARK the pointer argument
>>> is the second one, the first one is an integer argument with flags.
>>> And ASAN_MARK, both poison and unpoison, works kind like a clobber on the
>>> referenced variable, before unpoison it is generally inaccessible and after
>>> poison too.
>>
>> This was too optimistic. :(
>>
>> In use-after-scope-types-1.C, after the patch FRE+DSE are able to
>> optimize away the problematic read.  In general it seems to me that the
>> sanitizer passes should be before DSE if we want ASAN builtins to have
>> precise info, otherwise some reads or stores might not be
>> instrumented---GCC was being lucky here.
>>
>> The obvious change here is:
> 
> That is way too early, I don't think this is a good idea.
> Certainly not in stage4.

Certainly, for now I'll revert.

But can you expand on why it's too early?  Indeed I suppose it may
affect inlining decisions, on the other hand it seems dangerous to apply
instrumentation after pretty much any optimization pass.

Thanks,

Paolo


Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-13 Thread Jakub Jelinek
On Tue, Feb 13, 2018 at 12:41:28AM +0100, Paolo Bonzini wrote:
> On 09/02/2018 19:07, Jakub Jelinek wrote:
> > On Fri, Feb 09, 2018 at 07:01:08PM +0100, Richard Biener wrote:
> >>> which indeed fixes the testcase and seems not to break asan.exp.
> >>
> >> Huh. Need to double check why that makes sense ;) 
> > 
> > I think it does, for both ASAN_CHECK and ASAN_MARK the pointer argument
> > is the second one, the first one is an integer argument with flags.
> > And ASAN_MARK, both poison and unpoison, works kind like a clobber on the
> > referenced variable, before unpoison it is generally inaccessible and after
> > poison too.
> 
> This was too optimistic. :(
> 
> In use-after-scope-types-1.C, after the patch FRE+DSE are able to
> optimize away the problematic read.  In general it seems to me that the
> sanitizer passes should be before DSE if we want ASAN builtins to have
> precise info, otherwise some reads or stores might not be
> instrumented---GCC was being lucky here.
> 
> The obvious change here is:

That is way too early, I don't think this is a good idea.
Certainly not in stage4.

Jakub


Re: Fallout: PR84340

2018-02-13 Thread Paolo Bonzini
On 13/02/2018 10:32, Martin Liška wrote:
> Hello.
> 
> It caused PR84340, I'm suggesting following fix:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84340#c3

I don't think EAF_DIRECT is the issue.  You could think of ASAN_MARK as
writing a global variable, which it can do because it's not const.

The issue is that the ASAN_CHECK doesn't exist at early DSE time, and
thus causes the store to disappear.

Paolo


  1   2   >