date:20180207

Re: Extend aligned_membuf<> usage

2018-02-07 Thread François Dumont


On 06/02/2018 20:16, François Dumont wrote:

On 05/02/2018 18:16, Jonathan Wakely wrote:

Wouldn't it make more sense to simply make __aligned_buffer identical
to __aligned_membuf for the versioned-namespace? Then at least the
conditional code is only in one place.

Yes, __aligned_buffer is indeed not used, we could do that. I'll 
propose a patch to do so.



So here it is, ok to commit ?

François

diff --git a/libstdc++-v3/include/ext/aligned_buffer.h b/libstdc++-v3/include/ext/aligned_buffer.h
index 94a2ff0..81fb797 100644
--- a/libstdc++-v3/include/ext/aligned_buffer.h
+++ b/libstdc++-v3/include/ext/aligned_buffer.h
@@ -75,6 +75,10 @@ namespace __gnu_cxx
   { return static_cast(_M_addr()); }
 };
 
+#if _GLIBCXX_INLINE_VERSION
+  template
+using __aligned_buffer = __aligned_membuf<_Tp>;
+#else
   // Similar to __aligned_membuf but aligned for complete objects, not members.
   // This type is used in , , 
   // and , but ideally they would use __aligned_membuf
@@ -113,6 +117,7 @@ namespace __gnu_cxx
   _M_ptr() const noexcept
   { return static_cast(_M_addr()); }
 };
+#endif
 
 } // namespace

[PATCH] Fix -fcompare-debug failure on pr84146.c (PR target/84146)

2018-02-07 Thread Jakub Jelinek

Hi!

Unfortunately, seems my rest_of_insert_endbranch fix doesn't fix
-fcompare-debug on the testcase, when adding the endbr after the setjmp
call with no note in between, we add it into the same bb as the setjmp call,
while when adding it with -g with NOTE_INSN_CALL_ARG_LOCATION, which is
already outside of the bb, we add it outside of bb.

This patch fixes it by removing lots of code:
 22 files changed, 50 insertions(+), 205 deletions(-)
instead of sticking the call arg location info into a separate note that
is required to be adjacent to the call and thus requires lots of special
cases everywhere we emit it as a REG_CALL_ARG_LOCATION note in REG_NOTES
directly on the call.
All we need to ensure is that we remove that reg note before emitting
-fcompare-debug final insns dump, and need to unshare the rtl in there
(apparently rtl sharing verification ignores
NOTE_INSN_{CALL_ARG,VAR}_LOCATION notes, but of course not REG_NOTES).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-02-07  Jakub Jelinek  

PR target/84146
* reg-notes.def (REG_CALL_ARG_LOCATION): New reg note.
* insn-notes.def (NOTE_INSN_CALL_ARG_LOCATION): Remove.
* var-tracking.c (emit_note_insn_var_location): Remove all references
to NOTE_INSN_CALL_ARG_LOCATION.
(emit_notes_in_bb): Emit arguments as REG_CALL_ARG_LOCATION note on
the CALL_INSN rather than separate NOTE_INSN_CALL_ARG_LOCATION note.
Use copy_rtx_if_shared.
* dwarf2out.c (gen_subprogram_die): Use XEXP with 0 instead of
NOTE_VAR_LOCATION on ca_loc->call_arg_loc_note.
(dwarf2out_var_location): Remove handling of
NOTE_INSN_CALL_ARG_LOCATION, instead handle REG_CALL_ARG_LOCATION note
on call_insn.
* final.c (final_scan_insn): Remove all references to
NOTE_INSN_CALL_ARG_LOCATION.
(rest_of_clean_state): Likewise.  Remove REG_CALL_ARG_LOCATION notes
before dumping final insns.
* except.c (emit_note_eh_region_end): Remove all references to
NOTE_INSN_CALL_ARG_LOCATION.
* config/alpha/alpha.c (alpha_pad_function_end): Likewise.
* config/c6x/c6x.c (c6x_gen_bundles): Likewise.
* config/arc/arc.c (hwloop_optimize): Likewise.
* config/arm/arm.c (create_fix_barrier): Likewise.
* config/s390/s390.c (s390_chunkify_start): Likewise.
* config/sh/sh.c (find_barrier): Likewise.
* config/i386/i386.c (rest_of_insert_endbranch,
ix86_seh_fixup_eh_fallthru): Likewise.
* config/xtensa/xtensa.c (hwloop_optimize): Likewise.
* config/iq2000/iq2000.c (final_prescan_insn): Likewise.
* config/frv/frv.c (frv_function_prologue): Likewise.
* emit-rtl.c (try_split): Likewise.  Copy over REG_CALL_ARG_LOCATION
reg note.
(note_outside_basic_block_p): Remove all references to
NOTE_INSN_CALL_ARG_LOCATION.
* gengtype.c (adjust_field_rtx_def): Likewise.
* print-rtl.c (rtx_writer::print_rtx_operand_code_0, print_insn):
Likewise.
* jump.c (cleanup_barriers, delete_related_insns): Likewise.
* cfgrtl.c (force_nonfallthru_and_redirect): Likewise.

* gcc.target/i386/pr84146.c: Add -fcompare-debug to dg-options.

--- gcc/reg-notes.def.jj2018-01-03 10:19:55.239533971 +0100
+++ gcc/reg-notes.def   2018-02-07 16:40:03.800915206 +0100
@@ -239,3 +239,6 @@ REG_NOTE (CALL_DECL)
when a called function has a 'notrack' attribute.  This note is used by the
compiler when the option -fcf-protection=branch is specified.  */
 REG_NOTE (CALL_NOCF_CHECK)
+
+/* The values passed to callee, for debuginfo purposes.  */
+REG_NOTE (CALL_ARG_LOCATION)
--- gcc/insn-notes.def.jj   2018-01-03 10:19:55.669534040 +0100
+++ gcc/insn-notes.def  2018-02-07 16:40:03.800915206 +0100
@@ -65,9 +65,6 @@ INSN_NOTE (EH_REGION_END)
 /* The location of a variable.  */
 INSN_NOTE (VAR_LOCATION)
 
-/* The values passed to callee.  */
-INSN_NOTE (CALL_ARG_LOCATION)
-
 /* The beginning of a statement.  */
 INSN_NOTE (BEGIN_STMT)
 
--- gcc/var-tracking.c.jj   2018-02-07 13:11:26.950985542 +0100
+++ gcc/var-tracking.c  2018-02-07 16:44:10.810478942 +0100
@@ -8860,14 +8860,12 @@ emit_note_insn_var_location (variable **
   /* Make sure that the call related notes come first.  */
   while (NEXT_INSN (insn)
 && NOTE_P (insn)
-&& ((NOTE_KIND (insn) == NOTE_INSN_VAR_LOCATION
- && NOTE_DURING_CALL_P (insn))
-|| NOTE_KIND (insn) == NOTE_INSN_CALL_ARG_LOCATION))
+&& NOTE_KIND (insn) == NOTE_INSN_VAR_LOCATION
+&& NOTE_DURING_CALL_P (insn))
insn = NEXT_INSN (insn);
   if (NOTE_P (insn)
- && ((NOTE_KIND (insn) == NOTE_INSN_VAR_LOCATION
-  && NOTE_DURING_CALL_P (insn))
- || NOTE_KIND (insn) == NOTE_INSN_CALL_ARG_LOCATION))
+ && NOTE_KIND (insn) == NOTE_INSN_VAR_LOCATION

Re: PATCH to fix bogus warning with -Wstringop-truncation -g (PR tree-optimization/84228)

2018-02-07 Thread Jeff Law

On 02/06/2018 08:59 AM, Martin Sebor wrote:
> On 02/06/2018 06:23 AM, Marek Polacek wrote:
>> On Tue, Feb 06, 2018 at 01:57:36PM +0100, Jakub Jelinek wrote:
>>> On Tue, Feb 06, 2018 at 01:46:21PM +0100, Marek Polacek wrote:
 --- gcc/testsuite/c-c++-common/Wstringop-truncation-3.c
 +++ gcc/testsuite/c-c++-common/Wstringop-truncation-3.c
 @@ -0,0 +1,20 @@
 +/* PR tree-optimization/84228 */
 +/* { dg-do compile } */
 +/* { dg-options "-Wstringop-truncation -O2 -g" } */
 +
 +char *strncpy (char *, const char *, __SIZE_TYPE__);
 +struct S
 +{
 +  char arr[64];
 +};
 +
 +int
 +foo (struct S *p1, const char *a)
 +{
 +  if (a)
 +    goto err;
 +  strncpy (p1->arr, a, sizeof p1->arr); /* { dg-bogus "specified
 bound" } */
>>>
>>> Just curious, what debug stmt is in between those?
>>
>> Just
>>
>>   strncpy (_1, 0B, 64);
>>   # DEBUG BEGIN_STMT
>>   p1_5(D)->arr[3] = 0;
>>
>>> Wouldn't it be better to force them a couple of debug stmts?
>>> Say
>>>   int b = 5, c = 6, d = 7;
>>> at the start of the function and
>>>   b = 8; c = 9; d = 10;
>>> in between strncpy and the '\0' store?
>>
>> Yep.  I tweaked the testcase.  Now we have
>>
>>   strncpy (_1, 0B, 64);
>>   # DEBUG BEGIN_STMT
>>   # DEBUG b => 8
>>   # DEBUG BEGIN_STMT
>>   # DEBUG c => 9
>>   # DEBUG BEGIN_STMT
>>   # DEBUG d => 10
>>   # DEBUG BEGIN_STMT
>>   p1_5(D)->arr[3] = 0;
>>
 +  p1->arr[3] = '\0';
 +err:
 +  return 0;
 +}
 diff --git gcc/tree-ssa-strlen.c gcc/tree-ssa-strlen.c
 index c3cf432a921..f0f6535017b 100644
 --- gcc/tree-ssa-strlen.c
 +++ gcc/tree-ssa-strlen.c
 @@ -1849,7 +1849,7 @@ maybe_diag_stxncpy_trunc (gimple_stmt_iterator
 gsi, tree src, tree cnt)

    /* Look for dst[i] = '\0'; after the stxncpy() call and if found
   avoid the truncation warning.  */
 -  gsi_next ();
 +  gsi_next_nondebug ();
    gimple *next_stmt = gsi_stmt (gsi);

    if (!gsi_end_p (gsi) && is_gimple_assign (next_stmt))
>>>
>>> Ok for trunk, though generally looking at just next stmt is very
>>> fragile, might be
>>> better to look at the strncpy's vuse immediate uses if they are
>>> within the
>>> same basic block and either don't alias with it, or are the store it is
>>> looking for, or something similar.
>>
>> I guess some
>> FOR_EACH_IMM_USE_FAST (...)
>>   {
>>     if (is_gimple_debug (USE_STMT (use_p)))
>>   continue;
>> ...
>> would be better.
> 
> Jeff and I had a fairly extensive discussion about how best to
> handle debug statements.  I had prototyped his suggestion along
> these lines but ran into false positives and other difficulties.
> The details are here:
> 
>   https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00805.html
> 
> I thought I ended up just using gsi_next_nondebug() and added
> a test for it but I guess I must be misremembering.  The patch
> review spanned a couple of months so it's even possible I forgot
> to make the change.
Right.  I remember that being the ultimate decision (gsi_next_nondebug).
 Perhaps the patch got lost in the long cycle times.

Jeff

Re: PATCH to fix bogus warning with -Wstringop-truncation -g (PR tree-optimization/84228)

2018-02-07 Thread Jeff Law

On 02/06/2018 05:57 AM, Jakub Jelinek wrote:
> On Tue, Feb 06, 2018 at 01:46:21PM +0100, Marek Polacek wrote:
>> --- gcc/testsuite/c-c++-common/Wstringop-truncation-3.c
>> +++ gcc/testsuite/c-c++-common/Wstringop-truncation-3.c
>> @@ -0,0 +1,20 @@
>> +/* PR tree-optimization/84228 */
>> +/* { dg-do compile } */
>> +/* { dg-options "-Wstringop-truncation -O2 -g" } */
>> +
>> +char *strncpy (char *, const char *, __SIZE_TYPE__);
>> +struct S
>> +{
>> +  char arr[64];
>> +};
>> +
>> +int
>> +foo (struct S *p1, const char *a)
>> +{
>> +  if (a)
>> +goto err;
>> +  strncpy (p1->arr, a, sizeof p1->arr); /* { dg-bogus "specified bound" } */
> 
> Just curious, what debug stmt is in between those?
> Wouldn't it be better to force them a couple of debug stmts?
> Say
>   int b = 5, c = 6, d = 7;
> at the start of the function and
>   b = 8; c = 9; d = 10;
> in between strncpy and the '\0' store?
> 
>> +  p1->arr[3] = '\0';
>> +err:
>> +  return 0;
>> +}
>> diff --git gcc/tree-ssa-strlen.c gcc/tree-ssa-strlen.c
>> index c3cf432a921..f0f6535017b 100644
>> --- gcc/tree-ssa-strlen.c
>> +++ gcc/tree-ssa-strlen.c
>> @@ -1849,7 +1849,7 @@ maybe_diag_stxncpy_trunc (gimple_stmt_iterator gsi, 
>> tree src, tree cnt)
>>  
>>/* Look for dst[i] = '\0'; after the stxncpy() call and if found
>>   avoid the truncation warning.  */
>> -  gsi_next ();
>> +  gsi_next_nondebug ();
>>gimple *next_stmt = gsi_stmt (gsi);
>>  
>>if (!gsi_end_p (gsi) && is_gimple_assign (next_stmt))
> 
> Ok for trunk, though generally looking at just next stmt is very fragile, 
> might be
> better to look at the strncpy's vuse immediate uses if they are within the
> same basic block and either don't alias with it, or are the store it is
> looking for, or something similar.
Martin and I wandered down this approach a bit and ultimately decided
against it.  While yes it could avoid a false positive by looking at the
immediate uses, but I'm not sure avoiding the false positive in those
cases is actually good!

THe larger the separation between the strcpy and the truncation the more
likely it is that the code is wrong or at least poorly written and
deserves a developer looksie.

jeff

Re: PATCH to fix bogus warning with -Wstringop-truncation -g (PR tree-optimization/84228)

2018-02-07 Thread Jeff Law

On 02/06/2018 05:46 AM, Marek Polacek wrote:
> When -Wstringop-truncation sees a strncpy call where the specified bound
> is equal to the size of the destination, it looks at the next statement
> to see if it's dst[i] = '\0';, and if it is, it doesn't warn.  But it
> needs to look at the next nondebug statement, otherwise we can get a
> false positive with -g, as this testcase shows.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2018-02-06  Marek Polacek  
> 
>   PR tree-optimization/84228
>   * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Skip debug statements.
> 
>   * c-c++-common/Wstringop-truncation-3.c: New test.
OK.
jeff

Re: [PATCH/RFC] Fix ICE in find_taken_edge_computed_goto (PR 84136)

2018-02-07 Thread Jeff Law

On 02/02/2018 02:35 PM, David Malcolm wrote:
> On Thu, 2018-02-01 at 12:05 +0100, Richard Biener wrote:
>> On Wed, Jan 31, 2018 at 4:39 PM, David Malcolm 
>> wrote:
>>> PR 84136 reports an ICE within sccvn_dom_walker when handling a
>>> C/C++ source file that overuses the labels-as-values extension.
>>> The code in question stores a jump label into a global, and then
>>> jumps to it from another function, which ICEs after inlining:
>>>
>>> void* a;
>>>
>>> void foo() {
>>>   if ((a = &))
>>>   return;
>>>
>>>   l:;
>>> }
>>>
>>> int main() {
>>>   foo();
>>>   goto *a;
>>>
>>>   return 0;
>>> }
>>>
>>> This appears to be far beyond what we claim to support in this
>>> extension - but we shouldn't ICE.
>>>
>>> What's happening is that, after inlining, we have usage of a *copy*
>>> of the label, which optimizes away the if-return logic, turning it
>>> into an infinite loop.
>>>
>>> On entry to the sccvn_dom_walker we have this gimple:
>>>
>>> main ()
>>> {
>>>   void * a.0_1;
>>>
>>>[count: 0]:
>>>   a = 
>>>
>>>[count: 0]:
>>> l:
>>>   a.0_1 = a;
>>>   goto a.0_1;
>>> }
>>>
>>> and:
>>>   edge taken = find_taken_edge (bb, vn_valueize (val));
>>> reasonably valueizes the:
>>>   goto a.0_1;
>>> after the:
>>>   a = 
>>>   a.0_1 = a;
>>> as if it were:
>>>   goto *
>>>
>>> find_taken_edge_computed_goto then has:
>>>
>>> 2380  dest = label_to_block (val);
>>> 2381  if (dest)
>>> 2382{
>>> 2383  e = find_edge (bb, dest);
>>> 2384  gcc_assert (e != NULL);
>>> 2385}
>>>
>>> which locates dest as a self-jump from block 3 back to itself.
>>>
>>> However, the find_edge call returns NULL - it has a predecessor
>>> edge
>>> from block 2, but no successor edges.
>>>
>>> Hence the assertion fails and we ICE.
>>>
>>> A successor edge from the computed goto could have been created by
>>> make_edges if the label stmt had been in the function, but
>>> make_edges
>>> only looks in the current function when handling computed gotos,
>>> and
>>> the label only appeared after inlining.
>>>
>>> The following patch removes the assertion, fixing the ICE.
>>>
>>> Successfully bootstrapped on x86_64-pc-linux-gnu.
>>>
>>> If that's option (a), there could be some other approaches:
>>>
>>> (b) convert the assertion into a warning/error/sorry, on the
>>> assumption that if we don't detect such an edge then the code
>>> is
>>> presumably abusing the labels-as-values feature
>>> (c) have make_edges detect such a problematic computed goto (maybe
>>> converting make_edges_bb's return value to an enum and adding a
>>> 4th
>>> value - though it's not clear what to do then with it)
>>> (d) detect this case on inlining and handle it somehow (e.g. adding
>>> edges for labels that have appeared since make_edges originally
>>> ran, for computed gotos that have no out-edges)
>>> (e) do nothing, keeping the assertion, and accept that this is
>>> going
>>> to fail on a non-release build
>>> (f) something else?
>>>
>>> Of the above, (d) seems to me to be the most robust solution, but I
>>> don't know how far we want to go "down the rabbit hole" of handling
>>> such uses of labels-as-values (beyond not ICE-ing on them).
>>>
>>> Thoughts?
>>
>> I think you can preserve the assert for ! DECL_NONLOCAL (val) thus
>>
>> gcc_assert (e != NULL || DECL_NONLOCAL (val));
>>
>> does the label in this case properly have DECL_NONLOCAL
>> set?  Probably
>> not given we shouldn't have duplicated it in this case.  
> 
> Indeed, the inlined copy of the label doesn't have DECL_NONLOCAL set:
> 
> (gdb) p val->decl_common.nonlocal_flag 
> $5 = 0
> 
>> So the issue is really
>> that the FE doesn't set this bit for "escaped" labels... but I'm not
>> sure how
>> to easily constrain the extension here.
>>
>> The label should be FORCED_LABEL though so that's maybe a weaker
>> check.
> 
> It does have FORCED_LABEL set:
> 
> (gdb) p val->base.side_effects_flag 
> $6 = 1
> 
> ...though presumably that's going to be set for just about any label
> that a computed goto jumps to?  Hence this is presumably of little
> benefit for adjusting the assertion.
Agreed.

jeff

Re: [PATCH/RFC] Fix ICE in find_taken_edge_computed_goto (PR 84136)

2018-02-07 Thread Jeff Law

On 02/01/2018 04:05 AM, Richard Biener wrote:
> On Wed, Jan 31, 2018 at 4:39 PM, David Malcolm  wrote:
>> PR 84136 reports an ICE within sccvn_dom_walker when handling a
>> C/C++ source file that overuses the labels-as-values extension.
>> The code in question stores a jump label into a global, and then
>> jumps to it from another function, which ICEs after inlining:
>>
>> void* a;
>>
>> void foo() {
>>   if ((a = &))
>>   return;
>>
>>   l:;
>> }
>>
>> int main() {
>>   foo();
>>   goto *a;
>>
>>   return 0;
>> }
>>
>> This appears to be far beyond what we claim to support in this
>> extension - but we shouldn't ICE.
>>
>> What's happening is that, after inlining, we have usage of a *copy*
>> of the label, which optimizes away the if-return logic, turning it
>> into an infinite loop.
>>
>> On entry to the sccvn_dom_walker we have this gimple:
>>
>> main ()
>> {
>>   void * a.0_1;
>>
>>[count: 0]:
>>   a = 
>>
>>[count: 0]:
>> l:
>>   a.0_1 = a;
>>   goto a.0_1;
>> }
>>
>> and:
>>   edge taken = find_taken_edge (bb, vn_valueize (val));
>> reasonably valueizes the:
>>   goto a.0_1;
>> after the:
>>   a = 
>>   a.0_1 = a;
>> as if it were:
>>   goto *
>>
>> find_taken_edge_computed_goto then has:
>>
>> 2380  dest = label_to_block (val);
>> 2381  if (dest)
>> 2382{
>> 2383  e = find_edge (bb, dest);
>> 2384  gcc_assert (e != NULL);
>> 2385}
>>
>> which locates dest as a self-jump from block 3 back to itself.
>>
>> However, the find_edge call returns NULL - it has a predecessor edge
>> from block 2, but no successor edges.
>>
>> Hence the assertion fails and we ICE.
>>
>> A successor edge from the computed goto could have been created by
>> make_edges if the label stmt had been in the function, but make_edges
>> only looks in the current function when handling computed gotos, and
>> the label only appeared after inlining.
>>
>> The following patch removes the assertion, fixing the ICE.
>>
>> Successfully bootstrapped on x86_64-pc-linux-gnu.
>>
>> If that's option (a), there could be some other approaches:
>>
>> (b) convert the assertion into a warning/error/sorry, on the
>> assumption that if we don't detect such an edge then the code is
>> presumably abusing the labels-as-values feature
>> (c) have make_edges detect such a problematic computed goto (maybe
>> converting make_edges_bb's return value to an enum and adding a 4th
>> value - though it's not clear what to do then with it)
>> (d) detect this case on inlining and handle it somehow (e.g. adding
>> edges for labels that have appeared since make_edges originally
>> ran, for computed gotos that have no out-edges)
>> (e) do nothing, keeping the assertion, and accept that this is going
>> to fail on a non-release build
>> (f) something else?
>>
>> Of the above, (d) seems to me to be the most robust solution, but I
>> don't know how far we want to go "down the rabbit hole" of handling
>> such uses of labels-as-values (beyond not ICE-ing on them).
>>
>> Thoughts?
> 
> I think you can preserve the assert for ! DECL_NONLOCAL (val) thus
> 
> gcc_assert (e != NULL || DECL_NONLOCAL (val));
> 
> does the label in this case properly have DECL_NONLOCAL set?  Probably
> not given we shouldn't have duplicated it in this case.  So the issue is 
> really
> that the FE doesn't set this bit for "escaped" labels... but I'm not sure how
> to easily constrain the extension here.
> 
> The label should be FORCED_LABEL though so that's maybe a weaker
> check.
As David mentioned, I don't think that checking FORCED_LABEL is going to
be useful here.

Ideally we'd tighten the extension's language so that we could issue an
error out of the front-end.


Jeff

Re: [PATCH/RFC] Fix ICE in find_taken_edge_computed_goto (PR 84136)

2018-02-07 Thread Jeff Law

On 01/31/2018 08:39 AM, David Malcolm wrote:
> PR 84136 reports an ICE within sccvn_dom_walker when handling a
> C/C++ source file that overuses the labels-as-values extension.
> The code in question stores a jump label into a global, and then
> jumps to it from another function, which ICEs after inlining:
That's not "overuse", that's simply invalid.

The docs are pretty clear.  Quoting from extend.texi:

--

You may not use this mechanism to jump to code in a different function.
If you do that, totally unpredictable things happen.  The best way to
avoid this is to store the label address only in automatic variables and
never pass it as an argument.

--

So at the source level this is bogus.  THe fact that inlining brings the
context of foo into main is beside the point.

> 
> This appears to be far beyond what we claim to support in this
> extension - but we shouldn't ICE.
Agreed.  An ICE is bad.  One could make the argument that we should warn
on an escaped label value, but I doubt it's worth the headache.

> 
> What's happening is that, after inlining, we have usage of a *copy*
> of the label, which optimizes away the if-return logic, turning it
> into an infinite loop.
> 
> On entry to the sccvn_dom_walker we have this gimple:
> 
> main ()
> {
>   void * a.0_1;
> 
>[count: 0]:
>   a = 
> 
>[count: 0]:
> l:
>   a.0_1 = a;
>   goto a.0_1;
> }

ACK.

> 
> and:
>   edge taken = find_taken_edge (bb, vn_valueize (val));
> reasonably valueizes the:
>   goto a.0_1;
> after the:
>   a = 
>   a.0_1 = a;
> as if it were:
>   goto *
> 
> find_taken_edge_computed_goto then has:
> 
> 2380dest = label_to_block (val);
> 2381if (dest)
> 2382  {
> 2383e = find_edge (bb, dest);
> 2384gcc_assert (e != NULL);
> 2385  }
> 
> which locates dest as a self-jump from block 3 back to itself.
> 
> However, the find_edge call returns NULL - it has a predecessor edge
> from block 2, but no successor edges.
> 
> Hence the assertion fails and we ICE.
> 
> A successor edge from the computed goto could have been created by
> make_edges if the label stmt had been in the function, but make_edges
> only looks in the current function when handling computed gotos, and
> the label only appeared after inlining.
Which is the core problem.  Invalid input which breaks assumptions later.

I do not believe make_edges should be looking outside the current function.


> 
> The following patch removes the assertion, fixing the ICE.
> 
> Successfully bootstrapped on x86_64-pc-linux-gnu.
> 
> If that's option (a), there could be some other approaches:
> 
> (b) convert the assertion into a warning/error/sorry, on the
> assumption that if we don't detect such an edge then the code is
> presumably abusing the labels-as-values feature
> (c) have make_edges detect such a problematic computed goto (maybe
> converting make_edges_bb's return value to an enum and adding a 4th
> value - though it's not clear what to do then with it)
> (d) detect this case on inlining and handle it somehow (e.g. adding
> edges for labels that have appeared since make_edges originally
> ran, for computed gotos that have no out-edges)
> (e) do nothing, keeping the assertion, and accept that this is going
> to fail on a non-release build
> (f) something else?
> 
> Of the above, (d) seems to me to be the most robust solution, but I
> don't know how far we want to go "down the rabbit hole" of handling
> such uses of labels-as-values (beyond not ICE-ing on them).
(b) and (d) seem like the best choices to me.

The problem with (d) is we don't have any kind of reasonable escape
analysis for the label.  Even if you can determine the label escapes,
does the mere escape trigger an error?  I could argue either way.


So I'd lean towards (b).  Though it seems awful late in the pipeline to
diagnose this kind of error.

Jeff

Re: [PR tree-optimization/84224] do not ICE on malformed allocas

2018-02-07 Thread Jeff Law

On 02/06/2018 02:38 AM, Aldy Hernandez wrote:
> The -Walloca pass can receive a malformed alloca, courtesy of someone
> providing a faulty prototype.  This was causing an ICE because we
> assumed alloca calls had at least one argument, which the testcase does
> not:
> 
> +void *alloca ();
> +__typeof__(alloca ()) a () { return alloca (); }
> 
> I don't believe it should be the responsibility of the
> -Walloca-larger-than=* pass to warn against such things, so I propose we
> just ignore this.
> 
> I also think we should handle this testcase, regardless of the target
> having an alloca builtin, since the testcase includes its own
> prototype.  Thus, the missing "{dg-require-effect-target alloca}".
Wouldn't it make more sense to put the argument check into
gimple_alloca_call_p that way all callers are insulated?

After all if the alloca doesn't have an argument it probably doesn't
have the semantics of alloca that we expect -- like returning nonzero.

And if we look at vr-values.c:gimple_stmt_nonzero_p we see that it calls
gimple_alloca_call_p and if it returns true, then we assume the call
always returns a nonzero value.   Fixing gimple_alloca_call_p would take
care of both problems at once.

OK with that change.

jeff

Re: [PATCH] Fix floating point handling in dom (PR tree-optimization/84235)

2018-02-07 Thread Jeff Law

On 02/07/2018 12:06 AM, Richard Biener wrote:
> On February 6, 2018 9:40:37 PM GMT+01:00, Jakub Jelinek  
> wrote:
>> Hi!
>>
>> As the following testcase shows, dom2 miscompiles floating point x - x
>> into 0.0 even when x could be infinity and x - x then a NaN.
>> The corresponding match.pd optimization is:
>> /* Simplify x - x.
>>   This is unsafe for certain floats even in non-IEEE formats.
>>   In IEEE, it is unsafe because it does wrong for NaNs.
>>   Also note that operand_equal_p is always false if an operand
>>   is volatile.  */
>> (simplify
>> (minus @0 @0)
>> (if (!FLOAT_TYPE_P (type) || !HONOR_NANS (type))
>>  { build_zero_cst (type); }))
>> The patch makes it match what match.pd does.
>> We also have:
>> /* X / X is one.  */
>> (simplify
>>  (div @0 @0)
>> /* But not for 0 / 0 so that we can get the proper warnings and errors.
>> And not for _Fract types where we can't build 1.  */
>>  (if (!integer_zerop (@0) && !ALL_FRACT_MODE_P (TYPE_MODE (type)))
>>   { build_one_cst (type); }))
>> We can ignore the 0 / 0 case, we have both operands SSA_NAMEs and
>> match.pd
>> only avoids optimizing away literal 0 / 0, but the rest is valid, some
>> fract
>> types don't have a way to express 1.
>>
>> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> OK. 
> 
> I wonder why we have to re-implement all this in DOM. there's enough of match 
> and simplify interfaces to make it use that? 
What's going on with this code is we have some kind of binary
expression.  We do not know the value of either operand.

However, we have recorded that the two operands are equal to each other
via a conditional equivalence.  ie, we have 1 = (A == B) in the
expression hash table.

For many binary operations knowing the operands are equal allows us to
eliminate the binary operation.

This was part of avoiding regressing when we stopped recording
conditional equivalences in the cprop tables.

What I'm concerned about here is how did we get into this code for a
floating point object  That should have been filtered out earlier!


Jeff

Re: [PATCH] add -Wstringop-overflow to LTO options (PR 84212)

2018-02-07 Thread Martin Sebor


I went ahead and changed all the options on the list below
to include LTO and tested the attached patch by configuring
with --with-build-config=bootstrap-lto --disable-werror and
making profiledbootstrap.  Attached, besides the patch, is
also the breakdown of warnings.  The interesting column is
the one labeled Unique.  It gives the number of distinct
instances of each warning.

This was with all languages but Go and Brig.  Those two fail
what seem like unrelated reasons.  The Brig error is some
unsatisfied reference (I lost the log).  Go fails with
a bunch of errors looking like this, one for each errno
constant:

/opt/notnfs/msebor/src/gcc/84212/libgo/go/syscall/env_unix.go:96:10: 
error: reference to undefined name ‘EINVAL’


On 02/07/2018 12:30 PM, Martin Sebor wrote:

In PR 84212 the reporter asks why -Wno-stringop-overflow has
no effect during LTO linking.  It turns out that the reason
is the same as in bug 78768: the specification in the c.opt
file is missing LTO among the languages.

The attached patch adds LTO to it and to -Wstringop-truncation.

Bootstrapped and regtested on x86_64-linux.

There are other middle-end options in the c.opt file that do
not mention LTO that arguably should (*).  I didn't change
those in this patch, in part because I don't have test cases
showing where it matters, and in part because I don't think
that having to remember to include LTO in these options (and,
ideally, also include a test in the test suite for each) is
a good approach.

It seems that including LTO implicitly for all options would
obviate this manual step and eliminate the risk of missing
them.  Is there a reason not to do that?

If implicitly including LTO for every option is not feasible,
then it might be worthwhile to write a small script to verify
that every middle-end option does mention LTO, and have make
run the script as part of the self-test step.

Is there support for either of these changes?  If not, are
there any other ideas for how to avoid these kind of bugs?

Martin

[*] Here are a few examples.  I'm fine with adding LTO to
any or all of these as well as any others that I may have
missed for GCC 8 if that sounds like a good idea.

  -Walloc-size-larger-than
  -Warray-bounds
  -Wformat-truncation=
  -Wmaybe-uninitialized
  -Wnonnull
  -Wrestrict
  -Wstrict-overflow
  -Wsuggest-attribute
  -Wuninitialized


PR lto/84212 -  -Wno-* does not disable warnings from -flto link stage

gcc/c-family/ChangeLog:

	PR lto/84212
	* c.opt (-Wstringop-overflow, -Warray-bounds): Add LTO.
	(-Walloc-size-larger-than, -Wformat-truncation=): Same.
	(-Wmaybe-uninitialized, -Wnonnull, -Wrestrict): Same.
	(-Wstrict-overflow, -Wsuggest-attribute): Same.
	(-Wuninitialized): Same.

gcc/testsuite/ChangeLog:

	PR lto/84212
	* gcc.dg/lto/pr84212_0.c: New test file.
	* gcc.dg/lto/pr84212_1.c: Same.

Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt	(revision 257453)
+++ gcc/c-family/c.opt	(working copy)
@@ -304,7 +304,7 @@ C ObjC C++ ObjC++ Var(warn_alloca) Warning
 Warn on any use of alloca.
 
 Walloc-size-larger-than=
-C ObjC C++ ObjC++ Var(warn_alloc_size_limit) Warning Joined LangEnabledBy(C ObjC C++ ObjC++,Wall)
+C ObjC C++ ObjC++ LTO Var(warn_alloc_size_limit) Warning Joined LangEnabledBy(C ObjC C++ ObjC++ LTO,Wall)
 -Walloc-size-larger-than= Warn for calls to allocation functions that
 attempt to allocate objects larger than the specified number of bytes.
 
@@ -319,11 +319,11 @@ alloca, and on bounded uses of alloca whose bound
  bytes.
 
 Warray-bounds
-LangEnabledBy(C ObjC C++ ObjC++,Wall)
+LangEnabledBy(C ObjC C++ ObjC++ LTO,Wall)
 ; in common.opt
 
 Warray-bounds=
-LangEnabledBy(C ObjC C++ ObjC++,Wall,1,0)
+LangEnabledBy(C ObjC C++ ObjC++ LTO,Wall,1,0)
 ; in common.opt
 
 Wassign-intercept
@@ -575,12 +575,12 @@ C ObjC C++ ObjC++ Joined RejectNegative UInteger V
 Warn about printf/scanf/strftime/strfmon format string anomalies.
 
 Wformat-overflow=
-C ObjC C++ LTO ObjC++ Joined RejectNegative UInteger Var(warn_format_overflow) Warning LangEnabledBy(C ObjC C++ ObjC++,Wformat=, warn_format >= 1, 0) IntegerRange(0, 2)
+C ObjC C++ LTO ObjC++ Joined RejectNegative UInteger Var(warn_format_overflow) Warning LangEnabledBy(C ObjC C++ ObjC++ LTO,Wformat=, warn_format >= 1, 0) IntegerRange(0, 2)
 Warn about function calls with format strings that write past the end
 of the destination region.
 
 Wformat-truncation=
-C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_format_trunc) Warning LangEnabledBy(C ObjC C++ ObjC++,Wformat=, warn_format >= 1, 0) IntegerRange(0, 2)
+C ObjC C++ ObjC++ LTO Joined RejectNegative UInteger Var(warn_format_trunc) Warning LangEnabledBy(C ObjC C++ ObjC++ LTO,Wformat=, warn_format >= 1, 0) IntegerRange(0, 2)
 Warn about calls to snprintf and similar functions that truncate output.
 
 Wif-not-aligned
@@ -739,17 +739,17 @@ C ObjC C++ ObjC++ Var(warn_sizeof_array_argument)
 Warn when sizeof is applied on a parameter

[PATCH] -Wformat: fix nonsensical "wide character" message (PR c/84258)

2018-02-07 Thread David Malcolm

PR c/84258 reports that we issue this:

 warning: format is a wide character string [-Wformat=]

on this code:

  const unsigned char cuc[] = "%i";
  sprintf(buf, (char *)cuc, 1);

despite the absence of wide characters.

This wording dates back 17.5 years to r36586:

 2000-09-24  Joseph S. Myers  

  * c-common.c (check_format_info): Warn for a wide character string
   used as a non-wide format argument.

   * gcc.dg/c90-printf-1.c: Add test for wide string format.

which handled this case:

  printf ((const char *)L"foo"); /* { dg-warning "wide" "wide string" } */

The code in question is currently just checking for something that isn't
char_type_node.

digest_init converts the string literal to the type of the array, converting
it from array of char to array of unsigned char, leading to the condition
failing.

This patch fixes the nonsensical message by checking for the other string
types that lex_string can emit - char16_array_type_node,
char32_array_type_node, and wchar_array_type_node - and only referring
to "wide character string" if it's one of them.  Otherwise it uses a new,
more generic wording.

Successfully bootstrapped on x86_64-pc-linux-gnu.

This isn't a regression, but is fairly low risk.

OK for trunk, either now, or for next stage 1?

gcc/c-family/ChangeLog:
PR c/84258
* c-format.c (struct format_check_results): Add field
"number_non_char".
(check_format_info): Initialize it, and warn if encountered.
(check_format_arg): Distinguish between wide char and
everything else when detecting arrays of non-char.

gcc/testsuite/ChangeLog:
PR c/84258
* c-c++-common/Wformat-pr84258.c: New test.
---
 gcc/c-family/c-format.c  | 18 --
 gcc/testsuite/c-c++-common/Wformat-pr84258.c | 19 +++
 2 files changed, 35 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/Wformat-pr84258.c

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 7a69d5a..3e5b0c7 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -974,6 +974,8 @@ struct format_check_results
   /* Number of leaves of the format argument that were wide string
  literals.  */
   int number_wide;
+  /* Number of leaves of the format argument that are not array of "char".  */
+  int number_non_char;
   /* Number of leaves of the format argument that were empty strings.  */
   int number_empty;
   /* Number of leaves of the format argument that were unterminated
@@ -1433,6 +1435,7 @@ check_format_info (function_format_info *info, tree 
params,
   res.extra_arg_loc = UNKNOWN_LOCATION;
   res.number_dollar_extra_args = 0;
   res.number_wide = 0;
+  res.number_non_char = 0;
   res.number_empty = 0;
   res.number_unterminated = 0;
   res.number_other = 0;
@@ -1509,6 +1512,10 @@ check_format_info (function_format_info *info, tree 
params,
   if (res.number_wide > 0)
 warning_at (loc, OPT_Wformat_, "format is a wide character string");
 
+  if (res.number_non_char > 0)
+warning_at (loc, OPT_Wformat_,
+   "format string is not an array of type %qs", "char");
+
   if (res.number_unterminated > 0)
 warning_at (loc, OPT_Wformat_, "unterminated format string");
 }
@@ -1654,9 +1661,16 @@ check_format_arg (void *ctx, tree format_tree,
   res->number_non_literal++;
   return;
 }
-  if (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (format_tree))) != 
char_type_node)
+  tree underlying_type
+= TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (format_tree)));
+  if (underlying_type != char_type_node)
 {
-  res->number_wide++;
+  if (underlying_type == char16_type_node
+ || underlying_type == char32_type_node
+ || underlying_type == wchar_type_node)
+   res->number_wide++;
+  else
+   res->number_non_char++;
   return;
 }
   format_chars = TREE_STRING_POINTER (format_tree);
diff --git a/gcc/testsuite/c-c++-common/Wformat-pr84258.c 
b/gcc/testsuite/c-c++-common/Wformat-pr84258.c
new file mode 100644
index 000..d2870a8
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wformat-pr84258.c
@@ -0,0 +1,19 @@
+/* { dg-options "-Wformat" } */
+
+int main (int argc, char **argv)
+{
+  char buf[10];
+
+  char c[] = "%i";
+  unsigned char uc[] = "%i";
+  const char cc[] = "%i";
+  const unsigned char cuc[] = "%i";
+
+  __builtin_sprintf(buf, (char *)c, 1);
+  __builtin_sprintf(buf, (char *)uc, 1);
+  __builtin_sprintf(buf, (char *)cc, 1);
+  __builtin_sprintf(buf, (char *)cuc, 1); /* { dg-warning "format string is 
not an array of type 'char'" } */
+  __builtin_sprintf(buf, (const char *)L"foo"); /* { dg-warning "format is a 
wide character string" } */
+
+  return 0;
+}
-- 
1.8.5.3

Re: [RFC][PR82479] missing popcount builtin detection

2018-02-07 Thread Kugan Vivekanandarajah

Hi Richard,

On 1 February 2018 at 23:21, Richard Biener  wrote:
> On Thu, Feb 1, 2018 at 5:07 AM, Kugan Vivekanandarajah
>  wrote:
>> Hi Richard,
>>
>> On 31 January 2018 at 21:39, Richard Biener  
>> wrote:
>>> On Wed, Jan 31, 2018 at 11:28 AM, Kugan Vivekanandarajah
>>>  wrote:
 Hi Richard,

 Thanks for the review.
 On 25 January 2018 at 20:04, Richard Biener  
 wrote:
> On Wed, Jan 24, 2018 at 10:56 PM, Kugan Vivekanandarajah
>  wrote:
>> Hi All,
>>
>> Here is a patch for popcount builtin detection similar to LLVM. I
>> would like to queue this for review for next stage 1.
>>
>> 1. This is done part of loop-distribution and effective for -O3 and 
>> above.
>> 2. This does not distribute loop to detect popcount (like
>> memcpy/memmove). I dont think that happens in practice. Please correct
>> me if I am wrong.
>
> But then it has no business inside loop distribution but instead is
> doing final value
> replacement, right?  You are pattern-matching the whole loop after all.  
> I think
> final value replacement would already do the correct thing if you
> teached number of
> iteration analysis that niter for
>
>[local count: 955630224]:
>   # b_11 = PHI 
>   _1 = b_11 + -1;
>   b_8 = _1 & b_11;
>   if (b_8 != 0)
> goto ; [89.00%]
>   else
> goto ; [11.00%]
>
>[local count: 850510900]:
>   goto ; [100.00%]

 I am looking into this approach. What should be the scalar evolution
 for b_8 (i.e. b & (b -1) in a loop) should be? This is not clear to me
 and can this be represented with the scev?
>>>
>>> No, it's not affine and thus cannot be represented.  You only need the
>>> scalar evolution of the counting IV which is already handled and
>>> the number of iteration analysis needs to handle the above IV - this
>>> is the missing part.
>> Thanks for the clarification. I am now matching this loop pattern in
>> number_of_iterations_exit when number_of_iterations_exit_assumptions
>> fails. If the pattern matches, I am inserting the _builtin_popcount in
>> the loop preheater and setting the loop niter with this. This will be
>> used by the final value replacement. Is this what you wanted?
>
> No, you shouldn't insert a popcount stmt but instead the niter
> GENERIC tree should be a CALL_EXPR to popcount with the
> appropriate argument.

Thats what I tried earlier but ran into some ICEs. I wasn't sure if
niter in tree_niter_desc can take such.

Attached patch now does this. Also had to add support for CALL_EXPR in
few places to handle niter with CALL_EXPR. Does this look OK?

Thanks,
Kugan


gcc/ChangeLog:

2018-02-08  Kugan Vivekanandarajah  

* gimple-expr.c (extract_ops_from_tree): Handle CALL_EXPR.
* tree-chrec.c (chrec_convert_1): Likewise.
* tree-scalar-evolution.c (expression_expensive_p): Likewise.
* tree-ssa-loop-ivopts.c (contains_abnormal_ssa_name_p): Likewise.
* tree-ssa-loop-niter.c (check_popcount_pattern): New.
(number_of_iterations_exit): Record niter for popcount patern.
* tree-ssa.c (verify_ssa): Check stmt to be non NULL.

gcc/testsuite/ChangeLog:

2018-02-08  Kugan Vivekanandarajah  

* gcc.dg/tree-ssa/popcount.c: New test.


>
>> In general, there is also a condition gating the loop like
>>
>> if (b_4 != 0)
>>   goto loop;
>> else
>>   end:
>>
>> This of course will not be removed by final value replacement. Since
>> popcount (0) is defined, this is redundant and could be removed but
>> not removed.
>
> Yeah, that's probably sth for another pass though.  I suppose the
> end: case just uses zero in which case you'll have a PHI and you
> can optimize
>
>   if (b != 0)
> return popcount (b);
>   return 0;
>
> in phiopt.
>
> Richard.
>
>> Thanks,
>> Kugan
>>
>>>
>>> Richard.
>>>
 Thanks,
 Kugan
>
> is related to popcount (b_5).
>
> Richard.
>
>> Bootstrapped and regression tested on aarch64-linux-gnu with no new 
>> regressions.
>>
>> Thanks,
>> Kugan
>>
>> gcc/ChangeLog:
>>
>> 2018-01-25  Kugan Vivekanandarajah  
>>
>> PR middle-end/82479
>> * tree-loop-distribution.c (handle_popcount): New.
>> (pass_loop_distribution::execute): Use handle_popcount.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-01-25  Kugan Vivekanandarajah  
>>
>> PR middle-end/82479
>> * gcc.dg/tree-ssa/popcount.c: New test.
From 296efa2c09cdd797e3295f0c29a5a943dc87e9f2 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Thu, 8 Feb 2018 09:28:57 +1100
Subject: [PATCH] popcount

Re: [PATCH], Document the PowerPC --with-long-double-format={ibm,ieee} configuration options

2018-02-07 Thread Michael Meissner

On Wed, Feb 07, 2018 at 05:26:49PM -0600, Segher Boessenkool wrote:
> On Wed, Feb 07, 2018 at 05:48:26PM -0500, Michael Meissner wrote:
> > This patch udpates the GCC installation documentation to document the
> > configuration options to set the long double type on the PowerPC: I verified
> > that the documentation builds.  Can I install this on to the trunk?
> 
> Yes please, thanks!  A few minor issues:
> 
> > 2018-02-07  Michael Meissner  
> > 
> > * doc/install.texi (Configuration): Document the
> > --with-long-double-format={ibm,eee} PowerPC configuration
> > options.
> 
> Typo ("ieee").
> 
> > +@item --with-long-double-format=ibm
> > +@itemx --with-long-double-format-ieee
> 
> Typo ("=ieee").
> 
> > +systems.  If you are building multilibs, you will need to configure
> > +the compiler using the @option{--with-system-zlib} option.

Thanks.

> Why is that?  Is that a bug?  On BE we build multilibs all the time,
> and we do not need that flag.

No idea.  I imagine we will probably see the issue if/when we add IEEE
multilibs to big endian.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: Please accept this commit for the trunk

2018-02-07 Thread Mike Stump

On Feb 5, 2018, at 8:42 AM, Douglas Mencken  wrote:
> 
> I’m about
> 
> “ [PATCH 2/4] [Darwin,PPC] Remove uses of LR in
> restore_world ” https://gcc.gnu.org/bugzilla/attachment.cgi?id=42304
> 
> look at bug #84113 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84113 for
> more info
> 
> “ One important question ’s yet: Why this patch has been ignored despite
> it’s been made just in time? ”

I dusted the pointed to patch off and check it in.  Let us know how it goes.

Does this resolve all of PR84113?  If so, I can push the bug along.

What PR was the attachment url from?

Thanks for your help.

2018-02-07  Iain Sandoe  

* config/rs6000/altivec.md (*restore_world): Remove LR use.
* config/rs6000/predicates.md (restore_world_operation): Adjust op
count, remove one USE.

Index: gcc/config/rs6000/altivec.md
===
--- gcc/config/rs6000/altivec.md(revision 257471)
+++ gcc/config/rs6000/altivec.md(working copy)
@@ -419,7 +419,6 @@
 (define_insn "*restore_world"
  [(match_parallel 0 "restore_world_operation"
   [(return)
-  (use (reg:SI LR_REGNO))
(use (match_operand:SI 1 "call_operand" "s"))
(clobber (match_operand:SI 2 "gpc_reg_operand" "=r"))])]
  "TARGET_MACHO && (DEFAULT_ABI == ABI_DARWIN) && TARGET_32BIT"
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 257471)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -1295,13 +1295,12 @@
   rtx elt;
   int count = XVECLEN (op, 0);
 
-  if (count != 59)
+  if (count != 58)
 return 0;
 
   index = 0;
   if (GET_CODE (XVECEXP (op, 0, index++)) != RETURN
   || GET_CODE (XVECEXP (op, 0, index++)) != USE
-  || GET_CODE (XVECEXP (op, 0, index++)) != USE
   || GET_CODE (XVECEXP (op, 0, index++)) != CLOBBER)
 return 0;

Re: [PATCH, rs6000] Update vsx-vector-6-le.c tests for p9 target

2018-02-07 Thread Segher Boessenkool

On Wed, Feb 07, 2018 at 05:23:31PM -0600, Will Schmidt wrote:
> > >  /* { dg-do compile { target { powerpc64le-*-* && lp64 } } } */
> > >  /* { dg-skip-if "" { powerpc*-*-darwin* } } */
> > >  /* { dg-require-effective-target powerpc_vsx_ok } */
> > > -/* { dg-options "-mvsx -O2" } */
> > > +/* { dg-options "-mvsx -O2 -mcpu=power8" } */
> > 
> > Why not -mcpu=power7?  And you'll need
> 
> My default answer on that one is "power7 + le = nope". :-)
> I can be convinced otherwise if there are strong feelings, etc. 

No really strong feelings no.  But previously we ran this test on power7
(but the powerpc64le-* prevents that in reality) and now we don't anymore.


Segher

Re: [PATCH], Document the PowerPC --with-long-double-format={ibm,ieee} configuration options

2018-02-07 Thread Segher Boessenkool

On Wed, Feb 07, 2018 at 05:48:26PM -0500, Michael Meissner wrote:
> This patch udpates the GCC installation documentation to document the
> configuration options to set the long double type on the PowerPC: I verified
> that the documentation builds.  Can I install this on to the trunk?

Yes please, thanks!  A few minor issues:

> 2018-02-07  Michael Meissner  
> 
>   * doc/install.texi (Configuration): Document the
>   --with-long-double-format={ibm,eee} PowerPC configuration
>   options.

Typo ("ieee").

> +@item --with-long-double-format=ibm
> +@itemx --with-long-double-format-ieee

Typo ("=ieee").

> +systems.  If you are building multilibs, you will need to configure
> +the compiler using the @option{--with-system-zlib} option.

Why is that?  Is that a bug?  On BE we build multilibs all the time,
and we do not need that flag.


Segher

Re: [PATCH, rs6000] Update vsx-vector-6-le.c tests for p9 target

2018-02-07 Thread Will Schmidt

On Wed, 2018-02-07 at 12:28 -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Feb 07, 2018 at 11:16:12AM -0600, Will Schmidt wrote:
> > Noted during review of test results on P9.   Due to changes and 
> > improvements,
> > our codegen is different for this test on power9.
> > Modified the existing test to target P8, and added a P9 variant with updated
> > counts.
> 
> > diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c 
> > b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
> > index ddb0089..7fe691b 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
> > @@ -1,11 +1,11 @@
> >  /* { dg-do compile { target { powerpc64le-*-* && lp64 } } } */
> >  /* { dg-skip-if "" { powerpc*-*-darwin* } } */
> >  /* { dg-require-effective-target powerpc_vsx_ok } */
> > -/* { dg-options "-mvsx -O2" } */
> > +/* { dg-options "-mvsx -O2 -mcpu=power8" } */
> 
> Why not -mcpu=power7?  And you'll need

My default answer on that one is "power7 + le = nope". :-)
I can be convinced otherwise if there are strong feelings, etc. 

> /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
> "-mcpu=power7" } } */
> 
> You could also do instead
> 
> /* { dg-skip-if "this is not for p9" { powerpc_p9vector_ok } } */
> 
> or something like that; a bit neater.

ok.  I'll poke at this a bit, and will post (v2) for this once it's
ready.

> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.p9.c
> > @@ -0,0 +1,32 @@
> > +/* { dg-do compile { target { powerpc64le-*-* && lp64 } } } */
> > +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> > +/* { dg-require-effective-target powerpc_p9vector_ok } */
> > +/* { dg-options "-mvsx -O2 -mcpu=power9" } */
> 
> This needs the "do not override -mcpu" thing as well.

ok.

Thanks,
-Will

> 
> Segher
>

Re: [PATCH, rs6000] fix vsxcopy test for power9

2018-02-07 Thread Will Schmidt

On Wed, 2018-02-07 at 12:02 -0600, Segher Boessenkool wrote:
> Hi,
> 
> On Wed, Feb 07, 2018 at 10:59:46AM -0600, Will Schmidt wrote:
> > Noted during review of test results on P9.  The vsxcopy.c test is looking
> > for lxvd2x, stxvd2x instructions in generated code.  For P9 targets, we will
> > instead generate lxv, stxv instructions.
> > Thus, update the test to handle that codegen as well.
> > Sniff-tested on P9.
> > OK for trunk?
> 
> There are many insns that start with "lxv" (or "stxv"); you may want to
> use \m and \M, like:
> 
> > -/* { dg-final { scan-assembler "lxvd2x" } } */
> > -/* { dg-final { scan-assembler "stxvd2x" } } */
> > +/* { dg-final { scan-assembler "lxvd2x|lxv" } } */
> > +/* { dg-final { scan-assembler "stxvd2x|stxv" } } */
> 
> /* { dg-final { scan-assembler {\m(lxvd2x|lxv)\M} } } */
> /* { dg-final { scan-assembler {\m(stxvd2x|stxv)\M} } } */
> 
> ("lxv" without anything more will already also match "lxvd2x", etc.)

> Okay with such a change (if it works ;-) )

updated sniff-test suggests to me that it does.  :-) 
I've used the \m\M bits before, but never surrounding an (a|b)
construct.  something learned.  :-)

Thanks,
-Will



> 
> 
> Segher
>

[PATCH], Document the PowerPC --with-long-double-format={ibm,ieee} configuration options

2018-02-07 Thread Michael Meissner

This patch udpates the GCC installation documentation to document the
configuration options to set the long double type on the PowerPC: I verified
that the documentation builds.  Can I install this on to the trunk?

2018-02-07  Michael Meissner  

* doc/install.texi (Configuration): Document the
--with-long-double-format={ibm,eee} PowerPC configuration
options.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/doc/install.texi
===
--- gcc/doc/install.texi(revision 257455)
+++ gcc/doc/install.texi(working copy)
@@ -1877,6 +1877,35 @@ When neither of these configure options 
 128-bit @code{long double} when built against GNU C Library 2.4 and later,
 64-bit @code{long double} otherwise.
 
+@item --with-long-double-format=ibm
+@itemx --with-long-double-format-ieee
+Specify whether @code{long double} uses the IBM extended double format
+or the IEEE 128-bit floating point format on PowerPC Linux systems.
+This configuration switch will only work on little endian PowerPC
+Linux systems and on big endian 64-bit systems where the default cpu
+is at least power7 (i.e. @option{--with-cpu=power7},
+@option{--with-cpu=power8}, or @option{--with-cpu=power9} is used).
+
+If you use the @option{--with-long-double-64} configuration option,
+the @option{--with-long-double-format=ibm} and
+@option{--with-long-double-format=ieee} options are ignored.
+
+The default @code{long double} format is to use IBM extended double.
+Until all of the libraries are converted to use IEEE 128-bit floating
+point, it is not recommended to use
+@option{--with-long-double-format=ieee}.
+
+On little endian PowerPC Linux systems, if you explicitly set the
+@code{long double} type, it will build multilibs to allow you to
+select either @code{long double} format, unless you disable multilibs
+with the @code{--disable-multilib} option.  At present,
+@code{long double} multilibs are not built on big endian PowerPC Linux
+systems.  If you are building multilibs, you will need to configure
+the compiler using the @option{--with-system-zlib} option.
+
+If you do not set the @code{long double} type explicitly, no multilibs
+will be generated.
+
 @item --enable-fdpic
 On SH Linux systems, generate ELF FDPIC code.

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-02-07 Thread Jakub Jelinek

On Wed, Feb 07, 2018 at 03:52:39PM -0500, Jason Merrill wrote:
> > E.g. the constexpr function uses same_type_ignoring_top_level_qualifiers_p
> > instead of == type comparisons, the COMPONENT_REF stuff, ...
> 
> > For poly_* stuff, I think Richard S. wants to introduce it into the FEs at
> > some point, but I could be wrong; certainly it hasn't been done yet and
> > generally, poly*int seems to be a nightmare to deal with.
> 
> Yes, I understand how we got to this point, but having the functions
> diverge because of this guideline seems like a mistake.  And there
> seem to be two ways to avoid the divergence: make an exception to the
> guideline, or move the function.

Functionally, I think the following patch should turn fold_indirect_ref_1
to be equivalent to the patched constexpr.c version (with the known
documented differences), so if this is the obstackle for the acceptance
of the patch, I can test this.

Otherwise, I must say I have no idea how to share the code,
same_type_ignoring_qualifiers is only a C++ FE function, so the middle-end
can't use it even conditionally, and similarly with the TBAA issues.

--- gcc/fold-const.c.jj 2018-01-26 18:41:34.316410713 +0100
+++ gcc/fold-const.c2018-02-07 22:10:54.958628078 +0100
@@ -14172,7 +14172,16 @@ fold_indirect_ref_1 (location_t loc, tre
 
  /* ((foo*))[1] => BIT_FIELD_REF */
  if (TREE_CODE (op00type) == VECTOR_TYPE
- && type == TREE_TYPE (op00type))
+ && type == TREE_TYPE (op00type)
+ /* POINTER_PLUS_EXPR second operand is sizetype, unsigned,
+but we want to treat offsets with MSB set as negative.
+For the code below negative offsets are invalid and
+TYPE_SIZE of the element is something unsigned, so
+check whether op01 fits into poly_int64, which implies
+it is from 0 to INTTYPE_MAXIMUM (HOST_WIDE_INT), and
+then just use poly_uint64 because we want to treat the
+value as unsigned.  */
+ && tree_fits_poly_int64_p (op01))
{
  tree part_width = TYPE_SIZE (type);
  poly_uint64 max_offset


Jakub

[PATCH] Fix var-tracking ICE introduced in poly_int conversion (PR debug/84252)

2018-02-07 Thread Jakub Jelinek

Hi!

As mentioned in the PR, the vt_get_decl_and_offset function verifies
incoming PARALLEL is usable for tracking, but if it fails, we retry
vt_get_decl_and_offset on DECL_RTL and there we check only that a memory
isn't larger than 16 bytes (to make sure it doesn't have more than 16
parts), but if DECL_RTL is a REG e.g. with OImode as in the testcase on
aarch64, we just go on.  For the OImode argument passed in V4SImode v0
and V4SImode v1 parts we are lucky and used to get away with it, as it only
had 2 parts (and only dwarf2out threw it away as we can't really do vector
debug info well right now), but generally if vt_get_decl_and_offset doesn't
check the PARALLEL incoming, it is better to punt right away, we don't know
if all PARALLEL operands are REGs, have REG_ATTRS, the same underlying decl
and usable offsets, which we all rely on and only vt_get_decl_and_offset
verifies.

Bootstrapped/regtested on x86_64-linux and i686-linux and Richard S. tested
it on aarch64-linux, ok for trunk?

2018-02-07  Jakub Jelinek  

PR debug/84252
* var-tracking.c (vt_add_function_parameter): Punt for non-onepart
PARALLEL incoming that failed vt_get_decl_and_offset check.

* gcc.target/aarch64/pr84252.c: New test.

--- gcc/var-tracking.c.jj   2018-01-04 00:43:15.007702432 +0100
+++ gcc/var-tracking.c  2018-02-07 12:47:09.735980882 +0100
@@ -9653,6 +9653,7 @@ vt_add_function_parameter (tree parm)
   poly_int64 offset;
   dataflow_set *out;
   decl_or_value dv;
+  bool incoming_ok = true;
 
   if (TREE_CODE (parm) != PARM_DECL)
 return;
@@ -9743,6 +9744,7 @@ vt_add_function_parameter (tree parm)
 
   if (!vt_get_decl_and_offset (incoming, , ))
 {
+  incoming_ok = false;
   if (MEM_P (incoming))
{
  /* This means argument is passed by invisible reference.  */
@@ -9861,6 +9863,10 @@ vt_add_function_parameter (tree parm)
 {
   int i;
 
+  /* The following code relies on vt_get_decl_and_offset returning true for
+incoming, which might not be always the case.  */
+  if (!incoming_ok)
+   return;
   for (i = 0; i < XVECLEN (incoming, 0); i++)
{
  rtx reg = XEXP (XVECEXP (incoming, 0, i), 0);
--- gcc/testsuite/gcc.target/aarch64/pr84252.c.jj   2018-02-07 
14:01:27.987436160 +0100
+++ gcc/testsuite/gcc.target/aarch64/pr84252.c  2018-02-07 13:59:47.736644299 
+0100
@@ -0,0 +1,10 @@
+/* PR debug/84252 */
+/* { dg-do compile } */
+/* { dg-options "-g -O2" } */
+
+struct S { __Int32x4_t b[2]; };
+
+void
+foo (struct S x)
+{
+}

Jakub

Re: [PATCH] PowerPC PR target/84154, fix floating point to small integer conversion regression

2018-02-07 Thread Segher Boessenkool

Hi Mike,

On Tue, Feb 06, 2018 at 04:34:08PM -0500, Michael Meissner wrote:
> Here is the patch reworked.  It bootstraps on both little/big endian power8,
> and all of the tests run.  Can I install this into trunk now, and into GCC 7
> after a soak period (along with the previous patch)?

> +;; If have ISA 3.0, QI/HImode values can go in both VSX registers and GPR

"If we have"?

> +  [(set (match_operand:QHSI 0 "memory_operand" "=Z")
> + (any_fix:QHSI (match_operand:SFDF 1 "gpc_reg_operand" "wa")))
> +   (clobber (match_scratch:SI 2 "=wa"))]
> +  "((mode == SImode && TARGET_P8_VECTOR)
> +|| (mode != SImode && TARGET_P9_VECTOR))"

This is the same as

  "(mode == SImode && TARGET_P8_VECTOR) || TARGET_P9_VECTOR"

which is a bit easier to understand I think.

Okay (for trunk as well as 7) with those trivialities improved.  Thanks!


Segher

[Committed] PR fortran/82994 -- Check for a NULL pointer

2018-02-07 Thread Steve Kargl

Committed as obvious.

2018-02-07  Steven G. Kargl  

PR fortran/82994
* match.c (gfc_match_deallocate): Check for NULL pointer.

2018-02-07  Steven G. Kargl  

PR fortran/82994
* gfortran.dg/deallocate_error_3.f90: New test.
* gfortran.dg/deallocate_error_4.f90: New test.

-- 
Steve
Index: gcc/fortran/match.c
===
--- gcc/fortran/match.c	(revision 257459)
+++ gcc/fortran/match.c	(working copy)
@@ -4632,8 +4632,8 @@ gfc_match_deallocate (void)
 	   && (tail->expr->ref->type == REF_COMPONENT
 	   || tail->expr->ref->type == REF_ARRAY));
   if (sym && sym->ts.type == BT_CLASS)
-	b2 = !(CLASS_DATA (sym)->attr.allocatable
-	   || CLASS_DATA (sym)->attr.class_pointer);
+	b2 = !(CLASS_DATA (sym) && (CLASS_DATA (sym)->attr.allocatable
+	   || CLASS_DATA (sym)->attr.class_pointer));
   else
 	b2 = sym && !(sym->attr.allocatable || sym->attr.pointer
 		  || sym->attr.proc_pointer);
Index: gcc/testsuite/gfortran.dg/deallocate_error_3.f90
===
--- gcc/testsuite/gfortran.dg/deallocate_error_3.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/deallocate_error_3.f90	(working copy)
@@ -0,0 +1,9 @@
+! { dg-do compile }
+! PR fortran/82994
+! Code contributed by Gerhard Steinmetz
+program p
+   type t
+   end type
+   class(t) :: x  ! { dg-error "must be dummy, allocatable or pointer" }
+   deallocate (x) ! { dg-error "not a nonprocedure pointer nor an allocatable" }
+end
Index: gcc/testsuite/gfortran.dg/deallocate_error_4.f90
===
--- gcc/testsuite/gfortran.dg/deallocate_error_4.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/deallocate_error_4.f90	(working copy)
@@ -0,0 +1,10 @@
+! { dg-do compile }
+! PR fortran/82994
+! Code contributed by Gerhard Steinmetz
+program p
+   type t
+   end type
+   class(t) :: x  ! { dg-error "must be dummy, allocatable or pointer" }
+   allocate (x)   ! { dg-error "neither a data pointer nor an allocatable" }
+   deallocate (x) ! { dg-error "not a nonprocedure pointer nor an allocatable" }
+end

New Spanish PO file for 'gcc' (version 8.1-b20180128)

2018-02-07 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Spanish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/es.po

(This file, 'gcc-8.1-b20180128.es.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH] libgcc: xtensa: fix build with -mtext-section-literals

2018-02-07 Thread Max Filippov

On Thu, Feb 1, 2018 at 9:12 AM, Max Filippov  wrote:
> On Wed, Jan 31, 2018 at 11:17 AM, Max Filippov  wrote:
>> On Wed, Jan 31, 2018 at 11:11 AM, augustine.sterl...@gmail.com 
>>  wrote:
>>> On Tue, Jan 30, 2018 at 8:02 PM, Max Filippov  wrote:

 libgcc/
 2018-01-31  Max Filippov  

 * config/xtensa/ieee754-df.S (__adddf3_aux): Add
 .literal_position directive.
 * config/xtensa/ieee754-sf.S (__addsf3_aux): Likewise.
>>>
>>> This is fine, but when did it stop working, and why isn't there a test
>>> that would catch it?
>>
>> I broke it with the recent softfloat NaN fix. Haven't noticed that because
>> I usually build without any flags or with -mauto-litpools.
>> Let me add the test, thanks for the suggestion.
>
> I've taken a look at gcc testsuites and I don't see how to add a test that 
> would
> check that libgcc builds with different compiler options. Any suggestions?

If there are no suggestions I will check in the fix as is and will add building
with -mtext-section-literals to my testing scripts.

-- 
Thanks.
-- Max

libgo patch committed: Don't call funcPC from a function

2018-02-07 Thread Ian Lance Taylor

This patch changes the libgo runtime package to not call funcPC from a
function.  The escape analysis support is not yet good enough to avoid
escaping the argument to funcPC.  This causes unnecessary and often
harmful memory allocation.  E.g., (*cpuProfile).addExtra can be called
from a signal handler, and it must not allocate memory.  This patch
moves the calls to funcPC to use variables instead.  This was done in
the original migration to using funcPC, but was not done for newer
code.  In one case, in signal handling code, use getSigtramp.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 257436)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-5fe998e4a18cc1dbbd4869be5c8202bda55adb33
+cdc28627b7abfd73f5d552813db8eb4293b823b0
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/cpuprof.go
===
--- libgo/go/runtime/cpuprof.go (revision 257415)
+++ libgo/go/runtime/cpuprof.go (working copy)
@@ -156,8 +156,8 @@ func (p *cpuProfile) addExtra() {
if p.lostExtra > 0 {
hdr := [1]uint64{p.lostExtra}
lostStk := [2]uintptr{
-   funcPC(_LostExternalCode) + sys.PCQuantum,
-   funcPC(_ExternalCode) + sys.PCQuantum,
+   _LostExternalCodePC + sys.PCQuantum,
+   _ExternalCodePC + sys.PCQuantum,
}
cpuprof.log.write(nil, 0, hdr[:], lostStk[:])
p.lostExtra = 0
@@ -167,8 +167,8 @@ func (p *cpuProfile) addExtra() {
 func (p *cpuProfile) addLostAtomic64(count uint64) {
hdr := [1]uint64{count}
lostStk := [2]uintptr{
-   funcPC(_LostSIGPROFDuringAtomic64) + sys.PCQuantum,
-   funcPC(_System) + sys.PCQuantum,
+   _LostSIGPROFDuringAtomic64PC + sys.PCQuantum,
+   _SystemPC + sys.PCQuantum,
}
cpuprof.log.write(nil, 0, hdr[:], lostStk[:])
 }
Index: libgo/go/runtime/proc.go
===
--- libgo/go/runtime/proc.go(revision 257415)
+++ libgo/go/runtime/proc.go(working copy)
@@ -3369,7 +3369,9 @@ var lostAtomic64Count uint64
 
 var _SystemPC = funcPC(_System)
 var _ExternalCodePC = funcPC(_ExternalCode)
+var _LostExternalCodePC = funcPC(_LostExternalCode)
 var _GCPC = funcPC(_GC)
+var _LostSIGPROFDuringAtomic64PC = funcPC(_LostSIGPROFDuringAtomic64)
 
 // Called if we receive a SIGPROF signal.
 // Called by the signal handler, may run during STW.
@@ -3469,7 +3471,7 @@ func sigprofNonGoPC(pc uintptr) {
if prof.hz != 0 {
stk := []uintptr{
pc,
-   funcPC(_ExternalCode) + sys.PCQuantum,
+   _ExternalCodePC + sys.PCQuantum,
}
cpuprof.addNonGo(stk)
}
Index: libgo/go/runtime/signal_unix.go
===
--- libgo/go/runtime/signal_unix.go (revision 257415)
+++ libgo/go/runtime/signal_unix.go (working copy)
@@ -245,7 +245,7 @@ func setProcessCPUProfiler(hz int32) {
// Enable the Go signal handler if not enabled.
if atomic.Cas([_SIGPROF], 0, 1) {
atomic.Storeuintptr([_SIGPROF], getsig(_SIGPROF))
-   setsig(_SIGPROF, funcPC(sighandler))
+   setsig(_SIGPROF, getSigtramp())
}
} else {
// If the Go signal handler should be disabled by default,

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-02-07 Thread Jakub Jelinek

On Wed, Feb 07, 2018 at 08:36:31PM +0100, Marek Polacek wrote:
> > > That was my first patch, but it was rejected:
> > > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00271.html
> > 
> > Then should we update fold_indirect_ref_1 to use the new code?  Is
> > there a reason for them to stay out of sync?
> 
> One of the reasons is that middle end uses poly_uint64 type but the front ends
> shouldn't use them.  So some of these functions will unfortunately differ.

Yeah.  Part of the patch makes the two implementations slightly more
similar, but I have e.g. no idea how to test for poly_uint64 that fits
also in poly_int64 and the poly_int* stuff makes the two substantially
different in any case.

Jakub

Re: [C++ Patch/RFC] PR 83204 ("[6/7/8 Regression] c++ -std=c++14 ICE in maybe_undo_parenthesized_ref, at cp/semantics.c:1694")

2018-02-07 Thread Jason Merrill

OK.

On Wed, Feb 7, 2018 at 4:16 PM, Paolo Carlini  wrote:
> Hi,
>
> this is essentially an RFC, I'm still fully analyzing the issue, but since I
> already have something prima facie reasonable and passing the testsuite I
> decided to send immediately out what I have, looking for further feedback /
> help. As fully analyzed by Jakub in the audit trail, for the testcase - it's
> important detail being the parentheses around x in the for(int y = 0, yend =
> (x); y < yend; ++y) part of the lambda body - we ICE in
> maybe_undo_parenthesized_ref when we hit the assert:
>
>   gcc_assert (TREE_CODE (t) == ADDR_EXPR
>   || TREE_CODE (t) == STATIC_CAST_EXPR);
>
> because in tsubst_copy_and_build for INDIRECT_REF we do:
>
> -if (TREE_CODE (r) == INDIRECT_REF)
> -  REF_PARENTHESIZED_P (r) = REF_PARENTHESIZED_P (t);
> +if (REF_PARENTHESIZED_P (t))
> +  r = force_paren_expr (r);
>
> irrespective of the details of r beyond it being an INDIRECT_REF. Today
> something I immediately noticed, beyond all the ideas provided by Jakub in
> the trail, is that in all the other places in pt.c, besides maybe one case
> in tsubst_qualified_id, where we adjust REF_PARENTHESIZED_P either we use
> indeed force_paren_expr which is guaranteed to produce something consistent
> with maybe_undo_parenthesized_ref, or we simply do things like:
>
> if (TREE_CODE (r) == COMPONENT_REF)
>   REF_PARENTHESIZED_P (r) = REF_PARENTHESIZED_P (t);
>
> where in fact force_paren_expr would simply handle a COMPONENT_REF in the
> same way. Thus the idea of the draft patch. Jason, if you see something
> deeper going on which you would rather prefer to handle yourself, please let
> me know and I'll adjust the bug owner appropriately.
>
> Thanks! Paolo.
>
> 
>

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-02-07 Thread Jason Merrill

On Wed, Feb 7, 2018 at 4:14 PM, Jakub Jelinek  wrote:
> On Wed, Feb 07, 2018 at 03:52:39PM -0500, Jason Merrill wrote:
>> > E.g. the constexpr function uses same_type_ignoring_top_level_qualifiers_p
>> > instead of == type comparisons, the COMPONENT_REF stuff, ...
>>
>> > For poly_* stuff, I think Richard S. wants to introduce it into the FEs at
>> > some point, but I could be wrong; certainly it hasn't been done yet and
>> > generally, poly*int seems to be a nightmare to deal with.
>>
>> Yes, I understand how we got to this point, but having the functions
>> diverge because of this guideline seems like a mistake.  And there
>> seem to be two ways to avoid the divergence: make an exception to the
>> guideline, or move the function.
>
> Functionally, I think the following patch should turn fold_indirect_ref_1
> to be equivalent to the patched constexpr.c version (with the known
> documented differences), so if this is the obstackle for the acceptance
> of the patch, I can test this.
>
> Otherwise, I must say I have no idea how to share the code,
> same_type_ignoring_qualifiers is only a C++ FE function, so the middle-end
> can't use it even conditionally, and similarly with the TBAA issues.

Again, can we make an exception and use poly_int in this function
because it's mirroring a middle-end function?

Jason

[C++ Patch/RFC] PR 83204 ("[6/7/8 Regression] c++ -std=c++14 ICE in maybe_undo_parenthesized_ref, at cp/semantics.c:1694")

2018-02-07 Thread Paolo Carlini


Hi,

this is essentially an RFC, I'm still fully analyzing the issue, but 
since I already have something prima facie reasonable and passing the 
testsuite I decided to send immediately out what I have, looking for 
further feedback / help. As fully analyzed by Jakub in the audit trail, 
for the testcase - it's important detail being the parentheses around x 
in the for(int y = 0, yend = (x); y < yend; ++y) part of the lambda body 
- we ICE in maybe_undo_parenthesized_ref when we hit the assert:


  gcc_assert (TREE_CODE (t) == ADDR_EXPR
          || TREE_CODE (t) == STATIC_CAST_EXPR);

because in tsubst_copy_and_build for INDIRECT_REF we do:

-    if (TREE_CODE (r) == INDIRECT_REF)
-      REF_PARENTHESIZED_P (r) = REF_PARENTHESIZED_P (t);
+    if (REF_PARENTHESIZED_P (t))
+      r = force_paren_expr (r);

irrespective of the details of r beyond it being an INDIRECT_REF. Today 
something I immediately noticed, beyond all the ideas provided by Jakub 
in the trail, is that in all the other places in pt.c, besides maybe one 
case in tsubst_qualified_id, where we adjust REF_PARENTHESIZED_P either 
we use indeed force_paren_expr which is guaranteed to produce something 
consistent with maybe_undo_parenthesized_ref, or we simply do things like:


        if (TREE_CODE (r) == COMPONENT_REF)
      REF_PARENTHESIZED_P (r) = REF_PARENTHESIZED_P (t);

where in fact force_paren_expr would simply handle a COMPONENT_REF in 
the same way. Thus the idea of the draft patch. Jason, if you see 
something deeper going on which you would rather prefer to handle 
yourself, please let me know and I'll adjust the bug owner appropriately.


Thanks! Paolo.



Index: cp/pt.c
===
--- cp/pt.c (revision 257458)
+++ cp/pt.c (working copy)
@@ -17224,8 +17224,8 @@ tsubst_copy_and_build (tree t,
  r = build_x_indirect_ref (input_location, r, RO_UNARY_STAR,
complain|decltype_flag);
 
-   if (TREE_CODE (r) == INDIRECT_REF)
- REF_PARENTHESIZED_P (r) = REF_PARENTHESIZED_P (t);
+   if (REF_PARENTHESIZED_P (t))
+ r = force_paren_expr (r);
 
RETURN (r);
   }
Index: testsuite/g++.dg/cpp0x/lambda/lambda-ice25.C
===
--- testsuite/g++.dg/cpp0x/lambda/lambda-ice25.C(nonexistent)
+++ testsuite/g++.dg/cpp0x/lambda/lambda-ice25.C(working copy)
@@ -0,0 +1,27 @@
+// PR c++/83204
+// { dg-do compile { target c++11 } }
+
+int rand();
+
+template
+struct s
+{
+int count() { return rand(); }
+};
+
+template
+void f(s a)
+{
+int const x = a.count();
+int r = 0;
+auto l = [&](int& r)
+{
+for(int y = 0, yend = (x); y < yend; ++y)
+{
+r += y;
+}
+};
+l(r);
+}
+
+template void f(s);

Re: [PATCH] PR fortran/82049 -- resolve a charlen if possible

2018-02-07 Thread Steve Kargl

Thanks.  A version of the patch has been commit to 6-branch,
7-branch, and trunk.  One regression down, many more to go.

-- 
steve

On Wed, Feb 07, 2018 at 08:10:41AM +, Paul Richard Thomas wrote:
> Hi Steve,
> 
> That's OK for trunk and, if you are possessed of the intestinal
> fortitude, 6- and 7-branches.
> 
> Thanks
> 
> Paul
> 
> 
> On 7 February 2018 at 02:17, Steve Kargl
>  wrote:
> > The attached patch fixes PR fortran/82049.  Prior to this
> > patch, gfortran was fouling up the resolution of the charlen
> > expression in the testcase.  The immediately tries to resolve
> > the length while parsing the type-spec.
> >
> > While here, I've introduced an optimization that causes
> > gfc_match_type_spec() to return early if the gfortran isn't
> > going to be getting a type-spec.  This is done be peeking
> > at the next character, if it is in [a-z], the we don't have
> > a type spec.  OK to commit?
> >
> >
> > 2018-02-06  Steven G. Kargl  
> >
> > PR fortran/82049
> > * match.c (gfc_match_type_spec): If the charlen is non-NULL, then
> > try to resolve it.  While here return early if possible.
> >
> > 2018-02-06  Steven G. Kargl  
> >
> > PR fortran/82049
> > * gfortran.dg/assumed_charlen_parameter.f90: New test.
> >
> >
> > --
> > Steve
> 
> 
> 
> -- 
> "If you can't explain it simply, you don't understand it well enough"
> - Albert Einstein

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-02-07 Thread Jason Merrill

On Wed, Feb 7, 2018 at 3:32 PM, Jakub Jelinek  wrote:
> On Wed, Feb 07, 2018 at 03:23:25PM -0500, Jason Merrill wrote:
>> On Wed, Feb 7, 2018 at 2:48 PM, Jakub Jelinek  wrote:
>> > On Wed, Feb 07, 2018 at 08:36:31PM +0100, Marek Polacek wrote:
>> >> > > That was my first patch, but it was rejected:
>> >> > > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00271.html
>> >> >
>> >> > Then should we update fold_indirect_ref_1 to use the new code?  Is
>> >> > there a reason for them to stay out of sync?
>> >>
>> >> One of the reasons is that middle end uses poly_uint64 type but the front 
>> >> ends
>> >> shouldn't use them.  So some of these functions will unfortunately differ.
>> >
>> > Yeah.  Part of the patch makes the two implementations slightly more
>> > similar, but I have e.g. no idea how to test for poly_uint64 that fits
>> > also in poly_int64 and the poly_int* stuff makes the two substantially
>> > different in any case.
>>
>> Hmm.  Well, that seems rather unfortunate.  Why shouldn't the front
>> ends use them?  Can we make an exception for this function because
>> it's supposed to mirror a middle-end function?
>> Should we try to push this function back into the middle end?
>
> The function comment seems to explain the reasons:
> /* A less strict version of fold_indirect_ref_1, which requires cv-quals to
>match.  We want to be less strict for simple *& folding; if we have a
>non-const temporary that we access through a const pointer, that should
>work.  We handle this here rather than change fold_indirect_ref_1
>because we're dealing with things like ADDR_EXPR of INTEGER_CST which
>don't really make sense outside of constant expression evaluation.  Also
>we want to allow folding to COMPONENT_REF, which could cause trouble
>with TBAA in fold_indirect_ref_1.
>
>Try to keep this function synced with fold_indirect_ref_1.  */
>
> E.g. the constexpr function uses same_type_ignoring_top_level_qualifiers_p
> instead of == type comparisons, the COMPONENT_REF stuff, ...

> For poly_* stuff, I think Richard S. wants to introduce it into the FEs at
> some point, but I could be wrong; certainly it hasn't been done yet and
> generally, poly*int seems to be a nightmare to deal with.

Yes, I understand how we got to this point, but having the functions
diverge because of this guideline seems like a mistake.  And there
seem to be two ways to avoid the divergence: make an exception to the
guideline, or move the function.

Jason

Re: [patch, fortran] Fix PR 68560

2018-02-07 Thread Steve Kargl

On Wed, Feb 07, 2018 at 09:42:04PM +0100, Thomas Koenig wrote:
> Here's an update on the patch - I realized that it is not necessary
> to check for the actual argument, it is always present.
> 
> OK for trunk?
> 

Yes.

-- 
Steve

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-02-07 Thread Jakub Jelinek

On Wed, Feb 07, 2018 at 03:23:25PM -0500, Jason Merrill wrote:
> On Wed, Feb 7, 2018 at 2:48 PM, Jakub Jelinek  wrote:
> > On Wed, Feb 07, 2018 at 08:36:31PM +0100, Marek Polacek wrote:
> >> > > That was my first patch, but it was rejected:
> >> > > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00271.html
> >> >
> >> > Then should we update fold_indirect_ref_1 to use the new code?  Is
> >> > there a reason for them to stay out of sync?
> >>
> >> One of the reasons is that middle end uses poly_uint64 type but the front 
> >> ends
> >> shouldn't use them.  So some of these functions will unfortunately differ.
> >
> > Yeah.  Part of the patch makes the two implementations slightly more
> > similar, but I have e.g. no idea how to test for poly_uint64 that fits
> > also in poly_int64 and the poly_int* stuff makes the two substantially
> > different in any case.
> 
> Hmm.  Well, that seems rather unfortunate.  Why shouldn't the front
> ends use them?  Can we make an exception for this function because
> it's supposed to mirror a middle-end function?
> Should we try to push this function back into the middle end?

The function comment seems to explain the reasons:
/* A less strict version of fold_indirect_ref_1, which requires cv-quals to
   match.  We want to be less strict for simple *& folding; if we have a
   non-const temporary that we access through a const pointer, that should
   work.  We handle this here rather than change fold_indirect_ref_1
   because we're dealing with things like ADDR_EXPR of INTEGER_CST which
   don't really make sense outside of constant expression evaluation.  Also
   we want to allow folding to COMPONENT_REF, which could cause trouble
   with TBAA in fold_indirect_ref_1.

   Try to keep this function synced with fold_indirect_ref_1.  */

E.g. the constexpr function uses same_type_ignoring_top_level_qualifiers_p
instead of == type comparisons, the COMPONENT_REF stuff, ...

For poly_* stuff, I think Richard S. wants to introduce it into the FEs at
some point, but I could be wrong; certainly it hasn't been done yet and
generally, poly*int seems to be a nightmare to deal with.

Jakub

Re: [patch, fortran] Fix PR 68560

2018-02-07 Thread Thomas Koenig


Here's an update on the patch - I realized that it is not necessary
to check for the actual argument, it is always present.

OK for trunk?

Regards

Thomas



2018-02-01  Thomas Koenig  

 PR fortran/68560
 * trans-intrinsic.c (gfc_conv_intrinsic_shape): New function.
 (gfc_conv_intrinsic_function): Call it.

2018-02-01  Thomas Koenig  

 PR fortran/68560
 * gfortran.dg/shape_9.f90: New test.


Index: trans-intrinsic.c
===
--- trans-intrinsic.c	(Revision 257347)
+++ trans-intrinsic.c	(Arbeitskopie)
@@ -5593,6 +5593,22 @@ gfc_conv_intrinsic_ibits (gfc_se * se, gfc_expr *
 }
 
 static void
+gfc_conv_intrinsic_shape (gfc_se *se, gfc_expr *expr)
+{
+  gfc_actual_arglist *s, *k;
+  gfc_expr *e;
+
+  /* Remove the KIND argument, if present. */
+  s = expr->value.function.actual;
+  k = s->next;
+  e = k->expr;
+  gfc_free_expr (e);
+  k->expr = NULL;
+
+  gfc_conv_intrinsic_funcall (se, expr);
+}
+
+static void
 gfc_conv_intrinsic_shift (gfc_se * se, gfc_expr * expr, bool right_shift,
 			  bool arithmetic)
 {
@@ -8718,6 +8734,10 @@ gfc_conv_intrinsic_function (gfc_se * se, gfc_expr
 	  gfc_conv_intrinsic_minmaxloc (se, expr, GT_EXPR);
 	  break;
 
+	case GFC_ISYM_SHAPE:
+	  gfc_conv_intrinsic_shape (se, expr);
+	  break;
+
 	default:
 	  gfc_conv_intrinsic_funcall (se, expr);
 	  break;
! { dg-do  run }
! { dg-require-effective-target lto }
! { dg-options "-flto" }
! Check that there are no warnings with LTO for a KIND argument.
!
program test
   implicit none
   real, allocatable :: x(:,:)

   allocate(x(2,5))
   if (any(shape(x) /= [ 2, 5 ])) call abort
   if (any(shape(x,kind=1) /= [ 2, 5 ])) call abort
   if (any(shape(x,kind=2) /= [ 2, 5 ])) call abort
   if (any(shape(x,kind=4) /= [ 2, 5 ])) call abort
   if (any(shape(x,kind=8) /= [ 2, 5 ])) call abort
 end program test

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-02-07 Thread Jason Merrill

On Wed, Feb 7, 2018 at 2:48 PM, Jakub Jelinek  wrote:
> On Wed, Feb 07, 2018 at 08:36:31PM +0100, Marek Polacek wrote:
>> > > That was my first patch, but it was rejected:
>> > > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00271.html
>> >
>> > Then should we update fold_indirect_ref_1 to use the new code?  Is
>> > there a reason for them to stay out of sync?
>>
>> One of the reasons is that middle end uses poly_uint64 type but the front 
>> ends
>> shouldn't use them.  So some of these functions will unfortunately differ.
>
> Yeah.  Part of the patch makes the two implementations slightly more
> similar, but I have e.g. no idea how to test for poly_uint64 that fits
> also in poly_int64 and the poly_int* stuff makes the two substantially
> different in any case.

Hmm.  Well, that seems rather unfortunate.  Why shouldn't the front
ends use them?  Can we make an exception for this function because
it's supposed to mirror a middle-end function?
Should we try to push this function back into the middle end?

Jason

Re: [openacc, committed] Fix diff_type in expand_oacc_collapse_init

2018-02-07 Thread Tom de Vries


On 02/07/2018 08:37 PM, Rainer Orth wrote:

Hi Tom,


this patch fixes an 8 regression in an openacc testcase.

The regression was introduced by r250925, a fix for PR78266, a bug in the
handling of a loop with iteration variable type range smaller than the size
of the parallel dimension the loop is assigned to.

The fix for the regression is to apply the r250925 fix (in expand_oacc_for)
to expand_oacc_collapse_init as well.

Build and reg-tested on x86_64 with nvptx accelerator.

Committed to stage4 trunk.

[...]

* testsuite/libgomp.oacc-c-c++-common/pr84217.c: New test.


this test FAILs when compiled as C++:

+FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/pr84217.c 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1  -O2  (test for excess errors)
+UNRESOLVED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/pr84217.c -DACC_DEVIC
E_TYPE_host=1 -DACC_MEM_SHARED=1  -O2  compilation failed to produce executable

Excess errors:
Undefined   first referenced
  symbol in file
abort() /var/tmp//ccSsAnFc.o

(seen on i386-pc-solaris2.11 and sparc-sun-solaris2.11).

Fixed as follows, tested with the appropriate runtest invocations,
installed on mainline.

Rainer



Hi Rainer,

thanks for fixing this oversight of mine.

- Tom



libgomp-pr84217-c++.patch


# HG changeset patch
# Parent  e5c3e710215181503b228de2f9277e543392df2a
Fix libgomp.oacc-c-c++-common/pr84217.c for C++

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84217.c 
b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84217.c
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84217.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84217.c
@@ -1,4 +1,8 @@
-extern void abort (void);
+extern
+#ifdef __cplusplus
+"C"
+#endif
+void abort (void);
  
  #define N 10

Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-07 Thread Jason Merrill


On 02/06/2018 11:01 PM, Martin Sebor wrote:

On 02/05/2018 02:52 PM, Jason Merrill wrote:

On 02/04/2018 07:07 PM, Martin Sebor wrote:

To resolve the underlying root cause of the P1 bug c++/83503
- bogus -Wattributes for const and pure on function template
specialization, that we discussed last week, I've taken a stab
at making the change to avoid applying primary template's
attributes to its explicit specializations.  (The bug tracking
the underlying root cause is 83871 - wrong code for attribute
const and pure on distinct template specializations).

The attached patch is what I have so far.  It's incomplete
and not all the tests pass for a couple of reasons:

1) it only handles function templates (not class templates),
    and I have no tests for those yet,


Class templates may already work the way you expect; at least aligned
does, though that doesn't involve TYPE_ATTRIBUTES.

Hmm, it seems that we currently don't propagate unused even to implicit
instantiations, a bug in the other direction:

template  struct [[gnu::unused]] A { };

int main()
{
  A a; // shouldn't warn
}


I opened bug 84221 to track it.  It's a regression WRT 4.7.

For types, it's not completely clear to me what should be
expected for attribute deprecated.  Not inheriting the
attribute means that users would be able to explicitly
specialize a deprecated primary template which is in most
cases contrary to the intent of the attribute.

On the other hand, inheriting it means that there would be
no good way to deprecate the primary without also deprecating
its explicit specializations (because declaring the explicit
specializations would trigger the warning).  The use case
for this was mentioned by Richard in the core discussion
(deprecating the std::numeric_limits primary).

I can't think of any way to make it work.  The only solution
that comes to mind is to use the name of the source file (or
header) in which the primary is defined and allow explicit
specializations to be defined in it while issuing the warning
for those defined in other files.  But this definitely seems
like GCC 9 material.


Yes, we definitely don't need to mess with this now.


2) it isn't effective for the nonnull and returns_nonnull
    attributes because they are treated as type attributes,
3) the change to deal with attributes on function arguments
    may be unnecessary (I couldn't come up with any that would
    be propagated from the primary -- are there some?).


Yes, I think this is unnecessary.


Okay, thanks for confirming that.


Before I proceed with it I first want to make sure that it should
be fixed for GCC 8,


Some of it, I think.  Probably not the whole issue.


that duplicate_decls is the right place for these changes


I think so.


and that (2) should be fixed by treating those
and other such attributes by applying them to function decls.


This seems out of bounds for GCC 8.  It would also mean that we couldn't
use such attributes on pointers to functions.


+  TREE_NOTHROW (newdecl) |= TREE_NOTHROW (olddecl);


TREE_NOTHROW is mostly a non-attribute property, so I'd leave it out of
this.


__attribute__ ((nothrow))?  The patch includes a test case with
wrong-code due to inheriting the attribute.  With exception
specifications having to match between the primary and its
specializations it's the only way to make them different.
I've left this unchanged but let me know if I'm missing
something.


Yeah, I think you're right.  But I notice that the existing code (and 
thus your patch) touches TREE_NOTHROW in two places, and the first one 
seems wrong; we only want it inside the FUNCTION_DECL section.



+  DECL_IS_MALLOC (olddecl) = DECL_IS_MALLOC (newdecl);


If a template is declared to be malloc, IMO we should really warn if a
specialization is missing that attribute, it seems certain to be a 
mistake.


I tend to agree that it's likely a mistake.  Though warning
in the front-end will lead to false positives if the function
isn't malloc.  Ideally, this would be detected in the middle-
end (where -Wsuggest-attribute=malloc is handled) but I suspect
it's too late for that.  I've added a simple warning for it.


In general, I think we should (optionally) warn if a template has
attributes and a specialization doesn't, as a user might have been
relying on the current behavior.


I've added a new option, -Wmissing-attribute.  In bug 81824
Joseph asked for such a warning for C (for function resolvers
and aliases) and so I'll use the same option for both (I expect
it's too late to handle 81824 in GCC 8 but I'll finish it in
GCC 9).  Adding the warning required passing some additional
attributes around and so more churn.


Rather than pass them down into register_specialization and 
duplicate_decls, check_explicit_specialization could compare the 
attribute list to the attributes on the template itself.



+  if (!merge_attr)
+    {
+  /* Remove the two function attributes that are, in fact,
+ treated as (quasi) type

Re: [openacc, committed] Fix diff_type in expand_oacc_collapse_init

2018-02-07 Thread Rainer Orth

Hi Tom,

> this patch fixes an 8 regression in an openacc testcase.
>
> The regression was introduced by r250925, a fix for PR78266, a bug in the
> handling of a loop with iteration variable type range smaller than the size
> of the parallel dimension the loop is assigned to.
>
> The fix for the regression is to apply the r250925 fix (in expand_oacc_for)
> to expand_oacc_collapse_init as well.
>
> Build and reg-tested on x86_64 with nvptx accelerator.
>
> Committed to stage4 trunk.
[...]
>   * testsuite/libgomp.oacc-c-c++-common/pr84217.c: New test.

this test FAILs when compiled as C++:

+FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/pr84217.c 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1  -O2  (test for excess errors)
+UNRESOLVED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/pr84217.c -DACC_DEVIC
E_TYPE_host=1 -DACC_MEM_SHARED=1  -O2  compilation failed to produce executable

Excess errors:
Undefined   first referenced
 symbol in file
abort() /var/tmp//ccSsAnFc.o

(seen on i386-pc-solaris2.11 and sparc-sun-solaris2.11).

Fixed as follows, tested with the appropriate runtest invocations,
installed on mainline.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2018-02-07  Rainer Orth  

* testsuite/libgomp.oacc-c-c++-common/pr84217.c (abort)
[__cplusplus]: Declare extern "C".

# HG changeset patch
# Parent  e5c3e710215181503b228de2f9277e543392df2a
Fix libgomp.oacc-c-c++-common/pr84217.c for C++

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84217.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84217.c
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84217.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84217.c
@@ -1,4 +1,8 @@
-extern void abort (void);
+extern
+#ifdef __cplusplus
+"C"
+#endif
+void abort (void);
 
 #define N 10

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-02-07 Thread Marek Polacek

On Wed, Feb 07, 2018 at 01:23:49PM -0500, Jason Merrill wrote:
> On Wed, Feb 7, 2018 at 12:54 PM, Marek Polacek  wrote:
> > On Wed, Feb 07, 2018 at 12:26:04PM -0500, Jason Merrill wrote:
> >> On Fri, Jan 26, 2018 at 6:22 PM, Jakub Jelinek  wrote:
> >> > On Fri, Jan 26, 2018 at 02:11:19PM +0100, Richard Biener wrote:
> >> >> >> POINTER_PLUS_EXPR offets are to be interpreted as signed (ptrdiff_t)
> >> >> >> so using uhwi and then performing an unsigned division is wrong code.
> >> >> >> See mem_ref_offset how to deal with this (ugh - poly-ints...).  
> >> >> >> Basically
> >> >> >> you have to force the thing to signed.  Like just use
> >> >> >>
> >> >> >>   HOST_WIDE_INT offset = TREE_INT_CST_LOW (op01);
> >> >> >
> >> >> > Does it really matter here though?  Any negative offsets there are 
> >> >> > UB, we
> >> >> > should punt on them rather than try to optimize them.
> >> >> > As we known op01 is unsigned, if we check that it fits into shwi_p, 
> >> >> > it means
> >> >> > it will be 0 to shwi max and then we can handle it in uhwi too.
> >> >>
> >> >> Ah, of course.  Didn't look up enough context to see what this code
> >> >> does in the end ;)
> >> >>
> >> >> >   /* ((foo*))[1] => BIT_FIELD_REF */
> >> >> >   if (VECTOR_TYPE_P (op00type)
> >> >> >   && (same_type_ignoring_top_level_qualifiers_p
> >> >> > -(type, TREE_TYPE (op00type
> >> >> > +(type, TREE_TYPE (op00type)))
> >> >> > + && tree_fits_shwi_p (op01))
> >> >>
> >> >> nevertheless this appearant "mismatch" deserves a comment (checking
> >> >> shwi and using uhwi).
> >> >
> >> > So like this?
> >> >
> >> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >>
> >> Why not use the same code as fold_indirect_ref_1 here?
> >
> > That was my first patch, but it was rejected:
> > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00271.html
> 
> Then should we update fold_indirect_ref_1 to use the new code?  Is
> there a reason for them to stay out of sync?

One of the reasons is that middle end uses poly_uint64 type but the front ends
shouldn't use them.  So some of these functions will unfortunately differ.

Marek

[PATCH] add -Wstringop-overflow to LTO options (PR 84212)

2018-02-07 Thread Martin Sebor


In PR 84212 the reporter asks why -Wno-stringop-overflow has
no effect during LTO linking.  It turns out that the reason
is the same as in bug 78768: the specification in the c.opt
file is missing LTO among the languages.

The attached patch adds LTO to it and to -Wstringop-truncation.

Bootstrapped and regtested on x86_64-linux.

There are other middle-end options in the c.opt file that do
not mention LTO that arguably should (*).  I didn't change
those in this patch, in part because I don't have test cases
showing where it matters, and in part because I don't think
that having to remember to include LTO in these options (and,
ideally, also include a test in the test suite for each) is
a good approach.

It seems that including LTO implicitly for all options would
obviate this manual step and eliminate the risk of missing
them.  Is there a reason not to do that?

If implicitly including LTO for every option is not feasible,
then it might be worthwhile to write a small script to verify
that every middle-end option does mention LTO, and have make
run the script as part of the self-test step.

Is there support for either of these changes?  If not, are
there any other ideas for how to avoid these kind of bugs?

Martin

[*] Here are a few examples.  I'm fine with adding LTO to
any or all of these as well as any others that I may have
missed for GCC 8 if that sounds like a good idea.

  -Walloc-size-larger-than
  -Warray-bounds
  -Wformat-truncation=
  -Wmaybe-uninitialized
  -Wnonnull
  -Wrestrict
  -Wstrict-overflow
  -Wsuggest-attribute
  -Wuninitialized
PR lto/84212 -  -Wno-* does not disable warnings from -flto link stage

gcc/c-family/ChangeLog:

	PR lto/84212
	* c.opt (-Wstringop-overflow): Add LTO.

gcc/testsuite/ChangeLog:

	PR lto/84212
	* gcc.dg/lto/pr84212_0.c: New test file.
	* gcc.dg/lto/pr84212_1.c: Same.

Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt	(revision 257453)
+++ gcc/c-family/c.opt	(working copy)
@@ -739,17 +739,17 @@ C ObjC C++ ObjC++ Var(warn_sizeof_array_argument)
 Warn when sizeof is applied on a parameter declared as an array.
 
 Wstringop-overflow
-C ObjC C++ ObjC++ Warning Alias(Wstringop-overflow=, 2, 0)
+C ObjC C++ LTO ObjC++ Warning Alias(Wstringop-overflow=, 2, 0)
 Warn about buffer overflow in string manipulation functions like memcpy
 and strcpy.
 
 Wstringop-overflow=
-C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_stringop_overflow) Init(2) Warning LangEnabledBy(C ObjC C++ ObjC++, Wall, 2, 0) IntegerRange(0, 4)
+C ObjC C++ LTO ObjC++ Joined RejectNegative UInteger Var(warn_stringop_overflow) Init(2) Warning LangEnabledBy(C ObjC C++ ObjC++ LTO, Wall, 2, 0) IntegerRange(0, 4)
 Under the control of Object Size type, warn about buffer overflow in string
 manipulation functions like memcpy and strcpy.
 
 Wstringop-truncation
-C ObjC C++ ObjC++ Var(warn_stringop_truncation) Warning Init (1) LangEnabledBy(C ObjC C++ ObjC++, Wall)
+C ObjC C++ LTO ObjC++ Var(warn_stringop_truncation) Warning Init (1) LangEnabledBy(C ObjC C++ ObjC++ LTO, Wall)
 Warn about truncation in string manipulation functions like strncat and strncpy.
 
 Wsuggest-attribute=format
Index: gcc/testsuite/gcc.dg/lto/pr84212_0.c
===
--- gcc/testsuite/gcc.dg/lto/pr84212_0.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/lto/pr84212_0.c	(working copy)
@@ -0,0 +1,11 @@
+/* PR lto/84212 - -Wno-stringop-verflow does not disable warnings from
+   -flto link stage
+   { dg-lto-do link }
+   { dg-lto-options { { -O2 -Werror -Wno-stringop-overflow -flto } } }  */
+
+#include 
+
+void clear (char *p, unsigned n)
+{
+  memset (p, 0, n);
+}
Index: gcc/testsuite/gcc.dg/lto/pr84212_1.c
===
--- gcc/testsuite/gcc.dg/lto/pr84212_1.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/lto/pr84212_1.c	(working copy)
@@ -0,0 +1,11 @@
+/* PR lto/84212 - -Wno-stringop-verflow does not disable warnings from
+   -flto link stage  */
+
+extern void clear (char*, unsigned);
+
+int main (void)
+{
+  char x[3];
+
+  clear (x, 4);
+}

Re: [SFN+LVU+IEPM v4 7/9] [LVU] Introduce location views

2018-02-07 Thread Jason Merrill


On 02/06/2018 11:02 PM, Alexandre Oliva wrote:

On Feb  6, 2018, Jason Merrill  wrote:

On 12/11/2017 09:52 PM, Alexandre Oliva wrote:



Why do we need to use a non-zero view identifier for a zero view?  Why
can't we always use 0 instead of the bitmap?


We assign view ids before we can determine whether the assigned view id
will be a zero view.  That's because we scan insns forward, and debug
binds take effect at the next .loc directive, which might be hundreds of
insns after the first reference (lots of intervening debug binds before
the insn that will take the next view), and insns that would cause the > .loc 
directive to be at a different address from that of the previous
.loc might be anywhere between those two loc-generating insns, before or
after the bind.  So, by the time we have to assign an id to the view, we
don't know whether we'll find an insn that sets RESETTING_VIEW_P before
we reach the loc-emitting insn that will take that view id.

It made more sense to me to assign the ids uniformly, and then mark
those that we find to be zero when we reach them, than to scan forward
(potentially O(n^2)) to tell in advance.  This also reduces differences
in view id tracking between gcc-internal and asm view assignment.


Makes sense, thanks.


DW_LNS_fixed_advance_pc is the only opcode that may change the
address without resetting the view.  It is available for compilers
to use when an address change is uncertain, e.g., when advancing
past opcodes that may expand to an empty sequence,
e.g. possibly-empty alignment sequences, optional no-operation
opcodes or the like.



+  if (debug_variable_location_views && table->view)
+   push_dw_line_info_entry (table, LI_adv_address, label_num);



It looks like you'll always use DW_LNS_fixed_advance_pc when we don't
have .loc support.  Does that mean that the view never gets reset?


No, table->view will be zero if we have crossed an insn that
RESET_NEXT_VIEW, and then we'll use a LI_set_address.


I'm uncomfortable with the special handling of this opcode; it isn't
special in DWARF5 except as a fallback if more compact encodings
aren't usable.  Currently GCC is even more conservative than this, and
always use a relocation (DW_LNS_set_address); if we can use D_L_f_a_p
instead, I would expect that to help with link times.  It seems wrong
to use it only in the context of view support.


We didn't use it before, but if we use, that's ok.  We don't *have* to
reset the view number to zero at a PC change.  It would be all right to
never reset it, as long as the compiler and the debug info consumer
agrees about it.  Since in some cases the compiler has to assign views
itself (when the assembler doesn't support views), but it can't tell
whether there will be a PC change (e.g. at an align), we need some
mechanism that predictably refrains from resetting the view number,
otherwise the compiler might pick a view number based on a possibly
incorrect assumption about whether the align changes PC or not, and then
it might not match what the consumer, that knows whether the PC
advanced, would compute.  To make it concrete:

# ...
.L352:
# .loc 1 533 view 3 (internal compiler tracking, no assembler support)
mov r12 <- whatever
# DEBUG x => r12
.balign 32
.L353:
# .loc 1 480 view 

Consider you're the compiler and you're generating a view-augmented
loclist for variable x.  You have to indicate the view number at .L353
for the range that starts there, and possibly for the range that ends
there.

But is the view number 4 or 0?  Namely, does .balign advance PC or not?
The compiler can't possibly know.  So if it guesses .balign does advance
PC and assign a view number zero at .L353, but it guesses wrong, that
will be indistinguishable from view number zero at say .L350, because
the PC is the same.  The debug info consumer will see no PC change and
assign view number 4 to the line number table entry correspoding to the
.loc directive with view , but it won't find any loclist referencing
that view number.  Conversely, if the compiler guessed .balign did not
advance PC, it would assign view number 4 to .L353, but then, with your
suggestion, the debug info consumer would instead assign it view number
zero, and we'd be out of sync.

That's why we need an opcode that enables the compiler and the debug
info consumer to remain in sync, disregarding a potential PC change that
would cause a view reset, because the compiler can't predict whether or
not there will be an actual PC change.


Would it make sense to say for *all* opcodes that the view is reset if
and only if the address actually changes?


I don't see how to keep compiler and consumer in sync with this
arrangement.  That's why I introduced the one exceptional opcode.


OK, that makes sense.  But I'm still uncomfortable with choosing an 
existing opcode for that purpose, which previously would have been 
chosen just for reasons of encoding complexity and size.



How are we coordinating the line number

Re: RFA: Sanitize deprecation messages (PR 84195)

2018-02-07 Thread David Malcolm

On Wed, 2018-02-07 at 17:26 +, Nick Clifton wrote:
> Hi Martin, Hi David,
> 
> OK - attached is a new patch that:
> 
>   * Replaces control characters with their escape equivalents.
>   * Includes a testcase.
> 
> I was not sure what to do about the inconsistencies between the
> behaviour of #warning and #pragma GCC warning, (and the error
> equivalents), so I decided to leave it for now.  Fixing them
> will require mucking about in the C preprocessor library, which
> I did not fancy doing at the moment.
> 
> So, is this enhanced patch OK now ?
> 
> Cheers
>   Nick
> 
> gcc/ChangeLog
> 2018-02-07  Nick Clifton  
> 
>   PR 84195
>   * tree.c (warn_deprecated_use): Replace control characters in
>   deprecation messages with escape sequences.
> 
> gcc/testsuite/ChangeLog
> 2018-02-07  Nick Clifton  
> 
>   * gcc.c-torture/compile/pr84195.c: New test.

> Index: gcc/tree.c
> ===
> --- gcc/tree.c(revision 257451)
> +++ gcc/tree.c(working copy)
> @@ -12431,8 +12431,66 @@
>if (attr)
>  attr = lookup_attribute ("deprecated", attr);
>  
> +  char * new_msg = NULL;
> +  unsigned int new_i = 0;
> +
>if (attr)
> -msg = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr)));
> +{
> +  msg = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr)));
> +
> +  if (msg)
> + {
> +   /* PR 84195: Replace control characters in the message with their
> +  escaped equivalents.  Allow newlines if -fmessage-length has
> +  been set to a non-zero value.  

I'm not quite sure why we allow newlines in this case, sorry.

> This is done here, rather than
> +  where the attribute is recorded as the message length can
> +  change between these two locations.  */
> +
> +   /* FIXME: Do we need to check to see if msg is longer than 2^32 ?  */
> +   unsigned int i;

Please can you split this out into a subroutine, and please add selftests
for it, so that we can unit-test it directly.

Try grepping for e.g. tree_c_tests and other users of selftest.h to see
the idea.

This would be in addition to the DejaGnu-based test (I prefer a "belt and
braces" approach here).

Suggested unittest coverage:

- the empty string
- a non-empty string in which nothing is escaped
- a non-empty string in which everything is escaped
- all of the cases in the switch
- strings with and without newlines, since you're special-casing that

Ownership of the string seems fiddly, so given that gcc is implemented
in C++98, maybe instead of a subroutine, make it a class, with
responsibility for a string that we may or may not own; something like:

  class escaped_string
  {
  public:
 escaped_string (const char *unescaped);
 ~escaped_string () { if (m_owned) free (m_str); }
 operator const char *() const { return m_str; }
  private:
 char *m_str;
 bool m_owned;
  };

so you can (I hope) simply do:

  ASSERT_STREQ ("foo\\nbar", escaped_string ("foo\nbar"));

and similar for the various cases, without needing an explicit "free"
that can get lost if the function ever gains an early return.

(Or, better, have an escape_string function return an instance of a
class that has responsibility for the "if (ownership) free" dtor, but I
can't think of a good name for such a class right now).

Note that "make selftest-valgrind" will run the selftests under
valgrind (sadly, there are a few pre-existing leaks, even if you do
configure gcc with --enable-valgrind-annotations).

> +
> +   for (i = 0; i < strlen (msg); i++)

Just get strlen once, rather than each time through the loop?
"size_t unescaped_len", maybe?

> + {
> +   char c = msg[i];
> +
> +   if (! ISCNTRL (c))
> + {
> +   if (new_msg)
> + new_msg[new_i++] = c;
> +   continue;
> + }
> +
> +   if (c != '\n' || ! pp_is_wrapping_line (global_dc->printer))

Could a "bool needs_escaping_p (char)" subroutine clarify the logic
here?

> + {

Maybe a comment here noting this is the first char needing escaping?

> +   if (new_msg == NULL)
> + {
> +   new_msg = (char *) xmalloc (strlen (msg) * 2);

Off by one, I think - shouldn't this be
  (strlen (msg) * 2) + 1
to allow for every char potentially being escaped, *plus* the NIL-
terminator?

Hence worth having a unittest for the "non-empty and everything is
escaped" case.

(and reuse the one-time strlen from above).

> +   strncpy (new_msg, msg, i);
> +   new_i = i;
> + }
> +   new_msg [new_i++] = '\\';
> +   switch (c)
> + {
> + case  7: new_msg[new_i++] = 'a'; break;
> + case  8: new_msg[new_i++] = 'b'; break;
> + case 27: new_msg[new_i++] = 'e'; break;
> + case

[PATCH][GCC][ARM] Silence more re-definition warnings, make test case failure case more explicit.

2018-02-07 Thread Tamar Christina

Hi All,

The previous testcase would fail on a system where the initial mode is thumb 
and later
switches to an arm mode. This would again cause some warnings to be emitted.

This patch visits all builtins defined with builtin_define_with_int_value and 
undefines
them if they could possibly change by changing the architecture.

The testcase has also been updated to make it fail more easily on such cases.

Bootstrapped and regtested on arm-none-Linux-gnueabihf and no issues.

Ok for trunk?

Thanks,
Tamar


gcc/
2018-02-07  Tamar Christina  

PR target/82641
* config/arm/arm-c.c (arm_cpu_builtins): Un-define __ARM_FEATURE_LDREX,
__ARM_ARCH_PROFILE, __ARM_ARCH_ISA_THUMB, __ARM_FP and __ARM_NEON_FP.

gcc/testsuite
2018-02-07  Tamar Christina  

PR target/82641
* gcc.target/arm/pragma_arch_switch_2.c: Use armv6 and armv5t.

-- 
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 9a16172f1d8af1a2fb3f24b758650e16ff6e810a..7c741e9fe66a0e086556272a46c4cd709996ce36 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -87,11 +87,10 @@ arm_cpu_builtins (struct cpp_reader* pfile)
 	builtin_define ("__ARM_FEATURE_CMSE");
 }
 
+  cpp_undef (pfile, "__ARM_FEATURE_LDREX");
   if (TARGET_ARM_FEATURE_LDREX)
 builtin_define_with_int_value ("__ARM_FEATURE_LDREX",
    TARGET_ARM_FEATURE_LDREX);
-  else
-cpp_undef (pfile, "__ARM_FEATURE_LDREX");
 
   def_or_undef_macro (pfile, "__ARM_FEATURE_CLZ",
 		  ((TARGET_ARM_ARCH >= 5 && !TARGET_THUMB)
@@ -105,6 +104,8 @@ arm_cpu_builtins (struct cpp_reader* pfile)
   builtin_define_with_int_value ("__ARM_SIZEOF_MINIMAL_ENUM",
  flag_short_enums ? 1 : 4);
   builtin_define_type_sizeof ("__ARM_SIZEOF_WCHAR_T", wchar_type_node);
+
+  cpp_undef (pfile, "__ARM_ARCH_PROFILE");
   if (TARGET_ARM_ARCH_PROFILE)
 builtin_define_with_int_value ("__ARM_ARCH_PROFILE",
    TARGET_ARM_ARCH_PROFILE);
@@ -128,6 +129,7 @@ arm_cpu_builtins (struct cpp_reader* pfile)
   else
 def_or_undef_macro (pfile, "__THUMBEL__", TARGET_THUMB);
 
+  cpp_undef (pfile, "__ARM_ARCH_ISA_THUMB");
   if (TARGET_ARM_ARCH_ISA_THUMB)
 builtin_define_with_int_value ("__ARM_ARCH_ISA_THUMB",
    TARGET_ARM_ARCH_ISA_THUMB);
@@ -147,10 +149,9 @@ arm_cpu_builtins (struct cpp_reader* pfile)
 
   builtin_define ("__VFP_FP__");
 
+  cpp_undef (pfile, "__ARM_FP");
   if (TARGET_ARM_FP)
 builtin_define_with_int_value ("__ARM_FP", TARGET_ARM_FP);
-  else
-cpp_undef (pfile, "__ARM_FP");
 
   def_or_undef_macro (pfile, "__ARM_FP16_FORMAT_IEEE",
 		  arm_fp16_format == ARM_FP16_FORMAT_IEEE);
@@ -169,10 +170,9 @@ arm_cpu_builtins (struct cpp_reader* pfile)
   def_or_undef_macro (pfile, "__ARM_NEON__", TARGET_NEON);
   def_or_undef_macro (pfile, "__ARM_NEON", TARGET_NEON);
 
+  cpp_undef (pfile, "__ARM_NEON_FP");
   if (TARGET_NEON_FP)
 builtin_define_with_int_value ("__ARM_NEON_FP", TARGET_NEON_FP);
-  else
-cpp_undef (pfile, "__ARM_NEON_FP");
 
   /* Add a define for interworking. Needed when building libgcc.a.  */
   if (arm_cpp_interwork)
diff --git a/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c b/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
index fe52191c32c037fe4096c1884e1f6397318bd6a3..7f297557d555fd139a3b804d354117239a78ae62 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
@@ -2,16 +2,16 @@
 /* { dg-skip-if "instruction not valid on thumb" { *-*-* } { "-mthumb" } { "" } } */
 /* { dg-do assemble } */
 /* { dg-require-effective-target arm_arm_ok } */
-/* { dg-additional-options "-Wall -O2 -march=armv4t -std=gnu99 -marm" } */
+/* { dg-additional-options "-Wall -O2 -march=armv5t -std=gnu99 -marm" } */
 
-#pragma GCC target ("arch=armv5te")
-void cpu_has_iwmmxt (void)
+#pragma GCC target ("arch=armv6")
+int test_assembly (int hi, int lo)
 {
-   int lo;
-   int hi;
+   int res;
__asm__ __volatile__ (
-  "mcrr   p0, 0, %2, %3, c0\n"
-  : "=r" (lo), "=r" (hi)
-  : "r" (0), "r" (0x100));
+  "uxtah   %0, %1, %2\n"
+  : "=r" (res)
+  : "r" (hi),  "r" (lo));
+   return res;
 }

Re: [PATCH, rs6000] Update vsx-vector-6-le.c tests for p9 target

2018-02-07 Thread Segher Boessenkool

Hi!

On Wed, Feb 07, 2018 at 11:16:12AM -0600, Will Schmidt wrote:
> Noted during review of test results on P9.   Due to changes and improvements,
> our codegen is different for this test on power9.
> Modified the existing test to target P8, and added a P9 variant with updated
> counts.

> diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c 
> b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
> index ddb0089..7fe691b 100644
> --- a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
> +++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
> @@ -1,11 +1,11 @@
>  /* { dg-do compile { target { powerpc64le-*-* && lp64 } } } */
>  /* { dg-skip-if "" { powerpc*-*-darwin* } } */
>  /* { dg-require-effective-target powerpc_vsx_ok } */
> -/* { dg-options "-mvsx -O2" } */
> +/* { dg-options "-mvsx -O2 -mcpu=power8" } */

Why not -mcpu=power7?  And you'll need

/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power7" } } */

You could also do instead

/* { dg-skip-if "this is not for p9" { powerpc_p9vector_ok } } */

or something like that; a bit neater.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.p9.c
> @@ -0,0 +1,32 @@
> +/* { dg-do compile { target { powerpc64le-*-* && lp64 } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-options "-mvsx -O2 -mcpu=power9" } */

This needs the "do not override -mcpu" thing as well.


Segher

Re: [PATCH] C++: avoid most reserved words as misspelling suggestions (PR c++/81610 and PR c++/80567)

2018-02-07 Thread David Malcolm

On Wed, 2018-02-07 at 13:22 -0500, Jason Merrill wrote:
> On Wed, Feb 7, 2018 at 1:12 PM, David Malcolm 
> wrote:
> > On Wed, 2018-02-07 at 12:21 -0500, Jason Merrill wrote:
> > > On Fri, Jan 26, 2018 at 1:12 PM, David Malcolm  > > om>
> > > wrote:
> > > > On Mon, 2017-12-11 at 17:24 -0500, Jason Merrill wrote:
> > > > > On Wed, Nov 22, 2017 at 10:36 AM, David Malcolm  > > > > hat.
> > > > > com>
> > > > > wrote:
> > > > 
> > > > Original post:
> > > >   https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02048.html
> > > > 
> > > > > > PR c++/81610 and PR c++/80567 report problems where the C++
> > > > > > frontend
> > > > > > suggested "if", "for" and "else" as corrections for
> > > > > > misspelled
> > > > > > variable
> > > > > > names.
> > > > 
> > > > I've now marked these PRs as regressions: the nonsensical
> > > > suggestions
> > > > are only offered by trunk, not by gcc 7 and earlier.
> > > > 
> > > > > Hmm, what about cases where people are actually misspelling
> > > > > keywords?
> > > > > Don't we want to handle that?
> > > > > 
> > > > > fi (true) { }
> > > > > retrun 42;
> > > > 
> > > > I'd prefer not to.
> > > > 
> > > > gcc 7 and earlier don't attempt to correct the spelling of the
> > > > "fi"
> > > > and
> > > > "retrun" above.
> > > > 
> > > > trunk currently does offer "return" as a suggestion, but it was
> > > > by
> > > > accident, and I'm wary of attempting to support these
> > > > corrections:
> > > > is
> > > > "fi" meant to be an "if", or a function call that's missing its
> > > > decl,
> > > > or a name lookup issue?  ...etc
> > > > 
> > > > > In the PRs you mention, the actual identifiers are 1) missing
> > > > > includes, which we should check first, and 2) pretty far from
> > > > > the
> > > > > suggested keywords.
> > > > 
> > > > The C++ FE is missing a suggestion about which #include to use
> > > > for
> > > > "memset", but I'd prefer to treat that as a follow-up patch
> > > > (and
> > > > probably for next stage 1).
> > > > 
> > > > In the meantime, is this patch OK for trunk? (as a regression
> > > > fix)
> > > 
> > > Yes.
> > 
> > Thanks; committed (r257456).
> > 
> > FWIW, I've filed PR c++/84269 so I remember to fix the missing
> > suggestion for "memset" (in gcc 9 stage1).
> 
> Did you have a reaction to my comment about the suggested keyword
> being pretty far from the actual identifier?  Do we want to lower the
> cutoff for suggestions at all?

I've played around with tweaking how the cutoff works, [1] in response
to e.g. PR c/82967 ('"did you mean" suggestions are way too
suggestive'), but I've not yet come up with a version I prefer to the
current implementation.

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-02-07 Thread Jason Merrill

On Wed, Feb 7, 2018 at 12:54 PM, Marek Polacek  wrote:
> On Wed, Feb 07, 2018 at 12:26:04PM -0500, Jason Merrill wrote:
>> On Fri, Jan 26, 2018 at 6:22 PM, Jakub Jelinek  wrote:
>> > On Fri, Jan 26, 2018 at 02:11:19PM +0100, Richard Biener wrote:
>> >> >> POINTER_PLUS_EXPR offets are to be interpreted as signed (ptrdiff_t)
>> >> >> so using uhwi and then performing an unsigned division is wrong code.
>> >> >> See mem_ref_offset how to deal with this (ugh - poly-ints...).  
>> >> >> Basically
>> >> >> you have to force the thing to signed.  Like just use
>> >> >>
>> >> >>   HOST_WIDE_INT offset = TREE_INT_CST_LOW (op01);
>> >> >
>> >> > Does it really matter here though?  Any negative offsets there are UB, 
>> >> > we
>> >> > should punt on them rather than try to optimize them.
>> >> > As we known op01 is unsigned, if we check that it fits into shwi_p, it 
>> >> > means
>> >> > it will be 0 to shwi max and then we can handle it in uhwi too.
>> >>
>> >> Ah, of course.  Didn't look up enough context to see what this code
>> >> does in the end ;)
>> >>
>> >> >   /* ((foo*))[1] => BIT_FIELD_REF */
>> >> >   if (VECTOR_TYPE_P (op00type)
>> >> >   && (same_type_ignoring_top_level_qualifiers_p
>> >> > -(type, TREE_TYPE (op00type
>> >> > +(type, TREE_TYPE (op00type)))
>> >> > + && tree_fits_shwi_p (op01))
>> >>
>> >> nevertheless this appearant "mismatch" deserves a comment (checking
>> >> shwi and using uhwi).
>> >
>> > So like this?
>> >
>> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>>
>> Why not use the same code as fold_indirect_ref_1 here?
>
> That was my first patch, but it was rejected:
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00271.html

Then should we update fold_indirect_ref_1 to use the new code?  Is
there a reason for them to stay out of sync?

Jason

Re: [PATCH] C++: avoid most reserved words as misspelling suggestions (PR c++/81610 and PR c++/80567)

2018-02-07 Thread Jason Merrill

On Wed, Feb 7, 2018 at 1:12 PM, David Malcolm  wrote:
> On Wed, 2018-02-07 at 12:21 -0500, Jason Merrill wrote:
>> On Fri, Jan 26, 2018 at 1:12 PM, David Malcolm 
>> wrote:
>> > On Mon, 2017-12-11 at 17:24 -0500, Jason Merrill wrote:
>> > > On Wed, Nov 22, 2017 at 10:36 AM, David Malcolm > > > com>
>> > > wrote:
>> >
>> > Original post:
>> >   https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02048.html
>> >
>> > > > PR c++/81610 and PR c++/80567 report problems where the C++
>> > > > frontend
>> > > > suggested "if", "for" and "else" as corrections for misspelled
>> > > > variable
>> > > > names.
>> >
>> > I've now marked these PRs as regressions: the nonsensical
>> > suggestions
>> > are only offered by trunk, not by gcc 7 and earlier.
>> >
>> > > Hmm, what about cases where people are actually misspelling
>> > > keywords?
>> > > Don't we want to handle that?
>> > >
>> > > fi (true) { }
>> > > retrun 42;
>> >
>> > I'd prefer not to.
>> >
>> > gcc 7 and earlier don't attempt to correct the spelling of the "fi"
>> > and
>> > "retrun" above.
>> >
>> > trunk currently does offer "return" as a suggestion, but it was by
>> > accident, and I'm wary of attempting to support these corrections:
>> > is
>> > "fi" meant to be an "if", or a function call that's missing its
>> > decl,
>> > or a name lookup issue?  ...etc
>> >
>> > > In the PRs you mention, the actual identifiers are 1) missing
>> > > includes, which we should check first, and 2) pretty far from the
>> > > suggested keywords.
>> >
>> > The C++ FE is missing a suggestion about which #include to use for
>> > "memset", but I'd prefer to treat that as a follow-up patch (and
>> > probably for next stage 1).
>> >
>> > In the meantime, is this patch OK for trunk? (as a regression fix)
>>
>> Yes.
>
> Thanks; committed (r257456).
>
> FWIW, I've filed PR c++/84269 so I remember to fix the missing
> suggestion for "memset" (in gcc 9 stage1).

Did you have a reaction to my comment about the suggested keyword
being pretty far from the actual identifier?  Do we want to lower the
cutoff for suggestions at all?

Jason

Re: [PATCH] C++: avoid most reserved words as misspelling suggestions (PR c++/81610 and PR c++/80567)

2018-02-07 Thread David Malcolm

On Wed, 2018-02-07 at 12:21 -0500, Jason Merrill wrote:
> On Fri, Jan 26, 2018 at 1:12 PM, David Malcolm 
> wrote:
> > On Mon, 2017-12-11 at 17:24 -0500, Jason Merrill wrote:
> > > On Wed, Nov 22, 2017 at 10:36 AM, David Malcolm  > > com>
> > > wrote:
> > 
> > Original post:
> >   https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02048.html
> > 
> > > > PR c++/81610 and PR c++/80567 report problems where the C++
> > > > frontend
> > > > suggested "if", "for" and "else" as corrections for misspelled
> > > > variable
> > > > names.
> > 
> > I've now marked these PRs as regressions: the nonsensical
> > suggestions
> > are only offered by trunk, not by gcc 7 and earlier.
> > 
> > > Hmm, what about cases where people are actually misspelling
> > > keywords?
> > > Don't we want to handle that?
> > > 
> > > fi (true) { }
> > > retrun 42;
> > 
> > I'd prefer not to.
> > 
> > gcc 7 and earlier don't attempt to correct the spelling of the "fi"
> > and
> > "retrun" above.
> > 
> > trunk currently does offer "return" as a suggestion, but it was by
> > accident, and I'm wary of attempting to support these corrections:
> > is
> > "fi" meant to be an "if", or a function call that's missing its
> > decl,
> > or a name lookup issue?  ...etc
> > 
> > > In the PRs you mention, the actual identifiers are 1) missing
> > > includes, which we should check first, and 2) pretty far from the
> > > suggested keywords.
> > 
> > The C++ FE is missing a suggestion about which #include to use for
> > "memset", but I'd prefer to treat that as a follow-up patch (and
> > probably for next stage 1).
> > 
> > In the meantime, is this patch OK for trunk? (as a regression fix)
> 
> Yes.

Thanks; committed (r257456).

FWIW, I've filed PR c++/84269 so I remember to fix the missing
suggestion for "memset" (in gcc 9 stage1).

Dave

[PATCH v6] aarch64: Add split-stack support

2018-02-07 Thread Adhemerval Zanella

Changes from previous version:

  - Changed the wait to call __morestack to use use a branch with link
instead of a simple branch.  This allows use a call instruction and
avoid possible issues with later optimization passes which might
see a branch outside the instruction block (as noticed in previous
iterations while building a more complex workload as speccpu2006).

  - Change the return address to use the branch with link value and
set x12 to save x30.  This simplifies the required instructions
to setup/save the return address.

--

This patch adds the split-stack support on aarch64 (PR #67877).  As for
other ports this patch should be used along with glibc and gold support.

The support is done similar to other architectures: a split-stack field
is allocated before TCB by glibc, a target-specific __morestack implementation
and helper functions are added in libgcc and compiler supported in adjusted
(split-stack prologue, va_start for argument handling).  I also plan to
send the gold support to adjust stack allocation acrosss split-stack
and default code calls.

Current approach is to set the final stack adjustments using a 2 instructions
at most (mov/movk) which limits stack allocation to upper limit of 4GB.
The morestack call is non standard with x10 hollding the requested stack
pointer, x11 the argument pointer (if required), and x12 to return
continuation address.  Unwinding is handled by a personality routine that
knows how to find stack segments.

Split-stack prologue on function entry is as follow (this goes before the
usual function prologue):

function:
mrsx9, tpidr_el0
ldur   x9, [x9, -8]
movx10, 
movk   x10, #0x0, lsl #16
subx10, sp, x10
movx11, sp  # if function has stacked arguments
cmpx9, x10
bcc.LX
main_fn_entry:
[function prologue]
LX:
bl __morestack
b  main_fn_entry

Notes:

1. Even if a function does not allocate a stack frame, a split-stack prologue
   is created.  It is to avoid issues with tail call for external symbols
   which might require linker adjustment (libgo/runtime/go-varargs.c).

2. Basic-block reordering (enabled with -O2) will move split-stack TCB ldur
   to after the required stack calculation.

3. Similar to powerpc, When the linker detects a call from split-stack to
   non-split-stack code, it adds 16k (or more) to the value found in "allocate"
   instructions (so non-split-stack code gets a larger stack).  The amount is
   tunable by a linker option.  This feature is only implemented in the GNU
   gold linker.

4. AArch64 does not handle >4G stack initially and although it is possible
   to implement it, limiting to 4G allows to materize the allocation with
   only 2 instructions (mov + movk) and thus simplifying the linker
   adjustments required.  Supporting multiple threads each requiring more
   than 4G of stack is probably not that important, and likely to OOM at
   run time.

5. The TCB support on GLIBC is meant to be included in version 2.28.

6. Besides a regression tests I also checked with a SPECcpu2006 run with
   -fsplit-stack additional option.  I saw no regression besides 416.gamess
   which fails on trunk as well (not sure if some misconfiguration in my
   environment).

libgcc/ChangeLog:

* libgcc/config.host: Use t-stack and t-statck-aarch64 for
aarch64*-*-linux.
* libgcc/config/aarch64/morestack-c.c: New file.
* libgcc/config/aarch64/morestack.S: Likewise.
* libgcc/config/aarch64/t-stack-aarch64: Likewise.
* libgcc/generic-morestack.c (__splitstack_find): Add aarch64-specific
code.

gcc/ChangeLog:

* common/config/aarch64/aarch64-common.c
(aarch64_supports_split_stack): New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* gcc/config/aarch64/aarch64-linux.h (TARGET_ASM_FILE_END): Remove
macro.
* gcc/config/aarch64/aarch64-protos.h: Add
aarch64_expand_split_stack_prologue and
aarch64_split_stack_space_check.
* gcc/config/aarch64/aarch64.c (aarch64_expand_builtin_va_start): Use
internal argument pointer instead of virtual_incoming_args_rtx.
(morestack_ref): New symbol.
(aarch64_load_split_stack_value): New function.
(aarch64_expand_split_stack_prologue): Likewise.
(aarch64_internal_arg_pointer): Likewise.
(aarch64_file_end): Emit the split-stack note sections.
(aarch64_split_stack_space_check): Likewise.
(TARGET_ASM_FILE_END): New macro.
(TARGET_INTERNAL_ARG_POINTER): Likewise.
* gcc/config/aarch64/aarch64.h (aarch64_frame): Add
split_stack_arg_pointer to setup the argument pointer when using
split-stack.
* gcc/config/aarch64/aarch64.md
(UNSPECV_STACK_CHECK): New define.
(split_stack_prologue): New expand.
(split_stack_space_check): Likewise.
---

Re: [PATCH, rs6000] fix vsxcopy test for power9

2018-02-07 Thread Segher Boessenkool

Hi,

On Wed, Feb 07, 2018 at 10:59:46AM -0600, Will Schmidt wrote:
> Noted during review of test results on P9.  The vsxcopy.c test is looking
> for lxvd2x, stxvd2x instructions in generated code.  For P9 targets, we will
> instead generate lxv, stxv instructions.
> Thus, update the test to handle that codegen as well.
> Sniff-tested on P9.
> OK for trunk?

There are many insns that start with "lxv" (or "stxv"); you may want to
use \m and \M, like:

> -/* { dg-final { scan-assembler "lxvd2x" } } */
> -/* { dg-final { scan-assembler "stxvd2x" } } */
> +/* { dg-final { scan-assembler "lxvd2x|lxv" } } */
> +/* { dg-final { scan-assembler "stxvd2x|stxv" } } */

/* { dg-final { scan-assembler {\m(lxvd2x|lxv)\M} } } */
/* { dg-final { scan-assembler {\m(stxvd2x|stxv)\M} } } */

("lxv" without anything more will already also match "lxvd2x", etc.)

Okay with such a change (if it works ;-) )


Segher

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-02-07 Thread Marek Polacek

On Wed, Feb 07, 2018 at 12:26:04PM -0500, Jason Merrill wrote:
> On Fri, Jan 26, 2018 at 6:22 PM, Jakub Jelinek  wrote:
> > On Fri, Jan 26, 2018 at 02:11:19PM +0100, Richard Biener wrote:
> >> >> POINTER_PLUS_EXPR offets are to be interpreted as signed (ptrdiff_t)
> >> >> so using uhwi and then performing an unsigned division is wrong code.
> >> >> See mem_ref_offset how to deal with this (ugh - poly-ints...).  
> >> >> Basically
> >> >> you have to force the thing to signed.  Like just use
> >> >>
> >> >>   HOST_WIDE_INT offset = TREE_INT_CST_LOW (op01);
> >> >
> >> > Does it really matter here though?  Any negative offsets there are UB, we
> >> > should punt on them rather than try to optimize them.
> >> > As we known op01 is unsigned, if we check that it fits into shwi_p, it 
> >> > means
> >> > it will be 0 to shwi max and then we can handle it in uhwi too.
> >>
> >> Ah, of course.  Didn't look up enough context to see what this code
> >> does in the end ;)
> >>
> >> >   /* ((foo*))[1] => BIT_FIELD_REF */
> >> >   if (VECTOR_TYPE_P (op00type)
> >> >   && (same_type_ignoring_top_level_qualifiers_p
> >> > -(type, TREE_TYPE (op00type
> >> > +(type, TREE_TYPE (op00type)))
> >> > + && tree_fits_shwi_p (op01))
> >>
> >> nevertheless this appearant "mismatch" deserves a comment (checking
> >> shwi and using uhwi).
> >
> > So like this?
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Why not use the same code as fold_indirect_ref_1 here?

That was my first patch, but it was rejected:
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00271.html

Marek

Re: RFA: Sanitize deprecation messages (PR 84195)

2018-02-07 Thread Nick Clifton

Hi Martin, Hi David,

OK - attached is a new patch that:

  * Replaces control characters with their escape equivalents.
  * Includes a testcase.

I was not sure what to do about the inconsistencies between the
behaviour of #warning and #pragma GCC warning, (and the error
equivalents), so I decided to leave it for now.  Fixing them
will require mucking about in the C preprocessor library, which
I did not fancy doing at the moment.

So, is this enhanced patch OK now ?

Cheers
  Nick

gcc/ChangeLog
2018-02-07  Nick Clifton  

PR 84195
* tree.c (warn_deprecated_use): Replace control characters in
deprecation messages with escape sequences.

gcc/testsuite/ChangeLog
2018-02-07  Nick Clifton  

* gcc.c-torture/compile/pr84195.c: New test.


pr84195.patch.2
Description: Unix manual page

Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)

2018-02-07 Thread Jason Merrill

On Fri, Jan 26, 2018 at 6:22 PM, Jakub Jelinek  wrote:
> On Fri, Jan 26, 2018 at 02:11:19PM +0100, Richard Biener wrote:
>> >> POINTER_PLUS_EXPR offets are to be interpreted as signed (ptrdiff_t)
>> >> so using uhwi and then performing an unsigned division is wrong code.
>> >> See mem_ref_offset how to deal with this (ugh - poly-ints...).  Basically
>> >> you have to force the thing to signed.  Like just use
>> >>
>> >>   HOST_WIDE_INT offset = TREE_INT_CST_LOW (op01);
>> >
>> > Does it really matter here though?  Any negative offsets there are UB, we
>> > should punt on them rather than try to optimize them.
>> > As we known op01 is unsigned, if we check that it fits into shwi_p, it 
>> > means
>> > it will be 0 to shwi max and then we can handle it in uhwi too.
>>
>> Ah, of course.  Didn't look up enough context to see what this code
>> does in the end ;)
>>
>> >   /* ((foo*))[1] => BIT_FIELD_REF */
>> >   if (VECTOR_TYPE_P (op00type)
>> >   && (same_type_ignoring_top_level_qualifiers_p
>> > -(type, TREE_TYPE (op00type
>> > +(type, TREE_TYPE (op00type)))
>> > + && tree_fits_shwi_p (op01))
>>
>> nevertheless this appearant "mismatch" deserves a comment (checking
>> shwi and using uhwi).
>
> So like this?
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Why not use the same code as fold_indirect_ref_1 here?

Jason

Re: [PATCH] C++: avoid most reserved words as misspelling suggestions (PR c++/81610 and PR c++/80567)

2018-02-07 Thread Jason Merrill

On Fri, Jan 26, 2018 at 1:12 PM, David Malcolm  wrote:
> On Mon, 2017-12-11 at 17:24 -0500, Jason Merrill wrote:
>> On Wed, Nov 22, 2017 at 10:36 AM, David Malcolm 
>> wrote:
>
> Original post:
>   https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02048.html
>
>> > PR c++/81610 and PR c++/80567 report problems where the C++
>> > frontend
>> > suggested "if", "for" and "else" as corrections for misspelled
>> > variable
>> > names.
>
> I've now marked these PRs as regressions: the nonsensical suggestions
> are only offered by trunk, not by gcc 7 and earlier.
>
>> Hmm, what about cases where people are actually misspelling keywords?
>> Don't we want to handle that?
>>
>> fi (true) { }
>> retrun 42;
>
> I'd prefer not to.
>
> gcc 7 and earlier don't attempt to correct the spelling of the "fi" and
> "retrun" above.
>
> trunk currently does offer "return" as a suggestion, but it was by
> accident, and I'm wary of attempting to support these corrections: is
> "fi" meant to be an "if", or a function call that's missing its decl,
> or a name lookup issue?  ...etc
>
>> In the PRs you mention, the actual identifiers are 1) missing
>> includes, which we should check first, and 2) pretty far from the
>> suggested keywords.
>
> The C++ FE is missing a suggestion about which #include to use for
> "memset", but I'd prefer to treat that as a follow-up patch (and
> probably for next stage 1).
>
> In the meantime, is this patch OK for trunk? (as a regression fix)

Yes.

Jason

[PATCH, rs6000] Update vsx-vector-6-le.c tests for p9 target

2018-02-07 Thread Will Schmidt

Hi,
Noted during review of test results on P9.   Due to changes and improvements,
our codegen is different for this test on power9.
Modified the existing test to target P8, and added a P9 variant with updated
counts.

Sniff-tested, now runs clean on P9.  OK for trunk?

Thanks,
-Will

[testsuite]

2018-02-07  Will Schmidt  

* gcc.target/powerpc/vsx-vector-6-le.c:  Update CPU target.
* gcc.target/powerpc/vsx-vector-6-le.p9.c:  New.

diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
index ddb0089..7fe691b 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
@@ -1,11 +1,11 @@
 /* { dg-do compile { target { powerpc64le-*-* && lp64 } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-options "-mvsx -O2" } */
+/* { dg-options "-mvsx -O2 -mcpu=power8" } */
 
-/* Expected instruction counts for Little Endian */
+/* Expected instruction counts for Little Endian targeting Power8. */
 
 /* { dg-final { scan-assembler-times "xvabsdp" 1 } } */
 /* { dg-final { scan-assembler-times "xvadddp" 1 } } */
 /* { dg-final { scan-assembler-times "xxlnor" 8 } } */
 /* { dg-final { scan-assembler-times "xxlor" 30 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.p9.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.p9.c
new file mode 100644
index 000..450cd52
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.p9.c
@@ -0,0 +1,32 @@
+/* { dg-do compile { target { powerpc64le-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mvsx -O2 -mcpu=power9" } */
+
+/* Expected instruction counts for Little Endian targeting Power9. */
+
+/* { dg-final { scan-assembler-times "xvabsdp" 1 } } */
+/* { dg-final { scan-assembler-times "xvadddp" 1 } } */
+/* { dg-final { scan-assembler-times "xxlnor" 7 } } */
+/* { dg-final { scan-assembler-times "xxlor" 20 } } */
+/* { dg-final { scan-assembler-times "xvcmpeqdp" 5 } } */
+/* { dg-final { scan-assembler-times "xvcmpgtdp" 8 } } */
+/* { dg-final { scan-assembler-times "xvcmpgedp" 8 } } */
+/* { dg-final { scan-assembler-times "xvrdpim" 1 } } */
+/* { dg-final { scan-assembler-times "xvmaddadp" 1 } } */
+/* { dg-final { scan-assembler-times "xvmsubadp" 1 } } */
+/* { dg-final { scan-assembler-times "xvsubdp" 1 } } */
+/* { dg-final { scan-assembler-times "xvmaxdp" 1 } } */
+/* { dg-final { scan-assembler-times "xvmindp" 1 } } */
+/* { dg-final { scan-assembler-times "xvmuldp" 1 } } */
+/* { dg-final { scan-assembler-times "vperm" 1 } } */
+/* { dg-final { scan-assembler-times "xvrdpic" 1 } } */
+/* { dg-final { scan-assembler-times "xvsqrtdp" 1 } } */
+/* { dg-final { scan-assembler-times "xvrdpiz" 1 } } */
+/* { dg-final { scan-assembler-times "xvmsubasp" 1 } } */
+/* { dg-final { scan-assembler-times "xvnmaddasp" 1 } } */
+/* { dg-final { scan-assembler-times "vmsumshs" 1 } } */
+/* { dg-final { scan-assembler-times "xxland" 13 } } */
+
+/* Source code for the test in vsx-vector-6.h */
+#include "vsx-vector-6.h"

Re: [PATCH, rs6000] fix requires for mergew-mergeow.c test

2018-02-07 Thread Segher Boessenkool

Hi!

On Wed, Feb 07, 2018 at 10:30:05AM -0600, Will Schmidt wrote:
> Noted during review of test results.  The builtins-mergew-mergeow.c testcase
> uses the vec_mergeo() intrinsics that did not exist before power8.  Thus,
> update the -requires stanza so we don't try to build/run this on earlier 
> systems.
> 
> Sniff-tested OK on P6,P7,P8,P9.  OK for trunk?

Yes please.  Thanks!


Segher


> 2018-02-07  Will Schmidt  
> 
>   * gcc.target/powerpc/builtins-mergew-mergeow.c:  Update dg-requires.

Re: [PATCH] PR84068: Fix sort order of SCHED_PRESSURE_MODEL

2018-02-07 Thread Alexander Monakov

> On Wed, 31 Jan 2018, Wilco Dijkstra wrote:
> 
> > The comparison order for SCHED_PRESSURE_MODEL is incorrect.  If either
> > instruction is not in target_bb, the ordering is not well defined.  To fix
> > this, give all instructions in target_bb the highest priority and sort all
> > other instructions behind it.  This way instructions in target_bb will be
> > sorted using the pressure model, and instructions outside it will use
> > RFS_DEP_COUNT and/or RFS_TIE for their order.
> 
> This appears to be the same issue as PR 83459; please add 
> rtl-optimization/83459
> to ChangeLog if approved.
> 
> > PR rlt-optimization/84068
> > * haifa-sched.c (rank_for_schedule): Fix SCHED_PRESSURE_MODEL sorting.
> > 
> > PR rlt-optimization/84068
> 
> Note typos in category name (s/rlt/rtl).

This and the previous point applies to v2 as well.

Thanks.
Alexander

[PATCH, rs6000] fix vsxcopy test for power9

2018-02-07 Thread Will Schmidt


Hi,
Noted during review of test results on P9.  The vsxcopy.c test is looking
for lxvd2x, stxvd2x instructions in generated code.  For P9 targets, we will
instead generate lxv, stxv instructions.
Thus, update the test to handle that codegen as well.
Sniff-tested on P9.
OK for trunk?

Thanks,
-Will

[testsuite]

2018-02-07  Will Schmidt  

* gcc.target/powerpc/vsxcopy.c: Update scan-assembler stanzas.

diff --git a/gcc/testsuite/gcc.target/powerpc/vsxcopy.c 
b/gcc/testsuite/gcc.target/powerpc/vsxcopy.c
index fbe3c67..f877471 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsxcopy.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsxcopy.c
@@ -1,10 +1,10 @@
 /* { dg-do compile { target { powerpc64*-*-* } } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-O1 -mvsx" } */
-/* { dg-final { scan-assembler "lxvd2x" } } */
-/* { dg-final { scan-assembler "stxvd2x" } } */
+/* { dg-final { scan-assembler "lxvd2x|lxv" } } */
+/* { dg-final { scan-assembler "stxvd2x|stxv" } } */
 /* { dg-final { scan-assembler-not "xxpermdi" } } */
 
 typedef float vecf __attribute__ ((vector_size (16)));
 extern vecf j, k;

[RFC][PATCH] Stabilize a few qsort comparison functions

2018-02-07 Thread Franz Sirl


Hi,

this is the result of an attempt to minimize the differences between the
compile results of a Linux-based and a Cygwin64-based powerpc-eabi cross
toolchain.
The method used was:

    - find the -fverbose-asm assembler files that differ
    - compile that file again on both platforms with
   -O2 -g3 -fdump-tree-all-all -fdump-rtl-all -fdump-noaddr
    - look for the first dump file with differences and check that pass
  for qsort's
    - stabilize the compare functions

With some help on IRC to better understand the passes and some serious
debugging of GCC I came up with this patch. On the tested codebase the
differences in the assembler sources are now down to 0.
If the various pass maintainers have better ideas on how to stabilize
the compare functions, I'll be happy to verify them on the codebase.
For the SRA patch I already have an alternate version with an additional
ID member.

Comments?

Bootstrapped on linux-x86_64, no testsuite regressions.

Franz Sirl


2018-02-07  Franz Sirl 

    * ira-build.c (object_range_compare_func): Stabilize sort.
    * tree-sra.c (compare_access_positions): Likewise.
    * varasm.c (output_object_block_compare): Likewise.
    * tree-ssa-loop-ivopts.c (group_compare_offset): Likewise.
    (struct iv_common_cand): New member.
    (record_common_cand): Initialize new member.
    (common_cand_cmp): Use new member to stabilize sort.
    * tree-vrp.c (struct assert_locus): New member.
    (register_new_assert_for): Initialize new member.
    (compare_assert_loc): Use new member to stabilize sort.


Index: gcc/ira-build.c
===
--- gcc/ira-build.c (revision 257438)
+++ gcc/ira-build.c (working copy)
@@ -2787,8 +2787,10 @@ object_range_compare_func (const void *v1p, const
   if ((diff = OBJECT_MIN (obj1) - OBJECT_MIN (obj2)) != 0)
 return diff;
   if ((diff = OBJECT_MAX (obj1) - OBJECT_MAX (obj2)) != 0)
- return diff;
-  return ALLOCNO_NUM (a1) - ALLOCNO_NUM (a2);
+return diff;
+  if ((diff = ALLOCNO_NUM (a1) - ALLOCNO_NUM (a2)) != 0)
+return diff;
+  return OBJECT_SUBWORD (obj1) - OBJECT_SUBWORD (obj2);
 }
 
 /* Sort ira_object_id_map and set up conflict id of allocnos.  */
Index: gcc/tree-sra.c
===
--- gcc/tree-sra.c  (revision 257438)
+++ gcc/tree-sra.c  (working copy)
@@ -1567,7 +1567,13 @@ compare_access_positions (const void *a, const voi
   if (f1->size == f2->size)
 {
   if (f1->type == f2->type)
-   return 0;
+   {
+ if (f1->stmt == f2->stmt)
+   return 0;
+ if (f1->stmt && f2->stmt)
+   return gimple_uid (f1->stmt) - gimple_uid (f2->stmt);
+ return f1->stmt ? -1 : 1;
+   }
   /* Put any non-aggregate type before any aggregate type.  */
   else if (!is_gimple_reg_type (f1->type)
  && is_gimple_reg_type (f2->type))
Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 257438)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -443,6 +443,7 @@ struct iv_common_cand
   /* IV uses from which this common candidate is derived.  */
   auto_vec uses;
   hashval_t hash;
+  unsigned id;
 };
 
 /* Hashtable helpers.  */
@@ -2617,8 +2618,11 @@ group_compare_offset (const void *a, const void *b
 {
   const struct iv_use *const *u1 = (const struct iv_use *const *) a;
   const struct iv_use *const *u2 = (const struct iv_use *const *) b;
+  int diff;
 
-  return compare_sizes_for_sort ((*u1)->addr_offset, (*u2)->addr_offset);
+  if ((diff = compare_sizes_for_sort ((*u1)->addr_offset, (*u2)->addr_offset)) 
!= 0)
+return diff;
+  return (*u1)->id - (*u2)->id;
 }
 
 /* Check if small groups should be split.  Return true if no group
@@ -3378,6 +3382,7 @@ record_common_cand (struct ivopts_data *data, tree
   struct iv_common_cand ent;
   struct iv_common_cand **slot;
 
+  ent.id = data->iv_common_cands.length ();
   ent.base = base;
   ent.step = step;
   ent.hash = iterative_hash_expr (base, 0);
@@ -3387,6 +3392,7 @@ record_common_cand (struct ivopts_data *data, tree
   if (*slot == NULL)
 {
   *slot = new iv_common_cand ();
+  (*slot)->id = ent.id;
   (*slot)->base = base;
   (*slot)->step = step;
   (*slot)->uses.create (8);
@@ -3412,6 +3418,8 @@ common_cand_cmp (const void *p1, const void *p2)
 
   n1 = (*ccand1)->uses.length ();
   n2 = (*ccand2)->uses.length ();
+  if (n1 == n2)
+return (*ccand1)->id - (*ccand2)->id;
   return n2 - n1;
 }
 
Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 257438)
+++ gcc/tree-vrp.c  (working copy)
@@ -101,6 +101,8 @@ struct assert_locus
   /* Predicate code for the ASSERT_EXPR.  Must be COMPARISON_CLASS_P.  */
   enum tree_code comp_code;
 
+  unsigned id;
+
   /* Value being

Re: [PATCH] PR84068: Fix sort order of SCHED_PRESSURE_MODEL

2018-02-07 Thread Maxim Kuvyrkov

> On Feb 2, 2018, at 7:40 PM, Wilco Dijkstra  wrote:
> 
> Right, so here is version 2 which ends up much simpler:
> 
> The comparison function for SCHED_PRESSURE_MODEL is incorrect.  If either
> instruction is not in target_bb, the ordering is not well defined.  
> Since all instructions outside the target_bb get the highest model_index,
> all we need to do is sort on model_index.  If the model_index is the same
> we defer to RFS_DEP_COUNT and/or RFS_TIE.
> 
> Bootstrap OK, OK for commit?

Looks good to me.

--
Maxim Kuvyrkov
www.linaro.org

> 
> ChangeLog:
> 2018-02-02  Wilco Dijkstra  
> 
>   PR rlt-optimization/84068
>   * haifa-sched.c (rank_for_schedule): Fix SCHED_PRESSURE_MODEL sorting.
> 
>   PR rlt-optimization/84068
>   * gcc.dg/pr84068.c: New test.
> --
> 
> diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
> index 
> ebdec46bf04f1ba07e8b70607602a3bc9e7ec7de..4a899b56173dd7d67db084ed86d24ad865ccadcd
>  100644
> --- a/gcc/haifa-sched.c
> +++ b/gcc/haifa-sched.c
> @@ -2783,12 +2783,11 @@ rank_for_schedule (const void *x, const void *y)
> }
> 
>   /* Prefer instructions that occur earlier in the model schedule.  */
> -  if (sched_pressure == SCHED_PRESSURE_MODEL
> -  && INSN_BB (tmp) == target_bb && INSN_BB (tmp2) == target_bb)
> +  if (sched_pressure == SCHED_PRESSURE_MODEL)
> {
>   diff = model_index (tmp) - model_index (tmp2);
> -  gcc_assert (diff != 0);
> -  return rfs_result (RFS_PRESSURE_INDEX, diff, tmp, tmp2);
> +  if (diff != 0)
> + return rfs_result (RFS_PRESSURE_INDEX, diff, tmp, tmp2);
> }
> 
>   /* Prefer the insn which has more later insns that depend on it.
> diff --git a/gcc/testsuite/gcc.dg/pr84068.c b/gcc/testsuite/gcc.dg/pr84068.c
> new file mode 100644
> index 
> ..13110d84455f20edfc50f09efe4074721bd6a7d0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr84068.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-sched-critical-path-heuristic 
> -fno-sched-rank-heuristic --param=max-sched-extend-regions-iters=5 --param 
> sched-pressure-algorithm=2" } */
> +
> +#ifdef __SIZEOF_INT128__
> +typedef __int128 largeint;
> +#else
> +typedef long long largeint;
> +#endif
> +
> +largeint a;
> +int b;
> +
> +largeint
> +foo (char d, short e, int f)
> +{
> +  b = __builtin_sub_overflow_p (b, 1, (unsigned long)0);
> +  return a + f;
> +}

[PATCH, rs6000] fix requires for mergew-mergeow.c test

2018-02-07 Thread Will Schmidt

Hi, 

Noted during review of test results.  The builtins-mergew-mergeow.c testcase
uses the vec_mergeo() intrinsics that did not exist before power8.  Thus,
update the -requires stanza so we don't try to build/run this on earlier 
systems.

Sniff-tested OK on P6,P7,P8,P9.  OK for trunk?
Thanks
-Will

[testsuite]

2018-02-07  Will Schmidt  

* gcc.target/powerpc/builtins-mergew-mergeow.c:  Update dg-requires.

diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-mergew-mergow.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-mergew-mergow.c
index 24df2cf..51c8701 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-mergew-mergow.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-mergew-mergow.c
@@ -1,8 +1,8 @@
 /* { dg-do run } */
-/* { dg-require-effective-target vsx_hw } */
-/* { dg-options "-maltivec -mvsx" } */
+/* { dg-require-effective-target p8vector_hw } */
+/* { dg-options "-mpower8-vector" } */
 
 #include  // vector
 #include 
 
 #ifdef DEBUG

Re: [PATCH 2/7] Support >26 operands in generation code.

2018-02-07 Thread Alan Hayward



> On 6 Feb 2018, at 17:03, Richard Sandiford  
> wrote:
> 
> Alan Hayward  writes:
>> This patch adds support for CLOBBER_HIGH in the generation code.
>> 
>> An aarch64 will require 31 clobber high expressions, plus two
>> clobbers.
>> 
>> The exisiting gen code restricts to 26 vector operands by virtue
>> of using the operators [a-z]. This patch extends this to 52 by
>> supporting [a-zA-Z].
> 
> Although the main CLOBBER_HIGH patch will obviously need to wait
> until GCC 9 now, we can work around it for GCC 8 by making the
> tlsdesc pattern clobber all vector and predicate registers for SVE.
> We'll still need the support for more than 26 operands in order
> to do that though.
> 
>> 2017-11-16  Alan Hayward  
>> 
>> [...]
>>  * genextract.c (push_pathstr_operand): New function to
>>  support [a-zA-Z].
>>  (walk_rtx): Call push_pathstr_operand.
>>  (print_path): Support [a-zA-Z].
>> [...]
>> diff --git a/gcc/genextract.c b/gcc/genextract.c
>> index 
>> 258d234d2729bf16b152b90bb1833a37a6eb0bdc..e1fb716e459b9bd219e89cf36c30556d520305a2
>>  100644
>> --- a/gcc/genextract.c
>> +++ b/gcc/genextract.c
>> @@ -33,9 +33,10 @@ along with GCC; see the file COPYING3.  If not see
>> 
>>The string for each operand describes that path to the operand and
>>contains `0' through `9' when going into an expression and `a' through
>> -   `z' when going into a vector.  We assume here that only the first operand
>> -   of an rtl expression is a vector.  genrecog.c makes the same assumption
>> -   (and uses the same representation) and it is currently true.  */
>> +   `z' then 'A' through to 'Z' when going into a vector.  We assume here 
>> that
>> +   only the first operand of an rtl expression is a vector.  genrecog.c 
>> makes
>> +   the same assumption (and uses the same representation) and it is 
>> currently
>> +   true.  */
>> 
>> typedef char *locstr;
>> 
>> @@ -80,6 +81,22 @@ struct accum_extract
>> /* Forward declarations.  */
>> static void walk_rtx (md_rtx_info *, rtx, struct accum_extract *);
>> 
>> +#define UPPER_OFFSET ('A' - ('z' - 'a' + 1))
>> +
>> +/* Convert OPERAND into a character - either into [a-zA-Z] for vector 
>> operands
>> +   or [0-9] for integer operands - and push onto the end of the path ACC.  
>> */
>> +static void
>> +push_pathstr_operand (int operand, bool is_vector,
>> + struct accum_extract *acc)
>> +{
>> +  if (is_vector && 'a' + operand > 'z')
>> +acc->pathstr.safe_push (operand + UPPER_OFFSET);
>> +  else if (is_vector)
>> +acc->pathstr.safe_push (operand + 'a');
>> +  else
>> +acc->pathstr.safe_push (operand + '0');
>> +}
>> +
>> static void
>> gen_insn (md_rtx_info *info)
>> {
>> @@ -98,7 +115,7 @@ gen_insn (md_rtx_info *info)
>>   else
>> for (i = XVECLEN (insn, 1) - 1; i >= 0; i--)
>>   {
>> -acc.pathstr.safe_push ('a' + i);
>> +push_pathstr_operand (i, true, );
>>  walk_rtx (info, XVECEXP (insn, 1, i), );
>>  acc.pathstr.pop ();
>>   }
>> @@ -208,7 +225,7 @@ static void
>> walk_rtx (md_rtx_info *info, rtx x, struct accum_extract *acc)
>> {
>>   RTX_CODE code;
>> -  int i, len, base;
>> +  int i, len;
>>   const char *fmt;
>> 
>>   if (x == 0)
>> @@ -234,10 +251,9 @@ walk_rtx (md_rtx_info *info, rtx x, struct 
>> accum_extract *acc)
>>   VEC_safe_set_locstr (info, >oplocs, XINT (x, 0),
>> VEC_char_to_string (acc->pathstr));
>> 
>> -  base = (code == MATCH_OPERATOR ? '0' : 'a');
>>   for (i = XVECLEN (x, 2) - 1; i >= 0; i--)
>>  {
>> -  acc->pathstr.safe_push (base + i);
>> +  push_pathstr_operand (i, code != MATCH_OPERATOR, acc);
>>walk_rtx (info, XVECEXP (x, 2, i), acc);
>>acc->pathstr.pop ();
>> }
>> @@ -252,10 +268,9 @@ walk_rtx (md_rtx_info *info, rtx x, struct 
>> accum_extract *acc)
>>   if (code == MATCH_DUP)
>>  break;
>> 
>> -  base = (code == MATCH_OP_DUP ? '0' : 'a');
>>   for (i = XVECLEN (x, 1) - 1; i >= 0; i--)
>> {
>> -  acc->pathstr.safe_push (base + i);
>> +  push_pathstr_operand (i, code != MATCH_OP_DUP, acc);
>>walk_rtx (info, XVECEXP (x, 1, i), acc);
>>acc->pathstr.pop ();
>> }
>> @@ -271,7 +286,7 @@ walk_rtx (md_rtx_info *info, rtx x, struct accum_extract 
>> *acc)
>> {
>>   if (fmt[i] == 'e' || fmt[i] == 'u')
>>  {
>> -  acc->pathstr.safe_push ('0' + i);
>> +  push_pathstr_operand (i, false, acc);
>>walk_rtx (info, XEXP (x, i), acc);
>>acc->pathstr.pop ();
>>  }
>> @@ -280,7 +295,7 @@ walk_rtx (md_rtx_info *info, rtx x, struct accum_extract 
>> *acc)
>>int j;
>>for (j = XVECLEN (x, i) - 1; j >= 0; j--)
>>  {
>> -  acc->pathstr.safe_push ('a' + j);
>> +  push_pathstr_operand (j, true, acc);
>>walk_rtx (info, XVECEXP (x, i, j), acc);
>>acc->pathstr.pop ();
>>  }
>> @@ -311,7

C++ PATCH for c++/84182, ICE with captured lambda

2018-02-07 Thread Jason Merrill

My patch for 84126 didn't go far enough in handling the case of things
in local_specializations causing trouble with partial pack expansions;
it isn't enough to handle function parameters, we need to remember
everything that comes from local_specializations.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit e75f4ac70a69667cc1bb6bcc203e77972e477275
Author: Jason Merrill 
Date:   Tue Feb 6 16:32:00 2018 -0500

PR c++/84182 - ICE with captured lambda

PR c++/84181
* pt.c (extract_locals_r, extract_local_specs): New.
(tsubst_pack_expansion): Use them.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index ca73bb18d97..0d9e153b5ee 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -11415,6 +11415,34 @@ tsubst_binary_right_fold (tree t, tree args, 
tsubst_flags_t complain,
   return expand_right_fold (t, vec, complain);
 }
 
+/* Walk through the pattern of a pack expansion, adding everything in
+   local_specializations to a list.  */
+
+static tree
+extract_locals_r (tree *tp, int */*walk_subtrees*/, void *data)
+{
+  tree *extra = reinterpret_cast(data);
+  if (tree spec = retrieve_local_specialization (*tp))
+{
+  if (TREE_CODE (spec) == NONTYPE_ARGUMENT_PACK)
+   {
+ /* Pull out the actual PARM_DECL for the partial instantiation.  */
+ tree args = ARGUMENT_PACK_ARGS (spec);
+ gcc_assert (TREE_VEC_LENGTH (args) == 1);
+ tree arg = TREE_VEC_ELT (args, 0);
+ spec = PACK_EXPANSION_PATTERN (arg);
+   }
+  *extra = tree_cons (*tp, spec, *extra);
+}
+  return NULL_TREE;
+}
+static tree
+extract_local_specs (tree pattern)
+{
+  tree extra = NULL_TREE;
+  cp_walk_tree_without_duplicates (, extract_locals_r, );
+  return extra;
+}
 
 /* Substitute ARGS into T, which is an pack expansion
(i.e. TYPE_PACK_EXPANSION or EXPR_PACK_EXPANSION). Returns a
@@ -11442,14 +11470,17 @@ tsubst_pack_expansion (tree t, tree args, 
tsubst_flags_t complain,
   tree extra = PACK_EXPANSION_EXTRA_ARGS (t);
   if (extra && TREE_CODE (extra) == TREE_LIST)
 {
-  /* The partial instantiation involved function parameter packs; map
- from the general template to our current context.  */
-  for (tree fns = TREE_CHAIN (extra); fns; fns = TREE_CHAIN (fns))
+  for (tree elt = TREE_CHAIN (extra); elt; elt = TREE_CHAIN (elt))
{
- tree fn = TREE_PURPOSE (fns);
- tree inst = enclosing_instantiation_of (fn);
- register_parameter_specializations (fn, inst);
+ /* The partial instantiation involved local declarations collected in
+extract_local_specs; map from the general template to our local
+context.  */
+ tree gen = TREE_PURPOSE (elt);
+ tree partial = TREE_VALUE (elt);
+ tree inst = retrieve_local_specialization (partial);
+ register_local_specialization (inst, gen);
}
+  gcc_assert (!TREE_PURPOSE (extra));
   extra = TREE_VALUE (extra);
 }
   args = add_to_template_args (extra, args);
@@ -11625,25 +11656,8 @@ tsubst_pack_expansion (tree t, tree args, 
tsubst_flags_t complain,
   t = make_pack_expansion (pattern, complain);
   tree extra = args;
   if (unsubstituted_fn_pack)
-   {
- /* For function parameter packs it's more complicated; we need to
-remember which enclosing function(s) provided them to this pack
-expansion so we can map their parameters to the parameters of a
-later full instantiation.  */
- tree fns = NULL_TREE;
- for (tree p = packs; p; p = TREE_CHAIN (p))
-   {
- tree parm = TREE_PURPOSE (p);
- if (TREE_CODE (parm) != PARM_DECL)
-   continue;
- parm = DECL_CONTEXT (parm);
- if (purpose_member (parm, fns))
-   continue;
- fns = tree_cons (parm, NULL_TREE, fns);
-   }
- if (fns)
-   extra = tree_cons (NULL_TREE, extra, fns);
-   }
+   if (tree locals = extract_local_specs (pattern))
+ extra = tree_cons (NULL_TREE, extra, locals);
   PACK_EXPANSION_EXTRA_ARGS (t) = extra;
   return t;
 }
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-generic-targ2.C 
b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-targ2.C
new file mode 100644
index 000..b4d63da5758
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-targ2.C
@@ -0,0 +1,14 @@
+// PR c++/84181
+// { dg-do compile { target c++14 } }
+
+template < int ... I >
+struct A{};
+
+template < typename T >
+auto var = [](auto ... i){
+return A< decltype(i)::x ... >{};
+};
+
+int main(){
+var< int >();
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-generic-variadic10.C 
b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-variadic10.C
new file mode 100644
index 000..ab790cf8c9c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-variadic10.C
@@ -0,0 +1,17 @@
+// PR

[PATCH][Middle-end][version 3]2nd patch of PR78809 and PR83026

2018-02-07 Thread Qing Zhao

Hi, this is the 3rd version for this patch.

the main change compared with 2nd version are:
1. do not use “compute_objsize” to get the minimum object size per Jeff 
and Richard’s
comment. Instead, add a new function “determine_min_objsize” for this purpose. 
This new
function calls “compute_builtin_object_size” to get the minimum objsize, 
however, due to 
the fact that “compute_builtin_object_size” does nothing for SSA_NAME when 
optimize > 0 (don’t
know the exactly reason for this), inside “determine_min_objsize”, I have to 
add  more code
to catch up some simple SSA_NAME cases.

2. in gimple-fold.c and tree-ssa-structalias.c, add the handling of the 
new 
BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ in the same places where 
BUILT_IN_STRCMP and BUILT_IN_STRNCMP is checked.

3. some  format change and comments change per Jeff’s comment. 

let me know if you have any comments.

thanks a lot.

Qing

*

2nd Patch for PR78009 
Patch for PR83026

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
Inline strcmp with small constant strings

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
missing strlen optimization for strcmp of unequal strings

The design doc for PR78809 is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html

this patch is for the second part of change of PR78809 and PR83026:

B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0

 B.1. (PR83026) When the lengths of both arguments are constant and
  it's a strcmp:
* if the lengths are NOT equal, we can safely fold the call
  to a non-zero value.
* otherwise, do nothing now.

 B.2. (PR78809) When the length of one argument is constant, try to replace
 the call with a __builtin_str(n)cmp_eq call where possible, i.e:

 strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
 string with constant length, C is a constant.
   if (C <= strlen(STR) && sizeof_array(s) > C)
 {
   replace this call with
   __builtin_strncmp_eq (s, STR, C) (!)= 0
 }
   if (C > strlen(STR)
 {
   it can be safely treated as a call to strcmp (s, STR) (!)= 0
   can handled by the following strcmp.
 }

 strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
 string with constant length.
   if  (sizeof_array(s) > strlen(STR))
 {
   replace this call with
   __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
 }

 later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
 as __builtin_memcmp_eq, if the expansion does not succeed, change them back
 to call to __builtin_str(n)cmp.

adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026

bootstraped and tested on both X86 and Aarch64. no regression.


gcc/ChangeLog

+2018-02-02  
+   
+   PR middle-end/78809
+   PR middle-end/83026
+   * builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
+   and BUILT_IN_STRNCMP_EQ.
+   * builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
+   BUILT_IN_STRNCMP_EQ.
+   * gimple-fold.c (gimple_fold_builtin_string_compare): Add the 
+   handling of BUILTIN_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+   (gimple_fold_builtin): Likewise.
+   * tree-ssa-strlen.c (compute_string_length): New function.
+   (determine_min_obsize): New function.
+   (handle_builtin_string_cmp): New function to handle calls to
+   string compare functions.
+   (strlen_optimize_stmt): Add handling to builtin string compare
+   calls. 
+   * tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
+   Add the handling of BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+   * tree.c (build_common_builtin_nodes): Add new defines of
+   BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
+

gcc/testsuite/ChangeLog

+2018-02-02  
+
+   PR middle-end/78809
+   * gcc.dg/strcmpopt_2.c: New testcase.
+   * gcc.dg/strcmpopt_3.c: New testcase.
+
+   PR middle-end/83026
+   * gcc.dg/strcmpopt_3.c: New testcase.



PR78809_B.patch
Description: Binary data


> 
>> 
>> We're now in stage4 so please queue this patch and ping it during
>> next stage1.
>> 
> 
> I will update my patch based on Jeff and your comments, and send it during 
> next stage 1.
> 
> thanks.
> 
> Qing
>

[PATCH, rs6000] PR84220 fix altivec_vec_sld and vec_sldw intrinsic definitions

2018-02-07 Thread Will Schmidt

Hi,
  Our VEC_SLD definitions were mistakenly allowing the third argument to be
of an invalid type, triggering an ICE (on invalid code) later in the build
process.  This fixes those definitions.  The nearby VEC_SLDW definitions have
the same issue, those have been fixed as part of this patch too.
Testcases have been added to ensure we generate the 'invalid intrinsic'
message as is appropriate, instead of ICEing.
Giving proper credit, this was found by Peter Bergner while working a
different issue. :-)

Sniff-tests passed on P8.  Doing larger reg-test across power systems now.
OK for trunk?
And,.. do we want this one backported too?

Thanks,
-Will

[gcc]

2018-02-07  Will Schmidt  

PR target/84220
* config/rs6000/rs6000-c.c: Update definitions for
ALTIVEC_BUILTIN_VEC_SLD and ALTIVEC_BUILTIN_VEC_SLDW.

[testsuite]

2018-02-07  Will Schmidt  

PR target/84220
* gcc.target/powerpc/pr84220-1.c: New test.
* gcc.target/powerpc/pr84220-2.c: New test.
* gcc.target/powerpc/pr84220-sldw.c: New test.

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index a68be51..26f9990 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -3654,39 +3654,39 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
 RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
RS6000_BTI_bool_V16QI },
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
 RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
RS6000_BTI_unsigned_V16QI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_4SF,
-RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_4SI,
-RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_4SI,
-RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 
RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 
RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_4SI,
-RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 
RS6000_BTI_unsigned_V4SI, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 
RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_8HI,
-RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_8HI,
-RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 
RS6000_BTI_unsigned_V8HI, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 
RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_8HI,
-RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, 
RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, 
RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_8HI,
-RS6000_BTI_pixel_V8HI, RS6000_BTI_pixel_V8HI, RS6000_BTI_pixel_V8HI, 
RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_pixel_V8HI, RS6000_BTI_pixel_V8HI, RS6000_BTI_pixel_V8HI, 
RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_16QI,
-RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, 
RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_16QI,
-RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 
RS6000_BTI_unsigned_V16QI, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 
RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_16QI,
-RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_2DF,
-RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_2DI,
-RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 
RS6000_BTI_NOT_OPAQUE },
+RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 
RS6000_BTI_INTSI },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_2DI,
-RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI,

[PATCH] Improve # IVs generated by the vectorizer for strided loads/stores (PR84037)

2018-02-07 Thread Richard Biener


One of the issues with PR84037 is we spill IVs a lot because the
vectorizer blows up the number of IVs from 7 needed for the scalar
loop (after IVO) to 22.  The following patch brings this down
to 11 (and 7 with -mprefer-avx128 -- need to still investigate that 
difference).

The vect_loop_versioning hunk fixes a latent issue where we gimplify
some data-ref fields in-place because we forget unsharing (this also
then makes us not recognize equivalences).

Bootstrapped and tested on x86_64-unknown-linux-gnu, re-testing after
a minor change.

This improves the runtime of capacita by 10% on my machine (with AVX256).
Still 18% to go... (we still spill, with avx128 we no longer do!).

Richard.

2018-02-07  Richard Biener  

PR tree-optimization/84037
* tree-vectorizer.h (struct _loop_vec_info): Add ivexpr_map member.
(cse_and_gimplify_to_preheader): Declare.
(vect_get_place_in_interleaving_chain): Likewise.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
ivexpr_map.
(_loop_vec_info::~_loop_vec_info): Delete it.
(cse_and_gimplify_to_preheader): New function.
* tree-vect-slp.c (vect_get_place_in_interleaving_chain): Export.
* tree-vect-stmts.c (vectorizable_store): CSE base and steps.
(vectorizable_load): Likewise.  For grouped stores always base
the IV on the first element.
* tree-vect-loop-manip.c (vect_loop_versioning): Unshare versioning
condition before gimplifying.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 257382)
+++ gcc/tree-vect-loop.c(working copy)
@@ -1128,6 +1128,7 @@ _loop_vec_info::_loop_vec_info (struct l
 unaligned_dr (NULL),
 peeling_for_alignment (0),
 ptr_mask (0),
+ivexpr_map (NULL),
 slp_unrolling_factor (1),
 single_scalar_iteration_cost (0),
 vectorizable (false),
@@ -1251,10 +1252,38 @@ _loop_vec_info::~_loop_vec_info ()
   free (bbs);
 
   release_vec_loop_masks ();
+  delete ivexpr_map;
 
   loop->aux = NULL;
 }
 
+/* Return an invariant or register for EXPR and emit necessary
+   computations in the LOOP_VINFO loop preheader.  */
+
+tree
+cse_and_gimplify_to_preheader (loop_vec_info loop_vinfo, tree expr)
+{
+  if (is_gimple_reg (expr)
+  || is_gimple_min_invariant (expr))
+return expr;
+
+  if (! loop_vinfo->ivexpr_map)
+loop_vinfo->ivexpr_map = new hash_map;
+  tree  = loop_vinfo->ivexpr_map->get_or_insert (expr);
+  if (! cached)
+{
+  gimple_seq stmts = NULL;
+  cached = force_gimple_operand (unshare_expr (expr),
+, true, NULL_TREE);
+  if (stmts)
+   {
+ edge e = loop_preheader_edge (LOOP_VINFO_LOOP (loop_vinfo));
+ gsi_insert_seq_on_edge_immediate (e, stmts);
+   }
+}
+  return cached;
+}
+
 /* Return true if we can use CMP_TYPE as the comparison type to produce
all masks required to mask LOOP_VINFO.  */
 
Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   (revision 257382)
+++ gcc/tree-vectorizer.h   (working copy)
@@ -440,6 +440,9 @@ typedef struct _loop_vec_info : public v
   /* Cost vector for a single scalar iteration.  */
   auto_vec scalar_cost_vec;
 
+  /* Map of IV base/step expressions to inserted name in the preheader.  */
+  hash_map *ivexpr_map;
+
   /* The unrolling factor needed to SLP the loop. In case of that pure SLP is
  applied to the loop, i.e., no unrolling is needed, this is 1.  */
   poly_uint64 slp_unrolling_factor;
@@ -1544,6 +1552,7 @@ extern int vect_get_known_peeling_cost (
stmt_vector_for_cost *,
stmt_vector_for_cost *,
stmt_vector_for_cost *);
+extern tree cse_and_gimplify_to_preheader (loop_vec_info, tree);
 
 /* In tree-vect-slp.c.  */
 extern void vect_free_slp_instance (slp_instance);
@@ -1564,6 +1573,7 @@ extern bool can_duplicate_and_interleave
tree * = NULL, tree * = NULL);
 extern void duplicate_and_interleave (gimple_seq *, tree, vec,
  unsigned int, vec &);
+extern int vect_get_place_in_interleaving_chain (gimple *, gimple *);
 
 /* In tree-vect-patterns.c.  */
 /* Pattern recognition functions.
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 257382)
+++ gcc/tree-vect-slp.c (working copy)
@@ -188,7 +188,7 @@ vect_free_oprnd_info (vec *v = NULL;
-  gimple_seq stmts = NULL;
   tree stride_base, stride_step, alias_off;
   /* Checked by get_load_store_type.  */
   unsigned int const_nunits = nunits.to_constant ();
+  unsigned HOST_WIDE_INT

Re: PING [PATCH] -mjsr option bug fix

2018-02-07 Thread Nick Clifton

Hi Sebastian,

> +2018-01-05  Sebastian Perta  
> +
> + * config/rx/constraints.md: added new constraint
> CALL_OP_SYMBOL_REF
> + to allow or block "symbol_ref" depending on value of TARGET_JSR
> + * config/rx/rx.md: use CALL_OP_SYMBOL_REF in call_internal and
> + call_value_internal insns
> +

Approved - please apply.

Note for the future - it is usually best to provide changelog entries
as plain text, rather than a context diff, as they almost never apply
cleanly...

Cheers
  Nick

Re: PING [PATCH] RX movsicc degrade fix

2018-02-07 Thread Nick Clifton

Hi Sebastian,

  Sorry for missing this one.  If it helps in the future, feel free to ping me 
directly.

> +2018-01-09  Sebastian Perta  
> +
> + *config/rx.md: updated "movsicc" expand to be matched by GCC
> + *testsuite/gcc.target/rx/movsicc.c: new test case

Approved - please apply.

Cheers
  Nick

Re: [PATCH, i386] Fix ix86_multiplication_cost for SKX

2018-02-07 Thread Uros Bizjak

On Wed, Feb 7, 2018 at 2:02 PM, Shalnov, Sergey
 wrote:
> Hi,
> This patch is one of the set of patches to fix SKX costs.

Please post the whole series for review.

Thanks,
Uros.

> I think multiplication costs calculation algorithm needs to be adjusted in 
> gcc/config/i386/i386.c ix86_multiplication_cost() function.
> For TARGET_AVX512DQ emulation is not used and single vpmullq instruction 
> emitted.
> I think we have to align costs calculation algorithm with sequence of emitted 
> instructions.
>
> I don't see visible regressions for this change. This patch helps to avoid a 
> regressions with further changes.
>
> Could you please merge the patch to the main trunk?
>
> Thank you
> Sergey
>
> 2018-02-06  Sergey Shalnov  
> gcc/
> * config/i386/i386.c (ix86_multiplication_cost): Fix multiplication
> cost for TARGET_AVX512DQ since it uses vpmullq instead
> emulated sequence
>
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 
> 70b6775..6f4ec5d 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -40393,6 +40393,10 @@ ix86_multiplication_cost (const struct 
> processor_costs *cost,
>? cost->mulsd : cost->mulss, true);
>else if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
>  {
> +  /* vpmullq is used in this case. No emulation is needed.  */
> +  if (TARGET_AVX512DQ)
> +   return ix86_vec_cost (mode, cost->mulss, true);
> +
>/* V*QImode is emulated with 7-13 insns.  */
>if (mode == V16QImode || mode == V32QImode)
> {
>

[PATCH, i386] Fix ix86_multiplication_cost for SKX

2018-02-07 Thread Shalnov, Sergey

Hi,
This patch is one of the set of patches to fix SKX costs.

I think multiplication costs calculation algorithm needs to be adjusted in 
gcc/config/i386/i386.c ix86_multiplication_cost() function.
For TARGET_AVX512DQ emulation is not used and single vpmullq instruction 
emitted.
I think we have to align costs calculation algorithm with sequence of emitted 
instructions.

I don't see visible regressions for this change. This patch helps to avoid a 
regressions with further changes.

Could you please merge the patch to the main trunk?

Thank you
Sergey

2018-02-06  Sergey Shalnov  
gcc/
* config/i386/i386.c (ix86_multiplication_cost): Fix multiplication
cost for TARGET_AVX512DQ since it uses vpmullq instead
emulated sequence


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 
70b6775..6f4ec5d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -40393,6 +40393,10 @@ ix86_multiplication_cost (const struct processor_costs 
*cost,
   ? cost->mulsd : cost->mulss, true);
   else if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
 {
+  /* vpmullq is used in this case. No emulation is needed.  */
+  if (TARGET_AVX512DQ)
+   return ix86_vec_cost (mode, cost->mulss, true);
+
   /* V*QImode is emulated with 7-13 insns.  */
   if (mode == V16QImode || mode == V32QImode)
{

Re: [PATCH] Revert behavior to r251316.

2018-02-07 Thread Nathan Sidwell


On 02/07/2018 07:03 AM, Martin Liška wrote:

Hello.

The refactoring in r251316 changed the logic when 
cgraph_node::record_function_versions
function is called. Suggested patch reverts the logic prior to the revision.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
And make check RUNTESTFLAGS="i386.exp" on x86_64 works.

Ready to be installed?
Martin

gcc/cp/ChangeLog:

2018-01-26  Martin Liska  

PR c++/84059.
* class.c (add_method): Append argument value.
* cp-tree.h (maybe_version_functions): Add new argument.
* decl.c (decls_match): Call it if a declaration does not
have DECL_FUNCTION_VERSIONED.
(maybe_version_functions): record argument is added.


ok


--
Nathan Sidwell

[PATCH] Revert behavior to r251316.

2018-02-07 Thread Martin Liška

Hello.

The refactoring in r251316 changed the logic when 
cgraph_node::record_function_versions
function is called. Suggested patch reverts the logic prior to the revision.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
And make check RUNTESTFLAGS="i386.exp" on x86_64 works.

Ready to be installed?
Martin

gcc/cp/ChangeLog:

2018-01-26  Martin Liska  

PR c++/84059.
* class.c (add_method): Append argument value.
* cp-tree.h (maybe_version_functions): Add new argument.
* decl.c (decls_match): Call it if a declaration does not
have DECL_FUNCTION_VERSIONED.
(maybe_version_functions): record argument is added.

gcc/testsuite/ChangeLog:

2018-01-26  Martin Liska  

PR c++/84059.
* g++.dg/ext/mv26.C: New test.
---
 gcc/cp/class.c  |  2 +-
 gcc/cp/cp-tree.h|  2 +-
 gcc/cp/decl.c   | 14 ++
 gcc/testsuite/g++.dg/ext/mv26.C | 15 +++
 4 files changed, 23 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/mv26.C


diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index a4098ac872e..57a2e69ada6 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -1099,7 +1099,7 @@ add_method (tree type, tree method, bool via_using)
 	  /* If these are versions of the same function, process and
 	 move on.  */
 	  if (TREE_CODE (fn) == FUNCTION_DECL
-	  && maybe_version_functions (method, fn))
+	  && maybe_version_functions (method, fn, true))
 	continue;
 
 	  if (DECL_INHERITED_CTOR (method))
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index a53f4fd9c03..b636333b93f 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6136,7 +6136,7 @@ extern bool note_iteration_stmt_body_start	(void);
 extern void note_iteration_stmt_body_end	(bool);
 extern tree make_lambda_name			(void);
 extern int decls_match(tree, tree, bool = true);
-extern bool maybe_version_functions		(tree, tree);
+extern bool maybe_version_functions		(tree, tree, bool);
 extern tree duplicate_decls			(tree, tree, bool);
 extern tree declare_local_label			(tree);
 extern tree define_label			(location_t, tree);
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 01ce9fb6d69..3ccea9e6a45 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -1087,7 +1087,9 @@ decls_match (tree newdecl, tree olddecl, bool record_versions /* = true */)
 	  && !DECL_EXTERN_C_P (newdecl)
 	  && !DECL_EXTERN_C_P (olddecl)
 	  && record_versions
-	  && maybe_version_functions (newdecl, olddecl))
+	  && maybe_version_functions (newdecl, olddecl,
+  (!DECL_FUNCTION_VERSIONED (newdecl)
+   || !DECL_FUNCTION_VERSIONED (olddecl
 	return 0;
 }
   else if (TREE_CODE (newdecl) == TEMPLATE_DECL)
@@ -1145,19 +1147,17 @@ decls_match (tree newdecl, tree olddecl, bool record_versions /* = true */)
 }
 
 /* NEWDECL and OLDDECL have identical signatures.  If they are
-   different versions adjust them and return true.  */
+   different versions adjust them and return true.
+   If RECORD is set to true, record function versions.  */
 
 bool
-maybe_version_functions (tree newdecl, tree olddecl)
+maybe_version_functions (tree newdecl, tree olddecl, bool record)
 {
   if (!targetm.target_option.function_versions (newdecl, olddecl))
 return false;
 
-  bool record = false;
-
   if (!DECL_FUNCTION_VERSIONED (olddecl))
 {
-  record = true;
   DECL_FUNCTION_VERSIONED (olddecl) = 1;
   if (DECL_ASSEMBLER_NAME_SET_P (olddecl))
 	mangle_decl (olddecl);
@@ -1165,13 +1165,11 @@ maybe_version_functions (tree newdecl, tree olddecl)
 
   if (!DECL_FUNCTION_VERSIONED (newdecl))
 {
-  record = true;
   DECL_FUNCTION_VERSIONED (newdecl) = 1;
   if (DECL_ASSEMBLER_NAME_SET_P (newdecl))
 	mangle_decl (newdecl);
 }
 
-  /* Only record if at least one was not already versions.  */
   if (record)
 cgraph_node::record_function_versions (olddecl, newdecl);
 
diff --git a/gcc/testsuite/g++.dg/ext/mv26.C b/gcc/testsuite/g++.dg/ext/mv26.C
new file mode 100644
index 000..1b455130e46
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/mv26.C
@@ -0,0 +1,15 @@
+// PR c++/84059
+// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-require-ifunc "" }
+
+template  struct a
+{
+  int __attribute__ ((target ("arch=ivybridge"))) c (int) {return 1;}
+  int __attribute__ ((target ("default"))) c (int) { return 2; }
+};
+void
+d ()
+{
+  a b;
+  b.c (2);
+}

[PATCH] S/390: Disable prediction of indirect branches

2018-02-07 Thread Andreas Krebbel

This patch implements GCC support for mitigating vulnerability
CVE-2017-5715 known as Spectre #2 on IBM Z.

In order to disable prediction of indirect branches the implementation
makes use of an IBM Z specific feature - the execute instruction.
Performing an indirect branch via execute prevents the branch from
being subject to dynamic branch prediction.

The implementation tries to stay close to the x86 solution regarding
user interface.

x86 style options supported (without thunk-inline):

-mindirect-branch=(keep|thunk|thunk-extern)
-mfunction-return=(keep|thunk|thunk-extern)

IBM Z specific options:

-mindirect-branch-jump=(keep|thunk|thunk-extern|thunk-inline)
-mindirect-branch-call=(keep|thunk|thunk-extern)
-mfunction-return-reg=(keep|thunk|thunk-extern)
-mfunction-return-mem=(keep|thunk|thunk-extern)

These options allow us to enable/disable the branch conversion at a
finer granularity.

-mindirect-branch sets the value of -mindirect-branch-jump and
 -mindirect-branch-call.

-mfunction-return sets the value of -mfunction-return-reg and
 -mfunction-return-mem.

All these options are supported on GCC command line as well as
function attributes.

'thunk' triggers the generation of out of line thunks (expolines) and
replaces the formerly indirect branch with a direct branch to the
thunk.  Depending on the -march= setting two different types of thunks
are generated.  With -march=z10 or higher exrl (execute relative long)
is being used while targeting older machines makes use of larl/ex
instead.  From a security perspective the exrl variant is preferable.

'thunk-extern' does the branch replacement like 'thunk' but does not
emit the thunks.

'thunk-inline' is only available for indirect jumps.  It should be used
in environments where correct CFI is important - known as user space.

Additionally the patch introduces the -mindirect-branch-table option
which generates tables pointing to the locations which have been
modified.  This is supposed to allow reverting the changes without
re-compilation in situations where it isn't required. The sections are
split up into one section per option.

I plan to commit the patch tomorrow.

gcc/ChangeLog:

2018-02-07  Andreas Krebbel  

* config/s390/s390-opts.h (enum indirect_branch): Define.
* config/s390/s390-protos.h (s390_return_addr_from_memory)
(s390_indirect_branch_via_thunk)
(s390_indirect_branch_via_inline_thunk): Add function prototypes.
(enum s390_indirect_branch_type): Define.
* config/s390/s390.c (struct s390_frame_layout, struct
machine_function): Remove.
(indirect_branch_prez10thunk_mask, indirect_branch_z10thunk_mask)
(indirect_branch_table_label_no, indirect_branch_table_name):
Define variables.
(INDIRECT_BRANCH_NUM_OPTIONS): Define macro.
(enum s390_indirect_branch_option): Define.
(s390_return_addr_from_memory): New function.
(s390_handle_string_attribute): New function.
(s390_attribute_table): Add new attribute handler.
(s390_execute_label): Handle UNSPEC_EXECUTE_JUMP patterns.
(s390_indirect_branch_via_thunk): New function.
(s390_indirect_branch_via_inline_thunk): New function.
(s390_function_ok_for_sibcall): When jumping via thunk disallow
sibling call optimization for non z10 compiles.
(s390_emit_call): Force indirect branch target to be a single
register.  Add r1 clobber for non-z10 compiles.
(s390_emit_epilogue): Emit return jump via return_use expander.
(s390_reorg): Handle JUMP_INSNs as execute targets.
(s390_option_override_internal): Perform validity checks for the
new command line options.
(s390_indirect_branch_attrvalue): New function.
(s390_indirect_branch_settings): New function.
(s390_set_current_function): Invoke s390_indirect_branch_settings.
(s390_output_indirect_thunk_function):  New function.
(s390_code_end): Implement target hook.
(s390_case_values_threshold): Implement target hook.
(TARGET_ASM_CODE_END, TARGET_CASE_VALUES_THRESHOLD): Define target
macros.
* config/s390/s390.h (struct s390_frame_layout)
(struct machine_function): Move here from s390.c.
(TARGET_INDIRECT_BRANCH_NOBP_RET)
(TARGET_INDIRECT_BRANCH_NOBP_JUMP)
(TARGET_INDIRECT_BRANCH_NOBP_JUMP_THUNK)
(TARGET_INDIRECT_BRANCH_NOBP_JUMP_INLINE_THUNK)
(TARGET_INDIRECT_BRANCH_NOBP_CALL)
(TARGET_DEFAULT_INDIRECT_BRANCH_TABLE)
(TARGET_INDIRECT_BRANCH_THUNK_NAME_EXRL)
(TARGET_INDIRECT_BRANCH_THUNK_NAME_EX)
(TARGET_INDIRECT_BRANCH_TABLE): Define macros.
* config/s390/s390.md (UNSPEC_EXECUTE_JUMP)
(INDIRECT_BRANCH_THUNK_REGNUM): Define constants.
(mnemonic attribute): Add values which aren't recognized
automatically.
("*cjump_long", "*icjump_long", "*basr",

[testsuite, committed] Require alloca in gcc.dg/pr83844.c

2018-02-07 Thread Tom de Vries


Hi,

this patch requires effective target alloca in gcc.dg/pr83844.c.

Committed.

Thanks,
- Tom
[testsuite] Require alloca in gcc.dg/pr83844.c

2018-02-07  Tom de Vries  

	* gcc.dg/pr83844.c: Require effective target alloca.

---
 gcc/testsuite/gcc.dg/pr83844.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.dg/pr83844.c b/gcc/testsuite/gcc.dg/pr83844.c
index fdbc191..0d26b55 100644
--- a/gcc/testsuite/gcc.dg/pr83844.c
+++ b/gcc/testsuite/gcc.dg/pr83844.c
@@ -1,6 +1,7 @@
 /* PR c/83844 */
 /* { dg-do compile { target int32plus } } */
 /* { dg-options "-O0 -Wall" } */
+/* { dg-require-effective-target alloca } */
 
 typedef unsigned long long __u64 __attribute__((aligned(4),warn_if_not_aligned(8)));
 void bar (void *, void *, void *);

[testsuite, committed] Require global_constructor in gcc.dg/torture/pr83055.c

2018-02-07 Thread Tom de Vries


Hi,

this patch requires effective target global_constructor in 
gcc.dg/torture/pr83055.c.


Committed.

Thanks,
- Tom
[testsuite] Require global_constructor in gcc.dg/torture/pr83055.c

2018-02-07  Tom de Vries  

	* gcc.dg/torture/pr83055.c: Require effective target global_constructor.

---
 gcc/testsuite/gcc.dg/torture/pr83055.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.dg/torture/pr83055.c b/gcc/testsuite/gcc.dg/torture/pr83055.c
index 9bc71c6..93f54b7 100644
--- a/gcc/testsuite/gcc.dg/torture/pr83055.c
+++ b/gcc/testsuite/gcc.dg/torture/pr83055.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-fprofile-generate" } */
+/* { dg-require-effective-target global_constructor } */
 
 void __attribute__ ((__cold__)) a (void);
 void b (void);

RE: [PATCH][GCC][ARM] Silence ARM_ARCH redefinition warning and keep better track of architecturs already emitted.

2018-02-07 Thread Tamar Christina

The testcase seems to be failing on configurations where the default thumb 
version differs from the one it's setting it to,
Causing a warning on TARGET_ARM_ARCH_ISA_THUMB.

I will submit a patch soon to undefine all relevant built-ins using 
builtin_define_with_int_value before setting them again.

Thanks,
Tamar

> -Original Message-
> From: Kyrill Tkachov [mailto:kyrylo.tkac...@foss.arm.com]
> Sent: Tuesday, February 6, 2018 10:38
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: nd ; Ramana Radhakrishnan
> ; Richard Earnshaw
> ; ni...@redhat.com
> Subject: Re: [PATCH][GCC][ARM] Silence ARM_ARCH redefinition warning
> and keep better track of architecturs already emitted.
> 
> 
> On 05/02/18 11:52, Tamar Christina wrote:
> > Hi All,
> >
> > This patch silences the warnings that were emitted when certain ACLE
> > macros were being redefined when using the new target pragma or
> attribute to change architecture level.
> >
> > It also keeps better track of which arch directives have already been
> > emitted. This allows special cases that existed before to keep working as
> they did before.
> >
> > Regtested on arm-none-eabi and no regressions.
> > Bootstrapped and regtested on arm-none-Linux-gnueabihf and no issues.
> >
> > Ok for trunk and backport to the GCC-8 branch?
> >
> 
> Thanks Tamar, this is in line with what we do for aarch64 as well.
> Ok for trunk.
> There is no GCC 8 branch, and this is GCC 8 functionality so this shouldn't
> need to be backported anywhere.
> 
> Kyrill
> 
> > Thanks,
> > Tamar
> >
> >
> > gcc/
> > 2018-02-05  Tamar Christina  
> >
> > PR target/82641
> > * config/arm/arm.c (arm_print_asm_arch_directives): Record already
> > emitted arch directives.
> > * config/arm/arm-c.c (arm_cpu_builtins): Undefine __ARM_ARCH and
> > __ARM_FEATURE_COPROC before changing architectures.
> >
> > gcc/testsuite
> > 2018-02-05  Tamar Christina  
> >
> > PR target/82641
> > * gcc.target/arm/pragma_arch_switch_2.c: New.
> >
> > --

[openacc, committed] Fix diff_type in expand_oacc_collapse_init

2018-02-07 Thread Tom de Vries


Hi,

this patch fixes an 8 regression in an openacc testcase.

The regression was introduced by r250925, a fix for PR78266, a bug in 
the handling of a loop with iteration variable type range smaller than 
the size of the parallel dimension the loop is assigned to.


The fix for the regression is to apply the r250925 fix (in 
expand_oacc_for) to expand_oacc_collapse_init as well.


Build and reg-tested on x86_64 with nvptx accelerator.

Committed to stage4 trunk.

Thanks,
- Tom
[openacc] Fix diff_type in expand_oacc_collapse_init

2018-02-07  Tom de Vries  

	PR libgomp/84217
	* omp-expand.c (expand_oacc_collapse_init): Ensure diff_type is large
	enough.

	* c-c++-common/goacc/pr84217.c: New test.
	* gfortran.dg/goacc/pr84217.f90: New test.

	* testsuite/libgomp.oacc-c-c++-common/pr84217.c: New test.

---
 gcc/omp-expand.c   |  2 ++
 gcc/testsuite/c-c++-common/goacc/pr84217.c |  8 
 gcc/testsuite/gfortran.dg/goacc/pr84217.f90|  9 +
 .../testsuite/libgomp.oacc-c-c++-common/pr84217.c  | 22 ++
 4 files changed, 41 insertions(+)

diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index 90e0631..bb20490 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -1433,6 +1433,8 @@ expand_oacc_collapse_init (const struct omp_for_data *fd,
 	plus_type = sizetype;
   if (POINTER_TYPE_P (diff_type) || TYPE_UNSIGNED (diff_type))
 	diff_type = signed_type_for (diff_type);
+  if (TYPE_PRECISION (diff_type) < TYPE_PRECISION (integer_type_node))
+	diff_type = integer_type_node;
 
   if (tiling)
 	{
diff --git a/gcc/testsuite/c-c++-common/goacc/pr84217.c b/gcc/testsuite/c-c++-common/goacc/pr84217.c
new file mode 100644
index 000..c4f2920
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/pr84217.c
@@ -0,0 +1,8 @@
+void
+foo (void)
+{
+#pragma acc parallel loop tile (2, 3)
+  for (short i = 0; i < 10; ++i)
+for (short j = 0; j < 10; ++j)
+  ;
+}
diff --git a/gcc/testsuite/gfortran.dg/goacc/pr84217.f90 b/gcc/testsuite/gfortran.dg/goacc/pr84217.f90
new file mode 100644
index 000..cee49e9
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/pr84217.f90
@@ -0,0 +1,9 @@
+subroutine foo
+  integer(2) :: i, j
+  !$acc parallel loop tile(2,3)
+  do i = 1, 10
+ do j = 1, 10
+ end do
+  end do
+  !$acc end parallel loop
+end
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84217.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84217.c
new file mode 100644
index 000..18ff66a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84217.c
@@ -0,0 +1,22 @@
+extern void abort (void);
+
+#define N 10
+
+int
+main (void)
+{
+  int a[N];
+
+  for (short i = 0; i < N; ++i)
+a[i] = -1;
+
+#pragma acc parallel loop tile (2)
+  for (short i = 0; i < N; ++i)
+a[i] = i;
+
+  for (short i = 0; i < N; ++i)
+if (a[i] != i)
+  abort ();
+
+  return 0;
+}

Re: [PATCH] Fix floating point handling in dom (PR tree-optimization/84235)

2018-02-07 Thread Richard Biener

On Wed, 7 Feb 2018, Jakub Jelinek wrote:

> On Wed, Feb 07, 2018 at 08:06:53AM +0100, Richard Biener wrote:
> > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > OK. 
> 
> Committed, thanks.
> 
> > I wonder why we have to re-implement all this in DOM.  there's enough of
> > match and simplify interfaces to make it use that?
> 
> It would certainly be nicer to be able to just use match.pd here.
> I guess the intent is only do it if it folds into a constant.  While we
> could use a valueize hook that would return the same SSA_NAME for both
> values, wonder if gimplify-match.c has some API to create stmts in a
> sequence only instead of replacing the stmt(s) in the IL that we could use,
> test if it is a constant and use it, otherwise throw away.

Yes it does.  SCCVN for example uses it - not 100% "nice" but you
can look at vn_nary_build_or_lookup_1 for example.  The interface
is using gimple_resimplify[123], passing a NULL gimple_seq (never
allow any extra stmts as simplification) and expanded ops/code.
The simplification result is in the same expanded form and you
can use gimple_simplified_result_is_gimple_val to check if it
is a singleton and then is_gimple_min_invariant if it is constant.
But maybe we can also allow SSA names as result, like from
a - b + b or so.

There's of course the gimple_fold_stmt_to_constant[_1] API that
you can also use on existing gimple stmts which will return
a is_gimple_val as well (and will valueize things in the _1 variant
with the appropriate callbacks).  Not sure if everything DOM
wants to simplify is readily available in the IL though.  If not,
the above interface is the correct thing to use.

Well - or simply use fold_{unary,binary} ...

Richard.

Re: [PATCH] Small -ftrapv improvement

2018-02-07 Thread Richard Biener

On Tue, 6 Feb 2018, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned on IRC, operation_could_trap_helper_p returns true
> for division or modulo with -ftrapv; the operations could trap in certain
> cases (e.g. division by -1 of minimum signed value, but we don't have any
> library functions for division/modulo for -ftrapv nor instrument it in any
> way).  The following just makes it match what we implement.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2018-02-06  Jakub Jelinek  
> 
>   * tree-eh.c (operation_could_trap_helper_p): Ignore honor_trapv for
>   *DIV_EXPR and *MOD_EXPR.
> 
> --- gcc/tree-eh.c.jj  2018-02-01 11:07:24.0 +0100
> +++ gcc/tree-eh.c 2018-02-03 17:52:40.57329 +0100
> @@ -2436,7 +2436,7 @@ operation_could_trap_helper_p (enum tree
>  case ROUND_MOD_EXPR:
>  case TRUNC_MOD_EXPR:
>  case RDIV_EXPR:
> -  if (honor_snans || honor_trapv)
> +  if (honor_snans)
>   return true;
>if (fp_operation)
>   return flag_trapping_math;
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

[PATCH] Fix PRs 84204, 84205 and 84223

2018-02-07 Thread Richard Biener


Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2018-02-07  Richard Biener  

PR tree-optimization/84204
* tree-chrec.c (chrec_fold_plus_1): Remove size limiting in
this place.

* gcc.dg/graphite/pr84204.c: New testcase.

Index: gcc/tree-chrec.c
===
--- gcc/tree-chrec.c(revision 257382)
+++ gcc/tree-chrec.c(working copy)
@@ -375,12 +375,10 @@ chrec_fold_plus_1 (enum tree_code code,
 
default:
  {
-   int size = 0;
-   if ((tree_contains_chrecs (op0, )
-|| tree_contains_chrecs (op1, ))
-   && size < PARAM_VALUE (PARAM_SCEV_MAX_EXPR_SIZE))
+   if (tree_contains_chrecs (op0, NULL)
+   || tree_contains_chrecs (op1, NULL))
  return build2 (code, type, op0, op1);
-   else if (size < PARAM_VALUE (PARAM_SCEV_MAX_EXPR_SIZE))
+   else
  {
if (code == POINTER_PLUS_EXPR)
  return fold_build_pointer_plus (fold_convert (type, op0),
@@ -390,8 +388,6 @@ chrec_fold_plus_1 (enum tree_code code,
  fold_convert (type, op0),
  fold_convert (type, op1));
  }
-   else
- return chrec_dont_know;
  }
}
 }
Index: gcc/testsuite/gcc.dg/graphite/pr84204.c
===
--- gcc/testsuite/gcc.dg/graphite/pr84204.c (nonexistent)
+++ gcc/testsuite/gcc.dg/graphite/pr84204.c (working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -floop-parallelize-all -fno-tree-loop-im --param 
scev-max-expr-size=3" } */
+
+int oc;
+
+void
+mo (int xd)
+{
+  while (xd < 1)
+{
+  for (oc = 0; oc < 2; ++oc)
+   {
+   }
+
+  ++xd;
+}
+}

2018-02-07  Richard Biener  

PR tree-optimization/84205
* graphite-isl-ast-to-gimple.c (binary_op_to_tree): Also
special-case isl_ast_op_zdiv_r.

* gcc.dg/graphite/pr84205.c: New testcase.

Index: gcc/graphite-isl-ast-to-gimple.c
===
--- gcc/graphite-isl-ast-to-gimple.c(revision 257382)
+++ gcc/graphite-isl-ast-to-gimple.c(working copy)
@@ -327,6 +327,7 @@ binary_op_to_tree (tree type, __isl_take
  we cannot represent explicitely but that are no-ops for TYPE.
  Elide those.  */
   if ((expr_type == isl_ast_op_pdiv_r
+   || expr_type == isl_ast_op_zdiv_r
|| expr_type == isl_ast_op_add)
   && isl_ast_expr_get_type (arg_expr) == isl_ast_expr_int
   && (wi::exact_log2 (widest_int_from_isl_expr_int (arg_expr))
Index: gcc/testsuite/gcc.dg/graphite/pr84205.c
===
--- gcc/testsuite/gcc.dg/graphite/pr84205.c (nonexistent)
+++ gcc/testsuite/gcc.dg/graphite/pr84205.c (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O -floop-nest-optimize -ftree-pre -fno-tree-loop-im" } */
+
+long long unsigned int od;
+int zj, fk, ea;
+
+void
+ke (void)
+{
+  if (od != 0 && zj != 0)
+{
+  for (fk = 0; fk < 2; ++fk)
+   {
+   }
+
+  if (od == (long long unsigned int) zj)
+   zj = 0;
+
+  for (ea = 0; ea < 2; ++ea)
+   {
+   }
+}
+}

2018-02-07  Richard Biener  

PR tree-optimization/84223
* graphite-scop-detection.c (gather_bbs::before_dom_children):
Only add conditions from within the region.
(gather_bbs::after_dom_children): Adjust.

* gfortran.dg/graphite/pr84223.f90: New testcase.

Index: gcc/graphite-scop-detection.c
===
--- gcc/graphite-scop-detection.c   (revision 257382)
+++ gcc/graphite-scop-detection.c   (working copy)
@@ -1453,18 +1453,19 @@ gather_bbs::before_dom_children (basic_b
}
 }
 
-  gcond *stmt = single_pred_cond_non_loop_exit (bb);
-
-  if (stmt)
+  if (gcond *stmt = single_pred_cond_non_loop_exit (bb))
 {
   edge e = single_pred_edge (bb);
-
-  conditions.safe_push (stmt);
-
-  if (e->flags & EDGE_TRUE_VALUE)
-   cases.safe_push (stmt);
-  else
-   cases.safe_push (NULL);
+  /* Make sure the condition is in the region and thus was verified
+ to be handled.  */
+  if (e != region->region.entry)
+   {
+ conditions.safe_push (stmt);
+ if (e->flags & EDGE_TRUE_VALUE)
+   cases.safe_push (stmt);
+ else
+   cases.safe_push (NULL);
+   }
 }
 
   scop->scop_info->bbs.safe_push (bb);
@@ -1509,8 +1510,12 @@ gather_bbs::after_dom_children (basic_bl
 
   if (single_pred_cond_non_loop_exit (bb))
 {
-  conditions.pop ();
-  cases.pop ();
+  edge e = single_pred_edge (bb);
+

[PATCH,avr] Fix PR84209 code issue due to post-reload split of SP into 2 QI regs

2018-02-07 Thread Denis Chertykov

Committed.

2018-01-09  Georg-Johann Lay  

PR target/84209
* config/avr/avr.h (GENERAL_REGNO_P, GENERAL_REG_P): New macros.
* config/avr/avr.md: Only post-reload split REG-REG moves if
either register is REGERAL_REG_P.



--- config/avr/avr.h (revision 257384)
+++ config/avr/avr.h (working copy)
@@ -153,6 +153,9 @@ These two properties are reflected by bu

 #define FIRST_PSEUDO_REGISTER 36

+#define GENERAL_REGNO_P(N) IN_RANGE (N, 2, 31)
+#define GENERAL_REG_P(X) (REG_P (X) && GENERAL_REGNO_P (REGNO (X)))
+
 #define FIXED_REGISTERS {\
   1,1,/* r0 r1 */\
   0,0,/* r2 r3 */\
--- config/avr/avr.md (revision 257384)
+++ config/avr/avr.md (working copy)
@@ -3362,6 +3362,8 @@ (define_split
 (match_operand:HI 1 "reg_or_0_operand"))]
   "optimize
&& reload_completed
+   && GENERAL_REG_P (operands[0])
+   && (operands[1] == const0_rtx || GENERAL_REG_P (operands[1]))
&& (!AVR_HAVE_MOVW
|| const0_rtx == operands[1])"
   [(set (match_dup 2) (match_dup 3))

Patch ping

2018-02-07 Thread Jakub Jelinek

Hi!

I'd like to ping 2 patches:

PR83659 cxx_fold_indirect_ref ICE fix
  http://gcc.gnu.org/ml/gcc-patches/2018-01/msg02211.html

PR83708 __VA_OPT__ assorted fixes
  http://gcc.gnu.org/ml/gcc-patches/2018-01/msg00727.html

Jakub

Re: PR 84154: Fix checking -mibt and -mshstk options for control flow protection

2018-02-07 Thread Jakub Jelinek

On Tue, Feb 06, 2018 at 11:14:08AM +, Tsimbalist, Igor V wrote:
> Coincidentally, I have worked on the same patch. Please look at the patch, I 
> uploaded it to the bug. The main differences are
> 
> - updated the output messages to be more informative;
> - updated  the tests and add couple of new tests to check the messages;
> - fixed a typo in the doc file related to fcf-protection;
> 
> I am ok with the changes in i386.c but would like to update the messages. 
> Could you incorporate my changes and proceed? Or would you like me to finish 
> the fix?

The r257414 change broke bootstrap on both x86_64-linux and i686-linux,
in both the bootstrap ICEs during libitm build of aatree.cc:
/.../gcc/obj70/./gcc/xg++ -B/.../gcc/obj70/./gcc/ -nostdinc++ -nostdinc++ 
-I/.../gcc/obj70/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu 
-I/.../gcc/obj70/x86_64-pc-linux-gnu/libstdc++-v3/include 
-I/.../gcc/libstdc++-v3/libsupc++ -I/.../gcc/libstdc++-v3/include/backward 
-I/.../gcc/libstdc++-v3/testsuite/util 
-L/.../gcc/obj70/x86_64-pc-linux-gnu/libstdc++-v3/src 
-L/.../gcc/obj70/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs 
-L/.../gcc/obj70/x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs 
-B/.../gcc/obj70/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs 
-B/.../gcc/obj70/x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs 
-B/usr/local/x86_64-pc-linux-gnu/bin/ -B/usr/local/x86_64-pc-linux-gnu/lib/ 
-isystem /usr/local/x86_64-pc-linux-gnu/include -isystem 
/usr/local/x86_64-pc-linux-gnu/sys-include -DHAVE_CONFIG_H -I. 
-I../../../libitm -I../../../libitm/config/linux/x86 
-I../../../libitm/config/linux -I../../../libitm/config/x86 
-I../../../libitm/config/posix -I../../../libitm/config/generic 
-I../../../libitm -mrtm -Wall -pthread -Werror -fcf-protection -mcet 
-std=gnu++0x -funwind-tables -fno-exceptions -fno-rtti -fabi-version=4 -g -O2 
-D_GNU_SOURCE -MT aatree.lo -MD -MP -MF .deps/aatree.Tpo -c 
../../../libitm/aatree.cc  -fPIC -DPIC -o .libs/aatree.o
/.../gcc/obj70/gcc/include/ia32intrin.h:56:28: internal compiler error: in 
ix86_option_override_internal, at config/i386/i386.c:4948

The problem is that the enum has
{CF_NONE, CF_BRANCH, CF_RETURN, CF_FULL, CF_SET}
values and the hook does:
  opts->x_flag_cf_protection =
(cf_protection_level) (opts->x_flag_cf_protection | CF_SET);
after the switch (note, bad formatting), and if called again, we end up with
e.g. CF_FULL | CF_SET value which the switch doesn't handle.

It isn't clear what you want to do, either ignore the CF_SET | ...
values in the switch instead of gcc_unreachable on them, or
switch (flag_cf_protection & ~CF_SET)
or something similar.

Jakub

Re: [PATCH] Use -fcf-protection=return in cet-intrin-4.c

2018-02-07 Thread Uros Bizjak

On Tue, Feb 6, 2018 at 10:14 PM, Tsimbalist, Igor V
 wrote:
>> -Original Message-
>> From: Lu, Hongjiu
>> Sent: Tuesday, February 6, 2018 10:03 PM
>> To: gcc-patches@gcc.gnu.org
>> Cc: Uros Bizjak ; Tsimbalist, Igor V
>> 
>> Subject: [PATCH] Use -fcf-protection=return in cet-intrin-4.c
>>
>> Since -fcf-protection requires both -mshstk and -mibt, use
>> -fcf-protection=return with -mshstk in cet-intrin-4.c.
>>
>> OK for trunk?
>
> Ok from CET viewpoint.

OK.

Thanks,
Uros.

> Igor
>
>> H.J.
>> --
>>   PR target/84243
>>   * gcc.target/i386/cet-intrin-4.c (dg-options): Use
>>   -fcf-protection=return.
>> ---
>>  gcc/testsuite/gcc.target/i386/cet-intrin-4.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/testsuite/gcc.target/i386/cet-intrin-4.c
>> b/gcc/testsuite/gcc.target/i386/cet-intrin-4.c
>> index 76ec160543f..437a4cd690c 100644
>> --- a/gcc/testsuite/gcc.target/i386/cet-intrin-4.c
>> +++ b/gcc/testsuite/gcc.target/i386/cet-intrin-4.c
>> @@ -1,5 +1,5 @@
>>  /* { dg-do compile } */
>> -/* { dg-options "-O -fcf-protection -mshstk" } */
>> +/* { dg-options "-O -fcf-protection=return -mshstk" } */
>>  /* { dg-final { scan-assembler "rdsspd|incsspd\[ \t]+(%|)eax" { target ia32 
>> } } }
>> */
>>  /* { dg-final { scan-assembler "rdssp\[dq]\[ \t]+(%|)\[re]ax"  { target { ! 
>> ia32 } }
>> } } */
>>  /* { dg-final { scan-assembler "incssp\[dq]\[ \t]+(%|)\[re]di" { target { ! 
>> ia32 } }
>> } } */
>> --
>> 2.14.3
>

Re: [PATCH] i386: Mask out the CF_SET bit for -fcf-protection check

2018-02-07 Thread Uros Bizjak

On Tue, Feb 6, 2018 at 10:20 PM, Tsimbalist, Igor V
 wrote:
>> -Original Message-
>> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
>> ow...@gcc.gnu.org] On Behalf Of H.J. Lu
>> Sent: Tuesday, February 6, 2018 10:09 PM
>> To: gcc-patches@gcc.gnu.org
>> Cc: Uros Bizjak ; Tsimbalist, Igor V
>> 
>> Subject: [PATCH] i386: Mask out the CF_SET bit for -fcf-protection check
>>
>> Since ix86_option_override_internal sets the CF_SET bit in
>> flag_cf_protection and it can be called more than once via pragma,
>> we need to mask out the CF_SET bit when checking flag_cf_protection.
>>
>> OK for trunk if there is no regression?
>
> Ok from CET viewpoint.

OK.

Thanks,
Uros.

> Thanks,
> Igor
>
>> H.J.
>> ---
>>   PR target/84248
>>   * config/i386/i386.c (ix86_option_override_internal): Mask out
>>   the CF_SET bit when checking -fcf-protection.
>> ---
>>  gcc/config/i386/i386.c | 10 +-
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 6c612c77987..ef7ff89bcbb 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -4913,12 +4913,12 @@ ix86_option_override_internal (bool
>> main_args_p,
>>= build_target_option_node (opts);
>>
>>/* Do not support control flow instrumentation if CET is not enabled.  */
>> -  if (opts->x_flag_cf_protection != CF_NONE)
>> +  cf_protection_level cf_protection
>> += (cf_protection_level) (opts->x_flag_cf_protection & ~CF_SET);
>> +  if (cf_protection != CF_NONE)
>>  {
>> -  switch (flag_cf_protection)
>> +  switch (cf_protection)
>>   {
>> - case CF_NONE:
>> -   break;
>>   case CF_BRANCH:
>> if (! TARGET_IBT_P (opts->x_ix86_isa_flags2))
>>   {
>> @@ -4953,7 +4953,7 @@ ix86_option_override_internal (bool main_args_p,
>>   }
>>
>>opts->x_flag_cf_protection =
>> - (cf_protection_level) (opts->x_flag_cf_protection | CF_SET);
>> + (cf_protection_level) (cf_protection | CF_SET);
>>  }
>>
>>if (ix86_tune_features [X86_TUNE_AVOID_128FMA_CHAINS])
>> --
>> 2.14.3
>

Re: [PATCH] Fix floating point handling in dom (PR tree-optimization/84235)

2018-02-07 Thread Jakub Jelinek

On Wed, Feb 07, 2018 at 08:06:53AM +0100, Richard Biener wrote:
> >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> OK. 

Committed, thanks.

> I wonder why we have to re-implement all this in DOM.  there's enough of
> match and simplify interfaces to make it use that?

It would certainly be nicer to be able to just use match.pd here.
I guess the intent is only do it if it folds into a constant.  While we
could use a valueize hook that would return the same SSA_NAME for both
values, wonder if gimplify-match.c has some API to create stmts in a
sequence only instead of replacing the stmt(s) in the IL that we could use,
test if it is a constant and use it, otherwise throw away.

Jakub

Re: [PATCH] PR fortran/82049 -- resolve a charlen if possible

2018-02-07 Thread Paul Richard Thomas

Hi Steve,

That's OK for trunk and, if you are possessed of the intestinal
fortitude, 6- and 7-branches.

Thanks

Paul


On 7 February 2018 at 02:17, Steve Kargl
 wrote:
> The attached patch fixes PR fortran/82049.  Prior to this
> patch, gfortran was fouling up the resolution of the charlen
> expression in the testcase.  The immediately tries to resolve
> the length while parsing the type-spec.
>
> While here, I've introduced an optimization that causes
> gfc_match_type_spec() to return early if the gfortran isn't
> going to be getting a type-spec.  This is done be peeking
> at the next character, if it is in [a-z], the we don't have
> a type spec.  OK to commit?
>
>
> 2018-02-06  Steven G. Kargl  
>
> PR fortran/82049
> * match.c (gfc_match_type_spec): If the charlen is non-NULL, then
> try to resolve it.  While here return early if possible.
>
> 2018-02-06  Steven G. Kargl  
>
> PR fortran/82049
> * gfortran.dg/assumed_charlen_parameter.f90: New test.
>
>
> --
> Steve



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein

89 matches

Mail list logo