date:20211129

On Mon, 29 Nov 2021, Martin Sebor wrote:

> On 11/29/21 11:53 AM, Martin Sebor wrote:
> > On 11/29/21 6:09 AM, Richard Biener via Gcc-patches wrote:
> >> This removes more cases that appear when bootstrap with
> >> -Wunreachable-code-return progresses.
> >>
> > ...
> >> diff --git a/gcc/sel-sched-ir.h b/gcc/sel-sched-ir.h
> >> index 8ee0529d5a8..18e03c4cb96 100644
> >> --- a/gcc/sel-sched-ir.h
> >> +++ b/gcc/sel-sched-ir.h
> >> @@ -1493,8 +1493,6 @@ bb_next_bb (basic_block bb)
> >>   default:
> >>     return bb->next_bb;
> >>   }
> >> -
> >> -  gcc_unreachable ();
> >>   }
> > 
> > Just skiming the changes out of curiosity, this one makes me
> > wonder if the warning shouldn't be taught to avoid triggering
> > on calls to __builtin_unreachable().  They can help make code
> > more readable (e.g., after a case and switch statement that
> > handles all values).
> 
> I see someone else raised the same question in a patch I hadn't
> gotten to yet:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585431.html
> 
> If you do end up removing the gcc_unreachable() calls, I would
> suggest to replace them with a comment so as not to lose
> the readability benefit.
> 
> But I still wonder if it might make sense to teach the warning
> not just about __builtin_unreachable() but also about noreturn
> calls like abort() that (as you explained in the thread above)
> gcc_unreachable() expands to.  Is there a benefit to warning
> on such calls?

I'm not sure.  I've chosen to eliminate only the "obvious"
cases, like above where there's a default: that returns immediately
visible (not always in the patch context).  I've left some in
the code base where that's not so obvious.

IMHO making the flow obvious without a unreachable marker is
superior to obfuscating it and clearing that up with one.

Yes, I thought about not diagnosing things like

   return 1;
   return 1;

but then what about

   return 1;
   return 0;

?  I've seen cases like

   gcc_unreachable ();
   return 0;

was that meant to be

   return 0;
   gcc_unreachable ();

?  So it's not entirely clear.  I think that if there was a way
to denote definitive 'code should not reach here' function
(a new attribute?) then it would make sense to not warn about
control flow not reaching that.  But then it would make sense
to warn about stmts following such annotation.

Richard.

Re: [PATCH] tree-optimization/103440 - Always track arguments, even when ignoring equiv params.

On Mon, Nov 29, 2021 at 6:20 PM Andrew MacLeod via Gcc-patches
 wrote:
>
> I need to adjust the original patch, I shouldn't have continued the loop
> when dealing with equivalences.An equivalence is not the same as an
> undefined value...  we might be able to ignore the range from it for
> calculation purposes, but we cannot ignore the fact that it is a
> different SSA_NAME and may contain a different value that we do care about.
>
> There are other checks in then loop which will allow us to assign an
> equivalence between the DEF and an argument if we are ignoring the other
> ssa_names..  such as when there are undefined values.
>
> a_3 = PHI 
>
> if a_2 is undefined, we can create an equivalence between a_3 and a_1 as
> the value of a_2 is irrelevant and can be whatever we want it to be.
>
> if a_2 is instead an equivalence with a_3, we do not want to create an
> equivalence between a_3 and a_1 in this block as we may then turn it
> into a copy.. we'd only be able to do this if there was an equivalence
> between a_1 and a_2, and we are not checking that.
>
> Although we are may not be adding the range for a_2 into the cumulative
> knowledge of a_3's range, we do need to keep the edge to retain the copy
> as its value is important and could be different than the other
> argument... and we need to retain the copy when we go out of ssa.
>
> This fixes that oversight, bootstrapped on x86_64-pc-linux-gnu with no
> regressions.  OK?

OK.

> Caveat..  the test case has an infinite loop without this fix, but
> degagnu doesn't seem to kill it, and my test suite runs go forever.. if
> there something I am missing?   The gcc.log claims that it timeouts
> after 300 second, but it doesn't kill the executable. I set the
> dg-timeout field, but that appears to be a compile time timeout, not
> runtime anyway.

Might be an old issue with your dejagnu, it should work.

Richard.

>
> Andrew

Re: [PATCH 1/4] Canonicalize argument order for commutative functions

On Mon, Nov 29, 2021 at 4:40 PM Richard Sandiford
 wrote:
>
> Sorry for the slow response, was away last week.
>
> Richard Biener  writes:
> > On Wed, Nov 10, 2021 at 1:50 PM Richard Sandiford via Gcc-patches
> >  wrote:
> >>
> >> This patch uses information about internal functions to canonicalize
> >> the argument order of calls.
> >>
> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> >
> > OK.  Note the gimple_resimplifyN functions also canonicalize operand
> > order, currently for is_tree_code only:
> >
> >   /* Canonicalize operand order.  */
> >   bool canonicalized = false;
> >   if (res_op->code.is_tree_code ()
> >   && (TREE_CODE_CLASS ((enum tree_code) res_op->code) == tcc_comparison
> >   || commutative_tree_code (res_op->code))
> >   && tree_swap_operands_p (res_op->ops[0], res_op->ops[1]))
> > {
> >   std::swap (res_op->ops[0], res_op->ops[1]);
> >   if (TREE_CODE_CLASS ((enum tree_code) res_op->code) == tcc_comparison)
> > res_op->code = swap_tree_comparison (res_op->code);
> >   canonicalized = true;
> > }
> >
> > that's maybe not the best place.  The function assumes the operands
> > are already valueized,
> > so it maybe should be valueization that does the canonicalization -
> > but I think doing it
> > elsewhere made operand order unreliable (we do end up with
> > non-canonical order in
> > the IL sometimes).
> >
> > So maybe you should amend the code in resimplifyN as well.
>
> Hmm, yeah, thanks for the heads up.  Does this updated version look OK?
> Tested as before.

Yes - OK.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * gimple-fold.c: Include internal-fn.h.
> (fold_stmt_1): If a function maps to an internal one, use
> first_commutative_argument to canonicalize the order of
> commutative arguments.
> * gimple-match-head.c (gimple_resimplify2, gimple_resimplify3)
> (gimple_resimplify4, gimple_resimplify5): Extend commutativity
> checks to functions.
>
> gcc/testsuite/
> * gcc.dg/fmax-fmin-1.c: New test.
> ---
>  gcc/gimple-fold.c  | 25 --
>  gcc/gimple-match-head.c| 52 --
>  gcc/testsuite/gcc.dg/fmax-fmin-1.c | 18 +++
>  3 files changed, 75 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/fmax-fmin-1.c
>
> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> index 44fba12e150..1d8fd74f72c 100644
> --- a/gcc/gimple-fold.c
> +++ b/gcc/gimple-fold.c
> @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "varasm.h"
>  #include "memmodel.h"
>  #include "optabs.h"
> +#include "internal-fn.h"
>
>  enum strlen_range_kind {
>/* Compute the exact constant string length.  */
> @@ -6109,18 +6110,36 @@ fold_stmt_1 (gimple_stmt_iterator *gsi, bool inplace, 
> tree (*valueize) (tree))
>break;
>  case GIMPLE_CALL:
>{
> -   for (i = 0; i < gimple_call_num_args (stmt); ++i)
> +   gcall *call = as_a (stmt);
> +   for (i = 0; i < gimple_call_num_args (call); ++i)
>   {
> -   tree *arg = gimple_call_arg_ptr (stmt, i);
> +   tree *arg = gimple_call_arg_ptr (call, i);
> if (REFERENCE_CLASS_P (*arg)
> && maybe_canonicalize_mem_ref_addr (arg))
>   changed = true;
>   }
> -   tree *lhs = gimple_call_lhs_ptr (stmt);
> +   tree *lhs = gimple_call_lhs_ptr (call);
> if (*lhs
> && REFERENCE_CLASS_P (*lhs)
> && maybe_canonicalize_mem_ref_addr (lhs))
>   changed = true;
> +   if (*lhs)
> + {
> +   combined_fn cfn = gimple_call_combined_fn (call);
> +   internal_fn ifn = associated_internal_fn (cfn, TREE_TYPE (*lhs));
> +   int opno = first_commutative_argument (ifn);
> +   if (opno >= 0)
> + {
> +   tree arg1 = gimple_call_arg (call, opno);
> +   tree arg2 = gimple_call_arg (call, opno + 1);
> +   if (tree_swap_operands_p (arg1, arg2))
> + {
> +   gimple_call_set_arg (call, opno, arg2);
> +   gimple_call_set_arg (call, opno + 1, arg1);
> +   changed = true;
> + }
> + }
> + }
> break;
>}
>  case GIMPLE_ASM:
> diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
> index c481a625581..2d9364ca5de 100644
> --- a/gcc/gimple-match-head.c
> +++ b/gcc/gimple-match-head.c
> @@ -294,18 +294,16 @@ gimple_resimplify2 (gimple_seq *seq, gimple_match_op 
> *res_op,
>
>/* Canonicalize operand order.  */
>bool canonicalized = false;
> -  if (res_op->code.is_tree_code ())
> +  bool is_comparison
> += (res_op->code.is_tree_code ()
> +   && TREE_CODE_CLASS (tree_code (res_op->code)) == tcc_comparison);
> +  if ((is_comparison || commutative_binary_op_p (res_op->code, res_op->type))
> +

Re: [PATCH] introduce predicate analysis class

On Mon, Nov 29, 2021 at 4:40 PM Martin Sebor  wrote:
>
> On 11/25/21 3:18 AM, Richard Biener wrote:
> > On Mon, Aug 30, 2021 at 10:06 PM Martin Sebor via Gcc-patches
> >  wrote:
> >>
> >> The predicate analysis subset of the tree-ssa-uninit pass isn't
> >> necessarily specific to the detection of uninitialized reads.
> >> Suitably parameterized, the same core logic could be used in
> >> other warning passes to improve their S/N ratio, or issue more
> >> nuanced diagnostics (e.g., when an invalid access cannot be
> >> ruled out but also need not in reality be unavoidable, issue
> >> a "may be invalid" type of warning rather than "is invalid").
> >>
> >> Separating the predicate analysis logic from the uninitialized
> >> pass and exposing a narrow API should also make it easier to
> >> understand and evolve each part independently of the other,
> >> or replace one with a better implementation without modifying
> >> the other.(*)
> >>
> >> As the first step in this direction, the attached patch extracts
> >> the predicate analysis logic out of the pass, turns the interface
> >> into public class members, and hides the internals in either
> >> private members or static functions defined in a new source file.
> >> (**)
> >>
> >> The changes should have no externally observable effect (i.e.,
> >> should cause no changes in warnings), except on the contents of
> >> the uninitialized dump.  While making the changes I enhanced
> >> the dumps to help me follow the logic.  Turning some previously
> >> free-standing functions into members involved changing their
> >> signatures and adjusting their callers.  While making these
> >> changes I also renamed some of them as well some variables for
> >> improved clarity.  Finally, I moved declarations of locals
> >> closer to their point of initialization.
> >>
> >> Tested on x86_64-linux.  Besides the usual bootstrap/regtest
> >> I also tentatively verified the generality of the new class
> >> interfaces by making use of it in -Warray-bounds.  Besides there,
> >> I'd like to make use of it in the new gimple-ssa-warn-access pass
> >> and, longer term, any other flow-sensitive warnings that might
> >> benefit from it.
> >
> > This changed can_chain_union_be_invalidated_p from
> >
> >for (size_t i = 0; i < uninit_pred.length (); ++i)
> >  {
> >pred_chain c = uninit_pred[i];
> >size_t j;
> >for (j = 0; j < c.length (); ++j)
> >  if (can_one_predicate_be_invalidated_p (c[j], use_guard))
> >break;
> >
> >/* If we were unable to invalidate any predicate in C, then there
> >   is a viable path from entry to the PHI where the PHI takes
> >   an uninitialized value and continues to a use of the PHI.  */
> >if (j == c.length ())
> >  return false;
> >  }
> >return true;
> >
> > to
> >
> >for (unsigned i = 0; i < preds.length (); ++i)
> >  {
> >const pred_chain  = preds[i];
> >for (unsigned j = 0; j < chain.length (); ++j)
> >  if (can_be_invalidated_p (chain[j], guard))
> >return true;
> >
> >/* If we were unable to invalidate any predicate in C, then there
> >   is a viable path from entry to the PHI where the PHI takes
> >   an interesting value and continues to a use of the PHI.  */
> >return false;
> >  }
> >return true;
> >
> > which isn't semantically equivalent (it also uses overloading to
> > confuse me).  In particular the old code checked whether an
> > invalidation can happen for _each_ predicate in 'preds' while
> > the new one just checks preds[0], so the loop is pointless.
> >
> > Catched by -Wunreachable-code complaining about unreachable
> > ++i
> >
> > Martin, was that change intended?
>
> I didn't mean to make semantic changes (or to confuse you) so
> the removal of the test is almost certainly an accident.  It's
> interesting that the change hasn't caused any other trouble.
> We should look into that.

Jeff meanwhile approved restoring the original.  But yes, it's
odd that this did not have any visible effect (maybe fixing
the mistake resolved some uninit regressions - but IIRC none
were bisected to that rev.)

Richard.

>
> Martin
>
> >
> > Richard.
> >
> >>
> >> Martin
> >>
> >> [*] A review of open -Wuninitialized bugs I did while working
> >> on this project made me aware of a number of opportunities to
> >> improve the analyzer to reduce the number of false positives
> >> -Wmaybe-uninitiailzed suffers from.
> >>
> >> [**] The class isn't fully general and, like the uninit pass,
> >> only works with PHI nodes.  I plan to generalize it to compute
> >> the set of predicates between any two basic blocks.
>

Re: [PATCH] Remove can_throw_non_call_exceptions special case from operator_div::wi_fold.

On Mon, Nov 29, 2021 at 4:24 PM Aldy Hernandez  wrote:
>
> On Mon, Nov 29, 2021 at 3:48 PM Richard Biener
>  wrote:
> >
> > On Mon, Nov 29, 2021 at 3:39 PM Jeff Law  wrote:
> > >
> > >
> > >
> > > On 11/29/2021 7:00 AM, Aldy Hernandez via Gcc-patches wrote:
> > > > As discussed in the PR.  The code makes no difference, so whatever test
> > > > we added this special case for has been fixed or is being papered over.
> > > > I think we should fix any fall out upstream.
> > > >
> > > > [Unless Andrew can remember why we added this and it still applies.]
> > > >
> > > > Tested on x86-64 Linux.
> > > >
> > > > OK for trunk?
> > > >
> > > >   PR 103451
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >   * range-op.cc (operator_div::wi_fold): Remove
> > > >   can_throw_non_call_exceptions special case.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >   * gcc.dg/pr103451.c: New test.
> > > I'll defer to Andrew, but it seems wrong to me.  The whole point is to
> > > set the result to varying so that we don't know the result and never
> > > remove the division which is critical for -fnon-call-exceptions.
> >
> > But that has nothing to do with computing the value range for
> > the result which is only accessible when the stmt does _not_ throw ...
> >
> > That is, if we compute non-VARYING here and because of that
> > remove the stmt then _that's_ the place to fix (IMO)
>
> Ughh, I think you're both right.
>
> We should fix this upstream AND we should test for the presence of the
> division by 0 in the optimized dump.
>
> Of course doing both opens a can of worms.  The division by zero can
> be cleaned up by (at least) DCE, DSE, and the code sinking passes.
> I've fixed all 3 in the attached (untested) patch.  Dunno what y'all
> want to do at this point.

I think you need to add -fno-delete-dead-exceptions to the testcase.
The sinking
bug looks real, but just

 && (cfun->can_delete_dead_exceptions
|| !stmt_could_throw_p (cfun, stmt))

is needed there.  That change is OK.

Thanks,
Richard.

>
> Aldy

Re: [PATCH v2] rs6000: Modify the way for extra penalized cost

2021-11-29 Thread Kewen.Lin via Gcc-patches

Hi Segher,

on 2021/11/30 上午6:06, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Sep 28, 2021 at 04:16:04PM +0800, Kewen.Lin wrote:
>> This patch follows the discussions here[1][2], where Segher
>> pointed out the existing way to guard the extra penalized
>> cost for strided/elementwise loads with a magic bound does
>> not scale.
>>
>> The way with nunits * stmt_cost can get one much
>> exaggerated penalized cost, such as: for V16QI on P8, it's
>> 16 * 20 = 320, that's why we need one bound.  To make it
>> better and more readable, the penalized cost is simplified
>> as:
>>
>> unsigned adjusted_cost = (nunits == 2) ? 2 : 1;
>> unsigned extra_cost = nunits * adjusted_cost;
> 
>> For V2DI/V2DF, it uses 2 penalized cost for each scalar load
>> while for the other modes, it uses 1.
> 
> So for V2D[IF] we get 4, for V4S[IF] we get 4, for V8HI it's 8, and
> for V16QI it is 16?  Pretty terrible as well, heh (I would expect all
> vector ops to be similar cost).
> 

But for different vector units it has different number of loads, it seems
reasonable to have more costs when it has more loads to be fed into those
limited number of load/store units.

>> It's mainly concluded
>> from the performance evaluations.  One thing might be
>> related is that: More units vector gets constructed, more
>> instructions are used.
> 
> Yes, but how often does that happen, compared to actual vector ops?
> 
> This also suggests we should cost vector construction separately, which
> would pretty obviously be a good thing anyway (it happens often, it has
> a quite different cost structure).
> 

vectorizer does model vector construction separately, there is an enum
vect_cost_for_stmt *vec_construct*, normally it works well.  But for this
bwaves hotspot, it requires us to do some more penalization as evaluated,
so we put the penalized cost onto this special vector construction when
some heuristic thresholds are met.

>> It has more chances to schedule them
>> better (even run in parallelly when enough available units
>> at that time), so it seems reasonable not to penalize more
>> for them.
> 
> Yes.
> 
>> +  /* Don't expect strided/elementwise loads for just 1 nunit.  */
> 
> "We don't expect" etc.
> 

Fixed.

> Okay for trunk.  Thanks!  This probably isn't the last word in this
> story, but it is an improvement in any case :-)
> 
> 

Thanks for the review, rebased/retested and committed as r12-5589.
BR,
Kewen

Re: [PATCH] rs6000: Remove builtin mask check from builtin_decl [PR102347]

2021-11-29 Thread Kewen.Lin via Gcc-patches

Hi Segher,

on 2021/11/30 上午8:16, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Sep 28, 2021 at 04:13:40PM +0800, Kewen.Lin wrote:
>>  PR target/102347
>>  * config/rs6000/rs6000-call.c (rs6000_builtin_decl): Remove builtin
>>  mask check.
> 
> (Don't wrap lines early please).
> 

Fixed.

> Okay for trunk and all backports.  Thanks!
> 
> 

Thanks for the review, re-based/re-tested and committed as r12-5590.

Will backport it in a week or so.

BR,
Kewen

Re: [PATCH] c++: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4


On 11/29/21 11:31, Jakub Jelinek wrote:

Hi!

P1272R4 has added to the std::byteswap new stuff to me quite unrelated
clarification for std::bit_cast.
std::bit_cast is already in C++20 though and std::byteswap is a new
C++23 feature, it is unclear to me if that bit_cast addition in there
is really meant for C++23 and later only (e.g. related to only
C++23 actually allowing non-literal vars and thus other spots where
indeterminate values can appear during constant evaluation), or if
it was meant as a DR and should apply to C++20 too.


It's a DR.  Really, it was intended to be part of C++20; at the Cologne 
meeting in 2019 CWG thought byteswap was going to make C++20, so this 
bugfix could go in as part of that paper.


Also, allowing indeterminate values that are never read was in C++20 
(P1331).



The following so far only lightly tested patch (dg.exp=bit-cast*.C
testing) implements it as applying to C++23 only.

Also, it is unclear to me how bitfields should behave, whether having
indeterminate bits in
struct S { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 8, g : 24; 
};
struct T1 { unsigned char a : 1, : 7, b : 5, c : 3, d, e; unsigned int f : 8, g 
: 24; };
struct T2 { unsigned char a : 8, b : 1, : 4, c : 3, d, e; unsigned int f : 8, g 
: 24; };
struct T3 { unsigned char a : 8, b : 5, c : 1, : 2, d, e; unsigned int f : 8, g 
: 24; };
struct T4 { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 1, : 7, g 
: 24; };
constexpr bool f1 () { T1 t = { 0, 0, 0, 0, 0, 0, 0 }; S s = __builtin_bit_cast 
(S, t); return true; }
constexpr bool f2 () { T2 t = { 0, 0, 0, 0, 0, 0, 0 }; S s = __builtin_bit_cast 
(S, t); return true; }
constexpr bool f3 () { T3 t = { 0, 0, 0, 0, 0, 0, 0 }; S s = __builtin_bit_cast 
(S, t); return true; }
constexpr bool f4 () { T4 t = { 0, 0, 0, 0, 0, 0, 0 }; S s = __builtin_bit_cast 
(S, t); return true; }
constexpr bool a = f1 ();
constexpr bool b = f2 ();
constexpr bool c = f3 ();
constexpr bool d = f4 ();
The patch currently uses TREE_TYPE and doesn't check DECL_BIT_FIELD,
so it accepts f1 and f4 cases where the TREE_TYPE is unsigned char
and doesn't accept f2 and f3 where the type is some 5 or 3 bit precision
unsigned type.  But possibly either it should look at the declared
type of the bitfield and don't allow f4 and allow f1-f3, or don't allow
any of the f1-f4 cases.


I think in all of them the result of the cast has (some) indeterminate 
value.  So f1-3 are OK because the indeterminate value has unsigned char 
type and is never used; f4() is non-constant because S::f has 
non-byte-access type and so the new wording says it's undefined.



2021-11-29  Jakub Jelinek 

* constexpr.c (clear_uchar_or_std_byte_in_mask): New function.
(cxx_eval_bit_cast): Don't error about padding bits if target
type is unsigned char or std::byte, instead return no clearing
ctor.  Use clear_uchar_or_std_byte_in_mask.

* g++.dg/cpp2a/bit-cast11.C: New test.
* g++.dg/cpp2a/bit-cast12.C: New test.
* g++.dg/cpp2a/bit-cast13.C: New test.

--- gcc/cp/constexpr.c.jj   2021-11-22 10:13:17.156882931 +0100
+++ gcc/cp/constexpr.c  2021-11-29 15:41:54.611361669 +0100
@@ -4268,6 +4268,59 @@ check_bit_cast_type (const constexpr_ctx
return false;
  }
  
+/* Helper function for cxx_eval_bit_cast.  For unsigned char or

+   std::byte members of CONSTRUCTOR (recursively) if they contain
+   some indeterminate bits (as set in MASK), remove the ctor elts,
+   mark the CONSTRUCTOR as CONSTRUCTOR_NO_CLEARING and clear the
+   bits in MASK.  */
+
+static void
+clear_uchar_or_std_byte_in_mask (tree t, unsigned char *mask)
+{
+  if (TREE_CODE (t) != CONSTRUCTOR)
+return;
+
+  unsigned i, j = 0;
+  tree index, value;
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (t), i, index, value)
+{
+  tree type = TREE_TYPE (value);
+  if (is_byte_access_type (type)
+ && TYPE_MAIN_VARIANT (type) != char_type_node)
+   {
+ HOST_WIDE_INT pos;
+ if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
+   pos = tree_to_shwi (index);
+ else
+   pos = int_byte_position (index);
+ if (mask[pos])
+   {
+ CONSTRUCTOR_NO_CLEARING (t) = 1;
+ mask[pos] = 0;
+ continue;
+   }
+   }
+  if (TREE_CODE (value) == CONSTRUCTOR)
+   {
+ HOST_WIDE_INT pos;
+ if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
+   pos = tree_to_shwi (index)
+ * tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (t;
+ else
+   pos = int_byte_position (index);
+ clear_uchar_or_std_byte_in_mask (value, mask + pos);
+   }
+  if (i != j)
+   {
+ CONSTRUCTOR_ELT (t, j)->index = index;
+ CONSTRUCTOR_ELT (t, j)->value = value;
+   }
+  ++j;
+}
+  if (CONSTRUCTOR_NELTS (t) != j)
+vec_safe_truncate (CONSTRUCTOR_ELTS (t), j);
+}
+
  /* Subroutine of

Re: [PATCH] Fix regression introduced by r12-5536.

2021-11-29 Thread Hongtao Liu via Gcc-patches

On Tue, Nov 30, 2021 at 5:21 AM Uros Bizjak  wrote:
>
> On Mon, Nov 29, 2021 at 10:48 AM Hongtao Liu  wrote:
> >
> > On Mon, Nov 29, 2021 at 3:53 PM Uros Bizjak  wrote:
> > >
> > > On Mon, Nov 29, 2021 at 2:32 AM liuhongt  wrote:
> > > >
> > > > There're several failures reported in [1]:
> > > > 1.  unsupported instruction `pextrw` for "pextrw $0, %xmm31, 16(%rax)"
> > > > %vpextrw should be used in output templates.
> > > > 2. ICE in get_attr_memory for movhi_internal since some alternatives
> > > > are marked as TYPE_SSELOG.
> > > > Explicitly set memory_attr for those alternatives.
> > > >
> > > > Also this patch fixs a typo and some latent bugs which are related to
> > > > moving HImode from/to sse register w/o TARGET_AVX512FP16.
>
> Here are some more fixes:
>
thanks.
> i386: Fix and improve movhi_internal and movhf_internal some more.
>
> An (*v,C) alternative can be added to movhi_internal to directly load
> HImode constant 0 to xmm register. Also, V4SFmode moves can be used
> for xmm->xmm moves instead of TImode moves when optimizing for size.
> Fix invalid %vpinsrw insn template, which needs to duplicate %xmm
> register for AVX targets.
>
> Optimize GPR moves in movhf_internal in the same way as in movhi_internal.
> Fix pinsrw and pextrw templates for AVX targets. Use sselog1
> instead of sselog type.  Also, handle TARGET_SSE_PARTIAL_REG_DEPENDENCY
> and TARGET_SSE_SPLIT_REGS targets.
>
> 2021-11-29  Uroš Bizjak  
>
> gcc/ChangeLog:
>
> PR target/102811
> * config/i386/i386.md (*movhi_internal): Introduce (*v,C) alternative.
> Do not allocate non-GPR registers.  Optimize xmm->xmm moves when
> optimizing for size.  Fix vpinsrw insn template.
> (*movhf_internal): Fix pinsrw and pextrw insn templates for
> AVX targets. Use sselog1 type instead of sselog.  Optimize GPR moves.
> Optimize xmm->xmm moves for TARGET_SSE_PARTIAL_REG_DEPENDENCY
> and TARGET_SSE_SPLIT_REGS targets.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} w/ and
> w/o -mf16c.
>
> Pushed to master.
>
> Uros.



-- 
BR,
Hongtao

Re: [RFC][PATCH] c++/46476 - implement -Wunreachable-code-return


On 11/26/21 5:18 AM, Richard Biener via Gcc-patches wrote:

This implements a subset of -Wunreachable-code, unreachable code
after a return stmt.  Contrary to the previous attemt at CFG
construction time this implements the bits during GIMPLE lowering
where there are still all GIMPLE return stmts in the IL.

The lowering phase keeps track of whether stmts can fallthru
which is used to determine if the following stmt is reachable.
The implementation only considers labels here.

The fallthru flag is transparently extended to allow tracking
a reason for non-fallthruness which is used to mark returns.

This patch runs in to the same stray return/gcc_unreachable as the
previous one and thus requires cleanup across the GCC code base
which seems controversical.  So I'm putting this on hold unless
I receive some OK for cleanup in any way, meaning this isn't
going to make stage3.


This isn't meant as an objection to the patch per se, just as
data points suggesting there's room for improvement.  I do think
at least some of those should be considered for GCC 12 if the patch
goes in.  I see just one trivial test which seems a bit light.
I would recommend beefing it up to exercise some the cases below.

I tested the patch with Glibc (no warnings) and Binutils/GDB.
The latter shows over 900 warnings (600 unique ones) in 46
files, so it might be a useful test bed.  Lots of those, maybe
most, are for a break after a return, suggesting that it might
be worthwhile to treat such inoccuous case specially (e.g., only
warn for a break after a return at level 2).

Some other instances suggest other possible improvements.
For example:

/src/binutils-gdb/libiberty/lrealpath.c: In function ‘lrealpath’:
/src/binutils-gdb/libiberty/lrealpath.c:113:3: warning: statement after 
return is not reachable [-Wunreachable-code-return]

  113 |   {
  |   ^
/src/binutils-gdb/libiberty/lrealpath.c:115:5: warning: statement after 
return is not reachable [-Wunreachable-code-return]

  115 | long path_max = pathconf ("/", _PC_PATH_MAX);
  | ^~~~
/src/binutils-gdb/libiberty/lrealpath.c:115:21: warning: statement after 
return is not reachable [-Wunreachable-code-return]

  115 | long path_max = pathconf ("/", _PC_PATH_MAX);
  | ^~~~

I think one of them is a true positive but the others all look
like noise.  None of the locations is very useful.  It might be
something to look into.  I would suggest to point the warning
to the first unreachable statement (other instances point to
it already) and add a note pointing to the statement that makes
the former unreachable.

Another example below shows that the warning triggers more than
once for the same statement, suggestiung it's missing some
suppression (e.g., a call to suppress_warning()).

/src/binutils-gdb/bfd/bfdio.c:167:3: warning: statement after return is 
not reachable [-Wunreachable-code-return]

  167 |   return close_on_exec (fopen (filename, modes));
  |   ^~
/src/binutils-gdb/bfd/bfdio.c:167:10: warning: statement after return is 
not reachable [-Wunreachable-code-return]

  167 |   return close_on_exec (fopen (filename, modes));
  |  ^~~

There are some other "unusual" cases worth a look, such as missing
context of any kind except for like and column:

elfnn-riscv.c:3346:7: warning: statement after return is not reachable 
[-Wunreachable-code-return]
elfnn-riscv.c:3349:7: warning: statement after return is not reachable 
[-Wunreachable-code-return]
elfnn-riscv.c:3352:7: warning: statement after return is not reachable 
[-Wunreachable-code-return]
elfnn-riscv.c:3355:7: warning: statement after return is not reachable 
[-Wunreachable-code-return]


I also tried a few test cases of my own that might be worth
handling at some point (not necessarily in the first iteration).

struct __jmp_buf_tag { };
typedef struct __jmp_buf_tag jmp_buf[1];

void f ();

void test_return ()
{
  return;
  f ();   // warning here (good)
}

extern __attribute__ ((noreturn)) void fnoret ();

void test_noreturn ()
{
  fnoret ();
  f ();   // missing warning
}

void test_throw ()
{
  throw "";
  f ();   // missing warning
}

jmp_buf jmpbuf;

void test_longjmp ()
{
  __builtin_longjmp (jmpbuf, 1);
  f ();   // missing warning
}

Please see a few more comments on the code changes inline.



Sorry.

Richard.

2021-11-26  Richard Biener  

PR c++/46476
gcc/cp/
* decl.c (finish_function): Set input_location to
BUILTINS_LOCATION around the code building the return 0
for main().
* cp-gimplify.c (genericize_if_stmt): Avoid optimizing if (true)
and if (false) when -Wunreachable-code-return is in effect.

gcc/
* common.opt (Wunreachable-code): Re-enable.
(Wunreachable-code-return): New diagnostic, enabled by
-Wextra and -Wunreachable-code.
* doc/invoke.texi (Wunreachable-code): Document.
(Wunreachable-code-return):

[COMMITTED] tree-optimization/103467 - Don't reuse reference after potential resize.

2021-11-29 Thread Andrew MacLeod via Gcc-patches

After making sure the vector is large enough, we use a reference to the 
object through the rest of the function.  One path however requests the 
def chain for a dependant ssa-name, and if that request caused a resize, 
then our reference is no longer valid. On this path, simply use the 
object directly.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed as obvious.

Andrew
commit ab202b659dbdfd3a1f45ffe7a5052f35b5e8fa6d
Author: Andrew MacLeod 
Date:   Mon Nov 29 19:53:50 2021 -0500

Don't reuse reference after potential resize.

When a new def chain is requested, any existing reference may no longer
be valid, so just use the object directly.

PR tree-optimization/103467
* gimple-range-gori.cc (range_def_chain::register_dependency): Don't
use an object reference after a potential resize.

diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index 911d7ac4ec8..0dba34b58c5 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -278,11 +278,12 @@ range_def_chain::register_dependency (tree name, tree dep, basic_block bb)
 {
   // Get the def chain for the operand.
   b = get_def_chain (dep);
-  // If there was one, copy it into result.
+  // If there was one, copy it into result.  Access def_chain directly
+  // as the get_def_chain request above could reallocate the vector.
   if (b)
-	bitmap_ior_into (src.bm, b);
+	bitmap_ior_into (m_def_chain[v].bm, b);
   // And copy the import list.
-  set_import (src, NULL_TREE, get_imports (dep));
+  set_import (m_def_chain[v], NULL_TREE, get_imports (dep));
 }
   else
 // Originated outside the block, so it is an import.

Re: [PATCH] libcpp: Fix up #__VA_OPT__ handling [PR103415]


On 11/26/21 04:33, Jakub Jelinek wrote:

Hi!

stringify_arg uses pfile->u_buff to create the string literal.
Unfortunately, paste_tokens -> _cpp_lex_direct -> lex_number -> 
_cpp_unaligned_alloc
can in some cases use pfile->u_buff too, which results in losing everything
prepared for the string literal until the token pasting.

The following patch fixes that by not calling paste_token during the
construction of the string literal, but doing that before.  All the tokens
we are processing have been pushed into a token buffer using
tokens_buff_add_token so it is fine if we paste some of them in that buffer
(successful pasting creates a new token in that buffer), move following
tokens if any to make it contiguous, pop (throw away) the extra tokens at
the end and then do stringify_arg.


Please add some of this explanation to the "paste any tokens" comment in 
the code.



Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-11-26  Jakub Jelinek  

PR preprocessor/103415
libcpp/
* macro.c (stringify_arg): Remove va_opt argument and va_opt handling.
(replace_args): Adjust callers.  For #__VA_OPT__, perform token
pasting in a separate loop before stringify_arg call.
gcc/testsuite/
* c-c++-common/cpp/va-opt-8.c: New test.

--- libcpp/macro.c.jj   2021-11-23 10:26:30.595792246 +0100
+++ libcpp/macro.c  2021-11-25 17:55:39.149217299 +0100
@@ -295,7 +295,7 @@ static cpp_context *next_context (cpp_re
  static const cpp_token *padding_token (cpp_reader *, const cpp_token *);
  static const cpp_token *new_string_token (cpp_reader *, uchar *, unsigned 
int);
  static const cpp_token *stringify_arg (cpp_reader *, const cpp_token **,
-  unsigned int, bool);
+  unsigned int);
  static void paste_all_tokens (cpp_reader *, const cpp_token *);
  static bool paste_tokens (cpp_reader *, location_t,
  const cpp_token **, const cpp_token *);
@@ -834,8 +834,7 @@ cpp_quote_string (uchar *dest, const uch
  /* Convert a token sequence FIRST to FIRST+COUNT-1 to a single string token
 according to the rules of the ISO C #-operator.  */
  static const cpp_token *
-stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count,
-  bool va_opt)
+stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count)
  {
unsigned char *dest;
unsigned int i, escape_it, backslash_count = 0;
@@ -852,24 +851,6 @@ stringify_arg (cpp_reader *pfile, const
  {
const cpp_token *token = first[i];
  
-  if (va_opt && (token->flags & PASTE_LEFT))

-   {
- location_t virt_loc = pfile->invocation_location;
- const cpp_token *rhs;
- do
-   {
- if (i == count)
-   abort ();
- rhs = first[++i];
- if (!paste_tokens (pfile, virt_loc, , rhs))
-   {
- --i;
- break;
-   }
-   }
- while (rhs->flags & PASTE_LEFT);
-   }
-
if (token->type == CPP_PADDING)
{
  if (source == NULL
@@ -1945,8 +1926,7 @@ replace_args (cpp_reader *pfile, cpp_has
if (src->flags & STRINGIFY_ARG)
  {
if (!arg->stringified)
- arg->stringified = stringify_arg (pfile, arg->first, arg->count,
-   false);
+ arg->stringified = stringify_arg (pfile, arg->first, arg->count);
  }
else if ((src->flags & PASTE_LEFT)
 || (src != macro->exp.tokens && (src[-1].flags & PASTE_LEFT)))
@@ -2066,11 +2046,49 @@ replace_args (cpp_reader *pfile, cpp_has
{
  unsigned int count
= start ? paste_flag - start : tokens_buff_count (buff);
- const cpp_token *t
-   = stringify_arg (pfile,
-start ? start + 1
-: (const cpp_token **) (buff->base),
-count, true);
+ const cpp_token **first
+   = start ? start + 1
+   : (const cpp_token **) (buff->base);
+ unsigned int i, j;
+
+ /* Paste any tokens that need to be pasted.  */
+ for (i = 0, j = 0; i < count; i++, j++)
+   {
+ const cpp_token *token = first[i];
+
+ if (token->flags & PASTE_LEFT)
+   {
+ location_t virt_loc = pfile->invocation_location;
+ const cpp_token *rhs;
+ unsigned short flags = token->flags;
+ do
+   {
+ if (i == count)
+   abort ();
+ rhs = first[++i];
+

Re: [PATCH] rs6000: Remove builtin mask check from builtin_decl [PR102347]

Hi!

On Tue, Sep 28, 2021 at 04:13:40PM +0800, Kewen.Lin wrote:
>   PR target/102347
>   * config/rs6000/rs6000-call.c (rs6000_builtin_decl): Remove builtin
>   mask check.

(Don't wrap lines early please).

Okay for trunk and all backports.  Thanks!


Segher

Re: [PATCH] rs6000: Remove builtin mask check from builtin_decl [PR102347]

Hi!

On Mon, Oct 11, 2021 at 02:30:42PM +0800, Kewen.Lin wrote:
> > If we do need a band-aid for 10 and 11 (and we do as far as I can see),
> > I'd like to see one for just MMA there, and let all other badness fade
> > into history.  Unless you can convince me (in the patch / commit
> > message) that this is safe :-)
> 
> Just to fix for MMA seems incomplete to me since we can simply
> construct one non-MMA but failed case.  I questioned in the other
> thread, is there any possibility for one invalid target specific
> bif to escape from the function rs6000_expand_builtin?  (note that
> folding won't handle invalid bifs, so invalid bifs won't get folded
> early.)  If no, I think it would be safe.

It is much safer and simpler to not backport anything we do not need to.
You need to come up with a reason why it would be a good idea to do
backport something, especially if it isn't obviously super safe.


Segher

[committed] analyzer: further false leak fixes due to overzealous state merging [PR103217]

2021-11-29 Thread David Malcolm via Gcc-patches

Commit r12-5424-gf573d35147ca8433c102e1721d8c99fc432cb44b fixed a false
positive from -Wanalyzer-malloc-leak due to overzealous state merging,
erroneously merging two different svalues bound to a particular part
of the store when one has sm-state.

A further case was discovered by the reporter of PR analyzer/103217,
which this patch fixes.  In this variant, different states have set
different fields of a struct, and on attempting to merge them, the
states have a different set of binding keys, leading to one state
having an svalue with sm-state, and its peer state having a NULL value
for that binding key.  The state merger code was erroneously treating
them as mergeable to "UNKNOWN".  This followup patch fixes things by
rejecting such mergers if the non-NULL svalue is not mergeable with
"UNKNOWN".

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-5585-g132902177138c09803d639e12b1daebf2b9edddc.

gcc/analyzer/ChangeLog:
PR analyzer/103217
* store.cc (binding_cluster::can_merge_p): For the "key is bound"
vs "key is not bound" merger case, check that the bound svalue
is mergeable before merging it to "unknown", rejecting the merger
otherwise.

gcc/testsuite/ChangeLog:
PR analyzer/103217
* gcc.dg/analyzer/pr103217-2.c: New test.
* gcc.dg/analyzer/pr103217-3.c: New test.
* gcc.dg/analyzer/pr103217-4.c: New test.
* gcc.dg/analyzer/pr103217-5.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/store.cc  | 14 +-
 gcc/testsuite/gcc.dg/analyzer/pr103217-2.c | 52 ++
 gcc/testsuite/gcc.dg/analyzer/pr103217-3.c | 52 ++
 gcc/testsuite/gcc.dg/analyzer/pr103217-4.c | 52 ++
 gcc/testsuite/gcc.dg/analyzer/pr103217-5.c | 47 +++
 5 files changed, 215 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr103217-2.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr103217-3.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr103217-4.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr103217-5.c

diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index 3760858c26d..5eecbe6cf18 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -1729,6 +1729,7 @@ binding_cluster::can_merge_p (const binding_cluster 
*cluster_a,
   for (hash_set::iterator iter = keys.begin ();
iter != keys.end (); ++iter)
 {
+  region_model_manager *sval_mgr = mgr->get_svalue_manager ();
   const binding_key *key = *iter;
   const svalue *sval_a = cluster_a->get_any_value (key);
   const svalue *sval_b = cluster_b->get_any_value (key);
@@ -1746,7 +1747,6 @@ binding_cluster::can_merge_p (const binding_cluster 
*cluster_a,
}
   else if (sval_a && sval_b)
{
- region_model_manager *sval_mgr = mgr->get_svalue_manager ();
  if (const svalue *merged_sval
  = sval_a->can_merge_p (sval_b, sval_mgr, merger))
{
@@ -1760,9 +1760,19 @@ binding_cluster::can_merge_p (const binding_cluster 
*cluster_a,
   /* If we get here, then one cluster binds this key and the other
 doesn't; merge them as "UNKNOWN".  */
   gcc_assert (sval_a || sval_b);
-  tree type = sval_a ? sval_a->get_type () : sval_b->get_type ();
+
+  const svalue *bound_sval = sval_a ? sval_a : sval_b;
+  tree type = bound_sval->get_type ();
   const svalue *unknown_sval
= mgr->get_svalue_manager ()->get_or_create_unknown_svalue (type);
+
+  /* ...but reject the merger if this sval shouldn't be mergeable
+(e.g. reject merging svalues that have non-purgable sm-state,
+to avoid falsely reporting memory leaks by merging them
+with something else).  */
+  if (!bound_sval->can_merge_p (unknown_sval, sval_mgr, merger))
+   return false;
+
   out_cluster->m_map.put (key, unknown_sval);
 }
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr103217-2.c 
b/gcc/testsuite/gcc.dg/analyzer/pr103217-2.c
new file mode 100644
index 000..3a9c4145848
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr103217-2.c
@@ -0,0 +1,52 @@
+typedef __SIZE_TYPE__ size_t;
+
+extern void *calloc (size_t __nmemb, size_t __size)
+  __attribute__ ((__nothrow__ , __leaf__, __malloc__, __alloc_size__ (1, 2)));
+
+extern char *strdup (const char *__s)
+  __attribute__ ((__nothrow__ , __leaf__, __malloc__, __nonnull__ (1)));
+
+extern void abort (void)
+  __attribute__ ((__nothrow__ , __leaf__, __noreturn__));
+
+extern int getopt (int ___argc, char *const *___argv, const char *__shortopts)
+  __attribute__ ((__nothrow__ , __leaf__, __nonnull__ (2, 3)));
+extern char *optarg;
+
+extern void free (void *__ptr)
+  __attribute__ ((__nothrow__ , __leaf__));
+
+char *xstrdup(const char *src) {
+   char *val = strdup(src);
+   if (!val)
+   abort();
+   return val;
+}
+
+struct test {
+

Re: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification

2021-11-29 Thread Navid Rahimi via Gcc-patches

Jeff,

Sorry for bringing back this thread.

Quick question, you mentioned checking the TYPE_PRECISION to make sure the type 
is a canonical Boolean type (and not a fancy signed/unsigned Boolean type from 
Ada Andrew mentioned).
But I noticed that truth_valued_p does already check for:
(if (INTEGRAL_TYPE_P (type) && TYPE_PRECISION (type) == 1))

So in this case, there should not any other Boolean type fall into 
truth_valued_p type [1]? Is that right?

1) https://github.com/gcc-mirror/gcc/blob/master/gcc/match.pd#L1717

Best wishes,
Navid.

From: Jeff Law 
Sent: Tuesday, November 23, 2021 12:03
To: Navid Rahimi; Andrew Pinski
Cc: Navid Rahimi via Gcc-patches
Subject: Re: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean 
comparison simplification

On 11/23/2021 12:55 PM, Navid Rahimi wrote:
>> Did you test Ada with this patch as that is where the "odd" boolean
>> types show up?
> No I haven't tested Ada yet. Since it is work in progress still [WIP]. Quick 
> question, to prevent applying this optimization to those odd Boolean types in 
> Ada, there should be a check to check whether it is canonical boolean type or 
> signed/unsigned, which should prevent messing with odd Boolean types in Ada.
IIRC, you should check the type's precision.  THere should be examples
you can find in one or more of the gimple optimizers.

jeff

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-11-29 Thread Qing Zhao via Gcc-patches

Peter,

Thanks a lot for the patch.

Richard, how do you think of the patch?

(The major concern for me is:

With the current patch proposed by Peter, we will generate the call to 
.DEFERRED_INIT for a variable with OPAQUE_TYPE during gimplification phase,
 However, if this variable is in register, then the call to 
.DEFERRED_INIT will NOT be expanded during RTL expansion phase.  This 
unexpanded call to .DEFERRED_INIT might cause some potential IR issue later?

 If the above is a real issue, should we skip initialization for all 
OPAQUE_TYPE variables even when they are in memory and can be initialized with 
memset?
then we should update “is_var_need_auto_init” in gimplify.c 
instead.   However, the issue with this approach is, we might miss the 
opportunity to initialize an OPAQUE_TYPE variable if it will be in memory?
).

Thanks.

Qing

> On Nov 29, 2021, at 3:56 PM, Peter Bergner  wrote:
> 
> Sorry for dropping the ball on testing the patch from the bugzilla!
> 
> The following patch fixes the ICE reported in the bugzilla on the pre-existing
> gcc testsuite test case, bootstraps and shows no testsuite regressions
> on powerpc64le-linux.  Ok for trunk?
> 
> Peter
> 
> 
> For -ftrivial-auto-var-init=*, skip initializing the register variable if it
> is an opaque type, because CONST0_RTX(mode) is not defined for opaque modes.
> 
> gcc/
>   PR middle-end/103127
>   * internal-fn.c (expand_DEFERRED_INIT): Skip if VAR_TYPE is opaque.
> 
> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
> index 0cba95411a6..7cc0e9d5293 100644
> --- a/gcc/internal-fn.c
> +++ b/gcc/internal-fn.c
> @@ -3070,6 +3070,10 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
> }
>   else
> {
> +  /* Skip variables of opaque types that are in a register.  */
> +  if (OPAQUE_TYPE_P (var_type))
> + return;
> +
>   /* If this variable is in a register use expand_assignment.
>For boolean scalars force zero-init.  */
>   tree init;

Re: [PATCH] libcpp: Enable P1949R7 for C++11 and up as it was a DR [PR100977]


On 11/29/21 06:24, Jakub Jelinek wrote:

Hi!

Jonathan mentioned on IRC that:
"Accept P1949R7 (C++ Identifier Syntax using Unicode Standard Annex 31) as
a Defect Report and apply the changes therein to the C++ working paper."
while I've actually implemented it only for -std={gnu,c}++{23,2b}.
As the C++98 rules were significantly different, I'm not trying to change
anything for C++98.


I'm inclined to go ahead and change C++98 as well; I doubt anyone is 
relying on the particular C++98 extended character set rules, and we 
already accept the union of the different sets when not pedantic.



So far lightly tested, ok for trunk if it passes full bootstrap/regtest
on x86_64-linux/i686-linux?


OK.


2021-11-29  Jakub Jelinek  

PR c++/100977
* init.c (lang_defaults): Enable cxx23_identifiers for
-std={gnu,c}++{11,14,17,20} too.

* c-c++-common/cpp/ucnid-2011-1-utf8.c: Expect errors in C++.
* c-c++-common/cpp/ucnid-2011-1.c: Likewise.
* g++.dg/cpp/ucnid-4-utf8.C: Add missing space to dg-options.
* g++.dg/cpp23/normalize3.C: Enable for c++11 rather than just c++23.
* g++.dg/cpp23/normalize4.C: Likewise.
* g++.dg/cpp23/normalize5.C: Likewise.
* g++.dg/cpp23/normalize7.C: Expect errors rather than just warnings
for c++11 and up rather than just c++23.
* g++.dg/cpp23/ucnid-2-utf8.C: Expect errors even for c++11 .. c++20.

--- libcpp/init.c.jj2021-11-17 20:08:18.359724792 +0100
+++ libcpp/init.c   2021-11-29 11:40:05.989432952 +0100
@@ -114,14 +114,14 @@ static const struct lang_flags lang_defa
/* STDC2X   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,1, 
1, 1,   1,  0,   1, 1,   0,   1 },
/* GNUCXX   */  { 0,  1,  1,  1,  0,  0,0,  1,   0,   0,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
/* CXX98*/  { 0,  1,  0,  1,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX11 */  { 1,  1,  1,  1,  1,  0,0,  1,   1,   1,   1,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX11*/  { 1,  1,  0,  1,  1,  0,1,  1,   1,   1,   1,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX14 */  { 1,  1,  1,  1,  1,  0,0,  1,   1,   1,   1,1, 
1, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX14*/  { 1,  1,  0,  1,  1,  0,1,  1,   1,   1,   1,1, 
1, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX17 */  { 1,  1,  1,  1,  1,  0,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX17*/  { 1,  1,  1,  1,  1,  0,1,  1,   1,   1,   1,1, 
1, 0,   1,  0,   1, 0,   0,   0 },
-  /* GNUCXX20 */  { 1,  1,  1,  1,  1,  0,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX20*/  { 1,  1,  1,  1,  1,  0,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
+  /* GNUCXX11 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
+  /* CXX11*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
+  /* GNUCXX14 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   0,  1,   1, 0,   0,   0 },
+  /* CXX14*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 1,   0,  0,   1, 0,   0,   0 },
+  /* GNUCXX17 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
+  /* CXX17*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  0,   1, 0,   0,   0 },
+  /* GNUCXX20 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
+  /* CXX20*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
/* GNUCXX23 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
/* CXX23*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
/* ASM  */  { 0,  0,  1,  0,  0,  0,0,  0,   0,   0,   0,0, 
0, 0,   0,  0,   0, 0,   0,   0 }
--- gcc/testsuite/c-c++-common/cpp/ucnid-2011-1-utf8.c.jj   2020-01-14 
20:02:46.649611841 +0100
+++ gcc/testsuite/c-c++-common/cpp/ucnid-2011-1-utf8.c  2021-11-29 
12:11:51.640324720 +0100
@@ -2,7 +2,7 @@
  /* { dg-options "-std=c11 -pedantic" { target c } } */
  /* { dg-options "-std=c++11 -pedantic" { target c++ } } */
  
-¨

+¨ /* { dg-error "is not valid in an identifier" "" { target c++ } } */
  
  B̀
  
@@ -11,5 +11,5 @@ B̀

  À /* { dg-warning "not in NFC" } */
  
  

-�
-ሴ
+�  /* { dg-error "is not valid in an identifier" "" { target c++ } } */
+ሴ  /* {

Re: [PATCH] c++: Small incremental tweak to source_location::current() folding


On 11/27/21 03:52, Jakub Jelinek wrote:

On Wed, Nov 24, 2021 at 11:42:28PM +0100, Jakub Jelinek via Gcc-patches wrote:

I'm surprised the source_location::current handling would be needed; why do
calls to that function live long enough for us to walk into the ADDR_EXPR
here?  Maybe we should fold it in cp_fold instead of cp_genericize_r.


I've already committed the patch, but perhaps we shouldn't do it in cp_fold
where it will be folded even for warnings etc. and the locations might not
be the final yet.  This patch moves it to cp_fold_r so that it is done just
once for each function and just once for each static initializer.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2021-11-27  Jakub Jelinek  

* cp-gimplify.c (cp_fold_r): Perform folding of
std::source_location::current() calls here...
(cp_fold): ... rather than here.

--- gcc/cp/cp-gimplify.c.jj 2021-11-26 10:10:53.227121384 +0100
+++ gcc/cp/cp-gimplify.c2021-11-26 11:51:50.945204127 +0100
@@ -930,6 +930,13 @@ cp_fold_r (tree *stmt_p, int *walk_subtr
}
break;
  
+case CALL_EXPR:

+  if (tree fndecl = cp_get_callee_fndecl_nofold (stmt))
+   if (DECL_IMMEDIATE_FUNCTION_P (fndecl)
+   && source_location_current_p (fndecl))
+ *stmt_p = stmt = cxx_constant_value (stmt);
+  break;
+
  default:
break;
  }
@@ -2672,14 +2679,6 @@ cp_fold (tree x)
int sv = optimize, nw = sv;
tree callee = get_callee_fndecl (x);
  
-	if (tree fndecl = cp_get_callee_fndecl_nofold (x))

- if (DECL_IMMEDIATE_FUNCTION_P (fndecl)
- && source_location_current_p (fndecl))
-   {
- x = cxx_constant_value (x);
- break;
-   }
-
/* Some built-in function calls will be evaluated at compile-time in
   fold ().  Set optimize to 1 when folding __builtin_constant_p inside
   a constexpr function so that fold_builtin_1 doesn't fold it to 0.  */


Jakub

Re: [PATCH 3/3] c++: P1997 array-copy extensions: Assignment, return, etc. [PR103238]


On 11/21/21 21:51, Will Wray via Gcc-patches wrote:

This second patch completes the work of the first 'array-copy' patch to
provide first-cut implementations of all P1997 features. It adds:

  * Assignments to arrays from array values,a = b;
  * Placeholder auto in array declarations, auto cp[] = a;
  * Array as a return type from functions WIP,  auto f() -> T[N];
  * Parsing of array pseudo-destructors a.~A()
(only parsing for now, untested)


Let's not accept the pseudo-destructor and silently ignore it.  Either 
we should delay the parsing change until the semantics are implemented, 
or parse it and give a sorry().



Assignments a = b were easily allowed by changing branch conditions.
Assignments a = {e...} were trickier (a case not mentioned in P1997):

 int a[16]; a = {0,1,1,2}; a = {}; // assignments from init-lists

The semantics is the same as for struct aggregates:
(1) Aggregate initialization of an rhs array of the lhs type
 (so trailing elements with no initializer are value initialized)
(2) Copy-initialization of the lhs from the rhs.

The special case of an optionally-braced array value is allowed so that
a = b and a = {b} are generally equivalent for same type arrays a and b.
However, the now special-special case of assignment from a braced string-
literal currently only supports exact-match (same as for other arrays):

 char a[4]; a={"c++"} /* OK */; a={"c"} /* FAILs but should work */;

Array return from function is work in progress. The tests show what works.
I'm stuck in unfamiliar territory so it's best to submit what I have to be
reviewed for hints on how to progress.


Let's split out array return into a separate patch for now.  It also has 
ABI implications.



Please try the patch; play, stress it, and report the FAILS.

PR c++/103238

gcc/c/ChangeLog:

* c-decl.c (grokdeclarator): Don't complain of array returns.

gcc/cp/ChangeLog:

* call.c (can_convert_array): Extend to include array inits.
(standard_conversion): No decay for same-type array. Call build_conv.
(implicit_conversion_1): Call reshape_init for arrays too.
* decl.c (grokdeclarator): Don't complain of array returns.
* parser.c (cp_parser_postfix_dot_deref_expression): parse array ~A().
* pt.c (tsubst_function_type): Array type return is not a failure.
(do_auto_deduction): Placeholder auto deduction of array element type.
* tree.c (lvalue_kind): clk_class should include array (I think?).
* typeck.c (cp_build_modify_expr): Call reshape init to strip optional
braces. Allow NOP_EXPR for array assignment.
(convert_for_assignment): New if-block for same-type array convert,
strips optional braces, but rejects STRING_CST rhs shorter than lhs.

gcc/testsuite/ChangeLog:

* g++.dg/init/array-copy10.C: New test. auto[] deduce 'after' PASSes
* g++.dg/init/array-copy11.C: New test. Array return 'before' XFAILs
* g++.dg/init/array-copy12.C: New test. Array return 'after' PASSes
* g++.dg/init/array-copy7.C: New test. Array assign 'before' XFAILs
* g++.dg/init/array-copy8.C: New test. Array assign 'after' PASSes
* g++.dg/init/array-copy9.C: New test. auto[] deduce 'before' XFAILs
---
  gcc/c/c-decl.c   |  2 +-
  gcc/cp/call.c| 43 +++--
  gcc/cp/decl.c|  2 +-
  gcc/cp/parser.c  |  4 +-
  gcc/cp/pt.c  | 13 +-
  gcc/cp/tree.c|  3 +-
  gcc/cp/typeck.c  | 26 +--
  gcc/testsuite/g++.dg/init/array-copy10.C | 57 +++
  gcc/testsuite/g++.dg/init/array-copy11.C | 13 ++
  gcc/testsuite/g++.dg/init/array-copy12.C | 79 
  gcc/testsuite/g++.dg/init/array-copy7.C  | 40 
  gcc/testsuite/g++.dg/init/array-copy8.C  | 56 ++
  gcc/testsuite/g++.dg/init/array-copy9.C  | 57 +++
  13 files changed, 372 insertions(+), 23 deletions(-)

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 3e28a038095..031c43d189f 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -7055,7 +7055,7 @@ grokdeclarator (const struct c_declarator *declarator,
"returning a function");
type = integer_type_node;
  }
-   if (TREE_CODE (type) == ARRAY_TYPE)
+   if (TREE_CODE (type) == ARRAY_TYPE && !flag_array_copy)
  {
if (name)
  error_at (loc, "%qE declared as function returning an array",
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 4ee21c7bdbd..c73fb73d86e 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -908,29 +908,34 @@ static bool
  can_convert_array (tree atype, tree from, int flags, tsubst_flags_t complain)
  {
tree elttype = TREE_TYPE (atype);
-

Re: [PATCH v2] combine: Tweak the condition of last_set invalidation

Hi!

On Fri, Jun 11, 2021 at 09:16:21PM +0800, Kewen.Lin wrote:
> >> +/* Should pick up the lowest luid if the references
> >> +   are in the same block.  */
> >> +if (label_tick == rsp->last_set_table_tick
> >> +&& rsp->last_set_table_luid > insn_luid)
> >> +  rsp->last_set_table_luid = insn_luid;
> > 
> > Why?  Is it conservative for the check you will do later?  Please spell
> > this out, it is crucial!
> 
> Since later the combinations involving this insn probably make the
> register be used in one insn sitting ahead (which has smaller luid than
> the one which was recorded before).  Yes, it's very conservative, this
> ensure that we always use the luid of the insn which is the first insn
> using this register in the block.

Why would that be correct?!

> The last_set invalidation is going
> to catch the case like:
> 
>... regX  // avoid the set used here ...
>regX = ...
>...
> 
> Once we have the smallest luid one of all insns which use register X,
> any unsafe regX sets should be caught.

Yes, you invalidate more, but because you put lies in the table :-(

>   * combine.c (struct reg_stat_type): New member
>   last_set_table_luid.

This fits on one line.

>   (update_table_tick): Add one argument for insn luid and
>   set last_set_table_luid with it, remove its declaration.
>   (record_value_for_reg): Adjust the condition to set
>   last_set_invalid nonzero.

These lines break earlier than they should as well.

> +  /* Record the luid of the insn which uses register n, the insn should
> + be the first one using register n in that block of the insn which
> + last_set_table_tick was set for.  */
> +
> +  intlast_set_table_luid;

I'm not sure what this variable is for.  The comment says something
else than the variable name does, and now I don't know what to
believe :-)

The name says it is for a SET, the explanation says it is for a USE.


Segher

Re: [PATCH v2] rs6000: Modify the way for extra penalized cost

Hi!

On Tue, Sep 28, 2021 at 04:16:04PM +0800, Kewen.Lin wrote:
> This patch follows the discussions here[1][2], where Segher
> pointed out the existing way to guard the extra penalized
> cost for strided/elementwise loads with a magic bound does
> not scale.
> 
> The way with nunits * stmt_cost can get one much
> exaggerated penalized cost, such as: for V16QI on P8, it's
> 16 * 20 = 320, that's why we need one bound.  To make it
> better and more readable, the penalized cost is simplified
> as:
> 
> unsigned adjusted_cost = (nunits == 2) ? 2 : 1;
> unsigned extra_cost = nunits * adjusted_cost;

> For V2DI/V2DF, it uses 2 penalized cost for each scalar load
> while for the other modes, it uses 1.

So for V2D[IF] we get 4, for V4S[IF] we get 4, for V8HI it's 8, and
for V16QI it is 16?  Pretty terrible as well, heh (I would expect all
vector ops to be similar cost).

> It's mainly concluded
> from the performance evaluations.  One thing might be
> related is that: More units vector gets constructed, more
> instructions are used.

Yes, but how often does that happen, compared to actual vector ops?

This also suggests we should cost vector construction separately, which
would pretty obviously be a good thing anyway (it happens often, it has
a quite different cost structure).

> It has more chances to schedule them
> better (even run in parallelly when enough available units
> at that time), so it seems reasonable not to penalize more
> for them.

Yes.

> +   /* Don't expect strided/elementwise loads for just 1 nunit.  */

"We don't expect" etc.

Okay for trunk.  Thanks!  This probably isn't the last word in this
story, but it is an improvement in any case :-)

Segher

[PATCH] PR fortran/103473 - [11/12 Regression] ICE in simplify_minmaxloc_nodim, at fortran/simplify.c:5287

Dear all,

another trivial and obvious one, discovered by Gerhard.

We can have a NULL pointer dereference simplifying MINLOC/MAXLOC
on an array that was not properly declared.

OK for mainline / affected 11-branch after regtesting completes?

Thanks,
Harald

From 6bdecd3805eb0d55722992ccb517d08b9bafe605 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Mon, 29 Nov 2021 22:56:30 +0100
Subject: [PATCH] Fortran: error recovery when simplifying MINLOC/MAXLOC

gcc/fortran/ChangeLog:

	PR fortran/103473
	* simplify.c (simplify_minmaxloc_nodim): Avoid NULL pointer
	dereference when shape is not set.

gcc/testsuite/ChangeLog:

	PR fortran/103473
	* gfortran.dg/minmaxloc_15.f90: New test.
---
 gcc/fortran/simplify.c |  3 +++
 gcc/testsuite/gfortran.dg/minmaxloc_15.f90 | 11 +++
 2 files changed, 14 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_15.f90

diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index c9e13b59da9..fb7b7814603 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -5280,6 +5280,9 @@ simplify_minmaxloc_nodim (gfc_expr *result, gfc_expr *extremum,
   && !mask->value.logical)
 goto finish;

+  if (array->shape == NULL)
+goto finish;
+
   for (i = 0; i < array->rank; i++)
 {
   count[i] = 0;
diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_15.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_15.f90
new file mode 100644
index 000..e4eba3501d5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/minmaxloc_15.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! PR fortran/103473 - ICE in simplify_minmaxloc_nodim
+! Test case by Gerhard Steinmetz.
+
+subroutine s
+  implicit none
+  integer, parameter :: a(+'1') = [1] ! { dg-error "unary numeric operator" }
+  print *, minloc (a)
+end
+
+! { dg-prune-output "Parameter array" }
--
2.26.2

[PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-11-29 Thread Peter Bergner via Gcc-patches

Sorry for dropping the ball on testing the patch from the bugzilla!

The following patch fixes the ICE reported in the bugzilla on the pre-existing
gcc testsuite test case, bootstraps and shows no testsuite regressions
on powerpc64le-linux.  Ok for trunk?

Peter


For -ftrivial-auto-var-init=*, skip initializing the register variable if it
is an opaque type, because CONST0_RTX(mode) is not defined for opaque modes.

gcc/
PR middle-end/103127
* internal-fn.c (expand_DEFERRED_INIT): Skip if VAR_TYPE is opaque.

diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 0cba95411a6..7cc0e9d5293 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -3070,6 +3070,10 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
 }
   else
 {
+  /* Skip variables of opaque types that are in a register.  */
+  if (OPAQUE_TYPE_P (var_type))
+   return;
+
   /* If this variable is in a register use expand_assignment.
 For boolean scalars force zero-init.  */
   tree init;

[PATCH] PR fortran/101565 - ICE in gfc_simplify_image_index, at fortran/simplify.c:8234

Dear all,

a trivial one: we need to check the type of the SUB argument
to the coarray IMAGE_INDEX intrinsic.  It has to be an array
of type integer.

Patch by Steve Kargl.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From 58140e3cf97aedc5c9f3b3e6519027334cdc3213 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Mon, 29 Nov 2021 22:23:02 +0100
Subject: [PATCH] Fortran: check type of SUB argument to IMAGE_INDEX

gcc/fortran/ChangeLog:

	PR fortran/101565
	* check.c (gfc_check_image_index): Verify that SUB argument to
	IMAGE_INDEX is of type integer.

gcc/testsuite/ChangeLog:

	PR fortran/101565
	* gfortran.dg/coarray_49.f90: New test.
---
 gcc/fortran/check.c  | 7 +++
 gcc/testsuite/gfortran.dg/coarray_49.f90 | 9 +
 2 files changed, 16 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/coarray_49.f90

diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index 3e65f3d8b1f..ee3a51ee253 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -5955,6 +5955,13 @@ gfc_check_image_index (gfc_expr *coarray, gfc_expr *sub)
   return false;
 }

+  if (sub->ts.type != BT_INTEGER)
+{
+  gfc_error ("Type of %s argument of IMAGE_INDEX at %L shall be INTEGER",
+		 gfc_current_intrinsic_arg[1]->name, >where);
+  return false;
+}
+
   if (gfc_array_size (sub, ))
 {
   int corank = gfc_get_corank (coarray);
diff --git a/gcc/testsuite/gfortran.dg/coarray_49.f90 b/gcc/testsuite/gfortran.dg/coarray_49.f90
new file mode 100644
index 000..370e3fd5847
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray_49.f90
@@ -0,0 +1,9 @@
+! { dg-do compile }
+! { dg-options "-fcoarray=lib" }
+! PR fortran/101565 - ICE in gfc_simplify_image_index
+! Contributed by G. Steinmetz
+
+program p
+  integer :: x[*]
+  print *, image_index (x, [1.0]) ! { dg-error "shall be INTEGER" }
+end
--
2.26.2

Re: [PATCH] Fix regression introduced by r12-5536.

2021-11-29 Thread Uros Bizjak via Gcc-patches

On Mon, Nov 29, 2021 at 10:48 AM Hongtao Liu  wrote:
>
> On Mon, Nov 29, 2021 at 3:53 PM Uros Bizjak  wrote:
> >
> > On Mon, Nov 29, 2021 at 2:32 AM liuhongt  wrote:
> > >
> > > There're several failures reported in [1]:
> > > 1.  unsupported instruction `pextrw` for "pextrw $0, %xmm31, 16(%rax)"
> > > %vpextrw should be used in output templates.
> > > 2. ICE in get_attr_memory for movhi_internal since some alternatives
> > > are marked as TYPE_SSELOG.
> > > Explicitly set memory_attr for those alternatives.
> > >
> > > Also this patch fixs a typo and some latent bugs which are related to
> > > moving HImode from/to sse register w/o TARGET_AVX512FP16.

Here are some more fixes:

i386: Fix and improve movhi_internal and movhf_internal some more.

An (*v,C) alternative can be added to movhi_internal to directly load
HImode constant 0 to xmm register. Also, V4SFmode moves can be used
for xmm->xmm moves instead of TImode moves when optimizing for size.
Fix invalid %vpinsrw insn template, which needs to duplicate %xmm
register for AVX targets.

Optimize GPR moves in movhf_internal in the same way as in movhi_internal.
Fix pinsrw and pextrw templates for AVX targets. Use sselog1
instead of sselog type.  Also, handle TARGET_SSE_PARTIAL_REG_DEPENDENCY
and TARGET_SSE_SPLIT_REGS targets.

2021-11-29  Uroš Bizjak  

gcc/ChangeLog:

PR target/102811
* config/i386/i386.md (*movhi_internal): Introduce (*v,C) alternative.
Do not allocate non-GPR registers.  Optimize xmm->xmm moves when
optimizing for size.  Fix vpinsrw insn template.
(*movhf_internal): Fix pinsrw and pextrw insn templates for
AVX targets. Use sselog1 type instead of sselog.  Optimize GPR moves.
Optimize xmm->xmm moves for TARGET_SSE_PARTIAL_REG_DEPENDENCY
and TARGET_SSE_SPLIT_REGS targets.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} w/ and
w/o -mf16c.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index a384dae23e2..c88374c9d2b 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2495,11 +2495,12 @@
   (symbol_ref "true")))])
 
 (define_insn "*movhi_internal"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r ,r ,m ,*k,*k 
,*r,*m,*k,?r,?v,*v,*v,*m")
-   (match_operand:HI 1 "general_operand"  "r 
,rn,rm,rn,*r,*km,*k,*k,CBC,v, r, v, m, v"))]
+  [(set (match_operand:HI 0 "nonimmediate_operand"
+"=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m")
+   (match_operand:HI 1 "general_operand"
+"r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r  ,C ,*v,m ,*v"))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))
&& ix86_hardreg_mov_ok (operands[0], operands[1])"
-
 {
   switch (get_attr_type (insn))
 {
@@ -2526,10 +2527,13 @@
   return ix86_output_ssemov (insn, operands);
 
 case TYPE_SSELOG1:
+  if (satisfies_constraint_C (operands[1]))
+   return standard_sse_constant_opcode (insn, operands);
+
   if (SSE_REG_P (operands[0]))
return MEM_P (operands[1])
- ? "%vpinsrw\t{$0, %1, %0|%0, %1, 0}"
- : "%vpinsrw\t{$0, %k1, %0|%0, %k1, 0}";
+ ? "%vpinsrw\t{$0, %1, %d0|%d0, %1, 0}"
+ : "%vpinsrw\t{$0, %k1, %d0|%d0, %k1, 0}";
   else
return MEM_P (operands[0])
  ? "%vpextrw\t{$0, %1, %0|%0, %1, 0}"
@@ -2550,23 +2554,25 @@
 }
 }
   [(set (attr "isa")
-   (cond [(eq_attr "alternative" "9,10,11,12")
+   (cond [(eq_attr "alternative" "9,10,11,12,13")
  (const_string "sse2")
-  (eq_attr "alternative" "13")
+  (eq_attr "alternative" "14")
  (const_string "sse4")
   ]
   (const_string "*")))
(set (attr "type")
- (cond [(eq_attr "alternative" "9,10,12,13")
+ (cond [(eq_attr "alternative" "4,5,6,7")
+ (const_string "mskmov")
+   (eq_attr "alternative" "8")
+ (const_string "msklog")
+   (eq_attr "alternative" "9,10,13,14")
  (if_then_else (match_test "TARGET_AVX512FP16")
(const_string "ssemov")
(const_string "sselog1"))
-   (eq_attr "alternative" "4,5,6,7")
- (const_string "mskmov")
(eq_attr "alternative" "11")
+ (const_string "sselog1")
+   (eq_attr "alternative" "12")
  (const_string "ssemov")
-   (eq_attr "alternative" "8")
- (const_string "msklog")
(match_test "optimize_function_for_size_p (cfun)")
  (const_string "imov")
(and (eq_attr "alternative" "0")
@@ -2581,33 +2587,54 @@
  (const_string "imovx")
   ]
   (const_string "imov")))
-(set (attr "prefix")
-(cond [(eq_attr "alternative" "9,10,11,12,13")
- (const_string "maybe_evex")
-   (eq_attr "alternative" "4,5,6,7,8")
- (const_string "vex")
-  ]
-  (const_string "orig")))
-(set (attr "mode")
-  (cond

[wwwdocs] Add spanstream and constexpr std::string to libstdc++ changes

2021-11-29 Thread Jonathan Wakely via Gcc-patches

Pushed to wwwdocs.


---
 htdocs/gcc-12/changes.html | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index b8524f47..10ac025f 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -289,9 +289,9 @@ a work-in-progress.
 
 Improved experimental C++20 support, including:
   
-  std::vector, std::optional,
-  and std::variant can be used in constexpr
-  functions.
+  std::vector, std::basic_string,
+  std::optional, and std::variant
+  can be used in constexpr functions.
   Layout-compatibility and pointer-interconvertibility traits.
   
 
@@ -299,6 +299,7 @@ a work-in-progress.
   
   Monadic operations for std::optional.
   std::move_only_function
+  spanstream
   std::invoke_r
   std::basic_string::resize_and_overwrite
   
-- 
2.31.1

Re: [PATCH, Fortran] Fix setting of array lower bound for named arrays

Hi Tobias, all,

Am 29.11.21 um 21:56 schrieb Tobias Burnus:

The problem is that the standard does not really state what the bounds
are :-(

I sort of expected that comment...

Usually, it ends up referring to LBOUND (with wordings like "each lower
bound equal to the corresponding element of LBOUND (expr)") – albeit not
in this case.

Assuming that holds, the question is what's lbound(h(3))? gfortran gives
[1] for that one. What does NAG yield?

For lbound(h(3),dim=1), the standard has:

"If DIM is present, ARRAY is a whole array, and either ARRAY is an
assumed-size array of rank DIM or dimension DIM of ARRAY has nonzero
extent, the result has a value equal to the lower bound for subscript
DIM of ARRAY. Otherwise, if DIM is present, the result value is 1."

Thus, the question is whether "h(3)" is a 'whole array' or not. That reads:

F2018: 9.5.2 Whole arrays

"A whole array is a named array or a structure component whose final
part-ref is an array component name; no subscript list is appended."

I think in "h(3)" there is not really a named array – thus I read it as
if the "Otherwise ... result value is 1" applies.

If you read on in the standard:

"The appearance of a whole array variable in an executable construct
specifies all the elements of the array ..."

which might make you/makes me think that the sentence before that one
could need an official interpretation...

I've submitted a reduced example to the Intel Fortran Forum:

https://community.intel.com/t5/Intel-Fortran-Compiler/Allocate-with-SOURCE-and-bounds/m-p/1339992#M158535

There are good chances that Steve Lionel reads and comments on it.

You're invited to add to that thread!

Harald

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
Registergericht München, HRB 106955

Re: [PATCH 01/12] LoongArch Port: gcc build

2021-11-29 Thread Joseph Myers

On Sat, 27 Nov 2021, chenglulu wrote:

> + if (p_arch_native)
> +   fatal_error (UNKNOWN_LOCATION,
> +"Unknown FPU type %<0x%x%>, "
> +"%<-m" OPTSTR_ARCH "=" STR_CPU_NATIVE "%> failed",

Diagnostics should not start with an uppercase letter (unless the start 
would have a capital letter even in the middle of a sentence - thus, the 
diagnostic you have elsewhere starting "ABI" is fine as-is).

Diagnostics can't use string concatenation with macros, because that means 
only the part up to the first macro name gets extracted into gcc.pot for 
translation - in this case, "Unknown FPU type %<0x%x%>, %<-m" - so the 
translations never match at runtime.  The whole format string for a 
diagnostic function needs to appear directly in the call to the function 
(or to macros such as G_), without any part of that string itself coming 
from macro expansion, for translation to work properly.

> + inform (UNKNOWN_LOCATION, "Unknown processor ID %<0x%x%>, "
> + "some tuning parameters will fall back to default",

Diagnostics should not start with an uppercase letter.

> + inform (UNKNOWN_LOCATION,
> + "%<-m%s%> overrides %<-m" OPTSTR_ABI_BASE "=%s%>",

Same comment as above about not concatenating with macros in diagnostics.

> +  if (t.cpu_arch == CPU_NATIVE)
> +fatal_error (UNKNOWN_LOCATION,
> +  "%<-m" OPTSTR_ARCH "=" STR_CPU_NATIVE "%> "
> +  "does not work on a cross compiler");
> +
> +  else if (t.cpu_tune == CPU_NATIVE)
> +fatal_error (UNKNOWN_LOCATION,
> +  "%<-m" OPTSTR_TUNE "=" STR_CPU_NATIVE "%> "
> +  "does not work on a cross compiler");

Likewise.

> +  warning (0, "%s CPU architecture (%qs) does not support %qs ABI, "
> +"falling back to %<-m" OPTSTR_ARCH "=%s%>",

Likewise.

> +(t.cpu_arch == CPU_NATIVE ? "your native" : "default"),

Also, "your native" and "default" look like they are intended as English 
that should be translated, not e.g. literal option arguments that should 
stay as-is.  So they needs to appear directly in the diagnostic sentence 
for proper translation, e.g.

  if (t.cpu_arch == CPU_NATIVE)
warning (...);
  else
warning (...);

(if you use a conditional expression in the second argument to warning to 
avoid having separate calls like that, both format strings then need to 
have G_ () around them to ensure they are properly extracted for 
translation).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH, Fortran] Fix setting of array lower bound for named arrays

2021-11-29 Thread Tobias Burnus


Hi Harald, hi Chung-Lin,

On 29.11.21 21:21, Harald Anlauf wrote:

I think you need to check the following:

 allocate(c, source=h(3))
 write(*,*) lbound(c,1), ubound(c,1) ! prints 1 3
...
 pure function h(i) result(r)
  integer, value, intent(in) :: i
  integer, allocatable :: r(:)
  allocate(r(3:5))
  r = [1,2,3]
 end function h

This used to print 3 5, which is also what e.g. NAG, Nvidia, flang do.
Intel prints 1 3, so it agrees with you.

Hmm, usually NAG is right ...

The Fortran standard has:
9.7.1.2  Execution of an ALLOCATE statement

(6) When an ALLOCATE statement is executed for an array with no
allocate-shape-spec-list, the bounds of source-expr determine the
bounds of the array. Subsequent changes to the bounds of source-expr
do not affect the array bounds.


The problem is that the standard does not really state what the bounds
are :-(

Usually, it ends up referring to LBOUND (with wordings like "each lower
bound equal to the corresponding element of LBOUND (expr)") – albeit not
in this case.

Assuming that holds, the question is what's lbound(h(3))? gfortran gives
[1] for that one. What does NAG yield?

For lbound(h(3),dim=1), the standard has:

"If DIM is present, ARRAY is a whole array, and either ARRAY is an
assumed-size array of rank DIM or dimension DIM of ARRAY has nonzero
extent, the result has a value equal to the lower bound for subscript
DIM of ARRAY. Otherwise, if DIM is present, the result value is 1."

Thus, the question is whether "h(3)" is a 'whole array' or not. That reads:

"A whole array is a named array or a structure component whose final
part-ref is an array component name; no subscript list is appended."

I think in "h(3)" there is not really a named array – thus I read it as
if the "Otherwise ... result value is 1" applies.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: [PATCH take #2] x86_64: PR target/100711: Splitters for pandn

2021-11-29 Thread Uros Bizjak via Gcc-patches

On Mon, Nov 29, 2021 at 7:14 PM Roger Sayle  wrote:
>
>
> Hi Uros,
> Many thanks for the review.  Here's a revised version of the patch
> incorporating all of your suggestions.  This has been (re)tested on
> x86_64-pc-linux-gnu with make bootstrap and make -k check,
> both with and without --target_board=unix{-m32}, with no new
> failures.  Ok for mainline?
>
>
> 2021-11-29  Roger Sayle  
> Uroš Bizjak  
>
> gcc/ChangeLog
> PR target/100711
> * config/i386/sse.md (define_split): New splitters to simplify
> not;vec_duplicate;and as vec_duplicate;andn.
>
> gcc/testsuite/ChangeLog
> PR target/100711
> * gcc.target/i386/pr100711-1.c: New test case.
> * gcc.target/i386/pr100711-2.c: New test case.

OK.

Thanks,
Uros.

Re: [PATCH, Fortran] Fix setting of array lower bound for named arrays


Hi Chung-Lin,

Am 29.11.21 um 15:25 schrieb Chung-Lin Tang:

This patch by Tobias, fixes a case of setting array low-bounds, found
for particular uses of SOURCE=/MOLD=.

For example:
program A_M
   implicit none
   real, dimension (:), allocatable :: A, B
   allocate (A(0:5))
   call Init (A)
contains
   subroutine Init ( A )
     real, dimension ( 0 : ), intent ( in ) :: A
     integer, dimension ( 1 ) :: lb_B

     allocate (B, mold = A)
     ...
     lb_B = lbound (B, dim=1)   ! Error: lb_B assigned 1, instead of 0
like lower-bound of A.

Referencing the Fortran standard:

"16.9.109 LBOUND (ARRAY [, DIM, KIND])"
states:
"If DIM is present, ARRAY is a whole array, and either ARRAY is
  an assumed-size array of rank DIM or dimension DIM of ARRAY has
  nonzero extent, the result has a value equal to the lower bound
  for subscript DIM of ARRAY. Otherwise, if DIM is present, the
  result value is 1."

And on what is a "whole array":

"9.5.2 Whole arrays"
"A whole array is a named array or a structure component ..."

The attached patch adjusts the relevant part in gfc_trans_allocate() to
only set
e3_has_nodescriptor only for non-named arrays.

Tobias has tested this once, and I've tested this patch as well on our
complete set of
testsuites (which usually serves for OpenMP related stuff). Everything
appears well with no regressions.

Is this okay for trunk?


I think you need to check the following:

 allocate(c, source=h(3))
 write(*,*) lbound(c,1), ubound(c,1) ! prints 1 3

...

 pure function h(i) result(r)
  integer, value, intent(in) :: i
  integer, allocatable :: r(:)
  allocate(r(3:5))
  r = [1,2,3]
 end function h

This used to print 3 5, which is also what e.g. NAG, Nvidia, flang do.
Intel prints 1 3, so it agrees with you.

The Fortran standard has:

9.7.1.2  Execution of an ALLOCATE statement

(6) When an ALLOCATE statement is executed for an array with no
allocate-shape-spec-list, the bounds of source-expr determine the
bounds of the array. Subsequent changes to the bounds of source-expr
do not affect the array bounds.

Please re-check with Tobias.

Thanks,
Harald


Thanks,
Chung-Lin

2021-11-29  Tobias Burnus  

gcc/fortran/ChangeLog:

 * trans-stmt.c (gfc_trans_allocate): Set e3_has_nodescriptor to true
 only for non-named arrays.

gcc/testsuite/ChangeLog:

 * gfortran.dg/allocate_with_source_26.f90: Adjust testcase.
 * gfortran.dg/allocate_with_mold_4.f90: New testcase.

Re: [PATCH] Only return after resetting type_param_spec_list


Am 29.11.21 um 12:28 schrieb Richard Biener via Fortran:

This fixes an appearant mistake in gfc_insert_parameter_exprs.


Yes, that looks pretty much like a missed cleanup and fix during
development.  CC'ing Paul.


Bootstrap / regtest pending on x86_64-unknown-linux-gnu.

OK?


LGTM if it regtests.

Thanks,
Harald


Thanks,
Richard.

2021-11-29  Richard Biener  

gcc/fortran/
* decl.c (gfc_insert_parameter_exprs): Only return after
resetting type_param_spec_list.
---
  gcc/fortran/decl.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index c0fec90e3e0..4971638f9b6 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -3733,9 +3733,9 @@ gfc_insert_parameter_exprs (gfc_expr *e, 
gfc_actual_arglist *param_list)
  {
gfc_actual_arglist *old_param_spec_list = type_param_spec_list;
type_param_spec_list = param_list;
-  return gfc_traverse_expr (e, NULL, _parameter_exprs, 1);
-  type_param_spec_list = NULL;
+  bool res = gfc_traverse_expr (e, NULL, _parameter_exprs, 1);
type_param_spec_list = old_param_spec_list;
+  return res;
  }

  /* Determines the instance of a parameterized derived type to be used by

Re: [PATCH] elf: Add __libc_get_static_tls_bounds [BZ #16291]

2021-11-29 Thread Florian Weimer via Gcc-patches

* Fāng-ruì Sòng:

> PING^3

I think the core issue with this patch is like this:

* I do not want to commit glibc to a public API that disallows future
  changes to the way we allocate static TLS.  While static TLS objects
  cannot move in memory, the extent of the static TLS area (minimum and
  maximum address) is not fixed by ABI necessities.

* Joseph does not want to add a GLIBC_PRIVATE ABI that is exclusively
  used externally.

I have tried repeatly to wrap my head around how the sanitizers use the
static TLS boundary information.  Based on what I can tell, there are
several applications:

1. Thead Sanitizer needs to know application-accessible thread-specific
   memory that is carried via the glibc thread (stack) cache from one
   thread to another one, seemingly without synchronization.  Since the
   synchronization happens internally within glibc, without that extra
   knowledge, Thread Sanitizer would report a false positive.  This
   covers only data to which the application has direct access, internal
   access by glibc does not count because it is not going to be
   instrumented.

2. Address Sanitizer needs TLS boundary information for bounds checking.
   Again this only applies to accesses that can be instrumented, so only
   objects whose addresses leak to application code count.  (Maybe this
   is a fringe use case, though: it seems to apply only to “extern
   __thread int a[];“ and similar declarations, where the declared type
   is not enough.)

3. Leak Sanitizer needs to find all per-thread pointers pointing into
   the malloc heap, within memory not itself allocated by malloc.  This
   includes the DTV, pthread_getspecific metadata, and perhaps also user
   data set by pthread_getspecific, and of course static TLS variables.
   This goes someone beyond what people would usually consider static
   TLS: the DTV and struct pthread are not really part of static TLS as
   such.  And struct pthread has several members that contain malloc'ed
   pointers.

Is this a complete list of uses?

I *think* you can get what you need via existing GLIBC_PRIVATE
interfaces.  But in order to describe how to caox the data out of glibc,
I need to know what you need.

(Cc:ing a few people from a recent GCC thread.)

Thanks,
Florian

Re: [PATCH] Remove more stray returns and gcc_unreachable ()s


On 11/29/21 11:53 AM, Martin Sebor wrote:

On 11/29/21 6:09 AM, Richard Biener via Gcc-patches wrote:

This removes more cases that appear when bootstrap with
-Wunreachable-code-return progresses.


...

diff --git a/gcc/sel-sched-ir.h b/gcc/sel-sched-ir.h
index 8ee0529d5a8..18e03c4cb96 100644
--- a/gcc/sel-sched-ir.h
+++ b/gcc/sel-sched-ir.h
@@ -1493,8 +1493,6 @@ bb_next_bb (basic_block bb)
  default:
    return bb->next_bb;
  }
-
-  gcc_unreachable ();
  }


Just skiming the changes out of curiosity, this one makes me
wonder if the warning shouldn't be taught to avoid triggering
on calls to __builtin_unreachable().  They can help make code
more readable (e.g., after a case and switch statement that
handles all values).


I see someone else raised the same question in a patch I hadn't
gotten to yet:

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585431.html

If you do end up removing the gcc_unreachable() calls, I would
suggest to replace them with a comment so as not to lose
the readability benefit.

But I still wonder if it might make sense to teach the warning
not just about __builtin_unreachable() but also about noreturn
calls like abort() that (as you explained in the thread above)
gcc_unreachable() expands to.  Is there a benefit to warning
on such calls?



Martin

Re: [PATCH] Remove more stray returns and gcc_unreachable ()s


On 11/29/21 6:09 AM, Richard Biener via Gcc-patches wrote:

This removes more cases that appear when bootstrap with
-Wunreachable-code-return progresses.


...

diff --git a/gcc/sel-sched-ir.h b/gcc/sel-sched-ir.h
index 8ee0529d5a8..18e03c4cb96 100644
--- a/gcc/sel-sched-ir.h
+++ b/gcc/sel-sched-ir.h
@@ -1493,8 +1493,6 @@ bb_next_bb (basic_block bb)
  default:
return bb->next_bb;
  }
-
-  gcc_unreachable ();
  }


Just skiming the changes out of curiosity, this one makes me
wonder if the warning shouldn't be taught to avoid triggering
on calls to __builtin_unreachable().  They can help make code
more readable (e.g., after a case and switch statement that
handles all values).

Martin

Re: [PATCH] ipa: Careful processing ANCESTOR jump functions and NULL pointers (PR 103083)

2021-11-29 Thread Martin Jambor

Hi,

On Sat, Nov 27 2021, Jan Hubicka wrote:
>> Hi,
>> 
>> IPA_JF_ANCESTOR jump functions are constructed also when the formal
>> parameter of the caller is first checked whether it is NULL and left
>> as it is if it is NULL, to accommodate C++ casts to an ancestor class.
>> 
>> The jump function type was invented for devirtualization and IPA-CP
>> propagation of tree constants is also careful to apply it only to
>> existing DECLs(*) but as PR 103083 shows, the part propagating "known
>> bits" was not careful about this, which can lead to miscompilations.
>> 
>> This patch introduces a flag to the ancestor jump functions which
>> tells whether a NULL-check was elided when creating it and makes the
>> bits propagation behave accordingly.
>> 
>> (*) There still may remain problems when a DECL resides on address
>> zero (with -fno-delete-null-pointer-checks ...I hope it cannot happen
>> otherwise).  I am looking into that now but I think it will be easier
>> for everyone if I do so in a follow-up patch.
>
> I guess so.  Thinking about it bit more after our discussion yesterday
> I think it really matters where the ADDR_EXPR was created (since in
> funciton with -fno-delete-null-pointer-checks it might be equal to 0)
> and we need to somehow keep track of it.
>
> One observation is that it is unsafe to constant propagate
> ADDR_EXPR with -fno-delete-null-pointer-checks to function with
> -fdelete-null-pointer-checks, so we may just simply stop propagating
> known ADDR_EXPRs on when caller is -fno... and call is -f

Makes sense.

>> +/* Return true if any of the known bits are non-zero.  */
>> +bool
>> +ipcp_bits_lattice::known_nonzero_p () const
>> +{
>> +  if (!constant_p ())
>> +return false;
>> +  return !wi::eq_p (wi::bit_and (wi::bit_not (m_mask), m_value), 0);
> There is also ne_p

Changed.

>> @@ -2374,6 +2384,7 @@ propagate_bits_across_jump_function (cgraph_edge *cs, 
>> int idx,
>>tree operand = NULL_TREE;
>>enum tree_code code;
>>unsigned src_idx;
>> +  bool only_for_nonzero = false;
>>  
>>if (jfunc->type == IPA_JF_PASS_THROUGH)
>>  {
>> @@ -2386,7 +2397,9 @@ propagate_bits_across_jump_function (cgraph_edge *cs, 
>> int idx,
>>  {
>>code = POINTER_PLUS_EXPR;
>>src_idx = ipa_get_jf_ancestor_formal_id (jfunc);
>> -  unsigned HOST_WIDE_INT offset = ipa_get_jf_ancestor_offset (jfunc) / 
>> BITS_PER_UNIT;
>> +  unsigned HOST_WIDE_INT offset
>> += ipa_get_jf_ancestor_offset (jfunc) / BITS_PER_UNIT;
> I still think the offset should be in bytes (and probably poly_int64).
> There is get_addr_base_and_unit_offset

I agree about bytes, not sure about poly_ints, but could be persuaded.
But probably not as a part of this patch?

>
>> +  only_for_nonzero = (ipa_get_jf_ancestor_keep_null (jfunc) || !offset);
>>operand = build_int_cstu (size_type_node, offset);
>>  }
>>  
>> @@ -2404,16 +2417,18 @@ propagate_bits_across_jump_function (cgraph_edge 
>> *cs, int idx,
>>   and we store it in jump function during analysis stage.  */
>>  
>>if (src_lats->bits_lattice.bottom_p ()
>> -  && jfunc->bits)
>> -return dest_lattice->meet_with (jfunc->bits->value, jfunc->bits->mask,
>> -precision);
>> +  || (only_for_nonzero && !src_lats->bits_lattice.known_nonzero_p ()))
>> +{
>> +  if (jfunc->bits)
>> +return dest_lattice->meet_with (jfunc->bits->value,
>> +jfunc->bits->mask, precision);
>> +  else
>> +return dest_lattice->set_to_bottom ();
>> +}
> I do not think you need to set to bottom here. For pointers, we
> primarily track alignment and NULL is aligned - all you need to do is to
> make sure that we do not believe that some bits are 1.

I am not sure I understand, the testcase you wrote has all bits as zeros
and still miscompiles?  It is primarily used for alignment but not only
for that.

>> @@ -3327,7 +3332,8 @@ update_jump_functions_after_inlining (struct 
>> cgraph_edge *cs,
>>  ipa_set_ancestor_jf (dst,
>>   ipa_get_jf_ancestor_offset (src),
>>   ipa_get_jf_ancestor_formal_id (src),
>> - agg_p);
>> + agg_p,
>> + ipa_get_jf_ancestor_keep_null (src));
>>  break;
> Somewhere here you need to consider the case where you have two accessor
> jump functions to combine together and merge the keep_null flags...

Oops, I missed that, fixed.

>
> Also with the flags, I think you want to modify ipa-cp so NULL pointers
> gets propagated acroess the ancestor jump functions.

OK, added, along with a new testcase.

> Moreover ipa-modref is using ancestor jf to update its summary.  Here I
> think with -fno-delete-null-pointer-checks it is safe to pdate also with
> the flag set, since we know the

Re: [PATCH] Make the path to etags used in the build system configurable [PR103021]

2021-11-29 Thread Eric Gallager via Gcc-patches

On Mon, Nov 29, 2021 at 9:48 AM Jeff Law  wrote:
>
>
>
> On 11/28/2021 6:34 PM, Eric Gallager via Gcc-patches wrote:
> > The attached patch allows users to specify a path to their `etags`
> > executable for use when doing `make tags`, which is meant to close PR
> > other/103021: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103021
> > I based this patch off of this one from upstream automake:
> > https://git.savannah.gnu.org/cgit/automake.git/commit/m4?id=d2ccbd7eb38d6a4277d6f42b994eb5a29b1edf29
> > This means that I just supplied variables that the user can override
> > for the tags programs, rather than having the configure scripts
> > actually check for them. I handle etags and ctags separately because
> > the intl subdirectory has separate targets for them. Tested with `make
> > tags`; the changes I made work successfully, but some of the
> > subdirectories still have broken tags targets, so I had to switch to
> > `make -k tags` part way through. This isn't because of anything I did,
> > though; the `-k` flag is only necessary because of errors that were
> > already there before I touched anything. Also note that this patch
> > only affects the subdirectories that use handwritten Makefiles; the
> > ones that use automake will have to wait until we update the version
> > of automake used to be 1.16.4 or newer before they'll be fixed.
> OK for the trunk.
> jeff

OK thanks; committed as g:909b30a:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=909b30a17e71253772d2cb174d0dae6d0b8c9401

Re: [PATCH] ipa-param-manip: Be careful about a reallocating hash_map (PR 103449)

2021-11-29 Thread Jan Hubicka via Gcc-patches

> Hi,
> 
> PR 103449 revealed that when I was storing result of one hash_map
> lookup into another entry in the hash_map, I was still accessing the
> entry in the table, which meanwhile could get reallocated, making the
> accesses invalid-after-free.
> 
> Fixed with the following, which also simplifies the return statement
> which must have been true even now.
> 
> Bootstrapped and tested on x86_64-linux.  OK for master?
> 
> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2021-11-29  Martin Liska  
>   Martin Jambor  
> 
>   PR ipa/103449
>   * ipa-param-manipulation.c
>   (ipa_param_body_adjustments::prepare_debug_expressions): Be
>   careful about hash_map reallocating itself.  Simpify a return
>   which always returns true.
OK, thanks!
Honza

[PATCH] ipa-param-manip: Be careful about a reallocating hash_map (PR 103449)

2021-11-29 Thread Martin Jambor

Hi,

PR 103449 revealed that when I was storing result of one hash_map
lookup into another entry in the hash_map, I was still accessing the
entry in the table, which meanwhile could get reallocated, making the
accesses invalid-after-free.

Fixed with the following, which also simplifies the return statement
which must have been true even now.

Bootstrapped and tested on x86_64-linux.  OK for master?

Thanks,

Martin


gcc/ChangeLog:

2021-11-29  Martin Liska  
Martin Jambor  

PR ipa/103449
* ipa-param-manipulation.c
(ipa_param_body_adjustments::prepare_debug_expressions): Be
careful about hash_map reallocating itself.  Simpify a return
which always returns true.
---
 gcc/ipa-param-manipulation.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index 479c20b3871..163af94cde0 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -1279,9 +1279,10 @@ ipa_param_body_adjustments::prepare_debug_expressions 
(tree dead_ssa)
   if (gimple_assign_copy_p (def)
  && TREE_CODE (gimple_assign_rhs1 (def)) == SSA_NAME)
{
- tree *d = m_dead_ssa_debug_equiv.get (gimple_assign_rhs1 (def));
- m_dead_ssa_debug_equiv.put (dead_ssa, *d);
- return (*d != NULL_TREE);
+ tree d = *m_dead_ssa_debug_equiv.get (gimple_assign_rhs1 (def));
+ gcc_assert (d);
+ m_dead_ssa_debug_equiv.put (dead_ssa, d);
+ return true;
}
 
   tree val
-- 
2.33.1

[PATCH take #2] x86_64: PR target/100711: Splitters for pandn

2021-11-29 Thread Roger Sayle


Hi Uros,
Many thanks for the review.  Here's a revised version of the patch
incorporating all of your suggestions.  This has been (re)tested on
x86_64-pc-linux-gnu with make bootstrap and make -k check,
both with and without --target_board=unix{-m32}, with no new
failures.  Ok for mainline?


2021-11-29  Roger Sayle  
Uroš Bizjak  

gcc/ChangeLog
PR target/100711
* config/i386/sse.md (define_split): New splitters to simplify
not;vec_duplicate;and as vec_duplicate;andn.

gcc/testsuite/ChangeLog
PR target/100711
* gcc.target/i386/pr100711-1.c: New test case.
* gcc.target/i386/pr100711-2.c: New test case.


Thanks in advance,
Roger
--

> -Original Message-
> From: Uros Bizjak 
> Sent: 28 November 2021 19:45
> To: Roger Sayle 
> Cc: GCC Patches 
> Subject: Re: [PATCH] x86_64: PR target/100711: Splitters for pandn
> 
> On Sun, Nov 28, 2021 at 2:25 PM Roger Sayle 
> wrote:
> >
> >
> > This patch addresses PR target/100711 by introducing define_split
> > patterns so that not/broadcast/pand may be simplified (by combine) to
> > broadcast/pandn.  This introduces two splitters one for optimizing
> > pandn on TARGET_SSE for V4SI and V2DI, and another for vpandn on
> > TARGET_AVX2 for V16QI, V8HI, V32QI, V16HI and V8SI.  Each splitter has
> > its own new testcase.
> >
> > I've also confirmed that not/broadcast/pandn is already getting
> > simplified to broadcast/pand by the middle-end optimizers.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check with no new failures.  Ok for mainline?
> >
> >
> > 2021-11-28  Roger Sayle  
> >
> > gcc/ChangeLog
> > PR target/100711
> > * config/i386/sse.md (define_split): New splitters to simplify
> > not;vec_duplicate;and as vec_duplicate;andn.
> >
> > gcc/testsuite/ChangeLog
> > PR target/100711
> > * gcc.target/i386/pr100711-1.c: New test case.
> > * gcc.target/i386/pr100711-2.c: New test case.
> 
> 
> +;; PR target/100711: Split notl; vpbroadcastd; vpand as vpbroadcastd;
> +vpandn (define_split
> +  [(set (match_operand:VI48_128 0 "register_operand")
> + (and:VI48_128
> +  (vec_duplicate:VI48_128
> +(not:
> +  (match_operand: 1 "register_operand")))
> +  (match_operand:VI48_128 2 "register_operand")))]
> 
> You can use "vector_operand" here, the resulting PANDN can handle these.
> 
> +  "TARGET_SSE && can_create_pseudo_p ()"
> 
> This is a combine splitter, so can_create_pseudo_p () is not needed, because 
> it
> runs only during the combine phase.
> 
> FYI, the combine splitter is somehow different than normal splitter, the
> important part from the documentation is, that the insn is *not* matched by
> some define_insn pattern, and the split results in exactly two patterns:
> 
> The insn combiner phase also splits putative insns.  If three insns are 
> merged into
> one insn with a complex expression that cannot be matched by some
> 'define_insn' pattern, the combiner phase attempts to split the complex 
> pattern
> into two insns that are recognized.  Usually it can break the complex pattern 
> into
> two patterns by splitting out some subexpression.  However, in some other
> cases, such as performing an addition of a large constant in two insns on a 
> RISC
> machine, the way to split the addition into two insns is machine-dependent.
> 
> +  [(set (match_dup 3)
> + (vec_duplicate:VI48_128 (match_dup 1)))
> +   (set (match_dup 0)
> + (and:VI48_128 (not:VI48_128 (match_dup 3))
> +  (match_dup 2)))]
> +  "operands[3] = gen_reg_rtx (mode);")
> +
> +;; PR target/100711: Split notl; vpbroadcastd; vpand as vpbroadcastd;
> +vpandn (define_split
> +  [(set (match_operand:VI124_AVX2 0 "register_operand")
> + (and:VI124_AVX2
> +  (vec_duplicate:VI124_AVX2
> +(not:
> +  (match_operand: 1 "register_operand")))
> +  (match_operand:VI124_AVX2 2 "register_operand")))]
> +  "TARGET_AVX2 && can_create_pseudo_p ()"
> +  [(set (match_dup 3)
> + (vec_duplicate:VI124_AVX2 (match_dup 1)))
> +   (set (match_dup 0)
> + (and:VI124_AVX2 (not:VI124_AVX2 (match_dup 3))  (match_dup 2)))]
> +  "operands[3] = gen_reg_rtx (mode);")
> 
> Same here as above.
> 
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> 
> Please add -msse2 here, 32bit targets do not enable SSE by default, and please
> check if they handle DImode long long at all.
> 
> Also, please run tests for x86_64 and i386 targets. The testsuite should be 
> ran
> with:
> 
> make -k check RUNTESTFLAGS="--target_board=unix\{,-m32\}"
> 
> (Eventually, you can use check-gcc instead of check and/or add i386.exp after 
> --
> target-board.)
> 
> Uros.
> 
> +typedef int v4si __attribute__((vector_size (16))); typedef long long
> +v2di __attribute__((vector_size (16)));
> +
> +v4si foo (int a, v4si b)
> +{
> +return (__extension__ (v4si) {~a, ~a, ~a, ~a}) & b; }
> +
> +v2di bar (long long a, v2di b)
> +{
> +return (__extension__ (v2di) {~a, ~a}) & b; }
> 
>

Re: [PATCH 4/4] libgcc: vxcrtstuff.c: add a few undefs

2021-11-29 Thread Olivier Hainque via Gcc-patches

Hi Rasmus,

> On 1 Nov 2021, at 10:34, Rasmus Villemoes  wrote:
> 
> libgcc/
>   * config/vxcrtstuff.c: Undefine caddr_t, pid_t, rlim_t,
>   ssize_t and vfork after including auto-host.h.

Ok, thanks;

Re: [PATCH][RFC] middle-end/46476 - resurrect -Wunreachable-code


On 11/25/21 12:57 AM, Richard Biener wrote:

On Wed, 24 Nov 2021, Martin Sebor wrote:


On 11/24/21 8:21 AM, Richard Biener via Gcc-patches wrote:

This resurrects -Wunreachable-code and implements a warning for
trivially unreachable code as of CFG construction.  Most problematic
with this is the C/C++ frontend added 'return 0;' stmt in main
which the patch handles for C++ like the C frontend already does
by using BUILTINS_LOCATION.

Another problem for future enhancement is that after CFG construction
we no longer can point to the stmt making a stmt unreachable, so
this implementation tries to warn on the first unreachable
statement of a region.  It might be possible to retain a pointer
to the stmt that triggered creation of a basic-block but I'm not
sure how reliable that would be.

So this is really a simple attempt for now, triggered by myself
running into such a coding error.  As always, the perfect is the
enemy of the good.

It does not pass bootstrap (which enables -Wextra), because of the
situation in g++.dg/Wunreachable-code-5.C where the C++ frontend
prematurely elides conditions like if (! GATHER_STATISTICS) that
evaluate to true - oddly enough it does _not_ do this for
conditions evaluating to false ... (one of the
c-c++-common/Wunreachable-code-2.c cases).


I'm very much in favor of reviving the warning, even in its
current simplistic form.  I especially welcome the suggestion
to enhance it in the future, including adjusting its schedule
among other passes (or adding other, later invocations).  It
would be overly constraining to consider this placement ideal
or set in stone.

Among possible enhancements worth considering is handling
constant conditionals like:

int f (void)
{
   if (1)
 return 0;
   else
 return 1;   <<< warn
}

int g (void)
{
   if (1)
 return 0;
   return 1; <<< warn also in C (not just in C++)
}


I think both cases are undesirable to warn on, but both would be
possible to implement during parsing and only there they would
possibly make sense (you have to consider the if (1) resulting
from macro expansion or template instantiation which are the
undesirable to warn about cases - not to mention disabled code
that wants to remain syntactically correct, something that cannot
be achieved with #if 0)


The example I gave above (with the constant) was intentionally
contrived to illustrate the inconsistency between issuing
the warning in C++ but not in C.  I would view it as a bug.
Not necessarily one that would make me object to your patch
as is but certainly one to fix at some point.

Using heuristics like "is it the result of macro expansion?" is
one way to ameliorate the problem of issuing valid diagnostics
for deliberate instances of the construct the warning is meant
to detect.  But few heuristics are ever perfect.  For example,
using an early return from a function is an alternate way of
disabling the code that follows without actually removing it:

  void f (void)
  {
...
return;
... intentionally disabled code...
return;   // warning here
  }

instead of

  void f (void)
  {
...
if (1)
  return;

... intentionally disabled code...
return;   // no warning here?
  }

Why is diagnosing the former desirable and not the latter?
(This is a rhetorical question.)

In my view, in both cases, when intentional, such code is best
documented.  One way to document it is by adding a #pragma
GCC diagnostic.  Another is to recognize some special word
in a comment, analogous to /* Fallthrough */ as an alternate
spelling of the snynonymous attribute.


By the way, a related feature that would be useful and that's
been requested in the past is warning for stores with no effect,
as in:

   int i;
   i = 1;
   i = 2;   <<< warn here

The detection of the simple cases like the one above can also
be almost trivially implemented.


Likewise the above can be done during parsing where it's more
appearant whether the stmts are the same.


You mean the variables are the same.  The front end has to work
harder to determine that two expressions are equivalent (e.g.,
memset (, 0, sizeof x) and x = (struct X){ ... } both assign
a value to x).  It doesn't benefit from any of the simplifications
done by the gimple transformations.  As with any flow-sensitive
warnings, where to implement this one is a balancing act between
false positives and negatives.

Martin



Richard.


Martin



Richard.

2021-11-24  Richard Biener  

  PR middle-end/46476
  * common.opt (Wunreachable-code): No longer ignored,
  add warn_unreachable_code variable, enable with -Wextra.
  * doc/invoke.texi (Wunreachable-code): Document.
  (Wextra): Amend.
  * tree-cfg.c (build_gimple_cfg): Move case label grouping...
  (execute_build_cfg): ... here after new -Wunreachable-code
  warnings.
  (warn_unreachable_code_post_cfg_build): New function.
  (mark_forward_reachable_blocks): Likewise.
  (reverse_guess_deadend): Likewise.

gcc/cp/
  * decl.c (finish_function): Set input_location to

[PATCH] tree-optimization/103440 - Always track arguments, even when ignoring equiv params.

2021-11-29 Thread Andrew MacLeod via Gcc-patches

I need to adjust the original patch, I shouldn't have continued the loop 
when dealing with equivalences.    An equivalence is not the same as an 
undefined value...  we might be able to ignore the range from it for 
calculation purposes, but we cannot ignore the fact that it is a 
different SSA_NAME and may contain a different value that we do care about.


There are other checks in then loop which will allow us to assign an 
equivalence between the DEF and an argument if we are ignoring the other 
ssa_names..  such as when there are undefined values.


a_3 = PHI 

if a_2 is undefined, we can create an equivalence between a_3 and a_1 as 
the value of a_2 is irrelevant and can be whatever we want it to be.


if a_2 is instead an equivalence with a_3, we do not want to create an 
equivalence between a_3 and a_1 in this block as we may then turn it 
into a copy.. we'd only be able to do this if there was an equivalence 
between a_1 and a_2, and we are not checking that.


Although we are may not be adding the range for a_2 into the cumulative 
knowledge of a_3's range, we do need to keep the edge to retain the copy 
as its value is important and could be different than the other 
argument... and we need to retain the copy when we go out of ssa.


This fixes that oversight, bootstrapped on x86_64-pc-linux-gnu with no 
regressions.  OK?


Caveat..  the test case has an infinite loop without this fix, but 
degagnu doesn't seem to kill it, and my test suite runs go forever.. if 
there something I am missing?   The gcc.log claims that it timeouts 
after 300 second, but it doesn't kill the executable. I set the 
dg-timeout field, but that appears to be a compile time timeout, not 
runtime anyway.


Andrew
From 2991aa06aaaf87874df8ed1c93a4650dc85c79ec Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Mon, 29 Nov 2021 09:19:34 -0500
Subject: [PATCH 1/2] Always track arguments, even when ignoring equiv params.

To "ignore" ranges from equivalences, we should track the range separately,
but still do the other name processing which determiens if there is a single
name or not for equivalence.  Otherwise we mistakently think we can introduce
an equivalences.

	gcc/
	PR tree-optimization/103440
	* gimple-range-fold.cc (fold_using_range::range_of_phi): Continue
	normal param processing for equiv params.

	gcc/testsuite/
	* gcc.dg/pr103440.c: New.
---
 gcc/gimple-range-fold.cc| 21 +
 gcc/testsuite/gcc.dg/pr103440.c | 24 
 2 files changed, 33 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103440.c

diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index d66ada5bb7c..58122297c0b 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -795,20 +795,17 @@ fold_using_range::range_of_phi (irange , gphi *phi, fur_source )
   // Get the range of the argument on its edge.
   src.get_phi_operand (arg_range, arg, e);
 
-  // Likewise, if the incoming PHI argument is equivalent to this
-  // PHI definition, it provides no new info.  Accumulate these ranges
-  // in case all arguments are equivalences.
-  if (src.query ()->query_relation (e, arg, phi_def, false) == EQ_EXPR)
-	{
-	  single_arg = arg;
-	  equiv_range.union_(arg_range);
-	  continue;
-	}
-
   if (!arg_range.undefined_p ())
 	{
 	  // Register potential dependencies for stale value tracking.
-	  r.union_ (arg_range);
+	  // Likewise, if the incoming PHI argument is equivalent to this
+	  // PHI definition, it provides no new info.  Accumulate these ranges
+	  // in case all arguments are equivalences.
+	  if (src.query ()->query_relation (e, arg, phi_def, false) == EQ_EXPR)
+	equiv_range.union_(arg_range);
+	  else
+	r.union_ (arg_range);
+
 	  if (gimple_range_ssa_p (arg) && src.gori ())
 	src.gori ()->register_dependency (phi_def, arg);
 
@@ -829,7 +826,7 @@ fold_using_range::range_of_phi (irange , gphi *phi, fur_source )
 
 // If all arguments were equivalences, use the equivalence ranges as no
 // arguments were processed.
-if (!seen_arg)
+if (r.undefined_p () && !equiv_range.undefined_p ())
   r = equiv_range;
 
 // If the PHI boils down to a single effective argument, look at it.
diff --git a/gcc/testsuite/gcc.dg/pr103440.c b/gcc/testsuite/gcc.dg/pr103440.c
new file mode 100644
index 000..b97f45cd3ed
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103440.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+/* { dg-options "-Os" } */
+/* { dg-timeout 10 } */
+
+int a, b, c, d, e;
+int main() {
+  int f = 2, g = 1, h = -3;
+L1:
+  c = b ^ 1;
+  if (!f)
+goto L3;
+  if (d)
+g = e;
+  f = h;
+  if (!c)
+goto L1;
+L2:
+  if (g)
+a = 0;
+L3:
+  if (d == g)
+goto L2;
+  return 0;
+}
-- 
2.17.2

Re: [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]

Hi!

On Wed, Sep 01, 2021 at 02:55:51PM +0800, Kewen.Lin wrote:
> This patch is to fix the inconsistent behaviors for non-LTO mode
> and LTO mode.  As Martin pointed out, currently the function
> rs6000_can_inline_p simply makes it inlinable if callee_tree is
> NULL, but it's wrong, we should use the command line options
> from target_option_default_node as default.

This is not documented.

> It also replaces
> rs6000_isa_flags with the one from target_option_default_node
> when caller_tree is NULL as rs6000_isa_flags could probably
> change since initialization.

Is this enough then?  Are there no other places that use
rs6000_isa_flags?  Is it correct for us to have that variable at all
anymore?  Etc.

> +  bool always_inline =
> +(DECL_DISREGARD_INLINE_LIMITS (callee)
> + && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)));

The "=" should start a line, not end it.  And please don't use
unnecessary parens.

> +  /* Some flags such as fusion can be tolerated for always inlines.  */
> +  unsigned HOST_WIDE_INT always_inline_safe_mask =
> +(MASK_P8_FUSION | MASK_P10_FUSION | OPTION_MASK_SAVE_TOC_INDIRECT
> + | OPTION_MASK_P8_FUSION_SIGN | OPTION_MASK_P10_FUSION_LD_CMPI
> + | OPTION_MASK_P10_FUSION_2LOGICAL | OPTION_MASK_P10_FUSION_LOGADD
> + | OPTION_MASK_P10_FUSION_ADDLOG | OPTION_MASK_P10_FUSION_2ADD
> + | OPTION_MASK_PCREL_OPT);

Same.

The fusion ones are obvious.  The other two are not.  Please put those
two more obviously separate (not somewhere in the sea of fusion
options), and some comment wouldn't hurt either.  You can make it
separate statements even, make it really stand out.

Why are there OPTION_MASKs for separate P10 fusion types here, as well as
MASK_P10_FUSION?

> +
> +  if (always_inline) {
> +caller_isa &= ~always_inline_safe_mask;
> +callee_isa &= ~always_inline_safe_mask;
> +  }

"{" starts a new line, indented.

Segher

[PATCH] c++: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4

2021-11-29 Thread Jakub Jelinek via Gcc-patches

Hi!

P1272R4 has added to the std::byteswap new stuff to me quite unrelated
clarification for std::bit_cast.
std::bit_cast is already in C++20 though and std::byteswap is a new
C++23 feature, it is unclear to me if that bit_cast addition in there
is really meant for C++23 and later only (e.g. related to only
C++23 actually allowing non-literal vars and thus other spots where
indeterminate values can appear during constant evaluation), or if
it was meant as a DR and should apply to C++20 too.

The following so far only lightly tested patch (dg.exp=bit-cast*.C
testing) implements it as applying to C++23 only.

Also, it is unclear to me how bitfields should behave, whether having
indeterminate bits in
struct S { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 8, g : 24; 
};
struct T1 { unsigned char a : 1, : 7, b : 5, c : 3, d, e; unsigned int f : 8, g 
: 24; };
struct T2 { unsigned char a : 8, b : 1, : 4, c : 3, d, e; unsigned int f : 8, g 
: 24; };
struct T3 { unsigned char a : 8, b : 5, c : 1, : 2, d, e; unsigned int f : 8, g 
: 24; };
struct T4 { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 1, : 7, g 
: 24; };
constexpr bool f1 () { T1 t = { 0, 0, 0, 0, 0, 0, 0 }; S s = __builtin_bit_cast 
(S, t); return true; }
constexpr bool f2 () { T2 t = { 0, 0, 0, 0, 0, 0, 0 }; S s = __builtin_bit_cast 
(S, t); return true; }
constexpr bool f3 () { T3 t = { 0, 0, 0, 0, 0, 0, 0 }; S s = __builtin_bit_cast 
(S, t); return true; }
constexpr bool f4 () { T4 t = { 0, 0, 0, 0, 0, 0, 0 }; S s = __builtin_bit_cast 
(S, t); return true; }
constexpr bool a = f1 ();
constexpr bool b = f2 ();
constexpr bool c = f3 ();
constexpr bool d = f4 ();
The patch currently uses TREE_TYPE and doesn't check DECL_BIT_FIELD,
so it accepts f1 and f4 cases where the TREE_TYPE is unsigned char
and doesn't accept f2 and f3 where the type is some 5 or 3 bit precision
unsigned type.  But possibly either it should look at the declared
type of the bitfield and don't allow f4 and allow f1-f3, or don't allow
any of the f1-f4 cases.

2021-11-29  Jakub Jelinek 

* constexpr.c (clear_uchar_or_std_byte_in_mask): New function.
(cxx_eval_bit_cast): Don't error about padding bits if target
type is unsigned char or std::byte, instead return no clearing
ctor.  Use clear_uchar_or_std_byte_in_mask.

* g++.dg/cpp2a/bit-cast11.C: New test.
* g++.dg/cpp2a/bit-cast12.C: New test.
* g++.dg/cpp2a/bit-cast13.C: New test.

--- gcc/cp/constexpr.c.jj   2021-11-22 10:13:17.156882931 +0100
+++ gcc/cp/constexpr.c  2021-11-29 15:41:54.611361669 +0100
@@ -4268,6 +4268,59 @@ check_bit_cast_type (const constexpr_ctx
   return false;
 }
 
+/* Helper function for cxx_eval_bit_cast.  For unsigned char or
+   std::byte members of CONSTRUCTOR (recursively) if they contain
+   some indeterminate bits (as set in MASK), remove the ctor elts,
+   mark the CONSTRUCTOR as CONSTRUCTOR_NO_CLEARING and clear the
+   bits in MASK.  */
+
+static void
+clear_uchar_or_std_byte_in_mask (tree t, unsigned char *mask)
+{
+  if (TREE_CODE (t) != CONSTRUCTOR)
+return;
+
+  unsigned i, j = 0;
+  tree index, value;
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (t), i, index, value)
+{
+  tree type = TREE_TYPE (value);
+  if (is_byte_access_type (type)
+ && TYPE_MAIN_VARIANT (type) != char_type_node)
+   {
+ HOST_WIDE_INT pos;
+ if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
+   pos = tree_to_shwi (index);
+ else
+   pos = int_byte_position (index);
+ if (mask[pos])
+   {
+ CONSTRUCTOR_NO_CLEARING (t) = 1;
+ mask[pos] = 0;
+ continue;
+   }
+   }
+  if (TREE_CODE (value) == CONSTRUCTOR)
+   {
+ HOST_WIDE_INT pos;
+ if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
+   pos = tree_to_shwi (index)
+ * tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (t;
+ else
+   pos = int_byte_position (index);
+ clear_uchar_or_std_byte_in_mask (value, mask + pos);
+   }
+  if (i != j)
+   {
+ CONSTRUCTOR_ELT (t, j)->index = index;
+ CONSTRUCTOR_ELT (t, j)->value = value;
+   }
+  ++j;
+}
+  if (CONSTRUCTOR_NELTS (t) != j)
+vec_safe_truncate (CONSTRUCTOR_ELTS (t), j);
+}
+
 /* Subroutine of cxx_eval_constant_expression.
Attempt to evaluate a BIT_CAST_EXPR.  */
 
@@ -4344,12 +4397,30 @@ cxx_eval_bit_cast (const constexpr_ctx *
 
   tree r = NULL_TREE;
   if (can_native_interpret_type_p (TREE_TYPE (t)))
-r = native_interpret_expr (TREE_TYPE (t), ptr, len);
+{
+  r = native_interpret_expr (TREE_TYPE (t), ptr, len);
+  if (cxx_dialect >= cxx23
+ && is_byte_access_type (TREE_TYPE (t))
+ && TYPE_MAIN_VARIANT (TREE_TYPE (t)) != char_type_node)
+   {
+ gcc_assert (len == 1);
+ if (mask[0])
+   {
+ memset (mask, 0, len);
+

[PATCH] path solver: Use only one ssa_global_cache.

2021-11-29 Thread Aldy Hernandez via Gcc-patches

We're using a temporary range cache while computing ranges for PHIs to
make sure the real cache doesn't get set until all PHIs are computed.
With the ltrans beast in LTO mode this causes undue overhead.

Since we already have a bitmap to indicate whether there's a cache
entry, we can avoid the extra cache object by clearing it while PHIs
are being calculated.

Tested on x86-64 Linux.

OK for trunk?

gcc/ChangeLog:

PR tree-optimization/103409
* gimple-range-path.cc (path_range_query::compute_ranges_in_phis):
Do all the work with just one ssa_global_cache.
* gimple-range-path.h: Remove m_tmp_phi_cache.
---
 gcc/gimple-range-path.cc | 23 +++
 gcc/gimple-range-path.h  |  2 --
 2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index b9c71226c1c..15ef72fd492 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -375,30 +375,29 @@ void
 path_range_query::compute_ranges_in_phis (basic_block bb)
 {
   int_range_max r;
-  gphi_iterator iter;
+  auto_bitmap phi_set;
 
   // PHIs must be resolved simultaneously on entry to the block
   // because any dependencies must be satistifed with values on entry.
   // Thus, we calculate all PHIs first, and then update the cache at
   // the end.
 
-  m_tmp_phi_cache.clear ();
-  for (iter = gsi_start_phis (bb); !gsi_end_p (iter); gsi_next ())
+  for (auto iter = gsi_start_phis (bb); !gsi_end_p (iter); gsi_next ())
 {
   gphi *phi = iter.phi ();
   tree name = gimple_phi_result (phi);
 
   if (import_p (name) && range_defined_in_block (r, name, bb))
-   m_tmp_phi_cache.set_global_range (name, r);
-}
-  for (iter = gsi_start_phis (bb); !gsi_end_p (iter); gsi_next ())
-{
-  gphi *phi = iter.phi ();
-  tree name = gimple_phi_result (phi);
-
-  if (m_tmp_phi_cache.get_global_range (r, name))
-   set_cache (r, name);
+   {
+ unsigned v = SSA_NAME_VERSION (name);
+ set_cache (r, name);
+ bitmap_set_bit (phi_set, v);
+ // Pretend we don't have a cache entry for this name until
+ // we're done with all PHIs.
+ bitmap_clear_bit (m_has_cache_entry, v);
+   }
 }
+  bitmap_ior_into (m_has_cache_entry, phi_set);
 }
 
 // Compute ranges defined in the current block, or exported to the
diff --git a/gcc/gimple-range-path.h b/gcc/gimple-range-path.h
index 57a9ae9bdcd..77c92c07bc1 100644
--- a/gcc/gimple-range-path.h
+++ b/gcc/gimple-range-path.h
@@ -82,8 +82,6 @@ private:
   // Range cache for SSA names.
   ssa_global_cache *m_cache;
 
-  ssa_global_cache m_tmp_phi_cache;
-
   // Set for each SSA that has an active entry in the cache.
   bitmap m_has_cache_entry;
 
-- 
2.31.1

PING [PATCH] correct handling of offsets in PHI expressions [PR103215]


Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585180.html

On 11/22/21 4:54 PM, Martin Sebor wrote:

In an effort to avoid false positives while still detecting
certain out-of-bounds accesses the warning code that handles
PHI nodes chooses the operand with the most space remaining
as the one representative of the PHI.  That's not right when
the offsets into the operands are unequal, because it overly
constrains the range of offsets that can be substracted from
the pointer.

The attached change corrects the logic here to not only use
the size of the largest operand but also to extend the range
of offsets into it to reflect all operand.  Unfortunately,
as a result of the more conservative offset computation,
the fix leads to a fair number of false negatives.  I tried
to avoid those but couldn't come up with a clean solution
that didn't require design changes, so I defer those to GCC
13.

The diff is relative to the "cleanup" patch submitted below:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583735.html

Tested on x86_64-linux and by building Glibc and confirming
no new warnings.

Martin

Re: [PATCH] Avoid some -Wunreachable-code-ctrl





On 11/29/2021 8:03 AM, Richard Biener via Gcc-patches wrote:

This cleans up unreachable code diagnosed by -Wunreachable-code-ctrl.
It largely follows the previous series but discovers a few extra
cases, namely dead code after break or continue or loops without
exits.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2021-11-29  Richard Biener  

gcc/c/
* gimple-parser.c (c_parser_gimple_postfix_expression):
avoid unreachable code after break.

gcc/
* cfgrtl.c (skip_insns_after_block): Refactor code to
be more easily readable.
* expr.c (op_by_pieces_d::run): Remove unreachable
assert.
* sched-deps.c (sched_analyze): Remove unreachable
gcc_unreachable.
* sel-sched-ir.c (in_same_ebb_p): Likewise.
* tree-ssa-alias.c (nonoverlapping_refs_since_match_p):
Remove unreachable code.
* tree-vect-slp.c (vectorize_slp_instance_root_stmt):
Refactor to avoid unreachable loop iteration.
* tree.c (walk_tree_1): Remove unreachable break.
* vec-perm-indices.c (vec_perm_indices::series_p): Remove
unreachable return.

gcc/cp/
* parser.c (cp_parser_postfix_expression): Remove
unreachable code.
* pt.c (tsubst_expr): Remove unreachable breaks.

gcc/fortran/
* frontend-passes.c (gfc_expr_walker): Remove unreachable
break.
* scanner.c (skip_fixed_comments): Remove unreachable
gcc_unreachable.
* trans-expr.c (gfc_expr_is_variable): Refactor to make
control flow more obvious.

OK
jeff

PING 3 [PATCH] fix up compute_objsize (including PR 103143)


Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583735.html

On 11/22/21 9:41 AM, Martin Sebor wrote:

Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583735.html

On 11/15/21 9:49 AM, Martin Sebor wrote:

Ping for the following cleanup patch:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583735.html

On 11/8/21 7:34 PM, Martin Sebor wrote:

The pointer-query code that implements compute_objsize() that's
in turn used by most middle end access warnings now has a few
warts in it and (at least) one bug.  With the exception of
the bug the warts aren't behind any user-visible bugs that
I know of but they do cause problems in new code I've been
implementing on top of it.  Besides fixing the one bug (just
a typo) the attached patch cleans up these latent issues:

1) It moves the bndrng member from the access_ref class to
    access_data.  As a FIXME in the code notes, the member never
    did belong in the former and only takes up space in the cache.

2) The compute_objsize_r() function is big, unwieldy, and tedious
    to step through because of all the if statements that are better
    coded as one switch statement.  This change factors out more
    of its code into smaller handler functions as has been suggested
    and done a few times before.

3) (2) exposed a few places where I fail to pass the current
    GIMPLE statement down to ranger.  This leads to worse quality
    range info, including possible false positives and negatives.
    I just spotted these problems in code review but I haven't
    taken the time to come up with test cases.  This change fixes
    these oversights as well.

4) The handling of PHI statements is also in one big, hard-to-
    follow function.  This change moves the handling of each PHI
    argument into its own handler which merges it into the previous
    argument.  This makes the code easier to work with and opens it
    to reuse also for MIN_EXPR and MAX_EXPR.  (This is primarily
    used to print informational notes after warnings.)

5) Finally, the patch factors code to dump each access_ref
    cached by the pointer_query cache out of pointer_query::dump
    and into access_ref::dump.  This helps with debugging.

These changes should have no user-visible effect and other than
a regression test for the typo (PR 103143) come with no tests.
They've been tested on x86_64-linux.

Martin

Re: [PATCH 1/4] Canonicalize argument order for commutative functions

2021-11-29 Thread Richard Sandiford via Gcc-patches

Sorry for the slow response, was away last week.

Richard Biener  writes:
> On Wed, Nov 10, 2021 at 1:50 PM Richard Sandiford via Gcc-patches
>  wrote:
>>
>> This patch uses information about internal functions to canonicalize
>> the argument order of calls.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> OK.  Note the gimple_resimplifyN functions also canonicalize operand
> order, currently for is_tree_code only:
>
>   /* Canonicalize operand order.  */
>   bool canonicalized = false;
>   if (res_op->code.is_tree_code ()
>   && (TREE_CODE_CLASS ((enum tree_code) res_op->code) == tcc_comparison
>   || commutative_tree_code (res_op->code))
>   && tree_swap_operands_p (res_op->ops[0], res_op->ops[1]))
> {
>   std::swap (res_op->ops[0], res_op->ops[1]);
>   if (TREE_CODE_CLASS ((enum tree_code) res_op->code) == tcc_comparison)
> res_op->code = swap_tree_comparison (res_op->code);
>   canonicalized = true;
> }
>
> that's maybe not the best place.  The function assumes the operands
> are already valueized,
> so it maybe should be valueization that does the canonicalization -
> but I think doing it
> elsewhere made operand order unreliable (we do end up with
> non-canonical order in
> the IL sometimes).
>
> So maybe you should amend the code in resimplifyN as well.

Hmm, yeah, thanks for the heads up.  Does this updated version look OK?
Tested as before.

Thanks,
Richard


gcc/
* gimple-fold.c: Include internal-fn.h.
(fold_stmt_1): If a function maps to an internal one, use
first_commutative_argument to canonicalize the order of
commutative arguments.
* gimple-match-head.c (gimple_resimplify2, gimple_resimplify3)
(gimple_resimplify4, gimple_resimplify5): Extend commutativity
checks to functions.

gcc/testsuite/
* gcc.dg/fmax-fmin-1.c: New test.
---
 gcc/gimple-fold.c  | 25 --
 gcc/gimple-match-head.c| 52 --
 gcc/testsuite/gcc.dg/fmax-fmin-1.c | 18 +++
 3 files changed, 75 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/fmax-fmin-1.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 44fba12e150..1d8fd74f72c 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -69,6 +69,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "varasm.h"
 #include "memmodel.h"
 #include "optabs.h"
+#include "internal-fn.h"
 
 enum strlen_range_kind {
   /* Compute the exact constant string length.  */
@@ -6109,18 +6110,36 @@ fold_stmt_1 (gimple_stmt_iterator *gsi, bool inplace, 
tree (*valueize) (tree))
   break;
 case GIMPLE_CALL:
   {
-   for (i = 0; i < gimple_call_num_args (stmt); ++i)
+   gcall *call = as_a (stmt);
+   for (i = 0; i < gimple_call_num_args (call); ++i)
  {
-   tree *arg = gimple_call_arg_ptr (stmt, i);
+   tree *arg = gimple_call_arg_ptr (call, i);
if (REFERENCE_CLASS_P (*arg)
&& maybe_canonicalize_mem_ref_addr (arg))
  changed = true;
  }
-   tree *lhs = gimple_call_lhs_ptr (stmt);
+   tree *lhs = gimple_call_lhs_ptr (call);
if (*lhs
&& REFERENCE_CLASS_P (*lhs)
&& maybe_canonicalize_mem_ref_addr (lhs))
  changed = true;
+   if (*lhs)
+ {
+   combined_fn cfn = gimple_call_combined_fn (call);
+   internal_fn ifn = associated_internal_fn (cfn, TREE_TYPE (*lhs));
+   int opno = first_commutative_argument (ifn);
+   if (opno >= 0)
+ {
+   tree arg1 = gimple_call_arg (call, opno);
+   tree arg2 = gimple_call_arg (call, opno + 1);
+   if (tree_swap_operands_p (arg1, arg2))
+ {
+   gimple_call_set_arg (call, opno, arg2);
+   gimple_call_set_arg (call, opno + 1, arg1);
+   changed = true;
+ }
+ }
+ }
break;
   }
 case GIMPLE_ASM:
diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
index c481a625581..2d9364ca5de 100644
--- a/gcc/gimple-match-head.c
+++ b/gcc/gimple-match-head.c
@@ -294,18 +294,16 @@ gimple_resimplify2 (gimple_seq *seq, gimple_match_op 
*res_op,
 
   /* Canonicalize operand order.  */
   bool canonicalized = false;
-  if (res_op->code.is_tree_code ())
+  bool is_comparison
+= (res_op->code.is_tree_code ()
+   && TREE_CODE_CLASS (tree_code (res_op->code)) == tcc_comparison);
+  if ((is_comparison || commutative_binary_op_p (res_op->code, res_op->type))
+  && tree_swap_operands_p (res_op->ops[0], res_op->ops[1]))
 {
-  auto code = tree_code (res_op->code);
-  if ((TREE_CODE_CLASS (code) == tcc_comparison
-  || commutative_tree_code (code))
- && tree_swap_operands_p (res_op->ops[0], res_op->ops[1]))
-   {
- std::swap (res_op->ops[0],

Re: [PATCH] introduce predicate analysis class


On 11/25/21 3:18 AM, Richard Biener wrote:

On Mon, Aug 30, 2021 at 10:06 PM Martin Sebor via Gcc-patches
 wrote:


The predicate analysis subset of the tree-ssa-uninit pass isn't
necessarily specific to the detection of uninitialized reads.
Suitably parameterized, the same core logic could be used in
other warning passes to improve their S/N ratio, or issue more
nuanced diagnostics (e.g., when an invalid access cannot be
ruled out but also need not in reality be unavoidable, issue
a "may be invalid" type of warning rather than "is invalid").

Separating the predicate analysis logic from the uninitialized
pass and exposing a narrow API should also make it easier to
understand and evolve each part independently of the other,
or replace one with a better implementation without modifying
the other.(*)

As the first step in this direction, the attached patch extracts
the predicate analysis logic out of the pass, turns the interface
into public class members, and hides the internals in either
private members or static functions defined in a new source file.
(**)

The changes should have no externally observable effect (i.e.,
should cause no changes in warnings), except on the contents of
the uninitialized dump.  While making the changes I enhanced
the dumps to help me follow the logic.  Turning some previously
free-standing functions into members involved changing their
signatures and adjusting their callers.  While making these
changes I also renamed some of them as well some variables for
improved clarity.  Finally, I moved declarations of locals
closer to their point of initialization.

Tested on x86_64-linux.  Besides the usual bootstrap/regtest
I also tentatively verified the generality of the new class
interfaces by making use of it in -Warray-bounds.  Besides there,
I'd like to make use of it in the new gimple-ssa-warn-access pass
and, longer term, any other flow-sensitive warnings that might
benefit from it.


This changed can_chain_union_be_invalidated_p from

   for (size_t i = 0; i < uninit_pred.length (); ++i)
 {
   pred_chain c = uninit_pred[i];
   size_t j;
   for (j = 0; j < c.length (); ++j)
 if (can_one_predicate_be_invalidated_p (c[j], use_guard))
   break;

   /* If we were unable to invalidate any predicate in C, then there
  is a viable path from entry to the PHI where the PHI takes
  an uninitialized value and continues to a use of the PHI.  */
   if (j == c.length ())
 return false;
 }
   return true;

to

   for (unsigned i = 0; i < preds.length (); ++i)
 {
   const pred_chain  = preds[i];
   for (unsigned j = 0; j < chain.length (); ++j)
 if (can_be_invalidated_p (chain[j], guard))
   return true;

   /* If we were unable to invalidate any predicate in C, then there
  is a viable path from entry to the PHI where the PHI takes
  an interesting value and continues to a use of the PHI.  */
   return false;
 }
   return true;

which isn't semantically equivalent (it also uses overloading to
confuse me).  In particular the old code checked whether an
invalidation can happen for _each_ predicate in 'preds' while
the new one just checks preds[0], so the loop is pointless.

Catched by -Wunreachable-code complaining about unreachable
++i

Martin, was that change intended?


I didn't mean to make semantic changes (or to confuse you) so
the removal of the test is almost certainly an accident.  It's
interesting that the change hasn't caused any other trouble.
We should look into that.

Martin



Richard.



Martin

[*] A review of open -Wuninitialized bugs I did while working
on this project made me aware of a number of opportunities to
improve the analyzer to reduce the number of false positives
-Wmaybe-uninitiailzed suffers from.

[**] The class isn't fully general and, like the uninit pass,
only works with PHI nodes.  I plan to generalize it to compute
the set of predicates between any two basic blocks.

Re: [PATCH 4/5] vect: Make reduction code handle calls

2021-11-29 Thread Richard Sandiford via Gcc-patches

Richard Biener via Gcc-patches  writes:
> On Tue, Nov 16, 2021 at 5:24 PM Richard Sandiford
>  wrote:
>>
>> Richard Biener via Gcc-patches  writes:
>> > On Wed, Nov 10, 2021 at 1:48 PM Richard Sandiford via Gcc-patches
>> >  wrote:
>> >>
>> >> This patch extends the reduction code to handle calls.  So far
>> >> it's a structural change only; a later patch adds support for
>> >> specific function reductions.
>> >>
>> >> Most of the patch consists of using code_helper and gimple_match_op
>> >> to describe the reduction operations.  The other main change is that
>> >> vectorizable_call now needs to handle fully-predicated reductions.
>> >>
>> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>> >>
>> >> Richard
>> >>
>> >>
>> >> gcc/
>> >> * builtins.h (associated_internal_fn): Declare overload that
>> >> takes a (combined_cfn, return type) pair.
>> >> * builtins.c (associated_internal_fn): Split new overload out
>> >> of original fndecl version.  Also provide an overload that takes
>> >> a (combined_cfn, return type) pair.
>> >> * internal-fn.h (commutative_binary_fn_p): Declare.
>> >> (associative_binary_fn_p): Likewise.
>> >> * internal-fn.c (commutative_binary_fn_p): New function,
>> >> split out from...
>> >> (first_commutative_argument): ...here.
>> >> (associative_binary_fn_p): New function.
>> >> * gimple-match.h (code_helper): Add a constructor that takes
>> >> internal functions.
>> >> (commutative_binary_op_p): Declare.
>> >> (associative_binary_op_p): Likewise.
>> >> (canonicalize_code): Likewise.
>> >> (directly_supported_p): Likewise.
>> >> (get_conditional_internal_fn): Likewise.
>> >> (gimple_build): New overload that takes a code_helper.
>> >> * gimple-fold.c (gimple_build): Likewise.
>> >> * gimple-match-head.c (commutative_binary_op_p): New function.
>> >> (associative_binary_op_p): Likewise.
>> >> (canonicalize_code): Likewise.
>> >> (directly_supported_p): Likewise.
>> >> (get_conditional_internal_fn): Likewise.
>> >> * tree-vectorizer.h: Include gimple-match.h.
>> >> (neutral_op_for_reduction): Take a code_helper instead of a 
>> >> tree_code.
>> >> (needs_fold_left_reduction_p): Likewise.
>> >> (reduction_fn_for_scalar_code): Likewise.
>> >> (vect_can_vectorize_without_simd_p): Declare a nNew overload that 
>> >> takes
>> >> a code_helper.
>> >> * tree-vect-loop.c: Include case-cfn-macros.h.
>> >> (fold_left_reduction_fn): Take a code_helper instead of a 
>> >> tree_code.
>> >> (reduction_fn_for_scalar_code): Likewise.
>> >> (neutral_op_for_reduction): Likewise.
>> >> (needs_fold_left_reduction_p): Likewise.
>> >> (use_mask_by_cond_expr_p): Likewise.
>> >> (build_vect_cond_expr): Likewise.
>> >> (vect_create_partial_epilog): Likewise.  Use gimple_build rather
>> >> than gimple_build_assign.
>> >> (check_reduction_path): Handle calls and operate on code_helpers
>> >> rather than tree_codes.
>> >> (vect_is_simple_reduction): Likewise.
>> >> (vect_model_reduction_cost): Likewise.
>> >> (vect_find_reusable_accumulator): Likewise.
>> >> (vect_create_epilog_for_reduction): Likewise.
>> >> (vect_transform_cycle_phi): Likewise.
>> >> (vectorizable_reduction): Likewise.  Make more use of
>> >> lane_reduc_code_p.
>> >> (vect_transform_reduction): Use gimple_extract_op but expect
>> >> a tree_code for now.
>> >> (vect_can_vectorize_without_simd_p): New overload that takes
>> >> a code_helper.
>> >> * tree-vect-stmts.c (vectorizable_call): Handle reductions in
>> >> fully-masked loops.
>> >> * tree-vect-patterns.c (vect_mark_pattern_stmts): Use
>> >> gimple_extract_op when updating STMT_VINFO_REDUC_IDX.
>> >> ---
>> >>  gcc/builtins.c   |  46 -
>> >>  gcc/builtins.h   |   1 +
>> >>  gcc/gimple-fold.c|   9 +
>> >>  gcc/gimple-match-head.c  |  70 +++
>> >>  gcc/gimple-match.h   |  20 ++
>> >>  gcc/internal-fn.c|  46 -
>> >>  gcc/internal-fn.h|   2 +
>> >>  gcc/tree-vect-loop.c | 420 +++
>> >>  gcc/tree-vect-patterns.c |  23 ++-
>> >>  gcc/tree-vect-stmts.c|  66 --
>> >>  gcc/tree-vectorizer.h|  10 +-
>> >>  11 files changed, 455 insertions(+), 258 deletions(-)
>> >>
>> >> diff --git a/gcc/builtins.c b/gcc/builtins.c
>> >> index 384864bfb3a..03829c03a5a 100644
>> >> --- a/gcc/builtins.c
>> >> +++ b/gcc/builtins.c
>> >> @@ -2139,17 +2139,17 @@ mathfn_built_in_type (combined_fn fn)
>> >>  #undef SEQ_OF_CASE_MATHFN
>> >>  }
>> >>
>> >> -/* If BUILT_IN_NORMAL function FNDECL has an associated internal 
>> >> function,
>> >> -

Re: [PATCH] Remove can_throw_non_call_exceptions special case from operator_div::wi_fold.

2021-11-29 Thread Aldy Hernandez via Gcc-patches

On Mon, Nov 29, 2021 at 3:48 PM Richard Biener
 wrote:
>
> On Mon, Nov 29, 2021 at 3:39 PM Jeff Law  wrote:
> >
> >
> >
> > On 11/29/2021 7:00 AM, Aldy Hernandez via Gcc-patches wrote:
> > > As discussed in the PR.  The code makes no difference, so whatever test
> > > we added this special case for has been fixed or is being papered over.
> > > I think we should fix any fall out upstream.
> > >
> > > [Unless Andrew can remember why we added this and it still applies.]
> > >
> > > Tested on x86-64 Linux.
> > >
> > > OK for trunk?
> > >
> > >   PR 103451
> > >
> > > gcc/ChangeLog:
> > >
> > >   * range-op.cc (operator_div::wi_fold): Remove
> > >   can_throw_non_call_exceptions special case.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * gcc.dg/pr103451.c: New test.
> > I'll defer to Andrew, but it seems wrong to me.  The whole point is to
> > set the result to varying so that we don't know the result and never
> > remove the division which is critical for -fnon-call-exceptions.
>
> But that has nothing to do with computing the value range for
> the result which is only accessible when the stmt does _not_ throw ...
>
> That is, if we compute non-VARYING here and because of that
> remove the stmt then _that's_ the place to fix (IMO)

Ughh, I think you're both right.

We should fix this upstream AND we should test for the presence of the
division by 0 in the optimized dump.

Of course doing both opens a can of worms.  The division by zero can
be cleaned up by (at least) DCE, DSE, and the code sinking passes.
I've fixed all 3 in the attached (untested) patch.  Dunno what y'all
want to do at this point.

Aldy
From c521bd22d4a7360c7b01f864392eb5cf68cfc6f0 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Mon, 29 Nov 2021 12:52:45 +0100
Subject: [PATCH] Remove can_throw_non_call_exceptions special case from
 operator_div::wi_fold.

	PR 103451

gcc/ChangeLog:

	* range-op.cc (operator_div::wi_fold): Remove
	can_throw_non_call_exceptions special case.
	* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Check for
	can_throw_non_call_exceptions.
	* tree-ssa-dse.c (pass_dse::execute): Same.
	* tree-ssa-sink.c (sink_code_in_bb): Same.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr103451.c: New test.
---
 gcc/range-op.cc |  7 ---
 gcc/testsuite/gcc.dg/pr103451.c | 19 +++
 gcc/tree-ssa-dce.c  |  3 ++-
 gcc/tree-ssa-dse.c  |  3 ++-
 gcc/tree-ssa-sink.c |  4 +++-
 5 files changed, 26 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103451.c

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index bbf2924f815..6fe5f1cb4e0 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -1832,13 +1832,6 @@ operator_div::wi_fold (irange , tree type,
   return;
 }
 
-  // If flag_non_call_exceptions, we must not eliminate a division by zero.
-  if (cfun->can_throw_non_call_exceptions)
-{
-  r.set_varying (type);
-  return;
-}
-
   // If we're definitely dividing by zero, there's nothing to do.
   if (wi_zero_p (type, divisor_min, divisor_max))
 {
diff --git a/gcc/testsuite/gcc.dg/pr103451.c b/gcc/testsuite/gcc.dg/pr103451.c
new file mode 100644
index 000..90a584490ae
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103451.c
@@ -0,0 +1,19 @@
+// { dg-do compile }
+// { dg-options "-O2 -w -fnon-call-exceptions -fdump-tree-optimized" }
+
+int func_10_ptr_12;
+
+void func_10(long li_8) 
+{
+  long *ptr_9 = _8;
+  li_8 &= *ptr_9 / 0 ?: li_8;
+  for (;;)
+func_10_ptr_12 &= 4 ? *ptr_9 : 4;
+}
+
+void func_9_s_8() 
+{ 
+  func_10(func_9_s_8); 
+}
+
+// { dg-final { scan-tree-dump " / 0" "optimized" } }
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 1f817b95fab..1c1a5cc0811 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -304,7 +304,8 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
   /* If a statement could throw, it can be deemed necessary unless we
  are allowed to remove dead EH.  Test this after checking for
  new/delete operators since we always elide their EH.  */
-  if (!cfun->can_delete_dead_exceptions
+  if ((!cfun->can_delete_dead_exceptions
+   || cfun->can_throw_non_call_exceptions)
   && stmt_could_throw_p (cfun, stmt))
 {
   mark_stmt_necessary (stmt, true);
diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index 8717d654e5a..aa16565b429 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -1446,7 +1446,8 @@ pass_dse::execute (function *fun)
 		  && !gimple_has_side_effects (stmt)
 		  && !is_ctrl_altering_stmt (stmt)
 		  && (!stmt_could_throw_p (fun, stmt)
-		  || fun->can_delete_dead_exceptions))
+		  || (fun->can_delete_dead_exceptions
+			  && !cfun->can_throw_non_call_exceptions)))
 		{
 		  if (dump_file && (dump_flags & TDF_DETAILS))
 		{
diff --git a/gcc/tree-ssa-sink.c b/gcc/tree-ssa-sink.c
index 92f444ec1c8..29299129cfa 100644
--- a/gcc/tree-ssa-sink.c
+++ b/gcc/tree-ssa-sink.c
@@ -696,7

[PATCH] Fix --help -Q output

2021-11-29 Thread Martin Liška


There are cases where a default option value is -1 and
auto-detection happens in e.g. target.

Do not print these options. Leads to the following diff:

-  -fdelete-null-pointer-checks [enabled]
+  -fdelete-null-pointer-checks 
@@ -332 +332 @@
-  -fleading-underscore [enabled]
+  -fleading-underscore 
@@ -393 +393 @@
-  -fprefetch-loop-arrays   [enabled]
+  -fprefetch-loop-arrays   
@@ -502 +502 @@
-  -fstrict-volatile-bitfields  [enabled]
+  -fstrict-volatile-bitfields  
@@ -533 +533 @@
-  -ftree-loop-if-convert   [enabled]
+  -ftree-loop-if-convert   

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR middle-end/103438

gcc/ChangeLog:

* opts-common.c (option_enabled): Return flag_var for BOOLEAN
types (the can contain an unknown value -1).
---
 gcc/opts-common.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/opts-common.c b/gcc/opts-common.c
index 9d1914ff2ff..c4a19b9a0b6 100644
--- a/gcc/opts-common.c
+++ b/gcc/opts-common.c
@@ -1586,7 +1586,8 @@ option_flag_var (int opt_index, struct gcc_options *opts)
 }
 
 /* Return 1 if option OPT_IDX is enabled in OPTS, 0 if it is disabled,

-   or -1 if it isn't a simple on-off switch.  */
+   or -1 if it isn't a simple on-off switch (or if the value is unknown,
+   typically set later in target).  */
 
 int

 option_enabled (int opt_idx, unsigned lang_mask, void *opts)
@@ -1608,9 +1609,9 @@ option_enabled (int opt_idx, unsigned lang_mask, void 
*opts)
   {
   case CLVC_BOOLEAN:
if (option->cl_host_wide_int)
- return *(HOST_WIDE_INT *) flag_var != 0;
+ return *(HOST_WIDE_INT *) flag_var;
else
- return *(int *) flag_var != 0;
+ return *(int *) flag_var;
 
   case CLVC_EQUAL:

if (option->cl_host_wide_int)
--
2.34.0

[PATCH] Avoid some -Wunreachable-code-ctrl

This cleans up unreachable code diagnosed by -Wunreachable-code-ctrl.
It largely follows the previous series but discovers a few extra
cases, namely dead code after break or continue or loops without
exits.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2021-11-29  Richard Biener  

gcc/c/
* gimple-parser.c (c_parser_gimple_postfix_expression):
avoid unreachable code after break.

gcc/
* cfgrtl.c (skip_insns_after_block): Refactor code to
be more easily readable.
* expr.c (op_by_pieces_d::run): Remove unreachable
assert.
* sched-deps.c (sched_analyze): Remove unreachable
gcc_unreachable.
* sel-sched-ir.c (in_same_ebb_p): Likewise.
* tree-ssa-alias.c (nonoverlapping_refs_since_match_p):
Remove unreachable code.
* tree-vect-slp.c (vectorize_slp_instance_root_stmt):
Refactor to avoid unreachable loop iteration.
* tree.c (walk_tree_1): Remove unreachable break.
* vec-perm-indices.c (vec_perm_indices::series_p): Remove
unreachable return.

gcc/cp/
* parser.c (cp_parser_postfix_expression): Remove
unreachable code.
* pt.c (tsubst_expr): Remove unreachable breaks.

gcc/fortran/
* frontend-passes.c (gfc_expr_walker): Remove unreachable
break.
* scanner.c (skip_fixed_comments): Remove unreachable
gcc_unreachable.
* trans-expr.c (gfc_expr_is_variable): Refactor to make
control flow more obvious.
---
 gcc/c/gimple-parser.c |  8 +---
 gcc/cfgrtl.c  | 10 ++
 gcc/cp/parser.c   |  4 
 gcc/cp/pt.c   |  2 --
 gcc/expr.c|  3 ---
 gcc/fortran/frontend-passes.c |  1 -
 gcc/fortran/scanner.c |  1 -
 gcc/fortran/trans-expr.c  | 11 +++
 gcc/sched-deps.c  |  2 --
 gcc/sel-sched-ir.c|  3 ---
 gcc/tree-ssa-alias.c  |  3 ---
 gcc/tree-vect-slp.c   | 22 --
 gcc/tree.c|  2 --
 gcc/vec-perm-indices.c|  1 -
 14 files changed, 14 insertions(+), 59 deletions(-)

diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c
index 32f22dbb8a7..f594a8ccb31 100644
--- a/gcc/c/gimple-parser.c
+++ b/gcc/c/gimple-parser.c
@@ -1698,13 +1698,7 @@ c_parser_gimple_postfix_expression (gimple_parser 
)
}
  break;
}
-  else
-   {
- c_parser_error (parser, "expected expression");
- expr.set_error ();
- break;
-   }
-  break;
+  /* Fallthru.  */
 default:
   c_parser_error (parser, "expected expression");
   expr.set_error ();
diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index 3744adcc2ba..287a3db643a 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -3539,14 +3539,8 @@ skip_insns_after_block (basic_block bb)
  continue;
 
case NOTE:
- switch (NOTE_KIND (insn))
-   {
-   case NOTE_INSN_BLOCK_END:
- gcc_unreachable ();
-   default:
- continue;
-   }
- break;
+ gcc_assert (NOTE_KIND (insn) != NOTE_INSN_BLOCK_END);
+ continue;
 
case CODE_LABEL:
  if (NEXT_INSN (insn)
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 0bd58525726..cc88a36dd39 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -7892,10 +7892,6 @@ cp_parser_postfix_expression (cp_parser *parser, bool 
address_p, bool cast_p,
 return postfix_expression;
}
 }
-
-  /* We should never get here.  */
-  gcc_unreachable ();
-  return error_mark_node;
 }
 
 /* Helper function for cp_parser_parenthesized_expression_list and
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 31ed773e145..f4b9d9673fb 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -18242,13 +18242,11 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl,
   stmt = finish_co_yield_expr (input_location,
   RECUR (TREE_OPERAND (t, 0)));
   RETURN (stmt);
-  break;
 
 case CO_AWAIT_EXPR:
   stmt = finish_co_await_expr (input_location,
   RECUR (TREE_OPERAND (t, 0)));
   RETURN (stmt);
-  break;
 
 case EXPR_STMT:
   tmp = RECUR (EXPR_STMT_EXPR (t));
diff --git a/gcc/expr.c b/gcc/expr.c
index 5673902b1fc..b2815257509 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -1342,9 +1342,6 @@ op_by_pieces_d::run ()
}
 }
   while (1);
-
-  /* The code above should have handled everything.  */
-  gcc_assert (!length);
 }
 
 /* Derived class from op_by_pieces_d, providing support for block move
diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index f5ba7cecd54..16ee2afc9c0 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -5229,7 +5229,6 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t exprfn, 
void *data)
  case EXPR_OP:

Re: [PATCH] Remove can_throw_non_call_exceptions special case from operator_div::wi_fold.

On Mon, Nov 29, 2021 at 3:39 PM Jeff Law  wrote:
>
>
>
> On 11/29/2021 7:00 AM, Aldy Hernandez via Gcc-patches wrote:
> > As discussed in the PR.  The code makes no difference, so whatever test
> > we added this special case for has been fixed or is being papered over.
> > I think we should fix any fall out upstream.
> >
> > [Unless Andrew can remember why we added this and it still applies.]
> >
> > Tested on x86-64 Linux.
> >
> > OK for trunk?
> >
> >   PR 103451
> >
> > gcc/ChangeLog:
> >
> >   * range-op.cc (operator_div::wi_fold): Remove
> >   can_throw_non_call_exceptions special case.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.dg/pr103451.c: New test.
> I'll defer to Andrew, but it seems wrong to me.  The whole point is to
> set the result to varying so that we don't know the result and never
> remove the division which is critical for -fnon-call-exceptions.

But that has nothing to do with computing the value range for
the result which is only accessible when the stmt does _not_ throw ...

That is, if we compute non-VARYING here and because of that
remove the stmt then _that's_ the place to fix (IMO)

>
> > ---
> >   gcc/range-op.cc |  7 ---
> >   gcc/testsuite/gcc.dg/pr103451.c | 17 +
> >   2 files changed, 17 insertions(+), 7 deletions(-)
> >   create mode 100644 gcc/testsuite/gcc.dg/pr103451.c
> >
> > diff --git a/gcc/range-op.cc b/gcc/range-op.cc
> > index bbf2924f815..6fe5f1cb4e0 100644
> > --- a/gcc/range-op.cc
> > +++ b/gcc/range-op.cc
> > @@ -1832,13 +1832,6 @@ operator_div::wi_fold (irange , tree type,
> > return;
> >   }
> >
> > -  // If flag_non_call_exceptions, we must not eliminate a division by zero.
> > -  if (cfun->can_throw_non_call_exceptions)
> > -{
> > -  r.set_varying (type);
> > -  return;
> > -}
> > -
> > // If we're definitely dividing by zero, there's nothing to do.
> > if (wi_zero_p (type, divisor_min, divisor_max))
> >   {
> > diff --git a/gcc/testsuite/gcc.dg/pr103451.c 
> > b/gcc/testsuite/gcc.dg/pr103451.c
> > new file mode 100644
> > index 000..b83646d0b83
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/pr103451.c
> > @@ -0,0 +1,17 @@
> > +// { dg-do compile }
> > +// { dg-options "-O2 -w" }
> ISTM that what you want to test for is that the division by zero remains
> in the IL for -fnon-call-exceptions.
>
> jeff
>

Re: [PATCH] Make the path to etags used in the build system configurable [PR103021]





On 11/28/2021 6:34 PM, Eric Gallager via Gcc-patches wrote:

The attached patch allows users to specify a path to their `etags`
executable for use when doing `make tags`, which is meant to close PR
other/103021: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103021
I based this patch off of this one from upstream automake:
https://git.savannah.gnu.org/cgit/automake.git/commit/m4?id=d2ccbd7eb38d6a4277d6f42b994eb5a29b1edf29
This means that I just supplied variables that the user can override
for the tags programs, rather than having the configure scripts
actually check for them. I handle etags and ctags separately because
the intl subdirectory has separate targets for them. Tested with `make
tags`; the changes I made work successfully, but some of the
subdirectories still have broken tags targets, so I had to switch to
`make -k tags` part way through. This isn't because of anything I did,
though; the `-k` flag is only necessary because of errors that were
already there before I touched anything. Also note that this patch
only affects the subdirectories that use handwritten Makefiles; the
ones that use automake will have to wait until we update the version
of automake used to be 1.16.4 or newer before they'll be fixed.

OK for the trunk.
jeff

Re: [PATCH] Remove can_throw_non_call_exceptions special case from operator_div::wi_fold.





On 11/29/2021 7:00 AM, Aldy Hernandez via Gcc-patches wrote:

As discussed in the PR.  The code makes no difference, so whatever test
we added this special case for has been fixed or is being papered over.
I think we should fix any fall out upstream.

[Unless Andrew can remember why we added this and it still applies.]

Tested on x86-64 Linux.

OK for trunk?

PR 103451

gcc/ChangeLog:

* range-op.cc (operator_div::wi_fold): Remove
can_throw_non_call_exceptions special case.

gcc/testsuite/ChangeLog:

* gcc.dg/pr103451.c: New test.
I'll defer to Andrew, but it seems wrong to me.  The whole point is to 
set the result to varying so that we don't know the result and never 
remove the division which is critical for -fnon-call-exceptions.




---
  gcc/range-op.cc |  7 ---
  gcc/testsuite/gcc.dg/pr103451.c | 17 +
  2 files changed, 17 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/pr103451.c

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index bbf2924f815..6fe5f1cb4e0 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -1832,13 +1832,6 @@ operator_div::wi_fold (irange , tree type,
return;
  }
  
-  // If flag_non_call_exceptions, we must not eliminate a division by zero.

-  if (cfun->can_throw_non_call_exceptions)
-{
-  r.set_varying (type);
-  return;
-}
-
// If we're definitely dividing by zero, there's nothing to do.
if (wi_zero_p (type, divisor_min, divisor_max))
  {
diff --git a/gcc/testsuite/gcc.dg/pr103451.c b/gcc/testsuite/gcc.dg/pr103451.c
new file mode 100644
index 000..b83646d0b83
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103451.c
@@ -0,0 +1,17 @@
+// { dg-do compile }
+// { dg-options "-O2 -w" }
ISTM that what you want to test for is that the division by zero remains 
in the IL for -fnon-call-exceptions.


jeff

Re: [PATCH] Fix RTL FE issue with premature return





On 11/29/2021 4:26 AM, Richard Biener via Gcc-patches wrote:

This fixes an issue discovered by -Wunreachable-code-return

Bootstrap / regtest pending on x86_64-unknown-linux-gnu, OK?

2021-11-29  Richard Biener  

* read-rtl-function.c (function_reader::read_rtx_operand):
Return only after resetting m_in_call_function_usage.

OK.
jeff

[PATCH, Fortran] Fix setting of array lower bound for named arrays

2021-11-29 Thread Chung-Lin Tang


This patch by Tobias, fixes a case of setting array low-bounds, found
for particular uses of SOURCE=/MOLD=.

For example:
program A_M
  implicit none
  real, dimension (:), allocatable :: A, B
  allocate (A(0:5))
  call Init (A)
contains
  subroutine Init ( A )
real, dimension ( 0 : ), intent ( in ) :: A
integer, dimension ( 1 ) :: lb_B

allocate (B, mold = A)
...
lb_B = lbound (B, dim=1)   ! Error: lb_B assigned 1, instead of 0 like 
lower-bound of A.

Referencing the Fortran standard:

"16.9.109 LBOUND (ARRAY [, DIM, KIND])"
states:
"If DIM is present, ARRAY is a whole array, and either ARRAY is
 an assumed-size array of rank DIM or dimension DIM of ARRAY has
 nonzero extent, the result has a value equal to the lower bound
 for subscript DIM of ARRAY. Otherwise, if DIM is present, the
 result value is 1."

And on what is a "whole array":

"9.5.2 Whole arrays"
"A whole array is a named array or a structure component ..."

The attached patch adjusts the relevant part in gfc_trans_allocate() to only set
e3_has_nodescriptor only for non-named arrays.

Tobias has tested this once, and I've tested this patch as well on our complete 
set of
testsuites (which usually serves for OpenMP related stuff). Everything appears 
well with no regressions.

Is this okay for trunk?

Thanks,
Chung-Lin

2021-11-29  Tobias Burnus  

gcc/fortran/ChangeLog:

* trans-stmt.c (gfc_trans_allocate): Set e3_has_nodescriptor to true
only for non-named arrays.

gcc/testsuite/ChangeLog:

* gfortran.dg/allocate_with_source_26.f90: Adjust testcase.
* gfortran.dg/allocate_with_mold_4.f90: New testcase.diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index bdf7957..982e1e0 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -6660,16 +6660,13 @@ gfc_trans_allocate (gfc_code * code)
   else
e3rhs = gfc_copy_expr (code->expr3);
 
-  // We need to propagate the bounds of the expr3 for source=/mold=;
-  // however, for nondescriptor arrays, we use internally a lower bound
-  // of zero instead of one, which needs to be corrected for the allocate 
obj
-  if (e3_is == E3_DESC)
-   {
- symbol_attribute attr = gfc_expr_attr (code->expr3);
- if (code->expr3->expr_type == EXPR_ARRAY ||
- (!attr.allocatable && !attr.pointer))
-   e3_has_nodescriptor = true;
-   }
+  // We need to propagate the bounds of the expr3 for source=/mold=.
+  // However, for non-named arrays, the lbound has to be 1 and neither the
+  // bound used inside the called function even when returning an
+  // allocatable/pointer nor the zero used internally.
+  if (e3_is == E3_DESC
+ && code->expr3->expr_type != EXPR_VARIABLE)
+   e3_has_nodescriptor = true;
 }
 
   /* Loop over all objects to allocate.  */
diff --git a/gcc/testsuite/gfortran.dg/allocate_with_mold_4.f90 
b/gcc/testsuite/gfortran.dg/allocate_with_mold_4.f90
new file mode 100644
index 000..d545fe1
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/allocate_with_mold_4.f90
@@ -0,0 +1,24 @@
+program A_M
+  implicit none
+  real, parameter :: C(5:10) = 5.0
+  real, dimension (:), allocatable :: A, B
+  allocate (A(6))
+  call Init (A)
+contains
+  subroutine Init ( A )
+real, dimension ( -1 : ), intent ( in ) :: A
+integer, dimension ( 1 ) :: lb_B
+
+allocate (B, mold = A)
+if (any (lbound (B) /= lbound (A))) stop 1
+if (any (ubound (B) /= ubound (A))) stop 2
+if (any (shape (B) /= shape (A))) stop 3
+if (size (B) /= size (A)) stop 4
+deallocate (B)
+allocate (B, mold = C)
+if (any (lbound (B) /= lbound (C))) stop 5
+if (any (ubound (B) /= ubound (C))) stop 6
+if (any (shape (B) /= shape (C))) stop 7
+if (size (B) /= size (C)) stop 8
+end
+end 
diff --git a/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90 
b/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90
index 28f24fc..323c8a3 100644
--- a/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90
+++ b/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90
@@ -34,23 +34,23 @@ program p
  if (lbound(p1, 1) /= 3 .or. ubound(p1, 1) /= 4 &
  .or. lbound(p2, 1) /= 3 .or. ubound(p2, 1) /= 4 &
  .or. lbound(p3, 1) /= 1 .or. ubound(p3, 1) /= 2 &
- .or. lbound(p4, 1) /= 7 .or. ubound(p4, 1) /= 8 &
+ .or. lbound(p4, 1) /= 1 .or. ubound(p4, 1) /= 2 &
  .or. p1(3)%i /= 43 .or. p1(4)%i /= 56 &
  .or. p2(3)%i /= 43 .or. p2(4)%i /= 56 &
  .or. p3(1)%i /= 43 .or. p3(2)%i /= 56 &
- .or. p4(7)%i /= 11 .or. p4(8)%i /= 12) then
+ .or. p4(1)%i /= 11 .or. p4(2)%i /= 12) then
call abort()
  endif
 
  !write(*,*) lbound(a,1), ubound(a,1) ! prints 1 3
  !write(*,*) lbound(b,1), ubound(b,1) ! prints 1 3
- !write(*,*) lbound(c,1), ubound(c,1) ! prints 3 5
+ !write(*,*) lbound(c,1), ubound(c,1) ! prints 1 3
  !write(*,*) lbound(d,1), ubound(d,1) ! prints 1 5
  !write(*,*) lbound(e,1), ubound(e,1) ! prints 1 6

[PATCH] Remove can_throw_non_call_exceptions special case from operator_div::wi_fold.

2021-11-29 Thread Aldy Hernandez via Gcc-patches

As discussed in the PR.  The code makes no difference, so whatever test
we added this special case for has been fixed or is being papered over.
I think we should fix any fall out upstream.

[Unless Andrew can remember why we added this and it still applies.]

Tested on x86-64 Linux.

OK for trunk?

PR 103451

gcc/ChangeLog:

* range-op.cc (operator_div::wi_fold): Remove
can_throw_non_call_exceptions special case.

gcc/testsuite/ChangeLog:

* gcc.dg/pr103451.c: New test.
---
 gcc/range-op.cc |  7 ---
 gcc/testsuite/gcc.dg/pr103451.c | 17 +
 2 files changed, 17 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103451.c

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index bbf2924f815..6fe5f1cb4e0 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -1832,13 +1832,6 @@ operator_div::wi_fold (irange , tree type,
   return;
 }
 
-  // If flag_non_call_exceptions, we must not eliminate a division by zero.
-  if (cfun->can_throw_non_call_exceptions)
-{
-  r.set_varying (type);
-  return;
-}
-
   // If we're definitely dividing by zero, there's nothing to do.
   if (wi_zero_p (type, divisor_min, divisor_max))
 {
diff --git a/gcc/testsuite/gcc.dg/pr103451.c b/gcc/testsuite/gcc.dg/pr103451.c
new file mode 100644
index 000..b83646d0b83
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103451.c
@@ -0,0 +1,17 @@
+// { dg-do compile }
+// { dg-options "-O2 -w" }
+
+int func_10_ptr_12;
+
+void func_10(long li_8) 
+{
+  long *ptr_9 = _8;
+  li_8 &= *ptr_9 / 0 ?: li_8;
+  for (;;)
+func_10_ptr_12 &= 4 ? *ptr_9 : 4;
+}
+
+void func_9_s_8() 
+{ 
+  func_10(func_9_s_8); 
+}
-- 
2.31.1

[PATCH] Place stray return inside if

This avoids a -Wunreachable-code-return diagnostic about
two consecutive returns by placing the return in the appropriate
conditional block.

OK?

Thanks,
Richard.

2021-11-29  Richard Biener  

libstd++-v3/
* src/c++17/fs_ops.cc (fs::equivalent): Move return stmt inside #else.
---
 libstdc++-v3/src/c++17/fs_ops.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc
index 5b7f7edd1c9..cc8e4683900 100644
--- a/libstdc++-v3/src/c++17/fs_ops.cc
+++ b/libstdc++-v3/src/c++17/fs_ops.cc
@@ -898,8 +898,8 @@ fs::equivalent(const path& p1, const path& p2, error_code& 
ec) noexcept
   return false;
 #else
   ec = std::make_error_code(std::errc::function_not_supported);
-#endif
   return false;
+#endif
 }
 
 std::uintmax_t
-- 
2.31.1

[PATCH] Remove more stray returns and gcc_unreachable ()s

This removes more cases that appear when bootstrap with
-Wunreachable-code-return progresses.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2021-11-29  Richard Biener  

* config/i386/i386.c (ix86_shift_rotate_cost): Remove
unreachable return.
* tree-chrec.c (evolution_function_is_invariant_rec_p):
Likewise.
* tree-if-conv.c (if_convertible_stmt_p): Likewise.
* tree-ssa-pre.c (fully_constant_expression): Likewise.
* tree-vrp.c (operand_less_p): Likewise.
* reload.c (reg_overlap_mentioned_for_reload_p): Remove
unreachable gcc_unreachable ().
* sel-sched-ir.c (bb_next_bb): Likewise.
* varasm.c (compare_constant): Likewise.

gcc/cp/
* logic.cc (cnf_size_r): Remove unreachable and inconsistently
placed gcc_unreachable ()s.
* pt.c (iterative_hash_template_arg): Remove unreachable
gcc_unreachable and return.

gcc/fortran/
* target-memory.c (gfc_element_size): Remove unreachable return.

gcc/objc/
* objc-act.c (objc_build_setter_call): Remove unreachable
return.

libcpp/
* charset.c (convert_escape): Remove unreachable break.
---
 gcc/config/i386/i386.c  | 1 -
 gcc/cp/logic.cc | 2 --
 gcc/cp/pt.c | 3 ---
 gcc/fortran/target-memory.c | 1 -
 gcc/objc/objc-act.c | 3 ---
 gcc/reload.c| 7 +++
 gcc/sel-sched-ir.h  | 2 --
 gcc/tree-chrec.c| 2 --
 gcc/tree-if-conv.c  | 2 --
 gcc/tree-ssa-pre.c  | 1 -
 gcc/tree-vrp.c  | 2 --
 gcc/varasm.c| 2 --
 libcpp/charset.c| 1 -
 13 files changed, 3 insertions(+), 26 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 2657e7817ae..523cb55e279 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -20364,7 +20364,6 @@ ix86_shift_rotate_cost (const struct processor_costs 
*cost,
   else
return cost->shift_var;
 }
-  return cost->shift_const;
 }
 
 /* Compute a (partial) cost for rtx X.  Return true if the complete
diff --git a/gcc/cp/logic.cc b/gcc/cp/logic.cc
index 9d892b1473b..f31ae8c58ae 100644
--- a/gcc/cp/logic.cc
+++ b/gcc/cp/logic.cc
@@ -495,7 +495,6 @@ cnf_size_r (tree t)
  else
/* Neither LHS nor RHS is a conjunction.  */
return std::make_pair (0, false);
- gcc_unreachable ();
}
   if (conjunction_p (lhs))
{
@@ -536,7 +535,6 @@ cnf_size_r (tree t)
  else
/* Neither LHS nor RHS is a conjunction.  */
return std::make_pair (2, false);
- gcc_unreachable ();
}
   if (conjunction_p (lhs))
{
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 74323701a7d..31ed773e145 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -1969,9 +1969,6 @@ iterative_hash_template_arg (tree arg, hashval_t val)
val = iterative_hash_template_arg (TREE_OPERAND (arg, i), val);
   return val;
 }
-
-  gcc_unreachable ();
-  return 0;
 }
 
 /* Unregister the specialization SPEC as a specialization of TMPL.
diff --git a/gcc/fortran/target-memory.c b/gcc/fortran/target-memory.c
index 7b21a9e04e8..ab4665c6782 100644
--- a/gcc/fortran/target-memory.c
+++ b/gcc/fortran/target-memory.c
@@ -138,7 +138,6 @@ gfc_element_size (gfc_expr *e, size_t *siz)
   *siz = 0;
   return false;
 }
-  return true;
 }
 
 
diff --git a/gcc/objc/objc-act.c b/gcc/objc/objc-act.c
index 9baa46d2243..89f46294123 100644
--- a/gcc/objc/objc-act.c
+++ b/gcc/objc/objc-act.c
@@ -1904,9 +1904,6 @@ objc_build_setter_call (tree lhs, tree rhs)
 setter_argument, NULL);
   return setter;
 }
-
-  /* Unreachable, but the compiler may not realize.  */
-  return error_mark_node;
 }
 
 /* This hook routine is called when a MODIFY_EXPR is being built.  We
diff --git a/gcc/reload.c b/gcc/reload.c
index 190db6ac47f..9ee3439709b 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -6602,11 +6602,10 @@ reg_overlap_mentioned_for_reload_p (rtx x, rtx in)
return (rtx_equal_p (x, in)
|| reg_overlap_mentioned_for_reload_p (x, XEXP (in, 0))
|| reg_overlap_mentioned_for_reload_p (x, XEXP (in, 1)));
-  else return (reg_overlap_mentioned_for_reload_p (XEXP (x, 0), in)
-  || reg_overlap_mentioned_for_reload_p (XEXP (x, 1), in));
+  else
+   return (reg_overlap_mentioned_for_reload_p (XEXP (x, 0), in)
+   || reg_overlap_mentioned_for_reload_p (XEXP (x, 1), in));
 }
-
-  gcc_unreachable ();
 }
 
 /* Return nonzero if anything in X contains a MEM.  Look also for pseudo
diff --git a/gcc/sel-sched-ir.h b/gcc/sel-sched-ir.h
index 8ee0529d5a8..18e03c4cb96 100644
--- a/gcc/sel-sched-ir.h
+++ b/gcc/sel-sched-ir.h
@@ -1493,8 +1493,6 @@ bb_next_bb (basic_block bb)
 default:
   return bb->next_bb;
 }
-
-  gcc_unreachable ();
 }
 
 
diff --git

[PATCH] RISC-V: jal cannot refer to a default visibility symbol for shared object.

2021-11-29 Thread Nelson Chu

This is the original binutils bugzilla report,
https://sourceware.org/bugzilla/show_bug.cgi?id=28509

And this is the first version of the proposed binutils patch,
https://sourceware.org/pipermail/binutils/2021-November/118398.html

After applying the binutils patch, I get the the unexpected error when
building libgcc,

/scratch/nelsonc/riscv-gnu-toolchain/riscv-gcc/libgcc/config/riscv/div.S:42:
/scratch/nelsonc/build-upstream/rv64gc-linux/build-install/riscv64-unknown-linux-gnu/bin/ld:
 relocation R_RISCV_JAL against `__udivdi3' which may bind externally can not 
be used when making a shared object; recompile with -fPIC

Therefore, this patch add an extra hidden alias symbol for __udivdi3, and
then use HIDDEN_JUMPTARGET to target a non-preemptible symbol instead.
The solution is similar to glibc as follows,
https://sourceware.org/git/?p=glibc.git;a=commit;h=68389203832ab39dd0dbaabbc4059e7fff51c29b

libgcc/ChangeLog:

* config/riscv/div.S: Add the hidden alias symbol for __udivdi3, and
then use HIDDEN_JUMPTARGET to target it since it is non-preemptible.
* config/riscv/riscv-asm.h: Added new macros HIDDEN_JUMPTARGET and
HIDDEN_DEF.
---
 libgcc/config/riscv/div.S   | 15 ---
 libgcc/config/riscv/riscv-asm.h |  6 ++
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/libgcc/config/riscv/div.S b/libgcc/config/riscv/div.S
index c9bd787..723c3b8 100644
--- a/libgcc/config/riscv/div.S
+++ b/libgcc/config/riscv/div.S
@@ -40,7 +40,7 @@ FUNC_BEGIN (__udivsi3)
   slla0, a0, 32
   slla1, a1, 32
   move   t0, ra
-  jal__udivdi3
+  jalHIDDEN_JUMPTARGET(__udivdi3)
   sext.w a0, a0
   jr t0
 FUNC_END (__udivsi3)
@@ -52,7 +52,7 @@ FUNC_BEGIN (__umodsi3)
   srla0, a0, 32
   srla1, a1, 32
   move   t0, ra
-  jal__udivdi3
+  jalHIDDEN_JUMPTARGET(__udivdi3)
   sext.w a0, a1
   jr t0
 FUNC_END (__umodsi3)
@@ -95,11 +95,12 @@ FUNC_BEGIN (__udivdi3)
 .L5:
   ret
 FUNC_END (__udivdi3)
+HIDDEN_DEF (__udivdi3)
 
 FUNC_BEGIN (__umoddi3)
   /* Call __udivdi3(a0, a1), then return the remainder, which is in a1.  */
   move  t0, ra
-  jal   __udivdi3
+  jal   HIDDEN_JUMPTARGET(__udivdi3)
   move  a0, a1
   jrt0
 FUNC_END (__umoddi3)
@@ -111,12 +112,12 @@ FUNC_END (__umoddi3)
   bgtz  a1, .L12 /* Compute __udivdi3(-a0, a1), then negate the result.  */
 
   neg   a1, a1
-  j __udivdi3/* Compute __udivdi3(-a0, -a1).  */
+  j HIDDEN_JUMPTARGET(__udivdi3) /* Compute __udivdi3(-a0, -a1).  */
 .L11:/* Compute __udivdi3(a0, -a1), then negate the result.  */
   neg   a1, a1
 .L12:
   move  t0, ra
-  jal   __udivdi3
+  jal   HIDDEN_JUMPTARGET(__udivdi3)
   neg   a0, a0
   jrt0
 FUNC_END (__divdi3)
@@ -126,7 +127,7 @@ FUNC_BEGIN (__moddi3)
   bltz   a1, .L31
   bltz   a0, .L32
 .L30:
-  jal__udivdi3/* The dividend is not negative.  */
+  jalHIDDEN_JUMPTARGET(__udivdi3)/* The dividend is not negative.  */
   move   a0, a1
   jr t0
 .L31:
@@ -134,7 +135,7 @@ FUNC_BEGIN (__moddi3)
   bgez   a0, .L30
 .L32:
   nega0, a0
-  jal__udivdi3/* The dividend is hella negative.  */
+  jalHIDDEN_JUMPTARGET(__udivdi3)/* The dividend is hella negative.  */
   nega0, a1
   jr t0
 FUNC_END (__moddi3)
diff --git a/libgcc/config/riscv/riscv-asm.h b/libgcc/config/riscv/riscv-asm.h
index 8550707..96dd85b 100644
--- a/libgcc/config/riscv/riscv-asm.h
+++ b/libgcc/config/riscv/riscv-asm.h
@@ -33,3 +33,9 @@ X:
 #define FUNC_ALIAS(X,Y)\
.globl X;   \
X = Y
+
+#define CONCAT1(a, b)  CONCAT2(a, b)
+#define CONCAT2(a, b)  a ## b
+#define HIDDEN_JUMPTARGET(X)   CONCAT1(__hidden_, X)
+#define HIDDEN_DEF(X)  FUNC_ALIAS(HIDDEN_JUMPTARGET(X), X); \
+   .hidden HIDDEN_JUMPTARGET(X)
-- 
2.7.4

Re: [PATCH] Loop unswitching: support gswitch statements.

2021-11-29 Thread Martin Liška


On 11/26/21 09:12, Richard Biener wrote:

On Wed, Nov 24, 2021 at 3:32 PM Martin Liška  wrote:


On 11/24/21 15:14, Martin Liška wrote:

It likely miscompiles gcc.dg/loop-unswitch-5.c, working on that..


Fixed that in the updated version.


Function level comments need updating it seems.


I've done that.



+static unsigned
+evaluate_insns (class loop *loop,  basic_block *bbs,
+   predicate_vector _path,
+   auto_bb_flag _flag)
+{
+  auto_vec worklist (loop->num_nodes);
+  worklist.quick_push (bbs[0]);
...

so when adding gswitch support the easiest way to make

+  FOR_EACH_EDGE (e, ei, bb->succs)
+   {
...
+   {
+ worklist.safe_push (dest);
+ dest->flags |= reachable_flag;

work is when the gcond/gswitch simplification would mark
outgoing edges as (non-)executable.  For gswitch this
could be achieved by iterating over the case labels and
intersecting that with the range while for gcond it's a
matter of setting an edge flag instead of returning true/false.


Exactly, it can be quite naturally added to the current patch.


I'd call the common function evaluate_control_stmt_using_entry_checks
or so and invoke it on the last stmt of a block with >= 2 outgoing
edges.


Yes, I'll do it for the gswitch support patch.



We still seem to do the simplification work twice, once for costing
and once for transform, but that's OK for now I guess.

I think you want to clear_aux_for_blocks at the end of the pass.


Called that.



Otherwise I like it - it seems you have some TODO around cost
modeling.  Did you try to do gswitch support ontop as I suggested
to see if the general structure keeps working?


I vanished and tested the patch. No, I don't have the gswitch support patch
as the current patch was reworked a few times.

Can we please progress and have installed the suggested patch?

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin



Thanks,
Richard.



Martin
From cda58fc6552d4f55ae079e137bcb805fd8ebbc74 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 22 Nov 2021 13:54:20 +0100
Subject: [PATCH] Loop unswitching: refactoring & costing improvement

gcc/ChangeLog:

	* dbgcnt.def (DEBUG_COUNTER): Add loop_unswitch debug counter.
	* tree-ssa-loop-unswitch.c (struct unswitch_predicate): New.
	(tree_unswitch_single_loop): Rework it by iterating all
	candidates.
	(tree_may_unswitch_on): Support Ranger.
	(find_unswitching_predicates_for_bb): New.
	(get_predicates_for_bb): Likewise.
	(set_predicates_for_bb): Likewise.
	(init_loop_unswitch_info): Likewise.
	(tree_ssa_unswitch_loops): Initialize info before calling
	init_loop_unswitch_info.
	(combine_range): New.
	(simplify_using_entry_checks): Rename to ...
	(evaluate_control_stmt_using_entry_checks): ... this.
	(simplify_loop_version): Support predicate_path as argument.
	(evaluate_insns): New.
	(evaluate_loop_insns_for_predicate): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.dg/loop-unswitch-8.c: New test.
	* gcc.dg/loop-unswitch-9.c: New test.
---
 gcc/dbgcnt.def |   1 +
 gcc/testsuite/gcc.dg/loop-unswitch-8.c |  31 ++
 gcc/testsuite/gcc.dg/loop-unswitch-9.c |  27 ++
 gcc/tree-ssa-loop-unswitch.c   | 592 +
 4 files changed, 463 insertions(+), 188 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/loop-unswitch-8.c
 create mode 100644 gcc/testsuite/gcc.dg/loop-unswitch-9.c

diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index f8a15f3d1d1..278fb1112b3 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -187,6 +187,7 @@ DEBUG_COUNTER (ira_move)
 DEBUG_COUNTER (ivopts_loop)
 DEBUG_COUNTER (lim)
 DEBUG_COUNTER (local_alloc_for_sched)
+DEBUG_COUNTER (loop_unswitch)
 DEBUG_COUNTER (match)
 DEBUG_COUNTER (merged_ipa_icf)
 DEBUG_COUNTER (phiopt_edge_range)
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-8.c b/gcc/testsuite/gcc.dg/loop-unswitch-8.c
new file mode 100644
index 000..ae5f8f300e9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-8.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
+
+int
+foo(double *a, double *b, double *c, double *d, double *r, int size, int order)
+{
+  for (int i = 0; i < size; i++)
+  {
+double tmp;
+
+if (order < 3)
+  tmp = -8 * a[i];
+else
+  tmp = -4 * b[i];
+
+double x = 3 * tmp + d[i] + tmp;
+
+if (5 > order)
+  x += 2;
+
+if (order == 12345)
+  x *= 5;
+
+double y = 3.4f * tmp + d[i];
+r[i] = x + y;
+  }
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times ";; Unswitching loop on condition: order" 3 "unswitch" } } */
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-9.c b/gcc/testsuite/gcc.dg/loop-unswitch-9.c
new file mode 100644
index 000..9dd6023d49d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-9.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -funswitch-loops

Re: [PATCH]middle-end: move bitmask match.pd pattern and update tests

On Mon, 29 Nov 2021, Tamar Christina wrote:

> Hi All,
> 
> Following the previous bugfix this addresses the cosmetic and test issues.
> 
> The vector tests are moved to vect and the scalar are left where they are.
> 
> Bootstrapped Regtested x86_64-pc-linux-gnu and no regressions.
> Tested with -m32/-mno-sse and no issues.
> 
> Ok for master?

OK.

Richard.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * match.pd: Move below pattern that rewrites to EQ, NE.
>   * tree.c (bitmask_inv_cst_vector_p): Correct do .. while indentation.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/bic-bitmask-10.c: Moved to gcc.dg/vect/vect-bic-bitmask-10.c.
>   * gcc.dg/bic-bitmask-11.c: Moved to gcc.dg/vect/vect-bic-bitmask-11.c.
>   * gcc.dg/bic-bitmask-12.c: Moved to gcc.dg/vect/vect-bic-bitmask-12.c.
>   * gcc.dg/bic-bitmask-3.c: Moved to gcc.dg/vect/vect-bic-bitmask-3.c.
>   * gcc.dg/bic-bitmask-23.c: Moved to gcc.dg/vect/vect-bic-bitmask-23.c.
>   * gcc.dg/bic-bitmask-2.c: Moved to gcc.dg/vect/vect-bic-bitmask-2.c.
>   * gcc.dg/bic-bitmask-4.c: Moved to gcc.dg/vect/vect-bic-bitmask-4.c.
>   * gcc.dg/bic-bitmask-5.c: Moved to gcc.dg/vect/vect-bic-bitmask-5.c.
>   * gcc.dg/bic-bitmask-6.c: Moved to gcc.dg/vect/vect-bic-bitmask-6.c.
>   * gcc.dg/bic-bitmask-8.c: Moved to gcc.dg/vect/vect-bic-bitmask-8.c.
>   * gcc.dg/bic-bitmask-9.c: Moved to gcc.dg/vect/vect-bic-bitmask-9.c.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 
> e14f97ee1cda805f4e416a236de0d1770e9c933d..10a47e941529fdb25898f421b780a035dd8f8eff
>  100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -5213,20 +5213,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(eqcmp (bit_and @1 { wide_int_to_tree (ty, mask - rhs); })
>{ build_zero_cst (ty); }))
>  
> -/* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z
> -   where ~Y + 1 == pow2 and Z = ~Y.  */
> -(for cst (VECTOR_CST INTEGER_CST)
> - (for cmp (eq ne)
> -  icmp (le gt)
> -  (simplify
> -   (cmp (bit_and:c@2 @0 cst@1) integer_zerop)
> -(with { tree csts = bitmask_inv_cst_vector_p (@1); }
> - (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
> -  (if (TYPE_UNSIGNED (TREE_TYPE (@1)))
> -   (icmp @0 { csts; })
> -   (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); }
> -  (icmp (convert:utype @0) { csts; }
> -
>  /* -A CMP -B -> B CMP A.  */
>  (for cmp (tcc_comparison)
>   scmp (swapped_tcc_comparison)
> @@ -5713,6 +5699,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> replace if (x == 0) with tem = ~x; if (tem != 0) which is
> clearly less optimal and which we'll transform again in forwprop.  */
>  
> +/* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z
> +   where ~Y + 1 == pow2 and Z = ~Y.  */
> +(for cst (VECTOR_CST INTEGER_CST)
> + (for cmp (eq ne)
> +  icmp (le gt)
> +  (simplify
> +   (cmp (bit_and:c@2 @0 cst@1) integer_zerop)
> +(with { tree csts = bitmask_inv_cst_vector_p (@1); }
> + (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
> +  (if (TYPE_UNSIGNED (TREE_TYPE (@1)))
> +   (icmp @0 { csts; })
> +   (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); }
> +  (icmp (convert:utype @0) { csts; }
> +
>  /* When one argument is a constant, overflow detection can be simplified.
> Currently restricted to single use so as not to interfere too much with
> ADD_OVERFLOW detection in tree-ssa-math-opts.c.
> diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-10.c 
> b/gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-10.c
> similarity index 86%
> rename from gcc/testsuite/gcc.dg/bic-bitmask-10.c
> rename to gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-10.c
> index 
> 0d0416028ebe5d5d16c03cfec357b3aad31703c7..fe4f677b64dc96862683faf503eb4900a01e7407
>  100644
> --- a/gcc/testsuite/gcc.dg/bic-bitmask-10.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-10.c
> @@ -1,5 +1,5 @@
>  /* { dg-do run } */
> -/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */
> +/* { dg-additional-options "-O3 -save-temps -fdump-tree-dce -w" } */
>  
>  #include 
>  
> @@ -18,7 +18,7 @@ void fun2(int32_t *x, int n)
>  }
>  
>  #define TYPE int32_t
> -#include "bic-bitmask.h"
> +#include "../bic-bitmask.h"
>  
>  /* { dg-final { scan-tree-dump {<=\s*.+\{ 255,.+\}} dce7 { target vect_int } 
> } } */
>  /* { dg-final { scan-tree-dump-not {&\s*.+\{ 4294967290,.+\}} dce7 { target 
> vect_int } } } */
> diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-11.c 
> b/gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-11.c
> similarity index 86%
> rename from gcc/testsuite/gcc.dg/bic-bitmask-11.c
> rename to gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-11.c
> index 
> 0e589c96290286f02cddc27f33f25f0f7b3bb028..b77f4d42450fe6496d277a4429f0e051f5178781
>  100644
> --- a/gcc/testsuite/gcc.dg/bic-bitmask-11.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-11.c
> @@ -1,5 +1,5 @@
>  /* { dg-do run } */
> -/* {

[PATCH]middle-end: move bitmask match.pd pattern and update tests

2021-11-29 Thread Tamar Christina via Gcc-patches

Hi All,

Following the previous bugfix this addresses the cosmetic and test issues.

The vector tests are moved to vect and the scalar are left where they are.

Bootstrapped Regtested x86_64-pc-linux-gnu and no regressions.
Tested with -m32/-mno-sse and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* match.pd: Move below pattern that rewrites to EQ, NE.
* tree.c (bitmask_inv_cst_vector_p): Correct do .. while indentation.

gcc/testsuite/ChangeLog:

* gcc.dg/bic-bitmask-10.c: Moved to gcc.dg/vect/vect-bic-bitmask-10.c.
* gcc.dg/bic-bitmask-11.c: Moved to gcc.dg/vect/vect-bic-bitmask-11.c.
* gcc.dg/bic-bitmask-12.c: Moved to gcc.dg/vect/vect-bic-bitmask-12.c.
* gcc.dg/bic-bitmask-3.c: Moved to gcc.dg/vect/vect-bic-bitmask-3.c.
* gcc.dg/bic-bitmask-23.c: Moved to gcc.dg/vect/vect-bic-bitmask-23.c.
* gcc.dg/bic-bitmask-2.c: Moved to gcc.dg/vect/vect-bic-bitmask-2.c.
* gcc.dg/bic-bitmask-4.c: Moved to gcc.dg/vect/vect-bic-bitmask-4.c.
* gcc.dg/bic-bitmask-5.c: Moved to gcc.dg/vect/vect-bic-bitmask-5.c.
* gcc.dg/bic-bitmask-6.c: Moved to gcc.dg/vect/vect-bic-bitmask-6.c.
* gcc.dg/bic-bitmask-8.c: Moved to gcc.dg/vect/vect-bic-bitmask-8.c.
* gcc.dg/bic-bitmask-9.c: Moved to gcc.dg/vect/vect-bic-bitmask-9.c.

--- inline copy of patch -- 
diff --git a/gcc/match.pd b/gcc/match.pd
index 
e14f97ee1cda805f4e416a236de0d1770e9c933d..10a47e941529fdb25898f421b780a035dd8f8eff
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5213,20 +5213,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (eqcmp (bit_and @1 { wide_int_to_tree (ty, mask - rhs); })
 { build_zero_cst (ty); }))
 
-/* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z
-   where ~Y + 1 == pow2 and Z = ~Y.  */
-(for cst (VECTOR_CST INTEGER_CST)
- (for cmp (eq ne)
-  icmp (le gt)
-  (simplify
-   (cmp (bit_and:c@2 @0 cst@1) integer_zerop)
-(with { tree csts = bitmask_inv_cst_vector_p (@1); }
- (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
-  (if (TYPE_UNSIGNED (TREE_TYPE (@1)))
-   (icmp @0 { csts; })
-   (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); }
-(icmp (convert:utype @0) { csts; }
-
 /* -A CMP -B -> B CMP A.  */
 (for cmp (tcc_comparison)
  scmp (swapped_tcc_comparison)
@@ -5713,6 +5699,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
replace if (x == 0) with tem = ~x; if (tem != 0) which is
clearly less optimal and which we'll transform again in forwprop.  */
 
+/* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z
+   where ~Y + 1 == pow2 and Z = ~Y.  */
+(for cst (VECTOR_CST INTEGER_CST)
+ (for cmp (eq ne)
+  icmp (le gt)
+  (simplify
+   (cmp (bit_and:c@2 @0 cst@1) integer_zerop)
+(with { tree csts = bitmask_inv_cst_vector_p (@1); }
+ (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
+  (if (TYPE_UNSIGNED (TREE_TYPE (@1)))
+   (icmp @0 { csts; })
+   (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); }
+(icmp (convert:utype @0) { csts; }
+
 /* When one argument is a constant, overflow detection can be simplified.
Currently restricted to single use so as not to interfere too much with
ADD_OVERFLOW detection in tree-ssa-math-opts.c.
diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-10.c 
b/gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-10.c
similarity index 86%
rename from gcc/testsuite/gcc.dg/bic-bitmask-10.c
rename to gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-10.c
index 
0d0416028ebe5d5d16c03cfec357b3aad31703c7..fe4f677b64dc96862683faf503eb4900a01e7407
 100644
--- a/gcc/testsuite/gcc.dg/bic-bitmask-10.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-10.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */
+/* { dg-additional-options "-O3 -save-temps -fdump-tree-dce -w" } */
 
 #include 
 
@@ -18,7 +18,7 @@ void fun2(int32_t *x, int n)
 }
 
 #define TYPE int32_t
-#include "bic-bitmask.h"
+#include "../bic-bitmask.h"
 
 /* { dg-final { scan-tree-dump {<=\s*.+\{ 255,.+\}} dce7 { target vect_int } } 
} */
 /* { dg-final { scan-tree-dump-not {&\s*.+\{ 4294967290,.+\}} dce7 { target 
vect_int } } } */
diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-11.c 
b/gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-11.c
similarity index 86%
rename from gcc/testsuite/gcc.dg/bic-bitmask-11.c
rename to gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-11.c
index 
0e589c96290286f02cddc27f33f25f0f7b3bb028..b77f4d42450fe6496d277a4429f0e051f5178781
 100644
--- a/gcc/testsuite/gcc.dg/bic-bitmask-11.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-bic-bitmask-11.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */
+/* { dg-additional-options "-O3 -save-temps -fdump-tree-dce -w" } */
 
 #include 
 
@@ -17,7 +17,7 @@ void fun2(uint32_t *x, int n)
   x[i] = (x[i]&(~255)) != 0;
 }
 
-#include "bic-bitmask.h"
+#include "../bic-bitmask.h"

Re: Patch ping (Re: [PATCH] x86_64: Issue -Wpsabi warning about C++ zero width bitfield ABI changes [PR102024])

2021-11-29 Thread H.J. Lu via Gcc-patches

On Mon, Nov 29, 2021 at 3:24 AM Jakub Jelinek  wrote:
>
> Hi!
>
> On Tue, Aug 31, 2021 at 10:05:58AM +0200, Jakub Jelinek via Gcc-patches wrote:
> > This is an incremental patch to
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578447.html
> > for x86_64 ABI.
> > For zero-width bitfields current GCC classify_argument does:
> >   if (DECL_BIT_FIELD (field))
> > {
> >   for (i = (int_bit_position (field)
> > + (bit_offset % 64)) / 8 / 8;
> >i < ((int_bit_position (field) + (bit_offset % 
> > 64))
> > + tree_to_shwi (DECL_SIZE (field))
> > + 63) / 8 / 8; i++)
> > classes[i]
> >   = merge_classes (X86_64_INTEGER_CLASS, 
> > classes[i]);
> > }
> > which I think means that if the zero-width bitfields are at bit-positions
> > (in the toplevel aggregate) which are multiples of 64 bits doesn't do
> > anything, (int_bit_position (field) + (bit_offset % 64)) / 64 and
> > (int_bit_position (field) + (bit_offset % 64) + 63) / 64 should be equal.
> > But for zero-width bitfields at other bit positions it will call
> > merge_classes once.  Now, the typical case is that the zero width bitfield
> > is surrounded by some bitfields and in that case, it doesn't change
> > anything, but it can be sandwitched in between floats too as the testcases
> > show.
> > In C we had this behavior, in C++ previously the FE was removing the
> > zero-width bitfields and therefore they were ignored.
> > LLVM and ICC seems to ignore those bitfields both in C and C++ (== passing
> > struct S in SSE register rather than in GPR).
>
> I'd like to ping this patch, but perhaps first it would be nice to discuss
> it in the x86-64 psABI group.
> The current psABI doesn't seem to mention zero sized bitfields at all
> explicitly, so perhaps theoretically they should be treated as INTEGER class,
> but if they are at positions multiple of 64 bits, then it is unclear into
> which eightbyte they should be considered, whether the previous one if any
> or the next one if any.  I guess similar problem is for zero sized
> structures, but those should according to algorithm have NO_CLASS and so it
> doesn't really make a difference.  And, no compiler I'm aware of treats
> the zero sized bitfields at 64 bit boundaries as INTEGER class, LLVM/ICC are
> ignoring such bitfields everywhere, GCC ignores them at those boundaries
> (and used to ignore them in C++ everywhere).  I guess my preferred solution
> would be to say explicitly that zero sized bitfields are NO_CLASS.
> I'm not a member of the google x86-64 psABI group, can somebody please raise
> it there?

https://groups.google.com/g/x86-64-abi/c/OYeWs14WHQ4

> > The following patch assumes the current GCC C behavior of not ignoring them
> > when not at bitpositions divisible by 64 is right (though I'm really not
> > sure about that) and implements a -Wpsabi warning for that case.
> > The psABI IMHO should be clarified in any case.
> > The other option is to start ignoring them always (and issue -Wpsabi warning
> > if !DECL_FIELD_ABI_IGNORED on zero-width bitfield and it would change the
> > outcome).
> >
> > I think libffi doesn't need changing, as I think it doesn't support
> > bitfields.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux.
> >
> > 2021-08-31  Jakub Jelinek  
> >
> >   PR target/102024
> >   * config/i386/i386.c (classify_argument): Add cxx_bitfields argument,
> >   when seeing DECL_FIELD_ABI_IGNORED bitfields either set it to 1 or
> >   if equal to 2 ignore it.  Pass it to recursive calls.  Add wrapper
> >   with old arguments and diagnose ABI differences for C++ structures
> >   with zero width bitfields.  Formatting fixes.
> >
> >   * gcc.target/i386/pr102024.c: New test.
> >   * g++.target/i386/pr102024.C: New test.
>
> Jakub
>


-- 
H.J.

[PATCH] Only return after resetting type_param_spec_list

This fixes an appearant mistake in gfc_insert_parameter_exprs.

Bootstrap / regtest pending on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

2021-11-29  Richard Biener  

gcc/fortran/
* decl.c (gfc_insert_parameter_exprs): Only return after
resetting type_param_spec_list.
---
 gcc/fortran/decl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index c0fec90e3e0..4971638f9b6 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -3733,9 +3733,9 @@ gfc_insert_parameter_exprs (gfc_expr *e, 
gfc_actual_arglist *param_list)
 {
   gfc_actual_arglist *old_param_spec_list = type_param_spec_list;
   type_param_spec_list = param_list;
-  return gfc_traverse_expr (e, NULL, _parameter_exprs, 1);
-  type_param_spec_list = NULL;
+  bool res = gfc_traverse_expr (e, NULL, _parameter_exprs, 1);
   type_param_spec_list = old_param_spec_list;
+  return res;
 }
 
 /* Determines the instance of a parameterized derived type to be used by
-- 
2.31.1

[PATCH] Fix RTL FE issue with premature return

This fixes an issue discovered by -Wunreachable-code-return

Bootstrap / regtest pending on x86_64-unknown-linux-gnu, OK?

2021-11-29  Richard Biener  

* read-rtl-function.c (function_reader::read_rtx_operand):
Return only after resetting m_in_call_function_usage.
---
 gcc/read-rtl-function.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/read-rtl-function.c b/gcc/read-rtl-function.c
index 5699f574a37..81d7531ce47 100644
--- a/gcc/read-rtl-function.c
+++ b/gcc/read-rtl-function.c
@@ -886,8 +886,9 @@ function_reader::read_rtx_operand (rtx x, int idx)
   if (idx == 7 && CALL_P (x))
{
  m_in_call_function_usage = true;
- return rtx_reader::read_rtx_operand (x, idx);
+ rtx tem = rtx_reader::read_rtx_operand (x, idx);
  m_in_call_function_usage = false;
+ return tem;
}
   else
return rtx_reader::read_rtx_operand (x, idx);
-- 
2.31.1

[PATCH] libcpp: Enable P1949R7 for C++11 and up as it was a DR [PR100977]

2021-11-29 Thread Jakub Jelinek via Gcc-patches

Hi!

Jonathan mentioned on IRC that:
"Accept P1949R7 (C++ Identifier Syntax using Unicode Standard Annex 31) as
a Defect Report and apply the changes therein to the C++ working paper."
while I've actually implemented it only for -std={gnu,c}++{23,2b}.
As the C++98 rules were significantly different, I'm not trying to change
anything for C++98.

So far lightly tested, ok for trunk if it passes full bootstrap/regtest
on x86_64-linux/i686-linux?

2021-11-29  Jakub Jelinek  

PR c++/100977
* init.c (lang_defaults): Enable cxx23_identifiers for
-std={gnu,c}++{11,14,17,20} too.

* c-c++-common/cpp/ucnid-2011-1-utf8.c: Expect errors in C++.
* c-c++-common/cpp/ucnid-2011-1.c: Likewise.
* g++.dg/cpp/ucnid-4-utf8.C: Add missing space to dg-options.
* g++.dg/cpp23/normalize3.C: Enable for c++11 rather than just c++23.
* g++.dg/cpp23/normalize4.C: Likewise.
* g++.dg/cpp23/normalize5.C: Likewise.
* g++.dg/cpp23/normalize7.C: Expect errors rather than just warnings
for c++11 and up rather than just c++23.
* g++.dg/cpp23/ucnid-2-utf8.C: Expect errors even for c++11 .. c++20.

--- libcpp/init.c.jj2021-11-17 20:08:18.359724792 +0100
+++ libcpp/init.c   2021-11-29 11:40:05.989432952 +0100
@@ -114,14 +114,14 @@ static const struct lang_flags lang_defa
   /* STDC2X   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,1, 
1, 1,   1,  0,   1, 1,   0,   1 },
   /* GNUCXX   */  { 0,  1,  1,  1,  0,  0,0,  1,   0,   0,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
   /* CXX98*/  { 0,  1,  0,  1,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX11 */  { 1,  1,  1,  1,  1,  0,0,  1,   1,   1,   1,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX11*/  { 1,  1,  0,  1,  1,  0,1,  1,   1,   1,   1,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX14 */  { 1,  1,  1,  1,  1,  0,0,  1,   1,   1,   1,1, 
1, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX14*/  { 1,  1,  0,  1,  1,  0,1,  1,   1,   1,   1,1, 
1, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX17 */  { 1,  1,  1,  1,  1,  0,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX17*/  { 1,  1,  1,  1,  1,  0,1,  1,   1,   1,   1,1, 
1, 0,   1,  0,   1, 0,   0,   0 },
-  /* GNUCXX20 */  { 1,  1,  1,  1,  1,  0,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX20*/  { 1,  1,  1,  1,  1,  0,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
+  /* GNUCXX11 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
+  /* CXX11*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
+  /* GNUCXX14 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   0,  1,   1, 0,   0,   0 },
+  /* CXX14*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 1,   0,  0,   1, 0,   0,   0 },
+  /* GNUCXX17 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
+  /* CXX17*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  0,   1, 0,   0,   0 },
+  /* GNUCXX20 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
+  /* CXX20*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
   /* GNUCXX23 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
   /* CXX23*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
   /* ASM  */  { 0,  0,  1,  0,  0,  0,0,  0,   0,   0,   0,0, 
0, 0,   0,  0,   0, 0,   0,   0 }
--- gcc/testsuite/c-c++-common/cpp/ucnid-2011-1-utf8.c.jj   2020-01-14 
20:02:46.649611841 +0100
+++ gcc/testsuite/c-c++-common/cpp/ucnid-2011-1-utf8.c  2021-11-29 
12:11:51.640324720 +0100
@@ -2,7 +2,7 @@
 /* { dg-options "-std=c11 -pedantic" { target c } } */
 /* { dg-options "-std=c++11 -pedantic" { target c++ } } */
 
-¨
+¨ /* { dg-error "is not valid in an identifier" "" { target c++ } } */
 
 B̀
 
@@ -11,5 +11,5 @@ B̀
 À /* { dg-warning "not in NFC" } */
 
 
-�
-ሴ
+�  /* { dg-error "is not valid in an identifier" "" { target c++ } } */
+ሴ  /* { dg-error "is not valid in an identifier" "" { target c++ } } */
--- gcc/testsuite/c-c++-common/cpp/ucnid-2011-1.c.jj2020-01-14 
20:02:46.649611841 +0100
+++ gcc/testsuite/c-c++-common/cpp/ucnid-2011-1.c   2021-11-29 
12:12:02.230174227 +0100
@@ -2,7 +2,7 @@
 /* {

Re: [AArch64] Enable generation of FRINTNZ instructions

2021-11-29 Thread Andre Vieira (lists) via Gcc-patches




On 18/11/2021 11:05, Richard Biener wrote:


+ (if (!flag_trapping_math
+ && direct_internal_fn_supported_p (IFN_TRUNC, type,
+OPTIMIZE_FOR_BOTH))
+  (IFN_TRUNC @0)
  #endif

does IFN_FTRUNC_INT preserve the same exceptions as doing
explicit intermediate float->int conversions?  I think I'd
prefer to have !flag_trapping_math on both cases.
I realized I never responded to this. The AArch64 instructions mimic the 
behaviour you'd see if you were doing explicit conversions, so I'll be 
defining the new IFN and optab to require the same, such that these can 
be used by the compiler when flag_trapping_math. In the patch I sent 
last I added some likes to the md.texi description of the optab to that 
intent.

Re: [PATCH] Remove unreachable gcc_unreachable () at the end of functions

On Sun, 28 Nov 2021, Jeff Law wrote:

> 
> 
> On 11/25/2021 6:33 AM, Richard Biener via Gcc-patches wrote:
> > It seems to be a style to place gcc_unreachable () after a
> > switch that handles all cases with every case returning.
> > Those are unreachable (well, yes!), so they will be elided
> > at CFG construction time and the middle-end will place
> > another __builtin_unreachable "after" them to note the
> > path doesn't lead to a return when the function is not declared
> > void.
> >
> > So IMHO those explicit gcc_unreachable () serve no purpose,
> > if they could be replaced by a comment.  But since all cases
> > cover switches not handling a case or not returning will
> > likely cause some diagnostic to be emitted which is better
> > than running into an ICE only at runtime.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu - any
> > comments?
> >
> > Thanks,
> > Richard.
> >
> > 2021-11-24  Richard Biener  
> >
> >  * tree.h (reverse_storage_order_for_component_p): Remove
> >  spurious gcc_unreachable.
> >  * cfganal.c (dfs_find_deadend): Likewise.
> >  * fold-const-call.c (fold_const_logb): Likewise.
> >  (fold_const_significand): Likewise.
> >  * gimple-ssa-store-merging.c (lhs_valid_for_store_merging_p):
> >  Likewise.
> >
> > gcc/c-family/
> >  * c-format.c (check_format_string): Remove spurious
> >  gcc_unreachable.
> They would be a check if someone added a case to the switch that didn't
> return.  But we'd get a return-value warning if that happened.  So I don't see
> that they serve much purpose.

I've pushed the change.

Richard.

Patch ping (Re: [PATCH] x86_64: Issue -Wpsabi warning about C++ zero width bitfield ABI changes [PR102024])

2021-11-29 Thread Jakub Jelinek via Gcc-patches

Hi!

On Tue, Aug 31, 2021 at 10:05:58AM +0200, Jakub Jelinek via Gcc-patches wrote:
> This is an incremental patch to
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578447.html
> for x86_64 ABI.
> For zero-width bitfields current GCC classify_argument does:
>   if (DECL_BIT_FIELD (field))
> {
>   for (i = (int_bit_position (field)
> + (bit_offset % 64)) / 8 / 8;
>i < ((int_bit_position (field) + (bit_offset % 64))
> + tree_to_shwi (DECL_SIZE (field))
> + 63) / 8 / 8; i++)
> classes[i]
>   = merge_classes (X86_64_INTEGER_CLASS, classes[i]);
> }
> which I think means that if the zero-width bitfields are at bit-positions
> (in the toplevel aggregate) which are multiples of 64 bits doesn't do
> anything, (int_bit_position (field) + (bit_offset % 64)) / 64 and
> (int_bit_position (field) + (bit_offset % 64) + 63) / 64 should be equal.
> But for zero-width bitfields at other bit positions it will call
> merge_classes once.  Now, the typical case is that the zero width bitfield
> is surrounded by some bitfields and in that case, it doesn't change
> anything, but it can be sandwitched in between floats too as the testcases
> show.
> In C we had this behavior, in C++ previously the FE was removing the
> zero-width bitfields and therefore they were ignored.
> LLVM and ICC seems to ignore those bitfields both in C and C++ (== passing
> struct S in SSE register rather than in GPR).

I'd like to ping this patch, but perhaps first it would be nice to discuss
it in the x86-64 psABI group.
The current psABI doesn't seem to mention zero sized bitfields at all
explicitly, so perhaps theoretically they should be treated as INTEGER class,
but if they are at positions multiple of 64 bits, then it is unclear into
which eightbyte they should be considered, whether the previous one if any
or the next one if any.  I guess similar problem is for zero sized
structures, but those should according to algorithm have NO_CLASS and so it
doesn't really make a difference.  And, no compiler I'm aware of treats
the zero sized bitfields at 64 bit boundaries as INTEGER class, LLVM/ICC are
ignoring such bitfields everywhere, GCC ignores them at those boundaries
(and used to ignore them in C++ everywhere).  I guess my preferred solution
would be to say explicitly that zero sized bitfields are NO_CLASS.
I'm not a member of the google x86-64 psABI group, can somebody please raise
it there?

> The following patch assumes the current GCC C behavior of not ignoring them
> when not at bitpositions divisible by 64 is right (though I'm really not
> sure about that) and implements a -Wpsabi warning for that case.
> The psABI IMHO should be clarified in any case.
> The other option is to start ignoring them always (and issue -Wpsabi warning
> if !DECL_FIELD_ABI_IGNORED on zero-width bitfield and it would change the
> outcome).
> 
> I think libffi doesn't need changing, as I think it doesn't support
> bitfields.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux.
> 
> 2021-08-31  Jakub Jelinek  
> 
>   PR target/102024
>   * config/i386/i386.c (classify_argument): Add cxx_bitfields argument,
>   when seeing DECL_FIELD_ABI_IGNORED bitfields either set it to 1 or
>   if equal to 2 ignore it.  Pass it to recursive calls.  Add wrapper
>   with old arguments and diagnose ABI differences for C++ structures
>   with zero width bitfields.  Formatting fixes.
> 
>   * gcc.target/i386/pr102024.c: New test.
>   * g++.target/i386/pr102024.C: New test.

Jakub

Re: [PATCH] Remove unreachable returns

On Sun, 28 Nov 2021, Jeff Law wrote:

> 
> 
> On 11/25/2021 7:16 AM, Richard Biener via Gcc-patches wrote:
> > This removes unreachable return statements as diagnosed by
> > the -Wunreachable-code patch.  Some cases are more obviously
> > an improvement than others - in fact some may get you the idea
> > to replace them with gcc_unreachable () instead, leading to
> > cases of the 'Remove unreachable gcc_unreachable () at the end
> > of functions' patch.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > OK?  Comments?  Feel free to approve select cases only.
> >
> > Thanks,
> > Richard.
> >
> > 2021-11-25  Richard Biener  
> >
> >  * vec.c (qsort_chk): Do not return the void return value
> >  from the noreturn qsort_chk_error.
> >  * ccmp.c (expand_ccmp_expr_1): Remove unreachable return.
> >  * df-scan.c (df_ref_equal_p): Likewise.
> >  * dwarf2out.c (is_base_type): Likewise.
> >  (add_const_value_attribute): Likewise.
> >  * fixed-value.c (fixed_arithmetic): Likewise.
> >  * gimple-fold.c (gimple_fold_builtin_fputs): Likewise.
> >  * gimple-ssa-strength-reduction.c (stmt_cost): Likewise.
> >  * graphite-isl-ast-to-gimple.c
> >  (gcc_expression_from_isl_expr_op): Likewise.
> >  (gcc_expression_from_isl_expression): Likewise.
> >  * ipa-fnsummary.c (will_be_nonconstant_expr_predicate):
> >  Likewise.
> >  * lto-streamer-in.c (lto_input_mode_table): Likewise.
> >
> > gcc/c-family/
> >  * c-opts.c (c_common_post_options): Remove unreachable return.
> >  * c-pragma.c (handle_pragma_target): Likewise.
> >  (handle_pragma_optimize): Likewise.
> >
> > gcc/c/
> >  * c-typeck.c (c_tree_equal): Remove unreachable return.
> >  * c-parser.c (get_matching_symbol): Likewise.
> >
> > libgomp/
> >  * oacc-plugin.c (GOMP_PLUGIN_acc_default_dim): Remove unreachable
> >  return.
> I'd commit the whole set.

I have pushed it.

Richard.

Re: [PATCH] Fix regression introduced by r12-5536.

2021-11-29 Thread Hongtao Liu via Gcc-patches

On Mon, Nov 29, 2021 at 3:53 PM Uros Bizjak  wrote:
>
> On Mon, Nov 29, 2021 at 2:32 AM liuhongt  wrote:
> >
> > There're several failures reported in [1]:
> > 1.  unsupported instruction `pextrw` for "pextrw $0, %xmm31, 16(%rax)"
> > %vpextrw should be used in output templates.
> > 2. ICE in get_attr_memory for movhi_internal since some alternatives
> > are marked as TYPE_SSELOG.
> > Explicitly set memory_attr for those alternatives.
> >
> > Also this patch fixs a typo and some latent bugs which are related to
> > moving HImode from/to sse register w/o TARGET_AVX512FP16.
> >
> > For optimization issues discussed in PR102811, I'll create another patch for
> > it.
> > [1] https://gcc.gnu.org/pipermail/gcc-regression/2021-November/075893.html
> >
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} and
> > x86_64-pc-linux-gnu{-m32\ -march=cascadelake,\ -march=cascadelake}
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > * config/i386/i386.c (ix86_secondary_reload): Without
> > TARGET_SSE4_1, General register is needed to move HImode from
> > sse register to memory.
> > * config/i386/sse.md (*vec_extrachf): Use %vpextrw instead of
> > pextrw in output templates.
> > * config/i386/i386.md (movhi_internal): Ditto, also fix typo of
> > MEM_P (operands[1]) and adjust memory/mode/prefix/type
> > attribute for alternatives related to sse register.
>
> OK, but please use sselog1 type instead so you don't need to introduce
> the memory attribute.
Changed, committed as r12-5573
thanks for the review.
>
> Thanks,
> Uros.
>
> > ---
> >  gcc/config/i386/i386.c  |  2 +-
> >  gcc/config/i386/i386.md | 44 ++---
> >  gcc/config/i386/sse.md  |  6 +++---
> >  3 files changed, 36 insertions(+), 16 deletions(-)
> >
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index 3dedf522c42..7cf599f57f7 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -19277,7 +19277,7 @@ ix86_secondary_reload (bool in_p, rtx x, 
> > reg_class_t rclass,
> >  }
> >
> >/* Require movement to gpr, and then store to memory.  */
> > -  if (mode == HFmode
> > +  if ((mode == HFmode || mode == HImode)
> >&& !TARGET_SSE4_1
> >&& SSE_CLASS_P (rclass)
> >&& !in_p && MEM_P (x))
> > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> > index 68606e57e60..2cb3e727588 100644
> > --- a/gcc/config/i386/i386.md
> > +++ b/gcc/config/i386/i386.md
> > @@ -2528,12 +2528,12 @@ (define_insn "*movhi_internal"
> >  case TYPE_SSELOG:
> >if (SSE_REG_P (operands[0]))
> > return MEM_P (operands[1])
> > - ? "pinsrw\t{$0, %1, %0|%0, %1, 0}"
> > - : "pinsrw\t{$0, %k1, %0|%0, %k1, 0}";
> > + ? "%vpinsrw\t{$0, %1, %0|%0, %1, 0}"
> > + : "%vpinsrw\t{$0, %k1, %0|%0, %k1, 0}";
> >else
> > -   return MEM_P (operands[1])
> > - ? "pextrw\t{$0, %1, %0|%0, %1, 0}"
> > - : "pextrw\t{$0, %1, %k0|%k0, %k1, 0}";
> > +   return MEM_P (operands[0])
> > + ? "%vpextrw\t{$0, %1, %0|%0, %1, 0}"
> > + : "%vpextrw\t{$0, %1, %k0|%k0, %1, 0}";
> >
> >  case TYPE_MSKLOG:
> >if (operands[1] == const0_rtx)
> > @@ -2557,12 +2557,14 @@ (define_insn "*movhi_internal"
> >]
> >(const_string "*")))
> > (set (attr "type")
> > - (cond [(eq_attr "alternative" "9,10,11,12,13")
> > + (cond [(eq_attr "alternative" "9,10,12,13")
> >   (if_then_else (match_test "TARGET_AVX512FP16")
> > (const_string "ssemov")
> > (const_string "sselog"))
> > (eq_attr "alternative" "4,5,6,7")
> >   (const_string "mskmov")
> > +   (eq_attr "alternative" "11")
> > + (const_string "ssemov")
> > (eq_attr "alternative" "8")
> >   (const_string "msklog")
> > (match_test "optimize_function_for_size_p (cfun)")
> > @@ -2579,15 +2581,33 @@ (define_insn "*movhi_internal"
> >   (const_string "imovx")
> >]
> >(const_string "imov")))
> > +(set (attr "memory")
> > +(cond [(eq_attr "alternative" "9,10")
> > + (const_string "none")
> > +   (eq_attr "alternative" "12")
> > + (const_string "load")
> > +   (eq_attr "alternative" "13")
> > + (const_string "store")
> > +   ]
> > +   (const_string "*")))
>
> Please use sselog1 type instead, and the memory attribute will be
> calculated correctly.
>
> >  (set (attr "prefix")
> > -  (if_then_else (eq_attr "alternative" "4,5,6,7,8")
> > -   (const_string "vex")
> > -   (const_string "orig")))
> > +(cond [(eq_attr "alternative" "9,10,11,12,13")
> > + (const_string "maybe_evex")
> > +   (eq_attr "alternative" "4,5,6,7,8")
> > +

RE: [PATCH]middle-end cse: Make sure duplicate elements are not entered into the equivalence set [PR103404]

2021-11-29 Thread Tamar Christina via Gcc-patches

> -Original Message-
> From: Richard Biener 
> Sent: Monday, November 29, 2021 9:02 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@tachyum.com;
> Richard Sandiford 
> Subject: Re: [PATCH]middle-end cse: Make sure duplicate elements are not
> entered into the equivalence set [PR103404]
> 
> On Mon, 29 Nov 2021, Tamar Christina wrote:
> 
> > Hi All,
> >
> > CSE uses equivalence classes to keep track of expressions that all
> > have the same values at the current point in the program.
> >
> > Normal equivalences through SETs only insert and perform lookups in
> > this set but equivalence determined from comparisons, e.g.
> >
> > (insn 46 44 47 7 (set (reg:CCZ 17 flags)
> > (compare:CCZ (reg:SI 105 [ iD.2893 ])
> > (const_int 0 [0]))) "cse.c":18:22 7 {*cmpsi_ccno_1}
> >  (expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ])
> > (nil)))
> >
> > creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 
> > [0]).
> >
> > This causes a merge to happen between the two equivalence sets denoted
> > by (const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively.
> >
> > The operation happens through merge_equiv_classes however this
> > function has an invariant that the classes to be merge not contain any
> > duplicates.  This is because it frees entries before merging.
> >
> > The given testcase when using the supplied flags trigger an ICE due to
> > the equivalence set being
> >
> > (rr) p dump_class (class1)
> > Equivalence chain for (reg:SI 105 [ iD.2893 ]):
> > (reg:SI 105 [ iD.2893 ])
> > $3 = void
> >
> > (rr) p dump_class (class2)
> > Equivalence chain for (const_int 0 [0]):
> > (const_int 0 [0])
> > (reg:SI 97 [ _10 ])
> > (reg:SI 97 [ _10 ])
> > $4 = void
> >
> > This happens because the original INSN being recorded is
> >
> > (insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0)
> > (const_vector:V1SI [
> > (const_int 0 [0])
> > ])) "cse.c":11:9 1363 {*movv1si_internal}
> >  (expr_list:REG_UNUSED (reg:SI 97 [ _10 ])
> > (nil)))
> >
> > and we end up generating two equivalences. the first one is simply
> > that reg:SI 97 is 0.  The second one is that 0 can be extracted from
> > the V1SI, so subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0.  This nested
> > subreg gets folded away to just reg:SI 97 and we re-insert the same
> equivalence.
> >
> > This patch changes it so that once we figure out the bucket to insert
> > into we check if the equivalence set already contains the entry and if
> > so just return the existing entry and exit.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> > and no regressions.
> >
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR rtl-optimization/103404
> > * cse.c (insert_with_costs): Check if item exists already before
> adding
> > a new entry in the equivalence class.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR rtl-optimization/103404
> > * gcc.target/i386/pr103404.c: New test.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/cse.c b/gcc/cse.c
> > index
> >
> c1c7d0ca27b73c4b944b4719f95fece74e0358d5..08295246c594109e947276051c
> 67
> > 76e4cabca4ec 100644
> > --- a/gcc/cse.c
> > +++ b/gcc/cse.c
> > @@ -1537,6 +1537,17 @@ insert_with_costs (rtx x, struct table_elt *classp,
> unsigned int hash,
> >if (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER)
> >  add_to_hard_reg_set (_regs_in_table, GET_MODE (x), REGNO
> > (x));
> >
> > +  /* We cannot allow a duplicate to be entered into the equivalence sets
> > + and so we should perform a check before we do any allocations or
> > + change the buckets.  */
> > +  if (classp)
> > +{
> > +  struct table_elt *p;
> > +  for (p = classp; p; p = p->next_same_value)
> > +   if (exp_equiv_p (p->exp, x, 1, false))
> > + return p;
> 
> not really a review, leaving that to who approved the original change, but
> these things always look bad - this linear list walk makes insert_with_costs
> quadratic.  Is there any mitigation (like limiting the number of entries?), is
> that already an existing problem elsewhere in CSE?

Hmm I believe insert_with_costs is already quadratic as it walks the list after 
allocations
in a similar manner. 

The problem is that the function assumes it can't ever fail, so it starts off 
by allocating
memory and modifying the table itself.  There are also two ways it can allocate 
memory,
so by the time it starts walking the classp itself and we notice any duplicates 
we did quite
a lot of work already and can't easily undo all of it. Ideally the function 
would identify
the insertion point before it does any changes.  But there are multiple ways to 
determine
where to insert.

But.. I think I can use the result of the first walk to seed the second walk 
which becomes O(1).
That way there's no additional work being done.  I'll update the patch.

Regards,
Tamar
> 
> > +}
> > +
> >/* Put

[PATCH] tree-optimization/103458 - avoid creating new loops in CD-DCE

When creating forwarders in CD-DCE we have to avoid creating loops
where we formerly did not consider those because of abnormal
predecessors.  At this point simply excuse us when there are any
abnormal predecessors.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-11-29  Richard Biener  

PR tree-optimization/103458
* tree-ssa-dce.c (make_forwarders_with_degenerate_phis): Do not
create forwarders for blocks with abnormal predecessors.

* gcc.dg/torture/pr103458.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr103458.c | 21 +
 gcc/tree-ssa-dce.c  |  8 ++--
 2 files changed, 27 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr103458.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr103458.c 
b/gcc/testsuite/gcc.dg/torture/pr103458.c
new file mode 100644
index 000..3fd3b5fd2ff
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr103458.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Wno-div-by-zero" } */
+
+__attribute__ ((returns_twice)) int
+bar (void);
+
+void
+foo (int *p, int x)
+{
+  *p = 0;
+  while (*p < 1)
+{
+  x = 0;
+  while (x < 1)
+bar ();
+
+  x /= 0;
+}
+
+  foo (p, x);
+}
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index e3e6f0955b7..1f817b95fab 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -1650,8 +1650,12 @@ make_forwarders_with_degenerate_phis (function *fn)
   /* Only PHIs with three or more arguments have opportunities.  */
   if (EDGE_COUNT (bb->preds) < 3)
continue;
-  /* Do not touch loop headers.  */
-  if (bb->loop_father->header == bb)
+  /* Do not touch loop headers or blocks with abnormal predecessors.
+???  This is to avoid creating valid loops here, see PR103458.
+We might want to improve things to either explicitely add those
+loops or at least consider blocks with no backedges.  */
+  if (bb->loop_father->header == bb
+ || bb_has_abnormal_pred (bb))
continue;
 
   /* Take one PHI node as template to look for identical
-- 
2.31.1

Re: Improve -fprofile-report

2021-11-29 Thread Martin Liška


On 11/27/21 16:56, Jan Hubicka via Gcc-patches wrote:

Hi,
Profile-report was never properly updated after switch to new profile
representation.  This patch fixes the way profile mismatches are
calculated: we used to collect separately count and freq mismatches,
while now we have only counts & probabilities.  So we verify

  - in count: that total count of incomming edges is close to acutal count of
the BB
  - out prob: that total sum of outgoing edge edge probabilities is close
to 1 (except for BB containing noreturn calls or EH).


Hello.

Can you please CC me when you mention me in an email?



Moreover I added dumping of absolute data which is useful to plot them:
with Martin Liska we plan to setup regular testing so we keep optimizers
profie updates bit under control.

Finally I added both static and dynamic stats about mismatches - static
one is simply number of inconsistencies in the cfg while dynamic is
scaled by the profile - I think in order to keep eye on optimizers the
first number is quite relevant. WHile when tracking why code quality
regressed the second number matters more.

The output on exchange2 benchmark with FDO is currently:

Profile consistency report:


The version you send is different to what was install :)

Pass dump id and name|static mismatch|dynamic mismatch  
   |overall 
  |
 |in count |out prob |in count  
|out prob  |size   |time
  |
 15t cfg |  0  |  0  |0 
|0 |   165834  |   495010   
  |
 17t ompexp  |  0  |  0  |0 
|0 |   165834  |   495010   
  |
 18t walloca |  0  |  0  |0 
|0 |   165834  |   495010   
  |
 19i visibility  |  0  |  0  |0 
|0 |   165834  |   495010   
  |
 20i build_ssa_passes|  0  |  0  |0 
|0 |   165834  |   495010   
  |
...

Can you please rename it to the same format we use for dump files, e.g. 
018t.walloca1 ?
It would be easier for people finding the corresponding dump file.



Pass name|static mismatch|dynamic mismatch  
   |overall 
  |
  |in count |out prob |in count 
 |out prob  |size   |time   
   |
cp   |  9+9| 52   +52|222697491   
+222697491|0 |19336  |  86295742108 
|
inline   |  6-3| 52  |224325864 
+1628373|0 |26811+38.7%|  80149710330
-7.1%|
fixup_cfg| 19   +13| 57+5| 65581029   
-158744835|0 |34292+27.9%|  73900655012
-7.8%|
adjust_alignment | 19  | 57  | 65581029 
|0 |34292  |  73900655012   
  |
ccp  | 19  | 57  | 65581029 
|0 |29929-12.7%|  72799142820
-1.5%|
objsz|216  +197| 46   -11|161980247
+96399218|0 |25566-14.6%|  71697630628
-1.5%|
cunrolli |216  | 46  |161980247 
|0 |26184 +2.4%|  69645569278
-2.9%|
backprop |151   -65| 46  |137177274
-24802973|0 |26802 +2.4%|  67593507928
-2.9%|
phiprop  |151  | 46  |137177274 
|0 |26802  |  67593507928   
  |
forwprop |151  | 46  |137177274 
|0 |26801 -0.0%|  67593507371
-0.0%|
alias|151  | 46  |137177274 
|0 |26800 -0.0%|  67593506814
-0.0%|
retslot  |151  | 46  |137177274 
|0 |26800  |  67593506814   
  |
fre  |151  | 46  |137177274

[PATCH] Final value replacement improvements for until-wrap loops.

2021-11-29 Thread Roger Sayle


This middle-end patch is inspired by Richard Biener's until-wrap
loop example in PR tree-optimization/101145.

unsigned foo(unsigned val, unsigned start)
{
  unsigned cnt = 0;
  for (unsigned i = start; i > val; ++i)
cnt++;
  return cnt;
}

For this loop, the tree optimizers currently generate:

unsigned int foo (unsigned int val, unsigned int start)
{
  unsigned int cnt;
  unsigned int _1;
  unsigned int _5;

   [local count: 118111600]:
  if (start_3(D) > val_4(D))
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 105119324]:
  _1 = start_3(D) + 1;
  _5 = -start_3(D);
  cnt_2 = _1 > val_4(D) ? _5 : 1;

   [local count: 118111600]:
  # cnt_11 = PHI 
  return cnt_11;
}

or perhaps slightly easier to read:

  if (start > val) {
cnt = (start+1) > val ? -start : 1;
  } else cnt = 0;

In this snippet, if we know start > val, then (start+1) > val
unless start+1 overflows, i.e. (start+1) == 0 and start == ~0.
We can use this (loop header) context to simplify the ternary
expression to "(start != -1) ? -start : 1", which with a little
help from match.pd can be folded to -start.  Hence the optimal
final value replacement should be:

  cnt = (start > val) ? -start : 0;

Or as now generated by this patch:

unsigned int foo (unsigned int val, unsigned int start)
{
  unsigned int cnt;

   [local count: 118111600]:
  if (start_3(D) > val_4(D))
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 105119324]:
  cnt_2 = -start_3(D);

   [local count: 118111600]:
  # cnt_11 = PHI 
  return cnt_11;
}


We can also improve until-wrap loops that don't have a (suitable) loop
header, as determined by simplify_using_initial_conditions.

unsigned bar(unsigned val, unsigned start)
{
  unsigned cnt = 0;
  unsigned i = start;
  do {
cnt++;
i++;
  } while (i > val);
  return cnt;
}

which is currently optimized to:

unsigned int foo (unsigned int val, unsigned int start)
{
  unsigned int cnt;
  unsigned int _9;
  unsigned int _10;

   [local count: 118111600]:
  _9 = start_4(D) + 1;
  _10 = -start_4(D);
  cnt_3 = val_7(D) < _9 ? _10 : 1;
  return cnt_3;
}

Here we have "val < (start+1) ? -start : 1", which again with the
help of match.pd can be slightly simplified to "val <= start ? -start : 1"
when dealing with unsigned types, because at the complicating value where
start == ~0, we fortunately have -start == 1, hence it doesn't matter
whether the second or third operand of the ternary operator is returned.

To summarize, this patch (in addition to tweaking may_be_zero in
number_of_iterations_until_wrap) adds three new constant folding
transforms to match.pd.

X != C1 ? -X : C2 simplifies to -X when -C1 == C2.
which is the generalized form of the simplification above.

X != C1 ? ~X : C2 simplifies to ~X when ~C1 == C2.
which is the BIT_NOT_EXPR analog of the NEGATE_EXPR case.

and the "until-wrap final value replacement without context":

(X + 1) > Y ? -X : 1 simplifies to X >= Y ? -X : 1 when
X is unsigned, as when X + 1 overflows, X is -1, so -X == 1.


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures.  Ok for mainline?


2021-11-29  Roger Sayle  

gcc/ChangeLog
* tree-ssa-loop-niter.c (number_of_iterations_until_wrap):
Check if simplify_using_initial_conditions allows us to
simplify the expression for may_be_zero.
* match.pd (X != C ? -X : -C -> -X): New transform.
(X != C ? ~X : ~C -> ~X): Likewise.
((X+1) > Y ? -X : 1 -> X >= Y ? -X : 1): Likewise.

gcc/testsuite/ChangeLog
* gcc.dg/fold-condneg-1.c: New test case.
* gcc.dg/fold-condneg-2.c: New test case.
* gcc.dg/fold-condnot-1.c: New test case.
* gcc.dg/pr101145-1.c: New test case.
* gcc.dg/pr101145-2.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 7510940..06954e4 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -1478,7 +1478,7 @@ assert_loop_rolls_lt (tree type, affine_iv *iv0, 
affine_iv *iv1,
The number of iterations is stored to NITER.  */
 
 static bool
-number_of_iterations_until_wrap (class loop *, tree type, affine_iv *iv0,
+number_of_iterations_until_wrap (class loop *loop, tree type, affine_iv *iv0,
 affine_iv *iv1, class tree_niter_desc *niter)
 {
   tree niter_type = unsigned_type_for (type);
@@ -1506,6 +1506,23 @@ number_of_iterations_until_wrap (class loop *, tree 
type, affine_iv *iv0,
 
   num = fold_build2 (MINUS_EXPR, niter_type, wide_int_to_tree (type, max),
 iv1->base);
+
+  /* When base has the form iv + 1, if we know iv >= n, then iv + 1 < n
+only when iv + 1 overflows, i.e. when iv == TYPE_VALUE_MAX.  */
+  if (sgn == UNSIGNED
+ && integer_onep (step)
+ && TREE_CODE (iv1->base) == PLUS_EXPR
+ && integer_onep (TREE_OPERAND (iv1->base, 1)))
+   {
+ tree

Re: [PATCH]middle-end cse: Make sure duplicate elements are not entered into the equivalence set [PR103404]

On Mon, 29 Nov 2021, Tamar Christina wrote:

> Hi All,
> 
> CSE uses equivalence classes to keep track of expressions that all have the 
> same
> values at the current point in the program.
> 
> Normal equivalences through SETs only insert and perform lookups in this set 
> but
> equivalence determined from comparisons, e.g.
> 
> (insn 46 44 47 7 (set (reg:CCZ 17 flags)
> (compare:CCZ (reg:SI 105 [ iD.2893 ])
> (const_int 0 [0]))) "cse.c":18:22 7 {*cmpsi_ccno_1}
>  (expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ])
> (nil)))
> 
> creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 [0]).
> 
> This causes a merge to happen between the two equivalence sets denoted by
> (const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively.
> 
> The operation happens through merge_equiv_classes however this function has an
> invariant that the classes to be merge not contain any duplicates.  This is
> because it frees entries before merging.
> 
> The given testcase when using the supplied flags trigger an ICE due to the
> equivalence set being
> 
> (rr) p dump_class (class1)
> Equivalence chain for (reg:SI 105 [ iD.2893 ]):
> (reg:SI 105 [ iD.2893 ])
> $3 = void
> 
> (rr) p dump_class (class2)
> Equivalence chain for (const_int 0 [0]):
> (const_int 0 [0])
> (reg:SI 97 [ _10 ])
> (reg:SI 97 [ _10 ])
> $4 = void
> 
> This happens because the original INSN being recorded is
> 
> (insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0)
> (const_vector:V1SI [
> (const_int 0 [0])
> ])) "cse.c":11:9 1363 {*movv1si_internal}
>  (expr_list:REG_UNUSED (reg:SI 97 [ _10 ])
> (nil)))
> 
> and we end up generating two equivalences. the first one is simply that
> reg:SI 97 is 0.  The second one is that 0 can be extracted from the V1SI, so
> subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0.  This nested subreg gets folded 
> away
> to just reg:SI 97 and we re-insert the same equivalence.
> 
> This patch changes it so that once we figure out the bucket to insert into we
> check if the equivalence set already contains the entry and if so just return
> the existing entry and exit.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> x86_64-pc-linux-gnu and no regressions.
> 
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR rtl-optimization/103404
>   * cse.c (insert_with_costs): Check if item exists already before adding
>   a new entry in the equivalence class.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR rtl-optimization/103404
>   * gcc.target/i386/pr103404.c: New test.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/cse.c b/gcc/cse.c
> index 
> c1c7d0ca27b73c4b944b4719f95fece74e0358d5..08295246c594109e947276051c6776e4cabca4ec
>  100644
> --- a/gcc/cse.c
> +++ b/gcc/cse.c
> @@ -1537,6 +1537,17 @@ insert_with_costs (rtx x, struct table_elt *classp, 
> unsigned int hash,
>if (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER)
>  add_to_hard_reg_set (_regs_in_table, GET_MODE (x), REGNO (x));
>  
> +  /* We cannot allow a duplicate to be entered into the equivalence sets
> + and so we should perform a check before we do any allocations or
> + change the buckets.  */
> +  if (classp)
> +{
> +  struct table_elt *p;
> +  for (p = classp; p; p = p->next_same_value)
> + if (exp_equiv_p (p->exp, x, 1, false))
> +   return p;

not really a review, leaving that to who approved the original change,
but these things always look bad - this linear list walk makes
insert_with_costs quadratic.  Is there any mitigation (like limiting
the number of entries?), is that already an existing problem elsewhere
in CSE?

> +}
> +
>/* Put an element for X into the right hash bucket.  */
>  
>elt = free_element_chain;
> diff --git a/gcc/testsuite/gcc.target/i386/pr103404.c 
> b/gcc/testsuite/gcc.target/i386/pr103404.c
> new file mode 100644
> index 
> ..66f33645301db09503fc0977fd0f061a19e56ea5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr103404.c
> @@ -0,0 +1,32 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Og -fcse-follow-jumps -fno-dce 
> -fno-early-inlining -fgcse -fharden-conditional-branches 
> -frerun-cse-after-loop -fno-tree-ccp -mavx5124fmaps -std=c99 -w" } */
> +
> +typedef unsigned __attribute__((__vector_size__ (4))) U;
> +typedef unsigned __attribute__((__vector_size__ (16))) V;
> +typedef unsigned __attribute__((__vector_size__ (64))) W;
> +
> +int x, y;
> +
> +V v;
> +W w;
> +
> +inline
> +int bar (U a)
> +{
> +  a |= x;
> +  W k =
> +__builtin_shufflevector (v, 5 / a,
> +  2, 4, 0, 2, 4, 1, 0, 1,
> +  1, 2, 1, 3, 0, 4, 4, 0);
> +  w = k;
> +  y = 0;
> +}
> +
> +int
> +foo ()
> +{
> +  bar ((U){0x});
> +  for (unsigned i; i < sizeof (foo);)
> +;
> +}
> +
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409

Re: [PATCH] tree-optimization: [PR101540] Simplify CONSTRUCTOR for vector(1) to be VCE

On Mon, Nov 29, 2021 at 1:57 AM Andrew Pinski via Gcc-patches
 wrote:
>
> On Sun, Nov 28, 2021 at 12:25 PM Jeff Law via Gcc-patches
>  wrote:
> >
> >
> >
> > On 11/28/2021 10:56 AM, apinski--- via Gcc-patches wrote:
> > > From: Andrew Pinski 
> > >
> > > This just adds a simplification to simplify_vector_constructor for
> > > vector of 1 element to be VCE which should reduce memory usage in
> > > the compiler and maybe allow for some more optimizations.
> > >
> > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> > >
> > >   PR tree-optimization/101540
> > >
> > > gcc/ChangeLog:
> > >
> > >   * tree-ssa-forwprop.c (simplify_vector_constructor):
> > >   Simplify constructor of vector of 1 element to just
> > >   be a VIEW_CONVERT_EXPR.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * gcc.dg/tree-ssa/pr101540-1.c: New test.
> > So why generate a VCE here if the type conversion is useless?  Why not
> > just a NOP_EXPR?  Is there something special about converting between
> > the element type and the outer vector type that requires VCE rather than
> > NOP_EXR?  Neither an ACK or NAK, just trying to understand it a bit better.
>
>
> Because right now tree-cfg.c has this check for vector types for NOP_EXPR:
> /* Allow conversions between vectors with the same number of elements,
>provided that the conversion is OK for the element types too.  */
> if (VECTOR_TYPE_P (lhs_type)
> && VECTOR_TYPE_P (rhs1_type)
> && known_eq (TYPE_VECTOR_SUBPARTS (lhs_type),
>  TYPE_VECTOR_SUBPARTS (rhs1_type)))
>   {
> lhs_type = TREE_TYPE (lhs_type);
> rhs1_type = TREE_TYPE (rhs1_type);
>   }
> else if (VECTOR_TYPE_P (lhs_type) || VECTOR_TYPE_P (rhs1_type))
>   {
> error ("invalid vector types in nop conversion");
> debug_generic_expr (lhs_type);
> debug_generic_expr (rhs1_type);
> return true;
>   }
>
> We can change this check here for NOP_EXPR and vector types but VCE is
> still a nop in most cases and handled as such really. But I wonder if
> the rest of the compiler is ready for it though.

It's definitely not a NOP, I think the original patch is OK.

Thanks,
Richard.

>
> Thanks,
> Andrew Pinski
>
> >
> > Jeff
> >
> >

Re: [PATCH] Fix PR 19089: Environment variable TMP may yield gcc: abort

On Mon, Nov 29, 2021 at 9:53 AM Richard Biener
 wrote:
>
> On Mon, Nov 29, 2021 at 1:42 AM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > On Sun, Nov 28, 2021 at 12:14 PM Jeff Law via Gcc-patches
> >  wrote:
> > >
> > >
> > >
> > > On 11/27/2021 7:49 PM, apinski--- via Gcc-patches wrote:
> > > > From: Andrew Pinski 
> > > >
> > > > Even though I cannot reproduce the ICE any more, this is still
> > > > a bug. We check already to see if we can access the directory
> > > > but never check to see if the path is actually a directory.
> > > >
> > > > This adds the check and now we reject the file as not usable
> > > > as a tmp directory.
> > > >
> > > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> > > >
> > > > libiberty/ChangeLog:
> > > >
> > > >   * make-temp-file.c (try_dir): Check to see if the dir
> > > >   is actually a directory.
> > > > ---
> > > >   libiberty/make-temp-file.c | 16 +++-
> > > >   1 file changed, 15 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/libiberty/make-temp-file.c b/libiberty/make-temp-file.c
> > > > index 31f87fbcfde..11eb03d12ec 100644
> > > > --- a/libiberty/make-temp-file.c
> > > > +++ b/libiberty/make-temp-file.c
> > > > @@ -39,6 +39,10 @@ Boston, MA 02110-1301, USA.  */
> > > >   #if defined(_WIN32) && !defined(__CYGWIN__)
> > > >   #include 
> > > >   #endif
> > > > +#if HAVE_SYS_STAT_H
> > > > +#include 
> > > > +#endif
> > > > +
> > > >
> > > >   #ifndef R_OK
> > > >   #define R_OK 4
> > > > @@ -76,7 +80,17 @@ try_dir (const char *dir, const char *base)
> > > >   return base;
> > > > if (dir != 0
> > > > && access (dir, R_OK | W_OK | X_OK) == 0)
> > > > -return dir;
> > > > +{
> > > > +  /* Check to make sure dir is actually a directory. */
> > > > +#ifdef S_ISDIR
> > > > +  struct stat s;
> > > > +  if (stat(dir, ))
>
> I wonder if we can instead do access (dir "/.") or so where access
> should complain with ENOTDIR since 'dir' isn't a directory?

On Linux it is enough to add DIR_SEPARATOR to 'dir' passed to access.
One would have to check other OSes (if they maybe fail to correctly
handle trailing DIR_SEPARATOR at all).

>
> Richard.
>
> > > Formatting nit, missing whitespace between stat and open paren.
> > >
> > > Presumably this doesn't fix the problem in the case where S_ISDIR is not
> > > defined.  But it's still an improvement.  OK with the nit fixed.
> >
> > Correct, though I don't know of any host where S_ISDIR is not defined.
> > Mingw has them defined. So does cygwin. glibc (and all libc on Linux)
> > has them defined, Solaris and AIX has them defined. So Does Mac OS X.
> >
> >
> > MSVC does not define them but we don't support MSVC to compile GCC so
> > that should not be an issue.
> >
> > Thanks,
> > Andrew
> >
> > >
> > > jeff
> > >

Re: [PATCH] Fix PR 19089: Environment variable TMP may yield gcc: abort