Re: [PATCH] consider MIN_EXPR in get_size_range() (PR 85888)

Martin Sebor Fri, 25 May 2018 10:32:08 -0700

On 05/25/2018 01:02 AM, Richard Biener wrote:

On Thu, May 24, 2018 at 11:22 PM Martin Sebor <mse...@gmail.com> wrote:

On 05/24/2018 11:15 AM, Richard Biener wrote:

On May 24, 2018 7:02:17 PM GMT+02:00, Martin Sebor <mse...@gmail.com>

wrote:

On 05/24/2018 03:39 AM, Richard Biener wrote:

On Thu, May 24, 2018 at 12:50 AM Martin Sebor <mse...@gmail.com>

wrote:

The attached patch enhances the get_size_range() function to
extract more accurate ranges out of MIN_EXPR and MAX_EXPR nodes
with SSA_NAME operand(s).  This helps -Wstringop-overflow avoid
false positives on some targets in cases like some of those
reported in bug 85623 - strncmp() warns about attribute nonstring
incorrectly in -Wstringop-overflow

When first trying to expand calls to __builtin_strncmp, most back
ends succeed even when the expansion results in a call to the

library

function.  The powerpc64 back-end, however, fails and and allows
the generic expansion to take place.  There is a subtle difference
between the two cases that shows up when examining the bound of
the strncmp call expression (the third argument): in the original
call expression (in expand_builtin_strncmp) the bound is or can be
an SSA_NAME.  But when the early expansion fails,
expand_builtin_strncmp transforms the original call expression into
a new one, substituting MIN_EXPR (bound, CST) for the bound where
CST is the constant result of strlen (STR), with STR being either
the first or second strncmp argument.  When the bound is an SSA_NAME
the subsequent attempt to determine its range from the MIN_EXPR

fails.


Hmm, wouldn't sth extracted from the attached random patch I have

lying

around be more useful in general?  That is, refactor the GENERIC
expression walk in determine_value_range_1 to sth that can be used
in the current get_range_info () interface?  Or if we don't want to

change

that to accept non-SSA names add a corresponding get_expr_range_info

()

interface.


Thanks.  It is certainly more general.  I'm just not sure how much
use the generality can be put to because...

If you do that please cut out the dominator/loop condition handling

and

refrain from the idea of looking at SSA def stmts.


...I've done that but I'm having trouble exercising the code except
for this bug.  (I've run the C testsuite but the only tests that end
up in it call it with uninteresting arguments like VAR_DECL).  Do
you have any suggestions?


Well, what builds the MIN_EXPR for your testcase? I orinigally used it

for niters analysis which deals with complex GENERIC expressions. If you
just changed get_range_info then of course callers are guarded with SSA
name checks.

In pr85888 it's built by expand_builtin_strncmp (so very late).
It doesn't seem like a good use case because of this code:

    /* Expand the library call ourselves using a stabilized argument
       list to avoid re-evaluating the function's arguments twice.  */
    tree fn = build_call_nofold_loc (loc, fndecl, 3, arg1, arg2, len);
    gcc_assert (TREE_CODE (fn) == CALL_EXPR);
    CALL_EXPR_TAILCALL (fn) = CALL_EXPR_TAILCALL (exp);
    return expand_call (fn, target, target == const0_rtx);

AFAICS, the new call expression is the same as the old one, the only
difference is the LEN argument which is the MIN_EXPR.

The code was added in r72380 where the stabilization referred to
calling save_expr() on the arguments.  That code has been gone
since r242556 so I think a better/more targeted solution to fix
pr85888 than either of our approaches might be to simply call
expand_call() on the original CALL_EXPR with the original
arguments, including ARG3.

I don't fully understand the point of the save_expr() calls there
(and they're still in builtin_expand_strcmp), so if they need to
be restored then they would presumably include ARG3.  I.e.,
expand_call() would be called using ARG3 in the original SSA_NAME
form rather than the MIN_EXPR wrapper created by the function.

Attached is a patch that implements the first approach above.

If the SAVE_EXPRs are needed for some reason, it would be good
to have a test case to verify it.  I haven't been able to come
up with one (or find an existing test that fails due to their
removal) but that could very well be because my limited
understanding of what they're for (and poor testsuite coverage).

I'm also up for taking your patch and trying to make use of it
for other trees where get_range_info() is currently being called
for SSA_NAMEs only.  But it feels like a separate project from
this bug fix.


But your patch makes the whole 'len' computation after

   /* If we don't have a constant length for the first, use the length
      of the second, if we know it.  If neither string is constant length,
      use the given length argument.  We don't require a constant for
      this case; some cost analysis could be done if both are available
      but neither is constant.  For now, assume they're equally cheap,
      unless one has side effects.  If both strings have constant lengths,
      use the smaller.  */

   if (!len1 && !len2)
     len = len3;
   else if (!len1)
...
   /* If we are not using the given length, we must incorporate it here.
      The actual new length parameter will be MIN(len,arg3) in this case.  */
   if (len != len3)
     len = fold_build2_loc (loc, MIN_EXPR, TREE_TYPE (len), len, len3);

dead if expand_cmpstrn_or_cmpmem doesn't expand things inline.
I believe the idea is to possibly reduce 'len' for expansion of the
libcall.

OTOH all this "optimization" should have been done as folding
and not at RTL expansion time.  Thus if the issue is getting
a warning from the recursive re-expansion of the adjusted
call then removing this "postmature" optimization and making
sure it happens earlier as folding might be indeed a good thing.

But your patch as-is looks like it loses some optimization
(and I'm not caring about "folding" at -O0).


On the one affected target (as far as I know) it increases
the code size by adding a branch.  I wouldn't have thought
of that as an optimization.  Since it bloats code, at
a minimum I'd expect the decision to be gated for -Os.  But
I'm not sure the branch is necessarily a win either -- it
will certainly depend on whether strncpy makes use of it.

At the same time, if you believe this code is deliberately
crafted this way I wouldn't want to undo it just to suppress
a warning.  So I'm back to the question: is my first patch
good enough for now to suppress the warning or do you want
me to use your patch?

I don't have a problem using yours but in the first pass
I wouldn't want to do more than just fix the warning; with
it fixed, I could the look into removing some of the guards
for SSA_NAME around calls to get_range_info().

Martin

Richard.

Martin

Martin

PS I was able to create a test that at -O0 could call it with
a more interesting argument such as:

  const __SIZE_TYPE__ i = 3, j = 5;

  int f (void)
  {
    return __builtin_strncmp ("1234", "123", i < j ? i : j);
  }

Here, the first call to gimple_fold_builtin_string_compare() that
I picked as a test case sees:

  j.0_1 = 5;
  i.1_2 = 3;
  _3 = MIN_EXPR <j.0_1, i.1_2>;
  D.1963 = __builtin_strncmp ("1234", "123", _3);

Above, _3's range info is not available yet at this point so
the expression could be evaluated but only by looking at def stmts.
Why is it not a good idea?


Because it raises complexity concerns unless you also populate

intermediate SSA range Infos and stop at present ones. And you can't deal
with conditional info as the patch tries to do (but I wouldn't keep that
either initially).


Richard.


Martin

Re: [PATCH] consider MIN_EXPR in get_size_range() (PR 85888)

Reply via email to