[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-12-01 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

--- Comment #16 from Jakub Jelinek  ---
Author: jakub
Date: Thu Dec  1 23:15:57 2016
New Revision: 243145

URL: https://gcc.gnu.org/viewcvs?rev=243145=gcc=rev
Log:
PR tree-optimization/78586
* gimple-ssa-sprintf.c (format_integer): Don't handle NOP_EXPR,
CONVERT_EXPR or COMPONENT_REF here.  Formatting fix.  For
SSA_NAME_DEF_STMT with NOP_EXPR only change argtype if the rhs1's
type is INTEGER_TYPE or POINTER_TYPE.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/gimple-ssa-sprintf.c

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-30 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Jakub Jelinek  ---
Fixed.

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-30 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

--- Comment #14 from Martin Sebor  ---
(In reply to Jakub Jelinek from comment #13)
> (In reply to Martin Sebor from comment #11)
> > PR tree-optimization/78586
> > * gcc.dg/tree-ssa/builtin-sprintf-2.c: New test cases.
> > 
> > 
> > Modified:
> > trunk/gcc/testsuite/ChangeLog
> > trunk/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c
> 
>  RNG (4,  4, 32, "%lu", (unsigned)u)
> 
> Isn't this UB?  Did you mean (unsigned long)u) ?

You're right, it is undefined.  It shouldn't matter though because the purpose
of the test is to verify that undefined or unspecified or otherwise unsuitable
calls to sprintf (including those with unknown arguments) are not subject to
either return value constant propagation (the EQL tests) or the range
propagation optimization (the RNG tests).

> 
> As for closing this PR, I have no problem with that, just remember that
> PR78622 doesn't cover all the issues discussed here and still unresolved.

I'll go through the rest of your comments and open new bugs for any distinct
issues.  Since you fixed this bug I'll let you close it unless you want me to. 
If you find new issues please open new bugs.

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-30 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

--- Comment #13 from Jakub Jelinek  ---
(In reply to Martin Sebor from comment #11)
>   PR tree-optimization/78586
>   * gcc.dg/tree-ssa/builtin-sprintf-2.c: New test cases.
> 
> 
> Modified:
> trunk/gcc/testsuite/ChangeLog
> trunk/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c

 RNG (4,  4, 32, "%lu", (unsigned)u)

Isn't this UB?  Did you mean (unsigned long)u) ?

As for closing this PR, I have no problem with that, just remember that PR78622
doesn't cover all the issues discussed here and still unresolved.

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-30 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

Martin Sebor  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=78622

--- Comment #12 from Martin Sebor  ---
I'd like to track this and any other new problem separately from this bug so
I've opened bug 78622 for the issue described in comment #8.

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-30 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

--- Comment #11 from Martin Sebor  ---
Author: msebor
Date: Wed Nov 30 19:16:03 2016
New Revision: 243081

URL: https://gcc.gnu.org/viewcvs?rev=243081=gcc=rev
Log:
PR tree-optimization/78586 - [7 Regression] Wrong code caused by
printf-return-value

gcc/testsuite/ChangeLog:

PR tree-optimization/78586
* gcc.dg/tree-ssa/builtin-sprintf-2.c: New test cases.


Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-30 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

--- Comment #10 from Martin Sebor  ---
(In reply to Jakub Jelinek from comment #8)
> The far more important thing is that the handling of different argtype and
> dirtype really doesn't work.  Consider:
> 
> __attribute__((noinline, noclone)) int
> foo (int x)
> {
>   if (x < 4096 + 8 || x >= 4096 + 256 + 8)
> return -1;
>   char buf[5];
>   return __builtin_sprintf (buf, "%hhd", x + 1);
> }
> 
> int
> main ()
> {
>   if (__SCHAR_MAX__ != 127 || __CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4)
> return 0;
>   if (foo (4095 + 9) != 1
>   || foo (4095 + 32) != 2
>   || foo (4095 + 127) != 3
>   || foo (4095 + 128) != 4
>   || foo (4095 + 240) != 3
>   || foo (4095 + 248) != 2
>   || foo (4095 + 255) != 2
>   || foo (4095 + 256) != 1)
> __builtin_abort ();
>   return 0;
> }
> 
> We have VR_RANGE for the SSA_NAME:
>   # RANGE [4105, 4360] NONZERO 4607
>   _3 = x_6(D) + 1;
> and because both 4105 and 4360 are positive, we use argmin 4105 and argmax
> 4360 and recurse with those, and the recursive call for INTEGER_CST folds
> the integer to dirtype (signed char here), which yields 9 and 8 and thus we
> assume that it must be exactly a single char.  But that is not true, in this
> case the range of (signed char) _3 is actually [ -128, 127 ], so all values
> of dirtype.  In other cases it could be a subset of the values, but still
> the boundary values wouldn't be the boundary values of the range in the
> larger type.

Yes, this is a bug.  Thanks for the test case!  Let me look into fixing it.

> Another thing is, for the non-VR_RANGE case, it sets res.argmin and
> res.argmax always to the argmin and argmax values.  But those values aren't
> necessarily the boundaries of actual value range, those are just some values
> where we assume it will emit minimum or maximum number of characters.  So, I
> think if typeprec <= argprec, we shouldn't set res.arg{min,max} at all,
> otherwise we should always set it to the minimum and maximum value of the
> argtype.

I'll have to spend some time on this to understand what the problem is so I can
test it.

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-30 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

--- Comment #9 from Jakub Jelinek  ---
Even for TYPE_PRECISION (dirtype) == TYPE_PRECISION (argtype), if one is signed
and the other is unsigned, are you sure that for VR_RANGE we do the right
thing?
Again, should be backed out by sufficient testsuite coverage.
E.g. consider:
__attribute__((noinline, noclone)) int
foo (unsigned int x)
{
  if (x < 64 || x > 2U * __INT_MAX__ - 10)
return -1;
  char buf[4];
  return __builtin_sprintf (buf, "%d", x + 1);
}
Here we don't warn with -O2 -W -Wall, but if I change buf[4] to buf[1], I get:
warning: ā€˜%dā€™ directive writing between 2 and 3 bytes into a region of size 1
and
note: directive argument in the range [65u, 4294967285u]
But the directive is not writing in between 2 and 3 bytes, it is a number in
the range [65, INT_MAX] or [INT_MIN, -11], i.e. ~[-10, 64], thus it needs in
between 2 and 11 bytes.  For the warning that is a missed-warning case, but it
wouldn't surprise me if you couldn't construct a similar testcase perhaps with
sign -> unsigned or something similar (and octal or decimal, etc., many
options) where you'd be warning even on something that is safe.

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-30 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

--- Comment #8 from Jakub Jelinek  ---
Created attachment 40201
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40201=edit
gcc7-pr78586-2.patch

Ok, attaching a patch for the minor nits listed above.

The far more important thing is that the handling of different argtype and
dirtype really doesn't work.  Consider:

__attribute__((noinline, noclone)) int
foo (int x)
{
  if (x < 4096 + 8 || x >= 4096 + 256 + 8)
return -1;
  char buf[5];
  return __builtin_sprintf (buf, "%hhd", x + 1);
}

int
main ()
{
  if (__SCHAR_MAX__ != 127 || __CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4)
return 0;
  if (foo (4095 + 9) != 1
  || foo (4095 + 32) != 2
  || foo (4095 + 127) != 3
  || foo (4095 + 128) != 4
  || foo (4095 + 240) != 3
  || foo (4095 + 248) != 2
  || foo (4095 + 255) != 2
  || foo (4095 + 256) != 1)
__builtin_abort ();
  return 0;
}

We have VR_RANGE for the SSA_NAME:
  # RANGE [4105, 4360] NONZERO 4607
  _3 = x_6(D) + 1;
and because both 4105 and 4360 are positive, we use argmin 4105 and argmax 4360
and recurse with those, and the recursive call for INTEGER_CST folds the
integer to dirtype (signed char here), which yields 9 and 8 and thus we assume
that it must be exactly a single char.  But that is not true, in this case the
range of (signed char) _3 is actually [ -128, 127 ], so all values of dirtype. 
In other cases it could be a subset of the values, but still the boundary
values wouldn't be the boundary values of the range in the larger type.

Another thing is, for the non-VR_RANGE case, it sets res.argmin and res.argmax
always to the argmin and argmax values.  But those values aren't necessarily
the boundaries of actual value range, those are just some values where we
assume it will emit minimum or maximum number of characters.  So, I think if
typeprec <= argprec, we shouldn't set res.arg{min,max} at all, otherwise we
should always set it to the minimum and maximum value of the argtype.

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-30 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

--- Comment #7 from Jakub Jelinek  ---
Author: jakub
Date: Wed Nov 30 08:01:47 2016
New Revision: 242998

URL: https://gcc.gnu.org/viewcvs?rev=242998=gcc=rev
Log:
PR tree-optimization/78586
* gimple-ssa-sprintf.c (format_integer): Use TYPE_MAX_VALUE or
TYPE_MIN_VALUE or build_all_ones_cst instead of folding LSHIFT_EXPR.
Don't build_int_cst min/max twice.  Formatting fix.

* gcc.c-torture/execute/pr78586.c: New test.

Added:
trunk/gcc/testsuite/gcc.c-torture/execute/pr78586.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/gimple-ssa-sprintf.c
trunk/gcc/testsuite/ChangeLog

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

--- Comment #6 from Martin Sebor  ---
(In reply to Jakub Jelinek from comment #5)
> There are still various weird spots in format_integer.
> E.g.
>   else if (TREE_CODE (TREE_TYPE (arg)) == INTEGER_TYPE
>|| TREE_CODE (TREE_TYPE (arg)) == POINTER_TYPE)
> {
>   /* Determine the type of the provided non-constant argument.  */
>   if (TREE_CODE (arg) == NOP_EXPR)
> arg = TREE_OPERAND (arg, 0);
>   else if (TREE_CODE (arg) == CONVERT_EXPR)
> arg = TREE_OPERAND (arg, 0);
>   if (TREE_CODE (arg) == COMPONENT_REF)
> arg = TREE_OPERAND (arg, 1);
>   
>   argtype = TREE_TYPE (arg);
> }
> When would this actually happen?

I checked by running the builtin-sprintf*.c tests and none of the conditions is
true for any of them so it looks like it might be dead code from an earlier
versions of the pass that I never took out.

> 2) for SSA_NAMEs with VR_RANGE, you only look at the argument's ranges,
> while for something VR_VARYING and similar you consider both dirtype and
> argtype's precision.  Isn't it always UB if the precision is different? 

For %hhi and %hi the argument type can be char, short, or int, and considering
the type of the directive's "formal" argument avoids warning on cases where the
actual argument would be too big to fit if it were formatted using a directive
for its own type.  Like this:

  char d[4];
  return __builtin_sprintf (d, "%hhu", 123456);


> Corner case might be C bitfields, not sure what exact type one gets when
> passing that to varargs.  For UB wouldn't it be better to punt (return don't
> know value)?

Yes, the pass shouldn't try to make any assumptions when it detects undefined
behavior.  I didn't intentionally encode any such assumptions in it but I may
have done that inadvertently.  Those would be bugs.

> 3) "When precision is specified but unknown, use zero as the minimum since
> it results in no bytes on output (unless width is specified to be greater
> than 0)."
> When is 0 printed as nothing?  I thought it is printed as 0 (or 0x0 etc.).

I thought the same until last week.  Apparently we both thought wrong. 
printf("%.0i", 0) is required to print nothing.  The text is under the  d, i,
o, u, and x/X conversion specifiers: "The result of converting zero with an
explicit precision of zero shall be no characters."

> 4)   if (code == NOP_EXPR)
> argtype = TREE_TYPE (gimple_assign_rhs1 (def));
> Similar case to 1), except it can happen, but argtype could be anything,
> boolean type, perhaps fixed point type, enum, whatever.

I'm not quite sure I follow what you are trying to say here.  This code avoids
false positives for unknown arguments of wider types that are cast to narrower
types, as in: 'sprintf ((char[4]){ }, "%hhi", (unsigned char)i)' where i is an
int with an unknown value.  It strips the NOP_EXPR that promotes the argument
to int and lets the function use the range of the type of the argument before
the promotion.  It could be that I'm missing something so a test case would be
helpful.

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

Jakub Jelinek  changed:

   What|Removed |Added

 CC||law at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
There are still various weird spots in format_integer.
E.g.
  else if (TREE_CODE (TREE_TYPE (arg)) == INTEGER_TYPE
   || TREE_CODE (TREE_TYPE (arg)) == POINTER_TYPE)
{
  /* Determine the type of the provided non-constant argument.  */
  if (TREE_CODE (arg) == NOP_EXPR)
arg = TREE_OPERAND (arg, 0);
  else if (TREE_CODE (arg) == CONVERT_EXPR)
arg = TREE_OPERAND (arg, 0);
  if (TREE_CODE (arg) == COMPONENT_REF)
arg = TREE_OPERAND (arg, 1);

  argtype = TREE_TYPE (arg);
}
When would this actually happen?  The pass is done in SSA GIMPLE, and
integral/pointer types are gimple types, so the arguments should be always
is_gimple_val.  So, INTEGER_CST, SSA_NAME, or say ADDR_EXPR of something etc.,
but should not be NOP_EXPR, CONVERT_EXPR (we have a macro for those two btw,
CONVERT_EXPR_P), nor COMPONENT_REF.  Even if there would be NOPs etc., you
aren't checking if they convert from INTEGER/POINTER type or something
different.
2) for SSA_NAMEs with VR_RANGE, you only look at the argument's ranges, while
for something VR_VARYING and similar you consider both dirtype and argtype's
precision.  Isn't it always UB if the precision is different?  Corner case
might be C bitfields, not sure what exact type one gets when passing that to
varargs.  For UB wouldn't it be better to punt (return don't know value)?
I mean, if you have "%lld%d" format and pass in int and long long, assuming
what the result will be is strange.  For the same precision, it is quite common
that there is a mismatch between sign of what dirtype expects and what argtype
has (or one can be pointer and the other not).  That case we need to handle
properly.  But is e.g. for unsigned printing the minimum value of a signed type
the largest one?  I'd think it is -1.  E.g. for smaller precision signed (e.g.
bitfield) into larger precision unsigned.
3) "When precision is specified but unknown, use zero as the minimum since it
results in no bytes on output (unless width is specified to be greater than
0)."
When is 0 printed as nothing?  I thought it is printed as 0 (or 0x0 etc.).
4)   if (code == NOP_EXPR)
argtype = TREE_TYPE (gimple_assign_rhs1 (def));
Similar case to 1), except it can happen, but argtype could be anything,
boolean type, perhaps fixed point type, enum, whatever.

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

--- Comment #4 from Jakub Jelinek  ---
Created attachment 40190
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40190=edit
gcc7-pr78586.patch

Untested fix.

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

--- Comment #3 from Jakub Jelinek  ---
--- gcc/gimple-ssa-sprintf.c.jj 2016-11-28 23:50:20.0 +0100
+++ gcc/gimple-ssa-sprintf.c2016-11-29 15:03:29.201488577 +0100
@@ -1159,16 +1159,15 @@ format_integer (const conversion_spec 
  if (TYPE_UNSIGNED (argtype))
argmax = build_all_ones_cst (argtype);
  else
-   argmax = fold_build2 (LSHIFT_EXPR, argtype, integer_one_node,
+   argmax = fold_build2 (LSHIFT_EXPR, argtype,
+ build_int_cst (argtype, 1),
  build_int_cst (integer_type_node,
 argprec - 1));
}
   else
-   {
- argmax = fold_build2 (LSHIFT_EXPR, dirtype, integer_one_node,
-   build_int_cst (integer_type_node,
-  typeprec - 1));
-   }
+   argmax = fold_build2 (LSHIFT_EXPR, dirtype,
+ build_int_cst (dirtype, 1),
+ build_int_cst (integer_type_node, typeprec - 1));
   res.argmin = argmin;
   res.argmax = argmax;
 }

fixes it, that is a rather obvious bug.  But, I want to think about whether we
can't/shouldn't just use TYPE_{MIN,MAX}_VALUE (TYPE_DOMAIN (...)) instead.

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Simplified testcase:
void
foo (unsigned long x)
{
  char a[30];
  unsigned long b = __builtin_sprintf (a, "%lu", x);
  if (b != 4)
__builtin_abort ();
}

int
main ()
{
  foo (1000);
  return 0;
}

(note, the original one has UB, %d used to print size_t value).

[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value

2016-11-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1
 Status|UNCONFIRMED |NEW
Version|unknown |7.0
   Target Milestone|--- |7.0
Summary|Wrong code caused by|[7 Regression] Wrong code
   |printf-return-value |caused by
   ||printf-return-value
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed.