[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 --- Comment #16 from Jakub Jelinek --- Author: jakub Date: Thu Dec 1 23:15:57 2016 New Revision: 243145 URL: https://gcc.gnu.org/viewcvs?rev=243145&root=gcc&view=rev Log: PR tree-optimization/78586 * gimple-ssa-sprintf.c (format_integer): Don't handle NOP_EXPR, CONVERT_EXPR or COMPONENT_REF here. Formatting fix. For SSA_NAME_DEF_STMT with NOP_EXPR only change argtype if the rhs1's type is INTEGER_TYPE or POINTER_TYPE. Modified: trunk/gcc/ChangeLog trunk/gcc/gimple-ssa-sprintf.c
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 Jakub Jelinek changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #15 from Jakub Jelinek --- Fixed.
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 --- Comment #14 from Martin Sebor --- (In reply to Jakub Jelinek from comment #13) > (In reply to Martin Sebor from comment #11) > > PR tree-optimization/78586 > > * gcc.dg/tree-ssa/builtin-sprintf-2.c: New test cases. > > > > > > Modified: > > trunk/gcc/testsuite/ChangeLog > > trunk/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c > > RNG (4, 4, 32, "%lu", (unsigned)u) > > Isn't this UB? Did you mean (unsigned long)u) ? You're right, it is undefined. It shouldn't matter though because the purpose of the test is to verify that undefined or unspecified or otherwise unsuitable calls to sprintf (including those with unknown arguments) are not subject to either return value constant propagation (the EQL tests) or the range propagation optimization (the RNG tests). > > As for closing this PR, I have no problem with that, just remember that > PR78622 doesn't cover all the issues discussed here and still unresolved. I'll go through the rest of your comments and open new bugs for any distinct issues. Since you fixed this bug I'll let you close it unless you want me to. If you find new issues please open new bugs.
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 --- Comment #13 from Jakub Jelinek --- (In reply to Martin Sebor from comment #11) > PR tree-optimization/78586 > * gcc.dg/tree-ssa/builtin-sprintf-2.c: New test cases. > > > Modified: > trunk/gcc/testsuite/ChangeLog > trunk/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c RNG (4, 4, 32, "%lu", (unsigned)u) Isn't this UB? Did you mean (unsigned long)u) ? As for closing this PR, I have no problem with that, just remember that PR78622 doesn't cover all the issues discussed here and still unresolved.
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 Martin Sebor changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=78622 --- Comment #12 from Martin Sebor --- I'd like to track this and any other new problem separately from this bug so I've opened bug 78622 for the issue described in comment #8.
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 --- Comment #11 from Martin Sebor --- Author: msebor Date: Wed Nov 30 19:16:03 2016 New Revision: 243081 URL: https://gcc.gnu.org/viewcvs?rev=243081&root=gcc&view=rev Log: PR tree-optimization/78586 - [7 Regression] Wrong code caused by printf-return-value gcc/testsuite/ChangeLog: PR tree-optimization/78586 * gcc.dg/tree-ssa/builtin-sprintf-2.c: New test cases. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 --- Comment #10 from Martin Sebor --- (In reply to Jakub Jelinek from comment #8) > The far more important thing is that the handling of different argtype and > dirtype really doesn't work. Consider: > > __attribute__((noinline, noclone)) int > foo (int x) > { > if (x < 4096 + 8 || x >= 4096 + 256 + 8) > return -1; > char buf[5]; > return __builtin_sprintf (buf, "%hhd", x + 1); > } > > int > main () > { > if (__SCHAR_MAX__ != 127 || __CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4) > return 0; > if (foo (4095 + 9) != 1 > || foo (4095 + 32) != 2 > || foo (4095 + 127) != 3 > || foo (4095 + 128) != 4 > || foo (4095 + 240) != 3 > || foo (4095 + 248) != 2 > || foo (4095 + 255) != 2 > || foo (4095 + 256) != 1) > __builtin_abort (); > return 0; > } > > We have VR_RANGE for the SSA_NAME: > # RANGE [4105, 4360] NONZERO 4607 > _3 = x_6(D) + 1; > and because both 4105 and 4360 are positive, we use argmin 4105 and argmax > 4360 and recurse with those, and the recursive call for INTEGER_CST folds > the integer to dirtype (signed char here), which yields 9 and 8 and thus we > assume that it must be exactly a single char. But that is not true, in this > case the range of (signed char) _3 is actually [ -128, 127 ], so all values > of dirtype. In other cases it could be a subset of the values, but still > the boundary values wouldn't be the boundary values of the range in the > larger type. Yes, this is a bug. Thanks for the test case! Let me look into fixing it. > Another thing is, for the non-VR_RANGE case, it sets res.argmin and > res.argmax always to the argmin and argmax values. But those values aren't > necessarily the boundaries of actual value range, those are just some values > where we assume it will emit minimum or maximum number of characters. So, I > think if typeprec <= argprec, we shouldn't set res.arg{min,max} at all, > otherwise we should always set it to the minimum and maximum value of the > argtype. I'll have to spend some time on this to understand what the problem is so I can test it.
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 --- Comment #9 from Jakub Jelinek --- Even for TYPE_PRECISION (dirtype) == TYPE_PRECISION (argtype), if one is signed and the other is unsigned, are you sure that for VR_RANGE we do the right thing? Again, should be backed out by sufficient testsuite coverage. E.g. consider: __attribute__((noinline, noclone)) int foo (unsigned int x) { if (x < 64 || x > 2U * __INT_MAX__ - 10) return -1; char buf[4]; return __builtin_sprintf (buf, "%d", x + 1); } Here we don't warn with -O2 -W -Wall, but if I change buf[4] to buf[1], I get: warning: ā%dā directive writing between 2 and 3 bytes into a region of size 1 and note: directive argument in the range [65u, 4294967285u] But the directive is not writing in between 2 and 3 bytes, it is a number in the range [65, INT_MAX] or [INT_MIN, -11], i.e. ~[-10, 64], thus it needs in between 2 and 11 bytes. For the warning that is a missed-warning case, but it wouldn't surprise me if you couldn't construct a similar testcase perhaps with sign -> unsigned or something similar (and octal or decimal, etc., many options) where you'd be warning even on something that is safe.
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 --- Comment #8 from Jakub Jelinek --- Created attachment 40201 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40201&action=edit gcc7-pr78586-2.patch Ok, attaching a patch for the minor nits listed above. The far more important thing is that the handling of different argtype and dirtype really doesn't work. Consider: __attribute__((noinline, noclone)) int foo (int x) { if (x < 4096 + 8 || x >= 4096 + 256 + 8) return -1; char buf[5]; return __builtin_sprintf (buf, "%hhd", x + 1); } int main () { if (__SCHAR_MAX__ != 127 || __CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4) return 0; if (foo (4095 + 9) != 1 || foo (4095 + 32) != 2 || foo (4095 + 127) != 3 || foo (4095 + 128) != 4 || foo (4095 + 240) != 3 || foo (4095 + 248) != 2 || foo (4095 + 255) != 2 || foo (4095 + 256) != 1) __builtin_abort (); return 0; } We have VR_RANGE for the SSA_NAME: # RANGE [4105, 4360] NONZERO 4607 _3 = x_6(D) + 1; and because both 4105 and 4360 are positive, we use argmin 4105 and argmax 4360 and recurse with those, and the recursive call for INTEGER_CST folds the integer to dirtype (signed char here), which yields 9 and 8 and thus we assume that it must be exactly a single char. But that is not true, in this case the range of (signed char) _3 is actually [ -128, 127 ], so all values of dirtype. In other cases it could be a subset of the values, but still the boundary values wouldn't be the boundary values of the range in the larger type. Another thing is, for the non-VR_RANGE case, it sets res.argmin and res.argmax always to the argmin and argmax values. But those values aren't necessarily the boundaries of actual value range, those are just some values where we assume it will emit minimum or maximum number of characters. So, I think if typeprec <= argprec, we shouldn't set res.arg{min,max} at all, otherwise we should always set it to the minimum and maximum value of the argtype.
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 --- Comment #7 from Jakub Jelinek --- Author: jakub Date: Wed Nov 30 08:01:47 2016 New Revision: 242998 URL: https://gcc.gnu.org/viewcvs?rev=242998&root=gcc&view=rev Log: PR tree-optimization/78586 * gimple-ssa-sprintf.c (format_integer): Use TYPE_MAX_VALUE or TYPE_MIN_VALUE or build_all_ones_cst instead of folding LSHIFT_EXPR. Don't build_int_cst min/max twice. Formatting fix. * gcc.c-torture/execute/pr78586.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr78586.c Modified: trunk/gcc/ChangeLog trunk/gcc/gimple-ssa-sprintf.c trunk/gcc/testsuite/ChangeLog
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 --- Comment #6 from Martin Sebor --- (In reply to Jakub Jelinek from comment #5) > There are still various weird spots in format_integer. > E.g. > else if (TREE_CODE (TREE_TYPE (arg)) == INTEGER_TYPE >|| TREE_CODE (TREE_TYPE (arg)) == POINTER_TYPE) > { > /* Determine the type of the provided non-constant argument. */ > if (TREE_CODE (arg) == NOP_EXPR) > arg = TREE_OPERAND (arg, 0); > else if (TREE_CODE (arg) == CONVERT_EXPR) > arg = TREE_OPERAND (arg, 0); > if (TREE_CODE (arg) == COMPONENT_REF) > arg = TREE_OPERAND (arg, 1); > > argtype = TREE_TYPE (arg); > } > When would this actually happen? I checked by running the builtin-sprintf*.c tests and none of the conditions is true for any of them so it looks like it might be dead code from an earlier versions of the pass that I never took out. > 2) for SSA_NAMEs with VR_RANGE, you only look at the argument's ranges, > while for something VR_VARYING and similar you consider both dirtype and > argtype's precision. Isn't it always UB if the precision is different? For %hhi and %hi the argument type can be char, short, or int, and considering the type of the directive's "formal" argument avoids warning on cases where the actual argument would be too big to fit if it were formatted using a directive for its own type. Like this: char d[4]; return __builtin_sprintf (d, "%hhu", 123456); > Corner case might be C bitfields, not sure what exact type one gets when > passing that to varargs. For UB wouldn't it be better to punt (return don't > know value)? Yes, the pass shouldn't try to make any assumptions when it detects undefined behavior. I didn't intentionally encode any such assumptions in it but I may have done that inadvertently. Those would be bugs. > 3) "When precision is specified but unknown, use zero as the minimum since > it results in no bytes on output (unless width is specified to be greater > than 0)." > When is 0 printed as nothing? I thought it is printed as 0 (or 0x0 etc.). I thought the same until last week. Apparently we both thought wrong. printf("%.0i", 0) is required to print nothing. The text is under the d, i, o, u, and x/X conversion specifiers: "The result of converting zero with an explicit precision of zero shall be no characters." > 4) if (code == NOP_EXPR) > argtype = TREE_TYPE (gimple_assign_rhs1 (def)); > Similar case to 1), except it can happen, but argtype could be anything, > boolean type, perhaps fixed point type, enum, whatever. I'm not quite sure I follow what you are trying to say here. This code avoids false positives for unknown arguments of wider types that are cast to narrower types, as in: 'sprintf ((char[4]){ }, "%hhi", (unsigned char)i)' where i is an int with an unknown value. It strips the NOP_EXPR that promotes the argument to int and lets the function use the range of the type of the argument before the promotion. It could be that I'm missing something so a test case would be helpful.
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 Jakub Jelinek changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #5 from Jakub Jelinek --- There are still various weird spots in format_integer. E.g. else if (TREE_CODE (TREE_TYPE (arg)) == INTEGER_TYPE || TREE_CODE (TREE_TYPE (arg)) == POINTER_TYPE) { /* Determine the type of the provided non-constant argument. */ if (TREE_CODE (arg) == NOP_EXPR) arg = TREE_OPERAND (arg, 0); else if (TREE_CODE (arg) == CONVERT_EXPR) arg = TREE_OPERAND (arg, 0); if (TREE_CODE (arg) == COMPONENT_REF) arg = TREE_OPERAND (arg, 1); argtype = TREE_TYPE (arg); } When would this actually happen? The pass is done in SSA GIMPLE, and integral/pointer types are gimple types, so the arguments should be always is_gimple_val. So, INTEGER_CST, SSA_NAME, or say ADDR_EXPR of something etc., but should not be NOP_EXPR, CONVERT_EXPR (we have a macro for those two btw, CONVERT_EXPR_P), nor COMPONENT_REF. Even if there would be NOPs etc., you aren't checking if they convert from INTEGER/POINTER type or something different. 2) for SSA_NAMEs with VR_RANGE, you only look at the argument's ranges, while for something VR_VARYING and similar you consider both dirtype and argtype's precision. Isn't it always UB if the precision is different? Corner case might be C bitfields, not sure what exact type one gets when passing that to varargs. For UB wouldn't it be better to punt (return don't know value)? I mean, if you have "%lld%d" format and pass in int and long long, assuming what the result will be is strange. For the same precision, it is quite common that there is a mismatch between sign of what dirtype expects and what argtype has (or one can be pointer and the other not). That case we need to handle properly. But is e.g. for unsigned printing the minimum value of a signed type the largest one? I'd think it is -1. E.g. for smaller precision signed (e.g. bitfield) into larger precision unsigned. 3) "When precision is specified but unknown, use zero as the minimum since it results in no bytes on output (unless width is specified to be greater than 0)." When is 0 printed as nothing? I thought it is printed as 0 (or 0x0 etc.). 4) if (code == NOP_EXPR) argtype = TREE_TYPE (gimple_assign_rhs1 (def)); Similar case to 1), except it can happen, but argtype could be anything, boolean type, perhaps fixed point type, enum, whatever.
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 --- Comment #4 from Jakub Jelinek --- Created attachment 40190 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40190&action=edit gcc7-pr78586.patch Untested fix.
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 --- Comment #3 from Jakub Jelinek --- --- gcc/gimple-ssa-sprintf.c.jj 2016-11-28 23:50:20.0 +0100 +++ gcc/gimple-ssa-sprintf.c2016-11-29 15:03:29.201488577 +0100 @@ -1159,16 +1159,15 @@ format_integer (const conversion_spec &s if (TYPE_UNSIGNED (argtype)) argmax = build_all_ones_cst (argtype); else - argmax = fold_build2 (LSHIFT_EXPR, argtype, integer_one_node, + argmax = fold_build2 (LSHIFT_EXPR, argtype, + build_int_cst (argtype, 1), build_int_cst (integer_type_node, argprec - 1)); } else - { - argmax = fold_build2 (LSHIFT_EXPR, dirtype, integer_one_node, - build_int_cst (integer_type_node, - typeprec - 1)); - } + argmax = fold_build2 (LSHIFT_EXPR, dirtype, + build_int_cst (dirtype, 1), + build_int_cst (integer_type_node, typeprec - 1)); res.argmin = argmin; res.argmax = argmax; } fixes it, that is a rather obvious bug. But, I want to think about whether we can't/shouldn't just use TYPE_{MIN,MAX}_VALUE (TYPE_DOMAIN (...)) instead.
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- Simplified testcase: void foo (unsigned long x) { char a[30]; unsigned long b = __builtin_sprintf (a, "%lu", x); if (b != 4) __builtin_abort (); } int main () { foo (1000); return 0; } (note, the original one has UB, %d used to print size_t value).
[Bug tree-optimization/78586] [7 Regression] Wrong code caused by printf-return-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78586 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Status|UNCONFIRMED |NEW Version|unknown |7.0 Target Milestone|--- |7.0 Summary|Wrong code caused by|[7 Regression] Wrong code |printf-return-value |caused by ||printf-return-value Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Confirmed.