[Bug middle-end/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 --- Comment #10 from Marc Glisse --- (In reply to Jakub Jelinek from comment #6) > > > char array[] = "abc"; > > > return __builtin_strlen (array); > > Well, this actually is optimized. Oops, I failed my copy-paste, I meant to copy: char array[] = {'a', 'b', 'c', '\0'}; > > Or we could do like clang and improve alias analysis. We should know that > > array doesn't escape and thus that hallo() cannot write to it. > > The strlen pass uses the alias oracle, so the question is why it thinks the > call might affect the array. The alias machinery thinks that array escapes (and we are flow-insensitive there). It is thus normal that hallo can write to it. I think array doesn't escape, but it isn't obvious to me where the aliasing code decides that it does. (In reply to Jakub Jelinek from comment #7) > int baz () > { > char a[4]; > a[0] = 'a'; > a[1] = 'b'; > a[2] = 'c'; > a[3] = '\0'; > return __builtin_strlen (a); > } > still won't be optimized. And we don't "vectorize" it either (llvm doesn't optimize strlen in this case, but at least the write to a is a single movl $6513249, 4(%rsp) instead of 4 movb).
[Bug middle-end/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 --- Comment #9 from Renlin Li --- (In reply to nsz from comment #8) > (In reply to Jakub Jelinek from comment #6) > > (In reply to Marc Glisse from comment #1) > > > Or we could do like clang and improve alias analysis. We should know that > > > array doesn't escape and thus that hallo() cannot write to it. > > > > The strlen pass uses the alias oracle, so the question is why it thinks the > > call might affect the array. > > the optimization fails with > > const char array[] = "abc"; > > too (which is why i thought it was about pure strlen depending on global > state > other than the argument.. static const array works though). char *array = "abc"; works, however, this generates string literals in read-only section.
[Bug middle-end/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 --- Comment #8 from nsz at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #6) > (In reply to Marc Glisse from comment #1) > > Or we could do like clang and improve alias analysis. We should know that > > array doesn't escape and thus that hallo() cannot write to it. > > The strlen pass uses the alias oracle, so the question is why it thinks the > call might affect the array. the optimization fails with const char array[] = "abc"; too (which is why i thought it was about pure strlen depending on global state other than the argument.. static const array works though).
[Bug middle-end/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #7 from Jakub Jelinek --- (In reply to Martin Sebor from comment #4) > I'm working on a patch that among other things transforms the CONSTRUCTOR > node created for an array initializer to STRING_CST, as suggested by Richard > in bug 71303 that this one is a duplicate of, to help fold strlen results > for constant arguments (this also affects C++ constexpr where it would be > useful to be able to call strlen). Let me confirm this one since it has > more discussion and close the other as a dupe. That will be certainly useful not just for this, but also for the generated code quality if it can't/shouldn't be optimized into constant (the usual case). But, int baz () { char a[4]; a[0] = 'a'; a[1] = 'b'; a[2] = 'c'; a[3] = '\0'; return __builtin_strlen (a); } still won't be optimized. Not sure if it is worth to do something about it though. It can as well have the form int baz2 () { char a[30]; __builtin_memcpy (a, "1234567", 7); __builtin_memcpy (a + 7, "89abcdefg", 9); __builtin_memcpy (a + 16, "h", 2); return __builtin_strlen (a); } which isn't optimized either and would also need the notion of length > N. Surprisingly int baz3 () { char a[30]; __builtin_memcpy (a, "1234567", 8); __builtin_memcpy (a + 7, "89abcdefg", 10); __builtin_memcpy (a + 16, "h", 2); return __builtin_strlen (a); } isn't optimized either.
[Bug middle-end/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 --- Comment #6 from Jakub Jelinek --- (In reply to Marc Glisse from comment #1) > (In reply to Renlin Li from comment #0) > > char array[] = "abc"; > > return __builtin_strlen (array); > > There are DUPs for this part. Well, this actually is optimized. GCC is able to fold strlen of a string literal very early, but the rest is done in the strlen pass. For foo, there is: array = "abc"; _1 = __builtin_strlen (); where we can easily compute the string length of the rhs and copy it over to the lhs (array). In bar, we have instead: array[0] = 97; array[1] = 98; array[2] = 99; array[3] = 0; _1 = __builtin_strlen (); and right now, the strlen pass only handles the various string/memory builtins plus stores of zero. For this to work, we'd need to also instrument scalar stores of non-zero values, and record some transitive string length > N rather than string length equals to N, so for the array[0] = 97; store we'd record string length at [0] is > 0, for array[1] = 98; update that record to say > 1, then > 2. > Here it depends if we produce: > > size_t tmp1=hallo(); > size_t tmp2=strlen(array); > return tmp1+tmp2; > > or use the reverse order for tmp1 and tmp2. Currently we evaluate a before b > in a+b, this example seems to suggest that when one sub-expression is pure > and not the other, it would make sense to evaluate the pure one first > (assuming we can determine that information early enough). It also depends > where the C++ proposal about order of evaluation is going... > > Or we could do like clang and improve alias analysis. We should know that > array doesn't escape and thus that hallo() cannot write to it. The strlen pass uses the alias oracle, so the question is why it thinks the call might affect the array.
[Bug middle-end/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 --- Comment #5 from Martin Sebor --- *** Bug 71303 has been marked as a duplicate of this bug. ***
[Bug middle-end/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 Martin Sebor changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-06-23 Assignee|unassigned at gcc dot gnu.org |msebor at gcc dot gnu.org Ever confirmed|0 |1 Known to fail||4.9.3, 5.3.0, 6.1.0, 7.0 --- Comment #4 from Martin Sebor --- I'm working on a patch that among other things transforms the CONSTRUCTOR node created for an array initializer to STRING_CST, as suggested by Richard in bug 71303 that this one is a duplicate of, to help fold strlen results for constant arguments (this also affects C++ constexpr where it would be useful to be able to call strlen). Let me confirm this one since it has more discussion and close the other as a dupe.
[Bug middle-end/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 --- Comment #3 from joseph at codesourcery dot com --- strlen is pure, not const, since its result depends on memory pointed to by its argument, not just the value of the pointer itself. See extend.texi: "Note that a function that has pointer arguments and examines the data pointed to must @emph{not} be declared @code{const}.".
[Bug middle-end/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 nsz at gcc dot gnu.org changed: What|Removed |Added CC||nsz at gcc dot gnu.org --- Comment #2 from nsz at gcc dot gnu.org --- int hallo (); int dummy () { char array[] = "abc"; hallo (); return __builtin_strlen (array); } if strlen is pure then it cannot be evaluated at compile time because hallo might modify global state, but strlen should be modeled with attribute const, since the result cannot depend on global state. i.e. in gcc/builtins.def DEF_LIB_BUILTIN_CHKP (BUILT_IN_STRLEN, "strlen", BT_FN_SIZE_CONST_STRING, ATTR_PURE_NOTHROW_NONNULL_LEAF) the PURE should be CONST.
[Bug middle-end/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 --- Comment #1 from Marc Glisse --- (In reply to Renlin Li from comment #0) > char array[] = "abc"; > return __builtin_strlen (array); There are DUPs for this part. > int hallo (); > int dummy () > { > char array[] = "abc"; > return hallo () + __builtin_strlen (array); > } > > the __builtin_strlen is not fold into a const as in foo () above. Presumably, > gcc is too conservative about what hallo () function can do. By adding a > pure attribute to hallo (), gcc will generate optimal code. Here it depends if we produce: size_t tmp1=hallo(); size_t tmp2=strlen(array); return tmp1+tmp2; or use the reverse order for tmp1 and tmp2. Currently we evaluate a before b in a+b, this example seems to suggest that when one sub-expression is pure and not the other, it would make sense to evaluate the pure one first (assuming we can determine that information early enough). It also depends where the C++ proposal about order of evaluation is going... Or we could do like clang and improve alias analysis. We should know that array doesn't escape and thus that hallo() cannot write to it.