[Bug middle-end/71625] missing strlen optimization on different array initialization style

2016-06-25 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

--- Comment #10 from Marc Glisse  ---
(In reply to Jakub Jelinek from comment #6)
> > >   char array[] = "abc";
> > >   return __builtin_strlen (array);
> 
> Well, this actually is optimized.

Oops, I failed my copy-paste, I meant to copy:
  char array[] = {'a', 'b', 'c', '\0'};

> > Or we could do like clang and improve alias analysis. We should know that
> > array doesn't escape and thus that hallo() cannot write to it.
> 
> The strlen pass uses the alias oracle, so the question is why it thinks the
> call might affect the array.

The alias machinery thinks that array escapes (and we are flow-insensitive
there). It is thus normal that hallo can write to it. I think array doesn't
escape, but it isn't obvious to me where the aliasing code decides that it
does.

(In reply to Jakub Jelinek from comment #7)
> int baz ()
> {
>   char a[4];
>   a[0] = 'a';
>   a[1] = 'b';
>   a[2] = 'c';
>   a[3] = '\0';
>   return __builtin_strlen (a); 
> }
> still won't be optimized.

And we don't "vectorize" it either (llvm doesn't optimize strlen in this case,
but at least the write to a is a single movl $6513249, 4(%rsp) instead of 4
movb).

[Bug middle-end/71625] missing strlen optimization on different array initialization style

2016-06-24 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

--- Comment #9 from Renlin Li  ---
(In reply to nsz from comment #8)
> (In reply to Jakub Jelinek from comment #6)
> > (In reply to Marc Glisse from comment #1)
> > > Or we could do like clang and improve alias analysis. We should know that
> > > array doesn't escape and thus that hallo() cannot write to it.
> > 
> > The strlen pass uses the alias oracle, so the question is why it thinks the
> > call might affect the array.
> 
> the optimization fails with
> 
>  const char array[] = "abc";
> 
> too (which is why i thought it was about pure strlen depending on global
> state
> other than the argument.. static const array works though).

char *array = "abc";

works, however, this generates string literals in read-only section.

[Bug middle-end/71625] missing strlen optimization on different array initialization style

2016-06-24 Thread nsz at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

--- Comment #8 from nsz at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #6)
> (In reply to Marc Glisse from comment #1)
> > Or we could do like clang and improve alias analysis. We should know that
> > array doesn't escape and thus that hallo() cannot write to it.
> 
> The strlen pass uses the alias oracle, so the question is why it thinks the
> call might affect the array.

the optimization fails with

 const char array[] = "abc";

too (which is why i thought it was about pure strlen depending on global state
other than the argument.. static const array works though).

[Bug middle-end/71625] missing strlen optimization on different array initialization style

2016-06-23 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek  ---
(In reply to Martin Sebor from comment #4)
> I'm working on a patch that among other things transforms the CONSTRUCTOR
> node created for an array initializer to STRING_CST, as suggested by Richard
> in bug 71303 that this one is a duplicate of, to help fold strlen results
> for constant arguments (this also affects C++ constexpr where it would be
> useful to be able to call strlen).  Let me confirm this one since it has
> more discussion and close the other as a dupe.

That will be certainly useful not just for this, but also for the generated
code quality if it can't/shouldn't be optimized into constant (the usual case).

But,
int baz ()
{
  char a[4];
  a[0] = 'a';
  a[1] = 'b';
  a[2] = 'c';
  a[3] = '\0';
  return __builtin_strlen (a); 
}
still won't be optimized.  Not sure if it is worth to do something about it
though.  It can as well have the form
int baz2 ()
{
  char a[30];
  __builtin_memcpy (a, "1234567", 7);
  __builtin_memcpy (a + 7, "89abcdefg", 9);
  __builtin_memcpy (a + 16, "h", 2);
  return __builtin_strlen (a);
}
which isn't optimized either and would also need the notion of length > N.
Surprisingly
int baz3 ()
{
  char a[30];
  __builtin_memcpy (a, "1234567", 8);
  __builtin_memcpy (a + 7, "89abcdefg", 10);
  __builtin_memcpy (a + 16, "h", 2);
  return __builtin_strlen (a);
}
isn't optimized either.

[Bug middle-end/71625] missing strlen optimization on different array initialization style

2016-06-23 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

--- Comment #6 from Jakub Jelinek  ---
(In reply to Marc Glisse from comment #1)
> (In reply to Renlin Li from comment #0)
> >   char array[] = "abc";
> >   return __builtin_strlen (array);
> 
> There are DUPs for this part.

Well, this actually is optimized.

GCC is able to fold strlen of a string literal very early, but the rest is done
in the strlen pass.  For foo, there is:
  array = "abc";
  _1 = __builtin_strlen ();
where we can easily compute the string length of the rhs and copy it over to
the lhs (array).
In bar, we have instead:
  array[0] = 97;
  array[1] = 98;
  array[2] = 99;
  array[3] = 0;
  _1 = __builtin_strlen ();
and right now, the strlen pass only handles the various string/memory builtins
plus stores of zero.  For this to work, we'd need to also instrument scalar
stores of non-zero values, and record some transitive string length > N rather
than string length equals to N, so for the array[0] = 97; store we'd record
string length at [0] is > 0, for array[1] = 98; update that record to say
> 1, then > 2.

> Here it depends if we produce:
> 
> size_t tmp1=hallo();
> size_t tmp2=strlen(array);
> return tmp1+tmp2;
> 
> or use the reverse order for tmp1 and tmp2. Currently we evaluate a before b
> in a+b, this example seems to suggest that when one sub-expression is pure
> and not the other, it would make sense to evaluate the pure one first
> (assuming we can determine that information early enough). It also depends
> where the C++ proposal about order of evaluation is going...
> 
> Or we could do like clang and improve alias analysis. We should know that
> array doesn't escape and thus that hallo() cannot write to it.

The strlen pass uses the alias oracle, so the question is why it thinks the
call might affect the array.

[Bug middle-end/71625] missing strlen optimization on different array initialization style

2016-06-23 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

--- Comment #5 from Martin Sebor  ---
*** Bug 71303 has been marked as a duplicate of this bug. ***

[Bug middle-end/71625] missing strlen optimization on different array initialization style

2016-06-23 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

Martin Sebor  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-06-23
   Assignee|unassigned at gcc dot gnu.org  |msebor at gcc dot 
gnu.org
 Ever confirmed|0   |1
  Known to fail||4.9.3, 5.3.0, 6.1.0, 7.0

--- Comment #4 from Martin Sebor  ---
I'm working on a patch that among other things transforms the CONSTRUCTOR node
created for an array initializer to STRING_CST, as suggested by Richard in bug
71303 that this one is a duplicate of, to help fold strlen results for constant
arguments (this also affects C++ constexpr where it would be useful to be able
to call strlen).  Let me confirm this one since it has more discussion and
close the other as a dupe.

[Bug middle-end/71625] missing strlen optimization on different array initialization style

2016-06-23 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

--- Comment #3 from joseph at codesourcery dot com  ---
strlen is pure, not const, since its result depends on memory pointed to 
by its argument, not just the value of the pointer itself.  See 
extend.texi: "Note that a function that has pointer arguments and examines 
the data pointed to must @emph{not} be declared @code{const}.".

[Bug middle-end/71625] missing strlen optimization on different array initialization style

2016-06-23 Thread nsz at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

nsz at gcc dot gnu.org changed:

   What|Removed |Added

 CC||nsz at gcc dot gnu.org

--- Comment #2 from nsz at gcc dot gnu.org ---
int hallo ();
int dummy ()
{
  char array[] = "abc";
  hallo ();
  return __builtin_strlen (array);
}

if strlen is pure then it cannot be evaluated at compile time because hallo
might modify global state, but strlen should be modeled with attribute const,
since the result cannot depend on global state.

i.e. in gcc/builtins.def

DEF_LIB_BUILTIN_CHKP   (BUILT_IN_STRLEN, "strlen", BT_FN_SIZE_CONST_STRING,
ATTR_PURE_NOTHROW_NONNULL_LEAF)

the PURE should be CONST.

[Bug middle-end/71625] missing strlen optimization on different array initialization style

2016-06-22 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

--- Comment #1 from Marc Glisse  ---
(In reply to Renlin Li from comment #0)
>   char array[] = "abc";
>   return __builtin_strlen (array);

There are DUPs for this part.

> int hallo ();
> int dummy ()
> {
>   char array[] = "abc";
>   return hallo () + __builtin_strlen (array);
> }
> 
> the __builtin_strlen is not fold into a const as in foo () above. Presumably,
> gcc is too conservative about what hallo () function can do. By adding a
> pure attribute to hallo (), gcc will generate optimal code.

Here it depends if we produce:

size_t tmp1=hallo();
size_t tmp2=strlen(array);
return tmp1+tmp2;

or use the reverse order for tmp1 and tmp2. Currently we evaluate a before b in
a+b, this example seems to suggest that when one sub-expression is pure and not
the other, it would make sense to evaluate the pure one first (assuming we can
determine that information early enough). It also depends where the C++
proposal about order of evaluation is going...

Or we could do like clang and improve alias analysis. We should know that array
doesn't escape and thus that hallo() cannot write to it.