[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2021-09-10 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #14 from Richard Biener  ---
Oh, and related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2020-01-04 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #13 from Andrew Pinski  ---
See

http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_236.htm
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1409.htm

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-10-30 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #12 from Richard Biener  ---
(In reply to Richard Biener from comment #9)
> One complication when deciding whether the downstream uses are fine is
> that we assign the same alias set to union accesses u->x and u->y.
> 
> That said, the union read/write pairs we remove essentially act as
> an optimization barrier - they "union" the alias sets of all components
> of the unions (not only of those accessed).  I think that's what other
> readings of the standard ask for (there's a bug about this as well).
> A mere declaration of a union needs to union the alias-sets and the
> following should be valid:
> 
> union { long l; long long ll; };
> 
> long foo (long *p)
> {
>   *p = 0;
>   *(long long *)p = 1;
>   return *p;
> }
> 
> this means (from an implementation perspective) that iff we do not
> want to go this far then accessing a union member via a non-union
> type isn't valid.
> 
> Anyway ;)  I don't see an easy way to fix the testcases in this PR
> without removing the optimization entirely.  It's currently guarded
> like
> 
>   /* If the TBAA state isn't the same for downstream reads
>  we cannot value-number the VDEFs the same.  */
>   alias_set_type set = get_alias_set (lhs);
>   if (! vnresult
>   || vnresult->set == set
>   || alias_set_subset_of (set, vnresult->set))
> 
> but we don't know whether vnresult is from a load or a store so we
> don't know whether it fully specifies the "TBAA state".  So even
> dumbing that down to
> 
>   if (vnresult->set != set
>   || set != get_alias_set (TREE_TYPE (lhs)))
> resultsame = false;
> 
> where the latter test sees if the alias set for the access and for an
> access using a pointer is the same might keep some holes open.
> 
> Some of the more interesting cases _did_ involve unions and the above
> would likely make them not optimized again (I remember sth with vector
> intrinsics and initialization / copying via unions).  Oh, even more
> trivial union->x = union->x; will stop from being optimized.
> 
> Index: gcc/tree-ssa-sccvn.c
> ===
> --- gcc/tree-ssa-sccvn.c(revision 254135)
> +++ gcc/tree-ssa-sccvn.c(working copy)
> @@ -3800,11 +3800,11 @@ visit_reference_op_store (tree lhs, tree
>resultsame = expressions_equal_p (result, op);
>if (resultsame)
> {
> - /* If the TBAA state isn't compatible for downstream reads
> + /* If the TBAA state isn't the same for downstream reads
>  we cannot value-number the VDEFs the same.  */
>   alias_set_type set = get_alias_set (lhs);
>   if (vnresult->set != set
> - && ! alias_set_subset_of (set, vnresult->set))
> + || set != get_alias_set (TREE_TYPE (lhs)))
> resultsame = false;
> }
>  }
> @@ -5628,9 +5628,9 @@ eliminate_dom_walker::before_dom_childre
>  at least all accesses the later one does or if the store
>  was to readonly memory storing the same value.  */
>   alias_set_type set = get_alias_set (lhs);
> - if (! vnresult
> - || vnresult->set == set
> - || alias_set_subset_of (set, vnresult->set))
> + if (vnresult
> + && vnresult->set == set
> + && set == get_alias_set (TREE_TYPE (lhs)))
> {
>   if (dump_file && (dump_flags & TDF_DETAILS))
> {
> Index: gcc/tree-ssa-dom.c
> ===
> --- gcc/tree-ssa-dom.c  (revision 254135)
> +++ gcc/tree-ssa-dom.c  (working copy)
> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.
>  #include "gimplify.h"
>  #include "tree-cfgcleanup.h"
>  #include "dbgcnt.h"
> +#include "alias.h"
>  
>  /* This file implements optimizations on the dominator tree.  */
>  
> @@ -1953,7 +1954,8 @@ dom_opt_dom_walker::optimize_stmt (basic
>   gimple_set_vuse (new_stmt, gimple_vuse (stmt));
>   cached_lhs = m_avail_exprs_stack->lookup_avail_expr (new_stmt,
> false,
>false);
> - if (cached_lhs && operand_equal_p (rhs, cached_lhs, 0))
> + if (cached_lhs && operand_equal_p (rhs, cached_lhs, 0)
> + && get_alias_set (lhs) == get_alias_set (TREE_TYPE (lhs)))
> {
>   basic_block bb = gimple_bb (stmt);
>   unlink_stmt_vdef (stmt);
> 
> FAILs at least gcc.dg/tree-ssa/ssa-pre-26.c

And

FAIL: gnat.dg/opt39.adb scan-tree-dump-times optimized "MEM" 1 (found 9 times)

otherwise clean on x86_64-unknown-linux-gnu.

> as said I'm not 100% convinced the more strict handling avoids all of the
> cases.

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-10-28 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #11 from Alexander Cherepanov  ---
When I was converting the original testcase to malloc I started with a 
simple assignment with a cast. Clang optimized it as well (and I posted 
it to the clang bug) bug gcc didn't. Essentially this example:

void f(long long *p)
{
   *(long *)p = *p;
}

tree optimization turns to

f (long long int * p)
{
   long long int _1;

[100.00%] [count: INV]:
   _1 = *p_3(D);
   MEM[(long int *)p_3(D)] = _1;
   return;

}

The same happens with "new (p) long (*p);". So the question: why is 
that? If this result is intended then perhaps memcpys and assignment of 
union members from this bug report could be converted by gcc to the same 
form? It would solve the problem.

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-10-27 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #10 from Richard Biener  ---
(In reply to Alexander Cherepanov from comment #8)
> On 2017-10-27 12:41, rguenth at gcc dot gnu.org wrote:
> >> And with allocated memory (C; add placement new's for C++):
> >>
> >> --
> >> #include 
> >> #include 
> >> #include 
> >>
> >> static long test(long *px, long long *py, void *pu)
> >> {
> >>*px = 0;
> >>*py = 1;
> >>
> >>// change effective type from long long to long
> >>long tmp;
> >>memcpy(, pu, sizeof(tmp));
> >>memcpy(pu, , sizeof(tmp));
> > 
> > I believe this one is invalid - memcpy transfers the dynamic
> > type and *pu is currently 'long long'.  So it's either not
> > changing the dynamic type because, well, the type transfers
> > through 'tmp' or you are accessing 'tmp' with declared type
> > long as 'long long'.
> AIUI memcpy is always valid if there is enough space, no matter what 
> types its source and destination have. It accesses both of them as 
> arrays of chars -- C11, 7.24.1p1: "The header  declares one 
> type and several functions, and defines one  macro useful for 
> manipulating arrays of character type and other objects treated as 
> arrays of character type."
> 
> OTOH memcpy transfers effective type from the source to the destination 
> but only if the destination has no declared type -- C11, 6.5p6: "If a 
> value is copied into an object having no declared type using memcpy or 
> memmove, or is copied as an array of character type, then the effective 
> type of the modified object for that access and for subsequent accesses 
> that do not modify the value is the effective type of the object from 
> which the value is copied, if it has one."
> Placement new in C++ can change dynamic types of objects with declared 
> type but I don't think there are such facilities in C.
> 
> So in this example the first memcpy copies the value of *pu to tmp but 
> drops its effective type (long long) and the second memcpy copies the 
> value and passes the effective type of 'tmp' (long) along.

Hmm, ok.  For GCC internals the memcpy changes the dynamic type to
"anything" (it uses alias-set zero reads/writes) so for GCC internals
the testcase is valid in that the later read needs to consider the
memcpy store as aliasing.

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-10-27 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #9 from Richard Biener  ---
One complication when deciding whether the downstream uses are fine is
that we assign the same alias set to union accesses u->x and u->y.

That said, the union read/write pairs we remove essentially act as
an optimization barrier - they "union" the alias sets of all components
of the unions (not only of those accessed).  I think that's what other
readings of the standard ask for (there's a bug about this as well).
A mere declaration of a union needs to union the alias-sets and the
following should be valid:

union { long l; long long ll; };

long foo (long *p)
{
  *p = 0;
  *(long long *)p = 1;
  return *p;
}

this means (from an implementation perspective) that iff we do not
want to go this far then accessing a union member via a non-union
type isn't valid.

Anyway ;)  I don't see an easy way to fix the testcases in this PR
without removing the optimization entirely.  It's currently guarded
like

  /* If the TBAA state isn't the same for downstream reads
 we cannot value-number the VDEFs the same.  */
  alias_set_type set = get_alias_set (lhs);
  if (! vnresult
  || vnresult->set == set
  || alias_set_subset_of (set, vnresult->set))

but we don't know whether vnresult is from a load or a store so we
don't know whether it fully specifies the "TBAA state".  So even
dumbing that down to

  if (vnresult->set != set
  || set != get_alias_set (TREE_TYPE (lhs)))
resultsame = false;

where the latter test sees if the alias set for the access and for an
access using a pointer is the same might keep some holes open.

Some of the more interesting cases _did_ involve unions and the above
would likely make them not optimized again (I remember sth with vector
intrinsics and initialization / copying via unions).  Oh, even more
trivial union->x = union->x; will stop from being optimized.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 254135)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -3800,11 +3800,11 @@ visit_reference_op_store (tree lhs, tree
   resultsame = expressions_equal_p (result, op);
   if (resultsame)
{
- /* If the TBAA state isn't compatible for downstream reads
+ /* If the TBAA state isn't the same for downstream reads
 we cannot value-number the VDEFs the same.  */
  alias_set_type set = get_alias_set (lhs);
  if (vnresult->set != set
- && ! alias_set_subset_of (set, vnresult->set))
+ || set != get_alias_set (TREE_TYPE (lhs)))
resultsame = false;
}
 }
@@ -5628,9 +5628,9 @@ eliminate_dom_walker::before_dom_childre
 at least all accesses the later one does or if the store
 was to readonly memory storing the same value.  */
  alias_set_type set = get_alias_set (lhs);
- if (! vnresult
- || vnresult->set == set
- || alias_set_subset_of (set, vnresult->set))
+ if (vnresult
+ && vnresult->set == set
+ && set == get_alias_set (TREE_TYPE (lhs)))
{
  if (dump_file && (dump_flags & TDF_DETAILS))
{
Index: gcc/tree-ssa-dom.c
===
--- gcc/tree-ssa-dom.c  (revision 254135)
+++ gcc/tree-ssa-dom.c  (working copy)
@@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.
 #include "gimplify.h"
 #include "tree-cfgcleanup.h"
 #include "dbgcnt.h"
+#include "alias.h"

 /* This file implements optimizations on the dominator tree.  */

@@ -1953,7 +1954,8 @@ dom_opt_dom_walker::optimize_stmt (basic
  gimple_set_vuse (new_stmt, gimple_vuse (stmt));
  cached_lhs = m_avail_exprs_stack->lookup_avail_expr (new_stmt, false,
   false);
- if (cached_lhs && operand_equal_p (rhs, cached_lhs, 0))
+ if (cached_lhs && operand_equal_p (rhs, cached_lhs, 0)
+ && get_alias_set (lhs) == get_alias_set (TREE_TYPE (lhs)))
{
  basic_block bb = gimple_bb (stmt);
  unlink_stmt_vdef (stmt);

FAILs at least gcc.dg/tree-ssa/ssa-pre-26.c

as said I'm not 100% convinced the more strict handling avoids all of the
cases.

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-10-27 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #8 from Alexander Cherepanov  ---
On 2017-10-27 12:41, rguenth at gcc dot gnu.org wrote:
>> And with allocated memory (C; add placement new's for C++):
>>
>> --
>> #include 
>> #include 
>> #include 
>>
>> static long test(long *px, long long *py, void *pu)
>> {
>>*px = 0;
>>*py = 1;
>>
>>// change effective type from long long to long
>>long tmp;
>>memcpy(, pu, sizeof(tmp));
>>memcpy(pu, , sizeof(tmp));
> 
> I believe this one is invalid - memcpy transfers the dynamic
> type and *pu is currently 'long long'.  So it's either not
> changing the dynamic type because, well, the type transfers
> through 'tmp' or you are accessing 'tmp' with declared type
> long as 'long long'.
AIUI memcpy is always valid if there is enough space, no matter what 
types its source and destination have. It accesses both of them as 
arrays of chars -- C11, 7.24.1p1: "The header  declares one 
type and several functions, and defines one  macro useful for 
manipulating arrays of character type and other objects treated as 
arrays of character type."

OTOH memcpy transfers effective type from the source to the destination 
but only if the destination has no declared type -- C11, 6.5p6: "If a 
value is copied into an object having no declared type using memcpy or 
memmove, or is copied as an array of character type, then the effective 
type of the modified object for that access and for subsequent accesses 
that do not modify the value is the effective type of the object from 
which the value is copied, if it has one."
Placement new in C++ can change dynamic types of objects with declared 
type but I don't think there are such facilities in C.

So in this example the first memcpy copies the value of *pu to tmp but 
drops its effective type (long long) and the second memcpy copies the 
value and passes the effective type of 'tmp' (long) along.

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-10-27 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #7 from Richard Biener  ---
(In reply to Alexander Cherepanov from comment #6) 
> And with allocated memory (C; add placement new's for C++):
> 
> --
> #include 
> #include 
> #include 
> 
> static long test(long *px, long long *py, void *pu)
> {
>   *px = 0;
>   *py = 1;
> 
>   // change effective type from long long to long
>   long tmp;
>   memcpy(, pu, sizeof(tmp));
>   memcpy(pu, , sizeof(tmp));

I believe this one is invalid - memcpy transfers the dynamic
type and *pu is currently 'long long'.  So it's either not
changing the dynamic type because, well, the type transfers
through 'tmp' or you are accessing 'tmp' with declared type
long as 'long long'.

>   return *px;
> }
> 
> int main(void)
> {
>   void *p = malloc(10);
> 
>   printf("%ld\n", test(p, p, p));
> }
> --
> 
> Results:
> 
> --
> $ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
> 1
> $ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
> 0
> --
> 
> gcc version: gcc (GCC) 8.0.0 20171023 (experimental)

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-10-23 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

Alexander Cherepanov  changed:

   What|Removed |Added

 CC||ch3root at openwall dot com

--- Comment #6 from Alexander Cherepanov  ---
Here are simplified testcases. With a union (C and C++):

--
#include 

union u {
  long x;
  long long y;
};

static long test(long *px, long long *py, union u *pu)
{
  pu->x = 0;// make .x active member (for C++)
  *px = 0;  // access .x via pointer

  pu->y = pu->x;// switch active member to .y (for C++)
  *py = 1;  // access .y via pointer

  pu->x = pu->y;// switch active member back to .x
  return *px;   // access .x via pointer
}

int main(void)
{
  union u u;

  printf("%ld\n", test(, , ));
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
1
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
0
--

And with allocated memory (C; add placement new's for C++):

--
#include 
#include 
#include 

static long test(long *px, long long *py, void *pu)
{
  *px = 0;
  *py = 1;

  // change effective type from long long to long
  long tmp;
  memcpy(, pu, sizeof(tmp));
  memcpy(pu, , sizeof(tmp));

  return *px;
}

int main(void)
{
  void *p = malloc(10);

  printf("%ld\n", test(p, p, p));
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
1
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
0
--

gcc version: gcc (GCC) 8.0.0 20171023 (experimental)

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-09-25 Thread myriachan at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #5 from Melissa  ---
This originated from a Stack Overflow post "supercat" made (I'm the "Myria"
there).

https://stackoverflow.com/questions/46205744/is-this-use-of-unions-strictly-conforming/

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-09-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

Richard Biener  changed:

   What|Removed |Added

   Keywords||alias
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2017-09-18
  Component|c   |tree-optimization
 Ever confirmed|0   |1

--- Comment #4 from Richard Biener  ---
It's more related to PR77745.  With the testcase we are still removing the
stores that change the active union member.  They do have the same alias set
unfortunately given they use union s1s2 * in their access path.

The IL that we then up optimizing is

   [100.00%] [count: INV]:
  q[0].v1.x = 4321;
...
  _3 =  + _2;
  _14 = MEM[(struct s1 *)].x;
...
  if (_15 != 0)
goto ; [33.00%] [count: INV]
  else
goto ; [67.00%] [count: INV]

   [33.00%] [count: INV]:
  MEM[(struct s2 *)_3].x = 1234;

   [100.00%] [count: INV]:
  _17 = MEM[(struct s1 *)].x;
...

where you can see all loads/stores being done through s1/s2 types which do
_not_ alias.

So we have

 q->v1.x = 4321; // make v1 active
 if ((>v1)->x) // read via pointer to active member, supposedly valid
   {
 q->v2.x = q->v1.x; // change v2 to be active, same value, optimized away
 (>v2)->x = 1234; // write via pointer to active member
 q->v1.x = q->v2.x; // change v1 to be active, same value, optimized away
   }
 if ((>v1)->x != 1234)
   abort ();

if we do not DSE the active member changing stores everything works as
desired.

Note that for GCC it isn't necessary to go through the union type when
changing the active type.  Note it definitely is desirable to elide
the load/store changing the active member in the end.

It's going to be tricky to find a solution that doesn't pessimize optimization
when unions are involved.  DOM is similarly affected, in the end it needs
sufficiently obfuscated IL to trick our "handle must-aliases conservatively"
code enough.