Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-02-06 Thread Jeff Law
On Wed, 2020-02-05 at 16:57 -0700, Martin Sebor wrote:
> 
> It passes thanks to the TREE_CODE (arg) == PARM_DECL test added
> in the patch to get_range_strlen (the test was missing before
> and so while it handled ordinary objects (local or global) it
> unnecessarily excluded function arguments.
Oh yea, duh.  I recall noting you added the PARM_DECL handling and
thinking it might allow us to salvage some of the tests.  THen promptly
forgot.

jeff
> 



Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-02-05 Thread Martin Sebor

On 2/4/20 7:35 AM, Richard Biener wrote:



...

Jakub/Richi, comments on this hunk?


+ tree ref = TREE_OPERAND (TREE_OPERAND (arg, 0), 0);
+ tree off = TREE_OPERAND (arg, 1);
+ if ((TREE_CODE (ref) == PARM_DECL || VAR_P (ref))
+ && (!DECL_EXTERNAL (ref)
+ || !array_at_struct_end_p (arg)))
+   {

I think you'd want decl_binds_to_current_def_p (ref) instead of !DECL_EXTERNAL.


I've made the change.


Since 'arg' is originally a pointer array_at_struct_end_p is
meaningless here since
that's about the structure of a reference while the pointer is just a
value.


array_at_struct_end_p handles MEM_REF by looking at the argument
(i.e., at the DECL when it is one), so its use here avoids DECLs
with flexible arrays but allows others.  In other words, I don't
want to exclude MEM_REFs to a in:

  extern char a[4];

just because a is extern.


So if
you're concerned the objects size might not be as it looks like then you have to
rely on decl_binds_to_current_def_p only.


I'm only concerned about sizes of extern objects of struct types
with flexible array members.  I believe others are handled fine.


You also shouldn't use 'off' natively
in the code below but use mem_ref_offset to access the embedded offset
which is to be interpreted as signed integer (it's a pointer as you use it).
You compare it against an unsigned size...


I've changed it in the latest revision of the patch.

Martin


Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-02-05 Thread Martin Sebor

On 2/3/20 11:44 AM, Jeff Law wrote:

On Fri, 2020-01-31 at 12:04 -0700, Martin Sebor wrote:

Attached is a reworked patch since the first one didn't go far
enough to solve the major problems.  The new solution relies on
get_range_strlen_dynamic the same way as the sprintf optimization,
and does away with the determine_min_objsize function and calling
compute_builtin_object_size.

To minimize testsuite fallout I extended get_range_strlen to handle
a couple more kinds expressions(*), but I still had to xfail and
disable a few tests that were relying on being able to use the type
of the destination object as the upper bound on the string length.

Tested on x86_64-linux.

Martin

[*] With all the issues around MEM_REFs and types this change needs
extra scrutiny.  I'm still not sure I fully understand what can and
what cannot be safely relied on at this level.

On 1/15/20 6:18 AM, Martin Sebor wrote:

The strcmp optimization newly introduced in GCC 10 relies on
the size of the smallest referenced array object to determine
whether the function can return zero.  When the size of
the object is smaller than the length of the other string
argument the optimization folds the equality to false.

The bug report has identified a couple of problems here:
1) when the access to the array object is via a pointer to
a (possibly indirect) member of a union, in GIMPLE the pointer
may actually point to a different member than the one in
the original source code.  Thus the size of the array may
appear to be smaller than in the source code which can then
result in the optimization being invalid.
2) when the pointer in the access may point to two or more
arrays of different size (i.e., it's the result of a PHI),
assuming it points to the smallest of them can also lead
to an incorrect result when the optimization is applied.

The attached patch adjusts the optimization to 1) avoid making
any assumptions about the sizes of objects accessed via union
types, and b) use the size of the largest object in PHI nodes.

Tested on x86_64-linux.

Martin



PR tree-optimization/92765 - wrong code for strcmp of a union member

gcc/ChangeLog:

 PR tree-optimization/92765
 * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.
 * tree-ssa-strlen.c (compute_string_length): Remove.
 (determine_min_objsize): Remove.
 (get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.
 Avoid using type size as the upper bound on string length.
 (handle_builtin_string_cmp): Add an argument.  Adjust.
 (strlen_check_and_optimize_call): Pass additional argument to
 handle_builtin_string_cmp.

gcc/testsuite/ChangeLog:

 PR tree-optimization/92765
 * g++.dg/tree-ssa/strlenopt-1.C: New test.
 * g++.dg/tree-ssa/strlenopt-2.C: New test.
 * gcc.dg/Warray-bounds-58.c: New test.
 * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.
 * gcc.dg/Wstring-compare.c: Xfail a test.
 * gcc.dg/strcmpopt_2.c: Disable tests.
 * gcc.dg/strcmpopt_4.c: Adjust tests.
 * gcc.dg/strcmpopt_10.c: New test.
 * gcc.dg/strlenopt-69.c: Disable tests.
 * gcc.dg/strlenopt-92.c: New test.
 * gcc.dg/strlenopt-93.c: New test.
 * gcc.dg/strlenopt.h: Declare calloc.
 * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.
 * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index ed225922269..d70ac67e1ca 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, 
strlen_range_kind rkind,
c_strlen_data *pdata, unsigned eltsize)
  {
gcc_assert (TREE_CODE (arg) != SSA_NAME);
-
+
/* The length computed by this invocation of the function.  */
tree val = NULL_TREE;
  
@@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,

  type about the length here.  */
   tight_bound = true;
 }
-  else if (VAR_P (arg))
+  else if (TREE_CODE (arg) == MEM_REF
+  && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE
+  && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE
+  && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)
+   {
+ /* Handle a MEM_REF into a DECL accessing an array of integers,
+being conservative about references to extern structures with
+flexible array members that can be initialized to arbitrary
+numbers of elements as an extension (static structs are okay).
+FIXME: Make this less conservative -- see
+component_ref_size in tree.c.  */

I think it's generally been agreed that we can look at sizes of _DECL
nodes and this code doesn't look like this walks backwards through
casts or anything like that.  So the worry would be 

Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-02-04 Thread Richard Biener
On Mon, Feb 3, 2020 at 7:45 PM Jeff Law  wrote:
>
> On Fri, 2020-01-31 at 12:04 -0700, Martin Sebor wrote:
> > Attached is a reworked patch since the first one didn't go far
> > enough to solve the major problems.  The new solution relies on
> > get_range_strlen_dynamic the same way as the sprintf optimization,
> > and does away with the determine_min_objsize function and calling
> > compute_builtin_object_size.
> >
> > To minimize testsuite fallout I extended get_range_strlen to handle
> > a couple more kinds expressions(*), but I still had to xfail and
> > disable a few tests that were relying on being able to use the type
> > of the destination object as the upper bound on the string length.
> >
> > Tested on x86_64-linux.
> >
> > Martin
> >
> > [*] With all the issues around MEM_REFs and types this change needs
> > extra scrutiny.  I'm still not sure I fully understand what can and
> > what cannot be safely relied on at this level.
> >
> > On 1/15/20 6:18 AM, Martin Sebor wrote:
> > > The strcmp optimization newly introduced in GCC 10 relies on
> > > the size of the smallest referenced array object to determine
> > > whether the function can return zero.  When the size of
> > > the object is smaller than the length of the other string
> > > argument the optimization folds the equality to false.
> > >
> > > The bug report has identified a couple of problems here:
> > > 1) when the access to the array object is via a pointer to
> > > a (possibly indirect) member of a union, in GIMPLE the pointer
> > > may actually point to a different member than the one in
> > > the original source code.  Thus the size of the array may
> > > appear to be smaller than in the source code which can then
> > > result in the optimization being invalid.
> > > 2) when the pointer in the access may point to two or more
> > > arrays of different size (i.e., it's the result of a PHI),
> > > assuming it points to the smallest of them can also lead
> > > to an incorrect result when the optimization is applied.
> > >
> > > The attached patch adjusts the optimization to 1) avoid making
> > > any assumptions about the sizes of objects accessed via union
> > > types, and b) use the size of the largest object in PHI nodes.
> > >
> > > Tested on x86_64-linux.
> > >
> > > Martin
> >
> >
> > PR tree-optimization/92765 - wrong code for strcmp of a union member
> >
> > gcc/ChangeLog:
> >
> > PR tree-optimization/92765
> > * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and 
> > PARM_DECL.
> > * tree-ssa-strlen.c (compute_string_length): Remove.
> > (determine_min_objsize): Remove.
> > (get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.
> > Avoid using type size as the upper bound on string length.
> > (handle_builtin_string_cmp): Add an argument.  Adjust.
> > (strlen_check_and_optimize_call): Pass additional argument to
> > handle_builtin_string_cmp.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR tree-optimization/92765
> > * g++.dg/tree-ssa/strlenopt-1.C: New test.
> > * g++.dg/tree-ssa/strlenopt-2.C: New test.
> > * gcc.dg/Warray-bounds-58.c: New test.
> > * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.
> > * gcc.dg/Wstring-compare.c: Xfail a test.
> > * gcc.dg/strcmpopt_2.c: Disable tests.
> > * gcc.dg/strcmpopt_4.c: Adjust tests.
> > * gcc.dg/strcmpopt_10.c: New test.
> > * gcc.dg/strlenopt-69.c: Disable tests.
> > * gcc.dg/strlenopt-92.c: New test.
> > * gcc.dg/strlenopt-93.c: New test.
> > * gcc.dg/strlenopt.h: Declare calloc.
> > * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.
> > * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).
> >
> > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> > index ed225922269..d70ac67e1ca 100644
> > --- a/gcc/gimple-fold.c
> > +++ b/gcc/gimple-fold.c
> > @@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, 
> > strlen_range_kind rkind,
> >c_strlen_data *pdata, unsigned eltsize)
> >  {
> >gcc_assert (TREE_CODE (arg) != SSA_NAME);
> > -
> > +
> >/* The length computed by this invocation of the function.  */
> >tree val = NULL_TREE;
> >
> > @@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited, 
> > strlen_range_kind rkind,
> >  type about the length here.  */
> >   tight_bound = true;
> > }
> > -  else if (VAR_P (arg))
> > +  else if (TREE_CODE (arg) == MEM_REF
> > +  && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE
> > +  && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE
> > +  && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)
> > +   {
> > + /* Handle a MEM_REF into a DECL accessing an array of integers,
> > +being conservative about references to extern structures with
> 

Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-02-03 Thread Jeff Law
On Fri, 2020-01-31 at 12:04 -0700, Martin Sebor wrote:
> Attached is a reworked patch since the first one didn't go far
> enough to solve the major problems.  The new solution relies on
> get_range_strlen_dynamic the same way as the sprintf optimization,
> and does away with the determine_min_objsize function and calling
> compute_builtin_object_size.
> 
> To minimize testsuite fallout I extended get_range_strlen to handle
> a couple more kinds expressions(*), but I still had to xfail and
> disable a few tests that were relying on being able to use the type
> of the destination object as the upper bound on the string length.
> 
> Tested on x86_64-linux.
> 
> Martin
> 
> [*] With all the issues around MEM_REFs and types this change needs
> extra scrutiny.  I'm still not sure I fully understand what can and
> what cannot be safely relied on at this level.
> 
> On 1/15/20 6:18 AM, Martin Sebor wrote:
> > The strcmp optimization newly introduced in GCC 10 relies on
> > the size of the smallest referenced array object to determine
> > whether the function can return zero.  When the size of
> > the object is smaller than the length of the other string
> > argument the optimization folds the equality to false.
> > 
> > The bug report has identified a couple of problems here:
> > 1) when the access to the array object is via a pointer to
> > a (possibly indirect) member of a union, in GIMPLE the pointer
> > may actually point to a different member than the one in
> > the original source code.  Thus the size of the array may
> > appear to be smaller than in the source code which can then
> > result in the optimization being invalid.
> > 2) when the pointer in the access may point to two or more
> > arrays of different size (i.e., it's the result of a PHI),
> > assuming it points to the smallest of them can also lead
> > to an incorrect result when the optimization is applied.
> > 
> > The attached patch adjusts the optimization to 1) avoid making
> > any assumptions about the sizes of objects accessed via union
> > types, and b) use the size of the largest object in PHI nodes.
> > 
> > Tested on x86_64-linux.
> > 
> > Martin
> 
> 
> PR tree-optimization/92765 - wrong code for strcmp of a union member
> 
> gcc/ChangeLog:
> 
> PR tree-optimization/92765
> * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.
> * tree-ssa-strlen.c (compute_string_length): Remove.
> (determine_min_objsize): Remove.
> (get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.
> Avoid using type size as the upper bound on string length.
> (handle_builtin_string_cmp): Add an argument.  Adjust.
> (strlen_check_and_optimize_call): Pass additional argument to
> handle_builtin_string_cmp.
> 
> gcc/testsuite/ChangeLog:
> 
> PR tree-optimization/92765
> * g++.dg/tree-ssa/strlenopt-1.C: New test.
> * g++.dg/tree-ssa/strlenopt-2.C: New test.
> * gcc.dg/Warray-bounds-58.c: New test.
> * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.
> * gcc.dg/Wstring-compare.c: Xfail a test.
> * gcc.dg/strcmpopt_2.c: Disable tests.
> * gcc.dg/strcmpopt_4.c: Adjust tests.
> * gcc.dg/strcmpopt_10.c: New test.
> * gcc.dg/strlenopt-69.c: Disable tests.
> * gcc.dg/strlenopt-92.c: New test.
> * gcc.dg/strlenopt-93.c: New test.
> * gcc.dg/strlenopt.h: Declare calloc.
> * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.
> * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).
> 
> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> index ed225922269..d70ac67e1ca 100644
> --- a/gcc/gimple-fold.c
> +++ b/gcc/gimple-fold.c
> @@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, 
> strlen_range_kind rkind,
>c_strlen_data *pdata, unsigned eltsize)
>  {
>gcc_assert (TREE_CODE (arg) != SSA_NAME);
> - 
> +
>/* The length computed by this invocation of the function.  */
>tree val = NULL_TREE;
>  
> @@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited, 
> strlen_range_kind rkind,
>  type about the length here.  */
>   tight_bound = true;
> }
> -  else if (VAR_P (arg))
> +  else if (TREE_CODE (arg) == MEM_REF
> +  && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE
> +  && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE
> +  && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)
> +   {
> + /* Handle a MEM_REF into a DECL accessing an array of integers,
> +being conservative about references to extern structures with
> +flexible array members that can be initialized to arbitrary
> +numbers of elements as an extension (static structs are okay).
> +FIXME: Make this less conservative -- see
> +component_ref_size in 

Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-01-31 Thread Martin Sebor

Attached is a reworked patch since the first one didn't go far
enough to solve the major problems.  The new solution relies on
get_range_strlen_dynamic the same way as the sprintf optimization,
and does away with the determine_min_objsize function and calling
compute_builtin_object_size.

To minimize testsuite fallout I extended get_range_strlen to handle
a couple more kinds expressions(*), but I still had to xfail and
disable a few tests that were relying on being able to use the type
of the destination object as the upper bound on the string length.

Tested on x86_64-linux.

Martin

[*] With all the issues around MEM_REFs and types this change needs
extra scrutiny.  I'm still not sure I fully understand what can and
what cannot be safely relied on at this level.

On 1/15/20 6:18 AM, Martin Sebor wrote:

The strcmp optimization newly introduced in GCC 10 relies on
the size of the smallest referenced array object to determine
whether the function can return zero.  When the size of
the object is smaller than the length of the other string
argument the optimization folds the equality to false.

The bug report has identified a couple of problems here:
1) when the access to the array object is via a pointer to
a (possibly indirect) member of a union, in GIMPLE the pointer
may actually point to a different member than the one in
the original source code.  Thus the size of the array may
appear to be smaller than in the source code which can then
result in the optimization being invalid.
2) when the pointer in the access may point to two or more
arrays of different size (i.e., it's the result of a PHI),
assuming it points to the smallest of them can also lead
to an incorrect result when the optimization is applied.

The attached patch adjusts the optimization to 1) avoid making
any assumptions about the sizes of objects accessed via union
types, and b) use the size of the largest object in PHI nodes.

Tested on x86_64-linux.

Martin



PR tree-optimization/92765 - wrong code for strcmp of a union member

gcc/ChangeLog:

	PR tree-optimization/92765
	* gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.
	* tree-ssa-strlen.c (compute_string_length): Remove.
	(determine_min_objsize): Remove.
	(get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.
	Avoid using type size as the upper bound on string length.
	(handle_builtin_string_cmp): Add an argument.  Adjust.
	(strlen_check_and_optimize_call): Pass additional argument to
	handle_builtin_string_cmp.

gcc/testsuite/ChangeLog:

	PR tree-optimization/92765
	* g++.dg/tree-ssa/strlenopt-1.C: New test.
	* g++.dg/tree-ssa/strlenopt-2.C: New test.
	* gcc.dg/Warray-bounds-58.c: New test.
	* gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.
	* gcc.dg/Wstring-compare.c: Xfail a test.
	* gcc.dg/strcmpopt_2.c: Disable tests.
	* gcc.dg/strcmpopt_4.c: Adjust tests.
	* gcc.dg/strcmpopt_10.c: New test.
	* gcc.dg/strlenopt-69.c: Disable tests.
	* gcc.dg/strlenopt-92.c: New test.
	* gcc.dg/strlenopt-93.c: New test.
	* gcc.dg/strlenopt.h: Declare calloc.
	* gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index ed225922269..d70ac67e1ca 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 		   c_strlen_data *pdata, unsigned eltsize)
 {
   gcc_assert (TREE_CODE (arg) != SSA_NAME);
- 
+
   /* The length computed by this invocation of the function.  */
   tree val = NULL_TREE;
 
@@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	 type about the length here.  */
 	  tight_bound = true;
 	}
-  else if (VAR_P (arg))
+  else if (TREE_CODE (arg) == MEM_REF
+	   && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE
+	   && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE
+	   && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)
+	{
+	  /* Handle a MEM_REF into a DECL accessing an array of integers,
+	 being conservative about references to extern structures with
+	 flexible array members that can be initialized to arbitrary
+	 numbers of elements as an extension (static structs are okay).
+	 FIXME: Make this less conservative -- see
+	 component_ref_size in tree.c.  */
+	  tree ref = TREE_OPERAND (TREE_OPERAND (arg, 0), 0);
+	  tree off = TREE_OPERAND (arg, 1);
+	  if ((TREE_CODE (ref) == PARM_DECL || VAR_P (ref))
+	  && (!DECL_EXTERNAL (ref)
+		  || !array_at_struct_end_p (arg)))
+	{
+	  /* Fail if the offset is out of bounds.  Such accesses
+		 should be diagnosed at some point.  */
+	  val = DECL_SIZE_UNIT (ref);
+	  if (!val
+		  || integer_zerop (val)
+		  || tree_int_cst_le (val, off))
+		return false;
+
+	  pdata->minlen = ssize_int (0);
+
+	  /* Subtract the offset and one for the terminating nul.  

Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-01-16 Thread Jakub Jelinek
On Wed, Jan 15, 2020 at 09:52:57PM +0100, Jakub Jelinek wrote:
> This looks wrong.  For one, this function is used for two purposes now and
> you tweak it for one, but more importantly, whether he initial stmt
> you see is a PHI or not can't make a difference, how is that case e.g.
> different from _1 = PHI <_3, _4>; _2 = _1 + 1; and asking about _2?
> For _1, you'd use (correctly) the maximum, but if called on _2, you'd ask
> (wrongly) for minimum instead of maximum.

And now with testcases.  strlenopt-95.c shows the above.

> This also looks like a hack to shut up the particular testcases instead of
> really playing with what the IL provides.  Instead of the unions, consider
> e.g. C++ placement new, have a pointer to a buffer into which you placement
> new one structure, take address of some member in it, pass it to something,
> if it doesn't have a destructor do a C++ placement new into the same buffer
> but with different structure, take address of a different member with the
> same address as the first member, do the str*cmp on it that invokes this
> stuff.  SCCVN will (likely) find out that the values of those two pointers
> are the same and just use the former pointer in the latter case.

And strlenopt-93.C shows the latter.  strlenopt-94.C is similar, just to
show that it breaks equally badly with non-PODs that will be constructed by
placement new and destructed later.

Jakub
__attribute__((noipa)) int
barrier_copy (char *x, int y)
{
  asm volatile ("" : : "g" (x), "g" (y) : "memory");
  if (y == 0)
__builtin_strcpy (x, "abcd");
  return y;
}

__attribute__((noipa)) char *
test_2 (int x)
{
  char *p;
  if (x)
p = __builtin_malloc (4);
  else
p = __builtin_calloc (16, 1);
  char *q = p + 2;
  if (barrier_copy (q, x))
return p;
  if (__builtin_strcmp (q, "abcd") != 0)
__builtin_abort ();
  return p;
}

int
main ()
{
  __builtin_free (test_2 (0));
  __builtin_free (test_2 (1));
  return 0;
}#include 

struct S1 { char a[2]; char b[2]; char c[2]; };
struct S2 { char d[6]; };

__attribute__((noipa)) void
foo (char *b)
{
  b[0] = 1;
  b[1] = 2;
  asm volatile ("" : : "g" (b) : "memory");
}

__attribute__((noipa)) void
bar (char *d)
{
  __builtin_memcpy (d, "cde", 4);
  asm volatile ("" : : "g" (d) : "memory");
}

__attribute__((noipa)) void
baz (char *buf)
{
  S1 *s1 = new (buf) S1;
  char *p = (char *) >b;
  foo (p);
  S2 *s2 = new (buf) S2;
  char *q = (char *) >d[2];
  bar (q);
  if (__builtin_strcmp (q, "cde"))
__builtin_abort ();
}

int
main ()
{
  union U { S1 s1; S2 s2; char buf[sizeof (S1) > sizeof (S2) ? sizeof (S1) : 
sizeof (S2)]; } u;
  baz (u.buf);
  return 0;
}
#include 

struct S1 { char a[2]; char b[2]; char c[2]; S1 () { a[0] = 0; b[0] = 0; c[0] = 
0; }; ~S1 () {} };
struct S2 { char d[6]; S2 () { d[0] = 0; d[2] = 0; } ~S2 () {} };

__attribute__((noipa)) void
foo (char *b)
{
  b[0] = 1;
  b[1] = 2;
  asm volatile ("" : : "g" (b) : "memory");
}

__attribute__((noipa)) void
bar (char *d)
{
  __builtin_memcpy (d, "cde", 4);
  asm volatile ("" : : "g" (d) : "memory");
}

__attribute__((noipa)) void
baz (char *buf)
{
  S1 *s1 = new (buf) S1 ();
  char *p = (char *) >b;
  foo (p);
  s1->~S1 ();
  S2 *s2 = new (buf) S2 ();
  char *q = (char *) >d[2];
  bar (q);
  if (__builtin_strcmp (q, "cde"))
__builtin_abort ();
  s2->~S2 ();
}

int
main ()
{
  char buf[sizeof (S1) > sizeof (S2) ? sizeof (S1) : sizeof (S2)];
  baz (buf);
  return 0;
}


Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-01-15 Thread Jakub Jelinek
On Wed, Jan 15, 2020 at 01:18:54PM +, Martin Sebor wrote:
> @@ -4099,14 +4122,18 @@ determine_min_objsize (tree dest)
>  
>init_object_sizes ();
>  
> -  if (compute_builtin_object_size (dest, 2, ))
> -return size;
> -
>/* Try to determine the size of the object through the RHS
>   of the assign statement.  */
>if (TREE_CODE (dest) == SSA_NAME)
>  {
>gimple *stmt = SSA_NAME_DEF_STMT (dest);
> +
> +  /* Determine the size of the largest object when DEST refers
> +  to two or more via a PHI, otherwise the smallest.  */
> +  int ostype = gimple_code (stmt) == GIMPLE_PHI ? 0 : 2;
> +  if (compute_builtin_object_size (dest, ostype, ))
> + return size;
> +
>if (!is_gimple_assign (stmt))
>   return HOST_WIDE_INT_M1U;
>  
> @@ -4118,6 +4145,10 @@ determine_min_objsize (tree dest)
>return determine_min_objsize (dest);
>  }
>  
> +  /* Try to determine the size of the referenced object itself.  */
> +  if (compute_builtin_object_size (dest, 2, ))
> +return size;
> +

This looks wrong.  For one, this function is used for two purposes now and
you tweak it for one, but more importantly, whether he initial stmt
you see is a PHI or not can't make a difference, how is that case e.g.
different from _1 = PHI <_3, _4>; _2 = _1 + 1; and asking about _2?
For _1, you'd use (correctly) the maximum, but if called on _2, you'd ask
(wrongly) for minimum instead of maximum.

>/* The size of a flexible array cannot be determined.  Otherwise,
> - for arrays with more than one element, return the size of its
> - type.  GCC itself misuses arrays of both zero and one elements
> - as flexible array members so they are excluded as well.  */
> + unless the reference involves a union, for arrays with more than
> + one element, return the size of its type.  GCC itself misuses
> + arrays of both zero and one elements as flexible array members
> + so they are excluded as well.  */
>if (TREE_CODE (type) != ARRAY_TYPE
> -  || !array_at_struct_end_p (dest))
> +  || (!component_ref_via_union_p (dest)
> +   && !array_at_struct_end_p (dest)))
>  {
>tree type_size = TYPE_SIZE_UNIT (type);
>if (type_size && TREE_CODE (type_size) == INTEGER_CST

This also looks like a hack to shut up the particular testcases instead of
really playing with what the IL provides.  Instead of the unions, consider
e.g. C++ placement new, have a pointer to a buffer into which you placement
new one structure, take address of some member in it, pass it to something,
if it doesn't have a destructor do a C++ placement new into the same buffer
but with different structure, take address of a different member with the
same address as the first member, do the str*cmp on it that invokes this
stuff.  SCCVN will (likely) find out that the values of those two pointers
are the same and just use the former pointer in the latter case.

Jakub



Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-01-15 Thread Jeff Law
On Wed, 2020-01-15 at 13:18 +, Martin Sebor wrote:
> The strcmp optimization newly introduced in GCC 10 relies on
> the size of the smallest referenced array object to determine
> whether the function can return zero.  When the size of
> the object is smaller than the length of the other string
> argument the optimization folds the equality to false.
> 
> The bug report has identified a couple of problems here:
> 1) when the access to the array object is via a pointer to
> a (possibly indirect) member of a union, in GIMPLE the pointer
> may actually point to a different member than the one in
> the original source code.  Thus the size of the array may
> appear to be smaller than in the source code which can then
> result in the optimization being invalid.
> 2) when the pointer in the access may point to two or more
> arrays of different size (i.e., it's the result of a PHI),
> assuming it points to the smallest of them can also lead
> to an incorrect result when the optimization is applied.
> 
> The attached patch adjusts the optimization to 1) avoid making
> any assumptions about the sizes of objects accessed via union
> types, and b) use the size of the largest object in PHI nodes.
> 
> Tested on x86_64-linux.
> 
> Martin
> PR tree-optimization/92765 - wrong code for strcmp of a union member
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/92765
>   * gcc.dg/strlenopt-92.c: New test.
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/92765
>   * tree-ssa-strlen.c (component_ref_via_union_p): New function.
>   (determine_min_objsize): Call it.  Use the maximum object size
>   for PHI arguments.
OK
jeff



[PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-01-15 Thread Martin Sebor

The strcmp optimization newly introduced in GCC 10 relies on
the size of the smallest referenced array object to determine
whether the function can return zero.  When the size of
the object is smaller than the length of the other string
argument the optimization folds the equality to false.

The bug report has identified a couple of problems here:
1) when the access to the array object is via a pointer to
a (possibly indirect) member of a union, in GIMPLE the pointer
may actually point to a different member than the one in
the original source code.  Thus the size of the array may
appear to be smaller than in the source code which can then
result in the optimization being invalid.
2) when the pointer in the access may point to two or more
arrays of different size (i.e., it's the result of a PHI),
assuming it points to the smallest of them can also lead
to an incorrect result when the optimization is applied.

The attached patch adjusts the optimization to 1) avoid making
any assumptions about the sizes of objects accessed via union
types, and b) use the size of the largest object in PHI nodes.

Tested on x86_64-linux.

Martin
PR tree-optimization/92765 - wrong code for strcmp of a union member

gcc/testsuite/ChangeLog:

	PR tree-optimization/92765
	* gcc.dg/strlenopt-92.c: New test.

gcc/ChangeLog:

	PR tree-optimization/92765
	* tree-ssa-strlen.c (component_ref_via_union_p): New function.
	(determine_min_objsize): Call it.  Use the maximum object size
	for PHI arguments.

diff --git a/gcc/testsuite/gcc.dg/strlenopt-92.c b/gcc/testsuite/gcc.dg/strlenopt-92.c
new file mode 100644
index 000..44ad5b88854
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-92.c
@@ -0,0 +1,94 @@
+/* PR tree-optimization/92765 - wrong code for strcmp of a union member
+   { dg-do run }
+   { dg-options "-O2 -Wall" } */
+
+#include "strlenopt.h"
+
+extern void free (void*);
+extern void* calloc (size_t, size_t);
+extern void* malloc (size_t);
+
+
+/* Test from comment #0.  */
+struct S0 {
+  char a[2];
+};
+
+union U0 {
+  char b[4];
+  struct S0 s;
+};
+
+union U0 u0;
+union U0 *p0 = 
+
+int test_0 ()
+{
+  u0.b[0] = 'a';
+  u0.b[1] = 'b';
+  u0.b[2] = '\0';
+
+  int x = memcmp (p0->s.a, "x", 2);
+
+  if (strcmp (p0->b, "ab"))
+abort ();
+
+  return x;
+}
+
+
+/* Test from comment #6.  */
+union U1 { struct S1 { char a[2]; char b[2]; char c[2]; } s; char d[6]; } u1;
+
+__attribute__((noipa)) void
+barrier (char *p)
+{
+  asm volatile ("" : : "g" (p) : "memory");
+}
+
+__attribute__((noipa)) void
+test_1 (union U1 *x)
+{
+  char *p = (char *) >s.b;
+  barrier (p);
+  if (strcmp (>d[2], "cde"))
+abort ();
+}
+
+/* Test from comment #7.  */
+
+__attribute__((noipa)) int
+barrier_copy (char *x, int y)
+{
+  asm volatile ("" : : "g" (x), "g" (y) : "memory");
+  if (y == 0)
+strcpy (x, "abcd");
+  return y;
+}
+
+__attribute__((noipa)) char *
+test_2 (int x)
+{
+  char *p;
+  if (x)
+p = malloc (2);
+  else
+p = calloc (16, 1);
+  if (barrier_copy (p, x))
+return p;
+  if (strcmp (p, "abcd") != 0)
+abort ();
+  return p;
+}
+
+
+int main (void)
+{
+  test_0 ();
+
+  strcpy (u1.d, "abcde");
+  test_1 ();
+
+  free (test_2 (0));
+  free (test_2 (1));
+}
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index ad9e98973b1..f4b6aadae47 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -4087,6 +4087,29 @@ compute_string_length (int idx)
   return string_leni;
 }
 
+/* Returns true if REF is a reference to a member of a union type,
+   or a member of such a type (traversing all references along
+   the path).  Used to avoid making assumptions about accesses
+   to members that could also be accessed by other members of
+   incompatible types.  */
+
+static bool
+component_ref_via_union_p (tree ref)
+{
+  if (TREE_CODE (ref) == ADDR_EXPR)
+ref = TREE_OPERAND (ref, 0);
+
+  while (TREE_CODE (ref) == MEM_REF || handled_component_p (ref))
+{
+  tree type = TREE_TYPE (ref);
+  if (TREE_CODE (type) == UNION_TYPE)
+	return true;
+  ref = TREE_OPERAND (ref, 0);
+}
+
+  return false;
+}
+
 /* Determine the minimum size of the object referenced by DEST expression
which must have a pointer type.
Return the minimum size of the object if successful or HWI_M1U when
@@ -4099,14 +4122,18 @@ determine_min_objsize (tree dest)
 
   init_object_sizes ();
 
-  if (compute_builtin_object_size (dest, 2, ))
-return size;
-
   /* Try to determine the size of the object through the RHS
  of the assign statement.  */
   if (TREE_CODE (dest) == SSA_NAME)
 {
   gimple *stmt = SSA_NAME_DEF_STMT (dest);
+
+  /* Determine the size of the largest object when DEST refers
+	 to two or more via a PHI, otherwise the smallest.  */
+  int ostype = gimple_code (stmt) == GIMPLE_PHI ? 0 : 2;
+  if (compute_builtin_object_size (dest, ostype, ))
+	return size;
+
   if (!is_gimple_assign (stmt))
 	return HOST_WIDE_INT_M1U;
 
@@ -4118,6 +4145,10 @@