Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-16 Thread Richard Biener
On Thu, Nov 16, 2017 at 4:08 AM, Martin Sebor  wrote:
> On 11/15/2017 03:51 AM, Richard Biener wrote:
>>
>> On Tue, Nov 14, 2017 at 6:45 PM, Martin Sebor  wrote:
>>>
>>> On 11/14/2017 05:28 AM, Richard Biener wrote:


 On Mon, Nov 13, 2017 at 6:37 PM, Martin Sebor  wrote:
>
>
> Richard, this thread may have been conflated with the one Re:
> [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets
> (PR 82455) They are about different things.
>
> I'm still looking for approval of:
>
>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01208.html
>>>
>>>
>>>
>>> Sorry, I pointed to an outdated version.  This is the latest
>>> version:
>>>
>>>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>>>
>>> My bad...
>>>
>>>

 +  tree maxbound
 + = build_int_cst (sizetype, ~(1LLU << (TYPE_PRECISION (sizetype) -
 1)));

 this looks possibly bogus.  Can you instead use

   up_bound_p1
 = wide_int_to_tree (sizetype, wi::div_trunc (wi::max_value
 (TYPE_PRECISION (sizetype), SIGNED), wi::to_wide (eltsize)));

 please?  Note you are _not_ computing the proper upper bound here
 because
 that
 is what you compute plus low_bound.

 +  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound,
 eltsize);

 +
 +  tree arg = TREE_OPERAND (ref, 0);
 +  tree_code code = TREE_CODE (arg);
 +  if (code == COMPONENT_REF)
 + {
 +  HOST_WIDE_INT off;
 +  if (tree base = get_addr_base_and_unit_offset (ref, ))
 +{
 +  tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
 +  if (TREE_CODE (size) == INTEGER_CST)
 + up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);

 I think I asked this multiple times now but given 'ref' is the
 variable array-ref
 a.b.c[i] when you call get_addr_base_and_unit_offset (ref, ) you
 always
 get a NULL_TREE return value.

 So I asked you to pass it 'arg' instead ... which gets you the offset of
 a.b.c, which looks like what you intended to get anyway.

 I also wonder what you compute here - you are looking at the size of
 'base'
 but that is the size of 'a'.  You don't even use the computed offset!
 Which
 means you could have used get_base_address instead!?  Also the type
 of 'base' may be completely off given MEM[ + 8].b.c[i] would return
 blk
 as base which might be an array of chars and not in any way related to
 the type of the innermost structure we access with COMPONENT_REFs.

 Why are you only looking at COMPONENT_REF args anyways?  You
 don't want to handle a.b[3][i]?

 That is, I'd have expected you do

if (get_addr_base_and_unit_offset (ref, ))
  up_bound_p1 = wide_int_to_tree (sizetype, wi::sub (wi::to_wide
 (up_bound_p1), off));
>>
>>
>> ^
>
>
> Please see the attached update.

Ok.

Thanks,
Richard.

> Martin


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-15 Thread Martin Sebor

On 11/15/2017 03:51 AM, Richard Biener wrote:

On Tue, Nov 14, 2017 at 6:45 PM, Martin Sebor  wrote:

On 11/14/2017 05:28 AM, Richard Biener wrote:


On Mon, Nov 13, 2017 at 6:37 PM, Martin Sebor  wrote:


Richard, this thread may have been conflated with the one Re:
[PATCH] enhance -Warray-bounds to detect out-of-bounds offsets
(PR 82455) They are about different things.

I'm still looking for approval of:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01208.html



Sorry, I pointed to an outdated version.  This is the latest
version:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html

My bad...




+  tree maxbound
+ = build_int_cst (sizetype, ~(1LLU << (TYPE_PRECISION (sizetype) - 1)));

this looks possibly bogus.  Can you instead use

  up_bound_p1
= wide_int_to_tree (sizetype, wi::div_trunc (wi::max_value
(TYPE_PRECISION (sizetype), SIGNED), wi::to_wide (eltsize)));

please?  Note you are _not_ computing the proper upper bound here because
that
is what you compute plus low_bound.

+  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);

+
+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+ {
+  HOST_WIDE_INT off;
+  if (tree base = get_addr_base_and_unit_offset (ref, ))
+{
+  tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
+  if (TREE_CODE (size) == INTEGER_CST)
+ up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);

I think I asked this multiple times now but given 'ref' is the
variable array-ref
a.b.c[i] when you call get_addr_base_and_unit_offset (ref, ) you
always
get a NULL_TREE return value.

So I asked you to pass it 'arg' instead ... which gets you the offset of
a.b.c, which looks like what you intended to get anyway.

I also wonder what you compute here - you are looking at the size of
'base'
but that is the size of 'a'.  You don't even use the computed offset!
Which
means you could have used get_base_address instead!?  Also the type
of 'base' may be completely off given MEM[ + 8].b.c[i] would return
blk
as base which might be an array of chars and not in any way related to
the type of the innermost structure we access with COMPONENT_REFs.

Why are you only looking at COMPONENT_REF args anyways?  You
don't want to handle a.b[3][i]?

That is, I'd have expected you do

   if (get_addr_base_and_unit_offset (ref, ))
 up_bound_p1 = wide_int_to_tree (sizetype, wi::sub (wi::to_wide
(up_bound_p1), off));


^


Please see the attached update.

Martin
PR tree-optimization/82588 - missing -Warray-bounds on a excessively large index
PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds inner indic

gcc/ChangeLog:
	PR tree-optimization/82588
	PR tree-optimization/82583
	* tree-vrp.c (check_array_ref): Handle flexible array members,
	string literals, and inner indices.
	(search_for_addr_array): Add detail to diagnostics.

gcc/testsuite/ChangeLog:
	PR tree-optimization/82588
	PR tree-optimization/82583	
	* c-c++-common/Warray-bounds.c: New test.
	* gcc.dg/Warray-bounds-11.c: Adjust.
	* gcc.dg/Warray-bounds-22.c: New test.

diff --git a/gcc/testsuite/c-c++-common/Warray-bounds.c b/gcc/testsuite/c-c++-common/Warray-bounds.c
new file mode 100644
index 000..bea36fb
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Warray-bounds.c
@@ -0,0 +1,259 @@
+/* PR tree-optimization/82588 - missing -Warray-bounds on an excessively
+   large index
+   { dg-do compile }
+   { dg-require-effective-target alloca }
+   { dg-options "-O2 -Warray-bounds -ftrack-macro-expansion=0" }  */
+
+#define SIZE_MAX  __SIZE_MAX__
+#define DIFF_MAX  __PTRDIFF_MAX__
+#define DIFF_MIN  (-DIFF_MAX - 1)
+
+#define offsetof(T, m)   __builtin_offsetof (T, m)
+
+typedef __PTRDIFF_TYPE__ ssize_t;
+typedef __SIZE_TYPE__size_t;
+
+extern ssize_t signed_value (void)
+{
+  extern volatile ssize_t signed_value_source;
+  return signed_value_source;
+}
+
+extern size_t unsigned_value (void)
+{
+  extern volatile size_t unsigned_value_source;
+  return unsigned_value_source;
+}
+
+ssize_t signed_range (ssize_t min, ssize_t max)
+{
+  ssize_t val = signed_value ();
+  return val < min || max < val ? min : val;
+}
+
+typedef struct AX { int n; char ax[]; } AX;
+
+typedef struct A1 { int i; char a1[1]; } A1;
+typedef struct B { int i; struct A1 a1x[]; } B;
+
+void sink (int, ...);
+
+#define R(min, max) signed_range (min, max)
+#define T(expr) sink (0, expr)
+
+struct __attribute__ ((packed)) S16 { unsigned i: 16; };
+
+void farr_char (void)
+{
+  extern char ac[];
+
+  T (ac[DIFF_MIN]);   /* { dg-warning "array subscript -\[0-9\]+ is below array bounds of .char *\\\[]." } */
+  T (ac[-1]); /* { dg-warning "array subscript -1 is below array bounds" } */
+  T (ac[0]);
+
+  T (ac[DIFF_MAX - 1]);
+  T (ac[DIFF_MAX]);   /* { dg-warning "array subscript \[0-9\]+ is above array bounds" } */
+  T 

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-15 Thread Martin Sebor

On 11/15/2017 03:51 AM, Richard Biener wrote:

On Tue, Nov 14, 2017 at 6:45 PM, Martin Sebor  wrote:

On 11/14/2017 05:28 AM, Richard Biener wrote:


On Mon, Nov 13, 2017 at 6:37 PM, Martin Sebor  wrote:


Richard, this thread may have been conflated with the one Re:
[PATCH] enhance -Warray-bounds to detect out-of-bounds offsets
(PR 82455) They are about different things.

I'm still looking for approval of:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01208.html



Sorry, I pointed to an outdated version.  This is the latest
version:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html

My bad...




+  tree maxbound
+ = build_int_cst (sizetype, ~(1LLU << (TYPE_PRECISION (sizetype) - 1)));

this looks possibly bogus.  Can you instead use

  up_bound_p1
= wide_int_to_tree (sizetype, wi::div_trunc (wi::max_value
(TYPE_PRECISION (sizetype), SIGNED), wi::to_wide (eltsize)));

please?  Note you are _not_ computing the proper upper bound here because
that
is what you compute plus low_bound.

+  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);

+
+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+ {
+  HOST_WIDE_INT off;
+  if (tree base = get_addr_base_and_unit_offset (ref, ))
+{
+  tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
+  if (TREE_CODE (size) == INTEGER_CST)
+ up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);

I think I asked this multiple times now but given 'ref' is the
variable array-ref
a.b.c[i] when you call get_addr_base_and_unit_offset (ref, ) you
always
get a NULL_TREE return value.

So I asked you to pass it 'arg' instead ... which gets you the offset of
a.b.c, which looks like what you intended to get anyway.

I also wonder what you compute here - you are looking at the size of
'base'
but that is the size of 'a'.  You don't even use the computed offset!
Which
means you could have used get_base_address instead!?  Also the type
of 'base' may be completely off given MEM[ + 8].b.c[i] would return
blk
as base which might be an array of chars and not in any way related to
the type of the innermost structure we access with COMPONENT_REFs.

Why are you only looking at COMPONENT_REF args anyways?  You
don't want to handle a.b[3][i]?

That is, I'd have expected you do

   if (get_addr_base_and_unit_offset (ref, ))
 up_bound_p1 = wide_int_to_tree (sizetype, wi::sub (wi::to_wide
(up_bound_p1), off));


^





Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-15 Thread Richard Biener
On Tue, Nov 14, 2017 at 6:45 PM, Martin Sebor  wrote:
> On 11/14/2017 05:28 AM, Richard Biener wrote:
>>
>> On Mon, Nov 13, 2017 at 6:37 PM, Martin Sebor  wrote:
>>>
>>> Richard, this thread may have been conflated with the one Re:
>>> [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets
>>> (PR 82455) They are about different things.
>>>
>>> I'm still looking for approval of:
>>>
>>>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01208.html
>
>
> Sorry, I pointed to an outdated version.  This is the latest
> version:
>
>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>
> My bad...
>
>
>>
>> +  tree maxbound
>> + = build_int_cst (sizetype, ~(1LLU << (TYPE_PRECISION (sizetype) - 1)));
>>
>> this looks possibly bogus.  Can you instead use
>>
>>   up_bound_p1
>> = wide_int_to_tree (sizetype, wi::div_trunc (wi::max_value
>> (TYPE_PRECISION (sizetype), SIGNED), wi::to_wide (eltsize)));
>>
>> please?  Note you are _not_ computing the proper upper bound here because
>> that
>> is what you compute plus low_bound.
>>
>> +  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);
>>
>> +
>> +  tree arg = TREE_OPERAND (ref, 0);
>> +  tree_code code = TREE_CODE (arg);
>> +  if (code == COMPONENT_REF)
>> + {
>> +  HOST_WIDE_INT off;
>> +  if (tree base = get_addr_base_and_unit_offset (ref, ))
>> +{
>> +  tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
>> +  if (TREE_CODE (size) == INTEGER_CST)
>> + up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);
>>
>> I think I asked this multiple times now but given 'ref' is the
>> variable array-ref
>> a.b.c[i] when you call get_addr_base_and_unit_offset (ref, ) you
>> always
>> get a NULL_TREE return value.
>>
>> So I asked you to pass it 'arg' instead ... which gets you the offset of
>> a.b.c, which looks like what you intended to get anyway.
>>
>> I also wonder what you compute here - you are looking at the size of
>> 'base'
>> but that is the size of 'a'.  You don't even use the computed offset!
>> Which
>> means you could have used get_base_address instead!?  Also the type
>> of 'base' may be completely off given MEM[ + 8].b.c[i] would return
>> blk
>> as base which might be an array of chars and not in any way related to
>> the type of the innermost structure we access with COMPONENT_REFs.
>>
>> Why are you only looking at COMPONENT_REF args anyways?  You
>> don't want to handle a.b[3][i]?
>>
>> That is, I'd have expected you do
>>
>>if (get_addr_base_and_unit_offset (ref, ))
>>  up_bound_p1 = wide_int_to_tree (sizetype, wi::sub (wi::to_wide
>> (up_bound_p1), off));

^

Richard.

>> Richard.
>>
>>> Thanks
>>> Martin
>>>
>>>
> The difficulty with a testcase like
>
> struct { struct A { int b[1]; } a[5]; } x;
>
>  x.a[i].b[j]
>
> is that b is not considered an array at struct end since one of my
> recent changes to array_at_struct_end (basically it disallows having
> a flex array as the last member of an array).
>
> It would still stand for non-array components with variable offset
> but you can't create C testcases for that.
>
> So yes, for the specific case within the array_at_struct_end_p
> condition
> get_addr_base_and_unit_offset is enough.  IIRC the conditon was
> a bit more than just get_addr_base_and_unit_offset.  up_bound !=
> INTEGER_CST for example.  So make the above
>
> void foo (int n, int i)
> {
>  struct { struct A { int b[n]; } a[5]; } x;
>  return x.a[i].b[PTRDIFF_MAX/2];
> }
>
> with appropriately adjusted constant.  Does that give you the testcase
> you want?



 Thank you for the test case.  It is diagnosed the same way
 irrespective of which of the two functions is used so it serves
 to confirm my understanding that the only difference between
 the two functions is bits vs bytes.

 Unless you have another test case that does demonstrate that
 get_ref_base_and_extent is necessary/helpful, is the last patch
 okay to commit?

 (Again, to be clear, I'm happy to change or enhance the patch if
 I can verify that the change handles cases that the current patch
 misses.)

>
> As of "it works, catches corner-cases, ..." - yes, it does, but it
> adds code that needs to be maintained, may contain bugs, is
> executed even for valid code.



 Understood.  I don't claim the enhancement is free of any cost
 whatsoever.  But it is teeny by most standards and it doesn't
 detect just excessively large indices but also negative indices
 into last member arrays (bug 68325) and out-of-bounds indices
 (bug 82583).  The "excessively large" part does come largely
 for free with the other checks.

 Martin
>>>
>>>
>>>
>


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-14 Thread Martin Sebor

On 11/14/2017 05:28 AM, Richard Biener wrote:

On Mon, Nov 13, 2017 at 6:37 PM, Martin Sebor  wrote:

Richard, this thread may have been conflated with the one Re:
[PATCH] enhance -Warray-bounds to detect out-of-bounds offsets
(PR 82455) They are about different things.

I'm still looking for approval of:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01208.html


Sorry, I pointed to an outdated version.  This is the latest
version:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html

My bad...



+  tree maxbound
+ = build_int_cst (sizetype, ~(1LLU << (TYPE_PRECISION (sizetype) - 1)));

this looks possibly bogus.  Can you instead use

  up_bound_p1
= wide_int_to_tree (sizetype, wi::div_trunc (wi::max_value
(TYPE_PRECISION (sizetype), SIGNED), wi::to_wide (eltsize)));

please?  Note you are _not_ computing the proper upper bound here because that
is what you compute plus low_bound.

+  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);

+
+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+ {
+  HOST_WIDE_INT off;
+  if (tree base = get_addr_base_and_unit_offset (ref, ))
+{
+  tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
+  if (TREE_CODE (size) == INTEGER_CST)
+ up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);

I think I asked this multiple times now but given 'ref' is the
variable array-ref
a.b.c[i] when you call get_addr_base_and_unit_offset (ref, ) you always
get a NULL_TREE return value.

So I asked you to pass it 'arg' instead ... which gets you the offset of
a.b.c, which looks like what you intended to get anyway.

I also wonder what you compute here - you are looking at the size of 'base'
but that is the size of 'a'.  You don't even use the computed offset!  Which
means you could have used get_base_address instead!?  Also the type
of 'base' may be completely off given MEM[ + 8].b.c[i] would return blk
as base which might be an array of chars and not in any way related to
the type of the innermost structure we access with COMPONENT_REFs.

Why are you only looking at COMPONENT_REF args anyways?  You
don't want to handle a.b[3][i]?

That is, I'd have expected you do

   if (get_addr_base_and_unit_offset (ref, ))
 up_bound_p1 = wide_int_to_tree (sizetype, wi::sub (wi::to_wide
(up_bound_p1), off));

Richard.


Thanks
Martin



The difficulty with a testcase like

struct { struct A { int b[1]; } a[5]; } x;

 x.a[i].b[j]

is that b is not considered an array at struct end since one of my
recent changes to array_at_struct_end (basically it disallows having
a flex array as the last member of an array).

It would still stand for non-array components with variable offset
but you can't create C testcases for that.

So yes, for the specific case within the array_at_struct_end_p condition
get_addr_base_and_unit_offset is enough.  IIRC the conditon was
a bit more than just get_addr_base_and_unit_offset.  up_bound !=
INTEGER_CST for example.  So make the above

void foo (int n, int i)
{
 struct { struct A { int b[n]; } a[5]; } x;
 return x.a[i].b[PTRDIFF_MAX/2];
}

with appropriately adjusted constant.  Does that give you the testcase
you want?



Thank you for the test case.  It is diagnosed the same way
irrespective of which of the two functions is used so it serves
to confirm my understanding that the only difference between
the two functions is bits vs bytes.

Unless you have another test case that does demonstrate that
get_ref_base_and_extent is necessary/helpful, is the last patch
okay to commit?

(Again, to be clear, I'm happy to change or enhance the patch if
I can verify that the change handles cases that the current patch
misses.)



As of "it works, catches corner-cases, ..." - yes, it does, but it
adds code that needs to be maintained, may contain bugs, is
executed even for valid code.



Understood.  I don't claim the enhancement is free of any cost
whatsoever.  But it is teeny by most standards and it doesn't
detect just excessively large indices but also negative indices
into last member arrays (bug 68325) and out-of-bounds indices
(bug 82583).  The "excessively large" part does come largely
for free with the other checks.

Martin







Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-14 Thread Richard Biener
On Mon, Nov 13, 2017 at 6:37 PM, Martin Sebor  wrote:
> Richard, this thread may have been conflated with the one Re:
> [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets
> (PR 82455) They are about different things.
>
> I'm still looking for approval of:
>
>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01208.html

+  tree maxbound
+ = build_int_cst (sizetype, ~(1LLU << (TYPE_PRECISION (sizetype) - 1)));

this looks possibly bogus.  Can you instead use

  up_bound_p1
= wide_int_to_tree (sizetype, wi::div_trunc (wi::max_value
(TYPE_PRECISION (sizetype), SIGNED), wi::to_wide (eltsize)));

please?  Note you are _not_ computing the proper upper bound here because that
is what you compute plus low_bound.

+  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);

+
+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+ {
+  HOST_WIDE_INT off;
+  if (tree base = get_addr_base_and_unit_offset (ref, ))
+{
+  tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
+  if (TREE_CODE (size) == INTEGER_CST)
+ up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);

I think I asked this multiple times now but given 'ref' is the
variable array-ref
a.b.c[i] when you call get_addr_base_and_unit_offset (ref, ) you always
get a NULL_TREE return value.

So I asked you to pass it 'arg' instead ... which gets you the offset of
a.b.c, which looks like what you intended to get anyway.

I also wonder what you compute here - you are looking at the size of 'base'
but that is the size of 'a'.  You don't even use the computed offset!  Which
means you could have used get_base_address instead!?  Also the type
of 'base' may be completely off given MEM[ + 8].b.c[i] would return blk
as base which might be an array of chars and not in any way related to
the type of the innermost structure we access with COMPONENT_REFs.

Why are you only looking at COMPONENT_REF args anyways?  You
don't want to handle a.b[3][i]?

That is, I'd have expected you do

   if (get_addr_base_and_unit_offset (ref, ))
 up_bound_p1 = wide_int_to_tree (sizetype, wi::sub (wi::to_wide
(up_bound_p1), off));

Richard.

> Thanks
> Martin
>
>
>>> The difficulty with a testcase like
>>>
>>> struct { struct A { int b[1]; } a[5]; } x;
>>>
>>>  x.a[i].b[j]
>>>
>>> is that b is not considered an array at struct end since one of my
>>> recent changes to array_at_struct_end (basically it disallows having
>>> a flex array as the last member of an array).
>>>
>>> It would still stand for non-array components with variable offset
>>> but you can't create C testcases for that.
>>>
>>> So yes, for the specific case within the array_at_struct_end_p condition
>>> get_addr_base_and_unit_offset is enough.  IIRC the conditon was
>>> a bit more than just get_addr_base_and_unit_offset.  up_bound !=
>>> INTEGER_CST for example.  So make the above
>>>
>>> void foo (int n, int i)
>>> {
>>>  struct { struct A { int b[n]; } a[5]; } x;
>>>  return x.a[i].b[PTRDIFF_MAX/2];
>>> }
>>>
>>> with appropriately adjusted constant.  Does that give you the testcase
>>> you want?
>>
>>
>> Thank you for the test case.  It is diagnosed the same way
>> irrespective of which of the two functions is used so it serves
>> to confirm my understanding that the only difference between
>> the two functions is bits vs bytes.
>>
>> Unless you have another test case that does demonstrate that
>> get_ref_base_and_extent is necessary/helpful, is the last patch
>> okay to commit?
>>
>> (Again, to be clear, I'm happy to change or enhance the patch if
>> I can verify that the change handles cases that the current patch
>> misses.)
>>
>>>
>>> As of "it works, catches corner-cases, ..." - yes, it does, but it
>>> adds code that needs to be maintained, may contain bugs, is
>>> executed even for valid code.
>>
>>
>> Understood.  I don't claim the enhancement is free of any cost
>> whatsoever.  But it is teeny by most standards and it doesn't
>> detect just excessively large indices but also negative indices
>> into last member arrays (bug 68325) and out-of-bounds indices
>> (bug 82583).  The "excessively large" part does come largely
>> for free with the other checks.
>>
>> Martin
>
>


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-13 Thread Martin Sebor

Richard, this thread may have been conflated with the one Re:
[PATCH] enhance -Warray-bounds to detect out-of-bounds offsets
(PR 82455) They are about different things.

I'm still looking for approval of:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01208.html

Thanks
Martin


The difficulty with a testcase like

struct { struct A { int b[1]; } a[5]; } x;

 x.a[i].b[j]

is that b is not considered an array at struct end since one of my
recent changes to array_at_struct_end (basically it disallows having
a flex array as the last member of an array).

It would still stand for non-array components with variable offset
but you can't create C testcases for that.

So yes, for the specific case within the array_at_struct_end_p condition
get_addr_base_and_unit_offset is enough.  IIRC the conditon was
a bit more than just get_addr_base_and_unit_offset.  up_bound !=
INTEGER_CST for example.  So make the above

void foo (int n, int i)
{
 struct { struct A { int b[n]; } a[5]; } x;
 return x.a[i].b[PTRDIFF_MAX/2];
}

with appropriately adjusted constant.  Does that give you the testcase
you want?


Thank you for the test case.  It is diagnosed the same way
irrespective of which of the two functions is used so it serves
to confirm my understanding that the only difference between
the two functions is bits vs bytes.

Unless you have another test case that does demonstrate that
get_ref_base_and_extent is necessary/helpful, is the last patch
okay to commit?

(Again, to be clear, I'm happy to change or enhance the patch if
I can verify that the change handles cases that the current patch
misses.)



As of "it works, catches corner-cases, ..." - yes, it does, but it
adds code that needs to be maintained, may contain bugs, is
executed even for valid code.


Understood.  I don't claim the enhancement is free of any cost
whatsoever.  But it is teeny by most standards and it doesn't
detect just excessively large indices but also negative indices
into last member arrays (bug 68325) and out-of-bounds indices
(bug 82583).  The "excessively large" part does come largely
for free with the other checks.

Martin




Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-07 Thread Martin Sebor

On 11/07/2017 03:18 AM, Richard Biener wrote:

On Tue, Nov 7, 2017 at 4:23 AM, Martin Sebor  wrote:

On 11/06/2017 11:41 AM, Jeff Law wrote:


On 10/29/2017 10:15 AM, Martin Sebor wrote:


Ping -- please see my reply below.

On 10/20/2017 09:57 AM, Richard Biener wrote:


get_addr_base_and_unit_offset will return NULL if there's any


variable


component in 'ref'.  So as written it seems to be dead code (you
want to pass 'arg'?)




Sorry, I'm not sure I understand what you mean.  What do you think
is dead code?  The call to get_addr_base_and_unit_offset() is also
made for an array of unspecified bound (up_bound is null) and for
an array at the end of a struct.  For those the function returns
non-null, and for the others (arrays of runtime bound) it returns
null.  (I passed arg instead of ref but I see no difference in
my tests.)



If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')


it will


return the offset of 'c'.  If you pass a.b[j].c it will still return


NULL.


You could use get_ref_base_and_extent which will return the offset
of a.b[0].c in this case and sets max_size != size - but you are only
interested in offset.  The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large for a


HWI


you'll instead get offset == 0 and max_size == -1.

Thus I'm saying this is dead code for variable array accesses
(even for the array you are warning about).  Yes, for constant index
and at-struct-end you'll get sth, but the warning is in VRP because
of variable indexes.

So I suggest to pass 'arg' and use get_ref_base_and_extent
for some extra precision (and possible lossage for very very large
structures).



Computing bit offsets defeats the out-of-bounds flexible array
index detection because the computation overflows (the function
sets the offset to zero).



It shouldn't if you pass arg rather than ref.



I suspect you missed my reply in IRC where I confirmed that this
approach doesn't work for the reason you yourself mentioned above:


The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large
for a HWI you'll instead get offset == 0 and max_size == -1.



This means that using the function you suggest would either prevent
detecting the out-of-bounds indices that overflow due to the bit
offset, thus largely defeating the purpose of the patch (to detect
excessively large indices), or not printing the value of the out-of
bounds index in the warning in this case, which would in turn mean
further changes to the rest of the logic.


I think Richi's point is that it's more important (in his opinion) to
handle the varying objects within an access like a.b.c[i] than to handle
cases that would overflow when converted to bits.  Thus he'd prefer to
use get_ref_base_and_extent over get_addr_base_and_unit_offset.

I wonder if a hybrid approach here would work.

ie, use get_ref_base_and_extent per Richi's request.  When that returns
an max_size of -1, then call get_addr_base_and_unit_offset and if it
returns that the offset is huge, then warn without any deeper analysis.

Yea, this could trip a false positive if someone makes an array that is
half the address space (give or take), but that's probably not at all
common.  Whereas the additional cases handled by get_ref_base_and_extent
are perhaps more useful.

Thoughts?



I'm always open to suggestions for improvements.  I just haven't
been able to construct a test case to demonstrate the improvement
gained by using get_ref_base_and_extent.  The only test cases I've
come up with show false positives.  It could very well be because
of my lack of experience with the APIs so if there is a trade-off
here that could be compensated for by the approach you suggest I'm
happy to incorporate it into the patch.  I just need a test case
to verify that the solution works as intended.


The difficulty with a testcase like

struct { struct A { int b[1]; } a[5]; } x;

 x.a[i].b[j]

is that b is not considered an array at struct end since one of my
recent changes to array_at_struct_end (basically it disallows having
a flex array as the last member of an array).

It would still stand for non-array components with variable offset
but you can't create C testcases for that.

So yes, for the specific case within the array_at_struct_end_p condition
get_addr_base_and_unit_offset is enough.  IIRC the conditon was
a bit more than just get_addr_base_and_unit_offset.  up_bound !=
INTEGER_CST for example.  So make the above

void foo (int n, int i)
{
 struct { struct A { int b[n]; } a[5]; } x;
 return x.a[i].b[PTRDIFF_MAX/2];
}

with appropriately adjusted constant.  Does that give you the testcase
you want?


Thank you for the test case.  It is diagnosed the same way
irrespective of which of the two functions is used so it serves
to confirm my understanding that the only difference between
the two functions is bits vs bytes.

Unless you have another test case that 

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-07 Thread Richard Biener
On Tue, Nov 7, 2017 at 4:23 AM, Martin Sebor  wrote:
> On 11/06/2017 11:41 AM, Jeff Law wrote:
>>
>> On 10/29/2017 10:15 AM, Martin Sebor wrote:
>>>
>>> Ping -- please see my reply below.
>>>
>>> On 10/20/2017 09:57 AM, Richard Biener wrote:

 get_addr_base_and_unit_offset will return NULL if there's any
>
> variable

 component in 'ref'.  So as written it seems to be dead code (you
 want to pass 'arg'?)
>>>
>>>
>>>
>>> Sorry, I'm not sure I understand what you mean.  What do you think
>>> is dead code?  The call to get_addr_base_and_unit_offset() is also
>>> made for an array of unspecified bound (up_bound is null) and for
>>> an array at the end of a struct.  For those the function returns
>>> non-null, and for the others (arrays of runtime bound) it returns
>>> null.  (I passed arg instead of ref but I see no difference in
>>> my tests.)
>>
>>
>> If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')
>
> it will
>>
>> return the offset of 'c'.  If you pass a.b[j].c it will still return
>
> NULL.
>>
>> You could use get_ref_base_and_extent which will return the offset
>> of a.b[0].c in this case and sets max_size != size - but you are only
>> interested in offset.  The disadvantage of get_ref_base_and_extent
>> is it returns offset in bits thus if the offset is too large for a
>
> HWI
>>
>> you'll instead get offset == 0 and max_size == -1.
>>
>> Thus I'm saying this is dead code for variable array accesses
>> (even for the array you are warning about).  Yes, for constant index
>> and at-struct-end you'll get sth, but the warning is in VRP because
>> of variable indexes.
>>
>> So I suggest to pass 'arg' and use get_ref_base_and_extent
>> for some extra precision (and possible lossage for very very large
>> structures).
>
>
> Computing bit offsets defeats the out-of-bounds flexible array
> index detection because the computation overflows (the function
> sets the offset to zero).


 It shouldn't if you pass arg rather than ref.
>>>
>>>
>>> I suspect you missed my reply in IRC where I confirmed that this
>>> approach doesn't work for the reason you yourself mentioned above:
>>>
>> The disadvantage of get_ref_base_and_extent
>> is it returns offset in bits thus if the offset is too large
>> for a HWI you'll instead get offset == 0 and max_size == -1.
>>>
>>>
>>> This means that using the function you suggest would either prevent
>>> detecting the out-of-bounds indices that overflow due to the bit
>>> offset, thus largely defeating the purpose of the patch (to detect
>>> excessively large indices), or not printing the value of the out-of
>>> bounds index in the warning in this case, which would in turn mean
>>> further changes to the rest of the logic.
>>
>> I think Richi's point is that it's more important (in his opinion) to
>> handle the varying objects within an access like a.b.c[i] than to handle
>> cases that would overflow when converted to bits.  Thus he'd prefer to
>> use get_ref_base_and_extent over get_addr_base_and_unit_offset.
>>
>> I wonder if a hybrid approach here would work.
>>
>> ie, use get_ref_base_and_extent per Richi's request.  When that returns
>> an max_size of -1, then call get_addr_base_and_unit_offset and if it
>> returns that the offset is huge, then warn without any deeper analysis.
>>
>> Yea, this could trip a false positive if someone makes an array that is
>> half the address space (give or take), but that's probably not at all
>> common.  Whereas the additional cases handled by get_ref_base_and_extent
>> are perhaps more useful.
>>
>> Thoughts?
>
>
> I'm always open to suggestions for improvements.  I just haven't
> been able to construct a test case to demonstrate the improvement
> gained by using get_ref_base_and_extent.  The only test cases I've
> come up with show false positives.  It could very well be because
> of my lack of experience with the APIs so if there is a trade-off
> here that could be compensated for by the approach you suggest I'm
> happy to incorporate it into the patch.  I just need a test case
> to verify that the solution works as intended.

The difficulty with a testcase like

struct { struct A { int b[1]; } a[5]; } x;

 x.a[i].b[j]

is that b is not considered an array at struct end since one of my
recent changes to array_at_struct_end (basically it disallows having
a flex array as the last member of an array).

It would still stand for non-array components with variable offset
but you can't create C testcases for that.

So yes, for the specific case within the array_at_struct_end_p condition
get_addr_base_and_unit_offset is enough.  IIRC the conditon was
a bit more than just get_addr_base_and_unit_offset.  up_bound !=
INTEGER_CST for example.  So make the above

void foo (int n, int 

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-06 Thread Martin Sebor

On 11/06/2017 11:41 AM, Jeff Law wrote:

On 10/29/2017 10:15 AM, Martin Sebor wrote:

Ping -- please see my reply below.

On 10/20/2017 09:57 AM, Richard Biener wrote:

get_addr_base_and_unit_offset will return NULL if there's any

variable

component in 'ref'.  So as written it seems to be dead code (you
want to pass 'arg'?)



Sorry, I'm not sure I understand what you mean.  What do you think
is dead code?  The call to get_addr_base_and_unit_offset() is also
made for an array of unspecified bound (up_bound is null) and for
an array at the end of a struct.  For those the function returns
non-null, and for the others (arrays of runtime bound) it returns
null.  (I passed arg instead of ref but I see no difference in
my tests.)


If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')

it will

return the offset of 'c'.  If you pass a.b[j].c it will still return

NULL.

You could use get_ref_base_and_extent which will return the offset
of a.b[0].c in this case and sets max_size != size - but you are only
interested in offset.  The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large for a

HWI

you'll instead get offset == 0 and max_size == -1.

Thus I'm saying this is dead code for variable array accesses
(even for the array you are warning about).  Yes, for constant index
and at-struct-end you'll get sth, but the warning is in VRP because
of variable indexes.

So I suggest to pass 'arg' and use get_ref_base_and_extent
for some extra precision (and possible lossage for very very large
structures).


Computing bit offsets defeats the out-of-bounds flexible array
index detection because the computation overflows (the function
sets the offset to zero).


It shouldn't if you pass arg rather than ref.


I suspect you missed my reply in IRC where I confirmed that this
approach doesn't work for the reason you yourself mentioned above:


The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large
for a HWI you'll instead get offset == 0 and max_size == -1.


This means that using the function you suggest would either prevent
detecting the out-of-bounds indices that overflow due to the bit
offset, thus largely defeating the purpose of the patch (to detect
excessively large indices), or not printing the value of the out-of
bounds index in the warning in this case, which would in turn mean
further changes to the rest of the logic.

I think Richi's point is that it's more important (in his opinion) to
handle the varying objects within an access like a.b.c[i] than to handle
cases that would overflow when converted to bits.  Thus he'd prefer to
use get_ref_base_and_extent over get_addr_base_and_unit_offset.

I wonder if a hybrid approach here would work.

ie, use get_ref_base_and_extent per Richi's request.  When that returns
an max_size of -1, then call get_addr_base_and_unit_offset and if it
returns that the offset is huge, then warn without any deeper analysis.

Yea, this could trip a false positive if someone makes an array that is
half the address space (give or take), but that's probably not at all
common.  Whereas the additional cases handled by get_ref_base_and_extent
are perhaps more useful.

Thoughts?


I'm always open to suggestions for improvements.  I just haven't
been able to construct a test case to demonstrate the improvement
gained by using get_ref_base_and_extent.  The only test cases I've
come up with show false positives.  It could very well be because
of my lack of experience with the APIs so if there is a trade-off
here that could be compensated for by the approach you suggest I'm
happy to incorporate it into the patch.  I just need a test case
to verify that the solution works as intended.

Martin


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-06 Thread Martin Sebor

On 10/30/2017 06:02 AM, Richard Biener wrote:

On Sun, Oct 29, 2017 at 5:15 PM, Martin Sebor  wrote:

Ping -- please see my reply below.


On 10/20/2017 09:57 AM, Richard Biener wrote:


get_addr_base_and_unit_offset will return NULL if there's any


variable


component in 'ref'.  So as written it seems to be dead code (you
want to pass 'arg'?)




Sorry, I'm not sure I understand what you mean.  What do you think
is dead code?  The call to get_addr_base_and_unit_offset() is also
made for an array of unspecified bound (up_bound is null) and for
an array at the end of a struct.  For those the function returns
non-null, and for the others (arrays of runtime bound) it returns
null.  (I passed arg instead of ref but I see no difference in
my tests.)



If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')


it will


return the offset of 'c'.  If you pass a.b[j].c it will still return


NULL.


You could use get_ref_base_and_extent which will return the offset
of a.b[0].c in this case and sets max_size != size - but you are only
interested in offset.  The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large for a


HWI


you'll instead get offset == 0 and max_size == -1.

Thus I'm saying this is dead code for variable array accesses
(even for the array you are warning about).  Yes, for constant index
and at-struct-end you'll get sth, but the warning is in VRP because
of variable indexes.

So I suggest to pass 'arg' and use get_ref_base_and_extent
for some extra precision (and possible lossage for very very large
structures).



Computing bit offsets defeats the out-of-bounds flexible array
index detection because the computation overflows (the function
sets the offset to zero).



It shouldn't if you pass arg rather than ref.



I suspect you missed my reply in IRC where I confirmed that this
approach doesn't work for the reason you yourself mentioned above:


The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large
for a HWI you'll instead get offset == 0 and max_size == -1.


This means that using the function you suggest would either prevent
detecting the out-of-bounds indices that overflow due to the bit
offset, thus largely defeating the purpose of the patch (to detect
excessively large indices), or not printing the value of the out-of
bounds index in the warning in this case, which would in turn mean
further changes to the rest of the logic.


Well, it would not handle the case of

a.b[offset-bringing-you-bitoffset-overflow].c[i]

but it would handle (compared to your version of the patch)

a.b[j].c[i]

with i being the unreasonably large offset.  That's beause we
look at the bit offset of a.b[j].c thus the _start_ of the array.


AFAICS, using get_ref_base_and_extent only results in missing a set
of cases that get_addr_base_and_unit_offset doesn't.  It's entirely
possible I'm missing something but I can find no other difference
or construct a test case where an out-of-bounds index is detected
with the former that isn't with the latter.  Here's an example:

  #define MAX (__PTRDIFF_MAX__ - 256)

  struct A { char a[256]; char ax[]; };

  int f (struct A *p, unsigned long i)
  {
if (i < MAX + 1)
  i = MAX + 1; // bug: sizeof *p + i > PTRDIFF_MAX

return p->ax[i];   // missing -Warray-bounds
  }

This is because the offset of the array returned by
get_ref_base_and_extent when the bit offset computation overflows
is reset to zero.  So the function seems strictly worse in this
context that the other.   Am I missing something that makes this
a worthwhile tradeoff?  (If so, can you please show a test case?)


I still find the half-address-space case totally pointless to warn about
(even more to get "precise" here by subtracting the offset of the array).
There's not a single testcase that looks motivating to me (like an
error with some signed->unsigned conversion and then GCC magically
figuring out a range you'd warn for)


-Warray-bounds is documented to

  ...warn about subscripts to arrays that are always out of bounds.

Excessively large indices that would imply an object in excess of
PTRDIFF_MAX bytes are always out of bounds, so expecting them to
be diagnosed consistently, even in corner cases, is in line with
the documentation (and presumably also the intent) of the warning.

Other compilers (Clang and Intel ICC) already detect more of these
kinds of problems than GCC does.  In addition, C++ requires them
to be diagnosed in constexpr contexts.  Improving GCC to match and
even exceed these other compilers is simple and inexpensive, and
makes GCC a better implementation.  I welcome suggestions to
improve the patch (or submit a follow up one) and I'm open to
arguments that one solution is superior than another but I don't
understand your objection to a working solution on the grounds
that it handles corner cases.  All bugs are corner cases that
someone didn't get 

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-06 Thread Jeff Law
On 10/29/2017 10:15 AM, Martin Sebor wrote:
> Ping -- please see my reply below.
> 
> On 10/20/2017 09:57 AM, Richard Biener wrote:
>> get_addr_base_and_unit_offset will return NULL if there's any
>>> variable
>> component in 'ref'.  So as written it seems to be dead code (you
>> want to pass 'arg'?)
>
>
> Sorry, I'm not sure I understand what you mean.  What do you think
> is dead code?  The call to get_addr_base_and_unit_offset() is also
> made for an array of unspecified bound (up_bound is null) and for
> an array at the end of a struct.  For those the function returns
> non-null, and for the others (arrays of runtime bound) it returns
> null.  (I passed arg instead of ref but I see no difference in
> my tests.)

 If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')
>>> it will
 return the offset of 'c'.  If you pass a.b[j].c it will still return
>>> NULL.
 You could use get_ref_base_and_extent which will return the offset
 of a.b[0].c in this case and sets max_size != size - but you are only
 interested in offset.  The disadvantage of get_ref_base_and_extent
 is it returns offset in bits thus if the offset is too large for a
>>> HWI
 you'll instead get offset == 0 and max_size == -1.

 Thus I'm saying this is dead code for variable array accesses
 (even for the array you are warning about).  Yes, for constant index
 and at-struct-end you'll get sth, but the warning is in VRP because
 of variable indexes.

 So I suggest to pass 'arg' and use get_ref_base_and_extent
 for some extra precision (and possible lossage for very very large
 structures).
>>>
>>> Computing bit offsets defeats the out-of-bounds flexible array
>>> index detection because the computation overflows (the function
>>> sets the offset to zero).
>>
>> It shouldn't if you pass arg rather than ref.
> 
> I suspect you missed my reply in IRC where I confirmed that this
> approach doesn't work for the reason you yourself mentioned above:
> 
 The disadvantage of get_ref_base_and_extent
 is it returns offset in bits thus if the offset is too large
 for a HWI you'll instead get offset == 0 and max_size == -1.
> 
> This means that using the function you suggest would either prevent
> detecting the out-of-bounds indices that overflow due to the bit
> offset, thus largely defeating the purpose of the patch (to detect
> excessively large indices), or not printing the value of the out-of
> bounds index in the warning in this case, which would in turn mean
> further changes to the rest of the logic.
I think Richi's point is that it's more important (in his opinion) to
handle the varying objects within an access like a.b.c[i] than to handle
cases that would overflow when converted to bits.  Thus he'd prefer to
use get_ref_base_and_extent over get_addr_base_and_unit_offset.

I wonder if a hybrid approach here would work.

ie, use get_ref_base_and_extent per Richi's request.  When that returns
an max_size of -1, then call get_addr_base_and_unit_offset and if it
returns that the offset is huge, then warn without any deeper analysis.

Yea, this could trip a false positive if someone makes an array that is
half the address space (give or take), but that's probably not at all
common.  Whereas the additional cases handled by get_ref_base_and_extent
are perhaps more useful.

Thoughts?

jeff


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-30 Thread Richard Biener
On Sun, Oct 29, 2017 at 5:15 PM, Martin Sebor  wrote:
> Ping -- please see my reply below.
>
>
> On 10/20/2017 09:57 AM, Richard Biener wrote:
>>
>> get_addr_base_and_unit_offset will return NULL if there's any
>>>
>>> variable
>>
>> component in 'ref'.  So as written it seems to be dead code (you
>> want to pass 'arg'?)
>
>
>
> Sorry, I'm not sure I understand what you mean.  What do you think
> is dead code?  The call to get_addr_base_and_unit_offset() is also
> made for an array of unspecified bound (up_bound is null) and for
> an array at the end of a struct.  For those the function returns
> non-null, and for the others (arrays of runtime bound) it returns
> null.  (I passed arg instead of ref but I see no difference in
> my tests.)


 If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')
>>>
>>> it will

 return the offset of 'c'.  If you pass a.b[j].c it will still return
>>>
>>> NULL.

 You could use get_ref_base_and_extent which will return the offset
 of a.b[0].c in this case and sets max_size != size - but you are only
 interested in offset.  The disadvantage of get_ref_base_and_extent
 is it returns offset in bits thus if the offset is too large for a
>>>
>>> HWI

 you'll instead get offset == 0 and max_size == -1.

 Thus I'm saying this is dead code for variable array accesses
 (even for the array you are warning about).  Yes, for constant index
 and at-struct-end you'll get sth, but the warning is in VRP because
 of variable indexes.

 So I suggest to pass 'arg' and use get_ref_base_and_extent
 for some extra precision (and possible lossage for very very large
 structures).
>>>
>>>
>>> Computing bit offsets defeats the out-of-bounds flexible array
>>> index detection because the computation overflows (the function
>>> sets the offset to zero).
>>
>>
>> It shouldn't if you pass arg rather than ref.
>
>
> I suspect you missed my reply in IRC where I confirmed that this
> approach doesn't work for the reason you yourself mentioned above:
>
 The disadvantage of get_ref_base_and_extent
 is it returns offset in bits thus if the offset is too large
 for a HWI you'll instead get offset == 0 and max_size == -1.
>
> This means that using the function you suggest would either prevent
> detecting the out-of-bounds indices that overflow due to the bit
> offset, thus largely defeating the purpose of the patch (to detect
> excessively large indices), or not printing the value of the out-of
> bounds index in the warning in this case, which would in turn mean
> further changes to the rest of the logic.

Well, it would not handle the case of

a.b[offset-bringing-you-bitoffset-overflow].c[i]

but it would handle (compared to your version of the patch)

a.b[j].c[i]

with i being the unreasonably large offset.  That's beause we
look at the bit offset of a.b[j].c thus the _start_ of the array.

I still find the half-address-space case totally pointless to warn about
(even more to get "precise" here by subtracting the offset of the array).
There's not a single testcase that looks motivating to me (like an
error with some signed->unsigned conversion and then GCC magically
figuring out a range you'd warn for)

The diagnostic wording changes like

@@ -6718,7 +6750,8 @@ check_array_ref (location_t location, tree ref,
bool ignore_off_by_one)
   && tree_int_cst_le (low_sub, low_bound))
 {
   warning_at (location, OPT_Warray_bounds,
-  "array subscript is outside array bounds");
+  "array subscript [%E, %E] is outside array bounds of %qT",
+  low_sub, up_sub, artype);
   TREE_NO_WARNING (ref) = 1;
 }
 }

are ok for trunk.

Thanks,
Richard.


>>   I'll go ahead with the first version
>>>
>>> unless you have a different suggestion.
>
>
> But before I commit the patch as is I want to double-check to make
> sure you don't have further suggestions.
>
> Martin
>
> PS For reference the last version of the patch is here:
>
>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-29 Thread Martin Sebor

Ping -- please see my reply below.

On 10/20/2017 09:57 AM, Richard Biener wrote:

get_addr_base_and_unit_offset will return NULL if there's any

variable

component in 'ref'.  So as written it seems to be dead code (you
want to pass 'arg'?)



Sorry, I'm not sure I understand what you mean.  What do you think
is dead code?  The call to get_addr_base_and_unit_offset() is also
made for an array of unspecified bound (up_bound is null) and for
an array at the end of a struct.  For those the function returns
non-null, and for the others (arrays of runtime bound) it returns
null.  (I passed arg instead of ref but I see no difference in
my tests.)


If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')

it will

return the offset of 'c'.  If you pass a.b[j].c it will still return

NULL.

You could use get_ref_base_and_extent which will return the offset
of a.b[0].c in this case and sets max_size != size - but you are only
interested in offset.  The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large for a

HWI

you'll instead get offset == 0 and max_size == -1.

Thus I'm saying this is dead code for variable array accesses
(even for the array you are warning about).  Yes, for constant index
and at-struct-end you'll get sth, but the warning is in VRP because
of variable indexes.

So I suggest to pass 'arg' and use get_ref_base_and_extent
for some extra precision (and possible lossage for very very large
structures).


Computing bit offsets defeats the out-of-bounds flexible array
index detection because the computation overflows (the function
sets the offset to zero).


It shouldn't if you pass arg rather than ref.


I suspect you missed my reply in IRC where I confirmed that this
approach doesn't work for the reason you yourself mentioned above:

>>> The disadvantage of get_ref_base_and_extent
>>> is it returns offset in bits thus if the offset is too large
>>> for a HWI you'll instead get offset == 0 and max_size == -1.

This means that using the function you suggest would either prevent
detecting the out-of-bounds indices that overflow due to the bit
offset, thus largely defeating the purpose of the patch (to detect
excessively large indices), or not printing the value of the out-of
bounds index in the warning in this case, which would in turn mean
further changes to the rest of the logic.


  I'll go ahead with the first version

unless you have a different suggestion.


But before I commit the patch as is I want to double-check to make
sure you don't have further suggestions.

Martin

PS For reference the last version of the patch is here:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-20 Thread Richard Biener
On October 20, 2017 5:43:40 PM GMT+02:00, Martin Sebor  wrote:
>On 10/20/2017 02:08 AM, Richard Biener wrote:
>> On Fri, Oct 20, 2017 at 1:00 AM, Martin Sebor 
>wrote:
>>> On 10/19/2017 02:34 AM, Richard Biener wrote:

 On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor 
>wrote:
>
> On 10/18/2017 04:48 AM, Richard Biener wrote:
>>
>>
>> On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor 
>wrote:
>>>
>>>
>>> While testing my latest -Wrestrict changes I noticed a number of
>>> opportunities to improve the -Warray-bounds warning.  Attached
>>> is a patch that implements a solution for the following subset
>>> of these:
>>>
>>> PR tree-optimization/82596 - missing -Warray-bounds on an out-of
>>>   bounds index into string literal
>>> PR tree-optimization/82588 - missing -Warray-bounds on an
>excessively
>>>   large index
>>> PR tree-optimization/82583 - missing -Warray-bounds on
>out-of-bounds
>>>   inner indices
>>>
>>>
 I meant to use size_type_node (size_t), not sizetype.  But
 I just checked that ptrdiff_type_node is initialized in
 build_common_tree_nodes and thus always available.
>>>
>>>
>>> I see.  Using ptrdiff_type_node is preferable for the targets
>>> where ptrdiff_t has a greater precision than size_t (e.g., VMS).
>>> It makes sense now.  I should remember to change all the other
>>> places where I introduced ssizetype to use ptrdiff_type_node.
>>>

> As an aside, at some point I would like to get away from a type
> based limit in all these warnings and instead use one that can
> be controlled by an option so that a user can impose a lower limit
> on the maximum size of an object and have all size-related
>warnings
> (and perhaps even optimizations) enforce it and benefit from it.


 You could add a --param that is initialized from ptrdiff_type_node.
>>>
>>>
>>> Yes, that's an option to consider.  Thanks.
>>>
>>>

>> +  tree arg = TREE_OPERAND (ref, 0);
>> +  tree_code code = TREE_CODE (arg);
>> +  if (code == COMPONENT_REF)
>> +   {
>> + HOST_WIDE_INT off;
>> + if (tree base = get_addr_base_and_unit_offset (ref,
>))
>> +   up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype,
>> up_bound_p1,
>> +  TYPE_SIZE_UNIT (TREE_TYPE
>> (base)));
>> + else
>> +   return;
>>
>> so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will
>be
>> false).
>> simply not subtracting anyhing instead of returning would be
>> conservatively
>> correct, no?  Likewise subtracting the offset of the array for
>all
>> "previous"
>> variably indexed components with assuming the lowest value for
>the
>> index.
>> But as above I think compensating for the offset of the array
>within the
>> object
>> is academic ... ;)
>
>
>
> I was going to say yes (it gives up) but on second thought I don't
> think it does.  Only the major index can be unbounded and the code
> does consider the size of the sub-array when checking the major
> index.  So, IIUC, I think this works correctly as is (*).  What
> doesn't work is VLAs but those are a separate problem.  Let me
> know if I misunderstood your question.


 get_addr_base_and_unit_offset will return NULL if there's any
>variable
 component in 'ref'.  So as written it seems to be dead code (you
 want to pass 'arg'?)
>>>
>>>
>>> Sorry, I'm not sure I understand what you mean.  What do you think
>>> is dead code?  The call to get_addr_base_and_unit_offset() is also
>>> made for an array of unspecified bound (up_bound is null) and for
>>> an array at the end of a struct.  For those the function returns
>>> non-null, and for the others (arrays of runtime bound) it returns
>>> null.  (I passed arg instead of ref but I see no difference in
>>> my tests.)
>>
>> If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')
>it will
>> return the offset of 'c'.  If you pass a.b[j].c it will still return
>NULL.
>> You could use get_ref_base_and_extent which will return the offset
>> of a.b[0].c in this case and sets max_size != size - but you are only
>> interested in offset.  The disadvantage of get_ref_base_and_extent
>> is it returns offset in bits thus if the offset is too large for a
>HWI
>> you'll instead get offset == 0 and max_size == -1.
>>
>> Thus I'm saying this is dead code for variable array accesses
>> (even for the array you are warning about).  Yes, for constant index
>> and at-struct-end you'll get sth, but the warning is in VRP because
>> of variable indexes.
>>
>> So I suggest to pass 'arg' and use get_ref_base_and_extent
>> for some extra precision (and possible lossage for very very large
>> structures).
>
>Computing bit offsets defeats 

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-20 Thread Martin Sebor

On 10/20/2017 02:08 AM, Richard Biener wrote:

On Fri, Oct 20, 2017 at 1:00 AM, Martin Sebor  wrote:

On 10/19/2017 02:34 AM, Richard Biener wrote:


On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor  wrote:


On 10/18/2017 04:48 AM, Richard Biener wrote:



On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor  wrote:



While testing my latest -Wrestrict changes I noticed a number of
opportunities to improve the -Warray-bounds warning.  Attached
is a patch that implements a solution for the following subset
of these:

PR tree-optimization/82596 - missing -Warray-bounds on an out-of
  bounds index into string literal
PR tree-optimization/82588 - missing -Warray-bounds on an excessively
  large index
PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
  inner indices




I meant to use size_type_node (size_t), not sizetype.  But
I just checked that ptrdiff_type_node is initialized in
build_common_tree_nodes and thus always available.



I see.  Using ptrdiff_type_node is preferable for the targets
where ptrdiff_t has a greater precision than size_t (e.g., VMS).
It makes sense now.  I should remember to change all the other
places where I introduced ssizetype to use ptrdiff_type_node.




As an aside, at some point I would like to get away from a type
based limit in all these warnings and instead use one that can
be controlled by an option so that a user can impose a lower limit
on the maximum size of an object and have all size-related warnings
(and perhaps even optimizations) enforce it and benefit from it.



You could add a --param that is initialized from ptrdiff_type_node.



Yes, that's an option to consider.  Thanks.





+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+   {
+ HOST_WIDE_INT off;
+ if (tree base = get_addr_base_and_unit_offset (ref, ))
+   up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype,
up_bound_p1,
+  TYPE_SIZE_UNIT (TREE_TYPE
(base)));
+ else
+   return;

so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be
false).
simply not subtracting anyhing instead of returning would be
conservatively
correct, no?  Likewise subtracting the offset of the array for all
"previous"
variably indexed components with assuming the lowest value for the
index.
But as above I think compensating for the offset of the array within the
object
is academic ... ;)




I was going to say yes (it gives up) but on second thought I don't
think it does.  Only the major index can be unbounded and the code
does consider the size of the sub-array when checking the major
index.  So, IIUC, I think this works correctly as is (*).  What
doesn't work is VLAs but those are a separate problem.  Let me
know if I misunderstood your question.



get_addr_base_and_unit_offset will return NULL if there's any variable
component in 'ref'.  So as written it seems to be dead code (you
want to pass 'arg'?)



Sorry, I'm not sure I understand what you mean.  What do you think
is dead code?  The call to get_addr_base_and_unit_offset() is also
made for an array of unspecified bound (up_bound is null) and for
an array at the end of a struct.  For those the function returns
non-null, and for the others (arrays of runtime bound) it returns
null.  (I passed arg instead of ref but I see no difference in
my tests.)


If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg') it will
return the offset of 'c'.  If you pass a.b[j].c it will still return NULL.
You could use get_ref_base_and_extent which will return the offset
of a.b[0].c in this case and sets max_size != size - but you are only
interested in offset.  The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large for a HWI
you'll instead get offset == 0 and max_size == -1.

Thus I'm saying this is dead code for variable array accesses
(even for the array you are warning about).  Yes, for constant index
and at-struct-end you'll get sth, but the warning is in VRP because
of variable indexes.

So I suggest to pass 'arg' and use get_ref_base_and_extent
for some extra precision (and possible lossage for very very large
structures).


Computing bit offsets defeats the out-of-bounds flexible array
index detection because the computation overflows (the function
sets the offset to zero).  I'll go ahead with the first version
unless you have a different suggestion.

Thanks
Martin



Thus instead of

+  tree maxbound = TYPE_MAX_VALUE (ptrdiff_type_node);
+
+  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);
+
+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+   {
+ HOST_WIDE_INT off;
+ if (tree base = get_addr_base_and_unit_offset (arg, ))
+   {
+ tree size = TYPE_SIZE_UNIT (TREE_TYPE 

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-20 Thread Richard Biener
On Fri, Oct 20, 2017 at 1:00 AM, Martin Sebor  wrote:
> On 10/19/2017 02:34 AM, Richard Biener wrote:
>>
>> On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor  wrote:
>>>
>>> On 10/18/2017 04:48 AM, Richard Biener wrote:


 On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor  wrote:
>
>
> While testing my latest -Wrestrict changes I noticed a number of
> opportunities to improve the -Warray-bounds warning.  Attached
> is a patch that implements a solution for the following subset
> of these:
>
> PR tree-optimization/82596 - missing -Warray-bounds on an out-of
>   bounds index into string literal
> PR tree-optimization/82588 - missing -Warray-bounds on an excessively
>   large index
> PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
>   inner indices
>
>
>> I meant to use size_type_node (size_t), not sizetype.  But
>> I just checked that ptrdiff_type_node is initialized in
>> build_common_tree_nodes and thus always available.
>
>
> I see.  Using ptrdiff_type_node is preferable for the targets
> where ptrdiff_t has a greater precision than size_t (e.g., VMS).
> It makes sense now.  I should remember to change all the other
> places where I introduced ssizetype to use ptrdiff_type_node.
>
>>
>>> As an aside, at some point I would like to get away from a type
>>> based limit in all these warnings and instead use one that can
>>> be controlled by an option so that a user can impose a lower limit
>>> on the maximum size of an object and have all size-related warnings
>>> (and perhaps even optimizations) enforce it and benefit from it.
>>
>>
>> You could add a --param that is initialized from ptrdiff_type_node.
>
>
> Yes, that's an option to consider.  Thanks.
>
>
>>
 +  tree arg = TREE_OPERAND (ref, 0);
 +  tree_code code = TREE_CODE (arg);
 +  if (code == COMPONENT_REF)
 +   {
 + HOST_WIDE_INT off;
 + if (tree base = get_addr_base_and_unit_offset (ref, ))
 +   up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype,
 up_bound_p1,
 +  TYPE_SIZE_UNIT (TREE_TYPE
 (base)));
 + else
 +   return;

 so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be
 false).
 simply not subtracting anyhing instead of returning would be
 conservatively
 correct, no?  Likewise subtracting the offset of the array for all
 "previous"
 variably indexed components with assuming the lowest value for the
 index.
 But as above I think compensating for the offset of the array within the
 object
 is academic ... ;)
>>>
>>>
>>>
>>> I was going to say yes (it gives up) but on second thought I don't
>>> think it does.  Only the major index can be unbounded and the code
>>> does consider the size of the sub-array when checking the major
>>> index.  So, IIUC, I think this works correctly as is (*).  What
>>> doesn't work is VLAs but those are a separate problem.  Let me
>>> know if I misunderstood your question.
>>
>>
>> get_addr_base_and_unit_offset will return NULL if there's any variable
>> component in 'ref'.  So as written it seems to be dead code (you
>> want to pass 'arg'?)
>
>
> Sorry, I'm not sure I understand what you mean.  What do you think
> is dead code?  The call to get_addr_base_and_unit_offset() is also
> made for an array of unspecified bound (up_bound is null) and for
> an array at the end of a struct.  For those the function returns
> non-null, and for the others (arrays of runtime bound) it returns
> null.  (I passed arg instead of ref but I see no difference in
> my tests.)

If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg') it will
return the offset of 'c'.  If you pass a.b[j].c it will still return NULL.
You could use get_ref_base_and_extent which will return the offset
of a.b[0].c in this case and sets max_size != size - but you are only
interested in offset.  The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large for a HWI
you'll instead get offset == 0 and max_size == -1.

Thus I'm saying this is dead code for variable array accesses
(even for the array you are warning about).  Yes, for constant index
and at-struct-end you'll get sth, but the warning is in VRP because
of variable indexes.

So I suggest to pass 'arg' and use get_ref_base_and_extent
for some extra precision (and possible lossage for very very large
structures).

Thus instead of

+  tree maxbound = TYPE_MAX_VALUE (ptrdiff_type_node);
+
+  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);
+
+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+   {
+ HOST_WIDE_INT off;
+ if (tree base = get_addr_base_and_unit_offset (arg, ))
+   {
+ tree size = 

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-19 Thread Martin Sebor

On 10/19/2017 02:34 AM, Richard Biener wrote:

On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor  wrote:

On 10/18/2017 04:48 AM, Richard Biener wrote:


On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor  wrote:


While testing my latest -Wrestrict changes I noticed a number of
opportunities to improve the -Warray-bounds warning.  Attached
is a patch that implements a solution for the following subset
of these:

PR tree-optimization/82596 - missing -Warray-bounds on an out-of
  bounds index into string literal
PR tree-optimization/82588 - missing -Warray-bounds on an excessively
  large index
PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
  inner indices



I meant to use size_type_node (size_t), not sizetype.  But
I just checked that ptrdiff_type_node is initialized in
build_common_tree_nodes and thus always available.


I see.  Using ptrdiff_type_node is preferable for the targets
where ptrdiff_t has a greater precision than size_t (e.g., VMS).
It makes sense now.  I should remember to change all the other
places where I introduced ssizetype to use ptrdiff_type_node.




As an aside, at some point I would like to get away from a type
based limit in all these warnings and instead use one that can
be controlled by an option so that a user can impose a lower limit
on the maximum size of an object and have all size-related warnings
(and perhaps even optimizations) enforce it and benefit from it.


You could add a --param that is initialized from ptrdiff_type_node.


Yes, that's an option to consider.  Thanks.




+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+   {
+ HOST_WIDE_INT off;
+ if (tree base = get_addr_base_and_unit_offset (ref, ))
+   up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype, up_bound_p1,
+  TYPE_SIZE_UNIT (TREE_TYPE (base)));
+ else
+   return;

so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be
false).
simply not subtracting anyhing instead of returning would be
conservatively
correct, no?  Likewise subtracting the offset of the array for all
"previous"
variably indexed components with assuming the lowest value for the index.
But as above I think compensating for the offset of the array within the
object
is academic ... ;)



I was going to say yes (it gives up) but on second thought I don't
think it does.  Only the major index can be unbounded and the code
does consider the size of the sub-array when checking the major
index.  So, IIUC, I think this works correctly as is (*).  What
doesn't work is VLAs but those are a separate problem.  Let me
know if I misunderstood your question.


get_addr_base_and_unit_offset will return NULL if there's any variable
component in 'ref'.  So as written it seems to be dead code (you
want to pass 'arg'?)


Sorry, I'm not sure I understand what you mean.  What do you think
is dead code?  The call to get_addr_base_and_unit_offset() is also
made for an array of unspecified bound (up_bound is null) and for
an array at the end of a struct.  For those the function returns
non-null, and for the others (arrays of runtime bound) it returns
null.  (I passed arg instead of ref but I see no difference in
my tests.)



I was asking you to remove the 'else return' because w/o subtracting
the upper bound is just more conservative.


Sure, that sounds good.

Attached is an updated patch.

Thanks
Martin
PR tree-optimization/82588 - missing -Warray-bounds on a excessively large index
PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds inner indic

gcc/ChangeLog:
	PR tree-optimization/82588
	PR tree-optimization/82583
	* tree-vrp.c (check_array_ref): Handle flexible array members,
	string literals, and inner indices.
	(search_for_addr_array): Add detail to diagnostics.

gcc/testsuite/ChangeLog:
	PR tree-optimization/82588
	PR tree-optimization/82583	
	* c-c++-common/Warray-bounds.c: New test.
	* gcc.dg/Warray-bounds-11.c: Adjust.
	* gcc.dg/Warray-bounds-22.c: New test.

diff --git a/gcc/testsuite/c-c++-common/Warray-bounds.c b/gcc/testsuite/c-c++-common/Warray-bounds.c
new file mode 100644
index 000..2207999
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Warray-bounds.c
@@ -0,0 +1,240 @@
+/* PR tree-optimization/82588 - missing -Warray-bounds on an excessively
+   large index
+   { dg-do compile }
+   { dg-options "-O2 -Warray-bounds -ftrack-macro-expansion=0" }  */
+
+#define SIZE_MAX  __SIZE_MAX__
+#define SSIZE_MAX __PTRDIFF_MAX__
+#define SSIZE_MIN (-SSIZE_MAX - 1)
+
+typedef __PTRDIFF_TYPE__ ssize_t;
+typedef __SIZE_TYPE__size_t;
+
+extern ssize_t signed_value (void)
+{
+  extern volatile ssize_t signed_value_source;
+  return signed_value_source;
+}
+
+extern size_t unsigned_value (void)
+{
+  extern volatile size_t unsigned_value_source;
+  return unsigned_value_source;
+}
+
+ssize_t signed_range (ssize_t min, ssize_t max)
+{

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-19 Thread Martin Sebor

Good question!  STRING_CST does have a domain.  The problem is
that array_at_struct_end_p() returns true for STRING_CST.  I've
added the handling to the function and removed the block above
from the latest patch.


Can you split out the STRING_CST handling and commit that separately
(split the testcase)?  That part looks ok.


I've committed r253902.  It turns out, however, that this subset
of the patch doesn't fix the whole problem.  What's still missing
is the handling of:

  int g (int i)
  {
return (i < 0 ? ABC : DEF)[7];   // missing -Warray-bounds
  }

Surprisingly, this happens to work in C++ but not in C (which
is in contrast to bug 82609 where it's the other way around).
The root cause is that the expression is represented as
a MEM_REF(ADDR_EXPR (STRING_CST)) and check_array_bounds()
only considers ARRAY_REF and ADDR_EXPR.

I will submit a separate patch for that, perhaps along with
one for bug 82612 that you commented on this morning (missing
-Warray-bounds on a non-zero offset from the address of a non-
array object).

Martin


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-19 Thread Richard Biener
On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor  wrote:
> On 10/18/2017 04:48 AM, Richard Biener wrote:
>>
>> On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor  wrote:
>>>
>>> While testing my latest -Wrestrict changes I noticed a number of
>>> opportunities to improve the -Warray-bounds warning.  Attached
>>> is a patch that implements a solution for the following subset
>>> of these:
>>>
>>> PR tree-optimization/82596 - missing -Warray-bounds on an out-of
>>>   bounds index into string literal
>>> PR tree-optimization/82588 - missing -Warray-bounds on an excessively
>>>   large index
>>> PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
>>>   inner indices
>>>
>>> The patch also adds more detail to the -Warray-bounds diagnostics
>>> to make it easier to see the cause of the problem.
>>>
>>> Richard, since the changes touch tree-vrp.c, I look in particular
>>> for your comments.
>>
>>
>> +  /* Accesses to trailing arrays via pointers may access storage
>> +beyond the types array bounds.  For such arrays, or for flexible
>> +array members as well as for other arrays of an unknown size,
>> +replace the upper bound with a more permissive one that assumes
>> +the size of the largest object is SSIZE_MAX.  */
>>
>> I believe handling those cases are somewhat academic, but ...
>
>
> Thanks for the quick review!
>
> I agree the SSIZE_MAX tests handle corner cases and there are
> arguably more important gaps here to plug (e.g., VLAs).  Then
> again, most bugs tend to lurk in corner cases of one kind or
> another and these seemed like a good way for me to come up to
> speed on the implementation before tackling those.  If you
> have suggestions for which to dive into next I'm happy to take
> them.
>
>> +  tree eltype = TREE_TYPE (ref);
>> +  tree eltsize = TYPE_SIZE_UNIT (eltype);
>>
>> needs to use array_ref_element_size.  Note that this size can be
>> non-constant in which case you now ICE so you have to check
>> this is an INTEGER_CST.
>
>
> Thanks.  I did reproduce a few ICEs due to VLAs.  I've fixed
> the problems and added tests for them.  One-dimensional VLAs
> are now handled the same way arrays of unknown bound are.
> Handling them more intelligently (i.e., taking into account
> the ranges their bounds are in) and especially handling
> multidimensional VLAs will take some effort.
>
>>
>> +  tree maxbound = TYPE_MAX_VALUE (ssizetype);
>>
>> please don't use ssizetype - sizetype can be of different precision
>> than pointers (and thus objects).  size_type_node would come
>> close (maps to size_t), eventually ptrdiff_type_node is also
>> defined by all frontends.
>
>
> Okay, I've changed it to sizetype (it would be nice if there
> were a cleaner way of doing it rather than by bit-twiddling.)
>
> ptrdiff_type would have been my first choice but it's only defined
> by the C/C++ front ends and not available in the middle-end, so
> the other warnings that consider object sizes deal with ssizetype
> (e.g., -Walloc-size-larger-than, -Wstringop- overflow, and
> -Wformat-overflow).
>
> That said, I'm not sure I understand under what conditions
> ssizetype is not the signed equivalent of sizetype.  Can you
> clarify?

I meant to use size_type_node (size_t), not sizetype.  But
I just checked that ptrdiff_type_node is initialized in
build_common_tree_nodes and thus always available.

> As an aside, at some point I would like to get away from a type
> based limit in all these warnings and instead use one that can
> be controlled by an option so that a user can impose a lower limit
> on the maximum size of an object and have all size-related warnings
> (and perhaps even optimizations) enforce it and benefit from it.

You could add a --param that is initialized from ptrdiff_type_node.

>> +  up_bound_p1 = fold_build2 (TRUNC_DIV_EXPR, ssizetype, maxbound,
>> eltsize);
>> +
>>
>> int_const_binop if you insist on using trees...
>
>
> Sure.  (I think offset_int would be more convenient but the rest
> of the function uses trees so I just stuck to those to avoid
> converting things back and forth or disrupting more of the code
> than I had to.)

Understood.

>>
>> +  tree arg = TREE_OPERAND (ref, 0);
>> +  tree_code code = TREE_CODE (arg);
>> +  if (code == COMPONENT_REF)
>> +   {
>> + HOST_WIDE_INT off;
>> + if (tree base = get_addr_base_and_unit_offset (ref, ))
>> +   up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype, up_bound_p1,
>> +  TYPE_SIZE_UNIT (TREE_TYPE (base)));
>> + else
>> +   return;
>>
>> so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be
>> false).
>> simply not subtracting anyhing instead of returning would be
>> conservatively
>> correct, no?  Likewise subtracting the offset of the array for all
>> "previous"
>> variably indexed components with assuming the lowest value for the index.
>> But as above I 

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-18 Thread Martin Sebor

On 10/18/2017 04:48 AM, Richard Biener wrote:

On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor  wrote:

While testing my latest -Wrestrict changes I noticed a number of
opportunities to improve the -Warray-bounds warning.  Attached
is a patch that implements a solution for the following subset
of these:

PR tree-optimization/82596 - missing -Warray-bounds on an out-of
  bounds index into string literal
PR tree-optimization/82588 - missing -Warray-bounds on an excessively
  large index
PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
  inner indices

The patch also adds more detail to the -Warray-bounds diagnostics
to make it easier to see the cause of the problem.

Richard, since the changes touch tree-vrp.c, I look in particular
for your comments.


+  /* Accesses to trailing arrays via pointers may access storage
+beyond the types array bounds.  For such arrays, or for flexible
+array members as well as for other arrays of an unknown size,
+replace the upper bound with a more permissive one that assumes
+the size of the largest object is SSIZE_MAX.  */

I believe handling those cases are somewhat academic, but ...


Thanks for the quick review!

I agree the SSIZE_MAX tests handle corner cases and there are
arguably more important gaps here to plug (e.g., VLAs).  Then
again, most bugs tend to lurk in corner cases of one kind or
another and these seemed like a good way for me to come up to
speed on the implementation before tackling those.  If you
have suggestions for which to dive into next I'm happy to take
them.


+  tree eltype = TREE_TYPE (ref);
+  tree eltsize = TYPE_SIZE_UNIT (eltype);

needs to use array_ref_element_size.  Note that this size can be
non-constant in which case you now ICE so you have to check
this is an INTEGER_CST.


Thanks.  I did reproduce a few ICEs due to VLAs.  I've fixed
the problems and added tests for them.  One-dimensional VLAs
are now handled the same way arrays of unknown bound are.
Handling them more intelligently (i.e., taking into account
the ranges their bounds are in) and especially handling
multidimensional VLAs will take some effort.



+  tree maxbound = TYPE_MAX_VALUE (ssizetype);

please don't use ssizetype - sizetype can be of different precision
than pointers (and thus objects).  size_type_node would come
close (maps to size_t), eventually ptrdiff_type_node is also
defined by all frontends.


Okay, I've changed it to sizetype (it would be nice if there
were a cleaner way of doing it rather than by bit-twiddling.)

ptrdiff_type would have been my first choice but it's only defined
by the C/C++ front ends and not available in the middle-end, so
the other warnings that consider object sizes deal with ssizetype
(e.g., -Walloc-size-larger-than, -Wstringop- overflow, and
-Wformat-overflow).

That said, I'm not sure I understand under what conditions
ssizetype is not the signed equivalent of sizetype.  Can you
clarify?

As an aside, at some point I would like to get away from a type
based limit in all these warnings and instead use one that can
be controlled by an option so that a user can impose a lower limit
on the maximum size of an object and have all size-related warnings
(and perhaps even optimizations) enforce it and benefit from it.


+  up_bound_p1 = fold_build2 (TRUNC_DIV_EXPR, ssizetype, maxbound, eltsize);
+

int_const_binop if you insist on using trees...


Sure.  (I think offset_int would be more convenient but the rest
of the function uses trees so I just stuck to those to avoid
converting things back and forth or disrupting more of the code
than I had to.)



+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+   {
+ HOST_WIDE_INT off;
+ if (tree base = get_addr_base_and_unit_offset (ref, ))
+   up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype, up_bound_p1,
+  TYPE_SIZE_UNIT (TREE_TYPE (base)));
+ else
+   return;

so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be false).
simply not subtracting anyhing instead of returning would be conservatively
correct, no?  Likewise subtracting the offset of the array for all "previous"
variably indexed components with assuming the lowest value for the index.
But as above I think compensating for the offset of the array within the object
is academic ... ;)


I was going to say yes (it gives up) but on second thought I don't
think it does.  Only the major index can be unbounded and the code
does consider the size of the sub-array when checking the major
index.  So, IIUC, I think this works correctly as is (*).  What
doesn't work is VLAs but those are a separate problem.  Let me
know if I misunderstood your question.


+  else if (code == STRING_CST)
+   up_bound_p1 = build_int_cst (ssizetype, TREE_STRING_LENGTH (arg));

that one is more interesting -- why's 

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-18 Thread Richard Biener
On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor  wrote:
> While testing my latest -Wrestrict changes I noticed a number of
> opportunities to improve the -Warray-bounds warning.  Attached
> is a patch that implements a solution for the following subset
> of these:
>
> PR tree-optimization/82596 - missing -Warray-bounds on an out-of
>   bounds index into string literal
> PR tree-optimization/82588 - missing -Warray-bounds on an excessively
>   large index
> PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
>   inner indices
>
> The patch also adds more detail to the -Warray-bounds diagnostics
> to make it easier to see the cause of the problem.
>
> Richard, since the changes touch tree-vrp.c, I look in particular
> for your comments.

+  /* Accesses to trailing arrays via pointers may access storage
+beyond the types array bounds.  For such arrays, or for flexible
+array members as well as for other arrays of an unknown size,
+replace the upper bound with a more permissive one that assumes
+the size of the largest object is SSIZE_MAX.  */

I believe handling those cases are somewhat academic, but ...

+  tree eltype = TREE_TYPE (ref);
+  tree eltsize = TYPE_SIZE_UNIT (eltype);

needs to use array_ref_element_size.  Note that this size can be
non-constant in which case you now ICE so you have to check
this is an INTEGER_CST.

+  tree maxbound = TYPE_MAX_VALUE (ssizetype);

please don't use ssizetype - sizetype can be of different precision
than pointers (and thus objects).  size_type_node would come
close (maps to size_t), eventually ptrdiff_type_node is also
defined by all frontends.

+  up_bound_p1 = fold_build2 (TRUNC_DIV_EXPR, ssizetype, maxbound, eltsize);
+

int_const_binop if you insist on using trees...

+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+   {
+ HOST_WIDE_INT off;
+ if (tree base = get_addr_base_and_unit_offset (ref, ))
+   up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype, up_bound_p1,
+  TYPE_SIZE_UNIT (TREE_TYPE (base)));
+ else
+   return;

so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be false).
simply not subtracting anyhing instead of returning would be conservatively
correct, no?  Likewise subtracting the offset of the array for all "previous"
variably indexed components with assuming the lowest value for the index.
But as above I think compensating for the offset of the array within the object
is academic ... ;)

+  else if (code == STRING_CST)
+   up_bound_p1 = build_int_cst (ssizetype, TREE_STRING_LENGTH (arg));

that one is more interesting -- why's the TYPE_DOMAIN of the STRING_CST lacking
a max value?  Iff we use build_string_literal it should have the proper type.

Thanks,
Richard.

>
> Thanks
> Martin


[PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-17 Thread Martin Sebor

While testing my latest -Wrestrict changes I noticed a number of
opportunities to improve the -Warray-bounds warning.  Attached
is a patch that implements a solution for the following subset
of these:

PR tree-optimization/82596 - missing -Warray-bounds on an out-of
  bounds index into string literal
PR tree-optimization/82588 - missing -Warray-bounds on an excessively
  large index
PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
  inner indices

The patch also adds more detail to the -Warray-bounds diagnostics
to make it easier to see the cause of the problem.

Richard, since the changes touch tree-vrp.c, I look in particular
for your comments.

Thanks
Martin
PR tree-optimization/82596 - missing -Warray-bounds on an out-of-bounds index into string literal
PR tree-optimization/82588 - missing -Warray-bounds on a excessively large index
PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds inner indic

gcc/ChangeLog:
	PR tree-optimization/82596
	PR tree-optimization/82588
	PR tree-optimization/82583	
	* tree-vrp.c (check_array_ref): Handle flexible array members,
	string literals, and inner indices.
	(search_for_addr_array): Add detail to diagnostics.

gcc/testsuite/ChangeLog:

	PR tree-optimization/82596
	PR tree-optimization/82588
	PR tree-optimization/82583	
	* c-c++-common/Warray-bounds.c: New test.
	* gcc.dg/Warray-bounds-11.c: Adjust.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 2c86b8e..88cce15 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "gimple-walk.h"
 #include "tree-cfg.h"
+#include "tree-dfa.h"
 #include "tree-ssa-loop-manip.h"
 #include "tree-ssa-loop-niter.h"
 #include "tree-ssa-loop.h"
@@ -64,6 +65,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-cfgcleanup.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "builtins.h"
 
 #define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL }
 
@@ -6675,26 +6677,51 @@ check_array_ref (location_t location, tree ref, bool ignore_off_by_one)
   low_sub = up_sub = TREE_OPERAND (ref, 1);
   up_bound = array_ref_up_bound (ref);
 
-  /* Can not check flexible arrays.  */
   if (!up_bound
-  || TREE_CODE (up_bound) != INTEGER_CST)
-return;
+  || (warn_array_bounds < 2
+	  && array_at_struct_end_p (ref)))
+{
+  /* Accesses to trailing arrays via pointers may access storage
+	 beyond the types array bounds.  For such arrays, or for flexible
+	 array members as well as for other arrays of an unknown size,
+	 replace the upper bound with a more permissive one that assumes
+	 the size of the largest object is SSIZE_MAX.  */
+  tree eltype = TREE_TYPE (ref);
+  tree eltsize = TYPE_SIZE_UNIT (eltype);
+  tree maxbound = TYPE_MAX_VALUE (ssizetype);
+  up_bound_p1 = fold_build2 (TRUNC_DIV_EXPR, ssizetype, maxbound, eltsize);
+
+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+	{
+	  HOST_WIDE_INT off;
+	  if (tree base = get_addr_base_and_unit_offset (ref, ))
+	up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype, up_bound_p1,
+   TYPE_SIZE_UNIT (TREE_TYPE (base)));
+	  else
+	return;
+	}
+  else if (code == STRING_CST)
+	up_bound_p1 = build_int_cst (ssizetype, TREE_STRING_LENGTH (arg));
 
-  /* Accesses to trailing arrays via pointers may access storage
- beyond the types array bounds.  */
-  if (warn_array_bounds < 2
-  && array_at_struct_end_p (ref))
-return;
+  up_bound = int_const_binop (MINUS_EXPR, up_bound_p1,
+  build_int_cst (ssizetype, 1));
+}
+  else
+up_bound_p1 = int_const_binop (PLUS_EXPR, up_bound,
+   build_int_cst (TREE_TYPE (up_bound), 1));
 
   low_bound = array_ref_low_bound (ref);
-  up_bound_p1 = int_const_binop (PLUS_EXPR, up_bound,
- build_int_cst (TREE_TYPE (up_bound), 1));
+
+  tree artype = TREE_TYPE (TREE_OPERAND (ref, 0));
 
   /* Empty array.  */
   if (tree_int_cst_equal (low_bound, up_bound_p1))
 {
   warning_at (location, OPT_Warray_bounds,
-		  "array subscript is above array bounds");
+		  "array subscript %E is above array bounds of %qT",
+		  low_bound, artype);
   TREE_NO_WARNING (ref) = 1;
 }
 
@@ -6718,7 +6745,8 @@ check_array_ref (location_t location, tree ref, bool ignore_off_by_one)
   && tree_int_cst_le (low_sub, low_bound))
 {
   warning_at (location, OPT_Warray_bounds,
-		  "array subscript is outside array bounds");
+		  "array subscript [%E, %E] is outside array bounds of %qT",
+		  low_sub, up_sub, artype);
   TREE_NO_WARNING (ref) = 1;
 }
 }
@@ -6734,7 +6762,8 @@ check_array_ref (location_t location, tree ref, bool ignore_off_by_one)
 	  fprintf (dump_file, "\n");
 	}
   warning_at (location, OPT_Warray_bounds,
-		  "array subscript is above array bounds");
+		  "array subscript %E is above array