Re: [PATCH] analyzer: consider empty ranges and zero byte accesses [PR106845]

2022-09-11 Thread Tim Lange
> ...it took me a moment to realize that the analyzer "sees" that this is
> "main", and thus buf_size is 0.
> 
> Interestingly, if I rename it to not be "main" (and thus buf_size could
> be non-zero), we still don't complain:
>   https://godbolt.org/z/PezfTo9Mz
> Presumably this is a known limitation of the symbolic bounds checking?

Yeah. I do only try structural equality for binaryop_svalues.  The
example does result in a call to eval_condition_without_cm with two
  unaryop_svalue(NOP_EXPR, initial_svalue ('buf_size'))
that have different types ('unsigned int' and 'sizetype').  Thus,
lhs == rhs is false and eval_condition_without_cm does return UNKNOWN.

Changing the type of buf_size to size_t removes the UNARYOP wrapping and
thus, emits a warning: https://godbolt.org/z/4sh7TM4v1 [0]

Otherwise, we could also do a call to structural_equality for
unaryop_svalue inside eval_condition_without_cm and ignore a type
mismatch for unaryop_svalues.  That way, the analyzer would complain about
your example.  Not 100% sure but I think it is okay to ignore the type
here for unaryop_svalues as long as the leafs match up.  If you agree,
I can prepare a patch [1].

[0] I've seen you pushed a patch that displays the capacity as a new
event at region_creation.  My patches did that by overwriting whats
printed using describe_region_creation_event.  Should I remove all
those now unneccessary describe_region_creation_event overloads?
[1] Below is how that would probably look like.

---
 gcc/analyzer/region-model.cc | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 82006405373..4a9f0ff1e86 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -4190,6 +4190,24 @@ region_model::eval_condition_without_cm (const svalue 
*lhs,
}
 }
 
+  if (lhs->get_kind () == SK_UNARYOP)
+{
+  switch (op)
+   {
+   default:
+ break;
+   case EQ_EXPR:
+   case LE_EXPR:
+   case GE_EXPR:
+ {
+   tristate res = structural_equality (lhs, rhs);
+   if (res.is_true ())
+ return res;
+ }
+ break;
+   }
+}
+
   return tristate::TS_UNKNOWN;
 }
 
@@ -4307,9 +4325,7 @@ region_model::structural_equality (const svalue *a, const 
svalue *b) const
   {
const unaryop_svalue *un_a = as_a  (a);
if (const unaryop_svalue *un_b = dyn_cast  (b))
- return tristate (pending_diagnostic::same_tree_p (un_a->get_type (),
-   un_b->get_type ())
-  && un_a->get_op () == un_b->get_op ()
+ return tristate (un_a->get_op () == un_b->get_op ()
   && structural_equality (un_a->get_arg (),
   un_b->get_arg ()));
   }
-- 
2.37.3



Re: [PATCH] analyzer: consider empty ranges and zero byte accesses [PR106845]

2022-09-11 Thread David Malcolm via Gcc-patches
On Sun, 2022-09-11 at 10:21 +0200, Bernhard Reutner-Fischer wrote:
> On 11 September 2022 10:04:51 CEST, David Malcolm via Gcc-patches
>  wrote:
> 
> > > +++ b/gcc/testsuite/gcc.dg/analyzer/pr106845.c
> > > @@ -0,0 +1,11 @@
> > > +int buf_size;
> > > +
> > > +int
> > > +main (void)
> > > +{
> > > +  char buf[buf_size];
> > > +
> > > +  __builtin_memset (&buf[1], 0, buf_size);
> > > +
> > > +  return 0;
> > > +}
> > 
> > ...it took me a moment to realize that the analyzer "sees" that
> > this is
> > "main", and thus buf_size is 0.
> 
> Is this a valid assumption?

Not always, but often.

I suppose we could add an option for this.

> 
> What if I have a lib (preloaded maybe) that sets it to 42?

...or, say, a C++ ctor for a global object that runs before main, that
has side-effects (see e.g. PR analyzer/97115).

> 
> BTW, do we handle -Wl,-init,youre_toast
> where main isn't the entry point?

The analyzer currently has no knowledge of this; it blithely assumes
that no code runs before "main".  It also doesn't report about "leaks"
that happen when returning from main, whereas in theory someone could,
say, implement the guts of their program in an atexit handler.

I'm making assumptions in order to try to be more useful for the common
cases, potentially at the expense of the less common ones.

I'm not particularly familiar with the pre-main startup and post-main
shutdown of a process; feel free to file a bug if you want -fanalyzer
to be able to handle this kind of thing (links to pertinent docs would
be helpful!)

Thanks
Dave

> 
> Just curious..
> thanks,
> 
> > 
> > Interestingly, if I rename it to not be "main" (and thus buf_size
> > could
> > be non-zero), we still don't complain:
> >  https://godbolt.org/z/PezfTo9Mz
> > Presumably this is a known limitation of the symbolic bounds
> > checking?
> > 
> > Thanks
> > Dave
> > 
> 



Re: [PATCH] analyzer: consider empty ranges and zero byte accesses [PR106845]

2022-09-11 Thread Bernhard Reutner-Fischer via Gcc-patches
On 11 September 2022 10:04:51 CEST, David Malcolm via Gcc-patches 
 wrote:

>> +++ b/gcc/testsuite/gcc.dg/analyzer/pr106845.c
>> @@ -0,0 +1,11 @@
>> +int buf_size;
>> +
>> +int
>> +main (void)
>> +{
>> +  char buf[buf_size];
>> +
>> +  __builtin_memset (&buf[1], 0, buf_size);
>> +
>> +  return 0;
>> +}
>
>...it took me a moment to realize that the analyzer "sees" that this is
>"main", and thus buf_size is 0.

Is this a valid assumption?

What if I have a lib (preloaded maybe) that sets it to 42?

BTW, do we handle -Wl,-init,youre_toast
where main isn't the entry point?

Just curious..
thanks,

>
>Interestingly, if I rename it to not be "main" (and thus buf_size could
>be non-zero), we still don't complain:
>  https://godbolt.org/z/PezfTo9Mz
>Presumably this is a known limitation of the symbolic bounds checking?
>
>Thanks
>Dave
>



Re: [PATCH] analyzer: consider empty ranges and zero byte accesses [PR106845]

2022-09-11 Thread David Malcolm via Gcc-patches
On Sun, 2022-09-11 at 00:19 +0200, Tim Lange wrote:
> Hi,
> 
> see my patch below for a fix of pr106845.  I decided to allow
> bit_ranges
> and byte_ranges to have a size of zero and rather only add an
> assertion
> to the functions that assume a non-zero size.  That way is more
> elegant in
> the caller than restricting byte_range to only represent non-empty
> ranges.

Agreed.

> 
> - Tim
> 
> This patch adds handling of empty ranges in bit_range and byte_range
> and
> adds an assertion to member functions that assume a positive size.
> Further, the patch fixes an ICE caused by an empty byte_range passed
> to
> byte_range::exceeds_p.
> 
> Regression-tested on Linux x86_64.
> 

Thanks - the patch is OK for trunk, though...

[...snip...]
> 

> +++ b/gcc/testsuite/gcc.dg/analyzer/pr106845.c
> @@ -0,0 +1,11 @@
> +int buf_size;
> +
> +int
> +main (void)
> +{
> +  char buf[buf_size];
> +
> +  __builtin_memset (&buf[1], 0, buf_size);
> +
> +  return 0;
> +}

...it took me a moment to realize that the analyzer "sees" that this is
"main", and thus buf_size is 0.

Interestingly, if I rename it to not be "main" (and thus buf_size could
be non-zero), we still don't complain:
  https://godbolt.org/z/PezfTo9Mz
Presumably this is a known limitation of the symbolic bounds checking?

Thanks
Dave



[PATCH] analyzer: consider empty ranges and zero byte accesses [PR106845]

2022-09-10 Thread Tim Lange
Hi,

see my patch below for a fix of pr106845.  I decided to allow bit_ranges
and byte_ranges to have a size of zero and rather only add an assertion
to the functions that assume a non-zero size.  That way is more elegant in
the caller than restricting byte_range to only represent non-empty ranges.

- Tim

This patch adds handling of empty ranges in bit_range and byte_range and
adds an assertion to member functions that assume a positive size.
Further, the patch fixes an ICE caused by an empty byte_range passed to
byte_range::exceeds_p.

Regression-tested on Linux x86_64.

2022-09-10  Tim Lange  

gcc/analyzer/ChangeLog:

PR analyzer/106845
* region-model.cc (region_model::check_region_bounds):
Bail out if 0 bytes were accessed.
* store.cc (byte_range::dump_to_pp):
Add special case for empty ranges.
(byte_range::exceeds_p): Restrict to non-empty ranges.
(byte_range::falls_short_of_p): Restrict to non-empty ranges.
* store.h (bit_range::empty_p): New function.
(bit_range::get_last_byte_offset): Restrict to non-empty ranges.
(byte_range::empty_p): New function.
(byte_range::get_last_byte_offset): Restrict to non-empty ranges.

gcc/testsuite/ChangeLog:

PR analyzer/106845
* gcc.dg/analyzer/out-of-bounds-zero.c: New test.
* gcc.dg/analyzer/pr106845.c: New test.

---
 gcc/analyzer/region-model.cc  |  3 +
 gcc/analyzer/store.cc | 12 +++-
 gcc/analyzer/store.h  | 12 
 .../gcc.dg/analyzer/out-of-bounds-zero.c  | 67 +++
 gcc/testsuite/gcc.dg/analyzer/pr106845.c  | 11 +++
 5 files changed, 103 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-zero.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr106845.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index d321e5b6479..82006405373 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -1826,6 +1826,9 @@ region_model::check_region_bounds (const region *reg,
   /* Find out how many bytes were accessed.  */
   const svalue *num_bytes_sval = reg->get_byte_size_sval (m_mgr);
   tree num_bytes_tree = maybe_get_integer_cst_tree (num_bytes_sval);
+  /* Bail out if 0 bytes are accessed.  */
+  if (num_bytes_tree && zerop (num_bytes_tree))
+return;
 
   /* Get the capacity of the buffer.  */
   const svalue *capacity = get_capacity (base_reg);
diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index ec5232cb055..1857d95f0b6 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -380,7 +380,11 @@ bit_range::as_byte_range (byte_range *out) const
 void
 byte_range::dump_to_pp (pretty_printer *pp) const
 {
-  if (m_size_in_bytes == 1)
+  if (m_size_in_bytes == 0)
+{
+  pp_string (pp, "empty");
+}
+  else if (m_size_in_bytes == 1)
 {
   pp_string (pp, "byte ");
   pp_wide_int (pp, m_start_byte_offset, SIGNED);
@@ -455,7 +459,9 @@ bool
 byte_range::exceeds_p (const byte_range &other,
   byte_range *out_overhanging_byte_range) const
 {
-  if (other.get_last_byte_offset () < get_last_byte_offset ())
+  gcc_assert (!empty_p ());
+
+  if (other.get_next_byte_offset () < get_next_byte_offset ())
 {
   /* THIS definitely exceeds OTHER.  */
   byte_offset_t start = MAX (get_start_byte_offset (),
@@ -477,6 +483,8 @@ bool
 byte_range::falls_short_of_p (byte_offset_t offset,
  byte_range *out_fall_short_bytes) const
 {
+  gcc_assert (!empty_p ());
+
   if (get_start_byte_offset () < offset)
 {
   /* THIS falls short of OFFSET.  */
diff --git a/gcc/analyzer/store.h b/gcc/analyzer/store.h
index ac8b6853f4b..d172ee756c8 100644
--- a/gcc/analyzer/store.h
+++ b/gcc/analyzer/store.h
@@ -237,6 +237,11 @@ struct bit_range
   void dump_to_pp (pretty_printer *pp) const;
   void dump () const;
 
+  bool empty_p () const
+  {
+return m_size_in_bits == 0;
+  }
+
   bit_offset_t get_start_bit_offset () const
   {
 return m_start_bit_offset;
@@ -247,6 +252,7 @@ struct bit_range
   }
   bit_offset_t get_last_bit_offset () const
   {
+gcc_assert (!empty_p ());
 return get_next_bit_offset () - 1;
   }
 
@@ -297,6 +303,11 @@ struct byte_range
   void dump_to_pp (pretty_printer *pp) const;
   void dump () const;
 
+  bool empty_p () const
+  {
+return m_size_in_bytes == 0;
+  }
+
   bool contains_p (byte_offset_t offset) const
   {
 return (offset >= get_start_byte_offset ()
@@ -329,6 +340,7 @@ struct byte_range
   }
   byte_offset_t get_last_byte_offset () const
   {
+gcc_assert (!empty_p ());
 return m_start_byte_offset + m_size_in_bytes - 1;
   }
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-zero.c 
b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-zero.c
new file mode 100644
index 000..201ca00ebdb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/