Re: [PATCH] analyzer: Add support of placement new and improved operator new [PR105948,PR94355]
Hi Benjanmin, On Fri, 1 Sept 2023 at 17:45, David Malcolm via Gcc-patches < gcc-patches@gcc.gnu.org> wrote: > On Fri, 2023-09-01 at 16:48 +0200, Benjamin Priour wrote: > > Patch has been updated as per your suggestions and successfully > > regstrapped > > on x86_64-linux-gnu. > > The new testcase placement-new-size.C fails on aarch64: placement-new-size.C:10:3: error: 'int8_t' was not declared in this scope; did you mean 'wint_t'? placement-new-size.C:11:3: error: 'int64_t' was not declared in this scope placement-new-size.C:11:12: error: 'lp' was not declared in this scope placement-new-size.C:11:23: error: 's' was not declared in this scope placement-new-size.C:11:26: error: 'int64_t' does not name a type placement-new-size.C:34:3: error: 'int32_t' was not declared in this scope placement-new-size.C:34:12: error: 'i' was not declared in this scope placement-new-size.C:34:30: error: 'int32_t' does not name a type I suspect you should include (instead of stdlib.h) Thanks, Christophe > > call_details::maybe_get_arg_region is now > > /* If argument IDX's svalue at the callsite is of pointer type, > > return the region it points to. > > Otherwise return NULL. */ > > > > const region * > > call_details::deref_ptr_arg (unsigned idx) const > > { > >const svalue *ptr_sval = get_arg_svalue (idx); > >return m_model->deref_rvalue (ptr_sval, get_arg_tree (idx), > > m_ctxt); > > } > > > > > > New test is > > > > + > > +void test_binop () > > +{ > > + char *p = (char *) malloc (4); > > + if (!p) > > +return; > > + int32_t *i = ::new (p + 1) int32_t; /* { dg-warning "heap-based > > buffer > > overflow" } */ > > + *i = 42; /* { dg-warning "heap-based buffer overflow" } */ > > + free (p); > > +} > > > > Is it OK for trunk ? > > I didn't resend the whole patch as it otherwise was OK. > > Yes, thanks. > > Dave > >
Re: [PATCH] analyzer: Add support of placement new and improved operator new [PR105948,PR94355]
On Fri, 2023-09-01 at 16:48 +0200, Benjamin Priour wrote: > Patch has been updated as per your suggestions and successfully > regstrapped > on x86_64-linux-gnu. > > call_details::maybe_get_arg_region is now > /* If argument IDX's svalue at the callsite is of pointer type, > return the region it points to. > Otherwise return NULL. */ > > const region * > call_details::deref_ptr_arg (unsigned idx) const > { > const svalue *ptr_sval = get_arg_svalue (idx); > return m_model->deref_rvalue (ptr_sval, get_arg_tree (idx), > m_ctxt); > } > > > New test is > > + > +void test_binop () > +{ > + char *p = (char *) malloc (4); > + if (!p) > + return; > + int32_t *i = ::new (p + 1) int32_t; /* { dg-warning "heap-based > buffer > overflow" } */ > + *i = 42; /* { dg-warning "heap-based buffer overflow" } */ > + free (p); > +} > > Is it OK for trunk ? > I didn't resend the whole patch as it otherwise was OK. Yes, thanks. Dave
Re: [PATCH] analyzer: Add support of placement new and improved operator new [PR105948,PR94355]
Patch has been updated as per your suggestions and successfully regstrapped on x86_64-linux-gnu. call_details::maybe_get_arg_region is now /* If argument IDX's svalue at the callsite is of pointer type, return the region it points to. Otherwise return NULL. */ const region * call_details::deref_ptr_arg (unsigned idx) const { const svalue *ptr_sval = get_arg_svalue (idx); return m_model->deref_rvalue (ptr_sval, get_arg_tree (idx), m_ctxt); } New test is + +void test_binop () +{ + char *p = (char *) malloc (4); + if (!p) +return; + int32_t *i = ::new (p + 1) int32_t; /* { dg-warning "heap-based buffer overflow" } */ + *i = 42; /* { dg-warning "heap-based buffer overflow" } */ + free (p); +} Is it OK for trunk ? I didn't resend the whole patch as it otherwise was OK. Thanks, Benjamin. On Fri, Sep 1, 2023 at 12:07 PM Benjamin Priour wrote: > Hi David, > > On Fri, Sep 1, 2023 at 1:59 AM David Malcolm wrote: > >> On Fri, 2023-09-01 at 00:04 +0200, priour...@gmail.com wrote: >> >> > [..snip..] > > >> ...which will only fire if arg 1 is a region_svalue. This won't >> trigger if you have e.g. a binop_svalue for pointer arithmetic. >> >> What happens e.g. for this one-off-the-end bug: >> >> void *p = malloc (4); >> if (!p) >> return; >> int32_t *i = ::new (p + 1) int32_t; >> *i = 42; >> >> So maybe call_details::maybe_get_arg_region should instead be: >> >> /* Return the region that argument IDX points to. */ >> >> const region * >> call_details::deref_ptr_arg (unsigned idx) const >> { >> const svalue *ptr_sval = get_arg_svalue (idx); >> return m_model->deref_rvalue (ptr_sval, get_arg_tree (idx), m_ctxt); >> } >> >> (caveat: I didn't test this) >> >> > + const region *base_reg = ptr_reg->get_base_region (); >> > + const svalue *num_bytes_sval = cd.get_arg_svalue (0); >> > + const region *sized_new_reg >> > + = mgr->get_sized_region (base_reg, >> > + cd.get_lhs_type (), >> > + num_bytes_sval); >> >> Why do you use the base_reg here, rather than just ptr_reg? >> >> In the example above, the *(p + 1) has base region >> heap_allocated_region, but the ptr_reg is one byte higher; hence >> check_region_for_write of 4 bytes ought to detect a problem with >> writing 4 bytes to *(p + 1), but wouldn't complain about the write to >> *p. >> >> ...assuming that I'm reading this code correctly. >> >> > + model->check_region_for_write (sized_new_reg, >> > +nullptr, >> > +ctxt); >> > + const svalue *ptr_sval >> > + = mgr->get_ptr_svalue (cd.get_lhs_type (), sized_new_reg); >> > + cd.maybe_set_lhs (ptr_sval); >> > + } >> > + } >> >> [...snip...] >> >> The patch is OK for trunk as is; but please can you look into the >> above. >> >> > Thanks for the test case David, it exposed a missing heap-based over write > when on the placement new statement. > I've updated the code as per your suggestions, and it now works properly. > > >> If the above is a problem, you can either do another version of the >> patch, or do it as a followup patch (whichever you're more comfortable >> with, but it might be best to get the patch into trunk as-is, given >> that the GSoC period is nearly over). >> >> Thanks >> Dave >> >> > I will update the patch and regstrap it, so that it is done at once. > I've compared the new test case to a "C" version of it, resulting in > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111266 > > I will attempt to fix it while I'm regstrapping everything else, > I still have 4 patches in queue. > It will give me a brief break from transitioning the tests :) > > Thanks for the review, > Benjamin. >
Re: [PATCH] analyzer: Add support of placement new and improved operator new [PR105948,PR94355]
Hi David, On Fri, Sep 1, 2023 at 1:59 AM David Malcolm wrote: > On Fri, 2023-09-01 at 00:04 +0200, priour...@gmail.com wrote: > > [..snip..] > ...which will only fire if arg 1 is a region_svalue. This won't > trigger if you have e.g. a binop_svalue for pointer arithmetic. > > What happens e.g. for this one-off-the-end bug: > > void *p = malloc (4); > if (!p) > return; > int32_t *i = ::new (p + 1) int32_t; > *i = 42; > > So maybe call_details::maybe_get_arg_region should instead be: > > /* Return the region that argument IDX points to. */ > > const region * > call_details::deref_ptr_arg (unsigned idx) const > { > const svalue *ptr_sval = get_arg_svalue (idx); > return m_model->deref_rvalue (ptr_sval, get_arg_tree (idx), m_ctxt); > } > > (caveat: I didn't test this) > > > + const region *base_reg = ptr_reg->get_base_region (); > > + const svalue *num_bytes_sval = cd.get_arg_svalue (0); > > + const region *sized_new_reg > > + = mgr->get_sized_region (base_reg, > > + cd.get_lhs_type (), > > + num_bytes_sval); > > Why do you use the base_reg here, rather than just ptr_reg? > > In the example above, the *(p + 1) has base region > heap_allocated_region, but the ptr_reg is one byte higher; hence > check_region_for_write of 4 bytes ought to detect a problem with > writing 4 bytes to *(p + 1), but wouldn't complain about the write to > *p. > > ...assuming that I'm reading this code correctly. > > > + model->check_region_for_write (sized_new_reg, > > +nullptr, > > +ctxt); > > + const svalue *ptr_sval > > + = mgr->get_ptr_svalue (cd.get_lhs_type (), sized_new_reg); > > + cd.maybe_set_lhs (ptr_sval); > > + } > > + } > > [...snip...] > > The patch is OK for trunk as is; but please can you look into the > above. > > Thanks for the test case David, it exposed a missing heap-based over write when on the placement new statement. I've updated the code as per your suggestions, and it now works properly. > If the above is a problem, you can either do another version of the > patch, or do it as a followup patch (whichever you're more comfortable > with, but it might be best to get the patch into trunk as-is, given > that the GSoC period is nearly over). > > Thanks > Dave > > I will update the patch and regstrap it, so that it is done at once. I've compared the new test case to a "C" version of it, resulting in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111266 I will attempt to fix it while I'm regstrapping everything else, I still have 4 patches in queue. It will give me a brief break from transitioning the tests :) Thanks for the review, Benjamin.
Re: [PATCH] analyzer: Add support of placement new and improved operator new [PR105948,PR94355]
On Fri, 2023-09-01 at 00:04 +0200, priour...@gmail.com wrote: > Hi, > > Succesfully regstrapped off trunk 7f2ed06ddc825e8a4e0edfd1d66b5156e6dc1d34 > on x86_64-linux-gnu. > > Is it OK for trunk ? Hi Benjamin. Thanks for the patch. It's OK as-is, but it doesn't cover every case... [...snip...] > diff --git a/gcc/analyzer/call-details.cc b/gcc/analyzer/call-details.cc > index 66fb0fe871e..8d60e928b15 100644 > --- a/gcc/analyzer/call-details.cc > +++ b/gcc/analyzer/call-details.cc > @@ -295,6 +295,17 @@ call_details::get_arg_svalue (unsigned idx) const >return m_model->get_rvalue (arg, m_ctxt); > } > > +/* If argument IDX's svalue at the callsite is a region_svalue, > + return the region it points to. > + Otherwise return NULL. */ > + > +const region * > +call_details::maybe_get_arg_region (unsigned idx) const > +{ > + const svalue *sval = get_arg_svalue (idx); > + return sval->maybe_get_region (); > +} > + Is this the correct thing to be doing? It's used in the following... [...snip...] > diff --git a/gcc/analyzer/kf-lang-cp.cc b/gcc/analyzer/kf-lang-cp.cc > index 393b4f25e79..4450892dfa2 100644 > --- a/gcc/analyzer/kf-lang-cp.cc > +++ b/gcc/analyzer/kf-lang-cp.cc [...snip...] > @@ -54,28 +90,75 @@ public: > region_model *model = cd.get_model (); > region_model_manager *mgr = cd.get_manager (); > const svalue *size_sval = cd.get_arg_svalue (0); > -const region *new_reg > - = model->get_or_create_region_for_heap_alloc (size_sval, cd.get_ctxt > ()); > -if (cd.get_lhs_type ()) > +region_model_context *ctxt = cd.get_ctxt (); > +const gcall *call = cd.get_call_stmt (); > + > +/* If the call was actually a placement new, check that accessing > + the buffer lhs is placed into does not result in out-of-bounds. */ > +if (is_placement_new_p (call)) >{ > - const svalue *ptr_sval > - = mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg); > - cd.maybe_set_lhs (ptr_sval); > + const region *ptr_reg = cd.maybe_get_arg_region (1); > + if (ptr_reg && cd.get_lhs_type ()) > + { ...which will only fire if arg 1 is a region_svalue. This won't trigger if you have e.g. a binop_svalue for pointer arithmetic. What happens e.g. for this one-off-the-end bug: void *p = malloc (4); if (!p) return; int32_t *i = ::new (p + 1) int32_t; *i = 42; So maybe call_details::maybe_get_arg_region should instead be: /* Return the region that argument IDX points to. */ const region * call_details::deref_ptr_arg (unsigned idx) const { const svalue *ptr_sval = get_arg_svalue (idx); return m_model->deref_rvalue (ptr_sval, get_arg_tree (idx), m_ctxt); } (caveat: I didn't test this) > + const region *base_reg = ptr_reg->get_base_region (); > + const svalue *num_bytes_sval = cd.get_arg_svalue (0); > + const region *sized_new_reg > + = mgr->get_sized_region (base_reg, > + cd.get_lhs_type (), > + num_bytes_sval); Why do you use the base_reg here, rather than just ptr_reg? In the example above, the *(p + 1) has base region heap_allocated_region, but the ptr_reg is one byte higher; hence check_region_for_write of 4 bytes ought to detect a problem with writing 4 bytes to *(p + 1), but wouldn't complain about the write to *p. ...assuming that I'm reading this code correctly. > + model->check_region_for_write (sized_new_reg, > +nullptr, > +ctxt); > + const svalue *ptr_sval > + = mgr->get_ptr_svalue (cd.get_lhs_type (), sized_new_reg); > + cd.maybe_set_lhs (ptr_sval); > + } > + } [...snip...] The patch is OK for trunk as is; but please can you look into the above. If the above is a problem, you can either do another version of the patch, or do it as a followup patch (whichever you're more comfortable with, but it might be best to get the patch into trunk as-is, given that the GSoC period is nearly over). Thanks Dave
[PATCH] analyzer: Add support of placement new and improved operator new [PR105948, PR94355]
From: benjamin priour Hi, Succesfully regstrapped off trunk 7f2ed06ddc825e8a4e0edfd1d66b5156e6dc1d34 on x86_64-linux-gnu. Is it OK for trunk ? Thanks, Benjamin. Patch below. --- Fixed spurious possibly-NULL warning always tagging along throwing operator new despite it never returning NULL. Now operator new is correctly recognized as possibly returning NULL if and only if it is non-throwing or exceptions have been disabled. Different standard signatures of operator new are now properly recognized. Added support of placement new, so that it is now properly recognized, and a 'heap_allocated' region is no longer created for it. Placement new size is also checked and a 'Wanalyzer-allocation-size' is emitted when relevant, as well as always a 'Wanalyzer-out-of-bounds'. 'operator new' non-throwing variants are detected y checking the types of the parameters. Indeed, in a call to new (std::nothrow) () the chosen overload has signature 'operator new (void*, std::nothrow_t&)', where the second parameter is a reference. In a placement new, the second parameter will always be a void pointer. Prior to this patch, some buffers first allocated with 'new', then deleted an thereafter used would result in a 'Wanalyzer-user-after-free' warning. However the wording was "use after 'free'" instead of the expected "use after 'delete'". This patch fixes this by introducing a new kind of poisoned value, namely POISON_KIND_DELETED. Due to how the analyzer sees calls to non-throwing variants of operator new, dereferencing a pointer freshly allocated in this fashion caused both a 'Wanalyzer-use-of-uninitialized-value' and a 'Wanalyzer-null-dereference' to be emitted, while only the latter was relevant. As a result, 'null-dereference' now supersedes 'use-of-uninitialized'. Signed-off-by: benjamin priour gcc/analyzer/ChangeLog: PR analyzer/105948 PR analyzer/94355 * analyzer.h (is_placement_new_p): New declaration. * call-details.cc (call_details::maybe_get_arg_region): New function. Returns the region of the argument at given index if possible. * call-details.h: Declaration of the above function. * kf-lang-cp.cc (is_placement_new_p): Returns true if the gcall is recognized as a placement new. (kf_operator_delete::impl_call_post): Unbinding a region and its descendents now poisons with POISON_KIND_DELETED. (register_known_functions_lang_cp): Known function "operator delete" is now registered only once independently of its number of arguments. * region-model.cc (region_model::eval_condition): Now recursively calls itself if any of the operand is wrapped in a cast. * sm-malloc.cc (malloc_state_machine::on_stmt): Add placement new recognition. * svalue.cc (poison_kind_to_str): Wording for the new PK. * svalue.h (enum poison_kind): Add value POISON_KIND_DELETED. gcc/testsuite/ChangeLog: PR analyzer/105948 PR analyzer/94355 * g++.dg/analyzer/out-of-bounds-placement-new.C: Added a directive. * g++.dg/analyzer/placement-new.C: Added tests. * g++.dg/analyzer/new-2.C: New test. * g++.dg/analyzer/noexcept-new.C: New test. * g++.dg/analyzer/placement-new-size.C: New test. --- gcc/analyzer/analyzer.h | 1 + gcc/analyzer/call-details.cc | 11 ++ gcc/analyzer/call-details.h | 1 + gcc/analyzer/kf-lang-cp.cc| 117 +++--- gcc/analyzer/region-model.cc | 36 ++ gcc/analyzer/sm-malloc.cc | 37 -- gcc/analyzer/svalue.cc| 2 + gcc/analyzer/svalue.h | 3 + gcc/testsuite/g++.dg/analyzer/new-2.C | 70 +++ gcc/testsuite/g++.dg/analyzer/noexcept-new.C | 48 +++ .../analyzer/out-of-bounds-placement-new.C| 2 +- .../g++.dg/analyzer/placement-new-size.C | 27 gcc/testsuite/g++.dg/analyzer/placement-new.C | 90 +- 13 files changed, 417 insertions(+), 28 deletions(-) create mode 100644 gcc/testsuite/g++.dg/analyzer/new-2.C create mode 100644 gcc/testsuite/g++.dg/analyzer/noexcept-new.C create mode 100644 gcc/testsuite/g++.dg/analyzer/placement-new-size.C diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h index 9b351b5ed56..208b85026fc 100644 --- a/gcc/analyzer/analyzer.h +++ b/gcc/analyzer/analyzer.h @@ -423,6 +423,7 @@ extern bool is_std_named_call_p (const_tree fndecl, const char *funcname, const gcall *call, unsigned int num_args); extern bool is_setjmp_call_p (const gcall *call); extern bool is_longjmp_call_p (const gcall *call); +extern bool is_placement_new_p (const gcall *call); extern const char *get_user_facing_name (const gcall *call); diff --git a/gcc/analyzer/call-details.cc b/gcc/analyzer/call-details.cc index