Re: [PATCH] analyzer: implement symbolic value support for CPython plugin's refcnt checker [PR107646]

2023-09-10 Thread Eric Feng via Gcc-patches
On Thu, Sep 7, 2023 at 1:28 PM David Malcolm  wrote:

> On Mon, 2023-09-04 at 22:13 -0400, Eric Feng wrote:
>
> > Hi Dave,
>
> Hi Eric, thanks for the patch.
>
> >
> > Recently I've been working on symbolic value support for the reference
> > count checker. I've attached a patch for it below; let me know it looks
> > OK for trunk. Thanks!
> >
> > Best,
> > Eric
> >
> > ---
> >
> > This patch enhances the reference count checker in the CPython plugin by
> > adding support for symbolic values. Whereas previously we were only able
> > to check the reference count of PyObject* objects created in the scope
> > of the function; we are now able to emit diagnostics on reference count
> > mismatch of objects that were, for example, passed in as a function
> > parameter.
> >
> > rc6.c:6:10: warning: expected ‘obj’ to have reference count: N + ‘1’ but
> ob_refcnt field is N + ‘2’
> > 6 |   return obj;
> >   |  ^~~
>
> [...snip...]
>
> >  create mode 100644
> gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt.c
> >
> > diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> > index bf1982e79c3..d7ecd7fce09 100644
> > --- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> > +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> > @@ -314,17 +314,20 @@ public:
> >{
> >  diagnostic_metadata m;
> >  bool warned;
> > -// just assuming constants for now
> > -auto actual_refcnt
> > - = m_actual_refcnt->dyn_cast_constant_svalue ()->get_constant ();
> > -auto ob_refcnt = m_ob_refcnt->dyn_cast_constant_svalue
> ()->get_constant ();
> > -warned = warning_meta (rich_loc, m, get_controlling_option (),
> > -"expected %qE to have "
> > -"reference count: %qE but ob_refcnt field is:
> %qE",
> > -m_reg_tree, actual_refcnt, ob_refcnt);
> > -
> > -// location_t loc = rich_loc->get_loc ();
> > -// foo (loc);
> > +
> > +const auto *actual_refcnt_constant
> > + = m_actual_refcnt->dyn_cast_constant_svalue ();
> > +const auto *ob_refcnt_constant =
> m_ob_refcnt->dyn_cast_constant_svalue ();
> > +if (!actual_refcnt_constant || !ob_refcnt_constant)
> > +  return false;
> > +
> > +auto actual_refcnt = actual_refcnt_constant->get_constant ();
> > +auto ob_refcnt = ob_refcnt_constant->get_constant ();
> > +warned = warning_meta (
> > + rich_loc, m, get_controlling_option (),
> > + "expected %qE to have "
> > + "reference count: N + %qE but ob_refcnt field is N + %qE",
> > + m_reg_tree, actual_refcnt, ob_refcnt);
> >  return warned;
>
> I know you're emulating the old behavior I implemented way back in
> cpychecker, but I don't like that behavior :(
>
> Specifically, although the patch improves the behavior for symbolic
> values, it regresses the precision of wording for the concrete values
> case.  If we have e.g. a concrete ob_refcnt of 2, whereas we only have
> 1 pointer, then it's more readable to say:
>
>   warning: expected ‘obj’ to have reference count: ‘1’ but ob_refcnt
> field is ‘2’
>
> than:
>
>   warning: expected ‘obj’ to have reference count: N + ‘1’ but ob_refcnt
> field is N + ‘2’
>
> ...and we shouldn't quote concrete numbers, the message should be:
>
>   warning: expected ‘obj’ to have reference count of 1 but ob_refcnt field
> is 2


> or better:
>
>   warning: ‘*obj’ is pointed to by 1 pointer but 'ob_refcnt' field is 2
>
>
> Can you move the unwrapping of the svalue from the tests below into the
> emit vfunc?  That way the m_actual_refcnt doesn't have to be a
> constant_svalue; you could have logic in the emit vfunc to print
> readable messages based on what kind of svalue it is.
>
> Rather than 'N', it might be better to say 'initial'; how about:
>
>   warning: ‘*obj’ is pointed to by 0 additional pointers but 'ob_refcnt'
> field has increased by 1
>   warning: ‘*obj’ is pointed to by 1 additional pointer but 'ob_refcnt'
> field has increased by 2
>   warning: ‘*obj’ is pointed to by 1 additional pointer but 'ob_refcnt'
> field is unchanged
>   warning: ‘*obj’ is pointed to by 2 additional pointers but 'ob_refcnt'
> field has decreased by 1
>   warning: ‘*obj’ is pointed to by 1 fewer pointers but 'ob_refcnt' field
> is unchanged
>
> and similar?
>

That makes sense to me as well (indeed I was just emulating the old
behavior)! Will experiment and keep you posted on a revised patch with this
in mind.  This is somewhat of a minor detail but can we emit ‘*obj’ as
bolded text in the diagnostic message? Currently, I can emit this
(including the asterisk) like so: '*%E'. But unlike using %qE, it doesn't
bold the body of the single quotations. Is this possible?

>
> Maybe have a flag that tracks whether we're talking about a concrete
> value that's absolute versus a concrete value that's relative to the
> initial value?
>
>
> [...snip...]
>
>
> > @@ -369,6 +368,19 @@ 

[PATCH] analyzer: implement symbolic value support for CPython plugin's refcnt checker [PR107646]

2023-09-04 Thread Eric Feng via Gcc-patches
Hi Dave,

Recently I've been working on symbolic value support for the reference
count checker. I've attached a patch for it below; let me know it looks
OK for trunk. Thanks!

Best,
Eric

---

This patch enhances the reference count checker in the CPython plugin by
adding support for symbolic values. Whereas previously we were only able
to check the reference count of PyObject* objects created in the scope
of the function; we are now able to emit diagnostics on reference count
mismatch of objects that were, for example, passed in as a function
parameter.

rc6.c:6:10: warning: expected ‘obj’ to have reference count: N + ‘1’ but 
ob_refcnt field is N + ‘2’
6 |   return obj;
  |  ^~~
  ‘create_py_object2’: event 1
|
|6 |   return obj;
|  |  ^~~
|  |  |
|  |  (1) here
|


gcc/testsuite/ChangeLog:
PR analyzer/107646
* gcc.dg/plugin/analyzer_cpython_plugin.c: Support reference count 
checking
of symbolic values.
* gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: New test.
* gcc.dg/plugin/plugin.exp: New test.
* gcc.dg/plugin/cpython-plugin-test-refcnt.c: New test.

Signed-off-by: Eric Feng 

---
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 133 +++---
 .../cpython-plugin-test-PyList_Append.c   |  21 ++-
 .../plugin/cpython-plugin-test-refcnt.c   |  18 +++
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   1 +
 4 files changed, 118 insertions(+), 55 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt.c

diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c 
b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
index bf1982e79c3..d7ecd7fce09 100644
--- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
+++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
@@ -314,17 +314,20 @@ public:
   {
 diagnostic_metadata m;
 bool warned;
-// just assuming constants for now
-auto actual_refcnt
-   = m_actual_refcnt->dyn_cast_constant_svalue ()->get_constant ();
-auto ob_refcnt = m_ob_refcnt->dyn_cast_constant_svalue ()->get_constant ();
-warned = warning_meta (rich_loc, m, get_controlling_option (),
-  "expected %qE to have "
-  "reference count: %qE but ob_refcnt field is: %qE",
-  m_reg_tree, actual_refcnt, ob_refcnt);
-
-// location_t loc = rich_loc->get_loc ();
-// foo (loc);
+
+const auto *actual_refcnt_constant
+   = m_actual_refcnt->dyn_cast_constant_svalue ();
+const auto *ob_refcnt_constant = m_ob_refcnt->dyn_cast_constant_svalue ();
+if (!actual_refcnt_constant || !ob_refcnt_constant)
+  return false;
+
+auto actual_refcnt = actual_refcnt_constant->get_constant ();
+auto ob_refcnt = ob_refcnt_constant->get_constant ();
+warned = warning_meta (
+   rich_loc, m, get_controlling_option (),
+   "expected %qE to have "
+   "reference count: N + %qE but ob_refcnt field is N + %qE",
+   m_reg_tree, actual_refcnt, ob_refcnt);
 return warned;
   }
 
@@ -336,10 +339,6 @@ public:
 
 private:
 
-  void foo(location_t loc) const 
-  {
-inform(loc, "something is up right here");
-  }
   const region *m_base_region;
   const svalue *m_ob_refcnt;
   const svalue *m_actual_refcnt;
@@ -369,6 +368,19 @@ increment_region_refcnt (hash_map 
, const region *key)
   refcnt = existed ? refcnt + 1 : 1;
 }
 
+static const region *
+get_region_from_svalue (const svalue *sval, region_model_manager *mgr)
+{
+  const auto *region_sval = sval->dyn_cast_region_svalue ();
+  if (region_sval)
+return region_sval->get_pointee ();
+
+  const auto *initial_sval = sval->dyn_cast_initial_svalue ();
+  if (initial_sval)
+return mgr->get_symbolic_region (initial_sval);
+
+  return nullptr;
+}
 
 /* Recursively fills in region_to_refcnt with the references owned by
pyobj_ptr_sval.  */
@@ -381,20 +393,9 @@ count_pyobj_references (const region_model *model,
   if (!pyobj_ptr_sval)
 return;
 
-  const auto *pyobj_region_sval = pyobj_ptr_sval->dyn_cast_region_svalue ();
-  const auto *pyobj_initial_sval = pyobj_ptr_sval->dyn_cast_initial_svalue ();
-  if (!pyobj_region_sval && !pyobj_initial_sval)
-return;
-
-  // todo: support initial sval (e.g passed in as parameter)
-  if (pyobj_initial_sval)
-{
-  // increment_region_refcnt (region_to_refcnt,
-  //  pyobj_initial_sval->get_region ());
-  return;
-}
+  region_model_manager *mgr = model->get_manager ();
 
-  const region *pyobj_region = pyobj_region_sval->get_pointee ();
+  const region *pyobj_region = get_region_from_svalue (pyobj_ptr_sval, mgr);
   if (!pyobj_region || seen.contains (pyobj_region))
 return;
 
@@ -409,49 +410,75 @@ count_pyobj_references (const region_model *model,
 return;
 
   const auto _binding_map = retval_cluster->get_map ();
-
   for (const 

Re: [PATCH] analyzer: implement reference count checking for CPython plugin [PR107646]

2023-09-01 Thread Eric Feng via Gcc-patches
Thank you for the patch!

On Fri, Sep 1, 2023 at 10:51 AM David Malcolm  wrote:
>
> On Fri, 2023-09-01 at 04:49 +0200, Hans-Peter Nilsson wrote:
> > (Looks like this was committed as r14-3580-g597b9ec69bca8a)
> >
> > > Cc: g...@gcc.gnu.org, gcc-patches@gcc.gnu.org, Eric Feng
> > > 
> > > From: Eric Feng via Gcc 
> >
> > > gcc/testsuite/ChangeLog:
> > >   PR analyzer/107646
> > > * gcc.dg/plugin/analyzer_cpython_plugin.c: Implements
> > > reference count
> > >   * checking for PyObjects.
> > > * gcc.dg/plugin/cpython-plugin-test-2.c: Moved to...
> > > * gcc.dg/plugin/cpython-plugin-test-PyList_Append.c:
> > > ...here (and
> > >   * added more tests).
> > > * gcc.dg/plugin/cpython-plugin-test-1.c: Moved to...
> > > * gcc.dg/plugin/cpython-plugin-test-no-plugin.c: ...here
> > > (and added
> > >   * more tests).
> > > * gcc.dg/plugin/plugin.exp: New tests.
> > > * gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test.
> > > * gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New
> > > test.
> > > * gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c: New
> > > test.
> >
> > It seems this was more or less a rewrite, but that said,
> > it's generally preferable to always *add* tests, never *modify* them.
> >
> > >  .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 376
> > > +-
> >
> > ^^^ Ouch!  Was it not within reason to keep that test as it
> > was, and just add another test?
Thanks for the feedback. To clarify, 'analyzer_cpython_plugin.c' is
not a test itself but rather a plugin that currently lives within the
testsuite. The core of the test cases were also not modified, rather I
renamed certain filenames containing them for clarity (unless this is
what you meant in terms of modification, in which case noted) and
added to them. However, I understand the preference and will keep that
in mind.
> >
> > Anyway, the test after rewrite fails, and for some targets
> > like cris-elf and apparently m68k-linux, yields an error.
> > I see a PR was already opened.
> >
> > Also, mostly for future reference, several files in the
> > patch miss a final newline, as seen by a "\ No newline at
> > end of file"-marker.
Noted.
> >
> > I think I found the problem; a mismatch between default C++
> > language standard between host-gcc and target-gcc.
> >
> > (It's actually *not* as simple as "auto var = typeofvar()"
> > not being recognized in C++11 --or else there'd be an error
> > for the hash_set declaration too, which I just changed for
> > consistency-- but it's close enough for me.)
> >
> > With this, retesting plugin.exp for cris-elf works.
Sounds good, thanks again! I was also curious about why hash_map had
an issue here with that syntax whilst hash_set did not, so I tried to
investigate a bit further. I believe the issue was due to the compiler
having trouble disambiguating between the hash_map constructors in
C++11.

>From the error message we received:

test/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c:480:58:
error: no matching function for call to 'hash_map::hash_map(hash_map)'
   auto region_to_refcnt = hash_map ();

I think the compiler is mistakenly interpreting the call here as we
would like to create a new hash_map object using its copy constructor,
but we "forgot" to provide the object to be copied, rather than our
intention of using the default constructor.

Looking at hash_map.h and hash_set.h seems to support this hypothesis,
as hash_map has two constructors, one of which resembles a copy
constructor with additional arguments:
https://github.com/gcc-mirror/gcc/blob/master/gcc/hash-map.h#L147.
Perhaps the default arguments here further complicated the ambiguity
as to which constructor to use in the presence of the empty
parenthesis.

On the other hand, hash_set has only the default constructor with
default parameters, and thus there is no ambiguity:
https://github.com/gcc-mirror/gcc/blob/master/gcc/hash-set.h#L40.

I assume this ambiguity was cleared up by later versions, and thus we
observed no problems in C++17. However, I am certainly still a
relative newbie of C++, so please anyone feel free to correct my
reasoning and chime in!
> >
> > Ok to commit?
>
> Sorry about the failing tests.
>
> Thanks for the patch; please go ahead and commit.
>
> Dave
>
> >
> > -- >8 --
> > From: Hans-Peter Nilsson 
> > Date: Fri, 1 Sep 2023 04:36:03 +0200
> > Subject: [PATCH] testsuite: Fix analyzer_cpython_plugin.c
> > declarations, PR testsuite/111264
> >
> > Also, add missing newline at end of file.
> >
> > PR testsuite/111264
> > * gcc.dg/plugin/analyzer_cpython_plugin.c: Make declarations
> > C++11-compatible.
> > ---
> >  gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> > b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> > index 

Re: [PATCH] analyzer: implement reference count checking for CPython plugin [PR107646]

2023-08-30 Thread Eric Feng via Gcc-patches
On Tue, Aug 29, 2023 at 5:14 PM David Malcolm  wrote:
>
> On Tue, 2023-08-29 at 13:28 -0400, Eric Feng wrote:
> > Additionally, by using the old model and the pointer per your
> > suggestion,
> > we are able to find the representative tree and emit a more accurate
> > diagnostic!
> >
> > rc3.c:23:10: warning: expected ‘item’ to have reference count: ‘1’
> > but ob_refcnt field is: ‘2’
> >23 |   return list;
> >   |  ^~~~
> >   ‘create_py_object’: events 1-4
> > |
> > |4 |   PyObject* item = PyLong_FromLong(3);
> > |  |^~
> > |  ||
> > |  |(1) when ‘PyLong_FromLong’ succeeds
> > |5 |   PyObject* list = PyList_New(1);
> > |  |~
> > |  ||
> > |  |(2) when ‘PyList_New’ succeeds
> > |..
> > |   14 |   PyList_Append(list, item);
> > |  |   ~
> > |  |   |
> > |  |   (3) when ‘PyList_Append’ succeeds, moving buffer
> > |..
> > |   23 |   return list;
> > |  |  
> > |  |  |
> > |  |  (4) here
> > |
>
> Excellent, that's a big improvement.
>
> >
> > If a representative tree is not found, I decided we should just bail
> > out
> > of emitting a diagnostic for now, to avoid confusing the user on what
> > the problem is.
>
> Fair enough.
>
> >
> > I've attached the patch for this (on top of the previous one) below.
> > If
> > it also looks good, I can merge it with the last patch and push it in
> > at
> > the same time.
>
> I don't mind either way, but please can you update the tests so that we
> have some automated test coverage that the correct name is being
> printed in the warning.
>
> Thanks
> Dave
>

Sorry — forgot to hit 'reply all' in the previous e-mail. Resending to
preserve our chain on the list:

---

Thanks; pushed to trunk with nits fixed:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=597b9ec69bca8acb7a3d65641c0a730de8b27ed4.

Incidentally, I updated my formatting settings in VSCode, which I've
previously mentioned in passing. In case anyone is interested:

"C_Cpp.clang_format_style": "{ BasedOnStyle: GNU, UseTab: Always,
TabWidth: 8, IndentWidth: 2, BinPackParameters: false,
AlignAfterOpenBracket: Align,
AllowAllParametersOfDeclarationOnNextLine: true }",

This fixes some issues with the indent width and also ensures function
parameters of appropriate length are aligned properly and on a new
line each (like the rest of the analyzer code).


[PATCH] analyzer: implement reference count checking for CPython plugin [PR107646]

2023-08-29 Thread Eric Feng via Gcc-patches
Additionally, by using the old model and the pointer per your suggestion,
we are able to find the representative tree and emit a more accurate diagnostic!

rc3.c:23:10: warning: expected ‘item’ to have reference count: ‘1’ but 
ob_refcnt field is: ‘2’
   23 |   return list;
  |  ^~~~
  ‘create_py_object’: events 1-4
|
|4 |   PyObject* item = PyLong_FromLong(3);
|  |^~
|  ||
|  |(1) when ‘PyLong_FromLong’ succeeds
|5 |   PyObject* list = PyList_New(1);
|  |~
|  ||
|  |(2) when ‘PyList_New’ succeeds
|..
|   14 |   PyList_Append(list, item);
|  |   ~
|  |   |
|  |   (3) when ‘PyList_Append’ succeeds, moving buffer
|..
|   23 |   return list;
|  |  
|  |  |
|  |  (4) here
|

If a representative tree is not found, I decided we should just bail out
of emitting a diagnostic for now, to avoid confusing the user on what
the problem is.

I've attached the patch for this (on top of the previous one) below. If
it also looks good, I can merge it with the last patch and push it in at
the same time.

Best,
Eric

---
 gcc/analyzer/region-model.cc  |  3 +-
 gcc/analyzer/region-model.h   |  7 ++--
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 35 +++
 3 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index eb4f976b83a..c1d266d351b 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -5391,6 +5391,7 @@ region_model::pop_frame (tree result_lvalue,
 {
   gcc_assert (m_current_frame);
 
+  const region_model pre_popped_model = *this;
   const frame_region *frame_reg = m_current_frame;
 
   /* Notify state machines.  */
@@ -5424,7 +5425,7 @@ region_model::pop_frame (tree result_lvalue,
 }
 
   unbind_region_and_descendents (frame_reg,POISON_KIND_POPPED_STACK);
-  notify_on_pop_frame (this, retval, ctxt);
+  notify_on_pop_frame (this, _popped_model, retval, ctxt);
 }
 
 /* Get the number of frames in this region_model's stack.  */
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 440ea6d828d..b89c6f6c649 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -237,6 +237,7 @@ public:
 struct append_regions_cb_data;
 
 typedef void (*pop_frame_callback) (const region_model *model,
+   const region_model *prev_model,
const svalue *retval,
region_model_context *ctxt);
 
@@ -543,11 +544,13 @@ class region_model
   }
 
   static void
-  notify_on_pop_frame (const region_model *model, const svalue *retval,
+  notify_on_pop_frame (const region_model *model,
+  const region_model *prev_model,
+  const svalue *retval,
   region_model_context *ctxt)
   {
 for (auto  : pop_frame_callbacks)
-   callback (model, retval, ctxt);
+   callback (model, prev_model, retval, ctxt);
   }
 
 private:
diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c 
b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
index b2caed8fc1b..6f0a355fe30 100644
--- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
+++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
@@ -318,11 +318,10 @@ public:
 auto actual_refcnt
= m_actual_refcnt->dyn_cast_constant_svalue ()->get_constant ();
 auto ob_refcnt = m_ob_refcnt->dyn_cast_constant_svalue ()->get_constant ();
-warned = warning_meta (
-   rich_loc, m, get_controlling_option (),
-   "expected  to have "
-   "reference count: %qE but ob_refcnt field is: %qE",
-   actual_refcnt, ob_refcnt);
+warned = warning_meta (rich_loc, m, get_controlling_option (),
+  "expected %qE to have "
+  "reference count: %qE but ob_refcnt field is: %qE",
+  m_reg_tree, actual_refcnt, ob_refcnt);
 
 // location_t loc = rich_loc->get_loc ();
 // foo (loc);
@@ -425,7 +424,8 @@ count_expected_pyobj_references (const region_model *model,
 
 /* Compare ob_refcnt field vs the actual reference count of a region */
 static void
-check_refcnt (const region_model *model, region_model_context *ctxt,
+check_refcnt (const region_model *model, const region_model *old_model,
+ region_model_context *ctxt,
  const hash_map::iterator::reference_pair region_refcnt)
 {
@@ -438,8 +438,11 @@ check_refcnt (const region_model *model, 
region_model_context *ctxt,
 
   if (ob_refcnt_sval != actual_refcnt_sval)
   {
-// todo: fix this
-tree reg_tree = model->get_representative_tree 

Re: [PATCH] analyzer: implement reference count checking for CPython plugin [PR107646]

2023-08-28 Thread Eric Feng via Gcc-patches
On Tue, Aug 29, 2023 at 12:32 AM Eric Feng  wrote:
>
> Hi Dave,
>
> Thanks for the feedback. I've addressed the changes you mentioned in
> addition to adding more test cases. I've also taken this chance to
> split the test files according to known function subclasses, as you previously
> suggested. Since there were also some changes to the core analyzer, I've done 
> a
> bootstrap and regtested the patch as well. Does it look OK for trunk?
Apologies — I forgot to mention that bootstrap and regtest was done on
aarch64-unknown-linux-gnu.
>
> Best,
> Eric
>
> ---
>
> This patch introduces initial support for reference count checking of
> PyObjects in relation to the Python/C API for the CPython plugin.
> Additionally, the core analyzer underwent several modifications to
> accommodate this feature. These include:
>
> - Introducing support for callbacks at the end of
>   region_model::pop_frame. This is our current point of validation for
>   the reference count of PyObjects.
> - An added optional custom stmt_finder parameter to
>   region_model_context::warn. This aids in emitting a diagnostic
>   concerning the reference count, especially when the stmt_finder is
>   NULL, which is currently the case during region_model::pop_frame.
>
> The current diagnostic we emit relating to the reference count
> appears as follows:
>
> rc3.c:23:10: warning: expected  to 
> have reference count: ‘1’ but ob_refcnt field is: ‘2’
>23 |   return list;
>   |  ^~~~
>   ‘create_py_object’: events 1-4
> |
> |4 |   PyObject* item = PyLong_FromLong(3);
> |  |^~
> |  ||
> |  |(1) when ‘PyLong_FromLong’ succeeds
> |5 |   PyObject* list = PyList_New(1);
> |  |~
> |  ||
> |  |(2) when ‘PyList_New’ succeeds
> |..
> |   14 |   PyList_Append(list, item);
> |  |   ~
> |  |   |
> |  |   (3) when ‘PyList_Append’ succeeds, moving buffer
> |..
> |   23 |   return list;
> |  |  
> |  |  |
> |  |  (4) here
> |
>
> This is a WIP in several ways:
> - Enhancing the diagnostic for better clarity. For instance, users should
>   expect to see the variable name 'item' instead of the placeholder in the
>   diagnostic above.
> - Currently, functions returning PyObject * are assumed to always produce
>   a new reference.
> - The validation of reference count is only for PyObjects created within a
>   function body. Verifying reference counts for PyObjects passed as
>   parameters is not supported in this patch.
>
> gcc/analyzer/ChangeLog:
>   PR analyzer/107646
> * engine.cc (impl_region_model_context::warn): New optional parameter.
> * exploded-graph.h (class impl_region_model_context): Likewise.
> * region-model.cc (region_model::pop_frame): New callback feature for
>   * region_model::pop_frame.
> * region-model.h (struct append_regions_cb_data): Likewise.
> (class region_model): Likewise.
> (class region_model_context): New optional parameter.
> (class region_model_context_decorator): Likewise.
>
> gcc/testsuite/ChangeLog:
>   PR analyzer/107646
> * gcc.dg/plugin/analyzer_cpython_plugin.c: Implements reference count
>   * checking for PyObjects.
> * gcc.dg/plugin/cpython-plugin-test-2.c: Moved to...
> * gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: ...here (and
>   * added more tests).
> * gcc.dg/plugin/cpython-plugin-test-1.c: Moved to...
> * gcc.dg/plugin/cpython-plugin-test-no-plugin.c: ...here (and added
>   * more tests).
> * gcc.dg/plugin/plugin.exp: New tests.
> * gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test.
> * gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New test.
> * gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c: New test.
>
> Signed-off-by: Eric Feng 
>
> ---
>  gcc/analyzer/engine.cc|   8 +-
>  gcc/analyzer/exploded-graph.h |   4 +-
>  gcc/analyzer/region-model.cc  |   3 +
>  gcc/analyzer/region-model.h   |  48 ++-
>  .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 376 +-
>  c => cpython-plugin-test-PyList_Append.c} |  56 +--
>  .../plugin/cpython-plugin-test-PyList_New.c   |  38 ++
>  .../cpython-plugin-test-PyLong_FromLong.c |  38 ++
>  ...st-1.c => cpython-plugin-test-no-plugin.c} |   0
>  .../cpython-plugin-test-refcnt-checking.c |  78 
>  gcc/testsuite/gcc.dg/plugin/plugin.exp|   5 +-
>  11 files changed, 612 insertions(+), 42 deletions(-)
>  rename gcc/testsuite/gcc.dg/plugin/{cpython-plugin-test-2.c => 
> cpython-plugin-test-PyList_Append.c} (64%)
>  create mode 100644 
> 

[PATCH] analyzer: implement reference count checking for CPython plugin [PR107646]

2023-08-28 Thread Eric Feng via Gcc-patches
Hi Dave,

Thanks for the feedback. I've addressed the changes you mentioned in
addition to adding more test cases. I've also taken this chance to 
split the test files according to known function subclasses, as you previously 
suggested. Since there were also some changes to the core analyzer, I've done a
bootstrap and regtested the patch as well. Does it look OK for trunk?

Best,
Eric

---

This patch introduces initial support for reference count checking of
PyObjects in relation to the Python/C API for the CPython plugin.
Additionally, the core analyzer underwent several modifications to
accommodate this feature. These include:

- Introducing support for callbacks at the end of
  region_model::pop_frame. This is our current point of validation for
  the reference count of PyObjects.
- An added optional custom stmt_finder parameter to
  region_model_context::warn. This aids in emitting a diagnostic
  concerning the reference count, especially when the stmt_finder is
  NULL, which is currently the case during region_model::pop_frame.

The current diagnostic we emit relating to the reference count
appears as follows:

rc3.c:23:10: warning: expected  to 
have reference count: ‘1’ but ob_refcnt field is: ‘2’
   23 |   return list;
  |  ^~~~
  ‘create_py_object’: events 1-4
|
|4 |   PyObject* item = PyLong_FromLong(3);
|  |^~
|  ||
|  |(1) when ‘PyLong_FromLong’ succeeds
|5 |   PyObject* list = PyList_New(1);
|  |~
|  ||
|  |(2) when ‘PyList_New’ succeeds
|..
|   14 |   PyList_Append(list, item);
|  |   ~
|  |   |
|  |   (3) when ‘PyList_Append’ succeeds, moving buffer
|..
|   23 |   return list;
|  |  
|  |  |
|  |  (4) here
|

This is a WIP in several ways:
- Enhancing the diagnostic for better clarity. For instance, users should
  expect to see the variable name 'item' instead of the placeholder in the
  diagnostic above.
- Currently, functions returning PyObject * are assumed to always produce
  a new reference.
- The validation of reference count is only for PyObjects created within a
  function body. Verifying reference counts for PyObjects passed as
  parameters is not supported in this patch.

gcc/analyzer/ChangeLog:
  PR analyzer/107646
* engine.cc (impl_region_model_context::warn): New optional parameter.
* exploded-graph.h (class impl_region_model_context): Likewise.
* region-model.cc (region_model::pop_frame): New callback feature for
  * region_model::pop_frame.
* region-model.h (struct append_regions_cb_data): Likewise.
(class region_model): Likewise.
(class region_model_context): New optional parameter.
(class region_model_context_decorator): Likewise.

gcc/testsuite/ChangeLog:
  PR analyzer/107646
* gcc.dg/plugin/analyzer_cpython_plugin.c: Implements reference count
  * checking for PyObjects.
* gcc.dg/plugin/cpython-plugin-test-2.c: Moved to...
* gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: ...here (and
  * added more tests).
* gcc.dg/plugin/cpython-plugin-test-1.c: Moved to...
* gcc.dg/plugin/cpython-plugin-test-no-plugin.c: ...here (and added
  * more tests).
* gcc.dg/plugin/plugin.exp: New tests.
* gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test.
* gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New test.
* gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c: New test.

Signed-off-by: Eric Feng 

---
 gcc/analyzer/engine.cc|   8 +-
 gcc/analyzer/exploded-graph.h |   4 +-
 gcc/analyzer/region-model.cc  |   3 +
 gcc/analyzer/region-model.h   |  48 ++-
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 376 +-
 c => cpython-plugin-test-PyList_Append.c} |  56 +--
 .../plugin/cpython-plugin-test-PyList_New.c   |  38 ++
 .../cpython-plugin-test-PyLong_FromLong.c |  38 ++
 ...st-1.c => cpython-plugin-test-no-plugin.c} |   0
 .../cpython-plugin-test-refcnt-checking.c |  78 
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   5 +-
 11 files changed, 612 insertions(+), 42 deletions(-)
 rename gcc/testsuite/gcc.dg/plugin/{cpython-plugin-test-2.c => 
cpython-plugin-test-PyList_Append.c} (64%)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c
 create mode 100644 
gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c
 rename gcc/testsuite/gcc.dg/plugin/{cpython-plugin-test-1.c => 
cpython-plugin-test-no-plugin.c} (100%)
 create mode 100644 
gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc

Re: [PATCH] testsuite: Improve test in dg-require-python-h

2023-08-18 Thread Eric Feng via Gcc-patches
Thanks for the patch, Thiago. I've pushed it to trunk:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=6785917c9103e18bba0d718ac3b65a386d9a14f7.

Best,
Eric

On Fri, Aug 18, 2023 at 2:11 PM David Malcolm  wrote:
>
> On Thu, 2023-08-17 at 23:30 -0300, Thiago Jung Bauermann wrote:
> > If GCC is tested with a sysroot which doesn't contain a Python
> > installation (e.g., with a command such as
> > "make check-gcc-c FLAGS_UNDER_TEST="--sysroot=/some/path"), but
> > there's
> > a python3-config in $PATH, then the testsuite will pick up the host's
> > Python.h which can't actually be used:
> >
> > Executing on host: python3-config --includes(timeout = 300)
> > spawn -ignore SIGHUP python3-config --includes
> > -I/usr/include/python3.10 -I/usr/include/python3.10
> > Executing on host: /some/sysroot/bin/aarch64-unknown-linux-gnu-gcc --
> > sysroot=/some/sysroot/libc -Wl,-dynamic-
> > linker=/some/sysroot/libc/lib/ld-linux-aarch64.so.1 -Wl,-
> > rpath=/some/sysroot/libc/lib
> > /some/src/gcc.git/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-
> > 2.c-fdiagnostics-plain-output  -
> > fplugin=./analyzer_cpython_plugin.so -fanalyzer -
> > I/usr/include/python3.10 -I/usr/include/python3.10 -S -o cpython-
> > plugin-test-2.s(timeout = 600)
> > spawn -ignore SIGHUP /some/sysroot/bin/aarch64-unknown-linux-gnu-gcc
> > --sysroot=/some/sysroot/libc -Wl,-dynamic-
> > linker=/some/sysroot/libc/lib/ld-linux-aarch64.so.1 -Wl,-
> > rpath=/some/sysroot/libc/lib
> > /some/src/gcc.git/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
> > -fdiagnostics-plain-output -fplugin=./analyzer_cpython_plugin.so -
> > fanalyzer -I/usr/include/python3.10 -I/usr/include/python3.10 -S -o
> > cpython-plugin-test-2.s
> > In file included from /usr/include/python3.10/Python.h:8,
> >  from
> > /some/src/gcc.git/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-
> > 2.c:8:
> > /usr/include/python3.10/pyconfig.h:9:12: fatal error: aarch64-linux-
> > gnu/python3.10/pyconfig.h: No such file or directory
> > compilation terminated.
> > compiler exited with status 1
> >
> > This problem causes these testsuite failures:
> >
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 17)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 18)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 21)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 31)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 32)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 35)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 45)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 55)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 63)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 66)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 68)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so  (test for warnings, line 69)
> > FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -
> > fplugin=./analyzer_cpython_plugin.so (test for excess errors)
> > Excess errors:
> > /usr/include/python3.10/pyconfig.h:9:12: fatal error: aarch64-linux-
> > gnu/python3.10/pyconfig.h: No such file or directory
> > compilation terminated.
> >
> > So try to compile a test file so that the testcase can be marked as
> > unsupported instead.
> >
> > gcc/testsuite/ChangeLog:
> > * gcc/testsuite/lib/target-supports.exp (dg-require-python-
> > h): Test
> > whether Python.h can really be used.
> > ---
> >  gcc/testsuite/lib/target-supports.exp | 14 --
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/testsuite/lib/target-supports.exp
> > b/gcc/testsuite/lib/target-supports.exp
> > index 92b6f69730e9..5b5f86551844 100644
> > --- a/gcc/testsuite/lib/target-supports.exp
> > +++ b/gcc/testsuite/lib/target-supports.exp
> > @@ -12570,11 +12570,21 @@ proc dg-require-python-h { args } {
> >
> >  verbose "ENTER dg-require-python-h" 2
> >
> > +set supported 0
> >  set result [remote_exec host "python3-config --includes"]
> >  set status [lindex $result 0]
> >  if { $status == 0 } {
> > -set python_flags [lindex $result 1]
> > -} else {
> > +   # Remove trailing newline from python3-config output.
> > +   set python_flags [string trim [lindex $result 1]]
> 

Re: [COMMITTED] analyzer: More features for CPython analyzer plugin [PR107646]

2023-08-11 Thread Eric Feng via Gcc-patches
I've noticed there were still some strange indentations in the last
patch ... however, I think I've finally figured out a sane formatting
solution for me (fingers crossed). I will address them in the
follow-up patch at the same time as adding more test coverage.

---

In case, anyone else using VSCode has been having issues with
formatting according to GNU/GCC conventions, these are the relevant
formatting settings that I've found work for me. Assuming the C/C++
extension is installed, then in settings.json:

"C_Cpp.clang_format_style": "{ BasedOnStyle: GNU, UseTab: Always,
TabWidth: 8, IndentWidth: 8 }"

Just setting the base style to GNU formats everything correctly except
for the fact that indentation defaults to spaces (which is what I've
been struggling with fixing manually in the last few patches). The
rest of the settings are for replacing blocks of 8 spaces with tabs
(which is a requirement in check_GNU_style). In combination, this
works for everything except for header files for some reason, but I'll
defer that battle to another day.

On Fri, Aug 11, 2023 at 1:47 PM Eric Feng  wrote:
>
> Thanks for the feedback! I've incorporated the changes (aside from
> expanding test coverage, which I plan on releasing in a follow-up),
> rebased, and performed a bootstrap and regtest on
> aarch64-unknown-linux-gnu. Since you mentioned that it is good for trunk
> with nits fixed and no problems after rebase, the patch has now been pushed.
>
> Best,
> Eric
>
> ---
>
> This patch adds known function subclasses for Python/C API functions
> PyList_New, PyLong_FromLong, and PyList_Append. It also adds new
> optional parameters for
> region_model::get_or_create_region_for_heap_alloc, allowing for the
> newly allocated region to immediately transition from the start state to
> the assumed non-null state in the malloc state machine if desired.
> Finally, it adds a new procedure, dg-require-python-h, intended as a
> directive in Python-related analyzer tests, to append necessary Python
> flags during the tests' build process.
>
> The main warnings we gain in this patch with respect to the known function
> subclasses mentioned are leak related. For example:
>
> rc3.c: In function ‘create_py_object’:
> │
> rc3.c:21:10: warning: leak of ‘item’ [CWE-401] [-Wanalyzer-malloc-leak]
> │
>21 |   return list;
>   │
>   |  ^~~~
> │
>   ‘create_py_object’: events 1-4
> │
> |
> │
> |4 |   PyObject* item = PyLong_FromLong(10);
> │
> |  |^~~
> │
> |  ||
> │
> |  |(1) allocated here
> │
> |  |(2) when ‘PyLong_FromLong’ succeeds
> │
> |5 |   PyObject* list = PyList_New(2);
> │
> |  |~
> │
> |  ||
> │
> |  |(3) when ‘PyList_New’ fails
> │
> |..
> │
> |   21 |   return list;
> │
> |  |  
> │
> |  |  |
> │
> |  |  (4) ‘item’ leaks here; was allocated at (1)
> │
>
> Some concessions were made to
> simplify the analysis process when comparing kf_PyList_Append with the
> real implementation. In particular, PyList_Append performs some
> optimization internally to try and avoid calls to realloc if
> possible. For simplicity, we assume that realloc is called every time.
> Also, we grow the size by just 1 (to ensure enough space for adding a
> new element) rather than abide by the heuristics that the actual 
> implementation
> follows.
>
> gcc/analyzer/ChangeLog:
> PR analyzer/107646
> * call-details.h: New function.
> * region-model.cc (region_model::get_or_create_region_for_heap_alloc):
> New optional parameters.
> * region-model.h (class region_model): New optional parameters.
> * sm-malloc.cc (on_realloc_with_move): New function.
> (region_model::transition_ptr_sval_non_null): New function.
>
> gcc/testsuite/ChangeLog:
> PR analyzer/107646
> * gcc.dg/plugin/analyzer_cpython_plugin.c: Analyzer support for
> PyList_New, PyList_Append, PyLong_FromLong
> * gcc.dg/plugin/plugin.exp: New test.
> * lib/target-supports.exp: New procedure.
> * gcc.dg/plugin/cpython-plugin-test-2.c: New test.
>
> Signed-off-by: Eric Feng 
> ---
>  gcc/analyzer/call-details.h   |   4 +
>  gcc/analyzer/region-model.cc  |  17 +-
>  gcc/analyzer/region-model.h   |  14 +-
>  gcc/analyzer/sm-malloc.cc |  42 +
>  .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 722 ++
>  .../gcc.dg/plugin/cpython-plugin-test-2.c |  78 ++
>  gcc/testsuite/gcc.dg/plugin/plugin.exp|   3 +-
>  gcc/testsuite/lib/target-supports.exp |  25 +
>  8 files changed, 899 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
>
> 

[COMMITTED] analyzer: More features for CPython analyzer plugin [PR107646]

2023-08-11 Thread Eric Feng via Gcc-patches
Thanks for the feedback! I've incorporated the changes (aside from
expanding test coverage, which I plan on releasing in a follow-up),
rebased, and performed a bootstrap and regtest on
aarch64-unknown-linux-gnu. Since you mentioned that it is good for trunk
with nits fixed and no problems after rebase, the patch has now been pushed. 

Best,
Eric

---

This patch adds known function subclasses for Python/C API functions
PyList_New, PyLong_FromLong, and PyList_Append. It also adds new
optional parameters for
region_model::get_or_create_region_for_heap_alloc, allowing for the
newly allocated region to immediately transition from the start state to
the assumed non-null state in the malloc state machine if desired.
Finally, it adds a new procedure, dg-require-python-h, intended as a
directive in Python-related analyzer tests, to append necessary Python
flags during the tests' build process.

The main warnings we gain in this patch with respect to the known function
subclasses mentioned are leak related. For example:

rc3.c: In function ‘create_py_object’:
│
rc3.c:21:10: warning: leak of ‘item’ [CWE-401] [-Wanalyzer-malloc-leak]
│
   21 |   return list;
  │
  |  ^~~~
│
  ‘create_py_object’: events 1-4
│
|
│
|4 |   PyObject* item = PyLong_FromLong(10);
│
|  |^~~
│
|  ||
│
|  |(1) allocated here
│
|  |(2) when ‘PyLong_FromLong’ succeeds
│
|5 |   PyObject* list = PyList_New(2);
│
|  |~
│
|  ||
│
|  |(3) when ‘PyList_New’ fails
│
|..
│
|   21 |   return list;
│
|  |  
│
|  |  |
│
|  |  (4) ‘item’ leaks here; was allocated at (1)
│

Some concessions were made to
simplify the analysis process when comparing kf_PyList_Append with the
real implementation. In particular, PyList_Append performs some
optimization internally to try and avoid calls to realloc if
possible. For simplicity, we assume that realloc is called every time.
Also, we grow the size by just 1 (to ensure enough space for adding a
new element) rather than abide by the heuristics that the actual implementation
follows.

gcc/analyzer/ChangeLog:
PR analyzer/107646
* call-details.h: New function.
* region-model.cc (region_model::get_or_create_region_for_heap_alloc):
New optional parameters.
* region-model.h (class region_model): New optional parameters.
* sm-malloc.cc (on_realloc_with_move): New function.
(region_model::transition_ptr_sval_non_null): New function.

gcc/testsuite/ChangeLog:
PR analyzer/107646
* gcc.dg/plugin/analyzer_cpython_plugin.c: Analyzer support for
PyList_New, PyList_Append, PyLong_FromLong
* gcc.dg/plugin/plugin.exp: New test.
* lib/target-supports.exp: New procedure.
* gcc.dg/plugin/cpython-plugin-test-2.c: New test.

Signed-off-by: Eric Feng 
---
 gcc/analyzer/call-details.h   |   4 +
 gcc/analyzer/region-model.cc  |  17 +-
 gcc/analyzer/region-model.h   |  14 +-
 gcc/analyzer/sm-malloc.cc |  42 +
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 722 ++
 .../gcc.dg/plugin/cpython-plugin-test-2.c |  78 ++
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   3 +-
 gcc/testsuite/lib/target-supports.exp |  25 +
 8 files changed, 899 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c

diff --git a/gcc/analyzer/call-details.h b/gcc/analyzer/call-details.h
index 24be2247e63..bf2601151ea 100644
--- a/gcc/analyzer/call-details.h
+++ b/gcc/analyzer/call-details.h
@@ -49,6 +49,10 @@ public:
 return POINTER_TYPE_P (get_arg_type (idx));
   }
   bool arg_is_size_p (unsigned idx) const;
+  bool arg_is_integral_p (unsigned idx) const
+  {
+return INTEGRAL_TYPE_P (get_arg_type (idx));
+  }
 
   const gcall *get_call_stmt () const { return m_call; }
   location_t get_location () const;
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 094b7af3dbc..aa9fe008b9d 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -4991,11 +4991,16 @@ region_model::check_dynamic_size_for_floats (const 
svalue *size_in_bytes,
Use CTXT to complain about tainted sizes.
 
Reuse an existing heap_allocated_region if it's not being referenced by
-   this region_model; otherwise create a new one.  */
+   this region_model; otherwise create a new one.
+
+   Optionally (update_state_machine) transitions the pointer pointing to the
+   heap_allocated_region from start to assumed non-null.  */
 
 const region *
 region_model::get_or_create_region_for_heap_alloc (const svalue *size_in_bytes,
-  

[COMMITTED] MAINTAINERS: Add myself to write after approval

2023-08-11 Thread Eric Feng via Gcc-patches
ChangeLog:

* MAINTAINERS: Add myself.

Signed-off-by: Eric Feng 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1e54844c905..7a3ad68bc42 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -411,6 +411,7 @@ Chris Fairles   

 Alessandro Fanfarillo  
 Changpeng Fang 
 Sam Feifer 
+Eric Feng  
 Li Feng
 Thomas Fitzsimmons 
 Alexander Fomin

-- 
2.30.2



[PATCH v2] analyzer: More features for CPython analyzer plugin [PR107646]

2023-08-09 Thread Eric Feng via Gcc-patches
Thank you for your help in getting dg-require-python-h working! I can
confirm that the FAILs are related to differences between the --cflags
affecting the gimple seen by the analyzer. For this reason, I have
changed it to --includes for now. To be sure, I tested on Python 3.8 as
well and it works as expected. I have also addressed the following
comments on the WIP patch as you described.

-- Update Changelog entry to list new functions being simulated.
-- Update region_model::get_or_create_region_for_heap_alloc leading
comment.
-- Change register_alloc to update_state_machine.
-- Change move_ptr_sval_non_null to transition_ptr_sval_non_null.
-- Static helper functions for:
-- Initializing ob_refcnt field.
-- Setting ob_type field.
-- Getting ob_base field.
-- Initializing heap allocated region for PyObjects.
-- Incrementing a field by one.
-- Change arg_is_long_p to arg_is_integral_p.
-- Extract common failure scenario for reusability.

The initial WIP patch using 

/* { dg-options "-fanalyzer -I/usr/include/python3.9" }. */

have been bootstrapped and regtested on aarch64-unknown-linux-gnu. Since
we did not change any core logic in the revision and the only changes
within the analyzer core are changing variable names, is it OK for
trunk. In the mean time, the revised patch is currently going through
bootstrap and regtest process.

Best,
Eric

---
This patch adds known function subclasses for Python/C API functions
PyList_New, PyLong_FromLong, and PyList_Append. It also adds new
optional parameters for
region_model::get_or_create_region_for_heap_alloc, allowing for the
newly allocated region to immediately transition from the start state to
the assumed non-null state in the malloc state machine if desired.
Finally, it adds a new procedure, dg-require-python-h, intended as a
directive in Python-related analyzer tests, to append necessary Python
flags during the tests' build process.

The main warnings we gain in this patch with respect to the known function
subclasses mentioned are leak related. For example:

rc3.c: In function ‘create_py_object’:
│
rc3.c:21:10: warning: leak of ‘item’ [CWE-401] [-Wanalyzer-malloc-leak]
│
   21 |   return list;
  │
  |  ^~~~
│
  ‘create_py_object’: events 1-4
│
|
│
|4 |   PyObject* item = PyLong_FromLong(10);
│
|  |^~~
│
|  ||
│
|  |(1) allocated here
│
|  |(2) when ‘PyLong_FromLong’ succeeds
│
|5 |   PyObject* list = PyList_New(2);
│
|  |~
│
|  ||
│
|  |(3) when ‘PyList_New’ fails
│
|..
│
|   21 |   return list;
│
|  |  
│
|  |  |
│
|  |  (4) ‘item’ leaks here; was allocated at (1)
│

Some concessions were made to
simplify the analysis process when comparing kf_PyList_Append with the
real implementation. In particular, PyList_Append performs some
optimization internally to try and avoid calls to realloc if
possible. For simplicity, we assume that realloc is called every time.
Also, we grow the size by just 1 (to ensure enough space for adding a
new element) rather than abide by the heuristics that the actual implementation
follows.

gcc/analyzer/ChangeLog:
PR analyzer/107646
* region-model.cc (region_model::get_or_create_region_for_heap_alloc):
New optional parameters.
* region-model.h (class region_model): New optional parameters.
* sm-malloc.cc (on_realloc_with_move): New function.
(region_model::transition_ptr_sval_non_null): New function.

gcc/testsuite/ChangeLog:
PR analyzer/107646
* gcc.dg/plugin/analyzer_cpython_plugin.c: Analyzer support for
PyList_New, PyList_Append, PyLong_FromLong
* gcc.dg/plugin/plugin.exp: New test.
* lib/target-supports.exp: New procedure.
* gcc.dg/plugin/cpython-plugin-test-2.c: New test.

Signed-off-by: Eric Feng 
---
 gcc/analyzer/region-model.cc  |  20 +-
 gcc/analyzer/region-model.h   |  10 +-
 gcc/analyzer/sm-malloc.cc |  40 +
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 711 ++
 .../gcc.dg/plugin/cpython-plugin-test-2.c |  78 ++
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   3 +-
 gcc/testsuite/lib/target-supports.exp |  25 +
 7 files changed, 881 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index e92b3f7b074..c338f045d92 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -5127,11 +5127,16 @@ region_model::check_dynamic_size_for_floats (const 
svalue *size_in_bytes,
Use CTXT to complain about tainted sizes.
 
Reuse an existing 

Re: [PATCH v2] analyzer: stash values for CPython plugin [PR107646]

2023-08-03 Thread Eric Feng via Gcc-patches
On Wed, Aug 2, 2023 at 5:09 PM David Malcolm  wrote:
>
> On Wed, 2023-08-02 at 14:46 -0400, Eric Feng wrote:
> > On Wed, Aug 2, 2023 at 1:20 PM Marek Polacek 
> > wrote:
> > >
> > > On Wed, Aug 02, 2023 at 12:59:28PM -0400, David Malcolm wrote:
> > > > On Wed, 2023-08-02 at 12:20 -0400, Eric Feng wrote:
> > > >
>
> [Dropping Joseph and Marek from the CC]
>
> [...snip...]
>
> >
> >
> > Thank you, everyone. I've submitted a new patch with the described
> > changes.
>
> Thanks.
>
> > As I do not yet have write access, could someone please help
> > me commit it?
>
> I've pushed the v3 trunk to patch, as r14-2933-gfafe2d18f791c6; you can
> see it at [1], so you're now officially a GCC contributor,
> congratulation!
>
> FWIW I had to do a little whitespace fixing on the ChangeLog entries
> before the server-side hooks.commit-extra-checker would pass, as they
> were indented with spaces, rather than tabs, so it complained thusly:
>
> remote: *** The following commit was rejected by your 
> hooks.commit-extra-checker script (status: 1)
> remote: *** commit: 0a4a2dc7dad1dfe22be0b48fe0d8c50d216c8349
> remote: *** ChangeLog format failed:
> remote: *** ERR: line should start with a tab: "PR analyzer/107646"
> remote: *** ERR: line should start with a tab: "* 
> analyzer-language.cc (run_callbacks): New function."
> remote: *** ERR: line should start with a tab: "
> (on_finish_translation_unit): New function."
> remote: *** ERR: line should start with a tab: "* analyzer-language.h 
> (GCC_ANALYZER_LANGUAGE_H): New include."
> remote: *** ERR: line should start with a tab: "(class 
> translation_unit): New vfuncs."
> remote: *** ERR: line should start with a tab: "PR analyzer/107646"
> remote: *** ERR: line should start with a tab: "* c-parser.cc: New 
> functions on stashing values for the"
> remote: *** ERR: line should start with a tab: "  analyzer."
> remote: *** ERR: line should start with a tab: "PR analyzer/107646"
> remote: *** ERR: line should start with a tab: "* 
> gcc.dg/plugin/plugin.exp: Add new plugin and test."
> remote: *** ERR: line should start with a tab: "* 
> gcc.dg/plugin/analyzer_cpython_plugin.c: New plugin."
> remote: *** ERR: line should start with a tab: "* 
> gcc.dg/plugin/cpython-plugin-test-1.c: New test."
> remote: *** ERR: PR 107646 in subject but not in changelog: "analyzer: stash 
> values for CPython plugin [PR107646]"
> remote: ***
> remote: *** Please see: https://gcc.gnu.org/codingconventions.html#ChangeLogs
> remote: ***
> remote: error: hook declined to update refs/heads/master
> To git+ssh://gcc.gnu.org/git/gcc.git
>  ! [remote rejected] master -> master (hook declined)
> error: failed to push some refs to 
> 'git+ssh://dmalc...@gcc.gnu.org/git/gcc.git'
>
> ...but this was a trivial fix.  You can test that patches are properly
> formatted by running:
>
>   ./contrib/gcc-changelog/git_check_commit.py HEAD
>
> locally.
Sorry about that — will do. Thanks!
>
>
> >  Otherwise, please let me know if I should request write
> > access first (the GettingStarted page suggested requesting someone
> > commit the patch for the first few patches before requesting write
> > access).
>
> Please go ahead and request write access now; we should have done this
> in the "community bonding" phase of GSoC; sorry for not catching this.
Sounds good.
>
> Thanks again for the patch.  How's the followup work?  Are you close to
> being able to post one or more of the simpler known_function
> subclasses?
Yes, I will submit another patch for review very soon. Thank you for
helping me push this one!

Best,
Eric
>
> Dave
>
> [1]
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=fafe2d18f791c6b97b49af7c84b1b5703681c3af
>


Re: [PATCH v2] analyzer: stash values for CPython plugin [PR107646]

2023-08-02 Thread Eric Feng via Gcc-patches
On Wed, Aug 2, 2023 at 1:20 PM Marek Polacek  wrote:
>
> On Wed, Aug 02, 2023 at 12:59:28PM -0400, David Malcolm wrote:
> > On Wed, 2023-08-02 at 12:20 -0400, Eric Feng wrote:
> >
> > Hi Eric, thanks for the updated patch.
> >
> > Overall, looks good to me, although I'd drop the "Exited." from the
> > "sorry" message (and thus from the dg-message directive), since the
> > compiler is not exiting, it's just the particular plugin that's giving
> > up (but let's not hold up the patch with a "bikeshed" discussion on the
> > precise wording).
> >
> > If Joseph or Marek approves the C parts of the patch, this will be OK
> > to push to trunk.
>
Sounds good. Revised.
>
> > > index cf82b0306d1..617111b0f0a 100644
> > > --- a/gcc/c/c-parser.cc
> > > +++ b/gcc/c/c-parser.cc
> > > @@ -1695,6 +1695,32 @@ public:
> > >  return NULL_TREE;
> > >}
> > >
> > > +  tree
> > > +  lookup_type_by_id (tree id) const final override
> > > +  {
> > > +if (tree type_decl = lookup_name (id))
> > > +  {
> > > +   if (TREE_CODE (type_decl) == TYPE_DECL)
> > > + {
> > > +   tree record_type = TREE_TYPE (type_decl);
> > > +   if (TREE_CODE (record_type) == RECORD_TYPE)
> > > + return record_type;
> > > + }
> > > +  }
>
> I'd drop this set of { }, like below.  OK with that adjusted, thanks.
Sounds good — fixed.
>
> > > +
> > > +return NULL_TREE;
> > > +  }
> > > +
> > > +  tree
> > > +  lookup_global_var_by_id (tree id) const final override
> > > +  {
> > > +if (tree var_decl = lookup_name (id))
> > > +  if (TREE_CODE (var_decl) == VAR_DECL)
> > > +   return var_decl;
> > > +
> > > +return NULL_TREE;
> > > +  }
> > > +
> > >  private:
> > >/* Attempt to get an INTEGER_CST from MACRO.
> > >   Only handle the simplest cases: where MACRO's definition is a
>
> Marek
>

Thank you, everyone. I've submitted a new patch with the described
changes. As I do not yet have write access, could someone please help
me commit it? Otherwise, please let me know if I should request write
access first (the GettingStarted page suggested requesting someone
commit the patch for the first few patches before requesting write
access).

Best,
Eric


[PATCH v3] analyzer: stash values for CPython plugin [PR107646]

2023-08-02 Thread Eric Feng via Gcc-patches
Revised:
-- Remove superfluous { }
-- Reword diagnostic

---

This patch adds a hook to the end of ana::on_finish_translation_unit
which calls relevant stashing-related callbacks registered during plugin
initialization. This feature is used to stash named types and global
variables for a CPython analyzer plugin [PR107646].

gcc/analyzer/ChangeLog:
PR analyzer/107646
* analyzer-language.cc (run_callbacks): New function.
(on_finish_translation_unit): New function.
* analyzer-language.h (GCC_ANALYZER_LANGUAGE_H): New include.
(class translation_unit): New vfuncs.

gcc/c/ChangeLog:
PR analyzer/107646
* c-parser.cc: New functions on stashing values for the
  analyzer.

gcc/testsuite/ChangeLog:
PR analyzer/107646
* gcc.dg/plugin/plugin.exp: Add new plugin and test.
* gcc.dg/plugin/analyzer_cpython_plugin.c: New plugin.
* gcc.dg/plugin/cpython-plugin-test-1.c: New test.

Signed-off-by: Eric Feng 
---
 gcc/analyzer/analyzer-language.cc |  22 ++
 gcc/analyzer/analyzer-language.h  |   9 +
 gcc/c/c-parser.cc |  24 ++
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 230 ++
 .../gcc.dg/plugin/cpython-plugin-test-1.c |   8 +
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   2 +
 6 files changed, 295 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c

diff --git a/gcc/analyzer/analyzer-language.cc 
b/gcc/analyzer/analyzer-language.cc
index 2c8910906ee..85400288a93 100644
--- a/gcc/analyzer/analyzer-language.cc
+++ b/gcc/analyzer/analyzer-language.cc
@@ -35,6 +35,26 @@ static GTY (()) hash_map  
*analyzer_stashed_constants;
 #if ENABLE_ANALYZER
 
 namespace ana {
+static vec
+*finish_translation_unit_callbacks;
+
+void
+register_finish_translation_unit_callback (
+finish_translation_unit_callback callback)
+{
+  if (!finish_translation_unit_callbacks)
+vec_alloc (finish_translation_unit_callbacks, 1);
+  finish_translation_unit_callbacks->safe_push (callback);
+}
+
+static void
+run_callbacks (logger *logger, const translation_unit )
+{
+  for (auto const  : finish_translation_unit_callbacks)
+{
+  cb (logger, tu);
+}
+}
 
 /* Call into TU to try to find a value for NAME.
If found, stash its value within analyzer_stashed_constants.  */
@@ -102,6 +122,8 @@ on_finish_translation_unit (const translation_unit )
 the_logger.set_logger (new logger (logfile, 0, 0,
   *global_dc->printer));
   stash_named_constants (the_logger.get_logger (), tu);
+
+  run_callbacks (the_logger.get_logger (), tu);
 }
 
 /* Lookup NAME in the named constants stashed when the frontend TU finished.
diff --git a/gcc/analyzer/analyzer-language.h b/gcc/analyzer/analyzer-language.h
index 00f85aba041..8deea52d627 100644
--- a/gcc/analyzer/analyzer-language.h
+++ b/gcc/analyzer/analyzer-language.h
@@ -21,6 +21,8 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_ANALYZER_LANGUAGE_H
 #define GCC_ANALYZER_LANGUAGE_H
 
+#include "analyzer/analyzer-logging.h"
+
 #if ENABLE_ANALYZER
 
 namespace ana {
@@ -35,8 +37,15 @@ class translation_unit
  have been seen).  If it is defined and an integer (e.g. either as a
  macro or enum), return the INTEGER_CST value, otherwise return NULL.  */
   virtual tree lookup_constant_by_id (tree id) const = 0;
+  virtual tree lookup_type_by_id (tree id) const = 0;
+  virtual tree lookup_global_var_by_id (tree id) const = 0;
 };
 
+typedef void (*finish_translation_unit_callback)
+   (logger *, const translation_unit &);
+void register_finish_translation_unit_callback (
+finish_translation_unit_callback callback);
+
 /* Analyzer hook for frontends to call at the end of the TU.  */
 
 void on_finish_translation_unit (const translation_unit );
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index cf82b0306d1..a3f216d90f8 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1695,6 +1695,30 @@ public:
 return NULL_TREE;
   }
 
+  tree
+  lookup_type_by_id (tree id) const final override
+  {
+if (tree type_decl = lookup_name (id))
+   if (TREE_CODE (type_decl) == TYPE_DECL)
+ {
+   tree record_type = TREE_TYPE (type_decl);
+   if (TREE_CODE (record_type) == RECORD_TYPE)
+ return record_type;
+ }
+
+return NULL_TREE;
+  }
+
+  tree
+  lookup_global_var_by_id (tree id) const final override
+  {
+if (tree var_decl = lookup_name (id))
+  if (TREE_CODE (var_decl) == VAR_DECL)
+   return var_decl;
+
+return NULL_TREE;
+  }
+
 private:
   /* Attempt to get an INTEGER_CST from MACRO.
  Only handle the simplest cases: where MACRO's definition is a single
diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c 

Re: [PATCH] analyzer: stash values for CPython plugin [PR107646]

2023-08-02 Thread Eric Feng via Gcc-patches
Hi Dave,

Thank you for the feedback! I've incorporated the changes and sent a
revised version of the patch.

On Tue, Aug 1, 2023 at 1:02 PM David Malcolm  wrote:
>
> On Tue, 2023-08-01 at 09:52 -0400, Eric Feng wrote:
> > Hi all,
> >
> > This patch adds a hook to the end of ana::on_finish_translation_unit
> > which calls relevant stashing-related callbacks registered during
> > plugin
> > initialization. This feature is used to stash named types and global
> > variables for a CPython analyzer plugin [PR107646].
> >
> > Bootstrapped and tested on aarch64-unknown-linux-gnu. Does it look
> > okay?
>
> Hi Eric, thanks for the patch.
>
> The patch touches the C frontend, so those parts would need approval
> from the C FE maintainers/reviewers; I've CCed them.
>
> Overall, I like the patch, but it's not ready for trunk yet; various
> comments inline below...
>
> >
> > ---
> >
> > gcc/analyzer/ChangeLog:
>
> You could add: PR analyzer/107646 to these ChangeLog entries; have a
> look at how other ChangeLog entries refer to such bugzilla entries.
>
> >
> > * analyzer-language.cc (run_callbacks): New function.
> > (on_finish_translation_unit): New function.
> > * analyzer-language.h (GCC_ANALYZER_LANGUAGE_H): New include.
> > (class translation_unit): New vfuncs.
> >
> > gcc/c/ChangeLog:
> >
> > * c-parser.cc: New functions.
>
> I think this ChangeLog entry needs more detail.
Added in revised version of the patch.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/plugin/analyzer_cpython_plugin.c: New test.
> >
> > Signed-off-by: Eric Feng 
> > ---
> >  gcc/analyzer/analyzer-language.cc |  22 ++
> >  gcc/analyzer/analyzer-language.h  |   9 +
> >  gcc/c/c-parser.cc |  26 ++
> >  .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 224
> > ++
> >  4 files changed, 281 insertions(+)
> >  create mode 100644
> > gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> >
> > diff --git a/gcc/analyzer/analyzer-language.cc
> > b/gcc/analyzer/analyzer-language.cc
> > index 2c8910906ee..fc41b9c17b8 100644
> > --- a/gcc/analyzer/analyzer-language.cc
> > +++ b/gcc/analyzer/analyzer-language.cc
> > @@ -35,6 +35,26 @@ static GTY (()) hash_map 
> > *analyzer_stashed_constants;
> >  #if ENABLE_ANALYZER
> >
> >  namespace ana {
> > +static vec
> > +*finish_translation_unit_callbacks;
> > +
> > +void
> > +register_finish_translation_unit_callback (
> > +finish_translation_unit_callback callback)
> > +{
> > +  if (!finish_translation_unit_callbacks)
> > +vec_alloc (finish_translation_unit_callbacks, 1);
> > +  finish_translation_unit_callbacks->safe_push (callback);
> > +}
> > +
> > +void
> > +run_callbacks (logger *logger, const translation_unit )
>
> This function could be "static" since it's not needed outside of
> analyzer-language.cc
>
> > +{
> > +  for (auto const  : finish_translation_unit_callbacks)
> > +{
> > +  cb (logger, tu);
> > +}
> > +}
> >
> >  /* Call into TU to try to find a value for NAME.
> > If found, stash its value within analyzer_stashed_constants.  */
> > @@ -102,6 +122,8 @@ on_finish_translation_unit (const
> > translation_unit )
> >  the_logger.set_logger (new logger (logfile, 0, 0,
> >  *global_dc->printer));
> >stash_named_constants (the_logger.get_logger (), tu);
> > +
> > +  run_callbacks (the_logger.get_logger (), tu);
> >  }
> >
> >  /* Lookup NAME in the named constants stashed when the frontend TU
> > finished.
> > diff --git a/gcc/analyzer/analyzer-language.h
> > b/gcc/analyzer/analyzer-language.h
> > index 00f85aba041..8deea52d627 100644
> > --- a/gcc/analyzer/analyzer-language.h
> > +++ b/gcc/analyzer/analyzer-language.h
> > @@ -21,6 +21,8 @@ along with GCC; see the file COPYING3.  If not see
> >  #ifndef GCC_ANALYZER_LANGUAGE_H
> >  #define GCC_ANALYZER_LANGUAGE_H
> >
> > +#include "analyzer/analyzer-logging.h"
> > +
> >  #if ENABLE_ANALYZER
> >
> >  namespace ana {
> > @@ -35,8 +37,15 @@ class translation_unit
> >   have been seen).  If it is defined and an integer (e.g. either
> > as a
> >   macro or enum), return the INTEGER_CST value, otherwise return
> > NULL.  */
> >virtual tree lookup_constant_by_id (tree id) const = 0;
> > +  virtual tree lookup_type_by_id (tree id) const = 0;
> > +  virtual tree lookup_global_var_by_id (tree id) const = 0;
> >  };
> >
> > +typedef void (*finish_translation_unit_callback)
> > +   (logger *, const translation_unit &);
> > +void register_finish_translation_unit_callback (
> > +finish_translation_unit_callback callback);
> > +
> >  /* Analyzer hook for frontends to call at the end of the TU.  */
> >
> >  void on_finish_translation_unit (const translation_unit );
> > diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> > index 80920b31f83..f0ee55e416b 100644
> > --- a/gcc/c/c-parser.cc
> > +++ b/gcc/c/c-parser.cc
> > @@ -1695,6 +1695,32 @@ public:
> >  return NULL_TREE;
> >}
> >
> > 

[PATCH v2] analyzer: stash values for CPython plugin [PR107646]

2023-08-02 Thread Eric Feng via Gcc-patches
Revised:
-- Fix indentation problems
-- Add more detail to Changelog
-- Add new test on handling non-CPython code case
-- Turn off debugging inform by default
-- Make on_finish_translation_unit() static
-- Remove superfluous null checks in init_py_structs()

Changes have been bootstrapped and tested against trunk on 
aarch64-unknown-linux-gnu.

---
This patch adds a hook to the end of ana::on_finish_translation_unit
which calls relevant stashing-related callbacks registered during plugin
initialization. This feature is used to stash named types and global
variables for a CPython analyzer plugin [PR107646].

gcc/analyzer/ChangeLog:
PR analyzer/107646
* analyzer-language.cc (run_callbacks): New function.
(on_finish_translation_unit): New function.
* analyzer-language.h (GCC_ANALYZER_LANGUAGE_H): New include.
(class translation_unit): New vfuncs.

gcc/c/ChangeLog:
PR analyzer/107646
* c-parser.cc: New functions on stashing values for the
  analyzer.

gcc/testsuite/ChangeLog:
PR analyzer/107646
* gcc.dg/plugin/plugin.exp: Add new plugin and test.
* gcc.dg/plugin/analyzer_cpython_plugin.c: New plugin.
* gcc.dg/plugin/cpython-plugin-test-1.c: New test.

Signed-off-by: Eric Feng 
---
 gcc/analyzer/analyzer-language.cc |  22 ++
 gcc/analyzer/analyzer-language.h  |   9 +
 gcc/c/c-parser.cc |  26 ++
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 230 ++
 .../gcc.dg/plugin/cpython-plugin-test-1.c |   8 +
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   2 +
 6 files changed, 297 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c

diff --git a/gcc/analyzer/analyzer-language.cc 
b/gcc/analyzer/analyzer-language.cc
index 2c8910906ee..85400288a93 100644
--- a/gcc/analyzer/analyzer-language.cc
+++ b/gcc/analyzer/analyzer-language.cc
@@ -35,6 +35,26 @@ static GTY (()) hash_map  
*analyzer_stashed_constants;
 #if ENABLE_ANALYZER
 
 namespace ana {
+static vec
+*finish_translation_unit_callbacks;
+
+void
+register_finish_translation_unit_callback (
+finish_translation_unit_callback callback)
+{
+  if (!finish_translation_unit_callbacks)
+vec_alloc (finish_translation_unit_callbacks, 1);
+  finish_translation_unit_callbacks->safe_push (callback);
+}
+
+static void
+run_callbacks (logger *logger, const translation_unit )
+{
+  for (auto const  : finish_translation_unit_callbacks)
+{
+  cb (logger, tu);
+}
+}
 
 /* Call into TU to try to find a value for NAME.
If found, stash its value within analyzer_stashed_constants.  */
@@ -102,6 +122,8 @@ on_finish_translation_unit (const translation_unit )
 the_logger.set_logger (new logger (logfile, 0, 0,
   *global_dc->printer));
   stash_named_constants (the_logger.get_logger (), tu);
+
+  run_callbacks (the_logger.get_logger (), tu);
 }
 
 /* Lookup NAME in the named constants stashed when the frontend TU finished.
diff --git a/gcc/analyzer/analyzer-language.h b/gcc/analyzer/analyzer-language.h
index 00f85aba041..8deea52d627 100644
--- a/gcc/analyzer/analyzer-language.h
+++ b/gcc/analyzer/analyzer-language.h
@@ -21,6 +21,8 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_ANALYZER_LANGUAGE_H
 #define GCC_ANALYZER_LANGUAGE_H
 
+#include "analyzer/analyzer-logging.h"
+
 #if ENABLE_ANALYZER
 
 namespace ana {
@@ -35,8 +37,15 @@ class translation_unit
  have been seen).  If it is defined and an integer (e.g. either as a
  macro or enum), return the INTEGER_CST value, otherwise return NULL.  */
   virtual tree lookup_constant_by_id (tree id) const = 0;
+  virtual tree lookup_type_by_id (tree id) const = 0;
+  virtual tree lookup_global_var_by_id (tree id) const = 0;
 };
 
+typedef void (*finish_translation_unit_callback)
+   (logger *, const translation_unit &);
+void register_finish_translation_unit_callback (
+finish_translation_unit_callback callback);
+
 /* Analyzer hook for frontends to call at the end of the TU.  */
 
 void on_finish_translation_unit (const translation_unit );
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index cf82b0306d1..617111b0f0a 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1695,6 +1695,32 @@ public:
 return NULL_TREE;
   }
 
+  tree
+  lookup_type_by_id (tree id) const final override
+  {
+if (tree type_decl = lookup_name (id))
+  {
+   if (TREE_CODE (type_decl) == TYPE_DECL)
+ {
+   tree record_type = TREE_TYPE (type_decl);
+   if (TREE_CODE (record_type) == RECORD_TYPE)
+ return record_type;
+ }
+  }
+
+return NULL_TREE;
+  }
+
+  tree
+  lookup_global_var_by_id (tree id) const final override
+  {
+if (tree var_decl = lookup_name (id))
+  if (TREE_CODE (var_decl) == VAR_DECL)
+   return 

[PATCH] analyzer: stash values for CPython plugin [PR107646]

2023-08-01 Thread Eric Feng via Gcc-patches
Hi all,

This patch adds a hook to the end of ana::on_finish_translation_unit
which calls relevant stashing-related callbacks registered during plugin
initialization. This feature is used to stash named types and global
variables for a CPython analyzer plugin [PR107646].

Bootstrapped and tested on aarch64-unknown-linux-gnu. Does it look okay?

---

gcc/analyzer/ChangeLog:

* analyzer-language.cc (run_callbacks): New function.
(on_finish_translation_unit): New function.
* analyzer-language.h (GCC_ANALYZER_LANGUAGE_H): New include.
(class translation_unit): New vfuncs.

gcc/c/ChangeLog:

* c-parser.cc: New functions.

gcc/testsuite/ChangeLog:

* gcc.dg/plugin/analyzer_cpython_plugin.c: New test.

Signed-off-by: Eric Feng 
---
 gcc/analyzer/analyzer-language.cc |  22 ++
 gcc/analyzer/analyzer-language.h  |   9 +
 gcc/c/c-parser.cc |  26 ++
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 224 ++
 4 files changed, 281 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c

diff --git a/gcc/analyzer/analyzer-language.cc
b/gcc/analyzer/analyzer-language.cc
index 2c8910906ee..fc41b9c17b8 100644
--- a/gcc/analyzer/analyzer-language.cc
+++ b/gcc/analyzer/analyzer-language.cc
@@ -35,6 +35,26 @@ static GTY (()) hash_map 
*analyzer_stashed_constants;
 #if ENABLE_ANALYZER

 namespace ana {
+static vec
+*finish_translation_unit_callbacks;
+
+void
+register_finish_translation_unit_callback (
+finish_translation_unit_callback callback)
+{
+  if (!finish_translation_unit_callbacks)
+vec_alloc (finish_translation_unit_callbacks, 1);
+  finish_translation_unit_callbacks->safe_push (callback);
+}
+
+void
+run_callbacks (logger *logger, const translation_unit )
+{
+  for (auto const  : finish_translation_unit_callbacks)
+{
+  cb (logger, tu);
+}
+}

 /* Call into TU to try to find a value for NAME.
If found, stash its value within analyzer_stashed_constants.  */
@@ -102,6 +122,8 @@ on_finish_translation_unit (const translation_unit )
 the_logger.set_logger (new logger (logfile, 0, 0,
 *global_dc->printer));
   stash_named_constants (the_logger.get_logger (), tu);
+
+  run_callbacks (the_logger.get_logger (), tu);
 }

 /* Lookup NAME in the named constants stashed when the frontend TU finished.
diff --git a/gcc/analyzer/analyzer-language.h b/gcc/analyzer/analyzer-language.h
index 00f85aba041..8deea52d627 100644
--- a/gcc/analyzer/analyzer-language.h
+++ b/gcc/analyzer/analyzer-language.h
@@ -21,6 +21,8 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_ANALYZER_LANGUAGE_H
 #define GCC_ANALYZER_LANGUAGE_H

+#include "analyzer/analyzer-logging.h"
+
 #if ENABLE_ANALYZER

 namespace ana {
@@ -35,8 +37,15 @@ class translation_unit
  have been seen).  If it is defined and an integer (e.g. either as a
  macro or enum), return the INTEGER_CST value, otherwise return NULL.  */
   virtual tree lookup_constant_by_id (tree id) const = 0;
+  virtual tree lookup_type_by_id (tree id) const = 0;
+  virtual tree lookup_global_var_by_id (tree id) const = 0;
 };

+typedef void (*finish_translation_unit_callback)
+   (logger *, const translation_unit &);
+void register_finish_translation_unit_callback (
+finish_translation_unit_callback callback);
+
 /* Analyzer hook for frontends to call at the end of the TU.  */

 void on_finish_translation_unit (const translation_unit );
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 80920b31f83..f0ee55e416b 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1695,6 +1695,32 @@ public:
 return NULL_TREE;
   }

+  tree
+  lookup_type_by_id (tree id) const final override
+  {
+if (tree type_decl = lookup_name (id))
+  {
+ if (TREE_CODE (type_decl) == TYPE_DECL)
+ {
+ tree record_type = TREE_TYPE (type_decl);
+ if (TREE_CODE (record_type) == RECORD_TYPE)
+ return record_type;
+ }
+  }
+
+return NULL_TREE;
+  }
+
+  tree
+  lookup_global_var_by_id (tree id) const final override
+  {
+if (tree var_decl = lookup_name (id))
+  if (TREE_CODE (var_decl) == VAR_DECL)
+ return var_decl;
+
+return NULL_TREE;
+  }
+
 private:
   /* Attempt to get an INTEGER_CST from MACRO.
  Only handle the simplest cases: where MACRO's definition is a single
diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
new file mode 100644
index 000..285da102edb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
@@ -0,0 +1,224 @@
+/* -fanalyzer plugin for CPython extension modules  */
+/* { dg-options "-g" } */
+
+#define INCLUDE_MEMORY
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "function.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "diagnostic-core.h"
+#include