Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-08-07 Thread David Malcolm via Gcc
On Mon, 2023-08-07 at 14:31 -0400, Eric Feng wrote:
> On Fri, Aug 4, 2023 at 6:46 PM David Malcolm 
> wrote:
> > 
> > On Fri, 2023-08-04 at 18:42 -0400, David Malcolm wrote:
> > > On Fri, 2023-08-04 at 16:48 -0400, Eric Feng wrote:
> > > > On Fri, Aug 4, 2023 at 11:39 AM David Malcolm
> > > > 
> > > > wrote:
> > > > > 
> > > > > On Fri, 2023-08-04 at 11:02 -0400, Eric Feng wrote:
> > > > > > Hi Dave,
> > > > > > 
> > > > > > Tests related to our plugin which depend on Python-specific
> > > > > > definitions have been run by including /* { dg-options "-
> > > > > > fanalyzer
> > > > > > -I/usr/include/python3.9" } */. This is undoubtedly not
> > > > > > ideal;
> > > > > > is
> > > > > > it
> > > > > > best to approach this problem by adapting a subset of
> > > > > > relevant
> > > > > > definitions like in gil.h?
> > > > > 
> > > > > That might be acceptable in the very short-term, but to
> > > > > create a
> > > > > plugin
> > > > > that's useful to end-user (authors of CPython extension
> > > > > modules)
> > > > > we
> > > > > want to be testing against real Python headers.
> > > > > 
> > > > > As I understand it, https://peps.python.org/pep-0394/ allows
> > > > > for
> > > > > distributors of Python to symlink "python3-config" in the
> > > > > PATH to
> > > > > a
> > > > > python3.X-config script (for some X).
> > > > > 
> > > > > So on such systems running:
> > > > >   python3-config --includes
> > > > > should emit the correct -I option.  On my box it emits:
> > > > > 
> > > > > -I/usr/include/python3.8 -I/usr/include/python3.8
> > > > > 
> > > > > 
> > > > > It's more complicated, but I believe:
> > > > >   python3-config --cflags
> > > > > should emit the build flags that C/C++ extensions ought to
> > > > > use
> > > > > when
> > > > > building.  On my box this emits:
> > > > > 
> > > > > -I/usr/include/python3.8 -I/usr/include/python3.8  -Wno-
> > > > > unused-
> > > > > result -
> > > > > Wsign-compare  -O2 -g -pipe -Wall -Werror=format-security -
> > > > > Wp,-
> > > > > D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -
> > > > > fstack-
> > > > > protector-strong -grecord-gcc-switches   -m64 -mtune=generic
> > > > > -
> > > > > fasynchronous-unwind-tables -fstack-clash-protection -fcf-
> > > > > protection -
> > > > > D_GNU_SOURCE -fPIC -fwrapv  -DDYNAMIC_ANNOTATIONS_ENABLED=1 -
> > > > > DNDEBUG  -
> > > > > O2 -g -pipe -Wall -Werror=format-security -Wp,-
> > > > > D_FORTIFY_SOURCE=2
> > > > > -
> > > > > Wp,-
> > > > > D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -
> > > > > grecord-
> > > > > gcc-switches   -m64 -mtune=generic -fasynchronous-unwind-
> > > > > tables -
> > > > > fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -
> > > > > fwrapv
> > > > > 
> > > > > and it's likely going to vary from distribution to
> > > > > distribution.
> > > > > Some
> > > > > of those options *are* going to affect the gimple that -
> > > > > fanalyzer
> > > > > "sees".
> > > > > 
> > > > > Does your installation of Python have such a script?
> > > > > 
> > > > > So in the short term you could hack in a minimal subset of
> > > > > the
> > > > > decls/defns from Python.h, but I'd prefer it if target-
> > > > > supports.exp
> > > > > gained a DejaGnu directive that invokes python3-config,
> > > > > captures
> > > > > the
> > > > > result (or fails with UNSUPPORTED for systems without python3
> > > > > development headers), and then adds the result to the build
> > > > > flags
> > > > > of
> > > > > the file being tested.  The .exp files are implemented in
> > > > > Tcl,
> > > > > alas;
> > > > > let me know if you want help with that.
> > > > > 
> > > > > Dave
> > > > Sounds good; thanks! Following existing examples in
> > > > target-supports.exp, the following works as expected in terms
> > > > of
> > > > extracting the build flags we are interested in.
> > > > 
> > > > In target-supports.exp:
> > > > proc check_python_flags { } {
> > > >     set result [remote_exec host "python3-config --cflags"]
> > > >     set status [lindex $result 0]
> > > >     if { $status == 0 } {
> > > >     return [lindex $result 1]
> > > >     } else {
> > > >     return "UNSUPPORTED"
> > > >     }
> > > > }
> > > > 
> > > > However, I'm having some trouble figuring out the specifics as
> > > > to
> > > > how
> > > > we may add the build flags to our test cases. My intuition
> > > > looks
> > > > like
> > > > something like the following:
> > > > 
> > > > In plugin.exp:
> > > > foreach plugin_test $plugin_test_list {
> > > >     if {[lindex $plugin_test 0] eq "analyzer_cpython_plugin.c"}
> > > > {
> > > >     set python_flags [check_python_flags]
> > > >     if { $python_flags ne "UNSUPPORTED" } {
> > > >    // append $python_flags to build flags here
> > > >     }
> > > >     }
> > > > 
> > > > }
> > > > 
> > > > How might we do so?
> > > 
> > > Good question.
> > > 
> > > Looking at plugin.exp I see it uses plugin-test-execute, which is
> > > defined in 

Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-08-07 Thread Eric Feng via Gcc
On Fri, Aug 4, 2023 at 6:46 PM David Malcolm  wrote:
>
> On Fri, 2023-08-04 at 18:42 -0400, David Malcolm wrote:
> > On Fri, 2023-08-04 at 16:48 -0400, Eric Feng wrote:
> > > On Fri, Aug 4, 2023 at 11:39 AM David Malcolm 
> > > wrote:
> > > >
> > > > On Fri, 2023-08-04 at 11:02 -0400, Eric Feng wrote:
> > > > > Hi Dave,
> > > > >
> > > > > Tests related to our plugin which depend on Python-specific
> > > > > definitions have been run by including /* { dg-options "-
> > > > > fanalyzer
> > > > > -I/usr/include/python3.9" } */. This is undoubtedly not ideal;
> > > > > is
> > > > > it
> > > > > best to approach this problem by adapting a subset of relevant
> > > > > definitions like in gil.h?
> > > >
> > > > That might be acceptable in the very short-term, but to create a
> > > > plugin
> > > > that's useful to end-user (authors of CPython extension modules)
> > > > we
> > > > want to be testing against real Python headers.
> > > >
> > > > As I understand it, https://peps.python.org/pep-0394/ allows for
> > > > distributors of Python to symlink "python3-config" in the PATH to
> > > > a
> > > > python3.X-config script (for some X).
> > > >
> > > > So on such systems running:
> > > >   python3-config --includes
> > > > should emit the correct -I option.  On my box it emits:
> > > >
> > > > -I/usr/include/python3.8 -I/usr/include/python3.8
> > > >
> > > >
> > > > It's more complicated, but I believe:
> > > >   python3-config --cflags
> > > > should emit the build flags that C/C++ extensions ought to use
> > > > when
> > > > building.  On my box this emits:
> > > >
> > > > -I/usr/include/python3.8 -I/usr/include/python3.8  -Wno-unused-
> > > > result -
> > > > Wsign-compare  -O2 -g -pipe -Wall -Werror=format-security -Wp,-
> > > > D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -
> > > > fstack-
> > > > protector-strong -grecord-gcc-switches   -m64 -mtune=generic -
> > > > fasynchronous-unwind-tables -fstack-clash-protection -fcf-
> > > > protection -
> > > > D_GNU_SOURCE -fPIC -fwrapv  -DDYNAMIC_ANNOTATIONS_ENABLED=1 -
> > > > DNDEBUG  -
> > > > O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2
> > > > -
> > > > Wp,-
> > > > D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -
> > > > grecord-
> > > > gcc-switches   -m64 -mtune=generic -fasynchronous-unwind-tables -
> > > > fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -
> > > > fwrapv
> > > >
> > > > and it's likely going to vary from distribution to distribution.
> > > > Some
> > > > of those options *are* going to affect the gimple that -fanalyzer
> > > > "sees".
> > > >
> > > > Does your installation of Python have such a script?
> > > >
> > > > So in the short term you could hack in a minimal subset of the
> > > > decls/defns from Python.h, but I'd prefer it if target-
> > > > supports.exp
> > > > gained a DejaGnu directive that invokes python3-config, captures
> > > > the
> > > > result (or fails with UNSUPPORTED for systems without python3
> > > > development headers), and then adds the result to the build flags
> > > > of
> > > > the file being tested.  The .exp files are implemented in Tcl,
> > > > alas;
> > > > let me know if you want help with that.
> > > >
> > > > Dave
> > > Sounds good; thanks! Following existing examples in
> > > target-supports.exp, the following works as expected in terms of
> > > extracting the build flags we are interested in.
> > >
> > > In target-supports.exp:
> > > proc check_python_flags { } {
> > > set result [remote_exec host "python3-config --cflags"]
> > > set status [lindex $result 0]
> > > if { $status == 0 } {
> > > return [lindex $result 1]
> > > } else {
> > > return "UNSUPPORTED"
> > > }
> > > }
> > >
> > > However, I'm having some trouble figuring out the specifics as to
> > > how
> > > we may add the build flags to our test cases. My intuition looks
> > > like
> > > something like the following:
> > >
> > > In plugin.exp:
> > > foreach plugin_test $plugin_test_list {
> > > if {[lindex $plugin_test 0] eq "analyzer_cpython_plugin.c"} {
> > > set python_flags [check_python_flags]
> > > if { $python_flags ne "UNSUPPORTED" } {
> > >// append $python_flags to build flags here
> > > }
> > > }
> > > 
> > > }
> > >
> > > How might we do so?
> >
> > Good question.
> >
> > Looking at plugin.exp I see it uses plugin-test-execute, which is
> > defined in gcc/testsuite/lib/plugin-support.exp.
> >
> > Looking there, I see it attempts to build the plugin, and then if it
> > succeeds, it calls
> >   dg-runtest $plugin_tests $plugin_enabling_flags $default_flags
> > where $plugin_tests is the list of source files to be compiled using
> > the plugin.  So one way to do this would be to modify that code from
> > plugin.exp to pass in a different value, rather than $default_flags.
> > Though it seems hackish to special-case this.
>
> Sorry, I think I misspoke here; that line that uses $default_flags is

Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-08-04 Thread David Malcolm via Gcc
On Fri, 2023-08-04 at 18:42 -0400, David Malcolm wrote:
> On Fri, 2023-08-04 at 16:48 -0400, Eric Feng wrote:
> > On Fri, Aug 4, 2023 at 11:39 AM David Malcolm 
> > wrote:
> > > 
> > > On Fri, 2023-08-04 at 11:02 -0400, Eric Feng wrote:
> > > > Hi Dave,
> > > > 
> > > > Tests related to our plugin which depend on Python-specific
> > > > definitions have been run by including /* { dg-options "-
> > > > fanalyzer
> > > > -I/usr/include/python3.9" } */. This is undoubtedly not ideal;
> > > > is
> > > > it
> > > > best to approach this problem by adapting a subset of relevant
> > > > definitions like in gil.h?
> > > 
> > > That might be acceptable in the very short-term, but to create a
> > > plugin
> > > that's useful to end-user (authors of CPython extension modules)
> > > we
> > > want to be testing against real Python headers.
> > > 
> > > As I understand it, https://peps.python.org/pep-0394/ allows for
> > > distributors of Python to symlink "python3-config" in the PATH to
> > > a
> > > python3.X-config script (for some X).
> > > 
> > > So on such systems running:
> > >   python3-config --includes
> > > should emit the correct -I option.  On my box it emits:
> > > 
> > > -I/usr/include/python3.8 -I/usr/include/python3.8
> > > 
> > > 
> > > It's more complicated, but I believe:
> > >   python3-config --cflags
> > > should emit the build flags that C/C++ extensions ought to use
> > > when
> > > building.  On my box this emits:
> > > 
> > > -I/usr/include/python3.8 -I/usr/include/python3.8  -Wno-unused-
> > > result -
> > > Wsign-compare  -O2 -g -pipe -Wall -Werror=format-security -Wp,-
> > > D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -
> > > fstack-
> > > protector-strong -grecord-gcc-switches   -m64 -mtune=generic -
> > > fasynchronous-unwind-tables -fstack-clash-protection -fcf-
> > > protection -
> > > D_GNU_SOURCE -fPIC -fwrapv  -DDYNAMIC_ANNOTATIONS_ENABLED=1 -
> > > DNDEBUG  -
> > > O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2
> > > -
> > > Wp,-
> > > D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -
> > > grecord-
> > > gcc-switches   -m64 -mtune=generic -fasynchronous-unwind-tables -
> > > fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -
> > > fwrapv
> > > 
> > > and it's likely going to vary from distribution to distribution. 
> > > Some
> > > of those options *are* going to affect the gimple that -fanalyzer
> > > "sees".
> > > 
> > > Does your installation of Python have such a script?
> > > 
> > > So in the short term you could hack in a minimal subset of the
> > > decls/defns from Python.h, but I'd prefer it if target-
> > > supports.exp
> > > gained a DejaGnu directive that invokes python3-config, captures
> > > the
> > > result (or fails with UNSUPPORTED for systems without python3
> > > development headers), and then adds the result to the build flags
> > > of
> > > the file being tested.  The .exp files are implemented in Tcl,
> > > alas;
> > > let me know if you want help with that.
> > > 
> > > Dave
> > Sounds good; thanks! Following existing examples in
> > target-supports.exp, the following works as expected in terms of
> > extracting the build flags we are interested in.
> > 
> > In target-supports.exp:
> > proc check_python_flags { } {
> >     set result [remote_exec host "python3-config --cflags"]
> >     set status [lindex $result 0]
> >     if { $status == 0 } {
> >     return [lindex $result 1]
> >     } else {
> >     return "UNSUPPORTED"
> >     }
> > }
> > 
> > However, I'm having some trouble figuring out the specifics as to
> > how
> > we may add the build flags to our test cases. My intuition looks
> > like
> > something like the following:
> > 
> > In plugin.exp:
> > foreach plugin_test $plugin_test_list {
> >     if {[lindex $plugin_test 0] eq "analyzer_cpython_plugin.c"} {
> >     set python_flags [check_python_flags]
> >     if { $python_flags ne "UNSUPPORTED" } {
> >    // append $python_flags to build flags here
> >     }
> >     }
> > 
> > }
> > 
> > How might we do so?
> 
> Good question.
> 
> Looking at plugin.exp I see it uses plugin-test-execute, which is
> defined in gcc/testsuite/lib/plugin-support.exp.
> 
> Looking there, I see it attempts to build the plugin, and then if it
> succeeds, it calls 
>   dg-runtest $plugin_tests $plugin_enabling_flags $default_flags
> where $plugin_tests is the list of source files to be compiled using
> the plugin.  So one way to do this would be to modify that code from
> plugin.exp to pass in a different value, rather than $default_flags. 
> Though it seems hackish to special-case this.

Sorry, I think I misspoke here; that line that uses $default_flags is
from plugin-support.exp, not from plugin.exp; $default_flags is a
global variable.

So I think my 2nd approach below may be the one to try:

> 
> As another way, that avoids adding special-casing to plugin.exp,
> there's an existing directive:
>    dg-additional-options
> implemented in 

Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-08-04 Thread David Malcolm via Gcc
On Fri, 2023-08-04 at 16:48 -0400, Eric Feng wrote:
> On Fri, Aug 4, 2023 at 11:39 AM David Malcolm 
> wrote:
> > 
> > On Fri, 2023-08-04 at 11:02 -0400, Eric Feng wrote:
> > > Hi Dave,
> > > 
> > > Tests related to our plugin which depend on Python-specific
> > > definitions have been run by including /* { dg-options "-
> > > fanalyzer
> > > -I/usr/include/python3.9" } */. This is undoubtedly not ideal; is
> > > it
> > > best to approach this problem by adapting a subset of relevant
> > > definitions like in gil.h?
> > 
> > That might be acceptable in the very short-term, but to create a
> > plugin
> > that's useful to end-user (authors of CPython extension modules) we
> > want to be testing against real Python headers.
> > 
> > As I understand it, https://peps.python.org/pep-0394/ allows for
> > distributors of Python to symlink "python3-config" in the PATH to a
> > python3.X-config script (for some X).
> > 
> > So on such systems running:
> >   python3-config --includes
> > should emit the correct -I option.  On my box it emits:
> > 
> > -I/usr/include/python3.8 -I/usr/include/python3.8
> > 
> > 
> > It's more complicated, but I believe:
> >   python3-config --cflags
> > should emit the build flags that C/C++ extensions ought to use when
> > building.  On my box this emits:
> > 
> > -I/usr/include/python3.8 -I/usr/include/python3.8  -Wno-unused-
> > result -
> > Wsign-compare  -O2 -g -pipe -Wall -Werror=format-security -Wp,-
> > D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-
> > protector-strong -grecord-gcc-switches   -m64 -mtune=generic -
> > fasynchronous-unwind-tables -fstack-clash-protection -fcf-
> > protection -
> > D_GNU_SOURCE -fPIC -fwrapv  -DDYNAMIC_ANNOTATIONS_ENABLED=1 -
> > DNDEBUG  -
> > O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -
> > Wp,-
> > D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -
> > grecord-
> > gcc-switches   -m64 -mtune=generic -fasynchronous-unwind-tables -
> > fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv
> > 
> > and it's likely going to vary from distribution to distribution. 
> > Some
> > of those options *are* going to affect the gimple that -fanalyzer
> > "sees".
> > 
> > Does your installation of Python have such a script?
> > 
> > So in the short term you could hack in a minimal subset of the
> > decls/defns from Python.h, but I'd prefer it if target-supports.exp
> > gained a DejaGnu directive that invokes python3-config, captures
> > the
> > result (or fails with UNSUPPORTED for systems without python3
> > development headers), and then adds the result to the build flags
> > of
> > the file being tested.  The .exp files are implemented in Tcl,
> > alas;
> > let me know if you want help with that.
> > 
> > Dave
> Sounds good; thanks! Following existing examples in
> target-supports.exp, the following works as expected in terms of
> extracting the build flags we are interested in.
> 
> In target-supports.exp:
> proc check_python_flags { } {
>     set result [remote_exec host "python3-config --cflags"]
>     set status [lindex $result 0]
>     if { $status == 0 } {
>     return [lindex $result 1]
>     } else {
>     return "UNSUPPORTED"
>     }
> }
> 
> However, I'm having some trouble figuring out the specifics as to how
> we may add the build flags to our test cases. My intuition looks like
> something like the following:
> 
> In plugin.exp:
> foreach plugin_test $plugin_test_list {
>     if {[lindex $plugin_test 0] eq "analyzer_cpython_plugin.c"} {
>     set python_flags [check_python_flags]
>     if { $python_flags ne "UNSUPPORTED" } {
>    // append $python_flags to build flags here
>     }
>     }
> 
> }
> 
> How might we do so?

Good question.

Looking at plugin.exp I see it uses plugin-test-execute, which is
defined in gcc/testsuite/lib/plugin-support.exp.

Looking there, I see it attempts to build the plugin, and then if it
succeeds, it calls 
  dg-runtest $plugin_tests $plugin_enabling_flags $default_flags
where $plugin_tests is the list of source files to be compiled using
the plugin.  So one way to do this would be to modify that code from
plugin.exp to pass in a different value, rather than $default_flags. 
Though it seems hackish to special-case this.

As another way, that avoids adding special-casing to plugin.exp,
there's an existing directive:
   dg-additional-options
implemented in gcc/testsuite/lib/gcc-defs.exp which appends options to
the default options.  Unfortunately, it works via:
upvar dg-extra-tool-flags extra-tool-flags
which is a nasty Tcl hack meaning access the local variable named "dg-
extra-tool-flags" in *the frame above*, referring to it as "extra-tool-
flags".  (this is why I don't like Tcl)

So I think what could be done is to invoke your "check_python_flags"
test as a directive from the test case, so that in target-supports.exp
you'd have something like:

  proc dg-require-python-h {} {

which could do the 

Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-08-04 Thread Eric Feng via Gcc
On Fri, Aug 4, 2023 at 11:39 AM David Malcolm  wrote:
>
> On Fri, 2023-08-04 at 11:02 -0400, Eric Feng wrote:
> > Hi Dave,
> >
> > Tests related to our plugin which depend on Python-specific
> > definitions have been run by including /* { dg-options "-fanalyzer
> > -I/usr/include/python3.9" } */. This is undoubtedly not ideal; is it
> > best to approach this problem by adapting a subset of relevant
> > definitions like in gil.h?
>
> That might be acceptable in the very short-term, but to create a plugin
> that's useful to end-user (authors of CPython extension modules) we
> want to be testing against real Python headers.
>
> As I understand it, https://peps.python.org/pep-0394/ allows for
> distributors of Python to symlink "python3-config" in the PATH to a
> python3.X-config script (for some X).
>
> So on such systems running:
>   python3-config --includes
> should emit the correct -I option.  On my box it emits:
>
> -I/usr/include/python3.8 -I/usr/include/python3.8
>
>
> It's more complicated, but I believe:
>   python3-config --cflags
> should emit the build flags that C/C++ extensions ought to use when
> building.  On my box this emits:
>
> -I/usr/include/python3.8 -I/usr/include/python3.8  -Wno-unused-result -
> Wsign-compare  -O2 -g -pipe -Wall -Werror=format-security -Wp,-
> D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-
> protector-strong -grecord-gcc-switches   -m64 -mtune=generic -
> fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -
> D_GNU_SOURCE -fPIC -fwrapv  -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG  -
> O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-
> D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-
> gcc-switches   -m64 -mtune=generic -fasynchronous-unwind-tables -
> fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv
>
> and it's likely going to vary from distribution to distribution.  Some
> of those options *are* going to affect the gimple that -fanalyzer
> "sees".
>
> Does your installation of Python have such a script?
>
> So in the short term you could hack in a minimal subset of the
> decls/defns from Python.h, but I'd prefer it if target-supports.exp
> gained a DejaGnu directive that invokes python3-config, captures the
> result (or fails with UNSUPPORTED for systems without python3
> development headers), and then adds the result to the build flags of
> the file being tested.  The .exp files are implemented in Tcl, alas;
> let me know if you want help with that.
>
> Dave
Sounds good; thanks! Following existing examples in
target-supports.exp, the following works as expected in terms of
extracting the build flags we are interested in.

In target-supports.exp:
proc check_python_flags { } {
set result [remote_exec host "python3-config --cflags"]
set status [lindex $result 0]
if { $status == 0 } {
return [lindex $result 1]
} else {
return "UNSUPPORTED"
}
}

However, I'm having some trouble figuring out the specifics as to how
we may add the build flags to our test cases. My intuition looks like
something like the following:

In plugin.exp:
foreach plugin_test $plugin_test_list {
if {[lindex $plugin_test 0] eq "analyzer_cpython_plugin.c"} {
set python_flags [check_python_flags]
if { $python_flags ne "UNSUPPORTED" } {
   // append $python_flags to build flags here
}
}

}

How might we do so?
>
>
> >
> > Best,
> > Eric
> >
> > On Tue, Aug 1, 2023 at 1:06 PM David Malcolm 
> > wrote:
> > >
> > > On Tue, 2023-08-01 at 09:57 -0400, Eric Feng wrote:
> > > > >
> > > > > My guess is that you were trying to do it from the
> > > > > PLUGIN_ANALYZER_INIT
> > > > > hook rather than from the plugin_init function, but it's hard
> > > > > to be
> > > > > sure without seeing the code.
> > > > >
> > > >
> > > > Thanks Dave, you are entirely right — I made the mistake of
> > > > trying to
> > > > do it from PLUGIN_ANALYZER_INIT hook and not from the plugin_init
> > > > function. After following your suggestion, the callbacks are
> > > > getting
> > > > registered as expected.
> > >
> > > Ah, good.
> > >
> > > > I submitted a patch to review for this feature
> > > > on gcc-patches; please let me know if it looks OK.
> > >
> > > Thanks Eric; I've posted a reply to your email there, so let's
> > > discuss
> > > the details there.
> > >
> > > Dave
> > >
> >
>


Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-08-04 Thread David Malcolm via Gcc
On Fri, 2023-08-04 at 11:02 -0400, Eric Feng wrote:
> Hi Dave,
> 
> Tests related to our plugin which depend on Python-specific
> definitions have been run by including /* { dg-options "-fanalyzer
> -I/usr/include/python3.9" } */. This is undoubtedly not ideal; is it
> best to approach this problem by adapting a subset of relevant
> definitions like in gil.h?

That might be acceptable in the very short-term, but to create a plugin
that's useful to end-user (authors of CPython extension modules) we
want to be testing against real Python headers.

As I understand it, https://peps.python.org/pep-0394/ allows for
distributors of Python to symlink "python3-config" in the PATH to a
python3.X-config script (for some X).

So on such systems running:
  python3-config --includes
should emit the correct -I option.  On my box it emits:

-I/usr/include/python3.8 -I/usr/include/python3.8


It's more complicated, but I believe:
  python3-config --cflags
should emit the build flags that C/C++ extensions ought to use when
building.  On my box this emits:

-I/usr/include/python3.8 -I/usr/include/python3.8  -Wno-unused-result -
Wsign-compare  -O2 -g -pipe -Wall -Werror=format-security -Wp,-
D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-
protector-strong -grecord-gcc-switches   -m64 -mtune=generic -
fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -
D_GNU_SOURCE -fPIC -fwrapv  -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG  -
O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-
D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-
gcc-switches   -m64 -mtune=generic -fasynchronous-unwind-tables -
fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv

and it's likely going to vary from distribution to distribution.  Some
of those options *are* going to affect the gimple that -fanalyzer
"sees".

Does your installation of Python have such a script?

So in the short term you could hack in a minimal subset of the
decls/defns from Python.h, but I'd prefer it if target-supports.exp
gained a DejaGnu directive that invokes python3-config, captures the
result (or fails with UNSUPPORTED for systems without python3
development headers), and then adds the result to the build flags of
the file being tested.  The .exp files are implemented in Tcl, alas;
let me know if you want help with that.

Dave


> 
> Best,
> Eric
> 
> On Tue, Aug 1, 2023 at 1:06 PM David Malcolm 
> wrote:
> > 
> > On Tue, 2023-08-01 at 09:57 -0400, Eric Feng wrote:
> > > > 
> > > > My guess is that you were trying to do it from the
> > > > PLUGIN_ANALYZER_INIT
> > > > hook rather than from the plugin_init function, but it's hard
> > > > to be
> > > > sure without seeing the code.
> > > > 
> > > 
> > > Thanks Dave, you are entirely right — I made the mistake of
> > > trying to
> > > do it from PLUGIN_ANALYZER_INIT hook and not from the plugin_init
> > > function. After following your suggestion, the callbacks are
> > > getting
> > > registered as expected.
> > 
> > Ah, good.
> > 
> > > I submitted a patch to review for this feature
> > > on gcc-patches; please let me know if it looks OK.
> > 
> > Thanks Eric; I've posted a reply to your email there, so let's
> > discuss
> > the details there.
> > 
> > Dave
> > 
> 



Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-08-04 Thread Eric Feng via Gcc
Hi Dave,

Tests related to our plugin which depend on Python-specific
definitions have been run by including /* { dg-options "-fanalyzer
-I/usr/include/python3.9" } */. This is undoubtedly not ideal; is it
best to approach this problem by adapting a subset of relevant
definitions like in gil.h?

Best,
Eric

On Tue, Aug 1, 2023 at 1:06 PM David Malcolm  wrote:
>
> On Tue, 2023-08-01 at 09:57 -0400, Eric Feng wrote:
> > >
> > > My guess is that you were trying to do it from the
> > > PLUGIN_ANALYZER_INIT
> > > hook rather than from the plugin_init function, but it's hard to be
> > > sure without seeing the code.
> > >
> >
> > Thanks Dave, you are entirely right — I made the mistake of trying to
> > do it from PLUGIN_ANALYZER_INIT hook and not from the plugin_init
> > function. After following your suggestion, the callbacks are getting
> > registered as expected.
>
> Ah, good.
>
> > I submitted a patch to review for this feature
> > on gcc-patches; please let me know if it looks OK.
>
> Thanks Eric; I've posted a reply to your email there, so let's discuss
> the details there.
>
> Dave
>


Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-08-01 Thread David Malcolm via Gcc
On Tue, 2023-08-01 at 09:57 -0400, Eric Feng wrote:
> > 
> > My guess is that you were trying to do it from the
> > PLUGIN_ANALYZER_INIT
> > hook rather than from the plugin_init function, but it's hard to be
> > sure without seeing the code.
> > 
> 
> Thanks Dave, you are entirely right — I made the mistake of trying to
> do it from PLUGIN_ANALYZER_INIT hook and not from the plugin_init
> function. After following your suggestion, the callbacks are getting
> registered as expected. 

Ah, good.

> I submitted a patch to review for this feature
> on gcc-patches; please let me know if it looks OK.

Thanks Eric; I've posted a reply to your email there, so let's discuss
the details there.

Dave



Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-08-01 Thread Eric Feng via Gcc
>
> My guess is that you were trying to do it from the PLUGIN_ANALYZER_INIT
> hook rather than from the plugin_init function, but it's hard to be
> sure without seeing the code.
>

Thanks Dave, you are entirely right — I made the mistake of trying to
do it from PLUGIN_ANALYZER_INIT hook and not from the plugin_init
function. After following your suggestion, the callbacks are getting
registered as expected. I submitted a patch to review for this feature
on gcc-patches; please let me know if it looks OK.

Best,
Eric


Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-07-30 Thread David Malcolm via Gcc
On Sun, 2023-07-30 at 13:52 -0400, Eric Feng wrote:
> [...]
> > As noted in our chat earlier, I don't think we can easily make
> > these
> > work.  Looking at CPython's implementation: PyList_Type's
> > initializer
> > here:
> > https://github.com/python/cpython/blob/main/Objects/listobject.c#L3101
> > initializes tp_flags with the flags, but:
> > (a) we don't see that code when compiling a user's extension module
> > (b) even if we did, PyList_Type is non-const, so the analyzer has
> > to
> > assume that tp_flags could have been written to since it was
> > initialized
> > 
> > In theory we could specialcase such lookups, so that, say, a plugin
> > could register assumptions into the analyzer about the value of
> > bits
> > within (PyList_Type.tp_flags).
> > 
> > However, this seems like a future feature.
> 
> I agree that it is more appropriate as a future feature.
> 
> Recently, in preparation for a patch, I have been focusing on
> migrating as much of our plugin-specific functionality as possible,
> which is currently scattered across core analyzer files for
> convenience, into the plugin itself. Specifically, I am currently
> trying to transfer the code related to stashing Python-specific types
> and global variables into analyzer_cpython_plugin.c. This approach
> has
> three main benefits, among which some I believe we have previously
> discussed:
> 
> 1) We only need to search for these values when initializing our
> plugin, instead of every time the analyzer is enabled.
> 2) We can extend the values that we stash by modifying only our
> plugin, avoiding changes to core analyzer files such as
> analyzer-language.cc, which seems a safer and more resilient
> approach.
> 3) Future analyzer plugins will have an easier time stashing values
> relevant to their respective projects.

Sounds good, though I don't mind if the initial version of your patch
adds CPython-specific stuff to the core, if there are unexpected
hurdles in converting things to be more purely plugin based.

> 
> Let me know if my concerns or reasons appear unfounded.
> 
> My initial approach involved adding a hook to the end of
> ana::on_finish_translation_unit which calls the relevant
> stashing-related callbacks registered during plugin initialization.
> Here's a rough sketch:
> 
> void
> on_finish_translation_unit (const translation_unit )
> {
>   // ... existing code
>   stash_named_constants (the_logger.get_logger (), tu);
> 
>   do_finish_translation_unit_callbacks(the_logger.get_logger (), tu);
> }
> 
> Inside do_finish_translation_unit_callbacks we have a loop like so:
> 
> for (auto& callback : finish_translation_unit_callbacks)
> {
>     callback(logger, tu);
> }
> 
> Where finish_translation_unit_callbacks is a vector defined as
> follows:
> typedef void (*finish_translation_unit_callback) (logger *, const
> translation_unit &);
> vec
> *finish_translation_unit_callbacks;

Seems reasonable.

> 
> To register a callback, we use:
> 
> void
> register_finish_translation_unit_callback (
>     finish_translation_unit_callback callback)
> {
>   if (!finish_translation_unit_callbacks)
>     vec_alloc (finish_translation_unit_callbacks, 1);
>   finish_translation_unit_callbacks->safe_push (callback);
> }
> 
> And finally, from our plugin (or any other plugin), we can register
> callbacks like so:
> ana::register_finish_translation_unit_callback (_named_types);
> ana::register_finish_translation_unit_callback (_global_vars);
> 
> However, on_finish_translation_unit runs before plugin initialization
> occurs, so, unfortunately, we would be registering our callbacks
> after
> on_finish_translation_unit with this method.

Really?   I thought the plugin_init callback is called from
initialize_plugins, which is called from toplev::main fairly early on;
I though on_finish_translation_unit is called from deep within
do_compile, which is called later on from toplev::main.

What happens if you put breakpoints on both the plugin_init hook and on
on_finish_translation_unit, and have a look at the backtrace at each?

Note that this is the "plugin_init" code, not the PLUGIN_ANALYZER_INIT
callback.  The latter *is* called after on_finish_translation_unit,
when the analyzer runs.  You'll need to put your code in the former.


>  As a workaround, I tried
> saving the translation unit like this:
> 
> void
> on_finish_translation_unit (const translation_unit )
> {
>   // ... existing code
>   stash_named_constants (the_logger.get_logger (), tu);
> 
>   saved_tu = 
> }

That's not going to work; the "tu" is a reference to an on-stack
object, i.e. essentially a pointer to a temporary on the stack.  If
saved_tu is a pointer, then it's going to be pointing at garbage when
the function returns; if it's an object, then it's going to take a copy
of just the base class, which isn't going to be usable either ("object
slicing").

> 
> Then in our plugin:
> ana::register_finish_translation_unit_callback (_named_types);
> 

Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-07-30 Thread Eric Feng via Gcc
[...]
> As noted in our chat earlier, I don't think we can easily make these
> work.  Looking at CPython's implementation: PyList_Type's initializer
> here:
> https://github.com/python/cpython/blob/main/Objects/listobject.c#L3101
> initializes tp_flags with the flags, but:
> (a) we don't see that code when compiling a user's extension module
> (b) even if we did, PyList_Type is non-const, so the analyzer has to
> assume that tp_flags could have been written to since it was
> initialized
>
> In theory we could specialcase such lookups, so that, say, a plugin
> could register assumptions into the analyzer about the value of bits
> within (PyList_Type.tp_flags).
>
> However, this seems like a future feature.

I agree that it is more appropriate as a future feature.

Recently, in preparation for a patch, I have been focusing on
migrating as much of our plugin-specific functionality as possible,
which is currently scattered across core analyzer files for
convenience, into the plugin itself. Specifically, I am currently
trying to transfer the code related to stashing Python-specific types
and global variables into analyzer_cpython_plugin.c. This approach has
three main benefits, among which some I believe we have previously
discussed:

1) We only need to search for these values when initializing our
plugin, instead of every time the analyzer is enabled.
2) We can extend the values that we stash by modifying only our
plugin, avoiding changes to core analyzer files such as
analyzer-language.cc, which seems a safer and more resilient approach.
3) Future analyzer plugins will have an easier time stashing values
relevant to their respective projects.

Let me know if my concerns or reasons appear unfounded.

My initial approach involved adding a hook to the end of
ana::on_finish_translation_unit which calls the relevant
stashing-related callbacks registered during plugin initialization.
Here's a rough sketch:

void
on_finish_translation_unit (const translation_unit )
{
  // ... existing code
  stash_named_constants (the_logger.get_logger (), tu);

  do_finish_translation_unit_callbacks(the_logger.get_logger (), tu);
}

Inside do_finish_translation_unit_callbacks we have a loop like so:

for (auto& callback : finish_translation_unit_callbacks)
{
callback(logger, tu);
}

Where finish_translation_unit_callbacks is a vector defined as follows:
typedef void (*finish_translation_unit_callback) (logger *, const
translation_unit &);
vec *finish_translation_unit_callbacks;

To register a callback, we use:

void
register_finish_translation_unit_callback (
finish_translation_unit_callback callback)
{
  if (!finish_translation_unit_callbacks)
vec_alloc (finish_translation_unit_callbacks, 1);
  finish_translation_unit_callbacks->safe_push (callback);
}

And finally, from our plugin (or any other plugin), we can register
callbacks like so:
ana::register_finish_translation_unit_callback (_named_types);
ana::register_finish_translation_unit_callback (_global_vars);

However, on_finish_translation_unit runs before plugin initialization
occurs, so, unfortunately, we would be registering our callbacks after
on_finish_translation_unit with this method. As a workaround, I tried
saving the translation unit like this:

void
on_finish_translation_unit (const translation_unit )
{
  // ... existing code
  stash_named_constants (the_logger.get_logger (), tu);

  saved_tu = 
}

Then in our plugin:
ana::register_finish_translation_unit_callback (_named_types);
ana::register_finish_translation_unit_callback (_global_vars);
ana:: do_finish_translation_unit_callbacks();

With do_finish_translation_units passing the stored_tu to the callbacks.

Unfortunately, with this method, it seems like we encounter a
segmentation fault when trying to call the lookup functions within
translation_unit at the time of plugin initialization, even though the
translation unit is stored correctly. So it seems like the solution
may not be quite so simple.

I'm currently investigating this issue, but if there's an obvious
solution that I might be missing or any general suggestions, please
let me know!

Thanks as always,
Eric


Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-07-27 Thread David Malcolm via Gcc
On Thu, 2023-07-27 at 18:13 -0400, Eric Feng wrote:
> Hi Dave,
> 
> Thanks for the comments!
> 
> [...]
> > Do you have any DejaGnu tests for this functionality?  For example,
> > given PyList_New
> >   https://docs.python.org/3/c-api/list.html#c.PyList_New
> > there could be a test like:
> > 
> > /* { dg-require-effective-target python_h } */
> > 
> > #define PY_SSIZE_T_CLEAN
> > #include 
> > #include "analyzer-decls.h"
> > 
> > PyObject *
> > test_PyList_New (Py_ssize_t len)
> > {
> >   PyObject *obj = PyList_New (len);
> >   if (obj)
> >     {
> >  __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE"
> > } */
> >  __analyzer_eval (PyList_Check (obj)); /* { dg-warning "TRUE" }
> > */
> >  __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning
> > "TRUE" } */
> >     }
> >   else
> >     __analyzer_dump_path (); /* { dg-warning "path" } */
> >   return obj;
> > }
> > 
> > ...or similar, to verify that we simulate that the call can both
> > succeed and fail, and to verify properties of the store along the
> > "success" path.  Caveat: I didn't look at exactly what properties
> > you're simulating, so the above tests might need adjusting.
> > 
> 
> I am currently in the process of developing more tests. Specific to
> the test you provided as an example, we are passing all cases except
> for PyList_Check. PyList_Check does not pass because I have not yet
> added support for the various definitions of tp_flags.

As noted in our chat earlier, I don't think we can easily make these
work.  Looking at CPython's implementation: PyList_Type's initializer
here:
https://github.com/python/cpython/blob/main/Objects/listobject.c#L3101
initializes tp_flags with the flags, but:
(a) we don't see that code when compiling a user's extension module
(b) even if we did, PyList_Type is non-const, so the analyzer has to
assume that tp_flags could have been written to since it was
initialized

In theory we could specialcase such lookups, so that, say, a plugin
could register assumptions into the analyzer about the value of bits
within (PyList_Type.tp_flags).

However, this seems like a future feature.

>  I also
> encountered a minor hiccup where PyList_CheckExact appeared to give
> "UNKNOWN" rather than "TRUE", but this has since been fixed. The
> problem was caused by accidentally using the tree representation of
> struct PyList_Type as opposed to struct PyList_Type * when creating a
> pointer sval to the region for Pylist_Type.

Ah, good.

> 
> [...]
> > 
> > > Let's consider the following example which lacks error checking:
> > > 
> > > PyObject* foo() {
> > >     PyObject item = PyLong_FromLong(10);
> > >     PyObject list = PyList_New(5);
> > >     return list;
> > > }
> > > 
> > > The states for when PyLong_FromLong fails and when
> > > PyLong_FromLong
> > > succeeds are merged before the call to PyObject* list =
> > > PyList_New(5).
> > 
> > Ideally we would emit a leak warning about the "success" case of
> > PyLong_FromLong here.  I think you're running into the problem of
> > the
> > "store" part of the program_state being separate from the "malloc"
> > state machine part of program_state - I'm guessing that you're
> > creating
> > a heap_allocated_region for the new python object, but the "malloc"
> > state machine isn't transitioning the pointer from "start" to
> > "assumed-
> > non-null".  Such state machine states inhibit state-merging, and so
> > this might solve your state-merging problem.
> > 
> > I think we need a way to call
> > malloc_state_machine::on_allocator_call
> > from outside of sm-malloc.cc.  See
> > region_model::on_realloc_with_move
> > for an example of how to do something similar.
> > 
> 
> Thank you for the suggestion — this worked great and has solved the
> issue!

Excellent!

Thanks for the update
Dave



Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-07-27 Thread Eric Feng via Gcc
Hi Dave,

Thanks for the comments!

[...]
> Do you have any DejaGnu tests for this functionality?  For example,
> given PyList_New
>   https://docs.python.org/3/c-api/list.html#c.PyList_New
> there could be a test like:
>
> /* { dg-require-effective-target python_h } */
>
> #define PY_SSIZE_T_CLEAN
> #include 
> #include "analyzer-decls.h"
>
> PyObject *
> test_PyList_New (Py_ssize_t len)
> {
>   PyObject *obj = PyList_New (len);
>   if (obj)
> {
>  __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
>  __analyzer_eval (PyList_Check (obj)); /* { dg-warning "TRUE" } */
>  __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning "TRUE" } */
> }
>   else
> __analyzer_dump_path (); /* { dg-warning "path" } */
>   return obj;
> }
>
> ...or similar, to verify that we simulate that the call can both
> succeed and fail, and to verify properties of the store along the
> "success" path.  Caveat: I didn't look at exactly what properties
> you're simulating, so the above tests might need adjusting.
>

I am currently in the process of developing more tests. Specific to
the test you provided as an example, we are passing all cases except
for PyList_Check. PyList_Check does not pass because I have not yet
added support for the various definitions of tp_flags. I also
encountered a minor hiccup where PyList_CheckExact appeared to give
"UNKNOWN" rather than "TRUE", but this has since been fixed. The
problem was caused by accidentally using the tree representation of
struct PyList_Type as opposed to struct PyList_Type * when creating a
pointer sval to the region for Pylist_Type.

[...]
>
> > Let's consider the following example which lacks error checking:
> >
> > PyObject* foo() {
> > PyObject item = PyLong_FromLong(10);
> > PyObject list = PyList_New(5);
> > return list;
> > }
> >
> > The states for when PyLong_FromLong fails and when PyLong_FromLong
> > succeeds are merged before the call to PyObject* list =
> > PyList_New(5).
>
> Ideally we would emit a leak warning about the "success" case of
> PyLong_FromLong here.  I think you're running into the problem of the
> "store" part of the program_state being separate from the "malloc"
> state machine part of program_state - I'm guessing that you're creating
> a heap_allocated_region for the new python object, but the "malloc"
> state machine isn't transitioning the pointer from "start" to "assumed-
> non-null".  Such state machine states inhibit state-merging, and so
> this might solve your state-merging problem.
>
> I think we need a way to call malloc_state_machine::on_allocator_call
> from outside of sm-malloc.cc.  See region_model::on_realloc_with_move
> for an example of how to do something similar.
>

Thank you for the suggestion — this worked great and has solved the issue!

Best,
Eric


Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-07-25 Thread David Malcolm via Gcc
On Tue, 2023-07-25 at 00:49 -0400, Eric Feng wrote:
> Hi all,

Hi Eric, thanks for the update.

Various comments inline below...

> 
> I would like to update everyone on the progress of the static
> analyzer
> plugin for CPython extension module code.
>  Since the last update, I
> have implemented known function subclasses for PyList_New and
> PyList_Append. The existing known function subclasses have also been
> enhanced to provide more information. For instance, we are now
> simulating object type specific fields in addition to just ob_refcnt
> and ob_type, which are shared by all PyObjects.

Do you have any DejaGnu tests for this functionality?  For example,
given PyList_New
  https://docs.python.org/3/c-api/list.html#c.PyList_New
there could be a test like:

/* { dg-require-effective-target python_h } */

#define PY_SSIZE_T_CLEAN
#include 
#include "analyzer-decls.h"

PyObject *
test_PyList_New (Py_ssize_t len)
{
  PyObject *obj = PyList_New (len);
  if (obj)
{
 __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
 __analyzer_eval (PyList_Check (obj)); /* { dg-warning "TRUE" } */
 __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning "TRUE" } */
}
  else
__analyzer_dump_path (); /* { dg-warning "path" } */
  return obj;
}

...or similar, to verify that we simulate that the call can both
succeed and fail, and to verify properties of the store along the
"success" path.  Caveat: I didn't look at exactly what properties
you're simulating, so the above tests might need adjusting.

The:
  /* { dg-require-effective-target python_h } */
allows the testcase to bail out with "UNSUPPORTED" on hosts that don't
have a suitable Python.h header file installed, so that all this
Python-specific functionality is optional.  You could implement this
via a new function "check_effective_target_python_h" in
gcc/testsuite/lib/target-supports.exp, similar to the existing
check_effective_target_ functions in that Tcl file.


> 
> Regarding reference count checking, I have implemented a naive
> traversal of the store to count the actual reference count of
> PyObjects, allowing us to compare it against the ob_refcnt fields of
> the same PyObjects. Although we can compare the actual reference count
> and the ob_refcnt field, I am still working on implementing a
> diagnostic to alert about this issue.

Sounds promising.

> 
> In addition to the progress update, I have some implementation
> related
> questions and would appreciate any input. The current moment at which
> we run the algorithm for reference count checking, and thereby also
> the moment at which we may want to issue
> impl_region_model_context::warn, is within region_model::pop_frame.
> However, it appears that m_stmt and m_stmt_finder are NULL at the
> time
> of region_model::pop_frame, which results in the diagnostic for the
> reference count getting rejected. I am having trouble finding a
> workaround for this issue, so any ideas would be welcome.

FWIW I've run into a similar issue, and so did Tim last year.  The
whole stmt vs stmt_finder thing is rather ugly, and I have a local
branch with a part-written reworking of it that I hope will solve these
issues (adding a "class pending_location"), but it's very messy and far
from being ready.  Sorry about this.

In theory you can provide an instance of your own custom stmt_finder
subclass when you save the pending_diagnostic.  There's an example of
doing this in engine.cc: impl_region_model_context::on_state_leak,
where the leak is going to be reported at the end of a function
(similar to your diagnostic); it uses a leak_stmt_finder class, which
simply scans backwards once it has an exploded_path to find the last
stmt that had a useful location_t value.  This is a nasty hack, but
probably does what you need.

> 
> I am also currently examining some issues related to state merging.

Note that you can use -fno-analyzer-state-merge to turn off the state
merging code, which can be very useful when debugging this kind of
thing.

> Let's consider the following example which lacks error checking:
> 
> PyObject* foo() {
>     PyObject item = PyLong_FromLong(10);
>     PyObject list = PyList_New(5);
>     return list;
> }
> 
> The states for when PyLong_FromLong fails and when PyLong_FromLong
> succeeds are merged before the call to PyObject* list =
> PyList_New(5).

Ideally we would emit a leak warning about the "success" case of
PyLong_FromLong here.  I think you're running into the problem of the
"store" part of the program_state being separate from the "malloc"
state machine part of program_state - I'm guessing that you're creating
a heap_allocated_region for the new python object, but the "malloc"
state machine isn't transitioning the pointer from "start" to "assumed-
non-null".  Such state machine states inhibit state-merging, and so
this might solve your state-merging problem.

I think we need a way to call malloc_state_machine::on_allocator_call
from outside of sm-malloc.cc.  

Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-07-24 Thread Eric Feng via Gcc
Hi all,

I would like to update everyone on the progress of the static analyzer
plugin for CPython extension module code. Since the last update, I
have implemented known function subclasses for PyList_New and
PyList_Append. The existing known function subclasses have also been
enhanced to provide more information. For instance, we are now
simulating object type specific fields in addition to just ob_refcnt
and ob_type, which are shared by all PyObjects.

Regarding reference count checking, I have implemented a naive
traversal of the store to count the actual reference count of
PyObjects, allowing us to compare it against the ob_refcnt fields of
the same PyObjects. Although we can compare the actual reference count
and the ob_refcnt field, I am still working on implementing a
diagnostic to alert about this issue.

In addition to the progress update, I have some implementation related
questions and would appreciate any input. The current moment at which
we run the algorithm for reference count checking, and thereby also
the moment at which we may want to issue
impl_region_model_context::warn, is within region_model::pop_frame.
However, it appears that m_stmt and m_stmt_finder are NULL at the time
of region_model::pop_frame, which results in the diagnostic for the
reference count getting rejected. I am having trouble finding a
workaround for this issue, so any ideas would be welcome.

I am also currently examining some issues related to state merging.
Let's consider the following example which lacks error checking:

PyObject* foo() {
PyObject item = PyLong_FromLong(10);
PyObject list = PyList_New(5);
return list;
}

The states for when PyLong_FromLong fails and when PyLong_FromLong
succeeds are merged before the call to PyObject* list = PyList_New(5).
I suspect this may be related to me not correctly handling behavior
that arises due to the analyzer deterministically selecting the IDs
for heap allocations. Since there is a heap allocation for PyList_New
following PyLong_FromLong, the success and fail cases for
PyLong_FromLong are merged. I believe this is so that in the scenario
where PyLong_FromLong fails and PyList_New succeeds, the ID for the
region allocated for PyList_New wouldn't be the same as the
PyLong_FromLong success case. Whatever the cause, due to this state
merge, the heap allocated region representing PyObject *item has all
its fields set to UNKNOWN, making it impossible to perform the
reference count checking functionality. I attempted to fix this by
wrapping the svalue representing PyLongObject with
get_or_create_unmergeable, but it didn't seem to help. However, this
issue doesn't occur in all situations. For instance:

PyObject* foo() {
PyObject item = PyLong_FromLong(10);
PyObject list = PyList_New(5);
PyList_Append(list, item);
return list;
}

The above scenario is simulated as expected. I will continue to search
for a solution, but any suggestions would be highly appreciated. Thank
you!

Best,
Eric