Re: [PATCH v2] analyzer: new warnings -Wanalyzer-mkstemp-missing-suffix and, -Wanalyzer-mkstemp-of-string-literal [PR105890]

2026-03-15 Thread David Malcolm
On Sun, 2026-03-15 at 10:28 +0100, Tomás Ortín Fernández wrote:
> Hi David,
> 
> > > I was going to ask first if there's anything that would be
> > > particularly
> > > useful for the analyzer.
> > 
> > The big area of missing functionality in -fanalyzer is proper C++
> > support, but the problems there are difficult (e.g. reworking it to
> > be
> > more scalable).  Adding known_function subclasses is a less
> > ambitious
> > project, but also helpful, and has a much gentler learning curve
> > and
> > thus likely to be a more successful project.
> 
> I agree with this.  In that case, would it make the most sense to
> make 
> it be a 175 hours project, given that it will be of moderate
> difficulty? 
>   Do you or the GCC project have a preference of this?  I'm open to 
> either 175h or 350h.

I have a preference for 175 hours for this project; there seem to be a
lot of strong candidates for -fanalyzer projects, so I'd prefer to have
a smaller projects.

> 
> 
> > >   If there's not anything in particular, adding
> > > more known functions to cover more of both the C standard and
> > > POSIX
> > > would be interesting to me, now that I have some understanding of
> > > how
> > > it
> > > works.  I'd also be interested in adding support for more SEI
> > > CERT
> > > checks, I imagine some of those may be more challenging.
> > 
> > That sounds useful.  Do you have any specifically in mind?  We can
> > brainstorm about how to go about implementing them.
> 
> Not any interesting ones yet, as I wasn't familiar with CERT before 
> (other than it existed).  There are some low-hanging fruits that can 
> easily be implemented while working on adding more known functions,
> such 
> as "MSC24-C. Do not use deprecated or obsolescent functions".  I'm 
> reading the SEI CERT guidelines and taking notes of some potentially 
> interesting ones, I'll let you know when I have some interesting
> candidates.

(nods; thanks)

> 
> > > What I'm not sure is how go about clearly delimiting the scope in
> > > the
> > > case of a project of that kind.  A predefined list of functions
> > > and
> > > checks could easily end up being either too little or too
> > > ambitious,
> > > as
> > > it may be hard for me to estimate how difficult some type of
> > > check
> > > may
> > > be before having solved a similar one.  In this case, what would
> > > you
> > > recommend?
> > 
> > How about a flexible approach where you come up with a list of
> > functions and checks (a "backlog" in agile parlance, I believe);
> > each
> > item is relatively small and well-contained, and each week you try
> > and
> > implement some from the list, and if any particular item proves
> > harder
> > than expected, we put it back in the backlog and move on to
> > something
> > easier.  Maybe as the summer goes on you could try the harder
> > problems.
> > 
> > That way you'd be generating a series of non-trivial patches
> > similar to
> > the one you've already done; the analyzer would gain new warnings
> > and
> > improvements to analysis precision; and hopefully by the end of the
> > summer you'd have an impressive set of patches to your credit in
> > gcc
> > trunk, and have more familiarity with the internals of the
> > analyzer/gcc.
> > 
> > The expectation would be that you get plenty of patches in; but we
> > wouldn't expect the full list completed during the summer (I'd
> > prefer
> > to capture a more complete list of areas for improvement than to
> > achieve 100% on an incomplete list, if that makes sense).
> 
> I like this approach.  The list could be comprised of groups of
> related 
> functions (if that's the case for a given function) that require
> similar 
> checks, instead of individual functions.  I could first tackle easy 
> cases such as the `mkstemp` one, and cases where a very similar check
> is 
> already present for a different function in kf.cc.  And then move on
> to 
> tackle more complex and novel cases, and maybe some interesting CERT
> rules.
> 
> > One possible early task might be populating our bugzilla with more
> > RFE
> > bugs to cover possible POSIX entrypoints and CERT checks of
> > interest.
> 
> It makes a lot of sense for this to be the first stage of the
> project.
> 
> I agree with your proposed approach, and will write the GSOC proposal
> based on it.  A problem I see for that is that the proposal should
> have 
> a clear scope.
> 
> I could compile a backlog now and use it in the proposal, making
> clear 
> that I intend to get most of it implemented, but it's likely that
> this 
> backlog will need to be modified once the list of areas to improve is
> further expanded during the first stage.
> 
> Another option could be to state a number of additional C and POSIX 
> functions to be covered (e.g. "adding static analysis checks for 100 
> currently uncovered functions in the C or POSIX standards"), although
> that doesn't mean much: some functions and some checks are trivial,
> some 
> other not so much.

I don't like the 

Re: [PATCH v2] analyzer: new warnings -Wanalyzer-mkstemp-missing-suffix and, -Wanalyzer-mkstemp-of-string-literal [PR105890]

2026-03-15 Thread Tomás Ortín Fernández

Hi David,


I was going to ask first if there's anything that would be
particularly
useful for the analyzer.


The big area of missing functionality in -fanalyzer is proper C++
support, but the problems there are difficult (e.g. reworking it to be
more scalable).  Adding known_function subclasses is a less ambitious
project, but also helpful, and has a much gentler learning curve and
thus likely to be a more successful project.


I agree with this.  In that case, would it make the most sense to make 
it be a 175 hours project, given that it will be of moderate difficulty? 
 Do you or the GCC project have a preference of this?  I'm open to 
either 175h or 350h.




  If there's not anything in particular, adding
more known functions to cover more of both the C standard and POSIX
would be interesting to me, now that I have some understanding of how
it
works.  I'd also be interested in adding support for more SEI CERT
checks, I imagine some of those may be more challenging.


That sounds useful.  Do you have any specifically in mind?  We can
brainstorm about how to go about implementing them.


Not any interesting ones yet, as I wasn't familiar with CERT before 
(other than it existed).  There are some low-hanging fruits that can 
easily be implemented while working on adding more known functions, such 
as "MSC24-C. Do not use deprecated or obsolescent functions".  I'm 
reading the SEI CERT guidelines and taking notes of some potentially 
interesting ones, I'll let you know when I have some interesting candidates.



What I'm not sure is how go about clearly delimiting the scope in the
case of a project of that kind.  A predefined list of functions and
checks could easily end up being either too little or too ambitious,
as
it may be hard for me to estimate how difficult some type of check
may
be before having solved a similar one.  In this case, what would you
recommend?


How about a flexible approach where you come up with a list of
functions and checks (a "backlog" in agile parlance, I believe); each
item is relatively small and well-contained, and each week you try and
implement some from the list, and if any particular item proves harder
than expected, we put it back in the backlog and move on to something
easier.  Maybe as the summer goes on you could try the harder problems.

That way you'd be generating a series of non-trivial patches similar to
the one you've already done; the analyzer would gain new warnings and
improvements to analysis precision; and hopefully by the end of the
summer you'd have an impressive set of patches to your credit in gcc
trunk, and have more familiarity with the internals of the
analyzer/gcc.

The expectation would be that you get plenty of patches in; but we
wouldn't expect the full list completed during the summer (I'd prefer
to capture a more complete list of areas for improvement than to
achieve 100% on an incomplete list, if that makes sense).


I like this approach.  The list could be comprised of groups of related 
functions (if that's the case for a given function) that require similar 
checks, instead of individual functions.  I could first tackle easy 
cases such as the `mkstemp` one, and cases where a very similar check is 
already present for a different function in kf.cc.  And then move on to 
tackle more complex and novel cases, and maybe some interesting CERT rules.



One possible early task might be populating our bugzilla with more RFE
bugs to cover possible POSIX entrypoints and CERT checks of interest.


It makes a lot of sense for this to be the first stage of the project.

I agree with your proposed approach, and will write the GSOC proposal 
based on it.  A problem I see for that is that the proposal should have 
a clear scope.


I could compile a backlog now and use it in the proposal, making clear 
that I intend to get most of it implemented, but it's likely that this 
backlog will need to be modified once the list of areas to improve is 
further expanded during the first stage.


Another option could be to state a number of additional C and POSIX 
functions to be covered (e.g. "adding static analysis checks for 100 
currently uncovered functions in the C or POSIX standards"), although 
that doesn't mean much: some functions and some checks are trivial, some 
other not so much.


Lastly, it could be something much less specific such as "significantly 
expand the coverage of the POSIX API", but I'm not sure if this would be 
acceptable, as it's entirely subjective.


What approach would you recommend? It's likely that I'm missing a better 
way to delimit the scope of the API coverage part of the project.



In summary, I'm open to working on whatever may be more helpful right
now to improve the analyzer.  Regardless of what I end up choosing
for
the project proposal, I will now work on an additional patch covering
functions similar to `mkstemp`, as that's a very low-hanging fruit.


Excellent; thanks.


About this: all these functions share an almos

Re: [PATCH v2] analyzer: new warnings -Wanalyzer-mkstemp-missing-suffix and, -Wanalyzer-mkstemp-of-string-literal [PR105890]

2026-03-13 Thread David Malcolm
On Thu, 2026-03-12 at 20:15 +0100, Tomás Ortín Fernández wrote:
> > Thanks for the updated patch.  It looks good, and I've added it to
> > my
> > queue of patches to push to gcc 17 once we reopen trunk for new
> > features.
> 
> Thanks!
> 
> > Do you have a sense of what you'd like to look at next?  You
> > mentioned
> > a possible followup to cover similar functions to mkstemp.  If
> > you're
> > planning to apply to GSoC, I'm wondering what subproject you had in
> > mind.  Note that adding a bunch of known functions to more fully
> > handle
> > various POSIX APIs in -fanalyzer sounds like a reasonable GSoC
> > project
> > idea to me; it would be useful and (I hope) relatively easy.
> 
> I was going to ask first if there's anything that would be
> particularly 
> useful for the analyzer. 

The big area of missing functionality in -fanalyzer is proper C++
support, but the problems there are difficult (e.g. reworking it to be
more scalable).  Adding known_function subclasses is a less ambitious
project, but also helpful, and has a much gentler learning curve and
thus likely to be a more successful project.

>  If there's not anything in particular, adding 
> more known functions to cover more of both the C standard and POSIX 
> would be interesting to me, now that I have some understanding of how
> it 
> works.  I'd also be interested in adding support for more SEI CERT 
> checks, I imagine some of those may be more challenging.

That sounds useful.  Do you have any specifically in mind?  We can
brainstorm about how to go about implementing them.

> 
> What I'm not sure is how go about clearly delimiting the scope in the
> case of a project of that kind.  A predefined list of functions and 
> checks could easily end up being either too little or too ambitious,
> as 
> it may be hard for me to estimate how difficult some type of check
> may 
> be before having solved a similar one.  In this case, what would you 
> recommend?

How about a flexible approach where you come up with a list of
functions and checks (a "backlog" in agile parlance, I believe); each
item is relatively small and well-contained, and each week you try and
implement some from the list, and if any particular item proves harder
than expected, we put it back in the backlog and move on to something
easier.  Maybe as the summer goes on you could try the harder problems.

That way you'd be generating a series of non-trivial patches similar to
the one you've already done; the analyzer would gain new warnings and
improvements to analysis precision; and hopefully by the end of the
summer you'd have an impressive set of patches to your credit in gcc
trunk, and have more familiarity with the internals of the
analyzer/gcc.

The expectation would be that you get plenty of patches in; but we
wouldn't expect the full list completed during the summer (I'd prefer
to capture a more complete list of areas for improvement than to
achieve 100% on an incomplete list, if that makes sense).

One possible early task might be populating our bugzilla with more RFE
bugs to cover possible POSIX entrypoints and CERT checks of interest.

> 
> In summary, I'm open to working on whatever may be more helpful right
> now to improve the analyzer.  Regardless of what I end up choosing
> for 
> the project proposal, I will now work on an additional patch covering
> functions similar to `mkstemp`, as that's a very low-hanging fruit.

Excellent; thanks.

> 
> By the way, thank you for your welcoming and helpful attitude.  I was
> quite intimidated at the idea of approaching a big and complex
> project 
> like GCC, and your willingness to help and provide useful pointers
> were 
> motivating and lessened my worries.  My first patch was a very good 
> experience, and now I want to continue contributing to GCC.

Thanks, that's good to hear.

Let me know if you have further questions
Dave



Re: [PATCH v2] analyzer: new warnings -Wanalyzer-mkstemp-missing-suffix and, -Wanalyzer-mkstemp-of-string-literal [PR105890]

2026-03-12 Thread Tomás Ortín Fernández

Thanks for the updated patch.  It looks good, and I've added it to my
queue of patches to push to gcc 17 once we reopen trunk for new
features.


Thanks!


Do you have a sense of what you'd like to look at next?  You mentioned
a possible followup to cover similar functions to mkstemp.  If you're
planning to apply to GSoC, I'm wondering what subproject you had in
mind.  Note that adding a bunch of known functions to more fully handle
various POSIX APIs in -fanalyzer sounds like a reasonable GSoC project
idea to me; it would be useful and (I hope) relatively easy.


I was going to ask first if there's anything that would be particularly 
useful for the analyzer.  If there's not anything in particular, adding 
more known functions to cover more of both the C standard and POSIX 
would be interesting to me, now that I have some understanding of how it 
works.  I'd also be interested in adding support for more SEI CERT 
checks, I imagine some of those may be more challenging.


What I'm not sure is how go about clearly delimiting the scope in the 
case of a project of that kind.  A predefined list of functions and 
checks could easily end up being either too little or too ambitious, as 
it may be hard for me to estimate how difficult some type of check may 
be before having solved a similar one.  In this case, what would you 
recommend?


In summary, I'm open to working on whatever may be more helpful right 
now to improve the analyzer.  Regardless of what I end up choosing for 
the project proposal, I will now work on an additional patch covering 
functions similar to `mkstemp`, as that's a very low-hanging fruit.


By the way, thank you for your welcoming and helpful attitude.  I was 
quite intimidated at the idea of approaching a big and complex project 
like GCC, and your willingness to help and provide useful pointers were 
motivating and lessened my worries.  My first patch was a very good 
experience, and now I want to continue contributing to GCC.


Tomás


Re: [PATCH v2] analyzer: new warnings -Wanalyzer-mkstemp-missing-suffix and, -Wanalyzer-mkstemp-of-string-literal [PR105890]

2026-03-12 Thread David Malcolm
On Thu, 2026-03-12 at 11:09 -0400, David Malcolm wrote:
> On Thu, 2026-03-12 at 08:45 +0100, Tomás Ortín Fernández wrote:

[..snip...]

> 
> 
> Thanks for the updated patch.  It looks good, and I've added it to my
> queue of patches to push to gcc 17 once we reopen trunk for new
> features.

I was referring to the v3 patch here, just to be clear.

Dave



Re: [PATCH v2] analyzer: new warnings -Wanalyzer-mkstemp-missing-suffix and, -Wanalyzer-mkstemp-of-string-literal [PR105890]

2026-03-12 Thread David Malcolm
On Thu, 2026-03-12 at 08:45 +0100, Tomás Ortín Fernández wrote:
> Thank you for the review.
> 
> > > +class kf_mkstemp : public known_function
> > > +{
> > > +public:
> > > +  bool
> > > +  matches_call_types_p (const call_details &cd) const final
> > > override
> > > +  {
> > > +    return (cd.num_args () == 1 && cd.arg_is_pointer_p (0));
> > > +  }
> > > +
> > > +  void
> > > +  impl_call_pre (const call_details &cd) const final override
> > > +  {
> > > +    region_model_context *ctxt = cd.get_ctxt ();
> > > +    /* If there's no context we can't store warnings, so we
> > > return
> > > early. */
> > > +    if (!ctxt)
> > > +  {
> > > + cd.set_any_lhs_with_defaults ();
> > > + return;
> > > +  }
> > 
> > Hmmm, I don't like the way the patch has
> >   cd.set_any_lhs_with_defaults ();
> > above and then repeats it below at the end of the function (DRY
> > principle).  How about introducing a subroutine within the class to
> > do
> > the checks, so it looks like:
> > 
> >     if (ctxt)
> >   if (!check_argument (cd))
> >     return
> >     cd.set_any_lhs_with_defaults ();
> > 
> > or somesuch?  Or does that make things messier?
> 
> While thinking about this, I realized that calling
> `ctxt>terminate_path 
> ()` when the argument to `mkstemp` isn't a null-terminated string may
> be 
> too aggressive.  My original code followed the pattern used by 
> `kf_strcat`, but `mkstemp` is more similar to `fopen`. `kf_fopen`
> sets 
> the LHS to defaults regardless whether 
> `check_for_null_terminated_string_arg` returns NULL.  I have updated
> the 
> code to do the same.  Does this seem correct?

Looks good to me.

> 
> If correct, it incidentally means now it's not necessary to return
> early 
> from `impl_call_pre`, as I have extracted the template string check
> into 
> a method, as suggested.

(nods)

> 
> > > diff --git a/gcc/testsuite/gcc.dg/analyzer/mkstemp-1.c
> > > b/gcc/testsuite/gcc.dg/analyzer/mkstemp-1.c
> > > new file mode 100644
> > > index 000..d8e0e51c39f
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/analyzer/mkstemp-1.c
> > > @@ -0,0 +1,96 @@
> > > +/* { dg-additional-options "-Wno-analyzer-null-argument" } */
> > > +/* { dg-skip-if "has no putenv" { "avr-*-*" } } */
> > 
> > The above skip-if line looks like it was a copy&paste from a test
> > of
> > "putenv"; it means that the test should be skipped on targets
> > matching
> > avr-*-* since they don't have "putenv".
> > 
> > There might be targets that don't have mkstemp, but until we
> > discover
> > them, best to delete that line for this test.
> 
> Ah yes, sorry, as you can see I based the `mkstemp` tests on the 
> `putenv` tests and wasn't careful enough when updating it.
> 
> > Other than those nitpicks, the patch looks great; thanks

Thanks for the updated patch.  It looks good, and I've added it to my
queue of patches to push to gcc 17 once we reopen trunk for new
features.

Do you have a sense of what you'd like to look at next?  You mentioned
a possible followup to cover similar functions to mkstemp.  If you're
planning to apply to GSoC, I'm wondering what subproject you had in
mind.  Note that adding a bunch of known functions to more fully handle
various POSIX APIs in -fanalyzer sounds like a reasonable GSoC project
idea to me; it would be useful and (I hope) relatively easy.
> 
> Thank you. I have another question: I have noticed that generally any
> non-trivial method is defined out-of-line throughout the codebase. 
> Would it be better to define `kf_mkstemp`'s private methods out-of-
> line 
> as well?

I don't have a strong preference here

Dave



Re: [PATCH v2] analyzer: new warnings -Wanalyzer-mkstemp-missing-suffix and, -Wanalyzer-mkstemp-of-string-literal [PR105890]

2026-03-12 Thread Tomás Ortín Fernández

Thank you for the review.


+class kf_mkstemp : public known_function
+{
+public:
+  bool
+  matches_call_types_p (const call_details &cd) const final override
+  {
+    return (cd.num_args () == 1 && cd.arg_is_pointer_p (0));
+  }
+
+  void
+  impl_call_pre (const call_details &cd) const final override
+  {
+    region_model_context *ctxt = cd.get_ctxt ();
+    /* If there's no context we can't store warnings, so we return
early. */
+    if (!ctxt)
+  {
+   cd.set_any_lhs_with_defaults ();
+   return;
+  }


Hmmm, I don't like the way the patch has
  cd.set_any_lhs_with_defaults ();
above and then repeats it below at the end of the function (DRY
principle).  How about introducing a subroutine within the class to do
the checks, so it looks like:

if (ctxt)
  if (!check_argument (cd))
return
cd.set_any_lhs_with_defaults ();

or somesuch?  Or does that make things messier?


While thinking about this, I realized that calling `ctxt>terminate_path 
()` when the argument to `mkstemp` isn't a null-terminated string may be 
too aggressive.  My original code followed the pattern used by 
`kf_strcat`, but `mkstemp` is more similar to `fopen`. `kf_fopen` sets 
the LHS to defaults regardless whether 
`check_for_null_terminated_string_arg` returns NULL.  I have updated the 
code to do the same.  Does this seem correct?


If correct, it incidentally means now it's not necessary to return early 
from `impl_call_pre`, as I have extracted the template string check into 
a method, as suggested.



diff --git a/gcc/testsuite/gcc.dg/analyzer/mkstemp-1.c
b/gcc/testsuite/gcc.dg/analyzer/mkstemp-1.c
new file mode 100644
index 000..d8e0e51c39f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/mkstemp-1.c
@@ -0,0 +1,96 @@
+/* { dg-additional-options "-Wno-analyzer-null-argument" } */
+/* { dg-skip-if "has no putenv" { "avr-*-*" } } */


The above skip-if line looks like it was a copy&paste from a test of
"putenv"; it means that the test should be skipped on targets matching
avr-*-* since they don't have "putenv".

There might be targets that don't have mkstemp, but until we discover
them, best to delete that line for this test.


Ah yes, sorry, as you can see I based the `mkstemp` tests on the 
`putenv` tests and wasn't careful enough when updating it.



Other than those nitpicks, the patch looks great; thanks


Thank you. I have another question: I have noticed that generally any 
non-trivial method is defined out-of-line throughout the codebase. 
Would it be better to define `kf_mkstemp`'s private methods out-of-line 
as well?


Tomás



Re: [PATCH v2] analyzer: new warnings -Wanalyzer-mkstemp-missing-suffix and, -Wanalyzer-mkstemp-of-string-literal [PR105890]

2026-03-11 Thread David Malcolm
On Wed, 2026-03-11 at 14:02 +0100, Tomás Ortín Fernández (quanrong) wrote:
> This patch adds two new analyzer warnings for misuse of mkstemp(3):
> 

Thanks for the updated patch.


[...snip...]

> +private:
> +  const gimple &m_call_stmt; // non-NULL

Getting nitpicky, but a reference captures "non-nullness" in the C++
type system so the above comment is redundant.

> +  tree m_fndecl;  // non-NULL

whereas "tree" is a typedef to a pointer so this makes sense.



> +class kf_mkstemp : public known_function
> +{
> +public:
> +  bool
> +  matches_call_types_p (const call_details &cd) const final override
> +  {
> +    return (cd.num_args () == 1 && cd.arg_is_pointer_p (0));
> +  }
> +
> +  void
> +  impl_call_pre (const call_details &cd) const final override
> +  {
> +    region_model_context *ctxt = cd.get_ctxt ();
> +    /* If there's no context we can't store warnings, so we return 
> early. */
> +    if (!ctxt)
> +  {
> + cd.set_any_lhs_with_defaults ();
> + return;
> +  }

Hmmm, I don't like the way the patch has
 cd.set_any_lhs_with_defaults ();
above and then repeats it below at the end of the function (DRY
principle).  How about introducing a subroutine within the class to do
the checks, so it looks like:

   if (ctxt)
 if (!check_argument (cd))
   return
   cd.set_any_lhs_with_defaults ();

or somesuch?  Or does that make things messier?

> +
> +    const svalue *ptr_sval = cd.get_arg_svalue (0);
> +    region_model *model = cd.get_model ();
> +
> +    const svalue *strlen_sval
> +  = model->check_for_null_terminated_string_arg (cd, 0, false, 
> nullptr);
> +    if (!strlen_sval)
> +  {
> + ctxt->terminate_path ();
> + return;
> +  }
> +
> +    if (cd.get_arg_string_literal (0))
> +  {
> + ctxt->warn (std::make_unique (cd));
> +  }
> +    else if (check_suffix (cd, ptr_sval, strlen_sval).is_false ())
> +  {
> + ctxt->warn (std::make_unique (cd));
> +  }
> +
> +    cd.set_any_lhs_with_defaults ();
> +  }
> +

[...snip...]

> diff --git a/gcc/testsuite/gcc.dg/analyzer/mkstemp-1.c 
> b/gcc/testsuite/gcc.dg/analyzer/mkstemp-1.c
> new file mode 100644
> index 000..d8e0e51c39f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/mkstemp-1.c
> @@ -0,0 +1,96 @@
> +/* { dg-additional-options "-Wno-analyzer-null-argument" } */
> +/* { dg-skip-if "has no putenv" { "avr-*-*" } } */

The above skip-if line looks like it was a copy&paste from a test of
"putenv"; it means that the test should be skipped on targets matching
avr-*-* since they don't have "putenv".

There might be targets that don't have mkstemp, but until we discover
them, best to delete that line for this test.


Other than those nitpicks, the patch looks great; thanks
Dave