On Fri, 2026-03-13 at 12:23 +0530, Ridham Khurana wrote: > Hi Dave, > > Thanks for the confirmation about the expected type argument, I will > add it > in the shared layer. > > While going through the current analyzer implementation, I noticed > that > arguments to the function calls are retrieved through > *call_details::get_arg_svalue()* and then handles as const svalue*, > rather > than *tree* nodes like in the frontend and GIMPLE passes. From what I > can > understand, the library calls behaviour is modelled through > *known_function* handlers interacting with the *region_model*(for > example > through *impl_call_pre in kf_ handlers*), and then the existing > checks for > functions like printf are mostly driven by format attribute and the > validation of format string arguments(for example using > *check_for_null_terminated_string_arg()*), instead of interpreting > the > individual directives.
That's correct. > > But one thing that I am not sure about is where the shared string- > parser > show be integrated on the analyzer side. Maybe it should be triggered > through the attribute-based path , or it is better to use it inside > the > individual kf_* handlers for the functions like printf-style. I'm not sure. I think we want a subroutine inside the analyzer that can be called from either place, and then see how well each approach works. On the subject of known_function handlers, some other GSoC candidates have had success in making patches that add new known_function subclasses for specific POSIX/C stdlib entrypoints. This is a relatively easy and self-contained way to improve -fanalyzer, and it's a good way to demonstrate technical prowess, and to shake out any problems that a candidate might run into building/debugging gcc on their hardware. It overlaps with the format-string support, so would be a useful learning experience - but you'd have to choose a simpler API entrypoint (obviously we don't have the format-string parsing in convenient modular form yet). > > Also, before starting to draft the official proposal, I wanted to > confirm > the expected size of this project. From my current understanding, it > would > be 350 hours, I think 350 hours is the better choice; this is a rather ambitious project. > dividing this project into 2 major phases, the first phase of > the project to unify the parsing logic among all 3 subsystems (it would be the *2* subsystems at this time, since the analyzer doesn't yet support format strings) > and the > second phase to be the actual work on the analyzer part. Please let > me know > if it matches your expectations or would you prefer 175 hour scope? FWIW I'm always a bit sceptical of timetables that rigidly divide projects into phases - it feels too much like the "waterfall" model of development. But yes, splitting out the parsing logic from the other 2 subsystems is a prerequisite before using it in -fanalyzer (I suppose you could have a proof-of-concept that recognizes hardcoded strings and provides the analyzer with the (hardcoded) action list, but that's probably wasted effort compared to simply doing the refactoring work). A useful exercise would be to get familiar with running gcc's full test suite, and verifying that a patch doesn't regress anything, since that's very important during the refactoring of the existing code. Hope this is helpful and makes sense; let me know if you have any questions Dave
