Re: Move -Wmaybe-uninitialized to -Wextra

Marc Glisse Sun, 03 Feb 2019 02:03:02 -0800

On Sat, 2 Feb 2019, Martin Sebor wrote:

I don't feel too strongly about whether -Wmaybe-uninitialized should
be in -Wall or in -Wextra, and I might even be somewhat more inclined
to expect to find it in the latter.  But since you sound like you are
gearing up for proposing other changes in the same vein down the line
where I might have stronger opinions, I should comment.


[It's a bit of a long-winded response because I've been thinking about
this topic a lot lately.]


Thank you for taking the time to write this down.

In general, I think a discussion of warning groupings is worth having
(even needed) at the right time, but stage 4 doesn't strike me as
the most opportune occasion.

Specifically for -Wmaybe-uninitialized, the option has been in -Wall
for many releases, and no major changes to it have been made recently
that I know.  So what I'm missing in this proposal is: why now?  What
has changed to make this a pressing issue now?  Has its rate of false
positives gone up significantly?  If so, by how much and why?

I've been wanting to post this for years, but I only got motivated enoughrecently after seeing negative effects of -Wmaybe-uninitialized in twoprojects within a few days. I don't think the rate of false positives hasgone up significantly in gcc-9, they just randomly appear or disappearfrom one release to the next. In some sense, for projects that supportmultiple compiler versions, each new gcc release makes the number of falsepositives (on at least one version) go up, but that doesn't count as thewarning getting worse.

The discussion and/or the change don't need to happen now, but if I didn'tpost this patch while motivated, it might be years before I actually doit, so I ignored stage 4.

Please be clear about what you mean by false positives.  Is it that
the warning triggers contrary to the documentation ("a path exists
where the variable is uninitialized along with one where it is"),
or that the path to the use of the variable does exist but we
(though not GCC) can tell from the context that it cannot be reached?


The latter.

(The latter wouldn't qualify as a false positive as the term is defined
in literature; i.e., the warning works as designed, we just don't like
or agree with it.)

Indeed the definition of false positive is important. In some sense, nowarning is ever a false positive, when its definition is "the compilerdecided to warn". But that makes the warning useless.-Wmaybe-uninitialized does not say there is an uninitialized use, it saysthe control flow is too complicated or the compiler not clever enough (orthere is indeed an uninitialized use for some input, and only this subcaseis a bug). Another warning, -Wstrict-overflow (after it moved to themiddle-end), was more of an optimization note than a true warning:"warning: assuming your code is valid when generating code", Duh! Was Isupposed to make it invalid to quiet the warning?

In practice, false positives (and negatives) of both kinds, whether
they fit the formal definition or the informal one, are the nature
of virtually all non-trivial static diagnostics, certainly all those
that depend on control or data flow analysis.  Some are due to bugs
or limitations in the implementation of the warning.  Others are
inherent in the technology.

Yes, and I argue that these warnings belong in a different "level" ofwarnings than the trivial warnings.

Is this warning more prone to one kind than others? If so, is itbecause it's implemented poorly, or that its implementation hasn't keptup with the improvements to the optimizers, or has the infrastructure itdepends on become ill-suited for it to avoid some of the false positives(as formally defined), or is it something else?

I am mostly looking at what I see in practice. I regularly see falsepositives of -Wstrict-overflow (less now that I neutered parts of it) and-Wmaybe-uninitialized. And when I have strange crashes that later turn outto be caused by uninitialized memory uses, it is very seldom that-Wmaybe-uninitialized helps detect them. IIRC, last time, I had slightlymore success with -Wreturn-local-addr -O3 -fkeep-inline-functions-fkeep-static-functions.

I think several of the new middle-end warnings you added are about stringmanipulation. I don't do any of that, which may explain why I am notseeing them. Also, we are not using gcc-9 much yet.

These false positives are not easy to avoid, as required to be part of-Wall. Avoiding them, when it is possible at all, requires not just asyntactic tweak, like adding parentheses, but a semantic change that canmake the code worse. Initializing something that does not need it is extracode (increases code size and running time). It also prevents better toolsfrom detecting true uninitialized uses, either static analyzers or runtimecheckers (sanitizer, valgrind).
I don't find the argument very compelling that diagnosing potential
bugs should be avoided because the fix (or the suppression in
the case of a false positive) could be wrong.   The risk exists with
all diagnostics.

Yes, but in practice I haven't seen people get the parentheses wrong aftera warning from -Wparentheses.

I also take issue with the suggestion that dynamic analysis is
necessarily a superior mechanism for detecting bugs.  They each have
their strengths and weaknesses.  They are not an either-or proposition
but rather complementary solutions.

I didn't claim they were necessarily superior, and I started myenumeration of potential better tools with static analyzers, i.e. likegcc's -Wmaybe-uninitialized, but really focused on this and not onoptimization. Dynamic analysis has the advantage that it avoids deadbranches and thus a whole class of false positives, but indeed it is alsoa weakness that it misses a lot of possible paths not covered duringtesting. I agree 100% that they are complementary.

Lastly, in the case of uninitialized variables, the usual solution
of initializing them is trivial and always safe (some coding styles
even require it).

Here it shows that we don't work with the same type of code at all. If Iam using a boost::optional, i.e. a class with a buffer and a boolean thatsays if the buffer is initialized, how do I initialize the (private)buffer? Or should boost itself zero out the buffer whenever the boolean isset to false? The variables can easily be hidden behind dozens of levelsof abstraction that make them hard to initialize, and there may be nothingmeaningful to initialize them with. Uninitialized also includes clobbered(out-of-scope for instance) in gcc, where it isn't clear what you aresupposed to initialize to quiet the warning.

By the way, we usually try to avoid imposing a coding style, althoughthat's not always true.

Initializing scalars has a negligible performance
impact in practice, and, if it's thought to be necessary, can easily
be done conditionally (as in, based on some macro) so that
the uninitialized accesses can still be detected by dynamic analysis.

I'd rather avoid increasing yet again the uses of macros, but yes, wheninitializing to quiet what has been identified as a false positive, thatseems desirable. And it can be arrays instead of scalars.

This message concentrates on the negatives, but that doesn't mean Iconsider -Wmaybe-uninitialized as useless. It can find true uninitializeduses. And even some false positives can point at places where we can helpthe compiler generate better code (say with a default __builtin_unreachablecase in a switch). I did in the past contribute patches to make it warnmore often, and I might do so again in the future.
This paragraph makes me think you equate false positives from this
warning with those from -Wuninitialized.


Uh? No, I don't know what gave you this impression.

"May be unitialized" doesn't preclude the possibility that the variableis not used uninitialized.

But if the variable can never be used uninitialized and gcc is simplyunable to prove it, the warning is bad. It may not be a "false positive"depending on your definition, but it is undesirable.

My opinion is that -Wmaybe-uninitialized would serve its purpose better aspart of -Wextra. People tend to use -Wall with -Werror (either explicitlyor implicitly by modifying the code until all warnings are gone). What Isee currently in projects where I participate is that -Wmaybe-uninitializedis making things worse. People don't investigate deeply the cause of thewarning, they just go for whatever "quick-fix" makes the compiler shut up.Quite often, this involves extra code that is less readable and performsworse, while it didn't even "fix" what caused the warning, it just randomlyended up with code where the compiler doesn't warn (possibly because thefunction got bigger, which changed inlining decisions...).
I agree that we need to be mindful of warnings leading to unnecessary
and perhaps inappropriate code changes (as the author of a few warnings
with this potential it has been on mind a lot).

At the same time, not taking the time to properly analyze diagnostics
and come up with the appropriate solutions seems like a problem that's
orthogonal to the warnings themselves.  Disabling warnings in hopes of
solving what's really a problem with the development process or
a culture of a subset of projects doesn't seem like an optimal solution.

I agree with that. But if I complain about the mindless quickfixes, thereply is that customers ask for a library that does not warn, and thedevelopers do not have infinite time, so they do the minimum to avoidwarnings. This is strongly linked to the warning levels.

One could recommend that customers include the library with -isystem todisable the warnings inside the library, bug that's fragile (though it hasgotten quite good), and most importantly it also disables some warningsthat are relevant to the user.

If the warning is not enabled by default so much but only when people areready to investigate any warning thoroughly, the quickfix mentality is lesslikely to be present. People using -Wmaybe-uninitialized need to be willingto ignore false positives, or disable them with pragmas.
Note that similar arguments may apply to some other warnings that somehowmade their way into -Wall when they shouldn't have, but for now I am onlyproposing to move -Wmaybe-uninitialized. Some people tend to consider thatif a warning is not part of -Wall, it might as well not exist. Obviously Idisagree with that.
The description of -Wall is vague enough that arguments about which
warnings should or shouldn't be enabled by it are unavoidably subject
to individual biases and unlikely to lead to a consensus, either among
ourselves or, even less likely, among users.

True. To me, one of the key documented properties of -Wall is being "easyto avoid". -Wall -Werror should be a usable combination. Anything not easyto avoid belongs in a different class, say -Wextra. We can rename that toW1/W2/... if Wall is too charged with historical meaning.

One issue is that, especially in a large, long-lived code base, peoplewant to know when a change (or a new compiler) introduces a new warning.However, they do not want to have to go through hundreds or pre-existing,already analysed warnings. The easiest way to do that is to make surethere are no warnings at all (-Werror). For speculative warnings, that'sreally not convenient. I don't think there is a good solution to this(marking with a pragma the known false positives may be hard wheninstantiating some class somewhere causes a warning in some almostunrelated place, and you want the marking to be narrow enough that itwon't also disable other warnings), and it is independent on whether thewarning is in -Wall or -Wextra. I just have a feeling that splitting theeasy and the not-easy warnings may help developers handle themdifferently.

A shared definition of a false positive should be one of the very firststeps to coming closer to a consensus. Real world (as opposed toanecdotal) data on the rates of actual rates of false positives andnegatives vs true positives would be also most helpful, as would someconsensus of the severity of the bugs the true positives expose, as wellas some objective measure of the ease of suppression. There probablyare others but these would be a start.

This data is going to be super hard to get. Most projects have beencompiling for years and tweaking their code to avoid some warnings. We donot get to see the code that people originally write, we can only see whatthey commit.


--
Marc Glisse

Re: Move -Wmaybe-uninitialized to -Wextra

Reply via email to