On Thu, Mar 1, 2012 at 11:12 PM, Richard Smith <[email protected]> wrote: > On Thu, Mar 1, 2012 at 9:26 PM, Nico Weber <[email protected]> wrote: >> >> >> > On Tue, Feb 21, 2012 at 7:17 PM, Nico Weber <[email protected]> >> >> > wrote: >> >> >> >> >> >> the attached patch adds -Wstring-plus-int (I'm open to more creative >> >> >> names :-) ), which warns on code like |"error code: " + err|. The >> >> >> warning found 4 bugs in 2 files in chromium. The bugs it found look >> >> >> like this: >> >> >> >> >> >> File 1: >> >> >> if (!read_node.InitByIdLookup(it->id)) { >> >> >> error_handler()->OnUnrecoverableError(FROM_HERE, "Failed to >> >> >> look >> >> >> up " >> >> >> " data for received change with id " + it->id); >> >> >> return; >> >> >> } >> >> >> >> >> >> File 2: >> >> >> ResetStream(stream_id, spdy::INVALID_ASSOCIATED_STREAM, >> >> >> "Received OnSyn with inactive associated stream " + >> >> >> associated_stream_id); >> >> >> (Fix: >> >> >> >> >> >> http://codereview.chromium.org/9372076/diff/1/net/spdy/spdy_session.cc >> >> >> ) >> >> >> >> >> >> (A coworker found the bug in the one file, which prompted me writing >> >> >> the warning. The warning then found the bug in the other file.) >> >> >> >> >> >> >> >> >> When building all of chromium and its many dependencies, the warning >> >> >> did also find 3 false positives, but they all look alike: Ffmepg >> >> >> contains three different functions that all look like this: >> >> > >> >> > >> >> > 4 bugs and 3 false positives doesn't sound great to me. >> >> >> >> True. All of the the 3 false positives are really the same snippet >> >> though. (On the other hand, 3 of the 4 bugs are very similar as well.) >> >> The bugs this did find are crash bugs, and they were in rarely run >> >> error-logging code, so it does find interesting bugs. Maybe you guys >> >> can run it on your codebase after it's in, and we can move it out of >> >> -Wmost if you consider it too noisy in practice? >> > >> > >> > Sounds reasonable. >> > >> >> >> >> > Have you considered >> >> > relaxing the warning in cases where the integral summand is a >> >> > constant >> >> > expression and is in-bounds? Would this pattern have matched any of >> >> > your >> >> > real bugs? >> >> >> >> That's a good idea (for my bugs, the int was always a declrefexpr, >> >> never a literal). The ffmpeg example roughly looks like this: >> >> >> >> #define A "foo" >> >> #define B "bar" >> >> consume(A B + sizeof(A) - 1); >> >> >> >> The RHS is just "sizeof(A)" without the "- 1", but since A B has a >> >> length of 6, this still makes this warning go away. I added this to >> >> the patch. With this change, it's 4 bugs / 0 false positives. >> >> >> >> Note that this suppresses the warning for most enums which start at 0. >> >> Or did you mean to do this only for non-enum constant expressions? >> > >> > >> > I could imagine code legitimately wanting to do something like this: >> > >> > template</*...*/> >> > struct S { >> > enum { Offset = /* ... */ }; >> > static const char *str = "some string" + Offset; >> > }; >> > >> > That said, I'm happy for us to warn on such code. Whatever you prefer >> > seems >> > fine to me; we can refine this later, if a need arises. >> >> Enums are handled like ints: They warn if the offset is out of bounds, >> but don't if it's they are in bounds. >> >> >> > Onto the patch... >> >> > >> >> > Index: include/clang/Basic/DiagnosticGroups.td >> >> > =================================================================== >> >> > --- include/clang/Basic/DiagnosticGroups.td (revision 150418) >> >> > +++ include/clang/Basic/DiagnosticGroups.td (working copy) >> >> > @@ -320,6 +321,7 @@ >> >> > ReturnType, >> >> > SelfAssignment, >> >> > SizeofArrayArgument, >> >> > + StringPlusInt, >> >> > Trigraphs, >> >> > Uninitialized, >> >> > UnknownPragmas, >> >> > >> >> > Given the current level of false positives, I'm not completely >> >> > convinced >> >> > this should go into -Wmost. I imagine we'll know more once we've run >> >> > this on >> >> > more code. >> >> > >> >> > Index: include/clang/Basic/DiagnosticSemaKinds.td >> >> > =================================================================== >> >> > --- include/clang/Basic/DiagnosticSemaKinds.td (revision 150418) >> >> > +++ include/clang/Basic/DiagnosticSemaKinds.td (working copy) >> >> > @@ -3414,6 +3414,12 @@ >> >> > "explicitly assigning a variable of type %0 to itself">, >> >> > InGroup<SelfAssignment>, DefaultIgnore; >> >> > >> >> > +def warn_string_plus_int : Warning< >> >> > + "Adding an integral to a string does not append to the string">, >> >> > >> >> > Diagnostics outside the static analyzer should not be capitalized. >> >> >> >> Done. >> >> >> >> > + InGroup<StringPlusInt>; >> >> > +def note_string_plus_int_silence : Note< >> >> > + "use &(str[index]) if pointer arithmetic is intended">; >> >> > >> >> > The parentheses here are redundant. I think I'd prefer for the >> >> > suggestion to >> >> > be provided via fixits (and to reword this as something like "use >> >> > array >> >> > indexing to silence this warning"). >> >> >> >> Clang tends to suggest very explicit parentheses (see >> >> -Wparentheses-logical-op for example). >> > >> > >> > Usually only in cases where there is a potential bug. If we saw >> > &"foo"[2], >> > we wouldn't be likely to suspect that the user might have really meant >> > (&"foo")[2], so my (weak) preference is to suggest the simpler fix. >> >> Removed the parens. >> >> > (Incidentally, -Warray-bounds should be taught to complain about >> > (&"foo")[2]!) >> >> Patches accepted! :-D >> >> >> I did change it to a fixit and >> >> reworded, it now looks like so: >> >> >> >> test/SemaCXX/string-plus-int.cpp:12:17: warning: adding an integral to >> >> a string does not append to the string [-Wstring-plus-int] >> >> consume("foo" + 5); >> >> ~~~~~~^~~ >> >> test/SemaCXX/string-plus-int.cpp:12:17: note: use array indexing to >> >> silence this warning >> >> consume("foo" + 5); >> >> ^ >> >> &( )[ ] >> > >> > >> > This looks nice ... though if we're going to have parens in the fixit, >> > they >> > should contain the [] :-) >> > >> >> >> >> > Index: lib/Sema/SemaExpr.cpp >> >> > =================================================================== >> >> > --- lib/Sema/SemaExpr.cpp (revision 150418) >> >> > +++ lib/Sema/SemaExpr.cpp (working copy) >> >> > @@ -7798,6 +7798,20 @@ >> >> > ParensRange); >> >> > } >> >> > >> >> > +/// DiagnoseStringPlusInt - Emit a warning when adding an integer to >> >> > a >> >> > string >> >> > +/// literal. >> >> > +static void DiagnoseStringPlusInt(Sema &Self, SourceLocation OpLoc, >> >> > + Expr *LHSExpr, Expr *RHSExpr) { >> >> > + bool IsStringPlusInt = dyn_cast<StringLiteral>(LHSExpr) && >> >> > + RHSExpr->getType()->isIntegralType(Self.getASTContext()); >> >> > >> >> > We should probably also check for (unscoped) enumeration types, and >> >> > for >> >> > class types with implicit conversion to int. Doing so will require >> >> > moving >> >> > this check to checkArithmeticOpPointerOperand (after we've checked >> >> > for >> >> > an >> >> > overloaded operator+). >> >> >> >> Added a check for (unscoped) enumeration types. class types with >> >> implicit conversion to int feel like overkill to me. >> > >> > >> > OK. You should still move the check to after we check for an overloaded >> > operator+, since operator+ can be overloaded for enums. (Once that's >> > done, >> > you may find it's easier to warn on such implicitly-convertible types >> > than >> > to not do so.) >> >> Done, added an enum-overloaded operator+ to the test. >> >> >> > + if (!IsStringPlusInt) >> >> > + return; >> >> > + >> >> > + SourceRange DiagRange(LHSExpr->getLocStart(), >> >> > RHSExpr->getLocEnd()); >> >> > + Self.Diag(OpLoc, diag::warn_string_plus_int) << DiagRange; >> >> > + Self.Diag(OpLoc, diag::note_string_plus_int_silence); >> >> > +} >> >> > >> >> > We should also catch the case where the integer is on the LHS (and >> >> > skip >> >> > the >> >> > fixit in that case). >> >> >> >> I tried that. It found neither additional bugs nor additional false >> >> positives, so it's probably not worth it? >> > >> > >> > Fine, there's no point adding this if it doesn't actually catch any >> > bugs. I >> > have only empirical evidence that it might -- I once saw[*] the moral >> > equivalent of: >> > >> > int n = /*...*/; >> > std::string s = n + " widgets found"; >> > >> > [*] a long time ago, in a codebase far away... >> >> Ok, I added the warning for the int on the left as well (without fixit >> in that case, like you suggested upthread). > > > Just a few more things: > > +def warn_string_plus_int : Warning< > + "adding an integral to a string does not append to the string">, > + InGroup<StringPlusInt>; > > Rather than "an integral", can we say something like: > > adding %0 to a string literal does not append to the string > > (where %0 is the type)?
Done.
> + if (IndexExpr->EvaluateAsInt(index, Self.getASTContext())) {
> + if (index.isNonNegative() &&
> + index < llvm::APSInt(llvm::APInt(index.getBitWidth(),
> + StrExpr->getByteLength()),
>
> This should be <=, not <, I think: we want to allow "foo"+4 (a pointer past
> the null) but warn on "foo"+5 (which has undefined behavior). And
> getByteLength() should be getLength(), to properly handle wide string
> literals. We'll get duplicate warnings for the same issue if
> -Warray-bounds-pointer-arithmetic is enabled and we carry on here, but such
> code is rare enough that I don't think that's a pressing concern.
Done (-ish: getLength() didn't include the trailing \0, so I'm doing
<= getLength() + 1), added a test.
> Other than that, a test for the fix-it would be great (see test/FixIt for
> examples).
Wouldn't that require the fixit to be on the warning instead of the
note? (If so, maybe there's no need for the test yet?)
Nico
clang-str-plus-int-v4.patch
Description: Binary data
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
