On Thu, Mar 1, 2012 at 9:26 PM, Nico Weber <[email protected]> wrote:
> >> > On Tue, Feb 21, 2012 at 7:17 PM, Nico Weber <[email protected]> > wrote: > >> >> > >> >> the attached patch adds -Wstring-plus-int (I'm open to more creative > >> >> names :-) ), which warns on code like |"error code: " + err|. The > >> >> warning found 4 bugs in 2 files in chromium. The bugs it found look > >> >> like this: > >> >> > >> >> File 1: > >> >> if (!read_node.InitByIdLookup(it->id)) { > >> >> error_handler()->OnUnrecoverableError(FROM_HERE, "Failed to look > >> >> up " > >> >> " data for received change with id " + it->id); > >> >> return; > >> >> } > >> >> > >> >> File 2: > >> >> ResetStream(stream_id, spdy::INVALID_ASSOCIATED_STREAM, > >> >> "Received OnSyn with inactive associated stream " + > >> >> associated_stream_id); > >> >> (Fix: > >> >> > http://codereview.chromium.org/9372076/diff/1/net/spdy/spdy_session.cc > >> >> ) > >> >> > >> >> (A coworker found the bug in the one file, which prompted me writing > >> >> the warning. The warning then found the bug in the other file.) > >> >> > >> >> > >> >> When building all of chromium and its many dependencies, the warning > >> >> did also find 3 false positives, but they all look alike: Ffmepg > >> >> contains three different functions that all look like this: > >> > > >> > > >> > 4 bugs and 3 false positives doesn't sound great to me. > >> > >> True. All of the the 3 false positives are really the same snippet > >> though. (On the other hand, 3 of the 4 bugs are very similar as well.) > >> The bugs this did find are crash bugs, and they were in rarely run > >> error-logging code, so it does find interesting bugs. Maybe you guys > >> can run it on your codebase after it's in, and we can move it out of > >> -Wmost if you consider it too noisy in practice? > > > > > > Sounds reasonable. > > > >> > >> > Have you considered > >> > relaxing the warning in cases where the integral summand is a constant > >> > expression and is in-bounds? Would this pattern have matched any of > your > >> > real bugs? > >> > >> That's a good idea (for my bugs, the int was always a declrefexpr, > >> never a literal). The ffmpeg example roughly looks like this: > >> > >> #define A "foo" > >> #define B "bar" > >> consume(A B + sizeof(A) - 1); > >> > >> The RHS is just "sizeof(A)" without the "- 1", but since A B has a > >> length of 6, this still makes this warning go away. I added this to > >> the patch. With this change, it's 4 bugs / 0 false positives. > >> > >> Note that this suppresses the warning for most enums which start at 0. > >> Or did you mean to do this only for non-enum constant expressions? > > > > > > I could imagine code legitimately wanting to do something like this: > > > > template</*...*/> > > struct S { > > enum { Offset = /* ... */ }; > > static const char *str = "some string" + Offset; > > }; > > > > That said, I'm happy for us to warn on such code. Whatever you prefer > seems > > fine to me; we can refine this later, if a need arises. > > Enums are handled like ints: They warn if the offset is out of bounds, > but don't if it's they are in bounds. > > >> > Onto the patch... > >> > > >> > Index: include/clang/Basic/DiagnosticGroups.td > >> > =================================================================== > >> > --- include/clang/Basic/DiagnosticGroups.td (revision 150418) > >> > +++ include/clang/Basic/DiagnosticGroups.td (working copy) > >> > @@ -320,6 +321,7 @@ > >> > ReturnType, > >> > SelfAssignment, > >> > SizeofArrayArgument, > >> > + StringPlusInt, > >> > Trigraphs, > >> > Uninitialized, > >> > UnknownPragmas, > >> > > >> > Given the current level of false positives, I'm not completely > convinced > >> > this should go into -Wmost. I imagine we'll know more once we've run > >> > this on > >> > more code. > >> > > >> > Index: include/clang/Basic/DiagnosticSemaKinds.td > >> > =================================================================== > >> > --- include/clang/Basic/DiagnosticSemaKinds.td (revision 150418) > >> > +++ include/clang/Basic/DiagnosticSemaKinds.td (working copy) > >> > @@ -3414,6 +3414,12 @@ > >> > "explicitly assigning a variable of type %0 to itself">, > >> > InGroup<SelfAssignment>, DefaultIgnore; > >> > > >> > +def warn_string_plus_int : Warning< > >> > + "Adding an integral to a string does not append to the string">, > >> > > >> > Diagnostics outside the static analyzer should not be capitalized. > >> > >> Done. > >> > >> > + InGroup<StringPlusInt>; > >> > +def note_string_plus_int_silence : Note< > >> > + "use &(str[index]) if pointer arithmetic is intended">; > >> > > >> > The parentheses here are redundant. I think I'd prefer for the > >> > suggestion to > >> > be provided via fixits (and to reword this as something like "use > array > >> > indexing to silence this warning"). > >> > >> Clang tends to suggest very explicit parentheses (see > >> -Wparentheses-logical-op for example). > > > > > > Usually only in cases where there is a potential bug. If we saw > &"foo"[2], > > we wouldn't be likely to suspect that the user might have really meant > > (&"foo")[2], so my (weak) preference is to suggest the simpler fix. > > Removed the parens. > > > (Incidentally, -Warray-bounds should be taught to complain about > > (&"foo")[2]!) > > Patches accepted! :-D > > >> I did change it to a fixit and > >> reworded, it now looks like so: > >> > >> test/SemaCXX/string-plus-int.cpp:12:17: warning: adding an integral to > >> a string does not append to the string [-Wstring-plus-int] > >> consume("foo" + 5); > >> ~~~~~~^~~ > >> test/SemaCXX/string-plus-int.cpp:12:17: note: use array indexing to > >> silence this warning > >> consume("foo" + 5); > >> ^ > >> &( )[ ] > > > > > > This looks nice ... though if we're going to have parens in the fixit, > they > > should contain the [] :-) > > > >> > >> > Index: lib/Sema/SemaExpr.cpp > >> > =================================================================== > >> > --- lib/Sema/SemaExpr.cpp (revision 150418) > >> > +++ lib/Sema/SemaExpr.cpp (working copy) > >> > @@ -7798,6 +7798,20 @@ > >> > ParensRange); > >> > } > >> > > >> > +/// DiagnoseStringPlusInt - Emit a warning when adding an integer to > a > >> > string > >> > +/// literal. > >> > +static void DiagnoseStringPlusInt(Sema &Self, SourceLocation OpLoc, > >> > + Expr *LHSExpr, Expr *RHSExpr) { > >> > + bool IsStringPlusInt = dyn_cast<StringLiteral>(LHSExpr) && > >> > + RHSExpr->getType()->isIntegralType(Self.getASTContext()); > >> > > >> > We should probably also check for (unscoped) enumeration types, and > for > >> > class types with implicit conversion to int. Doing so will require > >> > moving > >> > this check to checkArithmeticOpPointerOperand (after we've checked for > >> > an > >> > overloaded operator+). > >> > >> Added a check for (unscoped) enumeration types. class types with > >> implicit conversion to int feel like overkill to me. > > > > > > OK. You should still move the check to after we check for an overloaded > > operator+, since operator+ can be overloaded for enums. (Once that's > done, > > you may find it's easier to warn on such implicitly-convertible types > than > > to not do so.) > > Done, added an enum-overloaded operator+ to the test. > > >> > + if (!IsStringPlusInt) > >> > + return; > >> > + > >> > + SourceRange DiagRange(LHSExpr->getLocStart(), > RHSExpr->getLocEnd()); > >> > + Self.Diag(OpLoc, diag::warn_string_plus_int) << DiagRange; > >> > + Self.Diag(OpLoc, diag::note_string_plus_int_silence); > >> > +} > >> > > >> > We should also catch the case where the integer is on the LHS (and > skip > >> > the > >> > fixit in that case). > >> > >> I tried that. It found neither additional bugs nor additional false > >> positives, so it's probably not worth it? > > > > > > Fine, there's no point adding this if it doesn't actually catch any > bugs. I > > have only empirical evidence that it might -- I once saw[*] the moral > > equivalent of: > > > > int n = /*...*/; > > std::string s = n + " widgets found"; > > > > [*] a long time ago, in a codebase far away... > > Ok, I added the warning for the int on the left as well (without fixit > in that case, like you suggested upthread). Just a few more things: +def warn_string_plus_int : Warning< + "adding an integral to a string does not append to the string">, + InGroup<StringPlusInt>; Rather than "an integral", can we say something like: adding %0 to a string literal does not append to the string (where %0 is the type)? + if (IndexExpr->EvaluateAsInt(index, Self.getASTContext())) { + if (index.isNonNegative() && + index < llvm::APSInt(llvm::APInt(index.getBitWidth(), + StrExpr->getByteLength()), This should be <=, not <, I think: we want to allow "foo"+4 (a pointer past the null) but warn on "foo"+5 (which has undefined behavior). And getByteLength() should be getLength(), to properly handle wide string literals. We'll get duplicate warnings for the same issue if -Warray-bounds-pointer-arithmetic is enabled and we carry on here, but such code is rare enough that I don't think that's a pressing concern. Other than that, a test for the fix-it would be great (see test/FixIt for examples). Thanks! Richard
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
