On 23.02.2013 6:47, Jordan Rose wrote:
On Feb 22, 2013, at 7:24 , Anton Yartsev <[email protected]
<mailto:[email protected]>> wrote:
On 22.02.2013 9:30, Anna Zaks wrote:
On Feb 21, 2013, at 6:26 PM, Anna Zaks <[email protected]
<mailto:[email protected]>> wrote:
On Feb 21, 2013, at 6:00 PM, Anton Yartsev <[email protected]
<mailto:[email protected]>> wrote:
On Feb 19, 2013, at 10:18 PM, Anton Yartsev
<[email protected] <mailto:[email protected]>> wrote:
Hi, Jordan. Thanx for the review!
Attached is the new version of the patch with all the comments
addressed. Also added support for directly called operator
new()/new[]() and operator delete()
There is currently one problem with handling of operator
delete(). The following test
void testDeleteOp1() {
int *p = (int *)malloc(sizeof(int));
operator delete(p); // FIXME: should complain "Argument to
operator delete() was allocated by malloc(), not operator new"
}
produce no warnings as attached RefState seem to be missing at
the point when checkPostStmt(const CallExpr *CE, CheckerContext
&C) callback is called for operator delete(p).
I haven't investigated the problem deeply yet, intend to address
it parallel with the review.
+ if (NE->getNumPlacementArgs())
+ return;
+ // skip placement new operators as they may not allocate memory
Two comments here:
- Please make sure all comments are complete, capitalized, and
punctuated sentences. (This has the important one,
"complete"....just missing capitalization and punctuation.)
I'll try. Unfortunately I am not as good in English as I want to
be, so sorry for my grammar, syntax, and punctuation.
--
Anton
<MallocChecker_v2.patch>
Hi Anna. Thanks for your comments! Attached is the new patch.
Just adding another kind as extra enumeration values does not
seem right. Another option is to make Kind be a pointer to a
static array, which contains objects recording all necessary info
about each kind (MacOSKeychainAPIChecker uses this approach).
This is probably an overkill for now, but is another option.
Not sure that I have got an idea.
The memory and deallocator kind are both set for a RefState. Do
you mean creating a static array with 'memory kinds' *
'deallocator kind' elements for all possible combinations? Also
there is no necessary info other then the kind itself.
Left for now.
I think of ToBeReleasedWith* as being different types of Allocate;
I don't think they should be separate values in the same enum. It's
also unfortunate to have to copy the constant values in both places
- DeallocatorKind and RefState::Kind. Maybe you could restructure
this similarly to how this is done in SVals.h?
+ const FunctionDecl *getCalleeDecl() const { return CalleeDecl; }
Do we only store the name of the allocator declaration here?
If the Decl is always an allocator Decl, we should change the name
of the method to be more descriptive.
Do we need to store this in the state? (Growing states implies
memory overhead.) Can't this be easily implied from the kind?
Kind can't give us information about the name of an allocator that
can be malloc(), realloc(), a user function with ownership_takes
attribute, etc.
One solution to avoid keeping a CalleeDecl in RefState is to
rollback to CallExpr::getDirectCallee() from
CheckerContext::getCalleeDec() and to print "malloc()" in case of
indirect calls.
Ok, I see.
After thinking about it some more, I do not think we should add an
extra pointer to the state to differentiate between few allocator
functions. In the case when we do not have ownership attributes, we
only have few possible allocators, whose names we know ahead of
time. In case we support ownership attributes, we are likely to have
few allocator functions whose names we could just store in the
checker state on the first encounter (like we store the
IdentifierInfo).
In addition, we could change the ownership attributes in such a way
that each allocator would have a corresponding deallocator; for
example, if we wanted to check matching allocators and deallocators.
Annotated deallocators won't necessarily be one of the functions you
know at compile time, so the DeallocatorKind enum would not cover
it. I think, it's best if we kept a table on a side that would store
this info (allocation function name, deallocator) and refer to the
entries in the table from the state. (Take a look at
MacOSKeychainAPIChecker - it's very similar to malloc, but it
handles different allocator/deallocator pairs. I think a similar
solution could work in this case as well. Other solutions that
address these issues are welcome as well!)
Attached is the patch that uses approach with a dynamic table that
holds both allocator name and expected deallocator kind. This
approach allows to keep any allocator names, either known or new
ones. The table could be easily extended to hold additional data such
as info about special deallocators, etc.
Please do not do this. The analyzer is not thread-safe today, but
that's no reason why we should make it harder to make it thread-safe
tomorrow.
Stepping back, there are many problems with this, the main one being
that just keeping track of the function name isn't really good enough.
MacOSKeychainAPIChecker can get away with it because its functions are
Could you, please, send the remainder of the sentence? Tried to guess,
but failed.
I think all Anna meant is to have a /static/ table, like
MacOSKeychainAPIChecker. If/when we support allocator families defined
at runtime, we can do this properly, with a mutable table as a field
of the checker, but for now this is both overkill and error-prone.
Separately, I don't understand why you chose to /inherit/ from
SmallVector, rather than just using a SmallVector member.
AllocDeallocDataVec isn't conceptually a vector, it's a type that maps
family IDs to names and DeallocatorKinds.
Other comments:
+enum DeallocatorKind {
+ D_free,
+ D_delete,
+ D_deleteArray,
+ D_unknown
+};
Nitpicking: single-character enum prefixes feel strange to me. DK_?
- enum Kind { // Reference to allocated memory.
- Allocated,
- // Reference to released/freed memory.
- Released,
- // The responsibility for freeing resources has
transfered from
- // this reference. A relinquished symbol should not be
freed.
- Relinquished } K;
+ // First two bits of Kind represent memory kind
+ static const unsigned K_MASK = 0x3;
+ // The rest bits keep an index to the AllocDeallocData
+ // that hold information about the allocator and deallocator functions
+ static const unsigned I_BASE = 2;
+ static const unsigned I_MASK = ~0 << I_BASE;
+ // A storage for memory kind and index to the AllocDeallocData
+ enum KindPlusData { // Reference to allocated memory.
+ Allocated,
+ // Reference to released/freed memory.
+ Released,
+ // The responsibility for freeing resources has
transfered
+ // from this reference. A relinquished symbol
should not be
+ // freed.
+ Relinquished
+ } K;
const Stmt *S;
This should be implemented using bitfields, and the bitfields should
go after the Stmt pointer so that the total object size is smaller.
+ /// Auxiliary functions that return kind and print names of
+ /// allocators/deallocators
+ DeallocatorKind GetDeallocKind(CheckerContext &C, const Expr *E) const;
+ void PrintAllocDeallocName(raw_ostream &os, CheckerContext &C,
+ const Expr *E) const;
+ void PrintAllocDeallocName(raw_ostream &os, CheckerContext &C,
+ const RefState *RS) const;
+ void PrintExpectedAllocName(raw_ostream &os, CheckerContext &C,
+ const Expr *DeallocExpr) const;
Hm. Doxygen will attach the comment to the first declaration and
ignore the others. You can use groups
<http://www.stack.nl/%7Edimitri/doxygen/manual/grouping.html#memgroup> to
sort of get the effect you want, but it might be worth just being
clearer anyway.
Also, the convention for function/method names in LLVM is now
lowerCamelCase.
Thank you for switching to raw_ostream. :-)
+ bool isDefaultNonPtrPlacementNewDelete(const FunctionDecl *FD,
+ CheckerContext &C) const;
Very confusing method name. isStandardNewDelete is probably fine.
There's nothing about "pointers" or "placement" that's strange here --
what you want to know is that this is the global declaration of
operator new, and that it's not provided by the user.
+ if (FD->isReservedGlobalPlacementOperator())
+ return false;
This is /still/ the wrong check. You need to check if there are
/any/ placement arguments, and you need to check that the Decl is not
part of a class. Both are easy:
This check is done to throw out standard placement new/deletes with an
additional pointer parameter but to handle standard placement nothrow
new/deletes.
As an option we can detect nothrow new/deletes and skip all other
placement versions.
if (FD->getNumParams() != 1 || FD->isVariadic())
return false;
if (isa<CXXMethodDecl>(FD))
return false;
+ // Process direct calls to operator new/new[]/delete/delete[]
functions
+ // as distinct from new/new[]/delete/delete[] expressions that are
+ // processed by "checkPostStmt(const CXXNewExpr *" and
+ // "checkPostStmt(const CXXDeleteExpr *" respectively
Missing a period, and also looks rather odd with the unbalanced
parens. Maybe something like "...processed by the checkPostStmt
callbacks for CXXNewExpr and CXXDeleteExpr"? Also, you can't really
use "respectively" here because you're matching two items up with four
items, although it is pretty clear what you mean.
Good thought here, though—I wouldn't have remembered to do this.
Hopefully someday it will be unimportant anyway when we use CallEvent
to model the allocator and deallocator calls inside CXXNewExpr and
CXXDeleteExpr.
+ assert(0 && "not a new/delete oparator");
This is better written llvm_unreachable("not a new/delete operator");
+ // the return value from operator new is already bound and we don't
want to
+ // break this binding so we call MallocUpdateRefState instead of
MallocMemAux
Capitalization, period.
+ StringRef Name = "";
+ if (const CallExpr *CE = dyn_cast<CallExpr>(E)) {
+ if (const FunctionDecl *FD = C.getCalleeDecl(CE)) {
+ if (FD->getDeclName().isIdentifier()) {
+ Name = FD->getName();
+ }
+ }
+ }
+
+ unsigned Idx = AllocDeallocData.GetOrInsert(Name, dKind);
// Set the symbol's state to Allocated.
- return state->set<RegionState>(Sym, RefState::getAllocated(CE));
-
+ return state->set<RegionState>(Sym, RefState::getAllocated(E, Idx));
Yeah, honestly it makes a lot more sense to me to just store the dKind
in the RefState, and lazily derive the name when we need to print it
at the end. (And not do it so specifically.)
+ if (const CXXDeleteExpr *DE = dyn_cast_or_null<CXXDeleteExpr>(E))
+ return DE->isArrayForm() ? D_deleteArray : D_delete;
You already tested for null, so you can just use dyn_cast here.
+ if (isa<ObjCMessageExpr>(E))
+ return D_free;
There are no Objective-C messages that free memory immediately; just
take this out.
Should be fixed somehow for situations when an ObjCMessageExpr is passed
as a deallocator to FreeMemAux(). Just removing the above code will
cause GetDeallocKind() to return D_unknown for an ObjCMessageExpr and we
end up with assert(0 && "unknown or unhandled DeallocatorKind");
triggering in PrintExpectedAllocName(). malloc.mm test contain
testcases illustrating this.
+ // get the exact name of an allocation function
Capitalization, spelling.
+ if (FD->getKind() == Decl::Function) {
Please don't do this; it rules out C++ methods. Actually, how about
this, for the whole method?
bool MallocChecker::PrintAllocDeallocName(raw_ostream &os,
CheckerContext &C,
const Expr *E) const {
if (const CallExpr *CE = dyn_cast<CallExpr>(E)) {
// FIXME: This doesn't handle indirect calls.
const FunctionDecl *FD = CE->getDirectCallee();
if (!FD)
return false;
os << *FD;
if (!FD->isOverloadedOperator())
os << "()";
return true;
}
if (const ObjCMessageExpr *Msg = dyn_cast<ObjCMessageExpr>(E)) {
if (Msg->isInstanceMessage())
os << "-";
else
os << "+";
os << Msg->getSelector().getAsString();
return true;
}
if (const CXXNewExpr *NE = dyn_cast<CXXNewExpr>(E)) {
os << *NE->getOperatorNew();
return true;
}
if (const CXXDeleteExpr *DE = dyn_cast<CXXDeleteExpr>(E)) {
os << *DE->getOperatorDelete();
return true;
}
return false;
}
This actually seems generally useful and could even go on
CheckerContext as a convenience helper. Note the bool return, so that
rather than come up with some placeholder text like "unknown means",
we can just rephrase the message to not mention the allocator.
Great!
+ switch(dKind) {
+ case D_free: os << "malloc()"; return;
+ case D_delete: os << "operator new"; return;
+ case D_deleteArray: os << "operator new[]"; return;
+ case D_unknown:
+ default: assert(0 && "unknown or unhandled DeallocatorKind");
+ }
LLVM style is to not include default cases for an enum-covered switch
statement. Also llvm_unreachable again.
Thanks for your patience in iterating on this!
Jordan
So we rollback to the approach with a single DeallocatorKind implicitly
stored inside RefState::Kind as subkinds are stored inside
SVal::BaseKind in Sval.h, right?
--
Anton
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits