Okay, starting over. Anna went and explained to me that we really do want to be 
planning for the future case, where we allow people to specify their own custom 
allocation families, and that we should start doing that now. (Sorry for 
sending you in the wrong direction.) The design we hashed out is similar to 
what you came up with, but with a change in focus towards what we're calling 
"families", which may have multiple allocators and deallocators.

class RefState {
  const Stmt *S;
  unsigned State : 2; // Kind enum, but stored as a bitfield
  unsigned Family : 30; // Rest of 32-bit word, currently just an ID

  // ...
};

enum AllocationFamilies {
  AF_None,
  AF_Malloc,
  AF_CXXNew,
  AF_CXXNewArray,
  AF_FirstUserFamily
};

That's all for now, but if/when we support user annotations for 
allocations/deallocations, we can add a table to MallocChecker (not globally):

mutable llvm::SmallDenseMap<IdentifierInfo *, > FamilyIDTable;
mutable unsigned NextUserFamily = AF_FirstUserFamily;

and prepopulate it with any families we want, such as, say, "malloc". (See the 
old malloc-annotations test case.) When we come across a family we haven't seen 
before, we add it to the table and increment the "NextUserFamily" counter. (We 
don't need to design most of this now.)

This does have the downside that you pointed out: in the error message at the 
end, we can't say "memory was allocated by malloc()" if malloc is called 
indirectly. Anna convinced me that this is acceptable, especially if we do 
rephrase the diagnostic when we don't have a good message to print, because 
we'll still have the path note that shows where the allocation happened. We can 
still rederive the allocator name from the source statement.

We'll also have trouble printing out an archetypical allocator or deallocator 
in the "user-defined family" case, but we can design that later. Anna and I 
tossed around a few ideas but didn't settle on anything.

On the other hand, this keeps RefState a little more compact, and avoids 
string-based lookups for function names when many functions don't have names.


Other issues: I now understand why you decided to check new/delete within 
classes; excluding all cases with placement args is probably a good balance. 
(LLVM uses placement-arg-based allocators for POD types on a BumpPtrAllocator, 
for example.) Sorry for not getting it sooner.

>> Stepping back, there are many problems with this, the main one being that 
>> just keeping track of the function name isn't really good enough. 
>> MacOSKeychainAPIChecker can get away with it because its functions are 
> 
> Could you, please, send the remainder of the sentence? Tried to guess, but 
> failed.


Since I have been convinced otherwise, this is less relevant, but I think I was 
saying that MacOSKeychainAPIChecker can get away with this because its 
functions are all simple C functions in the global namespace, whereas a general 
alloc/dealloc checker needs to handle functions in namespaces, C++ instance 
methods, ObjC methods...all sorts of things where a simple name can't 
disambiguate.

> Returning DK_free for Objective-C messages just mean that they belong to the 
> 'malloc/free' family. Consider the following example:
> void test() {
>   int *p = (int *)malloc(sizeof(int));
>   [NSData dataWithBytesNoCopy:bytes length:sizeof(int)];
> }

Ha, I forgot about free-when-done; carry on. Not entirely happy with assuming 
all of ObjC belongs to malloc, but we can revisit this when we have other 
interesting families.


I'm deliberately not going to comment on your latest patch because I'd rather 
have your feedback on this design. What do you think?
Jordan
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to