BTW, I wrote this two days ago.  For some reason my mail client didn't send it 
out until now.  My apologies for the delay.

On Aug 15, 2012, at 10:11 PM, Ted Kremenek <[email protected]> wrote:

> On Aug 15, 2012, at 6:12 PM, Richard Trieu <[email protected]> wrote:
> 
>> On Tue, Aug 14, 2012 at 9:48 PM, Ted Kremenek <[email protected]> wrote:
>> On Aug 14, 2012, at 2:32 PM, Richard Trieu <[email protected]> wrote:
>> 
>>> At a high level, I honestly find this logic to be more complicated than I 
>>> would have expected.  The sorting seems unnecessary, and will report 
>>> diagnostics in an unnatural order (first based on enum constant value, then 
>>> on declaration order).  A straight linear pass seems more naturally to me, 
>>> and DenseMap is very efficient.
>>> Is there a comparison between the different containers in LLVM and the STL 
>>> containers?
>> 
>> This is a reasonable place to start:
>> 
>>   http://llvm.org/docs/ProgrammersManual.html#ds_map
>> 
>> The key with DenseMap is that it is probed hashtable.  There is one big 
>> allocation for the entire table, instead of a bunch of buckets.  When 
>> applicable, it can be very fast, and feels like the right data structure to 
>> use here.
>> 
>> Duplicate enum detection, now with DenseMap.  The DenseMap maps a int64_t to 
>> a vector pointer.  0 and 1 were special keys for the DenseMap, so two 
>> separate pointers special cased for them.   The vectors pointers are stored 
>> in another vector in declaration order.  One pass is made over the enums to 
>> find ones without initializers.  These are used to create vectors.  A second 
>> pass through the enums populates the vectors.  Finally, a pass over the 
>> vector of vectors is used to generate all the warnings and notes.
>> 
>> Run time is fairly consistent with the sorted vector implementation, which 
>> is max %3 difference against control.
>> <duplicate-enum-densemap.patch>
> 
> Thanks for working on this.  My main concern is this patch now has a lot of 
> unnecessary malloc() traffic, which will certainly slow it down.  Comments 
> inline:
> 
>> +
>> +static int64_t GetInt64(const llvm::APSInt& Val) {
>> +  return  Val.isSigned() ? Val.getSExtValue() : Val.getZExtValue();
>> +}
>> +
>> +struct DenseMapInfoint64_t {
>> +  static int64_t getEmptyKey() { return 0; }
>> +  static int64_t getTombstoneKey() { return 1; }
>> +  static unsigned getHashValue(const int64_t Val) {
>> +    return (unsigned)(Val * 37);
>> +  }
>> +  static bool isEqual(const int64_t& LHS, const int64_t& RHS) {
>> +    return LHS == RHS;
>> +  }
>> +};
> 
> This trait class doesn't look like it was actually used.  The DenseMap below 
> just uses the default trait for int64_t.
> 
> I also still think we can so something a bit smarter here.  What I think we 
> need to distinguish between is whether or not a constant has appeared more 
> than once.  We're saving a bit of memory on the keys, but spending that 
> savings elsewhere when we allocate the vectors unconditionally for each 
> constant.
> 
>> +
>> +// Emits a warning when an element is implicitly set a value that
>> +// a previous element has already been set to.
>> +static void CheckForDuplicateEnumValues(Sema &S, Decl **Elements,
>> +                                        unsigned NumElements, EnumDecl 
>> *Enum,
>> +                                        QualType EnumType) {
>> +  if (S.Diags.getDiagnosticLevel(diag::warn_duplicate_enum_values,
>> +                                 Enum->getLocation()) ==
>> +      DiagnosticsEngine::Ignored)
>> +    return;
>> +  // Avoid anonymous enums
>> +  if (!Enum->getIdentifier())
>> +    return;
>> +
>> +  // Only check for small enums.
>> +  if (Enum->getNumPositiveBits() > 63 || Enum->getNumNegativeBits() > 64)
>> +    return;
>> +
>> +  typedef llvm::SmallVector<EnumConstantDecl*, 4> SameValueVector;
>> +  typedef llvm::DenseMap<int64_t, SameValueVector*> ValueToVectorMap;
>> +  typedef llvm::SmallVector<SameValueVector*, 10> DoubleVector;
>> +  ValueToVectorMap EnumMap;
>> +  DoubleVector EnumVector;
>> +  SameValueVector *ZeroVector = 0, *OneVector = 0;
> 
> It took me a while to understand what this was doing, so I feel it could 
> really benefit from a comment.  This also appears to result in a ton of 
> malloc traffic below.  Here's my suggestion:
> 
>   typedef llvm::SmallVector<EnumConstantDecl*, 3> ECDVector;
>   typedef llvm::SmallVector<ECDVector *, 3> DuplicatesVector;
> 
>   typedef llvm::PointerUnion<EnumConstantDecl*, ECDVector *> DeclOrVector;
>   typedef llvm::DenseMap<int64_t, DeclOrVector> ValueToVectorMap;
> 
>   DuplicatesVector DupVector;
>   ValueToVectorMap EnumMap;
> 
> The trick here is that the DenseMap maps from a constant to the first 
> EnumConstantDecl it encounters.  Only if we encounter a second 
> EnumConstantDecl with the same enum value do we pay the cost of allocating 
> another vector.  This will drastically optimize in the common case, as 
> calling malloc() is really slow.  Right now the code appears to be doing a 
> malloc() for every enum constant, which is going to really penalize us here.
> 
>> +
>> +  for (unsigned i = 0; i < NumElements; ++i) {
>> +    EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]);
>> +    if (!ECD) {
>> +      for (DoubleVector::iterator I = EnumVector.begin(), E = 
>> EnumVector.end();
>> +           I != E; ++I)
>> +        delete *I;
>> +      return;
>> +    }
> 
> I don't quite understand this loop through DoubleVector here, but it looks 
> like logic in case we want to return early and cleanup.  Is there a case 
> where the EnumConstantDecl can be null?
> 
>> +
>> +    if (ECD->getInitExpr())
>> +      continue;
>> +
>> +    int64_t Val = GetInt64(ECD->getInitVal());
>> +
> 
> Looks good.
> 
>> +    if (Val == 0) {
>> +      if (ZeroVector) continue;
>> +      ZeroVector = new SameValueVector();
>> +      ZeroVector->push_back(ECD);
>> +      EnumVector.push_back(ZeroVector);
>> +    } else if (Val == 1) {
>> +      if (OneVector) continue;
>> +      OneVector = new SameValueVector();
>> +      OneVector->push_back(ECD);
>> +      EnumVector.push_back(OneVector);
>> +    } else {
>> +      if (EnumMap.find(Val) != EnumMap.end())
>> +        continue;
>> +      SameValueVector *ValueVector = new SameValueVector();
>> +      ValueVector->push_back(ECD);
>> +      EnumVector.push_back(ValueVector);
>> +      EnumMap.insert(std::make_pair(Val, ValueVector));
> 
> The "find()" followed by the "insert()" is wasteful.  It results in two 
> lookups to the hash table when we could have just used one.  More on that 
> later.
> 
>> +    }
>> +  }
> 
> IMO, this looks like a lot of complexity just to handle the fact that 0 and 1 
> are special values for the DenseMap.  I don't really see this as the right 
> tradeoff; the code is more complicated with marginal impact on memory usage 
> or performance.
> 
> If you humor me for a bit, consider using something else for the key, e.g.:
> 
> struct DupKey {
>   int64_t val;
>   bool isTombstoneOrEmptyKey;
> };
> 
> The idea is if 'isTombStoneOrEmptyKey' is true, we can use val = 0 or val = 1 
> to represent empty keys or tombstone entries.  Otherwise, it's an int64_t, 
> with the full range of values.  We can define a DenseMap trait to do the 
> right thing.  Yes, this costs a tiny bit more in storage, but it allows the 
> data structure to handle the complete set of values in your domain, instead 
> of resorting to complicating the core algorithm.  What I see here now is the 
> same code essentially duplicated twice, which makes it harder to read and 
> more error prone.
> 
> If we use DupKey as our key for the DenseMap, we can instead do something 
> like this:
> 
>    DeclOrVector &entry = EnumMap[Val];  // Use default construction of 
> 'entry'.
>    // Is the first time we encountered this constant?
>    if (entry.isNull()) {
>      entry = ECD;
>      continue;
>    }
>    // Is this the second time we encountered this constant?  If so,
>    // push the previous decl encountered and the one just encountered
>    // to a vector of duplicates.
>    if (EnumConstantDecl *D = entry.dyn_cast<EnumConstantDecl*>()) {
>      ECDVector *Vec = new ECDVector();
>      Vec->push_back(D);
>      Vec->push_back(ECD);
>      
>      // Update the entry to refer to the duplicates.
>      entry = Vec;
> 
>      // Store the duplicates in a vector we can consult later for
>      // quick emission of diagnostics.
>      DupVector.push_back(Vec);
> 
>      // On to the next constant.
>      continue;
>    }
>    // Is this the third (or greater) time we encountered the constant?  If so,
>    // continue to add it to the existing vector.
>    ECDVector *Vec = entry.get<ECDVector*>();
>    Vec->push_back(ECD);
> 
> 
> With this code, we only allocate memory (beyond the DenseMap) when we 
> encounter a duplicate that would be worth reporting.  In the common case, 
> this savings in malloc traffic should be noticeable.
> 
> Notice also that I used:
> 
>      DeclOrVector &entry = EnumMap[Val];  // Use default construction of 
> 'entry'.
> 
> This results in a single lookup in the hashtable.  Since we plan on adding a 
> value for a key no matter what, by using this idiom we allow the DenseMap to 
> default construct an entry if it doesn't exist.  This results in a single 
> hashtable lookup, from which we can modify the value in place.  This is 
> obviously faster than doing a hashtable lookup twice. 
> 
>> +
>> +  for (unsigned i = 0; i < NumElements; ++i) {
>> +    EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]);
>> +    if (!ValidDuplicateEnum(ECD, Enum))
>> +      continue;
>> +
>> +    int64_t Val = GetInt64(ECD->getInitVal());
>> +
>> +    if (Val == 0) {
>> +      if (!ZeroVector || *ZeroVector->begin() == ECD)
>> +        continue;
>> +      ZeroVector->push_back(ECD);
>> +    } else if (Val == 1) {
>> +      if (!OneVector || *OneVector->begin() == ECD)
>> +        continue;
>> +      OneVector->push_back(ECD);
>> +    } else {
>> +      ValueToVectorMap::iterator I = EnumMap.find(Val);
>> +      if (I == EnumMap.end())
>> +        continue;
>> +      SameValueVector *V = I->second;
>> +      if (*V->begin() == ECD)
>> +        continue;
>> +      V->push_back(ECD);
>> +    }
>> +  }
> 
> This second loop looks unnecessary.  I think we can do everything we need to 
> count duplicates with one loop.  Of course the ValidDuplicateEnum() would 
> need to be hoisted to the first loop.
> 
>> +
>> +  for (DoubleVector::iterator DoubleVectorIter = EnumVector.begin(),
>> +                              DoubleVectorEnd = EnumVector.end();
>> +       DoubleVectorIter != DoubleVectorEnd; ++DoubleVectorIter) {
>> +    SameValueVector *V = *DoubleVectorIter;
>> +    if (V->size() == 1)
>> +      continue;
>> +
>> +    SameValueVector::iterator I = V->begin();
>> +    S.Diag((*I)->getLocation(), diag::warn_duplicate_enum_values)
>> +      << (*I)->getName() << (*I)->getInitVal().toString(10)
>> +      << (*I)->getSourceRange();
>> +    ++I;
>> +    for (SameValueVector::iterator E = V->end(); I != E; ++I)
>> +      S.Diag((*I)->getLocation(), diag::note_duplicate_element)
>> +        << (*I)->getName() << (*I)->getInitVal().toString(10)
>> +        << (*I)->getSourceRange();
>> +    delete V;
>> +  }
> 
> 
> This is more or less the same, essentially it becomes:
> 
> for (DuplicateVector::iterator I = DupVector.begin(), E = DupVector.end(); I 
> != E; ++I) {
>    ECDVector *Vec = *I;
>    // do the diagnostic logic ...
>    delete *I;
> }
> 
> Note that with my suggestions the vector has size on order of the number of 
> duplicate constants, not the number of total constants.  If there are no 
> duplicates, no work is required (including free'ing memory).
> 
>> +}
>> +
>>  void Sema::ActOnEnumBody(SourceLocation EnumLoc, SourceLocation LBraceLoc,
>>                           SourceLocation RBraceLoc, Decl *EnumDeclX,
>>                           Decl **Elements, unsigned NumElements,
>> @@ -10709,6 +10868,7 @@
>>      DeclsInPrototypeScope.push_back(Enum);
>>  
>>    CheckForUniqueEnumValues(*this, Elements, NumElements, Enum, EnumType);
>> +  CheckForDuplicateEnumValues(*this, Elements, NumElements, Enum, EnumType);
>>  }
>>  
>>  Decl *Sema::ActOnFileScopeAsmDecl(Expr *expr,
> 
> I know this may all be nit-picky, but I really think trying to reduce the 
> malloc() traffic is worth looking at to get a real understanding of the 
> performance improvement that can be found here.
> 
> Thanks for forging ahead on this.
> _______________________________________________
> cfe-commits mailing list
> [email protected]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to