On Aug 15, 2012, at 6:12 PM, Richard Trieu <[email protected]> wrote:

> On Tue, Aug 14, 2012 at 9:48 PM, Ted Kremenek <[email protected]> wrote:
> On Aug 14, 2012, at 2:32 PM, Richard Trieu <[email protected]> wrote:
> 
>> At a high level, I honestly find this logic to be more complicated than I 
>> would have expected.  The sorting seems unnecessary, and will report 
>> diagnostics in an unnatural order (first based on enum constant value, then 
>> on declaration order).  A straight linear pass seems more naturally to me, 
>> and DenseMap is very efficient.
>> Is there a comparison between the different containers in LLVM and the STL 
>> containers?
> 
> This is a reasonable place to start:
> 
>   http://llvm.org/docs/ProgrammersManual.html#ds_map
> 
> The key with DenseMap is that it is probed hashtable.  There is one big 
> allocation for the entire table, instead of a bunch of buckets.  When 
> applicable, it can be very fast, and feels like the right data structure to 
> use here.
> 
> Duplicate enum detection, now with DenseMap.  The DenseMap maps a int64_t to 
> a vector pointer.  0 and 1 were special keys for the DenseMap, so two 
> separate pointers special cased for them.   The vectors pointers are stored 
> in another vector in declaration order.  One pass is made over the enums to 
> find ones without initializers.  These are used to create vectors.  A second 
> pass through the enums populates the vectors.  Finally, a pass over the 
> vector of vectors is used to generate all the warnings and notes.
> 
> Run time is fairly consistent with the sorted vector implementation, which is 
> max %3 difference against control.
> <duplicate-enum-densemap.patch>

Thanks for working on this.  My main concern is this patch now has a lot of 
unnecessary malloc() traffic, which will certainly slow it down.  Comments 
inline:

> +
> +static int64_t GetInt64(const llvm::APSInt& Val) {
> +  return  Val.isSigned() ? Val.getSExtValue() : Val.getZExtValue();
> +}
> +
> +struct DenseMapInfoint64_t {
> +  static int64_t getEmptyKey() { return 0; }
> +  static int64_t getTombstoneKey() { return 1; }
> +  static unsigned getHashValue(const int64_t Val) {
> +    return (unsigned)(Val * 37);
> +  }
> +  static bool isEqual(const int64_t& LHS, const int64_t& RHS) {
> +    return LHS == RHS;
> +  }
> +};

This trait class doesn't look like it was actually used.  The DenseMap below 
just uses the default trait for int64_t.

I also still think we can so something a bit smarter here.  What I think we 
need to distinguish between is whether or not a constant has appeared more than 
once.  We're saving a bit of memory on the keys, but spending that savings 
elsewhere when we allocate the vectors unconditionally for each constant.

> +
> +// Emits a warning when an element is implicitly set a value that
> +// a previous element has already been set to.
> +static void CheckForDuplicateEnumValues(Sema &S, Decl **Elements,
> +                                        unsigned NumElements, EnumDecl *Enum,
> +                                        QualType EnumType) {
> +  if (S.Diags.getDiagnosticLevel(diag::warn_duplicate_enum_values,
> +                                 Enum->getLocation()) ==
> +      DiagnosticsEngine::Ignored)
> +    return;
> +  // Avoid anonymous enums
> +  if (!Enum->getIdentifier())
> +    return;
> +
> +  // Only check for small enums.
> +  if (Enum->getNumPositiveBits() > 63 || Enum->getNumNegativeBits() > 64)
> +    return;
> +
> +  typedef llvm::SmallVector<EnumConstantDecl*, 4> SameValueVector;
> +  typedef llvm::DenseMap<int64_t, SameValueVector*> ValueToVectorMap;
> +  typedef llvm::SmallVector<SameValueVector*, 10> DoubleVector;
> +  ValueToVectorMap EnumMap;
> +  DoubleVector EnumVector;
> +  SameValueVector *ZeroVector = 0, *OneVector = 0;

It took me a while to understand what this was doing, so I feel it could really 
benefit from a comment.  This also appears to result in a ton of malloc traffic 
below.  Here's my suggestion:

  typedef llvm::SmallVector<EnumConstantDecl*, 3> ECDVector;
  typedef llvm::SmallVector<ECDVector *, 3> DuplicatesVector;

  typedef llvm::PointerUnion<EnumConstantDecl*, ECDVector *> DeclOrVector;
  typedef llvm::DenseMap<int64_t, DeclOrVector> ValueToVectorMap;

  DuplicatesVector DupVector;
  ValueToVectorMap EnumMap;

The trick here is that the DenseMap maps from a constant to the first 
EnumConstantDecl it encounters.  Only if we encounter a second EnumConstantDecl 
with the same enum value do we pay the cost of allocating another vector.  This 
will drastically optimize in the common case, as calling malloc() is really 
slow.  Right now the code appears to be doing a malloc() for every enum 
constant, which is going to really penalize us here.

> +
> +  for (unsigned i = 0; i < NumElements; ++i) {
> +    EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]);
> +    if (!ECD) {
> +      for (DoubleVector::iterator I = EnumVector.begin(), E = 
> EnumVector.end();
> +           I != E; ++I)
> +        delete *I;
> +      return;
> +    }

I don't quite understand this loop through DoubleVector here, but it looks like 
logic in case we want to return early and cleanup.  Is there a case where the 
EnumConstantDecl can be null?

> +
> +    if (ECD->getInitExpr())
> +      continue;
> +
> +    int64_t Val = GetInt64(ECD->getInitVal());
> +

Looks good.

> +    if (Val == 0) {
> +      if (ZeroVector) continue;
> +      ZeroVector = new SameValueVector();
> +      ZeroVector->push_back(ECD);
> +      EnumVector.push_back(ZeroVector);
> +    } else if (Val == 1) {
> +      if (OneVector) continue;
> +      OneVector = new SameValueVector();
> +      OneVector->push_back(ECD);
> +      EnumVector.push_back(OneVector);
> +    } else {
> +      if (EnumMap.find(Val) != EnumMap.end())
> +        continue;
> +      SameValueVector *ValueVector = new SameValueVector();
> +      ValueVector->push_back(ECD);
> +      EnumVector.push_back(ValueVector);
> +      EnumMap.insert(std::make_pair(Val, ValueVector));

The "find()" followed by the "insert()" is wasteful.  It results in two lookups 
to the hash table when we could have just used one.  More on that later.

> +    }
> +  }

IMO, this looks like a lot of complexity just to handle the fact that 0 and 1 
are special values for the DenseMap.  I don't really see this as the right 
tradeoff; the code is more complicated with marginal impact on memory usage or 
performance.

If you humor me for a bit, consider using something else for the key, e.g.:

struct DupKey {
  int64_t val;
  bool isTombstoneOrEmptyKey;
};

The idea is if 'isTombStoneOrEmptyKey' is true, we can use val = 0 or val = 1 
to represent empty keys or tombstone entries.  Otherwise, it's an int64_t, with 
the full range of values.  We can define a DenseMap trait to do the right 
thing.  Yes, this costs a tiny bit more in storage, but it allows the data 
structure to handle the complete set of values in your domain, instead of 
resorting to complicating the core algorithm.  What I see here now is the same 
code essentially duplicated twice, which makes it harder to read and more error 
prone.

If we use DupKey as our key for the DenseMap, we can instead do something like 
this:

   DeclOrVector &entry = EnumMap[Val];  // Use default construction of 'entry'.
   // Is the first time we encountered this constant?
   if (entry.isNull()) {
     entry = ECD;
     continue;
   }
   // Is this the second time we encountered this constant?  If so,
   // push the previous decl encountered and the one just encountered
   // to a vector of duplicates.
   if (EnumConstantDecl *D = entry.dyn_cast<EnumConstantDecl*>()) {
     ECDVector *Vec = new ECDVector();
     Vec->push_back(D);
     Vec->push_back(ECD);
     
     // Update the entry to refer to the duplicates.
     entry = Vec;

     // Store the duplicates in a vector we can consult later for
     // quick emission of diagnostics.
     DupVector.push_back(Vec);

     // On to the next constant.
     continue;
   }
   // Is this the third (or greater) time we encountered the constant?  If so,
   // continue to add it to the existing vector.
   ECDVector *Vec = entry.get<ECDVector*>();
   Vec->push_back(ECD);


With this code, we only allocate memory (beyond the DenseMap) when we encounter 
a duplicate that would be worth reporting.  In the common case, this savings in 
malloc traffic should be noticeable.

Notice also that I used:

     DeclOrVector &entry = EnumMap[Val];  // Use default construction of 
'entry'.

This results in a single lookup in the hashtable.  Since we plan on adding a 
value for a key no matter what, by using this idiom we allow the DenseMap to 
default construct an entry if it doesn't exist.  This results in a single 
hashtable lookup, from which we can modify the value in place.  This is 
obviously faster than doing a hashtable lookup twice. 

> +
> +  for (unsigned i = 0; i < NumElements; ++i) {
> +    EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]);
> +    if (!ValidDuplicateEnum(ECD, Enum))
> +      continue;
> +
> +    int64_t Val = GetInt64(ECD->getInitVal());
> +
> +    if (Val == 0) {
> +      if (!ZeroVector || *ZeroVector->begin() == ECD)
> +        continue;
> +      ZeroVector->push_back(ECD);
> +    } else if (Val == 1) {
> +      if (!OneVector || *OneVector->begin() == ECD)
> +        continue;
> +      OneVector->push_back(ECD);
> +    } else {
> +      ValueToVectorMap::iterator I = EnumMap.find(Val);
> +      if (I == EnumMap.end())
> +        continue;
> +      SameValueVector *V = I->second;
> +      if (*V->begin() == ECD)
> +        continue;
> +      V->push_back(ECD);
> +    }
> +  }

This second loop looks unnecessary.  I think we can do everything we need to 
count duplicates with one loop.  Of course the ValidDuplicateEnum() would need 
to be hoisted to the first loop.

> +
> +  for (DoubleVector::iterator DoubleVectorIter = EnumVector.begin(),
> +                              DoubleVectorEnd = EnumVector.end();
> +       DoubleVectorIter != DoubleVectorEnd; ++DoubleVectorIter) {
> +    SameValueVector *V = *DoubleVectorIter;
> +    if (V->size() == 1)
> +      continue;
> +
> +    SameValueVector::iterator I = V->begin();
> +    S.Diag((*I)->getLocation(), diag::warn_duplicate_enum_values)
> +      << (*I)->getName() << (*I)->getInitVal().toString(10)
> +      << (*I)->getSourceRange();
> +    ++I;
> +    for (SameValueVector::iterator E = V->end(); I != E; ++I)
> +      S.Diag((*I)->getLocation(), diag::note_duplicate_element)
> +        << (*I)->getName() << (*I)->getInitVal().toString(10)
> +        << (*I)->getSourceRange();
> +    delete V;
> +  }


This is more or less the same, essentially it becomes:

for (DuplicateVector::iterator I = DupVector.begin(), E = DupVector.end(); I != 
E; ++I) {
   ECDVector *Vec = *I;
   // do the diagnostic logic ...
   delete *I;
}

Note that with my suggestions the vector has size on order of the number of 
duplicate constants, not the number of total constants.  If there are no 
duplicates, no work is required (including free'ing memory).

> +}
> +
>  void Sema::ActOnEnumBody(SourceLocation EnumLoc, SourceLocation LBraceLoc,
>                           SourceLocation RBraceLoc, Decl *EnumDeclX,
>                           Decl **Elements, unsigned NumElements,
> @@ -10709,6 +10868,7 @@
>      DeclsInPrototypeScope.push_back(Enum);
>  
>    CheckForUniqueEnumValues(*this, Elements, NumElements, Enum, EnumType);
> +  CheckForDuplicateEnumValues(*this, Elements, NumElements, Enum, EnumType);
>  }
>  
>  Decl *Sema::ActOnFileScopeAsmDecl(Expr *expr,

I know this may all be nit-picky, but I really think trying to reduce the 
malloc() traffic is worth looking at to get a real understanding of the 
performance improvement that can be found here.

Thanks for forging ahead on this.
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to