BTW, I wrote this two days ago. For some reason my mail client didn't send it out until now. My apologies for the delay.
On Aug 15, 2012, at 10:11 PM, Ted Kremenek <[email protected]> wrote: > On Aug 15, 2012, at 6:12 PM, Richard Trieu <[email protected]> wrote: > >> On Tue, Aug 14, 2012 at 9:48 PM, Ted Kremenek <[email protected]> wrote: >> On Aug 14, 2012, at 2:32 PM, Richard Trieu <[email protected]> wrote: >> >>> At a high level, I honestly find this logic to be more complicated than I >>> would have expected. The sorting seems unnecessary, and will report >>> diagnostics in an unnatural order (first based on enum constant value, then >>> on declaration order). A straight linear pass seems more naturally to me, >>> and DenseMap is very efficient. >>> Is there a comparison between the different containers in LLVM and the STL >>> containers? >> >> This is a reasonable place to start: >> >> http://llvm.org/docs/ProgrammersManual.html#ds_map >> >> The key with DenseMap is that it is probed hashtable. There is one big >> allocation for the entire table, instead of a bunch of buckets. When >> applicable, it can be very fast, and feels like the right data structure to >> use here. >> >> Duplicate enum detection, now with DenseMap. The DenseMap maps a int64_t to >> a vector pointer. 0 and 1 were special keys for the DenseMap, so two >> separate pointers special cased for them. The vectors pointers are stored >> in another vector in declaration order. One pass is made over the enums to >> find ones without initializers. These are used to create vectors. A second >> pass through the enums populates the vectors. Finally, a pass over the >> vector of vectors is used to generate all the warnings and notes. >> >> Run time is fairly consistent with the sorted vector implementation, which >> is max %3 difference against control. >> <duplicate-enum-densemap.patch> > > Thanks for working on this. My main concern is this patch now has a lot of > unnecessary malloc() traffic, which will certainly slow it down. Comments > inline: > >> + >> +static int64_t GetInt64(const llvm::APSInt& Val) { >> + return Val.isSigned() ? Val.getSExtValue() : Val.getZExtValue(); >> +} >> + >> +struct DenseMapInfoint64_t { >> + static int64_t getEmptyKey() { return 0; } >> + static int64_t getTombstoneKey() { return 1; } >> + static unsigned getHashValue(const int64_t Val) { >> + return (unsigned)(Val * 37); >> + } >> + static bool isEqual(const int64_t& LHS, const int64_t& RHS) { >> + return LHS == RHS; >> + } >> +}; > > This trait class doesn't look like it was actually used. The DenseMap below > just uses the default trait for int64_t. > > I also still think we can so something a bit smarter here. What I think we > need to distinguish between is whether or not a constant has appeared more > than once. We're saving a bit of memory on the keys, but spending that > savings elsewhere when we allocate the vectors unconditionally for each > constant. > >> + >> +// Emits a warning when an element is implicitly set a value that >> +// a previous element has already been set to. >> +static void CheckForDuplicateEnumValues(Sema &S, Decl **Elements, >> + unsigned NumElements, EnumDecl >> *Enum, >> + QualType EnumType) { >> + if (S.Diags.getDiagnosticLevel(diag::warn_duplicate_enum_values, >> + Enum->getLocation()) == >> + DiagnosticsEngine::Ignored) >> + return; >> + // Avoid anonymous enums >> + if (!Enum->getIdentifier()) >> + return; >> + >> + // Only check for small enums. >> + if (Enum->getNumPositiveBits() > 63 || Enum->getNumNegativeBits() > 64) >> + return; >> + >> + typedef llvm::SmallVector<EnumConstantDecl*, 4> SameValueVector; >> + typedef llvm::DenseMap<int64_t, SameValueVector*> ValueToVectorMap; >> + typedef llvm::SmallVector<SameValueVector*, 10> DoubleVector; >> + ValueToVectorMap EnumMap; >> + DoubleVector EnumVector; >> + SameValueVector *ZeroVector = 0, *OneVector = 0; > > It took me a while to understand what this was doing, so I feel it could > really benefit from a comment. This also appears to result in a ton of > malloc traffic below. Here's my suggestion: > > typedef llvm::SmallVector<EnumConstantDecl*, 3> ECDVector; > typedef llvm::SmallVector<ECDVector *, 3> DuplicatesVector; > > typedef llvm::PointerUnion<EnumConstantDecl*, ECDVector *> DeclOrVector; > typedef llvm::DenseMap<int64_t, DeclOrVector> ValueToVectorMap; > > DuplicatesVector DupVector; > ValueToVectorMap EnumMap; > > The trick here is that the DenseMap maps from a constant to the first > EnumConstantDecl it encounters. Only if we encounter a second > EnumConstantDecl with the same enum value do we pay the cost of allocating > another vector. This will drastically optimize in the common case, as > calling malloc() is really slow. Right now the code appears to be doing a > malloc() for every enum constant, which is going to really penalize us here. > >> + >> + for (unsigned i = 0; i < NumElements; ++i) { >> + EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]); >> + if (!ECD) { >> + for (DoubleVector::iterator I = EnumVector.begin(), E = >> EnumVector.end(); >> + I != E; ++I) >> + delete *I; >> + return; >> + } > > I don't quite understand this loop through DoubleVector here, but it looks > like logic in case we want to return early and cleanup. Is there a case > where the EnumConstantDecl can be null? > >> + >> + if (ECD->getInitExpr()) >> + continue; >> + >> + int64_t Val = GetInt64(ECD->getInitVal()); >> + > > Looks good. > >> + if (Val == 0) { >> + if (ZeroVector) continue; >> + ZeroVector = new SameValueVector(); >> + ZeroVector->push_back(ECD); >> + EnumVector.push_back(ZeroVector); >> + } else if (Val == 1) { >> + if (OneVector) continue; >> + OneVector = new SameValueVector(); >> + OneVector->push_back(ECD); >> + EnumVector.push_back(OneVector); >> + } else { >> + if (EnumMap.find(Val) != EnumMap.end()) >> + continue; >> + SameValueVector *ValueVector = new SameValueVector(); >> + ValueVector->push_back(ECD); >> + EnumVector.push_back(ValueVector); >> + EnumMap.insert(std::make_pair(Val, ValueVector)); > > The "find()" followed by the "insert()" is wasteful. It results in two > lookups to the hash table when we could have just used one. More on that > later. > >> + } >> + } > > IMO, this looks like a lot of complexity just to handle the fact that 0 and 1 > are special values for the DenseMap. I don't really see this as the right > tradeoff; the code is more complicated with marginal impact on memory usage > or performance. > > If you humor me for a bit, consider using something else for the key, e.g.: > > struct DupKey { > int64_t val; > bool isTombstoneOrEmptyKey; > }; > > The idea is if 'isTombStoneOrEmptyKey' is true, we can use val = 0 or val = 1 > to represent empty keys or tombstone entries. Otherwise, it's an int64_t, > with the full range of values. We can define a DenseMap trait to do the > right thing. Yes, this costs a tiny bit more in storage, but it allows the > data structure to handle the complete set of values in your domain, instead > of resorting to complicating the core algorithm. What I see here now is the > same code essentially duplicated twice, which makes it harder to read and > more error prone. > > If we use DupKey as our key for the DenseMap, we can instead do something > like this: > > DeclOrVector &entry = EnumMap[Val]; // Use default construction of > 'entry'. > // Is the first time we encountered this constant? > if (entry.isNull()) { > entry = ECD; > continue; > } > // Is this the second time we encountered this constant? If so, > // push the previous decl encountered and the one just encountered > // to a vector of duplicates. > if (EnumConstantDecl *D = entry.dyn_cast<EnumConstantDecl*>()) { > ECDVector *Vec = new ECDVector(); > Vec->push_back(D); > Vec->push_back(ECD); > > // Update the entry to refer to the duplicates. > entry = Vec; > > // Store the duplicates in a vector we can consult later for > // quick emission of diagnostics. > DupVector.push_back(Vec); > > // On to the next constant. > continue; > } > // Is this the third (or greater) time we encountered the constant? If so, > // continue to add it to the existing vector. > ECDVector *Vec = entry.get<ECDVector*>(); > Vec->push_back(ECD); > > > With this code, we only allocate memory (beyond the DenseMap) when we > encounter a duplicate that would be worth reporting. In the common case, > this savings in malloc traffic should be noticeable. > > Notice also that I used: > > DeclOrVector &entry = EnumMap[Val]; // Use default construction of > 'entry'. > > This results in a single lookup in the hashtable. Since we plan on adding a > value for a key no matter what, by using this idiom we allow the DenseMap to > default construct an entry if it doesn't exist. This results in a single > hashtable lookup, from which we can modify the value in place. This is > obviously faster than doing a hashtable lookup twice. > >> + >> + for (unsigned i = 0; i < NumElements; ++i) { >> + EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]); >> + if (!ValidDuplicateEnum(ECD, Enum)) >> + continue; >> + >> + int64_t Val = GetInt64(ECD->getInitVal()); >> + >> + if (Val == 0) { >> + if (!ZeroVector || *ZeroVector->begin() == ECD) >> + continue; >> + ZeroVector->push_back(ECD); >> + } else if (Val == 1) { >> + if (!OneVector || *OneVector->begin() == ECD) >> + continue; >> + OneVector->push_back(ECD); >> + } else { >> + ValueToVectorMap::iterator I = EnumMap.find(Val); >> + if (I == EnumMap.end()) >> + continue; >> + SameValueVector *V = I->second; >> + if (*V->begin() == ECD) >> + continue; >> + V->push_back(ECD); >> + } >> + } > > This second loop looks unnecessary. I think we can do everything we need to > count duplicates with one loop. Of course the ValidDuplicateEnum() would > need to be hoisted to the first loop. > >> + >> + for (DoubleVector::iterator DoubleVectorIter = EnumVector.begin(), >> + DoubleVectorEnd = EnumVector.end(); >> + DoubleVectorIter != DoubleVectorEnd; ++DoubleVectorIter) { >> + SameValueVector *V = *DoubleVectorIter; >> + if (V->size() == 1) >> + continue; >> + >> + SameValueVector::iterator I = V->begin(); >> + S.Diag((*I)->getLocation(), diag::warn_duplicate_enum_values) >> + << (*I)->getName() << (*I)->getInitVal().toString(10) >> + << (*I)->getSourceRange(); >> + ++I; >> + for (SameValueVector::iterator E = V->end(); I != E; ++I) >> + S.Diag((*I)->getLocation(), diag::note_duplicate_element) >> + << (*I)->getName() << (*I)->getInitVal().toString(10) >> + << (*I)->getSourceRange(); >> + delete V; >> + } > > > This is more or less the same, essentially it becomes: > > for (DuplicateVector::iterator I = DupVector.begin(), E = DupVector.end(); I > != E; ++I) { > ECDVector *Vec = *I; > // do the diagnostic logic ... > delete *I; > } > > Note that with my suggestions the vector has size on order of the number of > duplicate constants, not the number of total constants. If there are no > duplicates, no work is required (including free'ing memory). > >> +} >> + >> void Sema::ActOnEnumBody(SourceLocation EnumLoc, SourceLocation LBraceLoc, >> SourceLocation RBraceLoc, Decl *EnumDeclX, >> Decl **Elements, unsigned NumElements, >> @@ -10709,6 +10868,7 @@ >> DeclsInPrototypeScope.push_back(Enum); >> >> CheckForUniqueEnumValues(*this, Elements, NumElements, Enum, EnumType); >> + CheckForDuplicateEnumValues(*this, Elements, NumElements, Enum, EnumType); >> } >> >> Decl *Sema::ActOnFileScopeAsmDecl(Expr *expr, > > I know this may all be nit-picky, but I really think trying to reduce the > malloc() traffic is worth looking at to get a real understanding of the > performance improvement that can be found here. > > Thanks for forging ahead on this. > _______________________________________________ > cfe-commits mailing list > [email protected] > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
