Hi Richard,

Thanks for making great progress on this!  I agree with Richard Smith's 
suggestion about not using an std::map, but is there a reason to not just use a 
DenseMap to see if there are any enum collisions?  Fundamentally your algorithm 
is O(n*log n), not O(n).  Since you are converted everything to an int64_t, 
could you not just use a DenseMap<int64_t, EnumConstantDecl*>?  You could then 
iterate over the EnumConstantDecls in one linear pass, converting them to 
int64_t's on the fly for comparison purposes.  A few advantages I see are:

(1) No sorting, which is O(n log n)

(2) Diagnostics are based on declaration order without having to result to a 
stable_sort(), which is even more costly.  If you want to report all duplicates 
as notes, the mapped value could easily be made into a PointerIntPair, where 
the "int" corresponds to an index into another vector where you collect the 
duplicates.  Since the number of cases where duplicates will exist is few, 
you're optimizing for the common case.

With the current approach, I'm not certain, but something about the following 
looks curious:

> +    // Find the first duplicate enum.
> +    llvm::SmallVector<EnumConstantDecl*, 10>::iterator Iter = I + 1;
> +    for (; Iter != E && llvm::APSInt::isSameValue(Value, 
> (*Iter)->getInitVal());
> +         ++Iter) {
> +      if (ValidDuplicateEnum(*Iter, Enum))
> +        break;
> +    }
> +
> 


If the array is sorted by constant value, I don't believe there is any need to 
keep marching through the rest of the items if the next item isn't a match.  I 
may be mistaken, but I think this leaves your implementation still being 
quadratic in the worst case.  Further, the following:

> +
> +    if (Iter == E)
> +      break;
> +

appears to break out of the entire comparison entirely.  Isn't it possible for 
Iter == E if no match was found in the preceding loop?  For example, from your 
test cases, I'd expect to see a warning here:

> +
> +enum {
> +  I1 = -1,
> +  I2,
> +  I3,
> +  I4 = -2,
> +  I5,
> +  I6
> +};

Is there a reason why I1 and I5 are not reported as duplicates?  (or I3 and I6)?

At a high level, I honestly find this logic to be more complicated than I would 
have expected.  The sorting seems unnecessary, and will report diagnostics in 
an unnatural order (first based on enum constant value, then on declaration 
order).  A straight linear pass seems more naturally to me, and DenseMap is 
very efficient.  If you are worried about the malloc() traffic, you could 
always keep a DenseMap around in Sema just for this warning, and clear it out 
whenever entering CheckForDuplicateEnumValues().  You then pay the malloc() 
cost when you need a bigger map, which will quickly amortize the cost.

I honestly could be missing something here, so I apologize if I just am not 
reading the code correctly.

Cheers,
Ted

On Aug 13, 2012, at 7:05 PM, Richard Trieu <[email protected]> wrote:

> Several changes to this patch.  Richard Smith's suggestions has improved 
> performance.  Those changes are, only check values 64-bits or smaller and 
> directly compare the values instead of going through the functions and using 
> a sorted vector instead of a map to store the values.  Other changes were to 
> exclude a few more cases, and move the test from SemaCXX to Sema and then 
> have the RUN line test in both C and C++.
> 
> The improved performance has dropped the difference down to under 4% 
> difference for files purely made of a single enum from a previous ~10%. For 
> preprocessed Clang files, the difference was under the noise on my system.
> 
> On Fri, Jul 27, 2012 at 3:51 PM, Richard Trieu <[email protected]> wrote:
> Running Clang over itself, using preprocessed files and -fsyntax-only
> 28.86 to 29.08 seconds - Control
> 28.89 to 29.49 seconds - Clang with -Wduplicate-enum
> 
> This was run without Richard Smith's suggestions.  I will test those out to 
> see if there's any impact from them.
> 
> 
> On Wed, Jul 25, 2012 at 2:37 PM, Ted Kremenek <[email protected]> wrote:
> Ok.  That's still a scary number.  Do you have numbers for realistic 
> examples?  For example, we know Clang has some particularly large enums.  
> This micro benchmark is useful, but it may be overly pessimistic.
> 
> On Jul 25, 2012, at 2:34 PM, Richard Trieu <[email protected]> wrote:
> 
>> Yes, that is the slowdown for the entire -fsyntax-only time for a source 
>> file with only an enum in it.
>> 
>> On Wed, Jul 25, 2012 at 1:48 PM, Ted Kremenek <[email protected]> wrote:
>> Hi Richard,
>> 
>> If I am reading that right, the 6-10% slowdown is for the entire 
>> -fsyntax-only time?  If so, that's definitely cost prohibitive.
>> 
>> Ted
>> 
>> On Jul 19, 2012, at 8:25 PM, Richard Trieu <[email protected]> wrote:
>> 
>>> On Wed, Jul 18, 2012 at 9:14 PM, Ted Kremenek <[email protected]> wrote:
>>> On Jul 18, 2012, at 6:34 PM, Richard Trieu <[email protected]> wrote:
>>> 
>>>> A set could work for detecting the values, but both EnumConstantDecls are 
>>>> needed for the diagnostic, not just the values.  Possibly a map from 
>>>> APSInt->EnumConstantDecl* would work.  But either way, I would be dealing 
>>>> with getting APSInts to play nice with each other.
>>> 
>>> That seems reasonable to me.  The primary performance issue I see is the 
>>> quadratic algorithmic complexity.  If the APSInt comparisons are an issue, 
>>> we can see if we can find ways to optimize that further.
>>> 
>>> I created two more variations on and measured some timings.  Both used a 
>>> map, one with a custom compare function and one that extended the APSInt 
>>> value before insertion.  The APSInt extension had the better time, so I'll 
>>> be giving the number for that one.
>>> 
>>> At 10,000 elements, there was a 6-10% slow down.  This amounts to .01-.03 
>>> seconds difference on .13-.27 second runtime.
>>> 
>>> At 100,000 elements, 8-12% slow down.  .2-.3 seconds on 1.34 to 2.66 second 
>>> run time.
>>> 
>>> At 1,000,000 elements, 7-14% slow down.  Around 2 second difference for 
>>> runs of 13.6 to 26.7 seconds.
>>> 
>>> A new patch has been attached which has the APSInt bit extension before 
>>> adding to the map.
>>> <duplicate-enum-bit-extension.patch>
>> 
>> 
> 
> 
> 
> <duplicate-enum-vector.patch>

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to