On 21.05.2013, at 10:58, Camillo Bruni <[email protected]> wrote:

> my random pessimistic comments
> 
> a) your benchmarking script sucks! => use SMARK!! 
> http://smalltalkhub.com/#!/~StefanMarr/SMark

We're fully aware of that :)

> b) check my master thesis for a more thorough testing of dictionary, 
> including a cleaner implementation that defaults to something like a small 
> dictionary without a special class: 
> http://scg.unibe.ch/archive/masters/Brun11a.pdf
> 
> the optimistic version:
> 
> 
> On 2013-05-21, at 09:47, Max Leske <[email protected]> wrote:
> 
>> My coworker performed some interesting benchmarks on the various dictionary 
>> types in Pharo. If you look at the attachments you'll see four things:
>> 1. SmallDictionary and SmallIdentityDictionary are totally useless when it 
>> comes to performance. Unless their existence is somehow justified by saving 
>> space (which might be important for embedded systems) I don't see why we 
>> should keep them around.
> 
> yes their usage is not always justified, but be careful what you measure: as 
> far as I see you only measure addition, no? (sorry again, but that benchmark 
> brutally sucks, it's so damn hard to read... :P). Iteration and retrieval 
> have to measure as well, also how the dictionaries behave under key hash 
> collisions (though I am definitely not defending the small dicts.. :).
> 
>> 2. The special Fuel collections for large numbers of elements could be a 
>> valuable addition to the basic collections
> 
> THere is the pluggable key dictionary for that reason already, no? The main 
> problem is the limited hashes which results in quite some hash collisions for 
> very large dictionaries.
> 
>> 3. SmallInteger>>identityHash is really slow. Using SmallInteger as keys in 
>> IdentityDictionary is significantly slower than using them in a regular 
>> Dictionary because the identity check is done via #identityHash (#hash 
>> simply answers self).
> 
> that is really strange :/
> 
>> 4. Arrays are cool :)
>> 
>> Should I open issues on FogBugz for any of these points? Especially point 3 
>> bothers me because I feel that using a "primitive" type should be fast under 
>> all circumstances.
>> 
>> You'll find below the benchmark data (collected in 1.4, verified in 2.0) and 
>> the code to run it.
>> 
>> Cheers,
>> Max
>> 
>>> 
>> <performance.xlsx>
>>> 
>>> 
>> <performance.xls>
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --------------------------------------------------------------
>>> 
>>> ( comments are variations)
>>> 
>>> sizes := #( 10 20 30 40 50 60 70 80 90 100 120 130 140 150 160 170 180 190 
>>> 200 300 400 500 600 700 800 900 1000 ).
>>> "sizes := #( 10 20 30 40 50 60 70 80 90 100 120 130 140 150 160 170 180 190 
>>> 200 300 400 500 600 700 800 900 1000 1500 2000 2500 3000 3500 4000 4500 
>>> 5000 6000 7000 8000 9000 10000 )."
>>> "batches := sizes collect: [ :size | size ->  ((10000000 / size) asFloat // 
>>> 10) ]."
>>> batches := sizes collect: [ :size | size -> 10000 ].
>>> alphabet := Character alphabet.
>>> 
>>> batches do: [ :a | Transcript tab; show: a key asString ]. Transcript cr.
>>> batches do: [ :a | Transcript tab; show: a value asString ]. Transcript cr.
>>> "{ Dictionary. SmallDictionary. IdentityDictionary. 
>>> SmallIdentityDictionary. FLLargeIdentityDictionary } "
>>> { Dictionary. IdentityDictionary. FLLargeIdentityDictionary } 
>>> "{ Array }"
>>>     do: [ :class | 
>>>             Smalltalk garbageCollect.
>>>             Transcript show: class name.
>>>             batches
>>>                     do: [ :a |
>>>                             | struct strings iterations |
>>>                             "strings := (1 to: a key) collect: [ :i | i 
>>> asString ]."
>>>                             strings := (1 to: a key) collect: [ :i |
>>>                                     String streamContents: [ :s |
>>>                                             20 atRandom timesRepeat: [ s 
>>> nextPut: alphabet atRandom ] ] ].
>>>                             struct := class new.
>>>                             "struct := (class = FLLargeIdentityDictionary
>>>                                     ifTrue: [ class new ]
>>>                                     ifFalse: [ class new: a key ])."
>>>                              Transcript tab; show: ([ 
>>>                                     a value timesRepeat: [ 
>>>                                             1 to: a key do: [ :i | struct 
>>> at: (strings at: i)" i" put: i ] ] ] timeToRun).
>>>                             strings := nil ]
>>>                     displayingProgress: class name asString.
>>>             Transcript cr. ]
>>>     displayingProgress: 'Processing ...'
>>>     
>> 
>> _______________________________________________
>> Pharo-fuel mailing list
>> [email protected]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel
> 
> 
> _______________________________________________
> Pharo-fuel mailing list
> [email protected]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-fuel


Reply via email to