On 16.12.2011 03:26, Levente Uzonyi wrote:
How about my numbers? :)
"Preallocate objects, so we won't count gc time."
n := 1000000.
objects := Array new: n streamContents: [ :stream |
n timesRepeat: [ stream nextPut: Object new ] ].
set := IdentitySet new: n.
Smalltalk garbageCollect.
[1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "4949"
set := LargeIdentitySet new.
Smalltalk garbageCollect.
[1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "331"
set := (PluggableSet new: n)
hashBlock: [ :object | object identityHash * 4096 + object class
identityHash * 64 ]; "Change this to #basicIdentityHash in Pharo"
equalBlock: [ :a :b | a == b ];
yourself.
Smalltalk garbageCollect.
[1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "5511"
I also have a LargeIdentityDictionary, which is relatively fast, but
not as fast as LargeIdentitySet, because (for some unknown reason) we
don't have a primitive that could support it. If we had a primitive
like primitive 132 which would return the index of the element if
found or 0 if not, then we could have a really fast
LargeIdentityDictionary.
Levente
Hehe yes, if writing a version fully exploiting the limited range,
that's probably the approach I would go for as well.
(IAssuming it's the version at
http://leves.web.elte.hu/squeak/LargeIdentitySet.st)
Mariano commented in the version at
http://www.squeaksource.com/FuelExperiments that it's slow for them,
which I guess is due to not adopting #identityHash calls to
#basicIdentityHash calls for Pharo:
((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1
So it basically uses 1 bucket instead of 4096... Whoops. :)
Uploaded a new version to the MC repository which is adapted for Pharo,
on the same machine my numbers were taken from, it does the same test as
I used above in 871 ms. (181 with preallocation).
Cheers,
Henry