On Fri, 16 Dec 2011, Henrik Sperre Johansen wrote:

On 16.12.2011 03:26, Levente Uzonyi wrote:

How about my numbers? :)

"Preallocate objects, so we won't count gc time."
n := 1000000.
objects := Array new: n streamContents: [ :stream |
    n timesRepeat: [ stream nextPut: Object new ] ].

set := IdentitySet new: n.
Smalltalk garbageCollect.
[1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "4949"

set := LargeIdentitySet new.
Smalltalk garbageCollect.
[1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "331"

set := (PluggableSet new: n)
hashBlock: [ :object | object identityHash * 4096 + object class identityHash * 64 ]; "Change this to #basicIdentityHash in Pharo"
    equalBlock: [ :a :b | a == b ];
    yourself.
Smalltalk garbageCollect.
[1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "5511"


I also have a LargeIdentityDictionary, which is relatively fast, but not as fast as LargeIdentitySet, because (for some unknown reason) we don't have a primitive that could support it. If we had a primitive like primitive 132 which would return the index of the element if found or 0 if not, then we could have a really fast LargeIdentityDictionary.


Levente
Hehe yes, if writing a version fully exploiting the limited range, that's probably the approach I would go for as well. (IAssuming it's the version at http://leves.web.elte.hu/squeak/LargeIdentitySet.st)

Mariano commented in the version at http://www.squeaksource.com/FuelExperiments that it's slow for them, which I guess is due to not adopting #identityHash calls to #basicIdentityHash calls for Pharo:
((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1
So it basically uses 1 bucket instead of 4096... Whoops. :)

Uploaded a new version to the MC repository which is adapted for Pharo, on the same machine my numbers were taken from, it does the same test as I used above in 871 ms. (181 with preallocation).

Cool. One more thing: in Squeak the method using primitive 132 directly was renamed to #instVarsInclude:, so now #pointsTo: works as expected. If this was also added to Pharo, then the #pointsTo: sends should be changed to #instVarsInclude:, otherwise Array can be reported as included even if it wasn't added. I'll upload my LargeIdentityDictionary implementation to the same place this evening, since it's still 2-3 factor faster than other solutionts and there seem to be demand for it.


Levente


Cheers,
Henry



Reply via email to