> All I can say is that I am impressed by the numbers it is really much >> faster. >> I still don't understand why I send this email with a subject say >> IdentitySet because what I really need is a fast/large IdentityDictionary >> :( Anyway, there's a place where we can use this LargeIdentitySet in Fuel >> I think). >> >> So Levente, you say this is not possible to adapt this for dictionary? >> can >> we contact Eliot to provide such a primitive? >> > > As promised, I uploaded my LargeIdentityDictionary implementation to > http://leves.web.elte.hu/**squeak/**LargeIdentityDictionary.st<http://leves.web.elte.hu/squeak/LargeIdentityDictionary.st>. > The numbers will be a bit worse compared to LargeIdentitySet, because of > the lack of the primitive, but it's still 2-3x faster than other solutions > (IdentityDictionary, PluggableIdentityDictionary, subclassing, etc). I'm > about to propose this primitive with other improvements on the vm-dev list. > > Hi Eliot/Levente. What is the status of this? Do we have already the new primitive? If true, how can we adapt LargeIdentitySet to use such new primitive?
Thanks! > > Levente > > >> thanks >> >> On Fri, Dec 16, 2011 at 3:28 PM, Levente Uzonyi <[email protected]> wrote: >> >> On Fri, 16 Dec 2011, Henrik Sperre Johansen wrote: >>> >>> On 16.12.2011 03:26, Levente Uzonyi wrote: >>> >>>> >>>> >>>>> How about my numbers? :) >>>>> >>>>> "Preallocate objects, so we won't count gc time." >>>>> n := 1000000. >>>>> objects := Array new: n streamContents: [ :stream | >>>>> n timesRepeat: [ stream nextPut: Object new ] ]. >>>>> >>>>> set := IdentitySet new: n. >>>>> Smalltalk garbageCollect. >>>>> [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "4949" >>>>> >>>>> set := LargeIdentitySet new. >>>>> Smalltalk garbageCollect. >>>>> [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "331" >>>>> >>>>> set := (PluggableSet new: n) >>>>> hashBlock: [ :object | object identityHash * 4096 + object class >>>>> identityHash * 64 ]; "Change this to #basicIdentityHash in Pharo" >>>>> equalBlock: [ :a :b | a == b ]; >>>>> yourself. >>>>> Smalltalk garbageCollect. >>>>> [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "5511" >>>>> >>>>> >>>>> I also have a LargeIdentityDictionary, which is relatively fast, but >>>>> not >>>>> as fast as LargeIdentitySet, because (for some unknown reason) we don't >>>>> have a primitive that could support it. If we had a primitive like >>>>> primitive 132 which would return the index of the element if found or >>>>> 0 if >>>>> not, then we could have a really fast LargeIdentityDictionary. >>>>> >>>>> >>>>> Levente >>>>> >>>>> Hehe yes, if writing a version fully exploiting the limited range, >>>> that's >>>> probably the approach I would go for as well. >>>> (IAssuming it's the version at http://leves.web.elte.hu/** >>>> squeak/LargeIdentitySet.st<htt**p://leves.web.elte.hu/squeak/** >>>> LargeIdentitySet.st<http://leves.web.elte.hu/squeak/LargeIdentitySet.st> >>>> > >>>> ) >>>> >>>> Mariano commented in the version at http://www.squeaksource.com/** >>>> FuelExperiments >>>> <http://www.squeaksource.com/**FuelExperiments<http://www.squeaksource.com/FuelExperiments>> >>>> that it's >>>> >>>> slow for them, which I guess is due to not adopting #identityHash calls >>>> to >>>> #basicIdentityHash calls for Pharo: >>>> ((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1 >>>> So it basically uses 1 bucket instead of 4096... Whoops. :) >>>> >>>> Uploaded a new version to the MC repository which is adapted for Pharo, >>>> on the same machine my numbers were taken from, it does the same test >>>> as I >>>> used above in 871 ms. (181 with preallocation). >>>> >>>> >>> Cool. One more thing: in Squeak the method using primitive 132 directly >>> was renamed to #instVarsInclude:, so now #pointsTo: works as expected. If >>> this was also added to Pharo, then the #pointsTo: sends should be changed >>> to #instVarsInclude:, otherwise Array can be reported as included even if >>> it wasn't added. >>> I'll upload my LargeIdentityDictionary implementation to the same place >>> this evening, since it's still 2-3 factor faster than other solutionts >>> and >>> there seem to be demand for it. >>> >>> >>> Levente >>> >>> >>> Cheers, >>>> Henry >>>> >>>> >>>> >>>> >>> >> >> -- >> Mariano >> http://marianopeck.wordpress.**com <http://marianopeck.wordpress.com> >> >> > -- Mariano http://marianopeck.wordpress.com
