Re: [Pharo-project] Do we have the new primitive?? [WAS] Re: IdentitySet but using #hash rather than #identityHash ?

Levente Uzonyi Mon, 27 Feb 2012 06:17:18 -0800

On Sat, 25 Feb 2012, Mariano Martinez Peck wrote:


            All I can say is that I am impressed by the numbers it is really 
much
            faster.
            I still don't understand why I send this email with a subject say
            IdentitySet because what I really need is a fast/large 
IdentityDictionary
            :(  Anyway, there's a place where we can use this LargeIdentitySet 
in Fuel
            I think).

            So Levente, you say this is not possible to adapt this for 
dictionary?  can
            we contact Eliot to provide such a primitive?


As promised, I uploaded my LargeIdentityDictionary implementation to 
http://leves.web.elte.hu/squeak/LargeIdentityDictionary.st .
The numbers will be a bit worse compared to LargeIdentitySet, because of the 
lack of the primitive, but it's still 2-3x faster than other solutions
(IdentityDictionary, PluggableIdentityDictionary, subclassing, etc). I'm about 
to propose this primitive with other improvements on the vm-dev
list.


My proposals are still on the way. :)



Hi Eliot/Levente. What is the status of this? Do we have already the new 
primitive? If true, how can we adapt LargeIdentitySet to use such new primitive?

AFAIK the new primitive is not implemented yet. Adding the primitive tothe interpreter VM is very easy, but it seems to be a lot more complicated(to me) to add it to Cog, because the receiver can be a MethodContextwhich needs special handling.I'll rewrite both LargeIdentitySet and LargeIdentityDictionary when theprimitive is ready.



Levente


Thanks!






 

      Levente


            thanks

            On Fri, Dec 16, 2011 at 3:28 PM, Levente Uzonyi <[email protected]> 
wrote:

      On Fri, 16 Dec 2011, Henrik Sperre Johansen wrote:

       On 16.12.2011 03:26, Levente Uzonyi wrote:


            How about my numbers? :)

            "Preallocate objects, so we won't count gc time."
            n := 1000000.
            objects := Array new: n streamContents: [ :stream |
              n timesRepeat: [ stream nextPut: Object new ] ].

            set := IdentitySet new: n.
            Smalltalk garbageCollect.
            [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "4949"

            set := LargeIdentitySet new.
            Smalltalk garbageCollect.
            [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "331"

            set := (PluggableSet new: n)
              hashBlock: [ :object | object identityHash * 4096 + object class
            identityHash * 64 ]; "Change this to #basicIdentityHash in Pharo"
              equalBlock: [ :a :b | a == b ];
              yourself.
            Smalltalk garbageCollect.
            [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "5511"


            I also have a LargeIdentityDictionary, which is relatively fast, 
but not
            as fast as LargeIdentitySet, because (for some unknown reason) we 
don't
            have a primitive that could support it. If we had a primitive like
            primitive 132 which would return the index of the element if found 
or 0 if
            not, then we could have a really fast LargeIdentityDictionary.


            Levente

      Hehe yes, if writing a version fully exploiting the limited range, that's
      probably the approach I would go for as well.
(IAssuming it's the version at http://leves.web.elte.hu/**
squeak/LargeIdentitySet.st<http://leves.web.elte.hu/squeak/LargeIdentitySet.st>
)

Mariano commented in the version at http://www.squeaksource.com/**
FuelExperiments <http://www.squeaksource.com/FuelExperiments> that it's
slow for them, which I guess is due to not adopting #identityHash calls to
#basicIdentityHash calls for Pharo:
((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1
So it basically uses 1 bucket instead of 4096... Whoops. :)

Uploaded a new version to the MC repository which is adapted for Pharo,
on the same machine my numbers were taken from, it does the same test as I
used above in 871 ms. (181 with preallocation).


Cool. One more thing: in Squeak the method using primitive 132 directly
was renamed to #instVarsInclude:, so now #pointsTo: works as expected. If
this was also added to Pharo, then the #pointsTo: sends should be changed
to #instVarsInclude:, otherwise Array can be reported as included even if
it wasn't added.
I'll upload my LargeIdentityDictionary implementation to the same place
this evening, since it's still 2-3 factor faster than other solutionts and
there seem to be demand for it.


Levente


      Cheers,
      Henry






--
Mariano
http://marianopeck.wordpress.com





--
Mariano
http://marianopeck.wordpress.com

Re: [Pharo-project] Do we have the new primitive?? [WAS] Re: IdentitySet but using #hash rather than #identityHash ?

Reply via email to