OK, here's a filein that improves Pharo hashed collection performance quite a bit. Large collections are much faster, and small ones are pretty much the same speed as before. There are basically two fairly simple changes; the basic structure and algorithms of the collections is unchanged. The changes:

1. Spread identity hash values.
2. Make table sizes prime.


File it into PharoCore-1.0-10491rc1.image. It'll take a minute or two since it has to rehash the world halfway through. I don't know how to make another kind of packaging that can do that, so I'll leave that to someone else.

After the filein, there are some test failures, most of which do not seem to be *directly* related. I'm hoping someone that knows the affected tests can take a look and comment:


Unexpectedly pass ObjectFinalizerTests>>#testFinalizationOfEquals
  Not clear why, but this does not seem to be a problem :-)

Fails HostWindowTests>>#testOne
  But this test fails in the core image on Linux; HostWindows do not
  seem to be implemented for Linux.

Error on FontTest>>#testMultistringFont
  Japanese StrikeFonts have nil characterToGlyphMap,
  #createCharacterToGlyphMap answers nil,
  not immediately clear how this is supposed to be initialized for fonts
  with codepoints > 255.

PackageInfoTest>>testKernelPackage
  because some method in kernel package for Object is not in Object.
  I've changed #hash, so that's probably it.

Regards,

-Martin
'From Pharo1.0rc1 of 19 October 2009 [Latest update: #10491] on 23 October 2009 
at 5:54:55 pm'!

!ProtoObject methodsFor: 'comparing' stamp: 'MartinMcClure 10/22/2009 22:30'!
identityHashTEMP
        "Answer a SmallInteger whose value is related to the receiver's 
identity.
        This method must not be overridden, except by SmallInteger.
        Primitive. Fails if the receiver is a SmallInteger. Essential.
        See Object documentation whatIsAPrimitive.

        Do not override."

        ^self primIdentityHash bitShift: 18! !


!Object methodsFor: 'comparing' stamp: 'MartinMcClure 10/22/2009 22:46'!
hashTEMP
        "Answer a SmallInteger whose value is related to the receiver's 
identity.
        May be overridden, and should be overridden in any classes that define 
= "

        ^ self primIdentityHash bitShift: 18! !


!ProtoObject methodsFor: 'comparing' stamp: 'MartinMcClure 10/22/2009 21:13'!
primIdentityHash
        "Answer a SmallInteger whose value is related to the receiver's 
identity.
        This method must not be overridden, except by SmallInteger.
        Primitive. Fails if the receiver is a SmallInteger. Essential.
        See Object documentation whatIsAPrimitive.

        Do not override."

        <primitive: 75>
        self primitiveFailed! !


!SmallInteger methodsFor: 'comparing' stamp: 'MartinMcClure 10/23/2009 17:53'!
primIdentityHash
        "Senders of primIdentityHash do it because they expect to get an answer 
from 1-4095.
        So they should not send this to SmallIntegers, but should use 
#identityHash"
        
        "^self shouldNotImplement"
        
        "...OK, only here for the sake of FixedIdentitySet, which may not need 
it since it is probably not used for Integers.
        And FixedIdentitySet itself may not be needed now that IdentitySets are 
faster."
        
        ^self! !


!IdentityDictionary methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009 
14:55'!
scanFor: anObject
        "Scan the key array for the first slot containing either a nil 
(indicating an empty slot) or an element that matches anObject. Answer the 
index of that slot or zero if no slot is found. This method will be overridden 
in various subclasses that have different interpretations for matching 
elements."
        | finish hash start element |
        finish := array size.
        start := (anObject identityHash \\ finish) + 1.

        "Search from (hash mod size) to the end."
        start to: finish do:
                [:index | ((element := array at: index) == nil or: [element key 
== anObject])
                        ifTrue: [^ index ]].

        "Search from 1 to where we started."
        1 to: start-1 do:
                [:index | ((element := array at: index) == nil or: [element key 
== anObject])
                        ifTrue: [^ index ]].

        ^ 0  "No match AND no empty slot"! !


!IdentitySet methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009 14:57'!
scanFor: anObject
        "Scan the key array for the first slot containing either a nil 
(indicating an empty slot) or an element that matches anObject. Answer the 
index of that slot or zero if no slot is found. This method will be overridden 
in various subclasses that have different interpretations for matching 
elements."
        | finish hash start element |
        finish := array size.
        start := (anObject identityHash \\ finish) + 1.

        "Search from (hash mod size) to the end."
        start to: finish do:
                [:index | ((element := array at: index) == nil or: [element == 
anObject])
                        ifTrue: [^ index ]].

        "Search from 1 to where we started."
        1 to: start-1 do:
                [:index | ((element := array at: index) == nil or: [element == 
anObject])
                        ifTrue: [^ index ]].

        ^ 0  "No match AND no empty slot"! !


!MethodDictionary methodsFor: 'private' stamp: 'MartinMcClure 10/22/2009 21:24'!
scanFor: anObject
        "Scan the key array for the first slot containing either a nil 
(indicating an empty slot) or an element that matches anObject. Answer the 
index of that slot or zero if no slot is found. This method will be overridden 
in various subclasses that have different interpretations for matching 
elements."
        | element start finish |
        finish := array size.
        start := (anObject primIdentityHash \\ finish) + 1.

        "Search from (hash mod size) to the end."
        start to: finish do:
                [:index | ((element := self basicAt: index) == nil or: [element 
== anObject])
                        ifTrue: [^ index ]].

        "Search from 1 to where we started."
        1 to: start-1 do:
                [:index | ((element := self basicAt: index) == nil or: [element 
== anObject])
                        ifTrue: [^ index ]].

        ^ 0  "No match AND no empty slot"! !


!WeakIdentityKeyDictionary methodsFor: 'private' stamp: 'MartinMcClure 
10/23/2009 14:54'!
scanFor: anObject
        "ar 10/21/2000: The method has been copied to this location to indicate 
that whenever #scanFor: changes #scanForNil: must be changed in the receiver as 
well."
        "Scan the key array for the first slot containing either a nil 
(indicating an empty slot) or an element that matches anObject. Answer the 
index of that slot or zero if no slot is found. This method will be overridden 
in various subclasses that have different interpretations for matching 
elements."
        | element start finish hash |
        finish := array size.
        start := (anObject identityHash \\ finish) + 1.

        "Search from (hash mod size) to the end."
        start to: finish do:
                [:index | ((element := array at: index) == nil or: [element key 
== anObject])
                        ifTrue: [^ index ]].

        "Search from 1 to where we started."
        1 to: start-1 do:
                [:index | ((element := array at: index) == nil or: [element key 
== anObject])
                        ifTrue: [^ index ]].

        ^ 0  "No match AND no empty slot"! !

!WeakIdentityKeyDictionary methodsFor: 'private' stamp: 'MartinMcClure 
10/23/2009 14:54'!
scanForNil: anObject
        "Private. Scan the key array for the first slot containing nil 
(indicating an empty slot). Answer the index of that slot."
        | start finish hash |
        finish := array size.
        start := (anObject identityHash \\ array size) + 1.

        "Search from (hash mod size) to the end."
        start to: finish do:
                [:index | (array at: index) == nil ifTrue: [^ index ]].

        "Search from 1 to where we started."
        1 to: start-1 do:
                [:index | (array at: index) == nil ifTrue: [^ index ]].

        ^ 0  "No match AND no empty slot"! !







"---------------------Do surgery and rehash before 
continuing--------------------------"!

| dict method |
dict := ProtoObject methodDictionary.
method := dict at: #identityHashTEMP.
dict at: #identityHash put: method.
dict := Object methodDictionary.
method := dict at: #hashTEMP.
dict at: #identityHash put: method.

Set rehashAllSets.!

"---------- Life should be... better now :-) -----------------------------"!



!ProtoObject methodsFor: 'comparing' stamp: 'MartinMcClure 10/22/2009 22:30'!
identityHash
        "Answer a SmallInteger whose value is related to the receiver's 
identity.
        This method must not be overridden, except by SmallInteger.
        Primitive. Fails if the receiver is a SmallInteger. Essential.
        See Object documentation whatIsAPrimitive.

        Do not override."

        ^self primIdentityHash bitShift: 18! !


!Object methodsFor: 'comparing' stamp: 'MartinMcClure 10/22/2009 22:46'!
hash
        "Answer a SmallInteger whose value is related to the receiver's 
identity.
        May be overridden, and should be overridden in any classes that define 
= "

        ^ self primIdentityHash bitShift: 18! !


!FixedIdentitySet methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009 14:49'!
indexOf: anObject
        anObject isNil ifTrue: [self error: 'This class collection cannot 
handle nil as an element'].
        ^ (anObject primIdentityHash bitAnd: self basicSize - 1) + 1! !


!Set class methodsFor: 'sizing' stamp: 'MartinMcClure 10/23/2009 09:44'!
goodPrimes
        "Answer a sorted array of prime numbers less than one hundred million 
        that make good hash table sizes. Should be expanded to more numbers if 
folks
        want to make larger collections.
        Need to check with Andres' book when I get back to work to see if I 
remembered
        it right :-)
        
        Generated with this code:
        
        | prevPrime primes goodPrimes |
        goodPrimes := OrderedCollection new.
        primes := Integer largePrimesUpTo: 100000000.
        goodPrimes add: 5.
        prevPrime := 5.
        primes do: 
                [:prime | prime > (prevPrime * 4 // 3) ifTrue: 
                        [| lowByte | lowByte := prime bitAnd: 16rFF.
                        (lowByte > 10 and: [lowByte < 245]) ifTrue:
                                [goodPrimes add: prime.
                                prevPrime := prime]]].
        ^goodPrimes asArray printString"
        
        ^#(5 11 17 23 31 43 59 79 107 149 199 269 359 479 641 857 1151 1549 
2069 2767 3691 4931 6577 8779 11717 15629 20849 27799 37087 49451 65951 87943 
117259 156347 208463 277961 370619 494167 658897 878539 1171393 1561883 2082527 
2776727 3702313 4936423 6581909 8775947 11701267 15601723 20802317 27736427 
36981911 49309219 65745677 87660917)!
]style[(10 311 405 337)f2b,f2,f1,f2! !

!Set class methodsFor: 'sizing' stamp: 'MartinMcClure 10/23/2009 10:14'!
goodPrimeAtLeast: lowerLimit
        "Answer the next good prime >= lowerlimit.
        If lowerLimit is larger than the largest known good prime,
        just make it odd."
        
        | primes low mid high prime |
        primes := self goodPrimes.
        low := 1.
        high := primes size.
        lowerLimit > (primes at: high) ifTrue:
                [^lowerLimit even 
                        ifTrue: [lowerLimit + 1]
                        ifFalse: [lowerLimit]].
        [mid := high - low // 2 + low.
                prime := primes at: mid.
                prime < lowerLimit
                        ifTrue: [low := mid]
                        ifFalse: [high := mid].
                high - low <= 1 ifTrue:
                        [^primes at: high].
                prime == lowerLimit ifTrue:
                        [^prime]] repeat
                
        !
]style[(28 158 411)f2b,f2,f1! !

!Set methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009 10:25'!
growSize
        "Answer what my next higher table size should be"
        ^ self class goodPrimeAtLeast: array size * 2! !


!Set methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009 10:25'!
grow
        "Grow the elements array and reinsert the old elements"
        | oldElements |
        oldElements := array.
        array := Array new: self growSize.
        tally := 0.
        oldElements do:
                [:each | each == nil ifFalse: [self noCheckAdd: each]]! !


!Set class methodsFor: 'instance creation' stamp: 'MartinMcClure 10/23/2009 
10:19'!
sizeFor: nElements
        "Large enough size to hold nElements with some slop (see fullCheck)"
        nElements <= 0 ifTrue: [^ 5].
        ^ self goodPrimeAtLeast: (nElements+1*4//3)! !



!WeakSet methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009 10:26'!
grow
        "Grow the elements array if needed.
        Since WeakSets just nil their slots, alot of the occupied (in the eyes 
of the set) slots are usually    empty. Doubling size if unneeded can lead to 
BAD performance, therefore we see if reassigning   the <live> elements to a Set 
of similiar size leads to a sufficiently (50% used here) empty set first.
        and reinsert the old elements"
        |oldTally|
        oldTally := tally.
        self growTo: array size.
        oldTally >> 1 < tally ifTrue: [
        self growTo: self growSize]! !


!MethodPragmaTest methodsFor: 'testing-primitives' stamp: 'MartinMcClure 
10/23/2009 12:37'!
testPrimitiveIndexed2
        "This test useses the #identityHash primitive."

        self compile: '<primitive: 75> ^ #idHash' selector: #idHash.
        self assert: self idHash = self primIdentityHash.! !


!SmallInteger reorganize!
('arithmetic' * + - / // \\ gcd: quo:)
('bit manipulation' bitAnd: bitOr: bitShift: bitXor: hashMultiply highBit 
highBitOfMagnitude lowBit)
('comparing' < <= = > >= hash identityHash primIdentityHash ~=)
('converting' as31BitSmallInt asFloat)
('copying' clone deepCopy shallowCopy veryDeepCopyWith:)
('printing' decimalDigitLength destinationBuffer: numberOfDigitsInBase: 
printOn:base: printOn:base:nDigits: printString printStringBase: 
printStringBase:nDigits: threeDigitName)
('system primitives' asOop digitAt: digitAt:put: digitLength instVarAt: 
nextInstance nextObject)
('testing' even isLarge odd)
('private' fromString:radix: highBitOfPositiveReceiver)
!


!Set class reorganize!
('initialization' quickRehashAllSets rehashAllSets)
('sizing' goodPrimeAtLeast: goodPrimes)
('instance creation' new new: newFrom: sizeFor:)
!

ProtoObject removeSelector: #identityHashTEMP!
Object removeSelector: #hashTEMP!

!ProtoObject reorganize!
('apply primitives' tryNamedPrimitive tryNamedPrimitive: 
tryNamedPrimitive:with: tryNamedPrimitive:with:with: 
tryNamedPrimitive:with:with:with: tryNamedPrimitive:with:with:with:with: 
tryNamedPrimitive:with:with:with:with:with: 
tryNamedPrimitive:with:with:with:with:with:with: 
tryNamedPrimitive:with:with:with:with:with:with:with: tryPrimitive:withArgs:)
('closure-prims' privGetInstVar: privRemoteReturnTo: privSetInHolder: 
privSetInstVar:put: privStoreIn:instVar:)
('comparing' == identityHash primIdentityHash ~~)
('debugging' doOnlyOnce: flag: rearmOneShot withArgs:executeMethod:)
('initialize-release' initialize)
('method execution' executeMethod: with:executeMethod: with:with:executeMethod: 
with:with:with:executeMethod: with:with:with:with:executeMethod:)
('objects from disk' rehash)
('system primitives' become: cannotInterpret: doesNotUnderstand: nextInstance 
nextObject)
('testing' ifNil: ifNil:ifNotNil: ifNotNil: ifNotNil:ifNil: isInMemory isNil 
pointsTo:)
!

_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to