OK, here's a filein that improves Pharo hashed collection performance
quite a bit. Large collections are much faster, and small ones are
pretty much the same speed as before. There are basically two fairly
simple changes; the basic structure and algorithms of the collections is
unchanged. The changes:
1. Spread identity hash values.
2. Make table sizes prime.
File it into PharoCore-1.0-10491rc1.image. It'll take a minute or two
since it has to rehash the world halfway through. I don't know how to
make another kind of packaging that can do that, so I'll leave that to
someone else.
After the filein, there are some test failures, most of which do not
seem to be *directly* related. I'm hoping someone that knows the
affected tests can take a look and comment:
Unexpectedly pass ObjectFinalizerTests>>#testFinalizationOfEquals
Not clear why, but this does not seem to be a problem :-)
Fails HostWindowTests>>#testOne
But this test fails in the core image on Linux; HostWindows do not
seem to be implemented for Linux.
Error on FontTest>>#testMultistringFont
Japanese StrikeFonts have nil characterToGlyphMap,
#createCharacterToGlyphMap answers nil,
not immediately clear how this is supposed to be initialized for fonts
with codepoints > 255.
PackageInfoTest>>testKernelPackage
because some method in kernel package for Object is not in Object.
I've changed #hash, so that's probably it.
Regards,
-Martin
'From Pharo1.0rc1 of 19 October 2009 [Latest update: #10491] on 23 October 2009
at 5:54:55 pm'!
!ProtoObject methodsFor: 'comparing' stamp: 'MartinMcClure 10/22/2009 22:30'!
identityHashTEMP
"Answer a SmallInteger whose value is related to the receiver's
identity.
This method must not be overridden, except by SmallInteger.
Primitive. Fails if the receiver is a SmallInteger. Essential.
See Object documentation whatIsAPrimitive.
Do not override."
^self primIdentityHash bitShift: 18! !
!Object methodsFor: 'comparing' stamp: 'MartinMcClure 10/22/2009 22:46'!
hashTEMP
"Answer a SmallInteger whose value is related to the receiver's
identity.
May be overridden, and should be overridden in any classes that define
= "
^ self primIdentityHash bitShift: 18! !
!ProtoObject methodsFor: 'comparing' stamp: 'MartinMcClure 10/22/2009 21:13'!
primIdentityHash
"Answer a SmallInteger whose value is related to the receiver's
identity.
This method must not be overridden, except by SmallInteger.
Primitive. Fails if the receiver is a SmallInteger. Essential.
See Object documentation whatIsAPrimitive.
Do not override."
<primitive: 75>
self primitiveFailed! !
!SmallInteger methodsFor: 'comparing' stamp: 'MartinMcClure 10/23/2009 17:53'!
primIdentityHash
"Senders of primIdentityHash do it because they expect to get an answer
from 1-4095.
So they should not send this to SmallIntegers, but should use
#identityHash"
"^self shouldNotImplement"
"...OK, only here for the sake of FixedIdentitySet, which may not need
it since it is probably not used for Integers.
And FixedIdentitySet itself may not be needed now that IdentitySets are
faster."
^self! !
!IdentityDictionary methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009
14:55'!
scanFor: anObject
"Scan the key array for the first slot containing either a nil
(indicating an empty slot) or an element that matches anObject. Answer the
index of that slot or zero if no slot is found. This method will be overridden
in various subclasses that have different interpretations for matching
elements."
| finish hash start element |
finish := array size.
start := (anObject identityHash \\ finish) + 1.
"Search from (hash mod size) to the end."
start to: finish do:
[:index | ((element := array at: index) == nil or: [element key
== anObject])
ifTrue: [^ index ]].
"Search from 1 to where we started."
1 to: start-1 do:
[:index | ((element := array at: index) == nil or: [element key
== anObject])
ifTrue: [^ index ]].
^ 0 "No match AND no empty slot"! !
!IdentitySet methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009 14:57'!
scanFor: anObject
"Scan the key array for the first slot containing either a nil
(indicating an empty slot) or an element that matches anObject. Answer the
index of that slot or zero if no slot is found. This method will be overridden
in various subclasses that have different interpretations for matching
elements."
| finish hash start element |
finish := array size.
start := (anObject identityHash \\ finish) + 1.
"Search from (hash mod size) to the end."
start to: finish do:
[:index | ((element := array at: index) == nil or: [element ==
anObject])
ifTrue: [^ index ]].
"Search from 1 to where we started."
1 to: start-1 do:
[:index | ((element := array at: index) == nil or: [element ==
anObject])
ifTrue: [^ index ]].
^ 0 "No match AND no empty slot"! !
!MethodDictionary methodsFor: 'private' stamp: 'MartinMcClure 10/22/2009 21:24'!
scanFor: anObject
"Scan the key array for the first slot containing either a nil
(indicating an empty slot) or an element that matches anObject. Answer the
index of that slot or zero if no slot is found. This method will be overridden
in various subclasses that have different interpretations for matching
elements."
| element start finish |
finish := array size.
start := (anObject primIdentityHash \\ finish) + 1.
"Search from (hash mod size) to the end."
start to: finish do:
[:index | ((element := self basicAt: index) == nil or: [element
== anObject])
ifTrue: [^ index ]].
"Search from 1 to where we started."
1 to: start-1 do:
[:index | ((element := self basicAt: index) == nil or: [element
== anObject])
ifTrue: [^ index ]].
^ 0 "No match AND no empty slot"! !
!WeakIdentityKeyDictionary methodsFor: 'private' stamp: 'MartinMcClure
10/23/2009 14:54'!
scanFor: anObject
"ar 10/21/2000: The method has been copied to this location to indicate
that whenever #scanFor: changes #scanForNil: must be changed in the receiver as
well."
"Scan the key array for the first slot containing either a nil
(indicating an empty slot) or an element that matches anObject. Answer the
index of that slot or zero if no slot is found. This method will be overridden
in various subclasses that have different interpretations for matching
elements."
| element start finish hash |
finish := array size.
start := (anObject identityHash \\ finish) + 1.
"Search from (hash mod size) to the end."
start to: finish do:
[:index | ((element := array at: index) == nil or: [element key
== anObject])
ifTrue: [^ index ]].
"Search from 1 to where we started."
1 to: start-1 do:
[:index | ((element := array at: index) == nil or: [element key
== anObject])
ifTrue: [^ index ]].
^ 0 "No match AND no empty slot"! !
!WeakIdentityKeyDictionary methodsFor: 'private' stamp: 'MartinMcClure
10/23/2009 14:54'!
scanForNil: anObject
"Private. Scan the key array for the first slot containing nil
(indicating an empty slot). Answer the index of that slot."
| start finish hash |
finish := array size.
start := (anObject identityHash \\ array size) + 1.
"Search from (hash mod size) to the end."
start to: finish do:
[:index | (array at: index) == nil ifTrue: [^ index ]].
"Search from 1 to where we started."
1 to: start-1 do:
[:index | (array at: index) == nil ifTrue: [^ index ]].
^ 0 "No match AND no empty slot"! !
"---------------------Do surgery and rehash before
continuing--------------------------"!
| dict method |
dict := ProtoObject methodDictionary.
method := dict at: #identityHashTEMP.
dict at: #identityHash put: method.
dict := Object methodDictionary.
method := dict at: #hashTEMP.
dict at: #identityHash put: method.
Set rehashAllSets.!
"---------- Life should be... better now :-) -----------------------------"!
!ProtoObject methodsFor: 'comparing' stamp: 'MartinMcClure 10/22/2009 22:30'!
identityHash
"Answer a SmallInteger whose value is related to the receiver's
identity.
This method must not be overridden, except by SmallInteger.
Primitive. Fails if the receiver is a SmallInteger. Essential.
See Object documentation whatIsAPrimitive.
Do not override."
^self primIdentityHash bitShift: 18! !
!Object methodsFor: 'comparing' stamp: 'MartinMcClure 10/22/2009 22:46'!
hash
"Answer a SmallInteger whose value is related to the receiver's
identity.
May be overridden, and should be overridden in any classes that define
= "
^ self primIdentityHash bitShift: 18! !
!FixedIdentitySet methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009 14:49'!
indexOf: anObject
anObject isNil ifTrue: [self error: 'This class collection cannot
handle nil as an element'].
^ (anObject primIdentityHash bitAnd: self basicSize - 1) + 1! !
!Set class methodsFor: 'sizing' stamp: 'MartinMcClure 10/23/2009 09:44'!
goodPrimes
"Answer a sorted array of prime numbers less than one hundred million
that make good hash table sizes. Should be expanded to more numbers if
folks
want to make larger collections.
Need to check with Andres' book when I get back to work to see if I
remembered
it right :-)
Generated with this code:
| prevPrime primes goodPrimes |
goodPrimes := OrderedCollection new.
primes := Integer largePrimesUpTo: 100000000.
goodPrimes add: 5.
prevPrime := 5.
primes do:
[:prime | prime > (prevPrime * 4 // 3) ifTrue:
[| lowByte | lowByte := prime bitAnd: 16rFF.
(lowByte > 10 and: [lowByte < 245]) ifTrue:
[goodPrimes add: prime.
prevPrime := prime]]].
^goodPrimes asArray printString"
^#(5 11 17 23 31 43 59 79 107 149 199 269 359 479 641 857 1151 1549
2069 2767 3691 4931 6577 8779 11717 15629 20849 27799 37087 49451 65951 87943
117259 156347 208463 277961 370619 494167 658897 878539 1171393 1561883 2082527
2776727 3702313 4936423 6581909 8775947 11701267 15601723 20802317 27736427
36981911 49309219 65745677 87660917)!
]style[(10 311 405 337)f2b,f2,f1,f2! !
!Set class methodsFor: 'sizing' stamp: 'MartinMcClure 10/23/2009 10:14'!
goodPrimeAtLeast: lowerLimit
"Answer the next good prime >= lowerlimit.
If lowerLimit is larger than the largest known good prime,
just make it odd."
| primes low mid high prime |
primes := self goodPrimes.
low := 1.
high := primes size.
lowerLimit > (primes at: high) ifTrue:
[^lowerLimit even
ifTrue: [lowerLimit + 1]
ifFalse: [lowerLimit]].
[mid := high - low // 2 + low.
prime := primes at: mid.
prime < lowerLimit
ifTrue: [low := mid]
ifFalse: [high := mid].
high - low <= 1 ifTrue:
[^primes at: high].
prime == lowerLimit ifTrue:
[^prime]] repeat
!
]style[(28 158 411)f2b,f2,f1! !
!Set methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009 10:25'!
growSize
"Answer what my next higher table size should be"
^ self class goodPrimeAtLeast: array size * 2! !
!Set methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009 10:25'!
grow
"Grow the elements array and reinsert the old elements"
| oldElements |
oldElements := array.
array := Array new: self growSize.
tally := 0.
oldElements do:
[:each | each == nil ifFalse: [self noCheckAdd: each]]! !
!Set class methodsFor: 'instance creation' stamp: 'MartinMcClure 10/23/2009
10:19'!
sizeFor: nElements
"Large enough size to hold nElements with some slop (see fullCheck)"
nElements <= 0 ifTrue: [^ 5].
^ self goodPrimeAtLeast: (nElements+1*4//3)! !
!WeakSet methodsFor: 'private' stamp: 'MartinMcClure 10/23/2009 10:26'!
grow
"Grow the elements array if needed.
Since WeakSets just nil their slots, alot of the occupied (in the eyes
of the set) slots are usually empty. Doubling size if unneeded can lead to
BAD performance, therefore we see if reassigning the <live> elements to a Set
of similiar size leads to a sufficiently (50% used here) empty set first.
and reinsert the old elements"
|oldTally|
oldTally := tally.
self growTo: array size.
oldTally >> 1 < tally ifTrue: [
self growTo: self growSize]! !
!MethodPragmaTest methodsFor: 'testing-primitives' stamp: 'MartinMcClure
10/23/2009 12:37'!
testPrimitiveIndexed2
"This test useses the #identityHash primitive."
self compile: '<primitive: 75> ^ #idHash' selector: #idHash.
self assert: self idHash = self primIdentityHash.! !
!SmallInteger reorganize!
('arithmetic' * + - / // \\ gcd: quo:)
('bit manipulation' bitAnd: bitOr: bitShift: bitXor: hashMultiply highBit
highBitOfMagnitude lowBit)
('comparing' < <= = > >= hash identityHash primIdentityHash ~=)
('converting' as31BitSmallInt asFloat)
('copying' clone deepCopy shallowCopy veryDeepCopyWith:)
('printing' decimalDigitLength destinationBuffer: numberOfDigitsInBase:
printOn:base: printOn:base:nDigits: printString printStringBase:
printStringBase:nDigits: threeDigitName)
('system primitives' asOop digitAt: digitAt:put: digitLength instVarAt:
nextInstance nextObject)
('testing' even isLarge odd)
('private' fromString:radix: highBitOfPositiveReceiver)
!
!Set class reorganize!
('initialization' quickRehashAllSets rehashAllSets)
('sizing' goodPrimeAtLeast: goodPrimes)
('instance creation' new new: newFrom: sizeFor:)
!
ProtoObject removeSelector: #identityHashTEMP!
Object removeSelector: #hashTEMP!
!ProtoObject reorganize!
('apply primitives' tryNamedPrimitive tryNamedPrimitive:
tryNamedPrimitive:with: tryNamedPrimitive:with:with:
tryNamedPrimitive:with:with:with: tryNamedPrimitive:with:with:with:with:
tryNamedPrimitive:with:with:with:with:with:
tryNamedPrimitive:with:with:with:with:with:with:
tryNamedPrimitive:with:with:with:with:with:with:with: tryPrimitive:withArgs:)
('closure-prims' privGetInstVar: privRemoteReturnTo: privSetInHolder:
privSetInstVar:put: privStoreIn:instVar:)
('comparing' == identityHash primIdentityHash ~~)
('debugging' doOnlyOnce: flag: rearmOneShot withArgs:executeMethod:)
('initialize-release' initialize)
('method execution' executeMethod: with:executeMethod: with:with:executeMethod:
with:with:with:executeMethod: with:with:with:with:executeMethod:)
('objects from disk' rehash)
('system primitives' become: cannotInterpret: doesNotUnderstand: nextInstance
nextObject)
('testing' ifNil: ifNil:ifNotNil: ifNotNil: ifNotNil:ifNil: isInMemory isNil
pointsTo:)
!
_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project