I am currently reading through the Hashing in Smalltalk book (
http://www.lulu.com/shop/andres-valloud/hashing-in-smalltalk-theory-and-practice/paperback/product-3788892.html)
and, my head hurting notwithstanding, there are indeed a ton of gems in
this system. As he mentions, doing the exercises brings a lot of extra :-)

When going to 64-bit, and with the new ObjectMemory scheme, I guess a
couple of identity hashing functions will come under scrutiny.

e.g.

SmallInteger>>hashMultiply
 | low |

low := self bitAnd: 16383.
^(16r260D * low + ((16r260D * (self bitShift: -14) + (16r0065 * low)
bitAnd: 16383) * 16384))
 bitAnd: 16r0FFFFFFF


which will need some more bits.

I had a look at how it was done in VisualWorks;

hashMultiply
"Multiply the receiver by 16r0019660D mod 2^28
without using large integer arithmetic for speed.
 The constant is a generator of the multiplicative
subgroup of Z_2^30, see Knuth's TAOCP vol 2."
 <primitive: 1747>
| low14Bits |
low14Bits := self bitAnd: 16r3FFF.
 ^16384
* (16r260D * (self bitShift: -14) + (16r0065 * low14Bits) bitAnd: 16r3FFF)
+ (16r260D * low14Bits) bitAnd: 16rFFFFFFF

The hashing book version has:

multiplication
"Computes self times 1664525 mod 2^38 while avoiding overflow into a large
integer by making the multiplication into two 14 bits chunks. Do not use
any division or modulo operation."
 | lowBits highBits|

lowBits := self bitAnd: 16r3FFF.
 highBits := self bitShift: -14.
 ^(lowBits * 16r260D)
   + (((lowBits * 16r0065) bitAnd: 16r3FFF) bitShift: 14)
  + (((highBits * 16r260D) bitAnd: 16r3FFF) bitShift: 14)
 bitAnd: 16rFFFFFFF

So, 16384 * is the same as bitShift: 14 and it looks like done once, which
may be better.

As a side note, how is one debugging such methods? Looks like the debugger
(in 2.0) doesn't like SmallIntegers (I can understand why due to the
specific nature of the object but still. Ok,

Also VW marks it as a primitive, which Pharo does not. Would we gain some
speed doing that? hashMultiply is used a lof for identity hashes.

Bytecode has quite some work to do:

37 <70> self
38 <20> pushConstant: 16383
39 <BE> send: bitAnd:
40 <68> popIntoTemp: 0
41 <21> pushConstant: 9741
42 <10> pushTemp: 0
43 <B8> send: *
44 <21> pushConstant: 9741
45 <70> self
46 <22> pushConstant: -14
47 <BC> send: bitShift:
48 <B8> send: *
49 <23> pushConstant: 101
50 <10> pushTemp: 0
51 <B8> send: *
52 <B0> send: +
53 <20> pushConstant: 16383
54 <BE> send: bitAnd:
55 <24> pushConstant: 16384
56 <B8> send: *
57 <B0> send: +
58 <25> pushConstant: 268435455
59 <BE> send: bitAnd:
60 <7C> returnTop


I ran some experiments timing things.

It looks like that replacing 16384 * by bitShift:14 leads to a small gain,
bitShift (primitive 17) being faster than * (primitive 9)

The bytecode is identical, except send: bitShift instead of send: *

multiplication3
| low |

low := self bitAnd: 16383.
 ^(16r260D * low + ((16r260D * (self bitShift: -14) + (16r0065 * low)
bitAnd: 16383) bitShift: 14))
bitAnd: 16r0FFFFFFF



[500000 timesRepeat: [ 15000 hashMultiply ]] timeToRun 12
[500000 timesRepeat: [ 15000 multiplication ]] timeToRun 41  (worse)
[500000 timesRepeat: [ 15000 multiplication3 ]] timeToRun 10 (better)

It looks like correct for SmallInteger minVal to: SmallInteger maxVal

Now, VW gives: [500000 timesRepeat: [ 15000 hashMultiply ]] timeToRun 1.149
milliseconds

Definitely worth investigating the primitive thing, or some NB Asm as this
is used about everywhere (Collections etc).

Toughts?

Phil

Reply via email to