On Wed, 17 Mar 2010, Chris Muller wrote:

Hmm, well, that only moves the goal posts to ~320K.

I just ran a bigger test in 3.9.  Not only are the numbers comparable
at the beginning of the test with a small IdentitySet, they only
degrade to 10X slower at 1M elements.

The trunk IdentitySet seems to have much more rapid degradation...


Since there are only 4096 different keys, you're practically growing 4096 lists in an array. The cause of the problem is that the gap between the slots which are the head of the lists are less than the length of the list, so clustering will occur. Most primes above 100000 are affected by this issue in the #goodPrimes array. We can solve this issue by using:
(1) different primes in general
(2) different primes for identity hash based collections
(3) a different method for #scaledIdentityHash calculation
(4) a combination of the above

I found 5918 primes between 5 and 1000000 which are expected to behave optimally (and 59115 which are close to optimal) with #scaledIdentityHash, so probably (1) or (2) is the ideal solution.


Levente

On Wed, Mar 17, 2010 at 9:40 PM, Henrik Sperre Johansen
<[email protected]> wrote:
 On 18.03.2010 02:08, Chris Muller wrote:

Thank you very much for the links.  Ok, so if you take Lukas' test
code from the first link, and try it in trunk.  Here it is, but
tweaked for an IdentitySet and increased to simulate a modestly-sized
real-world use-case, 250K elements:

| test ord |
test := IdentitySet new.
[ test size>= 250000 ] whileFalse: [
    ord := OrderedCollection new: 100.
    1000 timesRepeat: [
        test add: ( ord add: Object new ) ].
    Transcript
        show: test size asString;
        tab.
    Transcript show:
        [
            10 timesRepeat: [
                ord do: [ :each | test includes: each ] ]
        ] timeToRun asString.
    Transcript cr ]

3.9 can run this entire thing and be done with it in under one minute.
 In trunk, all looks pretty fine until 136K elements, at which point
it completely seizes up; worse by a factor of>  120X.

  - Chris

Simplistic fix:
- remove 282311 from HashedCollection class>>goodPrimes.

The table should probably be remade for squeak, keeping in mind the
scaledIdentityHash implementation.

Cheers,
Henry

_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to