On Thu, Sep 11, 2008 at 08:51:53PM -0600, Alex Hunsaker wrote: > On Thu, Sep 11, 2008 at 9:24 AM, Kenneth Marshall <[EMAIL PROTECTED]> wrote: > > Alex, > > > > I meant to check the performance with increasing numbers of collisions, > > not increasing size of the hashed item. In other words, something like > > this: > > > > for ($coll=500; $i<=1000000; $i=$i*2) { > > for ($i=0; $i<=1000000; $i++) { > > hash(int8 $i); > > } > > # add the appropriate number of collisions, distributed evenly to > > # minimize the packing overrun problem > > for ($dup=0; $dup<=$coll; $dup++) { > > hash(int8 MAX_INT + $dup * 1000000/$coll); > > } > > } > > > > Ken > > *doh* right something like this... > > create or replace function create_test_hash() returns bool as $$ > declare > coll integer default 500; > -- tweak this to where create index gets really slow > max_coll integer default 1000000; > begin > loop > execute 'create table test_hash_'|| coll ||'(num int8);'; > execute 'insert into test_hash_'|| coll ||' (num) select n > from generate_series(0, '|| max_coll ||') as n;'; > execute 'insert into test_hash_'|| coll ||' (num) select > (n+4294967296) * '|| max_col ||'/'|| coll ||'::int from > generate_series(0, '|| coll ||') as n;'; > > coll := coll * 2; > > exit when coll >= max_coll; > end loop; > return true; > end; > $$ language 'plpgsql'; > > And then benchmark each table, and for extra credit cluster the table > on the index and benchmark that. > > Also obviously with the hashint8 which just ignores the top 32 bits. > > Right? > Yes, that is exactly right.
Ken -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches