On Thu, Sep 11, 2008 at 08:51:53PM -0600, Alex Hunsaker wrote:
> On Thu, Sep 11, 2008 at 9:24 AM, Kenneth Marshall <[EMAIL PROTECTED]> wrote:
> > Alex,
> >
> > I meant to check the performance with increasing numbers of collisions,
> > not increasing size of the hashed item. In other words, something like
> > this:
> >
> > for ($coll=500; $i<=1000000; $i=$i*2) {
> >  for ($i=0; $i<=1000000; $i++) {
> >    hash(int8 $i);
> >  }
> >  # add the appropriate number of collisions, distributed evenly to
> >  # minimize the packing overrun problem
> >  for ($dup=0; $dup<=$coll; $dup++) {
> >    hash(int8 MAX_INT + $dup * 1000000/$coll);
> >  }
> > }
> >
> > Ken
> 
> *doh* right something like this...
> 
> create or replace function create_test_hash() returns bool as $$
> declare
>     coll integer default 500;
>     -- tweak this to where create index gets really slow
>     max_coll integer default 1000000;
> begin
>     loop
>         execute 'create table test_hash_'|| coll ||'(num int8);';
>         execute 'insert into test_hash_'|| coll ||' (num) select n
> from generate_series(0, '|| max_coll ||') as n;';
>         execute 'insert into test_hash_'|| coll ||' (num) select
> (n+4294967296) * '|| max_col ||'/'|| coll ||'::int from
> generate_series(0, '|| coll ||') as n;';
> 
>         coll := coll * 2;
> 
>         exit when coll >= max_coll;
>     end loop;
>     return true;
> end;
> $$ language 'plpgsql';
> 
> And then benchmark each table, and for extra credit cluster the table
> on the index and benchmark that.
> 
> Also obviously with the hashint8 which just ignores the top 32 bits.
> 
> Right?
> 
Yes, that is exactly right.

Ken

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Reply via email to