Re: [HACKERS] a few crazy ideas about hash joins

Greg Stark Fri, 03 Apr 2009 10:03:20 -0700

On Fri, Apr 3, 2009 at 5:41 PM, Simon Riggs <[email protected]> wrote:
>
> I would be especially interested in using a shared memory hash table
> that *all* backends can use - if the table is mostly read-only, as
> dimension tables often are in data warehouse applications. That would
> give zero startup cost and significantly reduced memory.


I think that's a non-starter due to visibility issues and handling
inserts and updates. Even just reusing a hash from one execution in a
later execution of the same plan would be tricky since we would have
to expire it if the snapshot changes.

Alternately, you could say that what you describe is addressed by hash
indexes. The fact that they're not great performers compared to
in-memory hashes comes down to dealing with updates and vaccum which
is pretty much the same issue.

Hm.  I wonder if we need a whole class of index algorithms to deal
specifically with read-only tables. A hash table on a read-only table
could spend a lot of time to generate a perfect or near-perfect hash
function and then pack the hash table very densely without any bucket
chains. That might make it a big winner over a dynamic structure which
has to deal with handling inserts and so on. I'm assuming that if you
mark the table read-write it just marks the index invalid and you have
to rebuild it later once you've marked the table read-only again.

-- 
greg

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] a few crazy ideas about hash joins

Reply via email to