> -----Original Message-----
> > Minor question on this patch. AFAICS there is another patch that
seems
> > to be aiming at exactly the same use case. Jonah's Bloom filter
patch.
> >
> > Shouldn't we have a dust off to see which one is best? Or at least a
> > discussion to test whether they overlap? Perhaps you already did
that
> > and I missed it because I'm not very tuned in on this thread.
> >
> > --
> >  Simon Riggs           www.2ndQuadrant.com
> >  PostgreSQL Training, Services and Support
> 
> We haven't had that discussion AFAIK, and definitely should. First
> glance suggests they could coexist peacefully, with proper coaxing. If
> I understand things properly, Jonah's patch filters tuples early in
> the join process, and this patch tries to ensure that hash join
> batches are kept in RAM when they're most likely to be used. So
> they're orthogonal in purpose, and the patches actually apply *almost*
> cleanly together. Jonah, any comments? If I continue to have some time
> to devote, and get through all I think I can do to review this patch,
> I'll gladly look at Jonah's too, FWIW.
> 
> - Josh

The skew patch and bloom filter patch are orthogonal and can both be
applied.  The bloom filter patch is a great idea, and it is used in many
other database systems.  You can use the TPC-H data set to demonstrate
that the bloom filter patch will significantly improve performance of
multi-batch joins (with or without data skew).

Any query that filters a build table before joining on the probe table
will show improvements with a bloom filter.  For example, 

select * from customer, orders where customer.c_nationkey = 10 and
customer.c_custkey = orders.o_custkey

The bloom filter on customer would allow us to avoid probing with orders
tuples that cannot possibly find a match due to the selection criteria.
This is especially beneficial for multi-batch joins where an orders
tuple must be written to disk if its corresponding customer batch is not
the in-memory batch.

I have no experience reviewing patches, but I would be happy to help
contribute/review the bloom filter patch as best I can.

--
Dr. Ramon Lawrence
Assistant Professor, Department of Computer Science, University of
British Columbia Okanagan
E-mail: [EMAIL PROTECTED]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to