Re: [HACKERS] Proposal for Merge Join for Non '=' Operators

Tom Lane Wed, 09 Apr 2014 06:45:06 -0700

Dilip kumar <[email protected]> writes:
> On 09 April 2014 13:31, Nicolas Barbier Wrote
>> Do you have a real-world example use case of such joins, to offset the
>> extra planner time that will likely have to be paid (even for queries
>> for which the functionality ends up not being used)?


> I think this will be more useful when Both the relation are Big and are of 
> almost equal size.

I'm not following how this would help much.  Consider "a.x < b.y" where
we know that the inputs are sorted by x/y.  I think you are saying that
for a given b row, we could scan forward from the beginning of a, but
stop as soon as we find a row with a.x >= b.y.  Great ... but instead of
comparing M*N rows, we are comparing M*N/2 rows.  So it's still O(N^2),
though with a slightly smaller constant.  Is this going to be worth the
added time to sort the inputs?

It's slightly more promising for range constraints (that is, "a.x
between b.y and b.z") but even then you have to assume that the ranges
are narrow, at which point a nestloop join using an inner indexscan on
a.x might do about as well.

So personally, I suspect there's a reason why this isn't a standard
join algorithm already.  If you want to try it, I'd suggest that you
focus on getting to where you can do some performance testing ASAP,
without doing much code polishing or worrying about teaching the
planner how to cost it.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Merge Join for Non '=' Operators

Reply via email to