Am 21.12.2015 um 15:42 schrieb Tom Lane: > Viktor Leis <l...@in.tum.de> writes: >> I think it would be a good start to distinguish between nested loop >> joins with and without a index. > > We do. > >> In my opinion, the latter should simply NEVER be chosen. > > So, if you're given a query with a non-equality join condition > that doesn't match any index on either table, you think the planner > should just fail? This is not realistic. Certainly nestloop with > inner seqscan is likely to be slow, but that's already reflected > in the cost estimates. You are right that for non-equality joins there is no alternative, so nested loop is obviously the right choice. However, that does not make the selection of nested loop join in cases where a hash join would be possible a good choice.
Please have a look at Figure 6 (page 6) in http://www.vldb.org/pvldb/vol9/p204-leis.pdf Disabling nested loop joins without index scan (going from (a) to (b)) results in great improvements across the board. And even more importantly, it avoids most of the cases where queries took unreasonably long and timed out. Basically this amounts to the being able to run the query on PostgreSQL or not. The cost model does not save you because the estimated cardinality is close to 1. Also note that a hash join is fast too if the estimate is correct. Picking nested loop join in these situations is very risky but there is practically no upside for this decision. -- Viktor Leis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers