Gunther wrote: > But there > is one thing that keeps bothering me both with Oracle and PgSQL. And > that is the preference for Nested Loops.
[...] > But the issue is bulk searches, reports, and any analytic queries > scenarios. In those queries Nested Loops are almost always a bad choice, > even if there is an index. In over 20 years of working with RDBMs this > has been my unfailing heuristics. A report runs slow? Look at plan, is > there a Nested Loop? Yes? Squash it! And the report runs 10x faster > instantaneously. [...] > If you can set enable_nestloop off and the Hash Join is chosen and the > performance goes from 1 hour of 100% CPU to 10 seconds completion time, > then something is deadly wrong. [...] > The point is that Nested Loops should never be chosen except in index > lookup situations or may be memory constraints. > > How can I prevent it on a query by query scope? I cannot set > enable_nestloop = off because one query will be for a full report, wile > another one might have indexed constraints running in the same session, > and I don't want to manage side effects and remember to set > enable_nestloop parameter on and off. > > There must be a way to tell the optimizer to penalize nested loops to > make them the last resort. In Oracle there are those infamous hints, but > they don't always work either (or it is easy to make mistakes that you > get no feedback about). > > Is there any chance PgSQL can get something like a hint feature? PostgreSQL doesn't have a way to tell if a query is an OLAP query running against a star schema or a regular OLTP query, it will treat both in the same fashion. I also have had to deal with wrongly chosen nested loop joins, and testing a query with "enable_nestloop=off" is one of the first things to try in my experience. However, it is not true that PostgreSQL "perfers nested loops". Sometimes a nested loop join is the only sane and efficient way to process a query, and removing that capability would be just as bad a disaster as you are experiencing with your OLAP queries. Bad choices are almost always caused by bad estimates. Granted, there is no way that estimates can ever be perfect. So what could be done? One pragmatic solution would be to wrap every query that you know to be an OLAP query with BEGIN; SET LOCAL enable_nestloop=off; SELECT ... COMMIT; Looking deeper, I would say that wrongly chosen nested loop joins often come from an underestimate that is close to zero. PostgreSQL already clamps row count estimates to 1, that is, it will choose an estimate of 1 whenever it thinks fewer rows will be returned. Perhaps using a higher clamp like 2 would get rid of many of your problems, but it is a difficult gamble as it will also prevent some nested loop joins that would have been the best solution. Finally, even though the official line of PostgreSQL is to *not* have query hints, and for a number of good reasons, this is far from being an unanimous decision. The scales may tip at some point, though I personally hope that this point is not too close. Yours, Laurenz Albe -- Sent via pgsql-performance mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance