Hi, this is Gunther, have been with PgSQL for decades, on an off this list. Haven't been on for a long time making my way just fine. But there is one thing that keeps bothering me both with Oracle and PgSQL. And that is the preference for Nested Loops.

Over the years the archives have questions about Nested Loops being chosen over Hash Joins. But the responses seem too specific to the people's queries, ask many questions, make them post the query plans, and often end up frustrating with suggestions to change the data model or to add an index and stuff like that.

One should not have to go into that personal detail.

There are some clear boundaries that a smart database should just never cross.

Especially with OLAP queries. Think a database that is fine for OLTP, has indexes and the index based accesses for a few records joined with a dozen other tables all with indexes is no problem. If you fall into a Seq Scan scenario or unwanted Hash Join, you usually forgot to add an index or forgot to put index columns into your join or other constraints. Such are novice questions and we should be beyond that.

But the issue is bulk searches, reports, and any analytic queries scenarios. In those queries Nested Loops are almost always a bad choice, even if there is an index. In over 20 years of working with RDBMs this has been my unfailing heuristics. A report runs slow? Look at plan, is there a Nested Loop? Yes? Squash it! And the report runs 10x faster instantaneously.

So, all the more troublesome is if any database system (here PgSQL) would ever fall into a Nested Loop trap with CPU spinning at 100% for several minutes, with a Nested Loop body of anything from a Seq Scan or worse with a cardinality of anything over 10 or 100. Nested Loops of Nested Loops or Nested Loops of other complex query plan fragments should be a no-no and chosen only as an absolute last resort when the system cannot find enough memory, even then disk based merge sort should be better, i.e., Nested Loops should never be chosen. Period.

If you can set enable_nestloop off and the Hash Join is chosen and the performance goes from 1 hour of 100% CPU to 10 seconds completion time, then something is deadly wrong. And it doesn't matter to me if I should have re-written my query in some funny ways or tweaked my data model, these are all unacceptable options when you have a complex system with hybrid OLTP/OLAP uses. Don't tell me to de-normalize. I know I can materialize joins in tables which I can then use again in joins to save time. But that is not the point here.

And I don't think tweaking optimizer statistics is the solution either. Because optimizer statistics quickly become worthless when your criteria get more complex.

The point is that Nested Loops should never be chosen except in index lookup situations or may be memory constraints.

How can I prevent it on a query by query scope? I cannot set enable_nestloop = off because one query will be for a full report, wile another one might have indexed constraints running in the same session, and I don't want to manage side effects and remember to set enable_nestloop parameter on and off.

There must be a way to tell the optimizer to penalize nested loops to make them the last resort. In Oracle there are those infamous hints, but they don't always work either (or it is easy to make mistakes that you get no feedback about).

Is there any chance PgSQL can get something like a hint feature? Or is there a way to use postgresql.conf to penalize nested loops so that they would only ever be chosen in the most straight-forward situations as with query parameters that are indexed? I know I need to have sufficient work_mem, but if you can set enable_nestloop = off and you get the desired Hash Join, there is obviously sufficient work_mem, so that isn't the answer either.

Thanks for listening to my rant.


Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:

Reply via email to