[PERFORM] Ramifications of turning off Nested Loops for slow queries

2008-03-04 Thread Chris Kratz

Hello Everyone,

I had posted an issue previously that we've been unable to resolve.   
An early mis-estimation in one or more subqueries causes the remainder  
of the query to choose nested loops instead of a more efficient method  
and runs very slowly (CPU Bound).  I don't think there is any way to  
suggest to the planner it not do what it's doing, so we are starting  
to think about turning off nested loops entirely.


Here is the history so far:

http://archives.postgresql.org/pgsql-performance/2008-02/msg00205.php

At the suggestion of the list, we upgraded to 8.2.6 and are still  
experiencing the same problem.  I'm now installing 8.3 on my  
workstation  to see if it chooses a better plan, but it will take some  
time to get it compiled, a db loaded, etc.


We have a number of very long running reports that will run in seconds  
if nested loops are turned off.  The other alternative we are  
exploring is programmatically turning off nested loops just for the  
problematic reports.  But with the speedups we are seeing, others are  
getting gun shy about having them on at all.


So, I've now been asked to ping the list as to whether turning off  
nested loops system wide is a bad idea, and why or why not.


Any other thoughts or suggestions?

Thanks,

-Chris

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-performance


Re: [PERFORM] Ramifications of turning off Nested Loops for slow queries

2008-03-04 Thread Kevin Grittner
 On Tue, Mar 4, 2008 at  8:42 AM, in message
[EMAIL PROTECTED], Chris Kratz
[EMAIL PROTECTED] wrote: 
 
 So, I've now been asked to ping the list as to whether turning off  
 nested loops system wide is a bad idea, and why or why not.
 
In our environment, the fastest plan for a lot of queries involve
nested loops.  Of course, it's possible that these never provide the
fasted plan in your environment, but it seems very unlikely --
you're just not noticing the queries where it's doing fine.
 
 Any other thoughts or suggestions?
 
Make sure your effective_cache_size is properly configured.
 
Increase random_page_cost and/or decrease seq_page_cost.
You can play with the cost settings on a connection, using EXPLAIN
on the query, to see what plan you get with each configuration
before putting it into the postgresql.conf file.
 
-Kevin
 



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-performance


Re: [PERFORM] Ramifications of turning off Nested Loops for slow queries

2008-03-04 Thread Tom Lane
Kevin Grittner [EMAIL PROTECTED] writes:
 On Tue, Mar 4, 2008 at  8:42 AM, in message
 [EMAIL PROTECTED], Chris Kratz
 [EMAIL PROTECTED] wrote: 
 So, I've now been asked to ping the list as to whether turning off  
 nested loops system wide is a bad idea, and why or why not.
 
 In our environment, the fastest plan for a lot of queries involve
 nested loops.  Of course, it's possible that these never provide the
 fasted plan in your environment, but it seems very unlikely --
 you're just not noticing the queries where it's doing fine.

Yeah, I seem to recall similar queries from other people who were
considering the opposite, ie disabling the other join types :-(

The rule of thumb is that nestloop with an inner indexscan will beat
anything else for pulling a few rows out of a large table.  But on
the other hand it loses big for selecting lots of rows.  I don't think
that a global disable in either direction would be a smart move, unless
you run only a very small number of query types and have checked them
all.

regards, tom lane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-performance


Re: [PERFORM] Ramifications of turning off Nested Loops for slow queries

2008-03-04 Thread Chris Kratz
On 3/4/08, Kevin Grittner [EMAIL PROTECTED] wrote:

  On Tue, Mar 4, 2008 at  8:42 AM, in message
  Any other thoughts or suggestions?


 Make sure your effective_cache_size is properly configured.

 Increase random_page_cost and/or decrease seq_page_cost.
 You can play with the cost settings on a connection, using EXPLAIN
 on the query, to see what plan you get with each configuration
 before putting it into the postgresql.conf file.


 -Kevin


That was a good idea.  I hadn't tried playing with those settings in a
session.  This is a 8G box, and we've dedicated half of that (4G) to the
file system cache.  So, 4G is what effective_cache_size is set to.  Our
seq_page_cost is set to 1 and our random_page_cost is set to 1.75 in the
postgresql.conf.

In testing this one particular slow query in a session, I changed these
settings alternating in increments of 0.25.  The random_page_cost up to 4
and the seq_page_cost down to 0.25.  This made perhaps a second difference,
but at the end, we were back to to the 37s.  Doing a set enable_nestloop=off
in the session reduced the runtime to 1.2s with the other settings back to
our normal day to day settings.

So, for now I think we are going to have to modify the code to prepend the
problematic queries with this setting and hope the estimator is able to
better estimate this particular query in 8.3.

Thanks for the suggestions,

-Chris


Re: [PERFORM] Ramifications of turning off Nested Loops for slow queries

2008-03-04 Thread Chris Kratz
On 3/4/08, Tom Lane [EMAIL PROTECTED] wrote:

 Kevin Grittner [EMAIL PROTECTED] writes:
  On Tue, Mar 4, 2008 at  8:42 AM, in message
  [EMAIL PROTECTED], Chris Kratz
  [EMAIL PROTECTED] wrote:
  So, I've now been asked to ping the list as to whether turning off
  nested loops system wide is a bad idea, and why or why not.

  In our environment, the fastest plan for a lot of queries involve
  nested loops.  Of course, it's possible that these never provide the
  fasted plan in your environment, but it seems very unlikely --
  you're just not noticing the queries where it's doing fine.


 Yeah, I seem to recall similar queries from other people who were
 considering the opposite, ie disabling the other join types :-(

 The rule of thumb is that nestloop with an inner indexscan will beat
 anything else for pulling a few rows out of a large table.  But on
 the other hand it loses big for selecting lots of rows.  I don't think
 that a global disable in either direction would be a smart move, unless
 you run only a very small number of query types and have checked them
 all.

 regards, tom lane


So, if we can't find another way to solve the problem, probably our best bet
is to turn off nested loops on particularly bad queries by prepending them
w/ set enable_nested_loop=off?  But, leave them on for the remainder of the
system?

Do you think it's worth testing on 8.3 to see if the estimator is able to
make a better estimate?

-Chris