Re: [PERFORM] [SQL] ORDER BY Optimization

2005-05-07 Thread Tom Lane
[ cc list limited to -performance ]

Derek Buttineau|Compu-SOLVE [EMAIL PROTECTED] writes:
 It seems to me a merge join might be more appropriate here than a
 nestloop.

After some experimentation, I don't seem to be able to get the planner
to generate a mergejoin based on a backwards index scan.  I suspect
it's not considering the idea of a merge using descending order at all.
Might be a good enhancement, although we'd need to figure out how to
keep this from just uselessly doubling the number of mergejoin paths
considered :-(

In the meantime, the nestloop is the only hope for avoiding a
full-scan-and-sort.

 I'm not entirely sure what's up with the row-count estimates, the tables 
 are updated quite frequently (and VACUUM is also run quite frequently),

They're probably not as bad as they look.  The estimates for the lower
nodes are made without regard to the LIMIT, but the actuals of course
reflect the fact that the LIMIT stopped execution of the plan early.

The problem with this query is that the fast plan depends on the
assumption that as we scan in backwards m.msg_date order, it won't take
very long to find 10 rows that join to mr rows with mr.subscription=89.
If that's true then the plan wins, if it's not true then the plan can
lose big.  That requires a pretty good density of rows with
subscription=89, and in particular a good density near the end of the
msg_date order.  The planner effectively assumes that the proportion of
rows with subscription=89 isn't changing over time, but perhaps it is
--- if there's been a lot recently that could explain why the fast
plan is fast.  In any case I suppose that the reason the larger server
doesn't want to try that plan is that its statistics show a much lower
density of rows with subscription=89, and so the plan doesn't look
promising compared to something that wins if there are few rows with
subscription=89 ... which the other plan does.

You could probably get your larger server to try the no-sort plan if
you said set enable_sort = 0 first.  It would be interesting to
compare the EXPLAIN ANALYZE results for that case with the other
server.

The contents of the pg_stats row for mr.subscription in each server
would be informative, too.  One rowcount estimate that does look
wrong is

   -  Index Scan using maillog_received_subscription_idx on 
maillog_received mr  (cost=0.00..17789.73 rows=4479 width=43) (actual 
time=0.030..33554.061 rows=65508 loops=1)
 Index Cond: (subscription = 89)

so the stats row is suggesting there are only 4479 rows with
subscription = 89 when really there are 65508.  (The preceding
discussion hopefully makes it clear why this is a potentially critical
mistake.)  This suggests that you may need to raise your statistics
targets.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] [SQL] ORDER BY Optimization

2005-05-06 Thread Rosser Schwarz
while you weren't looking, Derek Buttineau|Compu-SOLVE wrote:

 I'm hoping this is the right place to send this.

The PostgreSQL Performance list, pgsql-performance@postgresql.org
would be more appropriate. I'm copying my followup there, as well.

As for your query, almost all the time is actually spent in the
nestloop, not the sort.  Compare:

   -  Sort  (cost=31402.85..31405.06 rows=886 width=306) (actual
 time=87454.187..87454.240 rows=10 loops=1)

vs.

  -  Nested Loop  (cost=0.00..31359.47 rows=886 width=306)
 (actual time=4.740..86430.468 rows=26308 loops=1)

That's 50-ish ms versus 80-odd seconds.

It seems to me a merge join might be more appropriate here than a
nestloop. What's your work_mem set at?  Off-the-cuff numbers show the
dataset weighing in the sub-ten mbyte range.

Provided it's not already at least that big, and you don't want to up
it permanently, try saying:

SET work_mem = 10240; -- 10 mbytes

immediately before running this query (uncached, of course) and see
what happens.

Also, your row-count estimates look pretty off-base.  When were these
tables last VACUUMed or ANALYZEd?

/rls

-- 
:wq

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] [SQL] ORDER BY Optimization

2005-05-06 Thread Derek Buttineau|Compu-SOLVE
Thanks for the response :)
That's 50-ish ms versus 80-odd seconds.
It seems to me a merge join might be more appropriate here than a
nestloop. What's your work_mem set at?  Off-the-cuff numbers show the
dataset weighing in the sub-ten mbyte range.
Provided it's not already at least that big, and you don't want to up
it permanently, try saying:
SET work_mem = 10240; -- 10 mbytes
 

It's currently set at 16mb, I've also tried upping sort_mem as well 
without any noticible impact on the uncached query. :(

immediately before running this query (uncached, of course) and see
what happens.
Also, your row-count estimates look pretty off-base.  When were these
tables last VACUUMed or ANALYZEd?
 

I'm not entirely sure what's up with the row-count estimates, the tables 
are updated quite frequently (and VACUUM is also run quite frequently), 
however I had just run a VACUUM ANALYZE on both databases before running 
the explain.

I'm also still baffled at the differences in the plans between the two 
servers, on the one that uses the index to sort, I get for comparison a 
nestloop of:

Nested Loop  (cost=0.00..1175943.99 rows=1814 width=311) (actual 
time=25.337..26.867 rows=10 loops=1)

The plan that the live server seems to be using seems fairly inefficient.
Derek
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]