Re: [PERFORM] speed up query with max() and odd estimates

Matthew Nuzum Tue, 26 Apr 2005 15:34:59 -0700

On 4/26/05, Steinar H. Gunderson <[EMAIL PROTECTED]> wrote:
> On Tue, Apr 26, 2005 at 03:16:57PM -0500, Matthew Nuzum wrote:
> >  Seq Scan on usage_access  (cost=0.00..1183396.40 rows=12713851
> > width=116) (actual time=481796.22..481839.43 rows=3343 loops=1)
> 
> That's a gross misestimation -- four orders of magnitude off!
> 
> Have you considering doing this in two steps, first getting out whatever
> comes from the subquery and then doing the query?


Well, I don't know if the estimates are correct now or not, but I
found that your suggestion of doing it in two steps helped a lot.

For the archives, here's what made a drastic improvement:

This batch program had an overhead of 25 min to build hash tables
using the sql queries. It is now down to about 47 seconds.

The biggest improvements (bringing it down to 9 min) were to get rid
of all instances of `select max(field) from ...` and replacing them
with `select field from ... order by field desc limit 1`

Then, to get it down to the final 47 seconds I changed this query:
SELECT client,max(atime) as atime from usage_access where atime >=
(select atime - '1 hour'::interval from usage_access order by atime
desc limit 1) group by client;

To these three queries:
SELECT atime - '1 hour'::interval from usage_access order by atime desc limit 1;
SELECT client, atime into temporary table recent_sessions from
usage_access where atime >= '%s';
SELECT client, max(atime) as atime from recent_sessions group by client;

Thanks for the help.
-- 
Matthew Nuzum
www.bearfruit.org

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: [PERFORM] speed up query with max() and odd estimates

Reply via email to