[PERFORM] Yet another slow nested loop

2009-06-16 Thread Alexander Staubo
Here's the query:

select photos.*
from photos
inner join event_participations on
  event_participations.user_id = photos.creator_id and
  event_participations.attend = true
inner join event_instances on
  event_instances.id = event_participations.event_instance_id
where (
  (event_instances.venue_id = 1290) and
  (photos.taken_at  (event_instances.time + interval '-3600 seconds')) and
  (photos.taken_at  (event_instances.time + interval '25200 seconds'))
)
order by taken_at desc
limit 20

It occasionally takes four minutes to run:


 QUERY PLAN
--
 Limit  (cost=0.00..10997.65 rows=20 width=116) (actual
time=262614.474..262614.474 rows=0 loops=1)
   -  Nested Loop  (cost=0.00..5729774.95 rows=10420 width=116)
(actual time=262614.470..262614.470 rows=0 loops=1)
 Join Filter: ((photos.taken_at  (event_instances.time +
'-01:00:00'::interval)) AND (photos.taken_at  (event_instances.time
+ '07:00:00'::interval)))
 -  Nested Loop  (cost=0.00..2055574.35 rows=11869630
width=120) (actual time=21.750..121838.012 rows=14013998 loops=1)
   -  Index Scan Backward using photos_taken_at on photos
 (cost=0.00..40924.34 rows=544171 width=116) (actual
time=14.997..1357.724 rows=544171 loops=1)
   -  Index Scan using event_participations_user_id_index
on event_participations  (cost=0.00..2.95 rows=60 width=8) (actual
time=0.007..0.159 rows=26 loops=544171)
 Index Cond: (event_participations.user_id =
photos.creator_id)
 Filter: event_participations.attend
 -  Index Scan using event_instances_pkey on event_instances
(cost=0.00..0.29 rows=1 width=12) (actual time=0.008..0.008 rows=0
loops=14013998)
   Index Cond: (event_instances.id =
event_participations.event_instance_id)
   Filter: (event_instances.venue_id = 1290)
 Total runtime: 262614.585 ms

With enable_nestloop to false, it takes about 1 second to run.

Database is freshly analyzed and vacuumed. Default statistics target
is 100. I have tried increasing the stats on
event_participations.user_id, event_participations.event_instance_id
and photos.taken_at to 1000, but no improvement.

This is PostgreSQL 8.3.3.

A.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Yet another slow nested loop

2009-06-16 Thread Dave Dutcher
 -Original Message-
 From: Alexander Staubo
 
-  Nested Loop  (cost=0.00..5729774.95 rows=10420 width=116)
 (actual time=262614.470..262614.470 rows=0 loops=1)
  Join Filter: ((photos.taken_at  (event_instances.time +
 '-01:00:00'::interval)) AND (photos.taken_at  (event_instances.time
 + '07:00:00'::interval)))
  -  Nested Loop  (cost=0.00..2055574.35 rows=11869630
 width=120) (actual time=21.750..121838.012 rows=14013998 loops=1)


Do you have any of the other enable_* options set to false?  What do you
have random_page_cost set to?  I ask because I'm surprised to see postgres
choose to loop when it knows it will have to loop 11 million times.

Dave



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Yet another slow nested loop

2009-06-16 Thread Alexander Staubo
On Tue, Jun 16, 2009 at 3:56 PM, Dave Dutcherd...@tridecap.com wrote:
 -Original Message-
 From: Alexander Staubo

    -  Nested Loop  (cost=0.00..5729774.95 rows=10420 width=116)
 (actual time=262614.470..262614.470 rows=0 loops=1)
          Join Filter: ((photos.taken_at  (event_instances.time +
 '-01:00:00'::interval)) AND (photos.taken_at  (event_instances.time
 + '07:00:00'::interval)))
          -  Nested Loop  (cost=0.00..2055574.35 rows=11869630
 width=120) (actual time=21.750..121838.012 rows=14013998 loops=1)


 Do you have any of the other enable_* options set to false?

No.

 What do you
 have random_page_cost set to?  I ask because I'm surprised to see postgres
 choose to loop when it knows it will have to loop 11 million times.

The default, ie. 4.0.

A.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Yet another slow nested loop

2009-06-16 Thread Tom Lane
Alexander Staubo a...@bengler.no writes:
 Here's the query:
 select photos.*
 from photos
 inner join event_participations on
   event_participations.user_id = photos.creator_id and
   event_participations.attend = true
 inner join event_instances on
   event_instances.id = event_participations.event_instance_id
 where (
   (event_instances.venue_id = 1290) and
   (photos.taken_at  (event_instances.time + interval '-3600 seconds')) and
   (photos.taken_at  (event_instances.time + interval '25200 seconds'))
 )
 order by taken_at desc
 limit 20

 It occasionally takes four minutes to run:

Actually the easiest way to fix that is to get rid of the LIMIT.
(Maybe use a cursor instead, and fetch only twenty rows.)  LIMIT
magnifies the risks from any estimation error, and you've got a lot
of that here ...

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Yet another slow nested loop

2009-06-16 Thread Alexander Staubo
On Tue, Jun 16, 2009 at 4:36 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Actually the easiest way to fix that is to get rid of the LIMIT.
 (Maybe use a cursor instead, and fetch only twenty rows.)  LIMIT
 magnifies the risks from any estimation error, and you've got a lot
 of that here ...

There's no cursor support in ActiveRecord, the ORM library we use, and
I'm not going to write it. Anyway, I would prefer not to gloss over
the underlying problem with something that requires a TODO next to
it. What can be done to fix the underlying problem? Nothing?

A.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Yet another slow nested loop

2009-06-16 Thread Robert Haas
On Tue, Jun 16, 2009 at 11:16 AM, Alexander Stauboa...@bengler.no wrote:
 On Tue, Jun 16, 2009 at 4:36 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Actually the easiest way to fix that is to get rid of the LIMIT.
 (Maybe use a cursor instead, and fetch only twenty rows.)  LIMIT
 magnifies the risks from any estimation error, and you've got a lot
 of that here ...

 There's no cursor support in ActiveRecord, the ORM library we use, and
 I'm not going to write it. Anyway, I would prefer not to gloss over
 the underlying problem with something that requires a TODO next to
 it. What can be done to fix the underlying problem? Nothing?

Basically, we need a system that can accurately estimate multi-column
selectivity, or else some kind of planner hints.

http://archives.postgresql.org/pgsql-performance/2009-06/msg00023.php
http://archives.postgresql.org/pgsql-performance/2009-06/msg00119.php

(with apologies for linking to my own posts, but you can go back and
read the whole thread if you're interested)

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance