Re: [PERFORM] Index Scan taking long time

2009-06-18 Thread Bryce Ewing

Hi Tom,

We have managed to improve significantly on the speed of this query.  
The way that we did this was through clustering the table based on the 
domain index which significantly reduced the page reads that were 
required in order to perform the query.


Also to find this we turned on log_statement_stats to see what it was doing.

This was on a table of roughly 600MB where the domains were randomly 
dispersed.


Cheers
Bryce

Tom Lane wrote:

Bryce Ewing br...@smx.co.nz writes:
  
So it seems to me that once the index is in memory everything is fine 
with the world, but the loading of the index into memory is horrendous.



So it would seem.  What's the disk hardware on this machine?

It's possible that part of the problem is table bloat, leading to the
indexscan having to fetch many more pages than it would if the table
were more compact.

regards, tom lane

  



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Index Scan taking long time

2009-06-17 Thread Bryce Ewing
The nested loops (which are due to the joins) don't seem to be part of 
the problem at all.  The main time that is taken (actual time that is) 
is in this part:

   Index Scan using event_20090526_domain_idx on event_20090526 e
   (cost=0.00..10694.13 rows=3606 width=1276) (actual 
time=50.233..14305.211 rows=3453 loops=1)

   Index Cond: (e.domain_id = d.id)

Which is the leaf node in the query plan, the total time for the query 
being: Total runtime: 14380.000 ms


And as I said once that query is run once it then does the same query 
plan and has this output for the same leaf node above:

   Index Scan using event_20090526_domain_idx on event_20090526 e
   (cost=0.00..10694.13 rows=3606 width=1276) (actual time=0.027..7.510 
rows=3453 loops=1)

   Index Cond: (e.domain_id = d.id)

So it seems to me that once the index is in memory everything is fine 
with the world, but the loading of the index into memory is horrendous.



Tom Lane wrote:

Scott Marlowe scott.marl...@gmail.com writes:
  

Without looking at the explain just yet, it seems to me that you are
constraining the order of joins to insist that the left joins be done
first, then the regular joins second, because of your mix of explicit
and implicit join syntax.  The query planner is constrained to run
explicit joins first, then implicit if I remember correctly.



That isn't true as of recent releases (8.2 and up, I think).  It is true
that there are semantic constraints that prevent certain combinations
of inner and outer joins from being rearranged ... but if that applies
here, it would also prevent manual rearrangement, unless the OP decides
that this query doesn't express quite what he meant.

regards, tom lane
  


--

*Bryce Ewing *| Platform Architect
*DDI:* +64 9 950 2195 *Fax:* +64 9 302 0518
*Mobile:* +64 21 432 293 *Freephone:* 0800 SMX SMX (769 769)
Level 11, 290 Queen Street, Auckland, New Zealand | SMX Ltd | smx.co.nz 
http://smx.co.nz

SMX | Business Email Specialists
The information contained in this email and any attachments is 
confidential. If you are not
the intended recipient then you must not use, disseminate, distribute or 
copy any information
contained in this email or any attachments. If you have received this 
email in error or you
are not the originally intended recipient please contact SMX immediately 
and destroy this email.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Index Scan taking long time

2009-06-16 Thread Bryce Ewing

Hi,

I have been trying to fix a performance issue that we have which I have 
tracked down to index scans being done on a particular table (or set of 
tables):


The following query:
explain analyze select *
FROM inbound.event_20090526 e
LEFT OUTER JOIN inbound.internal_host i ON (e.mta_host_id = i.id)
LEFT OUTER JOIN inbound.internal_host iaa ON (e.aamta_host_id = iaa.id)
LEFT OUTER JOIN inbound.event_status es ON (e.event_status_id = es.id)
LEFT OUTER JOIN inbound.threat t ON (e.threat_id = t.id), inbound.domain 
d, inbound.event_type et

WHERE e.domain_id = d.id
AND e.event_type_id = et.id
AND d.name IN (
   'testdomain.com'
);


Does this:
  
QUERY 
PLAN


--
Nested Loop Left Join  (cost=0.00..10887.03 rows=8 width=2021) (actual 
time=50.352..14378.603 rows=3453 loops=1)
  -  Nested Loop Left Join  (cost=0.00..10882.23 rows=8 width=1985) 
(actual time=50.346..14372.820 rows=3453 loops=1)
-  Nested Loop Left Join  (cost=0.00..10877.43 rows=8 
width=1949) (actual time=50.336..14358.101 rows=3453 loops=1)
  -  Nested Loop Left Join  (cost=0.00..10872.63 rows=8 
width=1801) (actual time=50.321..14344.603 rows=3453 loops=1)
-  Nested Loop  (cost=0.00..10867.83 rows=8 
width=1764) (actual time=50.315..14336.979 rows=3453 loops=1)
  -  Nested Loop  (cost=0.00..10863.03 rows=8 
width=1728) (actual time=50.288..14308.368 rows=3453 loops=1)
-  Index Scan using domain_name_idx on 
domain d  (cost=0.00..6.63 rows=1 width=452) (actual time=0.049..0.080 
rows=1 loops=1)
  Index Cond: ((name)::text = 
'testdomain.com'::text)
-  Index Scan using 
event_20090526_domain_idx on event_20090526 e  (cost=0.00..10694.13 
rows=3606 width=1276) (actual time=50.233..14305.211 rows=3453 loops=1)

  Index Cond: (e.domain_id = d.id)
  -  Index Scan using event_type_pkey on 
event_type et  (cost=0.00..0.56 rows=1 width=36) (actual 
time=0.006..0.006 rows=1 loops=3453)

Index Cond: (et.id = e.event_type_id)
-  Index Scan using threat_pkey on threat t  
(cost=0.00..0.56 rows=1 width=37) (actual time=0.000..0.000 rows=0 
loops=3453)

  Index Cond: (e.threat_id = t.id)
  -  Index Scan using event_status_pkey on event_status 
es  (cost=0.00..0.56 rows=1 width=148) (actual time=0.002..0.002 rows=1 
loops=3453)

Index Cond: (e.event_status_id = es.id)
-  Index Scan using internal_host_pkey on internal_host iaa  
(cost=0.00..0.56 rows=1 width=36) (actual time=0.002..0.003 rows=1 
loops=3453)

  Index Cond: (e.aamta_host_id = iaa.id)
  -  Index Scan using internal_host_pkey on internal_host i  
(cost=0.00..0.56 rows=1 width=36) (actual time=0.000..0.000 rows=0 
loops=3453)

Index Cond: (e.mta_host_id = i.id)
Total runtime: 14380.000 ms

If the same query is done straight away again we get:

QUERY 
PLAN  


-
Nested Loop Left Join  (cost=0.00..10887.03 rows=8 width=2021) (actual 
time=0.165..67.388 rows=3453 loops=1)
  -  Nested Loop Left Join  (cost=0.00..10882.23 rows=8 width=1985) 
(actual time=0.162..61.973 rows=3453 loops=1)
-  Nested Loop Left Join  (cost=0.00..10877.43 rows=8 
width=1949) (actual time=0.156..49.756 rows=3453 loops=1)
  -  Nested Loop Left Join  (cost=0.00..10872.63 rows=8 
width=1801) (actual time=0.148..37.522 rows=3453 loops=1)
-  Nested Loop  (cost=0.00..10867.83 rows=8 
width=1764) (actual time=0.146..31.920 rows=3453 loops=1)
  -  Nested Loop  (cost=0.00..10863.03 rows=8 
width=1728) (actual time=0.129..10.325 rows=3453 loops=1)
-  Index Scan using domain_name_idx on 
domain d  (cost=0.00..6.63 rows=1 width=452) (actual time=0.099..0.139 
rows=1 loops=1)
  Index Cond: ((name)::text = 
'rhe.com.au'::text)
-  Index Scan using 
event_20090526_domain_idx on event_20090526 e  (cost=0.00..10694.13 
rows=3606 width=1276) (actual time=0.027..7.510 rows=3453 loops=1)