Re: [PERFORM] Query performing very bad and sometimes good

2014-08-06 Thread Kevin Grittner
Andreas Joseph Krogh andr...@visena.com wrote:

 Some-times it performs much better (but still not good)

As has already been suggested, that difference is almost certainly 
due to differences in how much of the necessary data is cached or 
what the query is competing with.

 Does anyone see anything obvious or have any hints what to
 investigate further?

We need more information to be able to say much.  Please review
this page:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

Knowing more about the hardware and the tables (including all
indexes) would help a lot, as well as all non-default configuration
settings.  In particular, I'm curious whether there is an index on
the message_id column of origo_email_delivery.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Query performing very bad and sometimes good

2014-08-05 Thread Andreas Joseph Krogh
Hi all.   Running version: on= select version();
   version
 

  PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 
4.6.3-1ubuntu5) 4.6.3, 64-bit     I have this query.   prepare 
list_un_done_in_folder_q AS
 SELECT em.entity_id, substr(em.plain_text_content, 1, 101) as 
plain_text_content, del.entity_id as delivery_id, del.subject,
  coalesce(prop.is_seen, false) AS is_seen,
     coalesce(prop.is_done, false) 
AS is_done,
     del.received_timestamp, del.sent_timestamp, ef.name as from_name, 
ef.address as from_address
     , ARRAY(select a.name from origo_email_address a inner join 
origo_email_address_owner o ON o.address_id = a.entity_id AND o.recipient_type 
= 'TO' AND o.message_id = em.entity_id ORDER BY o.address_index ASC) as 
recipient_to_name
     , ARRAY(select a.address from origo_email_address a inner join 
origo_email_address_owner o ON o.address_id = a.entity_id AND o.recipient_type 
= 'TO' AND o.message_id = em.entity_id ORDER BY o.address_index ASC) as 
recipient_to_address
     , prop.followup_id, prop.is_forwarded as is_forwarded, prop.is_replied as 
is_replied, fm.folder_id
     , (SELECT
     person_fav.priority
    FROM origo_favourite_owner person_fav
    WHERE person_fav.favourite_for = $1
  AND person_fav.favourite_item = pers.entity_id)
     AS person_favourite_priority
     , (select company_fav.priority FROM origo_favourite_owner company_fav 
WHERE company_fav.favourite_for = $2
 
    
AND company_fav.favourite_item = comp.entity_id)
     AS company_favourite_priority
     , pers.entity_id as from_person_entity_id, pers.id as from_person_id, 
pers.onp_user_id, pers.firstname as from_firstname, pers.lastname as 
from_lastname
     , comp.entity_id as from_company_entity_id, comp.companyname as 
from_company_name
     , em.attachment_size FROM origo_email_delivery del JOIN 
origo_email_message em ON (del.message_id = em.entity_id)
     LEFT OUTER JOIN onp_crm_person pers ON (em.from_entity_id = 
pers.entity_id)
     LEFT OUTER JOIN onp_crm_relation comp ON (comp.entity_id = 
pers.relation_id)
     JOIN origo_email_message_property prop ON (em.entity_id = prop.message_id 
AND prop.owner_id = $3)
     LEFT OUTER JOIN origo_email_address ef ON em.from_id = ef.entity_id
     LEFT OUTER JOIN origo_email_folder_message fm ON fm.delivery_id = 
del.entity_id
 WHERE 1 = 1   AND prop.is_done = FALSE
   AND fm.folder_id = $4 ORDER BY  del.received_timestamp DESC LIMIT $5 
OFFSET $6;   Which sometimes performs really bad, although all indexes are 
being used.   Here is the explain plan:   on= explain analyze execute 
list_un_done_in_folder_q (3,3,3,44961, 101, 0);
 

 
QUERY PLAN
 
-
  Limit  (cost=2.53..52101.30 rows=101 width=641) (actual 
time=0.343..311765.063 rows=75 loops=1)
    -  Nested Loop Left Join  (cost=2.53..1365402.92 rows=2647 width=641) 
(actual time=0.342..311765.012 rows=75 loops=1)
  -  Nested Loop Left Join  (cost=2.10..1248967.97 rows=2647 
width=607) (actual time=0.202..311215.044 rows=75 loops=1)
    -  Nested Loop Left Join  (cost=1.83..1248018.82 rows=2647 
width=592) (actual time=0.201..311214.888 rows=75 loops=1)
  -  Nested Loop  (cost=1.55..1247217.37 rows=2647 
width=565) (actual time=0.199..311214.695 rows=75 loops=1)
    -  Nested Loop  (cost=1.13..1240583.78 rows=2647 
width=126) (actual time=0.194..311213.727 rows=75 loops=1)
  -  Nested Loop  (cost=0.71..1230153.64 
rows=20567 width=118) (actual time=0.029..311153.648 rows=12866 loops=1)
    -  Index Scan Backward using 
origo_email_delivery_received_idx on origo_email_delivery del  
(cost=0.42..1102717.48 rows=354038 width=98) (actual time=0.017..309196.670 
rows=354296 loops=1)
    -  Index Scan using 
origo_email_prop_owner_message_not_done_idx on origo_email_message_property 
prop  (cost=0.29..0.35 rows=1 width=20) (actual time=0.004..0.004 rows=0 
loops=354296)
  Index Cond: ((owner_id = 
3::bigint) AND (message_id = del.message_id))
  -  Index Only Scan using 
origo_email_folder_message_delivery_id_folder_id_key on 
origo_email_folder_message 

Re: [PERFORM] Query performing very bad and sometimes good

2014-08-05 Thread David G Johnston
Andreas Joseph Krogh-2 wrote
 Hi all.   Running version: on= select version();
    version
  
 
   PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu, compiled by gcc
 (Ubuntu/Linaro 
 4.6.3-1ubuntu5) 4.6.3, 64-bit     

9.3.2 is not release-worthy


 Bad:
 Index Scan Backward using origo_email_delivery_received_idx on
 origo_email_delivery del (cost=0.42..1102717.48 rows=354038 width=98)
 (actual time=0.017..309196.670 rows=354296 loops=1)
 
Add 4 new records
 
 Good (-ish):
 Index Scan Backward using origo_email_delivery_received_idx on
 origo_email_delivery del (cost=0.42..1102717.48 rows=354038 width=98)
 (actual time=0.019..2431.773 rows=354300 loops=1)

The plans appear to be basically identical - and the queries/data as well
aside from the addition of 4 more unmatched records.

The difference between the two is likely attributable to system load
variations combined with the effect of caching after running the query the
first (slow) time.

Doing OFFSET/LIMIT pagination can be problematic so I'd be curious what
would happen if you got rid of it.  In this specific case the result set is
only 75 with 101 allowed anyway.

The left joins seem to be marginal so I'd toss those out and optimize the
inner joins and, more likely, the correlated subqueries in the select list. 
You need to avoid nested looping over 300,000+ records somehow - though I'm
not going to be that helpful in the actual how part...

Note that in the inner-most loop the actual time for the cached data is half
of the non-cached data.  While both are quite small (0.002/0.004) the
300,000+ loops do add up.  The same likely applies to the other planning
nodes but I didn't dig that deep.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Query-performing-very-bad-and-sometimes-good-tp5813831p5813847.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance