[ 
https://issues.apache.org/jira/browse/CONNECTORS-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129922#comment-14129922
 ] 

Paul Boichat commented on CONNECTORS-1027:
------------------------------------------

Quert Plans: - Note that I only see this performance difference when a job is 
running (possibly as a result of documents being in a P or G status
 
 Original Query:
 
 explain analyze SELECT 
t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount, 
t0.priorityset FROM jobqueue t0 WHERE t0.status IN ('P','G')  AND 
t0.checkaction='R' AND t0.checktime <= 1407246846166 AND EXISTS (SELECT 'x' 
FROM jobs t1 WHERE t1.status  IN ('A','a')  AND t1.id=t0.jobid  AND 
t1.priority=5) AND NOT EXISTS (   SELECT 'x' FROM jobqueue t2   WHERE 
t2.dochash=t0.dochash AND t2.status IN ('A','F','a','f','D','d') AND 
t2.jobid!=t0.jobid ) AND NOT EXISTS (   SELECT 'x' FROM prereqevents t3,events 
t4   WHERE t0.id=t3.owner AND t3.eventname=t4.name ) ORDER BY t0.docpriority 
ASC LIMIT 480;
                                                                                
QUERY PLAN                             
                                                    
-----------------------------------------------------------------------------------------------------------------------
----------------------------------------------------
 Limit  (cost=499234.08..499235.28 rows=480 width=179) (actual 
time=23830.242..23830.496 rows=480 loops=1)
   ->  Sort  (cost=499234.08..499858.32 rows=249697 width=179) (actual 
time=23830.238..23830.332 rows=480 loops=1)
         Sort Key: t0.docpriority
         Sort Method: top-N heapsort  Memory: 152kB
         ->  Nested Loop Anti Join  (cost=38636.70..486865.47 rows=249697 
width=179) (actual time=1694.431..23772.466 r
ows=64682 loops=1)
               Join Filter: ((t2.jobid <> t0.jobid) AND ((t2.dochash)::text = 
(t0.dochash)::text))
               Rows Removed by Join Filter: 40361568
               ->  Hash Anti Join  (cost=38636.70..482464.67 rows=249697 
width=179) (actual time=1692.868..4870.315 row
s=64682 loops=1)
                     Hash Cond: (t0.id = t3.owner)
                     ->  Hash Join  (cost=38608.92..479284.46 rows=249700 
width=179) (actual time=1692.766..4832.385 ro
ws=64682 loops=1)
                           Hash Cond: (t0.jobid = t1.id)
                           ->  Bitmap Heap Scan on jobqueue t0  
(cost=38452.39..473170.01 rows=1373350 width=179) (actu
al time=1340.245..3427.182 rows=4817107 loops=1)
                                 Recheck Cond: ((status = ANY 
('{P,G}'::bpchar[])) AND (checkaction = 'R'::bpchar) AND 
(checktime <= 1407246846166::bigint))
                                 ->  Bitmap Index Scan on i1392985450174  
(cost=0.00..38109.05 rows=1373350 width=0) (a
ctual time=1281.749..1281.749 rows=4826138 loops=1)
                                       Index Cond: ((status = ANY 
('{P,G}'::bpchar[])) AND (checkaction = 'R'::bpchar) 
AND (checktime <= 1407246846166::bigint))
                           ->  Hash  (cost=156.51..156.51 rows=2 width=8) 
(actual time=2.025..2.025 rows=1 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                 ->  Seq Scan on jobs t1  (cost=0.00..156.51 
rows=2 width=8) (actual time=0.041..2.021 
rows=1 loops=1)
                                       Filter: ((status = ANY 
('{A,a}'::bpchar[])) AND (priority = 5))
                                       Rows Removed by Filter: 33
                     ->  Hash  (cost=26.15..26.15 rows=130 width=8) (actual 
time=0.005..0.005 rows=0 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 0kB
                           ->  Hash Join  (cost=12.93..26.15 rows=130 width=8) 
(actual time=0.004..0.004 rows=0 loops=1
)
                                 Hash Cond: ((t3.eventname)::text = 
(t4.name)::text)
                                 ->  Seq Scan on prereqevents t3  
(cost=0.00..11.40 rows=140 width=524) (actual time=0.
001..0.001 rows=0 loops=1)
                                 ->  Hash  (cost=11.30..11.30 rows=130 
width=516) (never executed)
                                       ->  Seq Scan on events t4  
(cost=0.00..11.30 rows=130 width=516) (never executed
)
               ->  Materialize  (cost=0.00..31.11 rows=1 width=49) (actual 
time=0.000..0.126 rows=624 loops=64682)
                     ->  Index Scan using i1392985450173 on jobqueue t2  
(cost=0.00..31.10 rows=1 width=49) (actual tim
e=0.069..0.739 rows=624 loops=1)
                           Index Cond: (status = ANY 
('{A,F,a,f,D,d}'::bpchar[]))
 Total runtime: 23832.087 ms
(31 rows)

 
 Revised Query:
 
 explain analyze SELECT                                                         
                                 
t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,
 t0.priorityset FROM jobqueue t0                                                
                                        WHERE (t0.status = 'P' OR t0.status = 
'G')  AND t0.checkaction='R' AND
 t0.checktime <= 1407246846166                                                  
                                       AND EXISTS (
   SELECT 'x' FROM jobs t1
   WHERE (t1.status = 'A' OR t1.status = 'a')  AND t1.id=t0.jobid  AND
 t1.priority=5
 )
 AND NOT EXISTS (
   SELECT 'x' FROM jobqueue t2
   WHERE t2.dochash=t0.dochash AND (t2.status = 'A' OR t2.status = 'F'
 OR t2.status = 'a' OR t2.status = 'f' OR t2.status = 'D' OR t2.status
 = 'd')
   AND t2.jobid!=t0.jobid
 )
 AND NOT EXISTS (
   SELECT 'x' FROM prereqevents t3,events t4
   WHERE t0.id=t3.owner AND t3.eventname=t4.name
 )
 ORDER BY t0.docpriority ASC
 LIMIT 480;
                                                                                
          QUERY PLAN                   
                                                                       
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------
 Limit  (cost=35.17..2868.75 rows=480 width=179) (actual time=2.004..136.828 
rows=480 loops=1)
   ->  Nested Loop Anti Join  (cost=35.17..1395679.74 rows=236418 width=179) 
(actual time=2.002..136.601 rows=480 loops
=1)
         Join Filter: ((t2.jobid <> t0.jobid) AND ((t2.dochash)::text = 
(t0.dochash)::text))
         Rows Removed by Join Filter: 304320
         ->  Nested Loop Anti Join  (cost=0.00..1391503.23 rows=236418 
width=179) (actual time=0.819..3.286 rows=480 lo
ops=1)
               ->  Nested Loop Semi Join  (cost=0.00..1185367.86 rows=236421 
width=179) (actual time=0.808..2.009 rows=
480 loops=1)
                     Join Filter: (t0.jobid = t1.id)
                     ->  Index Scan using i1392985450172 on jobqueue t0  
(cost=0.00..1146201.81 rows=1300315 width=179)
 (actual time=0.746..1.385 rows=480 loops=1)
                           Index Cond: ((checkaction = 'R'::bpchar) AND 
(checktime <= 1407246846166::bigint))
                           Filter: ((status = 'P'::bpchar) OR (status = 
'G'::bpchar))
                           Rows Removed by Filter: 633
                     ->  Materialize  (cost=0.00..156.60 rows=2 width=8) 
(actual time=0.000..0.000 rows=1 loops=480)
                           ->  Seq Scan on jobs t1  (cost=0.00..156.59 rows=2 
width=8) (actual time=0.045..0.045 rows=1
 loops=1)
                                 Filter: ((priority = 5) AND ((status = 
'A'::bpchar) OR (status = 'a'::bpchar)))
                                 Rows Removed by Filter: 18
               ->  Nested Loop  (cost=0.00..0.86 rows=1 width=8) (actual 
time=0.002..0.002 rows=0 loops=480)
                     ->  Index Scan using i1392985450178 on prereqevents t3  
(cost=0.00..0.27 rows=1 width=524) (actual
 time=0.001..0.001 rows=0 loops=480)
                           Index Cond: (t0.id = owner)
                     ->  Index Only Scan using events_pkey on events t4  
(cost=0.00..0.58 rows=1 width=516) (never exec
uted)
                           Index Cond: (name = (t3.eventname)::text)
                           Heap Fetches: 0
         ->  Materialize  (cost=35.17..39.20 rows=1 width=49) (actual 
time=0.001..0.124 rows=634 loops=480)
               ->  Bitmap Heap Scan on jobqueue t2  (cost=35.17..39.19 rows=1 
width=49) (actual time=0.261..0.699 rows=
634 loops=1)
                     Recheck Cond: ((status = 'A'::bpchar) OR (status = 
'F'::bpchar) OR (status = 'a'::bpchar) OR (stat
us = 'f'::bpchar) OR (status = 'D'::bpchar) OR (status = 'd'::bpchar))
                     ->  BitmapOr  (cost=35.17..35.17 rows=1 width=0) (actual 
time=0.249..0.249 rows=0 loops=1)
                           ->  Bitmap Index Scan on i1392985450173  
(cost=0.00..5.86 rows=1 width=0) (actual time=0.215
..0.215 rows=1000 loops=1)
                                 Index Cond: (status = 'A'::bpchar)
                           ->  Bitmap Index Scan on i1392985450173  
(cost=0.00..5.86 rows=1 width=0) (actual time=0.012
..0.012 rows=2 loops=1)
                                 Index Cond: (status = 'F'::bpchar)
                           ->  Bitmap Index Scan on i1392985450173  
(cost=0.00..5.86 rows=1 width=0) (actual time=0.005
..0.005 rows=0 loops=1)
                                 Index Cond: (status = 'a'::bpchar)
                           ->  Bitmap Index Scan on i1392985450173  
(cost=0.00..5.86 rows=1 width=0) (actual time=0.004
..0.004 rows=0 loops=1)
                                 Index Cond: (status = 'f'::bpchar)
                           ->  Bitmap Index Scan on i1392985450173  
(cost=0.00..5.86 rows=1 width=0) (actual time=0.004
..0.004 rows=0 loops=1)
                                 Index Cond: (status = 'D'::bpchar)
                           ->  Bitmap Index Scan on i1392985450173  
(cost=0.00..5.86 rows=1 width=0) (actual time=0.004
..0.004 rows=0 loops=1)
                                 Index Cond: (status = 'd'::bpchar)
 Total runtime: 137.195 ms
(38 rows)


> Database performance optimization
> ---------------------------------
>
>                 Key: CONNECTORS-1027
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1027
>             Project: ManifoldCF
>          Issue Type: Task
>          Components: Framework core
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 2.0
>
>
> Some database queries seem to perform better against postgresql in different 
> form.  For example:
> {code}
>  SELECT
>  t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,
>  t0.priorityset FROM jobqueue t0
>  WHERE t0.status IN ('P','G')  AND t0.checkaction='R' AND t0.checktime
>  <= 1407246846166
>  AND EXISTS (
>    SELECT 'x' FROM jobs t1
>    WHERE t1.status  IN ('A','a')  AND t1.id=t0.jobid  AND t1.priority=5
>  )
>  AND NOT EXISTS (
>    SELECT 'x' FROM jobqueue t2
>    WHERE t2.dochash=t0.dochash AND t2.status IN
>  ('A','F','a','f','D','d') AND t2.jobid!=t0.jobid
>  )
>  AND NOT EXISTS (
>    SELECT 'x' FROM prereqevents t3,events t4
>    WHERE t0.id=t3.owner AND t3.eventname=t4.name
>  )
>  ORDER BY t0.docpriority ASC
>  LIMIT 480;
> {code}
> ...apparently performs better on some versions of postgresql when written 
> like this:
> {code}
>  SELECT
>  t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,
>  t0.priorityset FROM jobqueue t0
>  WHERE (t0.status = 'P' OR t0.status = 'G')  AND t0.checkaction='R' AND
>  t0.checktime <= 1407246846166
> AND EXISTS (
>    SELECT 'x' FROM jobs t1
>    WHERE (t1.status = 'A' OR t1.status = 'a')  AND t1.id=t0.jobid  AND
>  t1.priority=5
>  )
>  AND NOT EXISTS (
>    SELECT 'x' FROM jobqueue t2
>    WHERE t2.dochash=t0.dochash AND (t2.status = 'A' OR t2.status = 'F'
>  OR t2.status = 'a' OR t2.status = 'f' OR t2.status = 'D' OR t2.status
>  = 'd')
>    AND t2.jobid!=t0.jobid
>  )
>  AND NOT EXISTS (
>    SELECT 'x' FROM prereqevents t3,events t4
>    WHERE t0.id=t3.owner AND t3.eventname=t4.name
>  )
>  ORDER BY t0.docpriority ASC
>  LIMIT 480;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to