[ 
https://issues.apache.org/jira/browse/CONNECTORS-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14192026#comment-14192026
 ] 

Karl Wright edited comment on CONNECTORS-1090 at 10/31/14 4:39 PM:
-------------------------------------------------------------------

I'm not seeing any sign from postgresql that it is using the correct index and 
scanning it in index order for the stuffer query.

Here's what the 8.3 documentation says:

{code}
The planner will consider satisfying an ORDER BY specification either by 
scanning any available index that matches the specification, or by scanning the 
table in physical order and doing an explicit sort. For a query that requires 
scanning a large fraction of the table, the explicit sort is likely to be 
faster because it requires less disk I/O due to a better-ordered access 
pattern. Indexes are more useful when only a few rows need be fetched. An 
important special case is ORDER BY in combination with LIMIT n: an explicit 
sort will have to process all the data to identify the first n rows, but if 
there is an index matching the ORDER BY then the first n rows can be retrieved 
directly, without scanning the remainder at all.
{code}

The intent was to combine ORDER BY with LIMIT, so it matches the case in 
question perfectly.   I don't understand why this isn't reading from the index 
in almost all cases.

Instead, with 9.3, we're getting plans that look like this:

{code}
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan: Limit  
(cost=1.13..633.06 rows=240 width=192) (actual time=0.307..4.133 rows=240 
loops=1)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:   ->  Nested Loop Semi 
Join  (cost=1.13..74547.95 rows=28312 width=192) (actual time=0.305..4.117 
rows=240 loops=1)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:         Join Filter: 
(t0.jobid = t1.id)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:         ->  Nested Loop 
Anti Join  (cost=1.13..74111.87 rows=28312 width=192) (actual time=0.294..3.976 
rows=240 loops=1)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:               ->  
Nested Loop Anti Join  (cost=0.71..39180.11 rows=28348 width=192) (actual 
time=0.268..0.815 rows=240 loops=1)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                     ->  
Index Scan using i1414771440111 on jobqueue t0  (cost=0.42..20351.99 rows=28384 
width=192) (actual time=0.236..0.399 rows=240 loops=1)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                         
  Index Cond: ((checkaction = 'R'::bpchar) AND (checktime <= 
1414772647826::bigint))
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                         
  Filter: ((status = 'P'::bpchar) OR (status = 'G'::bpchar))
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                         
  Rows Removed by Filter: 121
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                     ->  
Nested Loop  (cost=0.29..0.65 rows=1 width=8) (actual time=0.001..0.001 rows=0 
loops=240)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                         
  ->  Index Scan using i1414771440117 on prereqevents t3  (cost=0.14..0.16 
rows=1 width=524) (actual time=0.001..0.001 rows=0 loops=240)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                         
        Index Cond: (t0.id = owner)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                         
  ->  Index Only Scan using events_pkey on events t4  (cost=0.14..0.48 rows=1 
width=516) (never executed)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                         
        Index Cond: (name = (t3.eventname)::text)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                         
        Heap Fetches: 0
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:               ->  Index 
Scan using i1414771440110 on jobqueue t2  (cost=0.42..1.23 rows=1 width=49) 
(actual time=0.012..0.012 rows=0 loops=240)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                     
Index Cond: (((dochash)::text = (t0.dochash)::text) AND (jobid = t0.jobid))
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                     
Filter: ((status = 'A'::bpchar) OR (status = 'F'::bpchar) OR (status = 
'a'::bpchar) OR (status = 'f'::bpchar) OR (status = 'D'::bpchar) OR (status = 
'd'::bpchar))
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                     
Rows Removed by Filter: 1
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:         ->  Materialize 
 (cost=0.00..11.41 rows=1 width=8) (actual time=0.000..0.000 rows=1 loops=240)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:               ->  Seq 
Scan on jobs t1  (cost=0.00..11.40 rows=1 width=8) (actual time=0.005..0.005 
rows=1 loops=1)
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan:                     
Filter: ((priority = 5::bigint) AND ((status = 'A'::bpchar) OR (status = 
'a'::bpchar)))
 WARN 2014-10-31 12:24:07,944 (Stuffer thread) -  Plan: Total runtime: 4.474 ms
{code}



was (Author: [email protected]):
I'm not seeing any sign from postgresql that it is using the correct index and 
scanning it in index order for the stuffer query.

Here's what the 8.3 documentation says:

{code}
The planner will consider satisfying an ORDER BY specification either by 
scanning any available index that matches the specification, or by scanning the 
table in physical order and doing an explicit sort. For a query that requires 
scanning a large fraction of the table, the explicit sort is likely to be 
faster because it requires less disk I/O due to a better-ordered access 
pattern. Indexes are more useful when only a few rows need be fetched. An 
important special case is ORDER BY in combination with LIMIT n: an explicit 
sort will have to process all the data to identify the first n rows, but if 
there is an index matching the ORDER BY then the first n rows can be retrieved 
directly, without scanning the remainder at all.
{code}

The intent was to combine ORDER BY with LIMIT, so it matches the case in 
question perfectly.   I don't understand why this isn't reading from the index 
in almost all cases.


> More PostgreSQL performance improvements
> ----------------------------------------
>
>                 Key: CONNECTORS-1090
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1090
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Framework core
>    Affects Versions: Manifold 1.7.1
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.8, ManifoldCF 2.0
>
>         Attachments: CONNECTORS-1090.patch
>
>
> More performance improvements may be possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to