Re: [PERFORM] Very specialised query

Matthew Wakeling Fri, 27 Mar 2009 10:34:57 -0700

On Fri, 27 Mar 2009, Віталій Тимчишин wrote:

...an index on (objectid, start) would help...


Definitely.

You could try adding "AND l2.start > l1.start" to the first query. This will drop symmetric half of intersections (the ones that willremain are l2 inside or to the left of l1), but you can redo results byid1,id2 union all id2, id1 and may allow to use start index for"between"


That's remarkably clever, and should have been obvious. Thanks.

Adding the constraint as you say does indeed make the query fast. However,there seems to be a problem with the planner, in that it does notdistinguish between the costs of two alternative plans, which have vastlydifferent real costs. Consider the following:


SELECT
    l1.id AS id1,
    l2.id AS id2
FROM
    location l1,
    location l2
WHERE
        l1.objectid = 228000093
    AND l2.objectid = 228000093
    AND l1.id <> l2.id
    AND l1.start < l2.end
    AND l1.end > l2.start
    AND l1.start < l2.start

UNION ALLSELECT

    l1.id AS id1, l2.id AS id2
FROM
    location l1,
    location l2
WHERE
        l1.objectid = 228000093
    AND l2.objectid = 228000093
    AND l1.id <> l2.id
    AND l1.start < l2.end
    AND l1.end > l2.start
    AND l1.start >= l2.start;

QUERYPLAN

--------------------------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..20479179.74 rows=138732882 width=8)
   ->  Nested Loop  (cost=0.00..9545925.46 rows=69366441 width=8)
         Join Filter: ((l1.id <> l2.id) AND (l1.start < l2.end))
         ->  Index Scan using location_test_obj_end on location l1  
(cost=0.00..55966.07 rows=43277 width=12)
               Index Cond: (objectid = 228000093)
         ->  Index Scan using location_test_obj_start on location l2  
(cost=0.00..123.10 rows=4809 width=12)
               Index Cond: ((l2.objectid = 228000093) AND (l1.end > l2.start) AND 
(l1.start < l2.start))
   ->  Nested Loop  (cost=0.00..9545925.46 rows=69366441 width=8)
         Join Filter: ((l1.id <> l2.id) AND (l1.start < l2.end))
         ->  Index Scan using location_test_obj_end on location l1  
(cost=0.00..55966.07 rows=43277 width=12)
               Index Cond: (objectid = 228000093)
         ->  Index Scan using location_test_obj_start on location l2  
(cost=0.00..123.10 rows=4809 width=12)
               Index Cond: ((l2.objectid = 228000093) AND (l1.end > l2.start) AND 
(l1.start >= l2.start))
(13 rows)


Notice the two different index conditions:
    (l1.end > l2.start) AND (l1.start < l2.start)  - "between"
    (l1.end > l2.start) AND (l1.start >= l2.start) - open-ended
Both have a cost of (cost=0.00..123.10 rows=4809 width=12)

Postgres estimates these two index scans to be equivalent in cost, wherethey are actually vastly different in real cost. Shouldn't Postgres favoura "between" index scan over an open-ended one?


Matthew

--
[About NP-completeness] These are the problems that make efficient use of
the Fairy Godmother.                    -- Computer Science Lecturer
--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Very specialised query

Reply via email to