[PERFORM] Inner join on two OR conditions dont use index

2005-05-25 Thread Jocelyn Turcotte
Hi all
i dont know if this is normal, but if yes i would like to know why and
how I could do it another way other than using unions.

(I tried on postgresql 7.4 and 8.0.3, made my vacuum analyse just before)
Here is my simple query:

select * 
from rt_node n, rt_edge e
where node_id = 2 
and e.start_node_id = n.node_id;

which give me the following query plan:

 Nested Loop  (cost=0.00..79.46 rows=24 width=60)
   -  Index Scan using rt_node_pkey on rt_node n  (cost=0.00..5.94
rows=1 width=36)
 Index Cond: (node_id = 2)
   -  Index Scan using rt_edge_start_node on rt_edge e 
(cost=0.00..73.28 rows=24 width=24)
 Index Cond: (start_node_id = 2)


But if I plug another condition with a OR like this:
select * 
from rt_node n, rt_edge e
where node_id = 2 
and (e.start_node_id = n.node_id or e.end_node_id = n.node_id);

I get this plan, it stop using the index!:

 Nested Loop  (cost=0.00..158.94 rows=4 width=60)
   Join Filter: ((inner.start_node_id = outer.node_id) OR
(inner.end_node_id = outer.node_id))
   -  Index Scan using rt_node_pkey on rt_node n  (cost=0.00..5.94
rows=1 width=36)
 Index Cond: (node_id = 2)
   -  Seq Scan on rt_edge e  (cost=0.00..81.60 rows=4760 width=24)

I tried SET enable_seqscan = OFF and it give me this (woah) :

 Nested Loop  (cost=1.00..10158.94 rows=4 width=60)
   Join Filter: ((inner.start_node_id = outer.node_id) OR
(inner.end_node_id = outer.node_id))
   -  Index Scan using rt_node_pkey on rt_node n  (cost=0.00..5.94
rows=1 width=36)
 Index Cond: (node_id = 2)
   -  Seq Scan on rt_edge e  (cost=1.00..10081.60
rows=4760 width=24)

These are my tables definitions:
CREATE TABLE rt_node (
node_id INTEGER PRIMARY KEY
);

CREATE TABLE rt_edge (
edge_id INTEGER PRIMARY KEY,
start_node_id   INTEGER NOT NULL,
end_node_id INTEGER NOT NULL,
CONSTRAINT start_node_ref FOREIGN KEY (start_node_id) REFERENCES
rt_node(node_id),
CONSTRAINT end_node_ref FOREIGN KEY (end_node_id) REFERENCES
rt_node(node_id)
);
   
CREATE INDEX rt_edge_start_node ON rt_edge(start_node_id);
CREATE INDEX rt_edge_end_node ON rt_edge(end_node_id);


I cant figure why it cant use my index
I know I can use a UNION instead on two query like the first one only
different on start_node_id/end_node_id, and it works,
but this is a part of a bigger query which is already ugly and I hate
using 5 lines for something I could in 5 words.

thank you!

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Inner join on two OR conditions dont use index

2005-05-25 Thread Jocelyn Turcotte
Thanks John

it dont seems to work, but in my context I only needed data from the
rt_node table so I tried this:

select *
from rt_node n
where node_id = 2
and exists (select edge_id from rt_edge where start_node_id =
n.node_id or end_node_id = n.node_id)

and it gave me this plan (even if I remove the stupid node_id = 2 condition):

 Index Scan using rt_node_pkey on rt_node n  (cost=0.00..6.15 rows=1 width=25)
   Index Cond: (node_id = 2)
   Filter: (subplan)
   SubPlan
 -  Index Scan using rt_edge_start_node, rt_edge_end_node on
rt_edge  (cost=0.00..12.56 rows=4 width=4)
   Index Cond: ((start_node_id = $0) OR (end_node_id = $0))


this time it use my two indexes, maybe because he know that the same
value is compared in the two condition... I should ask my mother if
she got an idea, mothers know a lot of stuff!

On 5/25/05, John A Meinel [EMAIL PROTECTED] wrote:
 Jocelyn Turcotte wrote:
 
 Hi all
 i dont know if this is normal, but if yes i would like to know why and
 how I could do it another way other than using unions.
 
 
 
 The only thing that *might* work is if you used an index on both keys.
 So if you did:
 
 CREATE INDEX rt_edge_start_end_node ON rt_edge(start_node_id,end_node_id);
 
 The reason is that in an OR construct, you have to check both for being 
 true. So in the general case where you don't know the correlation between the 
 columns, you have to check all of the entries, because even if you know the 
 status of one side of the OR, you don't know the other.
 
 Another possibility would be to try this index:
 
 CREATE INDEX rt_edge_stare_or_end ON rt_edge(start_node_id OR end_node_id);
 
 I'm not sure how smart the planner can be, though.
 
 John
 =:-
 
 
 
 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly