Re: [GENERAL] Nested Loop Left Join always shows rows=1

2008-12-02 Thread Scara Maccai
 Yeah, this is a bug: it's effectively double-counting the 
 selectivity of the index clause. The error isn't enormously critical, 
 since the join size estimate is right; but it could perhaps lead to 
 choosing a plain indexscan when a bitmap scan would be better. I've 
 applied a patch.

Thank you. I'll try that.

I was able to change the PgBackendStatus struct to hold a percentage of 
completion field, which shows up when calling the pg_stat_get_activity 
function.

As I said in a previous mail, the progress indicator gives very good estimates 
for the simple queries I need. If I can come up with something that is good in 
general I'll post it. I'm reading some papers about the argument (mostly the 
ones listed in http://wiki.postgresql.org/wiki/Query_progress_indication).






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Nested Loop Left Join always shows rows=1

2008-12-01 Thread Tom Lane
Scara Maccai [EMAIL PROTECTED] writes:
 I ran ANALYZE on both tables, but the plan shows rows=1 in the inner 
 table results.

 If I change the left outer join into an inner join the bitmap index 
 scan of the inner table of the nested loop shows a correct number of 
 rows (not the exact same thing as actual rows, but very close).

Yeah, this is a bug: it's effectively double-counting the selectivity of
the index clause.  The error isn't enormously critical, since the join
size estimate is right; but it could perhaps lead to choosing a plain
indexscan when a bitmap scan would be better.  I've applied a patch.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Nested Loop Left Join always shows rows=1

2008-11-27 Thread Scara Maccai

Query is:

explain analyze  select sum(A), count(*) from tab1 left outer join 
tab_outer on id=out_id where id=10;


output:

 Aggregate  (cost=31.91..31.92 rows=1 width=4) (actual 
time=14.185..14.185 rows=1 loops=1)
   -  Nested Loop Left Join  (cost=0.00..17.11 rows=2959 width=4) 
(actual time=8.608..13.400 rows=2953 loops=1)
 -  Index Scan using id_idx on tab1  (cost=0.00..8.27 rows=1 
width=4) (actual time=0.010..0.011 rows=1 loops=1)

   Index Cond: (id = 10)
 -  Index Scan using out_id_idx on tab_outer  (cost=0.00..8.83 
rows=1 width=8) (actual time=8.590..11.924 rows=2953 loops=1)
   Index Cond: ((tab_outer.out_id = 10) AND (tab1.id = 
tab_outer.out_id))




http://explain-analyze.info/query_plans/2519-query-plan-1240

I ran ANALYZE on both tables, but the plan shows rows=1 in the inner 
table results.


If I change the left outer join into an inner join the bitmap index 
scan of the inner table of the nested loop shows a correct number of 
rows (not the exact same thing as actual rows, but very close).


Am I reading the explain output in the wrong way?
I'm not complaining about performance issues, I'm just trying to learn 
how to read plans.






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Nested Loop Left Join always shows rows=1

2008-11-27 Thread Tom Lane
Scara Maccai [EMAIL PROTECTED] writes:
   Aggregate  (cost=31.91..31.92 rows=1 width=4) (actual 
 time=14.185..14.185 rows=1 loops=1)
 -  Nested Loop Left Join  (cost=0.00..17.11 rows=2959 width=4) 
 (actual time=8.608..13.400 rows=2953 loops=1)
   -  Index Scan using id_idx on tab1  (cost=0.00..8.27 rows=1 
 width=4) (actual time=0.010..0.011 rows=1 loops=1)
 Index Cond: (id = 10)
   -  Index Scan using out_id_idx on tab_outer  (cost=0.00..8.83 
 rows=1 width=8) (actual time=8.590..11.924 rows=2953 loops=1)
 Index Cond: ((tab_outer.out_id = 10) AND (tab1.id = 
 tab_outer.out_id))

This seems a bit broken :-( ... ideally it shouldn't be generating the
redundant index condition, either.  I think the redundant condition is
probably indirectly responsible for the low rowcount estimate, too.
It doesn't really hurt anything, since the join output estimate is
correct, but it'd be nice to make it look better.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Nested Loop Left Join always shows rows=1

2008-11-27 Thread Scara Maccai

Tom Lane wrote:

Scara Maccai [EMAIL PROTECTED] writes:
  -  Index Scan using id_idx on tab1  (cost=0.00..8.27 rows=1 
width=4) (actual time=0.010..0.011 rows=1 loops=1)

Index Cond: (id = 10)
  -  Index Scan using out_id_idx on tab_outer  (cost=0.00..8.83 
rows=1 width=8) (actual time=8.590..11.924 rows=2953 loops=1)
Index Cond: ((tab_outer.out_id = 10) AND (tab1.id = 
tab_outer.out_id))


This seems a bit broken :-( ... ideally it shouldn't be generating the
redundant index condition, either.  


Why is the index condition redundant? I guess the redundant part is

AND (tab1.id = tab_outer.out_id) but I'm not sure, and I would really 
appreciate if you could explain a little bit more...

The reason I'm asking is because I'm looking at the implementation of some VERY basic 
form of progress indicator based on estimated rows vs rows already processed.
I guess this would work for our DB because the planner is usually very good at 
guessing the number of rows for each step of the plan, since our distributions 
are pretty simple.

But this left outer join thing I'm afraid would invalidate the method, since 
the inner table row count is always off...



Thank you for the reply.

(I added hackers too, since this seems a place where some 
fixing/development would be necessary. I hope it's ok.)





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general