Re: [PERFORM] difference in plan between 8.0 and 8.1?

2005-08-26 Thread Michael Fuhr
On Fri, Aug 26, 2005 at 10:45:07AM -0400, Alan Stange wrote:
 -  Bitmap Heap Scan on tbltimeseries (cost=48714.09..1331000.42 
 rows=10854026 width=8)
   Recheck Cond: (timeseriesid  0)
   -  Bitmap Index Scan on idx_timeseris (cost=0.00..48714.09 
 rows=10854026 width=0)
 Index Cond: (timeseriesid  0)
 
 I'm hoping someone can explain the new query plan (as I'm not sure I 
 understand what it is doing).

Search for bitmap in the 8.1 Release Notes:

http://developer.postgresql.org/docs/postgres/release.html#RELEASE-8-1

You could probably find more detailed discussion in the pgsql-hackers
archives.

-- 
Michael Fuhr

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] difference in plan between 8.0 and 8.1?

2005-08-26 Thread Tom Lane
Alan Stange [EMAIL PROTECTED] writes:
  Unique  (cost=2717137.08..2771407.21 rows=10854026 width=8)
-  Sort  (cost=2717137.08..2744272.14 rows=10854026 width=8)
  Sort Key: timeseriesid
  -  Bitmap Heap Scan on tbltimeseries  
 (cost=48714.09..1331000.42 rows=10854026 width=8)
Recheck Cond: (timeseriesid  0)
-  Bitmap Index Scan on idx_timeseris  
 (cost=0.00..48714.09 rows=10854026 width=0)
  Index Cond: (timeseriesid  0)
 (7 rows)

 I'm hoping someone can explain the new query plan (as I'm not sure I 
 understand what it is doing).

The index scan is reading the index to find out which heap tuple IDs
(TIDs) the index says meet the condition.  It returns a bitmap of the
tuple locations (actually, an array of per-page bitmaps).  The heap
scan goes and fetches the tuples from the table, working in TID order
to avoid re-reading the same page many times, as can happen for ordinary
index scans.  Since the result isn't sorted, we have to do a sort to get
it into the correct order for the Unique step.

Because it avoids random access to the heap, this plan can be a lot
faster than a regular index scan.  I'm not sure at all that 8.1 is
doing good relative cost estimation yet, though.  It would be
interesting to see EXPLAIN ANALYZE results for both ways.  (You can
use enable_bitmapscan and enable_indexscan to force the planner to pick
the plan it thinks is slower.)

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] difference in plan between 8.0 and 8.1?

2005-08-26 Thread Merlin Moncure
 Hello all,
 
 I was hoping someone could explain the plan for a statement.
 
 We have a table with a column of longs being used as an index.  The
 query plan in 8.0 was like this:
 
 # explain select distinct timeseriesid from tbltimeseries where
 timeseriesid  0 order by timeseriesid;

I had the same problem.  You probably already have seq scan turned off,
or the server would be using that.  You may have to turn bitmap off or
rework you query such that the server will use the index.  (between?).

Anyways, distinct is code word for 'bad performance' :).  Consider
laying out tables such that it not necessary, for example set up table
with RI link.  Then you can do this in zero time.

Good luck!

Merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] difference in plan between 8.0 and 8.1?

2005-08-26 Thread Alan Stange

Tom Lane wrote:

Alan Stange [EMAIL PROTECTED] writes:
  

 Unique  (cost=2717137.08..2771407.21 rows=10854026 width=8)
   -  Sort  (cost=2717137.08..2744272.14 rows=10854026 width=8)
 Sort Key: timeseriesid
 -  Bitmap Heap Scan on tbltimeseries  
(cost=48714.09..1331000.42 rows=10854026 width=8)

   Recheck Cond: (timeseriesid  0)
   -  Bitmap Index Scan on idx_timeseris  
(cost=0.00..48714.09 rows=10854026 width=0)

 Index Cond: (timeseriesid  0)
(7 rows)



  
I'm hoping someone can explain the new query plan (as I'm not sure I 
understand what it is doing).



The index scan is reading the index to find out which heap tuple IDs
(TIDs) the index says meet the condition.  It returns a bitmap of the
tuple locations (actually, an array of per-page bitmaps).  The heap
scan goes and fetches the tuples from the table, working in TID order
to avoid re-reading the same page many times, as can happen for ordinary
index scans.  Since the result isn't sorted, we have to do a sort to get
it into the correct order for the Unique step.

Because it avoids random access to the heap, this plan can be a lot
faster than a regular index scan.  I'm not sure at all that 8.1 is
doing good relative cost estimation yet, though.  It would be
interesting to see EXPLAIN ANALYZE results for both ways.  (You can
use enable_bitmapscan and enable_indexscan to force the planner to pick
the plan it thinks is slower.)
Just to be clear.  The index is on the timeseriesid column.   Also, We 
usually have the where clause with some non-zero number.


Anyway, here's the basic query, with variations added on belowe:

fiasco=# explain analyze select timeseriesid from tbltimeseries where 
timeseriesid  0;
  QUERY 
PLAN


Bitmap Heap Scan on tbltimeseries  (cost=48906.82..1332935.19 
rows=10905949 width=8) (actual time=16476.337..787480.979 rows=10907853 
loops=1)

  Recheck Cond: (timeseriesid  0)
  -  Bitmap Index Scan on idx_timeseris  (cost=0.00..48906.82 
rows=10905949 width=0) (actual time=16443.585..16443.585 rows=10907853 
loops=1)

Index Cond: (timeseriesid  0)
Total runtime: 791340.341 ms
(5 rows)



Now add the order:

fiasco=# explain analyze select timeseriesid from tbltimeseries where 
timeseriesid  0 order by timeseriesid;
 
QUERY PLAN

--
Sort  (cost=2726087.93..2753352.81 rows=10905949 width=8) (actual 
time=821090.666..826353.054 rows=10913868 loops=1)

  Sort Key: timeseriesid
  -  Bitmap Heap Scan on tbltimeseries  (cost=48912.82..1332941.19 
rows=10905949 width=8) (actual time=16353.921..757075.349 rows=10913868 
loops=1)

Recheck Cond: (timeseriesid  0)
-  Bitmap Index Scan on idx_timeseris  (cost=0.00..48912.82 
rows=10905949 width=0) (actual time=16335.239..16335.239 rows=10913868 
loops=1)

  Index Cond: (timeseriesid  0)
Total runtime: 830829.145 ms
(7 rows)




and the distinct:

fiasco=# explain analyze select distinct timeseriesid from tbltimeseries 
where timeseriesid  0 order by timeseriesid;

QUERY PLAN


Unique  (cost=2726087.93..2780617.68 rows=10905949 width=8) (actual 
time=816938.970..831119.423 rows=10913868 loops=1)
  -  Sort  (cost=2726087.93..2753352.81 rows=10905949 width=8) (actual 
time=816938.967..822298.802 rows=10913868 loops=1)

Sort Key: timeseriesid
-  Bitmap Heap Scan on tbltimeseries  
(cost=48912.82..1332941.19 rows=10905949 width=8) (actual 
time=15866.736..752851.006 rows=10913868 loops=1)

  Recheck Cond: (timeseriesid  0)
  -  Bitmap Index Scan on idx_timeseris  
(cost=0.00..48912.82 rows=10905949 width=0) (actual 
time=15852.652..15852.652 rows=10913868 loops=1)

Index Cond: (timeseriesid  0)
Total runtime: 835558.312 ms
(8 rows)




Now the usual query from 8.0:

fiasco=# set enable_bitmapscan=false; explain analyze select distinct 
timeseriesid from tbltimeseries where timeseriesid  0 order by 
timeseriesid;

SET
 
QUERY PLAN

--
Unique  (cost=0.00..14971276.10 rows=10905949 width=8) (actual 
time=24.930..999645.638 rows=10913868 loops=1)
  -  Index Scan using