On Sun, 2 Sep 2012, Pavel Stehule wrote:
statistics on data_objid_idx table are absolutly out - so planner
cannot find optimal plan
That doesn't have anything to do with the problem, AFAIU.
First, the data table is static and was analysed.
Second, the query in question is the join, and afaik the estimation of the
number of rows is known to be incorrect, in the case of column
correlation.
Third, according at least to my understanding in the fully cached regime
bitmap scan should not take two orders of magnitude more CPU time than
index scan.
Sergey
Regard
Pavel Stehule
Index Cond: (objid = t.objid)
Total runtime: 66622.026 ms
(11 rows)
Here is the output when bitmap scans are disabled:
QUERY PLAN
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..329631941.65 rows=10000 width=68) (actual
time=0.082..906.876 rows=10000 loops=1)
-> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual
time=0.081..905.683 rows=10000 loops=1)
Join Filter: (t.mjd = d1.mjd)
-> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28)
(actual time=0.009..1.679 rows=11456 loops=1)
-> Index Scan using data_objid_idx on data d1
(cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050
rows=248 loops=11456)
Index Cond: (objid = t.objid)
Total runtime: 907.462 ms
When the bitmap scans are enabled the "prof" of postgres shows
47.10% postmaster postgres [.] hash_search_with_hash_value
|
--- hash_search_with_hash_value
11.06% postmaster postgres [.] hash_seq_search
|
--- hash_seq_search
6.95% postmaster postgres [.] hash_any
|
--- hash_any
5.17% postmaster postgres [.] _bt_checkkeys
|
--- _bt_checkkeys
4.07% postmaster postgres [.] tbm_add_tuples
|
--- tbm_add_tuples
3.41% postmaster postgres [.] hash_search
|
--- hash_search
And the last note is that the crts.data table which is being bitmap scanned
is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan
code
is somehow unprepared to combine two bitmaps for such a big table, and this
leads to the terrible performance.
Regards,
Sergey
PS Here are the schemas of the tables, just in case:
wsdb=> \d test
Table "koposov.test"
Column | Type | Modifiers
---------+------------------+-----------
mjd | double precision |
fieldid | bigint |
intmag | integer |
objid | bigint |
wsdb=> \d crts.data
Table "crts.data"
Column | Type | Modifiers
--------+------------------+-----------
objid | bigint |
mjd | double precision |
mag | real |
emag | real |
ra | double precision |
dec | double precision |
Indexes:
"data_mjd_idx" btree (mjd) WITH (fillfactor=100)
"data_objid_idx" btree (objid) WITH (fillfactor=100)
"data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH
(fillfactor=100)
PPS shared_buffers=10GB, work_mem=1GB
All the test shown here were don in fully cached regime.
PPS I can believe that what I'm seeing is a feature, not a bug of bitmap
scans,
and I can live with disabling them, but I still thought it's worth
reporting.
*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers