Paul,
Well I re-read the PostGIS documentation - thank you very much -
then I
tried the following:
gisdb=# SET enable_seqscan TO off;
SET
gisdb=# explain analyze SELECT
cfcc::text,fename::text,asbinary(force_collection(force_2d
(wkb_geometry)),'NDR'),ogc_fid::text
from completechain WHERE wkb_geometry &&
setSRID('BOX3D(-90.6004589271402 38.4732965951534,-89.9998214028184
38.8267482378407)'::BOX3D,find_srid
('','completechain','wkb_geometry') );
QUERY
PLAN
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
------------------------------------
Index Scan using idx_completechain_wkb_geometry on completechain
(cost=0.00..364816.25 rows=90839 width=186) (actual
time=37.049..12532.636 rows=103159 loops=1)
Index Cond: (wkb_geometry &&
'0103000020FF7F0000010000000500000023AA47EB6DA656C0FABE9AFB943C43402
3AA47EB6DA656C095C6E1E2D269434089BCE812FD7F56C095C6E1E2D269434089BCE
812FD7F56C0FABE9AFB943C434023AA47EB6DA656C0FABE9AFB943C4340'::geomet
ry)
Total runtime: 12891.420 ms
(3 rows)
gisdb=# explain analyze SELECT
cfcc::text,fename::text,asbinary(force_collection(force_2d
(wkb_geometry)),'NDR'),ogc_fid::text
from completechain WHERE wkb_geometry &&
setSRID('BOX3D(-90.6004589271402 38.4732965951534,-89.9998214028184
38.8267482378407)'::BOX3D,find_srid
('','completechain','wkb_geometry') );
QUERY
PLAN
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
------------------------------------
Index Scan using idx_completechain_wkb_geometry on completechain
(cost=0.00..364816.25 rows=90839 width=186) (actual
time=11.988..5204.924 rows=103159 loops=1)
Index Cond: (wkb_geometry &&
'0103000020FF7F0000010000000500000023AA47EB6DA656C0FABE9AFB943C43402
3AA47EB6DA656C095C6E1E2D269434089BCE812FD7F56C095C6E1E2D269434089BCE
812FD7F56C0FABE9AFB943C434023AA47EB6DA656C0FABE9AFB943C4340'::geomet
ry)
Total runtime: 5497.049 ms
(3 rows)
gisdb=# explain analyze SELECT
cfcc::text,fename::text,asbinary(force_collection(force_2d
(wkb_geometry)),'NDR'),ogc_fid::text
from completechain WHERE wkb_geometry &&
setSRID('BOX3D(-90.6004589271402 38.4732965951534,-89.9998214028184
38.8267482378407)'::BOX3D,find_srid
('','completechain','wkb_geometry') );
QUERY
PLAN
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
------------------------------------
Index Scan using idx_completechain_wkb_geometry on completechain
(cost=0.00..364816.25 rows=90839 width=186) (actual
time=0.088..1915.497
rows=103159 loops=1)
Index Cond: (wkb_geometry &&
'0103000020FF7F0000010000000500000023AA47EB6DA656C0FABE9AFB943C43402
3AA47EB6DA656C095C6E1E2D269434089BCE812FD7F56C095C6E1E2D269434089BCE
812FD7F56C0FABE9AFB943C434023AA47EB6DA656C0FABE9AFB943C4340'::geomet
ry)
Total runtime: 2205.876 ms
(3 rows)
Not only was the first run 1/5 the time of the sequential scan
run but
the runs kept getting better. No matter how many I run with
'enable_seqscan' on the performance never improves. This must be
something I don't understand about PostgreSQL caching.
Notice that the cost estimate was actually higher on the first
one, yet
the performance was much better. I'm still working my way through
this
PostgreSQL stuff so maybe I'm missing something. But what I see
leads me
to believe that the planner didn't do a very good job on this
one. This
box covers the St. Louis metropolitan area in a database that is
loaded
with everything in Missouri and Illinois. So I've got to think that
using an index would be better than looking at the whole table.
Finally, you are right that my real concern is mapserver. It doesn't
have to be lightning fast, but I would sure like to see it better
than
it is. And the reality is that I am only looking for major
highways. So
maybe there is a way to get mapserver to build a query that narrows
things down before it starts doing the geometry stuff.
I'm also thinking that PostgreSQL loads entire rows when it reads
pages
despite the fact that you may only want 1 or 2 columns. So
refactoring
the database might make a big difference in disk io. This is
Tiger data
and I only care about a small portion of it.
The table row count, by the way, is 4,203,356. I'm very
optimistic about
the possible performance given the results. I just have to find a
way to
get mapserver to capitalize on it.
Patrick
You are returning 103159 rows, which is a lot. Unless you have
100million records in your table, sequence scanning will probably
work faster then index scanning, for retrieving that many rows.
Try a
bounding box that only encompasses a few hundred rows. You
should see
index scans turn on then.
By "query" performance, do you mean SQL query, or mapserver "query
mode". Because the latter has some quirks that need some care to
avoid.
P
On 30-Aug-06, at 1:00 PM, P. S. Brannan wrote:
I just imported a bunch of tiger data into postgres. I am using
mapserver as the frontend. Everything is working fine except for
the performance. The query performance is terrible and I can't
seem
to get postgresql to use the index i created. Here's what's
going on:
Index Creation:
create index idx_completechain_wkb_geometry on completechain using
gist(wkb_geometry gist_geometry_ops);
vacuum analyze;
Vacuum Analyze:
explain analyze SELECT cfcc::text,fename::text,asbinary
(force_collection(force_2d(wkb_geometry)),'NDR'),ogc_fid::text
from
completechain
WHERE wkb_geometry && setSRID('BOX3D(-90.6004589271402
38.4732965951534,-89.9998214028184
38.8267482378407)'::BOX3D,find_srid
('','completechain','wkb_geometry') );
QUERY PLAN
------------------------------------------------------------------
----
------------------------------------------------------------------
----
------------------------------------------------------------------
----
--------------------------
Seq Scan on completechain (cost=0.00..275502.54 rows=90839
width=186) (actual time=10718.879..55686.238 rows=103159 loops=1)
Filter: (wkb_geometry &&
'0103000020FF7F0000010000000500000023AA47EB6DA656C0FABE9AFB943C434
023A
A47EB6DA656C095C6E1E2D269434089BCE812FD7F56C095C6E1E2D269434089BCE
812F
D7F56C0FABE9AFB943C434023AA47EB6DA656C0FABE9AFB943C4340'::geometry
)
Total runtime: 55973.870 ms
(3 rows)
Can anyone tell me what I need to do to get postgres to use the
index? I'm sure that I'm missing something obvious.
Thanks,
Patrick Brannan