Ben --

Hopefully the windage below is not too inaccurate and perhaps sheds some light. 
You might ask it of the performance or general mail list of PostgreSQL itself.

I'm not a fan of fetishes, so my quick&dirty answer is: Yes.

Unpacked a bit:

There is an overhead to indexes, especially when they are long -- text strings 
for example, or decimals w/ lots of digits.

When PostgreSQL does any form of lookup -- basically any time there is a WHERE 
clause -- it will try to use an index if there is one and if the planner thinks 
that the number of rows likely to be returned is small. If there are lots of 
tuples to be processed, a sequential scan will be used as it is faster than the 
potential two reads for each indexed item (one to the index and one to get the 
data).

Remember that FKs do not need to have indexes but it almost always helps to 
have such an index, since a delete of a single row from the parent table would 
need to scan all rows of the other to make sure that the delete was allowed. 
With an index that operation is likely to be much faster.

The GIST indexes that postGIS uses do take a while to build, especially on 
large tables, but they can work to pare spatial queries down to managable 
numbers of rows; without them even trivial operations would take forever.

You can monitor index usage via the system tables; indexes which aren't used 
are excellent candidates for elimination.


I don't have much experience with them, but I have heard that some data 
warehousing schemes don't have much in the way of indexing as most operations 
are going to wade through massive numbers of rows anyway. OLTP application, 
OTH, almost always benefit from indexing.

As a practical matter, it can be faster to drop indexes, do massive data 
loads/deletes, and then rebuild the indexes -- has impacts of user access but 
it can lead to indexes that are more efficient if the data is presorted, and 
they'll tend to be more clustered on the disk so more of them are likely to be 
found in RAM. I've seen occasional slow downs from doing lots of single row 
transactions which result in the periodic need to rebalance a B-tree index in 
Informix, for example.

Index creation does require extra disk/RAM space and CPU time so they shouldn't 
just be thrown on will ye-nill ye. Think about the application and its data -- 
1,000,000 rows with 20 distinct values is a terrible candidate for indexing; a 
million rows with one-hundred-thousand distinct values would probably be a win, 
if the values are used in queries. Ceteras Parabus.

HTH,

Greg Williamson


----- Original Message ----
From: Ben Brehmer <[email protected]>
To: [email protected]
Sent: Friday, March 20, 2009 2:51:19 PM
Subject: [postgis-users] Is Index fetish bad?

Hi All,

In what kind of scenarios should one not create an index on a table?

As I understand it, indexes can only be beneficial for SELECT statements. 
Unless of course the query planner inadvertently creates an inefficient query 
plan. I'm wondering how are INSERTs, DELETEs, and UPDATEs affected by the 
overhead incurred by index updates. Maybe the updating of the index is 
negligent in comparison to the actual INSERT, UPDATE or DELETE?

Is there any performance difference if an index is being updated for a thousand 
row table versus a 20 million row table?

Just wondering,

Ben
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users



      

_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users

Reply via email to