[SQL] How to increase row deletion efficiency?

Alexander Stanier Wed, 07 Dec 2005 10:06:11 -0800

I am currently trying to separate two environments contained in onedatabase. Essentially I need to break that one database down into twowith a portion of the data going to each new database. I am intending toachieve this by duplicating the database and then stripping out the datathat is not required in each database. I have started by trying todelete data from a set of 28 related tables, however the performanceappears to be terrible. I am deleting from a table called document whichcascades down to 27 tables underneath it linked by various cascadingforeign key constraints. Some of these subsidiary tables have as many asa couple of million records.

Before executing the delete statement from document I tried setting allconstraints as deferred within a transaction, but this does not seem tohave helped.

I can't work out whether the indexes on these tables are a help or ahindrance. Presumably, any involving the foreign keys should help aslong as PostgreSQL will actually use them, but given that large numbersof records are being deleted the query planner may decide just to do asequence scan. An EXPLAIN doesn't show me what it does past the deletefrom document, i.e. if indexes are used when cascading. The downside ofthe indexes is that they have to be maintained which could be a lot ofwork in large scale deletions.

What I fear is that for every row that is deleted from the documenttable, the database is visiting all subsidiary tables to delete all datarelated to that one row before returning to document to delete anotherrow. this would mean that all tables are being visited many times. Ifthis is the way it is working, then the large tables are going to be areal problem. The most efficient way to do it would be to delete alldocument records, then with that list of documents in mind go on to thenext table and delete all related records so that each table is onlyvisited once to delete all the relevant records. I was hoping thatsetting constraints deferred would achieve this.

Can anyone advise me on how PostgreSQL (v8.0.3 on MacOS X 10.3) works ina delete statement and what strategy it uses to remove the data?

Can I specify "Unrecoverable" so that it doesn't write redo?

Are they any indicators I can use to tell me what part of the delete istaking so much time?

Also can anyone suggest anything else I can do to speed things up?

Or perhaps it simply is a lot of work and there is no way round it. Myfallback option is to SELECT data that I do need rather than DELETE thedata that I don't, but this route means I cannot make use of the foreignkeys.


Regards,
Alex Stanier.

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

[SQL] How to increase row deletion efficiency?

Reply via email to