On Wednesday 23 July 2003 01:40, Reece Hart wrote: > I'm trying to update a table but it's taking a very long time. I would > appreciate any tips folks may have about ways to speed it up. [snip] > paprospect2 contains ~40M rows. The goal now is to migrate the data to > the supertable-inherited column with > > update paprospect2 set run_id_new=run_id; > > > The update's been running for 5 hours (unloaded dual 2.4 GHz Xeon w/2GB > RAM, SCSI160 10K drive). There are no other jobs running. Load is ~1.2 > and the update's using ~3-5% of the CPU. [snip] > This suggests that the update is I/O bound (duh) and vmstat supports > this: [snip] > Presumably the large number of blocks written (bo) versus blocks read > (bi) reflects an enormous amount of bookkeeping that has to be done for > MVCC, logging, perhaps rewriting a row for the new definition (a guess > -- I don't know how this is handled), indicies, etc. There's no swapping > and no processes are waiting. In short, it seems that this is ENTIRELY > an I/O issue. Obviously, faster drives will help (but probably only by > small factor). > > Any ideas how I might speed this up? Presumably this is all getting > wrapped in a transaction -- does that hurt me for such a large update?
Well, it needs to keep enought bookkeeping to be able to rollback the whole transaction if it encounters a problem, or 40M rows in your case. Looks like you're right and it's an I/O issue. I must admit, I'm a bit puzzled that your CPU is quite so low, but I suppose you've got two fast CPUs so it shouldn't be high. [note the following is more speculation than experience] What might be happening is that the drive is spending all its time seeking between the WAL, index and table as it updates. I would also tend to be suspicious of the foreign keys - PG might be re-checking these, and obviously that would take time too. What you might want to try in future: 1. begin transaction 2. drop indexes, foreign keys 3. update table 4. vacuum it 5. recreate indexes, foreign keys etc 6. commit Now that's just moving the index updating/fk stuff to the end of the task, but it does seem to help sometimes. HTH -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html