Sorry I haven't had a chance to reply to this sooner.
The vacuum delay stuff that you're working on may help, but I can't really believe it's your salvation if this is happening after only a few minutes. No matter how much you're doing inside those functions, you surely can't be causing so many dead tuples that a vacuum is necessary that soon. Did you try not vacuuming for a little while to see if it helps?
I discussed it later in the thread, but we're adding about 400K rows per hour and deleting most of them after processing (note this is a commercial app, written and maintained by another department -- I can recommend changes, but this late into their release cycle they are very reluctant to change the app). This is 7 x 24 data collection from equipment, so there is no "slow" time to use as a maintenance window.
But since the server in question is a test machine, I was able to shut everything off long enough to do a full vacuum -- it took about 12 hours.
I didn't see it anywhere in this thread, but are you quite sure that you're not swapping? Note that vmstat on multiprocessor Solaris machines is not notoriously useful. You may want to have a look at what the example stuff in the SE Toolkit tells you, or what you get from sar. I believe you have to use a special kernel setting on Solaris to mark shared memory as being ineligible for swap.
I'm (reasonably) sure there is no swapping. Minimum free memory (from top) is about 800 MB, and "vmstat -S" shows no swap-in or swap-out.
I've been playing with a version of Jan's performance patch in the past few hours. Based on my simulations, it appears that a 1 ms delay every 10 pages is just about right. The performance hit is negligible (based on overall test time, and cpu % used by the vacuum process). I still have a bit more analysis to do, but this is looking pretty good. More later...
---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster