On 12 Sep 2005 23:07:49 -0400, Greg Stark <[EMAIL PROTECTED]> wrote:

The WAL parameters like commit_delay and commit_siblings are a bit of a
mystery. Nobody has done any extensive testing of them. It would be quite
helpful if you find anything conclusive and post it. It would also be
surprising if they had a very large effect. They almost got chopped recently
because they weren't believed to be useful.

You might also ponder whether you need to by issuing a commit for every datum.
If you only need to commit periodically you can get much better throughput. I
suppose that's the same as commit_siblings. It would be interesting to know if
you can get those parameters to perform as well as batching up records

Ideally I'd like to commit the data seperately, as the data could contain errors which abort the transaction, but it may come down to batching it and coding things such that I can catch and discard the offending row and retry the transaction if it fails (which should be fairly rare I would hope).  I was hoping that the commit_delay/commit_siblings stuff would allow me to maintain simplistic transaction failure isolation while giving some of the benefits of batching things up, as you've said.  I have seen performance gains with it set at 100ms and a 3-6 siblings with 8 backends running, but I haven't been able to extensively tune these values, they were mostly random guesses that seemed to work.  My cycles of performance testing take a while, at least a day or two per change being tested, and the differences can even then be hard to see due to variability in the testing load (as it's not really a static test load, but a window on reality).  On top of that, with the time it takes, I've succumbed more than once to the temptation of tweaking more than one thing per performance run, which really muddies the results.

> Increasing shared_buffers seems to always help, even out to half of the dev
> box's ram (2G).

Half should be a pessimal setting. It means virtually everything is buffered
twice. Once in the kernel and once in Postgres. Effectively halving your
memory. If that's really helping try raising it even further, to something
like 90% of your memory. But the conventional dogma is that shared buffers
should never be more than about 10k no matter how much memory you have. Let
the kernel do the bulk of the buffering.

That said it's even more mysterious in your situation. Why would a large
shared buffers help insert-only performance much at all? My guess is that it's
letting the vacuums complete quicker. Perhaps try raising work_mem?

I find it odd as well.  After reading the standard advice on shared_buffers, I had only intended on raising it slightly.  But seeing ever-increasing performance gains, I just kept tuning it upwards all the way to the 2G limit, and saw noticeable gains every time.  During at least some of the test cycles, there was no deleting or vacuuming going on, just insert traffic.  I guessed that shared_buffers' caching strategy must have been much more effective than the OS cache at something or other, but I don't know what exactly.  The only important read traffic that comes to mind is the index which is both being constantly updated and constantly checked for primary key uniqueness violations.

All of these replies here on the list (and a private one or two) have given me a long list of things to try, and I feel confident that at least some of them will gain me enough performance to comfortably deploy this application in the end on somewhat reasonable hardware.  Thanks to everyone here on the list for all the suggestions, it has been very helpful in giving me directions to go with this that I hadn't thought of before.

When I finally get all of this sorted out and working reasonably optimally, I'll be sure to come back and report what techniques/settings did and didn't work for this workload.

-- Brandon

Reply via email to