Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-21 Thread Vivek Khera
On Sep 12, 2005, at 6:02 PM, Brandon Black wrote:- using COPY instead of INSERT ?(should be easy to do from the aggregators)Possibly, although it would kill the current design of returning the database transaction status for a single client packet back to the client on transaction success/failure.   The aggregator could put several clients' data into a series of delayed multi-row copy statements.buffer through the file system on your aggregator.  once you "commit" to local disk file, return back to your client that you got the data.  then insert into the actual postgres DB in large batches of inserts inside a single Postgres transaction.we have our web server log certain tracking requests to a local file.  with file locks and append mode, it is extremely quick and has little contention delays. then every so often, we lock the file, rename  it, release the lock, then process it at our leisure to do the inserts to Pg in one big transaction. Vivek Khera, Ph.D. +1-301-869-4449 x806  

Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-21 Thread Vivek Khera
On Sep 12, 2005, at 6:02 PM, Brandon Black wrote:- splitting the xlog and the data on distinct physical drives or arraysThat would almost definitely help, I haven't tried it yet.  Speaking of the xlog, anyone know anything specific about the WAL tuning parameters for heavy concurrent write traffic?  What little I could dig up on WAL tuning was contradictory, and testing some random changes to the parameters hasn't been very conclusive yet.  I would imagine the WAL buffers stuff could potentially have a large effect for us.you will want to make your pg_xlog RAID volume BIG, and then tell postgres to use that space: bump up checkpoint_segments (and suitably the checkpoint timeouts).  I run with 256 segments and a timeout of 5 minutes.  The timeout refletcs your  expected crash recovery time, so adjust it wiselyAlso, you should consider how you split your drives across your RAID data channels on your test machine: I put each pair of the RAID10 mirrors on opposite channels, so both channels of my RAID controller are pretty evenly loaded during write. Vivek Khera, Ph.D. +1-301-869-4449 x806  

Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-13 Thread Stephen Frost
* Brandon Black ([EMAIL PROTECTED]) wrote:
 Ideally I'd like to commit the data seperately, as the data could contain 
 errors which abort the transaction, but it may come down to batching it and 
 coding things such that I can catch and discard the offending row and retry 
 the transaction if it fails (which should be fairly rare I would hope). I 

Don't really know if it'd be better, or worse, or what, but another
thought to throw out there is to perhaps use savepoints instead of full
transactions?  Not sure if that's more expensive or cheaper than doing
full commits but it might be something to consider.

 When I finally get all of this sorted out and working reasonably optimally, 
 I'll be sure to come back and report what techniques/settings did and didn't 
 work for this workload.

That'd be great, many thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-13 Thread Brandon Black
On 9/12/05, Christopher Petrilli [EMAIL PROTECTED] wrote:
3) Use 8.1 and strongly look at Bizgres. The data partitioning is critical.

I started looking closer at my options for partitioning (inheritance,
union all), and at Bizgres today. Bizgres partitioning appears to be
basically the same kind of inheritance partitioning one can do in
mainline PostgreSQL. Am I correct in thinking that the main difference
is that they've coded support for enable_constraint_exclusion=true so
that the query planner can be more effective at taking advantage of the
partitioning when you've specified CHECK constraints on the child
tables? I may go for 8.1 instead in that case, as the main win I'm
looking for is that with inheritance I'll be doing inserts into smaller
tables instead of ones that grow to unmanageable sizes (and that I can
drop old child tables instead of delete/vacuum).




[PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-12 Thread Brandon Black

I'm in the process of developing an application which uses PostgreSQL
for data storage. Our database traffic is very atypical, and as a
result it has been rather challenging to figure out how to best tune
PostgreSQL on what development hardware we have, as well as to figure
out exactly what we should be evaluating and eventually buying for
production hardware.

The vast, overwhelming majority of our database traffic is pretty much
a non-stop stream of INSERTs filling up tables. It is akin to
data acquisition. Several thousand clients are sending
once-per-minute updates full of timestamped numerical data at our
central server, which in turn performs INSERTs into several distinct
tables as part of the transaction for that client. We're talking
on the order of ~100 transactions per second, each containing INSERTs
to multiple tables (which contain only integer and floating point
columns and a timestamp column - the primary key (and only index) is on
a unique integer ID for the client and the timestamp). The
transaction load is spread evenly over time by having the clients send
their per-minute updates at random times rather than on the exact
minute mark.

There will of course be users using a web-based GUI to extract data
from these tables and display them in graphs and whatnot, but the
SELECT query traffic will always be considerably less frequent and
intensive than the incessant INSERTs, and it's not that big a deal if
the large queries take a little while to run.

This data also expires - rows with timestamps older than X days will be
DELETEd periodically (once an hour or faster), such that the tables
will reach a relatively stable size (pg_autovacuum is handling
vacuuming for now, but considering our case, we're thinking of killing
pg_autovacuum in favor of having the periodic DELETE process also do a
vacuum of affected tables right after the DELETE, and then have it
vacuum the other low traffic tables once a day while it's at it).

There is an aggregation layer in place which proxies the inbound data
from the clients into a small(er) number of persistent postgresql
backend processes. Right now we're doing one aggregator per 128
clients (so instead of 128 seperate database connections over the
course of a minute for a small transaction each, there is a single
database backend that is constantly committing transactions at a rate
of ~ 2/second). At a test load of ~1,000 clients, we would have 8
aggregators running and 8 postgresql backends. Testing has seemed
to indicate we should aggregate even harder - the planned production
load is ~5,000 clients initially, but will grow to almost double that
in the not-too-distant future, and that would mean ~40 backends at 128
clients each initially. Even on 8 cpus, I'm betting 40 concurrent
backends doing 2 tps is much worse off than 10 backends doing 8 tps.


Test hardware right now is a dual Opteron with 4G of ram, which we've
barely gotten 1,000 clients running against. Current disk hardware in
testing is whatever we could scrape together (4x 3-ware PCI hardware
RAID controllers, with 8 SATA drives in a RAID10 array off of each -
aggregated up in a 4-way stripe with linux md driver and then formatted
as ext3 with an appropriate stride parameter and data=""
Production will hopefully be a 4-8-way Opteron, 16 or more G of RAM,
and a fiberchannel hardware raid array or two (~ 1TB available RAID10 storage) with 15krpm disks and
battery-backed write cache.

I know I haven't provided a whole lot of application-level detail here,
but does anyone have any general advice on tweaking postgresql to deal
with a very heavy load of concurrent and almost exclusively write-only
transactions? Increasing shared_buffers seems to always help,
even out to half of the dev box's ram (2G). A 100ms commit_delay
seemed to help, but tuning it (and _siblings) has been difficult.
We're using 8.0 with the default 8k blocksize, but are strongly
considering both developing against 8.1 (seems it might handle the
heavy concurrency better), and re-compiling with 32k blocksize since
our storage arrays will inevitably be using fairly wide stripes.
Any advice on any of this (other than drop the project while you're
still a little bit sane)?

--Brandon



Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-12 Thread PFC




I know I haven't provided a whole lot of application-level detail here,


You did !

What about :

- using COPY instead of INSERT ?
(should be easy to do from the aggregators)

- using Bizgres ?   
(which was designed for your use case)

- splitting the xlog and the data on distinct physical drives or arrays

- benchmarking something else than ext3
(xfs ? reiser3 ?)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-12 Thread Qingqing Zhou





  "Brandon Black" [EMAIL PROTECTED] wrote ...
  Increasing shared_buffers seems to always help, even out to half of 
  the dev box's ram (2G). 
  
  Though officially PG does not prefer huge 
  shared_buffers size, I did see several times thatperformancewas 
  boosted in case IO is the bottleneck. Also, if you want to use big bufferpool 
  setting, make sure your version has Tom's split BufMgrLock patch
  (http://archives.postgresql.org/pgsql-committers/2005-03/msg00025.php), 
  which might already in 8.0.x somewhere. And if you want to use bufferpool 
  bigger than 2G on 64-bit machine, you may need 8.1 (http://archives.postgresql.org/pgsql-committers/2005-08/msg00221.php).
  
  Regards,
  Qingqing


Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-12 Thread Alex Turner
Split your system into multiple partitions of RAID 10s.For max
performance, ten drive RAID 10 for pg_xlog (This will max out a
PCI-X bus) on Bus A, multiple 4/6Drive RAID 10s for tablespaces on Bus
B. For max performance I would recommend using one RAID 10 for raw data
tables, one for aggregate tables and one for indexes. More RAM
will only help you with queries against your data, if you are
pre-aggregating, then you may not need all that much RAM.

You can easily get 100 tansacts per second with even less hardware with a little data partitioning.

Choose your controller carefully as many don't co-operate with linux well.
Alex Turner
NetEconomist
On 9/12/05, Brandon Black [EMAIL PROTECTED] wrote:

I'm in the process of developing an application which uses PostgreSQL
for data storage. Our database traffic is very atypical, and as a
result it has been rather challenging to figure out how to best tune
PostgreSQL on what development hardware we have, as well as to figure
out exactly what we should be evaluating and eventually buying for
production hardware.

The vast, overwhelming majority of our database traffic is pretty much
a non-stop stream of INSERTs filling up tables. It is akin to
data acquisition. Several thousand clients are sending
once-per-minute updates full of timestamped numerical data at our
central server, which in turn performs INSERTs into several distinct
tables as part of the transaction for that client. We're talking
on the order of ~100 transactions per second, each containing INSERTs
to multiple tables (which contain only integer and floating point
columns and a timestamp column - the primary key (and only index) is on
a unique integer ID for the client and the timestamp). The
transaction load is spread evenly over time by having the clients send
their per-minute updates at random times rather than on the exact
minute mark.

There will of course be users using a web-based GUI to extract data
from these tables and display them in graphs and whatnot, but the
SELECT query traffic will always be considerably less frequent and
intensive than the incessant INSERTs, and it's not that big a deal if
the large queries take a little while to run.

This data also expires - rows with timestamps older than X days will be
DELETEd periodically (once an hour or faster), such that the tables
will reach a relatively stable size (pg_autovacuum is handling
vacuuming for now, but considering our case, we're thinking of killing
pg_autovacuum in favor of having the periodic DELETE process also do a
vacuum of affected tables right after the DELETE, and then have it
vacuum the other low traffic tables once a day while it's at it).

There is an aggregation layer in place which proxies the inbound data
from the clients into a small(er) number of persistent postgresql
backend processes. Right now we're doing one aggregator per 128
clients (so instead of 128 seperate database connections over the
course of a minute for a small transaction each, there is a single
database backend that is constantly committing transactions at a rate
of ~ 2/second). At a test load of ~1,000 clients, we would have 8
aggregators running and 8 postgresql backends. Testing has seemed
to indicate we should aggregate even harder - the planned production
load is ~5,000 clients initially, but will grow to almost double that
in the not-too-distant future, and that would mean ~40 backends at 128
clients each initially. Even on 8 cpus, I'm betting 40 concurrent
backends doing 2 tps is much worse off than 10 backends doing 8 tps.


Test hardware right now is a dual Opteron with 4G of ram, which we've
barely gotten 1,000 clients running against. Current disk hardware in
testing is whatever we could scrape together (4x 3-ware PCI hardware
RAID controllers, with 8 SATA drives in a RAID10 array off of each -
aggregated up in a 4-way stripe with linux md driver and then formatted
as ext3 with an appropriate stride parameter and data=""
Production will hopefully be a 4-8-way Opteron, 16 or more G of RAM,
and a fiberchannel hardware raid array or two (~ 1TB available RAID10 storage) with 15krpm disks and
battery-backed write cache.

I know I haven't provided a whole lot of application-level detail here,
but does anyone have any general advice on tweaking postgresql to deal
with a very heavy load of concurrent and almost exclusively write-only
transactions? Increasing shared_buffers seems to always help,
even out to half of the dev box's ram (2G). A 100ms commit_delay
seemed to help, but tuning it (and _siblings) has been difficult.
We're using 8.0 with the default 8k blocksize, but are strongly
considering both developing against 8.1 (seems it might handle the
heavy concurrency better), and re-compiling with 32k blocksize since
our storage arrays will inevitably be using fairly wide stripes.
Any advice on any of this (other than drop the project while you're
still a little bit sane)?

--Brandon





Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-12 Thread Brandon Black
On 9/12/05, PFC [EMAIL PROTECTED] wrote:
 I know I haven't provided a whole lot of application-level detail here,You did !What about :- using COPY instead of INSERT ?(should
be easy to do from the aggregators)
Possibly, although it would kill the current design of returning the
database transaction status for a single client packet back to the
client on transaction success/failure.  The aggregator could put
several clients' data into a series of delayed multi-row copy
statements.
- using Bizgres ?(which
was designed for your use case)
I only briefly scanned their About page, but they didn't sound
particularly suited to my case at the time (it sounded kinds buzzwordy
actually, which I suppose is great for business apps people :) ).
We're more of a sciency shop. I'll go look at the project in more
detail tonight in light of your recommendation.
- splitting the xlog and the data on distinct physical drives or arrays

That would almost definitely help, I haven't tried it yet.
Speaking of the xlog, anyone know anything specific about the WAL
tuning parameters for heavy concurrent write traffic? What little
I could dig up on WAL tuning was contradictory, and testing some random
changes to the parameters hasn't been very conclusive yet. I
would imagine the WAL buffers stuff could potentially have a large
effect for us.
- benchmarking something else than ext3(xfs ? reiser3 ?)

We've had bad experiences under extreme and/or strange workloads with
XFS here in general, although this is the first major postgresql
project - the rest were with other applications writing to XFS.
Bad experiences like XFS filesystems detecting internal
inconsistencies at runtime and unmounting themselves from within the
kernel module (much to the dismay of applications with open files on
the filesystem), on machines with validated good hardware. It has
made me leary of using anything other than ext3 for fear of stability
problems. Reiser3 might be worth taking a look at though.

Thanks for the ideas,
-- Brandon


Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-12 Thread Alan Stange

Brandon Black wrote:




On 9/12/05, *PFC* [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:




- benchmarking something else than ext3
(xfs ? reiser3 ?)


We've had bad experiences under extreme and/or strange workloads with 
XFS here in general, although this is the first major postgresql 
project - the rest were with other applications writing to XFS.  Bad 
experiences like XFS filesystems detecting internal inconsistencies 
at runtime and unmounting themselves from within the kernel module 
(much to the dismay of applications with open files on the 
filesystem), on machines with validated good hardware.  It has made me 
leary of using anything other than ext3 for fear of stability 
problems.  Reiser3 might be worth taking a look at though.


Just one tidbit.   We tried XFS on a very active system similar to what 
you describe.   Dual opterons, 8GB memory, fiber channel drives, 2.6 
kernel, etc.   And the reliability was awful.   We spent a lot of time 
making changes one at a time to try and isolate the cause; when we 
switched out from XFS to ReiserFS our stability problems went away.


It may be the case that the XFS problems have all been corrected in 
newer kernels, but I'm not going to put too much effort into trying that 
again.



I recently built a postgres with 32KB block sizes and have been doing 
some testing.  For our particular workloads it has been a win.


-- Alan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-12 Thread Greg Stark

Brandon Black [EMAIL PROTECTED] writes:

 The vast, overwhelming majority of our database traffic is pretty much a 
 non-stop stream of INSERTs filling up tables. 

That part Postgres should handle pretty well. It should be pretty much limited
by your I/O bandwidth so big raid 1+0 arrays are ideal. Putting the WAL on a
dedicated array would also be critical.

The WAL parameters like commit_delay and commit_siblings are a bit of a
mystery. Nobody has done any extensive testing of them. It would be quite
helpful if you find anything conclusive and post it. It would also be
surprising if they had a very large effect. They almost got chopped recently
because they weren't believed to be useful.

You might also ponder whether you need to by issuing a commit for every datum.
If you only need to commit periodically you can get much better throughput. I
suppose that's the same as commit_siblings. It would be interesting to know if
you can get those parameters to perform as well as batching up records
yourself.

 There will of course be users using a web-based GUI to extract data from 
 these tables and display them in graphs and whatnot, but the SELECT query 
 traffic will always be considerably less frequent and intensive than the 
 incessant INSERTs, and it's not that big a deal if the large queries take a 
 little while to run.

I do fear these queries. Even if they aren't mostly terribly intensive if
you're pushing the edges of your write I/O bandwidth then a single seek to
satisfy one of these selects could really hurt your throughput. 

That said, as long as your WAL is on a dedicated drive Postgres's architecture
should in theory be ideal and allow you do run these things with impunity. The
WAL is purely write-only and it's the only thing your inserts will be blocking
on.

 This data also expires - rows with timestamps older than X days will be 
 DELETEd periodically (once an hour or faster), such that the tables will 
 reach a relatively stable size (pg_autovacuum is handling vacuuming for now, 
 but considering our case, we're thinking of killing pg_autovacuum in favor 
 of having the periodic DELETE process also do a vacuum of affected tables 
 right after the DELETE, and then have it vacuum the other low traffic tables 
 once a day while it's at it).

Ay, there's the rub.

Taking this approach means you have vacuums running which have to scan your
entire table and your inserts are being sprayed all over the disk. 

An alternative you may consider is using partitioned tables. Then when you
want to expire old records you simply drop the oldest partition. Or in your
case you might rotate through the partitions, so you just truncate the oldest
one and start inserting into it.

Unfortunately there's no built-in support for partitioned tables in Postgres.
You get to roll your own using UNION ALL or using inherited tables. Various
people have described success with each option though they both have
downsides too.

Using partitioned tables you would never need to delete any records except for
when you delete all of them. So you would never need to run vacuum except on
newly empty partitions. That avoids having to scan through all those pages
that you know will never have holes. If you use TRUNCATE (or VACUUM ALL or
CLUSTER) that would mean your inserts are always sequential too (though it
also means lots of block allocations along the way, possibly not an
advantage).

This may be a lot more work to set up and maintain but I think it would be a
big performance win. It would directly speed up the WAL writes by avoiding
those big full page dumps. And it would also cut out all the I/O traffic
vacuum generates.


 Increasing shared_buffers seems to always help, even out to half of the dev
 box's ram (2G).

Half should be a pessimal setting. It means virtually everything is buffered
twice. Once in the kernel and once in Postgres. Effectively halving your
memory. If that's really helping try raising it even further, to something
like 90% of your memory. But the conventional dogma is that shared buffers
should never be more than about 10k no matter how much memory you have. Let
the kernel do the bulk of the buffering.

That said it's even more mysterious in your situation. Why would a large
shared buffers help insert-only performance much at all? My guess is that it's
letting the vacuums complete quicker. Perhaps try raising work_mem?

-- 
greg


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-12 Thread Christopher Petrilli
On 9/12/05, Brandon Black [EMAIL PROTECTED] wrote:
 
  I'm in the process of developing an application which uses PostgreSQL for
 data storage.  Our database traffic is very atypical, and as a result it has
 been rather challenging to figure out how to best tune PostgreSQL on what
 development hardware we have, as well as to figure out exactly what we
 should be evaluating and eventually buying for production hardware.

A few suggestions...

1) Switch to COPY if you can, it's anywhere from 10-100x faster than
INSERT, but it does not necessarily fit your idea of updating multiple
tables.  In that case, try and enlarge the transaction's scope and do
multiple INSERTs in the same transaction.  Perhaps batching once per
second, or 5 seconds, and returning the aggregate result ot the
clients.

2) Tune ext3.  The default configuration wrecks high-write situations.
 Look into data=writeback for mounting, turning off atime (I hope
you've done this already) updates, and also modifying the scheduler to
the elevator model.  This is poorly documented in Linux (like just
about everything), but it's crtical.

3) Use 8.1 and strongly look at Bizgres. The data partitioning is critical.

4) Make sure you are not touching more data than you need, and don't
have any extraneous indexes.  Use the planner to make sure every index
is used, as it substantially increases the write load.

I've worked on a few similar applications, and this is a hard thing in
any database, even Oracle.

Chris 

-- 
| Christopher Petrilli
| [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-12 Thread Brandon Black
On 12 Sep 2005 23:07:49 -0400, Greg Stark [EMAIL PROTECTED] wrote:
The WAL parameters like commit_delay and commit_siblings are a bit of amystery. Nobody has done any extensive testing of them. It would be quite
helpful if you find anything conclusive and post it. It would also besurprising if they had a very large effect. They almost got chopped recentlybecause they weren't believed to be useful.You might also ponder whether you need to by issuing a commit for every datum.
If you only need to commit periodically you can get much better throughput. Isuppose that's the same as commit_siblings. It would be interesting to know ifyou can get those parameters to perform as well as batching up records
yourself.
Ideally I'd like to commit the data seperately, as the data could
contain errors which abort the transaction, but it may come down to
batching it and coding things such that I can catch and discard the
offending row and retry the transaction if it fails (which should be
fairly rare I would hope). I was hoping that the
commit_delay/commit_siblings stuff would allow me to maintain
simplistic transaction failure isolation while giving some of the
benefits of batching things up, as you've said. I have seen
performance gains with it set at 100ms and a 3-6 siblings with 8
backends running, but I haven't been able to extensively tune these
values, they were mostly random guesses that seemed to work. My
cycles of performance testing take a while, at least a day or two per
change being tested, and the differences can even then be hard to see
due to variability in the testing load (as it's not really a static
test load, but a window on reality). On top of that, with the
time it takes, I've succumbed more than once to the temptation of
tweaking more than one thing per performance run, which really muddies
the results.

 Increasing shared_buffers seems to always help, even out to half of the dev box's ram (2G).
Half should be a pessimal setting. It means virtually everything is bufferedtwice. Once in the kernel and once in Postgres. Effectively halving yourmemory. If that's really helping try raising it even further, to something
like 90% of your memory. But the conventional dogma is that shared buffersshould never be more than about 10k no matter how much memory you have. Letthe kernel do the bulk of the buffering.That said it's even more mysterious in your situation. Why would a large
shared buffers help insert-only performance much at all? My guess is that it'sletting the vacuums complete quicker. Perhaps try raising work_mem?
I find it odd as well. After reading the standard advice on
shared_buffers, I had only intended on raising it slightly. But
seeing ever-increasing performance gains, I just kept tuning it upwards
all the way to the 2G limit, and saw noticeable gains every time.
During at least some of the test cycles, there was no deleting or
vacuuming going on, just insert traffic. I guessed that
shared_buffers' caching strategy must have been much more effective
than the OS cache at something or other, but I don't know what
exactly. The only important read traffic that comes to mind is
the index which is both being constantly updated and constantly checked
for primary key uniqueness violations.

All of these replies here on the list (and a private one or two) have
given me a long list of things to try, and I feel confident that at
least some of them will gain me enough performance to comfortably
deploy this application in the end on somewhat reasonable
hardware. Thanks to everyone here on the list for all the
suggestions, it has been very helpful in giving me directions to go
with this that I hadn't thought of before.

When I finally get all of this sorted out and working reasonably
optimally, I'll be sure to come back and report what
techniques/settings did and didn't work for this workload.

-- Brandon



Re: [PERFORM] Performance considerations for very heavy INSERT traffic

2005-09-12 Thread Brandon Black
On 9/12/05, Christopher Petrilli [EMAIL PROTECTED] wrote:
2) Tune ext3.The default configuration wrecks high-write situations. Look into data="" for mounting, turning off atime (I hopeyou've done this already) updates, and also modifying the scheduler to
the elevator model.This is poorly documented in Linux (like justabout everything), but it's crtical.
I'm using noatime and data="" already. I changed my
scheduler from the default anticipatory to deadline and saw an
improvement, but I haven't yet delved into playing with specific
elevator tunable values per-device.
3) Use 8.1 and strongly look at Bizgres. The data partitioning is critical.

I've just started down the path of getting 8.1 running with a larger
block size, and tommorow I'm going to look at Bizgres's partitioning as
opposed to some manual schemes. Will the current Bizgres have a
lot of the performance enhancements of 8.1 already (or will 8.1 or 8.2
eventually get Bizgres's partitioning?)?
-- Brandon