I know what I would choose. I'd get the mega server w/ a ton of RAM and skip all the trickyness of partitioning a DB over multiple servers. Yes your data will grow to a point where even the XXGB can't cache everything. On the otherhand, memory prices drop just as fast. By that time, you can ebay your original 16/32GB and get 64/128GB.
a) What do you do when your calculations show you need 256G of ram? [Yes such machines exist but you're not longer in the realm of simply "add more RAM". Administering such machines is nigh as complex as clustering]
If you need that much memory, you've got enough customers paying you cash to pay for anything. :) Technology always increase -- 8X Opterons would double your memory capacity, higher capacity DIMMs, etc.
b) What do you do when you find you need multiple machines anyways to divide the CPU or I/O or network load up. Now you need n big beefy servers when n servers 1/nth as large would really have sufficed. This is a big difference when you're talking about the difference between colocating 16 1U boxen with 4G of ram vs 16 4U opterons with 64G of RAM...
All that said, yes, speaking as a user I think the path of least resistance is to build n complete slaves using Slony and then just divide the workload. That's how I'm picturing going when I get to that point.
Replication is good for uptime and high read systems. The problem is that if your system has a high volume of writes and you need near realtime data syncing, clusters don't get you anything. A write on one server means a write on every server. Spreading out the damage over multiple machines doesn't help a bit.
Plus the fact that we don't have multi-master replication yet is quite a bugaboo. That requires writing quite extensive code if you can't afford to have 1 server be your single point of failure. We wrote our own multi-master replication code at the client app level and it's quite a chore making sure the replication act logically. Every table needs to have separate logic to parse situations like "voucher was posted on server 1 but voided after on server 2, what's the correct action here?" So I've got a slew of complicated if-then-else statements that not only have to take into account type of update being made but the sequence.
And yes, I tried doing realtime locks over a VPN link over our servers in SF and VA. Ugh...latency was absolutely horrible and made transactions run 1000X slower.
---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly