[PERFORM] hardware priority for an SSD database?

2009-12-23 Thread Ben Chobot
We're looking to upgrade our database hardware so that it can sustain  
us while we re-architect some of the more fundamental issues with our  
applications. The first thing to spend money on is usually disks, but  
our database currently lives almost entirely on flash storage, so  
that's already nice and fast. My question is, what we should spend  
money on next?


With most data stored in flash, does it still make sense to buy as  
much ram as possible? RAM is still faster than flash, but while it's  
cheap, it isn't free, and our database is a couple hundred GB in size.


We also have several hundred active sessions. Does it makes sense to  
sacrifice some memory speed and go with 4 6-core Istanbul processors?  
Or does it make more sense to limit ourselves to 2 4-core Nehalem  
sockets and get Intel's 1333 MHz DDR3 memory and faster cores?


Our queries are mostly simple, but we have a lot of them, and their  
locality tends to be low. FWIW, about half are selects.


Does anybody have any experience with these kinds of tradeoffs in the  
absence of spinning media? Any insight would be much appreciated. From  
the information I have right now, trying to figuring out how to  
optimally spend our budget feels like a shot in the dark.


Thanks!

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] hardware priority for an SSD database?

2009-12-23 Thread Greg Smith

Ben Chobot wrote:
With most data stored in flash, does it still make sense to buy as 
much ram as possible? RAM is still faster than flash, but while it's 
cheap, it isn't free, and our database is a couple hundred GB in size.


Depends on the actual working set of data you run into on a regular 
basis.  If adding more RAM makes it possible to fit that where it didn't 
before, that can be a huge win even over SSD.  RAM is still around an 
order of magnitude faster than flash (2500MB/s vs. 200MB/s 
typically).  I'll normally stare at what's in the buffer cache to get an 
idea what the app likes to cache most to try and estimate the size of 
the working set better.


We also have several hundred active sessions. Does it makes sense to 
sacrifice some memory speed and go with 4 6-core Istanbul processors? 
Or does it make more sense to limit ourselves to 2 4-core Nehalem 
sockets and get Intel's 1333 MHz DDR3 memory and faster cores?


This is hard to say, particularly when you mix in the cost difference 
between the two solutions.  Yours is one of the situations where AMD's 
stuff might work out very well for you on a bang-per-buck basis though; 
it's certainly not one of the ones where it's a clear win for Intel 
(which I do see sometimes).



Does anybody have any experience with these kinds of tradeoffs in the 
absence of spinning media? Any insight would be much appreciated. From 
the information I have right now, trying to figuring out how to 
optimally spend our budget feels like a shot in the dark.


There are no easy answers or general guidelines here.  There are only 
two ways I've ever found to get useful results in this area:


1) Try some eval hardware (vendor load, friendly borrowing, etc.) and 
benchmark with your app.


2) Cripple an existing system to get more sensitivity analysis points.  
For example, if you have a 16GB server, you might do some benchmarking, 
reduce to 8GB, and see how much that changed things, to get an idea how 
sensitive your app is to memory size changes.  You can do similar tests 
underclocking/disabling CPUs, underclocking RAM, and lowering the speed 
of the drives.  For example, if you reduce the amount of RAM, but 
performance doesn't change much, while decreasing RAM clock drops it a 
lot, that's pretty good evidence you'd prefer spending on faster RAM 
than more of it.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] FSM - per database or per installation?

2009-12-23 Thread Craig James

Heikki Linnakangas wrote:

Craig James wrote:

Are the FSM parameters for each database, or the entire Postgres
system?  In other words, if I have 100 databases, do I need to increase
max_fsm_pages and max_fsm_relations by a factor of 100, or keep them the
same as if I just have one database?

I suspect they're per-database, i.e. as I add databases, I don't have to
increase the FSM parameters, but the documentation isn't 100% clear on
this point.


It's per cluster, ie *not* per-database.


Hmmm ... it seems I have an impossible problem.  I have ~250 databases each with about 
2500 relations (as in select count(1) from pg_class where relname not like 
'pg_%').  That makes roughly 625,000 relations.

But ... for max_fsm_pages, the Postgres manual says, This setting must be at least 
16 * max_fsm_relations. The default is chosen by initdb depending on the amount of 
available memory, and can range from 20k to 200k pages.

So max_fsm_pages should be 16*625000, or 10,000,000 ... except that the limit 
is 200,000.  Or is it only the *default* that can be 200,000 max, but you can 
override and set it to any number you like?

It appears that Postgres 8.3 and earlier can't do garbage collection on a 
configuration like mine.  Do I misunderstand something?


The parameter is gone in 8.4, BTW.


Both max_fsm_relations and max_fsm_pages?

Thanks,
Craig


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] FSM - per database or per installation?

2009-12-23 Thread Scott Marlowe
On Wed, Dec 23, 2009 at 6:38 PM, Craig James craig_ja...@emolecules.com wrote:
 Heikki Linnakangas wrote:

 Craig James wrote:

 Are the FSM parameters for each database, or the entire Postgres
 system?  In other words, if I have 100 databases, do I need to increase
 max_fsm_pages and max_fsm_relations by a factor of 100, or keep them the
 same as if I just have one database?

 I suspect they're per-database, i.e. as I add databases, I don't have to
 increase the FSM parameters, but the documentation isn't 100% clear on
 this point.

 It's per cluster, ie *not* per-database.

 Hmmm ... it seems I have an impossible problem.  I have ~250 databases each
 with about 2500 relations (as in select count(1) from pg_class where
 relname not like 'pg_%').  That makes roughly 625,000 relations.

 But ... for max_fsm_pages, the Postgres manual says, This setting must be
 at least 16 * max_fsm_relations. The default is chosen by initdb depending
 on the amount of available memory, and can range from 20k to 200k pages.

 So max_fsm_pages should be 16*625000, or 10,000,000 ... except that the
 limit is 200,000.  Or is it only the *default* that can be 200,000 max, but
 you can override and set it to any number you like?

NO! that's not the max (if it was I would be in serious trouble.)
That's the max that you'll see done by initdb when creating the
cluster.

We run 10M fsm pages on our servers, and use about 2.5M of that.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] FSM - per database or per installation?

2009-12-23 Thread Alvaro Herrera
Craig James wrote:
 Heikki Linnakangas wrote:

 The parameter is gone in 8.4, BTW.
 
 Both max_fsm_relations and max_fsm_pages?

Yes, both are gone.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance