[PERFORM] hardware priority for an SSD database?
We're looking to upgrade our database hardware so that it can sustain us while we re-architect some of the more fundamental issues with our applications. The first thing to spend money on is usually disks, but our database currently lives almost entirely on flash storage, so that's already nice and fast. My question is, what we should spend money on next? With most data stored in flash, does it still make sense to buy as much ram as possible? RAM is still faster than flash, but while it's cheap, it isn't free, and our database is a couple hundred GB in size. We also have several hundred active sessions. Does it makes sense to sacrifice some memory speed and go with 4 6-core Istanbul processors? Or does it make more sense to limit ourselves to 2 4-core Nehalem sockets and get Intel's 1333 MHz DDR3 memory and faster cores? Our queries are mostly simple, but we have a lot of them, and their locality tends to be low. FWIW, about half are selects. Does anybody have any experience with these kinds of tradeoffs in the absence of spinning media? Any insight would be much appreciated. From the information I have right now, trying to figuring out how to optimally spend our budget feels like a shot in the dark. Thanks! -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware priority for an SSD database?
Ben Chobot wrote: With most data stored in flash, does it still make sense to buy as much ram as possible? RAM is still faster than flash, but while it's cheap, it isn't free, and our database is a couple hundred GB in size. Depends on the actual working set of data you run into on a regular basis. If adding more RAM makes it possible to fit that where it didn't before, that can be a huge win even over SSD. RAM is still around an order of magnitude faster than flash (2500MB/s vs. 200MB/s typically). I'll normally stare at what's in the buffer cache to get an idea what the app likes to cache most to try and estimate the size of the working set better. We also have several hundred active sessions. Does it makes sense to sacrifice some memory speed and go with 4 6-core Istanbul processors? Or does it make more sense to limit ourselves to 2 4-core Nehalem sockets and get Intel's 1333 MHz DDR3 memory and faster cores? This is hard to say, particularly when you mix in the cost difference between the two solutions. Yours is one of the situations where AMD's stuff might work out very well for you on a bang-per-buck basis though; it's certainly not one of the ones where it's a clear win for Intel (which I do see sometimes). Does anybody have any experience with these kinds of tradeoffs in the absence of spinning media? Any insight would be much appreciated. From the information I have right now, trying to figuring out how to optimally spend our budget feels like a shot in the dark. There are no easy answers or general guidelines here. There are only two ways I've ever found to get useful results in this area: 1) Try some eval hardware (vendor load, friendly borrowing, etc.) and benchmark with your app. 2) Cripple an existing system to get more sensitivity analysis points. For example, if you have a 16GB server, you might do some benchmarking, reduce to 8GB, and see how much that changed things, to get an idea how sensitive your app is to memory size changes. You can do similar tests underclocking/disabling CPUs, underclocking RAM, and lowering the speed of the drives. For example, if you reduce the amount of RAM, but performance doesn't change much, while decreasing RAM clock drops it a lot, that's pretty good evidence you'd prefer spending on faster RAM than more of it. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] FSM - per database or per installation?
Heikki Linnakangas wrote: Craig James wrote: Are the FSM parameters for each database, or the entire Postgres system? In other words, if I have 100 databases, do I need to increase max_fsm_pages and max_fsm_relations by a factor of 100, or keep them the same as if I just have one database? I suspect they're per-database, i.e. as I add databases, I don't have to increase the FSM parameters, but the documentation isn't 100% clear on this point. It's per cluster, ie *not* per-database. Hmmm ... it seems I have an impossible problem. I have ~250 databases each with about 2500 relations (as in select count(1) from pg_class where relname not like 'pg_%'). That makes roughly 625,000 relations. But ... for max_fsm_pages, the Postgres manual says, This setting must be at least 16 * max_fsm_relations. The default is chosen by initdb depending on the amount of available memory, and can range from 20k to 200k pages. So max_fsm_pages should be 16*625000, or 10,000,000 ... except that the limit is 200,000. Or is it only the *default* that can be 200,000 max, but you can override and set it to any number you like? It appears that Postgres 8.3 and earlier can't do garbage collection on a configuration like mine. Do I misunderstand something? The parameter is gone in 8.4, BTW. Both max_fsm_relations and max_fsm_pages? Thanks, Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] FSM - per database or per installation?
On Wed, Dec 23, 2009 at 6:38 PM, Craig James craig_ja...@emolecules.com wrote: Heikki Linnakangas wrote: Craig James wrote: Are the FSM parameters for each database, or the entire Postgres system? In other words, if I have 100 databases, do I need to increase max_fsm_pages and max_fsm_relations by a factor of 100, or keep them the same as if I just have one database? I suspect they're per-database, i.e. as I add databases, I don't have to increase the FSM parameters, but the documentation isn't 100% clear on this point. It's per cluster, ie *not* per-database. Hmmm ... it seems I have an impossible problem. I have ~250 databases each with about 2500 relations (as in select count(1) from pg_class where relname not like 'pg_%'). That makes roughly 625,000 relations. But ... for max_fsm_pages, the Postgres manual says, This setting must be at least 16 * max_fsm_relations. The default is chosen by initdb depending on the amount of available memory, and can range from 20k to 200k pages. So max_fsm_pages should be 16*625000, or 10,000,000 ... except that the limit is 200,000. Or is it only the *default* that can be 200,000 max, but you can override and set it to any number you like? NO! that's not the max (if it was I would be in serious trouble.) That's the max that you'll see done by initdb when creating the cluster. We run 10M fsm pages on our servers, and use about 2.5M of that. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] FSM - per database or per installation?
Craig James wrote: Heikki Linnakangas wrote: The parameter is gone in 8.4, BTW. Both max_fsm_relations and max_fsm_pages? Yes, both are gone. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance