On 1/5/12 5:04 AM, Benedikt Grundmann wrote:
I have a question of how to benchmark hardware to determine
the appropriate ratio of seq_page_cost vs random_page_cost.

Emails in this mailing lists archive seem to indicate that
1.0 vs 3.0 - 4.0 are appropriate values on modern hardware.

Which surprised me a bit as I had thought that on actual
harddrives (ignoring SSDs) random_page_cost is higher.
I guess that the number tries to reflect caching of the
relevant pages in memory and modern hardware you have
more of that?

That sort of thing is one reason why all attempts so far to set random_page_cost based on physical characteristics haven't gone anywhere useful. The setting is sort of overloaded right now, it's a fuzzy mix of true random seek cost blended with some notion of cache percentage. Trying to bring some measurements to bear on it is a less effective approach than what people actually do here. Monitor the profile of query execution, change the value, see what happens. Use that as feedback for what direction to keep going; repeat until you're just spinning with no improvements.

It's easy to measure the actual read times and set the value based on that instead. But that doesn't actually work out so well. There's at least three problems in that area:

-Timing information is sometimes very expensive to collect. This I expect to at least document and quantify why usefully as a 9.2 feature.

-Basing query execution decisions on what is already in the cache leads to all sorts of nasty feedback situations where you optimize for the short term, for example using an index already in cache, while never reading in what would be a superior long term choice because it seems too expensive.

-Making a major adjustment to the query planning model like this would require a large performance regression testing framework to evaluate the results in.

We are not sure if the database used to choose differently
before the move to the new hardware and the hardware is
performing worse for random seeks.  Or if the planner is
now making different choices.

I don't recommend ever deploying new hardware without first doing some low-level benchmarks to validate its performance. Once stuff goes into production, you can't do that anymore. See http://www.2ndquadrant.com/en/talks/ for my hardware benchmarking talks if you'd like some ideas on what to collect.

--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to