Re: [HACKERS] random_page_cost vs seq_page_cost

Greg Smith Tue, 10 Jan 2012 12:23:18 -0800

On 1/5/12 5:04 AM, Benedikt Grundmann wrote:

I have a question of how to benchmark hardware to determine
the appropriate ratio of seq_page_cost vs random_page_cost.


Emails in this mailing lists archive seem to indicate that
1.0 vs 3.0 - 4.0 are appropriate values on modern hardware.

Which surprised me a bit as I had thought that on actual
harddrives (ignoring SSDs) random_page_cost is higher.
I guess that the number tries to reflect caching of the
relevant pages in memory and modern hardware you have
more of that?

That sort of thing is one reason why all attempts so far to setrandom_page_cost based on physical characteristics haven't gone anywhereuseful. The setting is sort of overloaded right now, it's a fuzzy mixof true random seek cost blended with some notion of cache percentage.Trying to bring some measurements to bear on it is a less effectiveapproach than what people actually do here. Monitor the profile ofquery execution, change the value, see what happens. Use that asfeedback for what direction to keep going; repeat until you're justspinning with no improvements.

It's easy to measure the actual read times and set the value based onthat instead. But that doesn't actually work out so well. There's atleast three problems in that area:

-Timing information is sometimes very expensive to collect. This Iexpect to at least document and quantify why usefully as a 9.2 feature.

-Basing query execution decisions on what is already in the cache leadsto all sorts of nasty feedback situations where you optimize for theshort term, for example using an index already in cache, while neverreading in what would be a superior long term choice because it seemstoo expensive.

-Making a major adjustment to the query planning model like this wouldrequire a large performance regression testing framework to evaluate theresults in.

We are not sure if the database used to choose differently
before the move to the new hardware and the hardware is
performing worse for random seeks.  Or if the planner is
now making different choices.

I don't recommend ever deploying new hardware without first doing somelow-level benchmarks to validate its performance. Once stuff goes intoproduction, you can't do that anymore. Seehttp://www.2ndquadrant.com/en/talks/ for my hardware benchmarking talksif you'd like some ideas on what to collect.


--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] random_page_cost vs seq_page_cost

Reply via email to