On 3/24/13 8:11 AM, Greg Smith wrote:
On 3/22/13 8:45 AM, Ants Aasma wrote:
However, I think the main issue isn't finding new algorithms that are
better in some specific circumstances. The hard part is figuring out
whether their performance is better in general.
Right. The current page replacement method works as expected. Many frequently
accessed pages accumulate a usage count of 5 before the clock sweep hits them.
Pages that are accessed once and not again before the clock sweep are evicted.
There are several theoretically better ways to approach this. Anyone who
hasn't already been working on this for a few years is very unlikely to come up
with a brand new idea, one that hasn't already been tested in the academic
research.
But the real blocker here isn't ideas, it's creating benchmark workloads to validate any
change. Right now I see the most promising work that could lead toward the
"performance farm" idea as all of the Jenkins based testing that's been going
on recently. Craig Ringer has using that for 2ndQuadrant work here, Peter Eisentraut has
been working with it:
http://petereisentraut.blogspot.com/2013/01/postgresql-and-jenkins.html and the PostGIS
project uses it too. There's some good momentum brewing there.
When we have regular performance testing with a mix of workloads--I have about
10 in mind to start--at that point we could start the testing performance
changes to the buffer replacement. Until then this whole area is hard to touch
usefully. You have to assume that any tuning you do for one type of workload
might accidentally slow another. Starting with a lot of baseline workloads is
the only way to move usefully forward when facing that problem.
The other thing I think would be tremendously useful would be the ability to
get performance data from systems in the field *without having to install extra
stuff or do a special build*. The last point is critical because there are so
many places where deviating from a standard package takes an act of Congress.
In this case, if I could run some queries to get stats about clock sweep waits
and what-not then I could get our shared buffer size changed on some hosts and
see how those changes affect the numbers. But doing this with a non-standard
build is pretty much a non-starter.
I know there's been some improvement in this area, but I suspect there's still
more to go.
--
Jim C. Nasby, Data Architect j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers