Hi folks, 
 
I've always been fascinated with genetic algorithms. Having had a chance to 
implement it once before, to solve real life issue - I knew they can be 
brilliant at searching for right solutions in multi dimensional space.
 
Thinking about just the postgresql.conf and number of possible options there to 
satisfy performance needs - I thought, this does sound like a good example of 
problem that can be solved using genetic algorithm. 
 
So I sat down after work for few days, and came up with a simple proof of 
concept.
It generates random population of postgresql configuration files, and runs 
simple pgbench test on each one of them. It takes the average TPS for 3 
consecutive runs as the score that then is applied to each 'guy'. 
 
Then I run a typical - I suppose - cross over operation and slight mutation of 
each new chromosome - to come up with new population, and so on and so forth. 
 
Running this for 2 days - I came up to conclusion that it does indeed seem to 
work, although default pgbench 'test cases' are not really stressing the 
database enough for it to generate diverse enough populations each time. 
 
Also, ideally this sort of thing should be run on two or more different hosts. 
One (master) that just generates new configurations, saves, restores, manages 
the whole operation - and 'slave' host(s) that run the actual tests.
 
One benefit of that would be the fact that genetic algorithms are highly 
parallelizable. 
 
I did reboot my machines after tests couple times, to test configuration files 
and to see if the results were in fact repeatable (as much as they can be) - 
and I have to say, to my surprise - they were. I.e. the configuration files 
with poor results were still obviously slower then the best ones.
 
I did include my sample results for everyone to see, including nice spreadsheet 
with graphs (everyone loves graphs) showing the scores across all populations.
The tests were ran on my mac laptops (don’t have access to bunch of servers 
that I can test things like that on for couple days, sorry).
 
The project, including readme file is available to look at: 
https://github.com/waniek/genpostgresql.conf
 
 
Things I know so far:
* I probably need to take into account more configuration options;
* pgbench with its default test case is not the right characterization suite 
for this exercise, I need something more life like. I suppose we all have some 
sort of a characterization suite that could be used here;
* Code needs a lot work on it, if this was to be used professionally;
* Just restarting postgresql with different configuration file doesn't really 
constitute fully proper way to test new configuration files, but it seem to 
work;
 
 
I don't expect much out of this - after all this is just a proof of concept. 
But if there are people out there thinking this can be in any way useful - 
please give us a shout. 
Also, if you know something more about genetic algorithms then I do - and can 
suggest improvement - let me know.
 
Lastly, I'm looking for some more sophisticated pgbench test cases that I could 
throw in at it. I think in general pgbench as a project could use some more 
sophisticated benchmarks that should be included with the project, for everyone 
to see. Perhaps even to run some automated regression tests against git head. 
 
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to