Andres Freund wrote on 2018-03-01: > I think the patch probably doesn't apply anymore, due to other changes > to pg_stat_statements since its posting. Could you refresh?
pgss_plans_v02.patch applies cleanly to master, there were no changes to pg_stat_statements since the copyright updates at the beginning of January. (pgss_plans_v02.patch is attached to message 1bd396a9-4573-55ad-7ce8-fe7adffa1...@uni-muenster.de and can be found in the current commitfest as well.) > I've not done any sort of review. Scrolling through I noticed // > comments which aren't pg coding style. I'll fix that along with any other problems that might be found in a review. > I'd like to see a small benchmark showing the overhead of the feature. > Both in runtime and storage size. I've tried to gather some meaningful results, however either my testing methodology was flawed (as variance between all my passes of pgbench was rather high) or the takeaway is that the feature only generates little overhead. This is what I've run on my workstation using a Ryzen 1700 and 16GB of RAM and an old Samsung 840 Evo as boot drive, which also held the database: The database used for the tests was dropped and pgbench initialized anew for each test (pgss off, pgss on, pgss on with plan collection) using a scaling of 16437704*0.003~=50 (roughly what the phoronix test suite uses for a buffer test). Also similar to the phoronix test suite, I used 8 jobs and 32 connections for a normal multithreaded load. I then ran 10 passes, each for 60 seconds, with a 30 second pause between them, as well as another test which ran for 10 minutes. With pg_stat_statements on, the latter test (10 minutes) resulted in 1833 tps, while the patched version resulted in 1700 tps, so a little over 7% overhead? Well, the "control run", without pg_stat_statements delivered only 1806 tps, so variance seems to be quite high. The results of the ten successive tests, each running 60 seconds and then waiting for 30 seconds, are displayed in the attached plot. I've tinkered with different settings with pgbench for quite some time now and all I can come up with are runs with high variance between them. If anybody has any recommendations for a setup that generates less variance, I'll try this again. Finally, the more interesting metric regarding this patch is the size of the pg_stat_statements.stat file, which stores all the metrics while the database is shut down. I reckon that the size of pgss_query_texts.stat (which holds only the query strings and plan strings while the database is running) will be similar, however it might fluctuate more as new strings are simply appended to the file until the garbagecollector decides that it has to be cleaned up. After running the aforementioned tests, the file was 8566 bytes in size for pgss in it's unmodified form, while the tests resulted in 32607 bytes for the pgss that collects plans as well. This seems reasonable as plans strings are usually longer than the statements from which they result. Worst case, the pg_stat_statements.stat holds two plans for each type of statement. I've not tested the length of the file with different encodings, such as JSON, YAML, or XML, however I do not expect any hugely different results. Greetings Julian
Description: Adobe PDF document