On Sat, Dec 7, 2013 at 9:26 AM, Fujii Masao <masao.fu...@gmail.com> wrote: > The patch doesn't apply cleanly against the master due to recent change of > pg_stat_statements. Could you update the patch?
Revision is attached, including changes recently discussed on other thread [1] to allow avoiding sending query text where that's not desirable and increase in default pg_stat_statements.max to 5000. I've found myself tweaking things a bit more here and there. I've added additional clarification around why we do an in-place garbage collection (i.e. why we don't just write a new file and atomically rename -- grep "over-engineer" to find what I mean). I also fully documented the pg_stat_statements function. If we want external tool authors to use it to avoid sending query texts, they have to know about it in the first place. I now use the pg_stat_tmp directory for my temp file (so pg_stat_statements behaves more like the statistics collector). The stats_temp_directory GUC is not used, for reasons that a patch comment explains. I've fuzz-tested this throughout with the "make installcheck-parallel" regression tests. I found it useful to run that in one terminal, while running pgbench with multiple clients in another. The pgbench script looks something like: """ select * from pg_stat_statements; select pg_stat_statements_reset(); """ pg_stat_statements_reset() was temporarily rigged to perform a garbage collection (and not a reset) for the benefit of this stress-test. The function pg_stat_statements(), the garbage collection function and pgss_store() will complain loudly if they notice anything out of the ordinary. I was not able to demonstrate any problems through this technique (though I think it focused my thinking on the right areas during development). Clearly, missed subtleties around managing the external file while locking in order to ensure correct behavior (that no session can observe something that it should or should not) will be something that a committer will want to pay particular attention to. I go to some lengths to to avoid doing this with only a shared lock. FYI, without hacking things up, it's quite hard to make external query text garbage collection occur - need_gc_qtexts() will almost always say no. It will only automatically happen once with pg_stat_statements.max set to 1000 when the standard regression tests are run. This is a really extreme workload in terms of causing pg_stat_statements churn, which shows just how unlikely garbage collection is in the real world. Still, it ought to be totally robust when it happens. In passing, I did this: /* ! * Rename file into place, to atomically commit to serializing. */ Because at this juncture, there ought to be no file to replace (not since Magnus had pg_stat_statements not keep the main global file around for as long as the server is running, in the style of the statistics collector). Thanks [1] http://www.postgresql.org/message-id/cam3swzsvff-vbnc8sc-0cpwzxvqh49reybchb95t95fcrgs...@mail.gmail.com -- Peter Geoghegan
pg_stat_statements_ext_text.v5.2013_12_09.patch.gz
Description: GNU Zip compressed data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers