On Fri, Sep 25, 2015 at 8:51 AM, Marti Raudsepp <ma...@juffo.org> wrote: > I've also been seeing lots of log messages saying "LOG: out of > memory" on a server that's hosting development databases. I put off > debugging this until now because it didn't seem to have any adverse > effects on the system.
That's unfortunate. > It's very frustrating when debugging aides cause further problems on a > system. If the in-line compaction doesn't materialize (or it's decided > not to backport it), I would propose instead to add a test to > pgss_store() to avoid growing the file beyond MaxAlloc (or perhaps > even a lower limit). Surely dropping some statistics is better than > this pathology. I heard a customer complaint today that seems similar. A Heroku customer attempted a migration from MySQL to PostgreSQL in this manner: mysqldump | psql This at least worked well enough to cause problems for pg_stat_statements (some queries were not rejected by the parser, I suppose). While I'm opposed to arbitrary limits for tools like pg_stat_statements, I think the following defensive measure might be useful on top of better OOM defenses: Test for query text length within pgss_store() where a pgssJumbleState is passed by caller (the post-parse-analysis hook caller -- not executor hook caller that has query costs to store). If it is greater than, say, 10 * Max(ASSUMED_MEDIAN_INIT, pgss->cur_median_usage), do not bother to normalize the query text, or store anything at all. Simply return. Any entry we create at that point will be a sticky entry (pending actually completing execution without the entry being evicted), and it doesn't seem worth worrying about normalizing very large query texts, which tend to be qualitatively similar to utility statements from the user's perspective anyway. Besides, query text normalization always occurred on a best-effort basis. It's not very uncommon for pg_stat_statements to fail to normalize query texts today for obscure reasons. This would avoid storing very large query texts again and again when a very large DML statement repeatedly fails (e.g. due to a data integration task that can run into duplicate violations) and is repeatedly rerun with tweaks. Maybe there is a substantively distinct table in each case, because a CREATE TABLE is performed as part of the same transaction, so each failed query gets a new pg_stat_statements entry, and a new, large query text. I should probably also assume that sticky entries have a generic length (existing pgss->mean_query_len) for the purposes of accumulating query text length within entry_dealloc(). They should not get to contribute to median query length (which throttles garbage collection to prevent thrashing). Anyone have an opinion on that? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers