The pending contrib/pg_stat_statements patch has an interesting method for dealing with the limited size of its shared-memory hash table (which has got one entry per unique statement text, userid, and database id, so it's not exactly likely to be small). When the hash table fills up, it enumerates all the hash entries, sorts them by "usage" order, decrements the usage counts, and then removes the ones with least usage values. It does all this while holding exclusive lock on the hash table. This is going to mean that every so often, your entire database just freezes up for N milliseconds while this is going on --- no query will be able to get through ExecutorEnd until that cleanup work is done. I don't think this is a property that will go over well with the sort of DBA who would have a use for this module. Unpredictable response times are exactly what he's hoping to avoid.
A couple of other possibilities that seem a bit saner: 1. Use a self-organizing list: any time an entry is referenced, move it to front, and when you need a new entry take the oldest one off the back. I don't see a way to do that without a global lock that protects the list links, but there could be a spinlock that's held only long enough to manipulate the list links. 2. Use a clock sweep algorithm similar to bufmgr's. Either of these trades off accuracy of deciding which existing cache entries are "least interesting" in order to reduce the maintenance overhead --- but it doesn't appear to me that the code implements usage counts in a way that would justify treating them as sacrosanct indicators of relative usefulness anyhow. The first option seems attractively simple and predictable in performance --- all operations are O(1). Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers