On Sat, Nov 25, 2017 at 10:34 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > If we could get rid of the copy-to-a-temporary-file technology for > transferring the stats collector's data to backends, then this problem > would probably vanish or at least get a lot less severe. But that seems > like a nontrivial project. With the infrastructure we have today, we > could probably keep the stats tables in a DSM segment; but how would > a backend get a consistent snapshot of them?
I suppose the obvious approach is to have a big lock around the statistics data proper; this could be taken in shared mode to take a snapshot or in exclusive mode to update statistics. In addition, create one or more queues where statistics messages can be enqueued in lieu of updating the main statistics data directly. If that doesn't perform well enough, you could keep two copies of the statistics, A and B. At any given time, one copy is quiescent and the other copy is being updated. Periodically, at a time when we know that nobody is taking a snapshot of the statistics, they reverse roles. Of course, the other obvious question is whether we really need a consistent snapshot, because that's bound to be pretty expensive even if you eliminate the I/O cost. Taking a consistent snapshot across all 100,000 tables in the database even if we're only ever going to access 5 of those tables doesn't seem like a good or scalable design. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company