2013/11/24 Peter Geoghegan <p...@heroku.com> > On Sun, Nov 24, 2013 at 1:18 AM, Pavel Stehule <pavel.steh...@gmail.com> > wrote: > > I got a compilation warning: > > I'll look into it. > > > * I tried do some basic benchmark, and I didn't see any negative on > > performance related to implemented feature > > You're never going to see any performance impact with something like a > regular pgbench workload. That's because you'll only ever write query > texts out when a shared lock is held. With only extreme exceptions (a > garbage collection), exceptions will never be exercised by what you're > doing, you will only block whole new entries from being added - > existing stat counters are just protected by a shared lock + spinlock. > > You might consider rigging pg_stat_statements to create a new hash > value randomly (consisting of a random integer for a queryid "hash") > maybe 1% - 10% of the time. That would simulate the cache being filled > quickly, I guess, where that shared lock will conflict with the > exclusive lock, potentially showing where what I've done can hurt. I > recommend in the interest of fairness not letting it get so far as to > put the cache under continual pressure. > > Now, this isn't that important a case, because having a larger hash > table makes exclusive locking/new entries less necessary, and this > work enables people to have larger hash tables. But it does matter to > some degree. > > how is a size of hash table related to exclusive locks in pgss_store? I don't afraid about performance of pg_stat_statements(). I afraid a performance of creating new entry and appending to file.
I didn't expected some slowdown - a benchmark was "only" verification against some hidden cost. Is not difficult to write synthetic benchmark that will find some differences, but a result of this benchmark will be useless probably due minimal relation to reality. > > * We would to this patch - a idea of proposal is right - a shared memory > can > > be used better than storage of possibly extra long queries > > Right. Plus the amount of storage used is pretty modest, even compared > to previous shared memory use. Each entry doesn't need to have > track_activity_query_size bytes of storage, only what it needs (though > garbage can accrue, which is a concern). > > > Peter does some warning about performance in feature proposal > > > http://www.postgresql.org/message-id/cam3swzryynfwxi3r3edakwboytaf1_pwgjxtayddnsbjafd...@mail.gmail.com > > I'm mostly talking about the cost of the shared lock for *reading* > here, when pg_stat_statements() is called. If that was happening at > the same time as I/O for reading the query texts from file, that could > be a concern. Not enough of a concern to worry about humans doing > this, I think, but maybe for scripts. > > Maybe the trick would be to copy the stats into shared memory, and > only then read texts from disk (with the shared lock released). We'd > have to be concerned about a garbage collection occurring, so we'd > need to re-acquire a shared lock again (plus a spinlock) to check that > didn't happen (which is generally very unlikely). Only when we know it > didn't happen could we display costs to the user, and that might mean > keeping all the query texts in memory, and that might matter in > extreme situations. The reason I haven't done all this already is > because it's far from clear that it's worth the trouble. > > I don't expect so pg_stat_statements is on application critical path, so I prefer mostly simple design > > Peter mentioned a race conditions under high load. It is fixed? > > Yes, the version you mentioned had the fix. > > ok Regards Pavel > -- > Peter Geoghegan >