On Tue, Oct 22, 2013 at 2:56 AM, Dimitri Fontaine <dimi...@2ndquadrant.fr> wrote: > Tom Lane <t...@sss.pgh.pa.us> writes: >> Hm. It's been a long time since college statistics, but doesn't the >> entire concept of standard deviation depend on the assumption that the >> underlying distribution is more-or-less normal (Gaussian)? Is there a > > I just had a quick chat with a statistician friends of mine on that > topic, and it seems that the only way to make sense of an average is if > you know already the distribution. > > In our case, what I keep experiencing with tuning queries is that we > have like 99% of them running under acceptable threshold and 1% of them > taking more and more time.
Agreed. In a lot of Heroku's performance work, the Perc99 and Perc95 have provided a lot more value that stddev, although stddev is a lot better than nothing and probably easier to implement. There are apparently high-quality statistical approximations of these that are not expensive to compute and are small in memory representation. That said, I'd take stddev over nothing for sure. Handily for stddev, I think by snapshots of count(x), sum(x), sum(x**2) (which I understand to be the components of stddev), I think one can compute stddevs across different time spans using auxiliary tools that sample this triplet on occasion. That's kind of a handy property that I'm not sure if percN-approximates can get too easily. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers