Gavin Flower wrote: > The standard deviation (sd) is proportional to the square root of > the number in the sample in a Normal Distribution. > > In a Normal Distribution, about 2/3 the values will be within plus > or minus one sd of the mean. > > There seems to be an implicit assumption that the distribution of > values follows the Normal Distribution - has this been verified?
The whole problem here is precisely to determine what is the data distribution -- one side of it is how to represent it for the planner (which we do by storing a number of distinct values, a list of MCVs and their respective frequencies, and a histogram representing values not in the MCV list); the other side is how to figure out what data to put in the MCV list and histogram (i.e. what to compute during ANALYZE). If we knew the distribution was a normal, we wouldn't need any of these things -- we'd just store the mean and standard deviation. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers