On Sat, Jan 24, 2015 at 9:14 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> AlexK987 <alex.cue....@gmail.com> writes: > > The documentation states that "The extent of analysis can be controlled > by > > adjusting the default_statistics_target configuration variable". It looks > > like I can tell Postgres to create more histograms with more bins, and > more > > distinct values. This implicitly means that Postgres will use a larger > > random subset to calculate statistics. > > > However, this is not what I want. My data may be quite skewed, and I want > > full control over the size of the sample. I want to explicitly tell > Postgres > > to analyze the whole table. How can I accomplish that? > > You can't, and you wouldn't want to if you could, because that would > result in slurping the entire table into backend local memory. All > the rows constituting the "random sample" are held in memory while > doing the statistical calculations. > > In practice, the only stat that would be materially improved by taking > enormously large samples would be the number-of-distinct-values estimate. > There's already a way you can override ANALYZE's estimate of that number > if you need to. > The accuracy of the list of most common values could also be improved a lot by increasing the sample. Cheers, Jeff