On Sat, Jan 24, 2015 at 9:14 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> AlexK987 <alex.cue....@gmail.com> writes:
> > The documentation states that "The extent of analysis can be controlled
> by
> > adjusting the default_statistics_target configuration variable". It looks
> > like I can tell Postgres to create more histograms with more bins, and
> more
> > distinct values. This implicitly means that Postgres will use a larger
> > random subset to calculate statistics.
>
> > However, this is not what I want. My data may be quite skewed, and I want
> > full control over the size of the sample. I want to explicitly tell
> Postgres
> > to analyze the whole table. How can I accomplish that?
>
> You can't, and you wouldn't want to if you could, because that would
> result in slurping the entire table into backend local memory.  All
> the rows constituting the "random sample" are held in memory while
> doing the statistical calculations.
>
> In practice, the only stat that would be materially improved by taking
> enormously large samples would be the number-of-distinct-values estimate.
> There's already a way you can override ANALYZE's estimate of that number
> if you need to.
>

The accuracy of the list of most common values could also be improved a lot
by increasing the sample.

Cheers,

Jeff

Reply via email to