On Wed, Mar 25, 2015 at 1:00 PM, Feike Steenbergen <
feikesteenber...@gmail.com> wrote:

> On 25 March 2015 at 19:07, Jeff Janes <jeff.ja...@gmail.com> wrote:
>
> > Also, I doubt that that is the problem in the first place.  If you
> collect a
> > sample of 30,000 (which the default target size of 100 does), and the
> > frequency of the second most common is really 0.00307333 at the time you
> > sampled it, you would expect to find it 92 times in the sample. The
> chances
> > against actually finding 1 instead of around 92 due to sampling error are
> > astronomical.
>
> It can be that the distribution of values is very volatile; we hope
> the increased stats target (from the default=100 to 1000 for this
> column) and frequent autovacuum and autoanalyze helps in keeping the
> estimates correct.
>
> It seems that it did find some other records (<> 'PRINTED), as is
> demonstrated in the stats where there was only one value in the MCV
> list: the frequency was 0.996567 and the fraction of nulls was 0,
> therefore leaving 0.03+ for other values. But because none of them
> were in the MCV and MCF list, they were all treated as equals. They
> are certainly not equal.
>

Now that I look back at the first post you made, it certainly looks like
the statistics target was set to 1 when that was analyzed, not to 100.  But
it doesn't look quite correct for that, either.

What version of PostgreSQL are running?  'select version();'

What do you get when to do "analyze verbose print_list"?

How can the avg_width be 4 when the vast majority of entries are 7
characters long?

Cheers,

Jeff

Reply via email to