Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

Alexander Kuzmenkov Thu, 01 Mar 2018 06:25:57 -0800

On 01.03.2018 06:23, David Gould wrote:

In theory the sample pages analyze uses should represent the whole table
fairly well. We rely on this to generate pg_statistic and it is a key
input to the planner. Why should we not believe in it as much only for
reltuples? If the analyze sampling does not work, the fix would be to improve
that, not to disregard it piecemeal.

Well, that sounds reasonable. But the problem with the moving averagecalculation remains. Suppose you run vacuum and not analyze. If theupdates are random enough, vacuum won't be able to reclaim all thepages, so the number of pages will grow. Again, we'll have the samething where the number of pages grows, the real number of live tuplesstays constant, and the estimated reltuples grows after each vacuum run.

I did some more calculations on paper to try to understand this. If weaverage reltuples directly, instead of averaging tuple density, itconverges like it should. The error with this density calculation seemsto be that we're effectively multiplying the old density by the newnumber of pages. I'm not sure why we even work with tuple density. Wecould just estimate the number of tuples based on analyze/vacuum, andthen apply moving average to it. The calculations would be shorter, too.What do you think?


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

Reply via email to