On Mon, Sep 25, 2017 at 4:39 AM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
> > > On 09/06/2017 09:45 AM, Haribabu Kommi wrote: > > > > > > On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra > > <tomas.von...@2ndquadrant.com <mailto:tomas.von...@2ndquadrant.com>> > wrote: > > > > On 7/25/17 12:55 AM, Tom Lane wrote: > > > > Tomas Vondra <tomas.von...@2ndquadrant.com > > <mailto:tomas.von...@2ndquadrant.com>> writes: > > > > It seems to me that VACUUM and ANALYZE somewhat disagree on > what > > exactly reltuples means. VACUUM seems to be thinking that > > reltuples > > = live + dead while ANALYZE apparently believes that > reltuples = > > live > > > > > > The question is - which of the reltuples definitions is the > > right > > one? I've always assumed that "reltuples = live + dead" but > > perhaps > > not? > > > > > > I think the planner basically assumes that reltuples is the live > > tuple count, so maybe we'd better change VACUUM to get in step. > > > > > > Attached is a patch that (I think) does just that. The disagreement > > was caused by VACUUM treating recently dead tuples as live, while > > ANALYZE treats both of those as dead. > > > > At first I was worried that this will negatively affect plans in the > > long-running transaction, as it will get underestimates (due to > > reltuples not including rows it can see). But that's a problem we > > already have anyway, you just need to run ANALYZE in the other > session. > > > > > > Thanks for the patch. > > From the mail, I understand that this patch tries to improve the > > reltuples value update in the catalog table by the vacuum command > > to consider the proper visible tuples similar like analyze command. > > > > -num_tuples); > > +num_tuples - nkeep); > > > > With the above correction, there is a problem in reporting the number > > of live tuples to the stats. > > > > postgres=# select reltuples, n_live_tup, n_dead_tup > > from pg_stat_user_tables join pg_class using (relname) > > where relname = 't'; > > reltuples | n_live_tup | n_dead_tup > > -----------+------------+------------ > > 899818 | 799636 | 100182 > > (1 row) > > > > > > The live tuples data value is again decremented with dead tuples > > value before sending them to stats in function lazy_vacuum_rel(), > > > > /* report results to the stats collector, too */ > > new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples; > > > > The fix needs a correction here also. Or change the correction in > > lazy_vacuum_rel() function itself before updating catalog table similar > > like stats. > > > > Ah, haven't noticed that for some reason - you're right, we estimate the > reltuples based on (num_tuples - nkeep), so it doesn't make much sense > to subtract nkeep again. Attached is v2 of the fix. > > I've removed the subtraction from lazy_vacuum_rel(), leaving just > > new_live_tuples = new_rel_tuples; > > and now it behaves as expected (no second subtraction). That means we > can get rid of new_live_tuples altogether (and the protection against > negative values), and use new_rel_tuples directly. > > Which pretty much means that in this case > > (pg_class.reltuples == pg_stat_user_tables.n_live_tup) > > but I guess that's fine, based on the initial discussion in this thread. The changes are fine and now it reports proper live tuples in both pg_class and stats. The other issue of continuous vacuum operation leading to decrease of number of live tuples is not related to this patch and that can be handled separately. I changed the patch status as ready for committer. Regards, Hari Babu Fujitsu Australia