Re: [HACKERS] Stats sender and 2pc minor problem

2016-10-14 Thread Alvaro Herrera
Tom Lane wrote:
> Stas Kelvich  writes:
> > Statistics sender logic during usual commit and two-phase commit do not
> > strictly matches each other and that leads to delta_live_tuples added to
> > n_live_tup in case of truncate in two phase commit.
> 
> Yeah, that code says it's supposed to match AtEOXact_PgStat, but it
> doesn't.

Hmm, oops.

> I pushed this, but without the regression test case, which would have
> failed outright in any test run with max_prepared_transactions = 0.

I agree that that was the right approach.  Thanks for taking care of it!

> I wonder if we could make that better by making the stats collector
> track stats by relfilenode rather than table OID.  It'd be a pretty
> major logic change, though, to serve a corner case.

Hm, that's an idea.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Stats sender and 2pc minor problem

2016-10-13 Thread Tom Lane
Stas Kelvich  writes:
> Statistics sender logic during usual commit and two-phase commit do not
> strictly matches each other and that leads to delta_live_tuples added to
> n_live_tup in case of truncate in two phase commit.

Yeah, that code says it's supposed to match AtEOXact_PgStat, but it
doesn't.

I pushed this, but without the regression test case, which would have
failed outright in any test run with max_prepared_transactions = 0.
Timing sensitivity is another problem.  In the commit that created this
discrepancy, d42358efb, Alvaro had tried to add regression coverage for
this area, but we ended up backing it out because it failed too often
in the buildfarm.

TBH, now that I look at it, I think that d42358efb was fundamentally
wrong and this patch is just continuing down the same wrong path.
Having the stats collector respond to a TRUNCATE like this cannot
work reliably, because the "it got truncated" flag will arrive at
the stats collector asynchronously, perhaps quite some time later
than the truncate occurred.  When that happens, we may throw away
live/dead tuple count updates from transactions that actually happened
after the truncate but chanced to report first.

I wonder if we could make that better by making the stats collector
track stats by relfilenode rather than table OID.  It'd be a pretty
major logic change, though, to serve a corner case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers