Re: [HACKERS] Rework the way multixact truncations work

Robert Haas Wed, 09 Dec 2015 08:22:08 -0800

On Wed, Dec 9, 2015 at 10:41 AM, Andres Freund <[email protected]> wrote:
>> (I am glad you talked the author out of back-patching; otherwise,
>> 9.4.5 and 9.3.10 would have introduced a data loss bug.)
>
> Isn't that a bug in a, as far as we know, impossible scenario? Unless I
> miss something there's no known case where it's "expected" that
> find_multixact_start() fails after initially succeeding? Sure, it sucks
> that the bug survived review and that it was written in the first
> place. But it not showing up during testing isn't meaningful, given it's
> a should-never-happen scenario.


If I correctly understand the scenario that you are describing, that
does happen - not for the same MXID, but for different ones.  At least
the last time I checked, and I'm not sure if we've fixed this, it
could happen because the SLRU page that contains the multixact wasn't
flushed out of the SLRU buffers yet.  But apart from that, it could
happen any time there's a gap in the sequence of files, and that sure
doesn't seem like a can't-happen situation.  We know that, on 9.3,
there's definitely a sequence of events that leads to a 0000 file
followed by a gap followed by the series of files that are still live.
Given the number of other bugs we've fixed in this area, I would not
like to bet on that being the only scenario where this crops up.  It
*shouldn't* happen, and as far as we know, if you start and end on a
version newer than 4f627f8 and aa29c1c, it won't.  Older branches,
though, I wouldn't like to bet on.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rework the way multixact truncations work

Reply via email to