On Tue, Oct 17, 2017 at 3:02 AM, Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote:
> Yeah, me too.  If you see another way to fix the problem, let's discuss
> it.

I doubt that there is a better way.

> I think a possible way is to avoid considering that the relfrozenxid
> value computed by the caller is final.

While that alternative seems possible, it also seems riskier.

> One thing I didn't quite investigate is why this bug only shows up with
> multixacts so far.  Is it just because multixacts provide an easy way to
> reproduce it, and that there are others, more difficult ways to cause
> the same problem without involving multixacts?  If so, then the problem
> is likely present in 9.2 as well.

The obvious explanation (although not necessarily the correct one) is
that freezing didn't have a MultiXactIdGetUpdateXid() call in 9.2. The
way we pass down both cutoff_xid and cutoff_multi to
FreezeMultiXactId() seems like it might be involved in the data
corruption that we saw (the incorrect pruning/failed to find parent
tuple thing).

I might spend some time figuring this out later in the week. It's hard
to pin down, and I've only really started to learn about MultiXacts in
the past few months.

Peter Geoghegan

