Re: [HACKERS] Rework the way multixact truncations work

Andres Freund Thu, 03 Dec 2015 10:05:49 -0800

On 2015-12-03 04:38:45 -0500, Noah Misch wrote:
> On Wed, Dec 02, 2015 at 04:46:26PM +0100, Andres Freund wrote:
> > Especially if reverting and redoing includes conflicts that mainly
> > increases the chance of accidental bugs.
> 
> True.  (That doesn't apply to these patches.)


Uh, it does. You had conflicts in your process, and it's hard to verify
that the re-applied patch is actually functionally identical given the
volume of changes. It's much easier to see what actually changes by
looking at iterative commits forward from the current state.


Sorry, but I really just want to see these changes as iterative patches
ontop of 9.5/HEAD instead of this process. I won't revert the reversion
if you push it anyway, but I think it's a rather bad idea.


> > > git remote add community git://git.postgresql.org/git/postgresql.git
> > > git remote add nmisch_github https://github.com/nmisch/postgresql.git
> > > git fetch --multiple community nmisch_github
> > > git diff community/master...nmisch_github/mxt4-rm-legacy
> > 
> > That's a nearly useless diff, because it includes a lot of other changes
> > (218 files changed, 2828 insertions(+), 8742 deletions(-)) made since
> > you published the changes.
> 
> Perhaps you used "git diff a..b", not "git diff a...b".

Ah yes. Neat, didn't know that one.

> > > @@ -2190,7 +2194,8 @@ MultiXactSetNextMXact(MultiXactId nextMulti,
> > > - /*
> > > -  * Computing the actual limits is only possible once the data directory 
> > > is
> > > -  * in a consistent state. There's no need to compute the limits while
> > > -  * still replaying WAL - no decisions about new multis are made even
> > > -  * though multixact creations might be replayed. So we'll only do 
> > > further
> > > -  * checks after TrimMultiXact() has been called.
> > > -  */
> > > + /* Before the TrimMultiXact() call at end of recovery, skip the rest. */
> > >   if (!MultiXactState->finishedStartup)
> > >           return;
> > > -
> > >   Assert(!InRecovery);
> > > 
> > > - /* Set limits for offset vacuum. */
> > > + /*
> > > +  * Setting MultiXactState->oldestOffset entails a find_multixact_start()
> > > +  * call, which is only possible once the data directory is in a 
> > > consistent
> > > +  * state. There's no need for an offset limit while still replaying WAL;
> > > +  * no decisions about new multis are made even though multixact 
> > > creations
> > > +  * might be replayed.
> > > +  */
> > >   needs_offset_vacuum = SetOffsetVacuumLimit();
> > 
> > I don't really see the benefit of this change. The previous location of
> > the comment is where we return early, so it seems appropriate to
> > document the reason there?
> 
> I made that low-importance change for two reasons.  First, returning at that
> point skips more than just the setting a limit; it also skips autovacuum
> signalling and wraparound warnings.  Second, the function has just computed
> mxid "actual limits", so branch mxt2-cosmetic made the comment specify that we
> defer an offset limit, not any and all limits.

My question was more about the comment being after the "early return"
than about the content change, should have made that clearer. Can we
just move your comment up?

> > >  static bool
> > >  SetOffsetVacuumLimit(void)
> > >  {
> > > - MultiXactId oldestMultiXactId;
> > > + MultiXactId     oldestMultiXactId;
> > >   MultiXactId nextMXact;
> > > - MultiXactOffset oldestOffset = 0;       /* placate compiler */
> > > - MultiXactOffset prevOldestOffset;
> > > + MultiXactOffset oldestOffset = 0;               /* placate compiler */
> > >   MultiXactOffset nextOffset;
> > >   bool            oldestOffsetKnown = false;
> > > + MultiXactOffset prevOldestOffset;
> > >   bool            prevOldestOffsetKnown;
> > > - MultiXactOffset offsetStopLimit = 0;
> > 
> > I don't see the benefit of the order changes here.
> 
> I reacted the same way.  Commit 4f627f8 reordered some declarations, which I
> reverted when I refinished that commit as branch mxt3-main.

But the other changes are there, and in the history anyway. As the new
order isn't more meaningful than the current one...

> > > -         if (oldestOffsetKnown)
> > > -                 ereport(DEBUG1,
> > > -                                 (errmsg("oldest MultiXactId member is 
> > > at offset %u",
> > > -                                                 oldestOffset)));
> > 
> > That's imo a rather useful debug message.
> 
> The branches emit that message at the same times 4f627f8^ and earlier emit it.

During testing I found it rather helpful if it was emitted regularly.

> > >   LWLockRelease(MultiXactTruncationLock);
> > > 
> > >   /*
> > > -  * If we can, compute limits (and install them MultiXactState) to 
> > > prevent
> > > -  * overrun of old data in the members SLRU area. We can only do so if 
> > > the
> > > -  * oldest offset is known though.
> > > +  * There's no need to update anything if we don't know the oldest offset
> > > +  * or if it hasn't changed.
> > >    */
> > 
> > Is that really a worthwhile optimization?
> 
> I would neither remove that longstanding optimization nor add it from scratch
> today.  Branch commit 06c9979 restored it as part of a larger restoration to
> the pre-4f627f8 structure of SetOffsetVacuumLimit().

There DetermineSafeOldestOffset() did it unconditionally.

> > > -SlruScanDirCbFindEarliest(SlruCtl ctl, char *filename, int segpage, void 
> > > *data)
> > 
> > That really seems like an independent change, deserving its own commit +
> > explanation.
> 
> Indeed.  I explained that change at length in
> http://www.postgresql.org/message-id/20151108085407.ga1097...@tornado.leadboat.com,
> including that it's alone on a branch (mxt1-disk-independent), to become its
> own PostgreSQL commit.

The comment there doesn't include the explanation...

> > > [branch commit 89a7232]
> > 
> > I don't think that's really a good idea - this way we restart after
> > every single page write, whereas currently we only restart after passing
> > through all pages once. In nearly all cases we'll only ever have to
> > retry once in total, be because such old pages aren't usually going to
> > be reread/redirtied.
> 
> Your improvement sounds fine, then.  Would both SimpleLruTruncate() and
> SlruDeleteSegment() benefit from it?

It probably makes sense to do it in SimpleLruTruncate too - but it does
additional checks as part of the restarts which aren't applicable for
DeleteSegment(), which is IIRC why I didn't also change it.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rework the way multixact truncations work

Reply via email to