On 6 April 2016 at 13:27, Andres Freund <[email protected]> wrote:
> On 2016-04-06 13:11:40 +0100, Simon Riggs wrote:
> > On 6 April 2016 at 10:09, Andres Freund <[email protected]> wrote:
> > > On 2016-04-06 10:04:42 +0100, Simon Riggs wrote:
> > > The issue there is that we continue to issue checkpoints if the only
> > > activity since the last checkpoint was emitting a standby
> > > snapshot. That's because:
> > >
> >
> > I agree this is the current situation in 9.4 and 9.5, hence the bug
> report.
> >
> > This no longer occurs with the patches I have proposed. The snapshot is
> > skipped, so a further checkpoint never triggers.
>
> Not if there's a longrunning/idle transaction.
>
> Note that skipping the snapshot is actually a *problem* in some
> cases. As I've brought up upthread, to which you never replied. A
> xl_running_xacts->xcnt == 0/!overflowed snapshot can be very important
> for hot standby, because it allows ProcArrayApplyRecoveryInfo() to
> switch to INITIALIZED state:
>
I replied by posting a patch to address your concern, how is that non-reply?
> > > > What issue is that? Previously you said it must not skip it at all
> for
> > > > logical.
> > >
> > > It's fine to skip the records iff nothing important has happened since
> > > the last time a snapshot has been logged. Again, the proposed approach
> > > allowed to detect that.
>
> > Agreed, both proposals do that.
>
> No, the currently committed patch, even including your proposed followup
> patch, doesn't do that. As you just commented about as quoted above. The
> code currently reads like:
>
> if (wal_level < WAL_LEVEL_LOGICAL)
> {
> LWLockRelease(ProcArrayLock);
>
> /*
> * Don't bother to log anything if nothing is happening,
> if we are
> * using archive_timeout > 0 and we didn't overflow
> snapshot last time.
> *
> * This ensures that we don't issue an empty WAL record,
> which can
> * be annoying when used in conjunction with archive
> timeout.
> */
> if (running->xcnt == 0 &&
> nlocks == 0 &&
> XLogArchiveTimeout > 0 &&
> !last_snapshot_overflowed)
> {
> LWLockRelease(XidGenLock);
> return InvalidXLogRecPtr;
> }
>
> last_snapshot_overflowed = running->subxid_overflow;
> }
>
> This obviously doesn't apply to WAL_LEVEL_LOGICAL as is (the if). And it
> also obviously repeats to log the same snapshot in a system where the
> state hasn't changed, but where running->xcnt != 0 or nlocks != 0.
My understanding from your previous comments was that it would be incorrect
to do that.
> > > > > We now log more WAL with
> > > > > XLogArchiveTimeout > 0 than without.
> > >
> > > > And the problem with that is what?
> > >
> > > That an idle system unnecessarily produces WAL? Waking up disks and
> > > everything?
> > >
> >
> > The OP discussed a problem with archive_timeout > 0. Are you saying we
> > should put in a fix that applies whatever the setting of archive_timeout?
>
> Yes. We should strive to fix the full scope of an issue; not just the
> part complained about.
>
I believe that's what I did. You didn't mention that point earlier, nor do
I now think it important.
> You seem to ignore valid points made against your approach, apparently
> because you dismiss them as "emotive language". Stop.
>
Not true. I have listened to everything you've said and been patient with
the high number of mistakes in your replies.
Calling something a "bandaid" is not a "valid point" against it.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services