On 12/23/19, 6:09 PM, "Kyotaro Horiguchi" <horikyota....@gmail.com> wrote: > You're right. That doesn't seem to work. Another thing I had in my > mind was that XLogWrite had an additional flag so that > AdvanceXLInsertBuffer can tell not to mark .ready. The function is > called while it *is* writing a complete record. So even if > AdvanceXLInsertBuffer inhibit marking .ready the succeeding bytes > comes soon and we can mark the old segment as .ready at the time. > > .. > + * If record_write == false, we don't mark the last segment as .ready > + * if the caller requested to write up to segment boundary. > .. > static void > - XLogWrite(XLogwrtRqst WriteRqst, bool flexible) > + XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool record_write) > > When XLogWrite is called with record_write = false, we don't mark > .ready and don't advance lastSegSwitchTime/LSN. At the next time > XLogWrite is called with record_write=true, if lastSegSwitchLSN is > behind the latest segment boundary before or equal to > LogwrtResult.Write, mark the skipped segments as .ready and update > lastSegSwitchTime/LSN.
Thanks for the suggestion. I explored this proposal a bit today. It looks like there are three places where XLogWrite() is called: AdvanceXLInsertBuffer(), XLogFlush(), and XLogBackgroundFlush(). IIUC while XLogFlush() generally seems to be used to write complete records to disk, this might not be true for XLogBackgroundFlush(), and we're reasonably sure we cannot make such an assumption for AdvanceXLInsertBuffer(). Therefore, we would likely only set record_write to true for XLogFlush() and for certain calls to XLogBackgroundFlush (e.g. flushing asynchronous commits). I'm worried that this approach could be fragile and that we could end up waiting an arbitrarily long time before marking segments as ready for archival. Even if we pay very close attention to the latest flushed LSN, it seems possible that a non-record_write call to XLogWrite() advances things such that we avoid ever calling it with record_write = true. For example, XLogBackgroundFlush() may have flushed the completed blocks, which we cannot assume are complete records. Then, XLogFlush() would skip calling XLogWrite() if LogwrtResult.Flush is sufficiently far ahead. In this scenario, I don't think we would mark any eligible segments as ".ready" until the next call to XLogWrite() with record_write = true, which may never happen. The next approach I'm going to try is having the callers of XLogWrite() manage marking segments ready. That might make it easier to mitigate some of my concerns above, but I'm not tremendously optimistic that this approach will fare any better. Nathan