Re: [HACKERS] Broken O(n^2) avoidance in wal segment recycling.
On Fri, Jun 23, 2017 at 3:25 PM, Michael Paquier wrote: > On Thu, Jun 22, 2017 at 6:10 AM, Andres Freund wrote: >> We've not heard any complaints about this afaik, but it's not something >> that's easily diagnosable as being a problem. Therefore I suspect we >> should fix and backpatch this? > > Agreed. I have just poked at this problem and have finished with the > attached. Logically it is not complicated at the argument values used > by the callers of RemoveXlogFile() are never updated when scanning > pg_wal. Surely this patch needs an extra pair of eyes. Andres has pointed me out offline that a bad copy-paste re-introduced fb886c15. So it is important to properly track the segment name of the switch point as well as the the segment from where the recycling begins. There was as well a second mistake in RemoveOldXlogFiles() causing unnecessary segments to be kept caused by the same copy-pasto. -- Michael diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 0a6314a642..fd35cd311c 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -878,7 +878,8 @@ static int emode_for_corrupt_record(int emode, XLogRecPtr RecPtr); static void XLogFileClose(void); static void PreallocXlogFiles(XLogRecPtr endptr); static void RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr); -static void RemoveXlogFile(const char *segname, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr); +static void RemoveXlogFile(const char *segname, XLogSegNo *endlogSegNo, + XLogSegNo recycleSegNo); static void UpdateLastRemovedPtr(char *filename); static void ValidateXLOGDirectoryStructure(void); static void CleanupBackupHistory(void); @@ -3829,10 +3830,18 @@ UpdateLastRemovedPtr(char *filename) static void RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr) { + XLogSegNo endlogSegNo; + XLogSegNo recycleSegNo; DIR *xldir; struct dirent *xlde; char lastoff[MAXFNAMELEN]; + /* + * Initialize info about where to try to recycle to. + */ + XLByteToSeg(endptr, endlogSegNo); + recycleSegNo = XLOGfileslop(PriorRedoPtr); + xldir = AllocateDir(XLOGDIR); if (xldir == NULL) ereport(ERROR, @@ -3875,7 +3884,7 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr) /* Update the last removed location in shared memory first */ UpdateLastRemovedPtr(xlde->d_name); -RemoveXlogFile(xlde->d_name, PriorRedoPtr, endptr); +RemoveXlogFile(xlde->d_name, &endlogSegNo, recycleSegNo); } } } @@ -3905,8 +3914,16 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI) struct dirent *xlde; char switchseg[MAXFNAMELEN]; XLogSegNo endLogSegNo; + XLogSegNo switchLogSegNo; + XLogSegNo recycleSegNo; - XLByteToPrevSeg(switchpoint, endLogSegNo); + /* + * Initialize info about where to begin the work. This will recycle + * arbitrarily 10 segments. + */ + XLByteToPrevSeg(switchpoint, switchLogSegNo); + XLByteToSeg(switchpoint, endLogSegNo); + recycleSegNo = endLogSegNo + 10; xldir = AllocateDir(XLOGDIR); if (xldir == NULL) @@ -3918,7 +3935,7 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI) /* * Construct a filename of the last segment to be kept. */ - XLogFileName(switchseg, newTLI, endLogSegNo); + XLogFileName(switchseg, newTLI, switchLogSegNo); elog(DEBUG2, "attempting to remove WAL segments newer than log file %s", switchseg); @@ -3944,7 +3961,7 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI) * - but seems safer to let them be archived and removed later. */ if (!XLogArchiveIsReady(xlde->d_name)) -RemoveXlogFile(xlde->d_name, InvalidXLogRecPtr, switchpoint); +RemoveXlogFile(xlde->d_name, &endLogSegNo, recycleSegNo); } } @@ -3954,31 +3971,22 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI) /* * Recycle or remove a log file that's no longer needed. * - * endptr is current (or recent) end of xlog, and PriorRedoRecPtr is the - * redo pointer of the previous checkpoint. These are used to determine - * whether we want to recycle rather than delete no-longer-wanted log files. - * If PriorRedoRecPtr is not known, pass invalid, and the function will - * recycle, somewhat arbitrarily, 10 future segments. + * segname is the name of the segment to recycle or remove. endlogSegNo + * is the segment number of the current (or recent) end of WAL. recycleSegNo + * is the segment number to recycle up to. + * + * endlogSegNo gets incremented if the segment is recycled so as it is not + * checked again with future callers of this function. */ static void -RemoveXlogFile(const char *segname, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr) +RemoveXlogFile(const char *segname, XLogSegNo *endlogSegNo, + XLogSegNo recycleSegNo) { char path[MAXPGPATH]; #ifdef WIN32 char newpath[MAXPGPATH]; #endif struct stat statbuf; - XLogSegNo endlogS
Re: [HACKERS] Broken O(n^2) avoidance in wal segment recycling.
On Thu, Jun 22, 2017 at 6:10 AM, Andres Freund wrote: > We've not heard any complaints about this afaik, but it's not something > that's easily diagnosable as being a problem. Therefore I suspect we > should fix and backpatch this? Agreed. I have just poked at this problem and have finished with the attached. Logically it is not complicated at the argument values used by the callers of RemoveXlogFile() are never updated when scanning pg_wal. Surely this patch needs an extra pair of eyes. -- Michael diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 0a6314a642..d594401b6e 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -878,7 +878,8 @@ static int emode_for_corrupt_record(int emode, XLogRecPtr RecPtr); static void XLogFileClose(void); static void PreallocXlogFiles(XLogRecPtr endptr); static void RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr); -static void RemoveXlogFile(const char *segname, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr); +static void RemoveXlogFile(const char *segname, XLogSegNo *endlogSegNo, + XLogSegNo recycleSegNo); static void UpdateLastRemovedPtr(char *filename); static void ValidateXLOGDirectoryStructure(void); static void CleanupBackupHistory(void); @@ -3829,10 +3830,18 @@ UpdateLastRemovedPtr(char *filename) static void RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr) { + XLogSegNo endlogSegNo; + XLogSegNo recycleSegNo; DIR *xldir; struct dirent *xlde; char lastoff[MAXFNAMELEN]; + /* + * Initialize info about where to try to recycle to. + */ + XLByteToPrevSeg(endptr, endlogSegNo); + recycleSegNo = XLOGfileslop(PriorRedoPtr); + xldir = AllocateDir(XLOGDIR); if (xldir == NULL) ereport(ERROR, @@ -3875,7 +3884,7 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr) /* Update the last removed location in shared memory first */ UpdateLastRemovedPtr(xlde->d_name); -RemoveXlogFile(xlde->d_name, PriorRedoPtr, endptr); +RemoveXlogFile(xlde->d_name, &endlogSegNo, recycleSegNo); } } } @@ -3905,8 +3914,14 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI) struct dirent *xlde; char switchseg[MAXFNAMELEN]; XLogSegNo endLogSegNo; + XLogSegNo recycleSegNo; + /* + * Initialize info about where to begin the work. This will recycle + * arbitrarily 10 segments. + */ XLByteToPrevSeg(switchpoint, endLogSegNo); + recycleSegNo = endLogSegNo + 10; xldir = AllocateDir(XLOGDIR); if (xldir == NULL) @@ -3944,7 +3959,7 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI) * - but seems safer to let them be archived and removed later. */ if (!XLogArchiveIsReady(xlde->d_name)) -RemoveXlogFile(xlde->d_name, InvalidXLogRecPtr, switchpoint); +RemoveXlogFile(xlde->d_name, &endLogSegNo, recycleSegNo); } } @@ -3954,31 +3969,22 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI) /* * Recycle or remove a log file that's no longer needed. * - * endptr is current (or recent) end of xlog, and PriorRedoRecPtr is the - * redo pointer of the previous checkpoint. These are used to determine - * whether we want to recycle rather than delete no-longer-wanted log files. - * If PriorRedoRecPtr is not known, pass invalid, and the function will - * recycle, somewhat arbitrarily, 10 future segments. + * segname is the name of the segment to recycle or remove. endlogSegNo + * is the segment number of the current (or recent) end of WAL. recycleSegNo + * is the segment number to recycle up to. + * + * endlogSegNo gets incremented if the segment is recycled so as it is not + * checked again with future callers of this function. */ static void -RemoveXlogFile(const char *segname, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr) +RemoveXlogFile(const char *segname, XLogSegNo *endlogSegNo, + XLogSegNo recycleSegNo) { char path[MAXPGPATH]; #ifdef WIN32 char newpath[MAXPGPATH]; #endif struct stat statbuf; - XLogSegNo endlogSegNo; - XLogSegNo recycleSegNo; - - /* - * Initialize info about where to try to recycle to. - */ - XLByteToSeg(endptr, endlogSegNo); - if (PriorRedoPtr == InvalidXLogRecPtr) - recycleSegNo = endlogSegNo + 10; - else - recycleSegNo = XLOGfileslop(PriorRedoPtr); snprintf(path, MAXPGPATH, XLOGDIR "/%s", segname); @@ -3987,9 +3993,9 @@ RemoveXlogFile(const char *segname, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr) * segment. Only recycle normal files, pg_standby for example can create * symbolic links pointing to a separate archive directory. */ - if (endlogSegNo <= recycleSegNo && + if (*endlogSegNo <= recycleSegNo && lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) && - InstallXLogFileSegment(&endlogSegNo, path, + InstallXLogFileSegment(endlogSegNo, path, true, recycleSegNo, true)) { ereport(DEBUG2, @@ -3997,7 +4003,7 @@ Re
[HACKERS] Broken O(n^2) avoidance in wal segment recycling.
Hi, Author: Heikki Linnakangas Branch: master Release: REL9_5_BR [b2a5545bd] 2015-04-13 16:53:49 +0300 Branch: REL9_4_STABLE Release: REL9_4_2 [d72792d02] 2015-04-13 17:22:21 +0300 Branch: REL9_3_STABLE Release: REL9_3_7 [a800267e4] 2015-04-13 17:22:35 +0300 Branch: REL9_2_STABLE Release: REL9_2_11 [cc2939f44] 2015-04-13 17:26:59 +0300 Branch: REL9_1_STABLE Release: REL9_1_16 [ad2925e20] 2015-04-13 17:26:49 +0300 Branch: REL9_0_STABLE Release: REL9_0_20 [5b6938186] 2015-04-13 17:26:35 +0300 Don't archive bogus recycled or preallocated files after timeline switch. Moved xlog file deletion from RemoveOldXlogFiles() into its own RemoveXlogFile() routine, because it introduced a new function also deleting/recycling segments. It did so moving + /* Needn't recheck that slot on future iterations */ + endlogSegNo++; into the new routine. But it's useless there, because it's just a stack variable, which is going to be freshly computed with + XLByteToPrevSeg(endptr, endlogSegNo); on the next call. That logic was introduced in commit 61b861421b0b849a0dffe36238b8e504624831c1 Author: Tom Lane Date: 2005-04-15 18:48:10 + Modify MoveOfflineLogs/InstallXLogFileSegment to avoid O(N^2) behavior when recycling a large number of xlog segments during checkpoint. The former behavior searched from the same start point each time, requiring O(checkpoint_segments^2) stat() calls to relocate all the segments. Instead keep track of where we stopped last time through. but was neutered by the commit above. We've not heard any complaints about this afaik, but it's not something that's easily diagnosable as being a problem. Therefore I suspect we should fix and backpatch this? Heikki? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers