I'd confirmation on how WAL files are named. I'm trying to write a tool which can tell me when we are missing a WAL file from the sequence. I initially thought that the file names were monotonically incrementing hexadecimal numbers. This doesn't appear to be the case.
00000001000001B7000000FD 00000001000001B7000000FE (there seem to be a whole bunch of missing filenames in the sequence here) 00000001000001B800000000 00000001000001B800000001 This pattern repeats. I hunted through the code and discovered the following in src/include/access/xlog_internal.h. #define XLogFilePath(path, tli, log, seg) \ snprintf(path, MAXPGPATH, XLOGDIR "/%08X%08X%08X", tli, log, seg) So, the names are not a single hexadecimal number, but instead three of them concatenated together. This macro is used eight times in src/backend/access/xlog.c. It seems clear that the first number, tli, is a TimeLineID. I wasn't completely clear on the behavior of log and seg until I found the following, also in xlog_internal.h. #define NextLogSeg(logId, logSeg) \ do { \ if ((logSeg) >= XLogSegsPerFile-1) \ { \ (logId)++; \ (logSeg) = 0; \ } \ else \ (logSeg)++; \ } while (0) So, clearly log simply increments and seg increments until it gets up to XLogSegsPerFile. Again, xlog_internal.h knows what that is. /* * We break each logical log file (xlogid value) into segment files of the * size indicated by XLOG_SEG_SIZE. One possible segment at the end of each * log file is wasted, to ensure that we don't have problems representing * last-byte-position-plus-1. */ #define XLogSegSize ((uint32) XLOG_SEG_SIZE) #define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize) In src/include/pg_config.h.in, I see /* XLOG_SEG_SIZE is the size of a single WAL file. This must be a power of 2 and larger than XLOG_BLCKSZ (preferably, a great deal larger than XLOG_BLCKSZ). Changing XLOG_SEG_SIZE requires an initdb. */ #undef XLOG_SEG_SIZE Then configure tells me the following # Check whether --with-wal-segsize was given. if test "${with_wal_segsize+set}" = set; then withval=$with_wal_segsize; case $withval in yes) { { echo "$as_me:$LINENO: error: argument required for --with-wal-segsize echo "$as_me: error: argument required for --with-wal-segsize option" >&2;} { (exit 1); exit 1; }; } ;; no) { { echo "$as_me:$LINENO: error: argument required for --with-wal-segsize echo "$as_me: error: argument required for --with-wal-segsize option" >&2;} { (exit 1); exit 1; }; } ;; *) wal_segsize=$withval ;; esac else wal_segsize=16 fi case ${wal_segsize} in 1) ;; 2) ;; 4) ;; 8) ;; 16) ;; 32) ;; 64) ;; *) { { echo "$as_me:$LINENO: error: Invalid WAL segment size. Allowed values a echo "$as_me: error: Invalid WAL segment size. Allowed values are 1,2,4,8,16,32, { (exit 1); exit 1; }; } esac { echo "$as_me:$LINENO: result: ${wal_segsize}MB" >&5 echo "${ECHO_T}${wal_segsize}MB" >&6; } cat >>confdefs.h <<_ACEOF #define XLOG_SEG_SIZE (${wal_segsize} * 1024 * 1024) _ACEOF Since I didn't specify a wal_segsize at compile time, it seems that my XLogSegsPerFile should be 0xffffffff / (16 * 1024 * 1024) = 255 Which matches nicely with what I'm observing. So, and this is where I want the double-check, a tool which verifies there are no missing WAL files (based on names alone) in a series of WAL files needs to know the following. 1) Timeline history (although perhaps not, it could simply verify all existing timelines) 2) What, if any, wal_segsize was specified for the database which is generating the WAL files Am I missing anything? The format of .backup files seem pretty simple to me. So I intend to do the following. 1) find the most recent .backup file 2) verify that all the files required for that .backup exist 3) see if there are any newer files, and 4) if there are newer files, warn if any are missing from the sequence Would this be reasonable and is there any community interest in open-sourcing the tool that I'm building? Andrew