On Tue, Aug 15, 2017 at 3:32 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Chris Travers <chris.trav...@adjust.com> writes: > > I wonder about a different solution. Would it be possible to special > case > > vacuum to check for and remove (or just move to where they can be > removed) > > files when vacuuming pg_class? At the point we are vacuuming pg_class, > we > > ought to be able to know that a relfilenode shouldn't be used anymore, > > right? > > I don't think so. It's not clear to me whether you have in mind "scan > pg_class, collect relfilenodes from all live tuples, then zap all files > not in that set" or "when removing a dead tuple, zap the relfilenode > it mentions". But neither one works. The first case has a race condition > against new pg_class entries. As for the second, the existence of a dead > tuple bearing relfilenode N isn't evidence that some other live tuple > can't have relfilenode N. > Ah because if the file never made it on to disk the number could be re-used. > > Another problem for the second solution is that in the case you're worried > about (ie, PANIC due to out-of-WAL-space during relation's creating > transaction), there's no very good reason to expect that the relation's > pg_class tuple ever made it to disk at all. > > A traditional low-tech answer to this has been to keep the WAL on a > separate volume from the main data store, so that it's protected from > out-of-space conditions in the main store and temp areas. The space > needs for WAL proper are generally much more predictable than the main > store, so it's easier to keep the dedicated space from overflowing. > (Stalled replication/archiving processes can be hazardous to your > health in this scenario, though, if they prevent prompt recycling of > WAL files.) > Yeah, most of our dbs here have wal on a separate volume but not this system. This system is also unusual in that disk usage varies wildly (and I am not 100% sure that this is the only case which causes it though I can reproduce it consistently in the case of the wal writer running out of disk space with symptoms exactly what I found). So for now that leaves my fallback approach as a way to fix it when I see it. I have written a shell script which does as follows: 1. starts Postgres in single user mode with a data directory or dies (won't run if Postgres seems to be already running) 2. gets the old of the current database 3. lists all files consisting of only digits in the base/[dboid] directory 4. asks Postgres (In single user mode again) for all relfilenodes and oids of tables (In my testing both were required because there were some cases where relfilenodes were not set in some system 5. Loops through the file nodes gathered, checks against the relfilenode entries, and zaps $f, $f_*, and $f.*. Currently for testing "zaps" has been to move to a lostnfound folder for inspection following the script. The logic here is not perfect and is very slightly under inclusive, but better that than the other way. Then we can start Postgres again. I cannot find a better way to avoid race conditions, I guess. At any rate it sounds like preventing the problem more generally may be something beyond what I would feel comfortable trying to do as a patch at my current level of familiarity with he source code. The full script is included inline below my signature in case it is of interest to anyone on the list. > regards, tom lane > -- Best Regards, Chris Travers Database Administrator Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com Saarbrücker Straße 37a, 10405 Berlin --- #!/bin/bash datadir=$1 database=$2 pg_ctl -D $datadir stop dboidfile="$PWD/cleanupdb.oid" reloidfile="$PWD/refilenodes.list" echo "COPY (select oid from pg_database where datname = current_database()) TO '$dboidfile'" | postgres --single -D $datadir $database > /dev/null if (($?)) then echo "FATAL: Could not start Postgres in single user mode" exit 1 fi dboid=`cat $dboidfile` filenodes=`(cd test/base/$dboid; ls [0-9]*[0-9] | grep -v '\.' | sort -n)` #echo $filenodes echo "COPY (select relfilenode from pg_class union select oid as relfilenode from pg_class) TO '$reloidfile'" | postgres --single -D $datadir $database > /dev/null relfilenodes=`cat $reloidfile` #echo $relfilenodes if [[ -z relfilenodes ]] then echo "FATAL: did not get any relfilenodes" exit 2 fi mkdir lostnfound; for f in $filenodes do if [[ -z `echo $relfilenodes | grep -w $f` ]] then echo moving $f to lostnfound mv $datadir/base/$dboid/$f lostnfound mv $datadir/base/$dboid/${f}_* lostnfound 2> /dev/null mv $datadir/base/$dboid/${f}.* lostnfound 2> /dev/null fi done rm $dboidfile rm $reloidfile