On Fri, Jun 14, 2002 at 09:15:15PM -0400, Greg A. Woods wrote: > [ On Saturday, June 15, 2002 at 00:45:12 (+0200), Ragnar Kj�rstad wrote: ] > > Subject: Re: Backing up PostgreSQL? > > > > snapshots (LVM snapshots) are not "supposedly nearly instantaneous", but > > "instantaneous". All write-access to the device is _locked_ while the > > snapshot is in progress (the process takes a few milliseconds, maybe a > > second on a big system), and there are _no_ races. That's the whole > > point of the snapshot! > > That's irrelevant from PostgreSQL's point of view. There's no sure way > to tell the postgresql process(es) to make the on-disk database image > consistent before you create the snapshot. The race condition is > between the user-level process and the filesystem. The only sure way to > guarantee a self-consistent backup is to shut down the process so as to > ensure all the files it had open are now closed. PostgreSQL makes no > claims that all data necessary to present a continually consistent view > of the DB will be written in a single system call. In fact this is > impossible since the average database consistes of many files and you > can only write to one file at a time through the UNIX system interface.
Yes it does, and no it's not impossible. see http://www.postgresql.org/idocs/index.php?wal.html > Yes there are other ways to recover consistency using other protection > mechanisms maintained by PostgreSQL, but you don't want to be relying on > those when you're doing backups -- you barely want to rely on those when > you're doing crash recovery! There is certainly a tradeoff. It's always a good idea to check the validity of ones backups, and this is even more important in cases like this were the process is relatively complicated. > If doing a snapshot really is that fast then there's almost no excuse > _not_ to stop the DB -- just do it! Your DB downtime will not be > noticable. Stopping the database means closing all the connections, and if you have multiple applications doing long overlapping transactions that don't recover well from shutting down the database, then you have a problem. > > To postgresql (or any other application) the rollback of an snapshot > > (or the backup of a snapshot) will be exactly like recovering from a > > crash. Database-servers need to write the data to disk (and fsync) > > before the transaction is completed. In practise they don't actually > > write it to the database-files but to a log, but that doesn't change the > > point. > > > > So, the only advantage of shutting down the database is that it doesn't > > have to recover like from a crash. > > Newer releases of PostgreSQL don't always use fsync(). I wouldn't trust > recovery to be consistent without any loss implicitly. The newest release of postgresql always use fsync (on it's log) unless you specificly turn it off. You shouldn't do that if you care about your data. > Because > PostgreSQL uses the filesystem (and not raw disk) the only way to be > 100% certain that what's written to disk is a consistent view of the DB > is to close all the open DB files. The only requirement on the filesystem is that it is journaling as well, so it's always kept in a consistant state like the postgresql-database is.. > You don't want the state of your backups to appear as if the system had > crashed -- you want them to be fully self-consistent. They _are_ fully self-consistant. I don't disagree that ideally you would want a clean shutdown, but it's a tradeoff. > At least that's > what you want if you care about your data _and_ your application, and > you care about getting a restored system back online ASAP. Restoring a snapshot is the fastest possible way of getting the system back online, and even a tape-backup of the snapshot will be faster than importing the database. -- Ragnar Kj�rstad Big Storage
