On May 3, 2006, at 5:59 AM, Karsten Hilbert wrote:
and would these WALs be easily
(or only with great difficulty) usable to recover data that had not
made it into the secondary (backup) server, between the time of any
last primary database backup (dump) and the time of a primary server
crash?
Easily no. With great difficulty no. There's PITR for that
(point in time recovery - wal log shipping). Not sure off the top
off my
head where the advantage over direct replication (Slony) is.
I had thought (maybe incorrectly) that there can exist a "lag"
between information that can be written on the primary (master)
server, and information that has been "conveyed/written/replicated"
to the slave. So if a master table should get corrupted, the WAL
could serve as a source of information added since the last dump. Do
Postgres tables never get corrupted? Maybe there is no advantage over
Slony, but not everyone may deploy a slave server, or the slave might
have gotten offline.
- is it recommended or required that the database to be fully
logged off, so that it is in an unambiguous or "clean" state (no
unprocessed transactions) at the time of the dump?
Neither. It simply does not matter.
Is Postgres different from (say) MySQL in this regard? In reference
to backing up, articles say
One of the difficulties with a large and active MySQL database is
making clean backups without having to bring the server down.
Otherwise, a backup may slow down the system and there may be
inconsistency with data, since related tables may be changed
while another is being backed up. Taking the server down will
ensure consistency of data, but it means interruption of service
to users. Sometimes this is necessary and unavoidable, but daily
server outages for backing up data may be unacceptable. A simple
alternative method to ensure reliable backups without having to
shut down the server daily is to set up replication
If PG would allow a dump to contain "unprocessed
transactions" we would not consider it for driving our
backend.
I didn't mean "clean" in terms of data quality ("non-crap"), I just
meant complete and unambiguous, i.e. no tables being modified at the
time of the backup, so that the "state of completeness" of the dump
could be better known. How would the dump process choose where to
begin and end if there is concurrent activity? May it not be helpful,
to be able to know that a brief work stoppage associated with a
certain date/time assures that all records to that point are "intact"
in the backup?
_______________________________________________
Gnumed-devel mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/gnumed-devel