Thanks Bryan, but I am a little dense on this one and just want to understand
the proper procedure to use
"SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE" so that I can
either consider it or not.
On one system that uses ZFS, we will probably use the
SYS_UTIL.SYSCS_FREEZE_DATABASE to freeze, perform an ZFS snapshot, and
SYS_UTIL.SYSCS_FREEZE_DATABASE to unfreeze the database. I am concerned of a
failure in the code between freezing the database and unfreezing the database
as I believe if the connection is lost, the database will remain frozen with no
way to unfreeze. I need to test that and maybe patch Derby to perform an
unfreeze if the connection is lost. I have had this happen when I used a script
to call IJ to do the freeze, the shell to do the ZFS snapshot, and IJ again to
unfreeze. The last IJ to unfreeze could not get a connection and could not
unfreeze. And the database was locked up and could not even be shutdown
cleanly.
Another system is an older system that is running UFS so this will not be
possible.
Back to "SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE". So the
procedure would be:
- use this procedure to create a backup of the database and enable log
archiving. Presumably the database would be created on some other media.
- if the log directory has not been moved with the from
"<database>/log", what does one do with this? Periodically copy the file in
there to the other media? How does one determine the "archive" logs from the
"active" logs used for crash recovery or is even necessary.
- time goes on and another full backup is done. Does one use
SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE again? I assume
that the new archive logs need to be kept with this backup now, right?
- I guess I confused by the archive logs and where they are located,
what they look like, and what to do with them
Now there is a disk crash that contains the database, a new disk is put in
place, formatted, etc. and is available to the OS. So the procedure to recover
this is to issue a connect like:
- connect 'jdbc:derby:<database>;rollForwardRecoveryFrom=<path to last
full backup with logs>';
Is that correct? Is that any different than physically copying the database
and all of the logs to the new disk simply connecting like:
- connect 'jdbc:derby:<database>'
Any light you can shed on this will be helpful. I think the Administration
guide should go through this. I t starts to, but the part about handling the
"archive logs" is not really presented.
-----Original Message-----
From: Bryan Pendleton [mailto:[email protected]]
Sent: Thursday, March 29, 2012 10:31 AM
To: Derby Discussion
Subject: Re: Can someone explain the use of logged archive backup?
> confused here on the archived logs and the active logs.
In general, there can be multiple logs covering the time between one backup and
the next backup, and those logs must be applied, serially, in the correct
order, to recover the database fully.
Once you take that "next" backup, you no longer need the previous backup/logs,
though it's probably wise to establish a reasonable holding period depending on
your resources (e.g., backup weekly, keep
3 months of backups and logs, destroy the oldest set when you complete the most
recent backup, etc.)
> I need to protect against a media crash and it is not so important to go back
> to specific periods of time for the database.
Perhaps you can entirely use lower-level mechanisms, then, such as RAID or
other redundant storage hardware, or a modern filesystem which automatically
replicates the underlying data against the failure of the storage, such as ZFS
(http://en.wikipedia.org/wiki/ZFS)
I think it's still wise to have an application-level backup strategy, because
sometimes logical recovery is necessary (e.g., to recover from an application
bug or an administrative mistake), so I think that the exercise you're going
through about documenting your backup and recovery strategies is an excellent
one.
And don't forget to test those backup/restore practices, since an untested
restore is no better than no restore at all.
I've found that one useful technique is to provision a secondary machine, which
can be MUCH smaller in terms of CPU, memory, networking, etc., and just has to
have enough disk space, and automate things so that every time I take a backup,
my scripts automatically copy the backup to this spare machine, restore the
backup and apply all the logs, and then run a few queries to satisfy myself
that the database is correctly recovered.
thanks,
bryan