Thanks for that explanation - it pretty much confirms what I expected
- which was that booting the secondary means that things get changed
and that logs can no longer be loaded.
Duncan
On 08/02/2008, at 11:33 PM, Jørgen Løland wrote:
Hi Duncan,
First of all, the scenario you describe seems (to me) to be solved
by the new replication functionality. However, I think it can be
done the hard way with a plan similar to what you describe. Here
goes :)
Log files can be found in <database_dir>/log. When you enable log
archive mode, the log files will not be deleted. Hence, you do not
need to perform backup on day 2 and 3 - you may simply copy the log
files from the <database_dir>/log directory.
So, ideally, the steps would be like this:
Day 1: make a backup, copy it to the secondary location. Boot the
secondary db and check that it is all ok
Day 2: copy the log files generated since the backup was made
Day 3: copy the log files generated since the backup was made
Day 4: boot secondary db, which now is in the same state as the
primary was in when the log was copied on day 3.
With a few modifications, this should work just fine:
Problem, day 1: Assuming that users are allowed access to the
primary database when you make the first backup (as indicated by
your scenario), the data pages and log files will contain
information from uncommitted transactions. When you boot the
secondary to check that everything is ok, Derby will go through the
same steps as when doing crash recovery. That means going through a
redo phase (redoing operations in the log that are not reflected in
the data pages) and an undo phase (basically abort transactions
that were active at the time the backup made). The undo phase is
key here because Derby do operations on the data pages of the
secondary that were not done on the primary. This is fine if you
want to use the secondary, but not if you want to keep sending it
log files.
Solution: Don't allow any active transactions when you make the
initial backup or (probably better in your scenario) don't boot the
secondary database to check if it is ok. Wait until the primary has
failed before booting it.
Problem, day 2 and 3: The log file with highest number copied on
day 1 (say logN.dat) may have been modified since you copied it.
Solution: Overwrite the secondary log file logN.dat with logN.dat
from the primary database.
I think that should do it, but if you do not require this NOW, I
would rather wait for replication in 10.4.
Good luck,
Jørgen
Duncan Groenewald wrote:
I still don't know if I really understand the Derby model as it
seems the transaction logs are archived when a database backup is
run. So here is a scenario:
Day 1: Backup Primary Derby (enabling logging), copy backup
database to secondary server and boot secondary server to check it
is all OK.
Day 2: Backup Primary Derby DB and copy archived log files to
secondary server.
Day 3: Backup Primary Derby DB and copy new archived log files to
secondary server.
Day 4: Boot secondary Derby DB to check its OK... In theory then
the boot process will replay all the log files and the database
should be in the same state as the Primary was on Day 3 ?
Somehow I don't think this would actually work - but I will give
it a try...
Here is the scenario I am try to cater for:
24x7 realtime system needs to be relocated to another site (or
needs to have a warm standby system that can be enabled in 15
minutes or less).
Basic approach is to have two databases running and logs from the
primary are loaded on the secondary within a couple of minutes of
them being written.
Transaction dumps on primary database are written to timestamped
files and file is renamed TRXDUMP20080206091545212_DONE.DAT once
dump write process has completed. A script checks for presence of
*_DONE.DAT files every 30 seconds and copies file to remote
servers file system (or this gets done by the dump process as
well). Script on the remote server checks for presence of
*_DONE.DAT files every 30 seconds and runs a Transaction Load
process on remote database to load the dump files. At any given
point in time the remote site is always within a few minutes of
the primary site.
It seems unlikely one could do this with Derby because there are
no commands to periodically dump the transaction logs or to load
the transaction logs.
Cheers
On 08/02/2008, at 7:05 PM, Knut Anders Hatlen wrote:
Duncan Groenewald <[EMAIL PROTECTED]> writes:
Thanks - the specification looks like its close to what I would
like.
The model I work from is one used by Sybase (and possibly others)
where you can specify a database dump and a separate
transaction log
dump at defined intervals using a script or some other
programmatic
method. From what I can tell its not possible to do this with
Derby,
since you can only dump the database and not the logs. Its also
unclear how you would load a log file on its own.
What I would like to see is two additional commands added to dump
transaction logs to specified directory or file name and another
command to load a transaction log file from a specified location/
file. Ideally a transaction log file load should function much the
same way a normal user does to allow concurrent user access while
loading a transaction log file.
Not exactly what you want (it won't allow concurrent user access
while
loading the transaction log), but you may achieve something
similar with
log archiving and roll-forward recovery, combined with some creative
scripts. I haven't tried it myself, but you may get some ideas here:
http://db.apache.org/derby/docs/dev/adminguide/
cadminrollforward.html
--
Knut Anders
Duncan Groenewald
mobile: +61406291205
email: [EMAIL PROTECTED]
--
Jørgen Løland
Duncan Groenewald
mobile: +61406291205
email: [EMAIL PROTECTED]