Thanks for that explanation - it pretty much confirms what I expected - which was that booting the secondary means that things get changed and that logs can no longer be loaded.

Duncan

On 08/02/2008, at 11:33 PM, Jørgen Løland wrote:

Hi Duncan,

First of all, the scenario you describe seems (to me) to be solved by the new replication functionality. However, I think it can be done the hard way with a plan similar to what you describe. Here goes :)

Log files can be found in <database_dir>/log. When you enable log archive mode, the log files will not be deleted. Hence, you do not need to perform backup on day 2 and 3 - you may simply copy the log files from the <database_dir>/log directory.

So, ideally, the steps would be like this:

Day 1: make a backup, copy it to the secondary location. Boot the secondary db and check that it is all ok
Day 2: copy the log files generated since the backup was made
Day 3: copy the log files generated since the backup was made
Day 4: boot secondary db, which now is in the same state as the primary was in when the log was copied on day 3.

With a few modifications, this should work just fine:

Problem, day 1: Assuming that users are allowed access to the primary database when you make the first backup (as indicated by your scenario), the data pages and log files will contain information from uncommitted transactions. When you boot the secondary to check that everything is ok, Derby will go through the same steps as when doing crash recovery. That means going through a redo phase (redoing operations in the log that are not reflected in the data pages) and an undo phase (basically abort transactions that were active at the time the backup made). The undo phase is key here because Derby do operations on the data pages of the secondary that were not done on the primary. This is fine if you want to use the secondary, but not if you want to keep sending it log files.

Solution: Don't allow any active transactions when you make the initial backup or (probably better in your scenario) don't boot the secondary database to check if it is ok. Wait until the primary has failed before booting it.

Problem, day 2 and 3: The log file with highest number copied on day 1 (say logN.dat) may have been modified since you copied it.

Solution: Overwrite the secondary log file logN.dat with logN.dat from the primary database.

I think that should do it, but if you do not require this NOW, I would rather wait for replication in 10.4.

Good luck,
Jørgen


Duncan Groenewald wrote:
I still don't know if I really understand the Derby model as it seems the transaction logs are archived when a database backup is run. So here is a scenario: Day 1: Backup Primary Derby (enabling logging), copy backup database to secondary server and boot secondary server to check it is all OK. Day 2: Backup Primary Derby DB and copy archived log files to secondary server. Day 3: Backup Primary Derby DB and copy new archived log files to secondary server. Day 4: Boot secondary Derby DB to check its OK... In theory then the boot process will replay all the log files and the database should be in the same state as the Primary was on Day 3 ? Somehow I don't think this would actually work - but I will give it a try...
Here is the scenario I am try to cater for:
24x7 realtime system needs to be relocated to another site (or needs to have a warm standby system that can be enabled in 15 minutes or less). Basic approach is to have two databases running and logs from the primary are loaded on the secondary within a couple of minutes of them being written. Transaction dumps on primary database are written to timestamped files and file is renamed TRXDUMP20080206091545212_DONE.DAT once dump write process has completed. A script checks for presence of *_DONE.DAT files every 30 seconds and copies file to remote servers file system (or this gets done by the dump process as well). Script on the remote server checks for presence of *_DONE.DAT files every 30 seconds and runs a Transaction Load process on remote database to load the dump files. At any given point in time the remote site is always within a few minutes of the primary site. It seems unlikely one could do this with Derby because there are no commands to periodically dump the transaction logs or to load the transaction logs.
Cheers
On 08/02/2008, at 7:05 PM, Knut Anders Hatlen wrote:
Duncan Groenewald <[EMAIL PROTECTED]> writes:

Thanks - the specification looks like its close to what I would like.
The model I work from is one used by Sybase (and possibly  others)
where you can specify a database dump and a separate transaction log dump at defined intervals using a script or some other programmatic method. From what I can tell its not possible to do this with Derby,
since you can only dump the database and not the  logs.  Its also
unclear how you would load a log file on its own.

What I would like to see is two additional commands added to dump
transaction logs to specified directory or file name and another
command to load a transaction log file from a specified location/
file.  Ideally a transaction log file load should function much the
same way a normal user does to allow concurrent user access while
loading a transaction log file.

Not exactly what you want (it won't allow concurrent user access while loading the transaction log), but you may achieve something similar with
log archiving and roll-forward recovery, combined with some creative
scripts. I haven't tried it myself, but you may get some ideas here:
http://db.apache.org/derby/docs/dev/adminguide/ cadminrollforward.html

--
Knut Anders
Duncan Groenewald
mobile: +61406291205
email: [EMAIL PROTECTED]


--
Jørgen Løland

Duncan Groenewald
mobile: +61406291205
email: [EMAIL PROTECTED]




Reply via email to