[jira] Commented: (DERBY-2872) Add Replication functionality to Derby

JIRA Mon, 23 Jul 2007 06:47:55 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514630
 ]


Jørgen Løland commented on DERBY-2872:
--------------------------------------

>> Regarding 3) - Processing on the master must be paused while the
>> following happens: 
>> * buffered log records and buffered data pages are forced to
>>   disk. Actually, forcing the data pages is not strictly
>>   necessary, but we might as well ship all write operations that
>>   have been performed to the slave.
>> * the entire database directory is sent to the slave (or copied
>>   to a backup location, from where it can be sent to the slave)
>> * the replication log buffer has been started and the logFactory
>>   has been informed to append log records to the buffer as well
>>   as to disk.
> I would think the online backup mechanism has already solved some these 
> issues.  Have you consider using an online backup to get a copy of the 
> database and existing log?

I had a brief look at backup a few weeks ago. If I remember
correctly, it does more or less the same as described above: It
pauses the database while flushing data and log to disk, and
copies the entire database directory to a backup location.
Processing is not resumed until this copying has completed.
Functionality to pause (freeze) the database 
is likely to be reused, and the strategy follows in the same steps as
backup.

>From the adminguide, topic "Online backups":
"The SYSCS_UTIL.SYSCS_BACKUP_DATABASE() procedure puts the database into a 
state in which it can be safely copied, then copies the entire original 
database directory (including data files, online transaction log files, and jar 
files) to the specified backup directory. Files that are not within the 
original database directory (for example, derby.properties) are not copied."

If this is the mechanism you are referring to, we can use backup
to do the "or copied to a backup location, from where it can be
sent to the slave" part. That strategy will freeze the database
for a shorter time iff the disk is faster than the network. On
the other hand, it will require 2x diskspace and potentially much
memory because log records will accumulate while the backed-up
database is being sent to the slave. 

If there is another, nonblocking backup mechanism that I don't know 
of, please refer me to it. If so, we may have to rework our plans.

>> The reason for this is that the slave requires a copy of the
>> database that is exactly equal to that on the master when log
>> shipment starts. When we start sending log records to the slave,
>> we need to know that the slave has a database that includes all
>> log records up to a LogInstant 'i'. The first log record that is
>> sent to the slave must be the one immediately following 'i'.
>> Hence the pause.

> I do not understand why it needs to be exactly the same database.  Recovery 
> already handles redo of log records that are already reflected in the 
> database.  What harm would it make if you sent log records with LogInstant 
> less than 'i'?

The problem is caused by us writing the log record to the slave
log file *before* recovering it.

Unfortunately (in this case), the LSN in
Derby (LogInstant) is the byte position where the log record
starts in the log file. Since undo operations seem to identify
their respective do operations using the LogInstant (seems to me
to be "hidden" inside an undo log record's byte[] data), all log
records must be found exactly the same place in the master and
slave log files. Hence, duplicates of log records cannot exist on
file without invalidating the LSNs. 
 
We could, of course, start sending log records < i, and let the
slave ignore these. Even if we decide to send a backup of the 
database, it would still be simple to start log shipping at exactly 
'i', however. I see no reason for not using exactly 'i'...

>>> 4. What will happen if the failover command is executed while the
>>>   master is alive and doing replication?  
>> 
>> I can think of at least three alternatives: 1) stop the master,
>> and make the old slave a normal Derby instance for the database.
>> 2) not allow failover to be executed on a slave when the master
>> is alive. 3) perform a "switch", i.e., make the old slave the new
>> master, and the old master the new slave.
>> 
>> For now, I think 1) is the best alternative to keep the amount of
>> work down while alternative 3) would make a good extension
>> candidate to the functionality.

>I assume the failover command is sent to the slave.  Both 1) and 3)  will then 
>require some mechanism where the slave sends commands to the master.  If you 
>want keep the work down, another alternative could be
>4) take down the connection to the master and perform failover. But maybe that 
>creates a too high risk for inconsistencies since you may end up with two 
>masters that both will serve clients.

>I think an important use-case is to be able to switch to another master during 
>planned maintenance.  

<snip>

Good point. This scenario should be added to the funcspec.

> Add Replication functionality to Derby
> --------------------------------------
>
>                 Key: DERBY-2872
>                 URL: https://issues.apache.org/jira/browse/DERBY-2872
>             Project: Derby
>          Issue Type: New Feature
>          Components: Miscellaneous
>    Affects Versions: 10.4.0.0
>            Reporter: Jørgen Løland
>            Assignee: Jørgen Løland
>         Attachments: proof_of_concept_master.diff, 
> proof_of_concept_master.stat, proof_of_concept_slave.diff, 
> proof_of_concept_slave.stat, replication_funcspec.html, 
> replication_funcspec_v2.html, replication_funcspec_v3.html, 
> replication_script.txt
>
>
> It would be nice to have replication functionality to Derby; many potential 
> Derby users seem to want this. The attached functional specification lists 
> some initial thoughts for how this feature may work.
> Dag Wanvik had a look at this functionality some months ago. He wrote a proof 
> of concept patch that enables replication by copying (using file system copy) 
> and redoing the existing Derby transaction log to the slave (unfortunately, I 
> can not find the mail thread now).
> DERBY-2852 contains a patch that enables replication by sending dedicated 
> logical log records to the slave through a network connection and redoing 
> these.
> Replication has been requested and discussed previously in multiple threads, 
> including these:
> http://mail-archives.apache.org/mod_mbox/db-derby-user/200504.mbox/[EMAIL 
> PROTECTED]
> http://www.nabble.com/Does-Derby-support-Transaction-Logging---t2626667.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-2872) Add Replication functionality to Derby

Reply via email to