[jira] Commented: (DERBY-2872) Add Replication functionality to Derby

JIRA Tue, 06 Nov 2007 02:40:20 -0800

    [ 
https://issues.apache.org/jira/browse/DERBY-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540400
 ]


Jørgen Løland commented on DERBY-2872:
--------------------------------------

Dan,

Thanks for showing interest in replication. I'll answer your questions inline, 
and will update the func spec with the results of the discussion later.

> The startslave command's syntax does not include a -slavehost option, but the 
> comments seem to indicate one is available.

You are right; will fix.

> How do startmaster and stopmaster connect to the master database?

In the current prototype implementation, all commands are processed in 
NetworkServerCommandImpl by calling Monitor.startPersistentService(dbname, ...) 
and Monitor.findService(dbname,...). The plan is to change this to connection 
url options later, i.e. 'jdbc:derby://host/db;startMaster=true'; Note that 
since startslave is blocking, the connection 'jdbc:...;startslave=true' call 
will hang.

> Do stopslave and startfailover need options to define the slavehost and port, 
> otherwise how do they communicate with the slave?

Since "startslave" is blocked during LogToFile.recover, 
Monitor.startPersistentService does not complete for this command. Calling 
Monitor.findService on the slave database does therefore not work. 

A way around this is to let the thread that receives log from the master and 
writes it to log file, check for a flag value every X second. A Hashtable 
could, e.g., be added to Monitor with setFlag(dbname, flagvalue) and 
getFlag(dbname) methods. The stopslave/failover commands then call 
Monitor.setFlag(slaveDBName, "failover"/"stopslave"). 

A potential problem with this is to authenticate the caller of the command 
since the AuthenticationService of the slave database is not reachable. I think 
the best solution would be to only accept failover/stopslave flags if the 
connection with the master is down. Otherwise, if the connection is working, 
stop and failover commands should only be accepted from the master.


> It's unclear exactly what the startmaster and stopmaster do, especially wrt 
> to the state of the database. Can a database be booted and active when 
> startmaster is called, or does startmaster boot the database? Similar for 
> stopmaster, does it shutdown the database?

The "startmaster" command can only be run against an existing database 'X'. If 
'X' has already been booted by the Derby instance that will have the master 
role, "startmaster" will connect to it and:

1) copy the files of 'X' to the slave (other transactions will be blocked 
during this step in the first version of replication -
   may be improved later by exploiting online backup)
2) create a replication log buffer and make sure all log records are added to 
this buffer
3) start a log shipment thread that sends the log asynchronously.

If 'X' has not already been booted, "startmaster" will boot it and then do the 
above.

The "stopmaster" command will

1) stop log records from being appended to the replication log buffer
2) stop the log shipper thread from sending more log to the slave
3) send a message to the slave that replication for database 'X' has been 
stopped.
4) close down all replication related functionality without shutting down 'X'

> Is there any reason to put these replications commands on the class/command 
> used to control the network server? They don't fit naturally there, why not a 
> replication specific class/command? From the functional spec I can't see any 
> requirement that the master or slave are running the network server, so I 
> assume I can have replication working with embedded only systems.

Implementing this in the network server means that the blocking startslave 
command will run in a thread on the same vm as the server. 

> How big is this main-memory log buffer, can it be configured?

In the initial version we use 10 buffers with size 32KB. 32K was chosen because 
this is the buffer size used in LogAccessFileBuffer.buffer byte[], which are 
the units copied to the replication buffer. We need multiple buffers so that it 
is possible to append log while the log shipper is sleeping and when it is busy 
shipping an older chunk of log. The number of buffers will probably be modified 
once we get a chance to actually test the funtionality. It will be configurable.

> extract - "the response time of transactions may increase for as long as log 
> shipment has trouble keeping up with the amount of generated log records."
> Could you explain this more, I don't see the connection between the log 
> buffer filling up and response times of other transactions. The spec says the 
> replication is asynchronous, so won't user transactions still be only limited 
> by the speed at which the transaction log is written to disk?

In the current design, log records that need to be shipped to the slave are 
appended to the replication log buffer at the same time they are written to 
disk. If the replication log buffer is full, the transaction requesting this 
disk write has to wait for a chunk of log to be shipped before the log can be 
added to it. I realize that it is possible to read the log from disk if the 
buffer overflows. This is a planned improvement, but is delayed for now due to 
limited developer resources.

> The spec seems to imply that the slave can connect with the master, but the 
> startmaster command doesn't specify its own hostname or portnumber so how is 
> this connection made?

The connection between the pairs will be set up by

1) the slave sets up a ServerSocket
2) the master connects to the socket on the specified slave
   location (i.e. host:port)
3) the socket connection can be used to send messages in both
   directions.

Thus, the slave does not contact the master - it only sends a message using the 
existing connection. 

> Why if the master loses its connection to the slave will the replication 
> stop, while if the slave loses its connection to the master it keeps 
> retrying? It seems that any temporary glitch in the network connectivity has 
> a huge chance of rendering the replication useless. I can't see the logic 
> behind this, what's stopping the master from keeping retrying. The log buffer 
> being full shouldn't matter should it, the log records are still on disk, or 
> is it that this scheme never reads the transaction log from disk, only from 
> memory as log records are created?

See answer to response time above. Reading log from disk in case of replication 
buffer overflow is definitely an improvement, but we delay that improvement for 
now. It will be high priority on the improvement todo-list.

> From reading between the lines, I think this scheme requires that the master 
> database stay booted while replicating, if so I think that's a key piece of 
> information that should be clearly stated in the functional spec. If not, 
> then I think that the how to shutdown a master database and restart 
> replication(without the initial copy) should be documented.

Again, a correct observation. The master database has to stay booted while 
replicated.

> Add Replication functionality to Derby
> --------------------------------------
>
>                 Key: DERBY-2872
>                 URL: https://issues.apache.org/jira/browse/DERBY-2872
>             Project: Derby
>          Issue Type: New Feature
>          Components: Miscellaneous
>    Affects Versions: 10.4.0.0
>            Reporter: Jørgen Løland
>            Assignee: Jørgen Løland
>         Attachments: master_classes_1.pdf, poc_master_v2.diff, 
> poc_master_v2.stat, poc_master_v2b.diff, poc_slave_v2.diff, 
> poc_slave_v2.stat, poc_slave_v2b.diff, poc_slave_v2c.diff, 
> proof-of-concept_v2b-howto.txt, proof_of_concept_master.diff, 
> proof_of_concept_master.stat, proof_of_concept_slave.diff, 
> proof_of_concept_slave.stat, replication_funcspec.html, 
> replication_funcspec_v2.html, replication_funcspec_v3.html, 
> replication_funcspec_v4.html, replication_funcspec_v5.html, 
> replication_funcspec_v6.html, replication_script.txt, slave_classes_1.pdf
>
>
> It would be nice to have replication functionality to Derby; many potential 
> Derby users seem to want this. The attached functional specification lists 
> some initial thoughts for how this feature may work.
> Dag Wanvik had a look at this functionality some months ago. He wrote a proof 
> of concept patch that enables replication by copying (using file system copy) 
> and redoing the existing Derby transaction log to the slave (unfortunately, I 
> can not find the mail thread now).
> DERBY-2852 contains a patch that enables replication by sending dedicated 
> logical log records to the slave through a network connection and redoing 
> these.
> Replication has been requested and discussed previously in multiple threads, 
> including these:
> http://mail-archives.apache.org/mod_mbox/db-derby-user/200504.mbox/[EMAIL 
> PROTECTED]
> http://www.nabble.com/Does-Derby-support-Transaction-Logging---t2626667.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-2872) Add Replication functionality to Derby

Reply via email to