[jira] Commented: (DERBY-3562) Number of log files (and log dir size) on the slave increases continuously

JIRA Fri, 28 Mar 2008 02:18:47 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582949#action_12582949
 ]

Jørgen Løland commented on DERBY-3562:
--------------------------------------

Mike Matrigali writes:
> I didn't quite follow all of this, and admit i am not up on replication.
> It would be nice if this process was the exact code as the normal
> checkpoint processing.  So a checkpoint would be triggered and then
> after it had done it's work it would do the appropriate cleanup.  If you
> do the cleanup too soon then redo recovery of the slave won't work - is
> that expected to work or at that point to you just restart from scratch
> from master.

> The existing code that replay's multiple checkpoints may be wierd as it
> may assume that this is recovery of a backed up database that is meant
> to keep all of it's log files.  Make sure to not break that.

> Is there a concept of a "fully" recoverable slave, ie. one that is
> supposed to keep all of it's log files so that it is recoverable in
> case of a data crash.  As I said may not be necessary as there is
> always the master.  Just good to know what is expected.
Mike,

Thank you for expressing your concerns. I'll do my best to explain why I think 
the proposed solution will work.

The patch adds functionality to the checkpoint processing used during recovery 
(LogToFile#checkpointInRFR). During recovery, the dirty data pages are flushed 
to disk, and the log.ctrl file is updated to point to the new checkpoint 
currently being processed.

With the patch [1], the log files that are older than the currently processed 
checkpoint's Undo Low Water Mark (undo LWM) are then deleted. The undo LWM 
points to the earliest log record that may be required to do recovery [2]. 
Since the log files are processed sequentially and the data pages have been 
flushed, the undo LWM in the checkpoint is equally valid during recovery (aka 
slave replication mode) as during normal transaction processing.

Once replication has successfully started, the slave database will always be 
recoverable [3], but not in case of corrupted data blocks [4]. You may at any 
time crash the Derby serving the slave database and then reboot it. The 
used-to-be-slave database will then recover to a transaction consistent state 
including the modifications from all transactions whose commit log record was 
written to disk on the slave before the crash.

Please follow up if you think I may have misunderstood anything or did not 
answer your questions good enough.

[1] The patch only applies to slave replication mode. Backup is not affected as 
to not break the "fully" recoverability feature for backups.
[2] The first log record of the oldest transaction in the checkpoint's 
transaction table.
[3] If "fully" recoverable means recovering in presence of corrupted data 
blocks, this is currently not supported for replication.
[4] Not including jar files, as explained in DERBY-3552.

> Number of log files (and log dir size) on the slave increases continuously
> --------------------------------------------------------------------------
>
>                 Key: DERBY-3562
>                 URL: https://issues.apache.org/jira/browse/DERBY-3562
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0, 10.5.0.0
>         Environment: -
>            Reporter: Ole Solberg
>            Assignee: Jørgen Løland
>         Attachments: derby-3562-1a.diff, derby-3562-1a.stat, 
> master_slave-db_size-6.jpg
>
>
> I did a simple test inserting tuples in a table during replication:
> The attached file 'master_slave-db_size-6.jpg' shows that 
> the size of the log directory (and number of files in the log directory)
> increases continuously during replication, while on master the size 
> (and number of files) never exceeds ~12Mb (12 files?) in this scenario.
> The seg0 directory on the slave stays at the same size as the master 
> seg0 directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3562) Number of log files (and log dir size) on the slave increases continuously

Reply via email to