[ 
https://issues.apache.org/jira/browse/OAK-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-2480:
-----------------------------
    Attachment: oak-2480.incremental.partial.patch

Attaching a patch representing the solution outlined above: not doing the diff 
based on the checkpoint (which points to the origin) but rather with the 
backup-head.
But this apparently has the downside of a scan of the entire backup, so is not 
very efficient and thus likely not the solution..
But at least, with this 'fix' the testcase succeeds..

> Incremental (FileStore)Backup copies the entire source instead of just the 
> delta
> --------------------------------------------------------------------------------
>
>                 Key: OAK-2480
>                 URL: https://issues.apache.org/jira/browse/OAK-2480
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: run
>    Affects Versions: 1.1.5
>            Reporter: Stefan Egli
>         Attachments: IncrementalBackupTest.java, 
> oak-2480.incremental.partial.patch
>
>
> Running the FileStoreBackup (in oak-run) sequentially should correspond to an 
> incremental backup. This implies the expectation, that the incremental backup 
> is very resource-friendly, ie that it only adds the delta/diff that changed 
> since the last backup. Instead what can be een at the moment, is that it 
> copies the entire source-store again on each 'incremental' backup.
> Tested with the latest trunk snapshot.
> Suspecting the problem to be as follows: on the first backup the 
> FileStoreBackup stores a checkpoint created in the source-store and adds it 
> as a property "checkpoint" to the backup root node, besides the actual backup 
> which is stored in '/root'. 
> On subsequent incremental runs, the backup tries to retrieve said property 
> "checkpoint" from the backup and uses that in the compactor to do the diff 
> based upon.
> Now the problem seems to be that in Compactor.compact it goes to call 
> process(), which does a writer.writeNode(before) (where before is the 
> checkpoint in the origin store but writer is a writer of the backup store). 
> And in this SegmentWriter.writeNode() it fails to find the 'before' segment, 
> and thus traverses the entire tree and copies it from the origin to the 
> backup.
> So the problem looks to be in the area where it assumes to find this 
> 'checkpoint-before' in the backup but that's not the case.
> So a solution would have been to not do the diff between the checkpoint and 
> the current origin-head, but between the backup-head and the origin-head 
> instead. Now apparently this was not the intention though, as that would mean 
> to read through the entire backup for doing the diffing - and that would be 
> inefficient...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to