[
https://issues.apache.org/jira/browse/OAK-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stefan Egli updated OAK-2480:
-----------------------------
Fix Version/s: 1.4
tentatively marking as fix version 1.4 (/cc [~alexparvulescu])
> Incremental (FileStore)Backup copies the entire source instead of just the
> delta
> --------------------------------------------------------------------------------
>
> Key: OAK-2480
> URL: https://issues.apache.org/jira/browse/OAK-2480
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: run
> Affects Versions: 1.1.5
> Reporter: Stefan Egli
> Fix For: 1.4
>
> Attachments: IncrementalBackupTest.java,
> oak-2480.incremental.partial.patch
>
>
> Running the FileStoreBackup (in oak-run) sequentially should correspond to an
> incremental backup. This implies the expectation, that the incremental backup
> is very resource-friendly, ie that it only adds the delta/diff that changed
> since the last backup. Instead what can be een at the moment, is that it
> copies the entire source-store again on each 'incremental' backup.
> Tested with the latest trunk snapshot.
> Suspecting the problem to be as follows: on the first backup the
> FileStoreBackup stores a checkpoint created in the source-store and adds it
> as a property "checkpoint" to the backup root node, besides the actual backup
> which is stored in '/root'.
> On subsequent incremental runs, the backup tries to retrieve said property
> "checkpoint" from the backup and uses that in the compactor to do the diff
> based upon.
> Now the problem seems to be that in Compactor.compact it goes to call
> process(), which does a writer.writeNode(before) (where before is the
> checkpoint in the origin store but writer is a writer of the backup store).
> And in this SegmentWriter.writeNode() it fails to find the 'before' segment,
> and thus traverses the entire tree and copies it from the origin to the
> backup.
> So the problem looks to be in the area where it assumes to find this
> 'checkpoint-before' in the backup but that's not the case.
> So a solution would have been to not do the diff between the checkpoint and
> the current origin-head, but between the backup-head and the origin-head
> instead. Now apparently this was not the intention though, as that would mean
> to read through the entire backup for doing the diffing - and that would be
> inefficient...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)