[ 
https://issues.apache.org/jira/browse/OAK-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838794#comment-13838794
 ] 

Jukka Zitting commented on OAK-1159:
------------------------------------

I added a "backup" mode to {{oak-run}} in revision 1547751. It currently only 
works for the TarMK backend.

An interesting issue came up when testing it on a vanilla CQ installation:

{noformat}
$ java -jar oak-run-0.13-SNAPSHOT.jar tarmk repository
Apache Jackrabbit Oak 0.13-SNAPSHOT
TarMK ../cq-quickstart/target/crx-quickstart/repository
Total size:
    75MB in    368 data segments
   791MB in   6436 bulk segments
Available for garbage collection:
     2kB in      2 data segments
     0MB in      0 bulk segments

$ java -jar oak-run-0.13-SNAPSHOT.jar backup repository backup
Apache Jackrabbit Oak 0.13-SNAPSHOT

$ java -jar oak-run-0.13-SNAPSHOT.jar tarmk backup
Apache Jackrabbit Oak 0.13-SNAPSHOT
TarMK target/backup
Total size:
   128MB in    533 data segments
  1521MB in  10880 bulk segments
Available for garbage collection:
     0kB in      0 data segments
     0MB in      0 bulk segments
{noformat}

Interestingly the size of the backup repository is close to twice that of the 
source repository. I suspect this is because the backup currently can't detect 
cases where parts of the content (binaries, subtrees, etc.) are shared across 
different locations. For example a copied or checked in binary would be stored 
just once in the source repository, but would currently get duplicated in the 
backup.

> Backup and restore
> ------------------
>
>                 Key: OAK-1159
>                 URL: https://issues.apache.org/jira/browse/OAK-1159
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: core, mk
>            Reporter: Michael Marth
>            Assignee: Alex Parvulescu
>         Attachments: OAK-1159-v2.patch, OAK-1159.patch
>
>
> We need a way to backup and restore a repository. I was thinking that the MK 
> impl could expose an interface for this, as the actual implementation would 
> differ quite a bit between e.g. TarMK and MongoMK.
> Also, I think we could leverage the MVCC nature of the MKs and mark a  
> specific revision as "the revision to backup" (regardless of ongoing writes). 
> That would allow us to prevent the ugly situation in JR2, that we need to 
> stop writes for a while to produce a consistent backup.
> The restore in such a scenario would discard revisions that happened after 
> said marker (but still made it into the backup).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to