[
https://issues.apache.org/jira/browse/OAK-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574758#comment-16574758
]
Michael Dürig commented on OAK-7672:
------------------------------------
[~dulceanu] without looking into too much details, the patch looks very good to
me. One thing I'm not sure about is whether all non segments entries in the tar
files are handled properly (e.g. binary index, graphs files etc.). An
interesting way to check might be to implement a test that copies the segments
from tar to azure and back to tar and then ensures the resulting binaries are
the same (binary diff).
Regarding the documentation:
* the new segment-copy is missing from the table of contents.
* I prefer the following wording (bold): {{includes __all previous *revisions*
persisted in the Segment Store__ and therefore *retaining the entire history*.}}
> Introduce oak-run segment-copy for moving around segments in different
> storages
> -------------------------------------------------------------------------------
>
> Key: OAK-7672
> URL: https://issues.apache.org/jira/browse/OAK-7672
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: oak-run, segment-tar
> Reporter: Andrei Dulceanu
> Assignee: Andrei Dulceanu
> Priority: Major
> Labels: tooling
> Fix For: 1.10, 1.9.7
>
> Attachments: OAK-7672.patch
>
>
> Often there's the need to transform a type of {{SegmentStore}} (e.g. local
> TarMK) into *the exact same* counter-part, using another persistence type
> (e.g. Azure Segment Store). While {{oak-upgrade}} partially solves this
> through sidegrades (see OAK-7623), there's a gap in the final content because
> of the level at which {{oak-upgrade}} operates (node store level). Therefore,
> the resulting sidegraded repository doesn't contain all the (possibly stale,
> unreferenced) data from the original repository, but only the latest head
> state. A side effect of this is that the resulting repository is always
> compacted.
> Introducing a new command in {{oak-run}}, namely {{segment-copy}}, would
> allow us to operate at a lower level (i.e. segment persistence), dealing only
> with constructs from {{org.apache.jackrabbit.oak.segment.spi.persistence}}:
> journal file, archives and archive entries. This way the only focus of this
> process would be to "translate" a segment between two persistence formats,
> without caring about the node logic stored inside (referenced/unreferenced
> node/property).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)