[
https://issues.apache.org/jira/browse/OAK-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548013#comment-14548013
]
Julian Sedding commented on OAK-2882:
-------------------------------------
[~chetanm] I like the approach as it avoids relying on implementation details
as does OAK-2626 while providing the same or possibly better benefit.
I would suggest to allow the id to length mapping to be lazily generated. I.e.
start without the mapping file, fill in the in-memory map during the first
migration run and serialize the list into the file when the DataStore is closed.
This would take care of incremental/repeated upgrades transparently. Also users
would not need to be concerned with how to generate the file, while still
allowing advanced users to supply a pre-computed file.
> Support migration without access to DataStore
> ---------------------------------------------
>
> Key: OAK-2882
> URL: https://issues.apache.org/jira/browse/OAK-2882
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Components: upgrade
> Reporter: Chetan Mehrotra
> Assignee: Chetan Mehrotra
> Fix For: 1.3.0, 1.0.15
>
> Attachments: OAK-2882.patch, build_datastore_list.sh
>
>
> Migration currently involves access to DataStore as its configured as part of
> repository.xml. However in complete migration actual binary content in
> DataStore is not accessed and migration logic only makes use of
> * Dataidentifier = id of the files
> * Length = As it gets encoded as part of blobId (OAK-1667)
> It would be faster and beneficial to allow migration without actual access to
> the DataStore. It would serve two benefits
> # Allows one to test out migration on local setup by just copying the TarPM
> files. For e.g. one can only zip following files to get going with repository
> startup if we can somehow avoid having direct access to DataStore
> {noformat}
> >crx-quickstart# tar -zcvf repo-2.tar.gz repository
> >--exclude=repository/repository/datastore
> >--exclude=repository/repository/index
> >--exclude=repository/workspaces/crx.default/index
> >--exclude=repository/tarJournal
> {noformat}
> # Provides faster (repeatable) migration as access to DataStore can be
> avoided which in cases like S3 might be slow. Given we solve how to get
> length
> *Proposal*
> Have a DataStore implementation which can be provided a mapping file having
> entries for blobId and length. This file would be used to answer queries
> regarding length and existing of blob and thus would avoid actual access to
> DataStore.
> Going further this DataStore can be configured with a delegate which can be
> used as a fallback in case the required details is not present in pre
> computed data set (may be due to change in content after that data was
> computed)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)