Chetan Mehrotra created OAK-2882:
------------------------------------
Summary: Support migration without access to DataStore
Key: OAK-2882
URL: https://issues.apache.org/jira/browse/OAK-2882
Project: Jackrabbit Oak
Issue Type: New Feature
Components: upgrade
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Fix For: 1.3.0, 1.0.15
Migration currently involves access to DataStore as its configured as part of
repository.xml. However in complete migration actual binary content in
DataStore is not accessed and migration logic only makes use of
* Dataidentifier = id of the files
* Length = As it gets encoded as part of blobId (OAK-1667)
It would be faster and beneficial to allow migration without actual access to
the DataStore. It would serve two benefits
# Allows one to test out migration on local setup by just copying the TarPM
files. For e.g. one can only zip following files to get going with repository
startup if we can somehow avoid having direct access to DataStore
{noformat}
>crx-quickstart# tar -zcvf repo-2.tar.gz repository
>--exclude=repository/repository/datastore
>--exclude=repository/repository/index
>--exclude=repository/workspaces/crx.default/index
>--exclude=repository/tarJournal
{noformat}
# Provides faster (repeatable) migration as access to DataStore can be avoided
which in cases like S3 might be slow. Given we solve how to get length
*Proposal*
Have a DataStore implementation which can be provided a mapping file having
entries for blobId and length. This file would be used to answer queries
regarding length and existing of blob and thus would avoid actual access to
DataStore.
Going further this DataStore can be configured with a delegate which can be
used as a fallback in case the required details is not present in pre computed
data set (may be due to change in content after that data was computed)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)