[ 
https://issues.apache.org/jira/browse/OAK-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716524#comment-14716524
 ] 

Tomek Rękawek edited comment on OAK-3148 at 8/27/15 11:58 AM:
--------------------------------------------------------------

I prepared a [pull request|https://github.com/apache/jackrabbit-oak/pull/36] 
with this feature. I tested it with the recent AEM and it took 10 minutes to 
migrate blobs from {{FileDataStore}} to {{FileBlobStore}}. More information can 
be found in the following migration scenario for an OSGi-based installation:

*Requirements*
* An OSGi-based Oak installation (eg. Sling or AEM).
* Node store should be configured to use an external blob store 
(customBlobStore=true).

*1. Enabling SplitBlobStore*
Steps:
# Add {{split.blobstore=old}} OSGi property to the source blob store.
# Configure the destination blob store and add {{split.blobstore=new}} property 
to its OSGi configuration.
# Create a configuration for the 
{{org.apache.jackrabbit.oak.spi.blob.osgi.SplitBlobStoreService}}.
#* It may be empty or contain just one parameter:
{code}
repository.home=crx-quickstart/repository
{code}
#* The directory is used to save the {{migrated_blobs.txt}} file.
# (optional) Restart the instance

After starting the instance, the {{SplitBlobStoreService}} will wait until blob 
stores with {{split.blobstore}} properties (the {{old}} and the {{new}}) are 
available. They will be bound and the {{SplitBlobStore}} will be registered in 
the OSGi. On the other hand, the {{NodeStoreService}} will ignore blob stores 
configured with the {{split.blobstore}} property and will wait until the 
{{SplitBlobStore}} is available.

>From this point, all the new blobs will be saved in the new blob store. 
>Binaries from the old blob store will be available to read.

The {{split.blobstore}} property support was added to {{FileBlobStore}}, 
{{AbstractDataStoreService}} (handling all Jackrabbit data stores), 
{{DocumentNodeStoreService}} and {{SegmentNodeStoreService}}.

*2. Migration*
Steps:
# Find BlobMigration JMX bean in the Felix console.
# Run {{startBlobMigration(false)}} operation

The migration can be stopped using {{stopBlobMigration()}} and then resumed 
with {{startBlobMigration(true)}}. The current stats are available via the JMX 
as well:

* last processed path,
* number of migrated nodes.

*3. Switching to the new blob store*
When the migration is finished, it's possible to completely switch to the new 
blob store:
# Remove the configuration for the old blob store.
# Remove the configuration for the {{SplitBlobStoreService}}
# Remove the {{split.blobstore=new}} OSGi property from the new blob store, so 
it can be find by the {{NodeStoreService}}.
# (optional) Restart the instance, so there are no JCR sessions bound to the 
old {{NodeState}}.

Migration is complete!


was (Author: tomek.rekawek):
I prepared a [PR|https://github.com/apache/jackrabbit-oak/pull/36]. I tested it 
with the recent AEM and it took 10 minutes to migrate blobs from 
{{FileDataStore}} to {{FileBlobStore}}. More information can be found in the 
following migration scenario for an OSGi-based installation:

*Requirements*
* An OSGi-based Oak installation (eg. Sling or AEM).
* Node store should be configured to use an external blob store 
(customBlobStore=true).

*1. Enabling SplitBlobStore*
Steps:
# Add {{split.blobstore=old}} OSGi property to the source blob store.
# Configure the destination blob store and add {{split.blobstore=new}} property 
to its OSGi configuration.
# Create a configuration for the 
{{org.apache.jackrabbit.oak.spi.blob.osgi.SplitBlobStoreService}}.
#* It may be empty or contain just one parameter:
{code}
repository.home=crx-quickstart/repository
{code}
#* The directory is used to save the {{migrated_blobs.txt}} file.
# (optional) Restart the instance

After starting the instance, the {{SplitBlobStoreService}} will wait until blob 
stores with {{split.blobstore}} properties (the {{old}} and the {{new}}) are 
available. They will be bound and the {{SplitBlobStore}} will be registered in 
the OSGi. On the other hand, the {{NodeStoreService}} will ignore blob stores 
configured with the {{split.blobstore}} property and will wait until the 
{{SplitBlobStore}} is available.

>From this point, all the new blobs will be saved in the new blob store. 
>Binaries from the old blob store will be available to read.

The {{split.blobstore}} property support was added to {{FileBlobStore}}, 
{{AbstractDataStoreService}} (handling all Jackrabbit data stores), 
{{DocumentNodeStoreService}} and {{SegmentNodeStoreService}}.

*2. Migration*
Steps:
# Find BlobMigration JMX bean in the Felix console.
# Run {{startBlobMigration(false)}} operation

The migration can be stopped using {{stopBlobMigration()}} and then resumed 
with {{startBlobMigration(true)}}. The current stats are available via the JMX 
as well:

* last processed path,
* number of migrated nodes.

*3. Switching to the new blob store*
When the migration is finished, it's possible to completely switch to the new 
blob store:
# Remove the configuration for the old blob store.
# Remove the configuration for the {{SplitBlobStoreService}}
# Remove the {{split.blobstore=new}} OSGi property from the new blob store, so 
it can be find by the {{NodeStoreService}}.
# (optional) Restart the instance, so there are no JCR sessions bound to the 
old {{NodeState}}.

Migration is complete!

> Online migration process for the binaries
> -----------------------------------------
>
>                 Key: OAK-3148
>                 URL: https://issues.apache.org/jira/browse/OAK-3148
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: blob, upgrade
>            Reporter: Tomek Rękawek
>            Priority: Minor
>
> For clients that want to migrate their blob stores, let's add a new feature 
> that allows copy them in the background.
> AC:
> # SplitBlobStore
> ## Administrator can configure Oak to use the {{SplitBlobStore}} that 
> references the source (old) and the destination (new) blob store.
> ## Data stores can be used as well via the {{DataStoreBlobStore}}.
> ## On the read operation, if the requested blob exists on the new store, 
> SplitBlobStore will return it.
> ## Otherwise, SplitBlobStore will try to read the blob from the old store.
> ## All write requests will be directed to the new blob store.
> # Copy process
> ## Administrator can start, stop and resume the copy process using JMX 
> command.
> ## Administrator can see the progress in JMX and logs
> ## The process will read the {{SplitBlobStore}} configuration and copy the 
> binaries from source to destination
> ## Once a binary is moved, its reference in the {{NodeStore}} is updated and 
> commited.
> ## Only the head revision has to be updated.
> The idea is that after all binaries are copied, the old revisions will be 
> gradually removed by the compaction mechanisms and then binaries will be 
> removed from the source store by the blob garbage collector. Future 
> improvements are possible, eg. to invoke the compaction and GC manually.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to