[ 
https://issues.apache.org/jira/browse/OAK-10418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Wadhwa updated OAK-10418:
--------------------------------
    Description: 
The Content Transfer Tool is a tool developed by Adobe that you can use to 
initiate the migration of existing content from a source AEM instance 
(on-premise or AMS) to the target AEM Cloud Service instance.

For migrating inline blobs, we use the OAK API, which helps upload this data to 
an Azure container. As per the current architecture, it spawns 20 threads, with 
each thread uploading one blob at a time. [Repo 
link|[http://example.com|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-blob-cloud-azure/src/main/java/org/apache/jackrabbit/oak/blob/cloud/azure/blobstorage/AzureBlobStoreBackend.java#L290]]

However, uploading one blob at a time introduces network latency with each 
call, ultimately increasing the overall migration time.

Proposed solution: Rather than uploading the blobs one at a time, we would 
store them in temporary storage. Once that step is complete, we can utilize the 
publicly available azCopy feature, which significantly accelerates the data 
migration process.

I conducted a Proof of Concept (POC) locally, and we observed a 15% improvement 
with this approach.

  was:
The Content Transfer Tool is a tool developed by Adobe that you can use to 
initiate the migration of existing content from a source AEM instance 
(on-premise or AMS) to the target AEM Cloud Service instance.

For migrating inline blobs, we use the OAK API, which helps upload this data to 
an Azure container. As per the current architecture, it spawns 20 threads, with 
each thread uploading one blob at a time. [Repo 
link|[http://example.com|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-blob-cloud-azure/src/main/java/org/apache/jackrabbit/oak/blob/cloud/azure/blobstorage/AzureBlobStoreBackend.java#L290]]

However, uploading one blob at a time introduces network latency with each 
call, ultimately increasing the overall migration time.

Proposed solution: Rather than uploading the blobs one at a time, we would 
store them in temporary storage. Once that step is complete, we can utilize the 
publicly available azCopy feature, which significantly accelerates the data 
migration process.


> Query : Faster binaries migration to azure cloud while using OAK repo
> ---------------------------------------------------------------------
>
>                 Key: OAK-10418
>                 URL: https://issues.apache.org/jira/browse/OAK-10418
>             Project: Jackrabbit Oak
>          Issue Type: Documentation
>          Components: jackrabbit-api
>            Reporter: Bharat Wadhwa
>            Assignee: Thomas Mueller
>            Priority: Minor
>              Labels: jackrabbit
>
> The Content Transfer Tool is a tool developed by Adobe that you can use to 
> initiate the migration of existing content from a source AEM instance 
> (on-premise or AMS) to the target AEM Cloud Service instance.
> For migrating inline blobs, we use the OAK API, which helps upload this data 
> to an Azure container. As per the current architecture, it spawns 20 threads, 
> with each thread uploading one blob at a time. [Repo 
> link|[http://example.com|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-blob-cloud-azure/src/main/java/org/apache/jackrabbit/oak/blob/cloud/azure/blobstorage/AzureBlobStoreBackend.java#L290]]
> However, uploading one blob at a time introduces network latency with each 
> call, ultimately increasing the overall migration time.
> Proposed solution: Rather than uploading the blobs one at a time, we would 
> store them in temporary storage. Once that step is complete, we can utilize 
> the publicly available azCopy feature, which significantly accelerates the 
> data migration process.
> I conducted a Proof of Concept (POC) locally, and we observed a 15% 
> improvement with this approach.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to