[jira] [Updated] (OAK-4751) Improve the checkpoint migration performance

JIRA Fri, 16 Dec 2016 03:18:54 -0800

     [ 
https://issues.apache.org/jira/browse/OAK-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tomek Rękawek updated OAK-4751:
-------------------------------
    Fix Version/s: 1.4.12

> Improve the checkpoint migration performance
> --------------------------------------------
>
>                 Key: OAK-4751
>                 URL: https://issues.apache.org/jira/browse/OAK-4751
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar, upgrade
>            Reporter: Tomek Rękawek
>            Assignee: Tomek Rękawek
>             Fix For: 1.5.10, 1.4.12, 1.6
>
>         Attachments: OAK-4751.patch
>
>
> (based on [~alex.parvulescu] input):
> During the segment->segment-tar migration, a fair amount of time is being 
> taken by the deduplication process. Basically the repository is ingesting 
> large amounts of content (a checkpoint is the equivalent of a full repo 
> state), and once it deduplicates the data, it finds it already available in 
> the destination repository.
> The reason this happens is because the diff mechanism cannot be efficient 
> across repositories.
> For example: on the source repo we have r0 root state and cp0 a checkpoint 
> very close to r0. the diff(r0, cp0) is extremely cheap measured in 
> milliseconds. But what the sidegrade does is it copies r0 to the destination 
> repository: r0 -> rx1, then it runs diff(rx1, cp0) which becomes very 
> expensive as the 2 node states don't originate from the same repository, so 
> diffing will fallback to a slow content equals comparison. next the content 
> is almost equal, so a huge amount of cycles are wasted in deduplicating data 
> over the 2 repositories.
> I have no easy solution here other than looking into providing a diff 
> mechanism that will compare the 2 local states diff(r0, cp0) BUT apply the 
> delta to the destination repository (apply it on rx1). I'm not sure how easy 
> this will turn out to be, and if it's worth the effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (OAK-4751) Improve the checkpoint migration performance

Reply via email to