[ 
https://issues.apache.org/jira/browse/OAK-2654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371596#comment-14371596
 ] 

Alex Parvulescu commented on OAK-2654:
--------------------------------------

fyi. I tested this patch against a fresh 1.0 build migrating a 13M nodes repo 
and the editors part (mainly indexing) went from *55 mins* to *35 mins* on my 
local machine. I also used the _mmap_ flag: 'java -Xmx8g -XX:MaxPermSize=512M 
-jar crx2oak-*.jar --mmap old-repo new-repo').
Maybe using a counter can push this further, but I stopped seeing the #refresh 
listed as a bottleneck.
cc [~chetanm]

> SegmentIdTable too eager to refresh
> -----------------------------------
>
>                 Key: OAK-2654
>                 URL: https://issues.apache.org/jira/browse/OAK-2654
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segmentmk
>            Reporter: Alex Parvulescu
>            Assignee: Alex Parvulescu
>             Fix For: 1.1.8, 1.2
>
>         Attachments: OAK-2654.patch
>
>
> Calling SegmentIdTable#getSegmentId might trigger a reference table refresh 
> if a certain condition is met, I think that this condition is too eager to 
> trigger the refresh and in high write scenarios, this results in large pauses 
> as the method is synchronized.
> The current condition resembles a cache miss (_index != first_) which means 
> that when looking up a segment id by the _lsb_ it might happen that it either 
>  - is not there, so it needs to be added (no refresh on this branch)
>  - or there is an overlap on lsb values (actually on the value returned by 
> _getIndex(lsb)_), in which case a refresh will be triggered.
> In high write scenarios the refresh case happens a lot more frequently so a 
> refresh is triggered, even if it might not be needed. a refresh makes sense 
> when there are null references and it might make sense for them to be 
> collected otherwise we're just creating clones of the same map over and over 
> again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to