[ 
https://issues.apache.org/jira/browse/OAK-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella closed OAK-2359.
---------------------------------

Bulk close for 1.1.4

> read is inefficient when there are many split documents
> -------------------------------------------------------
>
>                 Key: OAK-2359
>                 URL: https://issues.apache.org/jira/browse/OAK-2359
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.0.8
>         Environment: 1.0.8.r1644758
>            Reporter: Stefan Egli
>            Assignee: Marcel Reutegger
>            Priority: Critical
>             Fix For: 1.1.4, 1.0.10
>
>         Attachments: oak2359patch.diff
>
>
> As reported in OAK-2358 there is a potential problem with revisionGC not 
> cleaning up split documents properly (in 1.0.8.r1644758 at least). 
> As a side-effect, having many garbage-revisions renders the diffImpl 
> algorithm to become very slow - normally it would take only a few millis, but 
> with nodes that have many split-documents I can see diffImpl take hundres of 
> millis, sometimes up to a few seconds. Which causes the observation dequeuing 
> to be slower than the rate in which observation events are enqueued, which 
> results in observation queue never being cleaned up and event handling being 
> delayed more and more.
> Adding some logging showed that diffImpl would often read many 
> split-documents, which supports the assumption that the revisionGC not 
> cleaning up revisions has the diffImpl-slowness as a side-effect. Having said 
> that - diffImpl should probably still be able to run fast, since all the 
> revisions it should look at should be in the main document, not in split 
> documents.
> I dont have a test case handy for this at the moment unfortunately - if more 
> is coming up, I'll add more details here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to