[ 
https://issues.apache.org/jira/browse/OAK-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-2587:
-------------------------------
    Labels: observation  (was: )

> observation processing too eager/unfair under load
> --------------------------------------------------
>
>                 Key: OAK-2587
>                 URL: https://issues.apache.org/jira/browse/OAK-2587
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.0.12
>            Reporter: Stefan Egli
>            Assignee: Michael Dürig
>            Priority: Critical
>              Labels: observation
>             Fix For: 1.1.8
>
>         Attachments: OAK-2587.patch
>
>
> The current implementation of oak's observation event processing is too eager 
> and thus unfair under load scenarios. 
> Consider having many (eg 200) Eventlisteners but only a relatively small 
> threadpool (eg 5 as is the default in sling) backing them. When processing 
> changes for a particular BackgroundObserver, that one (in 
> BackgroundObserver.completionHandler.call) currently processes *all changes 
> irrespective of how many there are* - ie it is *eager*. Only once that 
> BackgroundObserver processed all changes will it let go and 'pass the thread' 
> to the next BackgroundObserver. Now if for some reason changes (ie commits) 
> are coming in while a BackgroundObserver is busy processing an earlier 
> change, this will lengthen that while loop. As a result the remaining (eg 
> 195) *EventListeners will have to wait for a potentially long time* until 
> it's their turn - thus *unfair*.
> Now combine the above pattern with a scenario where mongo is used as the 
> underlying store. In that case in order to remain highly performant it is 
> important that the diffs (for compareAgainstBaseState) are served from the 
> MongoDiffCache for as many cases as possible to avoid doing a round-trip to 
> mongoD. The unfairness in the BackgroundObservers can now result in a large 
> delay between the 'first' observers getting the event and the 'last' one (of 
> those 200). When this delay increases due to a burst in the load, there is a 
> risk of the diffs to no longer be in the cache - those last observers are 
> basically kicked out of the (diff) cache. Once this happens, *the situation 
> gets even worse*, since now you have yet new commits coming in and old 
> changes still having to be processed - all of which are being processed 
> through in 'stripes of 5 listeners' before the next one gets a chance. This 
> at some point results in a totally inefficient cache behavior, or in other 
> words, at some point all diffs have to be read from mongoD.
> To avoid this there are probably a number of options - a few one that come to 
> mind:
> * increase thread-pool to match or be closer to the number of listeners (but 
> this has other disadvantages, eg cost of thread-switching)
> * make BackgroundObservers fairer by limiting the number of changes they 
> process before they give others a chance to be served by the pool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to