[
https://issues.apache.org/jira/browse/OAK-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492415#comment-14492415
]
Chetan Mehrotra commented on OAK-2587:
--------------------------------------
[~mduerig] Would it be fine now to merge this to 1.0 branch given this change
has been in trunk since quite a while?
> observation processing too eager/unfair under load
> --------------------------------------------------
>
> Key: OAK-2587
> URL: https://issues.apache.org/jira/browse/OAK-2587
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.0.12
> Reporter: Stefan Egli
> Assignee: Michael Dürig
> Priority: Critical
> Labels: observation
> Fix For: 1.1.8
>
> Attachments: OAK-2587.patch
>
>
> The current implementation of oak's observation event processing is too eager
> and thus unfair under load scenarios.
> Consider having many (eg 200) Eventlisteners but only a relatively small
> threadpool (eg 5 as is the default in sling) backing them. When processing
> changes for a particular BackgroundObserver, that one (in
> BackgroundObserver.completionHandler.call) currently processes *all changes
> irrespective of how many there are* - ie it is *eager*. Only once that
> BackgroundObserver processed all changes will it let go and 'pass the thread'
> to the next BackgroundObserver. Now if for some reason changes (ie commits)
> are coming in while a BackgroundObserver is busy processing an earlier
> change, this will lengthen that while loop. As a result the remaining (eg
> 195) *EventListeners will have to wait for a potentially long time* until
> it's their turn - thus *unfair*.
> Now combine the above pattern with a scenario where mongo is used as the
> underlying store. In that case in order to remain highly performant it is
> important that the diffs (for compareAgainstBaseState) are served from the
> MongoDiffCache for as many cases as possible to avoid doing a round-trip to
> mongoD. The unfairness in the BackgroundObservers can now result in a large
> delay between the 'first' observers getting the event and the 'last' one (of
> those 200). When this delay increases due to a burst in the load, there is a
> risk of the diffs to no longer be in the cache - those last observers are
> basically kicked out of the (diff) cache. Once this happens, *the situation
> gets even worse*, since now you have yet new commits coming in and old
> changes still having to be processed - all of which are being processed
> through in 'stripes of 5 listeners' before the next one gets a chance. This
> at some point results in a totally inefficient cache behavior, or in other
> words, at some point all diffs have to be read from mongoD.
> To avoid this there are probably a number of options - a few one that come to
> mind:
> * increase thread-pool to match or be closer to the number of listeners (but
> this has other disadvantages, eg cost of thread-switching)
> * make BackgroundObservers fairer by limiting the number of changes they
> process before they give others a chance to be served by the pool.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)