[ https://issues.apache.org/jira/browse/OAK-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Marth updated OAK-2587: ------------------------------- Fix Version/s: (was: 1.0.13) (was: 1.2) 1.1.8 > observation processing too eager/unfair under load > -------------------------------------------------- > > Key: OAK-2587 > URL: https://issues.apache.org/jira/browse/OAK-2587 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core > Affects Versions: 1.0.12 > Reporter: Stefan Egli > Priority: Critical > Fix For: 1.1.8 > > Attachments: OAK-2587.patch > > > The current implementation of oak's observation event processing is too eager > and thus unfair under load scenarios. > Consider having many (eg 200) Eventlisteners but only a relatively small > threadpool (eg 5 as is the default in sling) backing them. When processing > changes for a particular BackgroundObserver, that one (in > BackgroundObserver.completionHandler.call) currently processes *all changes > irrespective of how many there are* - ie it is *eager*. Only once that > BackgroundObserver processed all changes will it let go and 'pass the thread' > to the next BackgroundObserver. Now if for some reason changes (ie commits) > are coming in while a BackgroundObserver is busy processing an earlier > change, this will lengthen that while loop. As a result the remaining (eg > 195) *EventListeners will have to wait for a potentially long time* until > it's their turn - thus *unfair*. > Now combine the above pattern with a scenario where mongo is used as the > underlying store. In that case in order to remain highly performant it is > important that the diffs (for compareAgainstBaseState) are served from the > MongoDiffCache for as many cases as possible to avoid doing a round-trip to > mongoD. The unfairness in the BackgroundObservers can now result in a large > delay between the 'first' observers getting the event and the 'last' one (of > those 200). When this delay increases due to a burst in the load, there is a > risk of the diffs to no longer be in the cache - those last observers are > basically kicked out of the (diff) cache. Once this happens, *the situation > gets even worse*, since now you have yet new commits coming in and old > changes still having to be processed - all of which are being processed > through in 'stripes of 5 listeners' before the next one gets a chance. This > at some point results in a totally inefficient cache behavior, or in other > words, at some point all diffs have to be read from mongoD. > To avoid this there are probably a number of options - a few one that come to > mind: > * increase thread-pool to match or be closer to the number of listeners (but > this has other disadvantages, eg cost of thread-switching) > * make BackgroundObservers fairer by limiting the number of changes they > process before they give others a chance to be served by the pool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)