[
https://issues.apache.org/jira/browse/OAK-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374972#comment-15374972
]
Stefan Egli commented on OAK-4522:
----------------------------------
there is one more aspect to take into account here: it's fine that the
CommitRateLimiter tries to not block any listener's doing a commit. However,
that is not enough: there is a [synchronized in
SegmentNodeStore.merge()|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentNodeStore.java#L294]
which always hits, in any session.save(). Now if a listener does a
session.save() while the CommitRateLimiter is delaying, it would still
negatively prevent the listener from continuing. Unless doing any
session.save() within a listener is prohibited (which I doubt it is, nor can
be), the only solution is to try to *not* apply that synchronized for listeners
too. If you don't - and we'd implement a schema where the CommitRateLimiter
would *indefinitely block* until the queue comes back below a low-water mark,
for example - you would have a *hard to find deadlock*.
At which point it looks like applying the CommitRateLimiter as part of a
CommitHook seems to become a problematic design choice.
(Based on tests it looks like this type of 'hard to find deadlock' situation
only applies to SegmentNodeStore, but still.)
/cc [~chetanm], [~catholicon], [~mduerig], [~mreutegg]
> Improve CommitRateLimiter to optionally block some commits
> ----------------------------------------------------------
>
> Key: OAK-4522
> URL: https://issues.apache.org/jira/browse/OAK-4522
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Reporter: Thomas Mueller
> Assignee: Thomas Mueller
>
> The CommitRateLimiter of OAK-1659 can delay commits, but doesn't currently
> block them, and delays even those commits that are part of handling events.
> Because of that, the queue can still get full, and possibly delaying commits
> while handling events can make the situation even worse.
> In Jackrabbit 2.x, we had a similar feature: JCR-2402. Also related is
> JCR-2746.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)