improvements for the clustered oak setup

Tomek Rekawek Wed, 08 Jun 2016 05:52:56 -0700

Hello,

during an Adobe internal Oak-coordination call I presented two improvements for 
the clustered Oak setup I’m working on: OAK-3865 and OAK-4412. The presentation 
is available at [1], please find the summary of the discussion below.

OAK-3865 optimises the secondary-read strategy. It tracks all the revisions
affected / read by the Oak instance and also fetches _lastRevs from all the
secondary Mongo instances to decide whether it’s safe to use the
“preferSecondary” or “nearest” preference. Chetan suggested we should use the
node document cache to have even better results - he added his comment to JIRA
[2] as well. Julian thinks we should use the find(…, maxAge) method more often,
maybe even exposing the maxAge in JCR. The first step, however, is to have this
OAK-3865 finished.

OAK-4412 introduces a new “hybrid” Lucene index that consists of asynchronous,
shared part and a volatile, local part updated synchronously. This way we can
have an index which is updated immediately (or almost immediately), like the
property index, but without the need to have expensive repository writes.

The main problem here is how we should update the volatile part of the index -
using commit hook or observer. Commit hook allows us to have the changes
visible immediately, but it may also slow done all the commits. On the other
hand, using observer will introduce a small delay between the commit and having
the modifications indexed, but it won’t burden the commit process. Stefan’s
idea is to use observer and modify the query logic, so it can check whether
there are some pending changes. If so, the query can wait until recent changes
are indexed [3].

With regards to the future work, I’m eager to commit the first patch
(OAK-3865). If the Oak community agrees on the approach and the implementation
(the patch is attached to the JIRA [4]), I’ll merge it next week, on Wednesday.
The Chetan idea is very good, but I’d like to extract it to a separate issue,
as the patch is quite big already.

For the OAK-4412, I’ll try to implement the Stefan idea of waiting for the
index update in the query time and keep you posted.

Best regards,
Tomek

[1]
https://issues.apache.org/jira/secure/attachment/12808917/clustered-oak-setup-improvements.pdf
[2]
https://issues.apache.org/jira/browse/OAK-3865?focusedCommentId=15320292#comment-15320292
[3]
https://issues.apache.org/jira/browse/OAK-4412?focusedCommentId=15320275#comment-15320275
[4] https://issues.apache.org/jira/browse/OAK-3865

--
Tomek Rękawek | Adobe Research | www.adobe.com
[email protected]

improvements for the clustered oak setup

Reply via email to