Re: Lucene index speed

Chetan Mehrotra Mon, 07 Dec 2015 07:56:56 -0800

On Mon, Dec 7, 2015 at 9:06 PM, Jim.Tully <[email protected]> wrote:
> When running locally with similar data, the indexing is nearly
> instantaneous.


Okie thats what I was expecting. The problem here is that AsyncIndexer
job is to be run as a singleton in a cluster. This is done at [1].
This is undocumented dependency on Sling way of scheduling things
(SLING-2979) which allows one to schedule jobs as singleton in a
cluster.

The default scheduler used by Oak (outside of Sling) does not honor
this contract which causes this job to be executed concurrently on
each cluster node and that causes conflict/retries etc. So in a way
Oak is outsourcing the job execution in cluster to embedding
application. Would be good to document this aspect (if you can open an
issue that would be helpful)

Given the recent work on DocumentDiscoveryLiteService it might be
possible for Oak to manage such thing on its own (@Stefan thoughts?).
But as of now this is not possible. So only way out currently is to
provide your own Whiteboard implementation which can handle such kind
of singleton scheduled jobs. Doing this is certainly non trivial!

Chetan Mehrotra
[1] 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/spi/whiteboard/WhiteboardUtils.java#L59

Re: Lucene index speed

Reply via email to