Lucene indexing in Oak is done via a single thread in a cluster. The AsyncIndex job which is responsible for performing the Lucene indexing is scheduled to run as a singleton in the cluster [1]. For this Oak depends on the embedding application. For e.g. when running in Apache Sling the Job is registered with following properties
* scheduler.period=5 * scheduler.runOn=SINGLE And Sling scheduler component then ensures that this job (i.e. indexing) is only performed on a one node in a cluster. Oak internally also ensures that at a time only one index is running via some state management in :async hidden node. So essentially by design we enforce that no concurrent indexing should happen and hence the use of NoLockFactory Chetan Mehrotra On Tue, Aug 4, 2015 at 5:40 PM, Shinichiro Abe <shinichiro.ab...@gmail.com> wrote: > Hello, > > If I understand correctly, by OakDirectory, a Lucene index is placed as Blob > on a Storage. > If there is an Oak cluster condition, Oak instances on the cluster will have > a LuceneIndexEditor individually in which IndexWriter is working. > In this case when multiple IndexWriters update documents against one > OakDirectory on the same timing, > I think it will lead to an index corruption even though OakDirectory controls > a lock with NoLockFactory. > How are you avoiding this? Oak has a OakDirectory per cluster id? Oak impl > locks other writer behaviors during one update? > I'd like to oak-lucene design. Where the code should I see? > Background: I'd like to get a hint from Oak team for CONNECTORS-1219 where > I'm working on, > I have cluster(multiprocess) condition problem. Currently I hit index > corruption when multiple IndexWriters write to an index. > > Thanks in advance. > Shinichiro Abe