Re: Reindex and external indexes - Possibility of stale index data

Thomas Mueller Tue, 21 Oct 2014 01:26:33 -0700

Hi,

The node doesn't need to be moved, even after multiple reindex operations.
Please note index creation is no different from reindex. In both cases, a
new index data node is created. So, if an index definition is created:


    /oak:index/lucene

Then the index is being built:

    /oak:index/lucene/:data_12345

The index is done building (a):

    /oak:index/lucene/:data_12345/@ready=true

Reindexing is started (b):

    /oak:index/lucene/@reindex=true
    /oak:index/lucene/:data_12345/@ready=true


While reindex is in progress:

    /oak:index/lucene/@reindex=true
    /oak:index/lucene/:data_12345/@ready=true
    /oak:index/lucene/:data_14444


When reindex is done (matches a):

    /oak:index/lucene/:data_14444/@ready=true

Reindex again is just restart from (b).

Regards,
Thomas















On 21/10/14 10:00, "Chetan Mehrotra" <chetan.mehro...@gmail.com> wrote:

>On Tue, Oct 21, 2014 at 1:18 PM, Thomas Mueller <muel...@adobe.com> wrote:
>> What we need is a distinction between the old and the new index *data*.
>
>Yes and that can be done by storing the index creation time.
>
>In the approach you suggested where two different nodes are used and
>later the nodes are renamed allows the logic to determine that its
>reindex. Renaming the node would be fine in this case as actual data
>is stored on filesystem but if it contains actual data then such a
>move might be costly. For e.g. in copy on read case the index data
>would be stored in NodeStore and also on file system. Further this is
>something which each such index implementation would need to follow
>
>Instead if we just record the creation time in the index definition
>node and then allow index impls to make use of that info to
>distinguish between a reindex and incremental index then that would
>serve the same purpose
>
>
>Chetan Mehrotra

Re: Reindex and external indexes - Possibility of stale index data

Reply via email to