> The index indeed gets rebuilt. In IndexUpdate.collectIndexEditors() the
provider does not return any editors and the following code is executed

OAK-2203

On Tue, Oct 21, 2014 at 8:37 AM, Marcel Reutegger <[email protected]>
wrote:

> Hi,
>
> this is the output when I run it on my machine within IntelliJ:
>
> 17:13:10.035 [main] INFO  o.a.j.oak.plugins.index.IndexUpdate - Reindexing
> will be performed for following indexes: [/oak:index/lucene]
> 17:13:10.172 [main] DEBUG o.a.j.o.p.i.lucene.LuceneIndexEditor - Indexed 1
> nodes, done.
> ================
> _0.cfs - 621
> _0.cfe - 194
> segments.gen - 20
> segments_1 - 81
> _0.si - 252
> 17:13:10.187 [main] INFO  o.a.j.oak.plugins.index.IndexUpdate - Reindexing
> will be performed for following indexes: [/oak:index/lucene]
> 17:13:10.200 [main] DEBUG o.a.j.o.p.i.lucene.LuceneIndexEditor - Indexed 2
> nodes, done.
> ================
> _0.cfs - 789
> _0.cfe - 194
> segments.gen - 20
> segments_1 - 81
> _0.si - 252
> 17:13:10.204 [main] INFO  o.a.j.oak.plugins.index.IndexUpdate - Reindexing
> will be performed for following indexes: [/oak:index/lucene]
> 17:13:10.220 [main] DEBUG o.a.j.o.p.i.lucene.LuceneIndexEditor - Indexed 3
> nodes, done.
> ================
> _0.cfs - 952
> _0.cfe - 194
> segments.gen - 20
> segments_1 - 81
> _0.si - 252
> 17:13:10.223 [main] INFO  o.a.j.oak.plugins.index.IndexUpdate - Reindexing
> will be performed for following indexes: [/oak:index/lucene]
> 17:13:10.238 [main] DEBUG o.a.j.o.p.i.lucene.LuceneIndexEditor - Indexed 2
> nodes, done.
> ================
> _0.cfs - 789
> _0.cfe - 194
> segments.gen - 20
> segments_1 - 81
> _0.si - 252
> 17:13:10.241 [main] INFO  o.a.j.oak.plugins.index.IndexUpdate - Reindexing
> will be performed for following indexes: [/oak:index/lucene]
> 17:13:10.256 [main] DEBUG o.a.j.o.p.i.lucene.LuceneIndexEditor - Indexed 3
> nodes, done.
> ================
> _0.cfs - 955
> _0.cfe - 194
> segments.gen - 20
> segments_1 - 81
> _0.si - 252
>
>
>
>
>
> The index indeed gets rebuilt. In IndexUpdate.collectIndexEditors() the
> provider
> does not return any editors and the following code is executed:
>
> Editor editor = provider.getIndexEditor(type, definition, root,
> updateCallback);
> if (editor == null) {
>     // trigger reindexing when an indexer becomes available
>     definition.setProperty(REINDEX_PROPERTY_NAME, true);
> } else ...
>
>
> We need to detect a re-index and clear the lucene replica on the local
> disk.
> As we can see, lucene will start with generation zero again and increment
> it
> with every modification. This will eventually lead to a collision with the
> replica on the local disk. In this extreme case, it even happens with every
> modification ;)
>
> Regards
>  Marcel
>
> On 20/10/14 14:24, "Chetan Mehrotra" <[email protected]> wrote:
>
> >Hi Marcel,
> >
> >> in my experience .cfs files are written once
> >and never modified
> >
> >I have checked in a testcase with [1] and if you run that you would
> >see following output which indicate that same file is getting updated.
> >
> >----
> >================
> >_0.cfs - 621
> >_0.cfe - 194
> >segments.gen - 20
> >segments_1 - 81
> >_0.si - 266
> >================
> >_0.cfs - 789
> >_0.cfe - 194
> >segments.gen - 20
> >segments_1 - 81
> >_0.si - 266
> >================
> >_0.cfs - 952
> >_0.cfe - 194
> >segments.gen - 20
> >segments_1 - 81
> >_0.si - 266
> >================
> >_0.cfs - 789
> >_0.cfe - 194
> >segments.gen - 20
> >segments_1 - 81
> >_0.si - 266
> >================
> >_0.cfs - 955
> >_0.cfe - 194
> >segments.gen - 20
> >segments_1 - 81
> >_0.si - 266
> >---------
> >
> >Chetan Mehrotra
> >[1] http://svn.apache.org/r1633123
> >
> >
> >On Mon, Oct 20, 2014 at 5:34 PM, Thomas Mueller <[email protected]>
> wrote:
> >> Hi,
> >>
> >> This blog post is interesting: they are using a physical switch (similar
> >> to a christmas light timer) to test a Lucene index doesn't get corrupt
> >>on
> >> power failure. It would be nice if we can do something similar with the
> >> Segment storage at some point.
> >>
> >> Regards,
> >> Thomas
> >>
> >>
> >>
> >> On 20/10/14 13:36, "Marcel Reutegger" <[email protected]> wrote:
> >>
> >>>Hi,
> >>>
> >>>this is very strange. in my experience .cfs files are written once
> >>>and never modified. this write-once pattern is actually used for
> >>>almost all files, except the segments.gen file you mentioned. E.g.
> >>>see [0] by Mike McCandless when he talks about LUCENE-5574.
> >>>
> >>>is it possible the entire lucene index is replaced by oak?
> >>>
> >>>regards
> >>> marcel
> >>>
> >>>[0]
> >>>
> http://blog.mikemccandless.com/2014/04/testing-lucenes-index-durability-
> >>>af
> >>>t
> >>>er.html
> >>>
> >>>On 20/10/14 11:59, "Chetan Mehrotra" <[email protected]> wrote:
> >>>
> >>>>While working on copy on read directory support (OAK-1724) and was
> >>>>checking how Lucene manages the index files. Following observation can
> >>>>be made with various test runs
> >>>>
> >>>>A - Small Index use Compound File format
> >>>>------------------
> >>>>
> >>>>If index contain few entries then it seems it uses the compound file
> >>>>format as directory listing shows only following files (filename -
> >>>>size)
> >>>>
> >>>>_0.cfs - 621
> >>>>_0.cfe - 194
> >>>>segments.gen - 20
> >>>>segments_1 - 81
> >>>>_0.si - 266
> >>>>
> >>>>If the index gets updates the _0.cfs file size changes and other
> >>>>remains
> >>>>same
> >>>>
> >>>>B - Large index store index file seprately
> >>>>--------------------
> >>>>
> >>>>For large index (not sure of threshold) Lucene seems to store the
> >>>>various index file separately and there probably the file do not get
> >>>>modified and only new file get created
> >>>>
> >>>>Question
> >>>>-------------
> >>>>1. Is this switch from cfs format to storing in separate files is
> >>>>automatic and done by Lucene after index reaches certain size. Or this
> >>>>done something specifically in Oak?
> >>>>2. Lucene would not modify existing file in a directory unless
> >>>>  a. In compound storage cfs file would get modified. There also
> >>>>modification would be append only?
> >>>>  b. segment.gen - This would get modified everytime
> >>>>  c. If separate files are used then any file would never be modified
> >>>>and only new files would be created
> >>>>
> >>>>Chetan Mehrotra
> >>>>PS: Probably the question is more appropriate for Lucene DL but
> >>>>checking here first to see if something in Oak is different from
> >>>>default
> >>>
> >>
>
>

Reply via email to