> The index indeed gets rebuilt. In IndexUpdate.collectIndexEditors() the provider does not return any editors and the following code is executed
OAK-2203 On Tue, Oct 21, 2014 at 8:37 AM, Marcel Reutegger <[email protected]> wrote: > Hi, > > this is the output when I run it on my machine within IntelliJ: > > 17:13:10.035 [main] INFO o.a.j.oak.plugins.index.IndexUpdate - Reindexing > will be performed for following indexes: [/oak:index/lucene] > 17:13:10.172 [main] DEBUG o.a.j.o.p.i.lucene.LuceneIndexEditor - Indexed 1 > nodes, done. > ================ > _0.cfs - 621 > _0.cfe - 194 > segments.gen - 20 > segments_1 - 81 > _0.si - 252 > 17:13:10.187 [main] INFO o.a.j.oak.plugins.index.IndexUpdate - Reindexing > will be performed for following indexes: [/oak:index/lucene] > 17:13:10.200 [main] DEBUG o.a.j.o.p.i.lucene.LuceneIndexEditor - Indexed 2 > nodes, done. > ================ > _0.cfs - 789 > _0.cfe - 194 > segments.gen - 20 > segments_1 - 81 > _0.si - 252 > 17:13:10.204 [main] INFO o.a.j.oak.plugins.index.IndexUpdate - Reindexing > will be performed for following indexes: [/oak:index/lucene] > 17:13:10.220 [main] DEBUG o.a.j.o.p.i.lucene.LuceneIndexEditor - Indexed 3 > nodes, done. > ================ > _0.cfs - 952 > _0.cfe - 194 > segments.gen - 20 > segments_1 - 81 > _0.si - 252 > 17:13:10.223 [main] INFO o.a.j.oak.plugins.index.IndexUpdate - Reindexing > will be performed for following indexes: [/oak:index/lucene] > 17:13:10.238 [main] DEBUG o.a.j.o.p.i.lucene.LuceneIndexEditor - Indexed 2 > nodes, done. > ================ > _0.cfs - 789 > _0.cfe - 194 > segments.gen - 20 > segments_1 - 81 > _0.si - 252 > 17:13:10.241 [main] INFO o.a.j.oak.plugins.index.IndexUpdate - Reindexing > will be performed for following indexes: [/oak:index/lucene] > 17:13:10.256 [main] DEBUG o.a.j.o.p.i.lucene.LuceneIndexEditor - Indexed 3 > nodes, done. > ================ > _0.cfs - 955 > _0.cfe - 194 > segments.gen - 20 > segments_1 - 81 > _0.si - 252 > > > > > > The index indeed gets rebuilt. In IndexUpdate.collectIndexEditors() the > provider > does not return any editors and the following code is executed: > > Editor editor = provider.getIndexEditor(type, definition, root, > updateCallback); > if (editor == null) { > // trigger reindexing when an indexer becomes available > definition.setProperty(REINDEX_PROPERTY_NAME, true); > } else ... > > > We need to detect a re-index and clear the lucene replica on the local > disk. > As we can see, lucene will start with generation zero again and increment > it > with every modification. This will eventually lead to a collision with the > replica on the local disk. In this extreme case, it even happens with every > modification ;) > > Regards > Marcel > > On 20/10/14 14:24, "Chetan Mehrotra" <[email protected]> wrote: > > >Hi Marcel, > > > >> in my experience .cfs files are written once > >and never modified > > > >I have checked in a testcase with [1] and if you run that you would > >see following output which indicate that same file is getting updated. > > > >---- > >================ > >_0.cfs - 621 > >_0.cfe - 194 > >segments.gen - 20 > >segments_1 - 81 > >_0.si - 266 > >================ > >_0.cfs - 789 > >_0.cfe - 194 > >segments.gen - 20 > >segments_1 - 81 > >_0.si - 266 > >================ > >_0.cfs - 952 > >_0.cfe - 194 > >segments.gen - 20 > >segments_1 - 81 > >_0.si - 266 > >================ > >_0.cfs - 789 > >_0.cfe - 194 > >segments.gen - 20 > >segments_1 - 81 > >_0.si - 266 > >================ > >_0.cfs - 955 > >_0.cfe - 194 > >segments.gen - 20 > >segments_1 - 81 > >_0.si - 266 > >--------- > > > >Chetan Mehrotra > >[1] http://svn.apache.org/r1633123 > > > > > >On Mon, Oct 20, 2014 at 5:34 PM, Thomas Mueller <[email protected]> > wrote: > >> Hi, > >> > >> This blog post is interesting: they are using a physical switch (similar > >> to a christmas light timer) to test a Lucene index doesn't get corrupt > >>on > >> power failure. It would be nice if we can do something similar with the > >> Segment storage at some point. > >> > >> Regards, > >> Thomas > >> > >> > >> > >> On 20/10/14 13:36, "Marcel Reutegger" <[email protected]> wrote: > >> > >>>Hi, > >>> > >>>this is very strange. in my experience .cfs files are written once > >>>and never modified. this write-once pattern is actually used for > >>>almost all files, except the segments.gen file you mentioned. E.g. > >>>see [0] by Mike McCandless when he talks about LUCENE-5574. > >>> > >>>is it possible the entire lucene index is replaced by oak? > >>> > >>>regards > >>> marcel > >>> > >>>[0] > >>> > http://blog.mikemccandless.com/2014/04/testing-lucenes-index-durability- > >>>af > >>>t > >>>er.html > >>> > >>>On 20/10/14 11:59, "Chetan Mehrotra" <[email protected]> wrote: > >>> > >>>>While working on copy on read directory support (OAK-1724) and was > >>>>checking how Lucene manages the index files. Following observation can > >>>>be made with various test runs > >>>> > >>>>A - Small Index use Compound File format > >>>>------------------ > >>>> > >>>>If index contain few entries then it seems it uses the compound file > >>>>format as directory listing shows only following files (filename - > >>>>size) > >>>> > >>>>_0.cfs - 621 > >>>>_0.cfe - 194 > >>>>segments.gen - 20 > >>>>segments_1 - 81 > >>>>_0.si - 266 > >>>> > >>>>If the index gets updates the _0.cfs file size changes and other > >>>>remains > >>>>same > >>>> > >>>>B - Large index store index file seprately > >>>>-------------------- > >>>> > >>>>For large index (not sure of threshold) Lucene seems to store the > >>>>various index file separately and there probably the file do not get > >>>>modified and only new file get created > >>>> > >>>>Question > >>>>------------- > >>>>1. Is this switch from cfs format to storing in separate files is > >>>>automatic and done by Lucene after index reaches certain size. Or this > >>>>done something specifically in Oak? > >>>>2. Lucene would not modify existing file in a directory unless > >>>> a. In compound storage cfs file would get modified. There also > >>>>modification would be append only? > >>>> b. segment.gen - This would get modified everytime > >>>> c. If separate files are used then any file would never be modified > >>>>and only new files would be created > >>>> > >>>>Chetan Mehrotra > >>>>PS: Probably the question is more appropriate for Lucene DL but > >>>>checking here first to see if something in Oak is different from > >>>>default > >>> > >> > >
