Hi Marcel,

> in my experience .cfs files are written once
and never modified

I have checked in a testcase with [1] and if you run that you would
see following output which indicate that same file is getting updated.

----
================
_0.cfs - 621
_0.cfe - 194
segments.gen - 20
segments_1 - 81
_0.si - 266
================
_0.cfs - 789
_0.cfe - 194
segments.gen - 20
segments_1 - 81
_0.si - 266
================
_0.cfs - 952
_0.cfe - 194
segments.gen - 20
segments_1 - 81
_0.si - 266
================
_0.cfs - 789
_0.cfe - 194
segments.gen - 20
segments_1 - 81
_0.si - 266
================
_0.cfs - 955
_0.cfe - 194
segments.gen - 20
segments_1 - 81
_0.si - 266
---------

Chetan Mehrotra
[1] http://svn.apache.org/r1633123


On Mon, Oct 20, 2014 at 5:34 PM, Thomas Mueller <[email protected]> wrote:
> Hi,
>
> This blog post is interesting: they are using a physical switch (similar
> to a christmas light timer) to test a Lucene index doesn't get corrupt on
> power failure. It would be nice if we can do something similar with the
> Segment storage at some point.
>
> Regards,
> Thomas
>
>
>
> On 20/10/14 13:36, "Marcel Reutegger" <[email protected]> wrote:
>
>>Hi,
>>
>>this is very strange. in my experience .cfs files are written once
>>and never modified. this write-once pattern is actually used for
>>almost all files, except the segments.gen file you mentioned. E.g.
>>see [0] by Mike McCandless when he talks about LUCENE-5574.
>>
>>is it possible the entire lucene index is replaced by oak?
>>
>>regards
>> marcel
>>
>>[0]
>>http://blog.mikemccandless.com/2014/04/testing-lucenes-index-durability-af
>>t
>>er.html
>>
>>On 20/10/14 11:59, "Chetan Mehrotra" <[email protected]> wrote:
>>
>>>While working on copy on read directory support (OAK-1724) and was
>>>checking how Lucene manages the index files. Following observation can
>>>be made with various test runs
>>>
>>>A - Small Index use Compound File format
>>>------------------
>>>
>>>If index contain few entries then it seems it uses the compound file
>>>format as directory listing shows only following files (filename -
>>>size)
>>>
>>>_0.cfs - 621
>>>_0.cfe - 194
>>>segments.gen - 20
>>>segments_1 - 81
>>>_0.si - 266
>>>
>>>If the index gets updates the _0.cfs file size changes and other remains
>>>same
>>>
>>>B - Large index store index file seprately
>>>--------------------
>>>
>>>For large index (not sure of threshold) Lucene seems to store the
>>>various index file separately and there probably the file do not get
>>>modified and only new file get created
>>>
>>>Question
>>>-------------
>>>1. Is this switch from cfs format to storing in separate files is
>>>automatic and done by Lucene after index reaches certain size. Or this
>>>done something specifically in Oak?
>>>2. Lucene would not modify existing file in a directory unless
>>>  a. In compound storage cfs file would get modified. There also
>>>modification would be append only?
>>>  b. segment.gen - This would get modified everytime
>>>  c. If separate files are used then any file would never be modified
>>>and only new files would be created
>>>
>>>Chetan Mehrotra
>>>PS: Probably the question is more appropriate for Lucene DL but
>>>checking here first to see if something in Oak is different from
>>>default
>>
>

Reply via email to