I don't know. This is the situation when I create and optimize the index
with Lucene.Net:

segments   28 Byte
_i5.cfs      543 kByte
deletable   12 Byte
_bd.cfs      317 kByte

Once the index is opened with Luke only segments and _i5.cfs remain,
untouched. So the only difference is that _bd.cfs and deletable are
removed. Well, deletable looked like a good candidate to be deleted, but
what about _bd.cfs? It looks like it wans't needed then.

Simone

Jokin Cuadrado wrote:
> I'm wandering about, but may be an issue with the text codification
> used? if it's just the 50%, maybe lucene.net it's using a codification
> than needs 2 bytes for each character by default, and luke is using
> one that only needs 1 byte.
>
> regard the number of files,  maybe luke don't take acount of the
> "deletables" file, that contains the files that are no longer used and
> may be deleted because it don't delete files. But i think that it's no
> relevant to the another question.
>
> jokin.
>
> On 7/17/07, Simone Busoli <[EMAIL PROTECTED]> wrote:
>>
>>  Hi Jokin,
>>
>>  actually I found some information about it. As far as I've discovered
>> compression can be applied to fields of documents, before adding them
>> to the
>> index, even if Lucene.Net doesn't supply it out of the box. But the
>> issue I
>> reported doesn't have to do with this, because index size reduction
>> seems to
>> be applied to a higher level by Luke, I mean, to an index already
>> containing
>> documents with uncompressed fields. In fact, when reopening the index
>> with
>> Lucene.Net after it's been opened - and you see, optimized - by Luke,
>> I am
>> still able to read it, even if I didn't configure support for
>> compression.
>> This means that Luke didn't compress the contents of the documents
>> contained
>> in the index (it would be a weird behavior after all), but instead did
>> something like optimizing the format of the files of the index. Another
>> detail is that when I write my index with Lucene.Net I end up with at
>> least
>> 3 files, while when I open it with Luke I always get 2 files only.
>> And yes,
>> I am calling IndexWriter.Optimize() when finished indexing. Am I missing
>> something maybe?
>>
>>  Simone
>

Reply via email to