Oops, you're right, term listings and counts for deleted docs are
adjusted during merges. I had the impression that optimize had some
special powers here that merge does not.

Thank you for bringing expungeDeletes to my attention.

On Sat, Nov 21, 2009 at 7:46 AM, Yonik Seeley
<yo...@lucidimagination.com> wrote:
> On Sat, Nov 21, 2009 at 12:33 AM, Lance Norskog <goks...@gmail.com> wrote:
>> And, terms whose documents have been deleted are not purged. So, you
>> can merge all you like and the index will not shrink back completely.
>
> Under what conditions?  Certainly not all, since I just tried a simple
> test and a merge removed the terms that were no longer in any
> documents just fine.
>
>> This is important because the orphan terms affect relevance
>> calculations.
>
> Marking a document as deleted don't affect any term statistics (which
> idf uses) until the document is actually removed (which can happen via
> a merge, optimize, or expungeDeletes).  That's a lucene limitation
> unrelated to how many of a terms documents have been deleted.  But
> perhaps I don't understand how you're using the term "orphan terms".
>
> -Yonik
> http://www.lucidimagination.com
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to