I haven't tried it, but according to http://lucene.apache.org/java/
docs/fileformats.html, each segment is a complete sub index. I
_wonder_ if you couldn't manage your own merges by using
IndexWriter.addIndexes() where you load each segment in separately
(this may mean copying the segments to other directories, but I am
not sure). Another option would be to modify Lucene to expose the
merge functionality.
This is pure speculation at this point, but I know the capabilities
exist (as all optimize does is merge segments until there is one
segment) so it seems like it should be possible.
-Grant
On Dec 1, 2006, at 8:11 AM, Stanislav Jordanov wrote:
Guys,
I've already asked this question but nobody answered:
Suppose we have a relatively big index which is continuously
updated - i.e. new docs get added while some of the old docs get
deleted.
For pragmatic reasons we have a restriction on maxMergeDocs so that
segment files don't get enormously big.
Consider now a segment of max size (i.e. containing maxMergeDocs
docs hence not eligible for a merge)
It is possible that (as time passes) this segment will have more
and more of its docs deleted.
But as it is not merge-able it will remain the same size and with
lots of "wholes" in it which is bad for performance.
The only way that I am aware of to correct this problem is to
invoke index optimization, which has several drawbacks:
1. it takes a while to optimize a big index.
2. the optimization process always produces a index comprising of a
single (extremely) large segment.
We can live with 1.
But 2 is undesirable.
Is there a way to "optimize" (in terms of purging its deleted docs)
an index or a single segment
without ending up with a single segment index?
Best,
Stanislav
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org
Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/
LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]