I'll take a stab at it.... When are you opening/closing your searcher? When
you open a searcher, you get a snapshot of the index at that instant, and
subsequent modifications aren't visible until you open a new searcher (at
least I think I've got this right).

And I'm sure this also interacts with the writer merge settings
"interestingly".

Personally, I'd worry about this a lot more if it happened after I'd closed
my writer and opened a new reader <G>...
Of course, my app has an index that is updated rarely (every two weeks), so
I haven't dug into too many details in this area...


Best
Erick

On 8/8/06, Marcus Falck <[EMAIL PROTECTED]> wrote:

I have noticed some strange behavior when searching my lucene index.



I'm adding 500.000 docs to an index.



MergeFactor = 10

MinMerge = 5000



When 49999 have been added ( just before the first 10 * 5000 merge ) the
hits.length() is reporting around 1000 hits for a keyword (which by the
way is around the same count as with 5000 docs added). After the 10*5000
merge the hits.length() returns around 8000 hits, which seems to be a
lot more reasonable. Since I'm adding content in date order ( oldest
first ) I have also tried to sort the hits (newest date first) and
display the top 10 hits.



According to that output it seems that the documents are added
correctly.



I'm using a multisearcher on top of a RAMDir and an FSDir. Using
Lucene1.4.3



Anybody that has any idea about why the hit count is so misleading?



/

Regards

Marcus







Reply via email to