Chandramohan wrote:

perform such a cull again, you might make several
distinct indexes (one per day, per week, per whatever) during that reindexing so the next time will be much easier.

How would you search and consolidate the results
across multiple indexes?  Hits from each index will
have independent scoring.

Frankly, I ignore the scores in my application. The data itself isn't English prose, so the TF/IDF calcuations are stretched at best, as a measure of relevance. I presort the documents to be in "relevance" order (a popularity metric), then specify index ordering for the results.

If that wouldn't work for your application, it seems to me that large-enough sub-sections *would* produce equivalent scores. That is, if the sub-indexes were big enough, one could directly compare scores, so a simple merge would work. If the total document corpus is small, then the need for sub-indexes isn't there anyhow.

--MDC

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to