Re: Term vectors: .tvf format question

Dmitry Serebrennikov Mon, 14 Jun 2004 11:50:06 -0700

Doug Cutting wrote:

So term-number-based vectors would be small and fast to use if all you're using is a single, optimized index, but very slow to use with unoptimized indexes and multiple indexes. That seems like a bad situtation, so, unless someone figures out another way, we're stuck with the current approach. Vectors are bigger and slower than optimal, but they're consistently so.

I'm very familiar with this particular issue :). One solution that has worked for my application was to treat terms from different segments / indexes as always being different, even if they actually did have the same text. Later on in results processing, when the number of terms under consideration has been greatly reduced, I was able to do the lookups and further consolidate those terms that turned out to be identical. Not sure if this is a good general solution, but it has worked for me reasonable well.

Dmitry.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Term vectors: .tvf format question

Reply via email to