Doug Cutting wrote on 06/09/2006 08:00 AM: > Chuck Williams wrote: >> one simple and substantial optimization is >> to support a token filter for term vectors, i.e. pass tokens through an >> additional filter for addition to term vectors. > > Why not instead add the rotated and/or reversed tokens to a different > field that does not store vectors? > I'm running into issues with the separate field approach. This would seem to require either rereading the content or storing all of the reversed/rotated tokens for subsequent generation out of a data structure. Both of these are performance problems, and in my app rereading is not even practical. Some fields are entire large documents; requirements prohibit any truncation. The content is streamed to the indexer through soap, whence the additional rereading problems.
It seems easiest and most efficient to have an additional filter on the tokens that go into a term vector. Am I missing an easier way to set up a separate field? I understand the desire to not add facilities to Lucene when there is an existing method to achieve the same end, but it is not clear than using an additional field is a practical approach. It also seems that in general the tokens useful in a term vector are only a subset of those useful in the index -- at least this is the case for my app. Thanks for any guidance, Chuck --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]