Re: scalability w/ number of fields

Doug Cutting Wed, 06 Apr 2005 09:28:27 -0700

Yonik Seeley wrote:

They are all indexed (and they all need to be under the current design).

As I mentioned before, Lucene will not perform well with a large number of indexed fields. If these are not tokenized fields, then a simple way to reduce the number of indexed fields is to move the field name into the value. Instead of adding <fieldX, valueY> and <fieldZ, valueA>, add <generic, fieldX-valueY> and <generic, fieldZ-valueY>. This should perform quite well. You'll also need to manipulate queries accordingly.

A similar method can work for tokenized fields. Simply write a TokenFilter that appends a field name to the front of tokens.

Yes, this is an ugly hack, but it can make a huge performance differrence. The problem is that Lucene stores norm values in an array, when, in cases like yours, a sparse data structure might be more sensible.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: scalability w/ number of fields

Reply via email to