On 18-May-07, at 1:01 PM, charlie w wrote:
So now I have the idea to invert the field name and value thusly:
foo=tag     ^2
bar=tag     ^1.2
foobar=tag    ^1.8
and search "foo:tag".

Intuitively, I would expect Lucene to be optimized for searching the values of fields, and not really the names of fields. In a somewhat large index, say 10 million documents, will Lucene search performance continue to be
acceptable if I load up documents with many fields like this?

Perhaps not. Storing a field with norms occupies O(N) space, regardless of the number of document with non-zero norms. There might be too much data for the os to cache and lucene to process efficiently.

Is there an upper limit on the number of fields comprising a document, and
if so what is it?

There is not.  They are relatively costless if omitNorms=False

Or, is there some way to make my original approach work after all?

The experimental Payloads allows an optional boost to be stored along with term position. This is the intended use case.

-Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to