Re: Similarity with schema-driven lengthNorm

Karsten Sperling Tue, 11 Mar 2008 17:07:34 -0700

Chris Hostetter wrote:
[...]
> is because of the same questions you raise: in an ideal world you would 
> put a lot of configuration in the <similarity/> section, you'd put it in 
> the field/fieldType sections .... but what would that look like? 
[...]
I think a very simple and generic way to handle this would be to have
the FieldType hold a LengthNorm class in the same way it currently holds
the Tokenizer / Analyzers.  We can then provide e.g. a DefaultLengthNorm
(same as DefaultSimilarity) and NoLengthNorm (lengthnorm = 1), and users
can create their own subclasses if they want to.


> It's really hard to design general solutions to these types of questions 
> without some solid use cases.
My specific use case is a product search engine for which I don't want a
length norm at all on most fields, and where I do want it I want longer
fields to only get a minimally smaller boost, e.g. 0.8 for a "long"
value (whatever long is exactly) compared to 1.0 for a "short" value.

The main problem I'm running into with this is that I can't override
encodeNorm/decodeNorm because they're static, and the current encoding
(while providing a huge range of values) doesnt have much precision
around any individual value.

Karsten

Re: Similarity with schema-driven lengthNorm

Reply via email to