Chris Hostetter wrote: [...] > is because of the same questions you raise: in an ideal world you would > put a lot of configuration in the <similarity/> section, you'd put it in > the field/fieldType sections .... but what would that look like? [...] I think a very simple and generic way to handle this would be to have the FieldType hold a LengthNorm class in the same way it currently holds the Tokenizer / Analyzers. We can then provide e.g. a DefaultLengthNorm (same as DefaultSimilarity) and NoLengthNorm (lengthnorm = 1), and users can create their own subclasses if they want to.
> It's really hard to design general solutions to these types of questions > without some solid use cases. My specific use case is a product search engine for which I don't want a length norm at all on most fields, and where I do want it I want longer fields to only get a minimally smaller boost, e.g. 0.8 for a "long" value (whatever long is exactly) compared to 1.0 for a "short" value. The main problem I'm running into with this is that I can't override encodeNorm/decodeNorm because they're static, and the current encoding (while providing a huge range of values) doesnt have much precision around any individual value. Karsten
