Re: boosting fields

karl wettin Thu, 27 Apr 2006 00:47:31 -0700


26 apr 2006 kl. 19.18 skrev Doug Cutting:

karl wettin wrote:
How about refactoring fields to something like:
[Document](fieldName)<#>---- {0..1} ->[Field +boost]<#>----{0..*} -> [FieldValue +store +index +termVector]
If you think you have a simple, back-compatible way to do this,please submit a patch. Perhaps it is simpler than I imagined.
Long-term, an API which supports per token boosting willprobably prove useful, as a part of #11 on http://wiki.apache.org/jakarta- lucene/Lucene2Whiteboard.
I've wanted that feature a few times. Let me know if there issomething I can do to help when the time is right.
The time will be right as soon as someone decides they want toimplement this! Ideally every part of the index would bepluggable, but the most important is postings, so probably weshould start there.
My idea is that the logic of DocumentWriter

I would prefer to leave out the persistence and deprication from thediscussion until the rest is solved, as I spend all my spare braintime on the InstanciatedIndex-thingy.

and also probably a no-positions version, a no-freqs version and aweight-per-position version. TermFreqs and TermPositions should bereplaced with a generic Postings API. Applications can thendowncast the Postings instance based on the FieldInfo.


This is much more interesting from my point of view. Let's start here.

I might be wrong and I really don't know why it is so bad, but Ithink casting based on FieldInfo would be breaking the Liskovsubtituion principle in big way.

My own immediate thought is to compromise by allowing boost per termin document. Simply remove the norms-methods from the IndexReader andadd a new one to the TermEnum and fall back on the field boost. Howwould the value be picked up by the scorer?


Boost per position, et.c. sounds very expensive.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: boosting fields

Reply via email to