Take a look at the org.apache.lucene.search.function package in SVN.  It
provides an API that allows you to define "function" classes that can
compute a score for each document using whatever means you want.  The
overall FunctionQuery can then be wrapped in a BooleanQuery along with
whateer other search critera you have.

5 basic functions have been included that can be composed in all sorts of
interesting ways to compute scores based on document values for a
particular field (or the relative ordinal positions in the FieldCache for
that field)


: Date: Tue, 24 Jan 2006 17:42:06 -0000
: From: Nick Vincent <[EMAIL PROTECTED]>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: Sorting by calculated custom score at search time
:
: Hi,
:
: I am trying to find a way to create scores with a custom formula based
: on the initial score from Lucene and field values from each document,
: e.g. for each document:
:
:  finalScore = searchScore * (popularity) * (userRating)
:
: The customer requires this functionality as I have to replace an
: existing system that works like this.  User rating and popularity are
: already available and will be stored in Lucene.  I've looked through LIA
: and the approaches there don't seem to fit the requirement:
:
: 5.1.6 Sorting by multiple fields: only sorts by one field, then the
: next, I need to combine the scores
: 6.1 Using a custom sort method: does not take into account the
: document's original score
:
: >From an earlier thread discussing a calculated score based on the hit
: score and the age of document I gather that TSS regenerate their indexes
: to alter the document boost based on date.  I need to be able to sort by
: either relevance or "popularity rated relevance" depending on user
: input, so I don't think adding a precalculated document boost at index
: time is an option.
:
: In the worst case scenario I'll need to iterate through the hits and
: then sort them in memory myself, but I'm looking to be indexing around
: 500,000 documents, and in this particular application there are a lot of
: common keywords, so a large number of hits for a basic query is common.
: I'm trying to avoid this as it's an untidy solution which is likely to
: be (relatively) slow.
:
: I notice Erik has commented that "I've not come across a really clean
: way to do this sort of age-based
: boosting other than how TSS does it".  I was wondering if anyone has any
: experience with dirtier approaches they could share with me?
:
: Any help is really appreciated,
:
: Thanks,
:
: Nick
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: [EMAIL PROTECTED]
: For additional commands, e-mail: [EMAIL PROTECTED]
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to