Custom sorting based on external (database) data

2011-05-05 Thread Sujit Pal
Hi,

Sorry for the possible double post, I wrote this up but had the
incorrect sender address, so I am guessing that my previous one is going
to be rejected by the list moderation daemon.

I am trying to figure out options for the following problem. I am on
Solr 1.4.1 (Lucene 2.9.1).

I have search results which are going to be ranked by the user (using a
thumbs up/down) and would translate to a score between -1 and +1. 

This data is stored in a database table (
unique_id
thumbs_up
thumbs_down
num_calls

as the thumbs up/down component is clicked.

We want to be able to sort the results by the following score =
(thumbs_up - thumbs_down) / (num_calls). The unique_id field refers to
the one referenced as uniqueId in the schema.xml.

Based on the following conversation:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg06322.html 

...my understanding is that I need to:

1) subclass FieldType to create my own RankFieldType. 
2) In this class I override the getSortField() method to return my
custom FieldSortComparatorSource object.
3) Build the custom FieldSortComparatorSource object which returns a
custom FieldSortComparator object in newComparator().
4) Configure the field type of class RankFieldType (rank_t), and a field
(called rank) of field type rank_t in schema.xml of type RankFieldType.
5) use sort=rank+desc to do the sort.

My question is: is there a simpler/more performant way? The number of
database lookups seems like its going to be pretty high with this
approach. And its hard to believe that my problem is new, so I am
guessing this is either part of some Solr configuration I am missing, or
there is some other (possibly simpler) approach I am overlooking.

Pointers to documentation or code (or even keywords I could google)
would be much appreciated.

TIA for all your help,

Sujit




Re: Custom sorting based on external (database) data

2011-05-05 Thread Ahmet Arslan


--- On Thu, 5/5/11, Sujit Pal sujit@comcast.net wrote:

 From: Sujit Pal sujit@comcast.net
 Subject: Custom sorting based on external (database) data
 To: solr-user solr-user@lucene.apache.org
 Date: Thursday, May 5, 2011, 11:03 PM
 Hi,
 
 Sorry for the possible double post, I wrote this up but had
 the
 incorrect sender address, so I am guessing that my previous
 one is going
 to be rejected by the list moderation daemon.
 
 I am trying to figure out options for the following
 problem. I am on
 Solr 1.4.1 (Lucene 2.9.1).
 
 I have search results which are going to be ranked by the
 user (using a
 thumbs up/down) and would translate to a score between -1
 and +1. 
 
 This data is stored in a database table (
 unique_id
 thumbs_up
 thumbs_down
 num_calls
 
 as the thumbs up/down component is clicked.
 
 We want to be able to sort the results by the following
 score =
 (thumbs_up - thumbs_down) / (num_calls). The unique_id
 field refers to
 the one referenced as uniqueId in the schema.xml.
 
 Based on the following conversation:
 http://www.mail-archive.com/solr-user@lucene.apache.org/msg06322.html
 
 
 ...my understanding is that I need to:
 
 1) subclass FieldType to create my own RankFieldType. 
 2) In this class I override the getSortField() method to
 return my
 custom FieldSortComparatorSource object.
 3) Build the custom FieldSortComparatorSource object which
 returns a
 custom FieldSortComparator object in newComparator().
 4) Configure the field type of class RankFieldType
 (rank_t), and a field
 (called rank) of field type rank_t in schema.xml of type
 RankFieldType.
 5) use sort=rank+desc to do the sort.
 
 My question is: is there a simpler/more performant way? The
 number of
 database lookups seems like its going to be pretty high
 with this
 approach. And its hard to believe that my problem is new,
 so I am
 guessing this is either part of some Solr configuration I
 am missing, or
 there is some other (possibly simpler) approach I am
 overlooking.
 
 Pointers to documentation or code (or even keywords I could
 google)
 would be much appreciated.

Looks like it can be done with 
http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html 
and 
http://wiki.apache.org/solr/FunctionQuery

You can dump your table into three text files. Issue a commit to load these 
changes.

Sort by function query is available in Solr3.1 though.


Re: Custom sorting based on external (database) data

2011-05-05 Thread Sujit Pal
Thank you Ahmet, looks like we could use this. Basically we would do
periodic dumps of the (unique_id|computed_score) sorted by score and
write it out to this file followed by a commit.

Found some more info here, for the benefit of others looking for
something similar:
http://dev.tailsweep.com/solr-external-scoring/ 

On Thu, 2011-05-05 at 13:12 -0700, Ahmet Arslan wrote:
 
 --- On Thu, 5/5/11, Sujit Pal sujit@comcast.net wrote:
 
  From: Sujit Pal sujit@comcast.net
  Subject: Custom sorting based on external (database) data
  To: solr-user solr-user@lucene.apache.org
  Date: Thursday, May 5, 2011, 11:03 PM
  Hi,
  
  Sorry for the possible double post, I wrote this up but had
  the
  incorrect sender address, so I am guessing that my previous
  one is going
  to be rejected by the list moderation daemon.
  
  I am trying to figure out options for the following
  problem. I am on
  Solr 1.4.1 (Lucene 2.9.1).
  
  I have search results which are going to be ranked by the
  user (using a
  thumbs up/down) and would translate to a score between -1
  and +1. 
  
  This data is stored in a database table (
  unique_id
  thumbs_up
  thumbs_down
  num_calls
  
  as the thumbs up/down component is clicked.
  
  We want to be able to sort the results by the following
  score =
  (thumbs_up - thumbs_down) / (num_calls). The unique_id
  field refers to
  the one referenced as uniqueId in the schema.xml.
  
  Based on the following conversation:
  http://www.mail-archive.com/solr-user@lucene.apache.org/msg06322.html
  
  
  ...my understanding is that I need to:
  
  1) subclass FieldType to create my own RankFieldType. 
  2) In this class I override the getSortField() method to
  return my
  custom FieldSortComparatorSource object.
  3) Build the custom FieldSortComparatorSource object which
  returns a
  custom FieldSortComparator object in newComparator().
  4) Configure the field type of class RankFieldType
  (rank_t), and a field
  (called rank) of field type rank_t in schema.xml of type
  RankFieldType.
  5) use sort=rank+desc to do the sort.
  
  My question is: is there a simpler/more performant way? The
  number of
  database lookups seems like its going to be pretty high
  with this
  approach. And its hard to believe that my problem is new,
  so I am
  guessing this is either part of some Solr configuration I
  am missing, or
  there is some other (possibly simpler) approach I am
  overlooking.
  
  Pointers to documentation or code (or even keywords I could
  google)
  would be much appreciated.
 
 Looks like it can be done with 
 http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html
  
 and 
 http://wiki.apache.org/solr/FunctionQuery
 
 You can dump your table into three text files. Issue a commit to load these 
 changes.
 
 Sort by function query is available in Solr3.1 though.