Custom sorting based on external (database) data
Hi, Sorry for the possible double post, I wrote this up but had the incorrect sender address, so I am guessing that my previous one is going to be rejected by the list moderation daemon. I am trying to figure out options for the following problem. I am on Solr 1.4.1 (Lucene 2.9.1). I have search results which are going to be ranked by the user (using a thumbs up/down) and would translate to a score between -1 and +1. This data is stored in a database table ( unique_id thumbs_up thumbs_down num_calls as the thumbs up/down component is clicked. We want to be able to sort the results by the following score = (thumbs_up - thumbs_down) / (num_calls). The unique_id field refers to the one referenced as uniqueId in the schema.xml. Based on the following conversation: http://www.mail-archive.com/solr-user@lucene.apache.org/msg06322.html ...my understanding is that I need to: 1) subclass FieldType to create my own RankFieldType. 2) In this class I override the getSortField() method to return my custom FieldSortComparatorSource object. 3) Build the custom FieldSortComparatorSource object which returns a custom FieldSortComparator object in newComparator(). 4) Configure the field type of class RankFieldType (rank_t), and a field (called rank) of field type rank_t in schema.xml of type RankFieldType. 5) use sort=rank+desc to do the sort. My question is: is there a simpler/more performant way? The number of database lookups seems like its going to be pretty high with this approach. And its hard to believe that my problem is new, so I am guessing this is either part of some Solr configuration I am missing, or there is some other (possibly simpler) approach I am overlooking. Pointers to documentation or code (or even keywords I could google) would be much appreciated. TIA for all your help, Sujit
Re: Custom sorting based on external (database) data
--- On Thu, 5/5/11, Sujit Pal sujit@comcast.net wrote: From: Sujit Pal sujit@comcast.net Subject: Custom sorting based on external (database) data To: solr-user solr-user@lucene.apache.org Date: Thursday, May 5, 2011, 11:03 PM Hi, Sorry for the possible double post, I wrote this up but had the incorrect sender address, so I am guessing that my previous one is going to be rejected by the list moderation daemon. I am trying to figure out options for the following problem. I am on Solr 1.4.1 (Lucene 2.9.1). I have search results which are going to be ranked by the user (using a thumbs up/down) and would translate to a score between -1 and +1. This data is stored in a database table ( unique_id thumbs_up thumbs_down num_calls as the thumbs up/down component is clicked. We want to be able to sort the results by the following score = (thumbs_up - thumbs_down) / (num_calls). The unique_id field refers to the one referenced as uniqueId in the schema.xml. Based on the following conversation: http://www.mail-archive.com/solr-user@lucene.apache.org/msg06322.html ...my understanding is that I need to: 1) subclass FieldType to create my own RankFieldType. 2) In this class I override the getSortField() method to return my custom FieldSortComparatorSource object. 3) Build the custom FieldSortComparatorSource object which returns a custom FieldSortComparator object in newComparator(). 4) Configure the field type of class RankFieldType (rank_t), and a field (called rank) of field type rank_t in schema.xml of type RankFieldType. 5) use sort=rank+desc to do the sort. My question is: is there a simpler/more performant way? The number of database lookups seems like its going to be pretty high with this approach. And its hard to believe that my problem is new, so I am guessing this is either part of some Solr configuration I am missing, or there is some other (possibly simpler) approach I am overlooking. Pointers to documentation or code (or even keywords I could google) would be much appreciated. Looks like it can be done with http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html and http://wiki.apache.org/solr/FunctionQuery You can dump your table into three text files. Issue a commit to load these changes. Sort by function query is available in Solr3.1 though.
Re: Custom sorting based on external (database) data
Thank you Ahmet, looks like we could use this. Basically we would do periodic dumps of the (unique_id|computed_score) sorted by score and write it out to this file followed by a commit. Found some more info here, for the benefit of others looking for something similar: http://dev.tailsweep.com/solr-external-scoring/ On Thu, 2011-05-05 at 13:12 -0700, Ahmet Arslan wrote: --- On Thu, 5/5/11, Sujit Pal sujit@comcast.net wrote: From: Sujit Pal sujit@comcast.net Subject: Custom sorting based on external (database) data To: solr-user solr-user@lucene.apache.org Date: Thursday, May 5, 2011, 11:03 PM Hi, Sorry for the possible double post, I wrote this up but had the incorrect sender address, so I am guessing that my previous one is going to be rejected by the list moderation daemon. I am trying to figure out options for the following problem. I am on Solr 1.4.1 (Lucene 2.9.1). I have search results which are going to be ranked by the user (using a thumbs up/down) and would translate to a score between -1 and +1. This data is stored in a database table ( unique_id thumbs_up thumbs_down num_calls as the thumbs up/down component is clicked. We want to be able to sort the results by the following score = (thumbs_up - thumbs_down) / (num_calls). The unique_id field refers to the one referenced as uniqueId in the schema.xml. Based on the following conversation: http://www.mail-archive.com/solr-user@lucene.apache.org/msg06322.html ...my understanding is that I need to: 1) subclass FieldType to create my own RankFieldType. 2) In this class I override the getSortField() method to return my custom FieldSortComparatorSource object. 3) Build the custom FieldSortComparatorSource object which returns a custom FieldSortComparator object in newComparator(). 4) Configure the field type of class RankFieldType (rank_t), and a field (called rank) of field type rank_t in schema.xml of type RankFieldType. 5) use sort=rank+desc to do the sort. My question is: is there a simpler/more performant way? The number of database lookups seems like its going to be pretty high with this approach. And its hard to believe that my problem is new, so I am guessing this is either part of some Solr configuration I am missing, or there is some other (possibly simpler) approach I am overlooking. Pointers to documentation or code (or even keywords I could google) would be much appreciated. Looks like it can be done with http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html and http://wiki.apache.org/solr/FunctionQuery You can dump your table into three text files. Issue a commit to load these changes. Sort by function query is available in Solr3.1 though.