Question on Dismax plugin

2011-04-11 Thread Nemani, Raj
All,

I have a question on the Dismax plugin for the search handler.  I have
two test instances of Solr.  In one I am using the default search
handler.  In this case, the fields that I am working with (slug and
story) are indexed via the all_text filed and the searches are done on
the all_text field.

For the other one I have configured a search handler using the dismax
plugin as shown below.

 

requestHandler name=mydismax class=solr.SearchHandler 

lst name=defaults

 str name=defTypedismax/str

 str name=echoParamsexplicit/str

 float name=tie0.01/float

 str name=qf

story^3.0 slug^0.2

 /str

 int name=ps100/int

 str name=q.alt*:*/str

 /lst

  /requestHandler

 

To make testing easier, I only have 4 (same) documents in both indexes
with the word Obama appearing inside as described below.

 

File 1:: The word Obama appears zero times in slug field and four
times in story field

File 2:: The word Obama appears zero times in slug field and thrice in
story field

File 3:: The word Obama appears zero times in slug field and two times
in story field

File 4:: The word Obama appears One time in slug field and one time in
story field

 

 

Here is the order of the documents in the order of decreasing scores
from the search results

 

Dismax Search Handler (steadily decreasing scores):

* File 1:: The word Obama appears zero times in slug field and
four times in story field

* File 4:: The word Obama appears One time in slug field and
one time in story field

* File 2:: The word Obama appears zero times in slug field and
thrice in story field

* File 3:: The word Obama appears zero times in slug field and
two times in story field

 

Standard Search handler:

* File 1:: The word Obama appears zero times in slug field and
four times in story field

* File 2:: The word Obama appears zero times in slug field and
thrice in story field (same score as File 4 score below)

* File 4:: The word Obama appears One time in slug field and
one time in story field (same score as File 2 score above)

* File 3:: The word Obama appears zero times in slug field and
two times in story field

 

 

My question, why is dismax showing File 4:: The word Obama appears One
time in slug field and one time in story field 

ahead of 

File 2:: The word Obama appears zero times in slug field and thrice
in story field given that I have boosted these fields as shown below.


 

str name=qf

story^3.0 slug^0.2

/str

 

I would have thought that the File 4:: The word Obama appears One time
in slug field and one time in story field would have gone all the
way done in the result list.

 

Any help is appreciated

Thanks much in advance

Raj

 

 

 

 

 

 

 

 



Re: Question on Dismax plugin

2011-04-11 Thread Otis Gospodnetic
Hi Raj,

I'm guessing your slug field is much shorter and thus a match in that field has 
more weight than a match is a much longer story field.  If you omit norms for 
those fields in the schema (and reindex), I believe you will see File 4 drop to 
position #4.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Nemani, Raj raj.nem...@turner.com
 To: solr-user@lucene.apache.org
 Sent: Mon, April 11, 2011 4:12:52 PM
 Subject: Question on Dismax plugin
 
 All,
 
 I have a question on the Dismax plugin for the search handler.   I have
 two test instances of Solr.  In one I am using the default  search
 handler.  In this case, the fields that I am working with (slug  and
 story) are indexed via the all_text filed and the searches are done  on
 the all_text field.
 
 For the other one I have configured a search  handler using the dismax
 plugin as shown below.
 
 
 
 requestHandler name=mydismax class=solr.SearchHandler  
 
 lst name=defaults
 
   str name=defTypedismax/str
 
  str  name=echoParamsexplicit/str
 
  float  name=tie0.01/float
 
  str  name=qf
 
 story^3.0  slug^0.2
 
  /str
 
  int  name=ps100/int
 
  str  name=q.alt*:*/str
 
  /lst
 
/requestHandler
 
 
 
 To make testing easier, I only have 4  (same) documents in both indexes
 with the word Obama appearing inside as  described below.
 
 
 
 File 1:: The word Obama appears zero times in  slug field and four
 times in story field
 
 File 2:: The word Obama  appears zero times in slug field and thrice in
 story field
 
 File  3:: The word Obama appears zero times in slug field and two times
 in  story field
 
 File 4:: The word Obama appears One time in slug field  and one time in
 story field
 
 
 
 
 
 Here is the order of  the documents in the order of decreasing scores
 from the search  results
 
 
 
 Dismax Search Handler (steadily decreasing  scores):
 
 * File 1:: The word Obama appears  zero times in slug field and
 four times in story field
 
 *  File 4:: The word Obama appears One time in slug field  and
 one time in story field
 
 * File 2::  The word Obama appears zero times in slug field and
 thrice in story  field
 
 * File 3:: The word Obama appears zero  times in slug field and
 two times in story field
 
 
 
 Standard  Search handler:
 
 * File 1:: The word Obama  appears zero times in slug field and
 four times in story  field
 
 * File 2:: The word Obama appears zero  times in slug field and
 thrice in story field (same score as File 4 score  below)
 
 * File 4:: The word Obama appears One  time in slug field and
 one time in story field (same score as File 2  score above)
 
 * File 3:: The word Obama  appears zero times in slug field and
 two times in story field
 
 
 
 
 
 My question, why is dismax showing File 4:: The word Obama  appears One
 time in slug field and one time in story field 
 
 ahead  of 
 
 File 2:: The word Obama appears zero times in slug field and  thrice
 in story field given that I have boosted these fields as shown  below.
 
 
 
 
 str name=qf
 
  story^3.0 slug^0.2
 
 /str
 
 
 
 I  would have thought that the File 4:: The word Obama appears One time
 in  slug field and one time in story field would have gone all the
 way done  in the result list.
 
 
 
 Any help is appreciated
 
 Thanks much  in advance
 
 Raj