Re: Multiple Indexes and relevance ranking question

Lance Norskog Fri, 01 Oct 2010 20:26:14 -0700

The score of a document has no scale: it only has meaning against otherscore in the same query.

Solr does not rank these documents correctly. Without sharing the TF/DFinformation across the shards, it cannot.

If the shards each have "a lot" of the same kind of document, thisproblem averages out. That is, the "statistical fingerprint" across theshards is similar enough that each index gives the same numerical range.Yes, this is hand-wavey, and we don't have a measuring tool thatverifies this assertion.


Lance

Valli Indraganti wrote:

I an new to Solr and the search technologies. I am playing around with
multiple indexes. I configured Solr for Tomcat, created two tomcat fragments
so that two solr webapps listen on port 8080 in tomcat. I have created two
separate indexes using each webapp successfully.

My documents are very primitive. Below is the structure. I have four such
documents with different doc id and increased number of the word "Hello"
corresponding to the name of the document (this is only to make my analysis
of the results easier). Documents One and two are in shar1 and three and
four are in shard 2. obviously, document two is ranked higher when queried
against that index (for the word Hello). And document four is ranked higher
when queried against second index. When using the shards, parameter, the
scores remain unaltered.
My question is, if the distributed search does not consider IDF, how is it
able to rank these documents correctly? Or do I not have the indexes truely
distributed? Is something wrong with my term distribution?

<add>
  -<#>  <doc>
    <field name="*id*">Valli1</field>
    <field name="*name*">One</field>
    <field name="*text*">Hello!This is a test document testing relevancy
scores.</field>
   </doc>
</add>

Re: Multiple Indexes and relevance ranking question

Reply via email to