The reason for the issue you are seeing is the IDF component in te
score. IDF = inverse document frequency.

The document frequency is the number of times a document appears in the
index. The higher the document frequency, the mre common the term and
thus the less relevant it is. The document frequency is inverted to give
a higher number for more relevant terms.

Solr does not yet support distributed IDF. Therefore the document
frequency is a 3m shard will be higher (as a proportion of your index)
compared to your 30m shard, thus it ill score lower.

I am not aware of a multiplier you can use to fix this. There is a
distributed IDF ticket in JIRA, maybe that is mature enough and might
help you.

Upayavira

On Thu, Jun 20, 2013, at 01:56 AM, Learner wrote:
> Hi,
> 
> Sorry if its a very basic question but I am pretty new to SolrCloud and I
> am
> trying to understand the underlying mechanism for calculating relevancy.
> 
> Currently we are using SOLR 3.6.X and we use shards to perform
> distributed
> searching. Our shards are not of equal size hence sometimes the results
> are
> not as we expected. 
> 
> For ex: Shard 1 has 30 million documents, Shard 2 has 30 millon documents
> and shard 3 has just 3 million documents (push indexing via message
> queue). 
> 
> When we do a search using shards, documents from shard 1 and shard 2 gets
> higher priority compared to documents in shard 3 (since its smaller).
> Currently we add index time boost when adding documents to shard 3 so
> that
> the documents from shard 3 also comes up (higher) in search results.
> 
> Now when using SolrCloud, say for example if one shard has person name
> repeated 5 times (with different unique id)  and we have one more same
> person name in shard 2 (with diff id), and when we do a search how does
> SOLR
> calculate the score? Does it do something like constant scoring across
> various shards in order to bring up the search results across various
> shards? How does the score gets calculated.. Does the score of all 6
> documents have same value(5 from shard 1 and 1 from shard 2 -if all the
> fields have same value except for unique id)? 
> 
> Thanks,
> BB 
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Score-calculation-tp4071805.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to