Re: how does Solr/Lucene index multi-value fields

Ian Holsman Tue, 31 May 2011 09:17:25 -0700

On May 31, 2011, at 12:11 PM, Erick Erickson wrote:

> Can you explain the use-case a bit more here? Especially the post-query
> processing and how you expect the multiple documents to help here.
>

we have a collection of related stories. when a user searches for something, we 
might not want to display the story that is most-relevant (according to SOLR), 
but according to other home-grown rules.  by combing all the possibilities in 
one SolrDocument, we can avoid a DB-hit to get related stories.

> But TF/IDF is calculated over all the values in the field. There's really no
> difference between a multi-valued field and storing all the data in a
> single field
> as far as relevance calculations are concerned.
> 

so.. it will suck regardless.. I thought we had per-field relevance in the 
current trunk. :-(

> Best
> Erick
> 
> On Tue, May 31, 2011 at 11:02 AM, Ian Holsman <had...@holsman.net> wrote:
>> Hi.
>> 
>> I want to store a list of documents (say each being 30-60k of text) into a 
>> single SolrDocument. (to speed up post-retrieval querying)
>> 
>> In order to do this, I need to know if lucene calculates the TF/IDF score 
>> over the entire field or does it treat each value in the list as a unique 
>> field?
>> 
>> If I can't store it as a multi-value, I could create a schema where I put 
>> each document into a unique field, but I'm not sure how to create the query 
>> to search all the fields.
>> 
>> 
>> Regards
>> Ian
>> 
>>

Re: how does Solr/Lucene index multi-value fields

Reply via email to