Hmmm, I may have mis-lead you. Re-reading my text it
wasn't very well written....

TF/IDF calculations are, indeed, per-field. I was trying
to say that there was no difference between storing all
the data for an individual field as a single long string of text
in a single-valued field or as several shorter strings in
a multi-valued field.

Best
Erick

On Tue, May 31, 2011 at 12:16 PM, Ian Holsman <had...@holsman.net> wrote:
>
> On May 31, 2011, at 12:11 PM, Erick Erickson wrote:
>
>> Can you explain the use-case a bit more here? Especially the post-query
>> processing and how you expect the multiple documents to help here.
>>
>
> we have a collection of related stories. when a user searches for something, 
> we might not want to display the story that is most-relevant (according to 
> SOLR), but according to other home-grown rules.  by combing all the 
> possibilities in one SolrDocument, we can avoid a DB-hit to get related 
> stories.
>
>
>> But TF/IDF is calculated over all the values in the field. There's really no
>> difference between a multi-valued field and storing all the data in a
>> single field
>> as far as relevance calculations are concerned.
>>
>
> so.. it will suck regardless.. I thought we had per-field relevance in the 
> current trunk. :-(
>
>
>> Best
>> Erick
>>
>> On Tue, May 31, 2011 at 11:02 AM, Ian Holsman <had...@holsman.net> wrote:
>>> Hi.
>>>
>>> I want to store a list of documents (say each being 30-60k of text) into a 
>>> single SolrDocument. (to speed up post-retrieval querying)
>>>
>>> In order to do this, I need to know if lucene calculates the TF/IDF score 
>>> over the entire field or does it treat each value in the list as a unique 
>>> field?
>>>
>>> If I can't store it as a multi-value, I could create a schema where I put 
>>> each document into a unique field, but I'm not sure how to create the query 
>>> to search all the fields.
>>>
>>>
>>> Regards
>>> Ian
>>>
>>>
>
>

Reply via email to