Re: multicore shards and relevancy score

Jason Rutherglen Tue, 15 Sep 2009 10:19:30 -0700

You can query multiple cores using MultiEmbeddedSearchHandler in
SOLR-1431.  Then the facet counts will be merged just like the current
distributed requests.


On Tue, Sep 15, 2009 at 7:41 AM, Paul Rosen <p...@performantsoftware.com> wrote:
> Shalin Shekhar Mangar wrote:
>>
>> On Tue, Sep 15, 2009 at 2:39 AM, Paul Rosen
>> <p...@performantsoftware.com>wrote:
>>
>>> I've done a few experiments with searching two cores with the same schema
>>> using the shard syntax. (using solr 1.3)
>>>
>>> My use case is that I want to have multiple cores because a few different
>>> people will be managing the indexing, and that will happen at different
>>> times. The data, however, is homogeneous.
>>>
>>>
>> Multiple cores were not built for distributed search. It is inefficient as
>> compared to a single index. But if you want to use them that way, that's
>> your choice.
>
> Well, I'm experimenting with them because it will simplify index maintenance
> greatly. I am beginning to think that it won't work in my case, though.
>
>>
>>> I've noticed in my tests that the results are not interwoven, but it
>>> might
>>> just be my test data. In other words, all the results from one core
>>> appear,
>>> then all the results from the other core.
>>>
>>> In thinking about it, it would make sense if the relevancy scores for
>>> each
>>> core were completely independent of each other. And that would mean that
>>> there is no way to compare the relevancy scores between the cores.
>>>
>>> In other words, I'd like the following results:
>>>
>>> - really relevant hit from core0
>>> - pretty relevant hit from core1
>>> - kind of relevant hit from core0
>>> - not so relevant hit from core1
>>>
>>> but I get:
>>>
>>> - really relevant hit from core0
>>> - kind of relevant hit from core0
>>> - pretty relevant hit from core1
>>> - not so relevant hit from core1
>>>
>>> So, are the results supposed to be interwoven, and I need to study my
>>> data
>>> more, or is this just not something that is possible?
>>>
>>>
>> The only difference wrt relevancy between a distributed search and a
>> single-node search is that there is no distributed IDF and therefore a
>> distributed search assumes a random distribution of terms among shards.
>> I'm
>> not sure if that is what you are seeing.
>>
>>
>>> Also, if this is insurmountable, I've discovered two show stoppers that
>>> will prevent using multicore in my project (counting the lack of support
>>> for
>>> faceting in multicore). Are these issues addressed in solr 1.4?
>>>
>>>
>> Can you give more details on what these two issues are?
>>
>
> The first issue is detailed above, where the results from a search over two
> shards don't appear to be returned in relevancy order.
>
> The second issue was detailed in an email last week "shards and facet
> count". The facet information is lost when doing a search over two shards,
> so if I use multicore, I can no longer have facets.
>
>
>

Re: multicore shards and relevancy score

Reply via email to