Re: [lucy-user] Lucy questions wrt production, ranking, etc

goran kent Thu, 08 Sep 2011 23:48:00 -0700

Thanks for the quick response Nathan!  See question below.

On Thu, Sep 8, 2011 at 9:58 PM, Nathan Kurz <[email protected]> wrote:
>
>> The environment is distributed search across a cluster with the intent
>> of keeping search-time sub-second - 3s at most (folks are spoilt by
>> the elephant in the industry, so they lose interest if the page does
>> not return in that time).
>>
>> I see from the docs that distributed search is supported, else it
>> would be a non-starter.
>
> This excites me too, but I don't know that anyone is pushing it's
> limits yet.  But architecturally, I think it's well designed to allow
> really fast clusters of in-ram search.  Talking about 3 seconds makes
> it sound like you're willing to hit disk:  you might need some intense
> tuning here, depending on how you deal with really common stopwords.
>  Also, there are some limitations with custom sort ordering and the
> like:  clusters are going to deal better with floating point than with
> alphabetical, for example, and


> ... excerpts might be a little clunky to
> retrieve.  Currently it's just a DocID and a score that get returned
> efficiently.

Just to clarify - is obtaining excerpts from a distributed search a
problem?  One would think irrespective of whether you're performing a
local or distributed search the modus operandi would be the same
(without coding gymnastics required to glue things together to work as
expected).

Re: [lucy-user] Lucy questions wrt production, ranking, etc

Reply via email to