Thanks for the quick response Nathan! See question below. On Thu, Sep 8, 2011 at 9:58 PM, Nathan Kurz <[email protected]> wrote: > >> The environment is distributed search across a cluster with the intent >> of keeping search-time sub-second - 3s at most (folks are spoilt by >> the elephant in the industry, so they lose interest if the page does >> not return in that time). >> >> I see from the docs that distributed search is supported, else it >> would be a non-starter. > > This excites me too, but I don't know that anyone is pushing it's > limits yet. But architecturally, I think it's well designed to allow > really fast clusters of in-ram search. Talking about 3 seconds makes > it sound like you're willing to hit disk: you might need some intense > tuning here, depending on how you deal with really common stopwords. > Also, there are some limitations with custom sort ordering and the > like: clusters are going to deal better with floating point than with > alphabetical, for example, and
> ... excerpts might be a little clunky to > retrieve. Currently it's just a DocID and a score that get returned > efficiently. Just to clarify - is obtaining excerpts from a distributed search a problem? One would think irrespective of whether you're performing a local or distributed search the modus operandi would be the same (without coding gymnastics required to glue things together to work as expected).
