Doug Cutting wrote:
>Yes, there is some duplication between MultiSearcher and SegmentsReader.
>The reason for keeping these separate was to support distributed searching.
>
I see.
>
>Thus the Searcher API is designed to have only small bits of data pass
>through it. I never actually implemented distributed searching, so this
>design is somewhat half baked.
>
>The general idea is that query terms must be passed to the searcher first to
>weight the query, then, once the query is weighted, it can be sent to a set
>of searchers in parallel.
>
>To implement this, we would need to do something like:
>
>1. Move the abstract Searcher methods to an interface:
> public interface Searchable {
> int docFreq(Term term) throws IOException;
> int maxDoc() throws IOException;
> TopDocs search(Query query, Filter filter, int n) throws IOException;
> Document doc(int i) throws IOException;
> }
>
>2. Implement a RemoteSearcher using RMI.
>
>3. Change MultiSearcher.search() to search each sub-index in a separate
>thread.
>
>The low-level search API doesn't really fit in here too well.
>
>Note that, except for the search() method, the Searchable interface is a
>subset of IndexReader, so it still might make sense to somehow combine the
>notions of Searcher and IndexReader. But we should keep distributed
>searching in mind when this is done. If you are interested in drafting such
>a re-design, I'd love to see it.
>
Perhaps. I will have to revisit this when we get to the next level of
scaling our product. It's good to know there is a plan, though. :)
>