Re: multithreading in SegmentsReader

Dmitry Serebrennikov Thu, 11 Oct 2001 12:07:57 -0700

Doug Cutting wrote:

>Yes, there is some duplication between MultiSearcher and SegmentsReader.
>The reason for keeping these separate was to support distributed searching.
>
I see.

>
>Thus the Searcher API is designed to have only small bits of data pass
>through it.  I never actually implemented distributed searching, so this
>design is somewhat half baked.
>
>The general idea is that query terms must be passed to the searcher first to
>weight the query, then, once the query is weighted, it can be sent to a set
>of searchers in parallel.
>
>To implement this, we would need to do something like:
>
>1. Move the abstract Searcher methods to an interface:
>  public interface Searchable {
>    int docFreq(Term term) throws IOException;
>    int maxDoc() throws IOException;
>    TopDocs search(Query query, Filter filter, int n) throws IOException;
>    Document doc(int i) throws IOException;
>  }
>
>2. Implement a RemoteSearcher using RMI.
>
>3. Change MultiSearcher.search() to search each sub-index in a separate
>thread.
>
>The low-level search API doesn't really fit in here too well.
>
>Note that, except for the search() method, the Searchable interface is a
>subset of IndexReader, so it still might make sense to somehow combine the
>notions of Searcher and IndexReader.  But we should keep distributed
>searching in mind when this is done.  If you are interested in drafting such
>a re-design, I'd love to see it.
>
Perhaps. I will have to revisit this when we get to the next level of 
scaling our product. It's good to know there is a plan, though. :)

>
Re: multithreading in SegmentsReader

Reply via email to