> From: Mark Harwood [mailto:[EMAIL PROTECTED]]
> 
> I have written up some of my experiences with creating a 
> distributed system 
> with Lucene here:
> 
> http://home.clara.net/markharwood/lucene/
> 
> It includes some UML interaction diagrams that I found useful 
> in understanding 
> the Lucene codebase.

Mark,

It's great to see someone experimenting with this.  I originally had
distributed searching in mind when I wrote Lucene, but never quite got to
adding it.  A message that mentions some of these intentions is at:
  http://www.mail-archive.com/[email protected]/msg00252.html

A less "chatty" interface than the one mentioned there might be:

  public interface Searchable {
    public class TermStatistics implements Serializable {
      public int[] docFreqs;
      public int maxDoc;
    }
    int getTermStatistics(Term[] terms) throws IOException;
    TopDocs search(Query query, Filter filter, int n) throws IOException;
    Document[] getDocs(int[] i) throws IOException;
  }

With these three phases (collect term statistics, get doc id scores, get
docs) the results should be identical to searching the indexes locally with
MultiSearcher.  It sounded like your experiments skipped the first phase.

Probably it would be worth writing a MultiThreadSearcher that spawns a
thread for each sub-search, then waits for all to finish before merging the
results.

So, if you are able to work on this more, it would be great to figure out
what it would take to make Query serializable, to convert the Searcher
implementations to use the above interface in place of the existing similar
abstract methods, and finally to implement an RMI-based RemoteSearcher.

Doug

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to