> From: Mark Harwood [mailto:[EMAIL PROTECTED]] > > I have written up some of my experiences with creating a > distributed system > with Lucene here: > > http://home.clara.net/markharwood/lucene/ > > It includes some UML interaction diagrams that I found useful > in understanding > the Lucene codebase.
Mark, It's great to see someone experimenting with this. I originally had distributed searching in mind when I wrote Lucene, but never quite got to adding it. A message that mentions some of these intentions is at: http://www.mail-archive.com/[email protected]/msg00252.html A less "chatty" interface than the one mentioned there might be: public interface Searchable { public class TermStatistics implements Serializable { public int[] docFreqs; public int maxDoc; } int getTermStatistics(Term[] terms) throws IOException; TopDocs search(Query query, Filter filter, int n) throws IOException; Document[] getDocs(int[] i) throws IOException; } With these three phases (collect term statistics, get doc id scores, get docs) the results should be identical to searching the indexes locally with MultiSearcher. It sounded like your experiments skipped the first phase. Probably it would be worth writing a MultiThreadSearcher that spawns a thread for each sub-search, then waits for all to finish before merging the results. So, if you are able to work on this more, it would be great to figure out what it would take to make Query serializable, to convert the Searcher implementations to use the above interface in place of the existing similar abstract methods, and finally to implement an RMI-based RemoteSearcher. Doug -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
