Hello Mark, > I have written up some of my experiences with creating a distributed system > with Lucene here: > http://home.clara.net/markharwood/lucene/ > It includes some UML interaction diagrams that I found useful in understanding > the Lucene codebase.
Very interesting work ! Some quick comments: * First, a very minor detail: your diagrams are quite big, I'd suggest removing the stub/skel instances, for educational purpose (moreover, rmi is supposed to make them transparent for the developer), and the anonymous classes as well. * I am not sure RMI is the best way of making a distributed search engine scalable. What about a messaging service such as JMS in order to cope with the scalability and bottle-necking problems that come along with the index readers/writers. * I guess I roughly understand the problematics of a distibuted search engine, but it is not clear for me what exactly is distributed in it ? I mean, how is the data partitioned in the distributed system ? Should it be randomly distributed (which implies sending the query to all the nodes that host an index), or is there a possibility to distribute it according to some rule, for example per field, or per class of document (in which case, the query is sent to a subset of the indexing nodes only). I wish there were such sequence diagram for the index package as well :-) Rodrigo -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
