On Sat, Nov 5, 2011 at 11:23 AM, Marvin Humphrey <[email protected]> wrote: > The most straightforward solution is to eliminate PolySearcher from the > equation and to create a class that combines the functionality of PolySearcher > and SearchClient. Fortunately, neither of them is particularly large or > complex, so the task is very doable. > > I propose that we name this new class LucyX::Remote::ClusterSearcher.
For Goran who is trying to get something working quickly in Perl, this seems like a great solution. Get to it! For Lucy as a whole, I think there are some meta-questions that should be resolved before we go down this path. 1) How core is is this to Lucy's functionality? 2) How much should we depend on outside libraries? 3) How independent should the Searcher and the Clients be? 4) How future-proof and scalable do we want this solution to be? My position would be that while search clusters are essential to Lucy, our core competency is fast search rather than reliable networking, and thus we should use well-tested external libraries rather than expanding our scope. I think the remote Clients and the central Searcher should be essentially independent of each other and of this networking layer. And I think that we should aim to make it scale to the moon. Fleshing this out a little bit, I think we should prefer libev in C over IO::Select in Perl, and that that we should prefer something high level like ZeroMQ over dealing with libev. I think we should have a well defined query and response format using something like Google's Protocol Buffers rather than serializing objects directly. I think a good goal would be allowing Lucene with a wrapper to act as a Client. Marvin: could you offer an high level overview of how cluster search would work ideally, with particular emphasis on what gets passed over the wire and what out-of-band coordination is needed between Searcher and Clients? --nate
