On 9/21/07, Mike Klaas <[EMAIL PROTECTED]> wrote:
> On 21-Sep-07, at 11:08 AM, Yonik Seeley wrote:
>
> > I wanted to take a step back for a second and think about if HTTP was
> > really the right choice for the transport for distributed search.
> >
> > I think the high-level approach in SOLR-303 is the right way to go
> > about it, but I'm unsure if HTTP is the right transport.
>
> I don't know anything about RMI, but is it possible to do 100's of
> simultaneous asynchronous requests cheaply?

Good question... probably only important for really big clusters (like
yours), but it would be nice.

Even if we go HTTP, I'm not sure it will be async at first - does
HTTPClient even support async?

I assume when you say async that you mean getting rid of the
thread-per-connection via NIO.  Some protocols do "async" by handing
off the request to another thread to wait on the response and then do
a callback to the original thread - this is async with respect to the
original calling thread, but still requires a thread-per-connection.

Of course HTTP has some issues too - you effectively need a separate
connection per outstanding request.  Pipelining won't work well
because things need to come back in-order.  I'm not sure if RMI has
this limitation as well.

> FWIW, our distributed search uses http over 120+ shards... and is
> written in python.

That would be an awesome test case if you were able to use what Solr
is going to provide out-of-the-box.  Any unusual requirements?

-Yonik

Reply via email to