[ https://issues.apache.org/jira/browse/SOLR-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678242#action_12678242 ]
noble.paul edited comment on SOLR-1044 at 3/3/09 12:30 AM: ----------------------------------------------------------- bq.Is our use of HTTP really a bottleneck? we are limited by the servlet engine's ability to serve requests . I guess it would easily peak out at 600-800 req/sec .Whereas a NIO based system can serve far more with lower latency (http://www.jboss.org/netty/performance.html). If we have a request served out of cache (no lucene search involved) the only overhead will be that of the HTTP . Then there is the overhead of servlet engine itself . Moreover HTTP is not a very efficient for large volume small sized requests bq.My feeling has been that if we go to a call mechanism, it should be based on something more standard that will have many off the shelf bindings - perl, python, php, C, etc. I agree. Hadoop looked like a simple RPC mechanism . But we can choose any (Thrift, Etch, Grizzly etc). We can rely on these for the transport alone. The payload will have to be our own say xml/json/javabin etc . None of them yet support a flexible format bq. That can also be a potential weakness though I think... a slow reader or writer for one request/response hangs up all the others. The requests on the server are served by multiple handlers (each one is a thread). One request will not block another if there are enough handlers/threads was (Author: noble.paul): bq.Is our use of HTTP really a bottleneck? we are limited by the servlet engine's ability to serve requests . I guess it would easily peak out at 600-800 req/sec .Whereas a NIO based system can serve far more with lower latency (http://www.jboss.org/netty/performance.html). If we have a request served out of cache (no lucene search involved) the only overhead will be that of the HTTP . Then there is the overhead of servlet engine itself . Moreover HTTP is not a very efficient for large volume small sized requests bq.My feeling has been that if we go to a call mechanism, it should be based on something more standard that will have many off the shelf bindings - perl, python, php, C, etc. I agree. Hadoop looked like a simple RPC mechanism . bq. That can also be a potential weakness though I think... a slow reader or writer for one request/response hangs up all the others. The requests on the server are served by multiple handlers (each one is a thread). One request will not block another if there are enough handlers/threads > Use Hadoop RPC for inter Solr communication > ------------------------------------------- > > Key: SOLR-1044 > URL: https://issues.apache.org/jira/browse/SOLR-1044 > Project: Solr > Issue Type: New Feature > Components: search > Reporter: Noble Paul > > Solr uses http for distributed search . We can make it a whole lot faster if > we use an RPC mechanism which is more lightweight/efficient. > Hadoop RPC looks like a good candidate for this. > The implementation should just have one protocol. It should follow the Solr's > idiom of making remote calls . A uri + params +[optional stream(s)] . The > response can be a stream of bytes. > To make this work we must make the SolrServer implementation pluggable in > distributed search. Users should be able to choose between the current > CommonshttpSolrServer, or a HadoopRpcSolrServer . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.