Ophir, this sounds a bit strange:
> CommonsHttpSolrServer.java, line 416 takes about 95% of the application's > total search time Is this only for heavy load? Some other things: * with lucene you accessed the indices with MultiSearcher in a LAN, right? * did you look into the logs of the servers, is there something wrong/delayed? * did you enable gzip compression for your servers or even the binary writer/parser for your solr clients? CommonsHttpSolrServer server = ... server.setRequestWriter(new BinaryRequestWriter()); server.setParser(new BinaryResponseParser()); Regards, Peter. > [posted this yesterday in lucene-user mailing list, and got an advice to > post this here instead. excuse me for spamming] > > Hi, > > I'm currently involved in a project of migrating from Lucene 2.9.1 to Solr > 1.4.0. > During stress testing, I encountered this performance problem: > While actual search times in our shards (which are now running Solr) have > not changed, the total time it takes for a query has increased dramatically. > During this performance test, we of course do not modify the indexes. > Our application is sending Solr select queries concurrently to the 8 shards, > using CommonsHttpSolrServer. > I added some timing debug messages, and found that > CommonsHttpSolrServer.java, line 416 takes about 95% of the application's > total search time: > int statusCode = _httpClient.executeMethod(method); > > Just to clarify: looking at access logs of the Solr shards, TTLB for a query > might be around 5 ms. (on all shards), but httpClient.executeMethod() for > this query can be much higher - say, 50 ms. > On average, if under light load queries take 12 ms. on average, under heavy > load the take around 22 ms. > > Another route we tried to pursue is add the "shards=shard1,shard2,…" > parameter to the query instead of doing this ourselves, but this doesn't > seem to work due to an NPE caused by QueryComponent.returnFields(), line > 553: > if (returnScores && sdoc.score != null) { > > where sdoc is null. I saw there is a null check on trunk, but since we're > currently using Solr 1.4.0's ready-made WAR file, I didn't see an easy way > around this. > Note: we're using a custom query component which extends QueryComponent, but > debugging this, I saw nothing wrong with the results at this point in the > code. > > Our previous code used HTTP in a different manner: > For each request, we created a new > sun.net.www.protocol.http.HttpURLConnection, and called its getInputStream() > method. > Under the same load as the new application, the old application does not > encounter the delays mentioned above. > > Our current code is initializing CommonsHttpSolrServer for each shard this > way: > MultiThreadedHttpConnectionManager httpConnectionManager = new > MultiThreadedHttpConnectionManager(); > httpConnectionManager.getParams().setTcpNoDelay(true); > httpConnectionManager.getParams().setMaxTotalConnections(1024); > httpConnectionManager.getParams().setStaleCheckingEnabled(false); > HttpClient httpClient = new HttpClient(); > HttpClientParams params = new HttpClientParams(); > params.setCookiePolicy(CookiePolicy.IGNORE_COOKIES); > params.setAuthenticationPreemptive(false); > params.setContentCharset(StringConstants.UTF8); > httpClient.setParams(params); > httpClient.setHttpConnectionManager(httpConnectionManager); > > and passing the new HttpClient to the Solr Server: > solrServer = new CommonsHttpSolrServer(coreUrl, httpClient); > > We tried two different ways - one with a single > MultiThreadedHttpConnectionManager and HttpClient for all the SolrServer's, > and the other with a new MultiThreadedHttpConnectionManager and HttpClient > for each SolrServer. > Both tries yielded similar performance results. > Also tried to give setMaxTotalConnections() a much higher connections number > (1,000,000) - didn't have an effect. > > One last thing - to answer Lance's question about this being an "apples to > apples" comparison (in lucene-user thread) - yes, our main goal in this > project is to do things as close to the previous version as possible. > This way we can monitor that behavior (both quality and performance) remains > similar, release this version, and then move forward to improve things. > Of course, there are some changes, but I believe we are indeed measuring the > complete flow on both apps, and that both apps are returning the same fields > via HTTP. > > Would love to hear what you think about this. TIA, > Ophir > > -- http://karussell.wordpress.com/