Ophir,

this sounds a bit strange:

> CommonsHttpSolrServer.java, line 416 takes about 95% of the application's 
> total search time

Is this only for heavy load?

Some other things:

 * with lucene you accessed the indices with MultiSearcher in a LAN, right?
 * did you look into the logs of the servers, is there something
wrong/delayed?
 * did you enable gzip compression for your servers or even the binary
writer/parser for your solr clients?

CommonsHttpSolrServer server = ...
server.setRequestWriter(new BinaryRequestWriter());
server.setParser(new BinaryResponseParser());

Regards,
Peter.

> [posted this yesterday in lucene-user mailing list, and got an advice to
> post this here instead. excuse me for spamming]
>
> Hi,
>
> I'm currently involved in a project of migrating from Lucene 2.9.1 to Solr
> 1.4.0.
> During stress testing, I encountered this performance problem:
> While actual search times in our shards (which are now running Solr) have
> not changed, the total time it takes for a query has increased dramatically.
> During this performance test, we of course do not modify the indexes.
> Our application is sending Solr select queries concurrently to the 8 shards,
> using CommonsHttpSolrServer.
> I added some timing debug messages, and found that
> CommonsHttpSolrServer.java, line 416 takes about 95% of the application's
> total search time:
> int statusCode = _httpClient.executeMethod(method);
>
> Just to clarify: looking at access logs of the Solr shards, TTLB for a query
> might be around 5 ms. (on all shards), but httpClient.executeMethod() for
> this query can be much higher - say, 50 ms.
> On average, if under light load queries take 12 ms. on average, under heavy
> load the take around 22 ms.
>
> Another route we tried to pursue is add the "shards=shard1,shard2,…"
> parameter to the query instead of doing this ourselves, but this doesn't
> seem to work due to an NPE caused by QueryComponent.returnFields(), line
> 553:
> if (returnScores && sdoc.score != null) {
>
> where sdoc is null. I saw there is a null check on trunk, but since we're
> currently using Solr 1.4.0's ready-made WAR file, I didn't see an easy way
> around this.
> Note: we're using a custom query component which extends QueryComponent, but
> debugging this, I saw nothing wrong with the results at this point in the
> code.
>
> Our previous code used HTTP in a different manner:
> For each request, we created a new
> sun.net.www.protocol.http.HttpURLConnection, and called its getInputStream()
> method.
> Under the same load as the new application, the old application does not
> encounter the delays mentioned above.
>
> Our current code is initializing CommonsHttpSolrServer for each shard this
> way:
>     MultiThreadedHttpConnectionManager httpConnectionManager = new
> MultiThreadedHttpConnectionManager();
>     httpConnectionManager.getParams().setTcpNoDelay(true);
>     httpConnectionManager.getParams().setMaxTotalConnections(1024);
>     httpConnectionManager.getParams().setStaleCheckingEnabled(false);
>     HttpClient httpClient = new HttpClient();
>     HttpClientParams params = new HttpClientParams();
>     params.setCookiePolicy(CookiePolicy.IGNORE_COOKIES);
>     params.setAuthenticationPreemptive(false);
>     params.setContentCharset(StringConstants.UTF8);
>     httpClient.setParams(params);
>     httpClient.setHttpConnectionManager(httpConnectionManager);
>
> and passing the new HttpClient to the Solr Server:
> solrServer = new CommonsHttpSolrServer(coreUrl, httpClient);
>
> We tried two different ways - one with a single
> MultiThreadedHttpConnectionManager and HttpClient for all the SolrServer's,
> and the other with a new MultiThreadedHttpConnectionManager and HttpClient
> for each SolrServer.
> Both tries yielded similar performance results.
> Also tried to give setMaxTotalConnections() a much higher connections number
> (1,000,000) - didn't have an effect.
>
> One last thing - to answer Lance's question about this being an "apples to
> apples" comparison (in lucene-user thread) - yes, our main goal in this
> project is to do things as close to the previous version as possible.
> This way we can monitor that behavior (both quality and performance) remains
> similar, release this version, and then move forward to improve things.
> Of course, there are some changes, but I believe we are indeed measuring the
> complete flow on both apps, and that both apps are returning the same fields
> via HTTP.
>
> Would love to hear what you think about this. TIA,
> Ophir
>
>   


-- 
http://karussell.wordpress.com/

Reply via email to