I have submitted a patch for the ticket at https://issues.apache.org/jira/browse/SOLR-6832
The patch creates an option *preferLocalShards* in solrconfig.xml and in the query request params (giving more preference to the one in the query). If this option is set, HttpShardHandler.preferCurrentHostForDistributedReq() tries to find a local URL and puts that URL as the first one in the list of URLs sent to LBHttpSolrServer. This ensures that the current host's cores will be given preference for distributed queries. Current host's URL is found by ResponseBuilder.findCurrentHostAddress() by searching for current core's name in the list of shards. Default value of the option is kept as 'false' to ensure normal behavior. Before putting more effort in writing test-cases, I would like to have some comments on this patch so that I can know that I am in the right direction here. Thanks Sachin On Wed, Dec 10, 2014 at 4:30 PM, Shawn Heisey <[email protected]> wrote: > On 12/9/2014 10:55 PM, S G wrote: > > For a distributed query, the request is always sent to all the shards > > even if the originating SolrCore (handling the original distributed > > query) is a replica of one of the shards. > > If the original Solr-Core can check itself before sending http > > requests for any shard, we can probably save some network hopping and > > gain some performance. > > I have to agree with the other replies you've gotten. > > Consider a SolrCloud that is handling 5000 requests per second with a > replicationFactor of 20 or 30. This could be one shard or multiple > shards. Currently, those requests will be load balanced to the entire > cluster. If this option is implemented, suddenly EVERY request will > have at least one part handled locally ... and unless the index is very > tiny or 99 percent of the queries hit a Solr cache, one index core > simply won't be able to handle 5000 queries per second. Getting a > single machine capable of handling that load MIGHT be possible, but it > would likely be *VERY* expensive. > > This would be great as an *OPTION* that can be enabled when the index > composition and query patterns dictate it will be beneficial ... but it > definitely should not be default behavior. > > Thanks, > Shawn > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
