[ https://issues.apache.org/jira/browse/SOLR-10695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018455#comment-16018455 ]
Chris Troullis commented on SOLR-10695: --------------------------------------- I found the code in SolrJ that would need to be changed to address this, in CloudSolrClient.sendRequest. Like Shalin was saying originally, currently it is only using the _route_ parameter to resolve down to the node level. This could be changed to do something like if the _route_ param is specified, and it only points to a single shard, use the specific core url for the replica on each node. This way, the request will be sent directly to a replica on a node instead of going to a random replica first and being redirected/going down the distributed path. This doesn't handle the case of having more than 1 replica for the same shard on a single node (as it would always pick the first one, and thus wouldn't be load balanced), but I'm not sure how realistic of a scenario that is. I also noticed that in this code path, if the "collection" request parameter is set, then it does use the full core url of the replica. Technically this can be used to replicate the behavior of the proposed change above, but I'm not sure that is what this "collection" parameter is supposed to be used for. Does anyone know what the purpose of it is supposed to be/why we are resolving down to the full core url if it is passed in? I would be happy to submit a patch for any changes, just want to see if people think it is worth doing before I actually change anything. > Optimize implicit routing for nodes containing multiple shards > -------------------------------------------------------------- > > Key: SOLR-10695 > URL: https://issues.apache.org/jira/browse/SOLR-10695 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 6.5.1 > Reporter: Chris Troullis > > I asked a question on the Solr mailing list about some odd behavior I was > seeing when using implicit routing. Here is a snippet from my question: > "I created a collection using the implicit router, created 10 shards, named > shard1, shard2, etc. I indexed 3000 documents to each shard, routed by > setting the _route_ field on the documents in my schema. All works fine, I > verified there are 3000 documents in each shard. > The odd behavior I am seeing is when I try to route a query to a specific > shard. I submitted a simple query to shard1 using the request parameter > _route_=shard1. The query comes back fine, but when I looked in the logs, it > looked like it was issuing 3 separate requests: > 1. The original query to shard1 > 2. A 2nd query to shard1 with the parameter ids=a bunch of document ids > 3. The original query to a random shard (changes every time I run the query)" > [~shalinmangar] said that the behavior I was seeing was due to the fact that > a node has more than 1 shard from the same collection, and upon being routed > to such a node, the original shard is selected randomly, not taking the > _route_ parameter into account. To quote: > "So to recap, this is happening because you have more than one shard1 > hosted on a node. Easy workaround is to have each shard hosted on a > unique node. But we can improve things on the solr side as well by 1) > having SolrJ resolve requests down to node name and core name, 2) > having the collection name to core name resolution take _route_ param > into account. Both 1 and 2 can solve the problem." > Shalin asked me to log a JIRA for this, wasn't sure if I should log as a bug > or enhancement. He suggested 2 potential solutions (above). I am up for > attempting to implement one of these solutions. Does anyone have any more > input, or a preference as to how this is addressed? It seems to me like 2 > would be the more robust solution. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org