Chris Troullis created SOLR-10695: ------------------------------------- Summary: Optimize implicit routing for nodes containing multiple shards Key: SOLR-10695 URL: https://issues.apache.org/jira/browse/SOLR-10695 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 6.5.1 Reporter: Chris Troullis
I asked a question on the Solr mailing list about some odd behavior I was seeing when using implicit routing. Here is a snippet from my question: "I created a collection using the implicit router, created 10 shards, named shard1, shard2, etc. I indexed 3000 documents to each shard, routed by setting the _route_ field on the documents in my schema. All works fine, I verified there are 3000 documents in each shard. The odd behavior I am seeing is when I try to route a query to a specific shard. I submitted a simple query to shard1 using the request parameter _route_=shard1. The query comes back fine, but when I looked in the logs, it looked like it was issuing 3 separate requests: 1. The original query to shard1 2. A 2nd query to shard1 with the parameter ids=a bunch of document ids 3. The original query to a random shard (changes every time I run the query)" [~shalinmangar] said that the behavior I was seeing was due to the fact that a node has more than 1 shard from the same collection, and upon being routed to such a node, the original shard is selected randomly, not taking the _route_ parameter into account. To quote: "So to recap, this is happening because you have more than one shard1 hosted on a node. Easy workaround is to have each shard hosted on a unique node. But we can improve things on the solr side as well by 1) having SolrJ resolve requests down to node name and core name, 2) having the collection name to core name resolution take _route_ param into account. Both 1 and 2 can solve the problem." Shalin asked me to log a JIRA for this, wasn't sure if I should log as a bug or enhancement. He suggested 2 potential solutions (above). I am up for attempting to implement one of these solutions. Does anyone have any more input, or a preference as to how this is addressed? It seems to me like 2 would be the more robust solution. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org