[ 
https://issues.apache.org/jira/browse/SOLR-10695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012550#comment-16012550
 ] 

Chris Troullis commented on SOLR-10695:
---------------------------------------

I am running this all on my local machine, with only 1 node, so I don't think 
time syncing should be an issue (but I can still try if you want).

By the way, I am doing all of this just through a rest client, to eliminate as 
many layers as possible, but I do see the same behavior through SolrJ as well.

By adding &distrib=false to the request, I only get the 3rd query (the query to 
the random shard), and I get no results back (since the data I am requesting is 
not on that shard). This seems in line with Shalin's theory of it picking a 
random shard initially and then going into distributed mode and issuing the 
request to the other (correct) shard (producing the other 2 queries).

In terms of timestamps, I see the requests in the order they are originally 
listed in the description.

> Optimize implicit routing for nodes containing multiple shards
> --------------------------------------------------------------
>
>                 Key: SOLR-10695
>                 URL: https://issues.apache.org/jira/browse/SOLR-10695
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 6.5.1
>            Reporter: Chris Troullis
>
> I asked a question on the Solr mailing list about some odd behavior I was 
> seeing when using implicit routing. Here is a snippet from my question:
> "I created a collection using the implicit router, created 10 shards, named 
> shard1, shard2, etc. I indexed 3000 documents to each shard, routed by 
> setting the _route_ field on the documents in my schema. All works fine, I 
> verified there are 3000 documents in each shard. 
> The odd behavior I am seeing is when I try to route a query to a specific 
> shard. I submitted a simple query to shard1 using the request parameter 
> _route_=shard1. The query comes back fine, but when I looked in the logs, it 
> looked like it was issuing 3 separate requests:
> 1. The original query to shard1
> 2. A 2nd query to shard1 with the parameter ids=a bunch of document ids
> 3. The original query to a random shard (changes every time I run the query)"
> [~shalinmangar] said that the behavior I was seeing was due to the fact that 
> a node has more than 1 shard from the same collection, and upon being routed 
> to such a node, the original shard is selected randomly, not taking the 
> _route_ parameter into account. To quote:
> "So to recap, this is happening because you have more than one shard1
> hosted on a node. Easy workaround is to have each shard hosted on a
> unique node. But we can improve things on the solr side as well by 1)
> having SolrJ resolve requests down to node name and core name, 2)
> having the collection name to core name resolution take _route_ param
> into account. Both 1 and 2 can solve the problem."
> Shalin asked me to log a JIRA for this, wasn't sure if I should log as a bug 
> or enhancement. He suggested 2 potential solutions (above). I am up for 
> attempting to implement one of these solutions. Does anyone have any more 
> input, or a preference as to how this is addressed? It seems to me like 2 
> would be the more robust solution.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to