[ 
https://issues.apache.org/jira/browse/SOLR-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-6379:
---------------------------

    Attachment: SOLR-6379.pristine_collection.test.patch

bq. This looks like it's simply one of those corner case bugs that manifests 
when you have a collection that has core names that match another collection 
name. 

FWIW: i wanted to prove to myself that this was really the only problem - so i 
mangled TestQueriesWhileReplicasComeOnline.java into a 
TestQueriesWhileReplicasComeOnlineOfPristineCollection.java with the following 
changes:
* creates a new collection from scratch with a randomly generated name
* skips the initial batch of queries against the static index
* uses CLUSTERSTATUS to get the list of shards & replicas (since the test 
framework plumbing for this is all built arround collection1)
* inlcudes the random DELETEREPLICA logic that was missing from the previous 
test.
* loops until a min number of replica add/delete commands have been sent 
(async) instead of a fixed number of times

Even w/o anshum's change to the core vs collection name resolution, this new 
test sort of passes for me -- by which i mean it doesn't fail any assertions on 
comparing the results while it's randomly adding/removing replicas - but it 
does die horribly with tons of zombie ZK threads (why? I have no idea)


> CloudSolrServer can query the wrong replica if a collection has a SolrCore 
> name that matches a collection name.
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-6379
>                 URL: https://issues.apache.org/jira/browse/SOLR-6379
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>            Reporter: Hoss Man
>            Assignee: Anshum Gupta
>            Priority: Minor
>             Fix For: 5.0, 4.10
>
>         Attachments: SOLR-6379.patch, SOLR-6379.patch, SOLR-6379.patch, 
> SOLR-6379.patch, SOLR-6379.patch, SOLR-6379.pristine_collection.test.patch
>
>
> spin off of SOLR-2894 where sarowe & miller were getting failures from 
> TestCloudPivot that seemed unrelated to any of hte distrib pivot logic itself.
> in particular: adding a call to "waitForThingsToLevelOut" at the start of the 
> test, even before indexing any docs, seemed to work around the problem -- but 
> even if all replicas aren't yet up when the test starts, we should either get 
> a failure when adding docs (ie: no replica hosting the target shard) or 
> queries should only be routed to the replicas that are up and fully caught up 
> with the rest of the collection.
> (NOTE: we're specifically talking about a situation where the set of docs in 
> the collection is static during the query request)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to