[
https://issues.apache.org/jira/browse/SOLR-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13949498#comment-13949498
]
Hoss Man commented on SOLR-5919:
--------------------------------
Hmmm... ok: I definitely see that i was being stupid in thinking that every
node in the cluster would be involved in hosting "collection1", and thus: the
overseer may not be running on one of the nodes hosting the collection. But....
It still seems really weird to me that the control server is part of the
cluster (and can act as the overseer):
1) that means the control server and the control_collection are running in "zk
mode" so all of the test scaffolding that executes parallel queries against the
cloudClient and the controlClient and then compares the results isn't actually
comparing a cloud mode response with a standalone mode response -- it's
comparing a cloud mode multi-shard response with a cloud mode single-shard
response.
2) can't this lead to inconsistencies in chaos tests if the chaos monkey shuts
down the server running the control_collection?
> AbstractFullDistribZkTestBase: control server thinks it's part of the cloud,
> takes overseer role
> ------------------------------------------------------------------------------------------------
>
> Key: SOLR-5919
> URL: https://issues.apache.org/jira/browse/SOLR-5919
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
>
> I was banging my head trying to figure out why SOLR-5795 combined with
> SOLR-5823 wasn't working when I noticed something interesting as a result of
> some gratuituous logging:
> * the control server thinks it's in running in cloud mode, in a cluster
> consisting solely of itself, and acts as overseer
> * none of the nodes in the actual cluster being tested think they are the
> overseer
> ...i haven't dug in very deep, but i suspect that some combination of the
> control server starting up first and thinking it's part of zk is leading to
> it becoming the overseer, even thought it evidently never thinks it's one of
> the leaders/replicas of the cloud cluster.
> It's hard to see this problem w/o SOLR-5823 -- i'll update the patch there
> with a test showing hte problem, but i wanted to make sure it got tracked in
> it's own bug.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]