[
https://issues.apache.org/jira/browse/SOLR-8256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16026449#comment-16026449
]
Varun Thacker edited comment on SOLR-8256 at 5/26/17 4:06 PM:
--------------------------------------------------------------
Here are two scenarios i've seen at customers over the last couple of years
1. In a large cluster with lots of shards ( 100k + across all collections ) ,
someone restarts all the nodes instead of a rolling manner. The overseer queue
get's so backed up that "rmr /overseer/queue" stops working. We ended up
deleting everything from zookeeper , uploading configs and then starting nodes
one by one. Today core.properties will go and register itself in state.json (
actually under clusterstate.json ) so we were able to bring back the cluster
2. New users either didn't backup zookeeper data and disk crashed or they
deleted them by mistake etc.
I'm not opposed to us moving towards zk as truth . Just want to be careful that
if we do so there could be pitfalls ?
For 1> i don't have any good ideas to prevent this from happening.
For 2> our zoo.cfg could have different "dataDir" and "dataLogDir" and we would
document which directory should a user backup on a regular basis (
https://zookeeper.apache.org/doc/r3.3.6/zookeeperAdmin.html )
was (Author: varunthacker):
Here are two scenarios i've seen at customers over the last couple of years
1. In a large cluster with lots of shards ( 100k + across all collections ) ,
someone restarts all the nodes in one go by mistake. The overseer queue get's
so backed up that "rmr /overseer/queue" stops working. So we ended up deleting
everything from zookeeper , uploading configs and then starting nodes one by
one. Today core.properties will go and register itself in state.json so we were
able to bring back the cluster
2. New users either didn't backup zookeeper data and disk crashed or they
deleted them by mistake etc.
I'm not opposed to us moving towards zk as truth . Just want to be careful that
if we do so there could be pitfalls ?
For 1> i don't have any good ideas to prevent this from happening.
For 2> our zoo.cfg could have different "dataDir" and "dataLogDir" and we would
document which directory should a user backup on a regular basis (
https://zookeeper.apache.org/doc/r3.3.6/zookeeperAdmin.html )
> Remove cluster setting 'legacy cloud' in 6x.
> --------------------------------------------
>
> Key: SOLR-8256
> URL: https://issues.apache.org/jira/browse/SOLR-8256
> Project: Solr
> Issue Type: Improvement
> Reporter: Mark Miller
> Priority: Blocker
> Fix For: master (7.0)
>
> Attachments: SOLR-8256.patch, SOLR-8256.patch
>
>
> We don't have the old back compat concerns anymore. It's time to remove this
> mostly unknown setting and start defaulting to this behavior that starts us
> down the path of zk=truth.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]