[ 
https://issues.apache.org/jira/browse/SOLR-8256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16026449#comment-16026449
 ] 

Varun Thacker edited comment on SOLR-8256 at 5/26/17 4:06 PM:
--------------------------------------------------------------

Here are two scenarios i've seen at customers over the last couple of years

1. In a large cluster with lots of shards ( 100k + across all collections ) , 
someone restarts all the nodes instead of a rolling manner. The overseer queue 
get's so backed up that "rmr /overseer/queue" stops working. We ended up 
deleting everything from zookeeper , uploading configs and then starting nodes 
one by one. Today core.properties will go and register itself in state.json ( 
actually under clusterstate.json ) so we were able to bring back the cluster

2. New users either didn't backup zookeeper data and disk crashed or they 
deleted them by mistake etc.

I'm not opposed to us moving towards zk as truth . Just want to be careful that 
if we do so there could be pitfalls ?

For 1> i don't have any good ideas to prevent this from happening. 
For 2> our zoo.cfg could have different "dataDir" and "dataLogDir" and we would 
document which directory should a user backup on a regular basis ( 
https://zookeeper.apache.org/doc/r3.3.6/zookeeperAdmin.html )



was (Author: varunthacker):
Here are two scenarios i've seen at customers over the last couple of years

1. In a large cluster with lots of shards ( 100k + across all collections ) , 
someone restarts all the nodes in one go by mistake. The overseer queue get's 
so backed up that "rmr /overseer/queue" stops working. So we ended up deleting 
everything from zookeeper , uploading configs and then starting nodes one by 
one. Today core.properties will go and register itself in state.json so we were 
able to bring back the cluster

2. New users either didn't backup zookeeper data and disk crashed or they 
deleted them by mistake etc.

I'm not opposed to us moving towards zk as truth . Just want to be careful that 
if we do so there could be pitfalls ?

For 1> i don't have any good ideas to prevent this from happening. 
For 2> our zoo.cfg could have different "dataDir" and "dataLogDir" and we would 
document which directory should a user backup on a regular basis ( 
https://zookeeper.apache.org/doc/r3.3.6/zookeeperAdmin.html )


> Remove cluster setting 'legacy cloud' in 6x.
> --------------------------------------------
>
>                 Key: SOLR-8256
>                 URL: https://issues.apache.org/jira/browse/SOLR-8256
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Mark Miller
>            Priority: Blocker
>             Fix For: master (7.0)
>
>         Attachments: SOLR-8256.patch, SOLR-8256.patch
>
>
> We don't have the old back compat concerns anymore. It's time to remove this 
> mostly unknown setting and start defaulting to this behavior that starts us 
> down the path of zk=truth.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to