> The zookeeper list isn't really the right place for most of this. The > residents of this list will to have zero knowledge of how Solr uses > zookeeper. I'm on both lists -- and I'm a lot more familiar with Solr > than Zookeeper.
I think I see that now. I was under the impression that this might be more of a ZooKeeper question since the clusterstate.json was where I think I will run into issues. But I guess this JSON structure is more of a Solr design and Zookeeper doesn't really have any say in it? > On second thought, DON'T TRY THIS. > > I wouldn't want to take the chance that the DELETE would actually try to > contact the mentioned servers and truly delete the collection. I agree. The only part in the scenario that I'm particularly worried about would be the API calls through Solr (because they look like they might try to do extra stuff I don't want). I will definitely be testing this before actually doing it, but I was posting the questions to see if anyone has other ideas. NOTE: The example I gave was minimal for the sake of making the idea easier to understand. It confuses me sometimes too. My true ZK cluster size is about 18 nodes, spans multiple physical locations, and manages over 200 collections. I'm not worried about ZooKeeper performance. Part of the reason for needing to do this is that it can cause problems when trying to create a new collection with the same name in a different physical location with a different schema. Also, adding/removing ZooKeeper nodes can be problematic to manage over a large cluster (partly because 3.4.6 doesn't support live config changes b3.5.0+ does). > If you have a collection that has replicas on all four Solr servers, > then your four solr servers are *one* SolrCloud cluster, not two. If > they were separate clusters, it would not be possible to have one > collection with shards/replicas on all four servers. I agree about the current *one* SolrCloud. I technically do have a single massive SolrCloud because of this fact (which also poses some potential issues). All the dynamic collections have always been logically separated and only the static collections span the whole SolrCloud cluster. > The first thing you need to do is rearrange the static collection so it > only lives on two of the Solr servers. To do this, you can use > ADDREPLICA if addiitonal replicas are required, then DELETEREPLICA to > remove it from two of the servers. Unfortunately, the final goal is to have each final SolrCloud cluster to have knowledge of every static collections, but the local ZooKeeper clusters should not know about the others in other clusters (effectively duplicating the collection in each cluster). So, there is no rearranging to do, only removing "extra" nodes after splitting the ZooKeeper cluster. This may sound counter productive, but the static collections are managed outside of Solr. In the event that I do need to update the content in one, I can reload the collection on per location basis for a less risky deployment. It's a bit scary when you need to reload a large static collection across 20+ Solr servers. -- Eric
