We've been trying to figure out ways to "migrate" existing SolrClouds to another ZK ensemble which will be built on different infrastructure than the current ensemble. Also, ZK will be upgraded from 3.4.13 (old ensemble) to 3.6.3 (new ensemble). We're running Solr 8.10.1.
One option we are experimenting with is to transfer ZK snapshot and tlogs from the old to the new ensemble, modify the ZK_HOST to point to the new ensemble, then restart Solr. (We use chroots to keep each SolrCloud separated). We are NOT using dynamic reconfiguration (zoo.cfg reconfigEnabled=false). Transferring the snapshot and tlogs seemingly worked in ZK: no errors, and poking around ZK show all the data is current. When we did the Solr part, it seemed to work as well, looking at the Cloud in the UI: all the nodes, replicas, collections, etc. are there: the clusterstatus is valid, and we can query and index new content. But the ZK Status page is just strange - it shows this error: Errors: Your ZK connection string (3 hosts) is different from the dynamic ensemble config (3 hosts). Solr does not currently support dynamic reconfiguration and will only be able to connect to the zk hosts in your connection string. Failed talking to Zookeeper localhost:2181 Failed talking to Zookeeper localhost:2182 Failed talking to Zookeeper localhost:2183 ZK connection string: 10.xx.xx.xx:2181,10.xx.xx.xx:2182,10.xx.xx.xx:2183/solr8 Ensemble size: 3 Ensemble mode: Dynamic reconfig enabled: true And the little table under all that shows the following headers: localhost:2181 localhost:2182 localhost:2183 ...of course with ok=false for all 3, because there is no ZK running on localhost. And as stated before, zoo.cfg has reconfigEnabled= false. So for all the important bits, Solr seems to be looking at the ZK in the connection string. But I don't understand what's going on in the UI: I'm not sure how to stop Solr from looking for ZK's on localhost. Is it somehow related to this: https://issues.apache.org/jira/browse/SOLR-13801? There are various linkages to other bugs/improvements, but we're not doing anything special here: we have whitelisted the ZK 4lw's, we've disable ACL, and we're not using TLS. Anyone have any ideas? Thanks