Hi zookeepers,
I'm seeing some confusing behaviour in Solr/zookeeper and hope you can shed some light on what's happening/how I can correct it. (I've also asked the ~same question to the Solr mailing list but I think it's a ZK issue.) We have two physical servers running automated builds of RedHat 6.4 and Solr 4.4.0 that host two separate Solr services. We also have two separate ZK ensembles, a production version and a development version - both running 3.4.5 built via automation. The first Solr server (called ld01) has 24 shards and hosts a collection called 'ukdomain'; the second Solr server (ld02) also has 24 shards and hosts a different collection called 'ldwa01'. It's evidently important to note that previously both of these physical servers provided the 'ukdomain' collection, but the 'ldwa01' server has been rebuilt for the new collection. ld01/ukdomain connects to the production ZK. ld02/ldwa01, well that depends and is the problem. When I start the ldwa01 solr nodes with their zookeeper configuration (defined in /etc/sysconfig/solrnode* and with collection.configName as 'ldwa01cfg') pointing to the development zookeeper ensemble, all nodes initially become shard leaders and then replicas as I'd expect. But if I change the ldwa01 solr nodes to point to the zookeeper ensemble also used for the ukdomain collection, all ldwa01 solr nodes start on the same shard (that is, the first ldwa01 solr node becomes the shard leader, then every other solr node becomes a replica for this shard). The significant point here is no other ldwa01 shards gain leaders (or replicas). The ukdomain collection uses a zookeeper collection.configName of 'ukdomaincfg', and prior to the creation of this ldwa01 service the collection.configName of 'ldwa01cfg' has never previously been used. So I'm confused why the ldwa01 service would differ when the only difference is which zookeeper ensemble is used. If anyone can explain why this is happening and how I can get the ldwa01 services to start correctly using the non-development zookeeper ensemble, I'd be very grateful! If more information or explanation is needed, just ask. Thanks, Gil Gil Hoggarth Web Archiving Technical Services Engineer The British Library, Boston Spa, West Yorkshire, LS23 7BQ
