Hello Solr Users, Starting on a project to transition a very large number of Solr 6 servers (master/slave) to Solr 8.x SolrCloud. The current query approach involves a Distributed Index search (specifying the appropriate Solr 6 shards for each query).
One thought to maintain current query strategy and allow for a zero down time transition from Solr 6 cores to Solr 8 collections would be to continue to use the Distributed Index search approach but specify shards from both the Solr 6 deployment and the Solr 8 SolrCloud collections. Once Solr 8 collections are fully populated, the Solr 6 shards could be dropped from the query. (one drawback here is that eventually queries will want to be changed from specifying shards, and simply specify collections). Based on testing this approach works if the query is made against the Solr 6 server. So this query works with 8642 being Solr 6 and 8983 being Solr 8 Cloud: http://localhost:8642/solr/testcore/select?shards=http://localhost:8983/solr/sample_techproducts_shard2_replica_n2,http://localhost:8983/solr/sample_techproducts_shard1_replica_n1,http://localhost:8642/solr/testcore&indent=on&q=*:*&wt=json&fl=id&start=0&rows=100 However, after disabling the shard white list an IO error occurs when making the query to the Solr 8 SolrCloud Solr servers. My guess is that Solrcloud, at least by default, requires that all shards/cores be known by Zookeeper. I am not sold on this approach, but as part of research I want to narrow on the best solution. A couple questions: 1. Is there a way to configure searches against Solr 8 SolrCloud to include shards not known by zookeeper (I am guessing not)? 2. While I am working on some ideas, I am curious about other seed ideas regarding approaches to zero down time transition from Solr 6 (master/slave) to Solr 8.x Solrclouds over a very, very large data set. Thanks, Matt