Hello Solr Users,

Starting on a project to transition a very large number of Solr 6 servers
(master/slave) to Solr 8.x SolrCloud.  The current query approach involves
a Distributed Index search (specifying the appropriate Solr 6 shards for
each query).

One thought to maintain current query strategy and allow for a zero down
time transition from Solr 6 cores to Solr 8 collections would be to
continue to use the Distributed Index search approach but specify shards
from both the Solr 6 deployment and the Solr 8 SolrCloud collections.  Once
Solr 8 collections are fully populated, the Solr 6 shards could be dropped
from the query.    (one drawback here is that eventually queries will want
to be changed from specifying shards, and simply specify collections).
Based on testing this approach works if the query is made against the Solr
6 server.

So this query works with 8642 being Solr 6 and 8983 being Solr 8 Cloud:

http://localhost:8642/solr/testcore/select?shards=http://localhost:8983/solr/sample_techproducts_shard2_replica_n2,http://localhost:8983/solr/sample_techproducts_shard1_replica_n1,http://localhost:8642/solr/testcore&indent=on&q=*:*&wt=json&fl=id&start=0&rows=100

However, after disabling the shard white list an IO error occurs when
making the query to  the Solr 8 SolrCloud Solr servers.  My guess is that
Solrcloud, at least by default, requires that all shards/cores be known by
Zookeeper.

I am not sold on this approach, but as part of research I want to narrow on
the best solution.

A couple questions:

1. Is there a way to configure searches against Solr 8 SolrCloud to include
shards not known by zookeeper (I am guessing not)?

2. While I am working on some ideas, I am curious about other seed ideas
regarding approaches to zero down time transition from Solr 6
(master/slave) to Solr 8.x Solrclouds over a very, very large data set.

Thanks,
Matt

Reply via email to