Re: SolrCloud on PublicCloud

Shawn Heisey Mon, 03 Aug 2020 13:32:39 -0700

On 8/3/2020 12:04 PM, Mathew Mathew wrote:

Have been looking for architectural guidance on correctly configuring SolrCloud 
on Public Cloud (eg Azure/AWS)
In particular the zookeeper based autoscaling seems to overlap with the auto 
scaling capabilities of cloud platforms.


I have the following questions.

   1.  Should the ZooKeeper ensable be put in a autoscaling group. This seems 
to be a no, since the SolrNodes need to register against a static list of 
Zookeeper ips.

Correct. There are features in ZK 3.5 for dynamic server membership,but in general it is better to have a static list. The client must beupgraded as well for that feature to work. The ZK client was upgradedto a 3.5 version in Solr 8.2.0. I don't think we have done any testingof the dynamic membership feature.

ZK is generally best set up with either 3 or 5 servers, depending on thelevel of redundancy desired, and left alone unless there's a problem.With 3 servers, the ensemble can survive the failure of 1 server. With5, it can survive the failure of 2. As far as I know, getting back tofull redundancy is best handled as a manual process, even if runningversion 3.5.

   2.  Should the SolrNodes be put in a AutoScaling group? Or should we just 
launch/register SolrNodes using a lambda function/Azure function.

That really depends on what you're doing. There is no "one size fitsmost" configuration.

I personally would avoid setting things up in a way that results in Solrnodes automatically being added or removed. Adding a node willgenerally result in a LOT of data being copied, and that can impactperformance in a major way, so adding nodes should be scheduled tominimize impact. If it's automatic in response to high load, adding anode can make performance a lot worse before it gets better. When anode disappears, manual action is required for SolrCloud to forget the node.

   3.  Should the SolrNodes be associated with local storage or should they be 
attached to shared storage volumes.

Lucene (which provides most of Solr's functionality) generally does notlike to work with shared storage. In addition to potential latencyissues for storage connected via a network, Lucene works extremely hardto ensure that only one process can open an index. Using shared storagewill encourage attempts to share the index directory between multipleprocesses, which almost always fails to work.

Things work best with locally attached storage utilizing an extremelyfast connection method (like SATA or SCSI), and a locally handledfilesystem. Lucene uses some pretty involved file locking mechanisms,which often do not work well on remote or shared filesystems.

---

We (the developers that build this software) generally have a verynear-sighted view of things, not really caring about details like thehardware deployment. That probably needs to change a little bit,particularly when it comes to documentation.


Thanks,
Shawn

Re: SolrCloud on PublicCloud

Reply via email to