Hi, I'm looking for some advice on improving performance of our solr setup. In particular, about the trade-offs between applying larger machines, vs more smaller machines. Our full index has just over 100 million docs, and we do almost all searches using fq's (with q=*:*) and facets. We are using solr 8.3.
Currently, I have a solrcloud setup with 2 physical machines (let's call them A and B), and my index is divided into 2 shards, and 2 replicas, such that each machine has a full copy of the index. The nodes and replicas are as follows: Machine A: core_node3 / shard1_replica_n1 core_node7 / shard2_replica_n4 Machine B: core_node5 / shard1_replica_n2 core_node8 / shard2_replica_n6 My Zookeeper setup uses 3 instances. It's also the case that most of the searches we do, we have results returning from both shards (from the same search). My experiments indicate that our setup is cpu-bound. Due to cost constraints, I could, either, double the cpu in each of the 2 machines, or make it a 4-machine setup (using current size machines) and 2 shards and 4 replicas (or 4 shards w/ 4 replicas). I assume that keeping the full index on all machines will allow all searches to be evenly distributed. Does anyone have any insights on what would be better for maximizing throughput on multiple searches being done at the same time? thanks! Reinaldo