bq: should each replica get its own instance By "instance" here I'm assuming you mean a JVM, i.e. running multiple JVMs on a single physical node (host).
"It Depends"(tm) of course. Each JVMs have some overhead. What I've usually found is that a better question is "how much heap do I need to allocate?" The most common performance issue I see is GC-related, especially when it comes to "solr runs fine, except occasionally we see long pauses" which can result from stop-the-world GC pauses. This can lead to all sorts of issues, like followers going into recovery and the like. There's also some consideration for how many CPUs etc. on each node. So here's my rule of thumb on where to start: start with, say, a 16G heap and run as many JVMs per node as it takes to accommodate the number of replicas you need. So say you have 4 replicas/node and they all run fine in 1 16G heap. Use one JVM. OTOH, you discover that you need 64G to run all 4. Consider 2 or even 4 JVMs. All bounded by how beefy your machines are. If you ave 256G RAM than 4 JVMs is reasonable. If you nave 16G of physical RAM and 2 CPUs, well you better only count on 1 JVM ;) How much heap do you need? Nobody knows until you stress test, here's a blog in case you haven't seen it before: https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ Finally, I use 16G as a starting point, 'cause you have to start somewhere. I've seen heaps range from 4G to 80G (this latter with Azul Zing). If you have a test setup you can see where your sweet spot is. Oh, one more thing. If you want to precisely control where each replica lands, the collections create command can take an "EMPTY" parameter that sets up the collection state in ZooKeeper but does not add _any_ replicas. You then place each one with ADDREPLICA and the "node" parameter. That said unless you're hosting a bunch of different collections it's usually just fine to let Solr place the replicas where it wants, it tries to distribute them evenly. And then there's the replica placement rules you can specify... Best, Erick On Fri, Apr 20, 2018 at 4:38 AM, Bernd Fehling <bernd.fehl...@uni-bielefeld.de> wrote: > Thanks Alessandro for the info. > > I am currently in the phase to find the right setup with shards, > nodes, replicas and so on. > I have decided to begin with 5 hosts and want to setup 1 collection with 5 > shards. > And start with 2 replicas per shard. > > But the next design question is, should each replica get its own instance? > > What will give better performance, all replicas in one java instance or > having one instance for each replica? > > What is your opinion? > > Regards > Bernd > > > Am 20.04.2018 um 12:17 schrieb Alessandro Benedetti: >> Unless you use recent Solt 7.x features where replicas can have different >> properties[1], each replica is functionally the same at Solr level. >> Zookeeper will elect a leader among them ( so temporary a replica will have >> more responsibilities ) but (R1-R2-R3) does not really exist at Solr level. >> It will just be Shard1 (ReplicaHost1, ReplicaHost2, ReplicaHost3). >> >> So you can't really shuffle anything at this level. >> >> >> >> >> ----- >> --------------- >> Alessandro Benedetti >> Search Consultant, R&D Software Engineer, Director >> Sease Ltd. - www.sease.io >> -- >> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >>