bq: should each replica get its own instance

By "instance" here I'm assuming you mean a JVM, i.e. running multiple
JVMs on a single physical node (host).

"It Depends"(tm) of course. Each JVMs have some overhead. What I've
usually found is that a better
question is "how much heap do I need to allocate?" The most common
performance issue
I see is GC-related, especially when it comes to "solr runs fine,
except occasionally we see long pauses"
which can result from stop-the-world GC pauses. This can lead to all
sorts of issues, like followers
going into recovery and the like.

There's also some consideration for how many CPUs etc. on each node.
So here's my
rule of thumb on where to start: start with, say, a 16G heap and run
as many JVMs per node as
it takes to accommodate the number of replicas you need. So say you
have 4 replicas/node and
they all run fine in 1 16G heap. Use one JVM.

OTOH, you discover that you need 64G to run all 4. Consider 2 or even 4 JVMs.

All bounded by how beefy your machines are. If you ave 256G RAM than 4 JVMs
is reasonable. If you nave 16G of physical RAM and 2 CPUs, well you better only
count on 1 JVM ;)

How much heap do you need? Nobody knows until you stress test, here's
a blog in case
you haven't seen it before:
https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

Finally, I use 16G as a starting point, 'cause you have to start
somewhere. I've seen heaps range from 4G
to 80G (this latter with Azul Zing). If you have a test setup you can
see where your sweet spot is.

Oh, one more thing. If you want to precisely control where each
replica lands, the collections create
command can take an "EMPTY" parameter that sets up the collection
state in ZooKeeper but does
not add _any_ replicas. You then place each one with ADDREPLICA and
the "node" parameter. That
said unless you're hosting a bunch of different collections it's
usually just fine to let Solr place
the replicas where it wants, it tries to distribute them evenly. And
then there's the replica placement
rules you can specify...

Best,
Erick

On Fri, Apr 20, 2018 at 4:38 AM, Bernd Fehling
<bernd.fehl...@uni-bielefeld.de> wrote:
> Thanks Alessandro for the info.
>
> I am currently in the phase to find the right setup with shards,
> nodes, replicas and so on.
> I have decided to begin with 5 hosts and want to setup 1 collection with 5 
> shards.
> And start with 2 replicas per shard.
>
> But the next design question is, should each replica get its own instance?
>
> What will give better performance, all replicas in one java instance or
> having one instance for each replica?
>
> What is your opinion?
>
> Regards
> Bernd
>
>
> Am 20.04.2018 um 12:17 schrieb Alessandro Benedetti:
>> Unless you use recent Solt 7.x features where replicas can have different
>> properties[1], each replica is functionally the same at Solr level.
>> Zookeeper will elect a leader among them ( so temporary a replica will have
>> more responsibilities ) but (R1-R2-R3) does not really exist at Solr level.
>> It will just be Shard1 (ReplicaHost1, ReplicaHost2, ReplicaHost3).
>>
>> So you can't really shuffle anything at this level.
>>
>>
>>
>>
>> -----
>> ---------------
>> Alessandro Benedetti
>> Search Consultant, R&D Software Engineer, Director
>> Sease Ltd. - www.sease.io
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>

Reply via email to