Steve,

No doubt I confused you.  I'm confused myself__

When I said replica set what I was referring to was one of the three replicas 
of the data.  Each replica needing to be in a different AZ.

What is a "replica set”?  And why does each instance of Solr (referred to in 
the reference guide as a “node”, BTW) running on a server need to be in the 
same “replica set”?
        What I should of said is each node on the server which there are three 
per server needs to be in the same AZ.

The bottom line is I guess I'm confused by the documentation and the reference 
to replicas. Normally when referring to replicas in the documentation it is 
referring to the number of times you want the data replicated. As in 
replication factor.  That's where the confusion was for me.

So let me ask this simple question.

If I want to create a rule that insures that my replication factor of three 
correctly shards the data across three AZ so if I was to lose one or even two 
AZ's in AWS Solr would still have 1 - 2 copies of the data.   How would that 
rule work?



On 9/25/18, 10:17 AM, "Steve Rowe" <sar...@gmail.com> wrote:

    Hi Chuck, see my replies inline below:
    
    > On Sep 25, 2018, at 11:21 AM, Chuck Reynolds <creyno...@ancestry.com> 
wrote:
    > 
    > So we have 90 server in AWS, 30 servers per AZ's.
    > 90 shards for the cluster.
    > Each server has three instances of Solr running on it so every instance 
on the server has to be in the same replica set.
    
    You lost me here.  What is a "replica set”?  And why does each instance of 
Solr (referred to in the reference guide as a “node”, BTW) running on a server 
need to be in the same “replica set”?
    
    (I’m guessing you have theorized that “replica:3” is a way of referring to 
"replica set #3”, but that’s incorrect; “replica:3” means that exactly 3 
replicas must be placed on the bucket of nodes you specify in the rule; more 
info below.)
    
    > So for example shard 1 will have three replicas and each replica needs to 
be in a separate AZ.
    
    Okay, I understand this part, but I fail to see how this is an example of 
your “replica set” assertion above.
    
    > So does the rule of replica:>2 work?
    
    I assume you did not mean ^^ literally, since you wrote “>” where I wrote 
“<“ in my previous response. 
    
    I checked offline with Noble Paul, who wrote the rule-based replica 
placement feature, and he corrected a misunderstanding of mine:
    
    > On 9/25/18, 9:08 AM, "Steve Rowe" <sar...@gmail.com> wrote:
    
    > So you could specify “replica:<2”, which means that no node can host more 
than one replica, but it's acceptable for a node to host zero replicas.
    
    But ^^ is incorrect. 
    
    “replica:<2” means that either zero or one replica of each shard of the 
collection to be created may be hosted on the bucket of *all* of the nodes that 
have the specified AWSAZ sysprop value.  That is, when placing replicas, Solr 
will put either zero or one replica on one of the nodes in the bucket.  And 
AFAICT that’s not exatly what you want, since zero replicas of a shard on an AZ 
is not acceptable. 
    
    > I just need all of the servers in an AZ to be in the same replica.  Does 
that make sense?
    
    I’m not sure?  This sounds like something different from your above 
example: "shard 1 will have three replicas and each replica needs to be in a 
separate AZ.”
    
    If you mean “exactly one Solr instance in an AZ must host exactly one 
replica of each shard of the collection”, then yes, that makes sense :).
    
    Okay, one more try :) - here are the rules that should do the trick for you 
(i.e., what I wrote in the previous sentence):
    
    -----
     rule=shard:*,replica:1,sysprop.AWSAZ:AZ1
    &rule=shard:*,replica:1,sysprop.AWSAZ:AZ2
    &rule=shard:*,replica:1,sysprop.AWSAZ:AZ3
    -----
    
    --
    Steve
    
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.lucidworks.com&d=DwIFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=J-2s3b-3-OTA0o6bGDhJXAQlB5Y3s4rOUxlh_78DJl0&m=e6E6P07QRR0Gn9oI2V1tk3sWdVDq5EF_tIgdoh4DxpE&s=Ddb5KOc_t4p64xyxt5rmqnWWwMcByecGQ2iJYv2BWiY&e=
    
    > On 9/25/18, 9:08 AM, "Steve Rowe" <sar...@gmail.com> wrote:
    > 
    >    Chuck,
    > 
    >    The default Snitch is the one that’s used if you don’t specify one in 
a rule.  The sysprop.* tag is provided by the default Snitch.
    > 
    >    The only thing that seems wrong to me in your rules is “replica:1”, 
“replica:2”, and “replica:3” - these say that exactly one, two, and three 
replicas of each shard, respectively, must be on each of the nodes that has the 
respective sysprop value.
    > 
    >    Since these rules will apply to all nodes that match the sysprop 
value, you have to allow for the possibility that some nodes will have *zero* 
replicas of a shard.  So you could specify “replica:<2”, which means that no 
node can host more than one replica, but it's acceptable for a node to host 
zero replicas.
    > 
    >    Did you set system property AWSAZ on each Solr node with an 
appropriate value?
    > 
    >    --
    >    Steve
    >    
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.lucidworks.com&d=DwIFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=J-2s3b-3-OTA0o6bGDhJXAQlB5Y3s4rOUxlh_78DJl0&m=uG91WrgZB5UTKLOAB53AcrY5LyBsJ3VyBH8cN7xe2mU&s=9V6TJoE0h5NMjEWZ38ipa3zgYvoLJ1H9GHplSz1DJLU&e=
    > 
    >> On Sep 25, 2018, at 10:39 AM, Chuck Reynolds <creyno...@ancestry.com> 
wrote:
    >> 
    >> Steve,
    >> 
    >> I wasn't able to get the sysprop to work.  I think maybe there is a 
disconnect on my part.
    >> 
    >> From the documentation it looks like I can only use the sysprop tag if 
I'm using a Snitch.  Is that correct.
    >> 
    >> I can't find any example of anyone using the default Snitch.
    >> 
    >> Here is what I have for my rule:
    >> 
rule=shard:*,replica:1,sysprop.AWSAZ:AZ1&rule=shard:*,replica:2,sysprop.AWSAZ:AZ2&rule=shard:*,replica:3,sysprop.AWSAZ:AZ3
    >> 
    >> I'm not specifying a snitch.  Is that my problem or is there a problem 
with my rule?
    >> 
    >> Thanks for your help.
    >> On 9/21/18, 2:40 PM, "Steve Rowe" <sar...@gmail.com> wrote:
    >> 
    >>   Hi Chuck,
    >> 
    >>   One way to do it is to set a system property on the JVM running each 
Solr node, corresponding to the the AWS availability zone on which the node is 
hosted.
    >> 
    >>   For example, you could use sysprop “AWSAZ”, then use rules like:
    >> 
    >>      replica:<2,sysprop.AWSAZ:us-east-1
    >>      replica:<2,sysprop.AWSAZ:us-west-1
    >>      replica:<2,sysprop.AWSAZ:ca-central-1
    >> 
    >>   --
    >>   Steve
    >>   
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.lucidworks.com&d=DwIFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=J-2s3b-3-OTA0o6bGDhJXAQlB5Y3s4rOUxlh_78DJl0&m=glt-Kw4TwOAGYMt6NB7R6qMysuNssE_CjJH46rL4tqo&s=6CzANqo-EwE1nnzHaTwr71MxQd7-im366kZUXznMKC8&e=
    >> 
    >>> On Sep 21, 2018, at 4:07 PM, Chuck Reynolds <creyno...@ancestry.com> 
wrote:
    >>> 
    >>> I'm using Solr 6.6 and I want to create a 90 node cluster with a 
replication
    >>> factor of three.  I'm using AWS EC2 instances and I have a requirement 
to
    >>> replicate the data into 3 AWS availability zones.  
    >>> 
    >>> So 30 servers in each zone and I don't see a create collection rule that
    >>> will put one replica in each of the three zones.
    >>> 
    >>> What am I missing?
    >>> 
    >>> 
    >>> 
    >>> --
    >>> Sent from: 
https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.472066.n3.nabble.com_Solr-2DUser-2Df472068.html&d=DwIFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=J-2s3b-3-OTA0o6bGDhJXAQlB5Y3s4rOUxlh_78DJl0&m=glt-Kw4TwOAGYMt6NB7R6qMysuNssE_CjJH46rL4tqo&s=pnPq-r9xSpo7DZsgF-XgR0MyUIFNcaZpAI-xcX4HjCY&e=
    
    

Reply via email to