Re: SolrCloud increase replication factor

2016-05-23 Thread Hendrik Haddorp
Hi Tom,

the pointer to the rule based placement was indeed what I was missing! I
simply had to add the rule "shard:*,replica:<2,node:*", as documented,
and my replicas do now get distributed as expected :-)

thanks,
Hendrik

On 23/05/16 15:28, Tom Evans wrote:
> On Mon, May 23, 2016 at 10:37 AM, Hendrik Haddorp
>  wrote:
>> Hi,
>>
>> I have a SolrCloud 6.0 setup and created my collection with a
>> replication factor of 1. Now I want to increase the replication factor
>> but would like the replicas for the same shard to be on different nodes,
>> so that my collection does not fail when one node fails. I tried two
>> approaches so far:
>>
>> 1) When I use the collections API with the MODIFYCOLLECTION action [1] I
>> can set the replication factor but that did not result in the creation
>> of additional replicas. The Solr Admin UI showed that my replication
>> factor changed but otherwise nothing happened. A reload of the
>> collection did also result in no change.
>>
>> 2) Using the ADDREPLICA action [2] from the collections API I have to
>> add the replicas to the shard individually, which is a bit more
>> complicated but otherwise worked. During testing this did however at
>> least once result in the replica being created on the same node. My
>> collection was split in 4 shards and for 2 of them all replicas ended up
>> on the same node.
>>
>> So is the only option to create the replicas manually and also pick the
>> nodes manually or is the perceived behavior wrong?
>>
>> regards,
>> Hendrik
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-modifycoll
>> [2]
>> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api_addreplica
>
> With ADDREPLICA, you can specify the node to create the replica on. If
> you are using a script to increase/remove replicas, you can simply
> incorporate the logic you desire in to your script - you can also use
> CLUSTERSTATUS to get a list of nodes/collections/shards etc in order
> to inform the logic in the script. This is the approach we took, we
> have a fabric script to add/remove extra nodes to/from the cluster, it
> works well.
>
> The alternative is to put the logic in to Solr itself, using what Solr
> calls a "snitch" to define the rules on where replicas are created.
> The snitch is specified at collection creation time, or you can use
> MODIFYCOLLECTION to set it after the fact. See this wiki patch for
> details:
>
> https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
>
> Cheers
>
> Tom



Re: SolrCloud increase replication factor

2016-05-23 Thread Erick Erickson
About (1), bq: The Solr Admin UI showed that my replication factor
changed but otherwise nothing happened.

this is as designed AFAIK. There's nothing built in to Solr to
_automatically_ add replicas when this property is changed. My guess
is that the MODIFYCOLLECTION code was written to help with editing the
ZK nodes, i.e. make it unnecessary to hand-edit the ZK nodes to change
things like replication factor without recreating a collection. I've
modified the ref guide page to make this more explicit.

about (2)... I agree. If you can reliably repeat this (or even better,
come up with a test case) it would be worth a JIRA I think.

Best,
Erick



On Mon, May 23, 2016 at 10:46 AM, Hendrik Haddorp
 wrote:
> What I find odd is that creating a collection with a replication factor
> greater then 1 does seem to not end up with replicas on the same node.
> However when one wants to add replicas later on one need to do the whole
> placement manually to avoid single point of failures.
>
> On 23/05/16 15:28, Tom Evans wrote:
>> On Mon, May 23, 2016 at 10:37 AM, Hendrik Haddorp
>>  wrote:
>>> Hi,
>>>
>>> I have a SolrCloud 6.0 setup and created my collection with a
>>> replication factor of 1. Now I want to increase the replication factor
>>> but would like the replicas for the same shard to be on different nodes,
>>> so that my collection does not fail when one node fails. I tried two
>>> approaches so far:
>>>
>>> 1) When I use the collections API with the MODIFYCOLLECTION action [1] I
>>> can set the replication factor but that did not result in the creation
>>> of additional replicas. The Solr Admin UI showed that my replication
>>> factor changed but otherwise nothing happened. A reload of the
>>> collection did also result in no change.
>>>
>>> 2) Using the ADDREPLICA action [2] from the collections API I have to
>>> add the replicas to the shard individually, which is a bit more
>>> complicated but otherwise worked. During testing this did however at
>>> least once result in the replica being created on the same node. My
>>> collection was split in 4 shards and for 2 of them all replicas ended up
>>> on the same node.
>>>
>>> So is the only option to create the replicas manually and also pick the
>>> nodes manually or is the perceived behavior wrong?
>>>
>>> regards,
>>> Hendrik
>>>
>>> [1]
>>> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-modifycoll
>>> [2]
>>> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api_addreplica
>>
>> With ADDREPLICA, you can specify the node to create the replica on. If
>> you are using a script to increase/remove replicas, you can simply
>> incorporate the logic you desire in to your script - you can also use
>> CLUSTERSTATUS to get a list of nodes/collections/shards etc in order
>> to inform the logic in the script. This is the approach we took, we
>> have a fabric script to add/remove extra nodes to/from the cluster, it
>> works well.
>>
>> The alternative is to put the logic in to Solr itself, using what Solr
>> calls a "snitch" to define the rules on where replicas are created.
>> The snitch is specified at collection creation time, or you can use
>> MODIFYCOLLECTION to set it after the fact. See this wiki patch for
>> details:
>>
>> https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
>>
>> Cheers
>>
>> Tom
>


Re: SolrCloud increase replication factor

2016-05-23 Thread Hendrik Haddorp
What I find odd is that creating a collection with a replication factor
greater then 1 does seem to not end up with replicas on the same node.
However when one wants to add replicas later on one need to do the whole
placement manually to avoid single point of failures.

On 23/05/16 15:28, Tom Evans wrote:
> On Mon, May 23, 2016 at 10:37 AM, Hendrik Haddorp
>  wrote:
>> Hi,
>>
>> I have a SolrCloud 6.0 setup and created my collection with a
>> replication factor of 1. Now I want to increase the replication factor
>> but would like the replicas for the same shard to be on different nodes,
>> so that my collection does not fail when one node fails. I tried two
>> approaches so far:
>>
>> 1) When I use the collections API with the MODIFYCOLLECTION action [1] I
>> can set the replication factor but that did not result in the creation
>> of additional replicas. The Solr Admin UI showed that my replication
>> factor changed but otherwise nothing happened. A reload of the
>> collection did also result in no change.
>>
>> 2) Using the ADDREPLICA action [2] from the collections API I have to
>> add the replicas to the shard individually, which is a bit more
>> complicated but otherwise worked. During testing this did however at
>> least once result in the replica being created on the same node. My
>> collection was split in 4 shards and for 2 of them all replicas ended up
>> on the same node.
>>
>> So is the only option to create the replicas manually and also pick the
>> nodes manually or is the perceived behavior wrong?
>>
>> regards,
>> Hendrik
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-modifycoll
>> [2]
>> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api_addreplica
>
> With ADDREPLICA, you can specify the node to create the replica on. If
> you are using a script to increase/remove replicas, you can simply
> incorporate the logic you desire in to your script - you can also use
> CLUSTERSTATUS to get a list of nodes/collections/shards etc in order
> to inform the logic in the script. This is the approach we took, we
> have a fabric script to add/remove extra nodes to/from the cluster, it
> works well.
>
> The alternative is to put the logic in to Solr itself, using what Solr
> calls a "snitch" to define the rules on where replicas are created.
> The snitch is specified at collection creation time, or you can use
> MODIFYCOLLECTION to set it after the fact. See this wiki patch for
> details:
>
> https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
>
> Cheers
>
> Tom



Re: SolrCloud increase replication factor

2016-05-23 Thread Jeff Wartes

https://github.com/whitepages/solrcloud_manager was designed to provide some 
easier operations for common kinds of cluster operation. 
It hasn’t been tested with 6.0 though, so if you try it, please let me know 
your experience.


On 5/23/16, 6:28 AM, "Tom Evans"  wrote:

>On Mon, May 23, 2016 at 10:37 AM, Hendrik Haddorp
> wrote:
>> Hi,
>>
>> I have a SolrCloud 6.0 setup and created my collection with a
>> replication factor of 1. Now I want to increase the replication factor
>> but would like the replicas for the same shard to be on different nodes,
>> so that my collection does not fail when one node fails. I tried two
>> approaches so far:
>>
>> 1) When I use the collections API with the MODIFYCOLLECTION action [1] I
>> can set the replication factor but that did not result in the creation
>> of additional replicas. The Solr Admin UI showed that my replication
>> factor changed but otherwise nothing happened. A reload of the
>> collection did also result in no change.
>>
>> 2) Using the ADDREPLICA action [2] from the collections API I have to
>> add the replicas to the shard individually, which is a bit more
>> complicated but otherwise worked. During testing this did however at
>> least once result in the replica being created on the same node. My
>> collection was split in 4 shards and for 2 of them all replicas ended up
>> on the same node.
>>
>> So is the only option to create the replicas manually and also pick the
>> nodes manually or is the perceived behavior wrong?
>>
>> regards,
>> Hendrik
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-modifycoll
>> [2]
>> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api_addreplica
>
>
>With ADDREPLICA, you can specify the node to create the replica on. If
>you are using a script to increase/remove replicas, you can simply
>incorporate the logic you desire in to your script - you can also use
>CLUSTERSTATUS to get a list of nodes/collections/shards etc in order
>to inform the logic in the script. This is the approach we took, we
>have a fabric script to add/remove extra nodes to/from the cluster, it
>works well.
>
>The alternative is to put the logic in to Solr itself, using what Solr
>calls a "snitch" to define the rules on where replicas are created.
>The snitch is specified at collection creation time, or you can use
>MODIFYCOLLECTION to set it after the fact. See this wiki patch for
>details:
>
>https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
>
>Cheers
>
>Tom



Re: SolrCloud increase replication factor

2016-05-23 Thread Tom Evans
On Mon, May 23, 2016 at 10:37 AM, Hendrik Haddorp
 wrote:
> Hi,
>
> I have a SolrCloud 6.0 setup and created my collection with a
> replication factor of 1. Now I want to increase the replication factor
> but would like the replicas for the same shard to be on different nodes,
> so that my collection does not fail when one node fails. I tried two
> approaches so far:
>
> 1) When I use the collections API with the MODIFYCOLLECTION action [1] I
> can set the replication factor but that did not result in the creation
> of additional replicas. The Solr Admin UI showed that my replication
> factor changed but otherwise nothing happened. A reload of the
> collection did also result in no change.
>
> 2) Using the ADDREPLICA action [2] from the collections API I have to
> add the replicas to the shard individually, which is a bit more
> complicated but otherwise worked. During testing this did however at
> least once result in the replica being created on the same node. My
> collection was split in 4 shards and for 2 of them all replicas ended up
> on the same node.
>
> So is the only option to create the replicas manually and also pick the
> nodes manually or is the perceived behavior wrong?
>
> regards,
> Hendrik
>
> [1]
> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-modifycoll
> [2]
> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api_addreplica


With ADDREPLICA, you can specify the node to create the replica on. If
you are using a script to increase/remove replicas, you can simply
incorporate the logic you desire in to your script - you can also use
CLUSTERSTATUS to get a list of nodes/collections/shards etc in order
to inform the logic in the script. This is the approach we took, we
have a fabric script to add/remove extra nodes to/from the cluster, it
works well.

The alternative is to put the logic in to Solr itself, using what Solr
calls a "snitch" to define the rules on where replicas are created.
The snitch is specified at collection creation time, or you can use
MODIFYCOLLECTION to set it after the fact. See this wiki patch for
details:

https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement

Cheers

Tom


SolrCloud increase replication factor

2016-05-23 Thread Hendrik Haddorp
Hi,

I have a SolrCloud 6.0 setup and created my collection with a
replication factor of 1. Now I want to increase the replication factor
but would like the replicas for the same shard to be on different nodes,
so that my collection does not fail when one node fails. I tried two
approaches so far:

1) When I use the collections API with the MODIFYCOLLECTION action [1] I
can set the replication factor but that did not result in the creation
of additional replicas. The Solr Admin UI showed that my replication
factor changed but otherwise nothing happened. A reload of the
collection did also result in no change.

2) Using the ADDREPLICA action [2] from the collections API I have to
add the replicas to the shard individually, which is a bit more
complicated but otherwise worked. During testing this did however at
least once result in the replica being created on the same node. My
collection was split in 4 shards and for 2 of them all replicas ended up
on the same node.

So is the only option to create the replicas manually and also pick the
nodes manually or is the perceived behavior wrong?

regards,
Hendrik

[1]
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-modifycoll
[2]
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api_addreplica