Solrcloud does not come with any autoscaling functionality. If you want such a 
thing, you’ll need to write it yourself.

https://github.com/whitepages/solrcloud_manager might be a useful head start 
though, particularly the “fill” and “cleancollection” commands. I don’t do 
*auto* scaling, but I do use this for all my cluster management, which 
certantly involves moving collections/shards around among nodes, adding 
capacity, and removing capacity.






On 2/14/16, 11:17 AM, "McCallick, Paul" <paul.e.mccall...@nordstrom.com> wrote:

>These are excellent questions and give me a good sense of why you suggest 
>using the collections api.
>
>In our case we have 8 shards of product data with a even distribution of data 
>per shard, no hot spots. We have very different load at different points in 
>the year (cyber monday), and we tend to have very little traffic at night. I'm 
>thinking of two use cases:
>
>1) we are seeing increased latency due to load and want to add 8 more replicas 
>to handle the query volume.  Once the volume subsides, we'd remove the nodes. 
>
>2) we lose a node due to some unexpected failure (ec2 tends to do this). We 
>want auto scaling to detect the failure and add a node to replace the failed 
>one. 
>
>In both cases the core api makes it easy. It adds nodes to the shards evenly. 
>Otherwise we have to write a fairly involved script that is subject to race 
>conditions to determine which shard to add nodes to. 
>
>Let me know if I'm making dangerous or uninformed assumptions, as I'm new to 
>solr. 
>
>Thanks,
>Paul
>
>> On Feb 14, 2016, at 10:35 AM, Susheel Kumar <susheel2...@gmail.com> wrote:
>> 
>> Hi Pual,
>> 
>> 
>> For Auto-scaling, it depends on how you are thinking to design and what/how
>> do you want to scale. Which scenario you think makes coreadmin API easy to
>> use for a sharded SolrCloud environment?
>> 
>> Isn't if in a sharded environment (assume 3 shards A,B & C) and shard B has
>> having higher or more load,  then you want to add Replica for shard B to
>> distribute the load or if a particular shard replica goes down then you
>> want to add another Replica back for the shard in which case ADDREPLICA
>> requires a shard name?
>> 
>> Can you describe your scenario / provide more detail?
>> 
>> Thanks,
>> Susheel
>> 
>> 
>> 
>> On Sun, Feb 14, 2016 at 11:51 AM, McCallick, Paul <
>> paul.e.mccall...@nordstrom.com> wrote:
>> 
>>> Hi all,
>>> 
>>> 
>>> This doesn’t really answer the following question:
>>> 
>>> What is the suggested way to add a new node to a collection via the
>>> apis?  I  am specifically thinking of autoscale scenarios where a node has
>>> gone down or more nodes are needed to handle load.
>>> 
>>> 
>>> The coreadmin api makes this easy.  The collections api (ADDREPLICA),
>>> makes this very difficult.
>>> 
>>> 
>>>> On 2/14/16, 8:19 AM, "Susheel Kumar" <susheel2...@gmail.com> wrote:
>>>> 
>>>> Hi Paul,
>>>> 
>>>> Shawn is referring to use Collections API
>>>> https://cwiki.apache.org/confluence/display/solr/Collections+API  than
>>> Core
>>>> Admin API https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API
>>>> for SolrCloud.
>>>> 
>>>> Hope that clarifies and you mentioned about ADDREPLICA which is the
>>>> collections API, so you are on right track.
>>>> 
>>>> Thanks,
>>>> Susheel
>>>> 
>>>> 
>>>> 
>>>> On Sun, Feb 14, 2016 at 10:51 AM, McCallick, Paul <
>>>> paul.e.mccall...@nordstrom.com> wrote:
>>>> 
>>>>> Then what is the suggested way to add a new node to a collection via the
>>>>> apis?  I  am specifically thinking of autoscale scenarios where a node
>>> has
>>>>> gone down or more nodes are needed to handle load.
>>>>> 
>>>>> Note that the ADDREPLICA endpoint requires a shard name, which puts the
>>>>> onus of how to scale out on the user. This can be challenging in an
>>>>> autoscale scenario.
>>>>> 
>>>>> Thanks,
>>>>> Paul
>>>>> 
>>>>>> On Feb 14, 2016, at 12:25 AM, Shawn Heisey <apa...@elyograg.org>
>>> wrote:
>>>>>> 
>>>>>>> On 2/13/2016 6:01 PM, McCallick, Paul wrote:
>>>>>>> - When creating a new collection, SOLRCloud will use all available
>>>>> nodes for the collection, adding cores to each.  This assumes that you
>>> do
>>>>> not specify a replicationFactor.
>>>>>> 
>>>>>> The number of nodes that will be used is numShards multipled by
>>>>>> replicationFactor.  The default value for replicationFactor is 1.  If
>>>>>> you do not specify numShards, there is no default -- the CREATE call
>>>>>> will fail.  The value of maxShardsPerNode can also affect the overall
>>>>>> result.
>>>>>> 
>>>>>>> - When adding new nodes to the cluster AFTER the collection is
>>> created,
>>>>> one must use the core admin api to add the node to the collection.
>>>>>> 
>>>>>> Using the CoreAdmin API is strongly discouraged when running
>>> SolrCloud.
>>>>>> It works, but it is an expert API when in cloud mode, and can cause
>>>>>> serious problems if not used correctly.  Instead, use the Collections
>>>>>> API.  It can handle all normal maintenance needs.
>>>>>> 
>>>>>>> I would really like to see the second case behave more like the
>>> first.
>>>>> If I add a node to the cluster, it is automatically used as a replica
>>> for
>>>>> existing clusters without my having to do so.  This would really
>>> simplify
>>>>> things.
>>>>>> 
>>>>>> I've added a FAQ entry to address why this is a bad idea.
>>> https://wiki.apache.org/solr/FAQ#Why_doesn.27t_SolrCloud_automatically_create_replicas_when_I_add_nodes.3F
>>>>>> 
>>>>>> Thanks,
>>>>>> Shawn
>>> 

Reply via email to