Solrcloud does not come with any autoscaling functionality. If you want such a thing, you’ll need to write it yourself.
https://github.com/whitepages/solrcloud_manager might be a useful head start though, particularly the “fill” and “cleancollection” commands. I don’t do *auto* scaling, but I do use this for all my cluster management, which certantly involves moving collections/shards around among nodes, adding capacity, and removing capacity. On 2/14/16, 11:17 AM, "McCallick, Paul" <paul.e.mccall...@nordstrom.com> wrote: >These are excellent questions and give me a good sense of why you suggest >using the collections api. > >In our case we have 8 shards of product data with a even distribution of data >per shard, no hot spots. We have very different load at different points in >the year (cyber monday), and we tend to have very little traffic at night. I'm >thinking of two use cases: > >1) we are seeing increased latency due to load and want to add 8 more replicas >to handle the query volume. Once the volume subsides, we'd remove the nodes. > >2) we lose a node due to some unexpected failure (ec2 tends to do this). We >want auto scaling to detect the failure and add a node to replace the failed >one. > >In both cases the core api makes it easy. It adds nodes to the shards evenly. >Otherwise we have to write a fairly involved script that is subject to race >conditions to determine which shard to add nodes to. > >Let me know if I'm making dangerous or uninformed assumptions, as I'm new to >solr. > >Thanks, >Paul > >> On Feb 14, 2016, at 10:35 AM, Susheel Kumar <susheel2...@gmail.com> wrote: >> >> Hi Pual, >> >> >> For Auto-scaling, it depends on how you are thinking to design and what/how >> do you want to scale. Which scenario you think makes coreadmin API easy to >> use for a sharded SolrCloud environment? >> >> Isn't if in a sharded environment (assume 3 shards A,B & C) and shard B has >> having higher or more load, then you want to add Replica for shard B to >> distribute the load or if a particular shard replica goes down then you >> want to add another Replica back for the shard in which case ADDREPLICA >> requires a shard name? >> >> Can you describe your scenario / provide more detail? >> >> Thanks, >> Susheel >> >> >> >> On Sun, Feb 14, 2016 at 11:51 AM, McCallick, Paul < >> paul.e.mccall...@nordstrom.com> wrote: >> >>> Hi all, >>> >>> >>> This doesn’t really answer the following question: >>> >>> What is the suggested way to add a new node to a collection via the >>> apis? I am specifically thinking of autoscale scenarios where a node has >>> gone down or more nodes are needed to handle load. >>> >>> >>> The coreadmin api makes this easy. The collections api (ADDREPLICA), >>> makes this very difficult. >>> >>> >>>> On 2/14/16, 8:19 AM, "Susheel Kumar" <susheel2...@gmail.com> wrote: >>>> >>>> Hi Paul, >>>> >>>> Shawn is referring to use Collections API >>>> https://cwiki.apache.org/confluence/display/solr/Collections+API than >>> Core >>>> Admin API https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API >>>> for SolrCloud. >>>> >>>> Hope that clarifies and you mentioned about ADDREPLICA which is the >>>> collections API, so you are on right track. >>>> >>>> Thanks, >>>> Susheel >>>> >>>> >>>> >>>> On Sun, Feb 14, 2016 at 10:51 AM, McCallick, Paul < >>>> paul.e.mccall...@nordstrom.com> wrote: >>>> >>>>> Then what is the suggested way to add a new node to a collection via the >>>>> apis? I am specifically thinking of autoscale scenarios where a node >>> has >>>>> gone down or more nodes are needed to handle load. >>>>> >>>>> Note that the ADDREPLICA endpoint requires a shard name, which puts the >>>>> onus of how to scale out on the user. This can be challenging in an >>>>> autoscale scenario. >>>>> >>>>> Thanks, >>>>> Paul >>>>> >>>>>> On Feb 14, 2016, at 12:25 AM, Shawn Heisey <apa...@elyograg.org> >>> wrote: >>>>>> >>>>>>> On 2/13/2016 6:01 PM, McCallick, Paul wrote: >>>>>>> - When creating a new collection, SOLRCloud will use all available >>>>> nodes for the collection, adding cores to each. This assumes that you >>> do >>>>> not specify a replicationFactor. >>>>>> >>>>>> The number of nodes that will be used is numShards multipled by >>>>>> replicationFactor. The default value for replicationFactor is 1. If >>>>>> you do not specify numShards, there is no default -- the CREATE call >>>>>> will fail. The value of maxShardsPerNode can also affect the overall >>>>>> result. >>>>>> >>>>>>> - When adding new nodes to the cluster AFTER the collection is >>> created, >>>>> one must use the core admin api to add the node to the collection. >>>>>> >>>>>> Using the CoreAdmin API is strongly discouraged when running >>> SolrCloud. >>>>>> It works, but it is an expert API when in cloud mode, and can cause >>>>>> serious problems if not used correctly. Instead, use the Collections >>>>>> API. It can handle all normal maintenance needs. >>>>>> >>>>>>> I would really like to see the second case behave more like the >>> first. >>>>> If I add a node to the cluster, it is automatically used as a replica >>> for >>>>> existing clusters without my having to do so. This would really >>> simplify >>>>> things. >>>>>> >>>>>> I've added a FAQ entry to address why this is a bad idea. >>> https://wiki.apache.org/solr/FAQ#Why_doesn.27t_SolrCloud_automatically_create_replicas_when_I_add_nodes.3F >>>>>> >>>>>> Thanks, >>>>>> Shawn >>>