Re: Adding nodes
Solrcloud does not come with any autoscaling functionality. If you want such a thing, you’ll need to write it yourself. https://github.com/whitepages/solrcloud_manager might be a useful head start though, particularly the “fill” and “cleancollection” commands. I don’t do *auto* scaling, but I do use this for all my cluster management, which certantly involves moving collections/shards around among nodes, adding capacity, and removing capacity. On 2/14/16, 11:17 AM, "McCallick, Paul"wrote: >These are excellent questions and give me a good sense of why you suggest >using the collections api. > >In our case we have 8 shards of product data with a even distribution of data >per shard, no hot spots. We have very different load at different points in >the year (cyber monday), and we tend to have very little traffic at night. I'm >thinking of two use cases: > >1) we are seeing increased latency due to load and want to add 8 more replicas >to handle the query volume. Once the volume subsides, we'd remove the nodes. > >2) we lose a node due to some unexpected failure (ec2 tends to do this). We >want auto scaling to detect the failure and add a node to replace the failed >one. > >In both cases the core api makes it easy. It adds nodes to the shards evenly. >Otherwise we have to write a fairly involved script that is subject to race >conditions to determine which shard to add nodes to. > >Let me know if I'm making dangerous or uninformed assumptions, as I'm new to >solr. > >Thanks, >Paul > >> On Feb 14, 2016, at 10:35 AM, Susheel Kumar wrote: >> >> Hi Pual, >> >> >> For Auto-scaling, it depends on how you are thinking to design and what/how >> do you want to scale. Which scenario you think makes coreadmin API easy to >> use for a sharded SolrCloud environment? >> >> Isn't if in a sharded environment (assume 3 shards A,B & C) and shard B has >> having higher or more load, then you want to add Replica for shard B to >> distribute the load or if a particular shard replica goes down then you >> want to add another Replica back for the shard in which case ADDREPLICA >> requires a shard name? >> >> Can you describe your scenario / provide more detail? >> >> Thanks, >> Susheel >> >> >> >> On Sun, Feb 14, 2016 at 11:51 AM, McCallick, Paul < >> paul.e.mccall...@nordstrom.com> wrote: >> >>> Hi all, >>> >>> >>> This doesn’t really answer the following question: >>> >>> What is the suggested way to add a new node to a collection via the >>> apis? I am specifically thinking of autoscale scenarios where a node has >>> gone down or more nodes are needed to handle load. >>> >>> >>> The coreadmin api makes this easy. The collections api (ADDREPLICA), >>> makes this very difficult. >>> >>> On 2/14/16, 8:19 AM, "Susheel Kumar" wrote: Hi Paul, Shawn is referring to use Collections API https://cwiki.apache.org/confluence/display/solr/Collections+API than >>> Core Admin API https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API for SolrCloud. Hope that clarifies and you mentioned about ADDREPLICA which is the collections API, so you are on right track. Thanks, Susheel On Sun, Feb 14, 2016 at 10:51 AM, McCallick, Paul < paul.e.mccall...@nordstrom.com> wrote: > Then what is the suggested way to add a new node to a collection via the > apis? I am specifically thinking of autoscale scenarios where a node >>> has > gone down or more nodes are needed to handle load. > > Note that the ADDREPLICA endpoint requires a shard name, which puts the > onus of how to scale out on the user. This can be challenging in an > autoscale scenario. > > Thanks, > Paul > >> On Feb 14, 2016, at 12:25 AM, Shawn Heisey >>> wrote: >> >>> On 2/13/2016 6:01 PM, McCallick, Paul wrote: >>> - When creating a new collection, SOLRCloud will use all available > nodes for the collection, adding cores to each. This assumes that you >>> do > not specify a replicationFactor. >> >> The number of nodes that will be used is numShards multipled by >> replicationFactor. The default value for replicationFactor is 1. If >> you do not specify numShards, there is no default -- the CREATE call >> will fail. The value of maxShardsPerNode can also affect the overall >> result. >> >>> - When adding new nodes to the cluster AFTER the collection is >>> created, > one must use the core admin api to add the node to the collection. >> >> Using the CoreAdmin API is strongly discouraged when running >>> SolrCloud. >> It works, but it is an expert API when in cloud mode, and can cause >> serious problems if not used correctly. Instead, use the Collections >> API. It can handle all normal
Re: Adding nodes
Hi Paul, Thanks for the detail but I am still not able to understand how the CoreAPI would make it easier for you to create replica's. I understand that using Core API, you can add more cores but would that also populate the data so that it can serve queries / act like a replica. Second, As Shawn mentioned in the link above that adding Replica for auto-scaling or in a near real time etc. is not a good idea since it put more load on the system and causing delay. Unless you have copy of indexes (assuming index is static) and you can create more cores dynamically in which case Core API may work for your case. Thanks, Susheel On Sun, Feb 14, 2016 at 2:17 PM, McCallick, Paul < paul.e.mccall...@nordstrom.com> wrote: > These are excellent questions and give me a good sense of why you suggest > using the collections api. > > In our case we have 8 shards of product data with a even distribution of > data per shard, no hot spots. We have very different load at different > points in the year (cyber monday), and we tend to have very little traffic > at night. I'm thinking of two use cases: > > 1) we are seeing increased latency due to load and want to add 8 more > replicas to handle the query volume. Once the volume subsides, we'd remove > the nodes. > > 2) we lose a node due to some unexpected failure (ec2 tends to do this). > We want auto scaling to detect the failure and add a node to replace the > failed one. > > In both cases the core api makes it easy. It adds nodes to the shards > evenly. Otherwise we have to write a fairly involved script that is subject > to race conditions to determine which shard to add nodes to. > > Let me know if I'm making dangerous or uninformed assumptions, as I'm new > to solr. > > Thanks, > Paul > > > On Feb 14, 2016, at 10:35 AM, Susheel Kumar> wrote: > > > > Hi Pual, > > > > > > For Auto-scaling, it depends on how you are thinking to design and > what/how > > do you want to scale. Which scenario you think makes coreadmin API easy > to > > use for a sharded SolrCloud environment? > > > > Isn't if in a sharded environment (assume 3 shards A,B & C) and shard B > has > > having higher or more load, then you want to add Replica for shard B to > > distribute the load or if a particular shard replica goes down then you > > want to add another Replica back for the shard in which case ADDREPLICA > > requires a shard name? > > > > Can you describe your scenario / provide more detail? > > > > Thanks, > > Susheel > > > > > > > > On Sun, Feb 14, 2016 at 11:51 AM, McCallick, Paul < > > paul.e.mccall...@nordstrom.com> wrote: > > > >> Hi all, > >> > >> > >> This doesn’t really answer the following question: > >> > >> What is the suggested way to add a new node to a collection via the > >> apis? I am specifically thinking of autoscale scenarios where a node > has > >> gone down or more nodes are needed to handle load. > >> > >> > >> The coreadmin api makes this easy. The collections api (ADDREPLICA), > >> makes this very difficult. > >> > >> > >>> On 2/14/16, 8:19 AM, "Susheel Kumar" wrote: > >>> > >>> Hi Paul, > >>> > >>> Shawn is referring to use Collections API > >>> https://cwiki.apache.org/confluence/display/solr/Collections+API than > >> Core > >>> Admin API > https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API > >>> for SolrCloud. > >>> > >>> Hope that clarifies and you mentioned about ADDREPLICA which is the > >>> collections API, so you are on right track. > >>> > >>> Thanks, > >>> Susheel > >>> > >>> > >>> > >>> On Sun, Feb 14, 2016 at 10:51 AM, McCallick, Paul < > >>> paul.e.mccall...@nordstrom.com> wrote: > >>> > Then what is the suggested way to add a new node to a collection via > the > apis? I am specifically thinking of autoscale scenarios where a node > >> has > gone down or more nodes are needed to handle load. > > Note that the ADDREPLICA endpoint requires a shard name, which puts > the > onus of how to scale out on the user. This can be challenging in an > autoscale scenario. > > Thanks, > Paul > > > On Feb 14, 2016, at 12:25 AM, Shawn Heisey > >> wrote: > > > >> On 2/13/2016 6:01 PM, McCallick, Paul wrote: > >> - When creating a new collection, SOLRCloud will use all available > nodes for the collection, adding cores to each. This assumes that you > >> do > not specify a replicationFactor. > > > > The number of nodes that will be used is numShards multipled by > > replicationFactor. The default value for replicationFactor is 1. If > > you do not specify numShards, there is no default -- the CREATE call > > will fail. The value of maxShardsPerNode can also affect the overall > > result. > > > >> - When adding new nodes to the cluster AFTER the collection is > >> created, > one must use the core admin api to add the node to the collection. > > > > Using
Re: Adding nodes
These are excellent questions and give me a good sense of why you suggest using the collections api. In our case we have 8 shards of product data with a even distribution of data per shard, no hot spots. We have very different load at different points in the year (cyber monday), and we tend to have very little traffic at night. I'm thinking of two use cases: 1) we are seeing increased latency due to load and want to add 8 more replicas to handle the query volume. Once the volume subsides, we'd remove the nodes. 2) we lose a node due to some unexpected failure (ec2 tends to do this). We want auto scaling to detect the failure and add a node to replace the failed one. In both cases the core api makes it easy. It adds nodes to the shards evenly. Otherwise we have to write a fairly involved script that is subject to race conditions to determine which shard to add nodes to. Let me know if I'm making dangerous or uninformed assumptions, as I'm new to solr. Thanks, Paul > On Feb 14, 2016, at 10:35 AM, Susheel Kumarwrote: > > Hi Pual, > > > For Auto-scaling, it depends on how you are thinking to design and what/how > do you want to scale. Which scenario you think makes coreadmin API easy to > use for a sharded SolrCloud environment? > > Isn't if in a sharded environment (assume 3 shards A,B & C) and shard B has > having higher or more load, then you want to add Replica for shard B to > distribute the load or if a particular shard replica goes down then you > want to add another Replica back for the shard in which case ADDREPLICA > requires a shard name? > > Can you describe your scenario / provide more detail? > > Thanks, > Susheel > > > > On Sun, Feb 14, 2016 at 11:51 AM, McCallick, Paul < > paul.e.mccall...@nordstrom.com> wrote: > >> Hi all, >> >> >> This doesn’t really answer the following question: >> >> What is the suggested way to add a new node to a collection via the >> apis? I am specifically thinking of autoscale scenarios where a node has >> gone down or more nodes are needed to handle load. >> >> >> The coreadmin api makes this easy. The collections api (ADDREPLICA), >> makes this very difficult. >> >> >>> On 2/14/16, 8:19 AM, "Susheel Kumar" wrote: >>> >>> Hi Paul, >>> >>> Shawn is referring to use Collections API >>> https://cwiki.apache.org/confluence/display/solr/Collections+API than >> Core >>> Admin API https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API >>> for SolrCloud. >>> >>> Hope that clarifies and you mentioned about ADDREPLICA which is the >>> collections API, so you are on right track. >>> >>> Thanks, >>> Susheel >>> >>> >>> >>> On Sun, Feb 14, 2016 at 10:51 AM, McCallick, Paul < >>> paul.e.mccall...@nordstrom.com> wrote: >>> Then what is the suggested way to add a new node to a collection via the apis? I am specifically thinking of autoscale scenarios where a node >> has gone down or more nodes are needed to handle load. Note that the ADDREPLICA endpoint requires a shard name, which puts the onus of how to scale out on the user. This can be challenging in an autoscale scenario. Thanks, Paul > On Feb 14, 2016, at 12:25 AM, Shawn Heisey >> wrote: > >> On 2/13/2016 6:01 PM, McCallick, Paul wrote: >> - When creating a new collection, SOLRCloud will use all available nodes for the collection, adding cores to each. This assumes that you >> do not specify a replicationFactor. > > The number of nodes that will be used is numShards multipled by > replicationFactor. The default value for replicationFactor is 1. If > you do not specify numShards, there is no default -- the CREATE call > will fail. The value of maxShardsPerNode can also affect the overall > result. > >> - When adding new nodes to the cluster AFTER the collection is >> created, one must use the core admin api to add the node to the collection. > > Using the CoreAdmin API is strongly discouraged when running >> SolrCloud. > It works, but it is an expert API when in cloud mode, and can cause > serious problems if not used correctly. Instead, use the Collections > API. It can handle all normal maintenance needs. > >> I would really like to see the second case behave more like the >> first. If I add a node to the cluster, it is automatically used as a replica >> for existing clusters without my having to do so. This would really >> simplify things. > > I've added a FAQ entry to address why this is a bad idea. >> https://wiki.apache.org/solr/FAQ#Why_doesn.27t_SolrCloud_automatically_create_replicas_when_I_add_nodes.3F > > Thanks, > Shawn >>
Re: Adding nodes
Hi Pual, For Auto-scaling, it depends on how you are thinking to design and what/how do you want to scale. Which scenario you think makes coreadmin API easy to use for a sharded SolrCloud environment? Isn't if in a sharded environment (assume 3 shards A,B & C) and shard B has having higher or more load, then you want to add Replica for shard B to distribute the load or if a particular shard replica goes down then you want to add another Replica back for the shard in which case ADDREPLICA requires a shard name? Can you describe your scenario / provide more detail? Thanks, Susheel On Sun, Feb 14, 2016 at 11:51 AM, McCallick, Paul < paul.e.mccall...@nordstrom.com> wrote: > Hi all, > > > This doesn’t really answer the following question: > > What is the suggested way to add a new node to a collection via the > apis? I am specifically thinking of autoscale scenarios where a node has > gone down or more nodes are needed to handle load. > > > The coreadmin api makes this easy. The collections api (ADDREPLICA), > makes this very difficult. > > > On 2/14/16, 8:19 AM, "Susheel Kumar"wrote: > > >Hi Paul, > > > >Shawn is referring to use Collections API > >https://cwiki.apache.org/confluence/display/solr/Collections+API than > Core > >Admin API https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API > >for SolrCloud. > > > >Hope that clarifies and you mentioned about ADDREPLICA which is the > >collections API, so you are on right track. > > > >Thanks, > >Susheel > > > > > > > >On Sun, Feb 14, 2016 at 10:51 AM, McCallick, Paul < > >paul.e.mccall...@nordstrom.com> wrote: > > > >> Then what is the suggested way to add a new node to a collection via the > >> apis? I am specifically thinking of autoscale scenarios where a node > has > >> gone down or more nodes are needed to handle load. > >> > >> Note that the ADDREPLICA endpoint requires a shard name, which puts the > >> onus of how to scale out on the user. This can be challenging in an > >> autoscale scenario. > >> > >> Thanks, > >> Paul > >> > >> > On Feb 14, 2016, at 12:25 AM, Shawn Heisey > wrote: > >> > > >> >> On 2/13/2016 6:01 PM, McCallick, Paul wrote: > >> >> - When creating a new collection, SOLRCloud will use all available > >> nodes for the collection, adding cores to each. This assumes that you > do > >> not specify a replicationFactor. > >> > > >> > The number of nodes that will be used is numShards multipled by > >> > replicationFactor. The default value for replicationFactor is 1. If > >> > you do not specify numShards, there is no default -- the CREATE call > >> > will fail. The value of maxShardsPerNode can also affect the overall > >> > result. > >> > > >> >> - When adding new nodes to the cluster AFTER the collection is > created, > >> one must use the core admin api to add the node to the collection. > >> > > >> > Using the CoreAdmin API is strongly discouraged when running > SolrCloud. > >> > It works, but it is an expert API when in cloud mode, and can cause > >> > serious problems if not used correctly. Instead, use the Collections > >> > API. It can handle all normal maintenance needs. > >> > > >> >> I would really like to see the second case behave more like the > first. > >> If I add a node to the cluster, it is automatically used as a replica > for > >> existing clusters without my having to do so. This would really > simplify > >> things. > >> > > >> > I've added a FAQ entry to address why this is a bad idea. > >> > > >> > > >> > https://wiki.apache.org/solr/FAQ#Why_doesn.27t_SolrCloud_automatically_create_replicas_when_I_add_nodes.3F > >> > > >> > Thanks, > >> > Shawn > >> > > >> >
Re: Adding nodes
Hi all, This doesn’t really answer the following question: What is the suggested way to add a new node to a collection via the apis? I am specifically thinking of autoscale scenarios where a node has gone down or more nodes are needed to handle load. The coreadmin api makes this easy. The collections api (ADDREPLICA), makes this very difficult. On 2/14/16, 8:19 AM, "Susheel Kumar"wrote: >Hi Paul, > >Shawn is referring to use Collections API >https://cwiki.apache.org/confluence/display/solr/Collections+API than Core >Admin API https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API >for SolrCloud. > >Hope that clarifies and you mentioned about ADDREPLICA which is the >collections API, so you are on right track. > >Thanks, >Susheel > > > >On Sun, Feb 14, 2016 at 10:51 AM, McCallick, Paul < >paul.e.mccall...@nordstrom.com> wrote: > >> Then what is the suggested way to add a new node to a collection via the >> apis? I am specifically thinking of autoscale scenarios where a node has >> gone down or more nodes are needed to handle load. >> >> Note that the ADDREPLICA endpoint requires a shard name, which puts the >> onus of how to scale out on the user. This can be challenging in an >> autoscale scenario. >> >> Thanks, >> Paul >> >> > On Feb 14, 2016, at 12:25 AM, Shawn Heisey wrote: >> > >> >> On 2/13/2016 6:01 PM, McCallick, Paul wrote: >> >> - When creating a new collection, SOLRCloud will use all available >> nodes for the collection, adding cores to each. This assumes that you do >> not specify a replicationFactor. >> > >> > The number of nodes that will be used is numShards multipled by >> > replicationFactor. The default value for replicationFactor is 1. If >> > you do not specify numShards, there is no default -- the CREATE call >> > will fail. The value of maxShardsPerNode can also affect the overall >> > result. >> > >> >> - When adding new nodes to the cluster AFTER the collection is created, >> one must use the core admin api to add the node to the collection. >> > >> > Using the CoreAdmin API is strongly discouraged when running SolrCloud. >> > It works, but it is an expert API when in cloud mode, and can cause >> > serious problems if not used correctly. Instead, use the Collections >> > API. It can handle all normal maintenance needs. >> > >> >> I would really like to see the second case behave more like the first. >> If I add a node to the cluster, it is automatically used as a replica for >> existing clusters without my having to do so. This would really simplify >> things. >> > >> > I've added a FAQ entry to address why this is a bad idea. >> > >> > >> https://wiki.apache.org/solr/FAQ#Why_doesn.27t_SolrCloud_automatically_create_replicas_when_I_add_nodes.3F >> > >> > Thanks, >> > Shawn >> > >>
Re: Adding nodes
Then what is the suggested way to add a new node to a collection via the apis? I am specifically thinking of autoscale scenarios where a node has gone down or more nodes are needed to handle load. Note that the ADDREPLICA endpoint requires a shard name, which puts the onus of how to scale out on the user. This can be challenging in an autoscale scenario. Thanks, Paul > On Feb 14, 2016, at 12:25 AM, Shawn Heiseywrote: > >> On 2/13/2016 6:01 PM, McCallick, Paul wrote: >> - When creating a new collection, SOLRCloud will use all available nodes for >> the collection, adding cores to each. This assumes that you do not specify >> a replicationFactor. > > The number of nodes that will be used is numShards multipled by > replicationFactor. The default value for replicationFactor is 1. If > you do not specify numShards, there is no default -- the CREATE call > will fail. The value of maxShardsPerNode can also affect the overall > result. > >> - When adding new nodes to the cluster AFTER the collection is created, one >> must use the core admin api to add the node to the collection. > > Using the CoreAdmin API is strongly discouraged when running SolrCloud. > It works, but it is an expert API when in cloud mode, and can cause > serious problems if not used correctly. Instead, use the Collections > API. It can handle all normal maintenance needs. > >> I would really like to see the second case behave more like the first. If I >> add a node to the cluster, it is automatically used as a replica for >> existing clusters without my having to do so. This would really simplify >> things. > > I've added a FAQ entry to address why this is a bad idea. > > https://wiki.apache.org/solr/FAQ#Why_doesn.27t_SolrCloud_automatically_create_replicas_when_I_add_nodes.3F > > Thanks, > Shawn >
Re: Adding nodes
Hi Paul, Shawn is referring to use Collections API https://cwiki.apache.org/confluence/display/solr/Collections+API than Core Admin API https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API for SolrCloud. Hope that clarifies and you mentioned about ADDREPLICA which is the collections API, so you are on right track. Thanks, Susheel On Sun, Feb 14, 2016 at 10:51 AM, McCallick, Paul < paul.e.mccall...@nordstrom.com> wrote: > Then what is the suggested way to add a new node to a collection via the > apis? I am specifically thinking of autoscale scenarios where a node has > gone down or more nodes are needed to handle load. > > Note that the ADDREPLICA endpoint requires a shard name, which puts the > onus of how to scale out on the user. This can be challenging in an > autoscale scenario. > > Thanks, > Paul > > > On Feb 14, 2016, at 12:25 AM, Shawn Heiseywrote: > > > >> On 2/13/2016 6:01 PM, McCallick, Paul wrote: > >> - When creating a new collection, SOLRCloud will use all available > nodes for the collection, adding cores to each. This assumes that you do > not specify a replicationFactor. > > > > The number of nodes that will be used is numShards multipled by > > replicationFactor. The default value for replicationFactor is 1. If > > you do not specify numShards, there is no default -- the CREATE call > > will fail. The value of maxShardsPerNode can also affect the overall > > result. > > > >> - When adding new nodes to the cluster AFTER the collection is created, > one must use the core admin api to add the node to the collection. > > > > Using the CoreAdmin API is strongly discouraged when running SolrCloud. > > It works, but it is an expert API when in cloud mode, and can cause > > serious problems if not used correctly. Instead, use the Collections > > API. It can handle all normal maintenance needs. > > > >> I would really like to see the second case behave more like the first. > If I add a node to the cluster, it is automatically used as a replica for > existing clusters without my having to do so. This would really simplify > things. > > > > I've added a FAQ entry to address why this is a bad idea. > > > > > https://wiki.apache.org/solr/FAQ#Why_doesn.27t_SolrCloud_automatically_create_replicas_when_I_add_nodes.3F > > > > Thanks, > > Shawn > > >
Re: Adding nodes
On 2/13/2016 6:01 PM, McCallick, Paul wrote: > - When creating a new collection, SOLRCloud will use all available nodes for > the collection, adding cores to each. This assumes that you do not specify a > replicationFactor. The number of nodes that will be used is numShards multipled by replicationFactor. The default value for replicationFactor is 1. If you do not specify numShards, there is no default -- the CREATE call will fail. The value of maxShardsPerNode can also affect the overall result. > - When adding new nodes to the cluster AFTER the collection is created, one > must use the core admin api to add the node to the collection. Using the CoreAdmin API is strongly discouraged when running SolrCloud. It works, but it is an expert API when in cloud mode, and can cause serious problems if not used correctly. Instead, use the Collections API. It can handle all normal maintenance needs. > I would really like to see the second case behave more like the first. If I > add a node to the cluster, it is automatically used as a replica for existing > clusters without my having to do so. This would really simplify things. I've added a FAQ entry to address why this is a bad idea. https://wiki.apache.org/solr/FAQ#Why_doesn.27t_SolrCloud_automatically_create_replicas_when_I_add_nodes.3F Thanks, Shawn
Adding nodes
I’d like to verify the following: - When creating a new collection, SOLRCloud will use all available nodes for the collection, adding cores to each. This assumes that you do not specify a replicationFactor. - When adding new nodes to the cluster AFTER the collection is created, one must use the core admin api to add the node to the collection. I would really like to see the second case behave more like the first. If I add a node to the cluster, it is automatically used as a replica for existing clusters without my having to do so. This would really simplify things. Paul McCallick Sr Manager Information Technology eCommerce Foundation
[RE-BALACE of Collection] Re-balancing of collection after adding nodes to clustered node
Hi, I found the email addresses from a slide-share @ http://www.slideshare.net/thelabdude/tjp-solr-webinar. It's very useful. We are developing SOLR search using CDH4 Cloudera and embedded SOLR 4.4.0-search-1.1.0. We created a Collection when the cluster had 2 slave nodes. Then two extra nodes added. In those extra nodes SOLR service runs, but Zoo Keeper service does not run in those nodes. Zoo Keeper service runs only in earlier nodes. When cluster had 2 nodes then indexing tool run successfully. But after adding two nodes when again indexing tool runs then it throws and error *no active slice servicing hashcode*. The error seems that re-balancing of collection didn't happen after adding extra SOLR nodes. So when indexing tool runs then tool tries to shard/distribute the indexing information into extra node(s) which is/are not aware of that collection and throws an error. Number of sharding is: 2. Composite routing policy is used. My question is, is it possible to re-balancing the collection information after creating new SOLR nodes? In your slide share it's written that re-balancing is available in SOLR-5025, what's SOLR-5025? Thanks Regards Debasis
Re: Replication after re adding nodes to cluster (sleeping replicas)
The whole point of SolrCloud is to automatically take care of all the ugly details of synching etc. You should be able to add a node and, assuming it has been assigned to a shard, do nothing. The node will start up, synch with the leader, get registered and start handling queries without you having to do anything. If you shut the node down, SolrCloud will figure that out and stop sending requests to it. If yo then bring the node back up, SolrCloud will figure out how to synch it with the leader and just make it happen. When it's synched, it'll start serving requests. Watch the Solr admin page and you'll see the status change as these operations happen. You'll have to refresh the screen And finally, watch the Solr log on the new node, that'll give you a good sense of what the steps are. Best, Erick On Fri, Nov 1, 2013 at 4:13 AM, michael.boom my_sky...@yahoo.com wrote: I have a SolrCloud cluster holding 4 collections, each with with 3 shards and replication factor = 2. They all live on 2 machines, and I am currently using this setup for testing. However, i would like to connect this test setup to our live application, just for benchmarking and evaluating if it can handle the big qpm number. I am planning also to setup a new machine, and add new nodes manually, one more replica for each shard on the new machines, in case the first two have problems handling the big qpm. But what i would like to do is after I set up the new nodes, to shut down the new machine and only put it back in the cluster if it's needed. Thus, getting to the title of this mail: After re adding the 3rd machine to the cluster, will the replicas be automatically synced with the leader, or do i need to manually trigger this somehow ? Is there a better idea for having this sleeping replicas? I bet lots of people faced this problem, so a best practice must be out there. - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Replication-after-re-adding-nodes-to-cluster-sleeping-replicas-tp4098764.html Sent from the Solr - User mailing list archive at Nabble.com.
Replication after re adding nodes to cluster (sleeping replicas)
I have a SolrCloud cluster holding 4 collections, each with with 3 shards and replication factor = 2. They all live on 2 machines, and I am currently using this setup for testing. However, i would like to connect this test setup to our live application, just for benchmarking and evaluating if it can handle the big qpm number. I am planning also to setup a new machine, and add new nodes manually, one more replica for each shard on the new machines, in case the first two have problems handling the big qpm. But what i would like to do is after I set up the new nodes, to shut down the new machine and only put it back in the cluster if it's needed. Thus, getting to the title of this mail: After re adding the 3rd machine to the cluster, will the replicas be automatically synced with the leader, or do i need to manually trigger this somehow ? Is there a better idea for having this sleeping replicas? I bet lots of people faced this problem, so a best practice must be out there. - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Replication-after-re-adding-nodes-to-cluster-sleeping-replicas-tp4098764.html Sent from the Solr - User mailing list archive at Nabble.com.