Re: Where to specify numShards when startup up a cloud setup
Hi zzT Putting numShards in core.properties also works. I struggled a little bit while figuring out this configuration approach. I knew I am not alone! ;-) On 2 April 2014 18:06, zzT zis@gmail.com wrote: It seems that I've figured out a configuration approach to this issue. I'm having the exact same issue and the only viable solutions found on the net till now are 1) Pass -DnumShards=x when starting up Solr server 2) Use the Collections API as indicated by Shawn. What I've noticed though - after making the call to /collections to create a node solr.xml - is that a new core entry is added inside solr.xml with the attribute numShards. So, right now I'm configuring solr.xml with numShards attribute inside my core nodes. This way I don't have to worry with annoying stuff you've already mentioned e.g. waiting for Solr to start up etc. Of course same logic applies here, numShards param is meanigful only the first time. Even if you change it at a later point the # of shards stays the same. -- View this message in context: http://lucene.472066.n3.nabble.com/Where-to-specify-numShards-when-startup-up-a-cloud-setup-tp4078473p4128566.html Sent from the Solr - User mailing list archive at Nabble.com. -- All the best Liu Bo
RE: Where to specify numShards when startup up a cloud setup
It seems that I've figured out a configuration approach to this issue. I'm having the exact same issue and the only viable solutions found on the net till now are 1) Pass -DnumShards=x when starting up Solr server 2) Use the Collections API as indicated by Shawn. What I've noticed though - after making the call to /collections to create a node solr.xml - is that a new core entry is added inside solr.xml with the attribute numShards. So, right now I'm configuring solr.xml with numShards attribute inside my core nodes. This way I don't have to worry with annoying stuff you've already mentioned e.g. waiting for Solr to start up etc. Of course same logic applies here, numShards param is meanigful only the first time. Even if you change it at a later point the # of shards stays the same. -- View this message in context: http://lucene.472066.n3.nabble.com/Where-to-specify-numShards-when-startup-up-a-cloud-setup-tp4078473p4128566.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Where to specify numShards when startup up a cloud setup
Yes, thanks Shawn. I know I can use collections HTTP API to set number of shards, but the problem with that is it is not easily scriptable so that the entire cluster can be setup in automated fashion - the script(s) will need to wait until the SOLR nodes are up and running before using the collection API. The information I want is: Is there some configuration way to set numShards (such as in solr.xml, etc. - or by sending some data to zookeeper API)? I am guessing the answer is still no. Thanks. From: Shawn Heisey [s...@elyograg.org] Sent: Tuesday, July 16, 2013 6:35 PM To: solr-user@lucene.apache.org Subject: Re: Where to specify numShards when startup up a cloud setup On 7/16/2013 3:36 PM, Robert Stewart wrote: I want to script the creation of N solr cloud instances (on ec2). But its not clear to me where I would specify numShards setting. From documentation, I see you can specify on the first node you start up, OR alternatively, use the collections API to create a new collection - but in that case you need first at least one running SOLR instance. I want to push all solr instances with similar configuration onto N instances and just run them with some number of shards pre-set somehow. Where can I put numShards configuration setting? What I want to do: 1) push solr configuration to zookeeper ensemble using zkCli command-line tool. 2) create N instances of SOLR running on Ec2, pointing to the same zookeeper 3) start all SOLR instances which will become a cloud setup with M shards (where MN), and N-M replicas. A minimal redundant SolrCloud cluster consists of two larger machines that run Solr and zookeeper, plus a third smaller machine that runs just zookeeper. This is just the minimum requirement, you can use additional and more powerful servers. The general way that you should set up a brand new SolrCloud. If anyone spots a problem with this, please don't hesitate to mention it: 1) Set up three hosts running standalone zookeeper, configured as a fully redundant ensemble. This is outside the scope of Solr documentation, please consult the zookeeper site: http://zookeeper.apache.org 2) Construct a zkHost parameter for your ZK ensemble. An example is below using the default zookeeper port of 2181. You'd need to use the proper port numbers, names, etc. The /chroot part is optional, but highly recommended. Use a name that has meaning for your SolrCloud cluster rather than chroot: -DzkHost=server1:2181,server2:2181,server3:2181/chroot By using the /chroot syntax, you can run more than one SolrCloud cluster on your zookeeper ensemble. Just use a different value for each cluster. 3) Start Solr with the same zkHost parameter on every Solr host, referring to the three zookeeper hosts already set up. You can use the same hosts for Solr as you did for zookeeper. 4) Use the zkcli script in example/cloud-scripts to upload a configuration set to zookeeper using the upconfig command. If you aren't using the Solr example or a custom install based on the example, then you'll need to examine the script to figure out how to run the java command manually and have it find the solr and zookeeper jars. 5) Use the Collections API to create a collection, referencing the uploaded config set and including additional parameters like numShards. If you have four Solr hosts, the following API call would work perfectly: http://server:port/solr/admin/collections?action=CREATEname=mycollectionnumShards=2replicationFactor=2collection.configName=mycfg Thanks, Shawn
Where to specify numShards when startup up a cloud setup
I want to script the creation of N solr cloud instances (on ec2). But its not clear to me where I would specify numShards setting. From documentation, I see you can specify on the first node you start up, OR alternatively, use the collections API to create a new collection - but in that case you need first at least one running SOLR instance. I want to push all solr instances with similar configuration onto N instances and just run them with some number of shards pre-set somehow. Where can I put numShards configuration setting? What I want to do: 1) push solr configuration to zookeeper ensemble using zkCli command-line tool. 2) create N instances of SOLR running on Ec2, pointing to the same zookeeper 3) start all SOLR instances which will become a cloud setup with M shards (where MN), and N-M replicas. Currently everything starts up with 1 shards, and N replicas. I already have one single collection pre-configured.
Re: Where to specify numShards when startup up a cloud setup
What does the solr.xml look like on the nodes? On Tue, Jul 16, 2013 at 2:36 PM, Robert Stewart robert_stew...@epam.comwrote: I want to script the creation of N solr cloud instances (on ec2). But its not clear to me where I would specify numShards setting. From documentation, I see you can specify on the first node you start up, OR alternatively, use the collections API to create a new collection - but in that case you need first at least one running SOLR instance. I want to push all solr instances with similar configuration onto N instances and just run them with some number of shards pre-set somehow. Where can I put numShards configuration setting? What I want to do: 1) push solr configuration to zookeeper ensemble using zkCli command-line tool. 2) create N instances of SOLR running on Ec2, pointing to the same zookeeper 3) start all SOLR instances which will become a cloud setup with M shards (where MN), and N-M replicas. Currently everything starts up with 1 shards, and N replicas. I already have one single collection pre-configured.
Re: Where to specify numShards when startup up a cloud setup
On 7/16/2013 3:36 PM, Robert Stewart wrote: I want to script the creation of N solr cloud instances (on ec2). But its not clear to me where I would specify numShards setting. From documentation, I see you can specify on the first node you start up, OR alternatively, use the collections API to create a new collection - but in that case you need first at least one running SOLR instance. I want to push all solr instances with similar configuration onto N instances and just run them with some number of shards pre-set somehow. Where can I put numShards configuration setting? What I want to do: 1) push solr configuration to zookeeper ensemble using zkCli command-line tool. 2) create N instances of SOLR running on Ec2, pointing to the same zookeeper 3) start all SOLR instances which will become a cloud setup with M shards (where MN), and N-M replicas. A minimal redundant SolrCloud cluster consists of two larger machines that run Solr and zookeeper, plus a third smaller machine that runs just zookeeper. This is just the minimum requirement, you can use additional and more powerful servers. The general way that you should set up a brand new SolrCloud. If anyone spots a problem with this, please don't hesitate to mention it: 1) Set up three hosts running standalone zookeeper, configured as a fully redundant ensemble. This is outside the scope of Solr documentation, please consult the zookeeper site: http://zookeeper.apache.org 2) Construct a zkHost parameter for your ZK ensemble. An example is below using the default zookeeper port of 2181. You'd need to use the proper port numbers, names, etc. The /chroot part is optional, but highly recommended. Use a name that has meaning for your SolrCloud cluster rather than chroot: -DzkHost=server1:2181,server2:2181,server3:2181/chroot By using the /chroot syntax, you can run more than one SolrCloud cluster on your zookeeper ensemble. Just use a different value for each cluster. 3) Start Solr with the same zkHost parameter on every Solr host, referring to the three zookeeper hosts already set up. You can use the same hosts for Solr as you did for zookeeper. 4) Use the zkcli script in example/cloud-scripts to upload a configuration set to zookeeper using the upconfig command. If you aren't using the Solr example or a custom install based on the example, then you'll need to examine the script to figure out how to run the java command manually and have it find the solr and zookeeper jars. 5) Use the Collections API to create a collection, referencing the uploaded config set and including additional parameters like numShards. If you have four Solr hosts, the following API call would work perfectly: http://server:port/solr/admin/collections?action=CREATEname=mycollectionnumShards=2replicationFactor=2collection.configName=mycfg Thanks, Shawn