On 7/16/2013 3:36 PM, Robert Stewart wrote:
I want to script the creation of N solr cloud instances (on ec2).
But its not clear to me where I would specify numShards setting.
From documentation, I see you can specify on the "first node" you start up, OR
alternatively, use the "collections" API to create a new collection - but in that case
you need first at least one running SOLR instance. I want to push all solr instances with similar
configuration onto N instances and just run them with some number of shards pre-set somehow. Where
can I put numShards configuration setting?
What I want to do:
1) push solr configuration to zookeeper ensemble using zkCli command-line tool.
2) create N instances of SOLR running on Ec2, pointing to the same zookeeper
3) start all SOLR instances which will become a cloud setup with M shards (where
M<N), and N-M replicas.
A minimal redundant SolrCloud cluster consists of two larger machines
that run Solr and zookeeper, plus a third smaller machine that runs just
zookeeper. This is just the minimum requirement, you can use additional
and more powerful servers.
The general way that you should set up a brand new SolrCloud. If anyone
spots a problem with this, please don't hesitate to mention it:
1) Set up three hosts running standalone zookeeper, configured as a
fully redundant ensemble. This is outside the scope of Solr
documentation, please consult the zookeeper site:
http://zookeeper.apache.org
2) Construct a zkHost parameter for your ZK ensemble. An example is
below using the default zookeeper port of 2181. You'd need to use the
proper port numbers, names, etc. The /chroot part is optional, but
highly recommended. Use a name that has meaning for your SolrCloud
cluster rather than chroot:
-DzkHost=server1:2181,server2:2181,server3:2181/chroot
By using the /chroot syntax, you can run more than one SolrCloud cluster
on your zookeeper ensemble. Just use a different value for each cluster.
3) Start Solr with the same zkHost parameter on every Solr host,
referring to the three zookeeper hosts already set up. You can use the
same hosts for Solr as you did for zookeeper.
4) Use the zkcli script in example/cloud-scripts to upload a
configuration set to zookeeper using the "upconfig" command. If you
aren't using the Solr example or a custom install based on the example,
then you'll need to examine the script to figure out how to run the java
command manually and have it find the solr and zookeeper jars.
5) Use the Collections API to create a collection, referencing the
uploaded config set and including additional parameters like numShards.
If you have four Solr hosts, the following API call would work perfectly:
http://server:port/solr/admin/collections?action=CREATE&name=mycollection&numShards=2&replicationFactor=2&collection.configName=mycfg
Thanks,
Shawn