Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup
Hi, got a nice talk on IRC about this. The right thing to do is to start with a clean SOLR cluster (no cores) and then create all the proper collections with the Collections API. Ugo On Thu, Mar 20, 2014 at 7:26 PM, Jeff Wartes jwar...@whitepages.com wrote: Please note that although the article talks about the ADDREPLICA command, that feature is coming in Solr 4.8, so don¹t be confused if you can¹t find it yet. See https://issues.apache.org/jira/browse/SOLR-5130 On 3/20/14, 7:45 AM, Erick Erickson erickerick...@gmail.com wrote: You might find this useful: http://heliosearch.org/solrcloud-assigning-nodes-machines/ It uses the collections API to create your collection with zero nodes, then shows how to assign your leaders to specific machines (well, at least specify the nodes the leaders will be created on, it doesn't show how to assign, for instance, shard1 to nodeX) It also shows a way to assign specific replicas on specific nodes to specific shards, although as Mark says this is a transitional technique. I know there's an addreplica command in the works for the collections API that should make this easier, but that's not released yet. Best, Erick On Thu, Mar 20, 2014 at 7:23 AM, Ugo Matrangolo ugo.matrang...@gmail.com wrote: Hi, I would like some advice about the best way to bootstrap from scratch a SolrCloud cluster housing at least two collections with different sharding/replication setup. Going through the docs/'Solr In Action' book what I have sees so far is that there is a way to bootstrap a SolrCloud cluster with sharding configuration using the: -DnumShards=2 but this (afaik) works only for a single collection. What I need is a way to deploy from scratch a SolrCloud cluster housing (e.g.) two collections Foo and Bar where Foo has only one shard and is replicated everywhere while Bar has three shards and ,again, is replicated. I can't find a config file where to put this sharding plan and I'm starting to think that the only way to do this is after the deploy using the Collections API. Is there a best approach way to do this ? Ugo
Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup
Hi, I would like some advice about the best way to bootstrap from scratch a SolrCloud cluster housing at least two collections with different sharding/replication setup. Going through the docs/'Solr In Action' book what I have sees so far is that there is a way to bootstrap a SolrCloud cluster with sharding configuration using the: -DnumShards=2 but this (afaik) works only for a single collection. What I need is a way to deploy from scratch a SolrCloud cluster housing (e.g.) two collections Foo and Bar where Foo has only one shard and is replicated everywhere while Bar has three shards and ,again, is replicated. I can't find a config file where to put this sharding plan and I'm starting to think that the only way to do this is after the deploy using the Collections API. Is there a best approach way to do this ? Ugo
Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup
Honestly, the best approach is to start with no collections defined and use the collections api. If you want to prefconfigure (which has it’s warts and will likely go away as an option), it’s tricky to do it with different numShards, as that is a global property per node. You would basically set -DnumShards=1 and start your cluster with Foo defined. Then you stop the cluster and define Bar and start with -DnumShards=3. The ability to preconfigure and bootstrap like this was kind of a transitional system meant to help people that knew Solr pre SolrCloud get something up quickly back before we had a collections api. The collections API is much better if you want multiple collections and it’s the future. -- Mark Miller about.me/markrmiller On March 20, 2014 at 10:24:18 AM, Ugo Matrangolo (ugo.matrang...@gmail.com) wrote: Hi, I would like some advice about the best way to bootstrap from scratch a SolrCloud cluster housing at least two collections with different sharding/replication setup. Going through the docs/'Solr In Action' book what I have sees so far is that there is a way to bootstrap a SolrCloud cluster with sharding configuration using the: -DnumShards=2 but this (afaik) works only for a single collection. What I need is a way to deploy from scratch a SolrCloud cluster housing (e.g.) two collections Foo and Bar where Foo has only one shard and is replicated everywhere while Bar has three shards and ,again, is replicated. I can't find a config file where to put this sharding plan and I'm starting to think that the only way to do this is after the deploy using the Collections API. Is there a best approach way to do this ? Ugo
Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup
You might find this useful: http://heliosearch.org/solrcloud-assigning-nodes-machines/ It uses the collections API to create your collection with zero nodes, then shows how to assign your leaders to specific machines (well, at least specify the nodes the leaders will be created on, it doesn't show how to assign, for instance, shard1 to nodeX) It also shows a way to assign specific replicas on specific nodes to specific shards, although as Mark says this is a transitional technique. I know there's an addreplica command in the works for the collections API that should make this easier, but that's not released yet. Best, Erick On Thu, Mar 20, 2014 at 7:23 AM, Ugo Matrangolo ugo.matrang...@gmail.com wrote: Hi, I would like some advice about the best way to bootstrap from scratch a SolrCloud cluster housing at least two collections with different sharding/replication setup. Going through the docs/'Solr In Action' book what I have sees so far is that there is a way to bootstrap a SolrCloud cluster with sharding configuration using the: -DnumShards=2 but this (afaik) works only for a single collection. What I need is a way to deploy from scratch a SolrCloud cluster housing (e.g.) two collections Foo and Bar where Foo has only one shard and is replicated everywhere while Bar has three shards and ,again, is replicated. I can't find a config file where to put this sharding plan and I'm starting to think that the only way to do this is after the deploy using the Collections API. Is there a best approach way to do this ? Ugo
Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup
Please note that although the article talks about the ADDREPLICA command, that feature is coming in Solr 4.8, so don¹t be confused if you can¹t find it yet. See https://issues.apache.org/jira/browse/SOLR-5130 On 3/20/14, 7:45 AM, Erick Erickson erickerick...@gmail.com wrote: You might find this useful: http://heliosearch.org/solrcloud-assigning-nodes-machines/ It uses the collections API to create your collection with zero nodes, then shows how to assign your leaders to specific machines (well, at least specify the nodes the leaders will be created on, it doesn't show how to assign, for instance, shard1 to nodeX) It also shows a way to assign specific replicas on specific nodes to specific shards, although as Mark says this is a transitional technique. I know there's an addreplica command in the works for the collections API that should make this easier, but that's not released yet. Best, Erick On Thu, Mar 20, 2014 at 7:23 AM, Ugo Matrangolo ugo.matrang...@gmail.com wrote: Hi, I would like some advice about the best way to bootstrap from scratch a SolrCloud cluster housing at least two collections with different sharding/replication setup. Going through the docs/'Solr In Action' book what I have sees so far is that there is a way to bootstrap a SolrCloud cluster with sharding configuration using the: -DnumShards=2 but this (afaik) works only for a single collection. What I need is a way to deploy from scratch a SolrCloud cluster housing (e.g.) two collections Foo and Bar where Foo has only one shard and is replicated everywhere while Bar has three shards and ,again, is replicated. I can't find a config file where to put this sharding plan and I'm starting to think that the only way to do this is after the deploy using the Collections API. Is there a best approach way to do this ? Ugo