Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup

2014-03-21 Thread Ugo Matrangolo
Hi,

got a nice talk on IRC about this. The right thing to do is to start with a
clean SOLR cluster (no cores) and then create all the proper collections
with the Collections API.

Ugo


On Thu, Mar 20, 2014 at 7:26 PM, Jeff Wartes jwar...@whitepages.com wrote:


 Please note that although the article talks about the ADDREPLICA command,
 that feature is coming in Solr 4.8, so don¹t be confused if you can¹t find
 it yet. See https://issues.apache.org/jira/browse/SOLR-5130



 On 3/20/14, 7:45 AM, Erick Erickson erickerick...@gmail.com wrote:

 You might find this useful:
 http://heliosearch.org/solrcloud-assigning-nodes-machines/
 
 
 It uses the collections API to create your collection with zero
 nodes, then shows how to assign your leaders to specific
 machines (well, at least specify the nodes the leaders will
 be created on, it doesn't show how to assign, for instance,
 shard1 to nodeX)
 
 It also shows a way to assign specific replicas on specific nodes
 to specific shards, although as Mark says this is a transitional
 technique. I know there's an addreplica command in the works
 for the collections API that should make this easier, but that's
 not released yet.
 
 Best,
 Erick
 
 
 On Thu, Mar 20, 2014 at 7:23 AM, Ugo Matrangolo
 ugo.matrang...@gmail.com wrote:
  Hi,
 
  I would like some advice about the best way to bootstrap from scratch a
  SolrCloud cluster housing at least two collections with different
  sharding/replication setup.
 
  Going through the docs/'Solr In Action' book what I have sees so far is
  that there is a way to bootstrap a SolrCloud cluster with sharding
  configuration using the:
 
-DnumShards=2
 
  but this (afaik) works only for a single collection. What I need is a
 way
  to deploy from scratch a SolrCloud cluster housing (e.g.) two
 collections
  Foo and Bar where Foo has only one shard and is replicated everywhere
 while
  Bar has three shards and ,again, is replicated.
 
  I can't find a config file where to put this sharding plan and I'm
 starting
  to think that the only way to do this is after the deploy using the
  Collections API.
 
  Is there a best approach way to do this ?
 
  Ugo




Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup

2014-03-20 Thread Ugo Matrangolo
Hi,

I would like some advice about the best way to bootstrap from scratch a
SolrCloud cluster housing at least two collections with different
sharding/replication setup.

Going through the docs/'Solr In Action' book what I have sees so far is
that there is a way to bootstrap a SolrCloud cluster with sharding
configuration using the:

  -DnumShards=2

but this (afaik) works only for a single collection. What I need is a way
to deploy from scratch a SolrCloud cluster housing (e.g.) two collections
Foo and Bar where Foo has only one shard and is replicated everywhere while
Bar has three shards and ,again, is replicated.

I can't find a config file where to put this sharding plan and I'm starting
to think that the only way to do this is after the deploy using the
Collections API.

Is there a best approach way to do this ?

Ugo


Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup

2014-03-20 Thread Mark Miller
Honestly, the best approach is to start with no collections defined and use the 
collections api.

If you want to prefconfigure (which has it’s warts and will likely go away as 
an option), it’s tricky to do it with different numShards, as that is a global 
property per node.

You would basically set -DnumShards=1 and start your cluster with Foo defined. 
Then you stop the cluster and define Bar and start with -DnumShards=3.

The ability to preconfigure and bootstrap like this was kind of a transitional 
system meant to help people that knew Solr pre SolrCloud get something up 
quickly back before we had a collections api.

The collections API is much better if you want multiple collections and it’s 
the future.
-- 
Mark Miller
about.me/markrmiller

On March 20, 2014 at 10:24:18 AM, Ugo Matrangolo (ugo.matrang...@gmail.com) 
wrote:

Hi,  

I would like some advice about the best way to bootstrap from scratch a  
SolrCloud cluster housing at least two collections with different  
sharding/replication setup.  

Going through the docs/'Solr In Action' book what I have sees so far is  
that there is a way to bootstrap a SolrCloud cluster with sharding  
configuration using the:  

-DnumShards=2  

but this (afaik) works only for a single collection. What I need is a way  
to deploy from scratch a SolrCloud cluster housing (e.g.) two collections  
Foo and Bar where Foo has only one shard and is replicated everywhere while  
Bar has three shards and ,again, is replicated.  

I can't find a config file where to put this sharding plan and I'm starting  
to think that the only way to do this is after the deploy using the  
Collections API.  

Is there a best approach way to do this ?  

Ugo  


Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup

2014-03-20 Thread Erick Erickson
You might find this useful:
http://heliosearch.org/solrcloud-assigning-nodes-machines/


It uses the collections API to create your collection with zero
nodes, then shows how to assign your leaders to specific
machines (well, at least specify the nodes the leaders will
be created on, it doesn't show how to assign, for instance,
shard1 to nodeX)

It also shows a way to assign specific replicas on specific nodes
to specific shards, although as Mark says this is a transitional
technique. I know there's an addreplica command in the works
for the collections API that should make this easier, but that's
not released yet.

Best,
Erick


On Thu, Mar 20, 2014 at 7:23 AM, Ugo Matrangolo
ugo.matrang...@gmail.com wrote:
 Hi,

 I would like some advice about the best way to bootstrap from scratch a
 SolrCloud cluster housing at least two collections with different
 sharding/replication setup.

 Going through the docs/'Solr In Action' book what I have sees so far is
 that there is a way to bootstrap a SolrCloud cluster with sharding
 configuration using the:

   -DnumShards=2

 but this (afaik) works only for a single collection. What I need is a way
 to deploy from scratch a SolrCloud cluster housing (e.g.) two collections
 Foo and Bar where Foo has only one shard and is replicated everywhere while
 Bar has three shards and ,again, is replicated.

 I can't find a config file where to put this sharding plan and I'm starting
 to think that the only way to do this is after the deploy using the
 Collections API.

 Is there a best approach way to do this ?

 Ugo


Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup

2014-03-20 Thread Jeff Wartes

Please note that although the article talks about the ADDREPLICA command,
that feature is coming in Solr 4.8, so don¹t be confused if you can¹t find
it yet. See https://issues.apache.org/jira/browse/SOLR-5130



On 3/20/14, 7:45 AM, Erick Erickson erickerick...@gmail.com wrote:

You might find this useful:
http://heliosearch.org/solrcloud-assigning-nodes-machines/


It uses the collections API to create your collection with zero
nodes, then shows how to assign your leaders to specific
machines (well, at least specify the nodes the leaders will
be created on, it doesn't show how to assign, for instance,
shard1 to nodeX)

It also shows a way to assign specific replicas on specific nodes
to specific shards, although as Mark says this is a transitional
technique. I know there's an addreplica command in the works
for the collections API that should make this easier, but that's
not released yet.

Best,
Erick


On Thu, Mar 20, 2014 at 7:23 AM, Ugo Matrangolo
ugo.matrang...@gmail.com wrote:
 Hi,

 I would like some advice about the best way to bootstrap from scratch a
 SolrCloud cluster housing at least two collections with different
 sharding/replication setup.

 Going through the docs/'Solr In Action' book what I have sees so far is
 that there is a way to bootstrap a SolrCloud cluster with sharding
 configuration using the:

   -DnumShards=2

 but this (afaik) works only for a single collection. What I need is a
way
 to deploy from scratch a SolrCloud cluster housing (e.g.) two
collections
 Foo and Bar where Foo has only one shard and is replicated everywhere
while
 Bar has three shards and ,again, is replicated.

 I can't find a config file where to put this sharding plan and I'm
starting
 to think that the only way to do this is after the deploy using the
 Collections API.

 Is there a best approach way to do this ?

 Ugo