Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup
Hi, got a nice talk on IRC about this. The right thing to do is to start with a clean SOLR cluster (no cores) and then create all the proper collections with the Collections API. Ugo On Thu, Mar 20, 2014 at 7:26 PM, Jeff Wartes jwar...@whitepages.com wrote: Please note that although the article talks about the ADDREPLICA command, that feature is coming in Solr 4.8, so don¹t be confused if you can¹t find it yet. See https://issues.apache.org/jira/browse/SOLR-5130 On 3/20/14, 7:45 AM, Erick Erickson erickerick...@gmail.com wrote: You might find this useful: http://heliosearch.org/solrcloud-assigning-nodes-machines/ It uses the collections API to create your collection with zero nodes, then shows how to assign your leaders to specific machines (well, at least specify the nodes the leaders will be created on, it doesn't show how to assign, for instance, shard1 to nodeX) It also shows a way to assign specific replicas on specific nodes to specific shards, although as Mark says this is a transitional technique. I know there's an addreplica command in the works for the collections API that should make this easier, but that's not released yet. Best, Erick On Thu, Mar 20, 2014 at 7:23 AM, Ugo Matrangolo ugo.matrang...@gmail.com wrote: Hi, I would like some advice about the best way to bootstrap from scratch a SolrCloud cluster housing at least two collections with different sharding/replication setup. Going through the docs/'Solr In Action' book what I have sees so far is that there is a way to bootstrap a SolrCloud cluster with sharding configuration using the: -DnumShards=2 but this (afaik) works only for a single collection. What I need is a way to deploy from scratch a SolrCloud cluster housing (e.g.) two collections Foo and Bar where Foo has only one shard and is replicated everywhere while Bar has three shards and ,again, is replicated. I can't find a config file where to put this sharding plan and I'm starting to think that the only way to do this is after the deploy using the Collections API. Is there a best approach way to do this ? Ugo
Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup
Hi, I would like some advice about the best way to bootstrap from scratch a SolrCloud cluster housing at least two collections with different sharding/replication setup. Going through the docs/'Solr In Action' book what I have sees so far is that there is a way to bootstrap a SolrCloud cluster with sharding configuration using the: -DnumShards=2 but this (afaik) works only for a single collection. What I need is a way to deploy from scratch a SolrCloud cluster housing (e.g.) two collections Foo and Bar where Foo has only one shard and is replicated everywhere while Bar has three shards and ,again, is replicated. I can't find a config file where to put this sharding plan and I'm starting to think that the only way to do this is after the deploy using the Collections API. Is there a best approach way to do this ? Ugo
Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup
Honestly, the best approach is to start with no collections defined and use the collections api. If you want to prefconfigure (which has it’s warts and will likely go away as an option), it’s tricky to do it with different numShards, as that is a global property per node. You would basically set -DnumShards=1 and start your cluster with Foo defined. Then you stop the cluster and define Bar and start with -DnumShards=3. The ability to preconfigure and bootstrap like this was kind of a transitional system meant to help people that knew Solr pre SolrCloud get something up quickly back before we had a collections api. The collections API is much better if you want multiple collections and it’s the future. -- Mark Miller about.me/markrmiller On March 20, 2014 at 10:24:18 AM, Ugo Matrangolo (ugo.matrang...@gmail.com) wrote: Hi, I would like some advice about the best way to bootstrap from scratch a SolrCloud cluster housing at least two collections with different sharding/replication setup. Going through the docs/'Solr In Action' book what I have sees so far is that there is a way to bootstrap a SolrCloud cluster with sharding configuration using the: -DnumShards=2 but this (afaik) works only for a single collection. What I need is a way to deploy from scratch a SolrCloud cluster housing (e.g.) two collections Foo and Bar where Foo has only one shard and is replicated everywhere while Bar has three shards and ,again, is replicated. I can't find a config file where to put this sharding plan and I'm starting to think that the only way to do this is after the deploy using the Collections API. Is there a best approach way to do this ? Ugo
Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup
You might find this useful: http://heliosearch.org/solrcloud-assigning-nodes-machines/ It uses the collections API to create your collection with zero nodes, then shows how to assign your leaders to specific machines (well, at least specify the nodes the leaders will be created on, it doesn't show how to assign, for instance, shard1 to nodeX) It also shows a way to assign specific replicas on specific nodes to specific shards, although as Mark says this is a transitional technique. I know there's an addreplica command in the works for the collections API that should make this easier, but that's not released yet. Best, Erick On Thu, Mar 20, 2014 at 7:23 AM, Ugo Matrangolo ugo.matrang...@gmail.com wrote: Hi, I would like some advice about the best way to bootstrap from scratch a SolrCloud cluster housing at least two collections with different sharding/replication setup. Going through the docs/'Solr In Action' book what I have sees so far is that there is a way to bootstrap a SolrCloud cluster with sharding configuration using the: -DnumShards=2 but this (afaik) works only for a single collection. What I need is a way to deploy from scratch a SolrCloud cluster housing (e.g.) two collections Foo and Bar where Foo has only one shard and is replicated everywhere while Bar has three shards and ,again, is replicated. I can't find a config file where to put this sharding plan and I'm starting to think that the only way to do this is after the deploy using the Collections API. Is there a best approach way to do this ? Ugo
Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup
Please note that although the article talks about the ADDREPLICA command, that feature is coming in Solr 4.8, so don¹t be confused if you can¹t find it yet. See https://issues.apache.org/jira/browse/SOLR-5130 On 3/20/14, 7:45 AM, Erick Erickson erickerick...@gmail.com wrote: You might find this useful: http://heliosearch.org/solrcloud-assigning-nodes-machines/ It uses the collections API to create your collection with zero nodes, then shows how to assign your leaders to specific machines (well, at least specify the nodes the leaders will be created on, it doesn't show how to assign, for instance, shard1 to nodeX) It also shows a way to assign specific replicas on specific nodes to specific shards, although as Mark says this is a transitional technique. I know there's an addreplica command in the works for the collections API that should make this easier, but that's not released yet. Best, Erick On Thu, Mar 20, 2014 at 7:23 AM, Ugo Matrangolo ugo.matrang...@gmail.com wrote: Hi, I would like some advice about the best way to bootstrap from scratch a SolrCloud cluster housing at least two collections with different sharding/replication setup. Going through the docs/'Solr In Action' book what I have sees so far is that there is a way to bootstrap a SolrCloud cluster with sharding configuration using the: -DnumShards=2 but this (afaik) works only for a single collection. What I need is a way to deploy from scratch a SolrCloud cluster housing (e.g.) two collections Foo and Bar where Foo has only one shard and is replicated everywhere while Bar has three shards and ,again, is replicated. I can't find a config file where to put this sharding plan and I'm starting to think that the only way to do this is after the deploy using the Collections API. Is there a best approach way to do this ? Ugo
Sharding and Replication setup
Hi all, I'm trying to setup Solr in current environment to provide high availabilty and fault torrenlance infrastucture. What I have is: 1. 2 physical servers running 2 Tomcats 2. A load balancer doing round-robin requests to 2 Tomcats After reading thru different posts (http://lucidworks.lucidimagination.com/display/solr/Scaling+and+Distribution, http://www.slideshare.net/sourcesense/sharded-solr-setup-with-master), I'm thinking of having the setup as in the image attached http://lucene.472066.n3.nabble.com/file/n4001642/Solr_Sharding_and_Replication_HA.png Solr_Sharding_and_Replication_HA.png Basically, it's a combination of sharding and replication. Server 1, I have a 3-core Solr instance. e.g.: Master 1, Slave 1 (replicated locally from Master 1) and Slave 2 (replicated remotely from Master 2 in Server 2) Server 2, I have similar setup, e.g.: Master 2, Slave 2 (replicated locally from Master 2) and Slave 1 (replicated remotely from Master 1 in Server 1) The indexing request can come to Master 1 or Master 2 hence Slave 1 and Slave 2 become high available shards (as in both servers). The search requests are served by a virtual coordinator (can hit slave 1 core or slave 2 core directly with shards parameter) to combine results from Slave 1 and Slave 2. I haven't done the actual implementation yet. Just post here so to hear any suggestion/recommendation/pitfalls/gotchas on this setup from experts. Very appreciate your attention. Cheers, Trung -- View this message in context: http://lucene.472066.n3.nabble.com/Sharding-and-Replication-setup-tp4001642.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Replication setup with SolrCloud/Zk
Hi Yury, How do you manage to start the instances without any issues? The way I see it, no matter which instance is started first, the slave will complain about not being to find its respective master because that instance hasn't been started yet ... no? Thanks, - Pulkit 2011/5/17 Yury Kats yuryk...@yahoo.com On 5/17/2011 10:17 AM, Stefan Matheis wrote: Yury, perhaps Java-Pararms (like used for this sample: http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slave_in_a_node ) can help you? Ah, thanks! It does seem to work! Cluster's solrconfig.xml (shared between all Solr instances and cores via SolrCloud/ZK): requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=enable${enable.master:false}/str str name=replicateAftercommit/str str name=replicateAfterstartup/str /lst lst name=slave str name=enable${enable.slave:false}/str str name=pollInterval00:01:00/str str name=masterUrlhttp:// ${masterHost:xyz}/solr/master/replication/str /lst /requestHandler Node 1 solr.xml: cores adminPath=/admin/cores defaultCoreName=master core name=master instanceDir=core1 shard=shard1 collection=myconf property name=enable.master value=true / /core core name=slave instanceDir=core2 shard=shard2 collection=myconf property name=enable.slave value=true / property name=masterHost value=node2:8983 / /core /cores Node 2 solr.xml: cores adminPath=/admin/cores defaultCoreName=master core name=master instanceDir=core1 shard=shard2 collection=myconf property name=enable.master value=true / /core core name=slave instanceDir=core2 shard=shard1 collection=myconf property name=enable.slave value=true / property name=masterHost value=node1:8983 / /core /cores
Re: Replication setup with SolrCloud/Zk
Sorry, stupid question, now I see that the core still starts and the polling process simply logs an error: SEVERE: Master at: http://localhost:7574/solr/master2/replication is not available. Index fetch failed. Exception: Connection refused I was able to setup the instructions in-detail with this thread's help here: http://pulkitsinghal.blogspot.com/2011/09/multicore-master-slave-replication-in.html Thanks, - Pulkit On Sat, Sep 10, 2011 at 2:54 PM, Pulkit Singhal pulkitsing...@gmail.comwrote: Hi Yury, How do you manage to start the instances without any issues? The way I see it, no matter which instance is started first, the slave will complain about not being to find its respective master because that instance hasn't been started yet ... no? Thanks, - Pulkit 2011/5/17 Yury Kats yuryk...@yahoo.com On 5/17/2011 10:17 AM, Stefan Matheis wrote: Yury, perhaps Java-Pararms (like used for this sample: http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slave_in_a_node ) can help you? Ah, thanks! It does seem to work! Cluster's solrconfig.xml (shared between all Solr instances and cores via SolrCloud/ZK): requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=enable${enable.master:false}/str str name=replicateAftercommit/str str name=replicateAfterstartup/str /lst lst name=slave str name=enable${enable.slave:false}/str str name=pollInterval00:01:00/str str name=masterUrlhttp:// ${masterHost:xyz}/solr/master/replication/str /lst /requestHandler Node 1 solr.xml: cores adminPath=/admin/cores defaultCoreName=master core name=master instanceDir=core1 shard=shard1 collection=myconf property name=enable.master value=true / /core core name=slave instanceDir=core2 shard=shard2 collection=myconf property name=enable.slave value=true / property name=masterHost value=node2:8983 / /core /cores Node 2 solr.xml: cores adminPath=/admin/cores defaultCoreName=master core name=master instanceDir=core1 shard=shard2 collection=myconf property name=enable.master value=true / /core core name=slave instanceDir=core2 shard=shard1 collection=myconf property name=enable.slave value=true / property name=masterHost value=node1:8983 / /core /cores
Re: Replication setup with SolrCloud/Zk
On 9/10/2011 3:54 PM, Pulkit Singhal wrote: Hi Yury, How do you manage to start the instances without any issues? The way I see it, no matter which instance is started first, the slave will complain about not being to find its respective master because that instance hasn't been started yet ... no? Yes, but it's not a big deal. The slaves polls periodically, so next time around the master will be up.
Replication setup with SolrCloud/Zk
Hi, I have two Solr nodes, each managing two cores -- a master core and a slave core. The slaves are setup to replicate from the other node's masters That is, node1.master - node2.slave, node2.master - node1.slave. The replication is configured in each core's solrconfig.xml, eg Master's solrconfig.xml on both nodes: requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAftercommit/str str name=replicateAfterstartup/str /lst /requestHandler node1.Slave's solrconfig.xml: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://node2:8983/solr/master/replication/str str name=pollInterval01:00:00/str /lst /requestHandler node2.Slave's solrconfig.xml: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://node1:8983/solr/master/replication/str str name=pollInterval01:00:00/str /lst /requestHandler This is all working great with regular Solr. I am now trying to move to SolrCloud/ZK and can't figure out how to keep my replication settings. The SolrCloud/ZK seems to be managing one configuration for all cores/nodes in the cluster, yet I need to keep 3 different soltconfig.xml apart -- one for the masters and one for each of the slaves. The rest of the configuration (schema.xml etc) is identical to all cores and can be shared. I found a reference to master/slave setup with Zk in the wiki [1]. Has it been implemented or is this a proposal? If it is implemented, it's not quite clear to me how to setup the ReplicationHandler to have 2 different slave cores to pull from two different masters. Any help/idea would be appreciated! Thanks, Yury [1] http://wiki.apache.org/solr/ZooKeeperIntegration#Master.2BAC8-Slave
Re: Replication setup with SolrCloud/Zk
Yury, perhaps Java-Pararms (like used for this sample: http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slave_in_a_node) can help you? Regards Stefan 2011/5/17 Yury Kats yuryk...@yahoo.com: Hi, I have two Solr nodes, each managing two cores -- a master core and a slave core. The slaves are setup to replicate from the other node's masters That is, node1.master - node2.slave, node2.master - node1.slave. The replication is configured in each core's solrconfig.xml, eg Master's solrconfig.xml on both nodes: requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAftercommit/str str name=replicateAfterstartup/str /lst /requestHandler node1.Slave's solrconfig.xml: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://node2:8983/solr/master/replication/str str name=pollInterval01:00:00/str /lst /requestHandler node2.Slave's solrconfig.xml: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://node1:8983/solr/master/replication/str str name=pollInterval01:00:00/str /lst /requestHandler This is all working great with regular Solr. I am now trying to move to SolrCloud/ZK and can't figure out how to keep my replication settings. The SolrCloud/ZK seems to be managing one configuration for all cores/nodes in the cluster, yet I need to keep 3 different soltconfig.xml apart -- one for the masters and one for each of the slaves. The rest of the configuration (schema.xml etc) is identical to all cores and can be shared. I found a reference to master/slave setup with Zk in the wiki [1]. Has it been implemented or is this a proposal? If it is implemented, it's not quite clear to me how to setup the ReplicationHandler to have 2 different slave cores to pull from two different masters. Any help/idea would be appreciated! Thanks, Yury [1] http://wiki.apache.org/solr/ZooKeeperIntegration#Master.2BAC8-Slave
Re: Replication setup with SolrCloud/Zk
On 5/17/2011 10:17 AM, Stefan Matheis wrote: Yury, perhaps Java-Pararms (like used for this sample: http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slave_in_a_node) can help you? Ah, thanks! It does seem to work! Cluster's solrconfig.xml (shared between all Solr instances and cores via SolrCloud/ZK): requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=enable${enable.master:false}/str str name=replicateAftercommit/str str name=replicateAfterstartup/str /lst lst name=slave str name=enable${enable.slave:false}/str str name=pollInterval00:01:00/str str name=masterUrlhttp://${masterHost:xyz}/solr/master/replication/str /lst /requestHandler Node 1 solr.xml: cores adminPath=/admin/cores defaultCoreName=master core name=master instanceDir=core1 shard=shard1 collection=myconf property name=enable.master value=true / /core core name=slave instanceDir=core2 shard=shard2 collection=myconf property name=enable.slave value=true / property name=masterHost value=node2:8983 / /core /cores Node 2 solr.xml: cores adminPath=/admin/cores defaultCoreName=master core name=master instanceDir=core1 shard=shard2 collection=myconf property name=enable.master value=true / /core core name=slave instanceDir=core2 shard=shard1 collection=myconf property name=enable.slave value=true / property name=masterHost value=node1:8983 / /core /cores
Re: replication setup
it is always recommended to paste your actual configuration and startup commands, instead of saying as described in wiki . On Tue, Jan 26, 2010 at 9:52 PM, Matthieu Labour matthieu_lab...@yahoo.com wrote: Hi I have set up replication following the wiki I downloaded the latest apache-solr-1.4 release and exploded it in 2 different directories I modified both solrconfig.xml for the master the slave as described on the wiki page In both sirectory, I started solr from the example directory example on the master: java -Dsolr.solr.home=multicore -Djetty.host=0.0.0.0 -Djetty.port=8983 -DSTOP.PORT=8078 -DSTOP.KEY=stop.now -jar start.jar and on the slave java -Dsolr.solr.home=multicore -Djetty.host=0.0.0.0 -Djetty.port=8982 -DSTOP.PORT=8077 -DSTOP.KEY=stop.now -jar start.jar I can see core0 and core 1 when I open the solr url However, I don't see a replication link and the following url solr url / replication returns a 404 error I must be doing something wrong. I would appreciate any help ! thanks a lot matt -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: replication setup
: Subject: replication setup : In-Reply-To: 83ec2c9c1001260724t110d6595m5071e0a40e1b1...@mail.gmail.com http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is hidden in that thread and gets less attention. It makes following discussions in the mailing list archives particularly difficult. See Also: http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking -Hoss
replication setup
Hi I have set up replication following the wiki I downloaded the latest apache-solr-1.4 release and exploded it in 2 different directories I modified both solrconfig.xml for the master the slave as described on the wiki page In both sirectory, I started solr from the example directory example on the master: java -Dsolr.solr.home=multicore -Djetty.host=0.0.0.0 -Djetty.port=8983 -DSTOP.PORT=8078 -DSTOP.KEY=stop.now -jar start.jar and on the slave java -Dsolr.solr.home=multicore -Djetty.host=0.0.0.0 -Djetty.port=8982 -DSTOP.PORT=8077 -DSTOP.KEY=stop.now -jar start.jar I can see core0 and core 1 when I open the solr url However, I don't see a replication link and the following url solr url / replication returns a 404 error I must be doing something wrong. I would appreciate any help ! thanks a lot matt