Re: Replication Issue with Repeater Please help
On 8/16/2014 8:11 AM, waqas sarwar wrote: >> Thank you so much. You helped alot. One more question is that can i use only >> one zookeeper server to manage 3 solr servers, or i've to configure 3 >> zookeeper servers for each. >And zookeeper servers should be stand alone or >> better to use same solr server machine ?>>Best Regards,>Waqas >> I think Erick basically said the same thing as this, in a slightly different way: If you want zookeeper to be fault tolerant, you must have at least three servers running it. One zookeeper will work, but if it goes down, SolrCloud doesn't function properly. Three are needed for full redundancy. If one of the three goes down, the other two will still function as a quorum. You can use the same servers for Zookeeper and Solr. This *can* be a source of performance problems, but that will usually only be a problem if you put a major load on your SolrCloud. If you do put them on the same server, I would recommend putting the zk database on a separate disk or disks -- the CPU requirements for Zookeeper are very small, but it relies on extremely responsive I/O to/from its database. As Erick said, we strongly recommend that you don't use the embedded ZK -- this starts up a zookeeper server in the same Java process as Solr. If Solr is stopped or goes down, you also lose zookeeper. Thanks, Shawn
Re: Replication Issue with Repeater Please help
It Depends (tm). > One ZooKeeper is a single point of failure. It goes away and your SolrCloud > cluster is kinda hosed. OTOH, with only 3 servers, the chance that one of > them is going down is low anyway. How lucky do you feel? > I would be cautious about running your ZK instances embedded, > super-especially if there's only one ZK instance. That couples your ZK > instances with your Solr instances. So if for any reason you want to > stop/start Solr, you will stop/start ZK as well and it's easy to fall below a > quorum. It's perfectly viable to run them embedded, especially on a very > small cluster. You do have to think a bit more about sequencing Solr nodes > going up/down is all. Best, Erick On Sat, Aug 16, 2014 at 7:11 AM, waqas sarwar wrote: > > >> Date: Thu, 14 Aug 2014 06:51:02 -0600 >> From: s...@elyograg.org >> To: solr-user@lucene.apache.org >> Subject: Re: Replication Issue with Repeater Please help >> >> On 8/14/2014 2:09 AM, waqas sarwar wrote: >> > Thanks Shawn. What i got is Circular replication is totally impossible & >> > Solr fails in distributed environment. Then why solr documentation says >> > that configure "REPEATER" for distributed architecture, because "REPEATER" >> > behave like master-slave at a time. >> > Can i configure SolrCloud on LAN, or i've to configure zookeeper myself. >> > Please provide me any solution for LAN distributed servers. If zookeeper >> > in only solution then provide me any link to configure it that can help me >> > & to avoid wrong direction. >> >> The repeater config is designed to avoid master overload from many >> slaves. So instead of configuring ten slaves to replicate from one >> master, you configure two slaves to replicate directly from your master, >> and then you configure those as repeaters. The other eight slaves are >> configured so that four of them replicate from each of the repeaters >> instead of the true master, reducing the load. >> >> SolrCloud is the easiest way to build a fully distributed and redundant >> solution. It is designed for a LAN. You configure three machines as >> your zookeeper ensemble, using the zookeeper download and instructions >> for a clustered setup: >> >> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_zkMulitServerSetup >> >> The way to start Solr in cloud mode is to give it a zkHost system >> property. That informs Solr about all of your ZK servers. If you have >> another way of setting that property, you can use that instead. I >> strongly recommend using a chroot with the zkHost parameter, but that is >> not required. Search the zookeeper page linked above for "chroot" to >> find a link to additional documentation about chroot. >> >> You can use the same servers for ZK as you do for Solr, but be aware >> that if Solr puts a large I/O load on the disks, you may want the ZK >> database to be on its own disks(s) so that it responds quickly. >> Separate servers is even better, but not strictly required unless the >> servers are under extreme load. >> >> https://cwiki.apache.org/confluence/display/solr/SolrCloud >> >> You will find a "Getting Started" link on the page above. Note that the >> "Getting Started" page talks about a zkRun option, which starts an >> embedded zookeeper as part of Solr. I strongly recommend that you do >> NOT take this route, except for *initial* testing. SolrCloud works much >> better if the Zookeeper ensemble is in its own process, separate from Solr. >> >> Thanks, >> Shawn >> >> Thank you so much. You helped alot. One more question is that can i use only >> one zookeeper server to manage 3 solr servers, or i've to configure 3 >> zookeeper servers for each. >And zookeeper servers should be stand alone or >> better to use same solr server machine ?>>Best Regards,>Waqas
RE: Replication Issue with Repeater Please help
> Date: Thu, 14 Aug 2014 06:51:02 -0600 > From: s...@elyograg.org > To: solr-user@lucene.apache.org > Subject: Re: Replication Issue with Repeater Please help > > On 8/14/2014 2:09 AM, waqas sarwar wrote: > > Thanks Shawn. What i got is Circular replication is totally impossible & > > Solr fails in distributed environment. Then why solr documentation says > > that configure "REPEATER" for distributed architecture, because "REPEATER" > > behave like master-slave at a time. > > Can i configure SolrCloud on LAN, or i've to configure zookeeper myself. > > Please provide me any solution for LAN distributed servers. If zookeeper in > > only solution then provide me any link to configure it that can help me & > > to avoid wrong direction. > > The repeater config is designed to avoid master overload from many > slaves. So instead of configuring ten slaves to replicate from one > master, you configure two slaves to replicate directly from your master, > and then you configure those as repeaters. The other eight slaves are > configured so that four of them replicate from each of the repeaters > instead of the true master, reducing the load. > > SolrCloud is the easiest way to build a fully distributed and redundant > solution. It is designed for a LAN. You configure three machines as > your zookeeper ensemble, using the zookeeper download and instructions > for a clustered setup: > > http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_zkMulitServerSetup > > The way to start Solr in cloud mode is to give it a zkHost system > property. That informs Solr about all of your ZK servers. If you have > another way of setting that property, you can use that instead. I > strongly recommend using a chroot with the zkHost parameter, but that is > not required. Search the zookeeper page linked above for "chroot" to > find a link to additional documentation about chroot. > > You can use the same servers for ZK as you do for Solr, but be aware > that if Solr puts a large I/O load on the disks, you may want the ZK > database to be on its own disks(s) so that it responds quickly. > Separate servers is even better, but not strictly required unless the > servers are under extreme load. > > https://cwiki.apache.org/confluence/display/solr/SolrCloud > > You will find a "Getting Started" link on the page above. Note that the > "Getting Started" page talks about a zkRun option, which starts an > embedded zookeeper as part of Solr. I strongly recommend that you do > NOT take this route, except for *initial* testing. SolrCloud works much > better if the Zookeeper ensemble is in its own process, separate from Solr. > > Thanks, > Shawn > > Thank you so much. You helped alot. One more question is that can i use only > one zookeeper server to manage 3 solr servers, or i've to configure 3 > zookeeper servers for each. >And zookeeper servers should be stand alone or > better to use same solr server machine ?>>Best Regards,>Waqas >
Re: Replication Issue with Repeater Please help
On 8/14/2014 2:09 AM, waqas sarwar wrote: > Thanks Shawn. What i got is Circular replication is totally impossible & Solr > fails in distributed environment. Then why solr documentation says that > configure "REPEATER" for distributed architecture, because "REPEATER" behave > like master-slave at a time. > Can i configure SolrCloud on LAN, or i've to configure zookeeper myself. > Please provide me any solution for LAN distributed servers. If zookeeper in > only solution then provide me any link to configure it that can help me & to > avoid wrong direction. The repeater config is designed to avoid master overload from many slaves. So instead of configuring ten slaves to replicate from one master, you configure two slaves to replicate directly from your master, and then you configure those as repeaters. The other eight slaves are configured so that four of them replicate from each of the repeaters instead of the true master, reducing the load. SolrCloud is the easiest way to build a fully distributed and redundant solution. It is designed for a LAN. You configure three machines as your zookeeper ensemble, using the zookeeper download and instructions for a clustered setup: http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_zkMulitServerSetup The way to start Solr in cloud mode is to give it a zkHost system property. That informs Solr about all of your ZK servers. If you have another way of setting that property, you can use that instead. I strongly recommend using a chroot with the zkHost parameter, but that is not required. Search the zookeeper page linked above for "chroot" to find a link to additional documentation about chroot. You can use the same servers for ZK as you do for Solr, but be aware that if Solr puts a large I/O load on the disks, you may want the ZK database to be on its own disks(s) so that it responds quickly. Separate servers is even better, but not strictly required unless the servers are under extreme load. https://cwiki.apache.org/confluence/display/solr/SolrCloud You will find a "Getting Started" link on the page above. Note that the "Getting Started" page talks about a zkRun option, which starts an embedded zookeeper as part of Solr. I strongly recommend that you do NOT take this route, except for *initial* testing. SolrCloud works much better if the Zookeeper ensemble is in its own process, separate from Solr. Thanks, Shawn
RE: Replication Issue with Repeater Please help
> Date: Wed, 13 Aug 2014 07:19:58 -0600 > From: s...@elyograg.org > To: solr-user@lucene.apache.org > Subject: Re: Replication Issue with Repeater Please help > > On 8/13/2014 12:49 AM, waqas sarwar wrote: > > Hi, I'm using Solr. I need a little bit assistance from you. I am > > bit stuck with Solr replication, before discussing issue let me write a > > brief description.Scenario:- I want to set up solr in distributed > > architecture, suppose start with least no of nodes (suppose 3), how can i > > replicate data of each node to 2 others and vice versa.My Solution:- I > > set up “REPEATER” on all nodes, each node is master to other, and > > configured circular replication. Issue i'm facing:- All nodes are > > working fine replicating data from other node, but when node1 replicate > > data from node2, node1 loses its own data. I think node1 don’t have to > > atleast lose its own data & have to merge new data. I think now question is > > pretty simple and clear, I want to set up solr in distributed architecture, > > each node is replica to other, how may i achieve it. Is there be any other > > way except Repeater and circular replication using repeater, to replicate > > data of each node to all others. Environme > nt:- LA > N, Solr (3.6 to 4.9), Redhat > > With master-slave replication, there must be a clear master, from which > slaves replicate. You can't set up fully circular replication, or the > master will replicate from the empty slave and your data will be gone. > This form of replication does not merge data -- it makes the slave index > identical to the master by copying the actual files on disk for the index. > > I think you'll want to use SolrCloud. You have three machines, so you > have the minimum number for a redundant zookeeper ensemble. SolrCloud > relies on zookeeper to handle cluster functions. SolrCloud is a true > cluster -- no replication, no master. > > https://cwiki.apache.org/confluence/display/solr/SolrCloud > > Thanks, > Shawn Thanks Shawn. What i got is Circular replication is totally impossible & Solr fails in distributed environment. Then why solr documentation says that configure "REPEATER" for distributed architecture, because "REPEATER" behave like master-slave at a time. Can i configure SolrCloud on LAN, or i've to configure zookeeper myself. Please provide me any solution for LAN distributed servers. If zookeeper in only solution then provide me any link to configure it that can help me & to avoid wrong direction. Regards,Waqas
Re: Replication Issue with Repeater Please help
On 8/13/2014 12:49 AM, waqas sarwar wrote: > Hi, I'm using Solr. I need a little bit assistance from you. I am bit > stuck with Solr replication, before discussing issue let me write a brief > description.Scenario:- I want to set up solr in distributed architecture, > suppose start with least no of nodes (suppose 3), how can i replicate data of > each node to 2 others and vice versa.My Solution:- I set up “REPEATER” on > all nodes, each node is master to other, and configured circular replication. > Issue i'm facing:- All nodes are working fine replicating data from other > node, but when node1 replicate data from node2, node1 loses its own data. I > think node1 don’t have to atleast lose its own data & have to merge new data. > I think now question is pretty simple and clear, I want to set up solr in > distributed architecture, each node is replica to other, how may i achieve > it. Is there be any other way except Repeater and circular replication using > repeater, to replicate data of each node to all others. Environme nt:- LA N, Solr (3.6 to 4.9), Redhat With master-slave replication, there must be a clear master, from which slaves replicate. You can't set up fully circular replication, or the master will replicate from the empty slave and your data will be gone. This form of replication does not merge data -- it makes the slave index identical to the master by copying the actual files on disk for the index. I think you'll want to use SolrCloud. You have three machines, so you have the minimum number for a redundant zookeeper ensemble. SolrCloud relies on zookeeper to handle cluster functions. SolrCloud is a true cluster -- no replication, no master. https://cwiki.apache.org/confluence/display/solr/SolrCloud Thanks, Shawn
Replication Issue with Repeater Please help
Hi, I'm using Solr. I need a little bit assistance from you. I am bit stuck with Solr replication, before discussing issue let me write a brief description.Scenario:- I want to set up solr in distributed architecture, suppose start with least no of nodes (suppose 3), how can i replicate data of each node to 2 others and vice versa.My Solution:- I set up “REPEATER” on all nodes, each node is master to other, and configured circular replication. Issue i'm facing:- All nodes are working fine replicating data from other node, but when node1 replicate data from node2, node1 loses its own data. I think node1 don’t have to atleast lose its own data & have to merge new data. I think now question is pretty simple and clear, I want to set up solr in distributed architecture, each node is replica to other, how may i achieve it. Is there be any other way except Repeater and circular replication using repeater, to replicate data of each node to all others. Environment:- LAN, Solr (3.6 to 4.9), Redhat