Re: Order of hosts in zkHost
I believe Arcadius has a point, but I still think the answer is no. ZooKeeper clients (Solr/SolrJ) connect to a single ZooKeeper server instance at a time, and keep that session open to that same server as long as they can/need. During this time, all interactions between the client and the ZK ensemble will be done to the same ZK server instance (yes, some operations will require that server to talk with the leader, but not all, reads are served locally for example). They will only switch to a different ZooKeeper server instance if the connection is lost for some reason. If all the clients are connected to the same ZK server, the load wouldn't be evenly distributed. However, according to ZooKeeper documentation [1] (and I haven't tested this), ZK clients don't chose the servers from the connection string in order: "To create a client session the application code must provide a connection string containing a comma separated list of host:port pairs, each corresponding to a ZooKeeper server (e.g. "127.0.0.1:4545" or " 127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"). The ZooKeeper client library will pick an arbitrary server and try to connect to it." Tomás [1] http://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html On Fri, Sep 4, 2015 at 9:12 AM, Erick Erickson wrote: > Arcadius: > > Note that one of the more recent changes is "per collection states" in > ZK. So rather > than have one huge clusterstate.json that gets passed out to to all > collection on any > change, the listeners can now listen only to specific collections. > > Reduces the "thundering herd" problem. > > Best, > Erick > > On Fri, Sep 4, 2015 at 12:39 AM, Arcadius Ahouansou > wrote: > > Hello Shawn. > > This question was raised because IMHO, apart from leader election, there > > are other load-generating activities such as all 10 solrj > > clients+solrCloudNodes listening to changes on > clusterstate.json/state.json > > and downloading the whole file in case there is a change... And this > would > > have happened on zk1 only if we did not shuffle... That's the theory. > > I could test this and see. > > On Sep 4, 2015 6:27 AM, "Shawn Heisey" wrote: > > > >> On 9/3/2015 9:47 PM, Arcadius Ahouansou wrote: > >> > Let's say we have 10 SolrJ clients all configured with > >> > zkhost=zk1:port,zk2:port,zk3:port > >> > > >> > For each of the 10 SolrJ clients, would it make a difference in term > of > >> > load on zk1 (the server on the list) if we shuffle around the order of > >> the > >> > ZK servers in zkHost or is it all the same? > >> > > >> > I would have thought that shuffling would lower load on zk1. > >> > >> I don't think this is going to make much difference. Here's why, > >> assuming that my understanding of how it all works is correct: > >> > >> One of the things zookeeper does is manage elections. It helps figure > >> out which member of a cluster is the leader. I think Zookeeper uses > >> this concept internally, too. One of the hosts in the ensemble will be > >> elected to be the leader, which accepts all input and replicates it to > >> the other members of the cluster. All of the clients will be talking to > >> the leader first, no matter what order the hosts are listed. > >> > >> If my understanding of how this works is flawed, then what I just said > >> is probably wrong. > >> > >> Thanks, > >> Shawn > >> > >> >
Re: Order of hosts in zkHost
Arcadius: Note that one of the more recent changes is "per collection states" in ZK. So rather than have one huge clusterstate.json that gets passed out to to all collection on any change, the listeners can now listen only to specific collections. Reduces the "thundering herd" problem. Best, Erick On Fri, Sep 4, 2015 at 12:39 AM, Arcadius Ahouansou wrote: > Hello Shawn. > This question was raised because IMHO, apart from leader election, there > are other load-generating activities such as all 10 solrj > clients+solrCloudNodes listening to changes on clusterstate.json/state.json > and downloading the whole file in case there is a change... And this would > have happened on zk1 only if we did not shuffle... That's the theory. > I could test this and see. > On Sep 4, 2015 6:27 AM, "Shawn Heisey" wrote: > >> On 9/3/2015 9:47 PM, Arcadius Ahouansou wrote: >> > Let's say we have 10 SolrJ clients all configured with >> > zkhost=zk1:port,zk2:port,zk3:port >> > >> > For each of the 10 SolrJ clients, would it make a difference in term of >> > load on zk1 (the server on the list) if we shuffle around the order of >> the >> > ZK servers in zkHost or is it all the same? >> > >> > I would have thought that shuffling would lower load on zk1. >> >> I don't think this is going to make much difference. Here's why, >> assuming that my understanding of how it all works is correct: >> >> One of the things zookeeper does is manage elections. It helps figure >> out which member of a cluster is the leader. I think Zookeeper uses >> this concept internally, too. One of the hosts in the ensemble will be >> elected to be the leader, which accepts all input and replicates it to >> the other members of the cluster. All of the clients will be talking to >> the leader first, no matter what order the hosts are listed. >> >> If my understanding of how this works is flawed, then what I just said >> is probably wrong. >> >> Thanks, >> Shawn >> >>
Re: Order of hosts in zkHost
Hello Shawn. This question was raised because IMHO, apart from leader election, there are other load-generating activities such as all 10 solrj clients+solrCloudNodes listening to changes on clusterstate.json/state.json and downloading the whole file in case there is a change... And this would have happened on zk1 only if we did not shuffle... That's the theory. I could test this and see. On Sep 4, 2015 6:27 AM, "Shawn Heisey" wrote: > On 9/3/2015 9:47 PM, Arcadius Ahouansou wrote: > > Let's say we have 10 SolrJ clients all configured with > > zkhost=zk1:port,zk2:port,zk3:port > > > > For each of the 10 SolrJ clients, would it make a difference in term of > > load on zk1 (the server on the list) if we shuffle around the order of > the > > ZK servers in zkHost or is it all the same? > > > > I would have thought that shuffling would lower load on zk1. > > I don't think this is going to make much difference. Here's why, > assuming that my understanding of how it all works is correct: > > One of the things zookeeper does is manage elections. It helps figure > out which member of a cluster is the leader. I think Zookeeper uses > this concept internally, too. One of the hosts in the ensemble will be > elected to be the leader, which accepts all input and replicates it to > the other members of the cluster. All of the clients will be talking to > the leader first, no matter what order the hosts are listed. > > If my understanding of how this works is flawed, then what I just said > is probably wrong. > > Thanks, > Shawn > >
Re: Order of hosts in zkHost
On 9/3/2015 9:47 PM, Arcadius Ahouansou wrote: > Let's say we have 10 SolrJ clients all configured with > zkhost=zk1:port,zk2:port,zk3:port > > For each of the 10 SolrJ clients, would it make a difference in term of > load on zk1 (the server on the list) if we shuffle around the order of the > ZK servers in zkHost or is it all the same? > > I would have thought that shuffling would lower load on zk1. I don't think this is going to make much difference. Here's why, assuming that my understanding of how it all works is correct: One of the things zookeeper does is manage elections. It helps figure out which member of a cluster is the leader. I think Zookeeper uses this concept internally, too. One of the hosts in the ensemble will be elected to be the leader, which accepts all input and replicates it to the other members of the cluster. All of the clients will be talking to the leader first, no matter what order the hosts are listed. If my understanding of how this works is flawed, then what I just said is probably wrong. Thanks, Shawn
Order of hosts in zkHost
Hello. Let's say we have 10 SolrJ clients all configured with zkhost=zk1:port,zk2:port,zk3:port For each of the 10 SolrJ clients, would it make a difference in term of load on zk1 (the server on the list) if we shuffle around the order of the ZK servers in zkHost or is it all the same? I would have thought that shuffling would lower load on zk1. Thanks. -- Arcadius Ahouansou Menelic Ltd | Information is Power M: 07908761999 W: www.menelic.com ---