Great, thanks!
On Thu, Nov 4, 2010 at 10:04 PM, Chang Song <tru64...@me.com> wrote: > > Benjamin. > It looks like ZK clients can handle a list of IPs from DNS query correctly. > Yes you are right. > > I am updating wiki per Patrick's request. > > Thanks a lot. > > Chang > > > > On Nov 5, 2010, at 1:10 AM, Benjamin Reed wrote: > >> one thing to note: the if you are using a DNS load balancer, some load >> balancers will return the list of resolved addresses in different orders to >> do the balancing. the zookeeper client will shuffle that list before it it >> used, so in reality, using a single DNS hostname resolving to all the server >> addresses will probably work just as well as most DNS-based load balancers. >> >> ben >> >> On 11/04/2010 08:26 AM, Patrick Hunt wrote: >>> Hi Chang, thanks for the insights, if you have a few minutes would you >>> mind updating the FAQ with some of this detail? >>> http://wiki.apache.org/hadoop/ZooKeeper/FAQ >>> >>> Thanks! >>> >>> Patrick >>> >>> On Thu, Nov 4, 2010 at 6:27 AM, Chang Song<tru64...@me.com> wrote: >>>> Sorry. I made a mistake on retry timeout in load balancer section of my >>>> answer. >>>> The same timeout applies to load balancer case as well (depends on the recv >>>> timeout) >>>> >>>> Thank you >>>> >>>> Chang >>>> >>>> >>>> On Nov 4, 2010, at 10:22 PM, Chang Song wrote: >>>> >>>>> I would like to add some info on this. >>>>> >>>>> This may not be very important, but there are subtle differences. >>>>> >>>>> Two cases: 1. server hardware failure or kernel panic >>>>> 2. zookeeper Java daemon process down >>>>> >>>>> In former one, timeout will be based on the timeout argument in >>>>> zookeeper_init(). >>>>> Partially based on ZK heartbeat algorithm. It recognize server down in >>>>> 2/3 of the timeout. >>>>> then retries at every timeout. For example, if timeout is 9000 msec, it >>>>> first times out in 6 second, and retries every 9 seconds. >>>>> >>>>> In latter case (Java process down), since socket connect immediately >>>>> returns >>>>> refused connection, it can retry immediately. >>>>> >>>>> On top of that, >>>>> >>>>> - Hardware load balancer: >>>>> If an ensemble cluster is serviced with hardware load balancer, >>>>> zookeeper client will retry every 2 second since we only have one IP to >>>>> try. >>>>> >>>>> - DNS RR: >>>>> Make sure that "nscd" on your linux box is off since it is most likely >>>>> that DNS cache returns the same IP many times. >>>>> This is actually worse than above since ZK client will retry the same >>>>> dead server every 2 seconds for some time. >>>>> >>>>> >>>>> I think it is best not to use load balancer for ZK clients since ZK >>>>> clients will try next server immediately >>>>> if previous one fails for some reason (based on timeout above). And this >>>>> is especially true if your cluster works in >>>>> pseudo realtime environment where tickTime is set to very low. >>>>> >>>>> >>>>> Chang >>>>> >>>>> >>>>> On Nov 4, 2010, at 9:17 AM, Ted Dunning wrote: >>>>> >>>>>> DNS round-robin works as well. >>>>>> >>>>>> On Wed, Nov 3, 2010 at 3:45 PM, Benjamin Reed<br...@yahoo-inc.com> >>>>>> wrote: >>>>>> >>>>>>> it would have to be a TCP based load balancer to work with ZooKeeper >>>>>>> clients, but other than that it should work really well. The clients >>>>>>> will be >>>>>>> doing heart beats so the TCP connections will be long lived. The client >>>>>>> library does random connection load balancing anyway. >>>>>>> >>>>>>> ben >>>>>>> >>>>>>> On 11/03/2010 12:19 PM, Luka Stojanovic wrote: >>>>>>> >>>>>>>> What would be expected behavior if a three node cluster is put behind a >>>>>>>> load >>>>>>>> balancer? It would ease deployment because all clients would be >>>>>>>> configured >>>>>>>> to target zookeeper.example.com regardless of actual cluster >>>>>>>> configuration, >>>>>>>> but I have impression that client-server connection is stateful and >>>>>>>> that >>>>>>>> jumping randomly from server to server could bring strange behavior. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> >>>>>>>> -- >>>>>>>> Luka Stojanovic >>>>>>>> lu...@vast.com >>>>>>>> Platform Engineering >>>>>>>> >>>>>>> >>>> >> > >