Re: Running cluster behind load balancer
I would like to add some info on this. This may not be very important, but there are subtle differences. Two cases: 1. server hardware failure or kernel panic 2. zookeeper Java daemon process down In former one, timeout will be based on the timeout argument in zookeeper_init(). Partially based on ZK heartbeat algorithm. It recognize server down in 2/3 of the timeout. then retries at every timeout. For example, if timeout is 9000 msec, it first times out in 6 second, and retries every 9 seconds. In latter case (Java process down), since socket connect immediately returns refused connection, it can retry immediately. On top of that, - Hardware load balancer: If an ensemble cluster is serviced with hardware load balancer, zookeeper client will retry every 2 second since we only have one IP to try. - DNS RR: Make sure that nscd on your linux box is off since it is most likely that DNS cache returns the same IP many times. This is actually worse than above since ZK client will retry the same dead server every 2 seconds for some time. I think it is best not to use load balancer for ZK clients since ZK clients will try next server immediately if previous one fails for some reason (based on timeout above). And this is especially true if your cluster works in pseudo realtime environment where tickTime is set to very low. Chang On Nov 4, 2010, at 9:17 AM, Ted Dunning wrote: DNS round-robin works as well. On Wed, Nov 3, 2010 at 3:45 PM, Benjamin Reed br...@yahoo-inc.com wrote: it would have to be a TCP based load balancer to work with ZooKeeper clients, but other than that it should work really well. The clients will be doing heart beats so the TCP connections will be long lived. The client library does random connection load balancing anyway. ben On 11/03/2010 12:19 PM, Luka Stojanovic wrote: What would be expected behavior if a three node cluster is put behind a load balancer? It would ease deployment because all clients would be configured to target zookeeper.example.com regardless of actual cluster configuration, but I have impression that client-server connection is stateful and that jumping randomly from server to server could bring strange behavior. Cheers, -- Luka Stojanovic lu...@vast.com Platform Engineering
Re: Running cluster behind load balancer
Sorry. I made a mistake on retry timeout in load balancer section of my answer. The same timeout applies to load balancer case as well (depends on the recv timeout) Thank you Chang On Nov 4, 2010, at 10:22 PM, Chang Song wrote: I would like to add some info on this. This may not be very important, but there are subtle differences. Two cases: 1. server hardware failure or kernel panic 2. zookeeper Java daemon process down In former one, timeout will be based on the timeout argument in zookeeper_init(). Partially based on ZK heartbeat algorithm. It recognize server down in 2/3 of the timeout. then retries at every timeout. For example, if timeout is 9000 msec, it first times out in 6 second, and retries every 9 seconds. In latter case (Java process down), since socket connect immediately returns refused connection, it can retry immediately. On top of that, - Hardware load balancer: If an ensemble cluster is serviced with hardware load balancer, zookeeper client will retry every 2 second since we only have one IP to try. - DNS RR: Make sure that nscd on your linux box is off since it is most likely that DNS cache returns the same IP many times. This is actually worse than above since ZK client will retry the same dead server every 2 seconds for some time. I think it is best not to use load balancer for ZK clients since ZK clients will try next server immediately if previous one fails for some reason (based on timeout above). And this is especially true if your cluster works in pseudo realtime environment where tickTime is set to very low. Chang On Nov 4, 2010, at 9:17 AM, Ted Dunning wrote: DNS round-robin works as well. On Wed, Nov 3, 2010 at 3:45 PM, Benjamin Reed br...@yahoo-inc.com wrote: it would have to be a TCP based load balancer to work with ZooKeeper clients, but other than that it should work really well. The clients will be doing heart beats so the TCP connections will be long lived. The client library does random connection load balancing anyway. ben On 11/03/2010 12:19 PM, Luka Stojanovic wrote: What would be expected behavior if a three node cluster is put behind a load balancer? It would ease deployment because all clients would be configured to target zookeeper.example.com regardless of actual cluster configuration, but I have impression that client-server connection is stateful and that jumping randomly from server to server could bring strange behavior. Cheers, -- Luka Stojanovic lu...@vast.com Platform Engineering
Re: Running cluster behind load balancer
Hi Chang, thanks for the insights, if you have a few minutes would you mind updating the FAQ with some of this detail? http://wiki.apache.org/hadoop/ZooKeeper/FAQ Thanks! Patrick On Thu, Nov 4, 2010 at 6:27 AM, Chang Song tru64...@me.com wrote: Sorry. I made a mistake on retry timeout in load balancer section of my answer. The same timeout applies to load balancer case as well (depends on the recv timeout) Thank you Chang On Nov 4, 2010, at 10:22 PM, Chang Song wrote: I would like to add some info on this. This may not be very important, but there are subtle differences. Two cases: 1. server hardware failure or kernel panic 2. zookeeper Java daemon process down In former one, timeout will be based on the timeout argument in zookeeper_init(). Partially based on ZK heartbeat algorithm. It recognize server down in 2/3 of the timeout. then retries at every timeout. For example, if timeout is 9000 msec, it first times out in 6 second, and retries every 9 seconds. In latter case (Java process down), since socket connect immediately returns refused connection, it can retry immediately. On top of that, - Hardware load balancer: If an ensemble cluster is serviced with hardware load balancer, zookeeper client will retry every 2 second since we only have one IP to try. - DNS RR: Make sure that nscd on your linux box is off since it is most likely that DNS cache returns the same IP many times. This is actually worse than above since ZK client will retry the same dead server every 2 seconds for some time. I think it is best not to use load balancer for ZK clients since ZK clients will try next server immediately if previous one fails for some reason (based on timeout above). And this is especially true if your cluster works in pseudo realtime environment where tickTime is set to very low. Chang On Nov 4, 2010, at 9:17 AM, Ted Dunning wrote: DNS round-robin works as well. On Wed, Nov 3, 2010 at 3:45 PM, Benjamin Reed br...@yahoo-inc.com wrote: it would have to be a TCP based load balancer to work with ZooKeeper clients, but other than that it should work really well. The clients will be doing heart beats so the TCP connections will be long lived. The client library does random connection load balancing anyway. ben On 11/03/2010 12:19 PM, Luka Stojanovic wrote: What would be expected behavior if a three node cluster is put behind a load balancer? It would ease deployment because all clients would be configured to target zookeeper.example.com regardless of actual cluster configuration, but I have impression that client-server connection is stateful and that jumping randomly from server to server could bring strange behavior. Cheers, -- Luka Stojanovic lu...@vast.com Platform Engineering
Re: Running cluster behind load balancer
one thing to note: the if you are using a DNS load balancer, some load balancers will return the list of resolved addresses in different orders to do the balancing. the zookeeper client will shuffle that list before it it used, so in reality, using a single DNS hostname resolving to all the server addresses will probably work just as well as most DNS-based load balancers. ben On 11/04/2010 08:26 AM, Patrick Hunt wrote: Hi Chang, thanks for the insights, if you have a few minutes would you mind updating the FAQ with some of this detail? http://wiki.apache.org/hadoop/ZooKeeper/FAQ Thanks! Patrick On Thu, Nov 4, 2010 at 6:27 AM, Chang Songtru64...@me.com wrote: Sorry. I made a mistake on retry timeout in load balancer section of my answer. The same timeout applies to load balancer case as well (depends on the recv timeout) Thank you Chang On Nov 4, 2010, at 10:22 PM, Chang Song wrote: I would like to add some info on this. This may not be very important, but there are subtle differences. Two cases: 1. server hardware failure or kernel panic 2. zookeeper Java daemon process down In former one, timeout will be based on the timeout argument in zookeeper_init(). Partially based on ZK heartbeat algorithm. It recognize server down in 2/3 of the timeout. then retries at every timeout. For example, if timeout is 9000 msec, it first times out in 6 second, and retries every 9 seconds. In latter case (Java process down), since socket connect immediately returns refused connection, it can retry immediately. On top of that, - Hardware load balancer: If an ensemble cluster is serviced with hardware load balancer, zookeeper client will retry every 2 second since we only have one IP to try. - DNS RR: Make sure that nscd on your linux box is off since it is most likely that DNS cache returns the same IP many times. This is actually worse than above since ZK client will retry the same dead server every 2 seconds for some time. I think it is best not to use load balancer for ZK clients since ZK clients will try next server immediately if previous one fails for some reason (based on timeout above). And this is especially true if your cluster works in pseudo realtime environment where tickTime is set to very low. Chang On Nov 4, 2010, at 9:17 AM, Ted Dunning wrote: DNS round-robin works as well. On Wed, Nov 3, 2010 at 3:45 PM, Benjamin Reedbr...@yahoo-inc.com wrote: it would have to be a TCP based load balancer to work with ZooKeeper clients, but other than that it should work really well. The clients will be doing heart beats so the TCP connections will be long lived. The client library does random connection load balancing anyway. ben On 11/03/2010 12:19 PM, Luka Stojanovic wrote: What would be expected behavior if a three node cluster is put behind a load balancer? It would ease deployment because all clients would be configured to target zookeeper.example.com regardless of actual cluster configuration, but I have impression that client-server connection is stateful and that jumping randomly from server to server could bring strange behavior. Cheers, -- Luka Stojanovic lu...@vast.com Platform Engineering
Re: Running cluster behind load balancer
Benjamin. It looks like ZK clients can handle a list of IPs from DNS query correctly. Yes you are right. I am updating wiki per Patrick's request. Thanks a lot. Chang On Nov 5, 2010, at 1:10 AM, Benjamin Reed wrote: one thing to note: the if you are using a DNS load balancer, some load balancers will return the list of resolved addresses in different orders to do the balancing. the zookeeper client will shuffle that list before it it used, so in reality, using a single DNS hostname resolving to all the server addresses will probably work just as well as most DNS-based load balancers. ben On 11/04/2010 08:26 AM, Patrick Hunt wrote: Hi Chang, thanks for the insights, if you have a few minutes would you mind updating the FAQ with some of this detail? http://wiki.apache.org/hadoop/ZooKeeper/FAQ Thanks! Patrick On Thu, Nov 4, 2010 at 6:27 AM, Chang Songtru64...@me.com wrote: Sorry. I made a mistake on retry timeout in load balancer section of my answer. The same timeout applies to load balancer case as well (depends on the recv timeout) Thank you Chang On Nov 4, 2010, at 10:22 PM, Chang Song wrote: I would like to add some info on this. This may not be very important, but there are subtle differences. Two cases: 1. server hardware failure or kernel panic 2. zookeeper Java daemon process down In former one, timeout will be based on the timeout argument in zookeeper_init(). Partially based on ZK heartbeat algorithm. It recognize server down in 2/3 of the timeout. then retries at every timeout. For example, if timeout is 9000 msec, it first times out in 6 second, and retries every 9 seconds. In latter case (Java process down), since socket connect immediately returns refused connection, it can retry immediately. On top of that, - Hardware load balancer: If an ensemble cluster is serviced with hardware load balancer, zookeeper client will retry every 2 second since we only have one IP to try. - DNS RR: Make sure that nscd on your linux box is off since it is most likely that DNS cache returns the same IP many times. This is actually worse than above since ZK client will retry the same dead server every 2 seconds for some time. I think it is best not to use load balancer for ZK clients since ZK clients will try next server immediately if previous one fails for some reason (based on timeout above). And this is especially true if your cluster works in pseudo realtime environment where tickTime is set to very low. Chang On Nov 4, 2010, at 9:17 AM, Ted Dunning wrote: DNS round-robin works as well. On Wed, Nov 3, 2010 at 3:45 PM, Benjamin Reedbr...@yahoo-inc.com wrote: it would have to be a TCP based load balancer to work with ZooKeeper clients, but other than that it should work really well. The clients will be doing heart beats so the TCP connections will be long lived. The client library does random connection load balancing anyway. ben On 11/03/2010 12:19 PM, Luka Stojanovic wrote: What would be expected behavior if a three node cluster is put behind a load balancer? It would ease deployment because all clients would be configured to target zookeeper.example.com regardless of actual cluster configuration, but I have impression that client-server connection is stateful and that jumping randomly from server to server could bring strange behavior. Cheers, -- Luka Stojanovic lu...@vast.com Platform Engineering
Re: Running cluster behind load balancer
Great, thanks! On Thu, Nov 4, 2010 at 10:04 PM, Chang Song tru64...@me.com wrote: Benjamin. It looks like ZK clients can handle a list of IPs from DNS query correctly. Yes you are right. I am updating wiki per Patrick's request. Thanks a lot. Chang On Nov 5, 2010, at 1:10 AM, Benjamin Reed wrote: one thing to note: the if you are using a DNS load balancer, some load balancers will return the list of resolved addresses in different orders to do the balancing. the zookeeper client will shuffle that list before it it used, so in reality, using a single DNS hostname resolving to all the server addresses will probably work just as well as most DNS-based load balancers. ben On 11/04/2010 08:26 AM, Patrick Hunt wrote: Hi Chang, thanks for the insights, if you have a few minutes would you mind updating the FAQ with some of this detail? http://wiki.apache.org/hadoop/ZooKeeper/FAQ Thanks! Patrick On Thu, Nov 4, 2010 at 6:27 AM, Chang Songtru64...@me.com wrote: Sorry. I made a mistake on retry timeout in load balancer section of my answer. The same timeout applies to load balancer case as well (depends on the recv timeout) Thank you Chang On Nov 4, 2010, at 10:22 PM, Chang Song wrote: I would like to add some info on this. This may not be very important, but there are subtle differences. Two cases: 1. server hardware failure or kernel panic 2. zookeeper Java daemon process down In former one, timeout will be based on the timeout argument in zookeeper_init(). Partially based on ZK heartbeat algorithm. It recognize server down in 2/3 of the timeout. then retries at every timeout. For example, if timeout is 9000 msec, it first times out in 6 second, and retries every 9 seconds. In latter case (Java process down), since socket connect immediately returns refused connection, it can retry immediately. On top of that, - Hardware load balancer: If an ensemble cluster is serviced with hardware load balancer, zookeeper client will retry every 2 second since we only have one IP to try. - DNS RR: Make sure that nscd on your linux box is off since it is most likely that DNS cache returns the same IP many times. This is actually worse than above since ZK client will retry the same dead server every 2 seconds for some time. I think it is best not to use load balancer for ZK clients since ZK clients will try next server immediately if previous one fails for some reason (based on timeout above). And this is especially true if your cluster works in pseudo realtime environment where tickTime is set to very low. Chang On Nov 4, 2010, at 9:17 AM, Ted Dunning wrote: DNS round-robin works as well. On Wed, Nov 3, 2010 at 3:45 PM, Benjamin Reedbr...@yahoo-inc.com wrote: it would have to be a TCP based load balancer to work with ZooKeeper clients, but other than that it should work really well. The clients will be doing heart beats so the TCP connections will be long lived. The client library does random connection load balancing anyway. ben On 11/03/2010 12:19 PM, Luka Stojanovic wrote: What would be expected behavior if a three node cluster is put behind a load balancer? It would ease deployment because all clients would be configured to target zookeeper.example.com regardless of actual cluster configuration, but I have impression that client-server connection is stateful and that jumping randomly from server to server could bring strange behavior. Cheers, -- Luka Stojanovic lu...@vast.com Platform Engineering
Running cluster behind load balancer
What would be expected behavior if a three node cluster is put behind a load balancer? It would ease deployment because all clients would be configured to target zookeeper.example.com regardless of actual cluster configuration, but I have impression that client-server connection is stateful and that jumping randomly from server to server could bring strange behavior. Cheers, -- Luka Stojanovic lu...@vast.com Platform Engineering
Re: Running cluster behind load balancer
DNS round-robin works as well. On Wed, Nov 3, 2010 at 3:45 PM, Benjamin Reed br...@yahoo-inc.com wrote: it would have to be a TCP based load balancer to work with ZooKeeper clients, but other than that it should work really well. The clients will be doing heart beats so the TCP connections will be long lived. The client library does random connection load balancing anyway. ben On 11/03/2010 12:19 PM, Luka Stojanovic wrote: What would be expected behavior if a three node cluster is put behind a load balancer? It would ease deployment because all clients would be configured to target zookeeper.example.com regardless of actual cluster configuration, but I have impression that client-server connection is stateful and that jumping randomly from server to server could bring strange behavior. Cheers, -- Luka Stojanovic lu...@vast.com Platform Engineering