Re: the error

2010-03-31 Thread Ted Dunning
As I pointed out in my response, you should distinguish hard and soft failures. If one machine fails even catastrophically, you can provide a new machine to replace it, thus converting a hard failure into a soft one. The conclusion is the same. Three machines is vastly better than one or two. O

Re: the error

2010-03-31 Thread Ted Dunning
Suppose a machine has probability of soft-failure p_1 and catastrophic p_2 << p_1. Assume that two machines have independent failure modes. Probably of soft failure of a one machine cluster = p_1, two machine cluster = probability of soft failure of 1 or 2 machines + probability of one machine ha

Re: the error

2010-03-31 Thread Henry Robinson
Using two machines running ZK will actually decrease your reliability >> compared to using a single machine. Consider using one machine or three. >> > > ? > > Not meaning to pull the thread off-topic, but I don't understand why this > should be the case. Can you elaborate? > > With majority-based

Re: the error

2010-03-31 Thread David Rosenstrauch
On 03/31/2010 02:10 PM, Ted Dunning wrote: To add to Patrick's comments, I hope you mean that you are connecting to ZK from a cluster of two machines rather than having only two machines that form a ZK cluster. Using two machines running ZK will actually decrease your reliability compared to usi

Re: the error

2010-03-31 Thread Ted Dunning
To add to Patrick's comments, I hope you mean that you are connecting to ZK from a cluster of two machines rather than having only two machines that form a ZK cluster. Using two machines running ZK will actually decrease your reliability compared to using a single machine. Consider using one mach

Re: the error

2010-03-31 Thread Patrick Hunt
Hi Li, when you say 17 threads reading a znode, do you mean that you have 17 threads each creating a session and using that session to read a znode? If so it's probably due to this: http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_advancedConfiguration see the parameter "