Here is the scenario I was concerned about. Consider three nodes in the system A, B and C which are placed say in different racks. Let us say that the disk on A fries up today. Now the blocks that were stored on A are not going to re-replicated (this is my understanding but I could be wrong in this assumption) to some other node or to the new disk with which you would bring back A. Now a month later the disk of B could fry and then another month later disk on C could fry. This way you could slowly start losing data in the absence of a replica synchronization algorithm like that in S3. This would never happen in S3 because there is always a replica synchronization algorithm that is running to give the guarantee that there will always be 3 replicas in the system. So if a disk fries then the data is re-replicated. Of course there is no way to protect oneself from 3 machines which store replicas losing their disks at the same time.
So I was wondering if there is a replica synchronization algorithm in place or is it a feature that is planned for the future. A On 7/17/07, Ted Dunning <[EMAIL PROTECTED]> wrote:
Assuming that you have many more disks than 3, then the chances that 3 simultaneous disk failures being just the right 3 is much lower than the chances of losing any 3 disks. This is enhanced by the ability of Hadoop to allocate files in different racks since one of the few mechanisms of coordinating failures is losing an entire rack. For example, if you have 20 disks, then the chance of losing a particular three disks given that you are losing 3 disks is about one chance in a thousand (assuming independent error location) and should be impossible if the failures are rack aligned. Remember, you can always increase the number of replicas if you like. On 7/17/07 12:55 AM, "Phantom" <[EMAIL PROTECTED]> wrote: > Is replica management built into HDFS ? What I mean is if I set replication > factor to 3 and if I lose 3 disks is that data lost forever ? I mean all 3 > disks dying at the same time I know is a far fetched scenario but if they > die over a certain period of time does HDFS re-replicate the data to ensure > that there are always 3 copies in the system ? > > Thanks > A
