This is more of "theoretical problem" really. Yahoo and others claim they lost far more data due to human error than any HDFS problems (including Namenode failures).
You can prevent data loss by having the namenode write the metadata to another machine (via NFS or DRBD or if you have a SAN). You'll still have an outage while switching over to a different machine, but at least you won't lose any data. Facebook has a partial solution (Avatarnode) and the HSFS folks are working on a solution (which like Avatarnode mainly involves keeping a hot copy of the Namenode so that failover is "instantaneous" - 1 or 2 minutes at most). ----- Original Message ----- From: Mark <[email protected]> To: [email protected] Cc: Sent: Saturday, October 29, 2011 11:46 AM Subject: Dealing with single point of failure How does one deal with the fact that HBase has a single point of failure.. namely the namenode. What steps can be taken to eliminate and/or minimize the impact of a namenode failure? What can a situation where reliability is of utmost importance should one choose an alternative technology.. ie Cassandra? Thanks
