Hi. I wonder, what Hadoop community uses in order to make NameNode resilient to failures?
I mean, what High-Availability measures are taken to keep the HDFS available even in case of NameNode failure? So far I read a possible solution using DRBD, and another one using carp. Both of them had the downside of keeping a passive machine aside taking the IP of the NameNode. Perhaps there is a way to keep only a passive NameNode service on another machine (which does other tasks), taking the IP only when the main has failed? That of course until the human operator restores the main node to action.\ Regards.
