Eric, I shut it down at night, because the slave server is in my bedroom, and I use the replication factor of 1, because that is what my CDH install did, so I accepted it. I will bump it up to 3.
But the most important advice that you give is "put it into safe mode" - and that is what I am going to do all the time that I am not working on it, because it is purely my development cluster. I might even shut the daemons down completely. Thank you, Mark On Thu, Mar 3, 2011 at 5:55 PM, Eric Sammer <[email protected]> wrote: > On Thu, Mar 3, 2011 at 6:44 PM, Mark Kerzner <[email protected]>wrote: > >> Hi, >> >> in my small development cluster I have a master/slave node and a slave >> node, >> and I shut down the slave node at night. I often see that my HDFS is >> corrupted, and I have to reformat the name node and to delete the data >> directory. >> > > Why do you shut down the slave at night? HDFS should only be corrupted if > you're missing all copies of a block. With a replication factor of 3 > (default) you should have 100% of the data on both nodes (if you only have 2 > nodes). If you've dialed it down to 1, simply starting the slave back up > should "un-corrupt" HDFS. You definitely don't want to be doing this to HDFS > regularly (dropping nodes from the cluster and re-adding them unless you're > trying to test HDFS' failure semantics. > > It finally dawns on me that with such small cluster I better shut the >> daemons down, for otherwise they are trying too hard to compensate for the >> missing node and eventually it goes bad. Is my understanding correct? >> > > It doesn't "eventually go bad." If the NN sees a DN disappear it may start > re-replicating data to another node. In such a small cluster, maybe there's > no where else to get the blocks from, but I bet you dialed the replication > factor down to 1 (or have code that writes files with a rep factor of 1 like > teragen / terasort). > > In short, if you're going to shut down nodes like this put the NN into safe > mode so it doesn't freak out (which will also make the cluster unusable > during that time) but there's definitely no need to be reformatting HDFS. > Just re-introduce the DN you shut down to the cluster. > > >> >> Thank you, >> Mark >> > > -- > Eric Sammer > twitter: esammer > data: www.cloudera.com >
