Hi Mayuran, Yes, you need to run a secondary namenode.
The secondary namenode is *not* a backup mechanism. It is an important part of the HDFS metadata system, and is responsible for periodically checkpointing the filesystem namespace into a single file. Without the secondary namenode running, the edit log of the NN will grow without bound (unless you are periodically restarting your namenode, which also causes a checkpoint. Note that you do not need to run the 2NN on a separate machine *if* you have enough RAM for two entire copies of your filesystem namespace. For small clusters you should be fine to run the two daemons on one machine. Hope that helps, -Todd On Mon, Sep 28, 2009 at 10:25 AM, Mayuran Yogarajah < [email protected]> wrote: > We've got the namenode image being written to a second machine via > NFS so we have that backed up. That said, do we still need a secondary > namenode, or is it OK to have the cluster going without one? > > thanks >
