A good way to implement failover is to make the Namenode log transactions to more than one directory, typically a local directory and a NFS mounted directory. The Namenode writes transactions to both directories synchronously.
If the Namenode machine dies, copy the fsimage and fsiedits from the NFS server and you will have recovered *all* committed transactions. The SecondaryNamenode pulls the fsimage and fsedits once every configured period, typically ranging from a few minutes to an hour. If you use the image from the SecondaryNamenode, you might lose the last few minutes of transactions. Thanks dhruba On 7/20/07 9:53 AM, "Doug Cutting" <[EMAIL PROTECTED]> wrote: >> So far I learned that the secondary namenode keeps refreshing >> periodically its backup copies of fsimage and editlog files, and if the >> primary namenode disappears, it's the responsibility of the cluster >> admin to notice this, shut down the cluster, switch the configs across >> the cluster to point to the secondary namenode, start a primary namenode >> on the secondary namenode's host, and restart the rest of the daemons. > > If you use DNS to switch the namenode from the primary to the secondary, > then no configuration changes or other daemon restarts are required. I > think that is the best practice.