I have dug into this more, it turns out the problem is unrelated to nfs or solaris. The issue is that if there is a meta data change, while the secondary is rebuilding the fsimage, the rebuilt image is rejected. On our production cluster, there is almost never a moment where there is not a file being created or altered, and as such the secondary is never make a fresh fsimage for the cluster.
I have checked this with several hadoop variants and with vanilla distributions with the namenode, secondary and a datanode all running on the same machine. On Tue, Oct 27, 2009 at 8:03 PM, Jason Venner <[email protected]>wrote: > The namenode would never accept the rebuild fsimage from the secondary, so > the edit logs grew with outbounds. > > > On Tue, Oct 27, 2009 at 10:51 AM, Stas Oskin <[email protected]> wrote: > >> Hi. >> >> You mean, you couldn't recover the NameNode from checkpoints because of >> timestamps? >> >> Regards. >> >> On Tue, Oct 27, 2009 at 4:49 PM, Jason Venner <[email protected] >> >wrote: >> >> > We have been having some trouble with the secondary on a cluster that >> has >> > one edit log partition on an nfs server, with the namenode rejecting the >> > merged images due to timestamp missmatches. >> > >> > >> > On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <[email protected]> >> wrote: >> > >> > > Hi. >> > > >> > > Thanks for the advice, it seems that the initial approach of having >> > single >> > > SecNameNode writing to exports is the way to go. >> > > >> > > By the way, I asked this already, but wanted to clarify: >> > > >> > > * It's possible to set how often SecNameNode checkpoints the data >> (what >> > is >> > > the setting by the way)? >> > > >> > > * It's possible to let NameNode write to exports as well together with >> > > local >> > > disk, which ensures the latest possible meta-data in case of disk >> crash >> > > (compared to pereodic check-pointing), but it's going to slow down the >> > > operations due to network read/writes. >> > > >> > > Thanks again. >> > > >> > > On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles >> > > <[email protected]>wrote: >> > > >> > > > From what I understand, it's rather tricky to set up multiple >> secondary >> > > > namenodes. In either case, running multiple 2ndary NNs doesn't get >> you >> > > > much. >> > > > See this thread: >> > > > >> http://www.mail-archive.com/[email protected]/msg06280.html >> > > > >> > > > On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <[email protected]> >> > > wrote: >> > > > >> > > > > To clarify, it's either let single SecNameNode to write to >> multiple >> > NFS >> > > > > exports, or actually have multiple SecNameNodes. >> > > > > >> > > > > Thanks again. >> > > > > >> > > > > On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <[email protected] >> > >> > > > wrote: >> > > > > >> > > > > > Hi. >> > > > > > >> > > > > > I'm want to keep a checkpoint data on several separate machines >> for >> > > > > backup, >> > > > > > and deliberating between exporting these machines disks via NFS, >> or >> > > > > actually >> > > > > > running Secondary Name Nodes there. >> > > > > > >> > > > > > Can anyone advice what would be better in my case? >> > > > > > >> > > > > > Regards. >> > > > > > >> > > > > >> > > > >> > > >> > >> > >> > >> > -- >> > Pro Hadoop, a book to guide you from beginner to hadoop mastery, >> > http://www.amazon.com/dp/1430219424?tag=jewlerymall >> > www.prohadoopbook.com a community for Hadoop Professionals >> > >> > > > > -- > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > http://www.amazon.com/dp/1430219424?tag=jewlerymall > www.prohadoopbook.com a community for Hadoop Professionals > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals
