Re: Secondary NameNodes or NFS exports?

Jason Venner Thu, 24 Dec 2009 11:16:14 -0800

In my test case, the checkpoints take a small number of seconds or less.

On Thu, Dec 24, 2009 at 10:34 AM, Todd Lipcon <[email protected]> wrote:


> How long does the checkpoint take? It seems possible to me that if the 2NN
> checkpoint takes longer than the interval, it's possible that multiple
> checkpoints will overlap and might trigger this. (this is conjecture, so
> definitely worth testing)
>
> -Todd
>
> On Wed, Dec 23, 2009 at 6:38 PM, Jason Venner <[email protected]
> >wrote:
>
> > I agree, it seems very wrong, that is why I need a block of time to
> really
> > verify the behavior.
> >
> > My test case is the following, and the same test fails in 18.3 and 19.0
> and
> > 19.1
> >
> > set up a single node cluster, 1 namenode, 1 datanode, 1 secondary, all on
> > the same machine.
> > set the checkpoint interval to 2 minutes (120 sec)
> >
> > make a few files, wait, and verify that a checkpoint can happen.
> >
> > recursively start coping a deep tree into hdfs, what the checkpoint fail
> > with a timestamp error.
> >
> > The code explicitly uses the edits.new for the checkpoint verification
> > timestamp.
> >
> > The window is the time from the take of the edit log to the return of the
> > fsimage.
> >
> > On Wed, Dec 23, 2009 at 5:52 PM, Brian Bockelman <[email protected]
> > >wrote:
> >
> > > Hey Jason,
> > >
> > > This analysis seems fairly unlikely - are you claiming that no edits
> can
> > be
> > > merged if files are being created?  Isn't this what edits.new is for?
> > >
> > > We roll the edits log successfully during periods of high transfer,
> when
> > a
> > > new file is being created every 1 second or so.
> > >
> > > We have had issues with unmergeable edits before - there might be some
> > race
> > > conditions in this area.
> > >
> > > Brian
> > >
> > > On Dec 23, 2009, at 7:07 PM, Jason Venner wrote:
> > >
> > > > I have no current solution.
> > > > When I can block a few days, I am going to instrument the code a bit
> > more
> > > to
> > > > verify my understanding.
> > > >
> > > > I believe the issue is that the time stamp is being checked against
> the
> > > > active edit log (the new one created then the checkpoint started)
> > rather
> > > > than the time stamp of the rolled (old) edit log.
> > > > As long as no transactions have hit, the time stamps are the same.
> > > >
> > > >
> > > > On Wed, Dec 23, 2009 at 11:23 AM, Stas Oskin <[email protected]>
> > > wrote:
> > > >
> > > >> Hi.
> > > >>
> > > >> What was your solution to this then?
> > > >>
> > > >> Regards.
> > > >>
> > > >> On Sat, Dec 5, 2009 at 7:43 AM, Jason Venner <
> [email protected]>
> > > >> wrote:
> > > >>
> > > >>> I have dug into this more, it turns out the problem is unrelated to
> > nfs
> > > >> or
> > > >>> solaris.
> > > >>> The issue is that if there is a meta data change, while the
> secondary
> > > is
> > > >>> rebuilding the fsimage, the rebuilt image is rejected.
> > > >>> On our production cluster, there is almost never a moment where
> there
> > > is
> > > >>> not
> > > >>> a file being created or altered, and as such the secondary is never
> > > make
> > > >> a
> > > >>> fresh fsimage for the cluster.
> > > >>>
> > > >>> I have checked this with several hadoop variants and with vanilla
> > > >>> distributions with the namenode, secondary and a datanode all
> running
> > > on
> > > >>> the
> > > >>> same machine.
> > > >>>
> > > >>> On Tue, Oct 27, 2009 at 8:03 PM, Jason Venner <
> > [email protected]
> > > >>>> wrote:
> > > >>>
> > > >>>> The namenode would never accept the rebuild fsimage from the
> > > secondary,
> > > >>> so
> > > >>>> the edit logs grew with outbounds.
> > > >>>>
> > > >>>>
> > > >>>> On Tue, Oct 27, 2009 at 10:51 AM, Stas Oskin <
> [email protected]>
> > > >>> wrote:
> > > >>>>
> > > >>>>> Hi.
> > > >>>>>
> > > >>>>> You mean, you couldn't recover the NameNode from checkpoints
> > because
> > > >> of
> > > >>>>> timestamps?
> > > >>>>>
> > > >>>>> Regards.
> > > >>>>>
> > > >>>>> On Tue, Oct 27, 2009 at 4:49 PM, Jason Venner <
> > > [email protected]
> > > >>>>>> wrote:
> > > >>>>>
> > > >>>>>> We have been having some trouble with the secondary on a cluster
> > > >> that
> > > >>>>> has
> > > >>>>>> one edit log partition on an nfs server, with the namenode
> > rejecting
> > > >>> the
> > > >>>>>> merged images due to timestamp missmatches.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <
> > [email protected]>
> > > >>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Hi.
> > > >>>>>>>
> > > >>>>>>> Thanks for the advice, it seems that the initial approach of
> > > >> having
> > > >>>>>> single
> > > >>>>>>> SecNameNode writing to exports is the way to go.
> > > >>>>>>>
> > > >>>>>>> By the way, I asked this already, but wanted to clarify:
> > > >>>>>>>
> > > >>>>>>> * It's possible to set how often SecNameNode checkpoints the
> data
> > > >>>>> (what
> > > >>>>>> is
> > > >>>>>>> the setting by the way)?
> > > >>>>>>>
> > > >>>>>>> * It's possible to let NameNode write to exports as well
> together
> > > >>> with
> > > >>>>>>> local
> > > >>>>>>> disk, which ensures the latest possible meta-data in case of
> disk
> > > >>>>> crash
> > > >>>>>>> (compared to pereodic check-pointing), but it's going to slow
> > down
> > > >>> the
> > > >>>>>>> operations due to network read/writes.
> > > >>>>>>>
> > > >>>>>>> Thanks again.
> > > >>>>>>>
> > > >>>>>>> On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
> > > >>>>>>> <[email protected]>wrote:
> > > >>>>>>>
> > > >>>>>>>> From what I understand, it's rather tricky to set up multiple
> > > >>>>> secondary
> > > >>>>>>>> namenodes. In either case, running multiple 2ndary NNs doesn't
> > > >> get
> > > >>>>> you
> > > >>>>>>>> much.
> > > >>>>>>>> See this thread:
> > > >>>>>>>>
> > > >>>>>
> > > http://www.mail-archive.com/[email protected]/msg06280.html
> > > >>>>>>>>
> > > >>>>>>>> On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <
> > > >>> [email protected]>
> > > >>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> To clarify, it's either let single SecNameNode to write to
> > > >>>>> multiple
> > > >>>>>> NFS
> > > >>>>>>>>> exports, or actually have multiple SecNameNodes.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks again.
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <
> > > >>> [email protected]
> > > >>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Hi.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I'm want to keep a checkpoint data on several separate
> > > >>> machines
> > > >>>>> for
> > > >>>>>>>>> backup,
> > > >>>>>>>>>> and deliberating between exporting these machines disks via
> > > >>> NFS,
> > > >>>>> or
> > > >>>>>>>>> actually
> > > >>>>>>>>>> running Secondary Name Nodes there.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Can anyone advice what would be better in my case?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Regards.
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > > >>>>>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > > >>>>>> www.prohadoopbook.com a community for Hadoop Professionals
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> --
> > > >>>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > > >>>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > > >>>> www.prohadoopbook.com a community for Hadoop Professionals
> > > >>>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > > >>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > > >>> www.prohadoopbook.com a community for Hadoop Professionals
> > > >>>
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > > > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > > > www.prohadoopbook.com a community for Hadoop Professionals
> > >
> > >
> >
> >
> > --
> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > www.prohadoopbook.com a community for Hadoop Professionals
> >
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: Secondary NameNodes or NFS exports?

Reply via email to