Re: Advice on restarting HDFS in a cron

jason hadoop Sat, 25 Apr 2009 23:35:24 -0700

You can also turn down the server logging level via
bin/hadoop daemonlog
-getlevel host:port server
-setlevel host:port


For the namenode 50070
For the jobtracker 50030
For the tasktracker 50060
For the secondary namenode 50090
For the datanode 50075

Somewhat wiser than I with log4j many have a better suggestion for the
logger name to pick other than root., perhaps org.apache.hadoop.

I beleve this, run from the master node would set the log level to warn for
all the datanodes and tasktrackers
for a in `cat conf/slaves`; do bin/hadoop daemonlog -setlevel $a:50075 root
WARN; bin/hadoop daemonlog -setlevel $a:50060 root WARN; done
and of course for the master node jobtracker and namenode
bin/hadoop daemonlog -setlevel localhost:50030 root WARN
bin/hadoop daemonlog -setlevel localhost:50070 root WARN

On Sat, Apr 25, 2009 at 10:10 PM, Rakhi Khatwani
<rakhi.khatw...@gmail.com>wrote:

> Thanks Aaron.
>
> On Sun, Apr 26, 2009 at 10:37 AM, Aaron Kimball <aa...@cloudera.com>
> wrote:
>
> > If your logs were being written to the root partition (/dev/sda1), that's
> > going to fill up fast. This partition is always <= 10 GB on EC2 and much
> of
> > that space is consumed by the OS install. You should redirect your logs
> to
> > some place under /mnt (/dev/sdb1); that's 160 GB.
> >
> > - Aaron
> >
> > On Sun, Apr 26, 2009 at 3:21 AM, Rakhi Khatwani <
> rakhi.khatw...@gmail.com
> > >wrote:
> >
> > > Hi,
> > >   I have faced somewhat a similar issue...
> > >   i have a couple of map reduce jobs running on EC2... after a week or
> > so,
> > > i get a no space on device exception while performing any linux
> > command...
> > > so end up shuttin down hadoop and hbase, clear the logs and then
> restart
> > > them.
> > >
> > > is there a cleaner way to do it???
> > >
> > > thanks
> > > Raakhi
> > >
> > > On Fri, Apr 24, 2009 at 11:59 PM, Todd Lipcon <t...@cloudera.com>
> wrote:
> > >
> > > > On Fri, Apr 24, 2009 at 11:18 AM, Marc Limotte <mlimo...@feeva.com>
> > > wrote:
> > > >
> > > > > Actually, I'm concerned about performance of map/reduce jobs for a
> > > > > long-running cluster.  I.e. it seems to get slower the longer it's
> > > > running.
> > > > >  After a restart of HDFS, the jobs seems to run faster.  Not
> > concerned
> > > > about
> > > > > the start-up time of HDFS.
> > > > >
> > > >
> > > > Hi Marc,
> > > >
> > > > Does it sound like this JIRA describes your problem?
> > > >
> > > > https://issues.apache.org/jira/browse/HADOOP-4766
> > > >
> > > > If so, restarting just the JT should help with the symptoms. (I say
> > > > symptoms
> > > > because this is clearly a problem! Hadoop should be stable and
> > performant
> > > > for months without a cluster restart!)
> > > >
> > > > -Todd
> > > >
> > > >
> > > > >
> > > > > Of course, as you suggest, this could be poor configuration of the
> > > > cluster
> > > > > on my part; but I'd still like to hear best practices around doing
> a
> > > > > scheduled restart.
> > > > >
> > > > > Marc
> > > > >
> > > > > -----Original Message-----
> > > > > From: Allen Wittenauer [mailto:a...@yahoo-inc.com]
> > > > > Sent: Friday, April 24, 2009 10:17 AM
> > > > > To: core-user@hadoop.apache.org
> > > > > Subject: Re: Advice on restarting HDFS in a cron
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On 4/24/09 9:31 AM, "Marc Limotte" <mlimo...@feeva.com> wrote:
> > > > > > I've heard that HDFS starts to slow down after it's been running
> > for
> > > a
> > > > > long
> > > > > > time.  And I believe I've experienced this.
> > > > >
> > > > > We did an upgrade (== complete restart) of a 2000 node instance in
> > ~20
> > > > > minutes on Wednesday. I wouldn't really consider that 'slow', but
> > YMMV.
> > > > >
> > > > > I suspect people aren't running the secondary name node and
> therefore
> > > > have
> > > > > massively large edits file.  The name node appears slow on restart
> > > > because
> > > > > it has to apply the edits to the fsimage rather than having the
> > > secondary
> > > > > keep it up to date.
> > > > >
> > > > >
> > > > > -----Original Message-----
> > > > > From: Marc Limotte
> > > > >
> > > > > Hi.
> > > > >
> > > > > I've heard that HDFS starts to slow down after it's been running
> for
> > a
> > > > long
> > > > > time.  And I believe I've experienced this.   So, I was thinking to
> > set
> > > > up a
> > > > > cron job to execute every week to shutdown HDFS and start it up
> > again.
> > > > >
> > > > > In concept, it would be something like:
> > > > >
> > > > > 0 0 0 0 0 $HADOOP_HOME/bin/stop-dfs.sh;
> $HADOOP_HOME/bin/start-dfs.sh
> > > > >
> > > > > But I'm wondering if there is a safer way to do this.  In
> particular:
> > > > >
> > > > > *         What if a map/reduce job is running when this cron hits.
> >  Is
> > > > > there a way to suspend jobs while the HDFS restart happens?
> > > > >
> > > > > *         Should I also restart the mapred daemons?
> > > > >
> > > > > *         Should I wait some time after "stop-dfs.sh" for things to
> > > > settle
> > > > > down, before executing "start-dfs.sh"?  Or maybe I should run a
> > command
> > > > to
> > > > > verify that it is stopped before I run the start?
> > > > >
> > > > > Thanks for any help.
> > > > > Marc
> > > > >
> > > > >
> > > > > PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS
> MEANT
> > > FOR
> > > > > ONLY THE INTENDED RECIPIENT OF THE TRANSMISSION, AND MAY BE A
> > > > COMMUNICATION
> > > > > PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW,
> > > USE,
> > > > > DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL IS STRICTLY
> > > > > PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN
> > E-MAIL
> > > > AND
> > > > > PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.
> > > > >
> > > >
> > >
> >
>



-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422

Re: Advice on restarting HDFS in a cron

Reply via email to