Re: Balancer exiting immediately despite having work to do.

2012-01-04 Thread James Warren
Hi Landy -

Attachments are stripped from e-mails sent to the mailing list.  Could you
publish your logs on pastebin and forward the url?

cheers,
-James

On Wed, Jan 4, 2012 at 10:03 AM, Bible, Landy landy-bi...@utulsa.eduwrote:

 Hi all,

 ** **

 I’m running Hadoop 0.20.2.  The balancer has suddenly stopped working.
 I’m attempting to balance the cluster with a threshold of 1, using the
 following command:

 ** **

 ./hadoop balancer –threshold 1

 ** **

 This has been working fine, but suddenly it isn’t.  It skips though 5
 iterations without actually doing any work:

 ** **

 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To
 Move  Bytes Being Moved

 Jan 4, 2012 11:47:56 AM   0 0 KB 1.87
 GB6.68 GB

 Jan 4, 2012 11:47:56 AM   1 0 KB 1.87
 GB6.68 GB

 Jan 4, 2012 11:47:56 AM   2 0 KB 1.87
 GB6.68 GB

 Jan 4, 2012 11:47:57 AM   3 0 KB 1.87
 GB6.68 GB

 Jan 4, 2012 11:47:57 AM   4 0 KB 1.87
 GB6.68 GB

 No block has been moved for 5 iterations. Exiting...

 Balancing took 524.0 milliseconds

 ** **

 I’ve attached the full log, but I can’t see any errors indicating why it
 is failing.  Any ideas?  I’d really like to get balancing working again.
 My use case isn’t the norm, and it is important that the cluster stay as
 close to completely balanced as possible.

 ** **

 --

 Landy Bible

 ** **

 Simulation and Computer Specialist

 School of Nursing – Collins College of Business

 The University of Tulsa

 ** **



Re: Map Task Capacity Not Changing

2011-12-15 Thread James Warren
(moving to mapreduce-user@, bcc'ing common-user@)

Hi Joey -

You'll want to change the value on all of your servers running tasktrackers
and then restart each tasktracker to reread the configuration.

cheers,
-James

On Thu, Dec 15, 2011 at 3:30 PM, Joey Krabacher jkrabac...@gmail.comwrote:

 I have looked up how to up this value on the web and have tried all
 suggestions to no avail.

 Any help would be great.

 Here is some background:

 Version: 0.20.2, r911707
 Compiled: Fri Feb 19 08:07:34 UTC 2010 by chrisdo

 Nodes: 5
 Current Map Task Capacity : 10  --- this is what I want to increase.

 What I have tried :

 Adding
   property
namemapred.tasktracker.map.tasks.maximum/name
value8/value
finaltrue/final
  /property
 to mapred-site.xml on NameNode.  I also added this to one of the
 datanodes for the hell of it and that didn't work either.

 Thanks.



Re: Regarding pointers for LZO compression in Hive and Hadoop

2011-12-14 Thread James Warren
Hi Abhishek -

(Redirecting to user@hive, bcc'ing common-user)

I found this blog to be particularly useful when incorporating Hive and LZO:

http://www.mrbalky.com/2011/02/24/hive-tables-partitions-and-lzo-compression/

And if you're having issues setting up LZO with Hadoop in general, check out

https://github.com/toddlipcon/hadoop-lzo

cheers,
-James



On Wed, Dec 14, 2011 at 11:32 AM, Abhishek Pratap Singh manu.i...@gmail.com
 wrote:

 Hi,

 I m looking for some useful docs on enabling LZO on hadoop cluster. I tried
 few of the blogs, but somehow its not working.
 Here is my requirement.

 I have a hadoop 0.20.2 and Hive 0.6. I have some tables with 1.5 TB of
 data, i want to compress them using LZO and enable LZO in hive as well as
 in hadoop.
 Let me know if you have any useful docs or pointers for the same.


 Regards,
 Abhishek



Re: HDFS permission denied

2011-04-25 Thread James Warren
Hi Wei -

In general, settings changes aren't applied until the hadoop daemons are
restarted.  Sounds like someone enabled permissions previously, but they
didn't take hold until you rebooted your cluster.

cheers,
-James

On Mon, Apr 25, 2011 at 1:19 AM, Peng, Wei wei.p...@xerox.com wrote:

 I forgot to mention that the hadoop was running fine before.
 However, after it crashed last week, the restarted hadoop cluster has
 such permission issues.
 So that means the settings are still as same as before.
 Then what would be the cause?

 Wei

 -Original Message-
 From: James Seigel [mailto:ja...@tynt.com]
 Sent: Sunday, April 24, 2011 5:36 AM
 To: common-user@hadoop.apache.org
 Subject: Re: HDFS permission denied

 Check where the hadoop tmp setting is pointing to.

 James

 Sent from my mobile. Please excuse the typos.

 On 2011-04-24, at 12:41 AM, Peng, Wei wei.p...@xerox.com wrote:

  Hi,
 
 
 
  I need a help very bad.
 
 
 
  I got an HDFS permission error by starting to run hadoop job
 
  org.apache.hadoop.security.AccessControlException: Permission denied:
 
  user=wp, access=WRITE, inode=:hadoop:supergroup:rwxr-xr-x
 
 
 
  I have the right permission to read and write files to my own hadoop
  user directory.
 
  It works fine when I use hadoop fs -put. The job input and output are
  all from my own hadoop user directory.
 
 
 
  It seems that when a job starts running, some data need to be written
  into some directory, and I don't have the permission to that
 directory.
  It is strange that the inode does not show which directory it is.
 
 
 
  Why does hadoop write something to a directory with my name secretly?
 Do
  I need to be set a particular user group?
 
 
 
  Many Thanks..
 
 
 
  Vivian
 
 
 
 
 



Re: HDFS permission denied

2011-04-25 Thread James Warren
At this point you should follow Mathias' advice - go to the logs and
determine which path has the permission issue.  It's better to change the
settings for that path rather than disabling permissions (i.e. making
everything 777) randomly.

-jw

On Mon, Apr 25, 2011 at 10:04 AM, Peng, Wei wei.p...@xerox.com wrote:

 James,

 Thanks for your replies.
 In this case, how can I set up the permission correctly in order to run
 a hadoop job?
 Do I need to set hadoop tmp directory (which is in the local directory
 instead of hdfs directory,right?) to be 777?
 Since the person who maintain the hadoop cluster has left, I have no
 idea what happened. =(

 Wei

 -Original Message-
 From: jameswarr...@gmail.com [mailto:jameswarr...@gmail.com] On Behalf
 Of James Warren
 Sent: Monday, April 25, 2011 9:56 AM
 To: common-user@hadoop.apache.org
 Subject: Re: HDFS permission denied

 Hi Wei -

 In general, settings changes aren't applied until the hadoop daemons are
 restarted.  Sounds like someone enabled permissions previously, but they
 didn't take hold until you rebooted your cluster.

 cheers,
 -James

 On Mon, Apr 25, 2011 at 1:19 AM, Peng, Wei wei.p...@xerox.com wrote:

  I forgot to mention that the hadoop was running fine before.
  However, after it crashed last week, the restarted hadoop cluster has
  such permission issues.
  So that means the settings are still as same as before.
  Then what would be the cause?
 
  Wei
 
  -Original Message-
  From: James Seigel [mailto:ja...@tynt.com]
  Sent: Sunday, April 24, 2011 5:36 AM
  To: common-user@hadoop.apache.org
  Subject: Re: HDFS permission denied
 
  Check where the hadoop tmp setting is pointing to.
 
  James
 
  Sent from my mobile. Please excuse the typos.
 
  On 2011-04-24, at 12:41 AM, Peng, Wei wei.p...@xerox.com wrote:
 
   Hi,
  
  
  
   I need a help very bad.
  
  
  
   I got an HDFS permission error by starting to run hadoop job
  
   org.apache.hadoop.security.AccessControlException: Permission
 denied:
  
   user=wp, access=WRITE, inode=:hadoop:supergroup:rwxr-xr-x
  
  
  
   I have the right permission to read and write files to my own hadoop
   user directory.
  
   It works fine when I use hadoop fs -put. The job input and output
 are
   all from my own hadoop user directory.
  
  
  
   It seems that when a job starts running, some data need to be
 written
   into some directory, and I don't have the permission to that
  directory.
   It is strange that the inode does not show which directory it is.
  
  
  
   Why does hadoop write something to a directory with my name
 secretly?
  Do
   I need to be set a particular user group?
  
  
  
   Many Thanks..
  
  
  
   Vivian
  
  
  
  
  
 



Re: urgent, error: java.io.IOException: Cannot create directory

2010-12-08 Thread james warren
Hi Richard -

First thing that comes to mind is a permissions issue.  Can you verify that
your directories along the desired namenode path are writable by the
appropriate user(s)?

HTH,
-James

On Wed, Dec 8, 2010 at 1:37 PM, Richard Zhang richardtec...@gmail.comwrote:

 Hi Guys:
 I am just installation the hadoop 0.21.0 in a single node cluster.
 I encounter the following error when I run bin/hadoop namenode -format

 10/12/08 16:27:22 ERROR namenode.NameNode:
 java.io.IOException: Cannot create directory
 /your/path/to/hadoop/tmp/dir/hadoop-hadoop/dfs/name/current
at

 org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:312)
at
 org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1425)
at
 org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1444)
at
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1242)
at

 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1348)
at
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1368)


 Below is my core-site.xml

 configuration
 !-- In: conf/core-site.xml --
 property
  namehadoop.tmp.dir/name
  value/your/path/to/hadoop/tmp/dir/hadoop-${user.name}/value
  descriptionA base for other temporary directories./description
 /property

 property
  namefs.default.name/name
  valuehdfs://localhost:54310/value
  descriptionThe name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem./description
 /property
 /configuration


 Below is my hdfs-site.xml
 *?xml version=1.0?
 ?xml-stylesheet type=text/xsl href=configuration.xsl?

 !-- Put site-specific property overrides in this file. --

 configuration
 !-- In: conf/hdfs-site.xml --
 property
  namedfs.replication/name
  value1/value
  descriptionDefault block replication.
  The actual number of replications can be specified when the file is
 created.
  The default is used if replication is not specified in create time.
  /description
 /property

 /configuration


 below is my mapred-site.xml:
 ?xml version=1.0?
 ?xml-stylesheet type=text/xsl href=configuration.xsl?

 !-- Put site-specific property overrides in this file. --

 configuration

 !-- In: conf/mapred-site.xml --
 property
  namemapred.job.tracker/name
  valuelocalhost:54311/value
  descriptionThe host and port that the MapReduce job tracker runs
  at.  If local, then jobs are run in-process as a single map
  and reduce task.
  /description
 /property

 /configuration


 Thanks.
 Richard
 *



Re: Multiple masters in hadoop

2010-09-29 Thread james warren
Actually the /hadoop/conf/masters file is for configuring
secondarynamenode(s).  Check
http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/
for
details.

cheers,
-jw

On Wed, Sep 29, 2010 at 1:36 PM, Shi Yu sh...@uchicago.edu wrote:

 The Master appeared in Masters and Salves files is the machine name or ip
 address.  If you have a single cluster, when you specify multiple names in
 those files it will cause error because of the connection failure.

 Shi


 On 2010-9-29 15:28, Bhushan Mahale wrote:

 Hi,

 The master files name in hadoop/conf is called as masters.
 Wondering if I can configure multiple masters for a single cluster. If
 yes, how can I use them?

 Thanks,
 Bhushan


 DISCLAIMER
 ==
 This e-mail may contain privileged and confidential information which is
 the property of Persistent Systems Ltd. It is intended only for the use of
 the individual or entity to which it is addressed. If you are not the
 intended recipient, you are not authorized to read, retain, copy, print,
 distribute or use this message. If you have received this communication in
 error, please notify the sender and delete all copies of this message.
 Persistent Systems Ltd. does not accept any liability for virus infected
 mails.





 --
 Postdoctoral Scholar
 Institute for Genomics and Systems Biology
 Department of Medicine, the University of Chicago
 Knapp Center for Biomedical Discovery
 900 E. 57th St. Room 10148
 Chicago, IL 60637, US
 Tel: 773-702-6799




lengthy delay after the last reduce completes

2010-05-07 Thread james warren
I was just wondering what goes on under the covers once the last reduce task
ends.  The following is from a very simple map reduce I run throughout the
day.  Typically the run time is about a minute from start to end, but for
this particular run there was a delay of over 5 minutes after the last
reduce task ended.

Any thoughts?

Thanks,
-James Warren



2010-05-07 01:11:10,302 [main] INFO  org.apache.hadoop.mapred.JobClient  -
Running job: job_201005041742_0879
2010-05-07 01:11:11,305 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 0% reduce 0%
2010-05-07 01:11:49,410 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 4% reduce 0%
2010-05-07 01:11:55,427 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 8% reduce 0%
2010-05-07 01:12:04,454 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 17% reduce 0%
2010-05-07 01:12:07,462 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 17% reduce 2%
2010-05-07 01:12:10,471 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 26% reduce 2%
2010-05-07 01:12:16,487 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 43% reduce 5%
2010-05-07 01:12:19,497 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 100% reduce 5%
2010-05-07 01:12:22,505 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 100% reduce 14%
2010-05-07 01:12:31,530 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 100% reduce 100%
2010-05-07 01:18:06,367 [main] INFO  org.apache.hadoop.mapred.JobClient  -
Job complete: job_201005041742_0879


fair scheduler preemptions timeout difficulties

2009-12-02 Thread james warren
Greetings, Hadoop Fans:

I'm attempting to use the timeout feature of the Fair Scheduler (using
Cloudera's most recently released distribution 0.20.1+152-1), but without
success.  I'm using the following configs:

/etc/hadoop/conf/mapred-site.xml

?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?

configuration
  property
namemapred.job.tracker/name
valuehadoop-master:8021/value
  /property
  property
 namemapred.tasktracker.map.tasks.maximum/name
 value9/value
  /property
  property
 namemapred.tasktracker.reduce.tasks.maximum/name
 value3/value
  /property
  property
 namemapred.jobtracker.taskScheduler/name
 valueorg.apache.hadoop.mapred.FairScheduler/value
  /property
  property
 namemapred.fairscheduler.allocation.file/name
 value/etc/hadoop/conf/pools.xml/value
  /property
  property
 namemapred.fairscheduler.assignmultiple/name
 valuetrue/value
  /property
  property
 namemapred.fairscheduler.poolnameproperty/name
 valuepool.name/value
  /property
  property
 namepool.name/name
 valuedefault/value
  /property

/configuration

and /etc/hadoop/conf/pools.xml

?xml version=1.0?
allocations
  pool name=realtime
minMaps4/minMaps
minReduces1/minReduces
minSharePreemptionTimeout180/minSharePreemptionTimeout
weight2.0/weight
  /pool
  pool name=default
minMaps2/minMaps
minReduces2/minReduces
maxRunningJobs1/maxRunningJobs
  /pool
/allocations

but a job in the realtime pool fails to interrupt a job running in the
default queue (waited for  15 minutes).  Is there something wrong with my
configs?  Or is there anything in the logs that would be useful for
debugging?  (I've only found a successfully configured fairscheduler
comment in the jobtracker log upon starting up the daemon.)

Help would be extremely appreciated!

Thanks,
-James Warren


Re: fair scheduler preemptions timeout difficulties

2009-12-02 Thread james warren
Todd from Cloudera solved this for me on their company's forum.

What you're missing is the mapred.fairscheduler.preemption property in
mapred-site.xml - without this on, the preemption settings in the
allocations file are ignored... to turn it on, set that property's value to
'true'

Thanks, Todd!

On Wed, Dec 2, 2009 at 4:26 PM, james warren ja...@rockyou.com wrote:

 Greetings, Hadoop Fans:

 I'm attempting to use the timeout feature of the Fair Scheduler (using
 Cloudera's most recently released distribution 0.20.1+152-1), but without
 success.  I'm using the following configs:

 /etc/hadoop/conf/mapred-site.xml

 ?xml version=1.0?
 ?xml-stylesheet type=text/xsl href=configuration.xsl?

 configuration
   property
 namemapred.job.tracker/name
 valuehadoop-master:8021/value
   /property
   property
  namemapred.tasktracker.map.tasks.maximum/name
  value9/value
   /property
   property
  namemapred.tasktracker.reduce.tasks.maximum/name
  value3/value
   /property
   property
  namemapred.jobtracker.taskScheduler/name
  valueorg.apache.hadoop.mapred.FairScheduler/value
   /property
   property
  namemapred.fairscheduler.allocation.file/name
  value/etc/hadoop/conf/pools.xml/value
   /property
   property
  namemapred.fairscheduler.assignmultiple/name
  valuetrue/value
   /property
   property
  namemapred.fairscheduler.poolnameproperty/name
  valuepool.name/value
   /property
   property
  namepool.name/name
  valuedefault/value
   /property

 /configuration

 and /etc/hadoop/conf/pools.xml

 ?xml version=1.0?
 allocations
   pool name=realtime
 minMaps4/minMaps
 minReduces1/minReduces
 minSharePreemptionTimeout180/minSharePreemptionTimeout
 weight2.0/weight
   /pool
   pool name=default
 minMaps2/minMaps
 minReduces2/minReduces
 maxRunningJobs1/maxRunningJobs
   /pool
 /allocations

 but a job in the realtime pool fails to interrupt a job running in the
 default queue (waited for  15 minutes).  Is there something wrong with my
 configs?  Or is there anything in the logs that would be useful for
 debugging?  (I've only found a successfully configured fairscheduler
 comment in the jobtracker log upon starting up the daemon.)

 Help would be extremely appreciated!

 Thanks,
 -James Warren




detecting stalled daemons?

2009-10-08 Thread james warren
Quick question for the hadoop / linux masters out there:

I recently observed a stalled tasktracker daemon on our production cluster,
and was wondering if there were common tests to detect failures so that
administration tools (e.g. monit) can automatically restart the daemon.  The
particular observed symptoms were:

   - the node was dropped by the jobtracker
   - information in /proc listed the tasktracker process as sleeping, not
   zombie
   - the web interface (port 50060) was unresponsive, though telnet did
   connect
   - no error information in the hadoop logs -- they simply were no longer
   being updated

I certainly cannot be the first person to encounter this - anyone have a
neat and tidy solution they could share?

(And yes, we will eventually we go down the nagios / ganglia / cloudera
desktop path but we're waiting until we're running CDH2.)

Many thanks,
-James Warren