Re: Balancer exiting immediately despite having work to do.
Hi Landy - Attachments are stripped from e-mails sent to the mailing list. Could you publish your logs on pastebin and forward the url? cheers, -James On Wed, Jan 4, 2012 at 10:03 AM, Bible, Landy landy-bi...@utulsa.eduwrote: Hi all, ** ** I’m running Hadoop 0.20.2. The balancer has suddenly stopped working. I’m attempting to balance the cluster with a threshold of 1, using the following command: ** ** ./hadoop balancer –threshold 1 ** ** This has been working fine, but suddenly it isn’t. It skips though 5 iterations without actually doing any work: ** ** Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved Jan 4, 2012 11:47:56 AM 0 0 KB 1.87 GB6.68 GB Jan 4, 2012 11:47:56 AM 1 0 KB 1.87 GB6.68 GB Jan 4, 2012 11:47:56 AM 2 0 KB 1.87 GB6.68 GB Jan 4, 2012 11:47:57 AM 3 0 KB 1.87 GB6.68 GB Jan 4, 2012 11:47:57 AM 4 0 KB 1.87 GB6.68 GB No block has been moved for 5 iterations. Exiting... Balancing took 524.0 milliseconds ** ** I’ve attached the full log, but I can’t see any errors indicating why it is failing. Any ideas? I’d really like to get balancing working again. My use case isn’t the norm, and it is important that the cluster stay as close to completely balanced as possible. ** ** -- Landy Bible ** ** Simulation and Computer Specialist School of Nursing – Collins College of Business The University of Tulsa ** **
Re: Map Task Capacity Not Changing
(moving to mapreduce-user@, bcc'ing common-user@) Hi Joey - You'll want to change the value on all of your servers running tasktrackers and then restart each tasktracker to reread the configuration. cheers, -James On Thu, Dec 15, 2011 at 3:30 PM, Joey Krabacher jkrabac...@gmail.comwrote: I have looked up how to up this value on the web and have tried all suggestions to no avail. Any help would be great. Here is some background: Version: 0.20.2, r911707 Compiled: Fri Feb 19 08:07:34 UTC 2010 by chrisdo Nodes: 5 Current Map Task Capacity : 10 --- this is what I want to increase. What I have tried : Adding property namemapred.tasktracker.map.tasks.maximum/name value8/value finaltrue/final /property to mapred-site.xml on NameNode. I also added this to one of the datanodes for the hell of it and that didn't work either. Thanks.
Re: Regarding pointers for LZO compression in Hive and Hadoop
Hi Abhishek - (Redirecting to user@hive, bcc'ing common-user) I found this blog to be particularly useful when incorporating Hive and LZO: http://www.mrbalky.com/2011/02/24/hive-tables-partitions-and-lzo-compression/ And if you're having issues setting up LZO with Hadoop in general, check out https://github.com/toddlipcon/hadoop-lzo cheers, -James On Wed, Dec 14, 2011 at 11:32 AM, Abhishek Pratap Singh manu.i...@gmail.com wrote: Hi, I m looking for some useful docs on enabling LZO on hadoop cluster. I tried few of the blogs, but somehow its not working. Here is my requirement. I have a hadoop 0.20.2 and Hive 0.6. I have some tables with 1.5 TB of data, i want to compress them using LZO and enable LZO in hive as well as in hadoop. Let me know if you have any useful docs or pointers for the same. Regards, Abhishek
Re: HDFS permission denied
Hi Wei - In general, settings changes aren't applied until the hadoop daemons are restarted. Sounds like someone enabled permissions previously, but they didn't take hold until you rebooted your cluster. cheers, -James On Mon, Apr 25, 2011 at 1:19 AM, Peng, Wei wei.p...@xerox.com wrote: I forgot to mention that the hadoop was running fine before. However, after it crashed last week, the restarted hadoop cluster has such permission issues. So that means the settings are still as same as before. Then what would be the cause? Wei -Original Message- From: James Seigel [mailto:ja...@tynt.com] Sent: Sunday, April 24, 2011 5:36 AM To: common-user@hadoop.apache.org Subject: Re: HDFS permission denied Check where the hadoop tmp setting is pointing to. James Sent from my mobile. Please excuse the typos. On 2011-04-24, at 12:41 AM, Peng, Wei wei.p...@xerox.com wrote: Hi, I need a help very bad. I got an HDFS permission error by starting to run hadoop job org.apache.hadoop.security.AccessControlException: Permission denied: user=wp, access=WRITE, inode=:hadoop:supergroup:rwxr-xr-x I have the right permission to read and write files to my own hadoop user directory. It works fine when I use hadoop fs -put. The job input and output are all from my own hadoop user directory. It seems that when a job starts running, some data need to be written into some directory, and I don't have the permission to that directory. It is strange that the inode does not show which directory it is. Why does hadoop write something to a directory with my name secretly? Do I need to be set a particular user group? Many Thanks.. Vivian
Re: HDFS permission denied
At this point you should follow Mathias' advice - go to the logs and determine which path has the permission issue. It's better to change the settings for that path rather than disabling permissions (i.e. making everything 777) randomly. -jw On Mon, Apr 25, 2011 at 10:04 AM, Peng, Wei wei.p...@xerox.com wrote: James, Thanks for your replies. In this case, how can I set up the permission correctly in order to run a hadoop job? Do I need to set hadoop tmp directory (which is in the local directory instead of hdfs directory,right?) to be 777? Since the person who maintain the hadoop cluster has left, I have no idea what happened. =( Wei -Original Message- From: jameswarr...@gmail.com [mailto:jameswarr...@gmail.com] On Behalf Of James Warren Sent: Monday, April 25, 2011 9:56 AM To: common-user@hadoop.apache.org Subject: Re: HDFS permission denied Hi Wei - In general, settings changes aren't applied until the hadoop daemons are restarted. Sounds like someone enabled permissions previously, but they didn't take hold until you rebooted your cluster. cheers, -James On Mon, Apr 25, 2011 at 1:19 AM, Peng, Wei wei.p...@xerox.com wrote: I forgot to mention that the hadoop was running fine before. However, after it crashed last week, the restarted hadoop cluster has such permission issues. So that means the settings are still as same as before. Then what would be the cause? Wei -Original Message- From: James Seigel [mailto:ja...@tynt.com] Sent: Sunday, April 24, 2011 5:36 AM To: common-user@hadoop.apache.org Subject: Re: HDFS permission denied Check where the hadoop tmp setting is pointing to. James Sent from my mobile. Please excuse the typos. On 2011-04-24, at 12:41 AM, Peng, Wei wei.p...@xerox.com wrote: Hi, I need a help very bad. I got an HDFS permission error by starting to run hadoop job org.apache.hadoop.security.AccessControlException: Permission denied: user=wp, access=WRITE, inode=:hadoop:supergroup:rwxr-xr-x I have the right permission to read and write files to my own hadoop user directory. It works fine when I use hadoop fs -put. The job input and output are all from my own hadoop user directory. It seems that when a job starts running, some data need to be written into some directory, and I don't have the permission to that directory. It is strange that the inode does not show which directory it is. Why does hadoop write something to a directory with my name secretly? Do I need to be set a particular user group? Many Thanks.. Vivian
Re: urgent, error: java.io.IOException: Cannot create directory
Hi Richard - First thing that comes to mind is a permissions issue. Can you verify that your directories along the desired namenode path are writable by the appropriate user(s)? HTH, -James On Wed, Dec 8, 2010 at 1:37 PM, Richard Zhang richardtec...@gmail.comwrote: Hi Guys: I am just installation the hadoop 0.21.0 in a single node cluster. I encounter the following error when I run bin/hadoop namenode -format 10/12/08 16:27:22 ERROR namenode.NameNode: java.io.IOException: Cannot create directory /your/path/to/hadoop/tmp/dir/hadoop-hadoop/dfs/name/current at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:312) at org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1425) at org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1444) at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1242) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1348) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1368) Below is my core-site.xml configuration !-- In: conf/core-site.xml -- property namehadoop.tmp.dir/name value/your/path/to/hadoop/tmp/dir/hadoop-${user.name}/value descriptionA base for other temporary directories./description /property property namefs.default.name/name valuehdfs://localhost:54310/value descriptionThe name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem./description /property /configuration Below is my hdfs-site.xml *?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration !-- In: conf/hdfs-site.xml -- property namedfs.replication/name value1/value descriptionDefault block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. /description /property /configuration below is my mapred-site.xml: ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration !-- In: conf/mapred-site.xml -- property namemapred.job.tracker/name valuelocalhost:54311/value descriptionThe host and port that the MapReduce job tracker runs at. If local, then jobs are run in-process as a single map and reduce task. /description /property /configuration Thanks. Richard *
Re: Multiple masters in hadoop
Actually the /hadoop/conf/masters file is for configuring secondarynamenode(s). Check http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/ for details. cheers, -jw On Wed, Sep 29, 2010 at 1:36 PM, Shi Yu sh...@uchicago.edu wrote: The Master appeared in Masters and Salves files is the machine name or ip address. If you have a single cluster, when you specify multiple names in those files it will cause error because of the connection failure. Shi On 2010-9-29 15:28, Bhushan Mahale wrote: Hi, The master files name in hadoop/conf is called as masters. Wondering if I can configure multiple masters for a single cluster. If yes, how can I use them? Thanks, Bhushan DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- Postdoctoral Scholar Institute for Genomics and Systems Biology Department of Medicine, the University of Chicago Knapp Center for Biomedical Discovery 900 E. 57th St. Room 10148 Chicago, IL 60637, US Tel: 773-702-6799
lengthy delay after the last reduce completes
I was just wondering what goes on under the covers once the last reduce task ends. The following is from a very simple map reduce I run throughout the day. Typically the run time is about a minute from start to end, but for this particular run there was a delay of over 5 minutes after the last reduce task ended. Any thoughts? Thanks, -James Warren 2010-05-07 01:11:10,302 [main] INFO org.apache.hadoop.mapred.JobClient - Running job: job_201005041742_0879 2010-05-07 01:11:11,305 [main] INFO org.apache.hadoop.mapred.JobClient - map 0% reduce 0% 2010-05-07 01:11:49,410 [main] INFO org.apache.hadoop.mapred.JobClient - map 4% reduce 0% 2010-05-07 01:11:55,427 [main] INFO org.apache.hadoop.mapred.JobClient - map 8% reduce 0% 2010-05-07 01:12:04,454 [main] INFO org.apache.hadoop.mapred.JobClient - map 17% reduce 0% 2010-05-07 01:12:07,462 [main] INFO org.apache.hadoop.mapred.JobClient - map 17% reduce 2% 2010-05-07 01:12:10,471 [main] INFO org.apache.hadoop.mapred.JobClient - map 26% reduce 2% 2010-05-07 01:12:16,487 [main] INFO org.apache.hadoop.mapred.JobClient - map 43% reduce 5% 2010-05-07 01:12:19,497 [main] INFO org.apache.hadoop.mapred.JobClient - map 100% reduce 5% 2010-05-07 01:12:22,505 [main] INFO org.apache.hadoop.mapred.JobClient - map 100% reduce 14% 2010-05-07 01:12:31,530 [main] INFO org.apache.hadoop.mapred.JobClient - map 100% reduce 100% 2010-05-07 01:18:06,367 [main] INFO org.apache.hadoop.mapred.JobClient - Job complete: job_201005041742_0879
fair scheduler preemptions timeout difficulties
Greetings, Hadoop Fans: I'm attempting to use the timeout feature of the Fair Scheduler (using Cloudera's most recently released distribution 0.20.1+152-1), but without success. I'm using the following configs: /etc/hadoop/conf/mapred-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? configuration property namemapred.job.tracker/name valuehadoop-master:8021/value /property property namemapred.tasktracker.map.tasks.maximum/name value9/value /property property namemapred.tasktracker.reduce.tasks.maximum/name value3/value /property property namemapred.jobtracker.taskScheduler/name valueorg.apache.hadoop.mapred.FairScheduler/value /property property namemapred.fairscheduler.allocation.file/name value/etc/hadoop/conf/pools.xml/value /property property namemapred.fairscheduler.assignmultiple/name valuetrue/value /property property namemapred.fairscheduler.poolnameproperty/name valuepool.name/value /property property namepool.name/name valuedefault/value /property /configuration and /etc/hadoop/conf/pools.xml ?xml version=1.0? allocations pool name=realtime minMaps4/minMaps minReduces1/minReduces minSharePreemptionTimeout180/minSharePreemptionTimeout weight2.0/weight /pool pool name=default minMaps2/minMaps minReduces2/minReduces maxRunningJobs1/maxRunningJobs /pool /allocations but a job in the realtime pool fails to interrupt a job running in the default queue (waited for 15 minutes). Is there something wrong with my configs? Or is there anything in the logs that would be useful for debugging? (I've only found a successfully configured fairscheduler comment in the jobtracker log upon starting up the daemon.) Help would be extremely appreciated! Thanks, -James Warren
Re: fair scheduler preemptions timeout difficulties
Todd from Cloudera solved this for me on their company's forum. What you're missing is the mapred.fairscheduler.preemption property in mapred-site.xml - without this on, the preemption settings in the allocations file are ignored... to turn it on, set that property's value to 'true' Thanks, Todd! On Wed, Dec 2, 2009 at 4:26 PM, james warren ja...@rockyou.com wrote: Greetings, Hadoop Fans: I'm attempting to use the timeout feature of the Fair Scheduler (using Cloudera's most recently released distribution 0.20.1+152-1), but without success. I'm using the following configs: /etc/hadoop/conf/mapred-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? configuration property namemapred.job.tracker/name valuehadoop-master:8021/value /property property namemapred.tasktracker.map.tasks.maximum/name value9/value /property property namemapred.tasktracker.reduce.tasks.maximum/name value3/value /property property namemapred.jobtracker.taskScheduler/name valueorg.apache.hadoop.mapred.FairScheduler/value /property property namemapred.fairscheduler.allocation.file/name value/etc/hadoop/conf/pools.xml/value /property property namemapred.fairscheduler.assignmultiple/name valuetrue/value /property property namemapred.fairscheduler.poolnameproperty/name valuepool.name/value /property property namepool.name/name valuedefault/value /property /configuration and /etc/hadoop/conf/pools.xml ?xml version=1.0? allocations pool name=realtime minMaps4/minMaps minReduces1/minReduces minSharePreemptionTimeout180/minSharePreemptionTimeout weight2.0/weight /pool pool name=default minMaps2/minMaps minReduces2/minReduces maxRunningJobs1/maxRunningJobs /pool /allocations but a job in the realtime pool fails to interrupt a job running in the default queue (waited for 15 minutes). Is there something wrong with my configs? Or is there anything in the logs that would be useful for debugging? (I've only found a successfully configured fairscheduler comment in the jobtracker log upon starting up the daemon.) Help would be extremely appreciated! Thanks, -James Warren
detecting stalled daemons?
Quick question for the hadoop / linux masters out there: I recently observed a stalled tasktracker daemon on our production cluster, and was wondering if there were common tests to detect failures so that administration tools (e.g. monit) can automatically restart the daemon. The particular observed symptoms were: - the node was dropped by the jobtracker - information in /proc listed the tasktracker process as sleeping, not zombie - the web interface (port 50060) was unresponsive, though telnet did connect - no error information in the hadoop logs -- they simply were no longer being updated I certainly cannot be the first person to encounter this - anyone have a neat and tidy solution they could share? (And yes, we will eventually we go down the nagios / ganglia / cloudera desktop path but we're waiting until we're running CDH2.) Many thanks, -James Warren