When you get the "job failed" error, that just means the hadoop job has failed. It doesn't contain the reason why. For that you'll need to look in the hadoop.log file where you will find much more information on what went wrong. As for your space problem, it's been a while since I set up my config files so I can't recall much off the top of my head, but my hadoop-site.xml file has these properties in it:
dfs.data.dir mapred.local.dir mapred.system.dir mapred.temp.dir Maybe one of them is the culprit. On Nov 22, 2007 10:55 PM, Susam Pal <[EMAIL PROTECTED]> wrote: > It is not required in 'conf/nutch-site.xml'. I made a mistake in > writing 'conf/nutch-site.xml' in my second post in this thread. It is > required only in 'conf/hadoop-site.xml'. It seems you are pointing it > to a partition which doesn't have enough space. My temporary directory > lies in a partition with about 20 GB free space and I never face a > problem. > > Regards, > Susam Pal > > > On Nov 23, 2007 4:32 AM, Josh Attenberg <[EMAIL PROTECTED]> wrote: > > i have added > > <property> > > <name>hadoop.tmp.dir</name> > > <value>/opt/tmp</value> > > <description>Base for Nutch Temporary Directories</description> > > </property> > > (with opt/tmp/ changed to an appropriate directory) in both the > > nutch-site.xml and hadoop-site.xml files. I still get out of space errors > > right away when trying to crawl. there must be some other configuration > > property that i am forgetting. can anyone tell me what this is? > > cheers, > > Josh > > > > On Nov 20, 2007 11:58 PM, Josh Attenberg <[EMAIL PROTECTED]> wrote: > > > > > i did as you say, and moved the files to a new directory on a big drive, > > > but now have some additional errors. are there any other pointers i need > > > to > > > update? > > > > > > > > > On Nov 20, 2007 11:33 PM, Susam Pal < [EMAIL PROTECTED]> wrote: > > > > > > > Is /tmp present in a partition that doesn't have enough space? Does it > > > > have enough space left when this error occurs? Nutch often needs GBs > > > > of space for /tmp. If there isn't enough space on the partition having > > > > /tmp, then you can add the following property in > > > > 'conf/hadoop- site.xml' to make it use a different directory for > > > > writing the temporary files. > > > > > > > > <property> > > > > <name>hadoop.tmp.dir</name> > > > > <value>/opt/tmp</value> > > > > <description>Base for Nutch Temporary Directories</description> > > > > </property> > > > > > > > > Regards, > > > > Susam Pal > > > > > > > > On Nov 21, 2007 8:54 AM, Josh Attenberg <[EMAIL PROTECTED] > > > > > wrote: > > > > > I had this error when fetching with nutch 0.8.1 there is ~450GB left > > > > on the > > > > > disk where the crawl db and segments folder. Are there any other > > > > settings i > > > > > need to make? I know there isnt much space in my home directory, if it > > > > was > > > > > trying to write there, but there is at least 500M. what are the > > > > possible > > > > > culprits/fixes? > > > > > > > > > > > > > > > Exception in thread "main" org.apache.hadoop.fs.FSError: > > > > java.io.IOException: > > > > > No space left on device > > > > > at > > > > > org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream.write( > > > > > LocalFileSystem.java:150) > > > > > at > > > > > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write( > > > > > FSDataOutputStream.java:112) > > > > > at java.io.BufferedOutputStream.flushBuffer( > > > > > BufferedOutputStream.java:65) > > > > > at java.io.BufferedOutputStream.flush( > > > > BufferedOutputStream.java:123) > > > > > at java.io.DataOutputStream.flush(DataOutputStream.java:106) > > > > > at java.io.FilterOutputStream.close(FilterOutputStream.java > > > > :140) > > > > > at org.apache.hadoop.fs.FSDataOutputStream$Summer.close ( > > > > > FSDataOutputStream.java:96) > > > > > at java.io.FilterOutputStream.close(FilterOutputStream.java > > > > :143) > > > > > at java.io.FilterOutputStream.close(FilterOutputStream.java > > > > :143) > > > > > at java.io.FilterOutputStream.close (FilterOutputStream.java > > > > :143) > > > > > at org.apache.hadoop.fs.FileUtil.copyContent(FileUtil.java > > > > :154) > > > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:74) > > > > > at org.apache.hadoop.fs.LocalFileSystem.copyFromLocalFile ( > > > > > LocalFileSystem.java:311) > > > > > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java > > > > :254) > > > > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java > > > > :327) > > > > > at org.apache.nutch.fetcher.Fetcher.fetch (Fetcher.java:443) > > > > > at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:477) > > > > > Caused by: java.io.IOException: No space left on device > > > > > at java.io.FileOutputStream.writeBytes(Native Method) > > > > > at java.io.FileOutputStream.write(FileOutputStream.java:260) > > > > > at > > > > > org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream.write( > > > > > LocalFileSystem.java:148) > > > > > ... 16 more > > > > > > > > > > > > > > > > > >
