it appears Nutch is still looking for
/tmp/hadoop/mapred/system/submit_xiq66r/job.jar
I moved stuff around when i probably shouldnt have, I have copied this
folder (/tmp/hadoop/) elsewhere, so there still must be a variable to set. i
get this error during inject:
Injector: Converting injected urls to crawl db entries.
Exception in thread "main" java.io.FileNotFoundException:
/tmp/hadoop/mapred/system/submit_xiq66r/job.jar (No such file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:179)
at java.io.FileOutputStream.<init>(FileOutputStream.java:131)
at org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream
.<init>(LocalFileSystem.java:133)
at org.apache.hadoop.fs.LocalFileSystem.createRaw(
LocalFileSystem.java:172)
at org.apache.hadoop.fs.LocalFileSystem.createRaw(
LocalFileSystem.java:180)
at org.apache.hadoop.fs.FSDataOutputStream$Summer.<init>(
FSDataOutputStream.java:56)
at org.apache.hadoop.fs.FSDataOutputStream$Summer.<init>(
FSDataOutputStream.java:45)
at org.apache.hadoop.fs.FSDataOutputStream.<init>(
FSDataOutputStream.java:146)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:270)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:177)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:74)
at org.apache.hadoop.fs.LocalFileSystem.copyFromLocalFile(
LocalFileSystem.java:311)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:254)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327)
at org.apache.nutch.crawl.Injector.inject(Injector.java:138)
at org.apache.nutch.crawl.Injector.main(Injector.java:164)
On Nov 21, 2007 12:09 AM, Susam Pal <[EMAIL PROTECTED]> wrote:
> I haven't asked you to move the files to a new location. I don't know
> if moving the files work. My solution was for a fresh crawl. If the
> partition which contains /tmp doesn't have enough space, you can point
> Nutch to a different temporary directory by adding this property to
> your 'conf/nutch-site.xml' and do a new crawl.
>
> <property>
> <name>hadoop.tmp.dir</name>
> <value>/opt/tmp</value>
> <description>Base for Nutch Temporary Directories</description>
> </property>
>
> Please note that /opt/tmp is only an example. Change it to whatever is
> required on your system. Please post the relevant portions of the
> error logs too when an error occurs.
>
> Regards,
> Susam Pal
>
> On Nov 21, 2007 10:28 AM, Josh Attenberg <[EMAIL PROTECTED]> wrote:
> > i did as you say, and moved the files to a new directory on a big drive,
> but
> > now have some additional errors. are there any other pointers i need to
> > update?
> >
> >
> > On Nov 20, 2007 11:33 PM, Susam Pal <[EMAIL PROTECTED]> wrote:
> >
> > > Is /tmp present in a partition that doesn't have enough space? Does it
> > > have enough space left when this error occurs? Nutch often needs GBs
> > > of space for /tmp. If there isn't enough space on the partition having
> > > /tmp, then you can add the following property in
> > > 'conf/hadoop-site.xml' to make it use a different directory for
> > > writing the temporary files.
> > >
> > > <property>
> > > <name>hadoop.tmp.dir</name>
> > > <value>/opt/tmp</value>
> > > <description>Base for Nutch Temporary Directories</description>
> > > </property>
> > >
> > > Regards,
> > > Susam Pal
> > >
> > > On Nov 21, 2007 8:54 AM, Josh Attenberg <[EMAIL PROTECTED]>
> wrote:
> > > > I had this error when fetching with nutch 0.8.1 there is ~450GB left
> on
> > > the
> > > > disk where the crawl db and segments folder. Are there any other
> > > settings i
> > > > need to make? I know there isnt much space in my home directory, if
> it
> > > was
> > > > trying to write there, but there is at least 500M. what are the
> possible
> > > > culprits/fixes?
> > > >
> > > >
> > > > Exception in thread "main" org.apache.hadoop.fs.FSError:
> > > java.io.IOException:
> > > > No space left on device
> > > > at
> > > > org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream.write(
> > > > LocalFileSystem.java:150)
> > > > at
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
> > > > FSDataOutputStream.java:112)
> > > > at java.io.BufferedOutputStream.flushBuffer(
> > > > BufferedOutputStream.java:65)
> > > > at java.io.BufferedOutputStream.flush(
> BufferedOutputStream.java
> > > :123)
> > > > at java.io.DataOutputStream.flush(DataOutputStream.java:106)
> > > > at java.io.FilterOutputStream.close(FilterOutputStream.java
> :140)
> > > > at org.apache.hadoop.fs.FSDataOutputStream$Summer.close(
> > > > FSDataOutputStream.java:96)
> > > > at java.io.FilterOutputStream.close(FilterOutputStream.java
> :143)
> > > > at java.io.FilterOutputStream.close(FilterOutputStream.java
> :143)
> > > > at java.io.FilterOutputStream.close(FilterOutputStream.java
> :143)
> > > > at org.apache.hadoop.fs.FileUtil.copyContent(FileUtil.java
> :154)
> > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:74)
> > > > at org.apache.hadoop.fs.LocalFileSystem.copyFromLocalFile(
> > > > LocalFileSystem.java:311)
> > > > at org.apache.hadoop.mapred.JobClient.submitJob(
> JobClient.java
> > > :254)
> > > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java
> :327)
> > > > at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:443)
> > > > at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:477)
> > > > Caused by: java.io.IOException: No space left on device
> > > > at java.io.FileOutputStream.writeBytes(Native Method)
> > > > at java.io.FileOutputStream.write(FileOutputStream.java:260)
> > > > at
> > > > org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream.write(
> > > > LocalFileSystem.java:148)
> > > > ... 16 more
> > > >
> > >
> >
>