Hello, It's me again :) Error is back.
Is this a reason, that I run this script, when nutch crawl? #!/bin/bash NUTCH_PATH=/home/developer/crawler/apache-nutch-1.4-bin/runtime/local/bin/nutch export JAVA_HOME=/usr/ rm -rf stats $NUTCH_PATH domainstats crawl/crawldb/current stats host $NUTCH_PATH readdb crawl/crawldb/ -stats $NUTCH_PATH readseg -list -dir crawl/crawldb/segments May be this script removed some essential files from tmp directory? Thanks. On Mon, Feb 11, 2013 at 8:26 PM, Eyeris Rodriguez Rueda <[email protected]>wrote: > the conversation is about the cosuming of nutch crawl process in /tmp > folder. > see Thu, 07 Feb, 14:12 > > http://mail-archives.apache.org/mod_mbox/nutch-user/201302.mbox/%[email protected]%3E > > > > > ----- Mensaje original ----- > De: "Alexei Korolev" <[email protected]> > Para: [email protected] > Enviados: Lunes, 11 de Febrero 2013 10:50:44 > Asunto: Re: DiskChecker$DiskErrorException > > Hi, > > Thank you for your input. DU shows: > > root@Ubuntu-1110-oneiric-64-minimal:~# du -hs /tmp > 5.1M /tmp > > About thread. Could you give me more specified link, because right now it's > pointing to archive of Feb, 2013. > > Thanks. > > On Mon, Feb 11, 2013 at 7:13 PM, Eyeris Rodriguez Rueda <[email protected] > >wrote: > > > Hi alexei. > > Make sure about markus suggestion, i had a same problem with /tmp folder > > space while nutch is crawling. This folder is cleaned when you reboot the > > system, but nutch check the available space and it can throw exceptions. > > verify the space with > > du -hs /tmp/ > > also check this thread > > http://mail-archives.apache.org/mod_mbox/nutch-user/201302.mbox/browser > > > > > > > > > > > > ----- Mensaje original ----- > > De: "Alexei Korolev" <[email protected]> > > Para: [email protected] > > Enviados: Lunes, 11 de Febrero 2013 3:40:06 > > Asunto: Re: DiskChecker$DiskErrorException > > > > Hi, > > > > Yes > > > > Filesystem 1K-blocks Used Available Use% Mounted on > > /dev/md2 1065281580 592273404 419321144 59% / > > udev 8177228 8 8177220 1% /dev > > tmpfs 3274592 328 3274264 1% /run > > none 5120 0 5120 0% /run/lock > > none 8186476 0 8186476 0% /run/shm > > /dev/md3 1808084492 15283960 1701678392 1% /home > > /dev/md1 507684 38099 443374 8% /boot > > > > On Mon, Feb 11, 2013 at 12:33 PM, Markus Jelsma > > <[email protected]>wrote: > > > > > Hi- Also enough space in your /tmp directory? > > > > > > Cheers > > > > > > > > > > > > -----Original message----- > > > > From:Alexei Korolev <[email protected]> > > > > Sent: Mon 11-Feb-2013 09:27 > > > > To: [email protected] > > > > Subject: DiskChecker$DiskErrorException > > > > > > > > Hello, > > > > > > > > Already twice I got this error: > > > > > > > > 2013-02-08 15:26:11,674 WARN mapred.LocalJobRunner - job_local_0001 > > > > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > > > > > > > > > > taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output/spill0.out > > > > in any of the configur > > > > ed local directories > > > > at > > > > > > > > > > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:389) > > > > at > > > > > > > > > > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138) > > > > at > > > > > > > > > > org.apache.hadoop.mapred.MapOutputFile.getSpillFile(MapOutputFile.java:94) > > > > at > > > > > > > > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1443) > > > > at > > > > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154) > > > > at > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359) > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > > > > at > > > > > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) > > > > 2013-02-08 15:26:12,515 ERROR fetcher.Fetcher - Fetcher: > > > > java.io.IOException: Job failed! > > > > at > > org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) > > > > at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1204) > > > > at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:1240) > > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > > at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:1213) > > > > > > > > I've checked in google, but no luck. I run nutch 1.4 locally and > have a > > > > plenty of free space on disk. > > > > I would much appreciate for some help. > > > > > > > > Thanks. > > > > > > > > > > > > -- > > > > Alexei A. Korolev > > > > > > > > > > > > > > > -- > > Alexei A. Korolev > > > > > > -- > Alexei A. Korolev > -- Alexei A. Korolev

