Hi Alexei, principally, in local mode you cannot run more than one Hadoop job concurrently, or you have to use disjoint hadoop.tmp.dir properties. There have been a few posts on this list about this topic.
I'm not 100% sure whether the commands in your scripts are the reason because they should only read data and not write anything. Sebastian On 03/04/2013 09:48 AM, Alexei Korolev wrote: > Hello, > > It's me again :) Error is back. > > Is this a reason, that I run this script, when nutch crawl? > > #!/bin/bash > > NUTCH_PATH=/home/developer/crawler/apache-nutch-1.4-bin/runtime/local/bin/nutch > > export JAVA_HOME=/usr/ > > rm -rf stats > > $NUTCH_PATH domainstats crawl/crawldb/current stats host > $NUTCH_PATH readdb crawl/crawldb/ -stats > $NUTCH_PATH readseg -list -dir crawl/crawldb/segments > > May be this script removed some essential files from tmp directory? > > Thanks. > > > On Mon, Feb 11, 2013 at 8:26 PM, Eyeris Rodriguez Rueda <[email protected]>wrote: > >> the conversation is about the cosuming of nutch crawl process in /tmp >> folder. >> see Thu, 07 Feb, 14:12 >> >> http://mail-archives.apache.org/mod_mbox/nutch-user/201302.mbox/%[email protected]%3E >> >> >> >> >> ----- Mensaje original ----- >> De: "Alexei Korolev" <[email protected]> >> Para: [email protected] >> Enviados: Lunes, 11 de Febrero 2013 10:50:44 >> Asunto: Re: DiskChecker$DiskErrorException >> >> Hi, >> >> Thank you for your input. DU shows: >> >> root@Ubuntu-1110-oneiric-64-minimal:~# du -hs /tmp >> 5.1M /tmp >> >> About thread. Could you give me more specified link, because right now it's >> pointing to archive of Feb, 2013. >> >> Thanks. >> >> On Mon, Feb 11, 2013 at 7:13 PM, Eyeris Rodriguez Rueda <[email protected] >>> wrote: >> >>> Hi alexei. >>> Make sure about markus suggestion, i had a same problem with /tmp folder >>> space while nutch is crawling. This folder is cleaned when you reboot the >>> system, but nutch check the available space and it can throw exceptions. >>> verify the space with >>> du -hs /tmp/ >>> also check this thread >>> http://mail-archives.apache.org/mod_mbox/nutch-user/201302.mbox/browser >>> >>> >>> >>> >>> >>> ----- Mensaje original ----- >>> De: "Alexei Korolev" <[email protected]> >>> Para: [email protected] >>> Enviados: Lunes, 11 de Febrero 2013 3:40:06 >>> Asunto: Re: DiskChecker$DiskErrorException >>> >>> Hi, >>> >>> Yes >>> >>> Filesystem 1K-blocks Used Available Use% Mounted on >>> /dev/md2 1065281580 592273404 419321144 59% / >>> udev 8177228 8 8177220 1% /dev >>> tmpfs 3274592 328 3274264 1% /run >>> none 5120 0 5120 0% /run/lock >>> none 8186476 0 8186476 0% /run/shm >>> /dev/md3 1808084492 15283960 1701678392 1% /home >>> /dev/md1 507684 38099 443374 8% /boot >>> >>> On Mon, Feb 11, 2013 at 12:33 PM, Markus Jelsma >>> <[email protected]>wrote: >>> >>>> Hi- Also enough space in your /tmp directory? >>>> >>>> Cheers >>>> >>>> >>>> >>>> -----Original message----- >>>>> From:Alexei Korolev <[email protected]> >>>>> Sent: Mon 11-Feb-2013 09:27 >>>>> To: [email protected] >>>>> Subject: DiskChecker$DiskErrorException >>>>> >>>>> Hello, >>>>> >>>>> Already twice I got this error: >>>>> >>>>> 2013-02-08 15:26:11,674 WARN mapred.LocalJobRunner - job_local_0001 >>>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find >>>>> >>>> >>> >> taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output/spill0.out >>>>> in any of the configur >>>>> ed local directories >>>>> at >>>>> >>>> >>> >> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:389) >>>>> at >>>>> >>>> >>> >> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138) >>>>> at >>>>> >>>> >>> >> org.apache.hadoop.mapred.MapOutputFile.getSpillFile(MapOutputFile.java:94) >>>>> at >>>>> >>>> >>> >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1443) >>>>> at >>>>> >>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154) >>>>> at >>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359) >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) >>>>> at >>>>> >>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) >>>>> 2013-02-08 15:26:12,515 ERROR fetcher.Fetcher - Fetcher: >>>>> java.io.IOException: Job failed! >>>>> at >>> org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) >>>>> at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1204) >>>>> at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:1240) >>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>>>> at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:1213) >>>>> >>>>> I've checked in google, but no luck. I run nutch 1.4 locally and >> have a >>>>> plenty of free space on disk. >>>>> I would much appreciate for some help. >>>>> >>>>> Thanks. >>>>> >>>>> >>>>> -- >>>>> Alexei A. Korolev >>>>> >>>> >>> >>> >>> >>> -- >>> Alexei A. Korolev >>> >> >> >> >> -- >> Alexei A. Korolev >> > > >

