Hi Alexei,

principally, in local mode you cannot run more than one Hadoop job
concurrently, or you have to use disjoint hadoop.tmp.dir properties.
There have been a few posts on this list about this topic.

I'm not 100% sure whether the commands in your scripts are the reason
because they should only read data and not write anything.

Sebastian

On 03/04/2013 09:48 AM, Alexei Korolev wrote:
> Hello,
> 
> It's me again :) Error is back.
> 
> Is this a reason, that I run this script, when nutch crawl?
> 
> #!/bin/bash
> 
> NUTCH_PATH=/home/developer/crawler/apache-nutch-1.4-bin/runtime/local/bin/nutch
> 
> export JAVA_HOME=/usr/
> 
> rm -rf stats
> 
> $NUTCH_PATH domainstats crawl/crawldb/current stats host
> $NUTCH_PATH readdb crawl/crawldb/ -stats
> $NUTCH_PATH readseg -list -dir crawl/crawldb/segments
> 
> May be this script removed some essential files from tmp directory?
> 
> Thanks.
> 
> 
> On Mon, Feb 11, 2013 at 8:26 PM, Eyeris Rodriguez Rueda <[email protected]>wrote:
> 
>> the conversation is about the cosuming of nutch crawl process in /tmp
>> folder.
>> see Thu, 07 Feb, 14:12
>>
>> http://mail-archives.apache.org/mod_mbox/nutch-user/201302.mbox/%[email protected]%3E
>>
>>
>>
>>
>> ----- Mensaje original -----
>> De: "Alexei Korolev" <[email protected]>
>> Para: [email protected]
>> Enviados: Lunes, 11 de Febrero 2013 10:50:44
>> Asunto: Re: DiskChecker$DiskErrorException
>>
>> Hi,
>>
>> Thank you for your input. DU shows:
>>
>> root@Ubuntu-1110-oneiric-64-minimal:~# du -hs /tmp
>> 5.1M    /tmp
>>
>> About thread. Could you give me more specified link, because right now it's
>> pointing to archive of Feb, 2013.
>>
>> Thanks.
>>
>> On Mon, Feb 11, 2013 at 7:13 PM, Eyeris Rodriguez Rueda <[email protected]
>>> wrote:
>>
>>> Hi alexei.
>>> Make sure about markus suggestion, i had a same problem with /tmp folder
>>> space while nutch is crawling. This folder is cleaned when you reboot the
>>> system, but nutch check the available space and it can throw exceptions.
>>> verify the space with
>>> du -hs /tmp/
>>> also check this thread
>>> http://mail-archives.apache.org/mod_mbox/nutch-user/201302.mbox/browser
>>>
>>>
>>>
>>>
>>>
>>> ----- Mensaje original -----
>>> De: "Alexei Korolev" <[email protected]>
>>> Para: [email protected]
>>> Enviados: Lunes, 11 de Febrero 2013 3:40:06
>>> Asunto: Re: DiskChecker$DiskErrorException
>>>
>>> Hi,
>>>
>>> Yes
>>>
>>> Filesystem           1K-blocks      Used Available Use% Mounted on
>>> /dev/md2             1065281580 592273404 419321144  59% /
>>> udev                   8177228         8   8177220   1% /dev
>>> tmpfs                  3274592       328   3274264   1% /run
>>> none                      5120         0      5120   0% /run/lock
>>> none                   8186476         0   8186476   0% /run/shm
>>> /dev/md3             1808084492  15283960 1701678392   1% /home
>>> /dev/md1                507684     38099    443374   8% /boot
>>>
>>> On Mon, Feb 11, 2013 at 12:33 PM, Markus Jelsma
>>> <[email protected]>wrote:
>>>
>>>> Hi- Also enough space in your /tmp directory?
>>>>
>>>> Cheers
>>>>
>>>>
>>>>
>>>> -----Original message-----
>>>>> From:Alexei Korolev <[email protected]>
>>>>> Sent: Mon 11-Feb-2013 09:27
>>>>> To: [email protected]
>>>>> Subject: DiskChecker$DiskErrorException
>>>>>
>>>>> Hello,
>>>>>
>>>>> Already twice I got this error:
>>>>>
>>>>> 2013-02-08 15:26:11,674 WARN  mapred.LocalJobRunner - job_local_0001
>>>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
>>>>>
>>>>
>>>
>> taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output/spill0.out
>>>>> in any of the configur
>>>>> ed local directories
>>>>>         at
>>>>>
>>>>
>>>
>> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:389)
>>>>>         at
>>>>>
>>>>
>>>
>> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>>>>>         at
>>>>>
>>>>
>>>
>> org.apache.hadoop.mapred.MapOutputFile.getSpillFile(MapOutputFile.java:94)
>>>>>         at
>>>>>
>>>>
>>>
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1443)
>>>>>         at
>>>>>
>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154)
>>>>>         at
>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359)
>>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>>>>>         at
>>>>>
>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>>>>> 2013-02-08 15:26:12,515 ERROR fetcher.Fetcher - Fetcher:
>>>>> java.io.IOException: Job failed!
>>>>>         at
>>> org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
>>>>>         at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1204)
>>>>>         at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:1240)
>>>>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>>>         at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:1213)
>>>>>
>>>>> I've checked in google, but no luck. I run nutch 1.4 locally and
>> have a
>>>>> plenty of free space on disk.
>>>>> I would much appreciate for some help.
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> --
>>>>> Alexei A. Korolev
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Alexei A. Korolev
>>>
>>
>>
>>
>> --
>> Alexei A. Korolev
>>
> 
> 
> 

Reply via email to