Re: Hadoop Disk Error

Joshua J Pavel Tue, 20 Apr 2010 11:01:04 -0700

Here is the output, with fetcher parsing enabled:

Command output:


crawl started in: cmrolg-even/crawl
rootUrlDir = /projects/events/search/nutch-1.0/cmrolg-even/urls
threads = 10
depth = 5
Injector: starting
Injector: crawlDb: cmrolg-even/crawl/crawldb.
Injector: urlDir: /projects/events/search/nutch-1.0/cmrolg-even/urls
Injector: Converting injected urls to crawl db entries.
Injector: Merging injected urls into crawl db.
Injector: done
Generator: Selecting best-scoring urls due for fetch.
Generator: starting
Generator: segment: cmrolg-even/crawl/segments/20100420175131
Generator: filtering: true
Generator: jobtracker is 'local', generating exactly one partition.
Generator: Partitioning selected urls by host, for politeness.
Generator: done.
Fetcher: starting
Fetcher: segment: cmrolg-even/crawl/segments/20100420175131
Fetcher: threads: 10
QueueFeeder finished: total 1 records.
fetching http:// [...]
-finishing thread FetcherThread, activeThreads=1
-finishing thread FetcherThread, activeThreads=1
-finishing thread FetcherThread, activeThreads=1
-finishing thread FetcherThread, activeThreads=1
-finishing thread FetcherThread, activeThreads=1
-finishing thread FetcherThread, activeThreads=1
-finishing thread FetcherThread, activeThreads=1
-finishing thread FetcherThread, activeThreads=1
-finishing thread FetcherThread, activeThreads=1
-finishing thread FetcherThread, activeThreads=0
-activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0
-activeThreads=0,
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
        at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:969):
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:122)
Elapsed time: 16

(So yes, 16 seconds total)

2010-04-20 17:51:36,994 INFO  fetcher.Fetcher - fetching http:// [...]
2010-04-20 17:51:37,006 INFO  http.Http - http.proxy.host = null
2010-04-20 17:51:37,007 INFO  http.Http - http.proxy.port = 8080
2010-04-20 17:51:37,007 INFO  http.Http - http.timeout = 10000
2010-04-20 17:51:37,007 INFO  http.Http - http.content.limit = -1
2010-04-20 17:51:37,007 INFO  http.Http - http.agent = Nutch/Nutch
(webmaster@ [...] )
2010-04-20 17:51:37,007 INFO  http.Http - protocol.plugin.check.blocking =
false...]
2010-04-20 17:51:37,007 INFO  http.Http - protocol.plugin.check.robots =
false
2010-04-20 17:51:37,025 INFO  fetcher.Fetcher - -finishing thread
FetcherThread, activeThreads=1
2010-04-20 17:51:37,027 INFO  fetcher.Fetcher - -finishing thread
FetcherThread, activeThreads=1
2010-04-20 17:51:37,028 INFO  fetcher.Fetcher - -finishing thread
FetcherThread, activeThreads=1
2010-04-20 17:51:37,030 INFO  fetcher.Fetcher - -finishing thread
FetcherThread, activeThreads=1
2010-04-20 17:51:37,031 INFO  fetcher.Fetcher - -finishing thread
FetcherThread, activeThreads=1
2010-04-20 17:51:37,031 INFO  fetcher.Fetcher - -finishing thread
FetcherThread, activeThreads=1
2010-04-20 17:51:37,032 INFO  fetcher.Fetcher - -finishing thread
FetcherThread, activeThreads=1
2010-04-20 17:51:37,032 INFO  fetcher.Fetcher - -finishing thread
FetcherThread, activeThreads=1
2010-04-20 17:51:37,032 INFO  fetcher.Fetcher - -finishing thread
FetcherThread, activeThreads=1
2010-04-20 17:51:37,296 INFO  fetcher.Fetcher - -finishing thread
FetcherThread, activeThreads=0
2010-04-20 17:51:38,035 INFO  fetcher.Fetcher - -activeThreads=0,
spinWaiting=0, fetchQueues.totalSize=0
2010-04-20 17:51:38,036 INFO  fetcher.Fetcher - -activeThreads=0,
2010-04-20 17:51:38,038 WARN  mapred.LocalJobRunner - job_local_0005
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
valid local directory for
taskTracker/jobcache/job_local_0005/attempt_local_0005_m_000000_0/output/spill0.out
        at org.apache.hadoop.fs.LocalDirAllocator
$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:335)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite
(LocalDirAllocator.java:124)
        at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite
(MapOutputFile.java:107)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill
(MapTask.java:930)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush
(MapTask.java:842)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run
(LocalJobRunner.java:138)

If I turn OFF fetcher parsing, it successfully queues and crawls, and then
dies with the same error message but with a different Job ID.

Same configuration on Windows/Cygwin completes successfully, and generates
a  94 MB crawl directory.  I could share my config as well (nutch-site and
crawl-urlfilter), but since it works successfully on another system, I
presume that I'm configured correctly.

Thanks for taking a look at this.



|------------>
| From:      |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Joshua J Pavel/Raleigh/i...@ibmus                                            
                                                                      |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |nutch-user@lucene.apache.org                                                 
                                                                     |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |04/20/2010 01:41 PM                                                          
                                                                     |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: Hadoop Disk Error                                                        
                                                                     |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|





Yes - how much free space does it need? We ran 0.9 using /tmp, and that has
~ 1 GB. After I first saw this error, I moved it to another filesystem
where I have 2 GB free (maybe not "gigs and gigs", but more than I think I
need to complete a small test crawl?).

Julien Nioche ---04/20/2010 12:36:10 PM---Hi Joshua, The error message you
got definitely indicates that you are running out of
                                                                       
                                                                       
 From:                Julien Nioche <lists.digitalpeb...@gmail.com>    
                                                                       
                                                                       
 To:                  nutch-user@lucene.apache.org                     
                                                                       
                                                                       
 Date:                04/20/2010 12:36 PM                              
                                                                       
                                                                       
 Subject:             Re: Hadoop Disk Error                            
                                                                       





Hi Joshua,

The error message you got definitely indicates that you are running out of
space.  Have you changed the value of hadoop.tmp.dir in the config file?

J.

--
DigitalPebble Ltd
http://www.digitalpebble.com.

On 20 April 2010 14:00, Joshua J Pavel <jpa...@us.ibm.com> wrote:

> I am - I changed the location to a filesystem with lots of free space and
> watched disk utilization during a crawl. It'll be a relatively small
crawl,
> and I have gigs and gigs free.
>
> [image: Inactive hide details for ---04/19/2010 05:53:53 PM---Are you
sure
> that you have enough space in the temporary directory used
b]---04/19/2010
> 05:53:53 PM---Are you sure that you have enough space in the temporary
> directory used by Hadoop? From: Joshua J Pa
>
>
> From:
> <arkadi.kosmy...@csiro.au>
> To:
> <nutch-user@lucene.apache.org>
> Date:
> 04/19/2010 05:53 PM
> Subject:
> RE: Hadoop Disk Error
> ------------------------------
>
>
>
> Are you sure that you have enough space in the temporary directory used
by
> Hadoop?
>
> From: Joshua J Pavel [mailto:jpa...@us.ibm.com. <jpa...@us.ibm.com>]
> Sent: Tuesday, 20 April 2010 6:42 AM
> To: nutch-user@lucene.apache.org
> Subject: Re: Hadoop Disk Error
>
>
> Some more information, if anyone can help:
>
> If I turn fetcher.parse to "false", then it successfully fetches and
crawls
> the site. and then bombs out with a larger ID for the job:
>
> 2010-04-19 20:34:48,342 WARN mapred.LocalJobRunner - job_local_0010
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> valid local directory for
>
taskTracker/jobcache/job_local_0010/attempt_local_0010_m_000000_0/output/spill0.out

> at
> org.apache.hadoop.fs.LocalDirAllocator
$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:335)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite
(LocalDirAllocator.java:124)
> at
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite
(MapOutputFile.java:107)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill
(MapTask.java:930)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush
(MapTask.java:842)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run
(LocalJobRunner.java:138)
>
> So, it's gotta be a problem with the parsing? The pages should all be
> UTF-8, and I know there are multiple languages involved. I tried setting
> parser.character.encoding.default to match, but it made no difference.
I'd
> appreciate any ideas.
>
> [?cid:1__=0ABBFD99DFE290498f9e8a93df938@us.ibm.com.]Joshua J
> Pavel---04/16/2010 03:05:18 PM---fwiw, the error does seem to be valid:
from
> the taskTracker/jobcache directory, I only have somethin
>
> From:
>
>
> Joshua J Pavel/Raleigh/i...@ibmus
>
>
> To:
>
>
> nutch-user@lucene.apache.org
>
>
> Date:
>
>
> 04/16/2010 03:05 PM
>
>
> Subject:
>
>
> Re: Hadoop Disk Error
>
> ________________________________
>
>
>
> fwiw, the error does seem to be valid: from the taskTracker/jobcache
> directory, I only have something for job 1-4.
>
> ls -la
> total 0
> drwxr-xr-x 6 root system 256 Apr 16 19:01 .
> drwxr-xr-x 3 root system 256 Apr 16 19:01 ..
> drwxr-xr-x 4 root system 256 Apr 16 19:01 job_local_0001
> drwxr-xr-x 4 root system 256 Apr 16 19:01 job_local_0002
> drwxr-xr-x 4 root system 256 Apr 16 19:01 job_local_0003
> drwxr-xr-x 4 root system 256 Apr 16 19:01 job_local_0004
>
> Joshua J Pavel---04/16/2010 09:00:35 AM---We're just now moving from a
> nutch .9 installation to 1.0, so I'm not entirely new to this. However
>
> From:
>
>
> Joshua J Pavel/Raleigh/i...@ibmus
>
>
> To:
>
>
> nutch-user@lucene.apache.org
>
>
> Date:
>
>
> 04/16/2010 09:00 AM
>
>
> Subject:
>
>
> Hadoop Disk Error
>
> ________________________________
>
>
>
>
>
> We're just now moving from a nutch .9 installation to 1.0, so I'm not
> entirely new to this.  However, I can't even get past the first fetch
now,
> due to a hadoop error.
>
> Looking in the mailing list archives, normally this error is caused from
> either permissions or a full disk.  I overrode the use of /tmp by setting
> hadoop.tmp.dir to a place with plenty of space, and I'm running the crawl
> as root, yet I'm still getting the error below.
>
> Any thoughts?
>
> Running on AIX with plenty of disk and RAM.
>
> 2010-04-16 12:49:51,972 INFO  fetcher.Fetcher - -finishing thread
> FetcherThread, activeThreads=0
> 2010-04-16 12:49:52,267 INFO  fetcher.Fetcher - -activeThreads=0,
> spinWaiting=0, fetchQueues.totalSize=0
> 2010-04-16 12:49:52,268 INFO  fetcher.Fetcher - -activeThreads=0,
> 2010-04-16 12:49:52,270 WARN  mapred.LocalJobRunner - job_local_0005
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> valid local directory for
>
>
taskTracker/jobcache/job_local_0005/attempt_local_0005_m_000000_0/output/spill0.out

>      at org.apache.hadoop.fs.LocalDirAllocator
> $AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:335)
>      at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite
> (LocalDirAllocator.java:124)
>      at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite
> (MapOutputFile.java:107)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill
> (MapTask.java:930)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush
> (MapTask.java:842)
>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>      at org.apache.hadoop.mapred.LocalJobRunner$Job.run
> (LocalJobRunner.java:138)
>
>
>
>

Re: Hadoop Disk Error

Reply via email to