I'm looking for advice where to locate the search.dir I saw post
stating to put on the OS file system some speak of locating under
hdfs...
So far I have only used the OS FS which means you have to copy out of
hdfs the whole crawl directory ... Quite long to transfert and
ressource consumming.
Any
i have tested this now with the current trunk of nutch.
Revision: 886112
the dump of the crawl db shows
http://www.wachauclimbing.net/home/impressum-disclaimer/comment-page-1/
Version: 7
Status: 2 (db_fetched)
Fetch time: Wed Dec 02 12:48:22 CET 2009
Modified time: Thu Jan 01 01:00:00 CET 1970
hi,
i have this error when crawling
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid
local directory for
taskTracker/jobcache/job_local_0001/attempt_local_0001_m_00_0/output/spill0.out
at
disk full?
2009/12/2 BELLINI ADAM mbel...@msn.com
hi,
i have this error when crawling
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
valid local directory for
taskTracker/jobcache/job_local_0001/attempt_local_0001_m_00_0/output/spill0.out
at
BELLINI ADAM wrote:
hi,
i have this error when crawling
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid
local directory for
taskTracker/jobcache/job_local_0001/attempt_local_0001_m_00_0/output/spill0.out
Most likely you ran out of tmp disk space.
--
Thanks! Fixing how I was merging the indexes took care of the warning.
Jesse
int GetRandomNumber()
{
return 4; // Chosen by fair roll of dice
// Guaranteed to be random
} // xkcd.com
On Tue, Dec 1, 2009 at 4:49 AM, Andrzej Bialecki a...@getopt.org wrote:
Jesse Hires wrote:
hi,
i fixed it by changing the tmp directory
property
namehadoop.tmp.dir/name
value/my/large/disk/space/hadoop-${user.name}/value
/property
thx, how can i close this post ?
Date: Wed, 2 Dec 2009 15:51:46 +0100
From: a...@getopt.org
To: nutch-user@lucene.apache.org
i think i have seen this before;
check the permisssions in /tmp directory;
well, at least for me that was the issue - for some reason;
On Wed, 2009-12-02 at 14:40 +, BELLINI ADAM wrote:
hi,
i have this error when crawling
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
Observing my fetch cycles perf. It looks like there is always a rather
long tail.
I saw it on 10k, 150k, 450k fetch runs.
Of course you can cut-off the tail with the patch 770 made by Julien
(thx), I did some dry test looks like working, so I'm going to move it
to production.
Yet, what seems to