advise for search.dir location

2009-12-02 Thread MilleBii
I'm looking for advice where to locate the search.dir I saw post stating to put on the OS file system some speak of locating under hdfs... So far I have only used the OS FS which means you have to copy out of hdfs the whole crawl directory ... Quite long to transfert and ressource consumming. Any

Re: crawl dates with fetch interval 0

2009-12-02 Thread reinhard schwab
i have tested this now with the current trunk of nutch. Revision: 886112 the dump of the crawl db shows http://www.wachauclimbing.net/home/impressum-disclaimer/comment-page-1/ Version: 7 Status: 2 (db_fetched) Fetch time: Wed Dec 02 12:48:22 CET 2009 Modified time: Thu Jan 01 01:00:00 CET 1970

org.apache.hadoop.util.DiskChecker$DiskErrorExceptio

2009-12-02 Thread BELLINI ADAM
hi, i have this error when crawling org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_local_0001/attempt_local_0001_m_00_0/output/spill0.out at

Re: org.apache.hadoop.util.DiskChecker$DiskErrorExceptio

2009-12-02 Thread Julien Nioche
disk full? 2009/12/2 BELLINI ADAM mbel...@msn.com hi, i have this error when crawling org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_local_0001/attempt_local_0001_m_00_0/output/spill0.out at

Re: org.apache.hadoop.util.DiskChecker$DiskErrorExceptio

2009-12-02 Thread Andrzej Bialecki
BELLINI ADAM wrote: hi, i have this error when crawling org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_local_0001/attempt_local_0001_m_00_0/output/spill0.out Most likely you ran out of tmp disk space. --

Re: odd warnings

2009-12-02 Thread Jesse Hires
Thanks! Fixing how I was merging the indexes took care of the warning. Jesse int GetRandomNumber() { return 4; // Chosen by fair roll of dice // Guaranteed to be random } // xkcd.com On Tue, Dec 1, 2009 at 4:49 AM, Andrzej Bialecki a...@getopt.org wrote: Jesse Hires wrote:

RE: org.apache.hadoop.util.DiskChecker$DiskErrorExceptio

2009-12-02 Thread BELLINI ADAM
hi, i fixed it by changing the tmp directory property namehadoop.tmp.dir/name value/my/large/disk/space/hadoop-${user.name}/value /property thx, how can i close this post ? Date: Wed, 2 Dec 2009 15:51:46 +0100 From: a...@getopt.org To: nutch-user@lucene.apache.org

Re: org.apache.hadoop.util.DiskChecker$DiskErrorExceptio

2009-12-02 Thread Fadzi Ushewokunze
i think i have seen this before; check the permisssions in /tmp directory; well, at least for me that was the issue - for some reason; On Wed, 2009-12-02 at 14:40 +, BELLINI ADAM wrote: hi, i have this error when crawling org.apache.hadoop.util.DiskChecker$DiskErrorException: Could

How does generate work ?

2009-12-02 Thread MilleBii
Observing my fetch cycles perf. It looks like there is always a rather long tail. I saw it on 10k, 150k, 450k fetch runs. Of course you can cut-off the tail with the patch 770 made by Julien (thx), I did some dry test looks like working, so I'm going to move it to production. Yet, what seems to