Yes I'm using both relative path & cygwin under windows. so /d: is not introduced by me, but either nutch or hadoop.
Regarding the cygwin path you are righ... actually where I lost quite some time. OK will try absolute paths and let you know. -MilleBii- 2009/6/24 Andrzej Bialecki <a...@getopt.org> > MilleBii wrote: > >> HEEEELLLLLPPP !!! >> >> Stuck for 3 days on not able to start any nutch job. >> >> hdfs works fine, ie I can put & look at files. >> When i start nutch crawl, I get the following error >> >> Job initialization failed: >> java.lang.IllegalArgumentException: Pathname >> >> /d:/Bii/nutch/logs/history/user/_logs/history/localhost_1245788245191_job_200906232217_0001_pc-xxxx%5Cxxxx_inject+urls >> >> It is looking for the file at a wrong location ???? Indeed in my case the >> correct location is /d:/Bii/nutch/logs/history, so why is * >> "history/user/_logs"* added and how can I fix that ? >> >> 2009/6/21 MilleBii <mille...@gmail.com> >> >> Looks like I just needed to transfer from the local filesystem to hdfs: >>> Is it safe to transfer a crawl directory (and subs) from the local file >>> system to hdfs and start crawling again ? >>> >>> 1. hadoop fs -put crawl crawl >>> 2. nutch generate crawl/crawldb crawl/segments -topN 500 (where now it >>> should use the hdfs) >>> >>> -MilleBii- >>> >>> 2009/6/21 MilleBii <mille...@gmail.com> >>> >>> I have newly installed hadoop in a distributed single node >>> configuration. >>> >>>> When I run nutch commands it is looking for files my user home >>>> directory >>>> and not at the nutch directory ? >>>> How can I change this ? >>>> >>> > I suspect your hadoop-site.xml uses relative path somewhere, and not an > absolute path (with leading slash). Also, /d: looks suspiciously like a > Windows pathname, in which case you should either use a full URI > (file:///d:/) or just the disk name d:/ without the leading slash. Please > also note that if you are running this on Windows under cygwin then in your > config files you MUST NOT use the cygwin paths (like /cygdrive/d/...) > because Java can't see them. > > > -- > Best regards, > Andrzej Bialecki <>< > ___. ___ ___ ___ _ _ __________________________________ > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > ___|||__|| \| || | Embedded Unix, System Integration > http://www.sigram.com Contact: info at sigram dot com > > -- -MilleBii-