Re: Hadoop Disk Error

2010-04-27 Thread Andrzej Bialecki
On 2010-04-26 22:31, Joshua J Pavel wrote: Sending this out to close the thread if anyone else experiences this problem: nutch 1.0 is not AIX-friendly (0.9 is). I'm not 100% sure which command it may be, but by modifying my path so that /opt/freeware/bin has precedence, I no longer get the

Issues in recrawling

2010-04-27 Thread arpit khurdiya
HI, I m new to the world of nutch. I am trying to crawl  local file systems on LAN using nutch 1.0. Documents are rarely modified and then search them using solr. And frequency of recrawling is 1 day as documents are frequently added and deleted. I have few queries regarding recrawling. 1. What

nutch crawl issue

2010-04-27 Thread matthew a. grisius
using Nutch nightly build nutch-2010-04-27_04-00-28: I am trying to bin/nutch crawl a single html file generated by javadoc and no links are followed. I verified this with bin/nutch readdb and bin/nutch readlinkdb, and also with luke-1.0.1. Only the single base seed doc specified is processed. I