On 2010-04-26 22:31, Joshua J Pavel wrote:
Sending this out to close the thread if anyone else experiences this
problem: nutch 1.0 is not AIX-friendly (0.9 is).
I'm not 100% sure which command it may be, but by modifying my path so
that /opt/freeware/bin has precedence, I no longer get the
HI,
I m new to the world of nutch. I am trying to crawl local file
systems on LAN using nutch 1.0. Documents are rarely modified and then
search them using solr. And frequency of recrawling is 1 day as
documents are frequently added and deleted. I have few queries
regarding recrawling.
1. What
using Nutch nightly build nutch-2010-04-27_04-00-28:
I am trying to bin/nutch crawl a single html file generated by javadoc
and no links are followed. I verified this with bin/nutch readdb and
bin/nutch readlinkdb, and also with luke-1.0.1. Only the single base
seed doc specified is processed.
I