On Wed, Jun 28, 2006 at 07:59:47PM -0700, Brian Hill wrote:
> [ ... ]
> Is there any way to specify a different crawl-urlfilter.txt file for
> each crawl? When I index SiteA, I have a handful of URL masks that I
> want to have available to it. When I index SiteB, I have a different set
> of URL masks that I want available there. Am I going to need two
> completely separate Nutch installations?

I copy the entire conf dir and set NUTCH_CONF_DIR to where I put it.
I believe this is documented somewhere :-)

In fact I have an entire script front-end to nutch that insists
NUTCH_CONF_DIR is set, because I don't want nutch to assume anything.

The nutch command's behaviour should be independent of your pwd
or where the 'real' nutch command is.

The other slight irritation is that nutch dumps working files
into a subdir of your current dir, so that has to be writable.


Matt


Reply via email to