What I really would like is a way to pass in the location of the config files (e.g. nutch-default.xml, regex-urlfilter.txt, etc.) as an argument to the nutch script, so that I can have multiple config files (each for a different site I wish to crawl).
Create a file mynutch.sh: NUTCH_CONF_DIR=$1; export NUTCH_CONF_DIR shift; echo using NUTCH_CONF_DIR = $NUTCH_CONF_DIR echo bin/nutch $@ bin/nutch $@
