Julien wrote:
Hello,
just do a :
export NUTCH_CONF_DIR=/_your_conf_path/
Julien
Nearly all the classes used for crawling(Injector, Generator, Fetcher,
Indexer, etc ) extend org.apache.hadoop.util.Toolbase class, which
ensures that the class can take some optional command line arguments.
Below is the javadoc of the class:
This is a base class to support generic commonad options.
* Generic command options allow a user to specify a namenode,
* a job tracker etc. Generic options supported are
* -conf <configuration file> specify an application configuration file
* -D <property=value> use value for given property
* -fs <local|namenode:port> specify a namenode
* -jt <local|jobtracker:port> specify a job tracker
*
* The general command line syntax is
* bin/hadoop command [genericOptions] [commandOptions]
*
* For every tool that inherits from ToolBase, generic options are
* handled by ToolBase while command options are passed to the tool.
* Generic options handling is implemented using Common CLI.
*
* Tools that inherit from ToolBase in Hadoop are
* DFSShell, DFSck, JobClient, and CopyFiles.
*
* Examples using generic options are
* bin/hadoop dfs -fs darwin:8020 -ls /data
* list /data directory in dfs with namenode darwin:8020
* bin/hadoop dfs -D fs.default.name=darwin:8020 -ls /data
* list /data directory in dfs with namenode darwin:8020
* bin/hadoop dfs -conf hadoop-site.xml -ls /data
* list /data directory in dfs with conf specified in hadoop-site.xml
* bin/hadoop job -D mapred.job.tracker=darwin:50020 -submit job.xml
* submit a job to job tracker darwin:50020
* bin/hadoop job -jt darwin:50020 -submit job.xml
* submit a job to job tracker darwin:50020
* bin/hadoop job -jt local -submit job.xml
* submit a job to local runner