Re: no static NutchConf

Doug Cutting Wed, 04 Jan 2006 12:35:57 -0800

Andrzej Bialecki wrote:

Example: what happens now if you try to run more than one fetcher at thesame time, where the fetcher parameters differ (or a set of activatedplugins differs)? You can't - the local tasks on each tasktracker willuse whatever local config is there.

That's true when mapred.job.tracker=local, but when things aredistributed the config can vary since each task is spawned in a separateJVM with a separate classpath. The nutch-site.xml on each node cannever be overidden. For example, so long as plugin.includes is notspecified in nutch-site.xml on each node, then each task can overrideplugin.includes to use different plugins.

Also note that plugin implementations can submitted in a jar file withthe job, and plugin.folders can be overridden in the job to find the newplugins. So a job jar might include a folder named "my.plugins" and setplugin.folders to "my.plugins, plugins", then alter plugin.includes toinclude job-specific plugins.

What happens if you change theconfig on a node that submits the job? The changes won't be propagatedto the tasktracker nodes, because tasktrackers use local configuration(through a singleton NutchConf.get()), instead of supplying aserialized/deserialized instance of the config from the originatingnode... etc.

Again, I'm not sure this is a problem. Properties which tasks should beable to override should not be specified in nutch-site.xml, but ratherin mapred-default.xml. Lots of job-specific properties are currentlypassed this way.

Another use case for eliminating the static uses of NutchConf is tosimplify the construction of a configuration gui. It would be nice tohave a web-based interface which permits one to configure parameters andthen have it run the system. This should be able to run multiple Nutchinstances in a single JVM. For example, a single Nutch-based "searchappliance" daemon should be able to crawl and search both your intranetand your public websites, each configured separately.


Doug

Re: no static NutchConf

Reply via email to