Yes, you do have to make a config file for your plugin to be seen by Nutch.
If you built Nutch from source, you should have the directory build/plugins. That's where the compiled plugins are. The names of the directories under there are the names that get included in 'plugin.includes'. Take a look at the existing plugin.xml files, you should be able to figure it out by example. The standard way to package the plugin code is to put it in a jar in the corresponding plugin directory. This ensures that it won't get loaded if it's not used. (This is optional: if you KNOW that it's gonna get used every time, you can put your code anywhere on the classpath.) Note that I'm using 1.1 - I can't guarantee that this information is still current. -MB On Feb 1, 2011, at 9:49 PM, .: Abhishek :. wrote: > Hi all, > > I am writing an custom HtmlParserFilter by implementing the > HtmlParseFilter. And, I am using the ParserChecker for testing the filter. > > I could see by some Syso's in the HTMLParseFilters class that by default > only org.apache.nutch.parse.js.JSParseFilter is being used. If I would like > to use my custom filter should I be adding some configurations any where? > > And a point to be noted is that, when I add the following lines in > nutch-site.xml, > > <property> > <name>plugin.includes</name> > > <value>nutch-extensionpoints|protocol-http|urlfilter-regex|parse-(text|html)|index-basic|query-(basic|site|url)</value> > <description>Regular expression naming plugin id names to > include. Any plugin not matching this expression is excluded. > In any case you need at least include the > nutch-extensionpoints plugin. By > default Nutch includes crawling just HTML and plain text via > HTTP, > and basic indexing and search plugins. > </description> > </property> > > I don't even see JSParseFilter being applied. The package that has my > custom filter does not have any special plugin configuration xml files, do I > have to add some or configure it else where. I am using Nutch 1.2. > > I see my knowledge with Nutch growing considerably, thanks to all of you. > > Cheers, > Abi

