Jeff, Hi, in Nutch 1.0 I was able to replace the parse-html plugin with my own > html parser to parse html files, through modifying the mime types in > parse-plugins.xml. > > I have been trying to do the same things in Nutch 1.1, but my own html > parser is not picked up when crawling, leading to no parser exceptions. >
You should be able to override Tika for a given mime-type provided that you declare the association between your plugin and the mime-type in parse-plugins.xml. Have you checked that your plugin is listed in plugin.includes? Can you see it listed in the log? J. -- DigitalPebble Ltd Open Source Solutions for Text Engineering http://www.digitalpebble.com

