Isn't tika responsible for XML parsing? Because I got this: parse.ParserFactory 
- ParserFactory:
Plugin: org.apache.nutch.parse.feed.FeedParser mapped to contentType 
application/rss+xml via
parse-plugins.xml, but not enabled via plugin.includes in nutch-default.xml. 
Should I just include xml?

The plugin "feed" contains org.apache.nutch.parse.feed.FeedParser

But both ("feed" and "parse-tika") should be able to parse RSS feeds.

Have a look at:
 http://lucene.472066.n3.nabble.com/RSS-parser-td3719558.html
and also:
 https://issues.apache.org/jira/browse/NUTCH-1053
 https://issues.apache.org/jira/browse/NUTCH-887

Sebastian

Reply via email to