2010-10-12 19:54:19,976 WARN parse.ParserFactory - ParserFactory:Plugin: org.apache.nutch.parse.html.HtmlParser mapped to contentType application/xhtml+xml via parse-plugins.xml, but its plugin.xml file does not claim to support contentType: application/xhtml+xml

2010-10-12 19:54:19,991 WARN parse.ParseUtil - Unable to successfully parse content http://www.lucidimagination.com/ of type application/xhtml+xml

2010-10-12 19:54:19,991 WARN fetcher.Fetcher - Error parsing: http://www.lucidimagination.com/: failed(2,200): org.apache.nutch.parse.ParseException: Unable to successfully parse content

I am trying to crawl http://www.lucidimagination.com/ with Nutch 1.2. I tried both Tika and html parsers (above is html), but neither work.

Any suggestions?

Reply via email to