Hi Cam,

They mean that the plugin.xml file declares that the 
parser supported mime type is NOT application/xhtml+html, 
but it is mapped to the content type application/xhtml+html in 
the parse-plugins.xml file.

Cheers,
Chris

On Jul 18, 2011, at 4:04 PM, Cam Bazz wrote:

> What does the following log mean:
> 
> 2011-07-19 01:00:07,034 WARN  parse.ParserFactory -
> ParserFactory:Plugin: org.apache.nutch.parse.html.HtmlParser mapped to
> contentType application/xhtml+xml via parse-plugins.xml, but its
> plugin.xml file does not claim to support contentType:
> application/xhtml+xml
> 
> 
> Does that mean that my html parser is not getting part of the crawled data?
> 
> best.


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [email protected]
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to