Hi Cam, They mean that the plugin.xml file declares that the parser supported mime type is NOT application/xhtml+html, but it is mapped to the content type application/xhtml+html in the parse-plugins.xml file.
Cheers, Chris On Jul 18, 2011, at 4:04 PM, Cam Bazz wrote: > What does the following log mean: > > 2011-07-19 01:00:07,034 WARN parse.ParserFactory - > ParserFactory:Plugin: org.apache.nutch.parse.html.HtmlParser mapped to > contentType application/xhtml+xml via parse-plugins.xml, but its > plugin.xml file does not claim to support contentType: > application/xhtml+xml > > > Does that mean that my html parser is not getting part of the crawled data? > > best. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

