Yes this is correct, but we still don't test for either of the two. On Wed, Feb 15, 2012 at 10:59 AM, Julien Nioche < [email protected]> wrote:
> The mimetype is not the same thing as the encoding. As Ken pointed out > this is done at the individual parser level > > > On 14 February 2012 23:51, Markus Jelsma <[email protected]> wrote: > >> Hi, >> >> This was indeed an issue until today. The detected type is in the crawl >> datum >> metadata. >> >> https://issues.apache.org/jira/browse/NUTCH-1259 >> >> > Hi, >> > >> > I can't see anywhere within our parser plugins where we detect encoding >> of >> > documents. I've also begun looking through the o.a.n.p package but >> again I >> > can't see anything. >> > >> > Can anyone provide some detail on this please? >> > >> > Thank you >> > >> > Lewis >> > > > > -- > * > *Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com > http://twitter.com/digitalpebble > > -- *Lewis*

