Sorry, I keep forgetting. I'm using the Nutch 2.x branch as of last week. However, there hasn't been a change to the filter in a month or so.
It was parsed correctly as far as I can tell. I'm seeing the same content in solr as what I see in the browser. On Mon, Aug 13, 2012 at 9:19 AM, Markus Jelsma <[email protected]>wrote: > Strange, no content type, that should not happen. Anyway, you can open an > issue in Jira for this. > Please mention your Nutch version. > > I cannot replicate it with trunk. > > Also, is it being parsed at all? > > -----Original message----- > > From:Bai Shen <[email protected]> > > Sent: Mon 13-Aug-2012 15:12 > > To: [email protected] > > Subject: MoreIndexingFilter plugin failing with NPE > > > > MoreIndexingFilter is failing with an NPE when trying to index > > http://spiderbites.nytimes.com/ > > > > The contentType comes back as null. There is a check for this in order > to > > determine which MIME command to run. > > > > However, when you check to see if the content type needs to be spilt into > > sub parts, there is no check and it throws and NPE. > > >

