Manish - you're in luck. Nutch 1.12 was released and has Boilerpipe support. Check: https://issues.apache.org/jira/browse/NUTCH-961
Markus -----Original message----- > From:Manish Verma <[email protected]> > Sent: Tuesday 28th June 2016 23:46 > To: [email protected] > Subject: Remove Header from content > > Hi, > > I don’t want to index header and footer of content , I know we can make > changes in HtmlParser.java but I don’t want to change nutch core code, is > there any other way(plugin) to eleminate Header div from content. > > Thanks MV > >

