Hey Kiran, drop me a line prior to starting, I will give it a try tomorrow (I hope).
--Roland Am 04.03.2013 14:13, schrieb kiran (JIRA):
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592179#comment-13592179 ] kiran commented on NUTCH-961: ----------------------------- No Roland, not yet. I just switched to using 1.x series, but i will give a try at porting this to 2.x this weekExpose Tika's boilerpipe support -------------------------------- Key: NUTCH-961 URL: https://issues.apache.org/jira/browse/NUTCH-961 Project: Nutch Issue Type: New Feature Components: parser Reporter: Markus Jelsma Assignee: Markus Jelsma Fix For: 1.7 Attachments: BoilerpipeExtractorRepository.java, NUTCH-961-1.3-3.patch, NUTCH-961-1.3-tikaparser1.patch, NUTCH-961-1.3-tikaparser.patch, NUTCH-961-1.4-dombuilder-1.patch, NUTCH-961-1.5-1.patch, NUTCH-961v2.patch Tika 0.8 comes with the Boilerpipe content handler which can be used to extract boilerplate content from HTML pages. We should see how we can expose Boilerplate in the Nutch cofiguration.-- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

