[ 
https://issues.apache.org/jira/browse/TIKA-673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047817#comment-13047817
 ] 

Matt Parker commented on TIKA-673:
----------------------------------

http://code.google.com/p/boilerpipe/

> BoilerPipe Integration
> ----------------------
>
>                 Key: TIKA-673
>                 URL: https://issues.apache.org/jira/browse/TIKA-673
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Matt Parker
>
> Found a library that might be worth considering for integration into your 
> package. It provides one of the best open source text extraction algorithms 
> to find the main text within an HTML page.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to