[ 
https://issues.apache.org/jira/browse/TIKA-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14218582#comment-14218582
 ] 

Ben McCann commented on TIKA-1484:
----------------------------------

Yes, it turns out I can exclude Boilerpipe. I wasn't sure if I could at first 
because I wasn't sure how Tika was using it. I had to checkout and read the 
source code to determine if this was a safe option.

I don't have any candidates for replacing it.

Maybe it could be moved to a separate boilerpipe-parser project?

> Boilerpipe dependency is evil
> -----------------------------
>
>                 Key: TIKA-1484
>                 URL: https://issues.apache.org/jira/browse/TIKA-1484
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.6
>            Reporter: Ben McCann
>
> The Boilerpipe project bundles inside it two classes from org.cyberneko.html. 
> We're already using NekoHTML in our project. Depending on which library shows 
> up on our classpath certain parts of our project will either work or not. I'd 
> really love it if Boilerpipe could be fixed or replaced with some other 
> library that is a better citizen.
> I see I'm not the first person to run into this as another Tika user has 
> filed a bug on the Boilerpipe project: 
> https://code.google.com/p/boilerpipe/issues/detail?id=62



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to