Hi all,

 

I asked this on the Tika user list, but I want to bring it up here as well:

 

The parse-tika plugin is appealing because it offers the ability to use
Boilerpipe, however it doesn't parse <script> tags as outlinks like
parse-html does. Does anyone know of a good reason parse-tika *shouldn't*
parse <script src="."> tags as outlinks? If not, I'll propose adding this
functionality to Tika's LinkContentHandler.

 

Thanks,

Joe

Reply via email to