Hi all,
I asked this on the Tika user list, but I want to bring it up here as well: The parse-tika plugin is appealing because it offers the ability to use Boilerpipe, however it doesn't parse <script> tags as outlinks like parse-html does. Does anyone know of a good reason parse-tika *shouldn't* parse <script src="."> tags as outlinks? If not, I'll propose adding this functionality to Tika's LinkContentHandler. Thanks, Joe

