Hello - this is a good question and i probably agree. I've just read your and Ken's conversation at Tika's list and associated Jira and will get back tomorrow. The bottom line, if i missed script links in the patch, it is my mistake and we should correct it.
M. -----Original message----- > From:Joseph Naegele <[email protected]> > Sent: Tuesday 5th April 2016 21:45 > To: [email protected] > Subject: collect script tags using parse-tika > > Hi all, > > > > I asked this on the Tika user list, but I want to bring it up here as well: > > > > The parse-tika plugin is appealing because it offers the ability to use > Boilerpipe, however it doesn't parse <script> tags as outlinks like > parse-html does. Does anyone know of a good reason parse-tika *shouldn't* > parse <script src="."> tags as outlinks? If not, I'll propose adding this > functionality to Tika's LinkContentHandler. > > > > Thanks, > > Joe > >

