Howie Wang wrote:
I think you have to hack the parse-html plugin. Look in DOMContentUtils.java in getOutlinks.java. You'll probably have to look for targets that start with
"javascript:" and do some string replacing.

The latest SVN version already has a JavaScript link extractor (JSParseFilter in parse-js plugin). Currently it handles extraction of JS snippets from HTML events (onload, onclick, onmouseover, etc), and of course from <script> elements. The only thing missing to handle your case is to add a clause to handle the "javascript:" in any other attribute.

I can make this change. Watch the commit messages so that you know when to sync your source.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to