[ https://issues.apache.org/jira/browse/NUTCH-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki closed NUTCH-426. ----------------------------------- Resolution: Fixed Fix Version/s: 0.9.0 Assignee: Andrzej Bialecki Fixed in rev. 493085. Thank you! > parse-js skips parsing if found URL fails java.net.URL parse > ------------------------------------------------------------ > > Key: NUTCH-426 > URL: https://issues.apache.org/jira/browse/NUTCH-426 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 0.9.0 > Reporter: [EMAIL PROTECTED] > Assigned To: Andrzej Bialecki > Priority: Minor > Fix For: 0.9.0 > > Attachments: nutch426.patch > > > The parse-js plugin in getJSLinks tries a regex looking for likely URLs > against a string of javascript. Any matches that do not begin 'www' are > given to java.net.URL with base URL to test 'URLness'. Often this test will > fail. Currently, when it fails, nutch skips processing any more of the > javascript snippet logs an error. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers