lfoppiano opened a new pull request, #923: URL: https://github.com/apache/nutch/pull/923
This PR fixes the issue https://issues.apache.org/jira/browse/NUTCH-3180. It hardens Nutch’s IDNA2008 hostname conversion so that unchecked ICU exceptions (e.g., from UTS46/Punycode) are converted into MalformedURLException, allowing callers like BasicURLNormalizer/URLCleaner to reject bad URLs instead of crashing a task. Changes: - Wrap ICU IDNA.nameToASCII/nameToUnicode in URLUtil.convertIDNA2008 with a try/catch to translate unchecked ICU/UTS46/Punycode exceptions into MalformedURLException (with cause attached). - Add regression tests intended to cover possible Exception thrown by the underlying library -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]

