Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/402 @cestella Do you know of an example URL that isn't a URL in Java, e.g. with the square brackets issue? I didn't see one in that thread, but I could have just missed it. Are we sure commons validator wouldn't have the same issue? I'm fine with validating them and then parsing if we're stuck dealing with Java implementation quirks. Specifically on the square brackets issue, it sounds like (and this is way out of my wheelhouse, so forgive my potential ignorance) that these characters (and others) qualify as unwise (https://www.ietf.org/rfc/rfc2396.txt) and should be escaped, which is why Java is strict about them. Could we appropriately escape the offending characters and end up with a correct URI? If commons has something to validate a URL, is there something to do that escaping for us? If that works, do we potentially want to be doing operations on URIs instead of URLs, e.g. `URI_TO_PATH`, `URI_TO_HOST`? Obviously, this raises the question of what we do the URL* functions and coexistence. In terms of test cases, I'd want to see something with the unwise character issue (but I don't have an example of that myself). I can help you try to dig one up if you want. In addition, I'd like to see something with a fragment defined (e.g. `http://java.sun.com/index.html#chapter1`) and make sure that the functions work appropriately (mostly just tests for `TO_PATH`).
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---