[ https://issues.apache.org/jira/browse/DROIDS-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839019#action_12839019 ]
Bertil Chapuis commented on DROIDS-83: -------------------------------------- It seems that the other constructors provided by the URI class are automatically encoding the string. What about using regex in such a case? Pattern pattern = Pattern.compile("(.*)://(.*)(/.*)\\?(.*)#(.*)"); Matcher matcher = pattern.matcher("http://www.test.com/test asdf?blabliblablo=bloublu#bbédu"); matcher.find(); String scheme = matcher.group(1); String host = matcher.group(2); String path = matcher.group(3); String query = matcher.group(4); String fragment = matcher.group(5); URI uri = new URI(scheme, host, path, query, fragment); Best regards. > LinkExtractor doesn't handle spaces in URI > ------------------------------------------ > > Key: DROIDS-83 > URL: https://issues.apache.org/jira/browse/DROIDS-83 > Project: Droids > Issue Type: Bug > Components: core > Affects Versions: 0.01 > Reporter: Richard Frovarp > Attachments: link-whitespace-fix.patch > > > Links with spaces aren't properly handled by the LinkExtractor. java.net.URI > expects valid URI's, and spaces aren't allowed. Therefore, before resolving > links, the URI's need to be cleaned up to what browsers can handle. This at > least includes handling space. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.