[ https://issues.apache.org/jira/browse/NUTCH-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebastian Nagel resolved NUTCH-2145. ------------------------------------ Resolution: Fixed Fix Version/s: 1.15 > parse/index checker fail to fetch valid percent-encoded URLs > ------------------------------------------------------------ > > Key: NUTCH-2145 > URL: https://issues.apache.org/jira/browse/NUTCH-2145 > Project: Nutch > Issue Type: Bug > Affects Versions: 2.3, 1.11 > Reporter: Sebastian Nagel > Priority: Major > Fix For: 1.15 > > > Parsechecker and indexchecker fail to fetch valid URLs containing > percent-encoded characters. The percent-encoding is broken by escaping % > again: > {noformat} > % bin/nutch parsechecker 'https://de.wikipedia.org/wiki/%C3%84sop' > fetching: https://de.wikipedia.org/wiki/%25C3%2584sop > Fetch failed with protocol status: gone(11), lastModified=0: > https://de.wikipedia.org/wiki/%25C3%2584sop > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)