Markus Jelsma created NUTCH-1711: ------------------------------------ Summary: Normalizer does not encode exclamation mark Key: NUTCH-1711 URL: https://issues.apache.org/jira/browse/NUTCH-1711 Project: Nutch Issue Type: Bug Affects Versions: 1.7 Reporter: Markus Jelsma Assignee: Markus Jelsma Fix For: 1.8
{code} $ bin/nutch org.apache.nutch.net.URLNormalizerChecker Checking combination of all URLNormalizers available http://nutch.apache.org/bla! http://nutch.apache.org/bla! {code} I never noticed that many URL encoders do not encode the exclamation mark until just now. SolrCloud uses the character to delimit the composite ID in SolrCloud, if you end with the exclamation mark, you will get an error! Any thoughts on this? -- This message was sent by Atlassian JIRA (v6.1.5#6160)