Hi, I think this thread should be useful: http://lucene.472066.n3.nabble.com/Parsed-content-in-form-of-special-characters-td4047239.html
Thanks & Regards Rajani Maski On Sun, Apr 7, 2013 at 4:56 AM, Jun Zhou <[email protected]> wrote: > Hi all, > > I'm using nutch 1.6 to crawl a web site which have lots of special > characters in the url, like "?,=@" etc. For each character, I can add a > regex in the regex-normalize.xml to change it into percent encoding. > > My question is, is there an easier way to do this? Like a url-encode method > to encode all the special characters rather than add regex one by one? > > Thanks! >

