Hi guys, I have been watching this thread intently and I am very happy to see that there is some progress :0)
Radim, Can I ask that you open a JIRA issue and submit a patch, this way we can not only track it, but it will also give the community a chance to test and validate the patch prior to integration into the source. Thanks Lewis On Fri, Oct 7, 2011 at 5:49 PM, Ramanathapuram, Rajesh < [email protected]> wrote: > Hi Radim, > > Thank you so much for this. I am not familiar with commit process to the > core. > Is there someone who can help us get this committed and help resolve this > issue? > > Thanks for all your help. > > Rajesh Ramana > > -----Original Message----- > From: Radim Kolar [mailto:[email protected]] > Sent: Thursday, October 06, 2011 2:18 PM > To: [email protected] > Subject: Re: Nutch not crawling URLs with spanish accented characters ( ñ) > > - The REGEX normalizer transforms the special characters, but fails to > substitute ‘%F1’ or ‘%C3%B1’ for ‘ñ’ > - The fetcher is having trouble interpreting the links with special > character ‘ñ’. > > i can add this transformation to basic-url normalizer if somebody is > willing to commit it. > -- *Lewis*

