Nutch Encoding on AWS

2012-08-08 Thread Niccolò Becchi
Hi, I have been using Nutch for fetching english sites (UTF-8 and ISO-8859-1). All go well running in local-mode or on a single-node hadoop cluster installed on my pc. Recently I have moved the crawling system to the Amazon AWS and Fetcher has some encoding problems with special character, they

Re: Nutch Encoding on AWS

2012-08-08 Thread X3C TECH
Not sure if it matters, but what data center are you using? Maybe the data center region uses different characters if the native language isn't english On Wed, Aug 8, 2012 at 7:25 AM, Niccolò Becchi niccolo.bec...@gmail.comwrote: Hi, I have been using Nutch for fetching english sites (UTF-8