Thank you for the additional information Varun! Would you mind doing something like the following:
hadoop dfs -text THE_FILE | hexdump -C And sharing the output? I'm trying to see the actual content of the file rather than any interpreted value. Jarcec On Mon, Jul 15, 2013 at 06:52:11PM -0700, varun kumar gullipalli wrote: > Hi Jarcec, > > I am validating the data by running the following command, > > hadoop fs -text <hdfs cluster> > > I think there is no issue with the shell (correct me if am wrong) because I > am connecting to MySQL database from the same shell(command line) and could > view the source data properly. > > Initially we observed that the following conf files doesn't have utf-8 > encoding. > <?xml version="1.0" encoding="UTF-8"?> > > sqoop-site.xml > sqoop=site-template.xml > > But no luck after making the changes too. > > Thanks, > Varun > > > ________________________________ > From: Jarek Jarcec Cecho <[email protected]> > To: [email protected]; varun kumar gullipalli <[email protected]> > Sent: Monday, July 15, 2013 6:37 PM > Subject: Re: Sqoop - utf-8 data load issue > > > Hi Varun, > we are usually not seeing any issues with transferring text data in UTF. How > are > you validating the imported file? I can imagine that your shell might be > messing > the encoding. > > Jarcec > > On Mon, Jul 15, 2013 at 06:27:25PM -0700, varun kumar gullipalli wrote: > > > > > > Hi, > > I am importing data from MySql to HDFS using free-form query import. > > It works fine but facing issue when the data is utf-8.The source(MySql) db > > is utf-8 compatible but looks like sqoop is converting the data during > > import. > > Example - The source value - elémeñt is loaded as elémeñt to HDFS. > > Please provide a solution for this. > > Thanks in advance!
signature.asc
Description: Digital signature
