[
https://issues.apache.org/jira/browse/SQOOP-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ranjan Bagchi updated SQOOP-2639:
---------------------------------
Description:
I am able to import utf-8 data (non-latin1) data successfully into HDFS via:
sqoop import --connect jdbc:mysql://host/db --username XX --password YY \
--mysql-delimiters \
--table MYSQL_SRC_TABLE --target-dir ${SQOOP_DIR_PREFIX}/mysql_table
--direct
However, using
sqoop export --connect jdbc:mysql://host/db --username XX --password YY \
--mysql-delimiters \
--table MYSQL_DEST_TABLE --export-dir ${SQOOP_DIR_PREFIX}/mysql_table \
--direct
Cuts off the fields after the first non-latin1 character (eg a letter w/ an
umlaut).
I tried other options like -- --default-character-set=utf8, without success.
I was able to fix the problem with the following change:
Change
https://svn.apache.org/repos/asf/sqoop/trunk/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java,
line 322 from
this.mysqlCharSet = MySQLUtils.MYSQL_DEFAULT_CHARSET;
to
this.mysqlCharSet = "utf-8";
Hope this helps
was:
I am able to import utf-8 data (non-latin1) data successfully into HDFS via:
sqoop import --connect jdbc:mysql://host/db --username XX --password YY \
--mysql-delimiters \
--table MYSQL_SRC_TABLE --target-dir ${SQOOP_DIR_PREFIX}/mysql_table
--direct
However, using
sqoop export --connect jdbc:mysql://host/db --username XX --password YY \
--mysql-delimiters \
--table MYSQL_DEST_TABLE --export-dir ${SQOOP_DIR_PREFIX}/mysql_table \
--direct
Cuts off the fields after the first non-latin1 character (eg a letter w/ an
umlaut).
I tried other options like -- --default-character-set=utf8, without success.
I was able to fix the problem with the following change:
Change
https://svn.apache.org/repos/asf/sqoop/trunk/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java,
line 322 from
`this.mysqlCharSet = MySQLUtils.MYSQL_DEFAULT_CHARSET;`
to
`this.mysqlCharSet = "utf-8"; `
Hope this helps
> Unable to export utf-8 data to MySQL using --direct mode
> --------------------------------------------------------
>
> Key: SQOOP-2639
> URL: https://issues.apache.org/jira/browse/SQOOP-2639
> Project: Sqoop
> Issue Type: Bug
> Components: connectors/mysql
> Affects Versions: 1.4.6
> Reporter: Ranjan Bagchi
>
> I am able to import utf-8 data (non-latin1) data successfully into HDFS via:
> sqoop import --connect jdbc:mysql://host/db --username XX --password YY \
> --mysql-delimiters \
> --table MYSQL_SRC_TABLE --target-dir ${SQOOP_DIR_PREFIX}/mysql_table
> --direct
> However, using
> sqoop export --connect jdbc:mysql://host/db --username XX --password YY \
> --mysql-delimiters \
> --table MYSQL_DEST_TABLE --export-dir ${SQOOP_DIR_PREFIX}/mysql_table
> \
> --direct
> Cuts off the fields after the first non-latin1 character (eg a letter w/ an
> umlaut).
> I tried other options like -- --default-character-set=utf8, without success.
> I was able to fix the problem with the following change:
> Change
> https://svn.apache.org/repos/asf/sqoop/trunk/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java,
> line 322 from
> this.mysqlCharSet = MySQLUtils.MYSQL_DEFAULT_CHARSET;
> to
> this.mysqlCharSet = "utf-8";
> Hope this helps
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)