[jira] [Created] (SQOOP-1245) Varchar fields encoding is corrupted during import when snappy used

Sergey (JIRA) Sat, 30 Nov 2013 04:28:22 -0800

Sergey created SQOOP-1245:
-----------------------------

             Summary: Varchar fields encoding is corrupted during import when 
snappy used
                 Key: SQOOP-1245
                 URL: https://issues.apache.org/jira/browse/SQOOP-1245
             Project: Sqoop
          Issue Type: Bug
    Affects Versions: 1.4.3
         Environment: CDH 4.4. 1.4.3+62
            Reporter: Sergey



Here is a MySQL table DDL:
{code}
CREATE TABLE `item_info` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `shop_id` int(11) unsigned NOT NULL,
  `internal_id` int(10) unsigned DEFAULT NULL,
  `name` varchar(1024) NOT NULL,
  `prefix` varchar(255) NOT NULL DEFAULT '',  
  PRIMARY KEY (`id`),
  
) ENGINE=InnoDB AUTO_INCREMENT=1727331768 DEFAULT CHARSET=utf8
{code}

when "--as-textfile" is used, works perfectly.
When "--compression-codec org.apache.hadoop.io.compress.SnappyCodec" is 
pecified, then all varchar fields are corrupted. Looks like they are encoded as 
"ISO-8859-1"
So there is no way to export with compression varchar with non-ASCII codes.





--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (SQOOP-1245) Varchar fields encoding is corrupted during import when snappy used

Reply via email to