Sqoop export using non-new line row delimiter

Tushar Sudake Thu, 17 May 2012 05:00:04 -0700

Hi,

I have some data in text file on HDFS and want to export this data into
MySQL database.
But I want Sqoop to use "|" as record delimiter instead of default "\n"
record delimiter.


So I am specifying ' --input-lines-terminated-by "|" ' option in my Sqoop
export command.

The export succeeds, but the number of records exported shown is  only 1.
And when I check in MySQL target table, I see only one record.

Looks like only one record before first "|" is getting exported.

Here's sample data on HDFS:

1,Hello|2,How|3,Are|4,You|5,I|6,am|7,fine|

Sqoop Export command:

bin/sqoop export --connect 'jdbc:mysql://localhost/mydb' -password pwd
--username usr --table mytable --export-dir data
--input-lines-terminated-by "|"

Console logs:

12/05/17 03:32:02 WARN tool.BaseSqoopTool: Setting your password on the
command-line is insecure. Consider using -P instead.
12/05/17 03:32:02 INFO manager.MySQLManager: Preparing to use a MySQL
streaming resultset.
12/05/17 03:32:02 INFO tool.CodeGenTool: Beginning code generation
12/05/17 03:32:02 INFO manager.SqlManager: Executing SQL statement: SELECT
t.* FROM `mytable` AS t LIMIT 1
12/05/17 03:32:03 INFO orm.CompilationManager: HADOOP_HOME is
/home/tushar/hadoop-0.20.2-cdh3u4
Note:
/tmp/sqoop-tushar/compile/fd6d3bfd4c2ed7f2e19a2de418993dfc/mytable.java
uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
12/05/17 03:32:04 INFO orm.CompilationManager: Writing jar file:
/tmp/sqooptushar/compile/fd6d3bfd4c2ed7f2e19a2de418993dfc/mytable.jar
12/05/17 03:32:04 INFO mapreduce.ExportJobBase: Beginning export of mytable
12/05/17 03:32:06 INFO input.FileInputFormat: Total input paths to process
: 1
12/05/17 03:32:06 INFO input.FileInputFormat: Total input paths to process
: 1
12/05/17 03:32:06 INFO mapred.JobClient: Running job: job_201205110542_0432
12/05/17 03:32:07 INFO mapred.JobClient:  map 0% reduce 0%
12/05/17 03:32:13 INFO mapred.JobClient:  map 100% reduce 0%
12/05/17 03:32:14 INFO mapred.JobClient: Job complete: job_201205110542_0432
12/05/17 03:32:14 INFO mapred.JobClient: Counters: 16
12/05/17 03:32:14 INFO mapred.JobClient:   Job Counters
12/05/17 03:32:14 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=6685
12/05/17 03:32:14 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
12/05/17 03:32:14 INFO mapred.JobClient:     Total time spent by all maps
waiting after reserving slots (ms)=0
12/05/17 03:32:14 INFO mapred.JobClient:     Launched map tasks=1
12/05/17 03:32:14 INFO mapred.JobClient:     Data-local map tasks=1
12/05/17 03:32:14 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
12/05/17 03:32:14 INFO mapred.JobClient:   FileSystemCounters
12/05/17 03:32:14 INFO mapred.JobClient:     HDFS_BYTES_READ=166
12/05/17 03:32:14 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=79082
12/05/17 03:32:14 INFO mapred.JobClient:   Map-Reduce Framework
12/05/17 03:32:14 INFO mapred.JobClient:     Map input records=1
12/05/17 03:32:14 INFO mapred.JobClient:     Physical memory (bytes)
snapshot=68677632
12/05/17 03:32:14 INFO mapred.JobClient:     Spilled Records=0
12/05/17 03:32:14 INFO mapred.JobClient:     CPU time spent (ms)=1130
12/05/17 03:32:14 INFO mapred.JobClient:     Total committed heap usage
(bytes)=39911424
12/05/17 03:32:14 INFO mapred.JobClient:     Virtual memory (bytes)
snapshot=392290304
12/05/17 03:32:14 INFO mapred.JobClient:     Map output records=1
12/05/17 03:32:14 INFO mapred.JobClient:     SPLIT_RAW_BYTES=117
12/05/17 03:32:14 INFO mapreduce.ExportJobBase: Transferred 166 bytes in
9.6013 seconds (17.2893 bytes/sec)
12/05/17 03:32:14 INFO mapreduce.ExportJobBase: Exported 1 records.

On MySQL side:

mysql> select * from mytable;
+------+-------+
| i    | name  |
+------+-------+
|    1 | Hello |
+------+-------+
1 row in set (0.00 sec)

Sqoop version is: sqoop-1.4.1-incubating__hadoop-1.0.0
Hadoop Version: CDH3u4

Doen't Sqoop support any other record delimiter than "\n" or am I missing
something?
Please suggest solution for this.

Thanks,
Tushar

Sqoop export using non-new line row delimiter

Reply via email to