Hi, Found that Cloudera Sqoop has open issue which didn't allow another record delimiter than 'n'. https://issues.cloudera.org/browse/SQOOP-136
Could anyone please confirm that this issue has been moved to/fixed in Apache Sqoop after incubation? Thanks, Tushar On Thu, May 17, 2012 at 5:29 PM, Tushar Sudake <[email protected]> wrote: > Hi, > > I have some data in text file on HDFS and want to export this data into > MySQL database. > But I want Sqoop to use "|" as record delimiter instead of default "\n" > record delimiter. > > So I am specifying ' --input-lines-terminated-by "|" ' option in my Sqoop > export command. > > The export succeeds, but the number of records exported shown is only 1. > And when I check in MySQL target table, I see only one record. > > Looks like only one record before first "|" is getting exported. > > Here's sample data on HDFS: > > 1,Hello|2,How|3,Are|4,You|5,I|6,am|7,fine| > > Sqoop Export command: > > bin/sqoop export --connect 'jdbc:mysql://localhost/mydb' -password pwd > --username usr --table mytable --export-dir data > --input-lines-terminated-by "|" > > Console logs: > > 12/05/17 03:32:02 WARN tool.BaseSqoopTool: Setting your password on the > command-line is insecure. Consider using -P instead. > 12/05/17 03:32:02 INFO manager.MySQLManager: Preparing to use a MySQL > streaming resultset. > 12/05/17 03:32:02 INFO tool.CodeGenTool: Beginning code generation > 12/05/17 03:32:02 INFO manager.SqlManager: Executing SQL statement: SELECT > t.* FROM `mytable` AS t LIMIT 1 > 12/05/17 03:32:03 INFO orm.CompilationManager: HADOOP_HOME is > /home/tushar/hadoop-0.20.2-cdh3u4 > Note: > /tmp/sqoop-tushar/compile/fd6d3bfd4c2ed7f2e19a2de418993dfc/mytable.java > uses or overrides a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 12/05/17 03:32:04 INFO orm.CompilationManager: Writing jar file: > /tmp/sqooptushar/compile/fd6d3bfd4c2ed7f2e19a2de418993dfc/mytable.jar > 12/05/17 03:32:04 INFO mapreduce.ExportJobBase: Beginning export of mytable > 12/05/17 03:32:06 INFO input.FileInputFormat: Total input paths to process > : 1 > 12/05/17 03:32:06 INFO input.FileInputFormat: Total input paths to process > : 1 > 12/05/17 03:32:06 INFO mapred.JobClient: Running job: job_201205110542_0432 > 12/05/17 03:32:07 INFO mapred.JobClient: map 0% reduce 0% > 12/05/17 03:32:13 INFO mapred.JobClient: map 100% reduce 0% > 12/05/17 03:32:14 INFO mapred.JobClient: Job complete: > job_201205110542_0432 > 12/05/17 03:32:14 INFO mapred.JobClient: Counters: 16 > 12/05/17 03:32:14 INFO mapred.JobClient: Job Counters > 12/05/17 03:32:14 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=6685 > 12/05/17 03:32:14 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 12/05/17 03:32:14 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 12/05/17 03:32:14 INFO mapred.JobClient: Launched map tasks=1 > 12/05/17 03:32:14 INFO mapred.JobClient: Data-local map tasks=1 > 12/05/17 03:32:14 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 > 12/05/17 03:32:14 INFO mapred.JobClient: FileSystemCounters > 12/05/17 03:32:14 INFO mapred.JobClient: HDFS_BYTES_READ=166 > 12/05/17 03:32:14 INFO mapred.JobClient: FILE_BYTES_WRITTEN=79082 > 12/05/17 03:32:14 INFO mapred.JobClient: Map-Reduce Framework > 12/05/17 03:32:14 INFO mapred.JobClient: Map input records=1 > 12/05/17 03:32:14 INFO mapred.JobClient: Physical memory (bytes) > snapshot=68677632 > 12/05/17 03:32:14 INFO mapred.JobClient: Spilled Records=0 > 12/05/17 03:32:14 INFO mapred.JobClient: CPU time spent (ms)=1130 > 12/05/17 03:32:14 INFO mapred.JobClient: Total committed heap usage > (bytes)=39911424 > 12/05/17 03:32:14 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=392290304 > 12/05/17 03:32:14 INFO mapred.JobClient: Map output records=1 > 12/05/17 03:32:14 INFO mapred.JobClient: SPLIT_RAW_BYTES=117 > 12/05/17 03:32:14 INFO mapreduce.ExportJobBase: Transferred 166 bytes in > 9.6013 seconds (17.2893 bytes/sec) > 12/05/17 03:32:14 INFO mapreduce.ExportJobBase: Exported 1 records. > > On MySQL side: > > mysql> select * from mytable; > +------+-------+ > | i | name | > +------+-------+ > | 1 | Hello | > +------+-------+ > 1 row in set (0.00 sec) > > Sqoop version is: sqoop-1.4.1-incubating__hadoop-1.0.0 > Hadoop Version: CDH3u4 > > Doen't Sqoop support any other record delimiter than "\n" or am I missing > something? > Please suggest solution for this. > > Thanks, > Tushar >
