[jira] [Updated] (SQOOP-3184) Sqoop1 (doc import + --incremental) does not cover HDFS file formats supported and/or limitations
[ https://issues.apache.org/jira/browse/SQOOP-3184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Kemper updated SQOOP-3184: - Description: Sqoop1 (doc import + --incremental) does not cover HDFS file formats supported and/or limitations === https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports 7.2.10. Incremental Imports === was: Sqoop1 (doc import + --incremental lastmodified) does not cover the --merge-key option === https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports 7.2.10. Incremental Imports Table 5. Incremental import arguments: Sqoop supports two types of incremental imports: append and lastmodified. You can use the --incremental argument to specify the type of incremental import to perform. --check-column (col) --incremental (mode) --last-value (value) An alternate table update strategy supported by Sqoop is called lastmodified mode. You should use this when rows of the source table may be updated, and each such update will set the value of a last-modified column to the current timestamp. Rows where the check column holds a timestamp more recent than the timestamp specified with --last-value are imported. === > Sqoop1 (doc import + --incremental) does not cover HDFS file formats > supported and/or limitations > - > > Key: SQOOP-3184 > URL: https://issues.apache.org/jira/browse/SQOOP-3184 > Project: Sqoop > Issue Type: Bug >Reporter: Markus Kemper > > Sqoop1 (doc import + --incremental) does not cover HDFS file formats > supported and/or limitations > === > https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports > 7.2.10. Incremental Imports > === -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (SQOOP-3184) Sqoop1 (doc import + --incremental) does not cover HDFS file formats supported and/or limitations
Markus Kemper created SQOOP-3184: Summary: Sqoop1 (doc import + --incremental) does not cover HDFS file formats supported and/or limitations Key: SQOOP-3184 URL: https://issues.apache.org/jira/browse/SQOOP-3184 Project: Sqoop Issue Type: Bug Reporter: Markus Kemper Sqoop1 (doc import + --incremental lastmodified) does not cover the --merge-key option === https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports 7.2.10. Incremental Imports Table 5. Incremental import arguments: Sqoop supports two types of incremental imports: append and lastmodified. You can use the --incremental argument to specify the type of incremental import to perform. --check-column (col) --incremental (mode) --last-value (value) An alternate table update strategy supported by Sqoop is called lastmodified mode. You should use this when rows of the source table may be updated, and each such update will set the value of a last-modified column to the current timestamp. Rows where the check column holds a timestamp more recent than the timestamp specified with --last-value are imported. === -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (SQOOP-3183) Sqoop1 (doc import + --incremental lastmodified) does not cover the --merge-key option
Markus Kemper created SQOOP-3183: Summary: Sqoop1 (doc import + --incremental lastmodified) does not cover the --merge-key option Key: SQOOP-3183 URL: https://issues.apache.org/jira/browse/SQOOP-3183 Project: Sqoop Issue Type: Bug Reporter: Markus Kemper Sqoop1 (doc import + --incremental lastmodified) does not cover the --merge-key option === https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports 7.2.10. Incremental Imports Table 5. Incremental import arguments: Sqoop supports two types of incremental imports: append and lastmodified. You can use the --incremental argument to specify the type of incremental import to perform. --check-column (col) --incremental (mode) --last-value (value) An alternate table update strategy supported by Sqoop is called lastmodified mode. You should use this when rows of the source table may be updated, and each such update will set the value of a last-modified column to the current timestamp. Rows where the check column holds a timestamp more recent than the timestamp specified with --last-value are imported. === -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (SQOOP-3182) Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with (Can't parse input data: 'PAR1')
[ https://issues.apache.org/jira/browse/SQOOP-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Kemper updated SQOOP-3182: - Summary: Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with (Can't parse input data: 'PAR1') (was: Sqoop1 (import + --incremental + --merge-key + --as-parquet) fails with (Can't parse input data: 'PAR1')) > Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with > (Can't parse input data: 'PAR1') > > > Key: SQOOP-3182 > URL: https://issues.apache.org/jira/browse/SQOOP-3182 > Project: Sqoop > Issue Type: Bug >Reporter: Markus Kemper > > Sqoop1 (import + --incremental + --merge-key + --as-parquet) fails with > (Can't parse input data: 'PAR1'). See test case below. > *Test Case* > {noformat} > # > # STEP 01 - Create Table and Data > # > export MYCONN=jdbc:oracle:thin:@oracle.sqoop.com:1521/db11g; > export MYUSER=sqoop > export MYPSWD=sqoop > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "drop table t1" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "create table t1 (c1 int, c2 date, c3 varchar(10), c4 timestamp)" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "insert into t1 values (1, sysdate, 'NEW ROW 1', sysdate)" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1" > Output: > - > | C1 | C2 | C3 | C4 | > - > | 1| 2017-05-06 06:59:02.0 | NEW ROW 1 | 2017-05-06 > 06:59:02 | > - > # > # STEP 02 - Import Data into HDFS > # > hdfs dfs -rm -r /user/root/t1 > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table > T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 > --merge-key C1 --last-value '2017-01-01 00:00:00.0' --as-parquetfile > --map-column-java C2=String,C4=String --num-mappers 1 --verbose > hdfs dfs -ls /user/root/t1/*.parquet > parquet-tools cat --json > 'hdfs://namenode/user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet' > Output: > 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Transferred 2.627 KB in > 23.6174 seconds (113.8988 bytes/sec) > 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Retrieved 1 records. > 17/05/06 07:01:34 INFO tool.ImportTool: --last-value 2017-05-06 07:01:09.0 > ~ > -rw-r--r-- 3 root root 1144 2017-05-06 07:01 > /user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet > ~ > {"C1":"1","C2":"2017-05-06 06:59:02.0","C3":"NEW ROW 1","C4":"2017-05-06 > 06:59:02"} > # > # STEP 03 - Insert New Row and Update Existing Row > # > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "insert into t1 values (2, sysdate, 'NEW ROW 2', sysdate)" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "update t1 set c3 = 'UPDATE 1', c4 = sysdate where c1 = 1" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1 order by c1" > Output: > - > | C1 | C2 | C3 | C4 | > - > | 1| 2017-05-06 06:59:02.0 | UPDATE 1 | 2017-05-06 > 07:04:40 | > | 2| 2017-05-06 07:04:38.0 | NEW ROW 2 | 2017-05-06 > 07:04:38 | > - > # > # STEP 04 - Import Data into HDFS and Merge changes > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table > T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 > --merge-key C1 --last-value '2017-05-06 07:01:09.0' --as-parquetfile > --map-column-java C2=String,C4=String --num-mappers 1 --verbose > Output: > 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Transferred 2.6611 KB in > 27.4934 seconds (99.1148 bytes/sec) > 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Retrieved 2 records. > 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Restoring classloader: > java.net.FactoryURLClassLoader@121fdcee > 17/05/06 07:06:43 INFO tool.ImportTool: Final destination exists, will run > merge job. > 17/05/06 07:06:43 DEBUG tool.ImportTool: Using temporary folder: > 4bc6b65cd0194b81938f4660974ee392_T1 > 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Checking for existing class: T1
[jira] [Created] (SQOOP-3182) Sqoop1 (import + --incremental + --merge-key + --as-parquet) fails with (Can't parse input data: 'PAR1')
Markus Kemper created SQOOP-3182: Summary: Sqoop1 (import + --incremental + --merge-key + --as-parquet) fails with (Can't parse input data: 'PAR1') Key: SQOOP-3182 URL: https://issues.apache.org/jira/browse/SQOOP-3182 Project: Sqoop Issue Type: Bug Reporter: Markus Kemper Sqoop1 (import + --incremental + --merge-key + --as-parquet) fails with (Can't parse input data: 'PAR1'). See test case below. *Test Case* {noformat} # # STEP 01 - Create Table and Data # export MYCONN=jdbc:oracle:thin:@oracle.sqoop.com:1521/db11g; export MYUSER=sqoop export MYPSWD=sqoop sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "drop table t1" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "create table t1 (c1 int, c2 date, c3 varchar(10), c4 timestamp)" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "insert into t1 values (1, sysdate, 'NEW ROW 1', sysdate)" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select * from t1" Output: - | C1 | C2 | C3 | C4 | - | 1| 2017-05-06 06:59:02.0 | NEW ROW 1 | 2017-05-06 06:59:02 | - # # STEP 02 - Import Data into HDFS # hdfs dfs -rm -r /user/root/t1 sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 --merge-key C1 --last-value '2017-01-01 00:00:00.0' --as-parquetfile --map-column-java C2=String,C4=String --num-mappers 1 --verbose hdfs dfs -ls /user/root/t1/*.parquet parquet-tools cat --json 'hdfs://namenode/user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet' Output: 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Transferred 2.627 KB in 23.6174 seconds (113.8988 bytes/sec) 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Retrieved 1 records. 17/05/06 07:01:34 INFO tool.ImportTool: --last-value 2017-05-06 07:01:09.0 ~ -rw-r--r-- 3 root root 1144 2017-05-06 07:01 /user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet ~ {"C1":"1","C2":"2017-05-06 06:59:02.0","C3":"NEW ROW 1","C4":"2017-05-06 06:59:02"} # # STEP 03 - Insert New Row and Update Existing Row # sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "insert into t1 values (2, sysdate, 'NEW ROW 2', sysdate)" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "update t1 set c3 = 'UPDATE 1', c4 = sysdate where c1 = 1" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select * from t1 order by c1" Output: - | C1 | C2 | C3 | C4 | - | 1| 2017-05-06 06:59:02.0 | UPDATE 1 | 2017-05-06 07:04:40 | | 2| 2017-05-06 07:04:38.0 | NEW ROW 2 | 2017-05-06 07:04:38 | - # # STEP 04 - Import Data into HDFS and Merge changes # sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 --merge-key C1 --last-value '2017-05-06 07:01:09.0' --as-parquetfile --map-column-java C2=String,C4=String --num-mappers 1 --verbose Output: 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Transferred 2.6611 KB in 27.4934 seconds (99.1148 bytes/sec) 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Retrieved 2 records. 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Restoring classloader: java.net.FactoryURLClassLoader@121fdcee 17/05/06 07:06:43 INFO tool.ImportTool: Final destination exists, will run merge job. 17/05/06 07:06:43 DEBUG tool.ImportTool: Using temporary folder: 4bc6b65cd0194b81938f4660974ee392_T1 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Checking for existing class: T1 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Attempting to load jar through URL: jar:file:/tmp/sqoop-root/compile/6ed24910abcbc6ea38a1963bfce9a92d/codegen_T1.jar!/ 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Previous classloader is java.net.FactoryURLClassLoader@121fdcee 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Testing class in jar: T1 17/05/06 07:06:43 ERROR tool.ImportTool: Import failed: java.io.IOException: Could not load jar /tmp/sqoop-root/compile/6ed24910abcbc6ea38a1963bfce9a92d/codegen_T1.jar into JVM. (Could not find class T1.) at
[jira] [Created] (SQOOP-3181) Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with (Could not find class .)
Markus Kemper created SQOOP-3181: Summary: Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with (Could not find class .) Key: SQOOP-3181 URL: https://issues.apache.org/jira/browse/SQOOP-3181 Project: Sqoop Issue Type: Bug Reporter: Markus Kemper Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with (Could not find class .). See test case below *Test Case* {noformat} # # STEP 01 - Create Table and Data # export MYCONN=jdbc:oracle:thin:@oracle.sqoop.com:1521/db11g; export MYUSER=sqoop export MYPSWD=sqoop sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "drop table t1" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "create table t1 (c1 int, c2 date, c3 varchar(10), c4 timestamp)" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "insert into t1 values (1, sysdate, 'NEW ROW 1', sysdate)" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select * from t1" Output: - | C1 | C2 | C3 | C4 | - | 1| 2017-05-06 06:59:02.0 | NEW ROW 1 | 2017-05-06 06:59:02 | - # # STEP 02 - Import Data into HDFS # hdfs dfs -rm -r /user/root/t1 sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 --merge-key C1 --last-value '2017-01-01 00:00:00.0' --as-parquetfile --map-column-java C2=String,C4=String --num-mappers 1 --verbose hdfs dfs -ls /user/root/t1/*.parquet parquet-tools cat --json 'hdfs://namenode/user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet' Output: 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Transferred 2.627 KB in 23.6174 seconds (113.8988 bytes/sec) 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Retrieved 1 records. 17/05/06 07:01:34 INFO tool.ImportTool: --last-value 2017-05-06 07:01:09.0 ~ -rw-r--r-- 3 root root 1144 2017-05-06 07:01 /user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet ~ {"C1":"1","C2":"2017-05-06 06:59:02.0","C3":"NEW ROW 1","C4":"2017-05-06 06:59:02"} # # STEP 03 - Insert New Row and Update Existing Row # sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "insert into t1 values (2, sysdate, 'NEW ROW 2', sysdate)" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "update t1 set c3 = 'UPDATE 1', c4 = sysdate where c1 = 1" sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select * from t1 order by c1" Output: - | C1 | C2 | C3 | C4 | - | 1| 2017-05-06 06:59:02.0 | UPDATE 1 | 2017-05-06 07:04:40 | | 2| 2017-05-06 07:04:38.0 | NEW ROW 2 | 2017-05-06 07:04:38 | - # # STEP 04 - Import Data into HDFS and Merge changes # sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 --merge-key C1 --last-value '2017-05-06 07:01:09.0' --as-parquetfile --map-column-java C2=String,C4=String --num-mappers 1 --verbose Output: 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Transferred 2.6611 KB in 27.4934 seconds (99.1148 bytes/sec) 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Retrieved 2 records. 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Restoring classloader: java.net.FactoryURLClassLoader@121fdcee 17/05/06 07:06:43 INFO tool.ImportTool: Final destination exists, will run merge job. 17/05/06 07:06:43 DEBUG tool.ImportTool: Using temporary folder: 4bc6b65cd0194b81938f4660974ee392_T1 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Checking for existing class: T1 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Attempting to load jar through URL: jar:file:/tmp/sqoop-root/compile/6ed24910abcbc6ea38a1963bfce9a92d/codegen_T1.jar!/ 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Previous classloader is java.net.FactoryURLClassLoader@121fdcee 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Testing class in jar: T1 17/05/06 07:06:43 ERROR tool.ImportTool: Import failed: java.io.IOException: Could not load jar /tmp/sqoop-root/compile/6ed24910abcbc6ea38a1963bfce9a92d/codegen_T1.jar into JVM. (Could not find class T1.) at
Re: Review Request 58466: SQOOP-3158 - Columns added to Mysql after initial sqoop import, export back to table with same schema fails
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/58466/ --- (Updated May 6, 2017, 11:29 a.m.) Review request for Sqoop, Attila Szabo and Szabolcs Vasas. Changes --- update test case to use existing checking NULL values after export Bugs: SQOOP-3158 https://issues.apache.org/jira/browse/SQOOP-3158 Repository: sqoop-trunk Description --- I have table in MySQL with 2 columns until yesterday. The columns are id and name. 1,Raj 2,Jack I have imported this data into HDFS yesterday itself as a file. Today we added a new column to the table in MySQL called salary. The table looks like below. 1,Raj 2,Jack 3,Jill,2000 4,Nick,3000 Now I have done Incremental import on this table as a file. Part-m-0 file contains 1,Raj 2,Jack Part-m-1 file contains 3,Jill,2000 4,Nick,3000 Now I created a new table in MySQL with same schema as Original MySQL table with columns id name and salary. Sqoop export will fail with below error: java.lang.RuntimeException: Can't parse input data: 'Raj' at SQOOP_3158.__loadFromFields(SQOOP_3158.java:316) at SQOOP_3158.parse(SQOOP_3158.java:254) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:89) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.util.NoSuchElementException at java.util.ArrayList$Itr.next(ArrayList.java:854) at SQOOP_3158.__loadFromFields(SQOOP_3158.java:311) ... 12 more Diffs (updated) - src/java/org/apache/sqoop/orm/ClassWriter.java eaa9123 src/test/com/cloudera/sqoop/TestExport.java b2edc53 Diff: https://reviews.apache.org/r/58466/diff/3/ Changes: https://reviews.apache.org/r/58466/diff/2-3/ Testing --- There is no existing test class to cover the path and I am not sure the best way to add test case for this. If you have any suggestion, please let me know. I have done manual testing to replicate the issue and confirmed that patch fixed the issue. I have also tried different data types, all working. However, if column in MySQL is defined as NOT NULL, then the export will still fail with error, this is expected. Thanks, Eric Lin
[jira] [Updated] (SQOOP-3158) Columns added to Mysql after initial sqoop import, export back to table with same schema fails
[ https://issues.apache.org/jira/browse/SQOOP-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated SQOOP-3158: Attachment: SQOOP-3158.2.patch > Columns added to Mysql after initial sqoop import, export back to table with > same schema fails > --- > > Key: SQOOP-3158 > URL: https://issues.apache.org/jira/browse/SQOOP-3158 > Project: Sqoop > Issue Type: Improvement >Affects Versions: 1.4.6 >Reporter: viru reddy >Assignee: Eric Lin > Labels: newbie > Attachments: SQOOP-3158.2.patch, SQOOP-3158.patch > > > I have table in MySQL with 2 columns until yesterday. The columns are id and > name. > 1,Raj > 2,Jack > I have imported this data into HDFS yesterday itself as a file. Today we > added a new column to the table in MySQL called salary. The table looks like > below. > 1,Raj > 2,Jack > 3,Jill,2000 > 4,Nick,3000 > Now I have done Incremental import on this table as a file. > Part-m-0 file contains > 1,Raj > 2,Jack > Part-m-1 file contains > 3,Jill,2000 > 4,Nick,3000 > Now I created a new table in MySQL with same schema as Original MySQL table > with columns id name and salary. > When I do sqoop export only last 2 rows are getting inserted to the new table > in MySQL and the sqoop export fails > How can I reflect all the rows to be inserted to the table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (SQOOP-3158) Columns added to Mysql after initial sqoop import, export back to table with same schema fails
[ https://issues.apache.org/jira/browse/SQOOP-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated SQOOP-3158: Attachment: (was: SQOOP-3158.2.patch) > Columns added to Mysql after initial sqoop import, export back to table with > same schema fails > --- > > Key: SQOOP-3158 > URL: https://issues.apache.org/jira/browse/SQOOP-3158 > Project: Sqoop > Issue Type: Improvement >Affects Versions: 1.4.6 >Reporter: viru reddy >Assignee: Eric Lin > Labels: newbie > Attachments: SQOOP-3158.patch > > > I have table in MySQL with 2 columns until yesterday. The columns are id and > name. > 1,Raj > 2,Jack > I have imported this data into HDFS yesterday itself as a file. Today we > added a new column to the table in MySQL called salary. The table looks like > below. > 1,Raj > 2,Jack > 3,Jill,2000 > 4,Nick,3000 > Now I have done Incremental import on this table as a file. > Part-m-0 file contains > 1,Raj > 2,Jack > Part-m-1 file contains > 3,Jill,2000 > 4,Nick,3000 > Now I created a new table in MySQL with same schema as Original MySQL table > with columns id name and salary. > When I do sqoop export only last 2 rows are getting inserted to the new table > in MySQL and the sqoop export fails > How can I reflect all the rows to be inserted to the table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: Review Request 58466: SQOOP-3158 - Columns added to Mysql after initial sqoop import, export back to table with same schema fails
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/58466/ --- (Updated May 6, 2017, 11:01 a.m.) Review request for Sqoop, Attila Szabo and Szabolcs Vasas. Changes --- Added test case to verify it failed before the change and passes after the change. Bugs: SQOOP-3158 https://issues.apache.org/jira/browse/SQOOP-3158 Repository: sqoop-trunk Description --- I have table in MySQL with 2 columns until yesterday. The columns are id and name. 1,Raj 2,Jack I have imported this data into HDFS yesterday itself as a file. Today we added a new column to the table in MySQL called salary. The table looks like below. 1,Raj 2,Jack 3,Jill,2000 4,Nick,3000 Now I have done Incremental import on this table as a file. Part-m-0 file contains 1,Raj 2,Jack Part-m-1 file contains 3,Jill,2000 4,Nick,3000 Now I created a new table in MySQL with same schema as Original MySQL table with columns id name and salary. Sqoop export will fail with below error: java.lang.RuntimeException: Can't parse input data: 'Raj' at SQOOP_3158.__loadFromFields(SQOOP_3158.java:316) at SQOOP_3158.parse(SQOOP_3158.java:254) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:89) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.util.NoSuchElementException at java.util.ArrayList$Itr.next(ArrayList.java:854) at SQOOP_3158.__loadFromFields(SQOOP_3158.java:311) ... 12 more Diffs (updated) - src/java/org/apache/sqoop/orm/ClassWriter.java eaa9123 src/test/com/cloudera/sqoop/TestExport.java b2edc53 Diff: https://reviews.apache.org/r/58466/diff/2/ Changes: https://reviews.apache.org/r/58466/diff/1-2/ Testing --- There is no existing test class to cover the path and I am not sure the best way to add test case for this. If you have any suggestion, please let me know. I have done manual testing to replicate the issue and confirmed that patch fixed the issue. I have also tried different data types, all working. However, if column in MySQL is defined as NOT NULL, then the export will still fail with error, this is expected. Thanks, Eric Lin