[jira] [Updated] (SQOOP-3184) Sqoop1 (doc import + --incremental) does not cover HDFS file formats supported and/or limitations

2017-05-06 Thread Markus Kemper (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Kemper updated SQOOP-3184:
-
Description: 
Sqoop1 (doc import + --incremental) does not cover HDFS file formats supported 
and/or limitations

===
https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports
7.2.10. Incremental Imports
===

  was:
Sqoop1 (doc import + --incremental lastmodified) does not cover the --merge-key 
option

===
https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports
7.2.10. Incremental Imports
Table 5. Incremental import arguments:

Sqoop supports two types of incremental imports: append and lastmodified. You 
can use the --incremental argument to specify the type of incremental import to 
perform.
--check-column (col)
--incremental (mode)
--last-value (value)

An alternate table update strategy supported by Sqoop is called lastmodified 
mode. You should use this when rows of the source table may be updated, and 
each such update will set the value of a last-modified column to the current 
timestamp. Rows where the check column holds a timestamp more recent than the 
timestamp specified with --last-value are imported.
===


> Sqoop1 (doc import + --incremental) does not cover HDFS file formats 
> supported and/or limitations
> -
>
> Key: SQOOP-3184
> URL: https://issues.apache.org/jira/browse/SQOOP-3184
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Markus Kemper
>
> Sqoop1 (doc import + --incremental) does not cover HDFS file formats 
> supported and/or limitations
> ===
> https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports
> 7.2.10. Incremental Imports
> ===



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SQOOP-3184) Sqoop1 (doc import + --incremental) does not cover HDFS file formats supported and/or limitations

2017-05-06 Thread Markus Kemper (JIRA)
Markus Kemper created SQOOP-3184:


 Summary: Sqoop1 (doc import + --incremental) does not cover HDFS 
file formats supported and/or limitations
 Key: SQOOP-3184
 URL: https://issues.apache.org/jira/browse/SQOOP-3184
 Project: Sqoop
  Issue Type: Bug
Reporter: Markus Kemper


Sqoop1 (doc import + --incremental lastmodified) does not cover the --merge-key 
option

===
https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports
7.2.10. Incremental Imports
Table 5. Incremental import arguments:

Sqoop supports two types of incremental imports: append and lastmodified. You 
can use the --incremental argument to specify the type of incremental import to 
perform.
--check-column (col)
--incremental (mode)
--last-value (value)

An alternate table update strategy supported by Sqoop is called lastmodified 
mode. You should use this when rows of the source table may be updated, and 
each such update will set the value of a last-modified column to the current 
timestamp. Rows where the check column holds a timestamp more recent than the 
timestamp specified with --last-value are imported.
===



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SQOOP-3183) Sqoop1 (doc import + --incremental lastmodified) does not cover the --merge-key option

2017-05-06 Thread Markus Kemper (JIRA)
Markus Kemper created SQOOP-3183:


 Summary: Sqoop1 (doc import + --incremental lastmodified) does not 
cover the --merge-key option
 Key: SQOOP-3183
 URL: https://issues.apache.org/jira/browse/SQOOP-3183
 Project: Sqoop
  Issue Type: Bug
Reporter: Markus Kemper


Sqoop1 (doc import + --incremental lastmodified) does not cover the --merge-key 
option

===
https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports
7.2.10. Incremental Imports
Table 5. Incremental import arguments:

Sqoop supports two types of incremental imports: append and lastmodified. You 
can use the --incremental argument to specify the type of incremental import to 
perform.
--check-column (col)
--incremental (mode)
--last-value (value)

An alternate table update strategy supported by Sqoop is called lastmodified 
mode. You should use this when rows of the source table may be updated, and 
each such update will set the value of a last-modified column to the current 
timestamp. Rows where the check column holds a timestamp more recent than the 
timestamp specified with --last-value are imported.
===



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SQOOP-3182) Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with (Can't parse input data: 'PAR1')

2017-05-06 Thread Markus Kemper (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Kemper updated SQOOP-3182:
-
Summary: Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) 
fails with (Can't parse input data: 'PAR1')  (was: Sqoop1 (import + 
--incremental + --merge-key + --as-parquet) fails with (Can't parse input data: 
'PAR1'))

> Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with 
> (Can't parse input data: 'PAR1')
> 
>
> Key: SQOOP-3182
> URL: https://issues.apache.org/jira/browse/SQOOP-3182
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Markus Kemper
>
> Sqoop1 (import + --incremental + --merge-key + --as-parquet) fails with 
> (Can't parse input data: 'PAR1').  See test case below.
> *Test Case*
> {noformat}
> #
> # STEP 01 - Create Table and Data
> #
> export MYCONN=jdbc:oracle:thin:@oracle.sqoop.com:1521/db11g;
> export MYUSER=sqoop
> export MYPSWD=sqoop
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "drop table t1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "create table t1 (c1 int, c2 date, c3 varchar(10), c4 timestamp)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1 values (1, sysdate, 'NEW ROW 1', sysdate)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1"
> Output:
> -
> | C1   | C2  | C3 | C4  | 
> -
> | 1| 2017-05-06 06:59:02.0 | NEW ROW 1  | 2017-05-06 
> 06:59:02 | 
> -
> #
> # STEP 02 - Import Data into HDFS 
> #
> hdfs dfs -rm -r /user/root/t1
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 
> --merge-key C1 --last-value '2017-01-01 00:00:00.0' --as-parquetfile 
> --map-column-java C2=String,C4=String --num-mappers 1 --verbose 
> hdfs dfs -ls /user/root/t1/*.parquet
> parquet-tools cat --json 
> 'hdfs://namenode/user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet'
> Output:
> 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Transferred 2.627 KB in 
> 23.6174 seconds (113.8988 bytes/sec)
> 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> 17/05/06 07:01:34 INFO tool.ImportTool:   --last-value 2017-05-06 07:01:09.0
> ~
> -rw-r--r--   3 root root   1144 2017-05-06 07:01 
> /user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet
> ~
> {"C1":"1","C2":"2017-05-06 06:59:02.0","C3":"NEW ROW 1","C4":"2017-05-06 
> 06:59:02"}
> #
> # STEP 03 - Insert New Row and Update Existing Row
> #
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1 values (2, sysdate, 'NEW ROW 2', sysdate)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "update t1 set c3 = 'UPDATE 1', c4 = sysdate where c1 = 1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1 order by c1"
> Output:
> -
> | C1   | C2  | C3 | C4  | 
> -
> | 1| 2017-05-06 06:59:02.0 | UPDATE 1   | 2017-05-06 
> 07:04:40 | 
> | 2| 2017-05-06 07:04:38.0 | NEW ROW 2  | 2017-05-06 
> 07:04:38 | 
> -
> #
> # STEP 04 - Import Data into HDFS and Merge changes 
> #
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 
> --merge-key C1 --last-value '2017-05-06 07:01:09.0' --as-parquetfile 
> --map-column-java C2=String,C4=String --num-mappers 1 --verbose 
> Output:
> 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Transferred 2.6611 KB in 
> 27.4934 seconds (99.1148 bytes/sec)
> 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Retrieved 2 records.
> 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Restoring classloader: 
> java.net.FactoryURLClassLoader@121fdcee
> 17/05/06 07:06:43 INFO tool.ImportTool: Final destination exists, will run 
> merge job.
> 17/05/06 07:06:43 DEBUG tool.ImportTool: Using temporary folder: 
> 4bc6b65cd0194b81938f4660974ee392_T1
> 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Checking for existing class: T1

[jira] [Created] (SQOOP-3182) Sqoop1 (import + --incremental + --merge-key + --as-parquet) fails with (Can't parse input data: 'PAR1')

2017-05-06 Thread Markus Kemper (JIRA)
Markus Kemper created SQOOP-3182:


 Summary: Sqoop1 (import + --incremental + --merge-key + 
--as-parquet) fails with (Can't parse input data: 'PAR1')
 Key: SQOOP-3182
 URL: https://issues.apache.org/jira/browse/SQOOP-3182
 Project: Sqoop
  Issue Type: Bug
Reporter: Markus Kemper


Sqoop1 (import + --incremental + --merge-key + --as-parquet) fails with (Can't 
parse input data: 'PAR1').  See test case below.

*Test Case*
{noformat}
#
# STEP 01 - Create Table and Data
#

export MYCONN=jdbc:oracle:thin:@oracle.sqoop.com:1521/db11g;
export MYUSER=sqoop
export MYPSWD=sqoop

sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"drop table t1"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"create table t1 (c1 int, c2 date, c3 varchar(10), c4 timestamp)"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"insert into t1 values (1, sysdate, 'NEW ROW 1', sysdate)"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"select * from t1"

Output:
-
| C1   | C2  | C3 | C4  | 
-
| 1| 2017-05-06 06:59:02.0 | NEW ROW 1  | 2017-05-06 
06:59:02 | 
-

#
# STEP 02 - Import Data into HDFS 
#

hdfs dfs -rm -r /user/root/t1
sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1 
--target-dir /user/root/t1 --incremental lastmodified --check-column C4 
--merge-key C1 --last-value '2017-01-01 00:00:00.0' --as-parquetfile 
--map-column-java C2=String,C4=String --num-mappers 1 --verbose 
hdfs dfs -ls /user/root/t1/*.parquet
parquet-tools cat --json 
'hdfs://namenode/user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet'

Output:
17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Transferred 2.627 KB in 23.6174 
seconds (113.8988 bytes/sec)
17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Retrieved 1 records.
17/05/06 07:01:34 INFO tool.ImportTool:   --last-value 2017-05-06 07:01:09.0
~
-rw-r--r--   3 root root   1144 2017-05-06 07:01 
/user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet
~
{"C1":"1","C2":"2017-05-06 06:59:02.0","C3":"NEW ROW 1","C4":"2017-05-06 
06:59:02"}

#
# STEP 03 - Insert New Row and Update Existing Row
#

sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"insert into t1 values (2, sysdate, 'NEW ROW 2', sysdate)"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"update t1 set c3 = 'UPDATE 1', c4 = sysdate where c1 = 1"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"select * from t1 order by c1"

Output:
-
| C1   | C2  | C3 | C4  | 
-
| 1| 2017-05-06 06:59:02.0 | UPDATE 1   | 2017-05-06 
07:04:40 | 
| 2| 2017-05-06 07:04:38.0 | NEW ROW 2  | 2017-05-06 
07:04:38 | 
-

#
# STEP 04 - Import Data into HDFS and Merge changes 
#

sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1 
--target-dir /user/root/t1 --incremental lastmodified --check-column C4 
--merge-key C1 --last-value '2017-05-06 07:01:09.0' --as-parquetfile 
--map-column-java C2=String,C4=String --num-mappers 1 --verbose 

Output:
17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Transferred 2.6611 KB in 
27.4934 seconds (99.1148 bytes/sec)
17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Retrieved 2 records.
17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Restoring classloader: 
java.net.FactoryURLClassLoader@121fdcee
17/05/06 07:06:43 INFO tool.ImportTool: Final destination exists, will run 
merge job.
17/05/06 07:06:43 DEBUG tool.ImportTool: Using temporary folder: 
4bc6b65cd0194b81938f4660974ee392_T1
17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Checking for existing class: T1
17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Attempting to load jar through 
URL: 
jar:file:/tmp/sqoop-root/compile/6ed24910abcbc6ea38a1963bfce9a92d/codegen_T1.jar!/
17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Previous classloader is 
java.net.FactoryURLClassLoader@121fdcee
17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Testing class in jar: T1
17/05/06 07:06:43 ERROR tool.ImportTool: Import failed: java.io.IOException: 
Could not load jar 
/tmp/sqoop-root/compile/6ed24910abcbc6ea38a1963bfce9a92d/codegen_T1.jar into 
JVM. (Could not find class T1.)
at 

[jira] [Created] (SQOOP-3181) Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with (Could not find class .)

2017-05-06 Thread Markus Kemper (JIRA)
Markus Kemper created SQOOP-3181:


 Summary: Sqoop1 (import + --incremental + --merge-key + 
--as-parquetfile) fails with (Could not find class .)
 Key: SQOOP-3181
 URL: https://issues.apache.org/jira/browse/SQOOP-3181
 Project: Sqoop
  Issue Type: Bug
Reporter: Markus Kemper


Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with 
(Could not find class .).  See test case below

*Test Case*
{noformat}
#
# STEP 01 - Create Table and Data
#

export MYCONN=jdbc:oracle:thin:@oracle.sqoop.com:1521/db11g;
export MYUSER=sqoop
export MYPSWD=sqoop

sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"drop table t1"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"create table t1 (c1 int, c2 date, c3 varchar(10), c4 timestamp)"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"insert into t1 values (1, sysdate, 'NEW ROW 1', sysdate)"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"select * from t1"

Output:
-
| C1   | C2  | C3 | C4  | 
-
| 1| 2017-05-06 06:59:02.0 | NEW ROW 1  | 2017-05-06 
06:59:02 | 
-

#
# STEP 02 - Import Data into HDFS 
#

hdfs dfs -rm -r /user/root/t1
sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1 
--target-dir /user/root/t1 --incremental lastmodified --check-column C4 
--merge-key C1 --last-value '2017-01-01 00:00:00.0' --as-parquetfile 
--map-column-java C2=String,C4=String --num-mappers 1 --verbose 
hdfs dfs -ls /user/root/t1/*.parquet
parquet-tools cat --json 
'hdfs://namenode/user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet'

Output:
17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Transferred 2.627 KB in 23.6174 
seconds (113.8988 bytes/sec)
17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Retrieved 1 records.
17/05/06 07:01:34 INFO tool.ImportTool:   --last-value 2017-05-06 07:01:09.0
~
-rw-r--r--   3 root root   1144 2017-05-06 07:01 
/user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet
~
{"C1":"1","C2":"2017-05-06 06:59:02.0","C3":"NEW ROW 1","C4":"2017-05-06 
06:59:02"}

#
# STEP 03 - Insert New Row and Update Existing Row
#

sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"insert into t1 values (2, sysdate, 'NEW ROW 2', sysdate)"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"update t1 set c3 = 'UPDATE 1', c4 = sysdate where c1 = 1"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
"select * from t1 order by c1"

Output:
-
| C1   | C2  | C3 | C4  | 
-
| 1| 2017-05-06 06:59:02.0 | UPDATE 1   | 2017-05-06 
07:04:40 | 
| 2| 2017-05-06 07:04:38.0 | NEW ROW 2  | 2017-05-06 
07:04:38 | 
-

#
# STEP 04 - Import Data into HDFS and Merge changes 
#

sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1 
--target-dir /user/root/t1 --incremental lastmodified --check-column C4 
--merge-key C1 --last-value '2017-05-06 07:01:09.0' --as-parquetfile 
--map-column-java C2=String,C4=String --num-mappers 1 --verbose 

Output:
17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Transferred 2.6611 KB in 
27.4934 seconds (99.1148 bytes/sec)
17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Retrieved 2 records.
17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Restoring classloader: 
java.net.FactoryURLClassLoader@121fdcee
17/05/06 07:06:43 INFO tool.ImportTool: Final destination exists, will run 
merge job.
17/05/06 07:06:43 DEBUG tool.ImportTool: Using temporary folder: 
4bc6b65cd0194b81938f4660974ee392_T1
17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Checking for existing class: T1
17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Attempting to load jar through 
URL: 
jar:file:/tmp/sqoop-root/compile/6ed24910abcbc6ea38a1963bfce9a92d/codegen_T1.jar!/
17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Previous classloader is 
java.net.FactoryURLClassLoader@121fdcee
17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Testing class in jar: T1
17/05/06 07:06:43 ERROR tool.ImportTool: Import failed: java.io.IOException: 
Could not load jar 
/tmp/sqoop-root/compile/6ed24910abcbc6ea38a1963bfce9a92d/codegen_T1.jar into 
JVM. (Could not find class T1.)
at 

Re: Review Request 58466: SQOOP-3158 - Columns added to Mysql after initial sqoop import, export back to table with same schema fails

2017-05-06 Thread Eric Lin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58466/
---

(Updated May 6, 2017, 11:29 a.m.)


Review request for Sqoop, Attila Szabo and Szabolcs Vasas.


Changes
---

update test case to use existing checking NULL values after export


Bugs: SQOOP-3158
https://issues.apache.org/jira/browse/SQOOP-3158


Repository: sqoop-trunk


Description
---

I have table in MySQL with 2 columns until yesterday. The columns are id and 
name.
1,Raj
2,Jack

I have imported this data into HDFS yesterday itself as a file. Today we added 
a new column to the table in MySQL called salary. The table looks like below.

1,Raj
2,Jack
3,Jill,2000
4,Nick,3000

Now I have done Incremental import on this table as a file.

Part-m-0 file contains
1,Raj
2,Jack

Part-m-1 file contains
3,Jill,2000
4,Nick,3000

Now I created a new table in MySQL with same schema as Original MySQL table 
with columns id name and salary.

Sqoop export will fail with below error:

java.lang.RuntimeException: Can't parse input data: 'Raj'
at SQOOP_3158.__loadFromFields(SQOOP_3158.java:316)
at SQOOP_3158.parse(SQOOP_3158.java:254)
at 
org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:89)
at 
org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at 
org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.util.NoSuchElementException
at java.util.ArrayList$Itr.next(ArrayList.java:854)
at SQOOP_3158.__loadFromFields(SQOOP_3158.java:311)
... 12 more


Diffs (updated)
-

  src/java/org/apache/sqoop/orm/ClassWriter.java eaa9123 
  src/test/com/cloudera/sqoop/TestExport.java b2edc53 


Diff: https://reviews.apache.org/r/58466/diff/3/

Changes: https://reviews.apache.org/r/58466/diff/2-3/


Testing
---

There is no existing test class to cover the path and I am not sure the best 
way to add test case for this. If you have any suggestion, please let me know.

I have done manual testing to replicate the issue and confirmed that patch 
fixed the issue. I have also tried different data types, all working.

However, if column in MySQL is defined as NOT NULL, then the export will still 
fail with error, this is expected.


Thanks,

Eric Lin



[jira] [Updated] (SQOOP-3158) Columns added to Mysql after initial sqoop import, export back to table with same schema fails

2017-05-06 Thread Eric Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Lin updated SQOOP-3158:

Attachment: SQOOP-3158.2.patch

> Columns added to Mysql after initial sqoop import, export back to table with 
> same schema fails 
> ---
>
> Key: SQOOP-3158
> URL: https://issues.apache.org/jira/browse/SQOOP-3158
> Project: Sqoop
>  Issue Type: Improvement
>Affects Versions: 1.4.6
>Reporter: viru reddy
>Assignee: Eric Lin
>  Labels: newbie
> Attachments: SQOOP-3158.2.patch, SQOOP-3158.patch
>
>
> I have table in MySQL with 2 columns until yesterday. The columns are id and 
> name.
> 1,Raj
> 2,Jack
> I have imported this data into HDFS yesterday itself as a file. Today we 
> added a new column to the table in MySQL called salary. The table looks like 
> below.
> 1,Raj
> 2,Jack
> 3,Jill,2000
> 4,Nick,3000
> Now I have done Incremental import on this table as a file.
> Part-m-0 file contains
> 1,Raj
> 2,Jack
> Part-m-1 file contains
> 3,Jill,2000
> 4,Nick,3000
> Now I created a new table in MySQL with same schema as Original MySQL table 
> with columns id name and salary.
> When I do sqoop export only last 2 rows are getting inserted to the new table 
> in MySQL  and the sqoop export fails
> How can I reflect all the rows to be inserted to the table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SQOOP-3158) Columns added to Mysql after initial sqoop import, export back to table with same schema fails

2017-05-06 Thread Eric Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Lin updated SQOOP-3158:

Attachment: (was: SQOOP-3158.2.patch)

> Columns added to Mysql after initial sqoop import, export back to table with 
> same schema fails 
> ---
>
> Key: SQOOP-3158
> URL: https://issues.apache.org/jira/browse/SQOOP-3158
> Project: Sqoop
>  Issue Type: Improvement
>Affects Versions: 1.4.6
>Reporter: viru reddy
>Assignee: Eric Lin
>  Labels: newbie
> Attachments: SQOOP-3158.patch
>
>
> I have table in MySQL with 2 columns until yesterday. The columns are id and 
> name.
> 1,Raj
> 2,Jack
> I have imported this data into HDFS yesterday itself as a file. Today we 
> added a new column to the table in MySQL called salary. The table looks like 
> below.
> 1,Raj
> 2,Jack
> 3,Jill,2000
> 4,Nick,3000
> Now I have done Incremental import on this table as a file.
> Part-m-0 file contains
> 1,Raj
> 2,Jack
> Part-m-1 file contains
> 3,Jill,2000
> 4,Nick,3000
> Now I created a new table in MySQL with same schema as Original MySQL table 
> with columns id name and salary.
> When I do sqoop export only last 2 rows are getting inserted to the new table 
> in MySQL  and the sqoop export fails
> How can I reflect all the rows to be inserted to the table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Review Request 58466: SQOOP-3158 - Columns added to Mysql after initial sqoop import, export back to table with same schema fails

2017-05-06 Thread Eric Lin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58466/
---

(Updated May 6, 2017, 11:01 a.m.)


Review request for Sqoop, Attila Szabo and Szabolcs Vasas.


Changes
---

Added test case to verify it failed before the change and passes after the 
change.


Bugs: SQOOP-3158
https://issues.apache.org/jira/browse/SQOOP-3158


Repository: sqoop-trunk


Description
---

I have table in MySQL with 2 columns until yesterday. The columns are id and 
name.
1,Raj
2,Jack

I have imported this data into HDFS yesterday itself as a file. Today we added 
a new column to the table in MySQL called salary. The table looks like below.

1,Raj
2,Jack
3,Jill,2000
4,Nick,3000

Now I have done Incremental import on this table as a file.

Part-m-0 file contains
1,Raj
2,Jack

Part-m-1 file contains
3,Jill,2000
4,Nick,3000

Now I created a new table in MySQL with same schema as Original MySQL table 
with columns id name and salary.

Sqoop export will fail with below error:

java.lang.RuntimeException: Can't parse input data: 'Raj'
at SQOOP_3158.__loadFromFields(SQOOP_3158.java:316)
at SQOOP_3158.parse(SQOOP_3158.java:254)
at 
org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:89)
at 
org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at 
org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.util.NoSuchElementException
at java.util.ArrayList$Itr.next(ArrayList.java:854)
at SQOOP_3158.__loadFromFields(SQOOP_3158.java:311)
... 12 more


Diffs (updated)
-

  src/java/org/apache/sqoop/orm/ClassWriter.java eaa9123 
  src/test/com/cloudera/sqoop/TestExport.java b2edc53 


Diff: https://reviews.apache.org/r/58466/diff/2/

Changes: https://reviews.apache.org/r/58466/diff/1-2/


Testing
---

There is no existing test class to cover the path and I am not sure the best 
way to add test case for this. If you have any suggestion, please let me know.

I have done manual testing to replicate the issue and confirmed that patch 
fixed the issue. I have also tried different data types, all working.

However, if column in MySQL is defined as NOT NULL, then the export will still 
fail with error, this is expected.


Thanks,

Eric Lin