[ 
https://issues.apache.org/jira/browse/SQOOP-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Kemper updated SQOOP-3030:
---------------------------------
    Comment: was deleted

(was: You are welcome [~Tagar]. 

[~maugli] I have not tried to reproduce the issue outside of --hcatalog yet, 
primarily because the use case required export of Parquet files which 
unfortunately --export-dir does not support due to a KiteSDK limitation.

Thanks, [~markuskem...@me.com] )

> Export to Oracle using (--direct + --hcatalog + --columns) with columns 
> having the wrong case sensitivity fails with NPE
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-3030
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3030
>             Project: Sqoop
>          Issue Type: Bug
>          Components: connectors/oracle, hive-integration
>            Reporter: Markus Kemper
>
> It appears that if the case of the columns in the Oracle RDBMS are not used 
> with the Sqoop --columns option with (export + --hcatalog + --direct) the 
> Sqoop map task will fail with an NPE where the non-direct path fails correct. 
>  My test case is below, please review and let me know if you have any 
> questions.
> Additionally it would be nice if we could detect this column case mis-match 
> during compile time and not submit the job only to have it fail in YARN.
> TEST CASE:
> STEP 01 : Create Oracle Source/Target Table
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "drop table t1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "create table t1 (c1 int, c2 varchar(10))"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1 values (1, 'some data')"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1"
> -------------------------------------
> | C1                   | C2         | 
> -------------------------------------
> | 1                    | some data  | 
> -------------------------------------
> STEP 02 : Import Oracle Table using Sqoop --hcatalog options
> beeline -u jdbc:hive2:// -e "use default; drop table t1_text;"
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T1 --num-mappers 1 --hcatalog-database default --hcatalog-table t1_text 
> --create-hcatalog-table --hcatalog-storage-stanza 'stored as textfile'
> beeline -u jdbc:hive2:// -e "use default; select * from t1_text;"
> Output:
> <SNIP>
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Database column names 
> projected : [c1, c2]
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Database column name - info 
> map :
>       c1 : [Type : 2,Precision : 38,Scale : 0]
>       c2 : [Type : 12,Precision : 10,Scale : 0]
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Creating HCatalog table 
> default.t1_text for import
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: HCatalog Create table 
> statement: 
> create table `default`.`t1_text` (
>       `c1` decimal(38),
>       `c2` varchar(10))
> stored as textfile
> <SNIP>
> 16/10/18 09:12:14 INFO mapreduce.ImportJobBase: Transferred 12 bytes in 
> 34.4222 seconds (0.3486 bytes/sec)
> 16/10/18 09:12:14 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> ---
> +-------------+-------------+--+
> | t1_text.c1  | t1_text.c2  |
> +-------------+-------------+--+
> | 1           | some data   |
> +-------------+-------------+--+
> STEP 03 : Attempt Export to Oracle using Sqoop --hcatalog and --columns 
> options
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T2 --num-mappers 2 --hcatalog-database default --hcatalog-table t1_text 
> --columns "c1,c2"
> Output: (failure = correct)
> <SNIP>
> 16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: Database column names 
> projected : [c1, c2]
> 16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: Database column name - info 
> map :
>       c1 : [Type : 2,Precision : 38,Scale : 0]
>       c2 : [Type : 12,Precision : 10,Scale : 0]
> <SNIP>
> 16/10/18 09:13:48 INFO hive.metastore: Connected to metastore.
> 16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: HCatalog full table schema 
> fields = [c1, c2]
> 16/10/18 09:13:49 INFO Configuration.deprecation: mapred.output.dir is 
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 16/10/18 09:13:49 INFO hcat.SqoopHCatUtilities: HCatalog table partitioning 
> key fields = []
> 16/10/18 09:13:49 INFO hcat.SqoopHCatUtilities: HCatalog projected schema 
> fields = [c1, c2]
> <SNIP>
> 16/10/18 09:14:22 INFO mapreduce.ExportJobBase: Exported 0 records.
> 16/10/18 09:14:22 ERROR tool.ExportTool: Error during export: Export job 
> failed!
> ---
> 2016-10-18 09:14:19,418 INFO [main] 
> org.apache.hive.hcatalog.mapreduce.InternalUtil: Initializing 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties 
> {transient_lastDdlTime=1476807097, name=default.t1_text, 
> serialization.null.format=\N, columns=c1,c2, 
> serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> serialization.format=1, columns.types=decimal(38,0),varchar(10)}
> 2016-10-18 09:14:19,660 INFO [Thread-12] 
> org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is 
> finished. keepGoing=false
> 2016-10-18 09:14:19,952 ERROR [Thread-11] 
> org.apache.sqoop.mapreduce.AsyncSqlOutputFormat: Got exception in update 
> thread: java.sql.SQLSyntaxErrorException: ORA-00904: "c2": invalid identifier
> STEP 03 : Attempt Export to Oracle using Sqoop --hcatalog and --columns 
> options and Oracle --direct
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T2 --num-mappers 2 --hcatalog-database default --hcatalog-table t1_text 
> --columns "c1,c2" --direct
> Output: (failure = not correct (NPE))
> <SNIP>
> 16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: 
> **************************************************
> *** Using Data Connector for Oracle and Hadoop ***
> **************************************************
> 16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: Oracle Database version: 
> Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
> 16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: This Oracle database is 
> not a RAC.
> 16/10/18 09:17:47 INFO Configuration.deprecation: mapred.map.max.attempts is 
> deprecated. Instead, use mapreduce.map.maxattempts
> 16/10/18 09:17:47 INFO tool.CodeGenTool: Beginning code generation
> 16/10/18 09:17:47 INFO manager.SqlManager: Executing SQL statement: SELECT 
> "C1","C2" FROM T2 WHERE 0=1
> <SNIP>
> 16/10/18 09:17:50 INFO mapreduce.ExportJobBase: Configuring HCatalog for 
> export job
> 16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Configuring HCatalog specific 
> details for job
> 16/10/18 09:17:50 INFO manager.SqlManager: Executing SQL statement: SELECT 
> "C1","C2" FROM "T2" WHERE 1=0
> 16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Database column names 
> projected : [c1, c2]
> 16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Database column name - info 
> map :
>       c1 : [Type : 2,Precision : 38,Scale : 0]
>       c2 : [Type : 12,Precision : 10,Scale : 0]
> <SNIP>
> 16/10/18 09:17:51 INFO hive.metastore: Connected to metastore.
> 16/10/18 09:17:52 INFO hcat.SqoopHCatUtilities: HCatalog full table schema 
> fields = [c1, c2]
> <SNIP>
> 16/10/18 09:18:25 INFO mapreduce.ExportJobBase: Exported 0 records.
> 16/10/18 09:18:25 ERROR tool.ExportTool: Error during export: Export job 
> failed!
> ---
> 2016-10-18 09:18:23,561 INFO [main] 
> org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: The number of rows 
> per batch is: 100
> 2016-10-18 09:18:23,561 INFO [main] 
> org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: The number of batches 
> per commit is: 100
> 2016-10-18 09:18:23,721 INFO [main] 
> org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: This record writer is 
> connected to Oracle via the JDBC URL: 
>       "oracle.jdbc.driver.T4CConnection@34133979"
>       to the Oracle instance: "ORCL"
> 2016-10-18 09:18:23,828 INFO [main] 
> org.apache.sqoop.manager.oracle.OraOopOracleQueries: Session Time Zone set to 
> GMT
> 2016-10-18 09:18:23,883 INFO [main] 
> org.apache.sqoop.manager.oracle.OracleConnectionFactory: Initializing Oracle 
> session with SQL :
> begin 
>   dbms_application_info.set_module(module_name => 'Data Connector for Oracle 
> and Hadoop', action_name => 'export 20161018091747PDT'); 
> end;
> 2016-10-18 09:18:23,883 WARN [main] 
> org.apache.sqoop.manager.oracle.OracleConnectionFactory: No Oracle 'session 
> initialization' statements were found to execute.
> Check that your oraoop-site-template.xml and/or oraoop-site.xml files are 
> correctly installed in the ${SQOOP_HOME}/conf directory.
> 2016-10-18 09:18:24,116 INFO [main] 
> org.apache.hive.hcatalog.mapreduce.InternalUtil: Initializing 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties 
> {transient_lastDdlTime=1476807097, name=default.t1_text, 
> serialization.null.format=\N, columns=c1,c2, 
> serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> serialization.format=1, columns.types=decimal(38,0),varchar(10)}
> 2016-10-18 09:18:24,345 INFO [Thread-12] 
> org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is 
> finished. keepGoing=false
> 2016-10-18 09:18:24,357 INFO [main] 
> org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: Batch-Mode insert 
> statement:
> insert  into "SQOOP"."T2"
> ("C1"
> ,"C2")
> values
> (:C1
> ,:C2)
> 2016-10-18 09:18:24,358 ERROR [main] 
> org.apache.sqoop.manager.oracle.OraOopOutputFormatInsert: The following error 
> occurred during configurePreparedStatement()
> java.lang.NullPointerException
>       at 
> org.apache.sqoop.manager.oracle.OraOopOutputFormatBase$OraOopDBRecordWriterBase.setBindValueAtName(OraOopOutputFormatBase.java:432)
> STEP 04 : Verify Export without --direct and correct column case works
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T1 --num-mappers 2 --hcatalog-database default --hcatalog-table t1_text 
> --columns "C1,C2" 
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1"
> Output:
> 16/10/18 09:26:46 INFO mapreduce.ExportJobBase: Transferred 7.79 KB in 
> 30.9303 seconds (257.9025 bytes/sec)
> 16/10/18 09:26:46 INFO mapreduce.ExportJobBase: Exported 1 records.
> ---
> -------------------------------------
> | C1                   | C2         | 
> -------------------------------------
> | 1                    | some data  | 
> | 1                    | some data  | 
> -------------------------------------
> STEP 05 : Verify Export with --direct and correct column case works
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T1 --num-mappers 2 --hcatalog-database default --hcatalog-table t1_text 
> --columns "C1,C2" --direct
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1"
> Output:
> <SNIP>
> 16/10/18 09:30:40 INFO oracle.OraOopManagerFactory: 
> **************************************************
> *** Using Data Connector for Oracle and Hadoop ***
> **************************************************
> <SNIP>
> 16/10/18 09:31:16 INFO mapreduce.ExportJobBase: Transferred 7.79 KB in 
> 30.9987 seconds (257.333 bytes/sec)
> 16/10/18 09:31:16 INFO mapreduce.ExportJobBase: Exported 1 records.
> ---
> -------------------------------------
> | C1                   | C2         | 
> -------------------------------------
> | 1                    | some data  | 
> | 1                    | some data  | 
> | 1                    | some data  | 
> -------------------------------------



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to