Hi I have been posting in the Users mailing list initially, but figured that I may have hit a bug. I am not sure if I should be forwarding the findings to the Dev mailing group, doing it anyways!. Please see details below.
Thank You RV From: Ramya Vasudevan Sent: Monday, November 30, 2015 2:49 PM To: '[email protected]' Subject: RE: SQOOP export to Oracle failing Hi The more I tried various options to get the sqoop export to work, the more I felt that this could be a bug. I started digging into the Sqoop internal code to debug and found this logic used in the code: method getTableColumns: It tries to gather column names and column (Oracle & JDBC) data types by Step 1) First retrieving column_name and data_type FROM dba_tab_columns table ( ) And Step 2) Then by using the column_name returned from Step 1 to retrieve the JDBC data types ( ) The details below are from me navigating through the code . - Step 1) succeeds and Oracle returns column names as a, x, y -> (all lower case as expected) - Step 2) Sqoop tries to make the following SQL to retrieve JDBC column metadata "select a,x,y from "TestV500"."Xy" where 0=1;" It fails here with the error "ORA-00904: "Y": invalid identifier". This is because Sqoop is not enclosing the column names in quotes in the select statement as it should for cases where there are lower case column names. The code that is problematic is in the class org.apache.sqoop.manager.oracle.OraOopOracleQueries at Line Number 683. This is incorrect: columnList.append(result.get(idx).getName()); It should be something like this: if(!result.get(idx).getName().equals(result.get(idx).getName().toUpperCase())) columnList.append("\"" + result.get(idx).getName() + "\""); else columnList.append(result.get(idx).getName()); Thank you RV From: Ramya Vasudevan Sent: Wednesday, November 18, 2015 2:56 PM To: '[email protected]' Subject: RE: SQOOP export to Oracle failing Hi I am still stuck in this issue. Any suggestion will be very helpful. Thank you RV From: Ramya Vasudevan Sent: Monday, November 16, 2015 3:41 PM To: [email protected]<mailto:[email protected]> Subject: RE: SQOOP export to Oracle failing Thank You David. I can see a little more information this time. Export with direct option: /usr/bin/sqoop export --connect jdbc:oracle:thin:@adhoc-oracle11:1521:orcl --direct --num-mappers 2 --username test --password password --export-dir "/hdfs_nfs_mount/tmp/oracle/TestV100/Xy.txt" --table "\"\"TestV500\".\"Xy\"\"" --fields-terminated-by "\t" --input-null-string null -m 1 -verbose Output: Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 15/11/16 15:27:31 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.4.1 15/11/16 15:27:31 DEBUG tool.BaseSqoopTool: Enabled debug logging. 15/11/16 15:27:31 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 15/11/16 15:27:31 DEBUG sqoop.ConnFactory: Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory 15/11/16 15:27:31 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory 15/11/16 15:27:31 DEBUG sqoop.ConnFactory: Trying ManagerFactory: org.apache.sqoop.manager.oracle.OraOopManagerFactory 15/11/16 15:27:31 DEBUG oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop can be called by Sqoop! 15/11/16 15:27:31 INFO manager.SqlManager: Using default fetchSize of 1000 15/11/16 15:27:32 INFO oracle.OraOopOracleQueries: Current schema is: TEST 15/11/16 15:27:32 DEBUG oracle.OraOopUtilities: The Oracle table context has been derived from: oracleConnectionUserName = test tableStr = "TestV500"."Xy" as: owner : TestV500 table : Xy 15/11/16 15:27:32 INFO oracle.OraOopManagerFactory: ************************************************** *** Using Data Connector for Oracle and Hadoop *** ************************************************** 15/11/16 15:27:32 DEBUG oracle.OraOopUtilities: Updated mapred.child.java.opts from "null" to "-Djava.security.egd=file:///dev/urandom" 15/11/16 15:27:32 INFO oracle.OraOopManagerFactory: Oracle Database version: Oracle Database 11g Release 11.1.0.6.0 - Production 15/11/16 15:27:32 INFO oracle.OraOopManagerFactory: This Oracle database is not a RAC. 15/11/16 15:27:32 DEBUG oracle.OraOopManagerFactory: Setting mapper url oraoop.mapper.jdbc.url.0 = jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=adhoc-oracle11)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SID=orcl))) 15/11/16 15:27:32 DEBUG oracle.OraOopManagerFactory: Setting mapper url oraoop.mapper.jdbc.url.1 = jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=adhoc-oracle11)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SID=orcl))) 15/11/16 15:27:32 INFO Configuration.deprecation: mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts 15/11/16 15:27:32 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.oracle.OraOopConnManager@7db63b8f<mailto:org.apache.sqoop.manager.oracle.OraOopConnManager@7db63b8f> 15/11/16 15:27:32 INFO tool.CodeGenTool: Beginning code generation 15/11/16 15:27:32 DEBUG oracle.OraOopOracleQueries: getTableColumns() : sql = SELECT column_name, data_type FROM dba_tab_columns WHERE owner = ? and table_name = ? and (DATA_TYPE IN ('BINARY_DOUBLE','BINARY_FLOAT','CHAR','DATE','FLOAT','NCHAR','NUMBER','NVARCHAR2','ROWID','URITYPE','VARCHAR2') OR DATA_TYPE LIKE 'INTERVAL YEAR(%) TO MONTH' OR DATA_TYPE LIKE 'INTERVAL DAY(%) TO SECOND(%)' OR DATA_TYPE LIKE 'TIMESTAMP(%)' OR DATA_TYPE LIKE 'TIMESTAMP(%) WITH TIME ZONE' OR DATA_TYPE LIKE 'TIMESTAMP(%) WITH LOCAL TIME ZONE') ORDER BY column_id 15/11/16 15:27:32 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: java.sql.SQLSyntaxErrorException: ORA-00904: "Y": invalid identifier java.lang.RuntimeException: java.sql.SQLSyntaxErrorException: ORA-00904: "Y": invalid identifier at org.apache.sqoop.manager.oracle.OraOopConnManager.getColumnNamesInOracleTable(OraOopConnManager.java:133) at org.apache.sqoop.manager.oracle.OraOopConnManager.getSelectedColumnNamesInOracleTable(OraOopConnManager.java:144) at org.apache.sqoop.manager.oracle.OraOopConnManager.getColTypesQuery(OraOopConnManager.java:216) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.oracle.OraOopConnManager.getColumnTypes(OraOopConnManager.java:504) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1833) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) Caused by: java.sql.SQLSyntaxErrorException: ORA-00904: "Y": invalid identifier at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:91) at oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:112) at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:173) at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:455) at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:413) at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:1030) at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:183) at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:774) at oracle.jdbc.driver.T4CStatement.executeMaybeDescribe(T4CStatement.java:849) at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1186) at oracle.jdbc.driver.OracleStatement.executeQuery(OracleStatement.java:1377) at oracle.jdbc.driver.OracleStatementWrapper.executeQuery(OracleStatementWrapper.java:386) at org.apache.sqoop.manager.oracle.OraOopOracleQueries.getTableColumns(OraOopOracleQueries.java:689) at org.apache.sqoop.manager.oracle.OraOopOracleQueries.getTableColumns(OraOopOracleQueries.java:767) at org.apache.sqoop.manager.oracle.OraOopOracleQueries.getTableColumnNames(OraOopOracleQueries.java:624) at org.apache.sqoop.manager.oracle.OraOopConnManager.getColumnNamesInOracleTable(OraOopConnManager.java:124) ... 16 more It looks like it is resolving the column "y" as Y and failing saying column Y doesn't exist: This is the table DDL: CREATE TABLE "TestV500"."Xy" ( "a" NVARCHAR2(255) DEFAULT NULL, "x" NUMBER(10,0) DEFAULT NULL, "y" NUMBER(10,0) DEFAULT NULL ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "USERS" ; Any idea on how I can get past this error? Thank You RV From: David Robson [mailto:[email protected]] Sent: Monday, November 16, 2015 2:49 PM To: [email protected]<mailto:[email protected]> Subject: RE: SQOOP export to Oracle failing Hi Ramya, Have you tried the "-direct" option? I have tested mixed case table and schema names with this before and it should work. We even added a section to the documentation on quoting them which you can read here: http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_quote_oracle_owners_and_tables The only thing you will need to do is use more mappers - the direct connector only works with 2 or more mappers. So add "-direct" to your command and increase the number of mappers to at least 2. David From: Ramya Vasudevan [mailto:[email protected]] Sent: Tuesday, 17 November 2015 5:20 AM To: [email protected]<mailto:[email protected]> Subject: SQOOP export to Oracle failing Hi I am facing an issue with export to Oracle: Scenario: sqoop export from hdfs to oracle where the schema and objects in oracle has to be mixed case. Issue: I created a schema (TestV500) and table (Xy) in oracle 11g using sqoop export commands: sqoop eval --connect jdbc:oracle:thin:@adhoc-oracle11:1521:orcl -username test -password password -query "CREATE USER \"TestV500\" identified by \"password\"" sqoop eval --connect jdbc:oracle:thin:@adhoc-oracle11:1521:orcl -username test -password password -query "GRANT CONNECT TO \"TestV500\"" sqoop eval --connect jdbc:oracle:thin:@adhoc-oracle11:1521:orcl -username test -password password -query "ALTER USER \"TestV500\" QUOTA UNLIMITED ON USERS" sqoop eval --connect jdbc:oracle:thin:@adhoc-oracle11:1521:orcl -username test -password password -query "CREATE TABLE \"TestV500\".\"Xy\"( \"a\" NVARCHAR2(255) DEFAULT NULL, \"x\" NUMBER(10,0) DEFAULT NULL, \"y\" NUMBER(10,0) DEFAULT NULL )" Installations used: Oracle Version: 11.1.0.6.0 Sqoop Version: 1.4.5-cdh5.4.1 Ojdbc version: ojdbc6.jar I am now trying to load the table using a tab delimited file: /usr/bin/sudo -u hdfs sqoop export --connect jdbc:oracle:thin:@oracle11:1521:orcl -username test -password password "/hdfs_nfs_mount/tmp/oracle/TestV500/Xy.txt" --table "\"\"TestV500\".\"Xy\"\"" --fields-terminated-by "\t" --input-null-string null -m 1 I get this error : 15/11/12 09:33:03 INFO tool.CodeGenTool: Beginning code generation 15/11/12 09:33:03 DEBUG manager.OracleManager: Using column names query: SELECT t.* FROM "TestV500"."Xy" t WHERE 1=0 15/11/12 09:33:03 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT t.* FROM "TestV500"."Xy" t WHERE 1=0 15/11/12 09:33:03 DEBUG manager.OracleManager: Creating a new connection for jdbc:oracle:thin:@adhoc-oracle11:1521:orcl, using username: test 15/11/12 09:33:03 DEBUG manager.OracleManager: No connection paramenters specified. Using regular API for making connection. 15/11/12 09:33:03 INFO manager.OracleManager: Time zone has been set to GMT 15/11/12 09:33:03 DEBUG manager.SqlManager: Using fetchSize for next query: 1000 15/11/12 09:33:03 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "TestV500"."Xy" t WHERE 1=0 15/11/12 09:33:04 DEBUG manager.SqlManager: Found column a of type [12, 255, 0] 15/11/12 09:33:04 DEBUG manager.SqlManager: Found column x of type [2, 10, 0] 15/11/12 09:33:04 DEBUG manager.SqlManager: Found column y of type [2, 10, 0] 15/11/12 09:33:04 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@adhoc-oracle11:1521:orcl/test 15/11/12 09:33:04 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.IllegalArgumentException: There is no column found in the target table "TestV500"."Xy". Please ensure that your table name is correct. java.lang.IllegalArgumentException: There is no column found in the target table "TestV500"."Xy". Please ensure that your table name is correct. at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1658) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) This is the input file Xy.txt: a 0 0 a 1 4 a 2 2 Observations: 1. Schema and table was created successfully in oracle. 2. If I connect to the database directly as 'test' user and run this command, it works fine (it shows zero rows as nothing is loaded yet) SELECT t.* FROM "TestV500"."Xy" t WHERE 1=0; 3. If I create a schema called testv500 with a table called xy (without mixed case), the exact same Sqoop export command, the input file loads the data in the table. Unless I am missing something here, I wonder if I hit a bug not being able to handle mixed case. Any help will be great! Thank You, RV
