[jira] [Issue Comment Edited] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Cheolsoo Park (JIRA) Wed, 02 May 2012 11:05:12 -0700

    [ 
https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266776#comment-13266776
 ]


Cheolsoo Park edited comment on SQOOP-481 at 5/2/12 6:04 PM:
-------------------------------------------------------------

Hi Arvind and Jarece, thank you for your suggestions!

Indeed, failing fast is a good thing to do, and we should always prevent maps 
from having null values. In fact, fast-fast logic is already in place. For 
example, we call valueOf() when putting a new value into a map, and valueOf 
does not return a null.

{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String 
stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}

But the problem remains because it is still possible for maps to not have 
specific keys, which cannot be detected until a look-up happens. I guess that 
the real problem that I am raising in this jira is *not auto-boxing but 
auto-unboxing*. For auto-unboxing, when the get() call is made is the earliest 
point we can fail. So I believe that adding a null check after get() is the 
best we can do.

{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
   throw new IOException("Column " + col + " does not exist in table " + 
tableName);
}
String javaType = toJavaType(col, sqlType);
{code}

Please correct me know if I misunderstand your suggestions.
                
      was (Author: cheolsoo):
    Hi Arvind and Jarece, thank you for your suggestions!

Indeed, failing fast is a good thing to do, and we should always prevent maps 
from having null values. In fact, fast-fast logic is already in place. For 
example, we call valueOf() when putting a new value into a map, and valueOf 
does not return a null.

{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String 
stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}

But the problem remains because it is still possible for maps to not have 
specific keys, which cannot be detected until a look-up happens. I guess that 
the real problem that I am raising in this jira is *not auto-boxing but 
auto-unboxing*. For auto-unboxing, when the get() call is made is the earliest 
point we can fail. So I believe that adding a null check after get() is the 
best we can do.

{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
   throw new throw new IOException("Column " + col + " does not exist in table 
" + tableName);
}
{code}

Please correct me know if I misunderstand your suggestions.
                  
> Sqoop import with --hive-import using wrong column names in --columns throws 
> a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username 
> **** --password **** --verbose --table foo --split-by i --columns i 
> --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: 
> java.lang.NullPointerException
> java.lang.NullPointerException
>       at 
> com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
>       at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
>       at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
>       at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
>       at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>       at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>       at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>       at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>       at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" 
> exists. Now toHiveType(int colType) tries to autocast a null to a primitive 
> int, resulting a NPE.
> It would be better if more informative message is provided rather than a 
> random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Reply via email to