[
https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266776#comment-13266776
]
Cheolsoo Park edited comment on SQOOP-481 at 5/2/12 6:04 PM:
-------------------------------------------------------------
Hi Arvind and Jarece, thank you for your suggestions!
Indeed, failing fast is a good thing to do, and we should always prevent maps
from having null values. In fact, fast-fast logic is already in place. For
example, we call valueOf() when putting a new value into a map, and valueOf
does not return a null.
{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String
stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}
But the problem remains because it is still possible for maps to not have
specific keys, which cannot be detected until a look-up happens. I guess that
the real problem that I am raising in this jira is *not auto-boxing but
auto-unboxing*. For auto-unboxing, when the get() call is made is the earliest
point we can fail. So I believe that adding a null check after get() is the
best we can do.
{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
throw new IOException("Column " + col + " does not exist in table " +
tableName);
}
String javaType = toJavaType(col, sqlType);
{code}
Please correct me know if I misunderstand your suggestions.
was (Author: cheolsoo):
Hi Arvind and Jarece, thank you for your suggestions!
Indeed, failing fast is a good thing to do, and we should always prevent maps
from having null values. In fact, fast-fast logic is already in place. For
example, we call valueOf() when putting a new value into a map, and valueOf
does not return a null.
{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String
stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}
But the problem remains because it is still possible for maps to not have
specific keys, which cannot be detected until a look-up happens. I guess that
the real problem that I am raising in this jira is *not auto-boxing but
auto-unboxing*. For auto-unboxing, when the get() call is made is the earliest
point we can fail. So I believe that adding a null check after get() is the
best we can do.
{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
throw new throw new IOException("Column " + col + " does not exist in table
" + tableName);
}
{code}
Please correct me know if I misunderstand your suggestions.
> Sqoop import with --hive-import using wrong column names in --columns throws
> a NPE
> ----------------------------------------------------------------------------------
>
> Key: SQOOP-481
> URL: https://issues.apache.org/jira/browse/SQOOP-481
> Project: Sqoop
> Issue Type: Bug
> Reporter: Cheolsoo Park
> Assignee: Cheolsoo Park
>
> To reproduce the error,
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username
> **** --password **** --verbose --table foo --split-by i --columns i
> --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop:
> java.lang.NullPointerException
> java.lang.NullPointerException
> at
> com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I"
> exists. Now toHiveType(int colType) tries to autocast a null to a primitive
> int, resulting a NPE.
> It would be better if more informative message is provided rather than a
> random NPE.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira