Prasanth Jayachandran created HIVE-12712:
--------------------------------------------

             Summary: HiveInputFormat may fail to column names to read in some 
cases
                 Key: HIVE-12712
                 URL: https://issues.apache.org/jira/browse/HIVE-12712
             Project: Hive
          Issue Type: Bug
    Affects Versions: 2.0.0, 2.1.0
            Reporter: Prasanth Jayachandran
            Assignee: Prasanth Jayachandran


The primary issue is when plan is generated pathToAliases map is populated with 
directory paths to table aliases. pathToAliases.put() uses path.toString() as 
map key. During probing, path.toUri().toString() is used. This can cause probe 
misses when path contains spaces in them. path.toUri() will escape the spaces 
in the path whereas path.toString() does not escape the spaces. As a result, 
HiveInputFormat can trigger a different code path which can fail to set list of 
columns to read from the source table. This was causing unexpected NPE in 
OrcInputFormat (after refactoring HIVE-11705) which removed null check for 
column names. The resulting exception is 

{code}
Caused by: java.lang.RuntimeException: ORC split generation failed with 
exception: java.lang.NullPointerException
        at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1288)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1354)
        at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:367)
        at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:457)
        at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:152)
        at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:246)
        at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:240)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:240)
        at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:227)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        ... 3 more
Caused by: java.util.concurrent.ExecutionException: 
java.lang.NullPointerException
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1282)
        ... 15 more
Caused by: java.lang.NullPointerException
        at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.extractNeededColNames(OrcInputFormat.java:422)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.extractNeededColNames(OrcInputFormat.java:417)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.access$2000(OrcInputFormat.java:134)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1072)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:919)
        ... 4 more

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to