Stamatis Zampetakis created HIVE-26350:
------------------------------------------

             Summary: IndexOutOfBoundsException when generating splits for 
external JDBC table with partition columns
                 Key: HIVE-26350
                 URL: https://issues.apache.org/jira/browse/HIVE-26350
             Project: Hive
          Issue Type: Bug
          Components: CBO, JDBC storage handler
            Reporter: Stamatis Zampetakis


Create the following table in some JDBC database (e.g., Postgres).

{code:sql}
CREATE TABLE country
(
    id   int,
    name varchar(20)
);
{code}

Create the following tables in Hive ensuring that the external JDBC table has 
the {{hive.sql.partitionColumn}} table property set.

{code:sql}
CREATE TABLE city (id int);

CREATE EXTERNAL TABLE country
(
    id int,
    name varchar(20)
)
STORED BY                                          
'org.apache.hive.storage.jdbc.JdbcStorageHandler'
TBLPROPERTIES (                                    
    "hive.sql.database.type" = "POSTGRES",
    "hive.sql.jdbc.driver" = "org.postgresql.Driver",
    "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/qtestDB",
    "hive.sql.dbcp.username" = "qtestuser",
    "hive.sql.dbcp.password" = "qtestpassword",
    "hive.sql.table" = "country",
    "hive.sql.partitionColumn" = "name",
    "hive.sql.numPartitions" = "2"
);
{code}

The query below fails with IndexOutOfBoundsException when the mapper scanning 
the JDBC table tries to generate the splits by exploiting the partitioning 
column.

{code:sql}
select country.id from country cross join city;
{code}

The full stack trace is given below.
{noformat}
java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
        at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_261]
        at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_261]
        at 
org.apache.hive.storage.jdbc.JdbcInputFormat.getSplits(JdbcInputFormat.java:102)
 [hive-jdbc-handler-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:564)
 [hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:858)
 [hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:263)
 [hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
        at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:281)
 [tez-dag-0.10.1.jar:0.10.1]
        at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:272)
 [tez-dag-0.10.1.jar:0.10.1]
        at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_261]
        at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_261]
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
 [hadoop-common-3.1.0.jar:?]
        at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:272)
 [tez-dag-0.10.1.jar:0.10.1]
        at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:256)
 [tez-dag-0.10.1.jar:0.10.1]
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
 [guava-19.0.jar:?]
        at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
 [guava-19.0.jar:?]
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
 [guava-19.0.jar:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_261]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_261]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]
{noformat}




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to