[ 
https://issues.apache.org/jira/browse/HIVE-26350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557529#comment-17557529
 ] 

Stamatis Zampetakis commented on HIVE-26350:
--------------------------------------------

{noformat}
HiveProject(id=[$0])
      HiveJdbcConverter(convention=[JDBC.POSTGRES])
        JdbcProject(id=[$0])
          JdbcHiveTableScan(table=[[default, country]], table:alias=[country])
{noformat}

The problem seems to start from the subplan above and the fact that HiveProject 
is executed in the same mapper with the Jdbc operators. It appears that due to 
the projection the column type list has one entry (int) 

{code:java}
List<TypeInfo> hiveColumnTypesList = 
TypeInfoUtils.getTypeInfosFromTypeString(job.get(serdeConstants.LIST_COLUMN_TYPES));
{code}

and basically does not contain the partitioning column leading to the 
{{IndexOutOfBoundsException}} when we attempt to find its type.

{code:java}
TypeInfo typeInfo = 
hiveColumnTypesList.get(columnNames.indexOf(partitionColumn));
{code}



> IndexOutOfBoundsException when generating splits for external JDBC table with 
> partition columns
> -----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-26350
>                 URL: https://issues.apache.org/jira/browse/HIVE-26350
>             Project: Hive
>          Issue Type: Bug
>          Components: CBO, JDBC storage handler
>            Reporter: Stamatis Zampetakis
>            Priority: Major
>         Attachments: cbo_plan.txt, explain_plan.txt, 
> jdbc_join_with_partition_table.q
>
>
> Create the following table in some JDBC database (e.g., Postgres).
> {code:sql}
> CREATE TABLE country
> (
>     id   int,
>     name varchar(20)
> );
> {code}
> Create the following tables in Hive ensuring that the external JDBC table has 
> the {{hive.sql.partitionColumn}} table property set.
> {code:sql}
> CREATE TABLE city (id int);
> CREATE EXTERNAL TABLE country
> (
>     id int,
>     name varchar(20)
> )
> STORED BY                                          
> 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (                                    
>     "hive.sql.database.type" = "POSTGRES",
>     "hive.sql.jdbc.driver" = "org.postgresql.Driver",
>     "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/qtestDB",
>     "hive.sql.dbcp.username" = "qtestuser",
>     "hive.sql.dbcp.password" = "qtestpassword",
>     "hive.sql.table" = "country",
>     "hive.sql.partitionColumn" = "name",
>     "hive.sql.numPartitions" = "2"
> );
> {code}
> The query below fails with IndexOutOfBoundsException when the mapper scanning 
> the JDBC table tries to generate the splits by exploiting the partitioning 
> column.
> {code:sql}
> select country.id from country cross join city;
> {code}
> The full stack trace is given below.
> {noformat}
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>         at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_261]
>         at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_261]
>         at 
> org.apache.hive.storage.jdbc.JdbcInputFormat.getSplits(JdbcInputFormat.java:102)
>  [hive-jdbc-handler-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:564)
>  [hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:858)
>  [hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:263)
>  [hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
>         at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:281)
>  [tez-dag-0.10.1.jar:0.10.1]
>         at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:272)
>  [tez-dag-0.10.1.jar:0.10.1]
>         at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_261]
>         at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_261]
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  [hadoop-common-3.1.0.jar:?]
>         at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:272)
>  [tez-dag-0.10.1.jar:0.10.1]
>         at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:256)
>  [tez-dag-0.10.1.jar:0.10.1]
>         at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>  [guava-19.0.jar:?]
>         at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>  [guava-19.0.jar:?]
>         at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>  [guava-19.0.jar:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_261]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_261]
>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to