[
https://issues.apache.org/jira/browse/HIVE-10083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alain Schröder updated HIVE-10083:
----------------------------------
Affects Version/s: (was: 0.13.1)
0.13.0
> SMBJoin fails in case one table is uninitialized
> ------------------------------------------------
>
> Key: HIVE-10083
> URL: https://issues.apache.org/jira/browse/HIVE-10083
> Project: Hive
> Issue Type: Bug
> Components: Logical Optimizer
> Affects Versions: 0.13.0
> Environment: MapR Hive 0.13
> Reporter: Alain Schröder
> Priority: Minor
>
> We experience IndexOutOfBoundsException in a SMBJoin in the case on the
> tables used for the JOIN is uninitialized. Everything works if both are
> uninitialized or initialized.
> {code}
> 2015-03-24 09:12:58,967 ERROR [main]: ql.Driver
> (SessionState.java:printError(545)) - FAILED: IndexOutOfBoundsException
> Index: 0, Size: 0
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at
> org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.fillMappingBigTableBucketFileNameToSmallTableBucketFileNames(AbstractBucketJoinProc.java:486)
> at
> org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.convertMapJoinToBucketMapJoin(AbstractBucketJoinProc.java:429)
> at
> org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToBucketMapJoin(AbstractSMBJoinProc.java:540)
> at
> org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToSMBJoin(AbstractSMBJoinProc.java:549)
> at
> org.apache.hadoop.hive.ql.optimizer.SortedMergeJoinProc.process(SortedMergeJoinProc.java:51)
> {code}
> Simplest way to reproduce:
> {code}
> SET hive.enforce.sorting=true;
> SET hive.enforce.bucketing=true;
> SET hive.exec.dynamic.partition=true;
> SET mapreduce.reduce.import.limit=-1;
> SET hive.optimize.bucketmapjoin=true;
> SET hive.optimize.bucketmapjoin.sortedmerge=true;
> SET hive.auto.convert.join=true;
> SET hive.auto.convert.sortmerge.join=true;
> SET hive.auto.convert.sortmerge.join.noconditionaltask=true;
> CREATE DATABASE IF NOT EXISTS tmp;
> USE tmp;
> CREATE TABLE `test1` (
> `foo` bigint )
> CLUSTERED BY (
> foo)
> SORTED BY (
> foo ASC)
> INTO 384 BUCKETS
> stored as orc;
> CREATE TABLE `test2`(
> `foo` bigint )
> CLUSTERED BY (
> foo)
> SORTED BY (
> foo ASC)
> INTO 384 BUCKETS
> STORED AS ORC;
> -- Initialize ONE table of the two tables with any data.
> INSERT INTO TABLE test1 SELECT foo FROM table_with_some_content LIMIT 100;
> SELECT t1.foo, t2.foo
> FROM test1 t1 INNER JOIN test2 t2
> ON (t1.foo = t2.foo);
> {code}
> I took a look at the Procedure
> fillMappingBigTableBucketFileNameToSmallTableBucketFileNames in
> AbstractBucketJoinProc.java and it does not seem to have changed from our
> MapR Hive 0.13 to current snapshot, so this should be also an error in the
> current Version.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)