Hi The documentation describes a scenario where SMB join leads to the same error you’ve got. It claims that changing the order of the tables solves the problem.
Dudu https://cwiki.apache.org/confluence/display/Hive/LanguageManual+JoinOptimization#LanguageManualJoinOptimization-SMBJoinacrossTableswithDifferentKeys SMB Join across Tables with Different Keys If the tables have differing number of keys, for example Table A has 2 SORT columns and Table B has 1 SORT column, then you might get an index out of bounds exception. The following query results in an index out of bounds exception because emp_person let us say for example has 1 sort column while emp_pay_history has 2 sort columns. Error Hive 0.11 SELECT p.*, py.* FROM emp_person p INNER JOIN emp_pay_history py ON p.empid = py.empid This works fine. Working query Hive 0.11 SELECT p.*, py.* FROM emp_pay_history py INNER JOIN emp_person p ON p.empid = py.empid From: Banias H [mailto:banias4sp...@gmail.com] Sent: Tuesday, May 31, 2016 8:09 PM To: user@hive.apache.org Subject: How to disable SMB join? Hi, Does anybody know if there a config setting to disable SMB join? One of our Hive queries failed with ArrayIndexOutOfBoundsException when Tez is the execution engine. The error seems to be addressed by https://issues.apache.org/jira/browse/HIVE-13282 We have Hive 1.2 and Tez 0.7 in our cluster and the workaround suggested in the ticket is to disable SMB join. I searched around and only found the setting to convert to SMB MapJoin. Any help on disabling SMB join altogether would be appreciated. Thanks. -B