Clemens Valiente created HIVE-20962: ---------------------------------------
Summary: CommonMergeJoinOperator cannot join on complex keys Key: HIVE-20962 URL: https://issues.apache.org/jira/browse/HIVE-20962 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 2.3.4 Reporter: Clemens Valiente CommonMergeJoinOperator fails to perform joins on complex keys, e.g. {code:sql} CREATE TABLE complex_key ( `key` struct<id:bigint, country:string> value int) PARTITIONED BY (date int); SELECT t1.key, t1.value, t2.value FROM complex_key t1 FULL OUTER JOIN complex_key t2 ON (t1.date=20181121 and t2.date =20181122 AND t1.key=t2.key); {code} This causes a ClassCastException: {code:java} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"reducesinkkey0":{"id":1,"country":"DK"}},"value":{"_col0":1489}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:357) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:279) ... 22 more Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to org.apache.hadoop.io.WritableComparable at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.compareKeys(CommonMergeJoinOperator.java:543) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processKey(CommonMergeJoinOperator.java:516) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:212) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:348) {code} Because the compareKeys() method tries to cast each key to a WritableComparable but e.g. the StandardStructObjectInspector would return our key field as an Arraylist. https://github.com/apache/hive/blob/66f97da9de65b1c7151ec57bdf9ada937855bd75/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonMergeJoinOperator.java#L590 Proper way to do it would probably be to use the KeyWrapperFactory to convert the keys to something easily comparable? -- This message was sent by Atlassian JIRA (v7.6.3#76005)