[
https://issues.apache.org/jira/browse/HIVE-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Satoshi Iijima updated HIVE-16241:
----------------------------------
Summary: When PTF, explode and join are used together, result is duplicated
(was: When PTF, explode AND join are used together, result is duplicated )
> When PTF, explode and join are used together, result is duplicated
> -------------------------------------------------------------------
>
> Key: HIVE-16241
> URL: https://issues.apache.org/jira/browse/HIVE-16241
> Project: Hive
> Issue Type: Bug
> Components: PTF-Windowing, Query Processor
> Affects Versions: 2.1.0
> Reporter: Satoshi Iijima
>
> Example is the bellow. Each subquery 'key' column is unique. But when they
> are joined on 'key' column, a result is duplicated.
> {code}
> CREATE TABLE test (
> key STRING,
> type STRING,
> value INT
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '/tmp/test.gz' OVERWRITE INTO TABLE test;
> SELECT * FROM test;
> A type1 30000
> B type2 20000
> C type2 5000
> SELECT l.*
> FROM (
> SELECT * FROM test LATERAL VIEW explode(ARRAY(key)) e AS dammy
> ) l JOIN (
> SELECT key, rank() OVER (PARTITION BY type ORDER BY value DESC) rank
> FROM test
> ) r ON l.key = r.key
> A type1 30000 A
> A type1 30000 A
> B type2 20000 B
> B type2 20000 B
> C type2 5000 C
> C type2 5000 C
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)