[
https://issues.apache.org/jira/browse/HIVE-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Satoshi Iijima updated HIVE-16241:
----------------------------------
Description:
Example is the bellow. Each subquery 'key' column is unique. But when they are
joined on 'key' column, a result is duplicated.
{code}
CREATE TABLE test (
key STRING,
type STRING,
value INT
) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
LOAD DATA LOCAL INPATH '/tmp/test.gz' OVERWRITE INTO TABLE test;
SELECT * FROM test;
A type1 30000
B type2 20000
C type2 5000
SELECT l.*
FROM (
SELECT * FROM test LATERAL VIEW explode(ARRAY(key)) e AS dammy
) l JOIN (
SELECT key, rank() OVER (PARTITION BY type ORDER BY value DESC) rank
FROM test
) r ON l.key = r.key
A type1 30000 A
A type1 30000 A
B type2 20000 B
B type2 20000 B
C type2 5000 C
C type2 5000 C
{code}
was:
Example:
{code}
CREATE TABLE test (
key STRING,
type STRING,
value INT
) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
LOAD DATA LOCAL INPATH '/tmp/test.gz' OVERWRITE INTO TABLE test;
SELECT * FROM test;
A type1 30000
B type2 20000
C type2 5000
SELECT l.*
FROM (
SELECT * FROM test LATERAL VIEW explode(ARRAY(key)) e AS dammy
) l JOIN (
SELECT key, rank() OVER (PARTITION BY type ORDER BY value DESC) rank
FROM test
) r ON l.key = r.key
A type1 30000 A
A type1 30000 A
B type2 20000 B
B type2 20000 B
C type2 5000 C
C type2 5000 C
{code}
> When PTF, explode AND join are used together, result is duplicated
> -------------------------------------------------------------------
>
> Key: HIVE-16241
> URL: https://issues.apache.org/jira/browse/HIVE-16241
> Project: Hive
> Issue Type: Bug
> Components: PTF-Windowing, Query Processor
> Affects Versions: 2.1.0
> Reporter: Satoshi Iijima
>
> Example is the bellow. Each subquery 'key' column is unique. But when they
> are joined on 'key' column, a result is duplicated.
> {code}
> CREATE TABLE test (
> key STRING,
> type STRING,
> value INT
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '/tmp/test.gz' OVERWRITE INTO TABLE test;
> SELECT * FROM test;
> A type1 30000
> B type2 20000
> C type2 5000
> SELECT l.*
> FROM (
> SELECT * FROM test LATERAL VIEW explode(ARRAY(key)) e AS dammy
> ) l JOIN (
> SELECT key, rank() OVER (PARTITION BY type ORDER BY value DESC) rank
> FROM test
> ) r ON l.key = r.key
> A type1 30000 A
> A type1 30000 A
> B type2 20000 B
> B type2 20000 B
> C type2 5000 C
> C type2 5000 C
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)