[ 
https://issues.apache.org/jira/browse/HIVE-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satoshi Iijima updated HIVE-16241:
----------------------------------
    Summary: When PTF, explode and join are used together, result is duplicated 
  (was: When PTF, explode AND join are used together, result is duplicated )

> When PTF, explode and join are used together, result is duplicated 
> -------------------------------------------------------------------
>
>                 Key: HIVE-16241
>                 URL: https://issues.apache.org/jira/browse/HIVE-16241
>             Project: Hive
>          Issue Type: Bug
>          Components: PTF-Windowing, Query Processor
>    Affects Versions: 2.1.0
>            Reporter: Satoshi Iijima
>
> Example is the bellow. Each subquery 'key' column is unique. But when they 
> are joined on 'key' column, a result is duplicated.
> {code}
> CREATE TABLE test (
>   key   STRING,
>   type  STRING,
>   value INT
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '/tmp/test.gz' OVERWRITE INTO TABLE test;
> SELECT * FROM test;
> A     type1   30000
> B     type2   20000
> C     type2   5000
> SELECT l.*
> FROM (
>   SELECT * FROM test LATERAL VIEW explode(ARRAY(key)) e AS dammy
> ) l JOIN (
>     SELECT key, rank() OVER (PARTITION BY type ORDER BY value DESC) rank 
>     FROM test
> ) r ON l.key = r.key
> A     type1   30000   A
> A     type1   30000   A
> B     type2   20000   B
> B     type2   20000   B
> C     type2   5000    C
> C     type2   5000    C
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to