[
https://issues.apache.org/jira/browse/HIVE-26257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
xiatch updated HIVE-26257:
--------------------------
Description:
Queries like these
{code:java}
SELECT
*
FROM
(
SELECT
C.lv_col,
'1' AS match_col
FROM
(
SELECT
'1' AS a
) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col
) A
LEFT JOIN
(
SELECT
'1' AS match_col
FROM
(
SELECT
'a' AS b
) E
LEFT JOIN (
SELECT
'a' AS c
) F ON E.b = F.c
) D ON A.match_col = D.match_col;
{code}
generates twice the number of rows in Spark when compared to MR.
was:
Queries like these
{{SELECT}}
{{ *}}
{{FROM}}
{{ (}}
{{ SELECT}}
{{ C.lv_col,}}
{{ '1' AS match_col}}
{{ FROM}}
{{ (}}
{{ SELECT}}
{{ '1' AS a}}
{{ ) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col}}
{{ ) A}}
{{ LEFT JOIN}}
{{ (}}
{{ SELECT}}
{{ '1' AS match_col}}
{{ FROM}}
{{ (}}
{{ SELECT}}
{{ 'a' AS b}}
{{ ) E}}
{{ LEFT JOIN (}}
{{ SELECT}}
{{ 'a' AS c}}
{{ ) F ON E.b = F.c}}
{{ ) D ON A.match_col = D.match_col;}}
generates twice the number of rows in Spark when compared to MR.
> Mapjoin with LateralViewJoin generates wrong plan in Spark
> ----------------------------------------------------------
>
> Key: HIVE-26257
> URL: https://issues.apache.org/jira/browse/HIVE-26257
> Project: Hive
> Issue Type: Bug
> Affects Versions: 3.0.0
> Reporter: xiatch
> Assignee: xiatch
> Priority: Major
>
> Queries like these
>
> {code:java}
> SELECT
> *
> FROM
> (
> SELECT
> C.lv_col,
> '1' AS match_col
> FROM
> (
> SELECT
> '1' AS a
> ) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col
> ) A
> LEFT JOIN
> (
> SELECT
> '1' AS match_col
> FROM
> (
> SELECT
> 'a' AS b
> ) E
> LEFT JOIN (
> SELECT
> 'a' AS c
> ) F ON E.b = F.c
> ) D ON A.match_col = D.match_col;
> {code}
>
> generates twice the number of rows in Spark when compared to MR.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)