[
https://issues.apache.org/jira/browse/HIVE-26257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
xiatch updated HIVE-26257:
--------------------------
Description:
Queries like these
# SELECT
# *
# FROM
# (
# SELECT
# C.lv_col,
# '1' AS match_col
# FROM
# (
# SELECT
# '1' AS a
# ) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col
# ) A
# LEFT JOIN
# (
# SELECT
# '1' AS match_col
# FROM
# (
# SELECT
# 'a' AS b
# ) E
# LEFT JOIN (
# SELECT
# 'a' AS c
# ) F ON E.b = F.c
# ) D ON A.match_col = D.match_col;
generates twice the number of rows in Spark when compared to MR.
was:
Queries like these
SELECT
*
FROM
(
SELECT
C.lv_col,
'1' AS match_col
FROM
(
SELECT
'1' AS a
) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col
) A
LEFT JOIN
(
SELECT
'1' AS match_col
FROM
(
SELECT
'a' AS b
) E
LEFT JOIN (
SELECT
'a' AS c
) F ON E.b = F.c
) D ON A.match_col = D.match_col;
generates twice the number of rows in Spark when compared to MR.
> Mapjoin with LateralViewJoin generates wrong plan in Spark
> ----------------------------------------------------------
>
> Key: HIVE-26257
> URL: https://issues.apache.org/jira/browse/HIVE-26257
> Project: Hive
> Issue Type: Bug
> Reporter: xiatch
> Assignee: xiatch
> Priority: Major
>
> Queries like these
> # SELECT
> # *
> # FROM
> # (
> # SELECT
> # C.lv_col,
> # '1' AS match_col
> # FROM
> # (
> # SELECT
> # '1' AS a
> # ) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col
> # ) A
> # LEFT JOIN
> # (
> # SELECT
> # '1' AS match_col
> # FROM
> # (
> # SELECT
> # 'a' AS b
> # ) E
> # LEFT JOIN (
> # SELECT
> # 'a' AS c
> # ) F ON E.b = F.c
> # ) D ON A.match_col = D.match_col;
> generates twice the number of rows in Spark when compared to MR.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)