[ 
https://issues.apache.org/jira/browse/HIVE-26257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiatch updated HIVE-26257:
--------------------------
    Description: 
Queries like these

 
{code:java}
SELECT
  *
FROM
  (
    SELECT
      C.lv_col,
      '1' AS match_col
    FROM
      (
        SELECT
          '1' AS a
      ) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col
  ) A
  LEFT JOIN
  (
    SELECT
      '1' AS match_col
    FROM
      (
        SELECT
          'a' AS b
      ) E
      LEFT JOIN (
        SELECT
          'a' AS c
      ) F ON E.b = F.c
  ) D ON A.match_col = D.match_col;
{code}
 

generates twice the number of rows in Spark when compared to MR.

  was:
Queries like these

{{SELECT}}

{{  *}}

{{FROM}}

{{  (}}

{{    SELECT}}

{{      C.lv_col,}}

{{      '1' AS match_col}}

{{    FROM}}

{{      (}}

{{        SELECT}}

{{          '1' AS a}}

{{      ) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col}}

{{  ) A}}

{{  LEFT JOIN}}

{{  (}}

{{    SELECT}}

{{      '1' AS match_col}}

{{    FROM}}

{{      (}}

{{        SELECT}}

{{          'a' AS b}}

{{      ) E}}

{{      LEFT JOIN (}}

{{        SELECT}}

{{          'a' AS c}}

{{      ) F ON E.b = F.c}}

{{  ) D ON A.match_col = D.match_col;}}

generates twice the number of rows in Spark when compared to MR.


> Mapjoin with LateralViewJoin generates wrong plan in Spark
> ----------------------------------------------------------
>
>                 Key: HIVE-26257
>                 URL: https://issues.apache.org/jira/browse/HIVE-26257
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: xiatch
>            Assignee: xiatch
>            Priority: Major
>
> Queries like these
>  
> {code:java}
> SELECT
>   *
> FROM
>   (
>     SELECT
>       C.lv_col,
>       '1' AS match_col
>     FROM
>       (
>         SELECT
>           '1' AS a
>       ) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col
>   ) A
>   LEFT JOIN
>   (
>     SELECT
>       '1' AS match_col
>     FROM
>       (
>         SELECT
>           'a' AS b
>       ) E
>       LEFT JOIN (
>         SELECT
>           'a' AS c
>       ) F ON E.b = F.c
>   ) D ON A.match_col = D.match_col;
> {code}
>  
> generates twice the number of rows in Spark when compared to MR.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to