[ 
https://issues.apache.org/jira/browse/HIVE-26257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiatch updated HIVE-26257:
--------------------------
    Description: 
Queries like these
 # SELECT
 #   *
 # FROM
 #   (
 #     SELECT
 #       C.lv_col,
 #       '1' AS match_col
 #     FROM
 #       (
 #         SELECT
 #           '1' AS a
 #       ) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col
 #   ) A
 #   LEFT JOIN
 #   (
 #     SELECT
 #       '1' AS match_col
 #     FROM
 #       (
 #         SELECT
 #           'a' AS b
 #       ) E
 #       LEFT JOIN (
 #         SELECT
 #           'a' AS c
 #       ) F ON E.b = F.c
 #   ) D ON A.match_col = D.match_col;

generates twice the number of rows in Spark when compared to MR.

  was:
Queries like these

SELECT

  *

FROM

  (

    SELECT

      C.lv_col,

      '1' AS match_col

    FROM

      (

        SELECT

          '1' AS a

      ) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col

  ) A

  LEFT JOIN

  (

    SELECT

      '1' AS match_col

    FROM

      (

        SELECT

          'a' AS b

      ) E

      LEFT JOIN (

        SELECT

          'a' AS c

      ) F ON E.b = F.c

  ) D ON A.match_col = D.match_col;

generates twice the number of rows in Spark when compared to MR.


> Mapjoin with LateralViewJoin generates wrong plan in Spark
> ----------------------------------------------------------
>
>                 Key: HIVE-26257
>                 URL: https://issues.apache.org/jira/browse/HIVE-26257
>             Project: Hive
>          Issue Type: Bug
>            Reporter: xiatch
>            Assignee: xiatch
>            Priority: Major
>
> Queries like these
>  # SELECT
>  #   *
>  # FROM
>  #   (
>  #     SELECT
>  #       C.lv_col,
>  #       '1' AS match_col
>  #     FROM
>  #       (
>  #         SELECT
>  #           '1' AS a
>  #       ) B LATERAL VIEW explode(split('abcd', ';')) C AS lv_col
>  #   ) A
>  #   LEFT JOIN
>  #   (
>  #     SELECT
>  #       '1' AS match_col
>  #     FROM
>  #       (
>  #         SELECT
>  #           'a' AS b
>  #       ) E
>  #       LEFT JOIN (
>  #         SELECT
>  #           'a' AS c
>  #       ) F ON E.b = F.c
>  #   ) D ON A.match_col = D.match_col;
> generates twice the number of rows in Spark when compared to MR.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to