[ 
https://issues.apache.org/jira/browse/HIVE-29367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Bereznyakov updated HIVE-29367:
------------------------------------------
    Attachment: mapjoin_stats_overflow.q.out.pre-fix.out
                HIVE-29367.patch
                HIVE-29637-mapjoin_stats_overflow.q.out.original.png
        Status: Patch Available  (was: In Progress)

the file [^mapjoin_stats_overflow.q.out.pre-fix.out] shows the original test 
output with the decision to convert to a majoin

> ConvertJoinMapJoin.computeOnlineDataSizeGeneric() overflow on large tables
> --------------------------------------------------------------------------
>
>                 Key: HIVE-29367
>                 URL: https://issues.apache.org/jira/browse/HIVE-29367
>             Project: Hive
>          Issue Type: Bug
>         Environment: Execute the attached query file 
> [^mapjoin_negative_overflow.q] and see that its output 
> [^mapjoin_negative_overflow.q.out] shows mapjoin conversion for *both* 
> queries, even though t1 is massive in the second query.
>            Reporter: Konstantin Bereznyakov
>            Assignee: Konstantin Bereznyakov
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-29367.patch, 
> HIVE-29637-mapjoin_stats_overflow.q.out.original.png, 
> mapjoin_negative_overflow.q, mapjoin_negative_overflow.q.out, 
> mapjoin_stats_overflow.q.out.pre-fix.out
>
>
> Once the value of onlineDataSize becomes really large, the following line of 
> code:
> https://github.com/apache/hive/blob/2cd59de300e9dae3fe1d6d2538efdbf56f80c763/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L366
> and its two subsequent lines become unsafe for overflowing of Long.MAX_VALUE. 
> As the result, the computed value could become negative.
> Therefore, the inputSize value assigned here: 
> [https://github.com/apache/hive/blob/2cd59de300e9dae3fe1d6d2538efdbf56f80c763/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1200]
>  becomes negative, and the large table is deemed to be fitting into the 
> memory here: 
> [https://github.com/apache/hive/blob/2cd59de300e9dae3fe1d6d2538efdbf56f80c763/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1213]
> The ultimate outcome of the query is attempting to fit a giant table into 
> RAM, and a failure with an OOM



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to