Konstantin Bereznyakov created HIVE-29367:
---------------------------------------------

             Summary: ConvertJoinMapJoin.computeOnlineDataSizeGeneric() 
overflow on large tables
                 Key: HIVE-29367
                 URL: https://issues.apache.org/jira/browse/HIVE-29367
             Project: Hive
          Issue Type: Bug
            Reporter: Konstantin Bereznyakov


Once the value of onlineDataSize becomes really large, the following line of 
code:
https://github.com/apache/hive/blob/2cd59de300e9dae3fe1d6d2538efdbf56f80c763/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L366
and its two subsequent lines become unsafe for overflowing of Long.MAX_VALUE. 
As the result, the computed value could become negative.

Therefore, the inputSize value assigned here: 
[https://github.com/apache/hive/blob/2cd59de300e9dae3fe1d6d2538efdbf56f80c763/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1200]
 becomes negative, and the large table is deemed to be fitting into the memory 
here: 
[https://github.com/apache/hive/blob/2cd59de300e9dae3fe1d6d2538efdbf56f80c763/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1213]

The ultimate outcome of the query is attempting to fit a giant table into RAM, 
and a failure with an OOM



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to