Mostafa Mokhtar created HIVE-10793: -------------------------------------- Summary: Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront Key: HIVE-10793 URL: https://issues.apache.org/jira/browse/HIVE-10793 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.1
HybridHashTableContainer will allocate memory based on estimate, which means if the actual is less than the estimate the allocated memory won't be used. Number of partitions is calculated based on estimated data size {code} numPartitions = calcNumPartitions(memoryThreshold, estimatedTableSize, minNumParts, minWbSize, nwayConf); {code} Then based on number of partitions writeBufferSize is set {code} writeBufferSize = (int)(estimatedTableSize / numPartitions); {code} Each hash partition will allocate 1 WriteBuffer, with no further allocation if the estimate data size is correct. Suggested solution is to reduce writeBufferSize by a factor such that only X% of the memory is preallocated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)