Mostafa Mokhtar created HIVE-10793:
--------------------------------------

             Summary: Hybrid Hybrid Grace Hash Join : Don't allocate all hash 
table memory upfront
                 Key: HIVE-10793
                 URL: https://issues.apache.org/jira/browse/HIVE-10793
             Project: Hive
          Issue Type: Bug
          Components: Hive
    Affects Versions: 1.2.0
            Reporter: Mostafa Mokhtar
            Assignee: Mostafa Mokhtar
             Fix For: 1.2.1


HybridHashTableContainer will allocate memory based on estimate, which means if 
the actual is less than the estimate the allocated memory won't be used.

Number of partitions is calculated based on estimated data size
{code}
numPartitions = calcNumPartitions(memoryThreshold, estimatedTableSize, 
minNumParts, minWbSize,
          nwayConf);
{code}

Then based on number of partitions writeBufferSize is set

{code}
writeBufferSize = (int)(estimatedTableSize / numPartitions);
{code}

Each hash partition will allocate 1 WriteBuffer, with no further allocation if 
the estimate data size is correct.

Suggested solution is to reduce writeBufferSize by a factor such that only X% 
of the memory is preallocated.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to