[
https://issues.apache.org/jira/browse/HIVE-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254738#comment-15254738
]
Wei Zheng commented on HIVE-12837:
----------------------------------
[~sershe] Could you please review?
> Better memory estimation/allocation for hybrid grace hash join during hash
> table loading
> ----------------------------------------------------------------------------------------
>
> Key: HIVE-12837
> URL: https://issues.apache.org/jira/browse/HIVE-12837
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 2.1.0
> Reporter: Wei Zheng
> Assignee: Wei Zheng
> Attachments: HIVE-12837.1.patch, HIVE-12837.2.patch,
> HIVE-12837.3.patch, HIVE-12837.4.patch
>
>
> This is to avoid an edge case when the memory available is very little (less
> than a single write buffer size), and we start loading the hash table. Since
> the write buffer is lazily allocated, we will easily run out of memory before
> even checking if we should spill any hash partition.
> e.g.
> Total memory available: 210 MB
> Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
> Size of write buffer: 8 MB (lazy allocation)
> Number of hash partitions: 16
> Number of hash partitions created in memory: 13
> Number of hash partitions created on disk: 3
> Available memory left after HybridHashTableContainer initialization:
> 210-16*13=2MB
> Now let's say a row is to be loaded into a hash partition in memory, it will
> try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.
> Solution is to perform the check for possible spilling earlier so we can
> spill partitions if memory is about to be full, to avoid OOM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)