[
https://issues.apache.org/jira/browse/HIVE-11587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002726#comment-15002726
]
Sergey Shelukhin commented on HIVE-11587:
-----------------------------------------
Backported this one. HIVE-10793 has conflict in cherry-pick, can you backport?
> Fix memory estimates for mapjoin hashtable
> ------------------------------------------
>
> Key: HIVE-11587
> URL: https://issues.apache.org/jira/browse/HIVE-11587
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 1.2.0, 1.2.1
> Reporter: Sergey Shelukhin
> Assignee: Wei Zheng
> Labels: TODOC2.0
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11587.01.patch, HIVE-11587.02.patch,
> HIVE-11587.03.patch, HIVE-11587.04.patch, HIVE-11587.05.patch,
> HIVE-11587.06.patch, HIVE-11587.07.patch, HIVE-11587.08.patch
>
>
> Due to the legacy in in-memory mapjoin and conservative planning, the memory
> estimation code for mapjoin hashtable is currently not very good. It
> allocates the probe erring on the side of more memory, not taking data into
> account because unlike the probe, it's free to resize, so it's better for
> perf to allocate big probe and hope for the best with regard to future data
> size. It is not true for hybrid case.
> There's code to cap the initial allocation based on memory available
> (memUsage argument), but due to some code rot, the memory estimates from
> planning are not even passed to hashtable anymore (there used to be two
> config settings, hashjoin size fraction by itself, or hashjoin size fraction
> for group by case), so it never caps the memory anymore below 1 Gb.
> Initial capacity is estimated from input key count, and in hybrid join cache
> can exceed Java memory due to number of segments.
> There needs to be a review and fix of all this code.
> Suggested improvements:
> 1) Make sure "initialCapacity" argument from Hybrid case is correct given the
> number of segments. See how it's calculated from keys for regular case; it
> needs to be adjusted accordingly for hybrid case if not done already.
> 1.5) Note that, knowing the number of rows, the maximum capacity one will
> ever need for probe size (in longs) is row count (assuming key per row, i.e.
> maximum possible number of keys) divided by load factor, plus some very small
> number to round up. That is for flat case. For hybrid case it may be more
> complex due to skew, but that is still a good upper bound for the total probe
> capacity of all segments.
> 2) Rename memUsage to maxProbeSize, or something, make sure it's passed
> correctly based on estimates that take into account both probe and data size,
> esp. in hybrid case.
> 3) Make sure that memory estimation for hybrid case also doesn't come up with
> numbers that are too small, like 1-byte hashtable. I am not very familiar
> with that code but it has happened in the past.
> Other issues we have seen:
> 4) Cap single write buffer size to 8-16Mb. The whole point of WBs is that you
> should not allocate large array in advance. Even if some estimate passes
> 500Mb or 40Mb or whatever, it doesn't make sense to allocate that.
> 5) For hybrid, don't pre-allocate WBs - only allocate on write.
> 6) Change everywhere rounding up to power of two is used to rounding down, at
> least for hybrid case (?)
> I wanted to put all of these items in single JIRA so we could keep track of
> fixing all of them.
> I think there are JIRAs for some of these already, feel free to link them to
> this one.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)