[
https://issues.apache.org/jira/browse/HIVE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900397#action_12900397
]
Ning Zhang commented on HIVE-1567:
----------------------------------
The hive.mapjoin.maxsize is there not for speed, it is for limiting memory
consumption. We saw OOM exceptions quite a lot before this parameter was
introduced. Rather than increasing it blindly a better way may be to estimate
how many rows can be fit into memory based on the row size and available memory
and adjusting this parameter automatically.
> increase hive.mapjoin.maxsize to 10 million
> -------------------------------------------
>
> Key: HIVE-1567
> URL: https://issues.apache.org/jira/browse/HIVE-1567
> Project: Hadoop Hive
> Issue Type: Improvement
> Reporter: He Yongqiang
>
> i saw in a very wide table, hive can process 1million rows in less than one
> minute (select all columns).
> setting the hive.mapjoin.maxsize to 100k is kind of too restrictive. Let's
> increase this to 10 million.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.