[
https://issues.apache.org/jira/browse/HIVE-28853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Seonggon Namgung updated HIVE-28853:
------------------------------------
Description: To prevent excessive MapJoin HashTable usage,
SharedWorkOptimizer checks the total InMemoryDataSize, including both merged
and unmerged MapJoins that belong to the same vertex, before proceeding to
merge step. However, the current implementation skips this check when merging
only a pair of TS or a pair of TS-FIL. This omission can lead to performance
degradation due to loading large HashTables. (was: To prevent excessive
MapJoin HashTable usage, SharedWorkOptimizer checks the total InMemoryDataSize,
including both merged and unmerged MapJoins that belong to the same vertex,
before proceeding to merge step. However, the current implementation skips this
check when merging only a pair of TS or a pair of TS-FIL. This omission can
lead to performance degradation due to loading large HashTables and may causeĀ
MapJoinMemoryExhaustionError.)
> SharedWorkOptimizer does not consider MapJoin operators' InMemoryDataSize in
> certain code paths.
> ------------------------------------------------------------------------------------------------
>
> Key: HIVE-28853
> URL: https://issues.apache.org/jira/browse/HIVE-28853
> Project: Hive
> Issue Type: Bug
> Reporter: Seonggon Namgung
> Assignee: Seonggon Namgung
> Priority: Major
>
> To prevent excessive MapJoin HashTable usage, SharedWorkOptimizer checks the
> total InMemoryDataSize, including both merged and unmerged MapJoins that
> belong to the same vertex, before proceeding to merge step. However, the
> current implementation skips this check when merging only a pair of TS or a
> pair of TS-FIL. This omission can lead to performance degradation due to
> loading large HashTables.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)