Sungwoo Park created HIVE-27375:
-----------------------------------
Summary: SharedWorkOptimizer assigns a common cache key to MapJoin
operators that should not share MapJoin tables
Key: HIVE-27375
URL: https://issues.apache.org/jira/browse/HIVE-27375
Project: Hive
Issue Type: Bug
Reporter: Sungwoo Park
When hive.optimize.shared.work.mapjoin.cache.reuse is set to true,
SharedWorkOptimizer sometimes assigns a common cache key to MapJoin operators
that should not share MapJoin tables. This bug occurs only for MapJoin
operators with 3 or more parent operators.
Example:
MAPJOIN[575] (RS_83, GBY_66, RS_85)
MAPJOIN[585] (RS_212, RS_213, GBY_210)
In this example, both MAPJOIN[575] and MAPJOIN[585] have three parent
operators. The current implementation assigns a common cache key to
MAPJOIN[575] and MAPJOIN[585] because RS_83 are RS_212 are equivalent.
However, MAPJOIN[575] uses GBY_66 for its big table whereas MAPJOIN[585] uses
GBY_210 for its big table. As a result, the MapJoin table loaded by one
operator cannot be used by the other.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)