Stamatis Zampetakis created HIVE-28816:
------------------------------------------
Summary: hive.optimize.cte.materialize.threshold behavior is
inconsistent with its description
Key: HIVE-28816
URL: https://issues.apache.org/jira/browse/HIVE-28816
Project: Hive
Issue Type: Bug
Affects Versions: 4.0.1
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis
The description of the {{hive.optimize.cte.materialize.threshold}} property is
shown below.
{code:java}
If the number of references to a CTE clause exceeds this threshold, Hive will
materialize it before executing the main query block.
{code}
However, in some cases Hive will materialize a CTE even when the number of
references to the CTE clause is exactly *equal* to the threshold.
The respective snippet from
[SemanticAnalyzer.java|https://github.com/apache/hive/blob/38dc6b61a695a799c447d6f90503742dcc22e7bb/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L2274]
is shown below.
{code:java}
if (threshold >= 0 && cte.reference >= threshold) {
cte.materialize = !HiveConf.getBoolVar(conf,
ConfVars.HIVE_CTE_MATERIALIZE_FULL_AGGREGATE_ONLY)
|| cte.qbExpr.getQB().getParseInfo().isFullyAggregate();
}
{code}
In most cases the phrase "exceeds" a threshold stands for strict inequality
(i.e., greater than the threshold) so in this case the description of the
property is not consistent with the behavior of Hive.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)