Stamatis Zampetakis created HIVE-28816:
------------------------------------------

             Summary: hive.optimize.cte.materialize.threshold behavior is 
inconsistent with its description
                 Key: HIVE-28816
                 URL: https://issues.apache.org/jira/browse/HIVE-28816
             Project: Hive
          Issue Type: Bug
    Affects Versions: 4.0.1
            Reporter: Stamatis Zampetakis
            Assignee: Stamatis Zampetakis


The description of the {{hive.optimize.cte.materialize.threshold}} property is 
shown below.

{code:java}
If the number of references to a CTE clause exceeds this threshold, Hive will 
materialize it before executing the main query block.
{code}

However, in some cases Hive will materialize a CTE even when the number of 
references to the CTE clause is exactly *equal* to the threshold.

The respective snippet from 
[SemanticAnalyzer.java|https://github.com/apache/hive/blob/38dc6b61a695a799c447d6f90503742dcc22e7bb/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L2274]
 is shown below.

{code:java}
        if (threshold >= 0 && cte.reference >= threshold) {
          cte.materialize = !HiveConf.getBoolVar(conf, 
ConfVars.HIVE_CTE_MATERIALIZE_FULL_AGGREGATE_ONLY)
              || cte.qbExpr.getQB().getParseInfo().isFullyAggregate();
        }
{code}

In most cases the phrase "exceeds" a threshold stands for strict inequality 
(i.e., greater than the threshold) so in this case the description of the 
property is not consistent with the behavior of Hive. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to