Xingbo Jiang created SPARK-57646:
------------------------------------
Summary: Bound alias-projection candidate enumeration to prevent
planner hangs
Key: SPARK-57646
URL: https://issues.apache.org/jira/browse/SPARK-57646
Project: Spark
Issue Type: Improvement
Components: Optimizer
Affects Versions: 4.1.2
Reporter: Xingbo Jiang
Assignee: Xingbo Jiang
`AliasAwareOutputExpression.projectExpression` enumerate all alias rewrites of
an expression via `multiTransformDown`, which is a cartesian product of
expression and alias. Although
`spark.sql.optimizer.expressionProjectionCandidateLimit` limits the distinct
output number to 100 by default, it's still possible that the number of
distinct candidates is below the limit, but the enumeration is much more than
that and most of them are duplicated. As a result, the Planner stuck in the
function for multiple hours.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]