Xingbo Jiang created SPARK-57646:
------------------------------------

             Summary: Bound alias-projection candidate enumeration to prevent 
planner hangs
                 Key: SPARK-57646
                 URL: https://issues.apache.org/jira/browse/SPARK-57646
             Project: Spark
          Issue Type: Improvement
          Components: Optimizer
    Affects Versions: 4.1.2
            Reporter: Xingbo Jiang
            Assignee: Xingbo Jiang


`AliasAwareOutputExpression.projectExpression` enumerate all alias rewrites of 
an expression via `multiTransformDown`, which is a cartesian product of 
expression and alias. Although 
`spark.sql.optimizer.expressionProjectionCandidateLimit` limits the distinct 
output number to 100 by default, it's still possible that the number of 
distinct candidates is below the limit, but the enumeration is much more than 
that and most of them are duplicated. As a result, the Planner stuck in the 
function for multiple hours.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to