[
https://issues.apache.org/jira/browse/SPARK-25650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16674954#comment-16674954
]
Marco Gaido commented on SPARK-25650:
-------------------------------------
[~maryannxue] since all the subtasks are completed, shall we close this?
> Make analyzer rules used in once-policy idempotent
> --------------------------------------------------
>
> Key: SPARK-25650
> URL: https://issues.apache.org/jira/browse/SPARK-25650
> Project: Spark
> Issue Type: Task
> Components: SQL
> Affects Versions: 2.3.2
> Reporter: Maryann Xue
> Priority: Major
>
> Rules like {{HandleNullInputsForUDF}}
> (https://issues.apache.org/jira/browse/SPARK-24891) do not stabilize (can
> apply new changes to a plan indefinitely) and can cause problems like SQL
> cache mismatching.
> Ideally, all rules whether in a once-policy batch or a fixed-point-policy
> batch should stabilize after the number of runs specified. Once-policy should
> be considered a performance improvement, a assumption that the rule can
> stabilize after just one run rather than an assumption that the rule won't be
> applied more than once. Those once-policy rules should be able to run fine
> with fixed-point policy rule as well.
> Currently we already have a check for fixed-point and throws an exception if
> maximum number of runs is reached and the plan is still changing. Here, in
> this PR, a similar check is added for once-policy and throws an exception if
> the plan changes between the first run and the second run of a once-policy
> rule.
> To reproduce this issue, go to [https://github.com/apache/spark/pull/22060],
> apply the changes and remove the specific rule from the whitelist
> https://github.com/apache/spark/pull/22060/files#diff-f70523b948b7af21abddfa3ab7e1d7d6R71.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]