[
https://issues.apache.org/jira/browse/SPARK-24865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan resolved SPARK-24865.
---------------------------------
Resolution: Fixed
Fix Version/s: 2.4.0
Issue resolved by pull request 21822
[https://github.com/apache/spark/pull/21822]
> Remove AnalysisBarrier
> ----------------------
>
> Key: SPARK-24865
> URL: https://issues.apache.org/jira/browse/SPARK-24865
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.3.0, 2.3.1
> Reporter: Reynold Xin
> Assignee: Reynold Xin
> Priority: Major
> Fix For: 2.4.0
>
>
> AnalysisBarrier was introduced in SPARK-20392 to improve analysis speed
> (don't re-analyze nodes that have already been analyzed).
> Before AnalysisBarrier, we already had some infrastructure in place, with
> analysis specific functions (resolveOperators and resolveExpressions). These
> functions do not recursively traverse down subplans that are already analyzed
> (with a mutable boolean flag _analyzed). The issue with the old system was
> that developers started using transformDown, which does a top-down traversal
> of the plan tree, because there was not top-down resolution function, and as
> a result analyzer performance became pretty bad.
> In order to fix the issue in SPARK-20392, AnalysisBarrier was introduced as a
> special node and for this special node, transform/transformUp/transformDown
> don't traverse down. However, the introduction of this special node caused a
> lot more troubles than it solves. This implicit node breaks assumptions and
> code in a few places, and it's hard to know when analysis barrier would
> exist, and when it wouldn't. Just a simple search of AnalysisBarrier in PR
> discussions demonstrates it is a source of bugs and additional complexity.
> Instead, I think a much simpler fix to the original issue is to introduce
> resolveOperatorsDown, and change all places that call transformDown in the
> analyzer to use that. We can also ban accidental uses of the various
> transform* methods by using a linter (which can only lint specific packages),
> or in test mode inspect the stack trace and fail explicitly if transform* are
> called in the analyzer.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]