cloud-fan commented on a change in pull request #25052: [SPARK-28250][SQL]
QueryPlan#references should exclude producedAttributes
URL: https://github.com/apache/spark/pull/25052#discussion_r300444702
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
##########
@@ -59,12 +51,21 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
extends TreeNode[PlanT
*/
def producedAttributes: AttributeSet = AttributeSet.empty
+ /**
+ * All Attributes that appear in expressions from this operator. Note that
this set does not
+ * include attributes that are implicitly referenced by being passed through
to the output tuple.
+ */
+ @transient
+ lazy val references: AttributeSet = {
+ AttributeSet.fromAttributeSets(expressions.map(_.references)) --
producedAttributes
+ }
+
/**
* Attributes that are referenced by expressions but not provided by this
node's children.
* Subclasses should override this method if they produce attributes
internally as it is used by
Review comment:
It seems like cheating if we override this method. We want to know if the
plan is valid or not by this method, overriding this method means we skip this
validation.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]