cloud-fan commented on a change in pull request #25052: [SPARK-28250][SQL] 
QueryPlan#references should exclude producedAttributes
URL: https://github.com/apache/spark/pull/25052#discussion_r300444702
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
 ##########
 @@ -59,12 +51,21 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]] 
extends TreeNode[PlanT
    */
   def producedAttributes: AttributeSet = AttributeSet.empty
 
+  /**
+   * All Attributes that appear in expressions from this operator.  Note that 
this set does not
+   * include attributes that are implicitly referenced by being passed through 
to the output tuple.
+   */
+  @transient
+  lazy val references: AttributeSet = {
+    AttributeSet.fromAttributeSets(expressions.map(_.references)) -- 
producedAttributes
+  }
+
   /**
    * Attributes that are referenced by expressions but not provided by this 
node's children.
    * Subclasses should override this method if they produce attributes 
internally as it is used by
 
 Review comment:
   It seems like cheating if we override this method. We want to know if the 
plan is valid or not by this method, overriding this method means we skip this 
validation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to