bersprockets commented on a change in pull request #23392: [SPARK-26450][SQL] 
Avoid rebuilding map of schema for every column in projection
URL: https://github.com/apache/spark/pull/23392#discussion_r244423562
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala
 ##########
 @@ -316,8 +316,10 @@ object GenerateUnsafeProjection extends 
CodeGenerator[Seq[Expression], UnsafePro
   protected def canonicalize(in: Seq[Expression]): Seq[Expression] =
     in.map(ExpressionCanonicalizer.execute)
 
-  protected def bind(in: Seq[Expression], inputSchema: Seq[Attribute]): 
Seq[Expression] =
-    in.map(BindReferences.bindReference(_, inputSchema))
+  protected def bind(in: Seq[Expression], inputSchema: Seq[Attribute]): 
Seq[Expression] = {
+    lazy val inputSchemaAttrSeq: AttributeSeq = inputSchema
 
 Review comment:
   Yes, that is the reason. For example, the query <code>df.count</code>, where 
df is a dataframe from a CSV datasource, calls GenerateUnsafeProjection.bind 
with am empty list of expressions.
   
   However, the map inside the AttributeSeq object is not built until someone 
accesses exprIdToOrdinal, so maybe it is overkill.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to