Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/20831#discussion_r176006590
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
---
@@ -169,7 +174,10 @@ case class InMemoryTableScanExec(
override def outputOrdering: Seq[SortOrder] =
relation.child.outputOrdering.map(updateAttribute(_).asInstanceOf[SortOrder])
- private def statsFor(a: Attribute) =
relation.partitionStatistics.forAttribute(a)
+ // When we make canonicalized plan, we can't find a normalized attribute
in this map.
+ // We return a `ColumnStatisticsSchema` for normalized attribute in this
case.
--- End diff --
And I think it isn't worth removing @transient from `relation` and
`InMemoryRelation.partitionStatistics` just for this. So I leave it as is.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]