cloud-fan commented on a change in pull request #24623: [SPARK-27739][SQL]
df.persist should save stats from optimized plan
URL: https://github.com/apache/spark/pull/24623#discussion_r313208622
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
##########
@@ -146,17 +146,17 @@ object InMemoryRelation {
storageLevel: StorageLevel,
child: SparkPlan,
tableName: Option[String],
- logicalPlan: LogicalPlan): InMemoryRelation = {
+ optimizedPlan: LogicalPlan): InMemoryRelation = {
val cacheBuilder = CachedRDDBuilder(useCompression, batchSize,
storageLevel, child, tableName)
- val relation = new InMemoryRelation(child.output, cacheBuilder,
logicalPlan.outputOrdering)
- relation.statsOfPlanToCache = logicalPlan.stats
+ val relation = new InMemoryRelation(child.output, cacheBuilder,
optimizedPlan.outputOrdering)
+ relation.statsOfPlanToCache = optimizedPlan.stats
relation
}
- def apply(cacheBuilder: CachedRDDBuilder, logicalPlan: LogicalPlan):
InMemoryRelation = {
+ def apply(cacheBuilder: CachedRDDBuilder, optimizedPlan: LogicalPlan):
InMemoryRelation = {
val relation = new InMemoryRelation(
- cacheBuilder.cachedPlan.output, cacheBuilder, logicalPlan.outputOrdering)
- relation.statsOfPlanToCache = logicalPlan.stats
+ cacheBuilder.cachedPlan.output, cacheBuilder,
optimizedPlan.outputOrdering)
+ relation.statsOfPlanToCache = optimizedPlan.stats
Review comment:
ah that's a good point, I agree with it. @jzhuge can we write a test using
file source partition pruning? I'm not comfortable merging a fix without tests.
The change itself LGTM.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]