sunchao commented on a change in pull request #34642:
URL: https://github.com/apache/spark/pull/34642#discussion_r766214720



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala
##########
@@ -71,6 +71,11 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] with 
Logging with Serializ
 
   val id: Int = SparkPlan.newPlanId()
 
+  /**
+   * Return true if this stage of the plan supports row-based execution.

Review comment:
       Maybe add some explanation why we need both this and `supportsColumnar`? 
it's a bit confusing when reading this code.
   
   Also I'm wondering if something like `prefersColumnar` is better, so that we 
have:
   - `supportsColumnar`: this plan can support columnar output, alongside the 
default row-based output which every plan supports.
   - `prefersColumnar`: this plan prefers to output columnar batches even 
though it is not explicitly requested (e.g., `outputsColumnar` is false).
   

##########
File path: sql/core/benchmarks/InMemoryColumnarBenchmark-results.txt
##########
@@ -0,0 +1,12 @@
+================================================================================================

Review comment:
       nit: ideally we should generate result using the GitHub workflow

##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
##########
@@ -256,7 +256,8 @@ case class CachedRDDBuilder(
   }
 
   private def buildBuffers(): RDD[CachedBatch] = {
-    val cb = if (cachedPlan.supportsColumnar) {
+    val cb = if (cachedPlan.supportsColumnar &&
+        serializer.supportsColumnarInput(cachedPlan.output)) {

Review comment:
       hmm why this is necessary? shouldn't `cachedPlan.supportsColumnar` 
already covers this? for instance in `InMemoryTableScanExec`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to