viirya commented on a change in pull request #34642:
URL: https://github.com/apache/spark/pull/34642#discussion_r766277525
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
##########
@@ -256,7 +256,8 @@ case class CachedRDDBuilder(
}
private def buildBuffers(): RDD[CachedBatch] = {
- val cb = if (cachedPlan.supportsColumnar) {
+ val cb = if (cachedPlan.supportsColumnar &&
+ serializer.supportsColumnarInput(cachedPlan.output)) {
Review comment:
This is actually a bug. `cachedPlan.supportsColumnar` only indicates the
cached plan can output columnar format, but whether this cached rdd builder can
take such input, is depending on its serializer.
There is one test which failed due to the proposed change. I remember that
it happens for `InMemoryRelation` under `InMemoryRelation`.
Previously we always add additional `ColumnarToRow` transition between two
`InMemoryRelation`s, so we don't hit this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]