Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/16677#discussion_r198328639
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
---
@@ -193,6 +193,16 @@ case object SinglePartition extends Partitioning {
}
}
+/**
+ * Represents a partitioning where rows are only serialized/deserialized
locally. The number
+ * of partitions are not changed and also the distribution of rows. This
is mainly used to
+ * obtain some statistics of map tasks such as number of outputs.
+ */
+case class LocalPartitioning(orgPartition: Partitioning, numPartitions:
Int) extends Partitioning {
--- End diff --
As you see that this causes test failure. It is because
`child.outputPartitioning` can be an `UnknownPartitioning(0)`. To avoid using
a separate field, I make childRDD as `LocalRelation`'s field instead of
`orgPartition`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]