Github user sujithjay commented on a diff in the pull request:
https://github.com/apache/spark/pull/20002#discussion_r158267365
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -67,6 +71,16 @@ object Partitioner {
}
}
}
+
+ /**
+ * Returns true if the number of partitions of the RDD is either greater
than or is
+ * less than and within a single order of magnitude of the max number of
upstream partitions;
+ * otherwise, returns false
+ */
+ private def isEligiblePartitioner(hasMaxPartitioner: RDD[_], rdds:
Seq[RDD[_]]): Boolean = {
+ val maxPartitions = rdds.map(_.partitions.length).max
+ log10(maxPartitions).floor -
log10(hasMaxPartitioner.getNumPartitions).floor < 1
--- End diff --
Hi @mridulm , I suppose I was trying to ensure a strict order-of-magnitude
check; but, I agree it leads to a discontinuity. I will change this, and the
corresponding test cases.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]