[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...

sujithjay Thu, 21 Dec 2017 04:40:57 -0800

Github user sujithjay commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20002#discussion_r158267365
  
    --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
    @@ -67,6 +71,16 @@ object Partitioner {
           }
         }
       }
    +
    +  /**
    +   * Returns true if the number of partitions of the RDD is either greater 
than or is
    +   * less than and within a single order of magnitude of the max number of 
upstream partitions;
    +   * otherwise, returns false
    +   */
    +  private def isEligiblePartitioner(hasMaxPartitioner: RDD[_], rdds: 
Seq[RDD[_]]): Boolean = {
    +    val maxPartitions = rdds.map(_.partitions.length).max
    +    log10(maxPartitions).floor - 
log10(hasMaxPartitioner.getNumPartitions).floor < 1
    --- End diff --
    
    Hi @mridulm , I suppose I was trying to ensure a strict order-of-magnitude 
check; but, I agree it leads to a discontinuity. I will change this, and the 
corresponding test cases.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...

Reply via email to