Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21527#discussion_r196810539
  
    --- Diff: 
core/src/test/scala/org/apache/spark/scheduler/MapStatusSuite.scala ---
    @@ -188,4 +188,37 @@ class MapStatusSuite extends SparkFunSuite {
           assert(count === 3000)
         }
       }
    +
    +  test("YSPARK-500 MapStatus has 2000 hardcoded") {
    +    val conf = new SparkConf()
    +      .setMaster("local")
    +      .setAppName("YSPARK-500")
    +    val sizes = Array.fill[Long](500)(150L)
    +    // Test default value
    +    withSpark(new SparkContext(conf)) { sc =>
    +      val status = MapStatus(null, sizes)
    +      assert(status.isInstanceOf[CompressedMapStatus])
    +    }
    +    // Test Non-positive values
    +    for (s <- -1 to 0) {
    +      assertThrows[IllegalArgumentException] {
    +        conf.set(config.SHUFFLE_MIN_NUM_PARTS_TO_HIGHLY_COMPRESS, s)
    +        withSpark(new SparkContext(conf)) { sc =>
    +          val status = MapStatus(null, sizes)
    +        }
    +      }
    +    }
    +    // Test positive values
    +    for(s <- 1 to 3000) {
    --- End diff --
    
    this is overkill, especially since you're creating a SparkContext in each 
test.  (Look at how long this takes compared to other tests: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91850/testReport/org.apache.spark.scheduler/MapStatusSuite/)
    
    you really dont' need a SparkContext for these tests, I'd just set a 
`SparkEnv` -- just use mockito to create a mock which return the conf, and be 
sure to unset it at the end of the test in a `finally`.  that's probably enough 
that runtime will be small, but in any case,just a few cases are probably 
enough:
    
    ```scala
    Seq(1, 100, 499, 500, 501).foreach { s => ...
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to