[GitHub] spark pull request #22638: [SPARK-25610][SQL][TEST] Improve execution time o...

mgaido91 Fri, 05 Oct 2018 02:55:46 -0700

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22638#discussion_r222952924
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala ---
    @@ -127,16 +127,16 @@ class DatasetCacheSuite extends QueryTest with 
SharedSQLContext with TimeLimits
       }
     
       test("cache UDF result correctly") {
    -    val expensiveUDF = udf({x: Int => Thread.sleep(5000); x})
    -    val df = spark.range(0, 10).toDF("a").withColumn("b", 
expensiveUDF($"a"))
    +    val expensiveUDF = udf({x: Int => Thread.sleep(2000); x})
    --- End diff --
    
    well, I do think this will pass 100% times, my concern was that in case of 
a regression we might fail detecting it. But yes, with the repartition to 1 
you're right, I haven't considered it, otherwise they may have run in parallel. 
So this seems enough.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22638: [SPARK-25610][SQL][TEST] Improve execution time o...

Reply via email to