Github user WeichenXu123 commented on the issue:

    https://github.com/apache/spark/pull/19229
  
    Looks not the reason. maybe issues somewhere else. Let me run test later. 
Thanks!
    But there is some small issues in test:
    Don't include gen data time:
    ```
        val start = System.nanoTime()
        val df2 = genData()
        model.transform(df2).count
        val end = System.nanoTime()
    ```
    and add cache at the end of genData:
    ```
    def genData() = {
       ....
       val df = spark.createDataframe...
      df.cache()
      df.count() // force trigger cache
      df
    }
    ```
    and we'd better add warm up code before record code running time.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to