Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19229
Looks not the reason. maybe issues somewhere else. Let me run test later.
Thanks!
But there is some small issues in test:
Don't include gen data time:
```
val start = System.nanoTime()
val df2 = genData()
model.transform(df2).count
val end = System.nanoTime()
```
and add cache at the end of genData:
```
def genData() = {
....
val df = spark.createDataframe...
df.cache()
df.count() // force trigger cache
df
}
```
and we'd better add warm up code before record code running time.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]