Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/22444
> so any server restart results in hours of downtime, just from scanning.
Well, that's why 2.3 supports caching things on disk. Also, 2.4 has
SPARK-6951 which should make this a lot faster even without disk caching.
@jianjianjiao have you tried out 2.4?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]