Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22444 @vanzin Really thanks for you suggestions. It becomes much faster loading event logs. from more than 2.5 hours, to 19 minutes, loading 17K event logs, some of them are larger than 10G. 1. To enable SHS V2 to caching things on disk. We are using Windows, there is a small "posix.permissions not supported in windows" issue, I create a new PR here https://github.com/apache/spark/pull/22520 , could you please take a look? This change doesn't speed up loading very much, but it improves other part. 2. Tried 2.4, and also tried applying SPARK-6951 to 2.3. this is the critical part improving the speed. I will close this PR, as it is useless now. Thanks again.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org