Github user jianjianjiao commented on the issue:
https://github.com/apache/spark/pull/22444
@vanzin Really thanks for you suggestions. It becomes much faster loading
event logs. from more than 2.5 hours, to 19 minutes, loading 17K event logs,
some of them are larger than 10G.
1. To enable SHS V2 to caching things on disk. We are using Windows, there
is a small "posix.permissions not supported in windows" issue, I create a new
PR here https://github.com/apache/spark/pull/22520 , could you please take a
look? This change doesn't speed up loading very much, but it improves other
part.
2. Tried 2.4, and also tried applying SPARK-6951 to 2.3. this is the
critical part improving the speed.
I will close this PR, as it is useless now. Thanks again.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]