Github user zhouyejoe commented on the issue: https://github.com/apache/spark/pull/19170 @vanzin Yes, I agree with you that the latest listener will not write these data into logs. But here is the story. We deployed SHS(Spark History Server) with LevelDB months ago in our clusters before you started to merge patches into trunk. We directly used your development branch to build binary only for History Server. In our cluster, there are multiple different versions of Spark including Spark 1.6.x and Spark 2.1. Then we started some kind of pressure testing on this SHS for our internal use cases which requires SHS to analyze each application logs and create DBs. Maybe we are using SHS too aggressively, but the GC issue is one of the major issues we met. We also reproduced this issue using Original SHS without LevelDB. So we created this ticket to solve the problem which has ran fine for several months. Without this patch, our SHS with LevelDB would never be in a stable status and cannot serve our users. I think we are not the only company that has multiple versions of Spar k in production environment, as far as I know, Netflix is another example. In case of large scale clusters where thousands of Spark application logs processed by a single SHS instance, this patch would definitely help.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org