Github user zhouyejoe commented on the issue:

    https://github.com/apache/spark/pull/19170
  
    @vanzin Yes, I agree with you that the latest listener will not write these 
data into logs. But here is the story. We deployed SHS(Spark History Server) 
with LevelDB months ago in our clusters before you started to merge patches 
into trunk. We directly used your development branch to build binary only for 
History Server. In our cluster, there are multiple different versions of Spark 
including Spark 1.6.x and Spark 2.1. Then we started some kind of pressure 
testing on this SHS for our internal use cases which requires SHS to analyze 
each application logs and create DBs. Maybe we are using SHS too aggressively, 
but the GC issue is one of the major issues we met. We also reproduced this 
issue using Original SHS without LevelDB. So we created this ticket to solve 
the problem which has ran fine for several months. Without this patch, our SHS 
with LevelDB would never be in a stable status and cannot serve our users. I 
think we are not the only company that has multiple versions of Spar
 k in production environment, as far as I know, Netflix is another example. In 
case of large scale clusters where thousands of Spark application logs 
processed by a single SHS instance, this patch would definitely help.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to