Hi zeppelin-users, Because Zeppelin relies on a long running SparkContext, it is quite important to make it stable to improve availability. From my experience, I run into a couple of issues if I run a SparkContext for several days, including: -- 1. EventLoggong doest work due to HDFS lease issue. Similar to this: https://mail-archives.apache.org/mod_mbox/spark-user/201507.mbox/%3ccae6kwsp_c00gksmnx0obu5aouxphdjs-syqywt-jfi3psvc...@mail.gmail.com%3E 2. SparkUI is getting slower due to large number of history jobs 3. Cached data is gone mystically
They may not be Zeppelin issues, but I would like to hear the problems you run into, and your experience of how to deal with maintaining a long running SparkContext. I know that we can do some cleanups periodically by restarting the spark interpreter, but I am wondering whether there are better ways. Thanks! Zhong