The doc peaks my interest, but I'd have to see some code before deciding if its the best direction or not. Hopefully we'll have the new JIRA up by end of week so you can submit a JIRA but feel free to open a WIP PR earlier if you're ready.
Alex Bozarth Software Engineer Spark Technology Center E-mail: ajboz...@us.ibm.com GitHub: github.com/ajbozarth 505 Howard Street San Francisco, CA 94105 United States From: Nan Zhu <zhunanmcg...@gmail.com> To: dev@livy.incubator.apache.org Date: 08/14/2017 02:35 PM Subject: resolve the scalability problem caused by app monitoring in livy with an actor-based design Hi, all In HDInsight, we (Microsoft) use Livy as the Spark job submission service. We keep seeing the customers fall into the problem when they submit many concurrent applications to the system, or recover livy from a state with many concurrent applications By looking at the code and the customers' exception stack, we lock down the problem to the application monitoring module where a new thread is created for each application. To resolve the issue, we propose a actor-based design of application monitoring module and share it here (as new JIRA seems not working yet) *https://docs.google.com/document/d/1yDl5_3wPuzyGyFmSOzxRp6P-nbTQTdDFXl2XQhXDiwA/edit?usp=sharing < https://docs.google.com/document/d/1yDl5_3wPuzyGyFmSOzxRp6P-nbTQTdDFXl2XQhXDiwA/edit?usp=sharing >* We are glad to hear feedbacks from the community and improve the design before we start implementing it! Best, Nan