runzhiwang commented on a change in pull request #242: [LIVY-336] Livy should not spawn one thread per job to track the job on Yarn URL: https://github.com/apache/incubator-livy/pull/242#discussion_r346141789
########## File path: server/src/main/scala/org/apache/livy/utils/SparkYarnApp.scala ########## @@ -56,17 +67,27 @@ object SparkYarnApp extends Logging { private def getYarnTagToAppIdTimeout(livyConf: LivyConf): FiniteDuration = livyConf.getTimeAsMs(LivyConf.YARN_APP_LOOKUP_TIMEOUT) milliseconds - private def getYarnPollInterval(livyConf: LivyConf): FiniteDuration = - livyConf.getTimeAsMs(LivyConf.YARN_POLL_INTERVAL) milliseconds + private def getYarnTagToAppIdMaxFailedTimes(livyConf: LivyConf): Int = + livyConf.getTimeAsMs(LivyConf.YARN_APP_LOOKUP_TIMEOUT).toInt / yarnAppLookUpInterval.toInt private val appType = Set("SPARK").asJava private val leakedAppTags = new java.util.concurrent.ConcurrentHashMap[String, Long]() + private[utils] val appMap = new TrieMap[SparkYarnApp, String]() Review comment: For example, there is only one appA, and there are 8 threads in the thread pool of future, and the config `yarnAppMonitorTimeout` is bigger than `yarnPollInterval `. In the first `while (true) ` loop, thread1 monitors appA whose cost is longer than `yarnPollInterval`, and after `yarnPollInterval `the second `while (true) ` loop starts, thread2 begin to monitor appA. Thus thread1 and thread2 both monitor appA in current. Though I can config `yarnAppMonitorTimeout < yarnPollInterval ` to avoid this problem by ending the monitor of thread1 before the monitor of thread2, but I think it is not safe in case someone changes the config. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services