Tom Chamnongvongse created LIVY-1003: ----------------------------------------
Summary: Interactive session - Setting large value of rsc.server.connect.timeout blocks other tasks Key: LIVY-1003 URL: https://issues.apache.org/jira/browse/LIVY-1003 Project: Livy Issue Type: Bug Components: RSC Affects Versions: 0.8.0 Reporter: Tom Chamnongvongse Problem: Livy is configured to deploy interactive sessions on YARN with `livy.rsc.server.connect.timeout` configure to a high value. Timeout is increased to allow more time for Livy session to be in YARN `ACCEPTED` state to prevent Livy server from killing the YARN app within the default timeout of 90 seconds. Until the app is in YARN `RUNNING` state, it takes up a thread in Scala's global execution context - https://github.com/apache/incubator-livy/blob/v0.8.0-incubating/server/src/main/scala/org/apache/livy/server/interactive/InteractiveSession.scala#L474. Creating too many of these sessions that are stuck in `ACCEPTED` state causes other tasks that use that global execution context to be queued up. How to reproduce: 1. Set `livy.rsc.server.connect.timeout` to something high like 1 hour. 2. Create enough interactive livy sessions in YARN so that they are queued in ACCEPTED state. The number of sessions that are stuck in ACCEPTED state should be equal to global execution context [thread pool size|https://docs.scala-lang.org/overviews/core/futures.html#the-global-execution-context] (Runtime.availableProcessors) 3. Try to delete a session using DELETE /sessions/{sessionId} and it should hang until one of the sessions is no longer stuck in ACCEPTED state. -- This message was sent by Atlassian Jira (v8.20.10#820010)