cekicbaris commented on pull request #167:
URL: https://github.com/apache/incubator-livy/pull/167#issuecomment-698866352


   > @cekicbaris , would be nice to see Spark Driver logs during this failure, 
I believe it might be related to Livy <-> Spark Driver communication. Also 
might be the networking is not very stable in your env, not sure if Livy does 
retries.
   @jahstreet 
   Do you mind to give some advice to check the networking? It is running on 
AWS EKS on kubernetes 1.15 and in `livy` namespace. `livy` service accounts has 
enough auth.
   
   One more thing, if the session time'd out , the interactive session is 
deleted from livy but driver pods and executor pods are still running. But if I 
delete the session with a DELETE request to REST API, then it also deletes the 
pods.
   Here is a time'd out session log.
   ```
   2020-09-23T07:19:29.853571515Z 2020-09-23 07:19:29 INFO  
InteractiveSessionManager:39 - Deleting InteractiveSession 2 because it was 
inactive for more than 3600000.0 ms.
   2020-09-23T07:19:29.853690768Z 2020-09-23 07:19:29 INFO  
InteractiveSession:39 - Stopping InteractiveSession 2...
   2020-09-23T07:19:29.994715243Z 2020-09-23 07:19:29 WARN  RpcDispatcher:191 - 
[ClientProtocol] Closing RPC channel with 1 outstanding RPCs.
   2020-09-23T07:19:30.013570228Z 2020-09-23 07:19:30 INFO  
InteractiveSession:39 - Stopped InteractiveSession 2.
   2020-09-23T07:19:39.119654462Z 2020-09-23 07:19:39 ERROR 
SparkKubernetesApp:56 - Unknown Kubernetes state unknown for app with tag 
livy-session-2-dfceC0pO.
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to