risyomei opened a new issue, #4457:
URL: https://github.com/apache/kyuubi/issues/4457

   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [X] I have searched in the 
[issues](https://github.com/apache/kyuubi/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Describe the bug
   
   When executing a statement using Spark on YARN Cluster mode, if you restart 
the NodeManger when a Statement is executing, the beeline will hang.
   
   It's similar to the issue [#647 Kill yarn app when executing statement cause 
beeline hang](https://github.com/apache/kyuubi/issues/647), but different in 
the way that:
   - Restarting a NodeManager that runs Spark Driver (SparkSQLEngine) does NOT 
throw any exception.
   
   And the `KyuubiSyncThriftClient` will then just hang and wait forever.
   
https://github.com/apache/kyuubi/blob/43309b86f1997b028e8fde5cb4e6449d818f4f73/kyuubi-server/src/main/scala/org/apache/kyuubi/operation/ExecuteStatement.scala#L104
   
   Possible Workaround:
   This issue can be alleviated by enabling 
"kyuubi.session.engine.request.timeout".
   When enabling it, the `KyuubiSyncThriftClient` will throw a 
`java.net.SocketTimeoutException: Read timed out`, and the exception can be 
handled properly by #647. 
   
   However, the "kyuubi.session.engine.request.timeout" was deleted by 
https://github.com/apache/kyuubi/pull/2948, and this is not possible anymore 
after 1.6.0
   
   Possible Solutions:
   1. Let Alive Probe deregister the Engine.
   2. Let the client handle the Engine deregister event.
   
   What do you think?
   
   ### Affects Version(s)
   
   master, 1.6.1-incubating
   
   ### Kyuubi Server Log Output
   
   _No response_
   
   ### Kyuubi Engine Log Output
   
   _No response_
   
   ### Kyuubi Server Configurations
   
   _No response_
   
   ### Kyuubi Engine Configurations
   
   _No response_
   
   ### Additional context
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes. I would be willing to submit a PR with guidance from the Kyuubi 
community to fix.
   - [ ] No. I cannot submit a PR at this time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to