Mathew created ZEPPELIN-3455:
--------------------------------

             Summary: Zeppelin crashes if Spark interpreter is restarted in a 
hanging state (And no hang timeout)
                 Key: ZEPPELIN-3455
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3455
             Project: Zeppelin
          Issue Type: Bug
          Components: Interpreters
    Affects Versions: 0.7.3
            Reporter: Mathew
             Fix For: 0.8.0


*Issue*:

If a user has not kinit'd their keytab, and attempts to run a %spark paragraph, 
it will hang indefinitely, and if they then try to restart the Spark 
interpreter, the whole of Zeppelin Crashes.

 

*Enviroment:*
 * Zeppelin 0.7.3
 * Spark 2.2
 * Yarn-Client (Kerberized) 
 * User Impersonation Enabled (Per user, in isolated process) 
 * Shiro authenticating users through AD
 * ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false
 * ZEPPELIN_IMPERSONATE_CMD='sudo -H -u ${ZEPPELIN_IMPERSONATE_USER} bash -c '

While this might seem like a crazy setup, it is extremely common in enterprise, 
as users have differing permissions in the Hadoop environment.

(I am aware that zeppelin can proxy users if it has its own keytab, but many 
Zeppelin users cannot do that for now.)

 

*Things to Fix:*
 * Firstly, there is seemingly no timeout for failing to initialize the Spark 
interpreter. (Meaning it hangs forever)
 * Secondly, while it is in the hanging state, restarting the Spark interpreter 
will crash Zeppelin for everyone. (Sometimes it will come back after a 20+ Min) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to