Divya Goel created ZEPPELIN-4973: ------------------------------------ Summary: Zeppelin spark jobs are getting hung and return with different errors each time. Key: ZEPPELIN-4973 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4973 Project: Zeppelin Issue Type: Bug Components: Interpreters, spark Affects Versions: 0.8.2 Environment: Hi,
I've been encountering this issue since 1 month and every time I've to restart the spark interpreter which at first makes things more vulnerable then after re-login again I run the job and again back to the job's hanging. Could you please assist me as I've to use zeppelin for my data visualization which is on stake due to this issue. With much regards, Divya Reporter: Divya Goel Fix For: 0.8.2 Attachments: zeppelin_error.PNG, zeppelin_sparkjob.PNG Hi,Hi, I've kerberized cluster and my kerberos ticket is renewed each day providing me the valid key. When I run spark job from my zeppelin IDE, it first gets stuck for 2.5-3 hours and after that I get an error mentioned below. GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:594) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:396) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:761) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:757) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at I've enabled the user impersonation in zeppelin that's why zeppelin keytab and principals are being submitted to spark interpreter by properties: zeppelin.spark.keytab and zeppelin.spark.principal. It's strange that this is the persistent error but sometimes out of the blue I get error mentioned below after 2.5 to 3 hours: java.lang.NullPointerException at org.apache.thrift.transport.TSocket.open(TSocket.java:170) at org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51) at org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37) at org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60) at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861) -- This message was sent by Atlassian Jira (v8.3.4#803005)