Supreeth Sharma created ZEPPELIN-3114:
-----------------------------------------
Summary: Notebooks and interpreters are not getting saved in
zeppelin after >1d stress testing
Key: ZEPPELIN-3114
URL: https://issues.apache.org/jira/browse/ZEPPELIN-3114
Project: Zeppelin
Issue Type: Bug
Components: zeppelin-server
Affects Versions: 0.7.3
Reporter: Supreeth Sharma
Scenario:
36 hour long test
14 node secured encrypted cluster (centos7 based)
simulated load of around 13 users running a set of 19 notebooks periodically as
per defined schedule
After 24 hours zeppelin stopped functioning.
Issue 1 :
Not able to create new notebook or update existing one.
Issue 2:
Not able to modify interpreter settings. Save action never gets completed on UI.
Issue 3:
Not able to run paragraphs.
Seeing below error in zeppelin logs :
{code}
WARN [2017-12-19 13:18:48,128] ({qtp1076835071-86681} Client.java[run]:715) -
Exception encountered while connecting to the server :
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException:
No valid credentials provided (Mechanism level: Failed to find any Kerberos
tgt)]
INFO [2017-12-19 13:18:48,128] ({qtp1076835071-86681}
RetryInvocationHandler.java[log]:280) - java.io.IOException: Failed on local
exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate
failed [Caused by GSSException: No valid credentials provided (Mechanism level:
Failed to find any Kerberos tgt)]; Host Details : local host is:
"ctr-e136-1513029738776-12293-01-000004.hwx.site/172.27.22.148"; destination
host is: "ctr-e136-1513029738776-12293-01-000004.hwx.site":8020; , while
invoking ClientNamenodeProtocolTranslatorPB.create over
ctr-e136-1513029738776-12293-01-000004.hwx.site/172.27.22.148:8020 after 12
failover attempts. Trying to failover after sleeping for 15905ms.
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)