[GitHub] zeppelin issue #2886: ZEPPELIN-3356: Zeppelin FileSystemStorage reloginFromK...

2018-04-12 Thread zjffdu
Github user zjffdu commented on the issue:

https://github.com/apache/zeppelin/pull/2886
  
@gss2002 I just found the root cause is that 
UserGroupInformation.loginUserFromKeytab called multiple times and created PR 
#2924 to fix it,  I have verified it, could you help verified it if you have 
time ?


---


[GitHub] zeppelin issue #2886: ZEPPELIN-3356: Zeppelin FileSystemStorage reloginFromK...

2018-03-29 Thread gss2002
Github user gss2002 commented on the issue:

https://github.com/apache/zeppelin/pull/2886
  
@prabhjyotsingh @zjffdu I made changes to check if security is enabled and 
if it was logged in via a keytab and than i relogin with checktgt method vs 
relogining in every time causing excess load on the kdc


---


[GitHub] zeppelin issue #2886: ZEPPELIN-3356: Zeppelin FileSystemStorage reloginFromK...

2018-03-29 Thread gss2002
Github user gss2002 commented on the issue:

https://github.com/apache/zeppelin/pull/2886
  
@zjffdu I am going to cut the new improved fix based on original feedback. 
But yes you will have to adjust the KDC to test this as Java does not use 
ticket_lifetime or renew_lifetime from krb5.conf per this article not fixed 
until Java 9.


https://stackoverflow.com/questions/38555244/how-do-you-set-the-kerberos-ticket-lifetime-from-java
https://bugs.openjdk.java.net/browse/JDK-8044500


---


[GitHub] zeppelin issue #2886: ZEPPELIN-3356: Zeppelin FileSystemStorage reloginFromK...

2018-03-28 Thread zjffdu
Github user zjffdu commented on the issue:

https://github.com/apache/zeppelin/pull/2886
  
@gss2002 Do you mean I have to change /var/kerberos/krb5kdc/kdc.conf to 
reproduce this issue ? 


---


[GitHub] zeppelin issue #2886: ZEPPELIN-3356: Zeppelin FileSystemStorage reloginFromK...

2018-03-28 Thread gss2002
Github user gss2002 commented on the issue:

https://github.com/apache/zeppelin/pull/2886
  

https://stackoverflow.com/questions/38555244/how-do-you-set-the-kerberos-ticket-lifetime-from-java
https://bugs.openjdk.java.net/browse/JDK-8044500


---


[GitHub] zeppelin issue #2886: ZEPPELIN-3356: Zeppelin FileSystemStorage reloginFromK...

2018-03-28 Thread gss2002
Github user gss2002 commented on the issue:

https://github.com/apache/zeppelin/pull/2886
  
@zjffdu you cannot just update the krb5.conf those are just recommendations 
on the client side. The KDC both with MIT Krb5 and Active Directory control the 
max_renewable_lifetime via /var/kerberos/krb5kdc/kdc.conf and settings in 
Windows registry.  My co-worker and I tested this today and the ticket is still 
renewable because the KDC controls the max time and it looks as if Java takes 
info from the KDC... Using the CLI kinit/klist and hadoop fs the ticket is 
expired. But from the looks of it when logging in with a keytab via UGI which 
zeppelin does for the HDFS calls it takes the settings from the kdc...  

See below:
JDK - KRB5 DEBUG OUTPUT from Zeppelin JVM:
 
Native config name: /etc/krb5.conf
Loaded from native config
>>> KdcAccessibility: reset
>>> KdcAccessibility: reset
>>> KeyTabInputStream, readName(): UNIT.HDP.EXAMPLE.COM
>>> KeyTabInputStream, readName(): zeppelin-unit
>>> KeyTab: load() entry length: 88; type: 18
>>> KeyTabInputStream, readName(): UNIT.HDP.EXAMPLE.COM
>>> KeyTabInputStream, readName(): zeppelin-unit
>>> KeyTab: load() entry length: 72; type: 17
>>> KeyTabInputStream, readName(): UNIT.HDP.EXAMPLE.COM
>>> KeyTabInputStream, readName(): zeppelin-unit
>>> KeyTab: load() entry length: 72; type: 23
Looking for keys for: zeppelin-u...@unit.hdp.example.com
Added key: 23version: 2
Added key: 17version: 2
Added key: 18version: 2
Looking for keys for: zeppelin-u...@unit.hdp.example.com
Added key: 23version: 2
Added key: 17version: 2
Added key: 18version: 2
Using builtin default etypes for default_tkt_enctypes
default etypes for default_tkt_enctypes: 18 17 16 23.
>>> KrbAsReq creating message
>>> KrbKdcReq send: kdc=ha21d51kd.unit.hdp.example.com TCP:88, 
timeout=3, number of retries =3, #bytes=174
>>> KDCCommunication: kdc=ha21d51kd.unit.hdp.example.com TCP:88, 
timeout=3,Attempt =1, #bytes=174
>>>DEBUG: TCPClient reading 769 bytes
>>> KrbKdcReq send: #bytes read=769
>>> KdcAccessibility: remove ha21d51kd.unit.hdp.example.com
Looking for keys for: zeppelin-u...@unit.hdp.example.com
Added key: 23version: 2
Added key: 17version: 2
Added key: 18version: 2
>>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType
>>> KrbAsRep cons in KrbAsReq.getReply zeppelin-unit
Found ticket for zeppelin-u...@unit.hdp.example.com to go to 
krbtgt/unit.hdp.example@unit.hdp.example.com expiring on Wed Mar 28 
23:28:46 EDT 2018
Entered Krb5Context.initSecContext with state=STATE_NEW
Found ticket for zeppelin-u...@unit.hdp.example.com to go to 
krbtgt/unit.hdp.example@unit.hdp.example.com expiring on Wed Mar 28 
23:28:46 EDT 2018
Service ticket not found in the subject
>>> Credentials acquireServiceCreds: same realm
Using builtin default etypes for default_tgs_enctypes
default etypes for default_tgs_enctypes: 18 17 16 23.
 
 
Showing Zeppelin was started after modifying /etc/krb5.conf 2m/5m 
ticket_lifetime/renew_lifetime
 
[root@ha21d55en zeppelin]# ps guaxww | grep -i zeppelin
zeppelin  89982  2.4  3.6 6872888 601888 ?  Sl   13:28   0:30 
/usr/jdk64/jdk1.8.0_102/bin/java -Dsun.security.krb5.debug=true 
-Dhdp.version=2.5.3.18-5 -Dspark.executor.memory=512m 
-Dspark.yarn.queue=default -Dfile.encoding=UTF-8 -Xms1024m -Xmx1024m 
-XX:MaxPermSize=512m 
-Dlog4j.configuration=file:///usr/local/zeppelin/current/conf/log4j.properties 
-Dzeppelin.log.file=/var/log/zeppelin/zeppelin-zeppelin-ha21d55en.unit.hdp.example.com.log
 -cp 
::/usr/local/zeppelin/current/lib/interpreter/*:/usr/local/zeppelin/current/lib/*:/usr/local/zeppelin/current/*::/usr/local/zeppelin/current/conf:/etc/hadoop/conf
 org.apache.zeppelin.server.ZeppelinServer
zeppelin  90439  0.0  0.0 113124  1524 ?S13:30   0:00 /bin/bash 
/usr/local/zeppelin/current/bin/interpreter.sh -d 
/usr/local/zeppelin/current/interpreter/livy -c 10.70.57.5 -p 41478 -r : -l 
/usr/local/zeppelin/current/local-repo/livy1 -g livy1
zeppelin  90454  0.0  0.0 113120   836 ?S13:30   0:00 /bin/bash 
/usr/local/zeppelin/current/bin/interpreter.sh -d 
/usr/local/zeppelin/current/interpreter/livy -c 10.70.57.5 -p 41478 -r : -l 
/usr/local/zeppelin/current/local-repo/livy1 -g livy1
zeppelin  90455  0.3  1.3 5198944 214228 ?  Sl   13:30   0:04 
/usr/jdk64/jdk1.8.0_102/bin/java -Dfile.encoding=UTF-8 
-Dlog4j.configuration=file:///usr/local/zeppelin/current/conf/log4j.properties 
-Dzeppelin.log.file=/var/log/zeppelin/zeppelin-interpreter-livy1-zeppelin-ha21d55en.unit.hdp.example.com.log
 -Xms1024m -Xmx1024m -XX:MaxPermSize=512m -cp 
:/usr/local/zeppelin/current/interpreter/livy/*:/usr/local/zeppelin/current/lib/interpreter/*:
 org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 10.70.57.5 
41478 

[GitHub] zeppelin issue #2886: ZEPPELIN-3356: Zeppelin FileSystemStorage reloginFromK...

2018-03-27 Thread zjffdu
Github user zjffdu commented on the issue:

https://github.com/apache/zeppelin/pull/2886
  
@gss2002 I could not reproduce the issue. I change the krb5.conf as 
following:
```
  renew_lifetime = 7min
  ticket_lifetime = 3min
```

And after one day, zeppelin still work properly. 


---


[GitHub] zeppelin issue #2886: ZEPPELIN-3356: Zeppelin FileSystemStorage reloginFromK...

2018-03-23 Thread gss2002
Github user gss2002 commented on the issue:

https://github.com/apache/zeppelin/pull/2886
  
@prabhjyotsingh I just read that same stackoverflow part of me says use 
checktgtandreloginfronkeytab to be lighter on kdc thoughts?  I will dig a bit 
deeper in am but auto renewal thread that exists in ugi cannot go beyond max 
renewal
@felixcheung I think you are right if I do 
usergroupinformation.getCurrentUser().checkTGtAndReloginFromKeytab() would work 
too

private void reloginFromKeytab(boolean checkTGT) throws IOException {
if (!shouldRelogin() || !isFromKeytab()) {
  return;
}
HadoopLoginContext login = getLogin();
if (login == null) {
  throw new KerberosAuthException(MUST_FIRST_LOGIN_FROM_KEYTAB);
}
if (checkTGT) {
  KerberosTicket tgt = getTGT();
  if (tgt != null && !shouldRenewImmediatelyForTests &&
Time.now() < getRefreshTime(tgt)) {
return;
  }
}
relogin(login);
  }


---


[GitHub] zeppelin issue #2886: ZEPPELIN-3356: Zeppelin FileSystemStorage reloginFromK...

2018-03-22 Thread zjffdu
Github user zjffdu commented on the issue:

https://github.com/apache/zeppelin/pull/2886
  
Thanks @gss2002 will review it soon


---


[GitHub] zeppelin issue #2886: ZEPPELIN-3356: Zeppelin FileSystemStorage reloginFromK...

2018-03-22 Thread gss2002
Github user gss2002 commented on the issue:

https://github.com/apache/zeppelin/pull/2886
  
@prabhjyotsingh  @zjffdu can you help review if you feel this is a valid 
fix?
Thanks again


---


[GitHub] zeppelin issue #2886: ZEPPELIN-3356: Zeppelin FileSystemStorage reloginFromK...

2018-03-21 Thread gss2002
Github user gss2002 commented on the issue:

https://github.com/apache/zeppelin/pull/2886
  
@zjffdu here is a patch that I think will fix this issue. I will know in 7 
days if the issue comes back but has plagued our 4 different environments 
running Zeppelin over the last few days since it has reached max timeout. Let 
me know your thoughts on this patch. Also the CI failures look to be un-related.


---