[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21178


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-05-02 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r185432085
  
--- Diff: 
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala
 ---
@@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: 
HiveServer2, sqlContext: SQLC
 
 if (UserGroupInformation.isSecurityEnabled) {
   try {
-HiveAuthFactory.loginFromKeytab(hiveConf)
-sparkServiceUGI = Utils.getUGI()
+val principal = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL)
+val keyTabFile = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB)
+if (principal.isEmpty || keyTabFile.isEmpty) {
+  throw new IOException(
+"HiveServer2 Kerberos principal or keytab is not correctly 
configured")
+}
+
+val originalUgi = UserGroupInformation.getCurrentUser
--- End diff --

So shall we change to `Utils.getUGI`? Basically it is the same. But as 
@mridulm mentioned, someone may set this "HADOOP_USER_NAME", though doAs is not 
worked in STS.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-05-02 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r185427751
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -92,7 +95,26 @@ public String getAuthName() {
   public static final String HS2_PROXY_USER = "hive.server2.proxy.user";
   public static final String HS2_CLIENT_TOKEN = "hiveserver2ClientToken";
 
-  public HiveAuthFactory(HiveConf conf) throws TTransportException {
+  private static Field keytabFile = null;
+  private static Method getKeytab = null;
+  static {
+Class clz = UserGroupInformation.class;
+try {
+  keytabFile = clz.getDeclaredField("keytabFile");
+  keytabFile.setAccessible(true);
+} catch (NoSuchFieldException nfe) {
+  keytabFile = null;
--- End diff --

Ok, I just remember Steve commented on some JIRAs about this thing, but I 
cannot remember where it is. Anyway I will log some logs here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-05-02 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r185423983
  
--- Diff: 
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala
 ---
@@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: 
HiveServer2, sqlContext: SQLC
 
 if (UserGroupInformation.isSecurityEnabled) {
   try {
-HiveAuthFactory.loginFromKeytab(hiveConf)
-sparkServiceUGI = Utils.getUGI()
+val principal = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL)
+val keyTabFile = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB)
+if (principal.isEmpty || keyTabFile.isEmpty) {
+  throw new IOException(
+"HiveServer2 Kerberos principal or keytab is not correctly 
configured")
+}
+
+val originalUgi = UserGroupInformation.getCurrentUser
--- End diff --

Ah ok, Utils.getUGI handles proxying in HS2 iirc - this is, currently, not 
relevant in our case imo.
You are right, we could have simply used `Utils.getUGI` instead of 
`UserGroupInformation.getCurrentUser` (it would have been more appropriate as 
well).
Something to keep in mind in case users are relying on HADOOP_USER_NAME for 
STS (highly doubt it, but you never know !)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-05-02 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r185423072
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -92,7 +95,26 @@ public String getAuthName() {
   public static final String HS2_PROXY_USER = "hive.server2.proxy.user";
   public static final String HS2_CLIENT_TOKEN = "hiveserver2ClientToken";
 
-  public HiveAuthFactory(HiveConf conf) throws TTransportException {
+  private static Field keytabFile = null;
+  private static Method getKeytab = null;
+  static {
+Class clz = UserGroupInformation.class;
+try {
+  keytabFile = clz.getDeclaredField("keytabFile");
+  keytabFile.setAccessible(true);
+} catch (NoSuchFieldException nfe) {
+  keytabFile = null;
--- End diff --

I am not sure what you mean by that statement.
```
$ $ find sql/hive-thriftserver/ -type f | grep scala$ | xargs grep -i 
'log.*"' | wc -l
  42
```

Log messages help us debug issues without needing to modify code; and this 
is an excellent case of where logging at appropriate level will help us unearth 
issues in production.
I am not sure what the concern here is.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-05-02 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r185413718
  
--- Diff: 
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala
 ---
@@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: 
HiveServer2, sqlContext: SQLC
 
 if (UserGroupInformation.isSecurityEnabled) {
   try {
-HiveAuthFactory.loginFromKeytab(hiveConf)
-sparkServiceUGI = Utils.getUGI()
+val principal = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL)
+val keyTabFile = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB)
+if (principal.isEmpty || keyTabFile.isEmpty) {
+  throw new IOException(
+"HiveServer2 Kerberos principal or keytab is not correctly 
configured")
+}
+
+val originalUgi = UserGroupInformation.getCurrentUser
--- End diff --

@mridulm thanks for your answer. Actually my question was a bit different, 
but @jerryshao answered me: which is the difference between 
`UserGroupInformation.getCurrentUser` and `Utils.getUGI()`? Since 
`Utils.getUGI()` does the same unless `HADOOP_USER_NAME` is set, my feeling was 
that using either one or the other function is the same here, so I just 
wondered why using one on one place and the other few lines after and not 
always the same one: I just wanted to make sure that they are actually the 
same. Thanks for you kind answers.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-05-02 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r185407534
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -92,7 +95,26 @@ public String getAuthName() {
   public static final String HS2_PROXY_USER = "hive.server2.proxy.user";
   public static final String HS2_CLIENT_TOKEN = "hiveserver2ClientToken";
 
-  public HiveAuthFactory(HiveConf conf) throws TTransportException {
+  private static Field keytabFile = null;
+  private static Method getKeytab = null;
+  static {
+Class clz = UserGroupInformation.class;
+try {
+  keytabFile = clz.getDeclaredField("keytabFile");
+  keytabFile.setAccessible(true);
+} catch (NoSuchFieldException nfe) {
+  keytabFile = null;
--- End diff --

In the current STS code, we avoid using any log method. You can check the 
code, all the log method which used before (in Spark1) is removed now. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-05-02 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r185404453
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -92,7 +95,26 @@ public String getAuthName() {
   public static final String HS2_PROXY_USER = "hive.server2.proxy.user";
   public static final String HS2_CLIENT_TOKEN = "hiveserver2ClientToken";
 
-  public HiveAuthFactory(HiveConf conf) throws TTransportException {
+  private static Field keytabFile = null;
+  private static Method getKeytab = null;
+  static {
+Class clz = UserGroupInformation.class;
+try {
+  keytabFile = clz.getDeclaredField("keytabFile");
+  keytabFile.setAccessible(true);
+} catch (NoSuchFieldException nfe) {
+  keytabFile = null;
--- End diff --

nit: We should include a debug message (not INFO) indicating the exception.
Same for getKeytab below.

This will help debug in case the signatures of these private methods 
changes later, and we are not sure what is happening.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-05-02 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r185406504
  
--- Diff: 
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala
 ---
@@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: 
HiveServer2, sqlContext: SQLC
 
 if (UserGroupInformation.isSecurityEnabled) {
   try {
-HiveAuthFactory.loginFromKeytab(hiveConf)
-sparkServiceUGI = Utils.getUGI()
+val principal = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL)
+val keyTabFile = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB)
+if (principal.isEmpty || keyTabFile.isEmpty) {
+  throw new IOException(
+"HiveServer2 Kerberos principal or keytab is not correctly 
configured")
+}
+
+val originalUgi = UserGroupInformation.getCurrentUser
--- End diff --

Assuming I understood your question @mgaido91, we use Utils.getUGI only 
when we do an explicit login; else fallback on originalUgi (the current user).

This is because after a login, the instance of ugi changes from underneath 
us in UserGroupInformation class.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-28 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184844613
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, 
String proxyUser, String i
 }
   }
 
+  public static boolean needUgiLogin(UserGroupInformation ugi, String 
principal, String keytab) {
+return null == ugi || !ugi.hasKerberosCredentials() || 
!ugi.getUserName().equals(principal) ||
+  !keytab.equals(getKeytabFromUgi());
+  }
+
+  private static String getKeytabFromUgi() {
+Class clz = UserGroupInformation.class;
+try {
+  synchronized (clz) {
+Field field = clz.getDeclaredField("keytabFile");
+field.setAccessible(true);
+return (String) field.get(null);
+  }
+} catch (NoSuchFieldException e) {
+  try {
+synchronized (clz) {
+  // In Hadoop 3 we don't have "keytabFile" field, instead we 
should use private method
+  // getKeytab().
+  Method method = clz.getDeclaredMethod("getKeytab");
+  method.setAccessible(true);
+  return (String) 
method.invoke(UserGroupInformation.getCurrentUser());
--- End diff --

Sure, I will change it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184841250
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, 
String proxyUser, String i
 }
   }
 
+  public static boolean needUgiLogin(UserGroupInformation ugi, String 
principal, String keytab) {
+return null == ugi || !ugi.hasKerberosCredentials() || 
!ugi.getUserName().equals(principal) ||
+  !keytab.equals(getKeytabFromUgi());
+  }
+
+  private static String getKeytabFromUgi() {
+Class clz = UserGroupInformation.class;
+try {
+  synchronized (clz) {
+Field field = clz.getDeclaredField("keytabFile");
+field.setAccessible(true);
+return (String) field.get(null);
+  }
+} catch (NoSuchFieldException e) {
+  try {
+synchronized (clz) {
+  // In Hadoop 3 we don't have "keytabFile" field, instead we 
should use private method
+  // getKeytab().
+  Method method = clz.getDeclaredMethod("getKeytab");
+  method.setAccessible(true);
+  return (String) 
method.invoke(UserGroupInformation.getCurrentUser());
--- End diff --

It is called twice right now; but as a util method which can be used in 
other place, let us not intentionally introduce known inefficiencies.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184833705
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, 
String proxyUser, String i
 }
   }
 
+  public static boolean needUgiLogin(UserGroupInformation ugi, String 
principal, String keytab) {
+return null == ugi || !ugi.hasKerberosCredentials() || 
!ugi.getUserName().equals(principal) ||
+  !keytab.equals(getKeytabFromUgi());
+  }
+
+  private static String getKeytabFromUgi() {
+Class clz = UserGroupInformation.class;
+try {
+  synchronized (clz) {
+Field field = clz.getDeclaredField("keytabFile");
+field.setAccessible(true);
+return (String) field.get(null);
+  }
+} catch (NoSuchFieldException e) {
+  try {
+synchronized (clz) {
+  // In Hadoop 3 we don't have "keytabFile" field, instead we 
should use private method
+  // getKeytab().
+  Method method = clz.getDeclaredMethod("getKeytab");
+  method.setAccessible(true);
+  return (String) 
method.invoke(UserGroupInformation.getCurrentUser());
--- End diff --

This will only be called twice in the initialization stage, so there should 
not be large overhead.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184833443
  
--- Diff: 
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala
 ---
@@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: 
HiveServer2, sqlContext: SQLC
 
 if (UserGroupInformation.isSecurityEnabled) {
   try {
-HiveAuthFactory.loginFromKeytab(hiveConf)
-sparkServiceUGI = Utils.getUGI()
+val principal = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL)
+val keyTabFile = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB)
+if (principal.isEmpty || keyTabFile.isEmpty) {
+  throw new IOException(
+"HiveServer2 Kerberos principal or keytab is not correctly 
configured")
+}
+
+val originalUgi = UserGroupInformation.getCurrentUser
--- End diff --

I don't think there's any particular reason, we just copy what HS2 did 
before.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184833381
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -18,14 +18,11 @@
 package org.apache.hive.service.auth;
 
 import java.io.IOException;
+import java.lang.reflect.Field;
+import java.lang.reflect.Method;
 import java.net.InetSocketAddress;
 import java.net.UnknownHostException;
-import java.util.ArrayList;
-import java.util.Arrays;
-import java.util.HashMap;
-import java.util.List;
-import java.util.Locale;
-import java.util.Map;
+import java.util.*;
--- End diff --

This is automatically done by my intellij idea, will revert back.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184784052
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, 
String proxyUser, String i
 }
   }
 
+  public static boolean needUgiLogin(UserGroupInformation ugi, String 
principal, String keytab) {
+return null == ugi || !ugi.hasKerberosCredentials() || 
!ugi.getUserName().equals(principal) ||
+  !keytab.equals(getKeytabFromUgi());
+  }
+
+  private static String getKeytabFromUgi() {
+Class clz = UserGroupInformation.class;
+try {
+  synchronized (clz) {
+Field field = clz.getDeclaredField("keytabFile");
+field.setAccessible(true);
+return (String) field.get(null);
+  }
+} catch (NoSuchFieldException e) {
+  try {
+synchronized (clz) {
+  // In Hadoop 3 we don't have "keytabFile" field, instead we 
should use private method
+  // getKeytab().
+  Method method = clz.getDeclaredMethod("getKeytab");
+  method.setAccessible(true);
+  return (String) 
method.invoke(UserGroupInformation.getCurrentUser());
--- End diff --

You dont want to reflect on class to get to field/method each time method 
is invoked.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184774268
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -18,14 +18,11 @@
 package org.apache.hive.service.auth;
 
 import java.io.IOException;
+import java.lang.reflect.Field;
+import java.lang.reflect.Method;
 import java.net.InetSocketAddress;
 import java.net.UnknownHostException;
-import java.util.ArrayList;
-import java.util.Arrays;
-import java.util.HashMap;
-import java.util.List;
-import java.util.Locale;
-import java.util.Map;
+import java.util.*;
--- End diff --

nit: I have always seen that we are trying to avoid to have imports of `.*`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184774034
  
--- Diff: 
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala
 ---
@@ -52,8 +52,22 @@ private[hive] class SparkSQLCLIService(hiveServer: 
HiveServer2, sqlContext: SQLC
 
 if (UserGroupInformation.isSecurityEnabled) {
   try {
-HiveAuthFactory.loginFromKeytab(hiveConf)
-sparkServiceUGI = Utils.getUGI()
+val principal = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL)
+val keyTabFile = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB)
+if (principal.isEmpty || keyTabFile.isEmpty) {
+  throw new IOException(
+"HiveServer2 Kerberos principal or keytab is not correctly 
configured")
+}
+
+val originalUgi = UserGroupInformation.getCurrentUser
--- End diff --

just curious, why here we are using `UserGroupInformation.getCurrentUser` 
and a few lines after `Utils.getUGI()`? I see that they basically do the same, 
but I am  interested if there are reasons for doing this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184672313
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, 
String proxyUser, String i
 }
   }
 
+  public static boolean needUgiLogin(UserGroupInformation ugi, String 
principal, String keytab) {
+return null == ugi || !ugi.hasKerberosCredentials() || 
!ugi.getUserName().equals(principal) ||
+  !keytab.equals(getKeytabFromUgi());
+  }
+
+  private static String getKeytabFromUgi() {
+Class clz = UserGroupInformation.class;
+try {
+  synchronized (clz) {
+Field field = clz.getDeclaredField("keytabFile");
+field.setAccessible(true);
+return (String) field.get(null);
+  }
+} catch (NoSuchFieldException e) {
+  try {
+synchronized (clz) {
+  // In Hadoop 3 we don't have "keytabFile" field, instead we 
should use private method
+  // getKeytab().
+  Method method = clz.getDeclaredMethod("getKeytab");
+  method.setAccessible(true);
+  return (String) 
method.invoke(UserGroupInformation.getCurrentUser());
--- End diff --

What is the purpose of moving both field and method out of this method? I'm 
not sure is there any difference.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184625530
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, 
String proxyUser, String i
 }
   }
 
+  public static boolean needUgiLogin(UserGroupInformation ugi, String 
principal, String keytab) {
+return null == ugi || !ugi.hasKerberosCredentials() || 
!ugi.getUserName().equals(principal) ||
+  !keytab.equals(getKeytabFromUgi());
--- End diff --

`Object.equals` for keytab check ?
I dont think we invoke this when `keytab` is null currently, but would be 
good to future proof this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184626752
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, 
String proxyUser, String i
 }
   }
 
+  public static boolean needUgiLogin(UserGroupInformation ugi, String 
principal, String keytab) {
+return null == ugi || !ugi.hasKerberosCredentials() || 
!ugi.getUserName().equals(principal) ||
+  !keytab.equals(getKeytabFromUgi());
+  }
+
+  private static String getKeytabFromUgi() {
+Class clz = UserGroupInformation.class;
+try {
+  synchronized (clz) {
+Field field = clz.getDeclaredField("keytabFile");
+field.setAccessible(true);
+return (String) field.get(null);
+  }
+} catch (NoSuchFieldException e) {
+  try {
+synchronized (clz) {
+  // In Hadoop 3 we don't have "keytabFile" field, instead we 
should use private method
+  // getKeytab().
+  Method method = clz.getDeclaredMethod("getKeytab");
+  method.setAccessible(true);
+  return (String) 
method.invoke(UserGroupInformation.getCurrentUser());
+}
+  } catch (Throwable t) {
+return null;
+  }
+} catch (Throwable t) {
+  return null;
+}
--- End diff --

Let us catch specific exceptions for method/field invocation; not throwable.
The code I had initially written was poor in this regard btw, and was 
catching throwable - was bad form.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184625003
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -107,9 +109,16 @@ public HiveAuthFactory(HiveConf conf) throws 
TTransportException {
 authTypeStr = AuthTypes.NONE.getAuthName();
   }
   if (authTypeStr.equalsIgnoreCase(AuthTypes.KERBEROS.getAuthName())) {
-saslServer = ShimLoader.getHadoopThriftAuthBridge()
-  .createServer(conf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB),
-
conf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL));
+String principal = 
conf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL);
+String keytab = conf.getVar(ConfVars.HIVE_SERVER2_KERBEROS_KEYTAB);
+if (needUgiLogin(UserGroupInformation.getCurrentUser(),
+  SecurityUtil.getServerPrincipal(principal, "0.0.0.0"), keytab)) {
--- End diff --

This will always return `false` : though the additional check will not hurt.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184626560
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, 
String proxyUser, String i
 }
   }
 
+  public static boolean needUgiLogin(UserGroupInformation ugi, String 
principal, String keytab) {
+return null == ugi || !ugi.hasKerberosCredentials() || 
!ugi.getUserName().equals(principal) ||
+  !keytab.equals(getKeytabFromUgi());
+  }
+
+  private static String getKeytabFromUgi() {
+Class clz = UserGroupInformation.class;
+try {
+  synchronized (clz) {
+Field field = clz.getDeclaredField("keytabFile");
+field.setAccessible(true);
+return (String) field.get(null);
+  }
+} catch (NoSuchFieldException e) {
+  try {
+synchronized (clz) {
+  // In Hadoop 3 we don't have "keytabFile" field, instead we 
should use private method
+  // getKeytab().
+  Method method = clz.getDeclaredMethod("getKeytab");
+  method.setAccessible(true);
+  return (String) 
method.invoke(UserGroupInformation.getCurrentUser());
--- End diff --

I was not aware of this change, thanks for fixing this !
Let us move both field and method out of getKeytabFromUgi.
```
synchronized(UserGroupInformation.class) {
  if (null != keyTabField) { 
return keyTabField.get(null); 
  } else {
return (String) method.invoke(UserGroupInformation.getCurrentUser()) 
  }
}
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/21178#discussion_r184625977
  
--- Diff: 
sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java
 ---
@@ -362,4 +371,34 @@ public static void verifyProxyAccess(String realUser, 
String proxyUser, String i
 }
   }
 
+  public static boolean needUgiLogin(UserGroupInformation ugi, String 
principal, String keytab) {
+return null == ugi || !ugi.hasKerberosCredentials() || 
!ugi.getUserName().equals(principal) ||
+  !keytab.equals(getKeytabFromUgi());
+  }
+
+  private static String getKeytabFromUgi() {
+Class clz = UserGroupInformation.class;
+try {
+  synchronized (clz) {
+Field field = clz.getDeclaredField("keytabFile");
--- End diff --

The field lookup + setAccessible can be pulled out of the method and into a 
static variable.
The field.get() itself should be within the sync block.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21178: [SPARK-24110][Thrift-Server] Avoid UGI.loginUserF...

2018-04-27 Thread jerryshao
GitHub user jerryshao opened a pull request:

https://github.com/apache/spark/pull/21178

[SPARK-24110][Thrift-Server] Avoid UGI.loginUserFromKeytab in STS

## What changes were proposed in this pull request?

Spark ThriftServer will call UGI.loginUserFromKeytab twice in 
initialization. This is unnecessary and will cause various potential problems, 
like Hadoop IPC failure after 7 days, or RM failover issue and so on.

So here we need to remove all the unnecessary login logics and make sure 
UGI in the context never be created again.

Note this is actually a HS2 issue, If later on we upgrade supported Hive 
version, the issue may already be fixed in Hive side.

## How was this patch tested?

Local verification in secure cluster.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jerryshao/apache-spark SPARK-24110

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21178.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21178


commit fd48235e5a835f4a701c4ce303fbef1d74b2f095
Author: jerryshao 
Date:   2018-04-27T07:21:52Z

Avoid UGI.loginUserFromKeytab in STS




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org