prasanthj commented on a change in pull request #1418:
URL: https://github.com/apache/hive/pull/1418#discussion_r477463702



##########
File path: 
llap-common/src/java/org/apache/hadoop/hive/llap/security/ConfBasedJwtSharedSecretProvider.java
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.llap.security;
+
+import com.google.common.base.Preconditions;
+import io.jsonwebtoken.security.Keys;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.conf.HiveConf;
+
+import java.security.Key;
+
+/**
+ * Default implementation of {@link JwtSecretProvider}.
+ * It uses the same encryption and decryption secret which can be used to sign 
and verify JWT.
+ */
+public class ConfBasedJwtSharedSecretProvider implements JwtSecretProvider {
+
+  private Key jwtEncryptionKey;
+
+  @Override public Key getEncryptionSecret() {
+    return jwtEncryptionKey;
+  }
+
+  @Override public Key getDecryptionSecret() {
+    return jwtEncryptionKey;
+  }
+
+  @Override public void init(final Configuration conf) {
+    final String sharedSecret = HiveConf.getVar(conf, 
HiveConf.ConfVars.LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET);

Review comment:
       Use conf.getPassword(). This should be fetched from jceks file. 

##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##########
@@ -4880,6 +4880,22 @@ private static void 
populateLlapDaemonVarsSet(Set<String> llapDaemonVarsSetLocal
     
LLAP_EXTERNAL_CLIENT_USE_HYBRID_CALENDAR("hive.llap.external.client.use.hybrid.calendar",
         false,
         "Whether to use hybrid calendar for parsing of data/timestamps."),
+
+    // confs for llap-external-client cloud deployment
+    
LLAP_EXTERNAL_CLIENT_CLOUD_RPC_PORT("hive.llap.external.client.cloud.rpc.port", 
30004,
+        "The LLAP daemon RPC port for external clients when llap is running in 
cloud environment."),
+    
LLAP_EXTERNAL_CLIENT_CLOUD_OUTPUT_SERVICE_PORT("hive.llap.external.client.cloud.output.service.port",
 30005,
+                "LLAP output service port when llap is running in cloud 
environment"),
+    LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_PROVIDER(
+        "hive.llap.external.client.cloud.jwt.shared.secret.provider",
+        
"org.apache.hadoop.hive.llap.security.ConfBasedJwtSharedSecretProvider",
+        "Shared secret provider to be used to sign JWT"),
+    
LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET("hive.llap.external.client.cloud.jwt.shared.secret",
+        "Let me give you this secret and you will get the access!",

Review comment:
       may be keep the default value empty. The system deploying this would 
have to randomly generate and store it in jceks file. empty default value and 
fail early instead of falling back to default. 

##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##########
@@ -4880,6 +4880,22 @@ private static void 
populateLlapDaemonVarsSet(Set<String> llapDaemonVarsSetLocal
     
LLAP_EXTERNAL_CLIENT_USE_HYBRID_CALENDAR("hive.llap.external.client.use.hybrid.calendar",
         false,
         "Whether to use hybrid calendar for parsing of data/timestamps."),
+
+    // confs for llap-external-client cloud deployment
+    
LLAP_EXTERNAL_CLIENT_CLOUD_RPC_PORT("hive.llap.external.client.cloud.rpc.port", 
30004,
+        "The LLAP daemon RPC port for external clients when llap is running in 
cloud environment."),
+    
LLAP_EXTERNAL_CLIENT_CLOUD_OUTPUT_SERVICE_PORT("hive.llap.external.client.cloud.output.service.port",
 30005,
+                "LLAP output service port when llap is running in cloud 
environment"),
+    LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_PROVIDER(
+        "hive.llap.external.client.cloud.jwt.shared.secret.provider",
+        
"org.apache.hadoop.hive.llap.security.ConfBasedJwtSharedSecretProvider",

Review comment:
       I would also provide environment variable based secret provider. Conf 
based ones have the potential of plain text secret if jceks is not used. Also 
will be good to have a config to enable the ports explicitly. If the config is 
false, then none of these ports, jwt secret etc. should be initialized. 

##########
File path: 
llap-common/src/java/org/apache/hadoop/hive/llap/security/JwtSecretProvider.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.llap.security;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.conf.HiveConf;
+
+import java.security.Key;
+
+/**
+ * JwtSecretProvider
+ *
+ * - provides encryption and decryption secrets for generating and parsing 
JWTs.
+ *
+ * - Hive internally uses method initAndGet() which initializes providers 
based on the value of config
+ *   {@link 
HiveConf.ConfVars#LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_PROVIDER}.
+ *   It expects implementations to provide default constructor and {@link 
#init(Configuration)} method.
+ */
+public interface JwtSecretProvider {
+
+  /**
+   * returns secret for signing JWT.
+   */
+  Key getEncryptionSecret();
+
+  /**
+   * returns secret for parsing JWT.
+   */
+  Key getDecryptionSecret();
+
+  /**
+   * Initializes the provider.
+   * @param conf configuration
+   */
+  void init(Configuration conf);
+
+  /**
+   *  Hive internally uses this method to obtain instance of {@link 
JwtSecretProvider}
+   *
+   * @param conf configuration
+   * @return implementation of {@link 
HiveConf.ConfVars#LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_PROVIDER}
+   */
+  static JwtSecretProvider initAndGet(Configuration conf) {
+    final String providerClass =
+        HiveConf.getVar(conf, 
HiveConf.ConfVars.LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_PROVIDER);

Review comment:
       same here. conf.getPassword

##########
File path: 
llap-client/src/java/org/apache/hadoop/hive/llap/registry/LlapServiceInstance.java
##########
@@ -47,6 +49,24 @@
    */
   public int getOutputFormatPort();
 
+  /**
+   * External host, usually needed in cloud envs where we cannot access 
internal host from outside
+   *
+   * @return
+   */
+  String getExternalHost();

Review comment:
       nit: getExternalHostname()

##########
File path: llap-common/src/java/org/apache/hadoop/hive/llap/LlapUtil.java
##########
@@ -401,4 +401,24 @@ public static Credentials credentialsFromByteArray(byte[] 
binaryCredentials)
     credentials.readTokenStorageStream(dib);
     return credentials;
   }
+
+  /**
+   * @return returns the value of IS_CLOUD_DEPLOYMENT from either environment 
variable or system properties
+   */
+  public static boolean isCloudDeployment() {
+    return "true".equalsIgnoreCase(System.getenv("IS_CLOUD_DEPLOYMENT"))

Review comment:
       alternative you can use is to default the external ports to 0. If port 
number is set to 0 then it is not cloud deployment. 

##########
File path: 
llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java
##########
@@ -157,12 +153,12 @@ public LlapBaseInputFormat() {
     HiveConf.setVar(job, HiveConf.ConfVars.LLAP_ZK_REGISTRY_USER, 
llapSplit.getLlapUser());
     SubmitWorkInfo submitWorkInfo = 
SubmitWorkInfo.fromBytes(llapSplit.getPlanBytes());
 
-    LlapServiceInstance serviceInstance = getServiceInstance(job, llapSplit);
-    String host = serviceInstance.getHost();
-    int llapSubmitPort = serviceInstance.getRpcPort();
+    final LlapDaemonInfo llapDaemonInfo = llapSplit.getLlapDaemonInfos()[0];

Review comment:
       can getLlapDaemonInfos() return empty array? 

##########
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/ContainerRunnerImpl.java
##########
@@ -342,6 +352,33 @@ public SubmitWorkResponseProto 
submitWork(SubmitWorkRequestProto request) throws
         .build();
   }
 
+  // if request is coming from llap external client, verify the JWT
+  // as of now, JWT contains applicationId

Review comment:
       which applicationId is it? I assume llap application ID? can you explain 
a little bit more about how/who injects the llap application ID into jwt and 
how llap validate it against its own app ID. 

##########
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java
##########
@@ -474,6 +478,10 @@ public void serviceStart() throws Exception {
       getConfig().setInt(ConfVars.LLAP_DAEMON_WEB_PORT.varname, 
webServices.getPort());
     }
     getConfig().setInt(ConfVars.LLAP_DAEMON_OUTPUT_SERVICE_PORT.varname, 
LlapOutputFormatService.get().getPort());
+    if (LlapUtil.isCloudDeployment()) {
+      getConfig().setInt(ConfVars.LLAP_EXTERNAL_CLIENT_CLOUD_RPC_PORT.varname,

Review comment:
       i think somewhere here we have to do a validation.. if the port is 
non-zero and if the jwt shared secret cannot be obtained then fail the daemon.. 

##########
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/ContainerRunnerImpl.java
##########
@@ -342,6 +352,33 @@ public SubmitWorkResponseProto 
submitWork(SubmitWorkRequestProto request) throws
         .build();
   }
 
+  // if request is coming from llap external client, verify the JWT
+  // as of now, JWT contains applicationId
+  private void verifyJwtForExternalClient(SubmitWorkRequestProto request, 
String applicationIdString,
+      String fragmentIdString) {
+    LOG.info("Checking if request[{}] is from llap external client in a cloud 
based deployment", applicationIdString);
+    if (request.getIsExternalClientRequest() && LlapUtil.isCloudDeployment()) {
+      LOG.info("Llap external client request - {}, verifying JWT", 
applicationIdString);
+      Preconditions.checkState(request.hasJwt(), "JWT not found in request, 
fragmentId: " + fragmentIdString);
+
+      JwtHelper jwtHelper = new JwtHelper(getConfig());
+      Jws<Claims> claimsJws;
+      try {
+        claimsJws = jwtHelper.parseClaims(request.getJwt());
+      } catch (JwtException e) {
+        LOG.error("Cannot verify JWT provided with the request, fragmentId: 
{}, {}", fragmentIdString, e);
+        throw e;
+      }
+
+      String appIdInJwt = (String) 
claimsJws.getBody().get(JwtHelper.LLAP_EXT_CLIENT_APP_ID);
+      // this should never happen ideally.
+      Preconditions.checkState(appIdInJwt.equals(applicationIdString),

Review comment:
       .equals(selfAppIdString) where selfAppIdString is llap daemon's app id.. 
just for better clarity.. 

##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java
##########
@@ -559,12 +566,22 @@ private SplitResult getSplits(JobConf job, TezWork work, 
Schema schema, Applicat
         // 4. Make location hints.
         SplitLocationInfo[] locations = makeLocationHints(hints.get(i));
 
+        // 5. populate info about llap daemons(to help client submit request 
and read data)
+        LlapDaemonInfo[] llapDaemonInfos = populateLlapDaemonInfos(job, 
locations);
+
+        // 6. Generate JWT for external clients if it's a cloud deployment
+        String jwt = "";
+        if (LlapUtil.isCloudDeployment()) {
+          JwtHelper jwtHelper = new JwtHelper(SessionState.getSessionConf());
+          jwt = jwtHelper.buildJwtForLlap(applicationId);

Review comment:
       leave a comment about which application's id we are injecting here and 
how the flow of jwt validation works.. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to