Copilot commented on code in PR #769:
URL: https://github.com/apache/ranger/pull/769#discussion_r2624539107


##########
dev-support/ranger-docker/scripts/hive/ranger-hive-setup.sh:
##########
@@ -139,32 +139,66 @@ cp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml 
${HIVE_HOME}/conf/
 cp ${TEZ_HOME}/conf/tez-site.xml ${HIVE_HOME}/conf/
 
 # Upload Tez libraries to HDFS
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
-
-# Recreate Tez tarball if it doesn't exist (it gets removed during Docker 
build)
-if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
-    echo "Recreating Tez tarball for HDFS upload..."
-    cd /opt
-    tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz apache-tez-${TEZ_VERSION}-bin/
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+    echo "Kerberos enabled - authenticating as hive user..."
+    su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname 
-f\`@EXAMPLE.COM" hive
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hive
+
+    # Recreate Tez tarball if it doesn't exist
+    if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+        echo "Recreating Tez tarball for HDFS upload..."
+        cd /opt
+        tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz 
apache-tez-${TEZ_VERSION}-bin/
+    fi
+
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f 
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hive
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hive
+    su -c "kdestroy" hive
+else
+    # Non-Kerberos mode - use hdfs user
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
+
+    # Recreate Tez tarball if it doesn't exist (it gets removed during Docker 
build)
+    if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+        echo "Recreating Tez tarball for HDFS upload..."
+        cd /opt
+        tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz 
apache-tez-${TEZ_VERSION}-bin/
+    fi
+
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f 
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hdfs
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hdfs
 fi
 
-su -c "${HADOOP_HOME}/bin/hdfs dfs -put 
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hdfs
-
 # Create HDFS user directory for hive
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /user/hive" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 777 /user/hive" hdfs
-
-# Create HDFS /tmp/hive directory for Tez staging
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /tmp/hive" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 777 /tmp/hive" hdfs
-
-# Fix /tmp directory permissions for Ranger (critical for INSERT operations)
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod 777 /tmp" hdfs
-
-# Create /user/root directory for YARN job execution
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /user/root" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod 777 /user/root" hdfs
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+    su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname 
-f\`@EXAMPLE.COM" hive

Review Comment:
   The kinit command is being executed again here, but the Kerberos ticket 
should still be valid from the previous kinit at line 144. Re-authenticating 
multiple times within a short script increases complexity and risk of 
credential mismanagement. Consider consolidating all HDFS operations requiring 
hive user credentials into a single authenticated session with one 
kinit/kdestroy pair.



##########
dev-support/ranger-docker/scripts/hive/ranger-hive-setup.sh:
##########
@@ -139,32 +139,66 @@ cp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml 
${HIVE_HOME}/conf/
 cp ${TEZ_HOME}/conf/tez-site.xml ${HIVE_HOME}/conf/
 
 # Upload Tez libraries to HDFS
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
-
-# Recreate Tez tarball if it doesn't exist (it gets removed during Docker 
build)
-if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
-    echo "Recreating Tez tarball for HDFS upload..."
-    cd /opt
-    tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz apache-tez-${TEZ_VERSION}-bin/
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+    echo "Kerberos enabled - authenticating as hive user..."
+    su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname 
-f\`@EXAMPLE.COM" hive
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hive
+
+    # Recreate Tez tarball if it doesn't exist
+    if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+        echo "Recreating Tez tarball for HDFS upload..."
+        cd /opt
+        tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz 
apache-tez-${TEZ_VERSION}-bin/
+    fi
+
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f 
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hive
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hive
+    su -c "kdestroy" hive
+else
+    # Non-Kerberos mode - use hdfs user
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
+
+    # Recreate Tez tarball if it doesn't exist (it gets removed during Docker 
build)
+    if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+        echo "Recreating Tez tarball for HDFS upload..."
+        cd /opt
+        tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz 
apache-tez-${TEZ_VERSION}-bin/
+    fi
+
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f 
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hdfs
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hdfs
 fi

Review Comment:
   There is significant code duplication between the Kerberos-enabled and 
non-Kerberos branches. The tarball recreation logic (lines 148-152 and 162-166) 
is identical. Consider extracting this common logic before the conditional 
block to improve maintainability.



##########
dev-support/ranger-docker/scripts/admin/create-ranger-services.py:
##########
@@ -21,6 +21,10 @@ def service_not_exists(service):
                                   'policy.download.auth.users': 'hdfs',
                                   'tag.download.auth.users': 'hdfs',
                                   'userstore.download.auth.users': 'hdfs',
+                                  'default-policy.1.name': 'hive-tez-path',
+                                  'default-policy.1.resource.path': '/*,/tmp',

Review Comment:
   The resource path '/*,/tmp' is overly permissive as it grants the hive user 
read, write, and execute access to all paths in HDFS. This violates the 
principle of least privilege. Consider specifying only the necessary paths such 
as '/apps/tez,/tmp/hive,/user/hive' to restrict access to what the hive user 
actually needs for Tez operations.



##########
dev-support/ranger-docker/scripts/hive/ranger-hive-setup.sh:
##########
@@ -139,32 +139,66 @@ cp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml 
${HIVE_HOME}/conf/
 cp ${TEZ_HOME}/conf/tez-site.xml ${HIVE_HOME}/conf/
 
 # Upload Tez libraries to HDFS
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
-
-# Recreate Tez tarball if it doesn't exist (it gets removed during Docker 
build)
-if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
-    echo "Recreating Tez tarball for HDFS upload..."
-    cd /opt
-    tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz apache-tez-${TEZ_VERSION}-bin/
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+    echo "Kerberos enabled - authenticating as hive user..."
+    su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname 
-f\`@EXAMPLE.COM" hive

Review Comment:
   The kinit command should be checked for success before proceeding with HDFS 
operations. If authentication fails, subsequent HDFS commands will fail with 
confusing errors. Consider adding error checking after kinit, for example: 'su 
-c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname -f\`@EXAMPLE.COM" hive 
|| { echo "Kerberos authentication failed"; exit 1; }'



##########
dev-support/ranger-docker/scripts/hive/ranger-hive-setup.sh:
##########
@@ -139,32 +139,66 @@ cp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml 
${HIVE_HOME}/conf/
 cp ${TEZ_HOME}/conf/tez-site.xml ${HIVE_HOME}/conf/
 
 # Upload Tez libraries to HDFS
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
-
-# Recreate Tez tarball if it doesn't exist (it gets removed during Docker 
build)
-if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
-    echo "Recreating Tez tarball for HDFS upload..."
-    cd /opt
-    tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz apache-tez-${TEZ_VERSION}-bin/
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+    echo "Kerberos enabled - authenticating as hive user..."
+    su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname 
-f\`@EXAMPLE.COM" hive
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hive
+
+    # Recreate Tez tarball if it doesn't exist
+    if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+        echo "Recreating Tez tarball for HDFS upload..."
+        cd /opt
+        tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz 
apache-tez-${TEZ_VERSION}-bin/
+    fi
+
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f 
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hive
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hive
+    su -c "kdestroy" hive
+else
+    # Non-Kerberos mode - use hdfs user
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
+
+    # Recreate Tez tarball if it doesn't exist (it gets removed during Docker 
build)
+    if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+        echo "Recreating Tez tarball for HDFS upload..."
+        cd /opt
+        tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz 
apache-tez-${TEZ_VERSION}-bin/
+    fi
+
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f 
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hdfs
+    su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hdfs
 fi
 
-su -c "${HADOOP_HOME}/bin/hdfs dfs -put 
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hdfs
-
 # Create HDFS user directory for hive
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /user/hive" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 777 /user/hive" hdfs
-
-# Create HDFS /tmp/hive directory for Tez staging
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /tmp/hive" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 777 /tmp/hive" hdfs
-
-# Fix /tmp directory permissions for Ranger (critical for INSERT operations)
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod 777 /tmp" hdfs
-
-# Create /user/root directory for YARN job execution
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /user/root" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod 777 /user/root" hdfs
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+    su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname 
-f\`@EXAMPLE.COM" hive

Review Comment:
   Similar to the previous kinit at line 144, this authentication command 
should be checked for success before proceeding with HDFS operations. Consider 
adding error handling to fail fast if authentication fails.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to