Copilot commented on code in PR #769:
URL: https://github.com/apache/ranger/pull/769#discussion_r2624539107
##########
dev-support/ranger-docker/scripts/hive/ranger-hive-setup.sh:
##########
@@ -139,32 +139,66 @@ cp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml
${HIVE_HOME}/conf/
cp ${TEZ_HOME}/conf/tez-site.xml ${HIVE_HOME}/conf/
# Upload Tez libraries to HDFS
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
-
-# Recreate Tez tarball if it doesn't exist (it gets removed during Docker
build)
-if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
- echo "Recreating Tez tarball for HDFS upload..."
- cd /opt
- tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz apache-tez-${TEZ_VERSION}-bin/
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+ echo "Kerberos enabled - authenticating as hive user..."
+ su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname
-f\`@EXAMPLE.COM" hive
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hive
+
+ # Recreate Tez tarball if it doesn't exist
+ if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+ echo "Recreating Tez tarball for HDFS upload..."
+ cd /opt
+ tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz
apache-tez-${TEZ_VERSION}-bin/
+ fi
+
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hive
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hive
+ su -c "kdestroy" hive
+else
+ # Non-Kerberos mode - use hdfs user
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
+
+ # Recreate Tez tarball if it doesn't exist (it gets removed during Docker
build)
+ if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+ echo "Recreating Tez tarball for HDFS upload..."
+ cd /opt
+ tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz
apache-tez-${TEZ_VERSION}-bin/
+ fi
+
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hdfs
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hdfs
fi
-su -c "${HADOOP_HOME}/bin/hdfs dfs -put
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hdfs
-
# Create HDFS user directory for hive
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /user/hive" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 777 /user/hive" hdfs
-
-# Create HDFS /tmp/hive directory for Tez staging
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /tmp/hive" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 777 /tmp/hive" hdfs
-
-# Fix /tmp directory permissions for Ranger (critical for INSERT operations)
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod 777 /tmp" hdfs
-
-# Create /user/root directory for YARN job execution
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /user/root" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod 777 /user/root" hdfs
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+ su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname
-f\`@EXAMPLE.COM" hive
Review Comment:
The kinit command is being executed again here, but the Kerberos ticket
should still be valid from the previous kinit at line 144. Re-authenticating
multiple times within a short script increases complexity and risk of
credential mismanagement. Consider consolidating all HDFS operations requiring
hive user credentials into a single authenticated session with one
kinit/kdestroy pair.
##########
dev-support/ranger-docker/scripts/hive/ranger-hive-setup.sh:
##########
@@ -139,32 +139,66 @@ cp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml
${HIVE_HOME}/conf/
cp ${TEZ_HOME}/conf/tez-site.xml ${HIVE_HOME}/conf/
# Upload Tez libraries to HDFS
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
-
-# Recreate Tez tarball if it doesn't exist (it gets removed during Docker
build)
-if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
- echo "Recreating Tez tarball for HDFS upload..."
- cd /opt
- tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz apache-tez-${TEZ_VERSION}-bin/
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+ echo "Kerberos enabled - authenticating as hive user..."
+ su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname
-f\`@EXAMPLE.COM" hive
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hive
+
+ # Recreate Tez tarball if it doesn't exist
+ if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+ echo "Recreating Tez tarball for HDFS upload..."
+ cd /opt
+ tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz
apache-tez-${TEZ_VERSION}-bin/
+ fi
+
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hive
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hive
+ su -c "kdestroy" hive
+else
+ # Non-Kerberos mode - use hdfs user
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
+
+ # Recreate Tez tarball if it doesn't exist (it gets removed during Docker
build)
+ if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+ echo "Recreating Tez tarball for HDFS upload..."
+ cd /opt
+ tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz
apache-tez-${TEZ_VERSION}-bin/
+ fi
+
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hdfs
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hdfs
fi
Review Comment:
There is significant code duplication between the Kerberos-enabled and
non-Kerberos branches. The tarball recreation logic (lines 148-152 and 162-166)
is identical. Consider extracting this common logic before the conditional
block to improve maintainability.
##########
dev-support/ranger-docker/scripts/admin/create-ranger-services.py:
##########
@@ -21,6 +21,10 @@ def service_not_exists(service):
'policy.download.auth.users': 'hdfs',
'tag.download.auth.users': 'hdfs',
'userstore.download.auth.users': 'hdfs',
+ 'default-policy.1.name': 'hive-tez-path',
+ 'default-policy.1.resource.path': '/*,/tmp',
Review Comment:
The resource path '/*,/tmp' is overly permissive as it grants the hive user
read, write, and execute access to all paths in HDFS. This violates the
principle of least privilege. Consider specifying only the necessary paths such
as '/apps/tez,/tmp/hive,/user/hive' to restrict access to what the hive user
actually needs for Tez operations.
##########
dev-support/ranger-docker/scripts/hive/ranger-hive-setup.sh:
##########
@@ -139,32 +139,66 @@ cp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml
${HIVE_HOME}/conf/
cp ${TEZ_HOME}/conf/tez-site.xml ${HIVE_HOME}/conf/
# Upload Tez libraries to HDFS
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
-
-# Recreate Tez tarball if it doesn't exist (it gets removed during Docker
build)
-if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
- echo "Recreating Tez tarball for HDFS upload..."
- cd /opt
- tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz apache-tez-${TEZ_VERSION}-bin/
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+ echo "Kerberos enabled - authenticating as hive user..."
+ su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname
-f\`@EXAMPLE.COM" hive
Review Comment:
The kinit command should be checked for success before proceeding with HDFS
operations. If authentication fails, subsequent HDFS commands will fail with
confusing errors. Consider adding error checking after kinit, for example: 'su
-c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname -f\`@EXAMPLE.COM" hive
|| { echo "Kerberos authentication failed"; exit 1; }'
##########
dev-support/ranger-docker/scripts/hive/ranger-hive-setup.sh:
##########
@@ -139,32 +139,66 @@ cp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml
${HIVE_HOME}/conf/
cp ${TEZ_HOME}/conf/tez-site.xml ${HIVE_HOME}/conf/
# Upload Tez libraries to HDFS
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
-
-# Recreate Tez tarball if it doesn't exist (it gets removed during Docker
build)
-if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
- echo "Recreating Tez tarball for HDFS upload..."
- cd /opt
- tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz apache-tez-${TEZ_VERSION}-bin/
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+ echo "Kerberos enabled - authenticating as hive user..."
+ su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname
-f\`@EXAMPLE.COM" hive
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hive
+
+ # Recreate Tez tarball if it doesn't exist
+ if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+ echo "Recreating Tez tarball for HDFS upload..."
+ cd /opt
+ tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz
apache-tez-${TEZ_VERSION}-bin/
+ fi
+
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hive
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hive
+ su -c "kdestroy" hive
+else
+ # Non-Kerberos mode - use hdfs user
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /apps/tez" hdfs
+
+ # Recreate Tez tarball if it doesn't exist (it gets removed during Docker
build)
+ if [ ! -f "/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz" ]; then
+ echo "Recreating Tez tarball for HDFS upload..."
+ cd /opt
+ tar czf apache-tez-${TEZ_VERSION}-bin.tar.gz
apache-tez-${TEZ_VERSION}-bin/
+ fi
+
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -put -f
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hdfs
+ su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hdfs
fi
-su -c "${HADOOP_HOME}/bin/hdfs dfs -put
/opt/apache-tez-${TEZ_VERSION}-bin.tar.gz /apps/tez/" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 755 /apps/tez" hdfs
-
# Create HDFS user directory for hive
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /user/hive" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 777 /user/hive" hdfs
-
-# Create HDFS /tmp/hive directory for Tez staging
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /tmp/hive" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod -R 777 /tmp/hive" hdfs
-
-# Fix /tmp directory permissions for Ranger (critical for INSERT operations)
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod 777 /tmp" hdfs
-
-# Create /user/root directory for YARN job execution
-su -c "${HADOOP_HOME}/bin/hdfs dfs -mkdir -p /user/root" hdfs
-su -c "${HADOOP_HOME}/bin/hdfs dfs -chmod 777 /user/root" hdfs
+if [ "${KERBEROS_ENABLED}" == "true" ]; then
+ su -c "kinit -kt /etc/keytabs/hive.keytab hive/\`hostname
-f\`@EXAMPLE.COM" hive
Review Comment:
Similar to the previous kinit at line 144, this authentication command
should be checked for success before proceeding with HDFS operations. Consider
adding error handling to fail fast if authentication fails.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]