steveloughran commented on code in PR #8100:
URL: https://github.com/apache/hadoop/pull/8100#discussion_r2607437440


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/TaskLevelSecurityEnforcement.md:
##########
@@ -0,0 +1,92 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+MR Task-Level Security Enforcement
+==================
+
+<!-- MACRO{toc|fromDepth=0|toDepth=2} -->
+
+Overview
+-------
+The goal of this feature tp provide a configurable mechanism to control which 
users are allowed to execute specific MapReduce jobs.
+This feature aims to prevent unauthorized or potentially harmful 
mapper/reducer implementations from running within the Hadoop cluster.
+
+In the standard Hadoop MapReduce execution flow:
+1) A MapReduce job is submitted by a user.
+2) The job is registered with the Resource Manager (RM).
+3) The RM assigns the job to a Node Manager (NM), where the Application Master 
(AM) for the job is launched.
+4) The AM requests additional containers from the cluster, to be able to start 
tasks.
+5) The NM launches those containers, and the containers execute the 
mapper/reducer tasks defined by the job.
+
+This feature introduces a security filtering mechanism inside the Application 
Master.
+Before mapper or reducer tasks are launched, the AM will verify that the 
user-submitted MapReduce code complies with a cluster-defined security policy.
+This ensures that only approved classes or packages can be executed inside the 
containers.
+The goal is to protect the cluster from unwanted or unsafe task 
implementations, such as custom code that may introduce performance, stability, 
or security risks.
+
+Upon receiving job metadata, the Application Master will:
+1) Check the feature is enabled.
+2) Check the user who submitted the job is allowed to bypass the security 
check.
+3) Compare classes in job config against the denied task list.
+4) If job is not authorised an exception will be thrown and AM will fail.
+
+Configurations
+-------
+
+#### Enables MapReduce Task-Level Security Enforcement
+When enabled, the Application Master performs validation of user-submitted 
mapper, reducer, and other task-related classes before launching containers.
+This mechanism protects the cluster from running disallowed or unsafe task 
implementations as defined by administrator-controlled policies.
+- Property name: mapreduce.security.enabled
+- Property type: boolean
+- Default: false (security disabled)
+
+
+#### MapReduce Task-Level Security Enforcement: Property Domain
+Defines the set of MapReduce configuration keys that represent user-supplied 
class names involved in task execution (e.g., mapper, reducer, partitioner).
+The Application Master examines the values of these properties and checks 
whether any referenced class is listed in denied tasks.
+Administrators may override this list to expand or restrict the validation 
domain.
+- Property name: mapreduce.security.property-domain
+- Property type: list of configuration keys
+- Default:
+  - mapreduce.job.combine.class
+  - mapreduce.job.combiner.group.comparator.class
+  - mapreduce.job.end-notification.custom-notifier-class
+  - mapreduce.job.inputformat.class
+  - mapreduce.job.map.class
+  - mapreduce.job.map.output.collector.class
+  - mapreduce.job.output.group.comparator.class
+  - mapreduce.job.output.key.class
+  - mapreduce.job.output.key.comparator.class
+  - mapreduce.job.output.value.class
+  - mapreduce.job.outputformat.class
+  - mapreduce.job.partitioner.class
+  - mapreduce.job.reduce.class
+  - mapreduce.map.output.key.class
+  - mapreduce.map.output.value.class

Review Comment:
   mapreduce.outputcommitter.factory.class



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java:
##########
@@ -0,0 +1,103 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.mapreduce.v2.app.security.authorize;
+
+import java.util.Arrays;
+import java.util.List;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapreduce.MRConfig;
+import org.apache.hadoop.mapreduce.MRJobConfig;
+
+/**
+ * Enforces task-level security rules for MapReduce jobs.
+ *
+ * <p>This security enforcement mechanism validates whether the user who 
submitted
+ * a job is allowed to execute the mapper/reducer/task classes defined in the 
job
+ * configuration. The check is performed inside the Application Master before
+ * task containers are launched.</p>
+ * <p>If the user is not on the allowed list and any job property within the 
configured
+ * security property domain references a denied class/prefix, a
+ * {@link TaskLevelSecurityException} is thrown and the job is rejected.</p>
+ * <p>This prevents unauthorized or unsafe custom code from running inside
+ * cluster containers.</p>
+ */
+public final class TaskLevelSecurityEnforcer {
+  private static final Logger LOG = 
LoggerFactory.getLogger(TaskLevelSecurityEnforcer.class);
+
+  /**
+   * Default constructor.
+   */
+  private TaskLevelSecurityEnforcer() {
+  }
+
+  /**
+   * Validates a MapReduce job's configuration against the cluster's task-level
+   * security policy.
+   *
+   * <p>The method performs the following steps:</p>
+   * <ol>
+   *   <li>Check whether task-level security is enabled.</li>
+   *   <li>Allow the job immediately if the user is on the configured 
allowed-users list.</li>
+   *   <li>Retrieve the security property domain (list of job configuration 
keys to inspect).</li>
+   *   <li>Retrieve the list of denied task class prefixes.</li>
+   *   <li>For each domain property, check whether its value begins with any 
denied prefix.</li>
+   *   <li>If a match is found, reject the job by throwing {@link 
TaskLevelSecurityException}.</li>
+   * </ol>
+   *
+   * @param conf the job configuration to validate
+   * @throws TaskLevelSecurityException if the user is not authorized to use 
one of the task classes
+   */
+  public static void validate(JobConf conf) throws TaskLevelSecurityException {
+    if (!conf.getBoolean(MRConfig.SECURITY_ENABLED, 
MRConfig.DEFAULT_SECURITY_ENABLED)) {
+      LOG.debug("The {} is disabled",  MRConfig.SECURITY_ENABLED);
+      return;
+    }
+
+    String currentUser = conf.get(MRJobConfig.USER_NAME);

Review Comment:
   +use getTrimmed() here and elsewhere to make sure the white space gets taken 
off



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java:
##########
@@ -133,5 +133,82 @@ public interface MRConfig {
   boolean DEFAULT_MASTER_WEBAPP_UI_ACTIONS_ENABLED = true;
   String MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT = 
"mapreduce.multiple-outputs-close-threads";
   int DEFAULT_MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT = 10;
+
+  /**
+   * Enables MapReduce Task-Level Security Enforcement.
+   *
+   * When enabled, the Application Master performs validation of user-submitted
+   * mapper, reducer, and other task-related classes before launching 
containers.
+   * This mechanism protects the cluster from running disallowed or unsafe task
+   * implementations as defined by administrator-controlled policies.
+   *
+   * Property type: boolean
+   * Default: false (security disabled)
+   */
+  String SECURITY_ENABLED = "mapreduce.security.enabled";
+  boolean DEFAULT_SECURITY_ENABLED = false;
+
+  /**
+   * MapReduce Task-Level Security Enforcement: Property Domain
+   *
+   * Defines the set of MapReduce configuration keys that represent 
user-supplied
+   * class names involved in task execution (e.g., mapper, reducer, 
partitioner).
+   * The Application Master examines the values of these properties and checks
+   * whether any referenced class is listed in {@link #SECURITY_DENIED_TASKS}.
+   * Administrators may override this list to expand or restrict the validation
+   * domain.
+   *
+   * Property type: list of configuration keys
+   * Default: all known task-level class properties (see list below)
+   */
+  String SECURITY_PROPERTY_DOMAIN = "mapreduce.security.property-domain";
+  String[] DEFAULT_SECURITY_PROPERTY_DOMAIN = {
+      "mapreduce.job.combine.class",
+      "mapreduce.job.combiner.group.comparator.class",
+      "mapreduce.job.end-notification.custom-notifier-class",
+      "mapreduce.job.inputformat.class",
+      "mapreduce.job.map.class",
+      "mapreduce.job.map.output.collector.class",
+      "mapreduce.job.output.group.comparator.class",
+      "mapreduce.job.output.key.class",
+      "mapreduce.job.output.key.comparator.class",
+      "mapreduce.job.output.value.class",
+      "mapreduce.job.outputformat.class",
+      "mapreduce.job.partitioner.class",
+      "mapreduce.job.reduce.class",
+      "mapreduce.map.output.key.class",
+      "mapreduce.map.output.value.class"
+  };
+
+  /**
+   * MapReduce Task-Level Security Enforcement: Denied Tasks
+   *
+   * Specifies the list of disallowed task implementation classes or packages.
+   * If a user submits a job whose mapper, reducer, or other task-related 
classes
+   * match any entry in this blacklist.
+   *
+   * Property type: list of class name or package patterns
+   * Default: empty (no restrictions)
+   * Example: 
org.apache.hadoop.streaming,org.apache.hadoop.examples.QuasiMonteCarlo
+   */
+  String SECURITY_DENIED_TASKS = "mapreduce.security.denied-tasks";
+  String[] DEFAULT_SECURITY_DENIED_TASKS = {};
+
+  /**
+   * MapReduce Task-Level Security Enforcement: Allowed Users
+   *
+   * Specifies users who may bypass the blacklist defined in
+   * {@link #SECURITY_DENIED_TASKS}.
+   * This whitelist is intended for trusted or system-level workflows that may
+   * legitimately require the use of restricted task implementations.
+   * If the submitting user is listed here, blacklist enforcement is skipped,
+   * although standard Hadoop authentication and ACL checks still apply.
+   *
+   * Property type: list of usernames
+   * Default: empty (no bypass users)
+   * Example: hue,hive
+   */
+  String SECURITY_ALLOWED_USERS = "mapreduce.security.allowed-users";

Review Comment:
   same comment on <p> and {@value entry}



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java:
##########
@@ -133,5 +133,82 @@ public interface MRConfig {
   boolean DEFAULT_MASTER_WEBAPP_UI_ACTIONS_ENABLED = true;
   String MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT = 
"mapreduce.multiple-outputs-close-threads";
   int DEFAULT_MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT = 10;
+
+  /**
+   * Enables MapReduce Task-Level Security Enforcement.
+   *
+   * When enabled, the Application Master performs validation of user-submitted
+   * mapper, reducer, and other task-related classes before launching 
containers.
+   * This mechanism protects the cluster from running disallowed or unsafe task
+   * implementations as defined by administrator-controlled policies.
+   *
+   * Property type: boolean
+   * Default: false (security disabled)

Review Comment:
   add a line
   ```
   value: {@value}
   ```
   so that the javadocs fill in the value
   



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java:
##########
@@ -133,5 +133,82 @@ public interface MRConfig {
   boolean DEFAULT_MASTER_WEBAPP_UI_ACTIONS_ENABLED = true;
   String MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT = 
"mapreduce.multiple-outputs-close-threads";
   int DEFAULT_MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT = 10;
+
+  /**
+   * Enables MapReduce Task-Level Security Enforcement.
+   *

Review Comment:
   nit, add <p> to put in newlines; needed for real javadocs, includign those 
IDEs can pop up



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java:
##########
@@ -0,0 +1,103 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.mapreduce.v2.app.security.authorize;
+
+import java.util.Arrays;
+import java.util.List;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapreduce.MRConfig;
+import org.apache.hadoop.mapreduce.MRJobConfig;
+
+/**
+ * Enforces task-level security rules for MapReduce jobs.
+ *
+ * <p>This security enforcement mechanism validates whether the user who 
submitted
+ * a job is allowed to execute the mapper/reducer/task classes defined in the 
job
+ * configuration. The check is performed inside the Application Master before
+ * task containers are launched.</p>
+ * <p>If the user is not on the allowed list and any job property within the 
configured
+ * security property domain references a denied class/prefix, a
+ * {@link TaskLevelSecurityException} is thrown and the job is rejected.</p>
+ * <p>This prevents unauthorized or unsafe custom code from running inside
+ * cluster containers.</p>
+ */
+public final class TaskLevelSecurityEnforcer {
+  private static final Logger LOG = 
LoggerFactory.getLogger(TaskLevelSecurityEnforcer.class);
+
+  /**
+   * Default constructor.
+   */
+  private TaskLevelSecurityEnforcer() {
+  }
+
+  /**
+   * Validates a MapReduce job's configuration against the cluster's task-level
+   * security policy.
+   *
+   * <p>The method performs the following steps:</p>
+   * <ol>
+   *   <li>Check whether task-level security is enabled.</li>
+   *   <li>Allow the job immediately if the user is on the configured 
allowed-users list.</li>
+   *   <li>Retrieve the security property domain (list of job configuration 
keys to inspect).</li>
+   *   <li>Retrieve the list of denied task class prefixes.</li>
+   *   <li>For each domain property, check whether its value begins with any 
denied prefix.</li>
+   *   <li>If a match is found, reject the job by throwing {@link 
TaskLevelSecurityException}.</li>
+   * </ol>
+   *
+   * @param conf the job configuration to validate
+   * @throws TaskLevelSecurityException if the user is not authorized to use 
one of the task classes
+   */
+  public static void validate(JobConf conf) throws TaskLevelSecurityException {
+    if (!conf.getBoolean(MRConfig.SECURITY_ENABLED, 
MRConfig.DEFAULT_SECURITY_ENABLED)) {
+      LOG.debug("The {} is disabled",  MRConfig.SECURITY_ENABLED);
+      return;
+    }
+
+    String currentUser = conf.get(MRJobConfig.USER_NAME);
+    List<String> allowedUsers = Arrays.asList(conf.getTrimmedStrings(
+        MRConfig.SECURITY_ALLOWED_USERS,
+        MRConfig.DEFAULT_SECURITY_ALLOWED_USERS
+    ));
+    if (allowedUsers.contains(currentUser)) {
+      LOG.debug("The {} is allowed to execute every task", currentUser);
+      return;
+    }
+
+    String[] propertyDomain = conf.getTrimmedStrings(
+        MRConfig.SECURITY_PROPERTY_DOMAIN,
+        MRConfig.DEFAULT_SECURITY_PROPERTY_DOMAIN
+    );
+    String[] deniedTasks = conf.getTrimmedStrings(
+        MRConfig.SECURITY_DENIED_TASKS,
+        MRConfig.DEFAULT_SECURITY_DENIED_TASKS
+    );
+    for (String property : propertyDomain) {
+      String propertyValue = conf.get(property, "");

Review Comment:
   getTrimmed



##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java:
##########
@@ -133,5 +133,82 @@ public interface MRConfig {
   boolean DEFAULT_MASTER_WEBAPP_UI_ACTIONS_ENABLED = true;
   String MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT = 
"mapreduce.multiple-outputs-close-threads";
   int DEFAULT_MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT = 10;
+
+  /**
+   * Enables MapReduce Task-Level Security Enforcement.
+   *
+   * When enabled, the Application Master performs validation of user-submitted
+   * mapper, reducer, and other task-related classes before launching 
containers.
+   * This mechanism protects the cluster from running disallowed or unsafe task
+   * implementations as defined by administrator-controlled policies.
+   *
+   * Property type: boolean
+   * Default: false (security disabled)
+   */
+  String SECURITY_ENABLED = "mapreduce.security.enabled";
+  boolean DEFAULT_SECURITY_ENABLED = false;
+
+  /**
+   * MapReduce Task-Level Security Enforcement: Property Domain
+   *
+   * Defines the set of MapReduce configuration keys that represent 
user-supplied
+   * class names involved in task execution (e.g., mapper, reducer, 
partitioner).
+   * The Application Master examines the values of these properties and checks
+   * whether any referenced class is listed in {@link #SECURITY_DENIED_TASKS}.
+   * Administrators may override this list to expand or restrict the validation
+   * domain.
+   *
+   * Property type: list of configuration keys
+   * Default: all known task-level class properties (see list below)
+   */
+  String SECURITY_PROPERTY_DOMAIN = "mapreduce.security.property-domain";
+  String[] DEFAULT_SECURITY_PROPERTY_DOMAIN = {
+      "mapreduce.job.combine.class",
+      "mapreduce.job.combiner.group.comparator.class",
+      "mapreduce.job.end-notification.custom-notifier-class",
+      "mapreduce.job.inputformat.class",
+      "mapreduce.job.map.class",
+      "mapreduce.job.map.output.collector.class",
+      "mapreduce.job.output.group.comparator.class",
+      "mapreduce.job.output.key.class",
+      "mapreduce.job.output.key.comparator.class",
+      "mapreduce.job.output.value.class",
+      "mapreduce.job.outputformat.class",
+      "mapreduce.job.partitioner.class",
+      "mapreduce.job.reduce.class",
+      "mapreduce.map.output.key.class",
+      "mapreduce.map.output.value.class"
+  };
+
+  /**
+   * MapReduce Task-Level Security Enforcement: Denied Tasks
+   *

Review Comment:
   same comment on <p> and {@value entry}



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to