[jira] [Commented] (HADOOP-19925) Create a SECURITY.md file to define the security model for the AI tools

ASF GitHub Bot (Jira) Thu, 25 Jun 2026 09:57:11 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-19925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18091674#comment-18091674
 ]


ASF GitHub Bot commented on HADOOP-19925:
-----------------------------------------

steveloughran commented on code in PR #8562:
URL: https://github.com/apache/hadoop/pull/8562#discussion_r3476162154


##########
SECURITY.md:
##########
@@ -0,0 +1,566 @@
+SPDX-License-Identifier: Apache-2.0
+
+# Apache Hadoop Security Model
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
+"OPTIONAL" in this document are to be interpreted as described in
+RFC 2119.
+
+This document defines the security model of Apache Hadoop: the deployments it 
is
+designed to protect, the boundaries it defends, and — equally importantly — the
+things which are *not* vulnerabilities. It exists for human reporters and for
+anyone using automated or AI-assisted tooling to look for security issues.
+
+**TL;DR: Hadoop's security model defends a Kerberos-secured cluster running on 
a
+trusted operating system, behind a network perimeter, with a valid site
+configuration. Findings which only apply outside that model are bugs, not
+vulnerabilities.**
+
+## Before Filing a Report (Including AI-Assisted Reports)
+
+The deployment Hadoop's security model defends is a **Kerberos-secured 
cluster**.
+Many findings that look like vulnerabilities in other contexts are not
+vulnerabilities here, because the surrounding deployment is trusted by design.
+
+You *MUST NOT* file a security report for:
+
+- Issues that require the operator to edit their own Hadoop site configuration,
+  place malicious files on their own classpath, or pass malicious arguments to
+  their own command invocation.
+- **Job submission running user-supplied code.** Submitting work to YARN or
+  MapReduce executes the submitter's code as the submitter's identity. That is
+  the product, not a vulnerability. See the threat model below.
+- **Denial of service at scale.** A large Hadoop cluster exists to execute jobs
+  at scale; such a cluster can itself be used to mount distributed attacks, and
+  authenticated users can exhaust resources. Resource exhaustion and 
performance
+  degradation from legitimate authenticated use are out of scope.
+- Issues that require the attacker to already hold cluster or remote-store
+  credentials, a valid Kerberos principal, or local disk access.
+- Anything against the **default insecure (non-Kerberos) mode** — it is 
insecure
+  by design (see the deployment model below).
+- **Transitive CVEs** in dependencies Hadoop builds or ships against. See
+  [Third Party Modules](#third-party-modules).
+- Raw **scanner output** (Snyk, Dependabot, Trivy, Zizmor, etc.) without a
+  reproducer against the current `trunk` branch.
+- Theoretical findings ("an attacker who could X might then Y") without a
+  reproduction.
+
+
+A valid report includes:
+
+- The Hadoop version, and ideally the git SHA it was reproduced against.
+- The exact steps, configuration, and commands used to reproduce it.
+- The observed in-scope failure, and what was expected instead.
+- Where a CVE/CVSS score is claimed, the reasoning behind that score.
+
+### For Partly/Fully AI-Generated Reports
+
+AI-assisted reports are accepted **only** if the submitter has verified the
+finding by hand against current source and includes a runnable reproducer.
+
+In addition, the submitter of an AI-generated report is
+
+1. REQUIRED to understand what Hadoop is, to understand the claimed 
vulnerability,
+and to be able to explain it in their own words — including justifying any 
claimed CVE or CVSS
+scores. If the submitter is unable to do this, then any credit for a resulting
+CVE will be assigned to the AI tool alone, and not to the submitter.
+
+2. MUST declare the AI tool used, and provide the prompt.
+   The prompt is a key part of AI tool reports, and we need to be able to 
track/replicate these.

Review Comment:
   how about "be willing to provide the session log"?





> Create a SECURITY.md file to define the security model for the AI tools
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-19925
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19925
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: security
>    Affects Versions: 3.6.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>
> Write a SECURITY.md file to scope AI generated security reports to sensible 
> deployments, and also for humans. Base off best work of other projects.
> - explain deployments and their security boundaries (dev, kerberos, isolated 
> cloud)
> - only accept security issues against kerberos
> - anything which doesn't lead to privilege escalation is a bug
> - anything which hurts perf is just a bug
> - we expect site config to be valid. If that can be manipulated, game over.
> - job submission is remote code execution so no, you don't get a CVE for that
> I will include dev and CI as targets of attacks and that we do care here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-19925) Create a SECURITY.md file to define the security model for the AI tools

Reply via email to