[jira] [Commented] (HADOOP-19925) Create a SECURITY.md file to define the security model for the AI tools

ASF GitHub Bot (Jira) Wed, 24 Jun 2026 04:02:25 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-19925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18091169#comment-18091169
 ]


ASF GitHub Bot commented on HADOOP-19925:
-----------------------------------------

steveloughran commented on code in PR #8562:
URL: https://github.com/apache/hadoop/pull/8562#discussion_r3466543147


##########
SECURITY.md:
##########
@@ -0,0 +1,491 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# Apache Hadoop Security Model
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
+"OPTIONAL" in this document are to be interpreted as described in
+RFC 2119.
+
+This document defines the security model of Apache Hadoop: the deployments it 
is
+designed to protect, the boundaries it defends, and — equally importantly — the
+things which are *not* vulnerabilities. It exists for human reporters and for
+anyone using automated or AI-assisted tooling to look for security issues.
+
+**TL;DR: Hadoop's security model defends a Kerberos-secured cluster running on 
a
+trusted operating system, behind a network perimeter, with a valid site
+configuration. Findings which only apply outside that model are bugs, not
+vulnerabilities.**
+
+## Before Filing a Report (Including AI-Assisted Reports)
+
+The deployment Hadoop's security model defends is a **Kerberos-secured 
cluster**.
+Many findings that look like vulnerabilities in other contexts are not
+vulnerabilities here, because the surrounding deployment is trusted by design.
+
+You *MUST NOT* file a security report for:
+
+- Issues that require the operator to edit their own Hadoop site configuration,
+  place malicious files on their own classpath, or pass malicious arguments to
+  their own command invocation.
+- **Job submission running user-supplied code.** Submitting work to YARN or
+  MapReduce executes the submitter's code as the submitter's identity. That is
+  the product, not a vulnerability. See the threat model below.
+- **Denial of service at scale.** A large Hadoop cluster exists to execute jobs
+  at scale; such a cluster can itself be used to mount distributed attacks, and
+  authenticated users can exhaust resources. Resource exhaustion and 
performance
+  degradation from legitimate authenticated use are out of scope.
+- Issues that require the attacker to already hold cluster or remote-store
+  credentials, a valid Kerberos principal, or local disk access.
+- Anything against the **default insecure (non-Kerberos) mode** — it is 
insecure
+  by design (see the deployment model below).
+- **Transitive CVEs** in dependencies Hadoop builds or ships against. See
+  [Third Party Modules](#third-party-modules).
+- Raw **scanner output** (Snyk, Dependabot, Trivy, Zizmor, etc.) without a
+  reproducer against the current `trunk` branch.
+- Theoretical findings ("an attacker who could X might then Y") without a
+  reproduction.
+
+
+A valid report includes:
+
+- The Hadoop version, and ideally the git SHA it was reproduced against.
+- The exact steps, configuration, and commands used to reproduce it.
+- The observed in-scope failure, and what was expected instead.
+- Where a CVE/CVSS score is claimed, the reasoning behind that score.
+
+### For Partly/Fully AI-Generated Reports
+
+AI-assisted reports are accepted **only** if the submitter has verified the
+finding by hand against current source and includes a runnable reproducer.
+
+In addition, the submitter of an AI-generated report is 
+
+1. REQUIRED to understand what Hadoop is, to understand the claimed 
vulnerability,
+and to be able to explain it in their own words — including justifying any 
claimed CVE or CVSS
+scores. If the submitter is unable to do this, then any credit for a resulting
+CVE will be assigned to the AI tool alone, and not to the submitter.
+
+2. MUST declare the AI tool used, and provide the prompt.
+   The prompt is a key part of AI tool reports, and we need to be able to 
track/replicate these.
+
+*Unverified LLM-generated reports waste maintainer time and will be closed
+without further response.*
+
+
+## Reporting a Vulnerability
+
+Report security vulnerabilities in Apache Hadoop privately to
+**[email protected]**. Do **not** open a public JIRA issue, GitHub
+issue, or pull request for an unfixed vulnerability.
+
+See the Apache Software Foundation's
+[guidelines for reporting security issues](https://www.apache.org/security/) 
for
+the responsible-disclosure process that applies to all ASF projects.
+
+## Third Party Modules
+
+### Reporting a Known CVE in a Hadoop Dependency
+
+Do not report the existence of a published CVE in a Hadoop dependency
+to the security list. These are published and do not need to be treated as
+confidential.
+
+These are considered improvements in the project, and are managed in
+the project's [issue tracker](https://issues.apache.org/jira/issues/).
+1. Search for any existing issue covering the dependency upgrade.
+2. If it exists, read it, its discussion, the PRs etc, and see what versions
+   it has been merged to.
+3. If it hasn't been merged, look at why and get involved: major work is 
likely to be
+   needed.
+4. If there isn't an issue, create one and start work on the PR!

Review Comment:
   yeah, vex is something to think about





> Create a SECURITY.md file to define the security model for the AI tools
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-19925
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19925
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: security
>    Affects Versions: 3.6.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>
> Write a SECURITY.md file to scope AI generated security reports to sensible 
> deployments, and also for humans. Base off best work of other projects.
> - explain deployments and their security boundaries (dev, kerberos, isolated 
> cloud)
> - only accept security issues against kerberos
> - anything which doesn't lead to privilege escalation is a bug
> - anything which hurts perf is just a bug
> - we expect site config to be valid. If that can be manipulated, game over.
> - job submission is remote code execution so no, you don't get a CVE for that
> I will include dev and CI as targets of attacks and that we do care here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-19925) Create a SECURITY.md file to define the security model for the AI tools

Reply via email to