[
https://issues.apache.org/jira/browse/HADOOP-19925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18091169#comment-18091169
]
ASF GitHub Bot commented on HADOOP-19925:
-----------------------------------------
steveloughran commented on code in PR #8562:
URL: https://github.com/apache/hadoop/pull/8562#discussion_r3466543147
##########
SECURITY.md:
##########
@@ -0,0 +1,491 @@
+<!---
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. See accompanying LICENSE file.
+-->
+
+# Apache Hadoop Security Model
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
+"OPTIONAL" in this document are to be interpreted as described in
+RFC 2119.
+
+This document defines the security model of Apache Hadoop: the deployments it
is
+designed to protect, the boundaries it defends, and — equally importantly — the
+things which are *not* vulnerabilities. It exists for human reporters and for
+anyone using automated or AI-assisted tooling to look for security issues.
+
+**TL;DR: Hadoop's security model defends a Kerberos-secured cluster running on
a
+trusted operating system, behind a network perimeter, with a valid site
+configuration. Findings which only apply outside that model are bugs, not
+vulnerabilities.**
+
+## Before Filing a Report (Including AI-Assisted Reports)
+
+The deployment Hadoop's security model defends is a **Kerberos-secured
cluster**.
+Many findings that look like vulnerabilities in other contexts are not
+vulnerabilities here, because the surrounding deployment is trusted by design.
+
+You *MUST NOT* file a security report for:
+
+- Issues that require the operator to edit their own Hadoop site configuration,
+ place malicious files on their own classpath, or pass malicious arguments to
+ their own command invocation.
+- **Job submission running user-supplied code.** Submitting work to YARN or
+ MapReduce executes the submitter's code as the submitter's identity. That is
+ the product, not a vulnerability. See the threat model below.
+- **Denial of service at scale.** A large Hadoop cluster exists to execute jobs
+ at scale; such a cluster can itself be used to mount distributed attacks, and
+ authenticated users can exhaust resources. Resource exhaustion and
performance
+ degradation from legitimate authenticated use are out of scope.
+- Issues that require the attacker to already hold cluster or remote-store
+ credentials, a valid Kerberos principal, or local disk access.
+- Anything against the **default insecure (non-Kerberos) mode** — it is
insecure
+ by design (see the deployment model below).
+- **Transitive CVEs** in dependencies Hadoop builds or ships against. See
+ [Third Party Modules](#third-party-modules).
+- Raw **scanner output** (Snyk, Dependabot, Trivy, Zizmor, etc.) without a
+ reproducer against the current `trunk` branch.
+- Theoretical findings ("an attacker who could X might then Y") without a
+ reproduction.
+
+
+A valid report includes:
+
+- The Hadoop version, and ideally the git SHA it was reproduced against.
+- The exact steps, configuration, and commands used to reproduce it.
+- The observed in-scope failure, and what was expected instead.
+- Where a CVE/CVSS score is claimed, the reasoning behind that score.
+
+### For Partly/Fully AI-Generated Reports
+
+AI-assisted reports are accepted **only** if the submitter has verified the
+finding by hand against current source and includes a runnable reproducer.
+
+In addition, the submitter of an AI-generated report is
+
+1. REQUIRED to understand what Hadoop is, to understand the claimed
vulnerability,
+and to be able to explain it in their own words — including justifying any
claimed CVE or CVSS
+scores. If the submitter is unable to do this, then any credit for a resulting
+CVE will be assigned to the AI tool alone, and not to the submitter.
+
+2. MUST declare the AI tool used, and provide the prompt.
+ The prompt is a key part of AI tool reports, and we need to be able to
track/replicate these.
+
+*Unverified LLM-generated reports waste maintainer time and will be closed
+without further response.*
+
+
+## Reporting a Vulnerability
+
+Report security vulnerabilities in Apache Hadoop privately to
+**[email protected]**. Do **not** open a public JIRA issue, GitHub
+issue, or pull request for an unfixed vulnerability.
+
+See the Apache Software Foundation's
+[guidelines for reporting security issues](https://www.apache.org/security/)
for
+the responsible-disclosure process that applies to all ASF projects.
+
+## Third Party Modules
+
+### Reporting a Known CVE in a Hadoop Dependency
+
+Do not report the existence of a published CVE in a Hadoop dependency
+to the security list. These are published and do not need to be treated as
+confidential.
+
+These are considered improvements in the project, and are managed in
+the project's [issue tracker](https://issues.apache.org/jira/issues/).
+1. Search for any existing issue covering the dependency upgrade.
+2. If it exists, read it, its discussion, the PRs etc, and see what versions
+ it has been merged to.
+3. If it hasn't been merged, look at why and get involved: major work is
likely to be
+ needed.
+4. If there isn't an issue, create one and start work on the PR!
Review Comment:
yeah, vex is something to think about
> Create a SECURITY.md file to define the security model for the AI tools
> -----------------------------------------------------------------------
>
> Key: HADOOP-19925
> URL: https://issues.apache.org/jira/browse/HADOOP-19925
> Project: Hadoop Common
> Issue Type: Improvement
> Components: security
> Affects Versions: 3.6.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Labels: pull-request-available
>
> Write a SECURITY.md file to scope AI generated security reports to sensible
> deployments, and also for humans. Base off best work of other projects.
> - explain deployments and their security boundaries (dev, kerberos, isolated
> cloud)
> - only accept security issues against kerberos
> - anything which doesn't lead to privilege escalation is a bug
> - anything which hurts perf is just a bug
> - we expect site config to be valid. If that can be manipulated, game over.
> - job submission is remote code execution so no, you don't get a CVE for that
> I will include dev and CI as targets of attacks and that we do care here.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]