ppkarwasz commented on code in PR #8562:
URL: https://github.com/apache/hadoop/pull/8562#discussion_r3464708548


##########
SECURITY.md:
##########
@@ -0,0 +1,491 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# Apache Hadoop Security Model
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
+"OPTIONAL" in this document are to be interpreted as described in
+RFC 2119.
+
+This document defines the security model of Apache Hadoop: the deployments it 
is
+designed to protect, the boundaries it defends, and — equally importantly — the
+things which are *not* vulnerabilities. It exists for human reporters and for
+anyone using automated or AI-assisted tooling to look for security issues.
+
+**TL;DR: Hadoop's security model defends a Kerberos-secured cluster running on 
a
+trusted operating system, behind a network perimeter, with a valid site
+configuration. Findings which only apply outside that model are bugs, not
+vulnerabilities.**
+
+## Before Filing a Report (Including AI-Assisted Reports)
+
+The deployment Hadoop's security model defends is a **Kerberos-secured 
cluster**.
+Many findings that look like vulnerabilities in other contexts are not
+vulnerabilities here, because the surrounding deployment is trusted by design.
+
+You *MUST NOT* file a security report for:
+
+- Issues that require the operator to edit their own Hadoop site configuration,
+  place malicious files on their own classpath, or pass malicious arguments to
+  their own command invocation.
+- **Job submission running user-supplied code.** Submitting work to YARN or
+  MapReduce executes the submitter's code as the submitter's identity. That is
+  the product, not a vulnerability. See the threat model below.
+- **Denial of service at scale.** A large Hadoop cluster exists to execute jobs
+  at scale; such a cluster can itself be used to mount distributed attacks, and
+  authenticated users can exhaust resources. Resource exhaustion and 
performance
+  degradation from legitimate authenticated use are out of scope.
+- Issues that require the attacker to already hold cluster or remote-store
+  credentials, a valid Kerberos principal, or local disk access.
+- Anything against the **default insecure (non-Kerberos) mode** — it is 
insecure
+  by design (see the deployment model below).
+- **Transitive CVEs** in dependencies Hadoop builds or ships against. See
+  [Third Party Modules](#third-party-modules).
+- Raw **scanner output** (Snyk, Dependabot, Trivy, Zizmor, etc.) without a
+  reproducer against the current `trunk` branch.
+- Theoretical findings ("an attacker who could X might then Y") without a
+  reproduction.
+
+
+A valid report includes:
+
+- The Hadoop version, and ideally the git SHA it was reproduced against.
+- The exact steps, configuration, and commands used to reproduce it.
+- The observed in-scope failure, and what was expected instead.
+- Where a CVE/CVSS score is claimed, the reasoning behind that score.
+
+### For Partly/Fully AI-Generated Reports
+
+AI-assisted reports are accepted **only** if the submitter has verified the
+finding by hand against current source and includes a runnable reproducer.
+
+In addition, the submitter of an AI-generated report is 
+
+1. REQUIRED to understand what Hadoop is, to understand the claimed 
vulnerability,
+and to be able to explain it in their own words — including justifying any 
claimed CVE or CVSS
+scores. If the submitter is unable to do this, then any credit for a resulting
+CVE will be assigned to the AI tool alone, and not to the submitter.
+
+2. MUST declare the AI tool used, and provide the prompt.
+   The prompt is a key part of AI tool reports, and we need to be able to 
track/replicate these.
+
+*Unverified LLM-generated reports waste maintainer time and will be closed
+without further response.*
+
+
+## Reporting a Vulnerability
+
+Report security vulnerabilities in Apache Hadoop privately to
+**[email protected]**. Do **not** open a public JIRA issue, GitHub
+issue, or pull request for an unfixed vulnerability.

Review Comment:
   I think you should add:
   
   > Do **not** cc any other `@hadoop.apache.org` address: these are public 
mailing lists.
   
   We have seen an insurgence of security reports sent “by mistake” to a public 
mailing list.



##########
AGENTS.md:
##########
@@ -0,0 +1,30 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.

Review Comment:
   I would use use a `SPDX-License-Identifier: Apache-2.0` line here to 
minimize the tokens ingested by AI agents and prevent their context being 
“polluted” with the license terms instead of what they should do.
   



##########
SECURITY.md:
##########
@@ -0,0 +1,491 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# Apache Hadoop Security Model
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
+"OPTIONAL" in this document are to be interpreted as described in
+RFC 2119.
+
+This document defines the security model of Apache Hadoop: the deployments it 
is
+designed to protect, the boundaries it defends, and — equally importantly — the
+things which are *not* vulnerabilities. It exists for human reporters and for
+anyone using automated or AI-assisted tooling to look for security issues.
+
+**TL;DR: Hadoop's security model defends a Kerberos-secured cluster running on 
a
+trusted operating system, behind a network perimeter, with a valid site
+configuration. Findings which only apply outside that model are bugs, not
+vulnerabilities.**
+
+## Before Filing a Report (Including AI-Assisted Reports)
+
+The deployment Hadoop's security model defends is a **Kerberos-secured 
cluster**.
+Many findings that look like vulnerabilities in other contexts are not
+vulnerabilities here, because the surrounding deployment is trusted by design.
+
+You *MUST NOT* file a security report for:
+
+- Issues that require the operator to edit their own Hadoop site configuration,
+  place malicious files on their own classpath, or pass malicious arguments to
+  their own command invocation.
+- **Job submission running user-supplied code.** Submitting work to YARN or
+  MapReduce executes the submitter's code as the submitter's identity. That is
+  the product, not a vulnerability. See the threat model below.
+- **Denial of service at scale.** A large Hadoop cluster exists to execute jobs
+  at scale; such a cluster can itself be used to mount distributed attacks, and
+  authenticated users can exhaust resources. Resource exhaustion and 
performance
+  degradation from legitimate authenticated use are out of scope.
+- Issues that require the attacker to already hold cluster or remote-store
+  credentials, a valid Kerberos principal, or local disk access.
+- Anything against the **default insecure (non-Kerberos) mode** — it is 
insecure
+  by design (see the deployment model below).
+- **Transitive CVEs** in dependencies Hadoop builds or ships against. See
+  [Third Party Modules](#third-party-modules).
+- Raw **scanner output** (Snyk, Dependabot, Trivy, Zizmor, etc.) without a
+  reproducer against the current `trunk` branch.
+- Theoretical findings ("an attacker who could X might then Y") without a
+  reproduction.
+
+
+A valid report includes:
+
+- The Hadoop version, and ideally the git SHA it was reproduced against.
+- The exact steps, configuration, and commands used to reproduce it.
+- The observed in-scope failure, and what was expected instead.
+- Where a CVE/CVSS score is claimed, the reasoning behind that score.
+
+### For Partly/Fully AI-Generated Reports
+
+AI-assisted reports are accepted **only** if the submitter has verified the
+finding by hand against current source and includes a runnable reproducer.
+
+In addition, the submitter of an AI-generated report is 
+
+1. REQUIRED to understand what Hadoop is, to understand the claimed 
vulnerability,
+and to be able to explain it in their own words — including justifying any 
claimed CVE or CVSS
+scores. If the submitter is unable to do this, then any credit for a resulting
+CVE will be assigned to the AI tool alone, and not to the submitter.
+
+2. MUST declare the AI tool used, and provide the prompt.
+   The prompt is a key part of AI tool reports, and we need to be able to 
track/replicate these.
+
+*Unverified LLM-generated reports waste maintainer time and will be closed
+without further response.*
+
+
+## Reporting a Vulnerability
+
+Report security vulnerabilities in Apache Hadoop privately to
+**[email protected]**. Do **not** open a public JIRA issue, GitHub
+issue, or pull request for an unfixed vulnerability.
+
+See the Apache Software Foundation's
+[guidelines for reporting security issues](https://www.apache.org/security/) 
for
+the responsible-disclosure process that applies to all ASF projects.
+
+## Third Party Modules
+
+### Reporting a Known CVE in a Hadoop Dependency
+
+Do not report the existence of a published CVE in a Hadoop dependency
+to the security list. These are published and do not need to be treated as
+confidential.
+
+These are considered improvements in the project, and are managed in
+the project's [issue tracker](https://issues.apache.org/jira/issues/).
+1. Search for any existing issue covering the dependency upgrade.
+2. If it exists, read it, its discussion, the PRs etc, and see what versions
+   it has been merged to.
+3. If it hasn't been merged, look at why and get involved: major work is 
likely to be
+   needed.
+4. If there isn't an issue, create one and start work on the PR!
+
+Tip: an easy way to check for the version of a library to ship in the trunk
+release of hadoop is the [LICENSE-binary](./LICENSE-binary) file.
+
+Please do not send an email listing the CVEs an automated scan
+tool reported and requesting updates, timelines etc.
+Open source development is a community process, and addressing this is done
+in the [developer mailing lists](https://hadoop.apache.org/mailing_lists.html).
+Join the community to help get your needs addressed.
+
+### Providing Advance Warning of a Critical CVE in a Hadoop Dependency
+
+If a team providing a library which Hadoop bundles has a critical CVE which
+a forthcoming fix will correct, they are encouraged to notify the hadoop 
security
+list so we can co-ordinate releases.
+
+We treat all such reports as confidential.
+
+### Reporting a Newly-Discovered Vulnerability in a Third-Party Module
+
+Security bugs in third-party modules (the JVM, the Kerberos infrastructure, 
cloud
+SDKs, connectors, or any other dependency) should be reported to their 
respective
+maintainers, through their own security-reporting mechanisms — after verifying
+the issue is in scope of *their* threat model and reproduces against *their*
+current release.
+
+
+## Supported Versions
+
+Security fixes are made only to the most recent Apache Hadoop release line(s).
+Older release lines are end-of-life and do not receive security updates; the
+remedy for a vulnerability in an old line is to upgrade. Refer to the
+[Apache Hadoop release and download 
policy](https://hadoop.apache.org/releases.html)
+for which lines are currently maintained. A report MUST be reproducible 
against a
+maintained release or the current `trunk` branch.
+
+## The Hadoop Threat Model
+
+In the Hadoop threat model there are **trusted elements**. Vulnerabilities that
+require the compromise of these trusted elements are outside the scope of the
+model:
+
+- **The underlying operating system is trusted.** Hadoop relies on OS process
+  isolation, file permissions, and (where required) OS-level disk encryption.
+  An attack that first requires the OS to be compromised or misconfigured is 
out
+  of scope.
+- **Valid site configuration is trusted.** We expect `core-site.xml`,
+  `hdfs-site.xml`, `yarn-site.xml` and the rest of the site configuration to be
+  valid and to be writable only by trusted administrators. If an attacker can
+  manipulate the site configuration, the game is already over — that is out of
+  scope.
+
+Within that model, the boundary Hadoop **defends** is **privilege escalation
+across an authenticated boundary within a Kerberos-secured cluster**.
+Examples of in-scope issues are:
+
+- A user acting as another user, as a service, or as a superuser without the
+  authorization to do so.
+- Bypassing service-level authorization / ACLs
+  (see [Service Level 
Authorization](hadoop-common-project/hadoop-common/src/site/markdown/ServiceLevelAuth.md)).
+- Forging, leaking, or improperly reusing delegation tokens.
+- Defeating the constraints on proxy/superuser impersonation
+  (see [Proxy user - Superusers Acting On Behalf Of Other 
Users](hadoop-common-project/hadoop-common/src/site/markdown/Superusers.md)).
+
+Further properties of the model:
+
+- **Hadoop clusters are never web-facing.** They are deployed behind a network
+  perimeter; network rules are expected to prevent access by untrusted
+  principals. A report which assumes a cluster is directly exposed to the 
public
+  internet is not in scope.
+- **Wire encryption is optional and controlled by site configuration.** Network
+  traffic between Hadoop components may or may not be encrypted, depending on 
the
+  deployment's configuration. The absence of wire encryption when it has not 
been
+  enabled is not a vulnerability.
+
+Relevant operational security documentation:
+
+- [Hadoop in Secure 
Mode](hadoop-common-project/hadoop-common/src/site/markdown/SecureMode.md)
+- [Service Level 
Authorization](hadoop-common-project/hadoop-common/src/site/markdown/ServiceLevelAuth.md)
+- [Authentication for Hadoop HTTP 
web-consoles](hadoop-common-project/hadoop-common/src/site/markdown/HttpAuthentication.md)
+- [Proxy user - 
Superusers](hadoop-common-project/hadoop-common/src/site/markdown/Superusers.md)
+- [Credential Provider 
API](hadoop-common-project/hadoop-common/src/site/markdown/CredentialProviderAPI.md)
+- [YARN Application 
Security](hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md)
+- [Transparent Encryption in 
HDFS](hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/TransparentEncryption.md)
+
+## Deployment Threat Model
+
+Hadoop is deployed in a number of ways, with different security boundaries.
+
+### Standalone (Insecure) Mode
+
+In its standalone configuration Hadoop performs no real authentication. Anyone 
with
+network access to the cluster has full access to its data and can submit work.
+
+This mode is *intended* to run only on a trusted, network-isolated host or
+network. It is insecure **by design**. "The unsecured cluster has no security" 
is
+not a vulnerability, and arbitrary data access against a non-Kerberos cluster 
is
+inherent in security being disabled.
+
+It should only be used for standalone development/test environments, with 
firewalls preventing remote access.
+It can then be used to test Hadoop and applications running on top of it.
+
+
+### Secure (Kerberos) Clusters
+
+This is the deployment the security model defends, as described in
+[The Hadoop threat model](#the-hadoop-threat-model) above: Kerberos
+authentication, service-level authorization, delegation tokens, and constrained
+proxy/superuser impersonation.
+
+It is the expected deployment of production physical clusters.
+1. A trusted kerberos system is used to authenticate principals.
+2. Services have been issued with credentials in files, which are secured on 
the physical hosts.
+3. Users of the cluster all authenticate with the kerberos system for their 
access
+4. Access to the cluster may be via a proxy mechanism.
+5. The HDFS filesystem uses kerberos to authenticate hdfs nodes and services 
themselves, other cluster services (YARN, Apache ZooKeeper etc) and callers.
+6. HDFS block tokens are issued by the HDFS Name Node to grant data access to 
authenticated principals;
+   the possessor of a token may access a block of data on a data node with the 
permissions in that token,
+   without the need to supply any further authentication information.
+
+Hadoop services issue _delegation tokens_: an authenticated principal obtains 
a token directly from a service such as HDFS, Apache HBase, Apache Hive, Apache 
Knox and more.
+YARN distributes these tokens to an application's containers and renews them 
on the application's behalf, so tasks can authenticate to those services 
without holding Kerberos credentials themselves.
+These tokens have an independent life from the kerberos credentials
+* They have a limited lifespan of a number of hours.
+* They can be cancelled: the issuing service MUST then reject requests using 
them as authentication.
+* They can be renewed: before their lifespan expires the renewer requests the 
issuing service to extend their lifespan.
+
+The details of these tokens or how issuing, cancellation and renewal are 
managed are not covered in this document.
+Hadoop and applications MUST safely marshall and store these tokens; if they 
are published in any form then permissions are being leaked.
+
+### Transient Cloud Deployments
+
+Hadoop is frequently deployed as a transient cluster in a cloud environment:
+
+- Cloud credentials are supplied to the deployment by the hosting 
infrastructure
+  — for example AWS IAM roles attached to the VMs/containers, or equivalent
+  mechanisms on other clouds. **These supplied credentials, and the access they
+  grant, are the trust boundary.** Using credentials provided to the VM or
+  container the code runs in is not a vulnerability.
+- The cluster is **transient** and typically single-tenant: it is created for a
+  workload and destroyed afterwards.
+- **Network rules prevent access by untrusted principals.** As with on-premises
+  clusters, the deployment is not web-facing; the network perimeter is part of
+  the model.
+
+Hadoop clusters MUST NOT be deployed in cloud without network rules to isolate 
them from the public internet.
+
+
+## Data at Rest and Temporary Files
+
+- **Persisting data encrypted requires HDFS encryption** (see
+  [Transparent Encryption in 
HDFS](hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/TransparentEncryption.md)).
+  Where encryption has been configured, a failure of the code to actually
+  encrypt the persisted data **is a vulnerability** and should be reported.
+- **Temporary data** is written to local-filesystem temporary directories. The
+  requirement is that the operating system secures and, where required, 
encrypts
+  these directories — this is part of the trusted-OS assumption. Within that:
+  - Code that creates temporary files and directories **MUST create them and 
set
+    their permissions atomically.** Creating a file or directory with 
permissive
+    defaults and then narrowing the permissions in a later step leaves a window
+    in which another local principal can act on it.
+  - A failure to create files/directories and set their permissions atomically
+    **is an issue** and should be reported.
+
+## Secrets and Logging
+
+Leaking secrets into logs is [CWE-532: Insertion of Sensitive Information into
+Log File](https://cwe.mitre.org/data/definitions/532.html). The following rules
+apply to Hadoop code:
+
+- Secrets *SHOULD NOT* be logged.
+- **Persistent secrets, long-lived credentials, and encryption secrets (keys,
+  key material, passwords) *MUST NOT* be logged at any level.**
+- **Transient secrets** (for example short-lived tokens) *MUST NOT* be logged 
at
+  `INFO`, `WARN`, or `ERROR` level, and *SHOULD NOT* be logged at `DEBUG` or
+  `TRACE` level.
+
+Transient secrets are called out specifically because secrets sometimes 
surface in HTTP/web
+request logs (URLs, headers, query parameters) and are visible
+when third-party components including JDK classes are configured to log at 
TRACE.
+Preventing logging of these is best-effort.
+
+
+## Development and CI Threat Model

Review Comment:
   The reporting rules should be different for these ones. Reporters should 
e-mail [email protected] and only Cc [email protected].
   
   If the threat is credible, INFRA will disable the affected workflow and 
notify the project.



##########
SECURITY.md:
##########
@@ -0,0 +1,491 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# Apache Hadoop Security Model
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
+"OPTIONAL" in this document are to be interpreted as described in
+RFC 2119.
+
+This document defines the security model of Apache Hadoop: the deployments it 
is
+designed to protect, the boundaries it defends, and — equally importantly — the
+things which are *not* vulnerabilities. It exists for human reporters and for
+anyone using automated or AI-assisted tooling to look for security issues.
+
+**TL;DR: Hadoop's security model defends a Kerberos-secured cluster running on 
a
+trusted operating system, behind a network perimeter, with a valid site
+configuration. Findings which only apply outside that model are bugs, not
+vulnerabilities.**
+
+## Before Filing a Report (Including AI-Assisted Reports)
+
+The deployment Hadoop's security model defends is a **Kerberos-secured 
cluster**.
+Many findings that look like vulnerabilities in other contexts are not
+vulnerabilities here, because the surrounding deployment is trusted by design.
+
+You *MUST NOT* file a security report for:
+
+- Issues that require the operator to edit their own Hadoop site configuration,
+  place malicious files on their own classpath, or pass malicious arguments to
+  their own command invocation.
+- **Job submission running user-supplied code.** Submitting work to YARN or
+  MapReduce executes the submitter's code as the submitter's identity. That is
+  the product, not a vulnerability. See the threat model below.
+- **Denial of service at scale.** A large Hadoop cluster exists to execute jobs
+  at scale; such a cluster can itself be used to mount distributed attacks, and
+  authenticated users can exhaust resources. Resource exhaustion and 
performance
+  degradation from legitimate authenticated use are out of scope.
+- Issues that require the attacker to already hold cluster or remote-store
+  credentials, a valid Kerberos principal, or local disk access.
+- Anything against the **default insecure (non-Kerberos) mode** — it is 
insecure
+  by design (see the deployment model below).
+- **Transitive CVEs** in dependencies Hadoop builds or ships against. See
+  [Third Party Modules](#third-party-modules).
+- Raw **scanner output** (Snyk, Dependabot, Trivy, Zizmor, etc.) without a
+  reproducer against the current `trunk` branch.
+- Theoretical findings ("an attacker who could X might then Y") without a
+  reproduction.
+
+
+A valid report includes:
+
+- The Hadoop version, and ideally the git SHA it was reproduced against.
+- The exact steps, configuration, and commands used to reproduce it.
+- The observed in-scope failure, and what was expected instead.
+- Where a CVE/CVSS score is claimed, the reasoning behind that score.
+
+### For Partly/Fully AI-Generated Reports
+
+AI-assisted reports are accepted **only** if the submitter has verified the
+finding by hand against current source and includes a runnable reproducer.
+
+In addition, the submitter of an AI-generated report is 
+
+1. REQUIRED to understand what Hadoop is, to understand the claimed 
vulnerability,
+and to be able to explain it in their own words — including justifying any 
claimed CVE or CVSS
+scores. If the submitter is unable to do this, then any credit for a resulting
+CVE will be assigned to the AI tool alone, and not to the submitter.
+
+2. MUST declare the AI tool used, and provide the prompt.
+   The prompt is a key part of AI tool reports, and we need to be able to 
track/replicate these.
+
+*Unverified LLM-generated reports waste maintainer time and will be closed
+without further response.*
+
+
+## Reporting a Vulnerability
+
+Report security vulnerabilities in Apache Hadoop privately to
+**[email protected]**. Do **not** open a public JIRA issue, GitHub
+issue, or pull request for an unfixed vulnerability.
+
+See the Apache Software Foundation's
+[guidelines for reporting security issues](https://www.apache.org/security/) 
for
+the responsible-disclosure process that applies to all ASF projects.
+
+## Third Party Modules
+
+### Reporting a Known CVE in a Hadoop Dependency
+
+Do not report the existence of a published CVE in a Hadoop dependency
+to the security list. These are published and do not need to be treated as
+confidential.
+
+These are considered improvements in the project, and are managed in
+the project's [issue tracker](https://issues.apache.org/jira/issues/).
+1. Search for any existing issue covering the dependency upgrade.
+2. If it exists, read it, its discussion, the PRs etc, and see what versions
+   it has been merged to.
+3. If it hasn't been merged, look at why and get involved: major work is 
likely to be
+   needed.
+4. If there isn't an issue, create one and start work on the PR!

Review Comment:
   Concerning CVEs in dependencies, please also review apache/security-site#63, 
which will update the Security Team page: 
https://security.apache.org/report-dependency/
   
   The two descriptions should be aligned. What we ask from users is not to 
bother projects sending:
   
   - Information that a vulnerability is present: everybody knows this,
   - PRs that just upgrade a dependency: Dependabot can do this.
   
   What we ask from user is to:
   
   - Check the **exploitability** of the vulnerability and depending on the 
result either:
     - Publicly tell the project the vulnerability is **not** exploitable: you 
should provide guidance on how to do that,
     - Privately tell the project the vulnerability is **exploitable** and 
describe the impact the vulnerability has on Hadoop (“PoC or GTFO”).
   
   Since Hadoop is known from shading dependencies, it might be useful to 
specify that only two kinds of dependencies matter:
   
   - Those shipped in a Hadoop binary distribution,
   - Those shaded in a Hadoop library.
   
   The rest is irrelevant: Hadoop will not make a new release for a 
vulnerability in the transitive hull of its Maven dependencies, if that 
vulnerability is not shipped by any Hadoop artifact and does not require any 
changes in Hadoop code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to