Claudenw commented on code in PR #677:
URL: https://github.com/apache/creadur-rat/pull/677#discussion_r3485434346


##########
AGENTS.md:
##########


Review Comment:
   There is no license header on this file.



##########
THREAT_MODEL.md:
##########
@@ -0,0 +1,302 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0

Review Comment:
   can we change this to https:



##########
THREAT_MODEL.md:
##########
@@ -0,0 +1,247 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Apache Creadur (RAT) — Threat Model
+
+## §1 Header
+
+- **Project:** Apache Creadur — primarily **RAT (Release Audit Tool)**
+  (`apache/creadur-rat`), with sibling tools **Whisker**
+  (`apache/creadur-whisker`, license-documentation generator) and **Tentacles**
+  (`apache/creadur-tentacles`, release-bundle analyzer). This model is written
+  in `creadur-rat` and covers the Creadur dev-tool family; Whisker/Tentacles
+  share RAT's trust profile (§2).
+- **Written against:** `main`/`master` @ HEAD (2026-06).
+- **Author:** ASF Security team, via the threat-model-producer rubric (Scovetta
+  rubric) at the Creadur PMC's request (path 3).
+- **Status:** DRAFT — under maintainer review (2026-06-10). Not yet ratified.
+- **Reporting cross-reference:** §8-violating findings via the ASF security
+  process ([`SECURITY.md`](SECURITY.md)); §3/§9 findings closed citing this 
doc.
+- **Provenance legend:** *(documented)* / *(maintainer)* / *(inferred)* — each
+  *(inferred)* has a §14 open question.
+- **Draft confidence:** ~14 documented / 0 maintainer / 16 inferred.
+
+**What it is.** RAT is a **build-time / CLI license-auditing tool**: it walks a
+source tree, matches files against configurable license/header definitions, and
+reports unapproved or unknown licenses. It runs as a **CLI**, an **Ant task**,
+or a **Maven plugin** — always **in the developer's or CI's own process**,
+never as a network service. Whisker generates license documentation; Tentacles
+inspects staged release bundles. None is a server.
+

Review Comment:
   Rat can change the source that it is scanning, but it uses the text from a 
controlled input file.



##########
SECURITY.md:
##########


Review Comment:
   Have we determined what goes in here yet?



##########
SECURITY.md:
##########


Review Comment:
   No license header on this file.



##########
THREAT_MODEL.md:
##########
@@ -0,0 +1,288 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Apache Creadur (RAT) — Threat Model
+
+## §1 Header
+
+- **Project:** Apache Creadur — primarily **RAT (Release Audit Tool)**
+  (`apache/creadur-rat`), with sibling tools **Whisker**
+  (`apache/creadur-whisker`, license-documentation generator) and **Tentacles**
+  (`apache/creadur-tentacles`, release-bundle analyzer). This model is written
+  in `creadur-rat` and covers the Creadur dev-tool family; Whisker/Tentacles
+  share RAT's trust profile (§2).
+- **Written against:** `main`/`master` @ HEAD (2026-06).
+- **Author:** ASF Security team, via the threat-model-producer rubric (Scovetta
+  rubric) at the Creadur PMC's request (path 3).
+- **Status:** DRAFT — under maintainer review (2026-06-10). Not yet ratified.
+- **Reporting cross-reference:** §8-violating findings via the ASF security
+  process ([`SECURITY.md`](SECURITY.md)); §3/§9 findings closed citing this 
doc.
+- **Provenance legend:** *(documented)* / *(maintainer)* / *(inferred)* — each
+  *(inferred)* has a §14 open question.
+- **Draft confidence:** ~14 documented / 5 maintainer / 11 inferred (maintainer
+  answers folded in from PR #677 review, 2026-06).
+
+**What it is.** RAT is a **build-time / CLI license-auditing tool**: it walks a
+source tree, matches files against configurable license/header definitions, and
+reports unapproved or unknown licenses. It runs as a **CLI**, an **Ant task**,
+or a **Maven plugin** — always **in the developer's or CI's own process**,
+never as a network service. Whisker generates license documentation; Tentacles
+inspects staged release bundles. None is a server.
+
+## §2 Scope and intended use
+
+Intended use: a project maintainer or CI job runs RAT over a codebase to verify
+license compliance before a release or on each change. The two inputs are the
+**tree being audited** (files, including archives RAT descends into) and the
+**RAT configuration** (XML/text license + matcher definitions).
+
+Caller trust level: the developer/CI invoking RAT is trusted. The **inputs are
+normally trusted too** (your own source, your own config) — but RAT is
+sometimes pointed at **untrusted input**: a CI job auditing an untrusted
+contribution/PR, or auditing a downloaded third-party artifact. That is the
+case the model cares about. *(inferred — Q1.)*
+
+**Component families.**
+
+| Family | Entry point | Untrusted-input exposure | In model? |
+| --- | --- | --- | --- |
+| File walking + license matching | `Reporter`, walkers | scanned file 
**content/paths** | **Yes** |
+| **XML configuration reader** | `XMLConfigurationReader` | the **config** (if 
attacker-supplied) | **Yes** (XXE surface) |
+| **Archive walker** | `ArchiveWalker` | archives in the tree (zip/jar/tar) | 
**Yes** (decompression-bomb surface) |
+| CLI / Ant task / Maven plugin | wrappers | invocation args (trusted caller) 
| wrappers — trusted |
+| **License-header insertion (write mode)** | `--addLicense` / editors | 
**modifies files in the audited tree** (operator-invoked) | trusted-input (§3) |
+| Whisker / Tentacles | their CLIs | same dev-tool profile | sibling — §2 note 
|
+
+**Note (PMC, review).** The CLI, Ant task, and Maven plugin front-ends are
+generated from a common option core, so any security-relevant behaviour (or
+gap) in that core transfers automatically to all three UIs — a finding in one
+front-end's handling generally applies to all of them. *(maintainer.)*
+
+## §3 Out of scope (explicit non-goals)
+
+- **RAT as a security scanner.** RAT checks *license* compliance; it is **not**
+  a vulnerability scanner or a security gate. "RAT didn't catch X security
+  issue" is not in scope. *(documented — purpose.)*
+- **Audit *correctness* as a security property.** A missed/false license match
+  is a correctness bug, not a vulnerability (unless it crosses a resource 
bound,
+  §8). *(inferred.)*
+- **The build/CI environment** RAT runs in, and the trust of the source tree
+  when RAT is deliberately run on your own (trusted) code — the dominant,
+  intended case. Findings whose only impact requires running RAT on input you
+  already trust are `OUT-OF-MODEL: trusted-input`.
+- **Test resources** (the deliberately-odd license fixtures under
+  `*/src/test/resources/`) — those are test data, not a target.
+- **RAT's header-insertion / file-modification mode** (`--addLicense` and the
+  editors) — RAT can *write* license headers into the audited files, mutating
+  the tree. This is explicitly operator-invoked against the operator's own
+  (trusted) sources; a run that modifies files the operator already controls is
+  `OUT-OF-MODEL: trusted-input`. (Raised by the PMC in review — write mode is
+  noted here so the boundary is explicit rather than silent.) *(maintainer.)*
+- **Custom matchers / matcher extensions**
+  (<https://creadur.apache.org/rat/license_def.html#Matchers>) — RAT lets the
+  operator define custom matcher classes in its configuration, and a custom
+  matcher sees the full text of every file selected for scanning. Because the
+  matcher set is operator-defined configuration under the control of whoever
+  runs RAT (not attacker-supplied), a custom matcher reading scanned text is
+  `OUT-OF-MODEL: trusted-input` — the same posture as any operator-supplied
+  extension code (cf. the write mode above). (Raised by the PMC in review.)
+  *(maintainer — Claudenw.)*
+
+## §4 Trust boundaries and data flow
+
+The boundary is **the input RAT is pointed at** — files and configuration.
+RAT's security questions only arise when that input is **untrusted**:
+
+```
+caller invokes RAT (CLI/Ant/Maven) on a directory + a config
+   │ trusted invocation
+   ▼
+read configuration (XMLConfigurationReader) ── XXE surface if config is 
untrusted

Review Comment:
   Why does this still mention the XXE surface?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to