Copilot commented on code in PR #769:
URL: https://github.com/apache/unomi/pull/769#discussion_r3388666853


##########
THREAT_MODEL.md:
##########
@@ -0,0 +1,190 @@
+<!--
+SPDX-License-Identifier: Apache-2.0
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    https://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Apache Unomi — Threat Model (v0 draft)
+
+## §1 Header
+
+- **Project:** Apache Unomi (`apache/unomi`), `master` / 2.x line, against 
which this draft was written. This model covers the **apache/unomi** server; 
`unomi-tracker` (browser tracking client) and `unomi-site` (website) are in the 
engagement scope but are treated here as satellites (see §2/§3).

Review Comment:
   The header claims this threat model was written against the `master` / 2.x 
line, but the repository `master` branch is currently on a 3.x development line 
(e.g., `pom.xml` is `3.1.0-SNAPSHOT`). This is likely to become 
stale/misleading for readers; consider referencing just the branch name (or a 
concrete tag) rather than a major line here.



##########
THREAT_MODEL.md:
##########
@@ -0,0 +1,190 @@
+<!--
+SPDX-License-Identifier: Apache-2.0
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    https://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Apache Unomi — Threat Model (v0 draft)
+
+## §1 Header
+
+- **Project:** Apache Unomi (`apache/unomi`), `master` / 2.x line, against 
which this draft was written. This model covers the **apache/unomi** server; 
`unomi-tracker` (browser tracking client) and `unomi-site` (website) are in the 
engagement scope but are treated here as satellites (see §2/§3).
+- **Date:** 2026-06-02. **Status:** draft — for Apache Unomi PMC review. 
**Author:** ASF Security team (drafted via the Scovetta threat-model rubric), 
for PMC ratification.
+- **Version binding:** versioned with the project; a report against version 
*N* is triaged against the model as it stood at *N*.
+- **Reporting cross-reference:** §8-property violations → report privately per 
ASF process (`[email protected]` → `[email protected]`); §3/§9 
findings are closed citing this document.
+- **Provenance legend:** *(documented)* = Unomi's own docs/repo/CVE 
advisories; *(maintainer)* = confirmed by an Unomi PMC member through this 
process; *(inferred)* = reasoned from architecture/history, not yet confirmed — 
each has a matching §14 open question.
+- **Draft confidence:** ~16 documented / 0 maintainer / ~30 inferred.
+- **What Unomi is:** Apache Unomi is a Java reference implementation of the 
OASIS Context Server (CXS) spec — a Customer Data Platform. It collects 
behavioural events about visitors (typically from a browser via the 
`unomi-tracker` JavaScript over a public **context** endpoint), builds and 
stores profiles + segments, evaluates rules/conditions, and exposes data via 
REST and GraphQL APIs. It persists to Elasticsearch/OpenSearch. *(documented — 
README, manual)*
+
+## §2 Scope and intended use
+
+- **Primary use:** an operator-deployed **context server** that ingests 
visitor events over the network and serves profile/segmentation data to web 
properties and back-office tools. *(documented — manual)*
+- **Caller roles** (network service — the role splits):
+  - **public web client** — a browser running `unomi-tracker`, hitting the 
**public context endpoint** (`/context.json`-style) **unauthenticated**, from 
the open internet. The highest-value untrusted surface. *(inferred — confirm 
the public-endpoint exposure model)*
+  - **integrator / API client** — calls the REST / GraphQL APIs, 
authenticated; may author conditions, rules, segments, scopes. **Trusted to its 
credential's authority.** *(inferred)*
+  - **operator/admin** — controls config, the Karaf container, plugins, and 
the Elasticsearch/OpenSearch backend. **Trusted.** *(inferred)*
+  - **cluster peer** — another Unomi node. *(inferred)*
+
+**Component-family table:**
+
+| Family | Entry point | Touches outside process | In model? |
+| --- | --- | --- | --- |
+| Public context ingestion | `/context.json` / event collector (`wab`, `rest`) 
| network (public listen) | **In — primary boundary** *(inferred)* |
+| Rule / condition / segment engine + scripting | `services`, `scripting` 
(MVEL/OGNL expression eval) | evaluates expressions | **In — historically the 
RCE surface (§11)** *(documented: CVEs)* |
+| JSON-Schema event validation | schema validation of incoming events | — | 
**In — the input-validation defense** *(documented: manual `jsonSchema`)* |
+| Admin REST + GraphQL APIs | `rest`, `graphql` | network (authenticated) | 
**In** *(documented: modules)* |
+| Persistence | `persistence-elasticsearch` / `persistence-opensearch` | 
network → ES/OS backend | **In (Unomi's use of it); the backend's own security 
is operator's** *(inferred)* |
+| Plugins / extensions / connectors | `plugins`, `extensions`, `connectors` | 
varies | **In core ones; third-party/`samples` out** *(inferred)* |

Review Comment:
   This row lists a `connectors` top-level module/directory, but this 
repository doesn’t have a `connectors/` directory; connectors appear to live 
under `extensions/` (e.g., `extensions/salesforce-connector`). Using the 
correct repo structure here will make the model easier to map to code during 
reviews.



##########
THREAT_MODEL.md:
##########
@@ -0,0 +1,190 @@
+<!--
+SPDX-License-Identifier: Apache-2.0
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    https://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Apache Unomi — Threat Model (v0 draft)
+
+## §1 Header
+
+- **Project:** Apache Unomi (`apache/unomi`), `master` / 2.x line, against 
which this draft was written. This model covers the **apache/unomi** server; 
`unomi-tracker` (browser tracking client) and `unomi-site` (website) are in the 
engagement scope but are treated here as satellites (see §2/§3).
+- **Date:** 2026-06-02. **Status:** draft — for Apache Unomi PMC review. 
**Author:** ASF Security team (drafted via the Scovetta threat-model rubric), 
for PMC ratification.
+- **Version binding:** versioned with the project; a report against version 
*N* is triaged against the model as it stood at *N*.
+- **Reporting cross-reference:** §8-property violations → report privately per 
ASF process (`[email protected]` → `[email protected]`); §3/§9 
findings are closed citing this document.
+- **Provenance legend:** *(documented)* = Unomi's own docs/repo/CVE 
advisories; *(maintainer)* = confirmed by an Unomi PMC member through this 
process; *(inferred)* = reasoned from architecture/history, not yet confirmed — 
each has a matching §14 open question.
+- **Draft confidence:** ~16 documented / 0 maintainer / ~30 inferred.
+- **What Unomi is:** Apache Unomi is a Java reference implementation of the 
OASIS Context Server (CXS) spec — a Customer Data Platform. It collects 
behavioural events about visitors (typically from a browser via the 
`unomi-tracker` JavaScript over a public **context** endpoint), builds and 
stores profiles + segments, evaluates rules/conditions, and exposes data via 
REST and GraphQL APIs. It persists to Elasticsearch/OpenSearch. *(documented — 
README, manual)*
+
+## §2 Scope and intended use
+
+- **Primary use:** an operator-deployed **context server** that ingests 
visitor events over the network and serves profile/segmentation data to web 
properties and back-office tools. *(documented — manual)*
+- **Caller roles** (network service — the role splits):
+  - **public web client** — a browser running `unomi-tracker`, hitting the 
**public context endpoint** (`/context.json`-style) **unauthenticated**, from 
the open internet. The highest-value untrusted surface. *(inferred — confirm 
the public-endpoint exposure model)*
+  - **integrator / API client** — calls the REST / GraphQL APIs, 
authenticated; may author conditions, rules, segments, scopes. **Trusted to its 
credential's authority.** *(inferred)*
+  - **operator/admin** — controls config, the Karaf container, plugins, and 
the Elasticsearch/OpenSearch backend. **Trusted.** *(inferred)*
+  - **cluster peer** — another Unomi node. *(inferred)*
+
+**Component-family table:**
+
+| Family | Entry point | Touches outside process | In model? |
+| --- | --- | --- | --- |
+| Public context ingestion | `/context.json` / event collector (`wab`, `rest`) 
| network (public listen) | **In — primary boundary** *(inferred)* |
+| Rule / condition / segment engine + scripting | `services`, `scripting` 
(MVEL/OGNL expression eval) | evaluates expressions | **In — historically the 
RCE surface (§11)** *(documented: CVEs)* |
+| JSON-Schema event validation | schema validation of incoming events | — | 
**In — the input-validation defense** *(documented: manual `jsonSchema`)* |
+| Admin REST + GraphQL APIs | `rest`, `graphql` | network (authenticated) | 
**In** *(documented: modules)* |
+| Persistence | `persistence-elasticsearch` / `persistence-opensearch` | 
network → ES/OS backend | **In (Unomi's use of it); the backend's own security 
is operator's** *(inferred)* |
+| Plugins / extensions / connectors | `plugins`, `extensions`, `connectors` | 
varies | **In core ones; third-party/`samples` out** *(inferred)* |
+| `unomi-tracker` (JS client) | browser | — | **Satellite — discoverability 
pointer; client-side, lower trust surface** *(inferred)* |
+| `unomi-site`, `samples`, `itests` | website / demos / tests | — | **Out** 
*(see §3)* |
+
+## §3 Out of scope (explicit non-goals)
+
+- **Attackers who already control the host, the Karaf container, the config, 
the plugins, or the Elasticsearch/OpenSearch backend.** Operator-trusted. 
*(inferred)*
+- **`unomi-site`, `samples/`, `itests/`** — website + demo + test code, not 
production trust surface. *(inferred)*
+- **Confidentiality of profile data at rest / in the search backend** when the 
operator has not secured Elasticsearch/OpenSearch and the network — that is 
deployment hardening, not an Unomi code property, unless Unomi claims 
otherwise. *(inferred)*
+- **Arbitrary expression evaluation by a *trusted admin*** who authors a 
malicious condition/rule — an authenticated privileged user defining 
server-side logic is the intended (if powerful) feature, not an attack on 
Unomi. The boundary is whether *public/untrusted* input can reach expression 
evaluation (see §8/§11). *(inferred — confirm)*
+
+## §4 Trust boundaries and data flow
+
+- **Primary boundary: the public context endpoint.** Event payloads arriving 
unauthenticated from browsers are **untrusted**. They flow → JSON-Schema 
validation → event/condition processing → profile update → persistence. The 
schema-validation step is the gate. *(inferred; schema validation documented)*
+- **Secondary boundary: the authenticated REST/GraphQL admin surface**, where 
conditions/rules/scopes are defined — trusted to the credential. *(inferred)*
+- **The historical break (load-bearing):** the public surface must **not** 
allow attacker-controlled input to reach OGNL/MVEL expression evaluation that 
can instantiate/call arbitrary Java — that was CVE-2020-11975 / CVE-2020-13942 
/ CVE-2021-31164, fixed by constraining the public surface. The model treats a 
regression of this kind as `VALID`/critical. *(documented — CVE advisories)*
+- **Reachability precondition:** a finding in `scripting`/condition-evaluation 
is **in-model** only if reachable from **public/unauthenticated** input (or 
from a lower privilege than the operation requires). Expression power available 
only to a trusted authenticated author is `OUT-OF-MODEL: trusted-input`. A 
finding on the ES/OS backend is in-model only if reachable through Unomi's API, 
not by directly attacking an exposed backend. *(inferred)*
+
+## §5 Assumptions about the environment
+
+- **Runtime:** JVM; runs in an Apache Karaf / OSGi container. *(documented — 
kar/package/manual)*
+- **Backend:** Elasticsearch or OpenSearch, assumed deployed on a trusted 
network and secured by the operator. *(inferred)*
+- **The public endpoint is internet-facing by design** (browsers post events 
directly); the admin APIs are assumed *not* public. Confirm. *(inferred)*
+- **Negative side-effects inventory** (predominantly inferred — wave-1/2 
target): Unomi listens on HTTP; reads config from the Karaf container; talks to 
the search backend; loads OSGi plugins; evaluates conditions/expressions; the 
scripting engine executes expression logic authored through the (trusted) admin 
path. *(inferred)*
+
+## §5a Build-time and configuration variants
+
+Security-relevant configuration knobs *(all inferred — confirm names/defaults 
against `configuration.adoc`):*
+
+- **Public-endpoint protection / third-party server allow-list + secured 
events** — the mechanism that distinguishes events a public client may send 
from those that require a trusted key. Default posture? *(inferred — Unomi has 
a "protected events" / third-party-server key concept)*
+- **JSON-Schema validation** of incoming events — on by default? 
Reject-unknown by default? *(inferred; feature documented)*
+- **Expression/scripting allow-list** (post-CVE) — what restricts which 
classes/methods conditions may reference, and is it on by default? *(inferred; 
the CVE fixes introduced restrictions)*
+- **Authentication on the admin REST/GraphQL APIs** — default credentials? 
bound to localhost vs all interfaces by default? *(inferred — the 
insecure-default question; wave 1)*
+
+**Insecure-default check:** if any of (public-endpoint protection, schema 
validation, scripting allow-list, admin auth) ships *off* or with a default 
credential, a report against that default is `VALID` unless the PMC designates 
it a documented must-configure (`OUT-OF-MODEL: non-default-build`). This is a 
wave-1 ruling (§14).
+
+## §6 Assumptions about inputs
+
+Per-surface trust table *(all inferred unless noted):*
+
+| Surface | Input | Attacker-controllable? | Caller/operator must enforce |
+| --- | --- | --- | --- |
+| Public context endpoint | event JSON, profile/session refs, scope | **yes 
(unauthenticated, public)** | JSON-schema validation on; public-event 
allow-list; no expression reach |
+| REST / GraphQL admin | conditions, rules, segments, queries | **yes, within 
the authenticated credential's authority** | authn + authz; restrict who may 
author expressions |
+| Condition / rule definitions | MVEL/OGNL expressions | **public: must be no; 
admin: yes-but-trusted** | keep expression authoring on the trusted side |
+| Persistence queries | derived from the above | indirectly | backend 
hardening; query/scope isolation |
+| Plugins / connectors config | operator-supplied | no — operator-trusted | 
vet third-party plugins |

Review Comment:
   Similar to the component table above, this line calls out “connectors” 
separately, but in this repo connectors are implemented as extensions (e.g., 
`extensions/salesforce-connector`). Aligning terminology here with the actual 
directory layout will reduce ambiguity for readers.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to