mbaedke commented on code in PR #2923: URL: https://github.com/apache/jackrabbit-oak/pull/2923#discussion_r3341474347
########## draft-THREAT-MODEL.md: ########## @@ -0,0 +1,1060 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +# Apache Jackrabbit Oak Security Threat Model (draft) + +**Why a separate Jackrabbit-Oak model (not a single Jackrabbit-PMC umbrella).** +The Jackrabbit PMC owns three functionally distinct codebases that share a +common JCR API contract but have completely different security architectures: +the original `jackrabbit` (jackrabbit-core, JR2-era), `jackrabbit-oak` (the +modern, scalable successor with a different storage model and a redesigned +security stack), and `jackrabbit-filevault` (a packaging / serialisation +tool whose entire reason for existing is to move repository content across a +trust boundary as a zip file). An umbrella model would have to disclaim each +of the per-repo nuances in turn — every "the project trusts X" statement +would carry "...for Oak, but jackrabbit-core uses a different mechanism, and +filevault doesn't have callers in this sense". Three smaller models cite +each other for the JCR contract and stand on their own for everything else. +The triage utility of a closed-set §13 disposition table requires that +each project's set actually be closed. + +## §1 Header + +- **Project:** Apache Jackrabbit Oak (`apache/jackrabbit-oak`) *(documented: + `AGENTS.md`, `.asf.yaml`)*. Oak is the modern JCR repository + implementation; the original `jackrabbit` (jackrabbit-core) is modelled + separately, and `jackrabbit-filevault` (packaging) is also modelled + separately. +- **Commit / version binding:** drafted against the default branch + (`trunk`) *(documented: `.asf.yaml` — `protected_branches: trunk`)*. A + vulnerability report against Oak version *N* should be triaged against + the model as it stood at *N* (release tag), not against `trunk`. +- **Date:** 2026-05-30. +- **Authors:** ASF Security team draft, awaiting Jackrabbit PMC review. +- **Status:** draft — under maintainer review. +- **Reporting cross-reference:** findings that may violate a §8 property + should be reported per the ASF Security Team disclosure channel + (`[email protected]`) and the Jackrabbit project's security mailing + list, before public disclosure *(documented: Oak's `security/reports.md` + — "The Apache Security team requests that researchers report + undisclosed vulnerabilities to the security mailing list before public + disclosure")*. Findings that fall under §3, §9, or §11a will be + closed by Oak triagers citing this document. +- **Provenance legend:** + *(documented)* — drawn from in-repo docs or website docs with citation; + *(maintainer)* — confirmed by an Oak maintainer in response to this + draft; *(inferred)* — synthesised from code structure or domain + knowledge, awaiting PMC ratification (every *(inferred)* tag has a + matching §14 question). +- **Draft confidence:** 35 documented / 0 maintainer / 28 inferred. + +**About the project.** Apache Jackrabbit Oak is the actively-developed +scalable hierarchical content repository that succeeded the original +Apache Jackrabbit (jackrabbit-core). It implements the JCR 2.0 +specification (JSR 283) and is the storage engine that ships with Adobe +Experience Manager and several other CMS / DAM products. The +implementation is split across ~47 Maven modules under +`oak-*`; it is intended to be **embedded** by a host application (CMS, +asset manager, integration framework, …) — not deployed as a standalone +server — and exposes a JCR `Repository` API and an Oak `ContentRepository` +API. Security is structured around a pluggable `SecurityProvider` that +binds an `AuthenticationConfiguration`, an `AuthorizationConfiguration` +(possibly composite), a `UserConfiguration`, a `PrincipalConfiguration`, +and a `PrivilegeConfiguration` *(documented: `security/introduction.md`, +`security/overview.md`)*. + +## §2 Scope and intended use + +### Intended use + +- **In-process JCR 2.0 (JSR 283) repository** embedded by a host + application (CMS, DAM, integration framework). The Oak repository + exposes `javax.jcr.Repository` and `org.apache.jackrabbit.oak.api.ContentRepository` + to in-process callers; there is no built-in network listener and no + per-end-user authn/authz outside what the host plugs into the + `LoginContextProvider` and `AuthorizationConfiguration` chain + *(documented: `security/introduction.md`, + `security/authentication.md`)*. +- The repository supports multiple NodeStore backends — DocumentNodeStore + (Mongo, RDB), SegmentNodeStore (TarMK), Composite, with Blob storage + in BlobStore (S3, Azure, FileBlobStore, …) *(documented: `AGENTS.md` — + "Persistence: Multiple NodeStore backends (Document, Segment/TarMK, + Composite, AWS S3, Azure)")*. +- Indexing is via Lucene (oak-lucene) and Elasticsearch (oak-search-elastic) + *(documented: `AGENTS.md`)*. + +### Deployment shape + +Oak is **not** a standalone daemon and is **not** a network service in its +own right. It is an in-process library. Network exposure (HTTP, WebDAV, +custom protocol) is **always** an artefact of the host application; Oak +ships no listener of its own *(inferred — §14 Q1)*. The threat model +is therefore that of a library, not a service — but a library whose +contract specifically promises authentication and authorisation +properties to its host, which makes it a more security-load-bearing +library than (say) zlib. + +### Caller roles + +Following §2 of the output-structure rubric (in-process-library split): + +| Role | Trust level | Notes | +| --- | --- | --- | +| **Host application code** | trusted | Holds the `Repository`/`ContentRepository` handle; chooses the `SecurityProvider` and `LoginContextProvider`; configures the NodeStore + BlobStore; may bypass authorisation by obtaining a system-level session via `loginAdministrative` / `loginService` *(documented: `security/permission/default.md` — admin/system principals bypass permission evaluation)*. The host decides whether end-user credentials reach Oak at all. | +| **JCR session principal (end user)** | untrusted but authenticated | Identifies through `Repository.login(Credentials)` or `ContentRepository.login(Credentials, workspaceName)`; subjected to the configured `PermissionProvider` chain on every read/write. The principal is authenticated by Oak's `LoginContext` *(documented: `security/authentication.md`)*. | +| **System / admin principal** | trusted | A session obtained via `loginAdministrative` / `loginService` (host-driven) carries `SystemPrincipal` or `AdminPrincipal` and bypasses permission evaluation *(documented: `security/permission/default.md` — "Three principal categories automatically receive full repository access: SystemPrincipal, AdminPrincipal, and Principals matching configured administrative names")*. | +| **External identity provider** | trusted control plane | The host configures one or more `ExternalIdentityProvider`s for LDAP / SAML / OAuth; Oak's `ExternalLoginModule` accepts whatever identity these IDPs assert *(documented: `security/authentication/externalloginmodule.md` — "The mechanism implicitly trusts that the configured IDP accurately authenticates identities")*. | +| **Pre-authenticated caller** | trusted (operator-asserted) | When `PreAuthenticatedLogin` is in use, Oak performs no credential verification at all; the host is asserting "this user has already been authenticated upstream" *(documented: `security/authentication/preauthentication.md` — "Oak delegates all authentication responsibility to the caller")*. | +| **NodeStore backend** | trusted | Mongo / Tar / Segment / RDB / Composite storage is assumed honest and assumed to enforce its own at-rest protections *(inferred — §14 Q2)*. | +| **BlobStore backend** | trusted | S3 / Azure / FileBlobStore is assumed honest *(inferred — §14 Q2)*. | + +### Component-family table + +| Family | Representative entry | Touches outside the process? | In-model? | +| --- | --- | --- | --- | +| `oak-api`, `oak-core`, `oak-core-spi` — content tree, MVCC, commit hooks | `ContentRepository.login` | no (only through a NodeStore) | **yes** | +| `oak-jcr` — JCR 2.0 binding *(documented: `AGENTS.md`)* | `Repository.login` | no | **yes** | +| `oak-security-spi`, default `AuthorizationConfiguration`, default `PermissionProvider` | `SecurityProvider` | no | **yes** (high security weight; 100% test coverage mandate per `AGENTS.md`) | +| `oak-authorization-cug` — Closed User Groups *(documented: `security/authorization/cug.md`)* | composite `AuthorizationConfiguration` | no | **yes** (read-only authorisation only — see §9.10) | +| `oak-authorization-principalbased` — principal-based authz *(documented: `AGENTS.md`)* | composite `AuthorizationConfiguration` | no | **yes** | +| `oak-auth-external` — IDP framework *(documented: `security/authentication/externalloginmodule.md`)* | `ExternalIdentityProvider` SPI | depends on IDP impl | **yes** for the wrapper; IDP impl is per-host | +| `oak-auth-ldap` — LDAP IDP *(documented: `AGENTS.md`)* | `LdapIdentityProvider` | **yes — LDAP/AD** | **yes** | +| Persistence — `oak-store-document` (Mongo / RDB), `oak-segment-tar`, `oak-store-composite`, `oak-store-spi` | `DocumentNodeStore`, `SegmentNodeStore` | **yes — DB / FS** | **yes** for in-Oak code; backend itself is trusted (§3) | +| BlobStore — `oak-blob`, `oak-blob-cloud`, `oak-blob-cloud-azure`, `oak-blob-plugins` | `S3DataStore`, `AzureDataStore` | **yes — S3 / Azure / FS** | **yes** for in-Oak code; cloud APIs trusted (§3) | +| Search — `oak-lucene`, `oak-search`, `oak-search-elastic` | `IndexEditor`, query parsers | sometimes (Elasticsearch over HTTP) | **yes** | +| `oak-run` — operator CLI / tooling | `oak-run.jar` | OS / FS / network depending on subcommand | **see §3** (in-model only for the command-driven contract; out-of-model for "operator runs it as the wrong user") | +| `oak-pojosr`, `oak-standalone` — repository launchers | embedded repository | filesystem | **yes** for code; deployment is operator's | +| `oak-upgrade` — JR2 → Oak migration | offline migration job | filesystem | **yes** for code; the migration source is a trusted JR2 repository | +| `oak-it`, `oak-it-osgi`, `oak-bench-*`, `oak-jcr-tests`, `oak-test-bundle`, `oak-exercise` | integration tests, benchmarks, training | varies | **out of model** — unsupported components *(§3)* | +| `oak-examples`, `oak-doc-railroad-macro`, `oak-doc` | examples and docs | none | **out of model** *(§3)* | +| Archived MicroKernel modules (`oak-mk-*`) | n/a | n/a | **out of model** — explicitly archived *(documented: README — "MicroKernel-related modules have been archived")* | + +A finding is in-model only if it lands in a row marked **yes**. See §4 +for per-component reachability tests. + +## §3 Out of scope (explicit non-goals) + +Reports requiring any of these will be closed with the cited disposition: + +1. **Host application correctness.** Oak is embedded. If the host hands + out an admin session to an unauthenticated HTTP request, exposes + `loginAdministrative` over JMX, or routes user-supplied SQL2 queries + directly into the session without filtering, the harm is the host's + *(documented: `security/permission/default.md` — admin/system + principals bypass permission evaluation)*. → `OUT-OF-MODEL: + adversary-not-in-scope`. +2. **NodeStore / BlobStore / IDP correctness.** Mongo, RDB, TarMK on + disk, S3, Azure, LDAP, SAML — Oak trusts the responses these systems + give. A backend returning forged bytes, an LDAP server asserting a + spoofed group membership, an S3 bucket allowing unauthorised reads — + none are Oak vulnerabilities *(inferred — §14 Q2)*. → + `OUT-OF-MODEL: trusted-input`. +3. **Storage-level authorisation.** HDFS / S3 / filesystem ACLs on the + underlying NodeStore / BlobStore are the operator's responsibility. A + tar-store file readable by `other` is not an Oak bug *(inferred — + §14 Q3)*. → `OUT-OF-MODEL: adversary-not-in-scope`. +4. **Pre-authentication misuse.** The `PreAuthenticatedLogin` mechanism + is an *explicit* bypass: Oak does no credential verification at all + and trusts that an upstream layer has *(documented: + `security/authentication/preauthentication.md`)*. A report that the + pre-auth code path "trusts the caller" is a documented design + choice. → `BY-DESIGN: property-disclaimed` (§9). +5. **Custom `SecurityProvider` / `LoginModule` replacements.** Oak + ships a default but documents that custom implementations are "only + recommended for experts having in-depth understanding of Oak + internals and which understand the security risk associated with + custom replacements" *(documented: `security/introduction.md`)*. A + report that requires a custom SPI implementation that voids a + guarantee is the host's choice. → `OUT-OF-MODEL: non-default-build`. +6. **`oak-run` invoked by the operator.** `oak-run` is an offline + administrative CLI; running it requires direct filesystem and + credential access. "Operator runs `oak-run console` and dumps the + repository" is not a vulnerability *(inferred — §14 Q4)*. → + `OUT-OF-MODEL: adversary-not-in-scope`. +7. **Code that ships but is not part of the supported product:** + `oak-it/`, `oak-it-osgi/`, `oak-bench-*/`, `oak-jcr-tests/`, + `oak-test-bundle/`, `oak-exercise/`, `oak-examples/`, + `oak-doc-railroad-macro/`, archived `oak-mk-*` modules + *(documented: README, AGENTS.md)*. → `OUT-OF-MODEL: + unsupported-component`. +8. **Original `jackrabbit` / `jackrabbit-core` code.** Oak migrated away + from the JR2 codebase; jackrabbit-core has a separate threat model. + The `oak-upgrade` module imports from jackrabbit-core as a one-shot + migration source; a bug in the JR2-side code is jackrabbit-core's + threat-model problem. → `OUT-OF-MODEL: unsupported-component` (with + cross-reference). +9. **`jackrabbit-filevault` package import** — filevault has its own + threat model. A vulnerable filevault install hook is filevault's + problem; Oak's role is to honour the JCR session privileges that + filevault uses *(inferred — §14 Q5)*. → `OUT-OF-MODEL: + unsupported-component` (with cross-reference). +10. **Build / release / SDLC hygiene.** Action pinning, signing, + reproducible builds, branch protection — out of model per the SKILL. +11. **Side channels** (cache timing, branch prediction, co-tenant on the + same JVM). *(inferred — §14 Q6)* → `OUT-OF-MODEL: + adversary-not-in-scope`. + +## §4 Trust boundaries and data flow + +Oak's trust boundary is **the JCR `Session` / Oak `ContentSession` API +surface**. Once a `Session` has been obtained, every read/write goes +through the configured `PermissionProvider` chain — *unless* the +principal is `SystemPrincipal`, `AdminPrincipal`, or matches a +configured administrative name, in which case permission evaluation is +bypassed *(documented: `security/permission/default.md` — "Three +principal categories automatically receive full repository access ... +Administrator sessions bypass permission evaluation entirely")*. + +There are six trust transitions a finding must land in to be in-model: + +| # | Transition | Who authenticates | Who authorises | +| --- | --- | --- | --- | +| B1 | Host → `Repository.login(Credentials)` / `ContentRepository.login(Credentials, ws)` | `LoginContext` chain (`LoginModule` impls in the configured JAAS appname; default is `LoginModuleImpl` + optional `TokenLoginModule` + optional `ExternalLoginModule`) *(documented: `security/authentication.md`)* | n/a at this transition | +| B2 | Host → `loginAdministrative` / `loginService` | trusted: the host *is* the system *(documented: `security/permission/default.md`)* | n/a — these sessions bypass authorisation | +| B3 | JCR session principal → tree read | `Subject` already established at B1 | `PermissionProvider` (default + optional CUG + optional principalbased, composed) *(documented: `security/authorization.md`, `security/permission.md`)* | +| B4 | JCR session principal → tree write | same as B3 | `PermissionValidator` commit hook *(documented: `security/permission/evaluation.md`)* | +| B5 | `ExternalLoginModule` → external IDP (LDAP, SAML, …) | the IDP authenticates the user *(documented: `security/authentication/externalloginmodule.md`)* | per-IDP; Oak honours whatever group membership the IDP asserts on sync | +| B6 | Oak → NodeStore / BlobStore (Mongo, Tar, RDB, S3, Azure) | backend's own auth (operator-configured) | backend's own ACLs | + +### Reachability preconditions per family + +A finding is in-model only if it meets the family's reachability test: + +- **`oak-api`, `oak-core`, `oak-jcr`**: reachable from a session + established at B1 with the principal carrying *less than* + `SystemPrincipal` / `AdminPrincipal`. A finding that requires an + already-admin session collapses to "the host gave away an admin + session", which is §3 item 1. +- **`oak-security-spi`, default `AuthorizationConfiguration`, + `PermissionProvider`**: in-model for any reachable failure mode that + results in an effective grant the configured ACLs do not license, or + an effective deny that they do. Hidden-item handling is in-model: + "system principals gain access except for hidden items that are not + exposed on the Oak API" *(documented: + `security/permission/default.md`)*. +- **`oak-authorization-cug`**: in-model only for *read* access + restriction; CUG "solely evaluates and enforces read access to + regular nodes and properties" *(documented: + `security/authorization/cug.md`)*. CUG is disabled by default. +- **`oak-authorization-principalbased`**: in-model when configured; + composes via `CompositeAuthorizationConfiguration` *(documented: + `security/authorization.md`)*. +- **`oak-auth-external`, `oak-auth-ldap`**: in-model for the bridge + code that *uses* the IDP — credential handling, session sync, + group-membership materialisation into Oak's user-management tree. + Out-of-model for the IDP's own correctness (§3 item 2). +- **NodeStore / BlobStore**: in-model for in-Oak code paths (commit + hook, MVCC, secondary indexes); out-of-model for the backend's + external behaviour (§3 item 2 / item 3). +- **`oak-lucene` / `oak-search-elastic`**: in-model for query + evaluation against the *visible* tree under the caller's + permissions. Index *leakage* — e.g. a search result that surfaces a + node path that the caller has no read permission for — is in-model + (the search must respect §3 §4 permission scope). The + `QueryEngine` is documented to filter by permission as part of + result delivery *(inferred — §14 Q7)*. +- **`oak-run`**: in-model only for in-process logic; "operator runs + it with a bad keystore on the file system" is out (§3 item 6). + +## §5 Assumptions about the environment + +- **JVM / runtime.** Java 11+ at HEAD; the build requires Maven 3.x + *(documented: `README.md`)*. JVM is conformant; the security manager + is *not* required (modern Java has deprecated it; Oak is not designed + to defend against in-JVM attackers). +- **Concurrency.** `Repository` is `Send`-equivalent (thread-safe in + the JCR sense); `Session` is **not** thread-safe and is documented + to be per-thread *(inferred — §14 Q8)*. Oak relies on MVCC for + isolation between concurrent sessions. +- **MVCC.** Oak's content tree is an MVCC layer over a NodeStore; each + session sees a consistent revision *(documented: `AGENTS.md` — + "MVCC transactional implementation")*. The revision is captured at + session login; later writes by other sessions are not visible until + refresh. +- **Memory.** JVM-managed; Oak holds in-memory caches (node cache, + permission cache, principal cache). Cache sizing is operator- + configurable; pathological cache patterns (e.g. millions of distinct + principals) can cause memory pressure but are not security failures + *(inferred — §14 Q9)*. +- **Time.** `System.currentTimeMillis` is used for token expiry, lock + timeouts, and TTL caches. Clock skew within the same JVM is + irrelevant; clock skew between Oak and an external IDP (LDAP, SAML) + is the operator's responsibility. +- **Filesystem.** SegmentNodeStore writes tar files under + `repository.home`; DocumentNodeStore stores nothing on disk by + default. FileBlobStore writes blob files. Permissions on these + directories are the operator's responsibility *(inferred — + §14 Q3)*. +- **Network.** Oak itself opens no listening sockets. Outbound + connections are to: Mongo / RDB (DocumentNodeStore), S3 / Azure + (BlobStore), LDAP (oak-auth-ldap), Elasticsearch + (oak-search-elastic). All endpoints are operator-configured trusted + *(inferred — §14 Q10)*. +- **What Oak does NOT do to its host** *(predominantly negative + claims, awaiting maintainer ratification — §14 Q11)*: + - opens **no** listening sockets; + - installs **no** process-wide signal handlers; + - does **not** spawn child processes from the core; + - reads a documented set of system properties (e.g. + `oak.*`) but does not consume arbitrary `LD_*` / + `JAVA_TOOL_OPTIONS` for security-sensitive behaviour; + - writes log entries through SLF4J (host-configured backend); + - persists nothing outside the configured NodeStore + BlobStore + home. + +## §5a Build-time and configuration variants + +Oak ships as a single feature-rich library, but the **`SecurityProvider` +composition** materially changes the security envelope. The +maintainer-confirmed list is to be ratified per §14 Q12; the +security-relevant subset: + +| Knob / configuration | Default | Maintainer stance | Effect | +| --- | --- | --- | --- | +| `SecurityProvider` impl | `SecurityProviderImpl` (default) | dev/test-or-prod depending on the rest of the host config *(inferred — §14 Q12)* | every security property in §8 is *conditioned* on the default impl | +| `AuthenticationConfiguration` | `AuthenticationConfigurationImpl` (`LoginModuleImpl` + `GuestLoginModule` + `TokenLoginModule`) *(documented: `security/authentication.md`)* | configurable per JAAS appname | which `LoginModule`s are in the chain decides what credentials Oak accepts | +| `AuthorizationConfiguration` | `AuthorizationConfigurationImpl` (default ACL) | required for any meaningful authz | absent → everything is allowed for whoever logs in | +| `CompositeAuthorizationConfiguration` | not composed by default | composed in for CUG and principal-based authz | when composed, the *strictest* result across configurations applies | +| `oak-authorization-cug` | not enabled | optional; required to be composed in *(documented: `security/authorization/cug.md`)* | enables read-restriction-by-group on configured supported paths | +| `oak-authorization-principalbased` | not enabled | optional *(documented: `AGENTS.md`)* | adds principal-keyed AC evaluation | +| `oak-auth-external` + `ExternalIdentityProvider` impl | none registered | optional | enables IDP-asserted identities into the JAAS chain | +| `oak-auth-ldap` | not enabled | optional | concrete LDAP `ExternalIdentityProvider` | +| `PreAuthenticatedLogin` | not in chain by default | dev/integration | when configured, Oak skips credential verification *(documented: `security/authentication/preauthentication.md`)* — §3 item 4 | +| `TokenConfiguration` | enabled by default in default chain | configurable token expiry / refresh | misconfigured TTL extends a stolen token's usefulness | +| `UserConfiguration` password hashing | PBKDF2 (current default per `PasswordUtil`) *(documented: `security/user.md` — "PasswordUtil supports Password-Based Key Derivation Function 2 (PBKDF2)")* | maintainer ruling required per §14 Q13: are pre-PBKDF2 hash schemes still acceptable on legacy users, or must operators force a rehash? | which hash scheme guards stored passwords | +| Administrative principal whitelist | `admin` user + `SystemPrincipal` *(documented: `security/permission/default.md`)* | adding more principals here disables authz for them | "Principals matching configured administrative names" bypass permission evaluation | +| `SecureNodeBuilder` wrapping | applied to non-admin sessions *(documented: `security/permission/evaluation.md`)* | required for permission enforcement on writes | admin sessions skip `SecureNodeBuilder` entirely | +| `CompositeNodeStore` / Mount-point isolation | not by default; configurable | required for multi-mount deployments | CUG explicitly cannot be used with non-default mounts intersecting supported paths *(documented: `security/authorization/cug.md`)* | +| `Whiteboard` / `FT_OAK-<issue>` feature toggles | varies per toggle | per-toggle ruling *(documented: `AGENTS.md` — "Non-trivial changes use toggles named `FT_OAK-<issue>`")* | a security-touching toggle in the on or off position can void a §8 property | +| Index hidden-properties suppression | hidden items not exposed on Oak API *(documented: `security/permission/default.md` — "system principals gain access except for hidden items that are not exposed on the Oak API")* | structural | hidden permission-store nodes (`/jcr:system/rep:permissionStore`) are immutable through standard APIs *(documented)* | + +**The insecure-default case.** Two defaults are load-bearing for triage: + +- The `admin` user *exists by default* and has full repository access. + Is "the operator failed to disable / rotate the `admin` user" a + `VALID` Oak report or an `OUT-OF-MODEL` operator misconfiguration? + *(maintainer ruling required — §14 Q14)*. The model assumes the + latter: the operator must rotate `admin` per §10. +- Pre-PBKDF2 hashed passwords on legacy users — see §14 Q13. + +## §6 Assumptions about inputs + +### Per-entry-point trust table (in-process API) + +| Entry point | Parameter | Attacker-controllable in the model? | Caller must enforce | +| --- | --- | --- | --- | +| `Repository.login(Credentials)` | `Credentials` | **yes** (this is *the* untrusted input) | configured `LoginModule`s do the work; host need not pre-validate | +| `Session.getNode(path)`, `getProperty(path)`, … | `path` | **yes** via the authenticated user, but reads filtered through `PermissionProvider` *(documented: `security/permission/evaluation.md`)* | nothing | +| `Session.save()` | accumulated transient changes | yes via the authenticated user, validated by `PermissionValidator` commit hook *(documented: `security/permission/evaluation.md`)* | nothing | +| `QueryManager.createQuery(stmt, lang)` | SQL2 / XPath text | **yes** (free-form query language) | Oak parses + plans + filters results by permissions | +| `AccessControlManager.setPolicy(path, policy)` | path + ACE list | only by callers with `MODIFY_ACCESS_CONTROL` privilege *(documented: `security/authorization.md`)* | Oak gates by privilege | +| `UserManager.createUser(id, password)` | user id, password | only by callers with the right privileges | password is hashed per `PasswordUtil` | +| `UserManager.createSystemUser(id, intermediatePath)` | id | only by callers with the right privileges | system users have no password and are intended for service identities | +| `loginAdministrative()` / `loginService(subject, ws)` | n/a | **none** — host code only; trusted | host must not expose these to user code | +| `PreAuthenticatedLogin` marker | n/a | **none** — host code only | host must not let user-supplied bytes reach the pre-auth code path | +| `ExternalIdentityProvider.authenticate(creds)` | credentials forwarded from JAAS | **yes** | the IDP impl is the trust anchor *(documented: `security/authentication/externalloginmodule.md`)* | +| `oak-run` subcommands | argv | **none** — operator-controlled | OS-level filesystem perms on the configured `repository.home` | +| Index documents (Lucene / Elasticsearch) | content of indexed properties | yes (whoever wrote the property) | query results filter by `PermissionProvider`; index *content* is treated as data | +| NodeStore byte stream | Tar blocks, Mongo BSON, RDB rows | **no** — trusted control plane | backend must not be hostile (§3 item 2) | + Review Comment: I think it's necessary to explicitly list XML imports and Workspace methods as entry points. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
