Re: [PR] docs(security): explicit security model, role/capability matrix, AGENTS.md linkage [superset]

via GitHub Thu, 28 May 2026 15:26:41 -0700


villebro commented on code in PR #40503:
URL: https://github.com/apache/superset/pull/40503#discussion_r3320973926



##########
.github/SECURITY.md:
##########
@@ -33,31 +33,77 @@ We kindly ask you to include the following information in 
your report to assist
 - Expected vs. Actual Behavior: A clear description of the intended system 
behavior versus the observed vulnerability.
 - Detailed Reproduction Steps: Clear, manual steps to reproduce the 
vulnerability.
 
-**Vulnerability Definition**
+## Security Model

Review Comment:
   For better discoverability, should this file be moved to the root, similar 
to what Spark is doing: https://github.com/apache/spark/blob/master/SECURITY.md 
?



##########
AGENTS.md:
##########
@@ -52,6 +52,35 @@ Common pre-commit failures:
 - **External API exposure** - Use UUIDs in public APIs instead of internal 
integer IDs
 - **Existing models** - Add UUID fields alongside integer IDs for gradual 
migration
 
+## Security and Threat Model
+
+Before evaluating any code path for security issues, read 
[`.github/SECURITY.md`](.github/SECURITY.md). It is the canonical, 
authoritative source for Apache Superset's security model and is referenced by 
both human reporters and automated scanners.
+
+In short, the test for whether a finding is in scope is one question:
+
+> *Does it let a principal perform an action the role and capability matrix in 
`.github/SECURITY.md` does not entitle them to?*
+
+If yes, it is in scope. If no, it is not.
+
+The three trust boundaries are:
+
+1. **The Admin role** is a fully trusted operational principal. Anything an 
Admin can do through documented configuration, API, or UI is an intended 
capability, not a vulnerability.
+2. **The operator** owns deployment-time decisions (secrets, network exposure, 
feature-flag selection, connector and codec choices, notification destinations, 
third-party plugins). Misconfiguration at this layer is a deployment defect, 
not a Superset vulnerability.
+3. **The codebase** is responsible for enforcing the role and capability 
matrix across its product surface. Failures of that enforcement, anywhere, are 
in scope regardless of which endpoint or component contains them.
+
+The canonical authorization pattern in this codebase is `@has_access_api` 
(Flask-AppBuilder) at the route level plus 
`security_manager.raise_for_access(...)` at the object level, with DAO 
`base_filters` where listing is involved. Code following both gates is not a 
finding **on authorization grounds** by itself; code that omits the per-object 
gate on a route that returns or mutates a specific object is. Code following 
both gates can still contain injection, SSRF, XSS, or other classes of finding 
unrelated to authorization, which are evaluated separately.
+
+The full role and capability matrix, in-scope and out-of-scope class lists, 
and CVE aggregation rules are in [`.github/SECURITY.md`](.github/SECURITY.md). 
Defer to that document for any specifics.

Review Comment:
   With reference to the previous comment, I would maybe strengthen the Trust 
Boundary section to cover something similar, too:
   
   > The security model assumes that operator-controlled 
infrastructure—including the metadata database, cache backends, message 
brokers, secret stores, and deployment environment—remains within the 
operator's trust boundary. Vulnerabilities must demonstrate a security boundary 
violation by an attacker who does not already control those systems.



##########
.github/SECURITY.md:
##########
@@ -33,31 +33,77 @@ We kindly ask you to include the following information in 
your report to assist
 - Expected vs. Actual Behavior: A clear description of the intended system 
behavior versus the observed vulnerability.
 - Detailed Reproduction Steps: Clear, manual steps to reproduce the 
vulnerability.
 
-**Vulnerability Definition**
+## Security Model
 
-Apache Superset considers a security vulnerability to be a demonstrable issue 
that has meaningful impact on confidentiality, integrity, or availability 
beyond the intended security model. Low-impact boundary variations or technical 
edge cases in existing access controls may be classified as hardening 
improvements rather than vulnerabilities, even if exploitable.
+This section defines what Apache Superset considers a security issue and what 
it does not. It is the canonical reference for reporters, the Apache Superset 
Security Team, and any automated tool (LLM-based scanner, static analyzer, 
dependency tool) that needs to constrain its hypotheses to behaviors that 
genuinely violate the project's security policy.
 
-**Out of Scope Vulnerabilities**
+The model is intentionally written in terms of principals, trust boundaries, 
and capability surface rather than in terms of specific files, functions, or 
libraries. New code paths inherit the model automatically.
 
-To prioritize engineering efforts on genuine architectural risks, the 
following scenarios are explicitly out of scope and will not be issued a CVE:
-- **Attacks requiring Admin privileges**: (e.g., CSS injection, template 
manipulation, dashboard ownership overrides, or modifying global system 
settings). Per the CVE vulnerability definition in CNA Operational Rules 4.1, a 
qualifying vulnerability must allow violation of a security policy. The Admin 
role is a fully trusted operational boundary defined by Apache Superset's 
security policy; actions within this boundary do not violate that policy and 
are therefore considered intended capabilities 'by design,' not vulnerabilities.
-- **Brute Force and Rate Limiting**: Reports targeting a lack of resource 
exhaustion protections, generic rate-limiting, or volumetric Denial of Service 
(DoS) attempts.
-- **Theoretical attack vectors**: Issues without a demonstrable, reproducible 
exploit path.
-- **Non-Exploitable Findings**: Missing security headers, generic banner 
disclosures, or descriptive error messages that do not lead to a direct, 
documented exploit.
-- **User enumeration**: API responses, timing differences, or error messages 
that reveal whether user accounts, IDs, dashboards, or datasets exist.
-- **Information disclosure (low impact)**: Software version disclosure, 
generic error messages, stack traces without sensitive data exposure, or system 
configuration details that don't enable further exploitation.
-- **Resource exhaustion requiring authentication**: Denial of Service attacks 
that require valid user credentials and don't bypass rate limiting or resource 
controls.
-- **Missing security headers**: Without demonstration of a concrete exploit 
scenario that leverages the missing header.
+### Trust Boundaries
 
-**Outcome of Reports**
+Apache Superset's threat model assumes three trust boundaries.
+
+1. *The Admin role* is a fully trusted operational principal. Anything an 
Admin can do through the documented user interface, REST API, or configuration 
system is an intended capability, not a vulnerability, even if individually 
powerful or destructive. The Admin role is, by policy, equivalent to 
operating-system-level trust over the Apache Superset application. This is 
unavoidable rather than aspirational: an Admin can, for example, register new 
database connections of arbitrary type, execute arbitrary SQL through SQL Lab, 
render Jinja templates that resolve to SQL or rendered HTML, and override 
application configuration. Granting Admin is functionally equivalent to 
granting shell access on the host, which is the reasoning behind treating it as 
a trust boundary in the sense of MITRE CNA Operational Rules 4.1.
+
+2. *The operator* is whoever deploys, configures, and runs Apache Superset. 
Behaviors that depend on deployment-time decisions are the operator's 
responsibility, not Apache Superset's. This includes the values of secrets, the 
network reachability of the application and its data sources, the choice of 
database connectors and cache backends, the selection of feature flags, the 
destinations of notifications, and the trust placed in third-party plugins. 
Defaults that fail closed are the responsibility of the Apache Superset 
codebase. Defaults that fail open must be accompanied by a documented hardening 
requirement; applying that hardening is the operator's responsibility, while 
shipping an undocumented or unflagged fail-open default is a codebase issue.
+
+3. *The Apache Superset codebase* is responsible for enforcing the role and 
capability matrix below across its product surface. A failure to enforce, 
anywhere in that surface, is in scope. The codebase's commitments are limited 
to the role and capability matrix and to controls Apache Superset's own 
documentation (this file and the linked Security documentation) explicitly 
positions as security boundaries; configurable hardening that operators can 
layer on top is treated separately under *Vulnerability Scope* below.
+
+### Roles and Capabilities
+
+Apache Superset ships with the following first-class principals. Detailed 
permission definitions live in the [Security 
documentation](https://superset.apache.org/docs/security).
+
+| Principal | Read data | Write objects | Execute SQL | Manage databases | 
Manage users, roles, RLS |
+|---|---|---|---|---|---|
+| Public (anonymous) | none by default | no | no | no | no |
+| Gamma | only granted datasets | own charts and dashboards on granted 
datasets | no by default (requires the `sql_lab` role) | no | no |
+| Alpha | all data sources | own charts, dashboards, and datasets | no by 
default (requires the `sql_lab` role) | data upload to existing databases only 
| no |
+| Admin | all | all | yes | yes | yes |
+| Embedded guest token | data sources reachable through the embedded 
dashboards the token authorizes | no | no | no | no |
+
+The `sql_lab` role is *additive*: it grants the SQL Lab permission set on top 
of the base role above, and is the only path by which Gamma or Alpha gain SQL 
execution capability. Database access is still scoped per the base role's 
grants. Admin includes SQL Lab access by default.
+
+Deployments may grant or revoke individual view-menu permissions, which shifts 
the boundary for that deployment but does not redefine the model. Any custom 
role created by an operator inherits the same principle: its capabilities are 
whatever the operator has explicitly granted it. The Public principal follows 
the same rule: operators may grant the Public role read access to specific 
datasets or dashboards (typically for anonymous reporting use cases), which 
shifts the boundary for that deployment without redefining the model.
+
+### Vulnerability Scope
+
+The test for whether a finding is in scope is a single question:
 
-Reports that are deemed out-of-scope for a CVE but represent valid security 
best practices or hardening opportunities may be converted into public GitHub 
issues. This allows the community to contribute to the general hardening of the 
platform even when a specific vulnerability threshold is not met.
+> *Does this finding let a principal perform an action the role and capability 
matrix above does not entitle them to?*
 
-Note that Apache Superset is not responsible for any third-party dependencies 
that may
-have security issues. Any vulnerabilities found in third-party dependencies 
should be
-reported to the maintainers of those projects. Results from security scans of 
Apache
-Superset dependencies found on its official Docker image can be remediated at 
release time
-by extending the image itself.
+If yes, it is in scope. If no, it is out of scope. The lists below apply that 
test to the classes Apache Superset most commonly receives reports about; they 
are illustrative, not exhaustive.
+
+*In Scope*
+
+- A user, embedded guest, or anonymous visitor reads, modifies, or deletes 
data outside their granted set. Includes object-level access bypass on charts, 
dashboards, datasets, saved queries, tags, annotations, and similar per-object 
endpoints, and row-level-security rule bypass.
+- A user supplies input that the codebase should sanitize or parameterize but 
does not, causing arbitrary SQL, template code, or scripts to execute. Includes 
injection through Jinja templates, SQL-construction paths, and any field the 
codebase passes to a query engine or template engine.
+- A user bypasses authentication, fixates or reuses another user's session, or 
reaches an authenticated endpoint without logging in.
+- An embedded guest token authorizes actions outside the dashboard it was 
issued for, or can be forged, replayed, or escalated to a higher principal.
+- Apache Superset, acting on behalf of an unprivileged user, fetches an 
outbound URL the user controls in a feature where Apache Superset itself, not 
the operator, controls the outbound destination set (server-side request 
forgery).
+- An Apache Superset default fails open without an accompanying documented 
hardening requirement. The codebase is responsible for shipping fail-closed 
defaults or for documenting the hardening required when a default fails open; 
failures of that responsibility are in scope (see *Trust Boundaries*).
+- A user bypasses a control Apache Superset documents specifically as a 
security boundary. This includes row-level security, the access checks tied to 
the role and capability matrix above, and any feature whose documentation 
positions it as security-relevant. The codebase commits to enforcing those 
controls; bypasses are in scope regardless of which principal triggers them.
+- A user causes a script to execute in another user's browser through a field 
the codebase renders to that other user (cross-site scripting), or causes 
cross-origin leakage of authenticated session state or data.
+- A user reaches a route, page, or API endpoint that requires a role they do 
not have.
+
+*Out of Scope*

Review Comment:
   Can we add an explicit call-out to NOT consider compromised 
metastores/caches/whathaveyou in-scope? Something like:
   
   > Compromise, modification, or malicious control of trusted backend 
infrastructure is out of scope. Apache Superset assumes the integrity of its 
metastore, cache backends (for example Redis or Memcached), message brokers, 
secret stores, and other operator-managed infrastructure. Findings that require 
an attacker to read from, write to, or otherwise tamper with these 
systems—including injecting malicious state, serialized objects, cache entries, 
task metadata, configuration, or database records—are considered 
post-compromise scenarios and do not constitute vulnerabilities in Apache 
Superset itself. A finding remains in scope only if an unprivileged user can 
cause such modification through a vulnerability in Apache Superset.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] docs(security): explicit security model, role/capability matrix, AGENTS.md linkage [superset]

Reply via email to