[GH] Adding security documentation (tooling-trusted-releases)

via GitHub Thu, 22 Jan 2026 09:05:37 -0800


sbp commented on code in PR #575:
URL: 
https://github.com/apache/tooling-trusted-releases/pull/575#discussion_r2717655278



##########
atr/docs/input-validation.md:
##########
@@ -0,0 +1,303 @@
+# 3.13. Input validation
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.12.` [Authorization security](security-authorization)
+
+**Next**: (none)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Defense in depth](#defense-in-depth)
+* [Form validation with Pydantic](#form-validation-with-pydantic)
+* [CSRF protection](#csrf-protection)
+* [Validation rules by input type](#validation-rules-by-input-type)
+* [Data integrity validation](#data-integrity-validation)
+* [Output encoding](#output-encoding)
+* [File upload security](#file-upload-security)
+* [Injection prevention](#injection-prevention)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+Input validation is critical for ATR's security posture. As a system that 
handles cryptographic signatures and release artifacts, ATR must ensure that 
all user input is properly validated before processing. This page documents the 
validation strategies and patterns used throughout the codebase.
+
+## Defense in depth
+
+ATR employs multiple layers of validation:
+
+1. **Transport layer**: HTTPS required, enforced by httpd
+2. **Request layer**: Size limits enforced by httpd (`MAX_CONTENT_LENGTH`)
+3. **Form layer**: Pydantic models validate structure and types
+4. **Application layer**: Business logic validation in route handlers
+5. **Database layer**: SQLAlchemy ORM with parameterized queries, plus 
constraints
+6. **Output layer**: Jinja2 auto-escaping for HTML output
+
+Each layer provides independent protection, so a failure in one layer does not 
compromise the system.
+
+## Form validation with Pydantic
+
+All form inputs in ATR are validated through 
[Pydantic](https://docs.pydantic.dev/latest/) models defined in 
[`form.py`](/ref/atr/form.py). The base class for forms is 
[`Form`](/ref/atr/form.py:Form), which extends Pydantic's `BaseModel`.
+
+### Defining form fields
+
+Form fields are defined using Python type annotations and the 
[`label`](/ref/atr/form.py:label) function:
+
+```python
+class ExampleForm(Form):
+    name: str = label("Project name", "Enter the project name")
+    count: int = label("Count", widget=Widget.NUMBER)
+    email: EmailStr = label("Contact email", widget=Widget.EMAIL)
+```
+
+The `label` function accepts a description (shown to users), optional 
documentation, and an optional widget hint for rendering.
+
+### Validation process
+
+When a form is submitted, ATR:
+
+1. Extracts form data from the request via 
[`quart_request`](/ref/atr/form.py:quart_request)
+2. Passes the data to the Pydantic model for validation
+3. If validation fails, collects errors via 
[`flash_error_data`](/ref/atr/form.py:flash_error_data)
+4. Displays errors to the user with 
[`flash_error_summary`](/ref/atr/form.py:flash_error_summary)
+5. If validation succeeds, proceeds with the validated data
+
+Pydantic provides built-in validators for common types (strings, integers, 
emails, URLs) and supports custom validators via decorators.
+
+### Custom validators
+
+For complex validation logic, use Pydantic's `@model_validator` decorator:
+
+```python
+from pydantic import model_validator
+
+class ReleaseForm(Form):
+    version: str = label("Version")
+
+    @model_validator(mode="after")
+    def validate_version_format(self):
+        if not re.match(r"^\d+\.\d+\.\d+", self.version):
+            raise ValueError("Version must start with X.Y.Z")
+        return self
+```
+
+## CSRF protection
+
+All forms that modify state must include a CSRF token. The token is generated 
by [`csrf_input`](/ref/atr/form.py:csrf_input) and validated automatically by 
Quart-WTF:

Review Comment:
   I don't think we have any GET forms (or very, very few if so), so this is 
essentially all forms for ATR. We should probably mention that developers don't 
have to include this field manually if they're using the `form` module 
renderer, because that will add it for them.



##########
atr/docs/security-authentication.md:
##########
@@ -0,0 +1,177 @@
+# 3.11. Authentication security
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.10.` [How to contribute](how-to-contribute)
+
+**Next**: `3.12.` [Authorization security](security-authorization)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Transport security](#transport-security)
+* [Web authentication](#web-authentication)
+* [API authentication](#api-authentication)
+* [Token lifecycle](#token-lifecycle)
+* [Security properties](#security-properties)
+* [Limitations and future work](#limitations-and-future-work)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+ATR uses two authentication mechanisms depending on the access method:
+
+* **Web sessions** via ASF OAuth for browser-based users accessing the web 
interface
+* **JWT tokens** derived from Personal Access Tokens (PATs) for programmatic 
API access
+
+Both mechanisms require HTTPS. Authentication verifies the identity of users, 
while authorization (covered in [Authorization 
security](security-authorization)) determines what actions they can perform.
+
+## Transport security
+
+All ATR routes, on both the website and the API, require HTTPS using TLS 1.2 
or newer. This is enforced at the httpd layer in front of the application. 
Requests over plain HTTP are redirected to HTTPS.
+
+Tokens and credentials must never appear in URLs, as URLs may be logged or 
cached. They must only be transmitted in request headers or POST bodies over 
HTTPS.
+
+## Web authentication
+
+### ASF OAuth integration
+
+Browser users authenticate through [ASF 
OAuth](https://oauth.apache.org/api.html). The authentication flow works as 
follows:
+
+1. User clicks "Sign in" on the ATR website
+2. ATR redirects the user to the ASF OAuth service
+3. User authenticates with their ASF credentials
+4. ASF OAuth redirects the user back to ATR with session information
+5. ATR creates a server-side session linked to the user's ASF UID
+
+The session is managed by 
[ASFQuart](https://github.com/apache/infrastructure-asfquart), which handles 
the OAuth handshake and session cookie management.
+
+### Session management
+
+Sessions are stored server-side. The browser receives only a session cookie 
that references the server-side session data. Session cookies are configured 
with security attributes:
+
+* `HttpOnly` - prevents JavaScript access to the cookie
+* `Secure` - cookie is only sent over HTTPS
+* `SameSite=Lax` - provides CSRF protection for most requests
+
+Session data includes the user's ASF UID and is used to authorize requests. 
The session expires after a period of inactivity or when the user logs out.
+
+### Session caching
+
+Authorization data fetched from LDAP (committee memberships, project 
participation) is cached in [`principal.Cache`](/ref/atr/principal.py:Cache) 
for performance. The cache has a TTL of 300 seconds, defined by 
`cache_for_at_most_seconds`. After the TTL expires, the next request will 
refresh the cache from LDAP.
+
+## API authentication
+
+API access uses a two-token system: Personal Access Tokens (PATs) for 
long-term credentials and JSON Web Tokens (JWTs) for short-term API access.
+
+### Personal Access Tokens (PATs)
+
+Committers can obtain PATs from the `/tokens` page on the ATR website. PATs 
have the following properties:
+
+* **Validity**: 180 days from creation
+* **Storage**: ATR stores only bcrypt hashes, never the plaintext PAT
+* **Revocation**: Users can revoke their own PATs at any time; admins can 
revoke any PAT
+* **Purpose**: PATs are used solely to obtain JWTs; they cannot be used 
directly for API access
+
+Only authenticated committers (signed in via ASF OAuth) can create PATs. Each 
user can have multiple active PATs.
+
+### JSON Web Tokens (JWTs)
+
+To access protected API endpoints, users must first obtain a JWT by exchanging 
their PAT. This is done by POSTing to `/api/jwt`:
+
+```text
+POST /api/jwt
+Content-Type: application/json
+
+{"asfuid": "username", "pat": "pat_token_value"}
+```
+
+On success, the response contains a JWT:
+
+```json
+{"asfuid": "username", "jwt": "jwt_token_value"}
+```
+
+JWTs have the following properties:
+
+* **Algorithm**: HS256 (HMAC-SHA256)
+* **Validity**: 90 minutes from creation
+* **Claims**: `sub` (ASF UID), `iat` (issued at), `exp` (expiration), `jti` 
(unique token ID)
+* **Storage**: JWTs are stateless; ATR does not store issued JWTs
+
+The JWT is used in the `Authorization` header as a bearer token:
+
+```text
+Authorization: Bearer jwt_token_value
+```
+
+### Token handling
+
+The [`jwtoken`](/ref/atr/jwtoken.py) module handles JWT creation and 
verification. Protected API endpoints use the `@jwtoken.require` decorator, 
which extracts the JWT from the `Authorization` header, verifies its signature 
and expiration, and makes the user's ASF UID available to the handler.
+
+## Token lifecycle
+
+The relationship between authentication methods and tokens:
+
+```text
+ASF OAuth (web login)
+    │
+    ├──▶ Web Session ──▶ Web Interface Access
+    │
+    └──▶ PAT Creation ──▶ PAT (180 days)
+                              │
+                              └──▶ JWT Exchange ──▶ JWT (90 min)
+                                                       │
+                                                       └──▶ API Access
+```
+
+For web users, authentication happens once via ASF OAuth, and the session 
persists until logout or expiration. For API users, the flow is: obtain a PAT 
once (via the web interface), then exchange it for JWTs as needed (JWTs expire 
quickly, so this exchange happens frequently in long-running scripts).
+
+## Security properties
+
+### Web sessions
+
+* Server-side storage prevents client-side tampering
+* Session cookies are protected against XSS (`HttpOnly`) and transmission 
interception (`Secure`)
+* `SameSite` attribute provides baseline CSRF protection (ATR also uses CSRF 
tokens in forms)
+
+### Personal Access Tokens
+
+* Stored as bcrypt hashes with appropriate cost factor
+* Can be revoked immediately by the user
+* Limited purpose (only for JWT issuance) reduces impact of compromise
+* Long validity (180 days) balanced by easy revocation
+
+### JSON Web Tokens
+
+* Short validity (90 minutes) limits exposure window
+* Signed with a server secret initialized at startup
+* Stateless design means no database lookup required for verification
+* Server restart invalidates all outstanding JWTs (secret is regenerated)
+
+### Credential protection
+
+Tokens must be protected by the user at all times:
+
+* Never include tokens in URLs
+* Never log tokens
+* Never commit tokens to source control
+* Report compromised tokens to ASF security immediately
+
+## Limitations and future work
+
+The current authentication system has some limitations:
+
+* **No token scopes**: PATs and JWTs grant full access for the user; 
fine-grained scopes are not yet implemented.
+* **No individual JWT revocation**: JWTs cannot be revoked individually. In an 
emergency, restarting the server regenerates the signing secret, which 
invalidates all JWTs.
+* **No refresh tokens**: When a JWT expires, users must exchange their PAT for 
a new JWT. There is no refresh token mechanism.
+* **Limited auditing**: Comprehensive logging and auditing of token operations 
is planned but not fully implemented.
+* **Unused JWT fields**: Some standard JWT fields like `iss` (issuer) are not 
currently used.
+* **No rate limiting**: PAT and JWT issuance is not currently rate limited.
+
+These limitations are tracked for future improvement.

Review Comment:
   Do we have issues for all of these? Could we link to the issues?
   



##########
atr/docs/input-validation.md:
##########
@@ -0,0 +1,303 @@
+# 3.13. Input validation
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.12.` [Authorization security](security-authorization)
+
+**Next**: (none)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Defense in depth](#defense-in-depth)
+* [Form validation with Pydantic](#form-validation-with-pydantic)
+* [CSRF protection](#csrf-protection)
+* [Validation rules by input type](#validation-rules-by-input-type)
+* [Data integrity validation](#data-integrity-validation)
+* [Output encoding](#output-encoding)
+* [File upload security](#file-upload-security)
+* [Injection prevention](#injection-prevention)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+Input validation is critical for ATR's security posture. As a system that 
handles cryptographic signatures and release artifacts, ATR must ensure that 
all user input is properly validated before processing. This page documents the 
validation strategies and patterns used throughout the codebase.
+
+## Defense in depth
+
+ATR employs multiple layers of validation:
+
+1. **Transport layer**: HTTPS required, enforced by httpd
+2. **Request layer**: Size limits enforced by httpd (`MAX_CONTENT_LENGTH`)
+3. **Form layer**: Pydantic models validate structure and types
+4. **Application layer**: Business logic validation in route handlers
+5. **Database layer**: SQLAlchemy ORM with parameterized queries, plus 
constraints
+6. **Output layer**: Jinja2 auto-escaping for HTML output
+
+Each layer provides independent protection, so a failure in one layer does not 
compromise the system.
+
+## Form validation with Pydantic
+
+All form inputs in ATR are validated through 
[Pydantic](https://docs.pydantic.dev/latest/) models defined in 
[`form.py`](/ref/atr/form.py). The base class for forms is 
[`Form`](/ref/atr/form.py:Form), which extends Pydantic's `BaseModel`.
+
+### Defining form fields
+
+Form fields are defined using Python type annotations and the 
[`label`](/ref/atr/form.py:label) function:
+
+```python
+class ExampleForm(Form):
+    name: str = label("Project name", "Enter the project name")
+    count: int = label("Count", widget=Widget.NUMBER)
+    email: EmailStr = label("Contact email", widget=Widget.EMAIL)
+```
+
+The `label` function accepts a description (shown to users), optional 
documentation, and an optional widget hint for rendering.
+
+### Validation process
+
+When a form is submitted, ATR:
+
+1. Extracts form data from the request via 
[`quart_request`](/ref/atr/form.py:quart_request)
+2. Passes the data to the Pydantic model for validation
+3. If validation fails, collects errors via 
[`flash_error_data`](/ref/atr/form.py:flash_error_data)
+4. Displays errors to the user with 
[`flash_error_summary`](/ref/atr/form.py:flash_error_summary)
+5. If validation succeeds, proceeds with the validated data
+
+Pydantic provides built-in validators for common types (strings, integers, 
emails, URLs) and supports custom validators via decorators.
+
+### Custom validators
+
+For complex validation logic, use Pydantic's `@model_validator` decorator:
+
+```python
+from pydantic import model_validator
+
+class ReleaseForm(Form):
+    version: str = label("Version")
+
+    @model_validator(mode="after")
+    def validate_version_format(self):
+        if not re.match(r"^\d+\.\d+\.\d+", self.version):
+            raise ValueError("Version must start with X.Y.Z")
+        return self
+```
+
+## CSRF protection
+
+All forms that modify state must include a CSRF token. The token is generated 
by [`csrf_input`](/ref/atr/form.py:csrf_input) and validated automatically by 
Quart-WTF:
+
+```python
+def csrf_input() -> htm.VoidElement:
+    csrf_token = utils.generate_csrf()
+    return htpy.input(type="hidden", name="csrf_token", value=csrf_token)
+```
+
+In templates, include the CSRF token in every form:
+
+```html
+<form method="post">
+    {{ csrf_input() }}
+    <!-- other form fields -->
+</form>
+```
+
+The CSRF token is tied to the user's session and validated on form submission. 
Requests without a valid CSRF token are rejected.
+
+## Validation rules by input type
+
+### ASF User IDs
+
+User IDs are validated against a strict pattern in 
[`principal.py`](/ref/atr/principal.py):
+
+```python
+if not re.match(r"^[-_a-z0-9]+$", user):
+    raise CommitterError("Invalid characters in User ID")
+```
+
+Only lowercase alphanumeric characters, hyphens, and underscores are permitted.
+
+### Email addresses
+
+Email validation uses Pydantic's `EmailStr` type, which implements RFC 5322 
validation:
+
+```python
+from pydantic import EmailStr
+
+class ContactForm(Form):
+    email: EmailStr = label("Email address")
+```
+
+### URLs
+
+URL validation uses Pydantic's `HttpUrl` type:
+
+```python
+from pydantic import HttpUrl
+
+class LinkForm(Form):
+    website: HttpUrl = label("Website URL")
+```
+
+### Version strings
+
+Version strings are validated according to project-specific patterns. The 
general pattern allows semantic versioning with optional suffixes:
+
+```python
+VERSION_PATTERN = re.compile(r"^[0-9]+\.[0-9]+.*$")
+```
+
+### Committee and project names
+
+Committee and project names are validated against the set of known committees 
and projects from LDAP and the ASF project database. Unknown names are rejected.
+
+### File names
+
+File names in uploads are sanitized to prevent path traversal:
+
+* Directory separators (`/`, `\`) are rejected or stripped
+* Null bytes are rejected
+* Only expected extensions are permitted per upload type
+
+## Data integrity validation
+
+Beyond input validation, ATR performs data integrity validation on database 
records using [`validate.py`](/ref/atr/validate.py). This catches 
inconsistencies that may have been introduced by bugs, migrations, or manual 
database edits.
+
+### Committee validation
+
+The [`committee`](/ref/atr/validate.py:committee) function checks:
+
+* `child_committees` must be empty (not used)
+* `full_name` must be set, trimmed, and not prefixed with "Apache "
+
+### Project validation
+
+The [`project`](/ref/atr/validate.py:project) function checks:
+
+* `category` must use comma-separated labels without colons
+* `committee_name` must be set (project must be linked to a committee)
+* `created` timestamp must be in the past
+* `full_name` must be set and start with "Apache "
+* `programming_languages` must use comma-separated labels without colons
+* `release_policy_id` must be None (not used)
+
+### Release validation
+
+The [`release`](/ref/atr/validate.py:release) function checks:
+
+* `created` timestamp must be in the past
+* `name` must match the expected pattern for project and version
+* Release directory must exist on disk and contain files
+* `package_managers` must be empty (not used)
+* `released` timestamp must be in the past or None
+* `sboms` must be empty (not used)
+* Vote logic must be consistent (cannot have `vote_resolved` without 
`vote_started`)
+* `votes` must be empty (not used)
+
+### Running validation
+
+Data integrity validation can be run via the admin interface or 
programmatically:
+
+```python
+async for divergence in validate.everything(data):
+    print(f"{divergence.source}: {divergence.divergence}")
+```
+
+## Output encoding
+
+ATR uses [Jinja2](https://jinja.palletsprojects.com/) for templating with 
auto-escaping enabled by default. All variables rendered in templates are 
automatically HTML-escaped:
+
+```html
+<!-- This is safe; user_input is escaped -->
+<p>Hello, {{ user_input }}</p>
+```
+
+When HTML output is intentionally generated (e.g., via htpy), it must be 
explicitly marked safe using `markupsafe.Markup`:
+
+```python
+import markupsafe
+safe_html = markupsafe.Markup("<strong>Bold</strong>")
+```
+
+Never mark user-controlled data as safe without proper sanitization.
+
+## File upload security
+
+File uploads are handled with several security measures:
+
+### Size limits
+
+Maximum upload size is enforced at the httpd layer via `MAX_CONTENT_LENGTH`. 
This prevents denial-of-service attacks via large uploads.
+
+### Extension validation
+
+Each upload type has an allowlist of permitted file extensions. Files with 
unexpected extensions are rejected.
+
+### Storage location
+
+Uploaded files are stored outside the web root in configured directories 
(e.g., `state/unfinished/`). They are not directly accessible via HTTP.
+
+### File handling
+
+Files are processed via 
[`quart.datastructures.FileStorage`](/ref/atr/form.py:quart_request) and 
validated before being written to disk. Empty files (where the browser sends a 
file input with no selection) are filtered out.
+
+## Injection prevention
+
+### SQL injection
+
+ATR uses SQLAlchemy ORM exclusively for database access. All queries use 
parameterized statements:
+
+```python
+# Safe: parameterized query
+result = await session.exec(
+    select(Project).where(Project.name == project_name)
+)
+```
+
+Direct SQL string concatenation is never used.
+
+### Cross-site scripting (XSS)
+
+XSS is prevented through:
+
+* Jinja2 auto-escaping (enabled by default)
+* `markupsafe.Markup` for trusted HTML only
+* Content Security Policy headers (configured in httpd)
+
+### Path traversal
+
+Path traversal is prevented by:
+
+* Using `pathlib.Path` for all file operations
+* Validating that paths remain within expected directories
+* Rejecting file names containing path separators
+
+```python
+import pathlib
+
+base = pathlib.Path("/allowed/directory")
+user_path = base / user_filename
+# Verify the resolved path is still under base
+if not user_path.resolve().is_relative_to(base.resolve()):
+    raise ValueError("Path traversal detected")
+```
+
+### Command injection
+
+ATR minimizes shell command execution. Where external commands are necessary 
(e.g., GPG operations), arguments are passed as lists, never as shell strings:

Review Comment:
   Probably best to just say we safeguard against command injection as much as 
possible.



##########
atr/docs/security-authorization.md:
##########
@@ -0,0 +1,217 @@
+# 3.12. Authorization security
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.11.` [Authentication security](security-authentication)
+
+**Next**: `3.13.` [Input validation](input-validation)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Roles and principals](#roles-and-principals)
+* [LDAP integration](#ldap-integration)
+* [Access control for releases](#access-control-for-releases)
+* [Access control for projects](#access-control-for-projects)
+* [Access control for tokens](#access-control-for-tokens)
+* [Implementation patterns](#implementation-patterns)
+* [Caching behavior](#caching-behavior)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+ATR uses role-based access control (RBAC) where roles are derived from ASF 
LDAP group memberships. Authentication (covered in [Authentication 
security](security-authentication)) establishes *who* a user is; authorization 
determines *what* they can do.
+
+The authorization model is committee-centric: most permissions are granted 
based on a user's relationship to a committee (PMC membership) or project 
(committer status).
+
+## Roles and principals
+
+ATR recognizes the following roles, derived from ASF LDAP:
+
+* **Public**: Unauthenticated users. Can view public information about 
releases and projects.
+
+* **Committer**: Any authenticated ASF committer. Can create Personal Access 
Tokens and view their own committees and projects. Determined by existence in 
LDAP `ou=people,dc=apache,dc=org`.
+
+* **Project Participant**: A committer who is a member of a specific project 
(has commit access). Can start releases, upload artifacts, and cast votes for 
that project. Determined by the `member` attribute in the project's LDAP group.

Review Comment:
   Oh, this line even says so at the end. So I guess we can just remove "(has 
commit access)".



##########
atr/docs/input-validation.md:
##########
@@ -0,0 +1,303 @@
+# 3.13. Input validation
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.12.` [Authorization security](security-authorization)
+
+**Next**: (none)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Defense in depth](#defense-in-depth)
+* [Form validation with Pydantic](#form-validation-with-pydantic)
+* [CSRF protection](#csrf-protection)
+* [Validation rules by input type](#validation-rules-by-input-type)
+* [Data integrity validation](#data-integrity-validation)
+* [Output encoding](#output-encoding)
+* [File upload security](#file-upload-security)
+* [Injection prevention](#injection-prevention)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+Input validation is critical for ATR's security posture. As a system that 
handles cryptographic signatures and release artifacts, ATR must ensure that 
all user input is properly validated before processing. This page documents the 
validation strategies and patterns used throughout the codebase.
+
+## Defense in depth
+
+ATR employs multiple layers of validation:
+
+1. **Transport layer**: HTTPS required, enforced by httpd
+2. **Request layer**: Size limits enforced by httpd (`MAX_CONTENT_LENGTH`)
+3. **Form layer**: Pydantic models validate structure and types
+4. **Application layer**: Business logic validation in route handlers
+5. **Database layer**: SQLAlchemy ORM with parameterized queries, plus 
constraints
+6. **Output layer**: Jinja2 auto-escaping for HTML output
+
+Each layer provides independent protection, so a failure in one layer does not 
compromise the system.
+
+## Form validation with Pydantic
+
+All form inputs in ATR are validated through 
[Pydantic](https://docs.pydantic.dev/latest/) models defined in 
[`form.py`](/ref/atr/form.py). The base class for forms is 
[`Form`](/ref/atr/form.py:Form), which extends Pydantic's `BaseModel`.
+
+### Defining form fields
+
+Form fields are defined using Python type annotations and the 
[`label`](/ref/atr/form.py:label) function:
+
+```python
+class ExampleForm(Form):
+    name: str = label("Project name", "Enter the project name")
+    count: int = label("Count", widget=Widget.NUMBER)
+    email: EmailStr = label("Contact email", widget=Widget.EMAIL)
+```
+
+The `label` function accepts a description (shown to users), optional 
documentation, and an optional widget hint for rendering.
+
+### Validation process
+
+When a form is submitted, ATR:
+
+1. Extracts form data from the request via 
[`quart_request`](/ref/atr/form.py:quart_request)
+2. Passes the data to the Pydantic model for validation
+3. If validation fails, collects errors via 
[`flash_error_data`](/ref/atr/form.py:flash_error_data)
+4. Displays errors to the user with 
[`flash_error_summary`](/ref/atr/form.py:flash_error_summary)
+5. If validation succeeds, proceeds with the validated data
+
+Pydantic provides built-in validators for common types (strings, integers, 
emails, URLs) and supports custom validators via decorators.
+
+### Custom validators
+
+For complex validation logic, use Pydantic's `@model_validator` decorator:
+
+```python
+from pydantic import model_validator
+
+class ReleaseForm(Form):
+    version: str = label("Version")
+
+    @model_validator(mode="after")
+    def validate_version_format(self):
+        if not re.match(r"^\d+\.\d+\.\d+", self.version):
+            raise ValueError("Version must start with X.Y.Z")
+        return self
+```
+
+## CSRF protection
+
+All forms that modify state must include a CSRF token. The token is generated 
by [`csrf_input`](/ref/atr/form.py:csrf_input) and validated automatically by 
Quart-WTF:

Review Comment:
   Also I think this should just say that all POST forms must include a CSRF 
token. Most POST forms modify state, but not all: the vote tabulation form, for 
example, is POST because it performs significant computation, not because it 
modifies state.



##########
atr/docs/security-authentication.md:
##########
@@ -0,0 +1,177 @@
+# 3.11. Authentication security
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.10.` [How to contribute](how-to-contribute)
+
+**Next**: `3.12.` [Authorization security](security-authorization)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Transport security](#transport-security)
+* [Web authentication](#web-authentication)
+* [API authentication](#api-authentication)
+* [Token lifecycle](#token-lifecycle)
+* [Security properties](#security-properties)
+* [Limitations and future work](#limitations-and-future-work)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+ATR uses two authentication mechanisms depending on the access method:
+
+* **Web sessions** via ASF OAuth for browser-based users accessing the web 
interface
+* **JWT tokens** derived from Personal Access Tokens (PATs) for programmatic 
API access
+
+Both mechanisms require HTTPS. Authentication verifies the identity of users, 
while authorization (covered in [Authorization 
security](security-authorization)) determines what actions they can perform.
+
+## Transport security
+
+All ATR routes, on both the website and the API, require HTTPS using TLS 1.2 
or newer. This is enforced at the httpd layer in front of the application. 
Requests over plain HTTP are redirected to HTTPS.
+
+Tokens and credentials must never appear in URLs, as URLs may be logged or 
cached. They must only be transmitted in request headers or POST bodies over 
HTTPS.
+
+## Web authentication
+
+### ASF OAuth integration
+
+Browser users authenticate through [ASF 
OAuth](https://oauth.apache.org/api.html). The authentication flow works as 
follows:
+
+1. User clicks "Sign in" on the ATR website
+2. ATR redirects the user to the ASF OAuth service
+3. User authenticates with their ASF credentials
+4. ASF OAuth redirects the user back to ATR with session information
+5. ATR creates a server-side session linked to the user's ASF UID
+
+The session is managed by 
[ASFQuart](https://github.com/apache/infrastructure-asfquart), which handles 
the OAuth handshake and session cookie management.
+
+### Session management
+
+Sessions are stored server-side. The browser receives only a session cookie 
that references the server-side session data. Session cookies are configured 
with security attributes:
+
+* `HttpOnly` - prevents JavaScript access to the cookie
+* `Secure` - cookie is only sent over HTTPS
+* `SameSite=Lax` - provides CSRF protection for most requests
+
+Session data includes the user's ASF UID and is used to authorize requests. 
The session expires after a period of inactivity or when the user logs out.
+
+### Session caching
+
+Authorization data fetched from LDAP (committee memberships, project 
participation) is cached in [`principal.Cache`](/ref/atr/principal.py:Cache) 
for performance. The cache has a TTL of 300 seconds, defined by 
`cache_for_at_most_seconds`. After the TTL expires, the next request will 
refresh the cache from LDAP.
+
+## API authentication
+
+API access uses a two-token system: Personal Access Tokens (PATs) for 
long-term credentials and JSON Web Tokens (JWTs) for short-term API access.
+
+### Personal Access Tokens (PATs)
+
+Committers can obtain PATs from the `/tokens` page on the ATR website. PATs 
have the following properties:
+
+* **Validity**: 180 days from creation
+* **Storage**: ATR stores only bcrypt hashes, never the plaintext PAT
+* **Revocation**: Users can revoke their own PATs at any time; admins can 
revoke any PAT
+* **Purpose**: PATs are used solely to obtain JWTs; they cannot be used 
directly for API access
+
+Only authenticated committers (signed in via ASF OAuth) can create PATs. Each 
user can have multiple active PATs.
+
+### JSON Web Tokens (JWTs)
+
+To access protected API endpoints, users must first obtain a JWT by exchanging 
their PAT. This is done by POSTing to `/api/jwt`:
+
+```text
+POST /api/jwt
+Content-Type: application/json
+
+{"asfuid": "username", "pat": "pat_token_value"}
+```
+
+On success, the response contains a JWT:
+
+```json
+{"asfuid": "username", "jwt": "jwt_token_value"}
+```
+
+JWTs have the following properties:
+
+* **Algorithm**: HS256 (HMAC-SHA256)
+* **Validity**: 90 minutes from creation
+* **Claims**: `sub` (ASF UID), `iat` (issued at), `exp` (expiration), `jti` 
(unique token ID)
+* **Storage**: JWTs are stateless; ATR does not store issued JWTs
+
+The JWT is used in the `Authorization` header as a bearer token:
+
+```text
+Authorization: Bearer jwt_token_value
+```
+
+### Token handling
+
+The [`jwtoken`](/ref/atr/jwtoken.py) module handles JWT creation and 
verification. Protected API endpoints use the `@jwtoken.require` decorator, 
which extracts the JWT from the `Authorization` header, verifies its signature 
and expiration, and makes the user's ASF UID available to the handler.
+
+## Token lifecycle
+
+The relationship between authentication methods and tokens:
+
+```text
+ASF OAuth (web login)
+    │
+    ├──▶ Web Session ──▶ Web Interface Access
+    │
+    └──▶ PAT Creation ──▶ PAT (180 days)
+                              │
+                              └──▶ JWT Exchange ──▶ JWT (90 min)
+                                                       │
+                                                       └──▶ API Access
+```
+
+For web users, authentication happens once via ASF OAuth, and the session 
persists until logout or expiration. For API users, the flow is: obtain a PAT 
once (via the web interface), then exchange it for JWTs as needed (JWTs expire 
quickly, so this exchange happens frequently in long-running scripts).
+
+## Security properties
+
+### Web sessions
+
+* Server-side storage prevents client-side tampering
+* Session cookies are protected against XSS (`HttpOnly`) and transmission 
interception (`Secure`)
+* `SameSite` attribute provides baseline CSRF protection (ATR also uses CSRF 
tokens in forms)
+
+### Personal Access Tokens
+
+* Stored as bcrypt hashes with appropriate cost factor
+* Can be revoked immediately by the user
+* Limited purpose (only for JWT issuance) reduces impact of compromise
+* Long validity (180 days) balanced by easy revocation
+
+### JSON Web Tokens
+
+* Short validity (90 minutes) limits exposure window
+* Signed with a server secret initialized at startup
+* Stateless design means no database lookup required for verification
+* Server restart invalidates all outstanding JWTs (secret is regenerated)

Review Comment:
   Don't think this is true. We have to manually delete the value during 
restart for it to be rotated.



##########
atr/docs/security-authentication.md:
##########
@@ -0,0 +1,177 @@
+# 3.11. Authentication security
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.10.` [How to contribute](how-to-contribute)
+
+**Next**: `3.12.` [Authorization security](security-authorization)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Transport security](#transport-security)
+* [Web authentication](#web-authentication)
+* [API authentication](#api-authentication)
+* [Token lifecycle](#token-lifecycle)
+* [Security properties](#security-properties)
+* [Limitations and future work](#limitations-and-future-work)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+ATR uses two authentication mechanisms depending on the access method:
+
+* **Web sessions** via ASF OAuth for browser-based users accessing the web 
interface
+* **JWT tokens** derived from Personal Access Tokens (PATs) for programmatic 
API access
+
+Both mechanisms require HTTPS. Authentication verifies the identity of users, 
while authorization (covered in [Authorization 
security](security-authorization)) determines what actions they can perform.
+
+## Transport security
+
+All ATR routes, on both the website and the API, require HTTPS using TLS 1.2 
or newer. This is enforced at the httpd layer in front of the application. 
Requests over plain HTTP are redirected to HTTPS.
+
+Tokens and credentials must never appear in URLs, as URLs may be logged or 
cached. They must only be transmitted in request headers or POST bodies over 
HTTPS.
+
+## Web authentication
+
+### ASF OAuth integration
+
+Browser users authenticate through [ASF 
OAuth](https://oauth.apache.org/api.html). The authentication flow works as 
follows:
+
+1. User clicks "Sign in" on the ATR website
+2. ATR redirects the user to the ASF OAuth service
+3. User authenticates with their ASF credentials
+4. ASF OAuth redirects the user back to ATR with session information
+5. ATR creates a server-side session linked to the user's ASF UID
+
+The session is managed by 
[ASFQuart](https://github.com/apache/infrastructure-asfquart), which handles 
the OAuth handshake and session cookie management.
+
+### Session management
+
+Sessions are stored server-side. The browser receives only a session cookie 
that references the server-side session data. Session cookies are configured 
with security attributes:
+
+* `HttpOnly` - prevents JavaScript access to the cookie
+* `Secure` - cookie is only sent over HTTPS
+* `SameSite=Lax` - provides CSRF protection for most requests

Review Comment:
   We use `SameSite=Strict`. That is the ASFQuart default.



##########
SECURITY.md:
##########
@@ -0,0 +1,44 @@
+# Security Policy
+
+## Reporting Security Issues

Review Comment:
   Could you [convert all headings to sentence 
case](https://release-test.apache.org/docs/code-conventions#documentation-and-interfaces)
 please?



##########
atr/docs/security-authentication.md:
##########
@@ -0,0 +1,177 @@
+# 3.11. Authentication security
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.10.` [How to contribute](how-to-contribute)
+
+**Next**: `3.12.` [Authorization security](security-authorization)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Transport security](#transport-security)
+* [Web authentication](#web-authentication)
+* [API authentication](#api-authentication)
+* [Token lifecycle](#token-lifecycle)
+* [Security properties](#security-properties)
+* [Limitations and future work](#limitations-and-future-work)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+ATR uses two authentication mechanisms depending on the access method:
+
+* **Web sessions** via ASF OAuth for browser-based users accessing the web 
interface
+* **JWT tokens** derived from Personal Access Tokens (PATs) for programmatic 
API access
+
+Both mechanisms require HTTPS. Authentication verifies the identity of users, 
while authorization (covered in [Authorization 
security](security-authorization)) determines what actions they can perform.
+
+## Transport security
+
+All ATR routes, on both the website and the API, require HTTPS using TLS 1.2 
or newer. This is enforced at the httpd layer in front of the application. 
Requests over plain HTTP are redirected to HTTPS.
+
+Tokens and credentials must never appear in URLs, as URLs may be logged or 
cached. They must only be transmitted in request headers or POST bodies over 
HTTPS.
+
+## Web authentication
+
+### ASF OAuth integration
+
+Browser users authenticate through [ASF 
OAuth](https://oauth.apache.org/api.html). The authentication flow works as 
follows:
+
+1. User clicks "Sign in" on the ATR website
+2. ATR redirects the user to the ASF OAuth service
+3. User authenticates with their ASF credentials
+4. ASF OAuth redirects the user back to ATR with session information
+5. ATR creates a server-side session linked to the user's ASF UID
+
+The session is managed by 
[ASFQuart](https://github.com/apache/infrastructure-asfquart), which handles 
the OAuth handshake and session cookie management.
+
+### Session management
+
+Sessions are stored server-side. The browser receives only a session cookie 
that references the server-side session data. Session cookies are configured 
with security attributes:
+
+* `HttpOnly` - prevents JavaScript access to the cookie
+* `Secure` - cookie is only sent over HTTPS
+* `SameSite=Lax` - provides CSRF protection for most requests
+
+Session data includes the user's ASF UID and is used to authorize requests. 
The session expires after a period of inactivity or when the user logs out.
+
+### Session caching
+
+Authorization data fetched from LDAP (committee memberships, project 
participation) is cached in [`principal.Cache`](/ref/atr/principal.py:Cache) 
for performance. The cache has a TTL of 300 seconds, defined by 
`cache_for_at_most_seconds`. After the TTL expires, the next request will 
refresh the cache from LDAP.
+
+## API authentication
+
+API access uses a two-token system: Personal Access Tokens (PATs) for 
long-term credentials and JSON Web Tokens (JWTs) for short-term API access.
+
+### Personal Access Tokens (PATs)
+
+Committers can obtain PATs from the `/tokens` page on the ATR website. PATs 
have the following properties:
+
+* **Validity**: 180 days from creation
+* **Storage**: ATR stores only bcrypt hashes, never the plaintext PAT

Review Comment:
   No, we use `SHA3-256`. The PAT isn't a password, it's a high entropy secret, 
so we don't need to use a KDF. If we did use a KDF, we wouldn't use bcrypt.



##########
atr/docs/input-validation.md:
##########
@@ -0,0 +1,303 @@
+# 3.13. Input validation
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.12.` [Authorization security](security-authorization)
+
+**Next**: (none)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Defense in depth](#defense-in-depth)
+* [Form validation with Pydantic](#form-validation-with-pydantic)
+* [CSRF protection](#csrf-protection)
+* [Validation rules by input type](#validation-rules-by-input-type)
+* [Data integrity validation](#data-integrity-validation)
+* [Output encoding](#output-encoding)
+* [File upload security](#file-upload-security)
+* [Injection prevention](#injection-prevention)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+Input validation is critical for ATR's security posture. As a system that 
handles cryptographic signatures and release artifacts, ATR must ensure that 
all user input is properly validated before processing. This page documents the 
validation strategies and patterns used throughout the codebase.
+
+## Defense in depth
+
+ATR employs multiple layers of validation:
+
+1. **Transport layer**: HTTPS required, enforced by httpd
+2. **Request layer**: Size limits enforced by httpd (`MAX_CONTENT_LENGTH`)
+3. **Form layer**: Pydantic models validate structure and types
+4. **Application layer**: Business logic validation in route handlers
+5. **Database layer**: SQLAlchemy ORM with parameterized queries, plus 
constraints
+6. **Output layer**: Jinja2 auto-escaping for HTML output

Review Comment:
   Could add Markdown checking to this list.



##########
atr/docs/input-validation.md:
##########
@@ -0,0 +1,303 @@
+# 3.13. Input validation
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.12.` [Authorization security](security-authorization)
+
+**Next**: (none)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Defense in depth](#defense-in-depth)
+* [Form validation with Pydantic](#form-validation-with-pydantic)
+* [CSRF protection](#csrf-protection)
+* [Validation rules by input type](#validation-rules-by-input-type)
+* [Data integrity validation](#data-integrity-validation)
+* [Output encoding](#output-encoding)
+* [File upload security](#file-upload-security)
+* [Injection prevention](#injection-prevention)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+Input validation is critical for ATR's security posture. As a system that 
handles cryptographic signatures and release artifacts, ATR must ensure that 
all user input is properly validated before processing. This page documents the 
validation strategies and patterns used throughout the codebase.
+
+## Defense in depth
+
+ATR employs multiple layers of validation:
+
+1. **Transport layer**: HTTPS required, enforced by httpd
+2. **Request layer**: Size limits enforced by httpd (`MAX_CONTENT_LENGTH`)
+3. **Form layer**: Pydantic models validate structure and types
+4. **Application layer**: Business logic validation in route handlers
+5. **Database layer**: SQLAlchemy ORM with parameterized queries, plus 
constraints
+6. **Output layer**: Jinja2 auto-escaping for HTML output
+
+Each layer provides independent protection, so a failure in one layer does not 
compromise the system.
+
+## Form validation with Pydantic
+
+All form inputs in ATR are validated through 
[Pydantic](https://docs.pydantic.dev/latest/) models defined in 
[`form.py`](/ref/atr/form.py). The base class for forms is 
[`Form`](/ref/atr/form.py:Form), which extends Pydantic's `BaseModel`.
+
+### Defining form fields
+
+Form fields are defined using Python type annotations and the 
[`label`](/ref/atr/form.py:label) function:
+
+```python
+class ExampleForm(Form):
+    name: str = label("Project name", "Enter the project name")
+    count: int = label("Count", widget=Widget.NUMBER)
+    email: EmailStr = label("Contact email", widget=Widget.EMAIL)
+```
+
+The `label` function accepts a description (shown to users), optional 
documentation, and an optional widget hint for rendering.
+
+### Validation process
+
+When a form is submitted, ATR:
+
+1. Extracts form data from the request via 
[`quart_request`](/ref/atr/form.py:quart_request)
+2. Passes the data to the Pydantic model for validation
+3. If validation fails, collects errors via 
[`flash_error_data`](/ref/atr/form.py:flash_error_data)
+4. Displays errors to the user with 
[`flash_error_summary`](/ref/atr/form.py:flash_error_summary)
+5. If validation succeeds, proceeds with the validated data
+
+Pydantic provides built-in validators for common types (strings, integers, 
emails, URLs) and supports custom validators via decorators.
+
+### Custom validators
+
+For complex validation logic, use Pydantic's `@model_validator` decorator:
+
+```python
+from pydantic import model_validator
+
+class ReleaseForm(Form):
+    version: str = label("Version")
+
+    @model_validator(mode="after")
+    def validate_version_format(self):
+        if not re.match(r"^\d+\.\d+\.\d+", self.version):
+            raise ValueError("Version must start with X.Y.Z")
+        return self
+```
+
+## CSRF protection
+
+All forms that modify state must include a CSRF token. The token is generated 
by [`csrf_input`](/ref/atr/form.py:csrf_input) and validated automatically by 
Quart-WTF:
+
+```python
+def csrf_input() -> htm.VoidElement:
+    csrf_token = utils.generate_csrf()
+    return htpy.input(type="hidden", name="csrf_token", value=csrf_token)
+```
+
+In templates, include the CSRF token in every form:
+
+```html
+<form method="post">
+    {{ csrf_input() }}
+    <!-- other form fields -->
+</form>
+```
+
+The CSRF token is tied to the user's session and validated on form submission. 
Requests without a valid CSRF token are rejected.
+
+## Validation rules by input type
+
+### ASF User IDs
+
+User IDs are validated against a strict pattern in 
[`principal.py`](/ref/atr/principal.py):
+
+```python
+if not re.match(r"^[-_a-z0-9]+$", user):
+    raise CommitterError("Invalid characters in User ID")
+```
+
+Only lowercase alphanumeric characters, hyphens, and underscores are permitted.
+
+### Email addresses
+
+Email validation uses Pydantic's `EmailStr` type, which implements RFC 5322 
validation:
+
+```python
+from pydantic import EmailStr
+
+class ContactForm(Form):
+    email: EmailStr = label("Email address")
+```
+
+### URLs
+
+URL validation uses Pydantic's `HttpUrl` type:
+
+```python
+from pydantic import HttpUrl
+
+class LinkForm(Form):
+    website: HttpUrl = label("Website URL")
+```
+
+### Version strings
+
+Version strings are validated according to project-specific patterns. The 
general pattern allows semantic versioning with optional suffixes:
+
+```python
+VERSION_PATTERN = re.compile(r"^[0-9]+\.[0-9]+.*$")
+```
+
+### Committee and project names
+
+Committee and project names are validated against the set of known committees 
and projects from LDAP and the ASF project database. Unknown names are rejected.
+
+### File names
+
+File names in uploads are sanitized to prevent path traversal:
+
+* Directory separators (`/`, `\`) are rejected or stripped
+* Null bytes are rejected
+* Only expected extensions are permitted per upload type

Review Comment:
   We also disallow `..` as path components.



##########
atr/docs/security-authentication.md:
##########
@@ -0,0 +1,177 @@
+# 3.11. Authentication security
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.10.` [How to contribute](how-to-contribute)
+
+**Next**: `3.12.` [Authorization security](security-authorization)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Transport security](#transport-security)
+* [Web authentication](#web-authentication)
+* [API authentication](#api-authentication)
+* [Token lifecycle](#token-lifecycle)
+* [Security properties](#security-properties)
+* [Limitations and future work](#limitations-and-future-work)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+ATR uses two authentication mechanisms depending on the access method:
+
+* **Web sessions** via ASF OAuth for browser-based users accessing the web 
interface
+* **JWT tokens** derived from Personal Access Tokens (PATs) for programmatic 
API access
+
+Both mechanisms require HTTPS. Authentication verifies the identity of users, 
while authorization (covered in [Authorization 
security](security-authorization)) determines what actions they can perform.
+
+## Transport security
+
+All ATR routes, on both the website and the API, require HTTPS using TLS 1.2 
or newer. This is enforced at the httpd layer in front of the application. 
Requests over plain HTTP are redirected to HTTPS.
+
+Tokens and credentials must never appear in URLs, as URLs may be logged or 
cached. They must only be transmitted in request headers or POST bodies over 
HTTPS.
+
+## Web authentication
+
+### ASF OAuth integration
+
+Browser users authenticate through [ASF 
OAuth](https://oauth.apache.org/api.html). The authentication flow works as 
follows:
+
+1. User clicks "Sign in" on the ATR website
+2. ATR redirects the user to the ASF OAuth service
+3. User authenticates with their ASF credentials
+4. ASF OAuth redirects the user back to ATR with session information
+5. ATR creates a server-side session linked to the user's ASF UID
+
+The session is managed by 
[ASFQuart](https://github.com/apache/infrastructure-asfquart), which handles 
the OAuth handshake and session cookie management.
+
+### Session management
+
+Sessions are stored server-side. The browser receives only a session cookie 
that references the server-side session data. Session cookies are configured 
with security attributes:
+
+* `HttpOnly` - prevents JavaScript access to the cookie
+* `Secure` - cookie is only sent over HTTPS
+* `SameSite=Lax` - provides CSRF protection for most requests
+
+Session data includes the user's ASF UID and is used to authorize requests. 
The session expires after a period of inactivity or when the user logs out.
+
+### Session caching
+
+Authorization data fetched from LDAP (committee memberships, project 
participation) is cached in [`principal.Cache`](/ref/atr/principal.py:Cache) 
for performance. The cache has a TTL of 300 seconds, defined by 
`cache_for_at_most_seconds`. After the TTL expires, the next request will 
refresh the cache from LDAP.
+
+## API authentication
+
+API access uses a two-token system: Personal Access Tokens (PATs) for 
long-term credentials and JSON Web Tokens (JWTs) for short-term API access.
+
+### Personal Access Tokens (PATs)
+
+Committers can obtain PATs from the `/tokens` page on the ATR website. PATs 
have the following properties:
+
+* **Validity**: 180 days from creation
+* **Storage**: ATR stores only bcrypt hashes, never the plaintext PAT
+* **Revocation**: Users can revoke their own PATs at any time; admins can 
revoke any PAT
+* **Purpose**: PATs are used solely to obtain JWTs; they cannot be used 
directly for API access
+
+Only authenticated committers (signed in via ASF OAuth) can create PATs. Each 
user can have multiple active PATs.
+
+### JSON Web Tokens (JWTs)
+
+To access protected API endpoints, users must first obtain a JWT by exchanging 
their PAT. This is done by POSTing to `/api/jwt`:
+
+```text
+POST /api/jwt
+Content-Type: application/json
+
+{"asfuid": "username", "pat": "pat_token_value"}
+```
+
+On success, the response contains a JWT:
+
+```json
+{"asfuid": "username", "jwt": "jwt_token_value"}
+```
+
+JWTs have the following properties:
+
+* **Algorithm**: HS256 (HMAC-SHA256)
+* **Validity**: 90 minutes from creation
+* **Claims**: `sub` (ASF UID), `iat` (issued at), `exp` (expiration), `jti` 
(unique token ID)
+* **Storage**: JWTs are stateless; ATR does not store issued JWTs
+
+The JWT is used in the `Authorization` header as a bearer token:
+
+```text
+Authorization: Bearer jwt_token_value
+```
+
+### Token handling
+
+The [`jwtoken`](/ref/atr/jwtoken.py) module handles JWT creation and 
verification. Protected API endpoints use the `@jwtoken.require` decorator, 
which extracts the JWT from the `Authorization` header, verifies its signature 
and expiration, and makes the user's ASF UID available to the handler.
+
+## Token lifecycle
+
+The relationship between authentication methods and tokens:
+
+```text
+ASF OAuth (web login)
+    │
+    ├──▶ Web Session ──▶ Web Interface Access
+    │
+    └──▶ PAT Creation ──▶ PAT (180 days)
+                              │
+                              └──▶ JWT Exchange ──▶ JWT (90 min)
+                                                       │
+                                                       └──▶ API Access
+```
+
+For web users, authentication happens once via ASF OAuth, and the session 
persists until logout or expiration. For API users, the flow is: obtain a PAT 
once (via the web interface), then exchange it for JWTs as needed (JWTs expire 
quickly, so this exchange happens frequently in long-running scripts).
+
+## Security properties
+
+### Web sessions
+
+* Server-side storage prevents client-side tampering
+* Session cookies are protected against XSS (`HttpOnly`) and transmission 
interception (`Secure`)
+* `SameSite` attribute provides baseline CSRF protection (ATR also uses CSRF 
tokens in forms)
+
+### Personal Access Tokens
+
+* Stored as bcrypt hashes with appropriate cost factor

Review Comment:
   SHA3-256 rather than bcrypt.



##########
atr/docs/input-validation.md:
##########
@@ -0,0 +1,303 @@
+# 3.13. Input validation
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.12.` [Authorization security](security-authorization)
+
+**Next**: (none)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Defense in depth](#defense-in-depth)
+* [Form validation with Pydantic](#form-validation-with-pydantic)
+* [CSRF protection](#csrf-protection)
+* [Validation rules by input type](#validation-rules-by-input-type)
+* [Data integrity validation](#data-integrity-validation)
+* [Output encoding](#output-encoding)
+* [File upload security](#file-upload-security)
+* [Injection prevention](#injection-prevention)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+Input validation is critical for ATR's security posture. As a system that 
handles cryptographic signatures and release artifacts, ATR must ensure that 
all user input is properly validated before processing. This page documents the 
validation strategies and patterns used throughout the codebase.
+
+## Defense in depth
+
+ATR employs multiple layers of validation:
+
+1. **Transport layer**: HTTPS required, enforced by httpd
+2. **Request layer**: Size limits enforced by httpd (`MAX_CONTENT_LENGTH`)
+3. **Form layer**: Pydantic models validate structure and types
+4. **Application layer**: Business logic validation in route handlers
+5. **Database layer**: SQLAlchemy ORM with parameterized queries, plus 
constraints
+6. **Output layer**: Jinja2 auto-escaping for HTML output
+
+Each layer provides independent protection, so a failure in one layer does not 
compromise the system.
+
+## Form validation with Pydantic
+
+All form inputs in ATR are validated through 
[Pydantic](https://docs.pydantic.dev/latest/) models defined in 
[`form.py`](/ref/atr/form.py). The base class for forms is 
[`Form`](/ref/atr/form.py:Form), which extends Pydantic's `BaseModel`.
+
+### Defining form fields
+
+Form fields are defined using Python type annotations and the 
[`label`](/ref/atr/form.py:label) function:
+
+```python
+class ExampleForm(Form):
+    name: str = label("Project name", "Enter the project name")
+    count: int = label("Count", widget=Widget.NUMBER)
+    email: EmailStr = label("Contact email", widget=Widget.EMAIL)
+```
+
+The `label` function accepts a description (shown to users), optional 
documentation, and an optional widget hint for rendering.
+
+### Validation process
+
+When a form is submitted, ATR:
+
+1. Extracts form data from the request via 
[`quart_request`](/ref/atr/form.py:quart_request)
+2. Passes the data to the Pydantic model for validation
+3. If validation fails, collects errors via 
[`flash_error_data`](/ref/atr/form.py:flash_error_data)
+4. Displays errors to the user with 
[`flash_error_summary`](/ref/atr/form.py:flash_error_summary)
+5. If validation succeeds, proceeds with the validated data
+
+Pydantic provides built-in validators for common types (strings, integers, 
emails, URLs) and supports custom validators via decorators.
+
+### Custom validators
+
+For complex validation logic, use Pydantic's `@model_validator` decorator:
+
+```python
+from pydantic import model_validator
+
+class ReleaseForm(Form):
+    version: str = label("Version")
+
+    @model_validator(mode="after")
+    def validate_version_format(self):
+        if not re.match(r"^\d+\.\d+\.\d+", self.version):
+            raise ValueError("Version must start with X.Y.Z")
+        return self
+```
+
+## CSRF protection
+
+All forms that modify state must include a CSRF token. The token is generated 
by [`csrf_input`](/ref/atr/form.py:csrf_input) and validated automatically by 
Quart-WTF:
+
+```python
+def csrf_input() -> htm.VoidElement:
+    csrf_token = utils.generate_csrf()
+    return htpy.input(type="hidden", name="csrf_token", value=csrf_token)
+```
+
+In templates, include the CSRF token in every form:
+
+```html
+<form method="post">
+    {{ csrf_input() }}
+    <!-- other form fields -->
+</form>
+```
+
+The CSRF token is tied to the user's session and validated on form submission. 
Requests without a valid CSRF token are rejected.
+
+## Validation rules by input type
+
+### ASF User IDs
+
+User IDs are validated against a strict pattern in 
[`principal.py`](/ref/atr/principal.py):
+
+```python
+if not re.match(r"^[-_a-z0-9]+$", user):
+    raise CommitterError("Invalid characters in User ID")
+```
+
+Only lowercase alphanumeric characters, hyphens, and underscores are permitted.
+
+### Email addresses
+
+Email validation uses Pydantic's `EmailStr` type, which implements RFC 5322 
validation:
+
+```python
+from pydantic import EmailStr
+
+class ContactForm(Form):
+    email: EmailStr = label("Email address")
+```
+
+### URLs
+
+URL validation uses Pydantic's `HttpUrl` type:
+
+```python
+from pydantic import HttpUrl
+
+class LinkForm(Form):
+    website: HttpUrl = label("Website URL")
+```
+
+### Version strings
+
+Version strings are validated according to project-specific patterns. The 
general pattern allows semantic versioning with optional suffixes:
+
+```python
+VERSION_PATTERN = re.compile(r"^[0-9]+\.[0-9]+.*$")
+```
+
+### Committee and project names
+
+Committee and project names are validated against the set of known committees 
and projects from LDAP and the ASF project database. Unknown names are rejected.
+
+### File names
+
+File names in uploads are sanitized to prevent path traversal:
+
+* Directory separators (`/`, `\`) are rejected or stripped
+* Null bytes are rejected
+* Only expected extensions are permitted per upload type
+
+## Data integrity validation
+
+Beyond input validation, ATR performs data integrity validation on database 
records using [`validate.py`](/ref/atr/validate.py). This catches 
inconsistencies that may have been introduced by bugs, migrations, or manual 
database edits.
+
+### Committee validation
+
+The [`committee`](/ref/atr/validate.py:committee) function checks:
+
+* `child_committees` must be empty (not used)
+* `full_name` must be set, trimmed, and not prefixed with "Apache "
+
+### Project validation
+
+The [`project`](/ref/atr/validate.py:project) function checks:
+
+* `category` must use comma-separated labels without colons
+* `committee_name` must be set (project must be linked to a committee)
+* `created` timestamp must be in the past
+* `full_name` must be set and start with "Apache "
+* `programming_languages` must use comma-separated labels without colons
+* `release_policy_id` must be None (not used)
+
+### Release validation
+
+The [`release`](/ref/atr/validate.py:release) function checks:
+
+* `created` timestamp must be in the past
+* `name` must match the expected pattern for project and version
+* Release directory must exist on disk and contain files
+* `package_managers` must be empty (not used)
+* `released` timestamp must be in the past or None
+* `sboms` must be empty (not used)
+* Vote logic must be consistent (cannot have `vote_resolved` without 
`vote_started`)
+* `votes` must be empty (not used)
+
+### Running validation
+
+Data integrity validation can be run via the admin interface or 
programmatically:
+
+```python
+async for divergence in validate.everything(data):
+    print(f"{divergence.source}: {divergence.divergence}")
+```
+
+## Output encoding
+
+ATR uses [Jinja2](https://jinja.palletsprojects.com/) for templating with 
auto-escaping enabled by default. All variables rendered in templates are 
automatically HTML-escaped:
+
+```html
+<!-- This is safe; user_input is escaped -->
+<p>Hello, {{ user_input }}</p>
+```
+
+When HTML output is intentionally generated (e.g., via htpy), it must be 
explicitly marked safe using `markupsafe.Markup`:
+
+```python
+import markupsafe
+safe_html = markupsafe.Markup("<strong>Bold</strong>")
+```
+
+Never mark user-controlled data as safe without proper sanitization.
+
+## File upload security
+
+File uploads are handled with several security measures:
+
+### Size limits
+
+Maximum upload size is enforced at the httpd layer via `MAX_CONTENT_LENGTH`. 
This prevents denial-of-service attacks via large uploads.
+
+### Extension validation
+
+Each upload type has an allowlist of permitted file extensions. Files with 
unexpected extensions are rejected.
+
+### Storage location
+
+Uploaded files are stored outside the web root in configured directories 
(e.g., `state/unfinished/`). They are not directly accessible via HTTP.

Review Comment:
   We don't really have a web root as ATR isn't a file server.



##########
atr/docs/security-authentication.md:
##########
@@ -0,0 +1,177 @@
+# 3.11. Authentication security
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.10.` [How to contribute](how-to-contribute)
+
+**Next**: `3.12.` [Authorization security](security-authorization)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Transport security](#transport-security)
+* [Web authentication](#web-authentication)
+* [API authentication](#api-authentication)
+* [Token lifecycle](#token-lifecycle)
+* [Security properties](#security-properties)
+* [Limitations and future work](#limitations-and-future-work)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+ATR uses two authentication mechanisms depending on the access method:
+
+* **Web sessions** via ASF OAuth for browser-based users accessing the web 
interface
+* **JWT tokens** derived from Personal Access Tokens (PATs) for programmatic 
API access

Review Comment:
   Technically the PAT is also an authentication mechanism too, even if only 
being used to issue JWTs.



##########
atr/docs/security-authorization.md:
##########
@@ -0,0 +1,217 @@
+# 3.12. Authorization security
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.11.` [Authentication security](security-authentication)
+
+**Next**: `3.13.` [Input validation](input-validation)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Roles and principals](#roles-and-principals)
+* [LDAP integration](#ldap-integration)
+* [Access control for releases](#access-control-for-releases)
+* [Access control for projects](#access-control-for-projects)
+* [Access control for tokens](#access-control-for-tokens)
+* [Implementation patterns](#implementation-patterns)
+* [Caching behavior](#caching-behavior)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+ATR uses role-based access control (RBAC) where roles are derived from ASF 
LDAP group memberships. Authentication (covered in [Authentication 
security](security-authentication)) establishes *who* a user is; authorization 
determines *what* they can do.
+
+The authorization model is committee-centric: most permissions are granted 
based on a user's relationship to a committee (PMC membership) or project 
(committer status).
+
+## Roles and principals
+
+ATR recognizes the following roles, derived from ASF LDAP:
+
+* **Public**: Unauthenticated users. Can view public information about 
releases and projects.
+
+* **Committer**: Any authenticated ASF committer. Can create Personal Access 
Tokens and view their own committees and projects. Determined by existence in 
LDAP `ou=people,dc=apache,dc=org`.
+
+* **Project Participant**: A committer who is a member of a specific project 
(has commit access). Can start releases, upload artifacts, and cast votes for 
that project. Determined by the `member` attribute in the project's LDAP group.
+
+* **PMC Member**: A committer who is on the PMC (Project Management Committee) 
for a specific committee. Has all participant permissions plus can resolve 
votes, finish releases, configure project settings, and manage signing keys. 
Determined by the `owner` attribute in the committee's LDAP group.
+
+* **Chair**: A PMC chair. Currently has the same permissions as PMC Member in 
ATR. Determined by membership in 
`cn=pmc-chairs,ou=groups,ou=services,dc=apache,dc=org`.
+
+* **ASF Member**: An ASF Member. Currently has the same permissions as a 
regular committer in ATR, though this may change. Determined by membership in 
`cn=member,ou=groups,dc=apache,dc=org`.
+
+* **Infrastructure Root**: ASF Infrastructure team with root access. Has 
administrative capabilities. Determined by membership in 
`cn=infrastructure-root,ou=groups,ou=services,dc=apache,dc=org`.
+
+* **Tooling Team**: Members of the ASF Tooling team. Treated as PMC members of 
the "tooling" committee. Determined by membership in 
`cn=tooling,ou=groups,ou=services,dc=apache,dc=org`.
+
+## LDAP integration
+
+Authorization data is fetched from ASF LDAP using the 
[`principal`](/ref/atr/principal.py) module. The key LDAP bases are:
+
+* `ou=people,dc=apache,dc=org` - All committers
+* `ou=project,ou=groups,dc=apache,dc=org` - Project and committee groups
+* `cn=member,ou=groups,dc=apache,dc=org` - ASF Members
+* `cn=pmc-chairs,ou=groups,ou=services,dc=apache,dc=org` - PMC Chairs
+* `cn=infrastructure-root,ou=groups,ou=services,dc=apache,dc=org` - 
Infrastructure root
+* `cn=tooling,ou=groups,ou=services,dc=apache,dc=org` - Tooling team
+
+The [`Committer`](/ref/atr/principal.py:Committer) class fetches a user's full 
authorization profile from LDAP, including their committee memberships (PMC 
membership) and project participations (committer access).
+
+## Access control for releases
+
+Release operations have the following access requirements:
+
+**View release information** (public pages, download links):
+
+* Allowed for: Everyone, including unauthenticated users
+
+**Start a new release**:
+
+* Allowed for: Project participants (committers on the project)
+* Checked via: `is_participant_of(project.committee_name)`
+
+**Upload release artifacts**:
+
+* Allowed for: Project participants
+* Additional constraint: Must be the user who started the release, or a PMC 
member
+
+**Cast a vote on a release**:
+
+* Allowed for: Project participants
+* Constraint: Cannot vote multiple times; can change existing vote
+
+**Resolve a vote (tally votes and determine outcome)**:
+
+* Allowed for: PMC members only
+* Checked via: `is_member_of(project.committee_name)`
+
+**Finish a release (publish to distribution)**:
+
+* Allowed for: PMC members only
+* Constraint: Vote must be resolved with a passing result
+
+**Cancel or delete a release**:
+
+* Allowed for: PMC members, or the user who started the release
+
+## Access control for projects

Review Comment:
   We should probably have one big matrix of this, but I think we should 
automatically generate it from the code otherwise we're going to get out of 
sync very easily. We could just omit this for now, I think; that would be fine.



##########
atr/docs/security-authorization.md:
##########
@@ -0,0 +1,217 @@
+# 3.12. Authorization security
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.11.` [Authentication security](security-authentication)
+
+**Next**: `3.13.` [Input validation](input-validation)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Roles and principals](#roles-and-principals)
+* [LDAP integration](#ldap-integration)
+* [Access control for releases](#access-control-for-releases)
+* [Access control for projects](#access-control-for-projects)
+* [Access control for tokens](#access-control-for-tokens)
+* [Implementation patterns](#implementation-patterns)
+* [Caching behavior](#caching-behavior)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+ATR uses role-based access control (RBAC) where roles are derived from ASF 
LDAP group memberships. Authentication (covered in [Authentication 
security](security-authentication)) establishes *who* a user is; authorization 
determines *what* they can do.
+
+The authorization model is committee-centric: most permissions are granted 
based on a user's relationship to a committee (PMC membership) or project 
(committer status).
+
+## Roles and principals
+
+ATR recognizes the following roles, derived from ASF LDAP:
+
+* **Public**: Unauthenticated users. Can view public information about 
releases and projects.
+
+* **Committer**: Any authenticated ASF committer. Can create Personal Access 
Tokens and view their own committees and projects. Determined by existence in 
LDAP `ou=people,dc=apache,dc=org`.
+
+* **Project Participant**: A committer who is a member of a specific project 
(has commit access). Can start releases, upload artifacts, and cast votes for 
that project. Determined by the `member` attribute in the project's LDAP group.
+
+* **PMC Member**: A committer who is on the PMC (Project Management Committee) 
for a specific committee. Has all participant permissions plus can resolve 
votes, finish releases, configure project settings, and manage signing keys. 
Determined by the `owner` attribute in the committee's LDAP group.
+
+* **Chair**: A PMC chair. Currently has the same permissions as PMC Member in 
ATR. Determined by membership in 
`cn=pmc-chairs,ou=groups,ou=services,dc=apache,dc=org`.
+
+* **ASF Member**: An ASF Member. Currently has the same permissions as a 
regular committer in ATR, though this may change. Determined by membership in 
`cn=member,ou=groups,dc=apache,dc=org`.
+
+* **Infrastructure Root**: ASF Infrastructure team with root access. Has 
administrative capabilities. Determined by membership in 
`cn=infrastructure-root,ou=groups,ou=services,dc=apache,dc=org`.
+
+* **Tooling Team**: Members of the ASF Tooling team. Treated as PMC members of 
the "tooling" committee. Determined by membership in 
`cn=tooling,ou=groups,ou=services,dc=apache,dc=org`.
+
+## LDAP integration
+
+Authorization data is fetched from ASF LDAP using the 
[`principal`](/ref/atr/principal.py) module. The key LDAP bases are:
+
+* `ou=people,dc=apache,dc=org` - All committers
+* `ou=project,ou=groups,dc=apache,dc=org` - Project and committee groups
+* `cn=member,ou=groups,dc=apache,dc=org` - ASF Members
+* `cn=pmc-chairs,ou=groups,ou=services,dc=apache,dc=org` - PMC Chairs
+* `cn=infrastructure-root,ou=groups,ou=services,dc=apache,dc=org` - 
Infrastructure root
+* `cn=tooling,ou=groups,ou=services,dc=apache,dc=org` - Tooling team
+
+The [`Committer`](/ref/atr/principal.py:Committer) class fetches a user's full 
authorization profile from LDAP, including their committee memberships (PMC 
membership) and project participations (committer access).
+
+## Access control for releases
+
+Release operations have the following access requirements:
+
+**View release information** (public pages, download links):
+
+* Allowed for: Everyone, including unauthenticated users
+
+**Start a new release**:
+
+* Allowed for: Project participants (committers on the project)
+* Checked via: `is_participant_of(project.committee_name)`
+
+**Upload release artifacts**:
+
+* Allowed for: Project participants
+* Additional constraint: Must be the user who started the release, or a PMC 
member
+
+**Cast a vote on a release**:
+
+* Allowed for: Project participants
+* Constraint: Cannot vote multiple times; can change existing vote
+
+**Resolve a vote (tally votes and determine outcome)**:
+
+* Allowed for: PMC members only
+* Checked via: `is_member_of(project.committee_name)`
+
+**Finish a release (publish to distribution)**:
+
+* Allowed for: PMC members only
+* Constraint: Vote must be resolved with a passing result
+
+**Cancel or delete a release**:
+
+* Allowed for: PMC members, or the user who started the release

Review Comment:
   I see [`delete` under 
`CommitteeParticipant`](https://github.com/apache/tooling-trusted-releases/blob/main/atr/storage/writers/release.py#L106)
 not under member. I haven't checked the other interfaces.



##########
atr/docs/input-validation.md:
##########
@@ -0,0 +1,303 @@
+# 3.13. Input validation
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.12.` [Authorization security](security-authorization)
+
+**Next**: (none)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Defense in depth](#defense-in-depth)
+* [Form validation with Pydantic](#form-validation-with-pydantic)
+* [CSRF protection](#csrf-protection)
+* [Validation rules by input type](#validation-rules-by-input-type)
+* [Data integrity validation](#data-integrity-validation)
+* [Output encoding](#output-encoding)
+* [File upload security](#file-upload-security)
+* [Injection prevention](#injection-prevention)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+Input validation is critical for ATR's security posture. As a system that 
handles cryptographic signatures and release artifacts, ATR must ensure that 
all user input is properly validated before processing. This page documents the 
validation strategies and patterns used throughout the codebase.
+
+## Defense in depth
+
+ATR employs multiple layers of validation:
+
+1. **Transport layer**: HTTPS required, enforced by httpd
+2. **Request layer**: Size limits enforced by httpd (`MAX_CONTENT_LENGTH`)
+3. **Form layer**: Pydantic models validate structure and types
+4. **Application layer**: Business logic validation in route handlers
+5. **Database layer**: SQLAlchemy ORM with parameterized queries, plus 
constraints
+6. **Output layer**: Jinja2 auto-escaping for HTML output
+
+Each layer provides independent protection, so a failure in one layer does not 
compromise the system.
+
+## Form validation with Pydantic
+
+All form inputs in ATR are validated through 
[Pydantic](https://docs.pydantic.dev/latest/) models defined in 
[`form.py`](/ref/atr/form.py). The base class for forms is 
[`Form`](/ref/atr/form.py:Form), which extends Pydantic's `BaseModel`.
+
+### Defining form fields
+
+Form fields are defined using Python type annotations and the 
[`label`](/ref/atr/form.py:label) function:
+
+```python
+class ExampleForm(Form):
+    name: str = label("Project name", "Enter the project name")
+    count: int = label("Count", widget=Widget.NUMBER)
+    email: EmailStr = label("Contact email", widget=Widget.EMAIL)
+```
+
+The `label` function accepts a description (shown to users), optional 
documentation, and an optional widget hint for rendering.
+
+### Validation process
+
+When a form is submitted, ATR:
+
+1. Extracts form data from the request via 
[`quart_request`](/ref/atr/form.py:quart_request)
+2. Passes the data to the Pydantic model for validation
+3. If validation fails, collects errors via 
[`flash_error_data`](/ref/atr/form.py:flash_error_data)
+4. Displays errors to the user with 
[`flash_error_summary`](/ref/atr/form.py:flash_error_summary)
+5. If validation succeeds, proceeds with the validated data
+
+Pydantic provides built-in validators for common types (strings, integers, 
emails, URLs) and supports custom validators via decorators.
+
+### Custom validators
+
+For complex validation logic, use Pydantic's `@model_validator` decorator:
+
+```python
+from pydantic import model_validator
+
+class ReleaseForm(Form):
+    version: str = label("Version")
+
+    @model_validator(mode="after")
+    def validate_version_format(self):
+        if not re.match(r"^\d+\.\d+\.\d+", self.version):
+            raise ValueError("Version must start with X.Y.Z")
+        return self
+```
+
+## CSRF protection
+
+All forms that modify state must include a CSRF token. The token is generated 
by [`csrf_input`](/ref/atr/form.py:csrf_input) and validated automatically by 
Quart-WTF:
+
+```python
+def csrf_input() -> htm.VoidElement:
+    csrf_token = utils.generate_csrf()
+    return htpy.input(type="hidden", name="csrf_token", value=csrf_token)
+```
+
+In templates, include the CSRF token in every form:
+
+```html
+<form method="post">
+    {{ csrf_input() }}
+    <!-- other form fields -->
+</form>
+```
+
+The CSRF token is tied to the user's session and validated on form submission. 
Requests without a valid CSRF token are rejected.
+
+## Validation rules by input type
+
+### ASF User IDs
+
+User IDs are validated against a strict pattern in 
[`principal.py`](/ref/atr/principal.py):
+
+```python
+if not re.match(r"^[-_a-z0-9]+$", user):
+    raise CommitterError("Invalid characters in User ID")
+```
+
+Only lowercase alphanumeric characters, hyphens, and underscores are permitted.
+
+### Email addresses
+
+Email validation uses Pydantic's `EmailStr` type, which implements RFC 5322 
validation:
+
+```python
+from pydantic import EmailStr
+
+class ContactForm(Form):
+    email: EmailStr = label("Email address")
+```
+
+### URLs
+
+URL validation uses Pydantic's `HttpUrl` type:
+
+```python
+from pydantic import HttpUrl
+
+class LinkForm(Form):
+    website: HttpUrl = label("Website URL")
+```
+
+### Version strings
+
+Version strings are validated according to project-specific patterns. The 
general pattern allows semantic versioning with optional suffixes:
+
+```python
+VERSION_PATTERN = re.compile(r"^[0-9]+\.[0-9]+.*$")
+```
+
+### Committee and project names
+
+Committee and project names are validated against the set of known committees 
and projects from LDAP and the ASF project database. Unknown names are rejected.
+
+### File names
+
+File names in uploads are sanitized to prevent path traversal:
+
+* Directory separators (`/`, `\`) are rejected or stripped
+* Null bytes are rejected
+* Only expected extensions are permitted per upload type
+
+## Data integrity validation
+
+Beyond input validation, ATR performs data integrity validation on database 
records using [`validate.py`](/ref/atr/validate.py). This catches 
inconsistencies that may have been introduced by bugs, migrations, or manual 
database edits.
+
+### Committee validation
+
+The [`committee`](/ref/atr/validate.py:committee) function checks:
+
+* `child_committees` must be empty (not used)
+* `full_name` must be set, trimmed, and not prefixed with "Apache "
+
+### Project validation
+
+The [`project`](/ref/atr/validate.py:project) function checks:
+
+* `category` must use comma-separated labels without colons
+* `committee_name` must be set (project must be linked to a committee)
+* `created` timestamp must be in the past
+* `full_name` must be set and start with "Apache "
+* `programming_languages` must use comma-separated labels without colons
+* `release_policy_id` must be None (not used)
+
+### Release validation
+
+The [`release`](/ref/atr/validate.py:release) function checks:
+
+* `created` timestamp must be in the past
+* `name` must match the expected pattern for project and version
+* Release directory must exist on disk and contain files
+* `package_managers` must be empty (not used)
+* `released` timestamp must be in the past or None
+* `sboms` must be empty (not used)
+* Vote logic must be consistent (cannot have `vote_resolved` without 
`vote_started`)
+* `votes` must be empty (not used)
+
+### Running validation
+
+Data integrity validation can be run via the admin interface or 
programmatically:
+
+```python
+async for divergence in validate.everything(data):
+    print(f"{divergence.source}: {divergence.divergence}")
+```
+
+## Output encoding
+
+ATR uses [Jinja2](https://jinja.palletsprojects.com/) for templating with 
auto-escaping enabled by default. All variables rendered in templates are 
automatically HTML-escaped:
+
+```html
+<!-- This is safe; user_input is escaped -->
+<p>Hello, {{ user_input }}</p>
+```
+
+When HTML output is intentionally generated (e.g., via htpy), it must be 
explicitly marked safe using `markupsafe.Markup`:
+
+```python
+import markupsafe
+safe_html = markupsafe.Markup("<strong>Bold</strong>")
+```
+
+Never mark user-controlled data as safe without proper sanitization.
+
+## File upload security
+
+File uploads are handled with several security measures:
+
+### Size limits
+
+Maximum upload size is enforced at the httpd layer via `MAX_CONTENT_LENGTH`. 
This prevents denial-of-service attacks via large uploads.
+
+### Extension validation
+
+Each upload type has an allowlist of permitted file extensions. Files with 
unexpected extensions are rejected.
+
+### Storage location
+
+Uploaded files are stored outside the web root in configured directories 
(e.g., `state/unfinished/`). They are not directly accessible via HTTP.
+
+### File handling
+
+Files are processed via 
[`quart.datastructures.FileStorage`](/ref/atr/form.py:quart_request) and 
validated before being written to disk. Empty files (where the browser sends a 
file input with no selection) are filtered out.
+
+## Injection prevention
+
+### SQL injection
+
+ATR uses SQLAlchemy ORM exclusively for database access. All queries use 
parameterized statements:
+
+```python
+# Safe: parameterized query
+result = await session.exec(
+    select(Project).where(Project.name == project_name)
+)
+```
+
+Direct SQL string concatenation is never used.
+
+### Cross-site scripting (XSS)
+
+XSS is prevented through:
+
+* Jinja2 auto-escaping (enabled by default)
+* `markupsafe.Markup` for trusted HTML only
+* Content Security Policy headers (configured in httpd)
+
+### Path traversal
+
+Path traversal is prevented by:
+
+* Using `pathlib.Path` for all file operations
+* Validating that paths remain within expected directories
+* Rejecting file names containing path separators
+
+```python
+import pathlib
+
+base = pathlib.Path("/allowed/directory")
+user_path = base / user_filename
+# Verify the resolved path is still under base
+if not user_path.resolve().is_relative_to(base.resolve()):
+    raise ValueError("Path traversal detected")
+```
+
+### Command injection
+
+ATR minimizes shell command execution. Where external commands are necessary 
(e.g., GPG operations), arguments are passed as lists, never as shell strings:

Review Comment:
   Minimizes? Heh. Hopefully we do more than just minimizing it...



##########
atr/docs/security-authorization.md:
##########
@@ -0,0 +1,217 @@
+# 3.12. Authorization security
+
+**Up**: `3.` [Developer guide](developer-guide)
+
+**Prev**: `3.11.` [Authentication security](security-authentication)
+
+**Next**: `3.13.` [Input validation](input-validation)
+
+**Sections**:
+
+* [Overview](#overview)
+* [Roles and principals](#roles-and-principals)
+* [LDAP integration](#ldap-integration)
+* [Access control for releases](#access-control-for-releases)
+* [Access control for projects](#access-control-for-projects)
+* [Access control for tokens](#access-control-for-tokens)
+* [Implementation patterns](#implementation-patterns)
+* [Caching behavior](#caching-behavior)
+* [Implementation references](#implementation-references)
+
+## Overview
+
+ATR uses role-based access control (RBAC) where roles are derived from ASF 
LDAP group memberships. Authentication (covered in [Authentication 
security](security-authentication)) establishes *who* a user is; authorization 
determines *what* they can do.
+
+The authorization model is committee-centric: most permissions are granted 
based on a user's relationship to a committee (PMC membership) or project 
(committer status).
+
+## Roles and principals
+
+ATR recognizes the following roles, derived from ASF LDAP:
+
+* **Public**: Unauthenticated users. Can view public information about 
releases and projects.
+
+* **Committer**: Any authenticated ASF committer. Can create Personal Access 
Tokens and view their own committees and projects. Determined by existence in 
LDAP `ou=people,dc=apache,dc=org`.
+
+* **Project Participant**: A committer who is a member of a specific project 
(has commit access). Can start releases, upload artifacts, and cast votes for 
that project. Determined by the `member` attribute in the project's LDAP group.

Review Comment:
   Don't all committers have commit access? I think the membership of a 
specific project is defined by LDAP.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GH] Adding security documentation (tooling-trusted-releases)

Reply via email to