This is an automated email from the ASF dual-hosted git repository. akm pushed a commit to branch document-cascades-517 in repository https://gitbox.apache.org/repos/asf/tooling-trusted-releases.git
commit 86089205019abe8edc8e610c3f37cd00c56ef2fb Author: Andrew K. Musselman <[email protected]> AuthorDate: Wed Mar 11 15:46:13 2026 -0700 Adding docs for cascading; fixes #517 --- atr/docs/database.md | 154 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 152 insertions(+), 2 deletions(-) diff --git a/atr/docs/database.md b/atr/docs/database.md index 47d1459a..2aae5036 100644 --- a/atr/docs/database.md +++ b/atr/docs/database.md @@ -11,6 +11,7 @@ * [Introduction](#introduction) * [Core models](#core-models) * [Other features](#other-features) +* [Cascade deletions](#cascade-deletions) * [Schema changes and migrations](#schema-changes-and-migrations) ## Introduction @@ -63,9 +64,158 @@ Database constraints ensure data integrity. The [`Task`](/ref/atr/models/sql.py: Unique constraints ensure that certain combinations of fields are unique. The `Release` model has a unique constraint on `(project_name, version)` to prevent creating duplicate releases for the same project version. The `Revision` model has two unique constraints: one on `(release_name, seq)` and another on `(release_name, number)`, ensuring that revision numbers are unique within a release. +## Cascade deletions + +ATR uses two kinds of cascade deletions: ORM-level cascades managed by SQLAlchemy, and DB-level cascades enforced by SQLite foreign key constraints. Both are configured in [`models/sql.py`](/ref/atr/models/sql.py). This section documents what happens when a parent record is deleted, which child records are automatically removed, and which deletions are blocked by foreign key constraints. + +SQLite foreign key enforcement is enabled in ATR via `PRAGMA foreign_keys=ON` in [`db/__init__.py`](/ref/atr/db/__init__.py). This means DB-level cascades and foreign key constraints are active. + +### ORM-level vs DB-level cascades + +ORM cascades are configured on SQLAlchemy relationships using `sa_relationship_kwargs={"cascade": "all, delete-orphan"}`. When a parent object is deleted through the ORM (via `session.delete(parent)`), SQLAlchemy automatically deletes all related child objects before issuing the SQL DELETE for the parent. These cascades only apply when using the ORM; they do not apply to raw SQL DELETE statements or bulk `sqlmodel.delete()` queries. + +DB-level cascades are configured on foreign key columns using `ondelete="CASCADE"` or `ondelete="SET NULL"`. These are enforced by SQLite itself, regardless of whether the deletion happens through the ORM or raw SQL. When a parent row is deleted, SQLite automatically deletes (or nullifies) child rows that reference it. + +Some relationships have both ORM and DB cascades configured. This provides defense in depth: the ORM cascade handles the common case of deleting through the ORM, while the DB cascade provides a safety net for bulk SQL deletions, migrations, or manual database operations. + +### Entity relationship overview + +The following diagram shows the main entity relationships and their cascade behaviour. + +```mermaid +erDiagram + Committee ||--o{ Project : "has projects" + Committee ||--o{ Committee : "has children" + Committee }o--o{ PublicSigningKey : "linked via KeyLink" + Project ||--o{ Release : "has releases" + Project ||--o| ReleasePolicy : "ORM cascade + reverse DB cascade" + Release ||--o{ Revision : "ORM cascade only" + Release ||--o{ CheckResult : "ORM + DB cascade" + Release ||--o{ Distribution : "ORM + DB cascade" + Release ||--o{ Quarantined : "DB cascade only" + Release ||--o| ReleasePolicy : "ORM cascade only" + Revision ||--o| Revision : "parent chain" + Task ||--o| WorkflowStatus : "DB SET NULL" + Task }o--o| Project : "references" +``` + +### Release deletion cascades + +Deleting a Release triggers the most cascades in the system. + +```mermaid +flowchart TD + A["Delete Release"] --> B["ORM cascade"] + A --> C["DB cascade"] + + B --> B1["Revisions deleted"] + B --> B2["CheckResults deleted"] + B --> B3["Distributions deleted"] + B --> B4["ReleasePolicy deleted"] + + C --> C1["CheckResults deleted"] + C --> C2["Distributions deleted"] + C --> C3["Quarantined records deleted"] + + style B fill:#e8f4fd + style C fill:#fde8e8 +``` + +When a Release is deleted through the ORM (`session.delete(release)`): + +**Via ORM cascade** (`cascade: "all, delete-orphan"`): + +* All **Revision** records belonging to the release are deleted. Revisions have no DB-level cascade — they are only cleaned up by the ORM or by explicit bulk deletion. +* All **CheckResult** records belonging to the release are deleted. +* All **Distribution** records belonging to the release are deleted. +* The release's **ReleasePolicy** (if any) is deleted. This is a one-to-one relationship with `cascade_delete=True` and `single_parent=True`. + +**Via DB cascade** (`ondelete="CASCADE"` on the foreign key): + +* All **CheckResult** records referencing the release are deleted. This is redundant with the ORM cascade when using ORM deletion, but provides coverage for raw SQL deletions. +* All **Distribution** records referencing the release are deleted. Same redundancy as above. +* All **Quarantined** records referencing the release are deleted. This has a DB-level cascade only (no ORM cascade), because Quarantined records are not part of the Release's ORM relationships with cascade configuration. + +### Project deletion cascades + +When a Project is deleted through the ORM: + +**Via ORM cascade**: The project's **ReleasePolicy** (if any) is deleted. This uses `cascade_delete=True` with `cascade: "all, delete-orphan"` and `single_parent=True`. + +**Note on `ondelete="CASCADE"` on `Project.release_policy_id`:** This foreign key has `ondelete="CASCADE"`, but the direction is from ReleasePolicy to Project — meaning if a ReleasePolicy row were deleted directly in the database, the *Project* referencing it would also be deleted. This is the reverse of the ORM cascade direction (where deleting a Project cascades to its ReleasePolicy). In practice, ReleasePolicy is never deleted independently; it is always deleted as a consequence of it [...] + +Project deletion is explicitly guarded in the application: [`storage/writers/project.py`](/ref/atr/storage/writers/project.py) raises an error if the project has any associated releases. This means in practice, the Release cascades described above are not triggered indirectly through Project deletion. + +### SET NULL behaviour + +One relationship uses `ondelete="SET NULL"` instead of CASCADE: + +* **WorkflowStatus.task_id** → Task.id: When a Task is deleted, the `task_id` column in any referencing WorkflowStatus rows is set to NULL rather than deleting the WorkflowStatus record. This preserves workflow status history even when the associated task is cleaned up. + +### Explicit deletions + +Some related records are deleted explicitly in application code rather than through cascades. + +**When deleting a Release** ([`storage/writers/release.py`](/ref/atr/storage/writers/release.py)): **Task** records matching the release's project name and version are deleted with a bulk SQL DELETE before the Release itself is deleted. Tasks reference `project.name` via a nullable foreign key with no cascade, so they would not be automatically cleaned up. **RevisionCounter** records are deleted only in test mode (when `ALLOW_TESTS` is enabled and the release belongs to the test committe [...] + +**When announcing a Release** ([`storage/writers/announce.py`](/ref/atr/storage/writers/announce.py)): All **Revision** records for the release are deleted with a bulk SQL DELETE during the announce process, as part of cleaning up draft history after a release is published. + +**When deleting a PublicSigningKey** ([`storage/writers/keys.py`](/ref/atr/storage/writers/keys.py)): **KeyLink** records associating the key with committees are explicitly cleared and deleted before the key itself is deleted. KeyLink has no cascade configuration, so this cleanup is mandatory. + +### Blocked deletions + +Several entities cannot be deleted while other records reference them, because the foreign key constraints have no cascade and the referencing column is non-nullable. Attempting to delete these will raise a database integrity error. + +**Committee** cannot be deleted while any **KeyLink** rows reference it (KeyLink.committee_name is a non-nullable primary key component). Although Project.committee_name references Committee, it is nullable, so it does not block deletion at the DB level. + +**Project** cannot be deleted while any **Release** rows reference it (Release.project_name is non-nullable, no cascade) or any **CheckResultIgnore** rows reference it (CheckResultIgnore.project_name is non-nullable, no cascade). The application also explicitly prevents Project deletion when releases exist. + +**Revision** cannot be deleted individually while another **Revision** references it as a parent (Revision.parent_name is nullable but has no `ondelete`, so deletion of a parent revision that is still referenced would violate the FK constraint). In practice, revisions are deleted in bulk per-release which avoids this issue since all revisions in the chain are removed together. + +### Cascade coverage summary + +```mermaid +flowchart LR + subgraph both ["Both ORM and DB cascade"] + CR["Release → CheckResult"] + DI["Release → Distribution"] + end + subgraph orm_only ["ORM cascade only"] + RV["Release → Revision"] + RP["Release → ReleasePolicy"] + PP["Project → ReleasePolicy"] + end + subgraph db_only ["DB cascade only"] + QR["Release → Quarantined"] + end + subgraph reverse ["Reverse DB cascade"] + RPP["ReleasePolicy → Project"] + end + + style both fill:#d4edda + style orm_only fill:#fff3cd + style db_only fill:#f8d7da + style reverse fill:#e2d5f1 +``` + +| Parent | Child | ORM cascade | DB cascade | Explicit deletion | +| ------ | ----- | ----------- | ---------- | ----------------- | +| Release | Revision | `all, delete-orphan` | — | Bulk delete during announce | +| Release | CheckResult | `all, delete-orphan` | `CASCADE` | — | +| Release | Distribution | `all, delete-orphan` | `CASCADE` | — | +| Release | Quarantined | — | `CASCADE` | — | +| Release | ReleasePolicy | `all, delete-orphan` | — | — | +| Project | ReleasePolicy | `all, delete-orphan` | — | — | +| ReleasePolicy | Project | — | `CASCADE` ⚠️ | — | +| Task | WorkflowStatus | — | `SET NULL` | — | +| Release | Task | — | — | Bulk delete before release deletion | +| PublicSigningKey | KeyLink | — | — | Explicit clear before key deletion | + +The ⚠️ on **ReleasePolicy → Project** indicates a reverse cascade: if a ReleasePolicy were deleted directly at the DB level, the referencing Project would also be deleted. This is a consequence of `ondelete="CASCADE"` on `Project.release_policy_id`. + ## Schema changes and migrations -We often have to make changes to the database model in ATR, whether that be to add a whole new model or just to rename or change some existing properties. No matter the change, this involves creating a database migration. We use Alembic to perform migrations, and this allows migrations to be _bidirectional_: we can downgrade as well as upgrade. This can be very helpful when, for example, a migration didn't apply properly or is no longer needed due to having found a different solution. +We often have to make changes to the database model in ATR, whether that be to add a whole new model or just to rename or change some existing properties. No matter the change, this involves creating a database migration. We use Alembic to perform migrations, and this allows migrations to be *bidirectional*: we can downgrade as well as upgrade. This can be very helpful when, for example, a migration didn't apply properly or is no longer needed due to having found a different solution. To change the database, do not edit the SQLite directly. Instead, change the model file in [`atr/models/sql.py`](/ref/atr/models/sql.py). If you're running ATR locally, you should see from its logs that the server is now broken due to having a mismatching database. That's fine! This is the point where you now create the migration. To do so, run: @@ -75,6 +225,6 @@ uv run --frozen alembic revision -m "Description of changes" --autogenerate Obviously, change `"Description of changes"` to an actual description of the changes that you made. Keep it short, around 50-60 characters. Then when you restart the server you should find that the migration is automatically applied. You should be careful, however, before restarting the server. Not all migrations apply successfully when autogenerated. Always review the automatically produced migrations in `migrations/versions` first, and ensure that they are correct before proceeding. On [...] -It can be helpful to make a backup of the entire SQLite database before performing the migration, especially if the migration is particularly complex. This can help if, for example, the downgrade is broken, otherwise you may find yourself in a state from which there is no easy recovery. _Always_ ensure that migrations are working locally before pushing them to GitHub, because we apply changes from GitHub directly to our deployed containers. Note that sometimes the deployed containers con [...] +It can be helpful to make a backup of the entire SQLite database before performing the migration, especially if the migration is particularly complex. This can help if, for example, the downgrade is broken, otherwise you may find yourself in a state from which there is no easy recovery. *Always* ensure that migrations are working locally before pushing them to GitHub, because we apply changes from GitHub directly to our deployed containers. Note that sometimes the deployed containers con [...] If testing a migration in a PR, be sure to stop the server and run `uv run --frozen alembic downgrade -1` before switching back to any branch not containing the migration. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
