asf-tooling commented on issue #299:
URL:
https://github.com/apache/tooling-trusted-releases/issues/299#issuecomment-4404163322
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@837830e8`
**Type:** `discussion` • **Classification:** `no_action` •
**Confidence:** `high`
**Application domain(s):** `data_storage`
### Summary
Issue #299 tracks a recurring class of bugs where SQLAlchemy ORM objects
become detached from their session (or have un-eagerly-loaded relationships)
and are reused in contexts that assume a live session. The team has discussed
multiple potential solutions—one session per request, read-only models,
continuation passing style (CPS), banning sql model types from writer methods,
and advanced linting—but has not converged on an approach. @sbp suggested
trying CPS in #628 first (2026-02-09) and most recently (2026-04-23) noted the
need to also verify that live session data isn't returned from continuation
functions. This remains an active architectural discussion with no agreed-upon
implementation.
### Where this lives in the code today
#### `atr/db/interaction.py` — `all_releases` (lines 107-118)
_currently does this_
Demonstrates the anti-pattern: ORM objects are fetched in a session scope,
then manually patched with relationships after the session closes to avoid
lazy-load failures on detached objects.
```python
async def all_releases(project: sql.Project) -> list[sql.Release]:
"""Get all releases for the project, sorted by version."""
query = sqlmodel.select(sql.Release).where(sql.Release.project_key ==
project.key)
results = []
async with db.session() as data:
for result in (await data.execute(query)).all():
release = result[0]
results.append(release)
for release in results:
release.project = project
```
#### `atr/db/interaction.py` — `releases_by_phase` (lines 480-501)
_currently does this_
Another instance of the anti-pattern: session closes, then detached objects
are manually patched. The comment 'Don't need to eager load and lose it when
the session closes' explicitly acknowledges the problem.
```python
async def releases_by_phase(project: sql.Project, phase: sql.ReleasePhase)
-> list[sql.Release]:
"""Get the releases for the project by phase."""
query = (
sqlmodel.select(sql.Release)
.where(
sql.Release.project_key == project.key,
sql.Release.phase == phase,
)
.order_by(sql.validate_instrumented_attribute(sql.Release.created).desc())
)
results = []
async with db.session() as data:
for result in (await data.execute(query)).all():
release = result[0]
results.append(release)
for release in results:
# Don't need to eager load and lose it when the session closes
release.project = project
return results
```
#### `atr/storage/__init__.py` — `write` (lines 455-463)
_needs modification_
The write() context manager yields a Write object holding a live session.
Objects obtained via this session become detached when the context exits, yet
callers may inadvertently hold references to them. The TODO comment indicates
awareness of the problem.
```python
@contextlib.asynccontextmanager
async def write(asf_uid: principal.UID = principal.ArgumentNone) ->
AsyncGenerator[Write]:
if asf_uid is principal.ArgumentNone:
authorisation = await principal.Authorisation()
else:
authorisation = await principal.Authorisation(asf_uid)
async with db.session() as data:
# TODO: Replace data with a DatabaseWriter instance
yield Write(authorisation, data)
```
#### `atr/storage/__init__.py` — `Write` (lines 251-260)
_needs modification_
Write holds a db.Session reference that it passes to all WriteAs* classes.
Objects loaded through these writers may leak out of the session's lifetime.
```python
class Write:
# Read and Write have authenticator methods which return access outcomes
# TODO: Still need to send some runtime credentials guarantee to the
WriteAs* classes
def __init__(self, authorisation: principal.Authorisation, data:
db.Session):
self.__authorisation: Final[principal.Authorisation] = authorisation
self.__data: Final[db.Session] = data
@property
def authorisation(self) -> principal.Authorisation:
return self.__authorisation
```
#### `atr/db/__init__.py` — `Session` (lines 145-152)
_needs modification_
The custom Session class extends AsyncSession but has no mechanism to track
or prevent use of detached objects after session closure. This is the extension
point where session lifecycle enforcement would be added.
```python
class Session(sqlalchemy.ext.asyncio.AsyncSession):
def __init__(self, *args: Any, **kwargs: Any) -> None:
explicit_value_passed_by_sessionmaker = kwargs.pop("log_queries",
None)
super().__init__(*args, **kwargs)
self.log_queries: bool = global_log_query
if explicit_value_passed_by_sessionmaker is not None:
self.log_queries = explicit_value_passed_by_sessionmaker
```
#### `atr/db/interaction.py` — `release_ready_for_vote` (lines 436-451)
_currently does this_
The function @andrewmusselman identified as triggering a MissingGreenlet
when release.release_policy was accessed without eager loading. Demonstrates
failure mode C from the analysis: async caller can't run sync lazy-load.
```python
async def release_ready_for_vote(
session: web.Committer,
project_key: safe.ProjectKey,
version_key: safe.VersionKey,
revision: safe.RevisionNumber,
data: db.Session,
allowed_vote_modes: frozenset[sql.VoteMode],
) -> tuple[sql.Release, sql.Committee] | str:
release = await session.release(
project_key,
version_key,
data=data,
with_project=True,
with_committee=True,
with_project_release_policy=True,
)
```
### Proposed approach
The team has not converged on a solution. Multiple approaches have been
discussed:
1. **Continuation Passing Style (CPS)** — @sbp's preferred comprehensive
solution, to be prototyped in #628 first. This would prevent detached objects
from escaping session scope by construction.
2. **One session per request** — Simple but may not prevent all reuse
patterns.
3. **Banning sql model types from writer method signatures** — Would require
extracting needed data before passing to writers.
4. **Advanced linting** — AST/type-driven static analysis to catch detached
object access.
@sbp's latest comment (2026-04-23) adds the constraint that live session
data must not be returned from continuation functions. The next concrete step
appears to be the CPS experiment in #628. Until that experiment yields results,
no codebase-wide change should be attempted. In the interim, the team should
continue ensuring eager-loading flags (like `_release_policy`,
`_project_release_policy`) are set correctly in individual query calls to
prevent the MissingGreenlet variant.
### Open questions
- What was the outcome of the CPS experiment in #628? Has it been tried yet?
- Would a runtime warning (e.g., overriding __getattr__ on detached objects
to log warnings) be acceptable as a development-time safety net while the
architectural solution is being designed?
- Are there additional instances of the releases_by_phase/all_releases
pattern (manual relationship assignment after session close) that should be
catalogued?
- Does the per-connection pragma question from @dave2wave (2026-03-18) refer
to SQLite pragmas like journal_mode, or SQLAlchemy session configuration like
expire_on_commit?
_The agent reviewed this issue and is not proposing patches in this run.
Review the existing-code citations and open questions above before deciding
next steps._
### Files examined
- `atr/db/__init__.py`
- `atr/storage/writers/cache.py`
- `atr/db/interaction.py`
- `atr/storage/writers/mail.py`
- `atr/storage/__init__.py`
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]