asf-tooling commented on issue #299:
URL:
https://github.com/apache/tooling-trusted-releases/issues/299#issuecomment-4403692335
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@837830e8`
**Type:** `discussion` • **Classification:** `no_action` •
**Confidence:** `high`
**Application domain(s):** `data_storage`
### Summary
Issue #299 tracks a systemic architectural problem: SQLAlchemy ORM objects
that leak out of session context managers become detached (or miss eager-loaded
relationships), causing three failure modes identified by @andrewmusselman
(silent write loss, DetachedInstanceError, MissingGreenlet). @sbp favors
continuation passing style (CPS) as the comprehensive solution but considers it
cumbersome; the team agreed to prototype it first in #628. @sbp's most recent
comment (2026-04-23) adds that continuation functions must also not return live
session data. No single implementation approach has been committed to yet.
### Where this lives in the code today
#### `atr/db/interaction.py` — `releases_by_phase` (lines 480-501)
_currently does this_
Classic example of the problem: ORM objects are collected inside a session
context, returned after it closes as detached objects. The manual
`release.project = project` assignment is a partial workaround.
```python
async def releases_by_phase(project: sql.Project, phase: sql.ReleasePhase)
-> list[sql.Release]:
"""Get the releases for the project by phase."""
query = (
sqlmodel.select(sql.Release)
.where(
sql.Release.project_key == project.key,
sql.Release.phase == phase,
)
.order_by(sql.validate_instrumented_attribute(sql.Release.created).desc())
)
results = []
async with db.session() as data:
for result in (await data.execute(query)).all():
release = result[0]
results.append(release)
for release in results:
# Don't need to eager load and lose it when the session closes
release.project = project
return results
```
#### `atr/db/interaction.py` — `all_releases` (lines 107-118)
_currently does this_
Same pattern: opens its own session, extracts ORM objects, returns them
detached. Manual relationship assignment is the workaround.
```python
async def all_releases(project: sql.Project) -> list[sql.Release]:
"""Get all releases for the project, sorted by version."""
query = sqlmodel.select(sql.Release).where(sql.Release.project_key ==
project.key)
results = []
async with db.session() as data:
for result in (await data.execute(query)).all():
release = result[0]
results.append(release)
for release in results:
release.project = project
```
#### `atr/storage/__init__.py` — `write` (lines 455-463)
_extension point_
The write() context manager is where session lifetime is managed for write
operations. Any ORM objects returned by writers to callers will become detached
when this exits.
```python
@contextlib.asynccontextmanager
async def write(asf_uid: principal.UID = principal.ArgumentNone) ->
AsyncGenerator[Write]:
if asf_uid is principal.ArgumentNone:
authorisation = await principal.Authorisation()
else:
authorisation = await principal.Authorisation(asf_uid)
async with db.session() as data:
# TODO: Replace data with a DatabaseWriter instance
yield Write(authorisation, data)
```
#### `atr/storage/__init__.py` — `read` (lines 455-463)
_extension point_
The read() context manager similarly bounds session lifetime. The TODO
comment shows the team already recognizes the need to restructure this layer.
```python
@contextlib.asynccontextmanager
async def read(asf_uid: principal.UID = principal.ArgumentNone) ->
AsyncGenerator[Read]:
if asf_uid is principal.ArgumentNone:
authorisation = await principal.Authorisation()
else:
authorisation = await principal.Authorisation(asf_uid)
async with db.session() as data:
# TODO: Replace data with a DatabaseReader instance
yield Read(authorisation, data)
```
#### `atr/db/__init__.py` — `Query` (lines 87-114)
_needs modification_
Query.get(), .demand(), .all() return live ORM instances. A CPS-based
solution would replace these with continuation-style methods (e.g., `async def
with_result(self, fn: Callable[[T], R]) -> R`) that keep the session open
during processing.
```python
class Query[T]:
def __init__(self, session: Session, query:
expression.SelectOfScalar[T]):
self.query = query
self.session = session
def order_by(self, *args: Any, **kwargs: Any) -> Query[T]:
self.query = self.query.order_by(*args, **kwargs)
return self
...
async def get(self, log_query: bool = False) -> T | None:
self.log_query("get", log_query)
result = await self.session.execute(self.query)
return result.unique().scalar_one_or_none()
async def demand(self, error: Exception, log_query: bool = False) -> T:
self.log_query("demand", log_query)
result = await self.session.execute(self.query)
item = result.unique().scalar_one_or_none()
if item is None:
raise error
return item
async def all(self, log_query: bool = False) -> Sequence[T]:
self.log_query("all", log_query)
result = await self.session.execute(self.query)
return result.scalars().all()
```
#### `atr/db/interaction.py` — `release_ready_for_vote` (lines 436-451)
_currently does this_
The function @andrewmusselman cited: it requests eager-loading of
project_release_policy but NOT release_policy directly. This is the Mode C
failure (MissingGreenlet) when release.release_policy is accessed later without
eager loading.
```python
async def release_ready_for_vote(
session: web.Committer,
project_key: safe.ProjectKey,
version_key: safe.VersionKey,
revision: safe.RevisionNumber,
data: db.Session,
allowed_vote_modes: frozenset[sql.VoteMode],
) -> tuple[sql.Release, sql.Committee] | str:
release = await session.release(
project_key,
version_key,
data=data,
with_project=True,
with_committee=True,
with_project_release_policy=True,
)
```
### Proposed approach
The team has not yet committed to a single implementation strategy. @sbp
identified CPS as the most complete solution and proposed testing it in #628
first. The discussion has evolved to identify three distinct failure modes
(detached-writes, detached-reads, greenlet-lazy) that a comprehensive solution
must address. @sbp's latest comment (2026-04-23) adds the requirement that
continuation functions must not leak live session data in their return values.
Possible incremental steps that are consistent with the discussion: (1) Add
a lint rule or runtime assertion that flags ORM model instances returned from
writer/reader methods. (2) Prototype CPS in #628 and evaluate ergonomics. (3)
Consider making the Query class support a `with_result(fn)` pattern that keeps
the session open while the callback executes. No diff is proposed because this
is still an open design discussion without a committed approach.
### Open questions
- Has the CPS prototype in #628 been started, and what were the ergonomic
findings?
- Would a runtime check (e.g., verifying session attachment on returned
objects) be acceptable as a stopgap?
- Should the `ensure_session` pattern (passing caller_data) be expanded as
an interim mitigation to reduce new session creation?
- Are there specific hot paths where Mode C (MissingGreenlet due to missing
eager-loads) recurs frequently enough to warrant adding a compile-time or
test-time check for unloaded relationships?
- What is the status of @dave2wave's question about per-connection pragmas
helping?
_The agent reviewed this issue and is not proposing patches in this run.
Review the existing-code citations and open questions above before deciding
next steps._
### Files examined
- `atr/db/__init__.py`
- `atr/storage/writers/cache.py`
- `atr/db/interaction.py`
- `atr/storage/writers/mail.py`
- `atr/storage/__init__.py`
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]