asf-tooling commented on issue #299:
URL: 
https://github.com/apache/tooling-trusted-releases/issues/299#issuecomment-4404163322

   <!-- gofannon-issue-triage-bot v2 -->
   
   **Automated triage** — analyzed at `main@837830e8`
   
   **Type:** `discussion`  •  **Classification:** `no_action`  •  
**Confidence:** `high`
   **Application domain(s):** `data_storage`
   
   ### Summary
   Issue #299 tracks a recurring class of bugs where SQLAlchemy ORM objects 
become detached from their session (or have un-eagerly-loaded relationships) 
and are reused in contexts that assume a live session. The team has discussed 
multiple potential solutions—one session per request, read-only models, 
continuation passing style (CPS), banning sql model types from writer methods, 
and advanced linting—but has not converged on an approach. @sbp suggested 
trying CPS in #628 first (2026-02-09) and most recently (2026-04-23) noted the 
need to also verify that live session data isn't returned from continuation 
functions. This remains an active architectural discussion with no agreed-upon 
implementation.
   
   ### Where this lives in the code today
   
   #### `atr/db/interaction.py` — `all_releases` (lines 107-118)
   _currently does this_
   Demonstrates the anti-pattern: ORM objects are fetched in a session scope, 
then manually patched with relationships after the session closes to avoid 
lazy-load failures on detached objects.
   
   ```python
   async def all_releases(project: sql.Project) -> list[sql.Release]:
       """Get all releases for the project, sorted by version."""
       query = sqlmodel.select(sql.Release).where(sql.Release.project_key == 
project.key)
   
       results = []
       async with db.session() as data:
           for result in (await data.execute(query)).all():
               release = result[0]
               results.append(release)
   
       for release in results:
           release.project = project
   ```
   
   #### `atr/db/interaction.py` — `releases_by_phase` (lines 480-501)
   _currently does this_
   Another instance of the anti-pattern: session closes, then detached objects 
are manually patched. The comment 'Don't need to eager load and lose it when 
the session closes' explicitly acknowledges the problem.
   
   ```python
   async def releases_by_phase(project: sql.Project, phase: sql.ReleasePhase) 
-> list[sql.Release]:
       """Get the releases for the project by phase."""
   
       query = (
           sqlmodel.select(sql.Release)
           .where(
               sql.Release.project_key == project.key,
               sql.Release.phase == phase,
           )
           
.order_by(sql.validate_instrumented_attribute(sql.Release.created).desc())
       )
   
       results = []
       async with db.session() as data:
           for result in (await data.execute(query)).all():
               release = result[0]
               results.append(release)
   
       for release in results:
           # Don't need to eager load and lose it when the session closes
           release.project = project
       return results
   ```
   
   #### `atr/storage/__init__.py` — `write` (lines 455-463)
   _needs modification_
   The write() context manager yields a Write object holding a live session. 
Objects obtained via this session become detached when the context exits, yet 
callers may inadvertently hold references to them. The TODO comment indicates 
awareness of the problem.
   
   ```python
   @contextlib.asynccontextmanager
   async def write(asf_uid: principal.UID = principal.ArgumentNone) -> 
AsyncGenerator[Write]:
       if asf_uid is principal.ArgumentNone:
           authorisation = await principal.Authorisation()
       else:
           authorisation = await principal.Authorisation(asf_uid)
       async with db.session() as data:
           # TODO: Replace data with a DatabaseWriter instance
           yield Write(authorisation, data)
   ```
   
   #### `atr/storage/__init__.py` — `Write` (lines 251-260)
   _needs modification_
   Write holds a db.Session reference that it passes to all WriteAs* classes. 
Objects loaded through these writers may leak out of the session's lifetime.
   
   ```python
   class Write:
       # Read and Write have authenticator methods which return access outcomes
       # TODO: Still need to send some runtime credentials guarantee to the 
WriteAs* classes
       def __init__(self, authorisation: principal.Authorisation, data: 
db.Session):
           self.__authorisation: Final[principal.Authorisation] = authorisation
           self.__data: Final[db.Session] = data
   
       @property
       def authorisation(self) -> principal.Authorisation:
           return self.__authorisation
   ```
   
   #### `atr/db/__init__.py` — `Session` (lines 145-152)
   _needs modification_
   The custom Session class extends AsyncSession but has no mechanism to track 
or prevent use of detached objects after session closure. This is the extension 
point where session lifecycle enforcement would be added.
   
   ```python
   class Session(sqlalchemy.ext.asyncio.AsyncSession):
       def __init__(self, *args: Any, **kwargs: Any) -> None:
           explicit_value_passed_by_sessionmaker = kwargs.pop("log_queries", 
None)
           super().__init__(*args, **kwargs)
   
           self.log_queries: bool = global_log_query
           if explicit_value_passed_by_sessionmaker is not None:
               self.log_queries = explicit_value_passed_by_sessionmaker
   ```
   
   #### `atr/db/interaction.py` — `release_ready_for_vote` (lines 436-451)
   _currently does this_
   The function @andrewmusselman identified as triggering a MissingGreenlet 
when release.release_policy was accessed without eager loading. Demonstrates 
failure mode C from the analysis: async caller can't run sync lazy-load.
   
   ```python
   async def release_ready_for_vote(
       session: web.Committer,
       project_key: safe.ProjectKey,
       version_key: safe.VersionKey,
       revision: safe.RevisionNumber,
       data: db.Session,
       allowed_vote_modes: frozenset[sql.VoteMode],
   ) -> tuple[sql.Release, sql.Committee] | str:
       release = await session.release(
           project_key,
           version_key,
           data=data,
           with_project=True,
           with_committee=True,
           with_project_release_policy=True,
       )
   ```
   
   ### Proposed approach
   The team has not converged on a solution. Multiple approaches have been 
discussed:
   
   1. **Continuation Passing Style (CPS)** — @sbp's preferred comprehensive 
solution, to be prototyped in #628 first. This would prevent detached objects 
from escaping session scope by construction.
   2. **One session per request** — Simple but may not prevent all reuse 
patterns.
   3. **Banning sql model types from writer method signatures** — Would require 
extracting needed data before passing to writers.
   4. **Advanced linting** — AST/type-driven static analysis to catch detached 
object access.
   
   @sbp's latest comment (2026-04-23) adds the constraint that live session 
data must not be returned from continuation functions. The next concrete step 
appears to be the CPS experiment in #628. Until that experiment yields results, 
no codebase-wide change should be attempted. In the interim, the team should 
continue ensuring eager-loading flags (like `_release_policy`, 
`_project_release_policy`) are set correctly in individual query calls to 
prevent the MissingGreenlet variant.
   
   ### Open questions
   - What was the outcome of the CPS experiment in #628? Has it been tried yet?
   - Would a runtime warning (e.g., overriding __getattr__ on detached objects 
to log warnings) be acceptable as a development-time safety net while the 
architectural solution is being designed?
   - Are there additional instances of the releases_by_phase/all_releases 
pattern (manual relationship assignment after session close) that should be 
catalogued?
   - Does the per-connection pragma question from @dave2wave (2026-03-18) refer 
to SQLite pragmas like journal_mode, or SQLAlchemy session configuration like 
expire_on_commit?
   
   _The agent reviewed this issue and is not proposing patches in this run. 
Review the existing-code citations and open questions above before deciding 
next steps._
   
   ### Files examined
   - `atr/db/__init__.py`
   - `atr/storage/writers/cache.py`
   - `atr/db/interaction.py`
   - `atr/storage/writers/mail.py`
   - `atr/storage/__init__.py`
   
   ---
   *Draft from a triage agent. A human reviewer should validate before merging 
any change. The agent did not run tests or verify diffs apply.*


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to