andrewmusselman commented on issue #751:
URL: 
https://github.com/apache/tooling-trusted-releases/issues/751#issuecomment-4356989911

   @sbp @dave2wave @alitheg here is what my robot says to do about it, RFC
   
   ## Review of the investigation draft
   
   The draft is in good shape and the headline conclusion (skip Maven, no-op 
for PyPI / Hex / Artifact Hub, promotion is a marginal win) is the right call. 
Below are some corrections to the document, a narrower schema proposal than the 
one in the draft, and concrete patches for the changes that are worth landing 
alongside it.
   
   ---
   
   ## Document corrections
   
   ### Maven Central staging URL
   
   The doc abbreviates this as `repository.apache.org`. The actual configured 
value in `atr/models/sql.py` is:
   
   ```
   
https://repository.apache.org:4443/repository/maven-staging/{owner_namespace}/{package}/maven-metadata.xml
   ```
   
   The port (`4443`) and `/repository/maven-staging/` path matter — RAO listens 
there for the staging repo. Either quote it in full or drop the parenthetical 
so the doc isn't subtly wrong.
   
   ### Docker Hub staging
   
   The "Yes (TODO in code)" entry is technically accurate, but the open 
question isn't "what staging URL?" — the commented `template_staging_url` in 
`sql.py` is identical to the production URL. The actual TODO is whether a 
separate staging-tag namespace is meaningful in our model at all: same repo 
with a `-staging` tag suffix, or a different repo entirely. That decision 
changes whether retag-promotion is even applicable, so it's worth surfacing in 
the doc.
   
   ### GitHub Releases
   
   The promotion mechanism (PATCH `prerelease: false`) is right, but the 
staging and production templates have *different shapes*: staging POSTs to 
`…/releases` (because creating a release is the staging act), while production 
GETs `…/releases/tags/v{version}`. That asymmetry is a small implementation 
gotcha that the rest of the platforms don't have — worth a sentence.
   
   ### npm staging
   
   The doc says staging is "dist-tags." Accurate, but it undersells a real 
difference from Test PyPI: npm has no separate staging *registry* — staging is 
just "publish under a non-`latest` dist-tag on the same registry." A bare `npm 
install package` won't pull the staged version, but the version is publicly 
listed and immutable once published. That's a meaningfully different trust 
model from Test PyPI and deserves to be called out next to the table.
   
   ### Erlang (Hex)
   
   The doc says "Not configured." Correct, but more importantly: 
`DistributionPlatform` in `sql.py` has no `HEX` member at all. Hex isn't just 
unconfigured for staging — it isn't modeled as a platform yet. That's the 
actual prerequisite for any further Erlang work, not promotion design.
   
   ---
   
   ## Schema proposal — narrower than the draft
   
   The draft proposes adding `supports_promotion`, `promotion_mechanism: str`, 
`promoted: bool`, `promoted_at`, and `staging_revision_key`. Most of these 
don't earn their keep:
   
   - **`supports_promotion: bool`** is derivable from `promotion_mechanism is 
not None`. Don't store both.
   - **`promotion_mechanism: str`** should be a typed enum 
(`PromotionMechanism.RETAG | DIST_TAG | PRERELEASE_FLAG`), not a free-form 
string. Stringly-typed schema columns always rot.
   - **`promoted: bool` and `promoted_at`** duplicate state. The existing 
`staging` boolean on `Distribution` already encodes this — when it flips false 
on a row that was previously `True`, that *is* the promotion. The 
`__upgrade_staging_to_final` method in `atr/storage/writers/distributions.py` 
already handles this transition. Don't add a parallel boolean.
   - **`staging_revision_key`** is the one that genuinely matters and is 
genuinely missing. The original issue calls this out: *"we have to check that 
the files being promoted are the same as those that were voted on."* Without 
recording which `Revision` was staged, ATR can't enforce that. It should land 
regardless of whether remote promotion is ever built, because it's also useful 
for the existing local re-upload path.
   
   So the actual schema delta is one nullable column.
   
   ---
   
   ## Patches
   
   ### Patch 1 — Add `staging_revision_key` to `Distribution`
   
   In `atr/models/sql.py`, inside `class Distribution(sqlmodel.SQLModel, 
table=True)` (around line 1185):
   
   ```diff
        staging: bool = sqlmodel.Field(default=False)
        pending: bool = sqlmodel.Field(default=False)
        retries: int = sqlmodel.Field(default=0)
        upload_date: datetime.datetime | None = sqlmodel.Field(default=None)
        api_url: str | None = sqlmodel.Field(default=None)
        web_url: str | None = sqlmodel.Field(default=None)
        created_by: str | None = sqlmodel.Field(default=None)
   +    # Records which Revision was staged. Set when staging=True so that the
   +    # finish-phase code can verify the artifacts being promoted match those
   +    # that were voted on. See issue #751.
   +    staging_revision_key: str | None = sqlmodel.Field(
   +        default=None, foreign_key="revision.key"
   +    )
   ```
   
   Then, at the `Distribution.record(...)` call site in 
`atr/storage/writers/distributions.py` (around line 143), accept and persist 
this value:
   
   ```diff
        async def record(
            self,
            release_key: models.safe.ReleaseKey,
            platform: models.sql.DistributionPlatform,
            owner_namespace: models.safe.Alphanumeric | None,
            package: models.safe.Alphanumeric,
            version: models.safe.VersionKey,
            staging: bool,
            pending: bool,
            upload_date: datetime.datetime | None,
            api_url: str | None = None,
            web_url: str | None = None,
   +        staging_revision_key: str | None = None,
        ) -> tuple[models.sql.Distribution, bool]:
            namespace = str(owner_namespace) if owner_namespace else ""
            existing = await self.__data.distribution(
                str(release_key), platform, namespace, str(package), 
str(version)
            ).get()
            dist = models.sql.Distribution(
                platform=platform,
                release_key=str(release_key),
                owner_namespace=namespace,
                package=str(package),
                version=str(version),
                staging=staging,
                pending=pending,
                retries=0,
                upload_date=upload_date,
                api_url=api_url,
                web_url=web_url,
                created_by=self.__asf_uid,
   +            staging_revision_key=staging_revision_key if staging else None,
            )
   ```
   
   The finish-phase code path (whichever caller drives 
`__upgrade_staging_to_final`) should then assert that 
`existing.staging_revision_key` equals the current `Revision.key` before 
promoting. That's a small follow-up rather than a schema concern.
   
   This will need an Alembic migration. The column is nullable, so existing 
rows are unaffected.
   
   ### Patch 2 — Stop hard-coding the staging-supported set
   
   In `atr/shared/distribution.py`, the `_template_url` function (around line 
439):
   
   ```diff
    def _template_url(
        dd: distribution.Data,
        staging: bool | None = None,
    ) -> str:
        if staging is False:
            return dd.platform.value.template_url
   
   -    supported = {
   -        sql.DistributionPlatform.ARTIFACT_HUB,
   -        sql.DistributionPlatform.PYPI,
   -        sql.DistributionPlatform.MAVEN,
   -    }
   -    if dd.platform not in supported:
   -        raise RuntimeError("Staging is currently supported only for 
ArtifactHub, PyPI and Maven Central.")
   -
        template_url = dd.platform.value.template_staging_url
        if template_url is None:
   -        raise RuntimeError("This platform does not provide a staging API 
endpoint.")
   +        raise RuntimeError(
   +            f"Platform {dd.platform.value.name} does not provide a staging 
API endpoint."
   +        )
   
        return template_url
   ```
   
   Rationale: the whitelist duplicates knowledge that's already in the enum — 
every platform with a `template_staging_url` already declares it. The whitelist 
is a footgun: when someone adds a new staging URL (for example, enabling GitHub 
Releases), they have to remember to update this set or staging silently breaks 
for the new platform. The enum should be the single source of truth, and the 
error message becomes more informative for free.
   
   ### Patch 3 — Decide `GITHUB`: enable or delete
   
   In `atr/models/sql.py`, around line 117, the commented-out enum member:
   
   ```python
       # GITHUB = DistributionPlatformValue(
       #     name="GitHub",
       #     gh_slug="github",
       #     
template_url="https://api.github.com/repos/{owner_namespace}/{package}/releases/tags/v{version}";,
       #     # Combine with {"prerelease": true}
       #     
template_staging_url="https://api.github.com/repos/{owner_namespace}/{package}/releases";,
       #     requires_owner_namespace=True,
       # )
   ```
   
   There are also two corresponding commented `# case 
models.sql.DistributionPlatform.GITHUB:` references in 
`atr/shared/distribution.py` (around lines 201 and 255).
   
   This block has been commented out long enough that it's noise. Either enable 
it (GitHub Releases is the cheapest promotion target after npm — one PATCH 
call, and the upload-date / web-url extractors already have placeholder cases 
waiting for it) or delete it. Leaving it commented while the investigation 
recommends enabling it is the worst of both worlds.
   
   ---
   
   ## Out-of-scope items to file separately
   
   The thread is currently mixing investigation, decision, and implementation. 
To close this issue cleanly:
   
   1. **npm dist-tag promotion** — the most valuable promotion to actually 
build, per the doc's own priority list. File as a child issue. The 
implementation surface is small (one HTTP call) but it depends on npm 
distribution support being mature, which it isn't yet.
   2. **GitHub Releases promotion** — depends on enabling the `GITHUB` platform 
first. File as a follow-up to whichever direction Patch 3 lands.
   3. **Docker Hub retag promotion** — feasible per the doc, but requires 
manifest-digest tracking (a third schema concern beyond 
`staging_revision_key`). Not worth scoping until Docker Hub distribution is 
actually being used in anger.
   4. **GH workflow integration** — there's an earlier comment in the thread 
asking to "add in information about GH workflows for staging and publishing to 
these models." The draft has a `gh_slug` column for display but doesn't engage 
with what ATR should *do* with the slug beyond label things. Either scope this 
out explicitly ("workflow integration out of scope for this issue") or open a 
separate design issue.
   
   ---
   
   ## Recommended close conditions
   
   1. Merge the investigation document with the corrections above.
   2. Land **Patch 1** (`staging_revision_key`) and **Patch 2** (whitelist 
removal). Both are independently useful regardless of any promotion work.
   3. Make a yes/no call on **Patch 3** (`GITHUB`).
   4. **Don't** build the generic `supports_promotion` / `promotion_mechanism` 
plumbing. Add it when there's a second platform that actually needs it. YAGNI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to