asf-tooling commented on issue #480:
URL:
https://github.com/apache/tooling-trusted-releases/issues/480#issuecomment-4410203466
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@2da7807a`
**Type:** `new_feature` • **Classification:** `actionable` •
**Confidence:** `medium`
**Application domain(s):** `distribution_tracking`, `web_api_infrastructure`
### Summary
This issue requests automated PyPI distribution support via GitHub Actions
(GHA) and ATR. The codebase already supports PyPI for *manual* distribution
recording (models, upload date parsing, web URL extraction all exist), but
automated distribution via GHA workflows is only enabled for Maven Central.
@dave2wave noted twine as the publishing tool and that RC versions need
separate handling. @potiuk provided detailed requirements around the apache
PyPI organization, RC vs. final package variants (RC packages embed version
suffixes and different dependency constraints), and the need for two
distribution variants in dev. The ATR-side changes are to enable PyPI in the
automation paths; the actual GHA workflow files would live in
`apache/tooling-actions`.
### Where this lives in the code today
#### `atr/shared/distribution.py` — `DistributionAutomateForm` (lines
130-144)
_needs modification_
The enum_filter_include only allows Maven for automated distribution; PyPI
needs to be added.
```python
class DistributionAutomateForm(form.Form):
platform: form.Enum[DistributionPlatform] = form.label(
"Platform", widget=form.Widget.SELECT,
enum_filter_include=[DistributionPlatform.MAVEN.value]
)
owner_namespace: safe.OptionalAlphanumeric = form.label(
"Owner or Namespace",
"Who owns or names the package (Maven groupId, npm @scope, Docker
namespace, "
"GitHub owner, ArtifactHub repo). Leave blank if not used.",
)
package: safe.Alphanumeric = form.label("Package")
version: safe.VersionKey = form.label("Version")
details: form.Bool = form.label(
"Include details",
"Include the details of the distribution in the response",
)
```
#### `atr/shared/distribution.py` — `DistributionPlatform` (lines 78-102)
_currently does this_
PyPI is already defined as a DistributionPlatform enum variant with SQL
conversion support.
```python
class DistributionPlatform(enum.Enum):
"""Wrapper enum for distribution platforms."""
ARTIFACT_HUB = "Artifact Hub"
DOCKER_HUB = "Docker Hub"
MAVEN = "Maven Central"
NPM = "npm"
NPM_SCOPED = "npm (scoped)"
PYPI = "PyPI"
def to_sql(self) -> sql.DistributionPlatform:
"""Convert to SQL enum."""
match self:
case DistributionPlatform.ARTIFACT_HUB:
return sql.DistributionPlatform.ARTIFACT_HUB
case DistributionPlatform.DOCKER_HUB:
return sql.DistributionPlatform.DOCKER_HUB
case DistributionPlatform.MAVEN:
return sql.DistributionPlatform.MAVEN
case DistributionPlatform.NPM:
return sql.DistributionPlatform.NPM
case DistributionPlatform.NPM_SCOPED:
return sql.DistributionPlatform.NPM_SCOPED
case DistributionPlatform.PYPI:
return sql.DistributionPlatform.PYPI
```
#### `atr/shared/distribution.py` — `_template_url` (lines 439-452)
_currently does this_
PyPI is already in the staging-supported set for template URL resolution,
indicating staging URL support already exists in the SQL model.
```python
def _template_url(
dd: distribution.Data,
staging: bool | None = None,
) -> str:
if staging is False:
return dd.platform.value.template_url
supported = {
sql.DistributionPlatform.ARTIFACT_HUB,
sql.DistributionPlatform.PYPI,
sql.DistributionPlatform.MAVEN,
}
if dd.platform not in supported:
raise RuntimeError("Staging is currently supported only for
ArtifactHub, PyPI and Maven Central.")
```
#### `atr/tasks/gha.py` — `trigger_workflow` (lines 116-127)
_currently does this_
The GHA trigger uses sql_platform.value.gh_slug to determine workflow
filename. For PyPI, this would create 'distribute-pypi.yml' or
'distribute-pypi-stg.yml' (assuming gh_slug is 'pypi').
```python
@checks.with_model(args.DistributionWorkflow)
async def trigger_workflow(
task_args: args.DistributionWorkflow, *, task_id: int | None = None
) -> results.Results | None:
unique_id = f"atr-dist-{task_args.name}-{uuid.uuid4()}"
project = safe.ProjectKey(task_args.project_key)
safe.VersionKey(task_args.version_key)
try:
sql_platform = sql.DistributionPlatform[task_args.platform]
except KeyError:
_fail(f"Invalid platform: {task_args.platform}")
workflow = f"distribute-{sql_platform.value.gh_slug}{'-stg' if
task_args.staging else ''}.yml"
```
#### `atr/models/distribution.py` — `PyPIResponse` (lines 75-87)
_currently does this_
PyPI API response models already exist for parsing PyPI JSON API responses.
```python
class PyPIUrl(schema.Subset):
upload_time_iso_8601: str | None = None
url: str | None = None
class PyPIInfo(schema.Subset):
release_url: str | None = None
project_url: str | None = None
class PyPIResponse(schema.Subset):
urls: list[PyPIUrl] = pydantic.Field(default_factory=list)
info: PyPIInfo = pydantic.Field(default_factory=PyPIInfo)
```
### Where new code would go
- `apache/tooling-actions (external repo)` — new file
GHA workflow files distribute-pypi.yml and distribute-pypi-stg.yml need to
be created in the apache/tooling-actions repository to handle the actual
twine-based upload to PyPI using trusted publishing.
### Proposed approach
The ATR-side implementation requires two main changes: (1) adding PyPI to
the automated platforms lists in `atr/post/distribution.py`, and (2) updating
the `DistributionAutomateForm` in `atr/shared/distribution.py` to include PyPI
in the allowed platform filter. These changes enable the existing GHA workflow
trigger mechanism (in `atr/tasks/gha.py`) to dispatch PyPI distribution
workflows.
The more complex aspect raised by @potiuk — handling RC vs. final package
variants where RC packages have different version strings and dependency
constraints — may require additional model changes. The current staging boolean
(`staging=True/False`) distinguishes staging from production, but @potiuk's
requirements suggest that both an RC variant AND a final variant need to
coexist in dev/staging. This may require a new field or tag on distributions to
distinguish RC-published packages from final-to-be-published packages. However,
the simpler first step of enabling PyPI automation can proceed without fully
solving the RC variant problem, since @dave2wave noted they would 'only go as
far as finding versions with an rc1 added to the version path'. The actual
workflow files (using twine with trusted publishing) need to be implemented in
the external `apache/tooling-actions` repository.
### Suggested patches
#### `atr/post/distribution.py`
Add PyPI to both automated and staging automated platform lists
````diff
--- a/atr/post/distribution.py
+++ b/atr/post/distribution.py
@@ -34,10 +34,12 @@ import atr.web as web
_AUTOMATED_PLATFORMS: Final[tuple[shared.distribution.DistributionPlatform,
...]] = (
shared.distribution.DistributionPlatform.MAVEN,
+ shared.distribution.DistributionPlatform.PYPI,
)
_AUTOMATED_PLATFORMS_STAGE:
Final[tuple[shared.distribution.DistributionPlatform, ...]] = (
shared.distribution.DistributionPlatform.MAVEN,
+ shared.distribution.DistributionPlatform.PYPI,
)
````
#### `atr/shared/distribution.py`
Include PyPI in the automated distribution form's platform filter so users
can select it
````diff
--- a/atr/shared/distribution.py
+++ b/atr/shared/distribution.py
@@ -99,7 +99,10 @@ class DeleteForm(form.Form):
class DistributionAutomateForm(form.Form):
platform: form.Enum[DistributionPlatform] = form.label(
- "Platform", widget=form.Widget.SELECT,
enum_filter_include=[DistributionPlatform.MAVEN.value]
+ "Platform", widget=form.Widget.SELECT, enum_filter_include=[
+ DistributionPlatform.MAVEN.value,
+ DistributionPlatform.PYPI.value,
+ ]
)
owner_namespace: safe.OptionalAlphanumeric = form.label(
"Owner or Namespace",
````
### Open questions
- What is the `gh_slug` value for `sql.DistributionPlatform.PYPI` in
`atr/models/sql.py`? The GHA workflow trigger needs this to construct the
filename `distribute-{gh_slug}.yml`.
- Does `sql.DistributionPlatform.PYPI.value.template_staging_url` already
point to test.pypi.org? If not, it needs to be configured.
- How should RC vs. final package variants be modeled? @potiuk described
needing two variants (RC-published and final-to-be-published) that differ in
version strings and dependencies. Does the existing `staging` boolean suffice,
or is a new field needed?
- The actual GHA workflow files (distribute-pypi.yml,
distribute-pypi-stg.yml) need to be created in apache/tooling-actions. What's
the status of that work?
- Should the workflow use PyPI trusted publishing (OIDC-based, no API tokens
needed) or traditional twine with API tokens? @dave2wave mentioned twine but
trusted publishing is the modern approach.
- Does PyPI's 'apache' organization (https://pypi.org/org/apache/) already
have trusted publisher configurations set up for any projects?
### Files examined
- `atr/get/distribution.py`
- `atr/models/distribution.py`
- `atr/post/distribution.py`
- `atr/shared/distribution.py`
- `atr/storage/writers/distributions.py`
- `atr/tasks/distribution.py`
- `atr/tasks/gha.py`
- `atr/models/api.py`
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]