asf-tooling commented on issue #433:
URL:
https://github.com/apache/tooling-trusted-releases/issues/433#issuecomment-4410329898
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@2da7807a`
**Type:** `new_feature` • **Classification:** `actionable` •
**Confidence:** `medium`
**Application domain(s):** `distribution_tracking`, `shared_infrastructure`
### Summary
Issue requests documenting how to provide a Maven groupId for distribution
recording, and providing clear error messages. The discussion evolved
significantly: @dave2wave (2026-03-06) concluded that Maven coordinates should
be provided as a project-level setting carried to the release policy,
eventually settable through `.asf.yaml`. Currently the groupId is entered as
the generic 'Owner or Namespace' field with minimal guidance. Immediate
improvements include better form help text and error messages; the larger
feature (project-level Maven coordinate settings) requires schema changes.
### Where this lives in the code today
#### `atr/shared/distribution.py` — `DistributionAutomateForm` (lines
130-156)
_needs modification_
The form label for owner_namespace is generic and doesn't provide
Maven-specific guidance about what a groupId looks like or how to find it.
```python
class DistributionAutomateForm(form.Form):
platform: form.Enum[DistributionPlatform] = form.label(
"Platform", widget=form.Widget.SELECT,
enum_filter_include=[DistributionPlatform.MAVEN.value]
)
owner_namespace: safe.OptionalAlphanumeric = form.label(
"Owner or Namespace",
"Who owns or names the package (Maven groupId, npm @scope, Docker
namespace, "
"GitHub owner, ArtifactHub repo). Leave blank if not used.",
)
package: safe.Alphanumeric = form.label("Package")
version: safe.VersionKey = form.label("Version")
details: form.Bool = form.label(
"Include details",
"Include the details of the distribution in the response",
)
@pydantic.model_validator(mode="after")
def validate_owner_namespace(self) -> DistributionAutomateForm:
sql_platform = self.platform.to_sql() # type: ignore[attr-defined]
default_owner_namespace = sql_platform.value.default_owner_namespace
if default_owner_namespace and (not self.owner_namespace):
self.owner_namespace = default_owner_namespace
util.validate_distribution_owner_namespace(sql_platform,
self.owner_namespace)
return self
```
#### `atr/shared/distribution.py` — `DistributionRecordForm` (lines 159-183)
_needs modification_
Same generic label issue as DistributionAutomateForm - needs Maven-specific
help text.
```python
class DistributionRecordForm(form.Form):
platform: form.Enum[DistributionPlatform] = form.label("Platform",
widget=form.Widget.SELECT)
owner_namespace: safe.OptionalAlphanumeric = form.label(
"Owner or Namespace",
"Who owns or names the package (Maven groupId, npm @scope, Docker
namespace, "
"GitHub owner, ArtifactHub repo). Leave blank if not used.",
)
package: safe.Alphanumeric = form.label("Package")
version: safe.VersionKey = form.label("Version")
details: form.Bool = form.label(
"Include details",
"Include the details of the distribution in the response",
)
@pydantic.model_validator(mode="after")
def validate_owner_namespace(self) -> DistributionRecordForm:
sql_platform = self.platform.to_sql() # type: ignore[attr-defined]
default_owner_namespace = sql_platform.value.default_owner_namespace
if default_owner_namespace and (not self.owner_namespace):
self.owner_namespace = default_owner_namespace
util.validate_distribution_owner_namespace(sql_platform,
self.owner_namespace)
return self
```
#### `atr/shared/distribution.py` — `json_from_maven_xml` (lines 336-398)
_currently does this_
This is where Maven metadata validation happens - error messages here could
be improved to help users understand what went wrong with their groupId.
```python
async def json_from_maven_xml(api_url: str, version_key: safe.VersionKey) ->
outcome.Outcome[basic.JSON]:
import datetime
import defusedxml.ElementTree as ElementTree
version = str(version_key)
try:
async with util.create_secure_session(timeout=_TIMEOUT) as session:
async with session.get(api_url) as response:
response.raise_for_status()
xml_text = await response.text()
# Parse the XML
root = ElementTree.fromstring(xml_text)
# Extract versioning info
group = root.find("groupId")
artifact = root.find("artifactId")
versioning = root.find("versioning")
if versioning is None:
e = DistributionError("No versioning element found in Maven
metadata")
return outcome.Error(e)
# Get lastUpdated timestamp (format: yyyyMMddHHmmss)
last_updated_elem = versioning.find("lastUpdated")
if (last_updated_elem is None) or (not last_updated_elem.text):
e = DistributionError("No lastUpdated timestamp found in Maven
metadata")
return outcome.Error(e)
# Convert lastUpdated string to Unix timestamp in milliseconds
last_updated_str = last_updated_elem.text
dt = datetime.datetime.strptime(last_updated_str, "%Y%m%d%H%M%S")
dt = dt.replace(tzinfo=datetime.UTC)
timestamp_ms = int(dt.timestamp() * 1000)
# Verify the version exists
versions_elem = versioning.find("versions")
if versions_elem is not None:
versions = [v.text for v in versions_elem.findall("version") if
v.text]
if version not in versions:
e = DistributionError(f"Version '{version}' not found in
Maven metadata")
return outcome.Error(e)
# Convert to dict matching MavenResponse structure
result_dict = {
"response": {
"start": 0,
"docs": [
{
"g": group.text if (group is not None) else "",
"a": artifact.text if (artifact is not None) else "",
"v": version,
"timestamp": timestamp_ms,
}
],
}
}
result = basic.as_json(result_dict)
return outcome.Result(result)
except (aiohttp.ClientError, DistributionError) as e:
return outcome.Error(e)
except ElementTree.ParseError as e:
return outcome.Error(RuntimeError(f"Failed to parse Maven XML: {e}"))
```
#### `atr/storage/writers/distributions.py` —
`CommitteeMember.record_from_data` (lines 216-247)
_needs modification_
The error message when Maven lookup fails is generic - it should provide
guidance about the groupId format when the platform is Maven.
```python
async def record_from_data(
self,
release_key: models.safe.ReleaseKey,
staging: bool,
dd: models.distribution.Data,
allow_retries: bool = False,
) -> tuple[models.sql.Distribution, bool, models.distribution.Metadata |
None]:
api_url = distribution.get_api_url(dd, staging)
if dd.platform == models.sql.DistributionPlatform.MAVEN:
api_oc = await distribution.json_from_maven_xml(api_url,
dd.version)
else:
api_oc = await
distribution.json_from_distribution_platform(api_url, dd.platform, dd.version)
match api_oc:
case outcome.Result(result):
pass
case outcome.Error(error):
log.error(f"Failed to get API response from {api_url}:
{error}")
if allow_retries:
dist, added = await self.record(
release_key=release_key,
platform=dd.platform,
owner_namespace=dd.owner_namespace,
package=dd.package,
version=dd.version,
staging=staging,
pending=True,
upload_date=None,
api_url=None,
web_url=None,
)
return dist, added, None
raise storage.AccessError(f"Failed to get API response from
distribution platform: {error}", status=502)
```
#### `atr/get/distribution.py` — `_automate_form_page` (lines 191-212)
_needs modification_
The form page for automation could include Maven-specific documentation/help
text explaining the groupId concept.
```python
async def _automate_form_page(project: safe.ProjectKey, version:
safe.VersionKey, staging: bool) -> str:
"""Helper to render the distribution automation form page."""
await shared.distribution.release_validated(project, version,
staging=staging)
block = htm.Block()
render.html_nav_phase(block, str(project), str(version), staging=staging)
title = "Create a staging distribution" if staging else "Create a
distribution"
block.h1[title]
block.p[
"Create a distribution of ",
htm.strong[f"{project}-{version}"],
" using the form below.",
]
block.p[
"You can also ",
htm.a(href=util.as_url(list_get, project_key=str(project),
version_key=str(version)))[
"view the distribution list"
],
".",
]
```
#### `atr/get/distribution.py` — `_record_form_page` (lines 267-288)
_needs modification_
The record form page similarly lacks Maven-specific documentation about the
groupId.
```python
async def _record_form_page(project: safe.ProjectKey, version:
safe.VersionKey, staging: bool) -> str:
"""Helper to render the distribution recording form page."""
await shared.distribution.release_validated(project, version,
staging=staging)
block = htm.Block()
render.html_nav_phase(block, str(project), str(version), staging=staging)
title = "Record a manual staging distribution" if staging else "Record a
manual distribution"
block.h1[title]
block.p[
"Record a manual distribution of ",
htm.strong[f"{project}-{version}"],
" using the form below.",
]
block.p[
"You can also ",
htm.a(href=util.as_url(list_get, project_key=str(project),
version_key=str(version)))[
"view the distribution list"
],
".",
]
```
### Where new code would go
- `atr/models/sql.py` — after Project model definition
Per @dave2wave's decision, Maven coordinates should be stored as a
project-level setting. This would require a new field or related table in the
SQL models.
- `atr/get/distribution.py` — after _automate_form_page block.p statements
(around line 192)
Add Maven-specific help text explaining what a groupId is and how to find
it.
### Proposed approach
The issue has two dimensions: (1) an immediate UX improvement to the form
help text and error messages explaining what a Maven groupId is and how to find
it, and (2) a larger architectural change per @dave2wave's decision to make
Maven coordinates a project-level setting. For the immediate improvement, we
should enhance the form labels, add inline help text on the distribution form
pages, and improve error messages when Maven lookups fail (e.g., 'The Maven
groupId you provided does not appear to exist. For Apache projects, this is
typically org.apache.<project-name>. Check your project\'s existing artifacts
on search.maven.org.'). For the larger project-level setting feature, that
would require schema changes to `atr/models/sql.py` (adding a maven_coordinates
field to projects), project settings UI, and form pre-population logic - this
is a separate body of work that should likely be tracked as its own
implementation task.
### Suggested patches
#### `atr/shared/distribution.py`
Improve form help text for the owner_namespace field to give Maven-specific
guidance
````diff
--- a/atr/shared/distribution.py
+++ b/atr/shared/distribution.py
@@ -100,9 +100,11 @@ class DistributionAutomateForm(form.Form):
platform: form.Enum[DistributionPlatform] = form.label(
"Platform", widget=form.Widget.SELECT,
enum_filter_include=[DistributionPlatform.MAVEN.value]
)
owner_namespace: safe.OptionalAlphanumeric = form.label(
"Owner or Namespace",
- "Who owns or names the package (Maven groupId, npm @scope, Docker
namespace, "
- "GitHub owner, ArtifactHub repo). Leave blank if not used.",
+ "For Maven Central, this is the groupId (e.g. org.apache.maven). "
+ "For most Apache projects it is org.apache.<project-name>. "
+ "You can find your groupId by searching for your artifact on
search.maven.org. "
+ "For other platforms: npm @scope, Docker namespace, GitHub owner,
ArtifactHub repo. "
+ "Leave blank if not used.",
)
package: safe.Alphanumeric = form.label("Package")
version: safe.VersionKey = form.label("Version")
@@ -124,9 +126,11 @@ class DistributionRecordForm(form.Form):
platform: form.Enum[DistributionPlatform] = form.label("Platform",
widget=form.Widget.SELECT)
owner_namespace: safe.OptionalAlphanumeric = form.label(
"Owner or Namespace",
- "Who owns or names the package (Maven groupId, npm @scope, Docker
namespace, "
- "GitHub owner, ArtifactHub repo). Leave blank if not used.",
+ "For Maven Central, this is the groupId (e.g. org.apache.maven). "
+ "For most Apache projects it is org.apache.<project-name>. "
+ "You can find your groupId by searching for your artifact on
search.maven.org. "
+ "For other platforms: npm @scope, Docker namespace, GitHub owner,
ArtifactHub repo. "
+ "Leave blank if not used.",
)
package: safe.Alphanumeric = form.label("Package")
version: safe.VersionKey = form.label("Version")
````
#### `atr/storage/writers/distributions.py`
Improve error messages when Maven API lookup fails to give Maven-specific
guidance about the groupId
````diff
--- a/atr/storage/writers/distributions.py
+++ b/atr/storage/writers/distributions.py
@@ -211,7 +211,14 @@ class CommitteeMember(CommitteeParticipant):
case outcome.Error(error):
log.error(f"Failed to get API response from {api_url}:
{error}")
if allow_retries:
dist, added = await self.record(
release_key=release_key,
platform=dd.platform,
owner_namespace=dd.owner_namespace,
package=dd.package,
version=dd.version,
staging=staging,
pending=True,
upload_date=None,
api_url=None,
web_url=None,
)
return dist, added, None
- raise storage.AccessError(f"Failed to get API response from
distribution platform: {error}", status=502)
+ if dd.platform == models.sql.DistributionPlatform.MAVEN:
+ raise storage.AccessError(
+ f"Failed to find the artifact on Maven Central. "
+ f"Please verify the groupId (Owner or Namespace) is
correct. "
+ f"For most Apache projects this is
'org.apache.<project-name>'. "
+ f"You can verify by searching on
https://search.maven.org. "
+ f"Original error: {error}",
+ status=502,
+ )
+ raise storage.AccessError(f"Failed to get API response from
distribution platform: {error}", status=502)
````
#### `atr/get/distribution.py`
Add Maven-specific help text on the distribution form pages
````diff
--- a/atr/get/distribution.py
+++ b/atr/get/distribution.py
@@ -189,6 +189,16 @@ async def _automate_form_page(project: safe.ProjectKey,
version: safe.VersionKey
".",
]
+ block.append(
+ htm.div(".alert.alert-info")[
+ htm.strong["Maven Central tip: "],
+ "The 'Owner or Namespace' field is the Maven groupId. ",
+ "For most Apache projects, this is ",
+ htm.code["org.apache.<project-name>"],
+ ". You can verify your groupId by searching on ",
+ htm.a(href="https://search.maven.org")["search.maven.org"],
+ ".",
+ ]
+ )
+
# Determine the action based on staging
action = (
util.as_url(post.distribution.stage_automate_selected,
project_key=str(project), version_key=str(version))
````
### Open questions
- The larger feature (Maven coordinates as project-level settings stored in
SQL and eventually configurable via .asf.yaml) needs a design discussion - what
does the schema look like? A new column on the Project table, or a separate
table supporting multiple groupIds per project?
- Should validation check the provided groupId against a known list of
Apache groupIds (per @dave2wave's mapping), or just validate format and
reachability?
- How will this interact with Nexus3 distribution workflows that @dave2wave
mentioned discussing with @alitheg?
- I could not verify the exact location of
`util.validate_distribution_owner_namespace` - this function likely also needs
improved error messages for Maven but I haven't seen its source.
### Files examined
- `atr/models/distribution.py`
- `atr/get/distribution.py`
- `atr/shared/distribution.py`
- `atr/post/distribution.py`
- `atr/storage/writers/distributions.py`
- `atr/tasks/distribution.py`
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]