asf-tooling commented on issue #433:
URL: 
https://github.com/apache/tooling-trusted-releases/issues/433#issuecomment-4410329898

   <!-- gofannon-issue-triage-bot v2 -->
   
   **Automated triage** — analyzed at `main@2da7807a`
   
   **Type:** `new_feature`  •  **Classification:** `actionable`  •  
**Confidence:** `medium`
   **Application domain(s):** `distribution_tracking`, `shared_infrastructure`
   
   ### Summary
   Issue requests documenting how to provide a Maven groupId for distribution 
recording, and providing clear error messages. The discussion evolved 
significantly: @dave2wave (2026-03-06) concluded that Maven coordinates should 
be provided as a project-level setting carried to the release policy, 
eventually settable through `.asf.yaml`. Currently the groupId is entered as 
the generic 'Owner or Namespace' field with minimal guidance. Immediate 
improvements include better form help text and error messages; the larger 
feature (project-level Maven coordinate settings) requires schema changes.
   
   ### Where this lives in the code today
   
   #### `atr/shared/distribution.py` — `DistributionAutomateForm` (lines 
130-156)
   _needs modification_
   The form label for owner_namespace is generic and doesn't provide 
Maven-specific guidance about what a groupId looks like or how to find it.
   
   ```python
   class DistributionAutomateForm(form.Form):
       platform: form.Enum[DistributionPlatform] = form.label(
           "Platform", widget=form.Widget.SELECT, 
enum_filter_include=[DistributionPlatform.MAVEN.value]
       )
       owner_namespace: safe.OptionalAlphanumeric = form.label(
           "Owner or Namespace",
           "Who owns or names the package (Maven groupId, npm @scope, Docker 
namespace, "
           "GitHub owner, ArtifactHub repo). Leave blank if not used.",
       )
       package: safe.Alphanumeric = form.label("Package")
       version: safe.VersionKey = form.label("Version")
       details: form.Bool = form.label(
           "Include details",
           "Include the details of the distribution in the response",
       )
   
       @pydantic.model_validator(mode="after")
       def validate_owner_namespace(self) -> DistributionAutomateForm:
           sql_platform = self.platform.to_sql()  # type: ignore[attr-defined]
           default_owner_namespace = sql_platform.value.default_owner_namespace
   
           if default_owner_namespace and (not self.owner_namespace):
               self.owner_namespace = default_owner_namespace
   
           util.validate_distribution_owner_namespace(sql_platform, 
self.owner_namespace)
   
           return self
   ```
   
   #### `atr/shared/distribution.py` — `DistributionRecordForm` (lines 159-183)
   _needs modification_
   Same generic label issue as DistributionAutomateForm - needs Maven-specific 
help text.
   
   ```python
   class DistributionRecordForm(form.Form):
       platform: form.Enum[DistributionPlatform] = form.label("Platform", 
widget=form.Widget.SELECT)
       owner_namespace: safe.OptionalAlphanumeric = form.label(
           "Owner or Namespace",
           "Who owns or names the package (Maven groupId, npm @scope, Docker 
namespace, "
           "GitHub owner, ArtifactHub repo). Leave blank if not used.",
       )
       package: safe.Alphanumeric = form.label("Package")
       version: safe.VersionKey = form.label("Version")
       details: form.Bool = form.label(
           "Include details",
           "Include the details of the distribution in the response",
       )
   
       @pydantic.model_validator(mode="after")
       def validate_owner_namespace(self) -> DistributionRecordForm:
           sql_platform = self.platform.to_sql()  # type: ignore[attr-defined]
           default_owner_namespace = sql_platform.value.default_owner_namespace
   
           if default_owner_namespace and (not self.owner_namespace):
               self.owner_namespace = default_owner_namespace
   
           util.validate_distribution_owner_namespace(sql_platform, 
self.owner_namespace)
   
           return self
   ```
   
   #### `atr/shared/distribution.py` — `json_from_maven_xml` (lines 336-398)
   _currently does this_
   This is where Maven metadata validation happens - error messages here could 
be improved to help users understand what went wrong with their groupId.
   
   ```python
   async def json_from_maven_xml(api_url: str, version_key: safe.VersionKey) -> 
outcome.Outcome[basic.JSON]:
       import datetime
   
       import defusedxml.ElementTree as ElementTree
   
       version = str(version_key)
       try:
           async with util.create_secure_session(timeout=_TIMEOUT) as session:
               async with session.get(api_url) as response:
                   response.raise_for_status()
                   xml_text = await response.text()
   
           # Parse the XML
           root = ElementTree.fromstring(xml_text)
   
           # Extract versioning info
           group = root.find("groupId")
           artifact = root.find("artifactId")
           versioning = root.find("versioning")
           if versioning is None:
               e = DistributionError("No versioning element found in Maven 
metadata")
               return outcome.Error(e)
   
           # Get lastUpdated timestamp (format: yyyyMMddHHmmss)
           last_updated_elem = versioning.find("lastUpdated")
           if (last_updated_elem is None) or (not last_updated_elem.text):
               e = DistributionError("No lastUpdated timestamp found in Maven 
metadata")
               return outcome.Error(e)
   
           # Convert lastUpdated string to Unix timestamp in milliseconds
           last_updated_str = last_updated_elem.text
           dt = datetime.datetime.strptime(last_updated_str, "%Y%m%d%H%M%S")
           dt = dt.replace(tzinfo=datetime.UTC)
           timestamp_ms = int(dt.timestamp() * 1000)
   
           # Verify the version exists
           versions_elem = versioning.find("versions")
           if versions_elem is not None:
               versions = [v.text for v in versions_elem.findall("version") if 
v.text]
               if version not in versions:
                   e = DistributionError(f"Version '{version}' not found in 
Maven metadata")
                   return outcome.Error(e)
   
           # Convert to dict matching MavenResponse structure
           result_dict = {
               "response": {
                   "start": 0,
                   "docs": [
                       {
                           "g": group.text if (group is not None) else "",
                           "a": artifact.text if (artifact is not None) else "",
                           "v": version,
                           "timestamp": timestamp_ms,
                       }
                   ],
               }
           }
           result = basic.as_json(result_dict)
           return outcome.Result(result)
       except (aiohttp.ClientError, DistributionError) as e:
           return outcome.Error(e)
       except ElementTree.ParseError as e:
           return outcome.Error(RuntimeError(f"Failed to parse Maven XML: {e}"))
   ```
   
   #### `atr/storage/writers/distributions.py` — 
`CommitteeMember.record_from_data` (lines 216-247)
   _needs modification_
   The error message when Maven lookup fails is generic - it should provide 
guidance about the groupId format when the platform is Maven.
   
   ```python
       async def record_from_data(
           self,
           release_key: models.safe.ReleaseKey,
           staging: bool,
           dd: models.distribution.Data,
           allow_retries: bool = False,
       ) -> tuple[models.sql.Distribution, bool, models.distribution.Metadata | 
None]:
           api_url = distribution.get_api_url(dd, staging)
           if dd.platform == models.sql.DistributionPlatform.MAVEN:
               api_oc = await distribution.json_from_maven_xml(api_url, 
dd.version)
           else:
               api_oc = await 
distribution.json_from_distribution_platform(api_url, dd.platform, dd.version)
           match api_oc:
               case outcome.Result(result):
                   pass
               case outcome.Error(error):
                   log.error(f"Failed to get API response from {api_url}: 
{error}")
                   if allow_retries:
                       dist, added = await self.record(
                           release_key=release_key,
                           platform=dd.platform,
                           owner_namespace=dd.owner_namespace,
                           package=dd.package,
                           version=dd.version,
                           staging=staging,
                           pending=True,
                           upload_date=None,
                           api_url=None,
                           web_url=None,
                       )
                       return dist, added, None
                   raise storage.AccessError(f"Failed to get API response from 
distribution platform: {error}", status=502)
   ```
   
   #### `atr/get/distribution.py` — `_automate_form_page` (lines 191-212)
   _needs modification_
   The form page for automation could include Maven-specific documentation/help 
text explaining the groupId concept.
   
   ```python
   async def _automate_form_page(project: safe.ProjectKey, version: 
safe.VersionKey, staging: bool) -> str:
       """Helper to render the distribution automation form page."""
       await shared.distribution.release_validated(project, version, 
staging=staging)
   
       block = htm.Block()
       render.html_nav_phase(block, str(project), str(version), staging=staging)
   
       title = "Create a staging distribution" if staging else "Create a 
distribution"
       block.h1[title]
   
       block.p[
           "Create a distribution of ",
           htm.strong[f"{project}-{version}"],
           " using the form below.",
       ]
       block.p[
           "You can also ",
           htm.a(href=util.as_url(list_get, project_key=str(project), 
version_key=str(version)))[
               "view the distribution list"
           ],
           ".",
       ]
   ```
   
   #### `atr/get/distribution.py` — `_record_form_page` (lines 267-288)
   _needs modification_
   The record form page similarly lacks Maven-specific documentation about the 
groupId.
   
   ```python
   async def _record_form_page(project: safe.ProjectKey, version: 
safe.VersionKey, staging: bool) -> str:
       """Helper to render the distribution recording form page."""
       await shared.distribution.release_validated(project, version, 
staging=staging)
   
       block = htm.Block()
       render.html_nav_phase(block, str(project), str(version), staging=staging)
   
       title = "Record a manual staging distribution" if staging else "Record a 
manual distribution"
       block.h1[title]
   
       block.p[
           "Record a manual distribution of ",
           htm.strong[f"{project}-{version}"],
           " using the form below.",
       ]
       block.p[
           "You can also ",
           htm.a(href=util.as_url(list_get, project_key=str(project), 
version_key=str(version)))[
               "view the distribution list"
           ],
           ".",
       ]
   ```
   
   ### Where new code would go
   - `atr/models/sql.py` — after Project model definition
     Per @dave2wave's decision, Maven coordinates should be stored as a 
project-level setting. This would require a new field or related table in the 
SQL models.
   - `atr/get/distribution.py` — after _automate_form_page block.p statements 
(around line 192)
     Add Maven-specific help text explaining what a groupId is and how to find 
it.
   
   ### Proposed approach
   The issue has two dimensions: (1) an immediate UX improvement to the form 
help text and error messages explaining what a Maven groupId is and how to find 
it, and (2) a larger architectural change per @dave2wave's decision to make 
Maven coordinates a project-level setting. For the immediate improvement, we 
should enhance the form labels, add inline help text on the distribution form 
pages, and improve error messages when Maven lookups fail (e.g., 'The Maven 
groupId you provided does not appear to exist. For Apache projects, this is 
typically org.apache.<project-name>. Check your project\'s existing artifacts 
on search.maven.org.'). For the larger project-level setting feature, that 
would require schema changes to `atr/models/sql.py` (adding a maven_coordinates 
field to projects), project settings UI, and form pre-population logic - this 
is a separate body of work that should likely be tracked as its own 
implementation task.
   
   ### Suggested patches
   
   #### `atr/shared/distribution.py`
   Improve form help text for the owner_namespace field to give Maven-specific 
guidance
   
   ````diff
   --- a/atr/shared/distribution.py
   +++ b/atr/shared/distribution.py
   @@ -100,9 +100,11 @@ class DistributionAutomateForm(form.Form):
        platform: form.Enum[DistributionPlatform] = form.label(
            "Platform", widget=form.Widget.SELECT, 
enum_filter_include=[DistributionPlatform.MAVEN.value]
        )
        owner_namespace: safe.OptionalAlphanumeric = form.label(
            "Owner or Namespace",
   -        "Who owns or names the package (Maven groupId, npm @scope, Docker 
namespace, "
   -        "GitHub owner, ArtifactHub repo). Leave blank if not used.",
   +        "For Maven Central, this is the groupId (e.g. org.apache.maven). "
   +        "For most Apache projects it is org.apache.<project-name>. "
   +        "You can find your groupId by searching for your artifact on 
search.maven.org. "
   +        "For other platforms: npm @scope, Docker namespace, GitHub owner, 
ArtifactHub repo. "
   +        "Leave blank if not used.",
        )
        package: safe.Alphanumeric = form.label("Package")
        version: safe.VersionKey = form.label("Version")
   @@ -124,9 +126,11 @@ class DistributionRecordForm(form.Form):
        platform: form.Enum[DistributionPlatform] = form.label("Platform", 
widget=form.Widget.SELECT)
        owner_namespace: safe.OptionalAlphanumeric = form.label(
            "Owner or Namespace",
   -        "Who owns or names the package (Maven groupId, npm @scope, Docker 
namespace, "
   -        "GitHub owner, ArtifactHub repo). Leave blank if not used.",
   +        "For Maven Central, this is the groupId (e.g. org.apache.maven). "
   +        "For most Apache projects it is org.apache.<project-name>. "
   +        "You can find your groupId by searching for your artifact on 
search.maven.org. "
   +        "For other platforms: npm @scope, Docker namespace, GitHub owner, 
ArtifactHub repo. "
   +        "Leave blank if not used.",
        )
        package: safe.Alphanumeric = form.label("Package")
        version: safe.VersionKey = form.label("Version")
   ````
   
   #### `atr/storage/writers/distributions.py`
   Improve error messages when Maven API lookup fails to give Maven-specific 
guidance about the groupId
   
   ````diff
   --- a/atr/storage/writers/distributions.py
   +++ b/atr/storage/writers/distributions.py
   @@ -211,7 +211,14 @@ class CommitteeMember(CommitteeParticipant):
                case outcome.Error(error):
                    log.error(f"Failed to get API response from {api_url}: 
{error}")
                    if allow_retries:
                        dist, added = await self.record(
                            release_key=release_key,
                            platform=dd.platform,
                            owner_namespace=dd.owner_namespace,
                            package=dd.package,
                            version=dd.version,
                            staging=staging,
                            pending=True,
                            upload_date=None,
                            api_url=None,
                            web_url=None,
                        )
                        return dist, added, None
   -                raise storage.AccessError(f"Failed to get API response from 
distribution platform: {error}", status=502)
   +                if dd.platform == models.sql.DistributionPlatform.MAVEN:
   +                    raise storage.AccessError(
   +                        f"Failed to find the artifact on Maven Central. "
   +                        f"Please verify the groupId (Owner or Namespace) is 
correct. "
   +                        f"For most Apache projects this is 
'org.apache.<project-name>'. "
   +                        f"You can verify by searching on 
https://search.maven.org. "
   +                        f"Original error: {error}",
   +                        status=502,
   +                    )
   +                raise storage.AccessError(f"Failed to get API response from 
distribution platform: {error}", status=502)
   ````
   
   #### `atr/get/distribution.py`
   Add Maven-specific help text on the distribution form pages
   
   ````diff
   --- a/atr/get/distribution.py
   +++ b/atr/get/distribution.py
   @@ -189,6 +189,16 @@ async def _automate_form_page(project: safe.ProjectKey, 
version: safe.VersionKey
            ".",
        ]
    
   +    block.append(
   +        htm.div(".alert.alert-info")[
   +            htm.strong["Maven Central tip: "],
   +            "The 'Owner or Namespace' field is the Maven groupId. ",
   +            "For most Apache projects, this is ",
   +            htm.code["org.apache.<project-name>"],
   +            ". You can verify your groupId by searching on ",
   +            htm.a(href="https://search.maven.org";)["search.maven.org"],
   +            ".",
   +        ]
   +    )
   +
        # Determine the action based on staging
        action = (
            util.as_url(post.distribution.stage_automate_selected, 
project_key=str(project), version_key=str(version))
   ````
   
   ### Open questions
   - The larger feature (Maven coordinates as project-level settings stored in 
SQL and eventually configurable via .asf.yaml) needs a design discussion - what 
does the schema look like? A new column on the Project table, or a separate 
table supporting multiple groupIds per project?
   - Should validation check the provided groupId against a known list of 
Apache groupIds (per @dave2wave's mapping), or just validate format and 
reachability?
   - How will this interact with Nexus3 distribution workflows that @dave2wave 
mentioned discussing with @alitheg?
   - I could not verify the exact location of 
`util.validate_distribution_owner_namespace` - this function likely also needs 
improved error messages for Maven but I haven't seen its source.
   
   ### Files examined
   - `atr/models/distribution.py`
   - `atr/get/distribution.py`
   - `atr/shared/distribution.py`
   - `atr/post/distribution.py`
   - `atr/storage/writers/distributions.py`
   - `atr/tasks/distribution.py`
   
   ---
   *Draft from a triage agent. A human reviewer should validate before merging 
any change. The agent did not run tests or verify diffs apply.*


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to