asf-tooling commented on issue #913:
URL:
https://github.com/apache/tooling-trusted-releases/issues/913#issuecomment-4410053450
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@2da7807a`
**Type:** `new_feature` • **Classification:** `actionable` •
**Confidence:** `medium`
**Application domain(s):** `project_committee_management`
### Summary
Issue #913 requests adding project reference metadata fields (homepage,
repository, download page, bug database, mailing lists, lifecycle page,
standards) to ATR's project model. @gstein noted this looks like DOAP data and
asked about storage; @dave2wave confirmed and pointed to issue #139 for
context. The external data source (`ProjectStatus` in
`atr/datasources/apache.py`) already parses most of these fields from
projects.apache.org, but `_update_projects()` currently discards them. The
implementation requires new DB columns on `sql.Project`, persisting fetched
data, and adding UI/forms for viewing/editing these fields.
### Where this lives in the code today
#### `atr/datasources/apache.py` — `ProjectStatus` (lines 192-209)
_currently does this_
Already parses homepage, repository, bug_database, download_page,
mailing_list, shortdesc, and implements (standards) from the external DOAP data
source — but these are not persisted to the database.
```python
class ProjectStatus(schema.Strict):
category: list[str] = schema.factory(list)
created: str | None = None
description: str | None = None
programming_language: list[str] =
schema.Field(alias="programming-language", default_factory=list)
doap: str | None = None
homepage: str
name: str
pmc: str | None
shortdesc: str | None = None
repository: list[str | dict] = schema.factory(list)
release: list[Release] = schema.factory(list)
bug_database: str | None = schema.alias_opt("bug-database")
download_page: str | None = schema.alias_opt("download-page")
license: str | None = None
mailing_list: str | None = schema.alias_opt("mailing-list")
maintainer: list[MaintainerInfo] = schema.factory(list)
implements: list[ImplementsInfo] = schema.factory(list)
```
#### `atr/storage/writers/project.py` — `CommitteeMember` (lines 77-85)
_extension point_
Pattern for metadata management (category/language add/remove) should be
followed for the new reference metadata fields.
```python
class CommitteeMember(CommitteeParticipant):
...
async def category_add(self, project_key: safe.ProjectKey, new_category:
str) -> bool:
...
async def category_remove(self, project_key: safe.ProjectKey,
action_value: str) -> bool:
...
async def language_add(self, project_key: safe.ProjectKey, new_language:
str) -> bool:
...
async def language_remove(self, project_key: safe.ProjectKey,
action_value: str) -> bool:
```
#### `atr/shared/projects.py` — `AddCategoryForm` (lines 248-251)
_extension point_
Existing form pattern for metadata editing that would be followed for a new
ReferenceMetadataForm.
```python
class AddCategoryForm(form.Form):
variant: ADD_CATEGORY = form.value(ADD_CATEGORY)
project_key: safe.ProjectKey = form.label("Project name",
widget=form.Widget.HIDDEN)
category_to_add: str = form.label("New category name")
```
### Where new code would go
- `atr/models/sql.py` — after existing Project fields (category,
description, programming_languages)
New nullable string columns needed on sql.Project: homepage,
repository_urls, bug_database, download_page, mailing_list, lifecycle_page,
standards
- `atr/shared/projects.py` — after DeleteProjectForm class
New form class ReferenceMetadataForm for editing reference metadata fields
- `atr/get/projects.py` — after _render_description_card function
New _render_reference_metadata_card function to display homepage, repo,
bug database, etc.
### Proposed approach
The implementation has two phases: (1) Auto-population from existing DOAP
data: Add new columns to `sql.Project` (homepage, repository_urls,
bug_database, download_page, mailing_list, lifecycle_page, standards,
short_description). Update `_update_projects()` to persist these fields from
`ProjectStatus`. (2) Manual editing UI: Add a new form in `shared/projects.py`
(e.g., `ReferenceMetadataForm`), a POST handler in `post/projects.py`, a writer
method on `CommitteeMember` in `storage/writers/project.py`, and a view/edit
card in `get/projects.py`. The lifecycle_page and standards fields don't exist
in the upstream DOAP data, so they'd only be editable manually. The
`implements` list from DOAP could seed the standards field. Repository URLs are
a list in the upstream data so should be stored as a comma-separated or
newline-separated string (similar to how programming_languages is stored).
### Suggested patches
#### `atr/datasources/apache.py`
Persist reference metadata fields from ProjectStatus into the Project model
during sync
````diff
--- a/atr/datasources/apache.py
+++ b/atr/datasources/apache.py
@@ -413,6 +413,16 @@ async def _update_projects(data: db.Session, projects:
ProjectsData) -> tuple[in
project_model.name = str(project_status.name)
project_model.category = ", ".join(project_status.category) or None
project_model.description = project_status.description
project_model.programming_languages = ",
".join(project_status.programming_language) or None
+ project_model.homepage = project_status.homepage or None
+ project_model.short_description = project_status.shortdesc or None
+ project_model.bug_database = project_status.bug_database or None
+ project_model.download_page = project_status.download_page or None
+ project_model.mailing_list = project_status.mailing_list or None
+ # Repository can be a list of strings or dicts; extract URLs
+ repo_urls = [
+ r if isinstance(r, str) else r.get("location", "") for r in
project_status.repository
+ ]
+ project_model.repository_urls = "\n".join(url for url in repo_urls
if url) or None
+ # Standards from implements
+ standards_urls = [impl.url for impl in project_status.implements if
impl.url]
+ project_model.standards = "\n".join(standards_urls) or None
return added_count, updated_count
````
#### `atr/get/projects.py`
Add a reference metadata card to the project view page
````diff
--- a/atr/get/projects.py
+++ b/atr/get/projects.py
@@ -170,6 +170,7 @@ async def view(
page.append(_render_project_label_card(project))
page.append(_render_pmc_card(project))
page.append(_render_description_card(project))
+ page.append(_render_reference_metadata_card(project))
if project.status == sql.ProjectStatus.ACTIVE:
if can_edit:
@@ -300,6 +301,40 @@ def _render_description_card(project: sql.Project) ->
htm.Element:
return card.collect()
+def _render_reference_metadata_card(project: sql.Project) -> htm.Element:
+ card = htm.Block(htm.div, classes=".card.mb-4")
+ card.div(".card-header.bg-light")[htm.h3(".mb-2")["Reference metadata"]]
+
+ rows: list[htm.Element] = []
+ # TODO: confirm these attribute names match sql.Project columns once
added
+ fields = [
+ ("Homepage", getattr(project, "homepage", None)),
+ ("Short description", getattr(project, "short_description", None)),
+ ("Download page", getattr(project, "download_page", None)),
+ ("Bug database", getattr(project, "bug_database", None)),
+ ("Mailing list", getattr(project, "mailing_list", None)),
+ ("Lifecycle page", getattr(project, "lifecycle_page", None)),
+ ]
+ for label, value in fields:
+ if value:
+ content: htm.Element
+ if value.startswith("http"):
+ content = htm.a(href=value)[value]
+ else:
+ content = htm.span[value]
+ rows.append(htm.tr[htm.th(".border-0.w-25")[label],
htm.td(".text-break.border-0")[content]])
+ else:
+ rows.append(htm.tr[htm.th(".border-0.w-25")[label],
htm.td(".text-muted.border-0")["Not set"]])
+
+ # Repository URLs (multi-value)
+ repo_urls = (getattr(project, "repository_urls", None) or
"").split("\n")
+ repo_links = [htm.div[htm.a(href=url)[url]] for url in repo_urls if
url.strip()]
+ rows.append(htm.tr[htm.th(".border-0.w-25")["Repository"],
htm.td(".text-break.border-0")[repo_links or "Not set"]])
+
+ card.div(".card-body")[htm.table(".table.mb-0")[htm.tbody[*rows]]]
+ return card.collect()
+
+
````
### Open questions
- I don't have access to `atr/models/sql.py` to confirm the exact `Project`
model fields. New columns (homepage, short_description, repository_urls,
bug_database, download_page, mailing_list, lifecycle_page, standards) need to
be added there — likely as `str | None` with `default=None`.
- The `repository` field in ProjectStatus is `list[str | dict]` — the dict
format needs investigation to determine the correct key for extracting URLs (I
guessed 'location' based on DOAP conventions).
- Issue #139 is referenced but not available — it may contain additional
context on where/how this metadata should be stored (e.g., whether it should
also be exposed via the JSON API).
- Should these fields be editable by committee members, or should they only
be auto-populated from DOAP data? The issue implies PMC members should be able
to set lifecycle_page (which isn't in DOAP).
- Database migration strategy — the codebase uses SQLite with SQLModel; need
to confirm how schema migrations are handled (alembic, manual, or auto-create).
### Files examined
- `atr/datasources/apache.py`
- `atr/get/projects.py`
- `atr/post/projects.py`
- `atr/storage/writers/project.py`
- `atr/shared/projects.py`
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]