asf-tooling commented on issue #462:
URL:
https://github.com/apache/tooling-trusted-releases/issues/462#issuecomment-4410283305
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@2da7807a`
**Type:** `new_feature` • **Classification:** `actionable` •
**Confidence:** `medium`
**Application domain(s):** `project_committee_management`
### Summary
The issue reports that while the `ProjectStatus` data source model in
`atr/datasources/apache.py` correctly parses `chair` and `charter` fields from
upstream APIs (projects.apache.org), the `_update_projects` function never
persists these fields to the `sql.Project` database model. @sebbASF also noted
the lack of charter support, and @dave2wave acknowledged that 'the project
schema needs to be cleaned up along with a refactoring of the update.' The fix
requires adding `chair` and `charter` columns to the SQL model and updating the
sync logic.
### Where this lives in the code today
#### `atr/datasources/apache.py` — `ProjectStatus` (lines 192-221)
_currently does this_
The data source model already parses `chair` and `charter` from upstream
JSON, but this data is never stored in the database.
```python
class ProjectStatus(schema.Strict):
category: list[str] = schema.factory(list)
created: str | None = None
description: str | None = None
programming_language: list[str] =
schema.Field(alias="programming-language", default_factory=list)
doap: str | None = None
homepage: str
name: str
pmc: str | None
shortdesc: str | None = None
repository: list[str | dict] = schema.factory(list)
release: list[Release] = schema.factory(list)
bug_database: str | None = schema.alias_opt("bug-database")
download_page: str | None = schema.alias_opt("download-page")
license: str | None = None
mailing_list: str | None = schema.alias_opt("mailing-list")
maintainer: list[MaintainerInfo] = schema.factory(list)
implements: list[ImplementsInfo] = schema.factory(list)
same_as: str | None = schema.alias_opt("sameAs")
developer: list[MaintainerInfo] = schema.factory(list)
modified: str | None = None
chair: ChairInfo | None = None
charter: str | None = None
vendor: str | None = None
helper: list[HelperInfo] = schema.factory(list)
member: list[MaintainerInfo] = schema.factory(list)
shortname: str | None = None
wiki: str | None = None
account: AccountInfo | None = None
platform: str | None = None
```
#### `atr/datasources/apache.py` — `_update_projects` (lines 453-467)
_needs modification_
This function transfers data from the upstream ProjectStatus model to the
sql.Project model, but omits `chair` and `charter` fields.
```python
async def _update_projects(data: db.Session, projects: ProjectsData) ->
tuple[int, int]:
added_count = 0
updated_count = 0
# Add projects and associate them with the right PMC
for project_key, project_status in projects.items():
...
# Pass the project name through the validator
safe.ProjectKey(project_model.key)
project_model.name = str(project_status.name)
project_model.category = ", ".join(project_status.category) or None
project_model.description = project_status.description
project_model.programming_languages = ",
".join(project_status.programming_language) or None
return added_count, updated_count
```
#### `atr/datasources/apache.py` — `ChairInfo` (lines 156-157)
_currently does this_
The ChairInfo model parses the nested Person structure from the upstream
DOAP data, containing name/homepage/mbox of the chair.
```python
class ChairInfo(schema.Strict):
person: PersonInfo | None = schema.alias_opt("Person")
```
#### `atr/storage/writers/project.py` — `CommitteeMember.create` (lines
141-154)
_currently does this_
Project creation does not include chair or charter fields, confirming the
SQL model likely lacks these columns.
```python
async def create(self, committee_key: safe.CommitteeKey, display_name:
str, label: str) -> None:
...
project = sql.Project(
key=label,
name=display_name,
status=sql.ProjectStatus.ACTIVE,
super_project_key=super_project.key if super_project else None,
description=super_project.description if super_project else None,
category=super_project.category if super_project else None,
programming_languages=super_project.programming_languages if
super_project else None,
committee_key=str(committee_key),
created=datetime.datetime.now(datetime.UTC),
created_by=self.__asf_uid,
)
```
### Where new code would go
- `atr/models/sql.py` — in the Project class definition
New `chair` (str | None) and `charter` (str | None) columns need to be
added to the sql.Project model.
### Proposed approach
This issue requires a two-part change: (1) Add `chair` and `charter` fields
to the `sql.Project` model in `atr/models/sql.py`, and (2) update the
`_update_projects` function in `atr/datasources/apache.py` to persist these
values from the upstream data source. The chair info from `ProjectStatus` is a
nested `ChairInfo` object containing a `PersonInfo` with name/homepage/mbox —
the simplest approach is to store the chair's name as a string on the Project
model. The charter is already a plain string. A database migration will also be
needed.
As @dave2wave noted, this should be part of a broader project schema cleanup
and update refactoring. The UI in `atr/get/projects.py` could also be updated
to display chair and charter information, but the core fix is in the data
persistence layer.
### Suggested patches
#### `atr/datasources/apache.py`
Persist chair name and charter from upstream ProjectStatus to the
sql.Project model during sync
````diff
--- a/atr/datasources/apache.py
+++ b/atr/datasources/apache.py
@@ -405,6 +405,12 @@
project_model.category = ", ".join(project_status.category) or None
project_model.description = project_status.description
project_model.programming_languages = ",
".join(project_status.programming_language) or None
+ # Persist chair name from upstream DOAP data
+ if project_status.chair and project_status.chair.person and
project_status.chair.person.name:
+ project_model.chair = project_status.chair.person.name
+ else:
+ project_model.chair = None
+ project_model.charter = project_status.charter
return added_count, updated_count
````
#### `atr/models/sql.py`
Add chair and charter columns to the Project model (exact location within
the class TBD since file not fully provided)
````diff
--- a/atr/models/sql.py
+++ b/atr/models/sql.py
@@ -0,0 +0,0 @@ class Project(...):
+ # TODO: confirm exact placement within Project class fields
+ chair: str | None = sqlmodel.Field(default=None)
+ charter: str | None = sqlmodel.Field(default=None)
````
### Open questions
- The exact structure of `sql.Project` in `atr/models/sql.py` is not
available — need to confirm the model definition and where to place new fields.
- Should chair be stored as a plain name string, or should it reference a
user ID (ASF UID) for richer integration?
- A database migration (Alembic or manual ALTER TABLE) will be needed — what
migration strategy does this project use?
- Should the chair field on Project duplicate what's already available via
the Committee model (which has chair info from Whimsy's committee-info.json),
or is this specifically about the DOAP project-level chair?
- @dave2wave mentioned a broader schema cleanup and update refactoring —
should this be deferred until that work is scoped?
### Files examined
- `atr/datasources/apache.py`
- `atr/get/projects.py`
- `atr/post/projects.py`
- `atr/storage/writers/project.py`
- `atr/shared/projects.py`
### Related issues
This issue appears related to: #468, #465.
_All address data model issues with committee and project metadata schema_
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]