asf-tooling commented on issue #462:
URL: 
https://github.com/apache/tooling-trusted-releases/issues/462#issuecomment-4410283305

   <!-- gofannon-issue-triage-bot v2 -->
   
   **Automated triage** — analyzed at `main@2da7807a`
   
   **Type:** `new_feature`  •  **Classification:** `actionable`  •  
**Confidence:** `medium`
   **Application domain(s):** `project_committee_management`
   
   ### Summary
   The issue reports that while the `ProjectStatus` data source model in 
`atr/datasources/apache.py` correctly parses `chair` and `charter` fields from 
upstream APIs (projects.apache.org), the `_update_projects` function never 
persists these fields to the `sql.Project` database model. @sebbASF also noted 
the lack of charter support, and @dave2wave acknowledged that 'the project 
schema needs to be cleaned up along with a refactoring of the update.' The fix 
requires adding `chair` and `charter` columns to the SQL model and updating the 
sync logic.
   
   ### Where this lives in the code today
   
   #### `atr/datasources/apache.py` — `ProjectStatus` (lines 192-221)
   _currently does this_
   The data source model already parses `chair` and `charter` from upstream 
JSON, but this data is never stored in the database.
   
   ```python
   class ProjectStatus(schema.Strict):
       category: list[str] = schema.factory(list)
       created: str | None = None
       description: str | None = None
       programming_language: list[str] = 
schema.Field(alias="programming-language", default_factory=list)
       doap: str | None = None
       homepage: str
       name: str
       pmc: str | None
       shortdesc: str | None = None
       repository: list[str | dict] = schema.factory(list)
       release: list[Release] = schema.factory(list)
       bug_database: str | None = schema.alias_opt("bug-database")
       download_page: str | None = schema.alias_opt("download-page")
       license: str | None = None
       mailing_list: str | None = schema.alias_opt("mailing-list")
       maintainer: list[MaintainerInfo] = schema.factory(list)
       implements: list[ImplementsInfo] = schema.factory(list)
       same_as: str | None = schema.alias_opt("sameAs")
       developer: list[MaintainerInfo] = schema.factory(list)
       modified: str | None = None
       chair: ChairInfo | None = None
       charter: str | None = None
       vendor: str | None = None
       helper: list[HelperInfo] = schema.factory(list)
       member: list[MaintainerInfo] = schema.factory(list)
       shortname: str | None = None
       wiki: str | None = None
       account: AccountInfo | None = None
       platform: str | None = None
   ```
   
   #### `atr/datasources/apache.py` — `_update_projects` (lines 453-467)
   _needs modification_
   This function transfers data from the upstream ProjectStatus model to the 
sql.Project model, but omits `chair` and `charter` fields.
   
   ```python
   async def _update_projects(data: db.Session, projects: ProjectsData) -> 
tuple[int, int]:
       added_count = 0
       updated_count = 0
   
       # Add projects and associate them with the right PMC
       for project_key, project_status in projects.items():
           ...
           # Pass the project name through the validator
           safe.ProjectKey(project_model.key)
           project_model.name = str(project_status.name)
           project_model.category = ", ".join(project_status.category) or None
           project_model.description = project_status.description
           project_model.programming_languages = ", 
".join(project_status.programming_language) or None
   
       return added_count, updated_count
   ```
   
   #### `atr/datasources/apache.py` — `ChairInfo` (lines 156-157)
   _currently does this_
   The ChairInfo model parses the nested Person structure from the upstream 
DOAP data, containing name/homepage/mbox of the chair.
   
   ```python
   class ChairInfo(schema.Strict):
       person: PersonInfo | None = schema.alias_opt("Person")
   ```
   
   #### `atr/storage/writers/project.py` — `CommitteeMember.create` (lines 
141-154)
   _currently does this_
   Project creation does not include chair or charter fields, confirming the 
SQL model likely lacks these columns.
   
   ```python
       async def create(self, committee_key: safe.CommitteeKey, display_name: 
str, label: str) -> None:
           ...
           project = sql.Project(
               key=label,
               name=display_name,
               status=sql.ProjectStatus.ACTIVE,
               super_project_key=super_project.key if super_project else None,
               description=super_project.description if super_project else None,
               category=super_project.category if super_project else None,
               programming_languages=super_project.programming_languages if 
super_project else None,
               committee_key=str(committee_key),
               created=datetime.datetime.now(datetime.UTC),
               created_by=self.__asf_uid,
           )
   ```
   
   ### Where new code would go
   - `atr/models/sql.py` — in the Project class definition
     New `chair` (str | None) and `charter` (str | None) columns need to be 
added to the sql.Project model.
   
   ### Proposed approach
   This issue requires a two-part change: (1) Add `chair` and `charter` fields 
to the `sql.Project` model in `atr/models/sql.py`, and (2) update the 
`_update_projects` function in `atr/datasources/apache.py` to persist these 
values from the upstream data source. The chair info from `ProjectStatus` is a 
nested `ChairInfo` object containing a `PersonInfo` with name/homepage/mbox — 
the simplest approach is to store the chair's name as a string on the Project 
model. The charter is already a plain string. A database migration will also be 
needed.
   
   As @dave2wave noted, this should be part of a broader project schema cleanup 
and update refactoring. The UI in `atr/get/projects.py` could also be updated 
to display chair and charter information, but the core fix is in the data 
persistence layer.
   
   ### Suggested patches
   
   #### `atr/datasources/apache.py`
   Persist chair name and charter from upstream ProjectStatus to the 
sql.Project model during sync
   
   ````diff
   --- a/atr/datasources/apache.py
   +++ b/atr/datasources/apache.py
   @@ -405,6 +405,12 @@
            project_model.category = ", ".join(project_status.category) or None
            project_model.description = project_status.description
            project_model.programming_languages = ", 
".join(project_status.programming_language) or None
   +        # Persist chair name from upstream DOAP data
   +        if project_status.chair and project_status.chair.person and 
project_status.chair.person.name:
   +            project_model.chair = project_status.chair.person.name
   +        else:
   +            project_model.chair = None
   +        project_model.charter = project_status.charter
    
        return added_count, updated_count
   ````
   
   #### `atr/models/sql.py`
   Add chair and charter columns to the Project model (exact location within 
the class TBD since file not fully provided)
   
   ````diff
   --- a/atr/models/sql.py
   +++ b/atr/models/sql.py
   @@ -0,0 +0,0 @@ class Project(...):
   +    # TODO: confirm exact placement within Project class fields
   +    chair: str | None = sqlmodel.Field(default=None)
   +    charter: str | None = sqlmodel.Field(default=None)
   ````
   
   ### Open questions
   - The exact structure of `sql.Project` in `atr/models/sql.py` is not 
available — need to confirm the model definition and where to place new fields.
   - Should chair be stored as a plain name string, or should it reference a 
user ID (ASF UID) for richer integration?
   - A database migration (Alembic or manual ALTER TABLE) will be needed — what 
migration strategy does this project use?
   - Should the chair field on Project duplicate what's already available via 
the Committee model (which has chair info from Whimsy's committee-info.json), 
or is this specifically about the DOAP project-level chair?
   - @dave2wave mentioned a broader schema cleanup and update refactoring — 
should this be deferred until that work is scoped?
   
   ### Files examined
   - `atr/datasources/apache.py`
   - `atr/get/projects.py`
   - `atr/post/projects.py`
   - `atr/storage/writers/project.py`
   - `atr/shared/projects.py`
   
   ### Related issues
   This issue appears related to: #468, #465.
   
   _All address data model issues with committee and project metadata schema_
   
   ---
   *Draft from a triage agent. A human reviewer should validate before merging 
any change. The agent did not run tests or verify diffs apply.*


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to