asf-tooling commented on issue #1052:
URL: 
https://github.com/apache/tooling-trusted-releases/issues/1052#issuecomment-4409879045

   <!-- gofannon-issue-triage-bot v2 -->
   
   **Automated triage** — analyzed at `main@2da7807a`
   
   **Type:** `discussion`  •  **Classification:** `no_action`  •  
**Confidence:** `high`
   **Application domain(s):** `authentication_authorization`
   
   ### Summary
   The issue requests adding explicit OAuth scope parameters to the 
authorization URL. However, the code responsible (`OAUTH_URL_INIT` in 
`src/asfquart/generics.py`) lives in the external `asfquart` package, not in 
this repository. The triage notes confirm this: 'low, also let asfquart know 
about this'. This repo already filters session data to only model-defined 
fields before storage via `_prepare_session_data` in `atr/sessions.py`. The 
actionable work for THIS repo is limited to documentation of the accepted risk; 
the actual OAuth scope fix belongs upstream in asfquart.
   
   ### Where this lives in the code today
   
   #### `atr/sessions.py` — `_prepare_session_data` (lines 112-133)
   _currently does this_
   This function already performs data minimization by filtering incoming OAuth 
data to only fields defined in the UserSession model, dropping 
unknown/unnecessary fields before storage.
   
   ```python
   def _prepare_session_data(session_data: dict[str, Any]) -> dict[str, Any]:
       # Field names match the raw OAuth dict from asfquart
       # They do not match ClientSession attribute names
       # ClientSession remaps on construction
       # Raw "pmcs" becomes ClientSession.committees
       # Raw "roleaccount" becomes ClientSession.isRole
       # Since awrite() passes the raw dict, we map from the raw key names here
       data = dict(session_data)
       if "pmcs" in data:
           data["committees"] = data.pop("pmcs")
       if "isMember" in data:
           data["is_member"] = data.pop("isMember")
       if "isChair" in data:
           data["is_chair"] = data.pop("isChair")
       # if "isRoot" in data:
       #     data["is_root"] = data.pop("isRoot")
       if "roleaccount" in data:
           data["is_role"] = data.pop("roleaccount")
       data.pop("metadata", None)
       data.pop("cts", None)
       data.pop("uts", None)
       return {k: v for k, v in data.items() if k in _SESSION_DATA_FIELDS}
   ```
   
   #### `atr/sessions.py` — `Store.create` (lines 136-143)
   _currently does this_
   Session creation calls _prepare_session_data which filters OAuth response 
data before storing, demonstrating existing data minimization at the storage 
layer.
   
   ```python
   class Store:
       async def create(self, hsid: str, session_data: dict[str, Any]) -> None:
           prepared = _prepare_session_data(session_data)
           now = time.time()
           user_session = sql.UserSession(sid_hash=hsid, cts=now, uts=now, 
**prepared)
           async with db.session() as data:
               data.add(user_session)
               await data.commit()
   ```
   
   ### Proposed approach
   The primary fix (adding a `scope` parameter to the OAuth authorization URL) 
must be implemented in the upstream `asfquart` package, not in this repository. 
The triage notes explicitly state 'low, also let asfquart know about this', 
confirming the team recognizes this is an upstream concern.
   
   This repository already implements a key mitigation: `_prepare_session_data` 
in `atr/sessions.py` filters incoming OAuth data to only fields defined in the 
`UserSession` SQLModel, dropping any extra data the AS might return. This means 
even without scope restriction at the authorization request level, the session 
storage layer only persists the minimum required fields.
   
   The remaining action item for this repo would be to document this as an 
accepted risk (if `oauth.apache.org` doesn't support client-side scope 
requests) or to coordinate with `oauth.apache.org` maintainers. No code change 
in this repository is warranted at this time.
   
   ### Open questions
   - Does oauth.apache.org support client-side scope parameters in 
authorization requests?
   - Has an issue been filed against the asfquart repository for adding scope 
to OAUTH_URL_INIT?
   - What fields does oauth.apache.org currently return by default vs what ATR 
actually uses (uid, dn, fullname, email, pmcs, projects, isMember, isChair, 
mfa, roleaccount)?
   
   _The agent reviewed this issue and is not proposing patches in this run. 
Review the existing-code citations and open questions above before deciding 
next steps._
   
   ### Files examined
   - `atr/sessions.py`
   - `atr/ldap.py`
   - `atr/jwtoken.py`
   - `atr/principal.py`
   - `atr/get/user.py`
   - `atr/post/user.py`
   - `atr/shared/user.py`
   - `atr/storage/readers/user.py`
   
   ---
   *Draft from a triage agent. A human reviewer should validate before merging 
any change. The agent did not run tests or verify diffs apply.*


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to