asf-tooling commented on issue #1052:
URL:
https://github.com/apache/tooling-trusted-releases/issues/1052#issuecomment-4409879045
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@2da7807a`
**Type:** `discussion` • **Classification:** `no_action` •
**Confidence:** `high`
**Application domain(s):** `authentication_authorization`
### Summary
The issue requests adding explicit OAuth scope parameters to the
authorization URL. However, the code responsible (`OAUTH_URL_INIT` in
`src/asfquart/generics.py`) lives in the external `asfquart` package, not in
this repository. The triage notes confirm this: 'low, also let asfquart know
about this'. This repo already filters session data to only model-defined
fields before storage via `_prepare_session_data` in `atr/sessions.py`. The
actionable work for THIS repo is limited to documentation of the accepted risk;
the actual OAuth scope fix belongs upstream in asfquart.
### Where this lives in the code today
#### `atr/sessions.py` — `_prepare_session_data` (lines 112-133)
_currently does this_
This function already performs data minimization by filtering incoming OAuth
data to only fields defined in the UserSession model, dropping
unknown/unnecessary fields before storage.
```python
def _prepare_session_data(session_data: dict[str, Any]) -> dict[str, Any]:
# Field names match the raw OAuth dict from asfquart
# They do not match ClientSession attribute names
# ClientSession remaps on construction
# Raw "pmcs" becomes ClientSession.committees
# Raw "roleaccount" becomes ClientSession.isRole
# Since awrite() passes the raw dict, we map from the raw key names here
data = dict(session_data)
if "pmcs" in data:
data["committees"] = data.pop("pmcs")
if "isMember" in data:
data["is_member"] = data.pop("isMember")
if "isChair" in data:
data["is_chair"] = data.pop("isChair")
# if "isRoot" in data:
# data["is_root"] = data.pop("isRoot")
if "roleaccount" in data:
data["is_role"] = data.pop("roleaccount")
data.pop("metadata", None)
data.pop("cts", None)
data.pop("uts", None)
return {k: v for k, v in data.items() if k in _SESSION_DATA_FIELDS}
```
#### `atr/sessions.py` — `Store.create` (lines 136-143)
_currently does this_
Session creation calls _prepare_session_data which filters OAuth response
data before storing, demonstrating existing data minimization at the storage
layer.
```python
class Store:
async def create(self, hsid: str, session_data: dict[str, Any]) -> None:
prepared = _prepare_session_data(session_data)
now = time.time()
user_session = sql.UserSession(sid_hash=hsid, cts=now, uts=now,
**prepared)
async with db.session() as data:
data.add(user_session)
await data.commit()
```
### Proposed approach
The primary fix (adding a `scope` parameter to the OAuth authorization URL)
must be implemented in the upstream `asfquart` package, not in this repository.
The triage notes explicitly state 'low, also let asfquart know about this',
confirming the team recognizes this is an upstream concern.
This repository already implements a key mitigation: `_prepare_session_data`
in `atr/sessions.py` filters incoming OAuth data to only fields defined in the
`UserSession` SQLModel, dropping any extra data the AS might return. This means
even without scope restriction at the authorization request level, the session
storage layer only persists the minimum required fields.
The remaining action item for this repo would be to document this as an
accepted risk (if `oauth.apache.org` doesn't support client-side scope
requests) or to coordinate with `oauth.apache.org` maintainers. No code change
in this repository is warranted at this time.
### Open questions
- Does oauth.apache.org support client-side scope parameters in
authorization requests?
- Has an issue been filed against the asfquart repository for adding scope
to OAUTH_URL_INIT?
- What fields does oauth.apache.org currently return by default vs what ATR
actually uses (uid, dn, fullname, email, pmcs, projects, isMember, isChair,
mfa, roleaccount)?
_The agent reviewed this issue and is not proposing patches in this run.
Review the existing-code citations and open questions above before deciding
next steps._
### Files examined
- `atr/sessions.py`
- `atr/ldap.py`
- `atr/jwtoken.py`
- `atr/principal.py`
- `atr/get/user.py`
- `atr/post/user.py`
- `atr/shared/user.py`
- `atr/storage/readers/user.py`
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]