asf-tooling commented on issue #233:
URL:
https://github.com/apache/tooling-trusted-releases/issues/233#issuecomment-4407589945
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@751c2146`
**Type:** `new_feature` • **Classification:** `actionable` •
**Confidence:** `medium`
**Application domain(s):** `authentication_authorization`, `admin_operations`
### Summary
This issue requests using a standard prefix for ATR secret tokens (PATs) and
registering the prefix with third-party scanners like GitHub. The team has
developed a 'noisy secrets' specification (published at
tooling.apache.org/noisy-secrets.html) with a reference implementation in
`atr/noisy.py`. The PAT creation code already uses `noisy.create()` (visible in
`atr/post/tokens.py`). The remaining work is: (1) registering the prefix
pattern with GitHub's secret scanning partner program and other scanners, (2)
upstreaming `atr/noisy.py` to `asfpy` per @gstein's request, and (3)
potentially integrating with any other secret token types ATR issues.
### Where this lives in the code today
#### `atr/post/tokens.py` — `_add_token` (lines 69-84)
_currently does this_
PAT creation already uses noisy.create() with a domain prefix, implementing
the 'prefix for all secret tokens' part of the issue.
```python
import atr.noisy as noisy
import atr.sessions as sessions
import atr.shared as shared
import atr.storage as storage
import atr.web as web
_EXPIRY_DAYS: Final[int] = 180
_PAT_NOISY_SECRET_DOMAIN: Final[bytes] = b"tooling.apache.org"
...
async def _add_token(session: web.Committer, add_form:
shared.tokens.AddTokenForm) -> web.WerkzeugResponse:
plaintext = noisy.create(_PAT_NOISY_SECRET_DOMAIN).decode("ascii")
token_hash = hashlib.sha3_256(plaintext.encode()).hexdigest()
created = datetime.datetime.now(datetime.UTC)
expires = created + datetime.timedelta(days=_EXPIRY_DAYS)
```
#### `atr/storage/writers/tokens.py` — `FoundationCommitter.add_token`
(lines 61-75)
_currently does this_
Storage layer for token creation - receives the already-hashed noisy secret;
no changes needed here.
```python
async def add_token(
self, token_hash: str, created: datetime.datetime, expires:
datetime.datetime, label: str | None
) -> types.PersonalAccessTokenSafe:
if not label:
raise ValueError("Label is required")
pat = sql.PersonalAccessToken(
asfuid=self.__asf_uid,
token_hash=token_hash,
created=created,
expires=expires,
label=label,
)
self.__data.add(pat)
await self.__data.commit()
log.auth_event("pat_issuance", self.__asf_uid,
pat_hash=pat.token_hash)
```
#### `atr/storage/writers/tokens.py` — `FoundationCommitter.issue_jwt`
(lines 113-128)
_currently does this_
JWT issuance from PAT - accepts the noisy secret plaintext and hashes it for
lookup. Works with the new format since hashing is format-agnostic.
```python
async def issue_jwt(self, pat_text: str) -> str:
pat_hash = hashlib.sha3_256(pat_text.encode()).hexdigest()
pat = await self.__data.query_one_or_none(
sqlmodel.select(sql.PersonalAccessToken).where(
sql.PersonalAccessToken.asfuid == self.__asf_uid,
sql.PersonalAccessToken.token_hash == pat_hash,
)
)
if (pat is None) or (pat.expires <
datetime.datetime.now(datetime.UTC)):
log.warning(
"Authentication failed",
extra={
"reason": "invalid_or_expired_pat",
},
)
raise storage.AccessError("Authentication failed", status=401)
```
#### `atr/jwtoken.py` — `verify` (lines 125-138)
_currently does this_
JWT verification - JWTs themselves don't need the noisy prefix (they're
short-lived and derived from PATs), but this shows the authentication chain
from PAT to JWT.
```python
async def verify(token: str) -> dict[str, Any]:
jwt_secret_key = _signing_key()
claims_unsafe = jwt.decode(token, options={"verify_signature": False},
algorithms=[_ALGORITHM])
asf_uid = claims_unsafe.get("sub")
log.set_asf_uid(asf_uid)
claims = jwt.decode(
token,
jwt_secret_key,
algorithms=[_ALGORITHM],
issuer=_ATR_JWT_ISSUER,
audience=_ATR_JWT_AUDIENCE,
leeway=_ATR_JWT_LEEWAY_SECONDS,
options={"require": ["sub", "iss", "aud", "iat", "nbf", "exp",
"jti"]},
)
```
### Where new code would go
- `atr/noisy.py` — existing file (reference implementation)
The noisy secrets reference implementation already exists here per @sbp's
comment. It will eventually be upstreamed to asfpy per @gstein's request.
### Proposed approach
The core implementation within ATR is already complete - PATs are generated
using `noisy.create()` with the domain `b"tooling.apache.org"`. The remaining
work is primarily external to this codebase:
1. **Register with GitHub Secret Scanning Partner Program**: Submit the
noisy secret prefix/pattern to GitHub so their scanner can detect leaked ATR
PATs in public repositories. This requires filing a request with GitHub's
secret scanning team with the regex pattern matching the noisy secret format.
2. **Upstream to asfpy**: Per @gstein's request (confirmed by @sbp), move
the `atr/noisy.py` implementation into the shared `asfpy` module so other ASF
tools can use the same token format. Once upstreamed, ATR would depend on
`asfpy.noisy` instead of `atr.noisy`.
3. **Incorporate specification feedback**: @sbp's last comment notes that
feedback from `#security-discuss` needs to be incorporated before the standard
is finalized.
Since the in-repo implementation is already functional and the remaining
work is coordination/external registration, no immediate code diffs are needed
within this repository.
### Open questions
- What is the exact regex pattern produced by atr/noisy.py that should be
registered with GitHub's secret scanning partner program?
- Has the feedback from #security-discuss been incorporated into the
specification yet?
- Are there other secret token types in ATR beyond PATs that need the noisy
prefix (e.g., any temporary secrets in SSH workflows)?
- What is the timeline for upstreaming to asfpy, and will ATR then switch
its import from atr.noisy to asfpy.noisy?
- Should the IssueForm in atr/shared/tokens.py validate that the submitted
PAT matches the noisy secret format (as a client-side hint that old-format PATs
are no longer accepted)?
### Files examined
- `atr/post/tokens.py`
- `atr/storage/writers/tokens.py`
- `atr/get/tokens.py`
- `atr/jwtoken.py`
- `tests/unit/test_tokens_safe.py`
- `atr/shared/tokens.py`
- `atr/storage/readers/tokens.py`
- `atr/ssh.py`
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]