asf-tooling commented on issue #233:
URL:
https://github.com/apache/tooling-trusted-releases/issues/233#issuecomment-4410422768
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@2da7807a`
**Type:** `new_feature` • **Classification:** `actionable` •
**Confidence:** `high`
**Application domain(s):** `authentication_authorization`,
`shared_infrastructure`
### Summary
This issue tracks using prefixed (Noisy Secrets) tokens in ATR and informing
third-party scanners. The specification is complete (published at
tooling.apache.org/noisy-secrets.html), the reference implementation exists in
`atr/noisy.py`, and ATR already generates PATs using noisy secrets
(`atr/post/tokens.py` line 62). Per @sbp's latest comment (Apr 22), the next
step is to more fully integrate these values within ATR — likely adding format
verification on PAT acceptance and registering the `secret_` prefix pattern
with GitHub's secret scanning partner program. @gstein also requested
upstreaming to `asfpy`.
### Where this lives in the code today
#### `atr/noisy.py` — `create` (lines 292-294)
_currently does this_
The top-level API for creating noisy secrets, already used by the PAT
creation flow.
```python
def create(fqdn: bytes | None = None) -> NoisySecret:
noisy_secret = construct_noisy_secret(FQDN(fqdn) if (fqdn is not None)
else None)
return NoisySecret(noisy_secret)
```
#### `atr/noisy.py` — `verify` (lines 297-300)
_extension point_
Verification function exists but is not yet called during PAT acceptance/JWT
issuance, meaning ATR doesn't validate incoming PAT format.
```python
def verify(bytes_value: bytes) -> bool:
if not is_candidate(bytes_value):
return False
return is_noisy_secret_tag(bytes_value)
```
#### `atr/post/tokens.py` — `_add_token` (lines 69-73)
_currently does this_
PAT creation already uses noisy.create() with the tooling.apache.org domain
to generate prefixed tokens.
```python
async def _add_token(session: web.Committer, add_form:
shared.tokens.AddTokenForm) -> web.WerkzeugResponse:
plaintext = noisy.create(_PAT_NOISY_SECRET_DOMAIN).decode("ascii")
token_hash = hashlib.sha3_256(plaintext.encode()).hexdigest()
created = datetime.datetime.now(datetime.UTC)
expires = created + datetime.timedelta(days=_EXPIRY_DAYS)
```
#### `atr/storage/writers/tokens.py` — `FoundationCommitter.issue_jwt`
(lines 113-128)
_needs modification_
JWT issuance accepts any PAT string without validating its format as a noisy
secret. Adding verification here could catch corrupted tokens early and log
format mismatches.
```python
async def issue_jwt(self, pat_text: str) -> str:
pat_hash = hashlib.sha3_256(pat_text.encode()).hexdigest()
pat = await self.__data.query_one_or_none(
sqlmodel.select(sql.PersonalAccessToken).where(
sql.PersonalAccessToken.asfuid == self.__asf_uid,
sql.PersonalAccessToken.token_hash == pat_hash,
)
)
if (pat is None) or (pat.expires <
datetime.datetime.now(datetime.UTC)):
log.warning(
"Authentication failed",
extra={
"reason": "invalid_or_expired_pat",
},
)
raise storage.AccessError("Authentication failed", status=401)
```
### Where new code would go
- `atr/storage/writers/tokens.py` — after symbol
FoundationCommitter.issue_jwt
Add optional noisy secret format validation when a PAT is presented for
JWT exchange, logging a warning if the format doesn't match (for migration
compatibility with pre-noisy-secret PATs).
### Proposed approach
The core Noisy Secrets implementation is complete and already integrated
into PAT generation. The remaining in-code work is to add format verification
when PATs are presented for JWT exchange, which helps detect corrupted or
fabricated token strings early. Because pre-existing PATs were not generated as
noisy secrets, strict rejection would break backwards compatibility — so
verification should log a warning rather than reject outright, at least until
all legacy PATs have expired (180-day TTL). The second half of this issue —
registering the `secret_` + namespace pattern with GitHub's secret scanning
partner program — is an external/administrative task that requires submitting a
regex pattern to GitHub. Additionally, @gstein requested upstreaming the
`atr/noisy.py` implementation to the `asfpy` package, which is a separate
contribution outside this repository.
### Suggested patches
#### `atr/storage/writers/tokens.py`
Add format validation logging when a PAT is presented for JWT issuance,
helping track migration from legacy tokens to noisy secrets
````diff
--- a/atr/storage/writers/tokens.py
+++ b/atr/storage/writers/tokens.py
@@ -27,6 +27,7 @@ import sqlmodel
import atr.db as db
import atr.jwtoken as jwtoken
import atr.ldap as ldap
+import atr.noisy as noisy
import atr.log as log
import atr.mail as mail
import atr.models.sql as sql
@@ -97,6 +98,10 @@ class FoundationCommitter(GeneralPublic):
async def issue_jwt(self, pat_text: str) -> str:
pat_hash = hashlib.sha3_256(pat_text.encode()).hexdigest()
+ if not noisy.verify(pat_text.encode("ascii")):
+ # Legacy PATs created before noisy secrets will not pass
verification.
+ # Log for monitoring migration progress; do not reject yet.
+ log.warning("PAT presented for JWT exchange is not a valid
noisy secret", extra={"asfuid": self.__asf_uid})
pat = await self.__data.query_one_or_none(
sqlmodel.select(sql.PersonalAccessToken).where(
sql.PersonalAccessToken.asfuid == self.__asf_uid,
````
### Open questions
- When should legacy (non-noisy-secret) PATs be hard-rejected? After all
existing 180-day PATs expire, or with an explicit migration notice?
- Has the `secret_` prefix regex been submitted to GitHub's secret scanning
partner program yet? This is the 'inform selected third party scanners' half of
the issue.
- Should the noisy.py module be upstreamed to asfpy before or after full ATR
integration, per @gstein's request?
- Are there other secret token types in ATR (beyond PATs) that should also
use noisy secrets?
### Files examined
- `atr/noisy.py`
- `atr/post/tokens.py`
- `atr/storage/writers/tokens.py`
- `atr/get/tokens.py`
- `atr/docs/authentication-security.md`
- `atr/shared/tokens.py`
- `atr/jwtoken.py`
- `atr/config.py`
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]