asf-tooling commented on issue #233:
URL: 
https://github.com/apache/tooling-trusted-releases/issues/233#issuecomment-4407589945

   <!-- gofannon-issue-triage-bot v2 -->
   
   **Automated triage** — analyzed at `main@751c2146`
   
   **Type:** `new_feature`  •  **Classification:** `actionable`  •  
**Confidence:** `medium`
   **Application domain(s):** `authentication_authorization`, `admin_operations`
   
   ### Summary
   This issue requests using a standard prefix for ATR secret tokens (PATs) and 
registering the prefix with third-party scanners like GitHub. The team has 
developed a 'noisy secrets' specification (published at 
tooling.apache.org/noisy-secrets.html) with a reference implementation in 
`atr/noisy.py`. The PAT creation code already uses `noisy.create()` (visible in 
`atr/post/tokens.py`). The remaining work is: (1) registering the prefix 
pattern with GitHub's secret scanning partner program and other scanners, (2) 
upstreaming `atr/noisy.py` to `asfpy` per @gstein's request, and (3) 
potentially integrating with any other secret token types ATR issues.
   
   ### Where this lives in the code today
   
   #### `atr/post/tokens.py` — `_add_token` (lines 69-84)
   _currently does this_
   PAT creation already uses noisy.create() with a domain prefix, implementing 
the 'prefix for all secret tokens' part of the issue.
   
   ```python
   import atr.noisy as noisy
   import atr.sessions as sessions
   import atr.shared as shared
   import atr.storage as storage
   import atr.web as web
   
   _EXPIRY_DAYS: Final[int] = 180
   _PAT_NOISY_SECRET_DOMAIN: Final[bytes] = b"tooling.apache.org"
   
   ...
   
   async def _add_token(session: web.Committer, add_form: 
shared.tokens.AddTokenForm) -> web.WerkzeugResponse:
       plaintext = noisy.create(_PAT_NOISY_SECRET_DOMAIN).decode("ascii")
       token_hash = hashlib.sha3_256(plaintext.encode()).hexdigest()
       created = datetime.datetime.now(datetime.UTC)
       expires = created + datetime.timedelta(days=_EXPIRY_DAYS)
   ```
   
   #### `atr/storage/writers/tokens.py` — `FoundationCommitter.add_token` 
(lines 61-75)
   _currently does this_
   Storage layer for token creation - receives the already-hashed noisy secret; 
no changes needed here.
   
   ```python
       async def add_token(
           self, token_hash: str, created: datetime.datetime, expires: 
datetime.datetime, label: str | None
       ) -> types.PersonalAccessTokenSafe:
           if not label:
               raise ValueError("Label is required")
           pat = sql.PersonalAccessToken(
               asfuid=self.__asf_uid,
               token_hash=token_hash,
               created=created,
               expires=expires,
               label=label,
           )
           self.__data.add(pat)
           await self.__data.commit()
           log.auth_event("pat_issuance", self.__asf_uid, 
pat_hash=pat.token_hash)
   ```
   
   #### `atr/storage/writers/tokens.py` — `FoundationCommitter.issue_jwt` 
(lines 113-128)
   _currently does this_
   JWT issuance from PAT - accepts the noisy secret plaintext and hashes it for 
lookup. Works with the new format since hashing is format-agnostic.
   
   ```python
       async def issue_jwt(self, pat_text: str) -> str:
           pat_hash = hashlib.sha3_256(pat_text.encode()).hexdigest()
           pat = await self.__data.query_one_or_none(
               sqlmodel.select(sql.PersonalAccessToken).where(
                   sql.PersonalAccessToken.asfuid == self.__asf_uid,
                   sql.PersonalAccessToken.token_hash == pat_hash,
               )
           )
           if (pat is None) or (pat.expires < 
datetime.datetime.now(datetime.UTC)):
               log.warning(
                   "Authentication failed",
                   extra={
                       "reason": "invalid_or_expired_pat",
                   },
               )
               raise storage.AccessError("Authentication failed", status=401)
   ```
   
   #### `atr/jwtoken.py` — `verify` (lines 125-138)
   _currently does this_
   JWT verification - JWTs themselves don't need the noisy prefix (they're 
short-lived and derived from PATs), but this shows the authentication chain 
from PAT to JWT.
   
   ```python
   async def verify(token: str) -> dict[str, Any]:
       jwt_secret_key = _signing_key()
       claims_unsafe = jwt.decode(token, options={"verify_signature": False}, 
algorithms=[_ALGORITHM])
       asf_uid = claims_unsafe.get("sub")
       log.set_asf_uid(asf_uid)
       claims = jwt.decode(
           token,
           jwt_secret_key,
           algorithms=[_ALGORITHM],
           issuer=_ATR_JWT_ISSUER,
           audience=_ATR_JWT_AUDIENCE,
           leeway=_ATR_JWT_LEEWAY_SECONDS,
           options={"require": ["sub", "iss", "aud", "iat", "nbf", "exp", 
"jti"]},
       )
   ```
   
   ### Where new code would go
   - `atr/noisy.py` — existing file (reference implementation)
     The noisy secrets reference implementation already exists here per @sbp's 
comment. It will eventually be upstreamed to asfpy per @gstein's request.
   
   ### Proposed approach
   The core implementation within ATR is already complete - PATs are generated 
using `noisy.create()` with the domain `b"tooling.apache.org"`. The remaining 
work is primarily external to this codebase:
   
   1. **Register with GitHub Secret Scanning Partner Program**: Submit the 
noisy secret prefix/pattern to GitHub so their scanner can detect leaked ATR 
PATs in public repositories. This requires filing a request with GitHub's 
secret scanning team with the regex pattern matching the noisy secret format.
   
   2. **Upstream to asfpy**: Per @gstein's request (confirmed by @sbp), move 
the `atr/noisy.py` implementation into the shared `asfpy` module so other ASF 
tools can use the same token format. Once upstreamed, ATR would depend on 
`asfpy.noisy` instead of `atr.noisy`.
   
   3. **Incorporate specification feedback**: @sbp's last comment notes that 
feedback from `#security-discuss` needs to be incorporated before the standard 
is finalized.
   
   Since the in-repo implementation is already functional and the remaining 
work is coordination/external registration, no immediate code diffs are needed 
within this repository.
   
   ### Open questions
   - What is the exact regex pattern produced by atr/noisy.py that should be 
registered with GitHub's secret scanning partner program?
   - Has the feedback from #security-discuss been incorporated into the 
specification yet?
   - Are there other secret token types in ATR beyond PATs that need the noisy 
prefix (e.g., any temporary secrets in SSH workflows)?
   - What is the timeline for upstreaming to asfpy, and will ATR then switch 
its import from atr.noisy to asfpy.noisy?
   - Should the IssueForm in atr/shared/tokens.py validate that the submitted 
PAT matches the noisy secret format (as a client-side hint that old-format PATs 
are no longer accepted)?
   
   ### Files examined
   - `atr/post/tokens.py`
   - `atr/storage/writers/tokens.py`
   - `atr/get/tokens.py`
   - `atr/jwtoken.py`
   - `tests/unit/test_tokens_safe.py`
   - `atr/shared/tokens.py`
   - `atr/storage/readers/tokens.py`
   - `atr/ssh.py`
   
   ---
   *Draft from a triage agent. A human reviewer should validate before merging 
any change. The agent did not run tests or verify diffs apply.*


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to