asf-tooling commented on issue #233:
URL: 
https://github.com/apache/tooling-trusted-releases/issues/233#issuecomment-4410422768

   <!-- gofannon-issue-triage-bot v2 -->
   
   **Automated triage** — analyzed at `main@2da7807a`
   
   **Type:** `new_feature`  •  **Classification:** `actionable`  •  
**Confidence:** `high`
   **Application domain(s):** `authentication_authorization`, 
`shared_infrastructure`
   
   ### Summary
   This issue tracks using prefixed (Noisy Secrets) tokens in ATR and informing 
third-party scanners. The specification is complete (published at 
tooling.apache.org/noisy-secrets.html), the reference implementation exists in 
`atr/noisy.py`, and ATR already generates PATs using noisy secrets 
(`atr/post/tokens.py` line 62). Per @sbp's latest comment (Apr 22), the next 
step is to more fully integrate these values within ATR — likely adding format 
verification on PAT acceptance and registering the `secret_` prefix pattern 
with GitHub's secret scanning partner program. @gstein also requested 
upstreaming to `asfpy`.
   
   ### Where this lives in the code today
   
   #### `atr/noisy.py` — `create` (lines 292-294)
   _currently does this_
   The top-level API for creating noisy secrets, already used by the PAT 
creation flow.
   
   ```python
   def create(fqdn: bytes | None = None) -> NoisySecret:
       noisy_secret = construct_noisy_secret(FQDN(fqdn) if (fqdn is not None) 
else None)
       return NoisySecret(noisy_secret)
   ```
   
   #### `atr/noisy.py` — `verify` (lines 297-300)
   _extension point_
   Verification function exists but is not yet called during PAT acceptance/JWT 
issuance, meaning ATR doesn't validate incoming PAT format.
   
   ```python
   def verify(bytes_value: bytes) -> bool:
       if not is_candidate(bytes_value):
           return False
       return is_noisy_secret_tag(bytes_value)
   ```
   
   #### `atr/post/tokens.py` — `_add_token` (lines 69-73)
   _currently does this_
   PAT creation already uses noisy.create() with the tooling.apache.org domain 
to generate prefixed tokens.
   
   ```python
   async def _add_token(session: web.Committer, add_form: 
shared.tokens.AddTokenForm) -> web.WerkzeugResponse:
       plaintext = noisy.create(_PAT_NOISY_SECRET_DOMAIN).decode("ascii")
       token_hash = hashlib.sha3_256(plaintext.encode()).hexdigest()
       created = datetime.datetime.now(datetime.UTC)
       expires = created + datetime.timedelta(days=_EXPIRY_DAYS)
   ```
   
   #### `atr/storage/writers/tokens.py` — `FoundationCommitter.issue_jwt` 
(lines 113-128)
   _needs modification_
   JWT issuance accepts any PAT string without validating its format as a noisy 
secret. Adding verification here could catch corrupted tokens early and log 
format mismatches.
   
   ```python
       async def issue_jwt(self, pat_text: str) -> str:
           pat_hash = hashlib.sha3_256(pat_text.encode()).hexdigest()
           pat = await self.__data.query_one_or_none(
               sqlmodel.select(sql.PersonalAccessToken).where(
                   sql.PersonalAccessToken.asfuid == self.__asf_uid,
                   sql.PersonalAccessToken.token_hash == pat_hash,
               )
           )
           if (pat is None) or (pat.expires < 
datetime.datetime.now(datetime.UTC)):
               log.warning(
                   "Authentication failed",
                   extra={
                       "reason": "invalid_or_expired_pat",
                   },
               )
               raise storage.AccessError("Authentication failed", status=401)
   ```
   
   ### Where new code would go
   - `atr/storage/writers/tokens.py` — after symbol 
FoundationCommitter.issue_jwt
     Add optional noisy secret format validation when a PAT is presented for 
JWT exchange, logging a warning if the format doesn't match (for migration 
compatibility with pre-noisy-secret PATs).
   
   ### Proposed approach
   The core Noisy Secrets implementation is complete and already integrated 
into PAT generation. The remaining in-code work is to add format verification 
when PATs are presented for JWT exchange, which helps detect corrupted or 
fabricated token strings early. Because pre-existing PATs were not generated as 
noisy secrets, strict rejection would break backwards compatibility — so 
verification should log a warning rather than reject outright, at least until 
all legacy PATs have expired (180-day TTL). The second half of this issue — 
registering the `secret_` + namespace pattern with GitHub's secret scanning 
partner program — is an external/administrative task that requires submitting a 
regex pattern to GitHub. Additionally, @gstein requested upstreaming the 
`atr/noisy.py` implementation to the `asfpy` package, which is a separate 
contribution outside this repository.
   
   ### Suggested patches
   
   #### `atr/storage/writers/tokens.py`
   Add format validation logging when a PAT is presented for JWT issuance, 
helping track migration from legacy tokens to noisy secrets
   
   ````diff
   --- a/atr/storage/writers/tokens.py
   +++ b/atr/storage/writers/tokens.py
   @@ -27,6 +27,7 @@ import sqlmodel
    import atr.db as db
    import atr.jwtoken as jwtoken
    import atr.ldap as ldap
   +import atr.noisy as noisy
    import atr.log as log
    import atr.mail as mail
    import atr.models.sql as sql
   @@ -97,6 +98,10 @@ class FoundationCommitter(GeneralPublic):
    
        async def issue_jwt(self, pat_text: str) -> str:
            pat_hash = hashlib.sha3_256(pat_text.encode()).hexdigest()
   +        if not noisy.verify(pat_text.encode("ascii")):
   +            # Legacy PATs created before noisy secrets will not pass 
verification.
   +            # Log for monitoring migration progress; do not reject yet.
   +            log.warning("PAT presented for JWT exchange is not a valid 
noisy secret", extra={"asfuid": self.__asf_uid})
            pat = await self.__data.query_one_or_none(
                sqlmodel.select(sql.PersonalAccessToken).where(
                    sql.PersonalAccessToken.asfuid == self.__asf_uid,
   ````
   
   ### Open questions
   - When should legacy (non-noisy-secret) PATs be hard-rejected? After all 
existing 180-day PATs expire, or with an explicit migration notice?
   - Has the `secret_` prefix regex been submitted to GitHub's secret scanning 
partner program yet? This is the 'inform selected third party scanners' half of 
the issue.
   - Should the noisy.py module be upstreamed to asfpy before or after full ATR 
integration, per @gstein's request?
   - Are there other secret token types in ATR (beyond PATs) that should also 
use noisy secrets?
   
   ### Files examined
   - `atr/noisy.py`
   - `atr/post/tokens.py`
   - `atr/storage/writers/tokens.py`
   - `atr/get/tokens.py`
   - `atr/docs/authentication-security.md`
   - `atr/shared/tokens.py`
   - `atr/jwtoken.py`
   - `atr/config.py`
   
   ---
   *Draft from a triage agent. A human reviewer should validate before merging 
any change. The agent did not run tests or verify diffs apply.*


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to