andrewmusselman opened a new issue, #32:
URL: https://github.com/apache/tooling-gofannon/issues/32

   ### Summary
   Authenticated users (e.g., `asf:akm`) appear as `null` in 
`api_request_start` / `api_request_end` log records for endpoints that don't go 
through `Depends(get_current_user)` — most notably `/log/client`. Same user 
appears correctly on auth-gated endpoints. The middleware reads 
`request.state.user`, which is only populated when the auth dependency runs. 
Public endpoints don't trigger that dependency, so the middleware sees nothing.
    
   ### Details
   **Symptom:**
   Same authenticated user (`asf:akm`) shows three different values in the log 
for the same browser session:
   ```json
   {"eventType": "api_request_end",   "userId": "asf:akm",   "path": 
"/agents/.../deploy"}
   {"eventType": "navigation",        "userId": "anonymous", "metadata": 
{"userId": "asf:akm"}}
   {"eventType": "api_request_start", "userId": null,        "path": 
"/log/client"}
   ```
    
   **Root cause (three contributing patterns):**
    
   1. **Two different defaults for missing uid.**
      - `services/observability_service.py:265, 289`: `getattr(request.state, 
'user', {}).get('uid')` → `None`
      - `routes.py:226` (`/log/client` handler): `getattr(request.state, 
'user', {}).get('uid', 'anonymous')` → `"anonymous"`
      Same condition, different sentinel values.
   2. **`request.state.user` only set by `Depends(get_current_user)`.** Public 
endpoints skip auth dependency, so the middleware sees empty state regardless 
of whether the session cookie is valid.
   3. **`chat_service.py:75, 146` explicitly passes `user_id=None`** with the 
comment "Chat service doesn't track user costs". Conflates cost-attribution 
policy with observability-attribution.
   **Impact:**
   - Audit logs unusable for tracing user activity across endpoints.
   - Cross-user correlation queries return inconsistent results.
   - Genuine anonymous users indistinguishable from authenticated users on 
public endpoints.
   ### Remediation
   **Opportunistic session resolution in the middleware:**
    
   ```python
   async def _resolve_user_opportunistically(request) -> Optional[str]:
       """Try to resolve user_id from session cookie without failing."""
       sid = request.cookies.get("gofannon_sid")
       if not sid:
           return None
       try:
           svc = get_session_service(get_database_service(settings))
           session = await svc.get_by_id(sid)
           if session:
               request.state.user = {"uid": session.user_uid, "auth_mode": 
"session"}
               return session.user_uid
       except Exception:
           pass
       return None
    
   # In the middleware __call__, before logging:
   user_id = getattr(request.state, 'user', {}).get('uid')
   if user_id is None:
       user_id = await _resolve_user_opportunistically(request)
   ```
    
   **For the chat-service-specific null:** decouple cost-attribution from 
observability-attribution. Thread the real `user_id` into chat LLM events but 
skip the `user_service.add_usage` call instead of blanking the user_id.
    
   ### Acceptance Criteria
   - [ ] Fixed: Middleware resolves session cookie opportunistically for public 
endpoints
   - [ ] Fixed: `chat_service.py` threads real `user_id` into LLM events 
(without billing)
   - [ ] Fixed: Consistent sentinel — `"anonymous"` everywhere or `null` 
everywhere, not both
   - [ ] Test added: authenticated user appears in `/log/client` event with 
correct `userId`
   - [ ] Test added: genuinely unauthenticated request appears as `"anonymous"` 
(or `null`, whichever is chosen)
   - [ ] Test added: chat LLM events have non-null userId for authenticated 
users
   - [ ] Note for migration: any audit query filtering `userId == "anonymous"` 
will need updating
   ### References
   - File: 
`webapp/packages/api/user-service/services/observability_service.py:265,289`
   - File: `webapp/packages/api/user-service/routes.py:226` (`/log/client`)
   - File: `webapp/packages/api/user-service/services/chat_service.py:75,146`
   - Tracker: FIXES.md item #12
   ### Priority
   **Low** - Observability hygiene; not user-impacting. ~40 LOC in 
observability service + ~10 LOC across two chat_service lines.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to