andrewmusselman opened a new issue, #25:
URL: https://github.com/apache/tooling-gofannon/issues/25

   ### Summary
   Long-running agents sometimes need to be interrupted. The earlier "Cancel" 
button only aborted the client-side fetch; the server-side task kept running 
until completion. The only way to actually stop a run today is to restart the 
api container, which kills every other in-flight run too.
    
   ### Details
   **Problem:**
   - No surgical kill exists for a specific run.
   - Server-side asyncio.Task continues after browser disconnects.
   - `task.cancel()` is too aggressive — raises `CancelledError` mid-await 
inside LLM HTTP calls, leaving httpx connections in unclean states.
   ### Proposed solution
   **Cooperative cancellation, layered:**
    
   1. **CancelToken via contextvar** (same pattern as Trace). The agent runtime 
checks `should_stop()` between operations.
   2. **Enforcement at structural boundaries.** Wrap `tools.call_llm`, 
`tools.data_store.*`, and `gofannon_client.call` with an entry check — if 
stopping, raise `AgentStopped` immediately without executing. In-flight LLM 
calls finish naturally; only the next attempt to do anything observable raises.
   **UI:**
   - Stop button next to Run; disabled when no run in flight.
   - While stopping (after click, before halt): button shows "Stopping… (after 
current LLM call completes)" disabled.
   - Run's outcome becomes a third status `stopped` — neutral chip color in the 
Progress Log (gray with a stop icon, not red).
   **Stop semantics for chained agents:**
   When agent X is stopping and X has called Y, Y stops too. Stop means the 
whole tree. Contextvar makes this trivial.
    
   ### Acceptance Criteria
   - [ ] Fixed: `CancelToken` contextvar threaded through agent execution
   - [ ] Fixed: `tools.call_llm`, `tools.data_store.*`, `gofannon_client.call` 
check `should_stop()` on entry
   - [ ] Fixed: `POST /runs/{run_id}/stop` sets cancel token, responds 202
   - [ ] Fixed: UI Stop button next to Run, disabled appropriately
   - [ ] Fixed: "Stopping…" state shown after click until run actually halts
   - [ ] Fixed: `RunRecord.status = "stopped"` distinguishable from error
   - [ ] Test added: Stop during LLM call halts at next tool boundary (not 
mid-await)
   - [ ] Test added: Chained sub-agent stops when parent receives stop
   - [ ] Test added: Cleanup runs (e.g., http_client.aclose() in finally) 
execute before exit
   ### References
   - File: `webapp/packages/api/user-service/services/agent_trace.py` (Trace 
contextvar pattern)
   - File: 
`webapp/packages/api/user-service/dependencies.py:_execute_agent_code`
   - File: `webapp/packages/webui/src/pages/AgentCreationFlow/RunsScreen.jsx`
   - Tracker: FIXES.md item #6
   ### Priority
   **Medium** - Depends on ISSUE-003 (run registry) for the cancel token to 
live somewhere addressable by run_id.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to