weiqingy opened a new issue, #549:
URL: https://github.com/apache/flink-agents/issues/549

   ### Search before asking
   
   - [x] I searched in the 
[issues](https://github.com/apache/flink-agents/issues) and found nothing 
similar.
   
   ### Description
   
    ## Summary
   
   Currently, MCP tools and prompts are discovered once — at plan construction 
time (Java MCP servers) or during operator `open()` (Python MCP servers) — and 
frozen for the lifetime of the job. Users want tools and prompts to update 
without restarting the Flink job when MCP servers add, modify, or remove 
capabilities.
   
     ## Motivation
   
     - MCP servers are external services that evolve independently of the Flink 
job
     - Adding a new tool to an MCP server currently requires a full job restart 
to pick it up
     - In long-running agent jobs, tool schemas may change and the agent should 
adapt
     - This is a frequently requested feature from the community (see 
Discussion #516)
   
     ## Proposed Approach
   
   Add a refresh mechanism to `ResourceCache` that periodically or on-demand 
re-queries MCP servers and updates the cached tools/prompts.
   
     Key design questions:
     - **Trigger mechanism:** Periodic (configurable interval) vs event-driven 
(external signal) vs both
     - **Scope of refresh:** Full re-discovery (re-list all tools/prompts) vs 
incremental
     - **Handling removed tools:** What happens if a tool is removed 
mid-conversation? Fail gracefully or error?
     - **Schema changes:** If a tool's input schema changes, how should 
in-flight requests be handled?
     - **Java vs Python MCP servers:** Java MCP servers currently discover 
tools at plan construction via reflection. Python MCP servers discover tools in 
`PythonResourceBridge.discoverPythonMCPResources()`. Both paths need refresh 
support.
   
     ## Key Files
   
     - `plan/src/main/java/.../plan/ResourceCache.java` — owns the cached 
resources, would gain refresh/invalidation
     methods
     - `plan/src/main/java/.../plan/PythonResourceBridge.java` — Python MCP 
discovery, would be called on refresh
     - `runtime/src/main/java/.../operator/ActionExecutionOperator.java` — owns 
the `ResourceCache` lifecycle, would trigger refreshes
     - `integrations/mcp/` — Java MCP server integration
   
     ## Dependencies
   
   **Blocked by [#547](https://github.com/apache/flink-agents/issues/547)** 
(Refactor AgentPlan & ResourceCache). The refactoring extracts `ResourceCache` 
as a standalone runtime object owned by the operator, which is the prerequisite 
for being able to refresh cached resources without mutating the immutable 
`AgentPlan`. That PR is currently under review — implementation of this feature 
can begin once it lands.
   
     ## References                                                              
                                           
   
   - Originally proposed in [Discussion 
#516](https://github.com/apache/flink-agents/discussions/516) under "Loading 
MCP Tools & Prompts in runtime - in case of updates on the MCP server side"
   
   ### Are you willing to submit a PR?
   
   - [x] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to