weiqingy opened a new issue, #549: URL: https://github.com/apache/flink-agents/issues/549
### Search before asking - [x] I searched in the [issues](https://github.com/apache/flink-agents/issues) and found nothing similar. ### Description ## Summary Currently, MCP tools and prompts are discovered once — at plan construction time (Java MCP servers) or during operator `open()` (Python MCP servers) — and frozen for the lifetime of the job. Users want tools and prompts to update without restarting the Flink job when MCP servers add, modify, or remove capabilities. ## Motivation - MCP servers are external services that evolve independently of the Flink job - Adding a new tool to an MCP server currently requires a full job restart to pick it up - In long-running agent jobs, tool schemas may change and the agent should adapt - This is a frequently requested feature from the community (see Discussion #516) ## Proposed Approach Add a refresh mechanism to `ResourceCache` that periodically or on-demand re-queries MCP servers and updates the cached tools/prompts. Key design questions: - **Trigger mechanism:** Periodic (configurable interval) vs event-driven (external signal) vs both - **Scope of refresh:** Full re-discovery (re-list all tools/prompts) vs incremental - **Handling removed tools:** What happens if a tool is removed mid-conversation? Fail gracefully or error? - **Schema changes:** If a tool's input schema changes, how should in-flight requests be handled? - **Java vs Python MCP servers:** Java MCP servers currently discover tools at plan construction via reflection. Python MCP servers discover tools in `PythonResourceBridge.discoverPythonMCPResources()`. Both paths need refresh support. ## Key Files - `plan/src/main/java/.../plan/ResourceCache.java` — owns the cached resources, would gain refresh/invalidation methods - `plan/src/main/java/.../plan/PythonResourceBridge.java` — Python MCP discovery, would be called on refresh - `runtime/src/main/java/.../operator/ActionExecutionOperator.java` — owns the `ResourceCache` lifecycle, would trigger refreshes - `integrations/mcp/` — Java MCP server integration ## Dependencies **Blocked by [#547](https://github.com/apache/flink-agents/issues/547)** (Refactor AgentPlan & ResourceCache). The refactoring extracts `ResourceCache` as a standalone runtime object owned by the operator, which is the prerequisite for being able to refresh cached resources without mutating the immutable `AgentPlan`. That PR is currently under review — implementation of this feature can begin once it lands. ## References - Originally proposed in [Discussion #516](https://github.com/apache/flink-agents/discussions/516) under "Loading MCP Tools & Prompts in runtime - in case of updates on the MCP server side" ### Are you willing to submit a PR? - [x] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
