[PR] AIP-99: Add MCPToolset and MCPHook for MCP server integration [airflow]

via GitHub Wed, 04 Mar 2026 18:30:06 -0800


kaxil opened a new pull request, #62904:
URL: https://github.com/apache/airflow/pull/62904


   (Part of https://github.com/orgs/apache/projects/586)
   
   Adds MCP (Model Context Protocol) server support to the Common AI provider 
(AIP-99 Phase 5). Users can now connect AI agents to MCP servers — the open 
protocol that lets LLMs interact with external tools through a standardized 
interface.
   
   Two new components:
   
   - **`MCPToolset`** — resolves MCP server config from an Airflow connection 
and delegates to PydanticAI's MCP server classes. Stores URLs, auth tokens, and 
commands in Airflow connections/secret backends instead of hardcoding in DAG 
code.
   - **`MCPHook`** — dedicated `mcp` connection type with UI fields for 
transport (HTTP/SSE/stdio), command, args, and auth token.
   
   ## Design decisions
   
   **Three tiers of toolset usage** — the docs and examples make clear that:
   1. `MCPToolset` (recommended) — Airflow connection management, secret 
backends, connection UI
   2. Direct PydanticAI MCP servers (`MCPServerStreamableHTTP`, 
`MCPServerStdio`) — for prototyping or full control
   3. Any `AbstractToolset` — AgentOperator accepts any PydanticAI-compatible 
toolset, no lock-in
   
   **Thin delegation, not reimplementation** — `MCPToolset` wraps PydanticAI's 
MCP servers and delegates `get_tools()`, `call_tool()`, 
`__aenter__`/`__aexit__`. The lifecycle delegation keeps the MCP connection 
open across tool calls in a multi-turn agent conversation instead of 
reconnecting per call.
   
   **Auth via Bearer header** — the connection's password field is passed as 
`Authorization: Bearer <token>` to HTTP/SSE servers. Stdio transport doesn't 
use auth (subprocess).
   
   **`args` coercion** — if a user enters the `args` extra field as a bare 
string instead of a JSON array, it's treated as a single-element list rather 
than splitting each character.
   
   ## Usage
   
   ```python
   from airflow.providers.common.ai.operators.agent import AgentOperator
   from airflow.providers.common.ai.toolsets.mcp import MCPToolset
   
   AgentOperator(
       task_id="mcp_agent",
       prompt="What tools are available?",
       llm_conn_id="pydantic_ai_default",
       toolsets=[
           MCPToolset(mcp_conn_id="my_mcp_server"),
           MCPToolset(mcp_conn_id="code_runner", tool_prefix="code"),
       ],
   )
   ```
   
   Connection config (HTTP):
   ```json
   {"conn_type": "mcp", "host": "http://localhost:3001/mcp"}
   ```
   
   Connection config (stdio):
   ```json
   {"conn_type": "mcp", "extra": "{\"transport\": \"stdio\", \"command\": 
\"uvx\", \"args\": [\"mcp-run-python\"]}"}
   ```
   
   ## What's not included
   
   - **No MCP resource/sampling/elicitation** — just tool exposure. Can add 
later.
   - **No MCP server management** — Airflow doesn't start/stop MCP servers. 
HTTP servers run externally; stdio servers are spawned by PydanticAI as 
subprocesses.
   
   Requires the `mcp` optional extra: `pip install 
"apache-airflow-providers-common-ai[mcp]"`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] AIP-99: Add MCPToolset and MCPHook for MCP server integration [airflow]

Reply via email to