This is an automated email from the ASF dual-hosted git repository.
xtsong pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/flink-agents.git
The following commit(s) were added to refs/heads/main by this push:
new 48c62b2 [docs] Add docs for chat models and prompts (#245)
48c62b2 is described below
commit 48c62b2a75bd9dae6dd033c93c55cb154410747e
Author: Alan Z. <[email protected]>
AuthorDate: Sat Oct 4 05:14:55 2025 -0700
[docs] Add docs for chat models and prompts (#245)
---
docs/content/docs/development/chat_models.md | 438 +++++++++++++++++++++
docs/content/docs/development/chat_with_llm.md | 57 ---
docs/content/docs/development/embedding_models.md | 2 +-
.../docs/development/integrate_with_flink.md | 2 +-
docs/content/docs/development/prompts.md | 279 +++++++++++++
docs/content/docs/development/react_agent.md | 4 +-
docs/content/docs/development/tool_use.md | 2 +-
docs/content/docs/development/vector_stores.md | 2 +-
docs/content/docs/development/workflow_agent.md | 4 +-
9 files changed, 725 insertions(+), 65 deletions(-)
diff --git a/docs/content/docs/development/chat_models.md
b/docs/content/docs/development/chat_models.md
new file mode 100644
index 0000000..741eadd
--- /dev/null
+++ b/docs/content/docs/development/chat_models.md
@@ -0,0 +1,438 @@
+---
+title: Chat Models
+weight: 3
+type: docs
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Chat Models
+
+## Overview
+
+Chat models enable agents to communicate with Large Language Models (LLMs) for
natural language understanding, reasoning, and generation. In Flink Agents,
chat models act as the "brain" of your agents, processing input messages and
generating intelligent responses based on context, prompts, and available tools.
+
+## Getting Started
+
+To use chat models in your agents, you need to define both a connection and
setup using decorators, then interact with the model through events.
+
+### Resource Decorators
+
+Flink Agents provides decorators to simplify chat model setup within agents:
+
+#### @chat_model_connection
+
+The `@chat_model_connection` decorator marks a method that creates a chat
model connection. This is typically defined once and shared across multiple
chat model setups.
+
+#### @chat_model_setup
+
+The `@chat_model_setup` decorator marks a method that creates a chat model
setup. This references a connection and adds chat-specific configuration like
prompts and tools.
+
+### Chat Events
+
+Chat models communicate through built-in events:
+
+- **ChatRequestEvent**: Sent by actions to request a chat completion from the
LLM
+- **ChatResponseEvent**: Received by actions containing the LLM's response
+
+### Usage Example
+
+Here's how to define and use chat models in a workflow agent:
+
+```python
+class MyAgent(Agent):
+
+ @chat_model_connection
+ @staticmethod
+ def ollama_connection() -> ResourceDescriptor:
+ return ResourceDescriptor(
+ clazz=OllamaChatModelConnection,
+ base_url="http://localhost:11434",
+ request_timeout=30.0
+ )
+
+ @chat_model_setup
+ @staticmethod
+ def ollama_chat_model() -> ResourceDescriptor:
+ return ResourceDescriptor(
+ clazz=OllamaChatModelSetup,
+ connection="ollama_connection",
+ model="qwen3:8b",
+ temperature=0.7
+ )
+
+ @action(InputEvent)
+ @staticmethod
+ def process_input(event: InputEvent, ctx: RunnerContext) -> None:
+ # Create a chat request with user message
+ user_message = ChatMessage(
+ role=MessageRole.USER,
+ content=f"input: {event.input}"
+ )
+ ctx.send_event(
+ ChatRequestEvent(model="ollama_chat_model",
messages=[user_message])
+ )
+
+ @action(ChatResponseEvent)
+ @staticmethod
+ def process_response(event: ChatResponseEvent, ctx: RunnerContext) -> None:
+ response_content = event.response.content
+ # Handle the LLM's response
+ # Process the response as needed for your use case
+```
+
+## Built-in Providers
+
+### Anthropic
+
+Anthropic provides cloud-based chat models featuring the Claude family, known
for their strong reasoning, coding, and safety capabilities.
+
+#### Prerequisites
+
+1. Get an API key from [Anthropic Console](https://console.anthropic.com/)
+
+#### AnthropicChatModelConnection Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `api_key` | str | Required | Anthropic API key for authentication |
+| `max_retries` | int | `3` | Maximum number of API retry attempts |
+| `timeout` | float | `60.0` | API request timeout in seconds |
+
+#### AnthropicChatModelSetup Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `connection` | str | Required | Reference to connection method name |
+| `model` | str | `"claude-sonnet-4-20250514"` | Name of the chat model to use
|
+| `prompt` | Prompt \| str | None | Prompt template or reference to prompt
resource |
+| `tools` | List[str] | None | List of tool names available to the model |
+| `max_tokens` | int | `1024` | Maximum number of tokens to generate |
+| `temperature` | float | `0.1` | Sampling temperature (0.0 to 1.0) |
+| `additional_kwargs` | dict | `{}` | Additional Anthropic API parameters |
+
+#### Usage Example
+
+```python
+class MyAgent(Agent):
+
+ @chat_model_connection
+ @staticmethod
+ def anthropic_connection() -> ResourceDescriptor:
+ return ResourceDescriptor(
+ clazz=AnthropicChatModelConnection,
+ api_key="your-api-key-here", # Or set ANTHROPIC_API_KEY env var
+ max_retries=3,
+ timeout=60.0
+ )
+
+ @chat_model_setup
+ @staticmethod
+ def anthropic_chat_model() -> ResourceDescriptor:
+ return ResourceDescriptor(
+ clazz=AnthropicChatModelSetup,
+ connection="anthropic_connection",
+ model="claude-sonnet-4-20250514",
+ max_tokens=2048,
+ temperature=0.7
+ )
+
+ ...
+```
+
+#### Available Models
+
+Visit the [Anthropic Models
documentation](https://docs.anthropic.com/en/docs/about-claude/models) for the
complete and up-to-date list of available chat models.
+
+Some popular options include:
+- **Claude Sonnet 4.5** (claude-sonnet-4-5-20250929)
+- **Claude Sonnet 4** (claude-sonnet-4-20250514)
+- **Claude Sonnet 3.7** (claude-3-7-sonnet-20250219)
+- **Claude Opus 4.1** (claude-opus-4-1-20250805)
+
+{{< hint warning >}}
+Model availability and specifications may change. Always check the official
Anthropic documentation for the latest information before implementing in
production.
+{{< /hint >}}
+
+### Ollama
+
+Ollama provides local chat models that run on your machine, offering privacy,
control, and no API costs.
+
+#### Prerequisites
+
+1. Install Ollama from [https://ollama.com/](https://ollama.com/)
+2. Start the Ollama server: `ollama serve`
+3. Download a chat model: `ollama pull qwen3:8b`
+
+#### OllamaChatModelConnection Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `base_url` | str | `"http://localhost:11434"` | Ollama server URL |
+| `request_timeout` | float | `30.0` | HTTP request timeout in seconds |
+
+#### OllamaChatModelSetup Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `connection` | str | Required | Reference to connection method name |
+| `model` | str | Required | Name of the chat model to use |
+| `prompt` | Prompt \| str | None | Prompt template or reference to prompt
resource |
+| `tools` | List[str] | None | List of tool names available to the model |
+| `temperature` | float | `0.75` | Sampling temperature (0.0 to 1.0) |
+| `num_ctx` | int | `2048` | Maximum number of context tokens |
+| `keep_alive` | str \| float | `"5m"` | How long to keep model loaded in
memory |
+| `extract_reasoning` | bool | `True` | Extract reasoning content from
response |
+| `additional_kwargs` | dict | `{}` | Additional Ollama API parameters |
+
+#### Usage Example
+
+```python
+class MyAgent(Agent):
+
+ @chat_model_connection
+ @staticmethod
+ def ollama_connection() -> ResourceDescriptor:
+ return ResourceDescriptor(
+ clazz=OllamaChatModelConnection,
+ base_url="http://localhost:11434",
+ request_timeout=120.0
+ )
+
+ @chat_model_setup
+ @staticmethod
+ def my_chat_model() -> ResourceDescriptor:
+ return ResourceDescriptor(
+ clazz=OllamaChatModelSetup,
+ connection="ollama_connection",
+ model="qwen3:8b",
+ temperature=0.7,
+ num_ctx=4096,
+ keep_alive="10m",
+ extract_reasoning=True
+ )
+
+ ...
+```
+
+#### Available Models
+
+Visit the [Ollama Models Library](https://ollama.com/library) for the complete
and up-to-date list of available chat models.
+
+Some popular options include:
+- **qwen3** series (qwen3:8b, qwen3:14b, qwen3:32b)
+- **llama3** series (llama3:8b, llama3:70b)
+- **deepseek** series (deepseek-r1, deepseek-v3.1)
+- **gpt-oss**
+
+{{< hint warning >}}
+Model availability and specifications may change. Always check the official
Ollama documentation for the latest information before implementing in
production.
+{{< /hint >}}
+
+### OpenAI
+
+OpenAI provides cloud-based chat models with state-of-the-art performance for
a wide range of natural language tasks.
+
+#### Prerequisites
+
+1. Get an API key from [OpenAI Platform](https://platform.openai.com/)
+2. Set the API key as an environment variable: `export
OPENAI_API_KEY=your-api-key`
+
+#### OpenAIChatModelConnection Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `api_key` | str | `$OPENAI_API_KEY` | OpenAI API key for authentication |
+| `api_base_url` | str | `"https://api.openai.com/v1"` | Base URL for OpenAI
API |
+| `max_retries` | int | `3` | Maximum number of API retry attempts |
+| `timeout` | float | `60.0` | API request timeout in seconds |
+| `default_headers` | dict | None | Default headers for API requests |
+| `reuse_client` | bool | `True` | Whether to reuse the OpenAI client between
requests |
+
+#### OpenAIChatModelSetup Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `connection` | str | Required | Reference to connection method name |
+| `model` | str | `"gpt-3.5-turbo"` | Name of the chat model to use |
+| `prompt` | Prompt \| str | None | Prompt template or reference to prompt
resource |
+| `tools` | List[str] | None | List of tool names available to the model |
+| `temperature` | float | `0.1` | Sampling temperature (0.0 to 2.0) |
+| `max_tokens` | int | None | Maximum number of tokens to generate |
+| `logprobs` | bool | None | Whether to return log probabilities per token |
+| `top_logprobs` | int | `0` | Number of top token log probabilities to return
(0-20) |
+| `strict` | bool | `False` | Enable strict mode for tool calling and schemas |
+| `reasoning_effort` | str | None | Reasoning effort level for reasoning
models ("low", "medium", "high") |
+| `additional_kwargs` | dict | `{}` | Additional OpenAI API parameters |
+
+#### Usage Example
+
+```python
+class MyAgent(Agent):
+
+ @chat_model_connection
+ @staticmethod
+ def openai_connection() -> ResourceDescriptor:
+ return ResourceDescriptor(
+ clazz=OpenAIChatModelConnection,
+ api_key="your-api-key-here", # Or set OPENAI_API_KEY env var
+ api_base_url="https://api.openai.com/v1",
+ max_retries=3,
+ timeout=60.0
+ )
+
+ @chat_model_setup
+ @staticmethod
+ def openai_chat_model() -> ResourceDescriptor:
+ return ResourceDescriptor(
+ clazz=OpenAIChatModelSetup,
+ connection="openai_connection",
+ model="gpt-4",
+ temperature=0.7,
+ max_tokens=1000
+ )
+
+ ...
+```
+
+#### Available Models
+
+Visit the [OpenAI Models
documentation](https://platform.openai.com/docs/models) for the complete and
up-to-date list of available chat models.
+
+Some popular options include:
+- **GPT-5** series (GPT-5, GPT-5 mini, GPT-5 nano)
+- **GPT-4.1**
+- **gpt-oss** series (gpt-oss-120b, gpt-oss-10b)
+
+{{< hint warning >}}
+Model availability and specifications may change. Always check the official
OpenAI documentation for the latest information before implementing in
production.
+{{< /hint >}}
+
+### Tongyi (DashScope)
+
+Tongyi provides cloud-based chat models from Alibaba Cloud, offering powerful
Chinese and English language capabilities.
+
+#### Prerequisites
+
+1. Get an API key from [Alibaba Cloud DashScope](https://dashscope.aliyun.com/)
+
+#### TongyiChatModelConnection Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `api_key` | str | `$DASHSCOPE_API_KEY` | DashScope API key for
authentication |
+| `request_timeout` | float | `60.0` | HTTP request timeout in seconds |
+
+#### TongyiChatModelSetup Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `connection` | str | Required | Reference to connection method name |
+| `model` | str | `"qwen-plus"` | Name of the chat model to use |
+| `prompt` | Prompt \| str | None | Prompt template or reference to prompt
resource |
+| `tools` | List[str] | None | List of tool names available to the model |
+| `temperature` | float | `0.7` | Sampling temperature (0.0 to 2.0) |
+| `extract_reasoning` | bool | `False` | Extract reasoning content from
response |
+| `additional_kwargs` | dict | `{}` | Additional DashScope API parameters |
+
+#### Usage Example
+
+```python
+class MyAgent(Agent):
+
+ @chat_model_connection
+ @staticmethod
+ def tongyi_connection() -> ResourceDescriptor:
+ return ResourceDescriptor(
+ clazz=TongyiChatModelConnection,
+ api_key="your-api-key-here", # Or set DASHSCOPE_API_KEY env var
+ request_timeout=60.0
+ )
+
+ @chat_model_setup
+ @staticmethod
+ def tongyi_chat_model() -> ResourceDescriptor:
+ return ResourceDescriptor(
+ clazz=TongyiChatModelSetup,
+ connection="tongyi_connection",
+ model="qwen-plus",
+ temperature=0.7,
+ extract_reasoning=True
+ )
+
+ ...
+```
+
+#### Available Models
+
+Visit the [DashScope Models
documentation](https://help.aliyun.com/zh/dashscope/developer-reference/model-introduction)
for the complete and up-to-date list of available chat models.
+
+Some popular options include:
+- **qwen-plus**
+- **qwen-max**
+- **qwen-turbo**
+- **qwen-long**
+
+{{< hint warning >}}
+Model availability and specifications may change. Always check the official
DashScope documentation for the latest information before implementing in
production.
+{{< /hint >}}
+
+## Custom Providers
+
+{{< hint warning >}}
+The custom provider APIs are experimental and unstable, subject to
incompatible changes in future releases.
+{{< /hint >}}
+
+If you want to use chat models not offered by the built-in providers, you can
extend the base chat classes and implement your own! The chat model system is
built around two main abstract classes:
+
+### BaseChatModelConnection
+
+Handles the connection to chat model services and provides the core chat
functionality.
+
+```python
+class MyChatModelConnection(BaseChatModelConnection):
+
+ def chat(
+ self,
+ messages: Sequence[ChatMessage],
+ tools: List[Tool] | None = None,
+ **kwargs: Any,
+ ) -> ChatMessage:
+ # Core method: send messages to LLM and return response
+ # - messages: Input message sequence
+ # - tools: Optional list of tools available to the model
+ # - kwargs: Additional parameters from model_kwargs
+ # - Returns: ChatMessage with the model's response
+ pass
+```
+
+### BaseChatModelSetup
+
+The setup class acts as a high-level configuration interface that defines
which connection to use and how to configure the chat model.
+
+```python
+class MyChatModelSetup(BaseChatModelSetup):
+ # Add your custom configuration fields here
+
+ @property
+ def model_kwargs(self) -> Dict[str, Any]:
+ # Return model-specific configuration passed to chat()
+ # This dictionary is passed as **kwargs to the chat() method
+ return {"model": self.model, "temperature": 0.7, ...}
+```
diff --git a/docs/content/docs/development/chat_with_llm.md
b/docs/content/docs/development/chat_with_llm.md
deleted file mode 100644
index 78b80f1..0000000
--- a/docs/content/docs/development/chat_with_llm.md
+++ /dev/null
@@ -1,57 +0,0 @@
----
-title: Chat with LLM
-weight: 3
-type: docs
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements. See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership. The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied. See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-## ChatModel
-
-{{< hint warning >}}
-**TODO**: What is ChatModel. How to define and ChatModelConnection, and
ChatModelSetup. How to reuse ChatModelConnection in ChatModelSetup.
-
-**TODO**: How to send ChatRequestEvent and handle ChatResponseEvent.
-{{< /hint >}}
-
-{{< hint warning >}}
-**TODO**: List of all built-in Model and configuration, Ollama, Tongyi, etc.
-{{< /hint >}}
-
-### Ollama
-
-### Tongyi
-
-## Prompt
-
-{{< hint warning >}}
-**TODO**: What is Prompt. What are the differences between Local Prompt and
MCP Prompt.
-{{< /hint >}}
-
-### Local Prompt
-
-{{< hint warning >}}
-**TODO**: How to define and use a Local Prompt.
-{{< /hint >}}
-
-### MCP Prompt
-
-{{< hint warning >}}
-**TODO**: Link to MCP Prompt documentation.
-{{< /hint >}}
diff --git a/docs/content/docs/development/embedding_models.md
b/docs/content/docs/development/embedding_models.md
index 046d7a1..d24912c 100644
--- a/docs/content/docs/development/embedding_models.md
+++ b/docs/content/docs/development/embedding_models.md
@@ -1,6 +1,6 @@
---
title: Embedding Models
-weight: 4
+weight: 5
type: docs
---
<!--
diff --git a/docs/content/docs/development/integrate_with_flink.md
b/docs/content/docs/development/integrate_with_flink.md
index 676eb9d..bad997e 100644
--- a/docs/content/docs/development/integrate_with_flink.md
+++ b/docs/content/docs/development/integrate_with_flink.md
@@ -1,6 +1,6 @@
---
title: Integrate with Flink
-weight: 7
+weight: 8
type: docs
---
<!--
diff --git a/docs/content/docs/development/prompts.md
b/docs/content/docs/development/prompts.md
new file mode 100644
index 0000000..c28f9ff
--- /dev/null
+++ b/docs/content/docs/development/prompts.md
@@ -0,0 +1,279 @@
+---
+title: Prompts
+weight: 4
+type: docs
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Prompts
+
+## Overview
+
+Prompts are templates that define how your agents communicate with Large
Language Models (LLMs). They provide structured instructions, context, and
formatting guidelines that shape the LLM's responses. In Flink Agents, prompts
are first-class resources that can be defined, reused, and referenced across
agents and chat models.
+
+## Prompt Types
+
+Flink Agents supports two types of prompts:
+
+### Local Prompt
+
+Local prompts are templates defined directly in your code. They support
variable substitution using `{variable_name}` syntax and can be created from
either text strings or message sequences.
+
+### MCP Prompt
+
+MCP (Model Context Protocol) prompts are managed by external MCP servers. They
enable dynamic prompt retrieval, centralized prompt management, and integration
with external prompt repositories.
+
+## Local Prompt
+
+### Creating from Text
+
+The simplest way to create a prompt is from a text string using
`Prompt.from_text()`:
+
+```python
+product_suggestion_prompt_str = """
+Based on the rating distribution and user dissatisfaction reasons, generate
three actionable suggestions for product improvement.
+
+Input format:
+{{
+ "id": "1",
+ "score_histogram": ["10%", "20%", "10%", "15%", "45%"],
+ "unsatisfied_reasons": ["reason1", "reason2", "reason3"]
+}}
+
+Ensure that your response can be parsed by Python json, use the following
format as an example:
+{{
+ "suggestion_list": [
+ "suggestion1",
+ "suggestion2",
+ "suggestion3"
+ ]
+}}
+
+input:
+{input}
+"""
+
+product_suggestion_prompt = Prompt.from_text(product_suggestion_prompt_str)
+```
+
+**Key points:**
+- Use `{variable_name}` for template variables that will be substituted at
runtime
+- Escape literal braces by doubling them: `{{` and `}}`
+
+### Creating from Messages
+
+For more control, create prompts from a sequence of `ChatMessage` objects
using `Prompt.from_messages()`:
+
+```python
+review_analysis_prompt = Prompt.from_messages(
+ messages=[
+ ChatMessage(
+ role=MessageRole.SYSTEM,
+ content="""
+ Analyze the user review and product information to determine a
+ satisfaction score (1-5) and potential reasons for dissatisfaction.
+
+ Example input format:
+ {{
+ "id": "12345",
+ "review": "The headphones broke after one week of use."
+ }}
+
+ Ensure your response can be parsed by Python JSON:
+ {{
+ "id": "12345",
+ "score": 1,
+ "reasons": ["poor quality"]
+ }}
+ """,
+ ),
+ ChatMessage(
+ role=MessageRole.USER,
+ content="""
+ "input":
+ {input}
+ """,
+ ),
+ ],
+)
+```
+
+**Key points:**
+- Define multiple messages with different roles (SYSTEM, USER)
+- Each message can have its own template variables
+
+### Using Prompts in Agents
+
+Register a prompt as an agent resource using the `@prompt` decorator:
+
+```python
+class ReviewAnalysisAgent(Agent):
+
+ @prompt
+ @staticmethod
+ def review_analysis_prompt() -> Prompt:
+ """Prompt for review analysis."""
+ return Prompt.from_messages(
+ messages=[
+ ChatMessage(
+ role=MessageRole.SYSTEM,
+ content="""
+ Analyze the user review and product information to determine a
+ satisfaction score (1-5) and potential reasons for dissatisfaction.
+
+ Example input format:
+ {{
+ "id": "12345",
+ "review": "The headphones broke after one week of use."
+ }}
+
+ Ensure your response can be parsed by Python JSON:
+ {{
+ "id": "12345",
+ "score": 1,
+ "reasons": ["poor quality"]
+ }}
+ """,
+ ),
+ ChatMessage(
+ role=MessageRole.USER,
+ content="""
+ "input":
+ {input}
+ """,
+ ),
+ ],
+ )
+
+ @chat_model_setup
+ @staticmethod
+ def review_analysis_model() -> ResourceDescriptor:
+ """ChatModel which focus on review analysis."""
+ return ResourceDescriptor(
+ clazz=OllamaChatModelSetup,
+ connection="ollama_server",
+ model="qwen3:8b",
+ prompt="review_analysis_prompt",
+ extract_reasoning=True,
+ )
+
+ @action(InputEvent)
+ @staticmethod
+ def process_input(event: InputEvent, ctx: RunnerContext) -> None:
+ """Process input event and send chat request for review analysis."""
+ input: ProductReview = event.input
+ ctx.short_term_memory.set("id", input.id)
+
+ content = f"""
+ "id": {input.id},
+ "review": {input.review}
+ """
+ msg = ChatMessage(role=MessageRole.USER, extra_args={"input": content})
+ ctx.send_event(ChatRequestEvent(model="review_analysis_model",
messages=[msg]))
+```
+
+Prompts use `{variable_name}` syntax for template variables. Variables are
filled from `ChatMessage.extra_args`. The prompt is automatically applied when
the chat model is invoked.
+
+## MCP Prompt
+
+{{< hint info >}}
+MCP (Model Context Protocol) is a standardized protocol for integrating AI
applications with external data sources and tools. MCP prompts allow dynamic
prompt retrieval from MCP servers.
+{{< /hint >}}
+
+MCP prompts are managed by external MCP servers and automatically discovered
when you define an MCP server connection in your agent.
+
+### Define MCP Server with Prompts
+
+Create an MCP server that exposes prompts using the `FastMCP` library:
+
+```python
+# mcp_server.py
+mcp = FastMCP("ReviewServer")
+
[email protected]()
+def review_analysis_prompt(product_id: str, review: str) -> str:
+ """Prompt for analyzing product reviews."""
+ return f"""
+ Analyze the following product review and provide a satisfaction score
(1-5).
+
+ Product ID: {product_id}
+ Review: {review}
+
+ Output format: {{"score": 1-5, "reasons": ["reason1", "reason2"]}}
+ """
+
+mcp.run("streamable-http")
+```
+
+**Key points:**
+- Use `@mcp.prompt()` decorator to define prompts
+- Prompt function parameters become template variables
+- The function name becomes the prompt identifier
+
+### Use MCP Prompts in Agent
+
+Connect to the MCP server and use its prompts in your agent:
+
+```python
+class ReviewAnalysisAgent(Agent):
+
+ @mcp_server
+ @staticmethod
+ def review_mcp_server() -> MCPServer:
+ """Connect to MCP server."""
+ return MCPServer(endpoint="http://127.0.0.1:8000/mcp")
+
+ @chat_model_connection
+ @staticmethod
+ def ollama_server() -> ResourceDescriptor:
+ """Ollama connection."""
+ return ResourceDescriptor(clazz=OllamaChatModelConnection)
+
+ @chat_model_setup
+ @staticmethod
+ def review_model() -> ResourceDescriptor:
+ return ResourceDescriptor(
+ clazz=OllamaChatModelSetup,
+ connection="ollama_server",
+ model="qwen3:8b",
+ prompt="review_analysis_prompt", # Reference MCP prompt by name
+ )
+
+ @action(InputEvent)
+ @staticmethod
+ def process_input(event: InputEvent, ctx: RunnerContext) -> None:
+ input_data = event.input
+
+ # Provide prompt variables via extra_args
+ msg = ChatMessage(
+ role=MessageRole.USER,
+ extra_args={
+ "product_id": input_data.product_id,
+ "review": input_data.review
+ }
+ )
+ ctx.send_event(ChatRequestEvent(model="review_model", messages=[msg]))
+```
+
+**Key points:**
+- Use `@mcp_server` decorator to define MCP server connection
+- Reference MCP prompts by their function name (e.g.,
`"review_analysis_prompt"`)
+- Provide prompt parameters using `ChatMessage.extra_args`
+- All prompts and tools from the MCP server are automatically registered
\ No newline at end of file
diff --git a/docs/content/docs/development/react_agent.md
b/docs/content/docs/development/react_agent.md
index 7a2773a..64d965b 100644
--- a/docs/content/docs/development/react_agent.md
+++ b/docs/content/docs/development/react_agent.md
@@ -39,7 +39,7 @@ my_react_agent = ReActAgent(
### Chat Model
User should specify the chat model used in the ReAct Agent.
-We use `ResourceDescriptor` to describe the chat model, includes chat model
type and chat model arguments. See [Chat Model]({{< ref
"docs/development/chat_with_llm#chatmodel" >}}) for more details.
+We use `ResourceDescriptor` to describe the chat model, includes chat model
type and chat model arguments. See [Chat Model]({{< ref
"docs/development/chat_models" >}}) for more details.
```python
chat_model_descriptor = ResourceDescriptor(
clazz=OllamaChatModelSetup,
@@ -94,7 +94,7 @@ ChatMessage(
)
```
-See [Prompt]({{< ref "docs/development/chat_with_llm#prompt" >}}) for more
details.
+See [Prompt]({{< ref "docs/development/prompts" >}}) for more details.
### Output Schema
User can set output schema to configure the ReAct Agent output type. If output
schema is set, the ReAct Agent will deserialize the llm response to expected
type.
diff --git a/docs/content/docs/development/tool_use.md
b/docs/content/docs/development/tool_use.md
index 9afa2b5..7e0e8a8 100644
--- a/docs/content/docs/development/tool_use.md
+++ b/docs/content/docs/development/tool_use.md
@@ -1,6 +1,6 @@
---
title: Tool Use
-weight: 6
+weight: 7
type: docs
---
<!--
diff --git a/docs/content/docs/development/vector_stores.md
b/docs/content/docs/development/vector_stores.md
index 793fdec..950a251 100644
--- a/docs/content/docs/development/vector_stores.md
+++ b/docs/content/docs/development/vector_stores.md
@@ -1,6 +1,6 @@
---
title: Vector Stores
-weight: 5
+weight: 6
type: docs
---
<!--
diff --git a/docs/content/docs/development/workflow_agent.md
b/docs/content/docs/development/workflow_agent.md
index 254588a..b77e177 100644
--- a/docs/content/docs/development/workflow_agent.md
+++ b/docs/content/docs/development/workflow_agent.md
@@ -160,8 +160,8 @@ Then, user can define actions listen to or send `MyEvent`.
## Built-in Events and Actions
There are several built-in `Event` and `Action` in Flink-Agents:
-* See [chat with llm]({{< ref "docs/development/chat_with_llm" >}}) for how to
chat with a LLM leveraging built-in action and events.
-* See [tool use]({{< ref "docs/development/tool_use" >}}) for how to
programmatically use a tool leveraging built-in action and events.
+* See [Chat Models]({{< ref "docs/development/chat_models" >}}) for how to
chat with a LLM leveraging built-in action and events.
+* See [Tool Use]({{< ref "docs/development/tool_use" >}}) for how to
programmatically use a tool leveraging built-in action and events.
## Memory