(flink-agents) branch main updated: [docs] Add docs for chat models and prompts (#245)

xtsong Sat, 18 Oct 2025 03:09:34 -0700

This is an automated email from the ASF dual-hosted git repository.

xtsong pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/flink-agents.git



The following commit(s) were added to refs/heads/main by this push:
     new 48c62b2  [docs] Add docs for chat models and prompts (#245)
48c62b2 is described below

commit 48c62b2a75bd9dae6dd033c93c55cb154410747e
Author: Alan Z. <[email protected]>
AuthorDate: Sat Oct 4 05:14:55 2025 -0700

    [docs] Add docs for chat models and prompts (#245)
---
 docs/content/docs/development/chat_models.md       | 438 +++++++++++++++++++++
 docs/content/docs/development/chat_with_llm.md     |  57 ---
 docs/content/docs/development/embedding_models.md  |   2 +-
 .../docs/development/integrate_with_flink.md       |   2 +-
 docs/content/docs/development/prompts.md           | 279 +++++++++++++
 docs/content/docs/development/react_agent.md       |   4 +-
 docs/content/docs/development/tool_use.md          |   2 +-
 docs/content/docs/development/vector_stores.md     |   2 +-
 docs/content/docs/development/workflow_agent.md    |   4 +-
 9 files changed, 725 insertions(+), 65 deletions(-)

diff --git a/docs/content/docs/development/chat_models.md 
b/docs/content/docs/development/chat_models.md
new file mode 100644
index 0000000..741eadd
--- /dev/null
+++ b/docs/content/docs/development/chat_models.md
@@ -0,0 +1,438 @@
+---
+title: Chat Models
+weight: 3
+type: docs
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Chat Models
+
+## Overview
+
+Chat models enable agents to communicate with Large Language Models (LLMs) for 
natural language understanding, reasoning, and generation. In Flink Agents, 
chat models act as the "brain" of your agents, processing input messages and 
generating intelligent responses based on context, prompts, and available tools.
+
+## Getting Started
+
+To use chat models in your agents, you need to define both a connection and 
setup using decorators, then interact with the model through events.
+
+### Resource Decorators
+
+Flink Agents provides decorators to simplify chat model setup within agents:
+
+#### @chat_model_connection
+
+The `@chat_model_connection` decorator marks a method that creates a chat 
model connection. This is typically defined once and shared across multiple 
chat model setups.
+
+#### @chat_model_setup
+
+The `@chat_model_setup` decorator marks a method that creates a chat model 
setup. This references a connection and adds chat-specific configuration like 
prompts and tools.
+
+### Chat Events
+
+Chat models communicate through built-in events:
+
+- **ChatRequestEvent**: Sent by actions to request a chat completion from the 
LLM
+- **ChatResponseEvent**: Received by actions containing the LLM's response
+
+### Usage Example
+
+Here's how to define and use chat models in a workflow agent:
+
+```python
+class MyAgent(Agent):
+
+    @chat_model_connection
+    @staticmethod
+    def ollama_connection() -> ResourceDescriptor:
+        return ResourceDescriptor(
+            clazz=OllamaChatModelConnection,
+            base_url="http://localhost:11434";,
+            request_timeout=30.0
+        )
+
+    @chat_model_setup
+    @staticmethod
+    def ollama_chat_model() -> ResourceDescriptor:
+        return ResourceDescriptor(
+            clazz=OllamaChatModelSetup,
+            connection="ollama_connection",
+            model="qwen3:8b",
+            temperature=0.7
+        )
+
+    @action(InputEvent)
+    @staticmethod
+    def process_input(event: InputEvent, ctx: RunnerContext) -> None:
+        # Create a chat request with user message
+        user_message = ChatMessage(
+            role=MessageRole.USER,
+            content=f"input: {event.input}"
+        )
+        ctx.send_event(
+            ChatRequestEvent(model="ollama_chat_model", 
messages=[user_message])
+        )
+
+    @action(ChatResponseEvent)
+    @staticmethod
+    def process_response(event: ChatResponseEvent, ctx: RunnerContext) -> None:
+        response_content = event.response.content
+        # Handle the LLM's response
+        # Process the response as needed for your use case
+```
+
+## Built-in Providers
+
+### Anthropic
+
+Anthropic provides cloud-based chat models featuring the Claude family, known 
for their strong reasoning, coding, and safety capabilities.
+
+#### Prerequisites
+
+1. Get an API key from [Anthropic Console](https://console.anthropic.com/)
+
+#### AnthropicChatModelConnection Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `api_key` | str | Required | Anthropic API key for authentication |
+| `max_retries` | int | `3` | Maximum number of API retry attempts |
+| `timeout` | float | `60.0` | API request timeout in seconds |
+
+#### AnthropicChatModelSetup Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `connection` | str | Required | Reference to connection method name |
+| `model` | str | `"claude-sonnet-4-20250514"` | Name of the chat model to use 
|
+| `prompt` | Prompt \| str | None | Prompt template or reference to prompt 
resource |
+| `tools` | List[str] | None | List of tool names available to the model |
+| `max_tokens` | int | `1024` | Maximum number of tokens to generate |
+| `temperature` | float | `0.1` | Sampling temperature (0.0 to 1.0) |
+| `additional_kwargs` | dict | `{}` | Additional Anthropic API parameters |
+
+#### Usage Example
+
+```python
+class MyAgent(Agent):
+
+    @chat_model_connection
+    @staticmethod
+    def anthropic_connection() -> ResourceDescriptor:
+        return ResourceDescriptor(
+            clazz=AnthropicChatModelConnection,
+            api_key="your-api-key-here",  # Or set ANTHROPIC_API_KEY env var
+            max_retries=3,
+            timeout=60.0
+        )
+
+    @chat_model_setup
+    @staticmethod
+    def anthropic_chat_model() -> ResourceDescriptor:
+        return ResourceDescriptor(
+            clazz=AnthropicChatModelSetup,
+            connection="anthropic_connection",
+            model="claude-sonnet-4-20250514",
+            max_tokens=2048,
+            temperature=0.7
+        )
+
+    ...
+```
+
+#### Available Models
+
+Visit the [Anthropic Models 
documentation](https://docs.anthropic.com/en/docs/about-claude/models) for the 
complete and up-to-date list of available chat models.
+
+Some popular options include:
+- **Claude Sonnet 4.5** (claude-sonnet-4-5-20250929)
+- **Claude Sonnet 4** (claude-sonnet-4-20250514)
+- **Claude Sonnet 3.7** (claude-3-7-sonnet-20250219)
+- **Claude Opus 4.1** (claude-opus-4-1-20250805)
+
+{{< hint warning >}}
+Model availability and specifications may change. Always check the official 
Anthropic documentation for the latest information before implementing in 
production.
+{{< /hint >}}
+
+### Ollama
+
+Ollama provides local chat models that run on your machine, offering privacy, 
control, and no API costs.
+
+#### Prerequisites
+
+1. Install Ollama from [https://ollama.com/](https://ollama.com/)
+2. Start the Ollama server: `ollama serve`
+3. Download a chat model: `ollama pull qwen3:8b`
+
+#### OllamaChatModelConnection Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `base_url` | str | `"http://localhost:11434"` | Ollama server URL |
+| `request_timeout` | float | `30.0` | HTTP request timeout in seconds |
+
+#### OllamaChatModelSetup Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `connection` | str | Required | Reference to connection method name |
+| `model` | str | Required | Name of the chat model to use |
+| `prompt` | Prompt \| str | None | Prompt template or reference to prompt 
resource |
+| `tools` | List[str] | None | List of tool names available to the model |
+| `temperature` | float | `0.75` | Sampling temperature (0.0 to 1.0) |
+| `num_ctx` | int | `2048` | Maximum number of context tokens |
+| `keep_alive` | str \| float | `"5m"` | How long to keep model loaded in 
memory |
+| `extract_reasoning` | bool | `True` | Extract reasoning content from 
response |
+| `additional_kwargs` | dict | `{}` | Additional Ollama API parameters |
+
+#### Usage Example
+
+```python
+class MyAgent(Agent):
+
+    @chat_model_connection
+    @staticmethod
+    def ollama_connection() -> ResourceDescriptor:
+        return ResourceDescriptor(
+            clazz=OllamaChatModelConnection,
+            base_url="http://localhost:11434";,
+            request_timeout=120.0
+        )
+
+    @chat_model_setup
+    @staticmethod
+    def my_chat_model() -> ResourceDescriptor:
+        return ResourceDescriptor(
+            clazz=OllamaChatModelSetup,
+            connection="ollama_connection",
+            model="qwen3:8b",
+            temperature=0.7,
+            num_ctx=4096,
+            keep_alive="10m",
+            extract_reasoning=True
+        )
+
+    ...
+```
+
+#### Available Models
+
+Visit the [Ollama Models Library](https://ollama.com/library) for the complete 
and up-to-date list of available chat models.
+
+Some popular options include:
+- **qwen3** series (qwen3:8b, qwen3:14b, qwen3:32b)
+- **llama3** series (llama3:8b, llama3:70b)
+- **deepseek** series (deepseek-r1, deepseek-v3.1)
+- **gpt-oss**
+
+{{< hint warning >}}
+Model availability and specifications may change. Always check the official 
Ollama documentation for the latest information before implementing in 
production.
+{{< /hint >}}
+
+### OpenAI
+
+OpenAI provides cloud-based chat models with state-of-the-art performance for 
a wide range of natural language tasks.
+
+#### Prerequisites
+
+1. Get an API key from [OpenAI Platform](https://platform.openai.com/)
+2. Set the API key as an environment variable: `export 
OPENAI_API_KEY=your-api-key`
+
+#### OpenAIChatModelConnection Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `api_key` | str | `$OPENAI_API_KEY` | OpenAI API key for authentication |
+| `api_base_url` | str | `"https://api.openai.com/v1"` | Base URL for OpenAI 
API |
+| `max_retries` | int | `3` | Maximum number of API retry attempts |
+| `timeout` | float | `60.0` | API request timeout in seconds |
+| `default_headers` | dict | None | Default headers for API requests |
+| `reuse_client` | bool | `True` | Whether to reuse the OpenAI client between 
requests |
+
+#### OpenAIChatModelSetup Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `connection` | str | Required | Reference to connection method name |
+| `model` | str | `"gpt-3.5-turbo"` | Name of the chat model to use |
+| `prompt` | Prompt \| str | None | Prompt template or reference to prompt 
resource |
+| `tools` | List[str] | None | List of tool names available to the model |
+| `temperature` | float | `0.1` | Sampling temperature (0.0 to 2.0) |
+| `max_tokens` | int | None | Maximum number of tokens to generate |
+| `logprobs` | bool | None | Whether to return log probabilities per token |
+| `top_logprobs` | int | `0` | Number of top token log probabilities to return 
(0-20) |
+| `strict` | bool | `False` | Enable strict mode for tool calling and schemas |
+| `reasoning_effort` | str | None | Reasoning effort level for reasoning 
models ("low", "medium", "high") |
+| `additional_kwargs` | dict | `{}` | Additional OpenAI API parameters |
+
+#### Usage Example
+
+```python
+class MyAgent(Agent):
+
+    @chat_model_connection
+    @staticmethod
+    def openai_connection() -> ResourceDescriptor:
+        return ResourceDescriptor(
+            clazz=OpenAIChatModelConnection,
+            api_key="your-api-key-here",  # Or set OPENAI_API_KEY env var
+            api_base_url="https://api.openai.com/v1";,
+            max_retries=3,
+            timeout=60.0
+        )
+
+    @chat_model_setup
+    @staticmethod
+    def openai_chat_model() -> ResourceDescriptor:
+        return ResourceDescriptor(
+            clazz=OpenAIChatModelSetup,
+            connection="openai_connection",
+            model="gpt-4",
+            temperature=0.7,
+            max_tokens=1000
+        )
+
+    ...
+```
+
+#### Available Models
+
+Visit the [OpenAI Models 
documentation](https://platform.openai.com/docs/models) for the complete and 
up-to-date list of available chat models.
+
+Some popular options include:
+- **GPT-5** series (GPT-5, GPT-5 mini, GPT-5 nano)
+- **GPT-4.1**
+- **gpt-oss** series (gpt-oss-120b, gpt-oss-10b)
+
+{{< hint warning >}}
+Model availability and specifications may change. Always check the official 
OpenAI documentation for the latest information before implementing in 
production.
+{{< /hint >}}
+
+### Tongyi (DashScope)
+
+Tongyi provides cloud-based chat models from Alibaba Cloud, offering powerful 
Chinese and English language capabilities.
+
+#### Prerequisites
+
+1. Get an API key from [Alibaba Cloud DashScope](https://dashscope.aliyun.com/)
+
+#### TongyiChatModelConnection Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `api_key` | str | `$DASHSCOPE_API_KEY` | DashScope API key for 
authentication |
+| `request_timeout` | float | `60.0` | HTTP request timeout in seconds |
+
+#### TongyiChatModelSetup Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `connection` | str | Required | Reference to connection method name |
+| `model` | str | `"qwen-plus"` | Name of the chat model to use |
+| `prompt` | Prompt \| str | None | Prompt template or reference to prompt 
resource |
+| `tools` | List[str] | None | List of tool names available to the model |
+| `temperature` | float | `0.7` | Sampling temperature (0.0 to 2.0) |
+| `extract_reasoning` | bool | `False` | Extract reasoning content from 
response |
+| `additional_kwargs` | dict | `{}` | Additional DashScope API parameters |
+
+#### Usage Example
+
+```python
+class MyAgent(Agent):
+
+    @chat_model_connection
+    @staticmethod
+    def tongyi_connection() -> ResourceDescriptor:
+        return ResourceDescriptor(
+            clazz=TongyiChatModelConnection,
+            api_key="your-api-key-here",  # Or set DASHSCOPE_API_KEY env var
+            request_timeout=60.0
+        )
+
+    @chat_model_setup
+    @staticmethod
+    def tongyi_chat_model() -> ResourceDescriptor:
+        return ResourceDescriptor(
+            clazz=TongyiChatModelSetup,
+            connection="tongyi_connection",
+            model="qwen-plus",
+            temperature=0.7,
+            extract_reasoning=True
+        )
+
+    ...
+```
+
+#### Available Models
+
+Visit the [DashScope Models 
documentation](https://help.aliyun.com/zh/dashscope/developer-reference/model-introduction)
 for the complete and up-to-date list of available chat models.
+
+Some popular options include:
+- **qwen-plus**
+- **qwen-max**
+- **qwen-turbo**
+- **qwen-long**
+
+{{< hint warning >}}
+Model availability and specifications may change. Always check the official 
DashScope documentation for the latest information before implementing in 
production.
+{{< /hint >}}
+
+## Custom Providers
+
+{{< hint warning >}}
+The custom provider APIs are experimental and unstable, subject to 
incompatible changes in future releases.
+{{< /hint >}}
+
+If you want to use chat models not offered by the built-in providers, you can 
extend the base chat classes and implement your own! The chat model system is 
built around two main abstract classes:
+
+### BaseChatModelConnection
+
+Handles the connection to chat model services and provides the core chat 
functionality.
+
+```python
+class MyChatModelConnection(BaseChatModelConnection):
+
+    def chat(
+        self,
+        messages: Sequence[ChatMessage],
+        tools: List[Tool] | None = None,
+        **kwargs: Any,
+    ) -> ChatMessage:
+        # Core method: send messages to LLM and return response
+        # - messages: Input message sequence
+        # - tools: Optional list of tools available to the model
+        # - kwargs: Additional parameters from model_kwargs
+        # - Returns: ChatMessage with the model's response
+        pass
+```
+
+### BaseChatModelSetup
+
+The setup class acts as a high-level configuration interface that defines 
which connection to use and how to configure the chat model.
+
+```python
+class MyChatModelSetup(BaseChatModelSetup):
+    # Add your custom configuration fields here
+
+    @property
+    def model_kwargs(self) -> Dict[str, Any]:
+        # Return model-specific configuration passed to chat()
+        # This dictionary is passed as **kwargs to the chat() method
+        return {"model": self.model, "temperature": 0.7, ...}
+```
diff --git a/docs/content/docs/development/chat_with_llm.md 
b/docs/content/docs/development/chat_with_llm.md
deleted file mode 100644
index 78b80f1..0000000
--- a/docs/content/docs/development/chat_with_llm.md
+++ /dev/null
@@ -1,57 +0,0 @@
----
-title: Chat with LLM
-weight: 3
-type: docs
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-## ChatModel
-
-{{< hint warning >}}
-**TODO**: What is ChatModel. How to define and ChatModelConnection, and 
ChatModelSetup. How to reuse ChatModelConnection in ChatModelSetup.
-
-**TODO**: How to send ChatRequestEvent and handle ChatResponseEvent.
-{{< /hint >}}
-
-{{< hint warning >}}
-**TODO**: List of all built-in Model and configuration, Ollama, Tongyi, etc.
-{{< /hint >}}
-
-### Ollama
-
-### Tongyi
-
-## Prompt
-
-{{< hint warning >}}
-**TODO**: What is Prompt. What are the differences between Local Prompt and 
MCP Prompt.
-{{< /hint >}}
-
-### Local Prompt
-
-{{< hint warning >}}
-**TODO**: How to define and use a Local Prompt.
-{{< /hint >}}
-
-### MCP Prompt
-
-{{< hint warning >}}
-**TODO**: Link to MCP Prompt documentation.
-{{< /hint >}}
diff --git a/docs/content/docs/development/embedding_models.md 
b/docs/content/docs/development/embedding_models.md
index 046d7a1..d24912c 100644
--- a/docs/content/docs/development/embedding_models.md
+++ b/docs/content/docs/development/embedding_models.md
@@ -1,6 +1,6 @@
 ---
 title: Embedding Models
-weight: 4
+weight: 5
 type: docs
 ---
 <!--
diff --git a/docs/content/docs/development/integrate_with_flink.md 
b/docs/content/docs/development/integrate_with_flink.md
index 676eb9d..bad997e 100644
--- a/docs/content/docs/development/integrate_with_flink.md
+++ b/docs/content/docs/development/integrate_with_flink.md
@@ -1,6 +1,6 @@
 ---
 title: Integrate with Flink
-weight: 7
+weight: 8
 type: docs
 ---
 <!--
diff --git a/docs/content/docs/development/prompts.md 
b/docs/content/docs/development/prompts.md
new file mode 100644
index 0000000..c28f9ff
--- /dev/null
+++ b/docs/content/docs/development/prompts.md
@@ -0,0 +1,279 @@
+---
+title: Prompts
+weight: 4
+type: docs
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Prompts
+
+## Overview
+
+Prompts are templates that define how your agents communicate with Large 
Language Models (LLMs). They provide structured instructions, context, and 
formatting guidelines that shape the LLM's responses. In Flink Agents, prompts 
are first-class resources that can be defined, reused, and referenced across 
agents and chat models.
+
+## Prompt Types
+
+Flink Agents supports two types of prompts:
+
+### Local Prompt
+
+Local prompts are templates defined directly in your code. They support 
variable substitution using `{variable_name}` syntax and can be created from 
either text strings or message sequences.
+
+### MCP Prompt
+
+MCP (Model Context Protocol) prompts are managed by external MCP servers. They 
enable dynamic prompt retrieval, centralized prompt management, and integration 
with external prompt repositories.
+
+## Local Prompt
+
+### Creating from Text
+
+The simplest way to create a prompt is from a text string using 
`Prompt.from_text()`:
+
+```python
+product_suggestion_prompt_str = """
+Based on the rating distribution and user dissatisfaction reasons, generate 
three actionable suggestions for product improvement.
+
+Input format:
+{{
+    "id": "1",
+    "score_histogram": ["10%", "20%", "10%", "15%", "45%"],
+    "unsatisfied_reasons": ["reason1", "reason2", "reason3"]
+}}
+
+Ensure that your response can be parsed by Python json, use the following 
format as an example:
+{{
+    "suggestion_list": [
+        "suggestion1",
+        "suggestion2",
+        "suggestion3"
+    ]
+}}
+
+input:
+{input}
+"""
+
+product_suggestion_prompt = Prompt.from_text(product_suggestion_prompt_str)
+```
+
+**Key points:**
+- Use `{variable_name}` for template variables that will be substituted at 
runtime
+- Escape literal braces by doubling them: `{{` and `}}`
+
+### Creating from Messages
+
+For more control, create prompts from a sequence of `ChatMessage` objects 
using `Prompt.from_messages()`:
+
+```python
+review_analysis_prompt = Prompt.from_messages(
+    messages=[
+        ChatMessage(
+            role=MessageRole.SYSTEM,
+            content="""
+            Analyze the user review and product information to determine a
+            satisfaction score (1-5) and potential reasons for dissatisfaction.
+
+            Example input format:
+            {{
+                "id": "12345",
+                "review": "The headphones broke after one week of use."
+            }}
+
+            Ensure your response can be parsed by Python JSON:
+            {{
+                "id": "12345",
+                "score": 1,
+                "reasons": ["poor quality"]
+            }}
+            """,
+        ),
+        ChatMessage(
+            role=MessageRole.USER,
+            content="""
+            "input":
+            {input}
+            """,
+        ),
+    ],
+)
+```
+
+**Key points:**
+- Define multiple messages with different roles (SYSTEM, USER)
+- Each message can have its own template variables
+
+### Using Prompts in Agents
+
+Register a prompt as an agent resource using the `@prompt` decorator:
+
+```python
+class ReviewAnalysisAgent(Agent):
+
+    @prompt
+    @staticmethod
+    def review_analysis_prompt() -> Prompt:
+        """Prompt for review analysis."""
+        return Prompt.from_messages(
+            messages=[
+                ChatMessage(
+                    role=MessageRole.SYSTEM,
+                    content="""
+            Analyze the user review and product information to determine a
+            satisfaction score (1-5) and potential reasons for dissatisfaction.
+
+            Example input format:
+            {{
+                "id": "12345",
+                "review": "The headphones broke after one week of use."
+            }}
+
+            Ensure your response can be parsed by Python JSON:
+            {{
+                "id": "12345",
+                "score": 1,
+                "reasons": ["poor quality"]
+            }}
+            """,
+                ),
+                ChatMessage(
+                    role=MessageRole.USER,
+                    content="""
+            "input":
+            {input}
+            """,
+                ),
+            ],
+        )
+
+    @chat_model_setup
+    @staticmethod
+    def review_analysis_model() -> ResourceDescriptor:
+        """ChatModel which focus on review analysis."""
+        return ResourceDescriptor(
+            clazz=OllamaChatModelSetup,
+            connection="ollama_server",
+            model="qwen3:8b",
+            prompt="review_analysis_prompt",
+            extract_reasoning=True,
+        )
+
+    @action(InputEvent)
+    @staticmethod
+    def process_input(event: InputEvent, ctx: RunnerContext) -> None:
+        """Process input event and send chat request for review analysis."""
+        input: ProductReview = event.input
+        ctx.short_term_memory.set("id", input.id)
+
+        content = f"""
+            "id": {input.id},
+            "review": {input.review}
+        """
+        msg = ChatMessage(role=MessageRole.USER, extra_args={"input": content})
+        ctx.send_event(ChatRequestEvent(model="review_analysis_model", 
messages=[msg]))
+```
+
+Prompts use `{variable_name}` syntax for template variables. Variables are 
filled from `ChatMessage.extra_args`. The prompt is automatically applied when 
the chat model is invoked.
+
+## MCP Prompt
+
+{{< hint info >}}
+MCP (Model Context Protocol) is a standardized protocol for integrating AI 
applications with external data sources and tools. MCP prompts allow dynamic 
prompt retrieval from MCP servers.
+{{< /hint >}}
+
+MCP prompts are managed by external MCP servers and automatically discovered 
when you define an MCP server connection in your agent.
+
+### Define MCP Server with Prompts
+
+Create an MCP server that exposes prompts using the `FastMCP` library:
+
+```python
+# mcp_server.py
+mcp = FastMCP("ReviewServer")
+
[email protected]()
+def review_analysis_prompt(product_id: str, review: str) -> str:
+    """Prompt for analyzing product reviews."""
+    return f"""
+    Analyze the following product review and provide a satisfaction score 
(1-5).
+
+    Product ID: {product_id}
+    Review: {review}
+
+    Output format: {{"score": 1-5, "reasons": ["reason1", "reason2"]}}
+    """
+
+mcp.run("streamable-http")
+```
+
+**Key points:**
+- Use `@mcp.prompt()` decorator to define prompts
+- Prompt function parameters become template variables
+- The function name becomes the prompt identifier
+
+### Use MCP Prompts in Agent
+
+Connect to the MCP server and use its prompts in your agent:
+
+```python
+class ReviewAnalysisAgent(Agent):
+
+    @mcp_server
+    @staticmethod
+    def review_mcp_server() -> MCPServer:
+        """Connect to MCP server."""
+        return MCPServer(endpoint="http://127.0.0.1:8000/mcp";)
+
+    @chat_model_connection
+    @staticmethod
+    def ollama_server() -> ResourceDescriptor:
+        """Ollama connection."""
+        return ResourceDescriptor(clazz=OllamaChatModelConnection)
+
+    @chat_model_setup
+    @staticmethod
+    def review_model() -> ResourceDescriptor:
+        return ResourceDescriptor(
+            clazz=OllamaChatModelSetup,
+            connection="ollama_server",
+            model="qwen3:8b",
+            prompt="review_analysis_prompt",  # Reference MCP prompt by name
+        )
+
+    @action(InputEvent)
+    @staticmethod
+    def process_input(event: InputEvent, ctx: RunnerContext) -> None:
+        input_data = event.input
+
+        # Provide prompt variables via extra_args
+        msg = ChatMessage(
+            role=MessageRole.USER,
+            extra_args={
+                "product_id": input_data.product_id,
+                "review": input_data.review
+            }
+        )
+        ctx.send_event(ChatRequestEvent(model="review_model", messages=[msg]))
+```
+
+**Key points:**
+- Use `@mcp_server` decorator to define MCP server connection
+- Reference MCP prompts by their function name (e.g., 
`"review_analysis_prompt"`)
+- Provide prompt parameters using `ChatMessage.extra_args`
+- All prompts and tools from the MCP server are automatically registered
\ No newline at end of file
diff --git a/docs/content/docs/development/react_agent.md 
b/docs/content/docs/development/react_agent.md
index 7a2773a..64d965b 100644
--- a/docs/content/docs/development/react_agent.md
+++ b/docs/content/docs/development/react_agent.md
@@ -39,7 +39,7 @@ my_react_agent = ReActAgent(
 ### Chat Model
 User should specify the chat model used in the ReAct Agent.
 
-We use `ResourceDescriptor` to describe the chat model, includes chat model 
type and chat model arguments. See [Chat Model]({{< ref 
"docs/development/chat_with_llm#chatmodel" >}}) for more details.
+We use `ResourceDescriptor` to describe the chat model, includes chat model 
type and chat model arguments. See [Chat Model]({{< ref 
"docs/development/chat_models" >}}) for more details.
 ```python
 chat_model_descriptor = ResourceDescriptor(
     clazz=OllamaChatModelSetup,
@@ -94,7 +94,7 @@ ChatMessage(
 )
 ```
 
-See [Prompt]({{< ref "docs/development/chat_with_llm#prompt" >}}) for more 
details.
+See [Prompt]({{< ref "docs/development/prompts" >}}) for more details.
 
 ### Output Schema
 User can set output schema to configure the ReAct Agent output type. If output 
schema is set, the ReAct Agent will deserialize the llm response to expected 
type. 
diff --git a/docs/content/docs/development/tool_use.md 
b/docs/content/docs/development/tool_use.md
index 9afa2b5..7e0e8a8 100644
--- a/docs/content/docs/development/tool_use.md
+++ b/docs/content/docs/development/tool_use.md
@@ -1,6 +1,6 @@
 ---
 title: Tool Use
-weight: 6
+weight: 7
 type: docs
 ---
 <!--
diff --git a/docs/content/docs/development/vector_stores.md 
b/docs/content/docs/development/vector_stores.md
index 793fdec..950a251 100644
--- a/docs/content/docs/development/vector_stores.md
+++ b/docs/content/docs/development/vector_stores.md
@@ -1,6 +1,6 @@
 ---
 title: Vector Stores
-weight: 5
+weight: 6
 type: docs
 ---
 <!--
diff --git a/docs/content/docs/development/workflow_agent.md 
b/docs/content/docs/development/workflow_agent.md
index 254588a..b77e177 100644
--- a/docs/content/docs/development/workflow_agent.md
+++ b/docs/content/docs/development/workflow_agent.md
@@ -160,8 +160,8 @@ Then, user can define actions listen to or send `MyEvent`.
 ## Built-in Events and Actions
 
 There are several built-in `Event` and `Action` in Flink-Agents:
-* See [chat with llm]({{< ref "docs/development/chat_with_llm" >}}) for how to 
chat with a LLM leveraging built-in action and events.
-* See [tool use]({{< ref "docs/development/tool_use" >}}) for how to 
programmatically use a tool leveraging built-in action and events.
+* See [Chat Models]({{< ref "docs/development/chat_models" >}}) for how to 
chat with a LLM leveraging built-in action and events.
+* See [Tool Use]({{< ref "docs/development/tool_use" >}}) for how to 
programmatically use a tool leveraging built-in action and events.
 
 ## Memory

(flink-agents) branch main updated: [docs] Add docs for chat models and prompts (#245)

Reply via email to