This is an automated email from the ASF dual-hosted git repository.
wenming pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/apisix-website.git
The following commit(s) were added to refs/heads/master by this push:
new 5a49c069e7a blog: add apisix-ai-gateway-features (#1862)
5a49c069e7a is described below
commit 5a49c069e7adba6a397d1fe0b7e4bd8480a9380c
Author: Yilia Lin <[email protected]>
AuthorDate: Mon Feb 24 15:10:33 2025 +0800
blog: add apisix-ai-gateway-features (#1862)
* blog: add apisix-ai-gateway-features
* add en version
* Update apisix-ai-gateway-features.md
* Update apisix-ai-gateway-features.md
* fix typo
* update metadata
* update metadata
* fix lint in why-reinvent-api-gateways
---
.../blog/2025/02/21/why-reinvent-api-gateways.md | 2 +-
.../blog/2025/02/24/apisix-ai-gateway-features.md | 137 +++++++++++++++++++++
.../blog/2025/02/24/apisix-ai-gateway-features.md | 135 ++++++++++++++++++++
3 files changed, 273 insertions(+), 1 deletion(-)
diff --git a/blog/en/blog/2025/02/21/why-reinvent-api-gateways.md
b/blog/en/blog/2025/02/21/why-reinvent-api-gateways.md
index 1d3ea8f60d6..5c05eb10422 100644
--- a/blog/en/blog/2025/02/21/why-reinvent-api-gateways.md
+++ b/blog/en/blog/2025/02/21/why-reinvent-api-gateways.md
@@ -72,7 +72,7 @@ If your API gateway requirements include:
✅ Handling thousands of routes without latency spikes
-✅ Ultra-low latency (under 5ms) for API processing
+✅ Ultra-low latency (under 5ms) for API processing
✅ Hot plugin reloading without restarting the gateway
diff --git a/blog/en/blog/2025/02/24/apisix-ai-gateway-features.md
b/blog/en/blog/2025/02/24/apisix-ai-gateway-features.md
new file mode 100644
index 00000000000..8baa1ce3744
--- /dev/null
+++ b/blog/en/blog/2025/02/24/apisix-ai-gateway-features.md
@@ -0,0 +1,137 @@
+---
+title: "Comprehensive Overview of APISIX AI Gateway Features"
+keywords:
+ - APISIX
+ - AI Gateway
+ - LLM Proxy
+ - API Gateway for AI
+ - Token Rate Limiting
+ - AI Security
+ - AI Traffic Management
+ - Open-Source API Gateway
+ - Multi-LLM Load Balancing
+ - AI API Protection
+ - AI Request Throttling
+description: "Explore the robust features of the APISIX AI Gateway, including
LLM proxy, intelligent traffic scheduling, token rate limiting, and security
protection. Achieve multi-LLM load balancing, API rate control, and content
moderation through open-source plugins to optimize the performance, security,
and cost control of AI applications."
+tags: [Ecosystem]
+---
+
+>This article will provide an in-depth look at the AI gateway features of the
current and upcoming versions of APISIX. As a multifunctional API and AI
gateway, Apache APISIX offers efficient and secure LLM API calls for AI
applications.
+
+<!--truncate-->
+
+## Introduction: The Rise of AI Agents and the Evolution of AI Gateway
+
+In recent years, AI agents such as AutoGPT, Chatbots, and AI Assistants have
seen rapid development. These applications rely heavily on API calls to large
language models (LLMs), which has brought about challenges considering high
concurrency, cost control, and security.
+
+Traditional API gateways primarily serve Web APIs and microservices and are
not optimized for the unique needs of AI applications. This has led to the
emergence of the concept of AI gateway. An AI gateway needs to provide enhanced
capabilities in the following areas:
+
+- **Multi-LLM Proxy**: Support for multiple LLM providers to avoid vendor
lock-in.
+- **Token Rate Limiting**: Prevent API abuse and optimize cost management.
+- **Security Protection**: Including prompt filtering and content moderation
to ensure compliance of AI applications.
+- **Smart Traffic Management**: Dynamically adjust LLM weights based on cost,
latency, and stability.
+
+Apache APISIX is not only an API gateway but also an AI gateway through its
plugins, helping AI applications call LLM APIs more efficiently and securely.
+
+## LLM Proxy: Efficient Management of Multiple LLM Backends
+
+AI applications typically do not rely on a single LLM provider but need to
dynamically select the best model based on requirements. For example:
+
+- Using OpenAI GPT-4 for general text generation and Claude for legal document
processing.
+- Switching between Mistral and Gemini to optimize cost and throughput.
+
+**Apache APISIX's LLM Proxy offers the following capabilities:**
+
+✅ Support for Multiple LLM Providers: Including OpenAI, DeepSeek, Claude,
Mistral, Gemini, etc., to avoid vendor lock-in.
+
+✅ LLM Weight and Priority Management: Adjust traffic distribution based on
business needs.
+
+✅ Multi-LLM Load Balancing: Dynamically adjust LLM weights based on latency,
cost, and stability.
+
+✅ Retry and Fallback Mechanisms: Ensure business continuity if an LLM API
fails.
+
+✅ Load Balancing Across Different Providers of the Same LLM:
+
+For example:
+
+- Privately deployed DeepSeek.
+- Official DeepSeek API.
+- DeepSeek API from Volcano Engine
+
+Users can flexibly allocate traffic weights among different DeepSeek providers
based on latency, stability, and price to achieve the best calling strategy.
+
+These capabilities enable AI applications to adapt flexibly to different LLMs,
improve reliability, and reduce API calling costs.
+
+## AI Security Protection: Ensuring Safe and Compliant Use of AI
+
+AI APIs may involve sensitive data, misleading information, and potential
misuse. Therefore, an AI gateway needs to provide security at multiple levels.
+
+**The AI security capabilities provided by Apache APISIX include:**
+
+✅ **AI RAG (Retrieval-Augmented Generation)**: Supports enterprise-owned
knowledge bases to reduce LLM hallucinations and improve output reliability.
+
+✅ **Prompt Guard**: Automatically intercepts sensitive, illegal, and
inappropriate prompts to prevent malicious use by users.
+
+✅ **Prompt Decorator**: Automatically adds content before and after user input
to enhance the quality of LLM-generated content.
+
+✅ **Prompt Template**: Makes it easier for users to reuse standardized prompts
and improve interaction experience.
+
+✅ **Response Filtering & Moderation**: Intercepts sensitive or non-compliant
AI-generated content.
+
+✅ **Logging & Auditing**: Provides complete API request logs for compliance
audits.
+
+These security measures ensure that AI applications meet enterprise-level
security requirements and avoid compliance risks due to misleading AI content.
+
+## Token Observability and Management: Preventing High Bills Due to API Abuse
+
+Calling LLM APIs consumes tokens, and API abuse can lead to significant costs.
Apache APISIX provides fine-grained token monitoring and management mechanisms.
+
+**The token management capabilities of Apache APISIX include:**
+
+✅ Token Rate Limiting by Route/Service/Consumer/Consumer Group/Custom Dimension
+
+✅ Support for Multiple Rate Limiting Modes:
+
+- Single-machine vs. cluster rate limiting to accommodate different scales of
AI API services.
+- Fixed time window vs. sliding time window to flexibly control API rates.
+
+✅ Different Rate Limiting Policies for Different LLMs: Prevent cost overruns.
+
+Through Apache APISIX, enterprises can achieve fine-grained management of
token resources and prevent high bills due to API abuse.
+
+## Smart Routing: Dynamic Traffic Management for AI APIs
+
+During AI API calls, different tasks may require different LLMs. For example:
+
+- Code generation requests → sent to GPT-4 or DeepSeek.
+- Long-form summarization tasks → sent to Claude.
+- General conversations → sent to GPT-3.5 or Gemini.
+
+**The smart routing capabilities of Apache APISIX include:**
+
+✅ Context-Aware Routing Based on Request Content:
+
+- Select the optimal LLM based on prompt type.
+- Allocate different models (GPT-4 Turbo vs. GPT-3.5) based on user level
(paid vs. free users).
+
+✅ Response Caching: Reduce redundant API calls and improve response speed.
+
+These capabilities help AI APIs run more efficiently, reduce API latency, and
increase throughput.
+
+## Conclusion
+
+With the rapid development of AI technology, API gateways also need to evolve
to meet the unique needs of AI applications. Apache APISIX, with its LLM Proxy,
token rate limiting, security protection, and smart routing features, has
become the best choice for an AI gateway.
+
+**The core advantages of Apache APISIX compared to traditional API gateways
are:**
+
+🚀 Support for Multiple LLM Providers: Avoid vendor lock-in.
+
+⚡️ Smart Traffic Scheduling: Dynamic load balancing to improve API reliability.
+
+🔒 Built-in Security Capabilities: Including prompt protection and content
moderation to ensure secure and compliant AI APIs.
+
+💰 Token Rate Limiting: Prevent high bills due to API abuse.
+
+📊 High-performance Architecture: Meet the high concurrency needs of AI
applications.
+
+If you are building AI-related applications and want to have both a powerful
API gateway and AI gateway, give Apache APISIX a try!
diff --git a/blog/zh/blog/2025/02/24/apisix-ai-gateway-features.md
b/blog/zh/blog/2025/02/24/apisix-ai-gateway-features.md
new file mode 100644
index 00000000000..8805836a333
--- /dev/null
+++ b/blog/zh/blog/2025/02/24/apisix-ai-gateway-features.md
@@ -0,0 +1,135 @@
+---
+title: "APISIX 的 AI Gateway 功能一览:LLM 代理、Token 限流、安全防护"
+keywords:
+ - APISIX
+ - AI Gateway
+ - LLM Proxy
+ - API Gateway for AI
+ - Token Rate Limiting
+ - AI Security
+ - AI Traffic Management
+ - Open-Source API Gateway
+ - Multi-LLM Load Balancing
+ - AI API Protection
+ - AI Request Throttling
+description: 探索 APISIX AI Gateway 的强大功能,包括 LLM 代理、智能流量调度、Token
限流、安全防护等。通过开源插件实现多 LLM 负载均衡、API 速率控制、内容审核,优化 AI 应用的性能、安全性和成本控制。
+tags: [Ecosystem]
+---
+
+> 本文将详细介绍当前及未来几个版本 APISIX 的 AI 网关功能。作为一个多功能的 API 和 AI 网关,Apache APISIX 将为 AI
应用提供了高效且安全的 LLM API 调用。
+
+<!--truncate-->
+
+## 引言:AI 代理的崛起与 AI Gateway 的演进
+
+近年来,AI 代理发展迅猛,如 AutoGPT、Chatbots、AI Assistants 等应用不断涌现。它们依赖于大语言模型(LLM)的 API
调用,而高并发、成本控制、安全等挑战也随之而来。
+
+传统的 API 网关主要服务于 Web API 和微服务,并未针对 AI 应用的特殊需求进行优化,因此催生了 AI Gateway 的概念。AI
Gateway 需要在以下几个方面提供增强能力:
+
+- **多 LLM 代理**:支持多个 LLM 供应商,避免供应商锁定
+- **Token 速率限制**:防止 API 滥用,优化成本管理
+- **安全防护**:包括提示词过滤、内容审核等,确保 AI 应用的合规性
+- **智能流量管理**:根据成本、延迟、稳定性,动态调整 LLM 权重
+
+Apache APISIX 不仅是 API 网关,也通过插件成为了 AI 网关,帮助 AI 应用更高效、安全地调用 LLM API。
+
+## LLM Proxy:高效管理多个 LLM 后端
+
+AI 应用通常不会只依赖一个 LLM 供应商,而是需要根据需求动态选择最佳模型。例如:
+
+- 使用 OpenAI GPT-4 进行通用文本生成,使用 Claude 进行法律文档处理
+- 在 Mistral 和 Gemini 之间切换,以优化成本和吞吐量
+
+**Apache APISIX 的 LLM Proxy 提供以下能力:**
+
+✅ 支持多个 LLM 供应商(OpenAI, DeepSeek, Claude, Mistral, Gemini, etc.),避免供应商锁定
+
+✅ LLM 权重和优先级管理,基于业务需求调整流量分配
+
+✅ 智能负载均衡,依据延迟、成本、稳定性 动态调整 LLM 权重
+
+✅ 重试与备用机制,确保某个 LLM API 发生故障时业务不中断
+
+✅ 支持同一 LLM 不同供应商之间的负载均衡,例如:
+
+- 私有部署的 DeepSeek
+- 官方 DeepSeek API
+- 火山引擎的 DeepSeek API
+
+用户可以根据延迟、稳定性、价格等因素,灵活分配不同 DeepSeek 供应商的流量权重,实现最佳调用策略。
+
+这些能力让 AI 应用能够灵活适配不同的 LLM,提高可靠性,降低 API 调用成本。
+
+## AI 安全防护:确保 AI 使用安全与合规
+
+AI API 可能会涉及敏感数据和误导性信息,甚至可能被滥用。因此,AI 网关需要在多个层面提供安全保障。
+
+**Apache APISIX 提供的 AI 安全能力包括:**
+
+✅ AI RAG(Retrieval-Augmented Generation),支持企业自有知识库,降低 LLM 幻觉,提高输出可靠性
+
+✅ 提示词防护,自动拦截敏感、违法、不适当的提示词,防止用户恶意使用
+
+✅ 提示词装饰器,自动在用户的输入前后添加内容,增强 LLM 生成质量
+
+✅ 提示词模版,让用户更方便地复用标准化提示词,提升交互体验
+
+✅ 返回内容审核,拦截敏感或违规 的 AI 生成内容
+
+✅ 日志与审计,提供完整 API 请求日志,便于合规审计
+
+这些安全措施确保 AI 应用符合企业级安全要求,避免因 AI 误导性内容导致合规风险。
+
+## Token 可观测性与管理:防止 API 滥用导致高额账单
+
+调用 LLM API 需要消耗 Token,滥用 API 可能导致巨额成本。Apache APISIX 提供精细化的 Token 监控和管理机制。
+
+**Apache APISIX 的 Token 管理能力:**
+
+✅ 按 Route/Service/Consumer/Consumer Group/自定义维度进行 Token 限流限速
+
+✅ 支持多种限流模式:
+
+- 单机 vs. 集群限流,适应不同规模的 AI API 服务
+- 固定时间窗口 vs. 滑动时间窗口,灵活控制 API 速率
+
+✅ 对不同 LLM 设置不同的限流策略,避免成本失控
+
+通过 Apache APISIX,企业可以实现 Token 资源的精细化管理,防止 API 滥用带来的高额账单。
+
+## 智能路由:AI API 的动态流量管理
+
+在 AI API 调用过程中,不同的任务可能需要不同的 LLM。例如:
+
+- 代码生成请求 → 发送至 GPT-4 或 DeepSeek
+- 长篇摘要任务 → 发送至 Claude
+- 普通对话 → 发送至 GPT-3.5 或 Gemini
+
+**Apache APISIX 的智能路由能力:**
+
+✅ 基于请求内容的智能路由(Context-Aware Routing)
+
+- 根据提示词(Prompt)类型 选择最优 LLM
+- 按用户级别(付费用户 vs. 免费用户) 分配不同的模型(GPT-4 Turbo vs. GPT-3.5)
+
+✅ 缓存优化(Response Caching),减少重复 API 调用,提升响应速度
+
+这些能力帮助 AI API 运行更加高效,降低 API 延迟,提高吞吐量。
+
+## 结语
+
+随着 AI 技术的快速发展,API Gateway 也需要不断进化,适应 AI 应用的特殊需求。Apache APISIX 通过 LLM
Proxy、Token 速率限制、安全防护和智能路由等功能,成为 AI Gateway 的最佳选择。
+
+**Apache APISIX 相较于传统 API Gateway 的核心优势**
+
+🚀 支持多 LLM 供应商,避免供应商锁定
+
+⚡️ 智能流量调度,动态负载均衡,提高 API 可靠性
+
+🔒 内置安全能力,包括提示词防护、内容审核,确保 AI API 安全合规
+
+💰 Token 限流限速,避免 API 滥用导致高额账单
+
+📊 高性能架构,满足 AI 应用的高并发需求
+
+如果你正在构建 AI 相关应用,并希望同时拥有强大的 API Gateway 和 AI Gateway,不妨试试 Apache APISIX!🎯