Re: [PR] feat(ai-rag): support multiple embedding providers, add Cohere rerank, and standardize chat interface [apisix]

via GitHub Fri, 13 Feb 2026 00:20:01 -0800


Copilot commented on code in PR #12941:
URL: https://github.com/apache/apisix/pull/12941#discussion_r2802885516



##########
apisix/plugins/ai-rag.lua:
##########
@@ -14,65 +14,113 @@
 -- See the License for the specific language governing permissions and
 -- limitations under the License.
 --
-local next    = next
-local require = require
-local ngx_req = ngx.req
+local next     = next
+local require  = require
+local ngx_req  = ngx.req
+local table    = table
+local ipairs   = ipairs
+local pcall    = pcall
+local tostring = tostring
 
-local http     = require("resty.http")
 local core     = require("apisix.core")
 
-local azure_openai_embeddings = 
require("apisix.plugins.ai-rag.embeddings.azure_openai").schema
-local azure_ai_search_schema = 
require("apisix.plugins.ai-rag.vector-search.azure_ai_search").schema
+local openai_base_embeddings_schema = 
require("apisix.plugins.ai-rag.embeddings.openai-base").schema
+local azure_ai_search_schema = 
require("apisix.plugins.ai-rag.vector-search.azure-ai-search").schema
+local cohere_rerank_schema = 
require("apisix.plugins.ai-rag.rerank.cohere").schema
 
 local HTTP_INTERNAL_SERVER_ERROR = ngx.HTTP_INTERNAL_SERVER_ERROR
 local HTTP_BAD_REQUEST = ngx.HTTP_BAD_REQUEST
 
+local embeddings_drivers = {}
+local vector_search_drivers = {}
+local rerank_drivers = {}
+
+local plugin_name = "ai-rag"
+
+local input_strategy_enum = {
+    last = "last",
+    all = "all"
+}
+
 local schema = {
     type = "object",
     properties = {
-        type = "object",
         embeddings_provider = {
             type = "object",
-            properties = {
-                azure_openai = azure_openai_embeddings
+            oneOf = {
+                {
+                    properties = {
+                        openai = openai_base_embeddings_schema
+                    },
+                    required = { "openai" },
+                    additionalProperties = false
+                },
+                {
+                    properties = {
+                        ["azure-openai"] = openai_base_embeddings_schema
+                    },
+                    required = { "azure-openai" },
+                    additionalProperties = false
+                },
+                {
+                    properties = {
+                        ["openai-compatible"] = openai_base_embeddings_schema
+                    },
+                    required = { "openai-compatible" },
+                    additionalProperties = false
+                }
             },
-            -- ensure only one provider can be configured while implementing 
support for
-            -- other providers
-            required = { "azure_openai" },
-            maxProperties = 1,
+            description = "Configuration for the embeddings provider."
         },
         vector_search_provider = {
             type = "object",
-            properties = {
-                azure_ai_search = azure_ai_search_schema
+            oneOf = {
+                {
+                    properties = {
+                        ["azure-ai-search"] = azure_ai_search_schema
+                    },
+                    required = { "azure-ai-search" },
+                    additionalProperties = false
+                }
             },
-            -- ensure only one provider can be configured while implementing 
support for
-            -- other providers
-            required = { "azure_ai_search" },
-            maxProperties = 1
+            description = "Configuration for the vector search provider."
         },
-    },
-    required = { "embeddings_provider", "vector_search_provider" }
-}
-
-local request_schema = {
-    type = "object",
-    properties = {
-        ai_rag = {
+        rerank_provider = {
+            type = "object",
+            oneOf = {
+                {
+                    properties = {
+                        cohere = cohere_rerank_schema
+                    },
+                    required = { "cohere" },
+                    additionalProperties = false
+                }
+            },
+            description = "Configuration for the rerank provider."
+        },
+        rag_config = {
             type = "object",
             properties = {
-                vector_search = {},
-                embeddings = {},
+                input_strategy = {
+                    type = "string",
+                    enum = { input_strategy_enum.last, input_strategy_enum.all 
},
+                    default = input_strategy_enum.last,
+                    description = "Strategy for extracting input text from 
messages."
+                            .. "'last' uses the last user message"

Review Comment:
   The description for `input_strategy` is missing spaces between sentences. It 
should have spaces after the periods to improve readability.
   ```suggestion
                       description = "Strategy for extracting input text from 
messages. "
                               .. "'last' uses the last user message. "
   ```



##########
docs/zh/latest/plugins/ai-rag.md:
##########
@@ -176,60 +162,29 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
       }
     }
   }
-}'
+}''
 ```
 
-向路由发送 POST 请求，在请求体中包含向量字段名称、嵌入模型维度和输入提示：
+向路由发送 POST 请求：
 
 ```shell
 curl "http://127.0.0.1:9080/rag"; -X POST \
   -H "Content-Type: application/json" \
-  -d '{
-    "ai_rag":{
-      "vector_search":{
-        "fields":"contentVector"
-      },
-      "embeddings":{
-        "input":"Which Azure services are good for DevOps?",
-        "dimensions":1024
-      }
-    }
-  }'
+  -d ''{
+    "messages": [
+        {
+            "role": "user",
+            "content": "Which Azure services are good for DevOps?"
+        }
+    ]
+  }''

Review Comment:
   The shell command syntax is incorrect. Line 165 has `}''` (two single 
quotes) and line 180 also has `}''`. In shell, when using `-d` with curl, the 
closing should be `}'` (single quote). The double quotes are causing malformed 
JSON. Compare with the English version (line 165 in English doc) which 
correctly uses `}'`.



##########
docs/en/latest/plugins/ai-rag.md:
##########
@@ -37,69 +37,40 @@ description: The ai-rag Plugin enhances LLM outputs with 
Retrieval-Augmented Gen
 
 The `ai-rag` Plugin provides Retrieval-Augmented Generation (RAG) capabilities 
with LLMs. It facilitates the efficient retrieval of relevant documents or 
information from external data sources, which are used to enhance the LLM 
responses, thereby improving the accuracy and contextual relevance of the 
generated outputs.
 
-The Plugin supports using [Azure 
OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) 
and [Azure AI 
Search](https://azure.microsoft.com/en-us/products/ai-services/ai-search) 
services for generating embeddings and performing vector search.
-
-**_As of now only [Azure 
OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) 
and [Azure AI 
Search](https://azure.microsoft.com/en-us/products/ai-services/ai-search) 
services are supported for generating embeddings and performing vector search 
respectively. PRs for introducing support for other service providers are 
welcomed._**
+The Plugin supports using 
[OpenAI](https://platform.openai.com/docs/api-reference/embeddings) or [Azure 
OpenAI](https://learn.microsoft.com/en-us/azure/search/vector-search-how-to-generate-embeddings?tabs=rest-api)
 services for generating embeddings, [Azure AI 
Search](https://azure.microsoft.com/en-us/products/ai-services/ai-search) 
services for performing vector search, and optionally [Cohere 
Rerank](https://docs.cohere.com/docs/rerank-overview) services for reranking 
the retrieval results.
 
 ## Attributes
 
-| Name                                      |   Required   |   Type   |   
Description                                                                     
                                                        |
-| ----------------------------------------------- | ------------ | -------- | 
-----------------------------------------------------------------------------------------------------------------------------------------
 |
-| embeddings_provider                             | True          | object   | 
Configurations of the embedding models provider.                                
                                                           |
-| embeddings_provider.azure_openai                | True          | object   | 
Configurations of [Azure 
OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) 
as the embedding models provider. |
-| embeddings_provider.azure_openai.endpoint       | True          | string   | 
Azure OpenAI embedding model endpoint.                                          
                                        |
-| embeddings_provider.azure_openai.api_key        | True          | string   | 
Azure OpenAI API key.                                                           
                                                         |
-| vector_search_provider                          | True          | object   | 
Configuration for the vector search provider.                                   
                                                           |
-| vector_search_provider.azure_ai_search          | True          | object   | 
Configuration for Azure AI Search.                                              
                                                           |
-| vector_search_provider.azure_ai_search.endpoint | True          | string   | 
Azure AI Search endpoint.                                                       
                                                           |
-| vector_search_provider.azure_ai_search.api_key  | True          | string   | 
Azure AI Search API key.                                                        
                                                          |
-
-## Request Body Format
-
-The following fields must be present in the request body.
-
-|   Field              |   Type   |    Description                             
                                                                                
      |
-| -------------------- | -------- | 
-------------------------------------------------------------------------------------------------------------------------------
 |
-| ai_rag               | object   | Request body RAG specifications.           
                                                                   |
-| ai_rag.embeddings    | object   | Request parameters required to generate 
embeddings. Contents will depend on the API specification of the configured 
provider.   |
-| ai_rag.vector_search | object   | Request parameters required to perform 
vector search. Contents will depend on the API specification of the configured 
provider. |
-
-- Parameters of `ai_rag.embeddings`
-
-  - Azure OpenAI
-
-  |   Name          |   Required   |   Type   |   Description                  
                                                                                
            |
-  | --------------- | ------------ | -------- | 
--------------------------------------------------------------------------------------------------------------------------
 |
-  | input           | True          | string   | Input text used to compute 
embeddings, encoded as a string.                                                
                |
-  | user            | False           | string   | A unique identifier 
representing your end-user, which can help in monitoring and detecting abuse.   
                       |
-  | encoding_format | False           | string   | The format to return the 
embeddings in. Can be either `float` or `base64`. Defaults to `float`.          
                  |
-  | dimensions      | False           | integer  | The number of dimensions 
the resulting output embeddings should have. Only supported in text-embedding-3 
and later models. |
-
-For other parameters please refer to the [Azure OpenAI embeddings 
documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#embeddings).
-
-- Parameters of `ai_rag.vector_search`
-
-  - Azure AI Search
-
-  |   Field   |   Required   |   Type   |   Description                |
-  | --------- | ------------ | -------- | ---------------------------- |
-  | fields    | True          | String   | Fields for the vector search. |
-
-  For other parameters please refer the [Azure AI Search 
documentation](https://learn.microsoft.com/en-us/rest/api/searchservice/documents/search-post).
-
-Example request body:
-
-```json
-{
-  "ai_rag": {
-    "vector_search": { "fields": "contentVector" },
-    "embeddings": {
-      "input": "which service is good for devops",
-      "dimensions": 1024
-    }
-  }
-}
-```
+| Name                                      |   Required   |   Type   | Valid 
Values | Description                                                            
                                                                 |
+| ----------------------------------------------- | ------------ | -------- | 
--- | 
-----------------------------------------------------------------------------------------------------------------------------------------
 |
+| embeddings_provider                             | True         | object   | 
openai, azure-openai, openai-compatible | Configurations of the embedding 
models provider. Must and can only specify one. Currently supports `openai`, 
`azure-openai`, `openai-compatible`.                                            
                                             |
+| vector_search_provider                          | True         | object   | 
azure-ai-search | Configuration for the vector search provider.                 
                                                                             |
+| vector_search_provider.azure-ai-search          | True         | object   |  
| Configuration for Azure AI Search.                                            
                                                             |
+| vector_search_provider.azure-ai-search.endpoint | True         | string   |  
| Azure AI Search endpoint.                                                     
                                                             |
+| vector_search_provider.azure-ai-search.api_key  | True         | string   |  
| Azure AI Search API key.                                                      
                                                            |
+| vector_search_provider.azure-ai-search.fields   | True         | string   |  
| Target fields for vector search.                                              
                                             |
+| vector_search_provider.azure-ai-search.select   | True         | string   |  
| Fields to select in the response.                                             
                               |
+| vector_search_provider.azure-ai-search.exhaustive| False       | boolean  |  
| Whether to perform an exhaustive search. Defaults to `true`.                  
                                                                     |
+| vector_search_provider.azure-ai-search.k        | False        | integer  | 
>0 | Number of nearest neighbors to return. Defaults to 5.                      
                                                                        |
+| rerank_provider                                 | False        | object   | 
cohere | Configuration for the rerank provider.                                 
                                                               |
+| rerank_provider.cohere                          | False        | object   |  
| Configuration for Cohere Rerank.                                              
                                                              |
+| rerank_provider.cohere.endpoint                 | False        | string   |  
| Cohere Rerank API endpoint. Defaults to `https://api.cohere.ai/v1/rerank`.    
                                                           |
+| rerank_provider.cohere.api_key                  | True         | string   |  
| Cohere API key.                                                               
                                                     |
+| rerank_provider.cohere.model                    | False        | string   |  
| Rerank model name. Defaults to `Cohere-rerank-v4.0-fast`.                     
                                                               |

Review Comment:
   The documentation states that `rerank_provider.cohere.model` defaults to 
`Cohere-rerank-v4.0-fast`, but this default is not present in the schema 
definition (apisix/plugins/ai-rag/rerank/cohere.lua lines 36-39). The schema 
does not specify a default value for the `model` property, so either the 
documentation should be updated to reflect this, or a default should be added 
to the schema.



##########
docs/zh/latest/plugins/ai-rag.md:
##########
@@ -151,15 +126,26 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   "plugins": {
     "ai-rag": {
       "embeddings_provider": {
-        "azure_openai": {
+        "azure-openai": {
           "endpoint": "'"$AZ_EMBEDDINGS_ENDPOINT"'",
           "api_key": "'"$AZ_OPENAI_API_KEY"'"
         }
       },
       "vector_search_provider": {
-        "azure_ai_search": {
+        "azure-ai-search": {
           "endpoint": "'"$AZ_AI_SEARCH_ENDPOINT"'",
-          "api_key": "'"$AZ_AI_SEARCH_KEY"'"
+          "api_key": "'"$AZ_AI_SEARCH_KEY"'",
+          "fields": "contentVector",
+          "select": "content",
+          "k": 10
+        }
+      },
+      "rerank_provider": {
+        "cohere": {
+            "endpoint":"'"$COHERE_DOMAIN"'",
+            "api_key": "'"$COHERE_API_KEY"'",
+            "model": ""'"COHERE_MODEL"'",

Review Comment:
   There's a syntax error in the shell command. Line 147 has `"model": 
""'"COHERE_MODEL"'",` which has an extra double quote before the variable 
substitution and is missing the `$` prefix for the environment variable. It 
should be `"model": "'"$COHERE_MODEL"'",` to match the pattern used for other 
variables in the same file (see lines 130, 131, 136, 137, 145, 146).
   ```suggestion
               "model": "'"$COHERE_MODEL"'",
   ```



##########
apisix/plugins/ai-rag/rerank/cohere.lua:
##########
@@ -0,0 +1,117 @@
+--
+-- Licensed to the Apache Software Foundation (ASF) under one or more
+-- contributor license agreements.  See the NOTICE file distributed with
+-- this work for additional information regarding copyright ownership.
+-- The ASF licenses this file to You under the Apache License, Version 2.0
+-- (the "License"); you may not use this file except in compliance with
+-- the License.  You may obtain a copy of the License at
+--
+--     http://www.apache.org/licenses/LICENSE-2.0
+--
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+--
+local core = require("apisix.core")
+local http = require("resty.http")
+local type = type
+local ipairs = ipairs
+
+local _M = {}
+
+_M.schema = {
+    type = "object",
+    properties = {
+        endpoint = {
+            type = "string",
+            default = "https://api.cohere.ai/v2/rerank";,
+            description = "The endpoint for the Cohere Rerank API."
+        },
+        api_key = {
+            type = "string",
+            description = "The API key for authentication."
+        },
+        model = {
+            type = "string",
+            description = "The model to use for reranking."
+        },
+        top_n = {
+            type = "integer",
+            minimum = 1,
+            default = 3,
+            description = "The number of top results to return."
+        }
+    },
+    required = { "api_key", "model" }

Review Comment:
   The schema requires both `api_key` and `model` (line 47), but `model` has a 
default value specified in line 38. Since a default is provided, `model` should 
not be in the `required` array, or the default should be removed. Having both a 
default and marking it as required is contradictory.



##########
apisix/plugins/ai-rag.lua:
##########
@@ -82,67 +130,141 @@ function _M.check_schema(conf)
 end
 
 
+local function get_input_text(messages, strategy)
+    if not messages or #messages == 0 then
+        return nil
+    end
+
+    if strategy == input_strategy_enum.last then
+        for i = #messages, 1, -1 do
+            if messages[i].role == "user" then
+                return messages[i].content
+            end
+        end
+    elseif strategy == input_strategy_enum.all then
+        local contents = {}
+        for _, msg in ipairs(messages) do
+            if msg.role == "user" then
+                core.table.insert(contents, msg.content)
+            end
+        end
+        if #contents > 0 then
+            return table.concat(contents, "\n")
+        end
+    end
+    return nil
+end
+
+
+local function load_driver(category, name, cache)
+    local driver = cache[name]
+    if driver then
+        return driver
+    end
+
+    local pkg_path = "apisix.plugins.ai-rag." .. category .. "." .. name
+    local ok, mod = pcall(require, pkg_path)
+    if not ok then
+        return nil, "failed to load module " .. pkg_path .. ", err: " .. 
tostring(mod)
+    end
+
+    cache[name] = mod
+    return mod
+end
+
+
+local function inject_context_into_messages(messages, docs)
+    if not docs or #docs == 0 then
+        return
+    end
+
+    local context_str = core.table.concat(docs, "\n\n")
+    local augment = {
+        role = "user",
+        content = "Context:\n" .. context_str
+    }
+    if #messages > 0 then
+        -- Insert context before the last message (which is typically the 
user's latest query)
+        -- to ensure the LLM considers the context relevant to the immediate 
question.
+        core.table.insert(messages, #messages, augment)

Review Comment:
   The `inject_context_into_messages` function uses 
`core.table.insert(messages, #messages, augment)` to insert the context before 
the last message. However, `table.insert(t, pos, value)` inserts at position 
`pos`, shifting existing elements. This means the context will be inserted at 
position `#messages`, which pushes the last message to position `#messages + 
1`. This is correct behavior, but it might be clearer to add a comment 
explaining that the context is being inserted as the second-to-last message.



##########
apisix/plugins/ai-rag.lua:
##########
@@ -82,67 +130,141 @@ function _M.check_schema(conf)
 end
 
 
+local function get_input_text(messages, strategy)
+    if not messages or #messages == 0 then
+        return nil
+    end
+
+    if strategy == input_strategy_enum.last then
+        for i = #messages, 1, -1 do
+            if messages[i].role == "user" then
+                return messages[i].content
+            end
+        end
+    elseif strategy == input_strategy_enum.all then
+        local contents = {}
+        for _, msg in ipairs(messages) do
+            if msg.role == "user" then
+                core.table.insert(contents, msg.content)
+            end
+        end
+        if #contents > 0 then
+            return table.concat(contents, "\n")
+        end
+    end
+    return nil
+end
+
+
+local function load_driver(category, name, cache)
+    local driver = cache[name]
+    if driver then
+        return driver
+    end
+
+    local pkg_path = "apisix.plugins.ai-rag." .. category .. "." .. name
+    local ok, mod = pcall(require, pkg_path)
+    if not ok then
+        return nil, "failed to load module " .. pkg_path .. ", err: " .. 
tostring(mod)
+    end
+
+    cache[name] = mod
+    return mod
+end
+
+
+local function inject_context_into_messages(messages, docs)
+    if not docs or #docs == 0 then
+        return
+    end
+
+    local context_str = core.table.concat(docs, "\n\n")
+    local augment = {
+        role = "user",
+        content = "Context:\n" .. context_str
+    }
+    if #messages > 0 then
+        -- Insert context before the last message (which is typically the 
user's latest query)
+        -- to ensure the LLM considers the context relevant to the immediate 
question.
+        core.table.insert(messages, #messages, augment)
+    else
+        core.table.insert_tail(messages, augment)
+    end
+end
+
+
 function _M.access(conf, ctx)
-    local httpc = http.new()
     local body_tab, err = core.request.get_json_request_body_table()
     if not body_tab then
         return HTTP_BAD_REQUEST, err
     end
-    if not body_tab["ai_rag"] then
-        core.log.error("request body must have \"ai-rag\" field")
-        return HTTP_BAD_REQUEST
-    end
-
-    local embeddings_provider = next(conf.embeddings_provider)
-    local embeddings_provider_conf = 
conf.embeddings_provider[embeddings_provider]
-    local embeddings_driver = require("apisix.plugins.ai-rag.embeddings." .. 
embeddings_provider)
 
-    local vector_search_provider = next(conf.vector_search_provider)
-    local vector_search_provider_conf = 
conf.vector_search_provider[vector_search_provider]
-    local vector_search_driver = 
require("apisix.plugins.ai-rag.vector-search." ..
-                                        vector_search_provider)
+    -- 1. Extract Input
+    local rag_conf = conf.rag_config or {}
+    local input_strategy = rag_conf.input_strategy or input_strategy_enum.last
+    local input_text = get_input_text(body_tab.messages, input_strategy)
 
-    local vs_req_schema = vector_search_driver.request_schema
-    local emb_req_schema = embeddings_driver.request_schema
+    if not input_text then
+        core.log.warn("no user input found for embedding")
+        return

Review Comment:
   When no user input is found, the function returns early (line 209) without 
an error status. This silently allows the request to proceed without RAG 
augmentation, which could be unexpected behavior. Consider returning an error 
status (e.g., HTTP_BAD_REQUEST) to clearly indicate that the request cannot be 
processed without a user message.
   ```suggestion
           return HTTP_BAD_REQUEST, "no user input found for embedding"
   ```



##########
docs/zh/latest/plugins/ai-rag.md:
##########
@@ -37,69 +37,40 @@ description: ai-rag 插件通过检索增强生成（RAG）增强 LLM 输出，
 
 `ai-rag` 插件为 LLM 提供检索增强生成（Retrieval-Augmented 
Generation，RAG）功能。它促进从外部数据源高效检索相关文档或信息，这些信息用于增强 LLM 响应，从而提高生成输出的准确性和上下文相关性。
 
-该插件支持使用 [Azure 
OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) 
和 [Azure AI 
Search](https://azure.microsoft.com/en-us/products/ai-services/ai-search) 
服务来生成嵌入和执行向量搜索。
-
-**_目前仅支持 [Azure 
OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) 
和 [Azure AI 
Search](https://azure.microsoft.com/en-us/products/ai-services/ai-search) 
服务来生成嵌入和执行向量搜索。欢迎提交 PR 以引入对其他服务提供商的支持。_**
+该插件支持使用 [OpenAI](https://platform.openai.com/docs/api-reference/embeddings) 或 
[Azure 
OpenAI](https://learn.microsoft.com/en-us/azure/search/vector-search-how-to-generate-embeddings?tabs=rest-api)
 服务生成嵌入，使用 [Azure AI 
Search](https://azure.microsoft.com/en-us/products/ai-services/ai-search) 
服务执行向量搜索，以及可选的 [Cohere Rerank](https://docs.cohere.com/docs/rerank-overview) 
服务对检索结果进行重排序。
 
 ## 属性
 
-| 名称                                      |   必选项   |   类型   |   描述            
                                                                                
                                 |
-| ----------------------------------------------- | ------------ | -------- | 
-----------------------------------------------------------------------------------------------------------------------------------------
 |
-| embeddings_provider                             | 是          | object   | 
嵌入模型提供商的配置。                                                                     
                      |
-| embeddings_provider.azure_openai                | 是          | object   | 
[Azure 
OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) 
作为嵌入模型提供商的配置。 |
-| embeddings_provider.azure_openai.endpoint       | 是          | string   | 
Azure OpenAI 嵌入模型端点。                                                            
                      |
-| embeddings_provider.azure_openai.api_key        | 是          | string   | 
Azure OpenAI API 密钥。                                                            
                                                        |
-| vector_search_provider                          | 是          | object   | 
向量搜索提供商的配置。                                                                     
                         |
-| vector_search_provider.azure_ai_search          | 是          | object   | 
Azure AI Search 的配置。                                                            
                                             |
-| vector_search_provider.azure_ai_search.endpoint | 是          | string   | 
Azure AI Search 端点。                                                             
                                                     |
-| vector_search_provider.azure_ai_search.api_key  | 是          | string   | 
Azure AI Search API 密钥。                                                         
                                                         |
-
-## 请求体格式
-
-请求体中必须包含以下字段。
-
-|   字段              |   类型   |    描述                                           
                                                                        |
-| -------------------- | -------- | 
-------------------------------------------------------------------------------------------------------------------------------
 |
-| ai_rag               | object   | 请求体 RAG 规范。                                
                                              |
-| ai_rag.embeddings    | object   | 生成嵌入所需的请求参数。内容将取决于配置的提供商的 API 规范。   |
-| ai_rag.vector_search | object   | 执行向量搜索所需的请求参数。内容将取决于配置的提供商的 API 规范。 |
-
-- `ai_rag.embeddings` 的参数
-
-  - Azure OpenAI
-
-  |   名称          |   必选项   |   类型   |   描述                                    
                                                                          |
-  | --------------- | ------------ | -------- | 
--------------------------------------------------------------------------------------------------------------------------
 |
-  | input           | 是          | string   | 用于计算嵌入的输入文本，编码为字符串。              
                                                  |
-  | user            | 否           | string   | 代表您的最终用户的唯一标识符，可以帮助监控和检测滥用。     
                     |
-  | encoding_format | 否           | string   | 返回嵌入的格式。可以是 `float` 或 
`base64`。默认为 `float`。                            |
-  | dimensions      | 否           | integer  | 结果输出嵌入应具有的维数。仅在 
text-embedding-3 及更高版本的模型中支持。 |
-
-有关其他参数，请参阅 [Azure OpenAI 
嵌入文档](https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#embeddings)。
-
-- `ai_rag.vector_search` 的参数
-
-  - Azure AI Search
-
-  |   字段   |   必选项   |   类型   |   描述                |
-  | --------- | ------------ | -------- | ---------------------------- |
-  | fields    | 是          | String   | 向量搜索的字段。 |
-
-  有关其他参数，请参阅 [Azure AI Search 
文档](https://learn.microsoft.com/en-us/rest/api/searchservice/documents/search-post)。
-
-示例请求体：
-
-```json
-{
-  "ai_rag": {
-    "vector_search": { "fields": "contentVector" },
-    "embeddings": {
-      "input": "which service is good for devops",
-      "dimensions": 1024
-    }
-  }
-}
-```
+| 名称                                      |   必选项   |   类型   | 有效值 |  描述       
                                                                                
                                      |
+| ----------------------------------------------- | ------------ | -------- | 
--- | 
-----------------------------------------------------------------------------------------------------------------------------------------
 |
+| embeddings_provider                             | 是          | object   | 
openai, azure-openai, openai-compatible | 嵌入模型提供商的配置。必须且只能指定一种，当前支持 `openai`, 
`azure-openai`, `openai-compatible`                                             
                                            |
+| vector_search_provider                          | 是          | object   | 
azure-ai-search | 向量搜索提供商的配置。                                                   
                                           |
+| vector_search_provider.azure-ai-search          | 是          | object   |  | 
Azure AI Search 的配置。                                                            
                                             |
+| vector_search_provider.azure-ai-search.endpoint | 是          | string   |  | 
Azure AI Search 端点。                                                             
                                                     |
+| vector_search_provider.azure-ai-search.api_key  | 是          | string   |  | 
Azure AI Search API 密钥。                                                         
                                                         |
+| vector_search_provider.azure-ai-search.fields   | 是          | string   |  | 
向量搜索的目标字段。                                                                      
                     |
+| vector_search_provider.azure-ai-search.select   | 是          | string   |  | 
响应中选择返回的字段。                                                                     
       |
+| vector_search_provider.azure-ai-search.exhaustive| 否         | boolean  |  | 
是否进行详尽搜索。默认为 `true`。                                                            
                           |
+| vector_search_provider.azure-ai-search.k        | 否          | integer  | >0 
| 返回的最近邻数量。默认为 5。                                                               
                               |
+| rerank_provider                                 | 否          | object   | 
cohere | 重排序提供商的配置。                                                             
                                   |
+| rerank_provider.cohere                          | 否          | object   |  | 
Cohere Rerank 的配置。                                                              
                                              |
+| rerank_provider.cohere.endpoint                 | 否          | string   |  | 
Cohere Rerank API 端点。默认为 `https://api.cohere.ai/v1/rerank`。                     
                                          |
+| rerank_provider.cohere.api_key                  | 是          | string   |  | 
Cohere API 密钥。                                                                  
                                                  |
+| rerank_provider.cohere.model                    | 否          | string   |  | 
重排序模型名称。默认为 `Cohere-rerank-v4.0-fast`。                                          
                                          |
+| rerank_provider.cohere.top_n                    | 否          | integer  |  | 
重排序后保留的文档数量。默认为 3。                                                              
                                  |
+| rag_config                                      | 否          | object   |  | 
RAG 流程的通用配置。                                                                    
                             |
+| rag_config.input_strategy                       | 否          | string   |  | 
提取用户输入文本的策略。可选值：`last`（仅最后一条消息），`all`（所有用户消息拼接）。默认为 `last`。                     
                |
+
+### embeddings_provider 属性
+
+当前支持`openai`,`azure`,`openai-compatible`,所有子字段均位于 
`embeddings_provider.<provider>` 对象下（例如 `embeddings_provider.openai.api_key`）。

Review Comment:
   In the Chinese documentation, there's a typo in line 66. It says "azure" but 
should be "azure-openai" to match the actual provider names (openai, 
azure-openai, openai-compatible).
   ```suggestion
   当前支持`openai`,`azure-openai`,`openai-compatible`,所有子字段均位于 
`embeddings_provider.<provider>` 对象下（例如 `embeddings_provider.openai.api_key`）。
   ```



##########
t/plugin/ai-rag.t:
##########
@@ -383,10 +499,19 @@ passed
 
 
 
-=== TEST 12: send request with embedding input missing
+=== TEST 10: Verify Context Injection (With Rerank)
+--- log_level: debug
 --- request
 POST /echo
-{"ai_rag":{"vector_search":{"fields":"something"},"embeddings":{"input":"which 
service is good for devops"}}}
---- error_code: 200
+{
+    "messages": [
+        {
+            "role": "user",
+            "content": "What is Apache APISIX?"
+        }
+    ]
+}
+--- error_log
+Number of documents retrieved: 1
 --- response_body eval
-qr/\{"messages":\[\{"content":"passed","role":"user"\}\]\}|\{"messages":\[\{"role":"user","content":"passed"\}\]\}/
+qr/Apache APISIX is a dynamic, real-time, high-performance API Gateway.*What 
is Apache APISIX/

Review Comment:
   The test suite does not include any tests for the `input_strategy` 
configuration option, specifically the "all" strategy which concatenates all 
user messages. This is a new feature that should have test coverage to ensure 
it works correctly, especially for edge cases like multiple user messages or 
mixed message types.
   ```suggestion
   qr/Apache APISIX is a dynamic, real-time, high-performance API Gateway.*What 
is Apache APISIX/
   
   
   
   === TEST 11: Happy Path (With Rerank, input_strategy all)
   --- config
       location /t_input_all {
           content_by_lua_block {
               local t = require("lib.test_admin").test
               local code, body = t('/apisix/admin/routes/2',
                    ngx.HTTP_PUT,
                    [[{
                       "uri": "/echo",
                       "plugins": {
                           "ai-rag": {
                               "embeddings_provider": {
                                   "openai": {
                                       "endpoint": 
"http://127.0.0.1:3623/embeddings";,
                                       "api_key": "correct-key"
                                   }
                               },
                               "vector_search_provider": {
                                   "azure-ai-search": {
                                       "endpoint": 
"http://127.0.0.1:3623/indexes/rag-apisix/docs/search";,
                                       "api_key": "correct-key",
                                       "fields": "text_vector",
                                       "select": "chunk",
                                       "k": 10
                                   }
                               },
                               "rerank_provider": {
                                   "cohere": {
                                       "endpoint": 
"http://127.0.0.1:3623/rerank";,
                                       "api_key": "correct-key",
                                       "model": "Cohere-rerank-v4.0-fast",
                                       "top_n": 1
                                   }
                               },
                               "input_strategy": "all"
                           }
                       },
                       "upstream": {
                           "type": "roundrobin",
                           "nodes": {
                               "127.0.0.1:1980": 1
                           },
                           "scheme": "http",
                           "pass_host": "node"
                       }
                   }]]
               )
   
               if code >= 300 then
                   ngx.status = code
               end
               ngx.say(body)
           }
       }
   --- response_body
   passed
   
   
   
   === TEST 12: Verify Context Injection (With Rerank, input_strategy all)
   --- log_level: debug
   --- request
   POST /echo
   {
       "messages": [
           {
               "role": "user",
               "content": "What is Apache APISIX?"
           },
           {
               "role": "user",
               "content": "Explain it briefly."
           }
       ]
   }
   --- error_log
   Number of documents retrieved: 1
   --- response_body eval
   qr/Apache APISIX is a dynamic, real-time, high-performance API Gateway.*What 
is Apache APISIX\?.*Explain it briefly\./
   ```



##########
docs/en/latest/plugins/ai-rag.md:
##########
@@ -37,69 +37,40 @@ description: The ai-rag Plugin enhances LLM outputs with 
Retrieval-Augmented Gen
 
 The `ai-rag` Plugin provides Retrieval-Augmented Generation (RAG) capabilities 
with LLMs. It facilitates the efficient retrieval of relevant documents or 
information from external data sources, which are used to enhance the LLM 
responses, thereby improving the accuracy and contextual relevance of the 
generated outputs.
 
-The Plugin supports using [Azure 
OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) 
and [Azure AI 
Search](https://azure.microsoft.com/en-us/products/ai-services/ai-search) 
services for generating embeddings and performing vector search.
-
-**_As of now only [Azure 
OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) 
and [Azure AI 
Search](https://azure.microsoft.com/en-us/products/ai-services/ai-search) 
services are supported for generating embeddings and performing vector search 
respectively. PRs for introducing support for other service providers are 
welcomed._**
+The Plugin supports using 
[OpenAI](https://platform.openai.com/docs/api-reference/embeddings) or [Azure 
OpenAI](https://learn.microsoft.com/en-us/azure/search/vector-search-how-to-generate-embeddings?tabs=rest-api)
 services for generating embeddings, [Azure AI 
Search](https://azure.microsoft.com/en-us/products/ai-services/ai-search) 
services for performing vector search, and optionally [Cohere 
Rerank](https://docs.cohere.com/docs/rerank-overview) services for reranking 
the retrieval results.
 
 ## Attributes
 
-| Name                                      |   Required   |   Type   |   
Description                                                                     
                                                        |
-| ----------------------------------------------- | ------------ | -------- | 
-----------------------------------------------------------------------------------------------------------------------------------------
 |
-| embeddings_provider                             | True          | object   | 
Configurations of the embedding models provider.                                
                                                           |
-| embeddings_provider.azure_openai                | True          | object   | 
Configurations of [Azure 
OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) 
as the embedding models provider. |
-| embeddings_provider.azure_openai.endpoint       | True          | string   | 
Azure OpenAI embedding model endpoint.                                          
                                        |
-| embeddings_provider.azure_openai.api_key        | True          | string   | 
Azure OpenAI API key.                                                           
                                                         |
-| vector_search_provider                          | True          | object   | 
Configuration for the vector search provider.                                   
                                                           |
-| vector_search_provider.azure_ai_search          | True          | object   | 
Configuration for Azure AI Search.                                              
                                                           |
-| vector_search_provider.azure_ai_search.endpoint | True          | string   | 
Azure AI Search endpoint.                                                       
                                                           |
-| vector_search_provider.azure_ai_search.api_key  | True          | string   | 
Azure AI Search API key.                                                        
                                                          |
-
-## Request Body Format
-
-The following fields must be present in the request body.
-
-|   Field              |   Type   |    Description                             
                                                                                
      |
-| -------------------- | -------- | 
-------------------------------------------------------------------------------------------------------------------------------
 |
-| ai_rag               | object   | Request body RAG specifications.           
                                                                   |
-| ai_rag.embeddings    | object   | Request parameters required to generate 
embeddings. Contents will depend on the API specification of the configured 
provider.   |
-| ai_rag.vector_search | object   | Request parameters required to perform 
vector search. Contents will depend on the API specification of the configured 
provider. |
-
-- Parameters of `ai_rag.embeddings`
-
-  - Azure OpenAI
-
-  |   Name          |   Required   |   Type   |   Description                  
                                                                                
            |
-  | --------------- | ------------ | -------- | 
--------------------------------------------------------------------------------------------------------------------------
 |
-  | input           | True          | string   | Input text used to compute 
embeddings, encoded as a string.                                                
                |
-  | user            | False           | string   | A unique identifier 
representing your end-user, which can help in monitoring and detecting abuse.   
                       |
-  | encoding_format | False           | string   | The format to return the 
embeddings in. Can be either `float` or `base64`. Defaults to `float`.          
                  |
-  | dimensions      | False           | integer  | The number of dimensions 
the resulting output embeddings should have. Only supported in text-embedding-3 
and later models. |
-
-For other parameters please refer to the [Azure OpenAI embeddings 
documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#embeddings).
-
-- Parameters of `ai_rag.vector_search`
-
-  - Azure AI Search
-
-  |   Field   |   Required   |   Type   |   Description                |
-  | --------- | ------------ | -------- | ---------------------------- |
-  | fields    | True          | String   | Fields for the vector search. |
-
-  For other parameters please refer the [Azure AI Search 
documentation](https://learn.microsoft.com/en-us/rest/api/searchservice/documents/search-post).
-
-Example request body:
-
-```json
-{
-  "ai_rag": {
-    "vector_search": { "fields": "contentVector" },
-    "embeddings": {
-      "input": "which service is good for devops",
-      "dimensions": 1024
-    }
-  }
-}
-```
+| Name                                      |   Required   |   Type   | Valid 
Values | Description                                                            
                                                                 |
+| ----------------------------------------------- | ------------ | -------- | 
--- | 
-----------------------------------------------------------------------------------------------------------------------------------------
 |
+| embeddings_provider                             | True         | object   | 
openai, azure-openai, openai-compatible | Configurations of the embedding 
models provider. Must and can only specify one. Currently supports `openai`, 
`azure-openai`, `openai-compatible`.                                            
                                             |
+| vector_search_provider                          | True         | object   | 
azure-ai-search | Configuration for the vector search provider.                 
                                                                             |
+| vector_search_provider.azure-ai-search          | True         | object   |  
| Configuration for Azure AI Search.                                            
                                                             |
+| vector_search_provider.azure-ai-search.endpoint | True         | string   |  
| Azure AI Search endpoint.                                                     
                                                             |
+| vector_search_provider.azure-ai-search.api_key  | True         | string   |  
| Azure AI Search API key.                                                      
                                                            |
+| vector_search_provider.azure-ai-search.fields   | True         | string   |  
| Target fields for vector search.                                              
                                             |
+| vector_search_provider.azure-ai-search.select   | True         | string   |  
| Fields to select in the response.                                             
                               |
+| vector_search_provider.azure-ai-search.exhaustive| False       | boolean  |  
| Whether to perform an exhaustive search. Defaults to `true`.                  
                                                                     |
+| vector_search_provider.azure-ai-search.k        | False        | integer  | 
>0 | Number of nearest neighbors to return. Defaults to 5.                      
                                                                        |
+| rerank_provider                                 | False        | object   | 
cohere | Configuration for the rerank provider.                                 
                                                               |
+| rerank_provider.cohere                          | False        | object   |  
| Configuration for Cohere Rerank.                                              
                                                              |
+| rerank_provider.cohere.endpoint                 | False        | string   |  
| Cohere Rerank API endpoint. Defaults to `https://api.cohere.ai/v1/rerank`.    
                                                           |
+| rerank_provider.cohere.api_key                  | True         | string   |  
| Cohere API key.                                                               
                                                     |
+| rerank_provider.cohere.model                    | False        | string   |  
| Rerank model name. Defaults to `Cohere-rerank-v4.0-fast`.                     
                                                               |
+| rerank_provider.cohere.top_n                    | False        | integer  |  
| Number of top results to keep after reranking. Defaults to 3.                 
                                                                               |
+| rag_config                                      | False        | object   |  
| General configuration for the RAG process.                                    
                                                             |
+| rag_config.input_strategy                       | False        | string   |  
| Strategy for extracting input text from messages. Values: `last` (last user 
message), `all` (concatenate all user messages). Defaults to `last`.            
                         |
+
+### embeddings_provider attributes
+
+Currently supports `openai`, `azure-openai`, `openai-compatible`. All 
sub-fields are located under the `embeddings_provider.<provider>` object (e.g., 
`embeddings_provider.openai.api_key`).
+
+| Name        | Required | Type    | Description                               
                                  |
+|-------------|--------|---------|----------------------------------------------------------------------|
+| `endpoint`  | True     | string  | API service endpoint.<br>? OpenAI: 
`https://api.openai.com/v1`<br>? Azure: 
`https://<your-resource>.openai.azure.com/` |

Review Comment:
   The bullet point formatting in line 70 uses incorrect symbols. The Chinese 
documentation uses `?` instead of `•` for bullet points. This should use `•` to 
match standard markdown formatting and maintain consistency.
   ```suggestion
   | `endpoint`  | True     | string  | API service endpoint.<br>• OpenAI: 
`https://api.openai.com/v1`<br>• Azure: 
`https://<your-resource>.openai.azure.com/` |
   ```



##########
apisix/plugins/ai-rag.lua:
##########
@@ -82,67 +130,141 @@ function _M.check_schema(conf)
 end
 
 
+local function get_input_text(messages, strategy)
+    if not messages or #messages == 0 then
+        return nil
+    end
+
+    if strategy == input_strategy_enum.last then
+        for i = #messages, 1, -1 do
+            if messages[i].role == "user" then
+                return messages[i].content
+            end
+        end
+    elseif strategy == input_strategy_enum.all then
+        local contents = {}
+        for _, msg in ipairs(messages) do
+            if msg.role == "user" then
+                core.table.insert(contents, msg.content)
+            end
+        end
+        if #contents > 0 then
+            return table.concat(contents, "\n")
+        end
+    end
+    return nil
+end
+
+
+local function load_driver(category, name, cache)
+    local driver = cache[name]
+    if driver then
+        return driver
+    end
+
+    local pkg_path = "apisix.plugins.ai-rag." .. category .. "." .. name
+    local ok, mod = pcall(require, pkg_path)
+    if not ok then
+        return nil, "failed to load module " .. pkg_path .. ", err: " .. 
tostring(mod)
+    end
+
+    cache[name] = mod
+    return mod
+end
+
+
+local function inject_context_into_messages(messages, docs)
+    if not docs or #docs == 0 then
+        return
+    end
+
+    local context_str = core.table.concat(docs, "\n\n")
+    local augment = {
+        role = "user",
+        content = "Context:\n" .. context_str
+    }
+    if #messages > 0 then
+        -- Insert context before the last message (which is typically the 
user's latest query)
+        -- to ensure the LLM considers the context relevant to the immediate 
question.
+        core.table.insert(messages, #messages, augment)
+    else
+        core.table.insert_tail(messages, augment)
+    end
+end
+
+
 function _M.access(conf, ctx)
-    local httpc = http.new()
     local body_tab, err = core.request.get_json_request_body_table()
     if not body_tab then
         return HTTP_BAD_REQUEST, err
     end
-    if not body_tab["ai_rag"] then
-        core.log.error("request body must have \"ai-rag\" field")
-        return HTTP_BAD_REQUEST
-    end
-
-    local embeddings_provider = next(conf.embeddings_provider)
-    local embeddings_provider_conf = 
conf.embeddings_provider[embeddings_provider]
-    local embeddings_driver = require("apisix.plugins.ai-rag.embeddings." .. 
embeddings_provider)
 
-    local vector_search_provider = next(conf.vector_search_provider)
-    local vector_search_provider_conf = 
conf.vector_search_provider[vector_search_provider]
-    local vector_search_driver = 
require("apisix.plugins.ai-rag.vector-search." ..
-                                        vector_search_provider)
+    -- 1. Extract Input
+    local rag_conf = conf.rag_config or {}
+    local input_strategy = rag_conf.input_strategy or input_strategy_enum.last
+    local input_text = get_input_text(body_tab.messages, input_strategy)
 
-    local vs_req_schema = vector_search_driver.request_schema
-    local emb_req_schema = embeddings_driver.request_schema
+    if not input_text then
+        core.log.warn("no user input found for embedding")
+        return
+    end
 
-    request_schema.properties.ai_rag.properties.vector_search = vs_req_schema
-    request_schema.properties.ai_rag.properties.embeddings = emb_req_schema
+    -- 2. Load Drivers
+    local embeddings_provider_name = next(conf.embeddings_provider)
+    local embeddings_conf = conf.embeddings_provider[embeddings_provider_name]
+    local embeddings_driver, err = load_driver("embeddings", 
embeddings_provider_name,
+            embeddings_drivers)
+    if not embeddings_driver then
+        core.log.error("failed to load embeddings driver: ", err)
+        return HTTP_INTERNAL_SERVER_ERROR, "failed to load embeddings driver"
+    end
 
-    local ok, err = core.schema.check(request_schema, body_tab)
-    if not ok then
-        core.log.error("request body fails schema check: ", err)
-        return HTTP_BAD_REQUEST
+    local vector_search_provider_name = next(conf.vector_search_provider)
+    local vector_search_conf = 
conf.vector_search_provider[vector_search_provider_name]
+    local vector_search_driver, err = load_driver("vector-search", 
vector_search_provider_name,
+            vector_search_drivers)
+    if not vector_search_driver then
+        core.log.error("failed to load vector search driver: ", err)
+        return HTTP_INTERNAL_SERVER_ERROR, "failed to load vector search 
driver"
     end
 
-    local embeddings, status, err = 
embeddings_driver.get_embeddings(embeddings_provider_conf,
-                                                        
body_tab["ai_rag"].embeddings, httpc)
+    -- 3. Get Embeddings
+    local embeddings, status, err = 
embeddings_driver.get_embeddings(embeddings_conf, input_text)
     if not embeddings then
         core.log.error("could not get embeddings: ", err)
         return status, err
     end
 
-    local search_body = body_tab["ai_rag"].vector_search
-    search_body.embeddings = embeddings
-    local res, status, err = 
vector_search_driver.search(vector_search_provider_conf,
-                                                        search_body, httpc)
-    if not res then
+    -- 4. Vector Search
+    local docs, status, err = vector_search_driver.search(vector_search_conf, 
embeddings)
+    if not docs then
         core.log.error("could not get vector_search result: ", err)
         return status, err
     end
 
-    -- remove ai_rag from request body because their purpose is served
-    -- also, these values will cause failure when proxying requests to LLM.
-    body_tab["ai_rag"] = nil
+    -- 5. Rerank
+    if conf.rerank_provider then
+        local rerank_provider_name = next(conf.rerank_provider)
+        local rerank_conf = conf.rerank_provider[rerank_provider_name]
+        local rerank_driver, err = load_driver("rerank", rerank_provider_name, 
rerank_drivers)
+
+        if not rerank_driver then
+            core.log.error("failed to load rerank driver: ", err)
+            return HTTP_INTERNAL_SERVER_ERROR, "failed to load rerank driver"
+        end
 
-    if not body_tab.messages then
-        body_tab.messages = {}
+        local reranked_docs, err = rerank_driver.rerank(rerank_conf, docs, 
input_text)
+        if reranked_docs then
+            docs = reranked_docs
+        else
+            core.log.error("rerank failed: ", err)
+            return HTTP_INTERNAL_SERVER_ERROR, "rerank failed"

Review Comment:
   The error handling for rerank is misleading. The code expects 
`rerank_driver.rerank()` to return two values (docs, err) on line 256, but the 
cohere.lua implementation only returns a single value (docs). This means `err` 
will always be nil, and the error handling on lines 260-261 is dead code that 
can never execute. The rerank gracefully falls back to original docs on failure 
(as intended), but this should be made explicit. Either: 1) Have rerank return 
(docs, nil) on success and (docs, err_msg) on fallback, or 2) Remove the dead 
error handling code and add a comment explaining the fallback behavior.
   ```suggestion
           -- rerank drivers are expected to handle failures internally and
           -- fall back to returning the original docs on error.
           local reranked_docs = rerank_driver.rerank(rerank_conf, docs, 
input_text)
           if reranked_docs then
               docs = reranked_docs
   ```



##########
apisix/plugins/ai-rag/rerank/cohere.lua:
##########
@@ -0,0 +1,117 @@
+--
+-- Licensed to the Apache Software Foundation (ASF) under one or more
+-- contributor license agreements.  See the NOTICE file distributed with
+-- this work for additional information regarding copyright ownership.
+-- The ASF licenses this file to You under the Apache License, Version 2.0
+-- (the "License"); you may not use this file except in compliance with
+-- the License.  You may obtain a copy of the License at
+--
+--     http://www.apache.org/licenses/LICENSE-2.0
+--
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+--
+local core = require("apisix.core")
+local http = require("resty.http")
+local type = type
+local ipairs = ipairs
+
+local _M = {}
+
+_M.schema = {
+    type = "object",
+    properties = {
+        endpoint = {
+            type = "string",
+            default = "https://api.cohere.ai/v2/rerank";,
+            description = "The endpoint for the Cohere Rerank API."
+        },
+        api_key = {
+            type = "string",
+            description = "The API key for authentication."
+        },
+        model = {
+            type = "string",
+            description = "The model to use for reranking."
+        },
+        top_n = {
+            type = "integer",
+            minimum = 1,
+            default = 3,
+            description = "The number of top results to return."
+        }
+    },
+    required = { "api_key", "model" }
+}
+
+function _M.rerank(conf, docs, query)
+    if not docs or #docs == 0 then
+        return docs
+    end
+
+    local top_n = conf.top_n or 3
+    if #docs <= top_n then
+        return docs
+    end
+
+    -- Construct documents for Cohere Rerank API
+    local documents = {}
+    for _, doc in ipairs(docs) do
+        local doc_content = doc
+        if type(doc) == "table" then
+            doc_content = doc.content or core.json.encode(doc)
+        end
+        core.table.insert(documents, doc_content)
+    end
+
+    local body = {
+        model = conf.model,
+        query = query,
+        top_n = top_n,
+        documents = documents
+    }
+
+    local body_str, err = core.json.encode(body)
+    if not body_str then
+        core.log.error("failed to encode rerank body: ", err)
+        return docs -- fallback
+    end
+
+    local httpc = http.new()
+    local res, err = httpc:request_uri(conf.endpoint, {
+        method = "POST",
+        headers = {
+            ["Content-Type"] = "application/json",
+            ["Authorization"] = "Bearer " .. conf.api_key
+        },
+        body = body_str
+    })
+
+    if not res or res.status ~= 200 then
+        core.log.error("rerank failed: ", err or (res and res.status))
+        return docs -- fallback
+    end
+
+    local res_body = core.json.decode(res.body)
+    if not res_body or not res_body.results then
+        return docs
+    end
+
+    local new_docs = {}
+    for _, result in ipairs(res_body.results) do
+        -- The vector search API returns 0-based indices; Lua tables are 
1-based.

Review Comment:
   The comment on line 105 states "The vector search API returns 0-based 
indices" but this is actually referring to the Cohere Rerank API, not the 
vector search API. The comment should be clarified to say "The Cohere Rerank 
API returns 0-based indices" for accuracy.
   ```suggestion
           -- The Cohere Rerank API returns 0-based indices; Lua tables are 
1-based.
   ```



##########
apisix/plugins/ai-rag/rerank/cohere.lua:
##########
@@ -0,0 +1,117 @@
+--
+-- Licensed to the Apache Software Foundation (ASF) under one or more
+-- contributor license agreements.  See the NOTICE file distributed with
+-- this work for additional information regarding copyright ownership.
+-- The ASF licenses this file to You under the Apache License, Version 2.0
+-- (the "License"); you may not use this file except in compliance with
+-- the License.  You may obtain a copy of the License at
+--
+--     http://www.apache.org/licenses/LICENSE-2.0
+--
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+--
+local core = require("apisix.core")
+local http = require("resty.http")
+local type = type
+local ipairs = ipairs
+
+local _M = {}
+
+_M.schema = {
+    type = "object",
+    properties = {
+        endpoint = {
+            type = "string",
+            default = "https://api.cohere.ai/v2/rerank";,

Review Comment:
   The default endpoint for Cohere Rerank is set to 
`https://api.cohere.ai/v2/rerank` (line 29), but in the documentation examples 
(docs/en/latest/plugins/ai-rag.md line 99 and docs/zh/latest/plugins/ai-rag.md 
line 99), the default is stated as `https://api.cohere.ai/v1/rerank`. There's 
an API version mismatch between the code (v2) and documentation (v1). This 
should be consistent.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat(ai-rag): support multiple embedding providers, add Cohere rerank, and standardize chat interface [apisix]

Reply via email to