This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new e6eb4f8748b [doc](ES Catalog) Update ES Catalog docs (#943)
e6eb4f8748b is described below

commit e6eb4f8748ba941c9175391aa4b2678a1fc7cf1e
Author: qiye <[email protected]>
AuthorDate: Mon Aug 5 09:44:15 2024 +0800

    [doc](ES Catalog) Update ES Catalog docs (#943)
    
    1. Add json type mapping https://github.com/apache/doris/pull/37101
    2. Add session instruction `enable_es_parallel_scroll` and `batch_size`
    https://github.com/apache/doris/pull/37180
---
 docs/lakehouse/database/es.md                            | 15 +++++++++++++--
 .../current/lakehouse/database/es.md                     | 16 ++++++++++++++--
 .../version-2.0/lakehouse/database/es.md                 | 15 +++++++++++++--
 .../version-2.1/lakehouse/database/es.md                 | 15 +++++++++++++--
 .../version-3.0/lakehouse/database/es.md                 | 15 +++++++++++++--
 versioned_docs/version-2.0/lakehouse/database/es.md      | 15 +++++++++++++--
 versioned_docs/version-2.1/lakehouse/database/es.md      | 15 +++++++++++++--
 versioned_docs/version-3.0/lakehouse/database/es.md      | 15 +++++++++++++--
 8 files changed, 105 insertions(+), 16 deletions(-)

diff --git a/docs/lakehouse/database/es.md b/docs/lakehouse/database/es.md
index 2ca16db5b42..251568c86e5 100644
--- a/docs/lakehouse/database/es.md
+++ b/docs/lakehouse/database/es.md
@@ -88,8 +88,8 @@ After switching to the ES Catalog, you will be in the 
`dafault_db`  so you don't
 | ip               | string      |                                             
                            |
 | constant_keyword | string      |                                             
                            |
 | wildcard         | string      |                                             
                            |
-| nested           | string      |                                             
                            |
-| object           | string      |                                             
                            |
+| nested           | json        |                                             
                            |
+| object           | json        |                                             
                            |
 | other            | unsupported |                                             
                            |
 
 ### Array Type
@@ -448,6 +448,17 @@ Note:
 1. The `_id`  field only supports `=` and `in` filtering.
 2. The`_id`  field must be of  `varchar`  type.
 
+### Getting globally ordered query results
+ES query results sorted by scores are useful in scenarios such as relevance 
sorting, prioritizing important content, etc. Doris querying ES pulls data 
according to the distribution of shards in the ES index in order to take full 
advantage of the MPP architecture.  
+In order to get globally ordered sorting results, you need to do a single 
point query on ES. This can be controlled by the session variable 
`enable_es_parallel_scroll` (default true).  
+When `enable_es_parallel_scroll=false` is set, Doris will send a `scroll` 
query without `shard_preference` and `sort` information to the ES cluster to 
get globally ordered results.  
+**Note:** Use with caution when the query result set is large.
+
+
+### Modify the batch size for scroll requests.
+
+The `batch` of a `scroll` request is 4064 by default, and can be changed with 
the session variable `batch_size`.
+
 ## FAQ
 
 1. Are X-Pack authenticated ES clusters supported?
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/database/es.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/database/es.md
index 966a7cf0291..a4c2f0464ad 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/database/es.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/database/es.md
@@ -91,8 +91,8 @@ CREATE CATALOG es PROPERTIES (
 | ip               | string      |                                             
               |
 | constant_keyword | string      |                                             
               |
 | wildcard         | string      |                                             
               |
-| nested           | string      |                                             
               |
-| object           | string      |                                             
               |
+| nested           | json        |                                             
               |
+| object           | json        |                                             
               |
 | other            | unsupported |                                             
               |
 
 
@@ -449,6 +449,18 @@ PROPERTIES (
 
 2. `_id` 字段必须为 `varchar` 类型
 
+
+### 获取全局有序的查询结果
+
+在相关性排序、优先展示重要内容等场景中 ES 查询结果按照 score 来排序非常有用。Doris 查询 ES 为了充分利用 MPP 的架构优势,是按照 
ES 索引的 shard 的分布情况来拉取数据。  
+为了得到全局有序的排序结果,需要对 ES 进行单点查询。可以通过 session 变量 `enable_es_parallel_scroll` (默认为 
true)来控制。  
+当设置 `enable_es_parallel_scroll=false` 时,Doris 将会向 ES 集群发送不带 `shard_preference` 
和 `sort` 信息的 `scroll` 查询,从而得到全局有序的结果。  
+**注意:** 在查询结果集较大时,谨慎使用。
+
+### 修改 scroll 请求的 batch 大小
+
+`scroll` 请求的 `batch` 默认为 4064。可以通过 session 变量 `batch_size` 来修改。
+
 ## 常见问题
 
 1. 是否支持 X-Pack 认证的 ES 集群
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/lakehouse/database/es.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/lakehouse/database/es.md
index 966a7cf0291..7817e9ffbce 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/lakehouse/database/es.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/lakehouse/database/es.md
@@ -91,8 +91,8 @@ CREATE CATALOG es PROPERTIES (
 | ip               | string      |                                             
               |
 | constant_keyword | string      |                                             
               |
 | wildcard         | string      |                                             
               |
-| nested           | string      |                                             
               |
-| object           | string      |                                             
               |
+| nested           | json        |                                             
               |
+| object           | json        |                                             
               |
 | other            | unsupported |                                             
               |
 
 
@@ -449,6 +449,17 @@ PROPERTIES (
 
 2. `_id` 字段必须为 `varchar` 类型
 
+### 获取全局有序的查询结果
+
+在相关性排序、优先展示重要内容等场景中 ES 查询结果按照 score 来排序非常有用。Doris 查询 ES 为了充分利用 MPP 的架构优势,是按照 
ES 索引的 shard 的分布情况来拉取数据。  
+为了得到全局有序的排序结果,需要对 ES 进行单点查询。可以通过 session 变量 `enable_es_parallel_scroll` (默认为 
true)来控制。  
+当设置 `enable_es_parallel_scroll=false` 时,Doris 将会向 ES 集群发送不带 `shard_preference` 
和 `sort` 信息的 `scroll` 查询,从而得到全局有序的结果。  
+**注意:** 在查询结果集较大时,谨慎使用。
+
+### 修改 scroll 请求的 batch 大小
+
+`scroll` 请求的 `batch` 默认为 4064。可以通过 session 变量 `batch_size` 来修改。
+
 ## 常见问题
 
 1. 是否支持 X-Pack 认证的 ES 集群
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/database/es.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/database/es.md
index 966a7cf0291..7817e9ffbce 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/database/es.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/database/es.md
@@ -91,8 +91,8 @@ CREATE CATALOG es PROPERTIES (
 | ip               | string      |                                             
               |
 | constant_keyword | string      |                                             
               |
 | wildcard         | string      |                                             
               |
-| nested           | string      |                                             
               |
-| object           | string      |                                             
               |
+| nested           | json        |                                             
               |
+| object           | json        |                                             
               |
 | other            | unsupported |                                             
               |
 
 
@@ -449,6 +449,17 @@ PROPERTIES (
 
 2. `_id` 字段必须为 `varchar` 类型
 
+### 获取全局有序的查询结果
+
+在相关性排序、优先展示重要内容等场景中 ES 查询结果按照 score 来排序非常有用。Doris 查询 ES 为了充分利用 MPP 的架构优势,是按照 
ES 索引的 shard 的分布情况来拉取数据。  
+为了得到全局有序的排序结果,需要对 ES 进行单点查询。可以通过 session 变量 `enable_es_parallel_scroll` (默认为 
true)来控制。  
+当设置 `enable_es_parallel_scroll=false` 时,Doris 将会向 ES 集群发送不带 `shard_preference` 
和 `sort` 信息的 `scroll` 查询,从而得到全局有序的结果。  
+**注意:** 在查询结果集较大时,谨慎使用。
+
+### 修改 scroll 请求的 batch 大小
+
+`scroll` 请求的 `batch` 默认为 4064。可以通过 session 变量 `batch_size` 来修改。
+
 ## 常见问题
 
 1. 是否支持 X-Pack 认证的 ES 集群
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/lakehouse/database/es.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/lakehouse/database/es.md
index 966a7cf0291..7817e9ffbce 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/lakehouse/database/es.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/lakehouse/database/es.md
@@ -91,8 +91,8 @@ CREATE CATALOG es PROPERTIES (
 | ip               | string      |                                             
               |
 | constant_keyword | string      |                                             
               |
 | wildcard         | string      |                                             
               |
-| nested           | string      |                                             
               |
-| object           | string      |                                             
               |
+| nested           | json        |                                             
               |
+| object           | json        |                                             
               |
 | other            | unsupported |                                             
               |
 
 
@@ -449,6 +449,17 @@ PROPERTIES (
 
 2. `_id` 字段必须为 `varchar` 类型
 
+### 获取全局有序的查询结果
+
+在相关性排序、优先展示重要内容等场景中 ES 查询结果按照 score 来排序非常有用。Doris 查询 ES 为了充分利用 MPP 的架构优势,是按照 
ES 索引的 shard 的分布情况来拉取数据。  
+为了得到全局有序的排序结果,需要对 ES 进行单点查询。可以通过 session 变量 `enable_es_parallel_scroll` (默认为 
true)来控制。  
+当设置 `enable_es_parallel_scroll=false` 时,Doris 将会向 ES 集群发送不带 `shard_preference` 
和 `sort` 信息的 `scroll` 查询,从而得到全局有序的结果。  
+**注意:** 在查询结果集较大时,谨慎使用。
+
+### 修改 scroll 请求的 batch 大小
+
+`scroll` 请求的 `batch` 默认为 4064。可以通过 session 变量 `batch_size` 来修改。
+
 ## 常见问题
 
 1. 是否支持 X-Pack 认证的 ES 集群
diff --git a/versioned_docs/version-2.0/lakehouse/database/es.md 
b/versioned_docs/version-2.0/lakehouse/database/es.md
index 0f2d249a62a..b6f58ee0cae 100644
--- a/versioned_docs/version-2.0/lakehouse/database/es.md
+++ b/versioned_docs/version-2.0/lakehouse/database/es.md
@@ -88,8 +88,8 @@ After switching to the ES Catalog, you will be in the 
`dafault_db`  so you don't
 | ip               | string      |                                             
                            |
 | constant_keyword | string      |                                             
                            |
 | wildcard         | string      |                                             
                            |
-| nested           | string      |                                             
                            |
-| object           | string      |                                             
                            |
+| nested           | json        |                                             
                            |
+| object           | json        |                                             
                            |
 | other            | unsupported |                                             
                            |
 
 <version since="dev">
@@ -452,6 +452,17 @@ Note:
 1. The `_id`  field only supports `=` and `in` filtering.
 2. The`_id`  field must be of  `varchar`  type.
 
+### Getting globally ordered query results
+ES query results sorted by scores are useful in scenarios such as relevance 
sorting, prioritizing important content, etc. Doris querying ES pulls data 
according to the distribution of shards in the ES index in order to take full 
advantage of the MPP architecture.  
+In order to get globally ordered sorting results, you need to do a single 
point query on ES. This can be controlled by the session variable 
`enable_es_parallel_scroll` (default true).  
+When `enable_es_parallel_scroll=false` is set, Doris will send a `scroll` 
query without `shard_preference` and `sort` information to the ES cluster to 
get globally ordered results.  
+**Note:** Use with caution when the query result set is large.
+
+
+### Modify the batch size for scroll requests.
+
+The `batch` of a `scroll` request is 4064 by default, and can be changed with 
the session variable `batch_size`.
+
 ## FAQ
 
 1. Are X-Pack authenticated ES clusters supported?
diff --git a/versioned_docs/version-2.1/lakehouse/database/es.md 
b/versioned_docs/version-2.1/lakehouse/database/es.md
index 0f2d249a62a..b6f58ee0cae 100644
--- a/versioned_docs/version-2.1/lakehouse/database/es.md
+++ b/versioned_docs/version-2.1/lakehouse/database/es.md
@@ -88,8 +88,8 @@ After switching to the ES Catalog, you will be in the 
`dafault_db`  so you don't
 | ip               | string      |                                             
                            |
 | constant_keyword | string      |                                             
                            |
 | wildcard         | string      |                                             
                            |
-| nested           | string      |                                             
                            |
-| object           | string      |                                             
                            |
+| nested           | json        |                                             
                            |
+| object           | json        |                                             
                            |
 | other            | unsupported |                                             
                            |
 
 <version since="dev">
@@ -452,6 +452,17 @@ Note:
 1. The `_id`  field only supports `=` and `in` filtering.
 2. The`_id`  field must be of  `varchar`  type.
 
+### Getting globally ordered query results
+ES query results sorted by scores are useful in scenarios such as relevance 
sorting, prioritizing important content, etc. Doris querying ES pulls data 
according to the distribution of shards in the ES index in order to take full 
advantage of the MPP architecture.  
+In order to get globally ordered sorting results, you need to do a single 
point query on ES. This can be controlled by the session variable 
`enable_es_parallel_scroll` (default true).  
+When `enable_es_parallel_scroll=false` is set, Doris will send a `scroll` 
query without `shard_preference` and `sort` information to the ES cluster to 
get globally ordered results.  
+**Note:** Use with caution when the query result set is large.
+
+
+### Modify the batch size for scroll requests.
+
+The `batch` of a `scroll` request is 4064 by default, and can be changed with 
the session variable `batch_size`.
+
 ## FAQ
 
 1. Are X-Pack authenticated ES clusters supported?
diff --git a/versioned_docs/version-3.0/lakehouse/database/es.md 
b/versioned_docs/version-3.0/lakehouse/database/es.md
index 2ca16db5b42..251568c86e5 100644
--- a/versioned_docs/version-3.0/lakehouse/database/es.md
+++ b/versioned_docs/version-3.0/lakehouse/database/es.md
@@ -88,8 +88,8 @@ After switching to the ES Catalog, you will be in the 
`dafault_db`  so you don't
 | ip               | string      |                                             
                            |
 | constant_keyword | string      |                                             
                            |
 | wildcard         | string      |                                             
                            |
-| nested           | string      |                                             
                            |
-| object           | string      |                                             
                            |
+| nested           | json        |                                             
                            |
+| object           | json        |                                             
                            |
 | other            | unsupported |                                             
                            |
 
 ### Array Type
@@ -448,6 +448,17 @@ Note:
 1. The `_id`  field only supports `=` and `in` filtering.
 2. The`_id`  field must be of  `varchar`  type.
 
+### Getting globally ordered query results
+ES query results sorted by scores are useful in scenarios such as relevance 
sorting, prioritizing important content, etc. Doris querying ES pulls data 
according to the distribution of shards in the ES index in order to take full 
advantage of the MPP architecture.  
+In order to get globally ordered sorting results, you need to do a single 
point query on ES. This can be controlled by the session variable 
`enable_es_parallel_scroll` (default true).  
+When `enable_es_parallel_scroll=false` is set, Doris will send a `scroll` 
query without `shard_preference` and `sort` information to the ES cluster to 
get globally ordered results.  
+**Note:** Use with caution when the query result set is large.
+
+
+### Modify the batch size for scroll requests.
+
+The `batch` of a `scroll` request is 4064 by default, and can be changed with 
the session variable `batch_size`.
+
 ## FAQ
 
 1. Are X-Pack authenticated ES clusters supported?


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to