This is an automated email from the ASF dual-hosted git repository.
kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/branch-2.0 by this push:
new f4bb33390b2 [opt](ES catalog)Add more description for limitations of
docvalue_scan (#29420) (#29459)
f4bb33390b2 is described below
commit f4bb33390b2e87acbcebf6efa4e3cbd6cfbb81cd
Author: qiye <[email protected]>
AuthorDate: Thu Jan 4 20:50:51 2024 +0800
[opt](ES catalog)Add more description for limitations of docvalue_scan
(#29420) (#29459)
---
docs/en/docs/lakehouse/multi-catalog/es.md | 2 ++
docs/zh-CN/docs/lakehouse/multi-catalog/es.md | 1 +
2 files changed, 3 insertions(+)
diff --git a/docs/en/docs/lakehouse/multi-catalog/es.md
b/docs/en/docs/lakehouse/multi-catalog/es.md
index c6293f931f9..03f926a3f51 100644
--- a/docs/en/docs/lakehouse/multi-catalog/es.md
+++ b/docs/en/docs/lakehouse/multi-catalog/es.md
@@ -196,6 +196,8 @@ By default, Doris On ES obtains all target columns from
`_source`, which is in r
1. Columnar storage is not available for `text` fields in ES. Thus, if you
need to obtain fields containing `text` values, you will need to obtain them
from `_source`.
2. When obtaining large numbers of fields (`>= 25`), the performances of
`docvalue` and `_source` are basically equivalent.
+3. The `keyword` type field, due to the
[`ignore_above`](https://www.elastic.co/guide/en/elasticsearch/reference/current/keyword.html#keyword-params)
parameter's limit, long text fields exceeding this limit will be ignored, so
the result may be empty. In this case, you need to turn off
`enable_docvalue_scan` and get the result from `_source`.
+
### Sniff Keyword Fields
diff --git a/docs/zh-CN/docs/lakehouse/multi-catalog/es.md
b/docs/zh-CN/docs/lakehouse/multi-catalog/es.md
index 64a12726f5c..aabfe3dffd5 100644
--- a/docs/zh-CN/docs/lakehouse/multi-catalog/es.md
+++ b/docs/zh-CN/docs/lakehouse/multi-catalog/es.md
@@ -191,6 +191,7 @@ ES Catalog 支持过滤条件的下推: 过滤条件下推给ES,这样只有
1. `text`类型的字段在ES中是没有列式存储,因此如果要获取的字段值有`text`类型字段会自动降级为从`_source`中获取
2. 在获取的字段数量过多的情况下(`>= 25`),从`docvalue`中获取字段值的性能会和从`_source`中获取字段值基本一样
+3.
`keyword`类型字段由于[`ignore_above`](https://www.elastic.co/guide/en/elasticsearch/reference/current/keyword.html#keyword-params)参数的限制,对于超过该限制的长文本字段会忽略,所以可能会出现结果为空的情况。此时需要关闭`enable_docvalue_scan`,从`_source`中获取结果。
### 探测keyword类型字段
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]