This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new 1f769291b5 [doc](invert index) add invert index char_filter doc 
(#24205)
1f769291b5 is described below

commit 1f769291b54a5150341f2e0978b25b37871ff760
Author: zzzxl <[email protected]>
AuthorDate: Wed Sep 13 10:02:45 2023 +0800

    [doc](invert index) add invert index char_filter doc (#24205)
---
 docs/en/docs/data-table/index/inverted-index.md    | 7 +++++++
 docs/zh-CN/docs/data-table/index/inverted-index.md | 7 +++++++
 2 files changed, 14 insertions(+)

diff --git a/docs/en/docs/data-table/index/inverted-index.md 
b/docs/en/docs/data-table/index/inverted-index.md
index 1e17ca011b..f86d47c8bb 100644
--- a/docs/en/docs/data-table/index/inverted-index.md
+++ b/docs/en/docs/data-table/index/inverted-index.md
@@ -84,6 +84,11 @@ The features for inverted index is as follows:
       - "true" indicates that support is needed, but needs more storage for 
index.
       - "false" indicates that support is not needed, and less storage for 
index. MATCH_ALL can be used for matching multi words without order.
       - default mode is "false".
+    - char_filter: the main function is to pre-process the string before word 
segmentation
+      - char_filter_type: specify char_filters with different functions 
(currently only char_replace is supported)
+        - char_replace: replace each char in the pattern with a char in the 
replacement
+          - char_filter_pattern: character array to be replaced
+          - char_filter_replacement: replaced character array, can be left 
unset, defaults to a space character
   - COMMENT is optional
 
 ```sql
@@ -94,6 +99,8 @@ CREATE TABLE table_name
   INDEX idx_name2(column_name2) USING INVERTED [PROPERTIES("parser" = 
"english|chinese|unicode")] [COMMENT 'your comment']
   INDEX idx_name3(column_name3) USING INVERTED [PROPERTIES("parser" = 
"chinese", "parser_mode" = "fine_grained|coarse_grained")] [COMMENT 'your 
comment']
   INDEX idx_name4(column_name4) USING INVERTED [PROPERTIES("parser" = 
"english|chinese|unicode", "support_phrase" = "true|false")] [COMMENT 'your 
comment']
+  INDEX idx_name5(column_name4) USING INVERTED [PROPERTIES("char_filter_type" 
= "char_replace", "char_filter_pattern" = "._"), "char_filter_replacement" = " 
"] [COMMENT 'your comment']
+  INDEX idx_name5(column_name4) USING INVERTED [PROPERTIES("char_filter_type" 
= "char_replace", "char_filter_pattern" = "._")] [COMMENT 'your comment']
 )
 table_properties;
 ```
diff --git a/docs/zh-CN/docs/data-table/index/inverted-index.md 
b/docs/zh-CN/docs/data-table/index/inverted-index.md
index ce85973752..ad4c9a011d 100644
--- a/docs/zh-CN/docs/data-table/index/inverted-index.md
+++ b/docs/zh-CN/docs/data-table/index/inverted-index.md
@@ -82,6 +82,11 @@ Doris倒排索引的功能简要介绍如下:
       - true为支持,但是索引需要更多的存储空间
       - false为不支持,更省存储空间,可以用MATCH_ALL查询多个关键字
       - 默认false
+    - char_filter:功能主要在分词前对字符串提前处理
+      - char_filter_type:指定使用不同功能的char_filter(目前仅支持char_replace)
+        - char_replace 将pattern中每个char替换为一个replacement中的char
+          - char_filter_pattern:需要被替换掉的字符数组
+          - char_filter_replacement:替换后的字符数组,可以不用配置,默认为一个空格字符
   - COMMENT 是可选的,用于指定注释
 
 ```sql
@@ -92,6 +97,8 @@ CREATE TABLE table_name
   INDEX idx_name2(column_name2) USING INVERTED [PROPERTIES("parser" = 
"english|unicode|chinese")] [COMMENT 'your comment']
   INDEX idx_name3(column_name3) USING INVERTED [PROPERTIES("parser" = 
"chinese", "parser_mode" = "fine_grained|coarse_grained")] [COMMENT 'your 
comment']
   INDEX idx_name4(column_name4) USING INVERTED [PROPERTIES("parser" = 
"english|unicode|chinese", "support_phrase" = "true|false")] [COMMENT 'your 
comment']
+  INDEX idx_name5(column_name4) USING INVERTED [PROPERTIES("char_filter_type" 
= "char_replace", "char_filter_pattern" = "._"), "char_filter_replacement" = " 
"] [COMMENT 'your comment']
+  INDEX idx_name5(column_name4) USING INVERTED [PROPERTIES("char_filter_type" 
= "char_replace", "char_filter_pattern" = "._")] [COMMENT 'your comment']
 )
 table_properties;
 ```


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to