Re: [PR] Fix All Regexp functions Documention [doris-website]
yiguolei merged PR #2568: URL: https://github.com/apache/doris-website/pull/2568 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Fix All Regexp functions Documention [doris-website]
zclllyybb commented on code in PR #2568: URL: https://github.com/apache/doris-website/pull/2568#discussion_r2185483645 ## i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-count.md: ## @@ -49,10 +51,11 @@ REGEXP_COUNT(, ) ## 返回值 - 返回正则表达式 “pattern” 在字符串 “str” 中的匹配字符数量,返回类型为 “int”。若没有字符匹配,则返回 0。 -- 1. 如果'str' 或者 'parttern' 为NULL ,或者他们都为NULL,返回NULL; -- 2. 如果 'pattern' 不符合正则表达式规则,则是错误的用法,抛出error; -### 测试字符串区匹配包含转义字符的表达式返回结果 +- 如果'str' 或者 'parttern' 为NULL ,或者他们都为NULL,返回NULL; +- 如果 'pattern' 不符合正则表达式规则,则是错误的用法,抛出error; Review Comment: 下面例子没有这个case ## i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-count.md: ## @@ -216,4 +227,20 @@ SELECT id, regexp_count(text_data, 'e') as count_e FROM test_table_for_regexp_co |9 | 0 | | 10 | 1 | +--+-+ + +``` + +表情包匹配 Review Comment: 这个说法不严谨,这些是emoji字符。emoji有加和关系,你应该测试子emoji能否匹配组合emoji。 ## i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md: ## @@ -5,21 +5,52 @@ } --- + + ## 描述 -对字符串 str 进行正则匹配,抽取符合 pattern 的所有子模式匹配部分。需要 pattern 完全匹配 str 中的某部分,这样才能返回 pattern 部分中需匹配部分的字符串数组。如果没有匹配或者 pattern 没有子模式,返回空字符串。 +REGEXP_EXTRACT_ALL函数用于对给定字符串str执行正则表达式匹配,并提取与指定pattern的第一个子模式匹配的所有部分。为了使函数返回表示模式匹配部分的字符串数组,该模式必须与输入字符串str的一部分完全匹配。如果没有匹配项,或模式不包含任何子模式,则返回空字符串。 Review Comment: 所有中文和英文之间需要有一个空格 ## i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md: ## @@ -5,21 +5,52 @@ } --- + + ## 描述 -对字符串 str 进行正则匹配,抽取符合 pattern 的所有子模式匹配部分。需要 pattern 完全匹配 str 中的某部分,这样才能返回 pattern 部分中需匹配部分的字符串数组。如果没有匹配或者 pattern 没有子模式,返回空字符串。 +REGEXP_EXTRACT_ALL函数用于对给定字符串str执行正则表达式匹配,并提取与指定pattern的第一个子模式匹配的所有部分。为了使函数返回表示模式匹配部分的字符串数组,该模式必须与输入字符串str的一部分完全匹配。如果没有匹配项,或模式不包含任何子模式,则返回空字符串。 Review Comment: 另外 “提取与指定pattern的第一个子模式匹配的所有部分” 这个说法是错误的。 应该是 “所有与指定 pattern 匹配的文本串当中的与第一个子模式匹配的部分”。 这二者获得的结果是不同的,你原本的说法不要求整个 pattern 产生合法匹配。 ## i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-count.md: ## @@ -25,11 +25,13 @@ under the License. ## 描述 这是一个用于统计字符串中匹配给定正则表达式模式的字符数量的函数。输入包括用户提供的字符串和正则表达式模式。返回值为匹配字符的总数量;如果未找到匹配项,则返回 0。 -1. 'str' 参数为 “string” 类型,是用户希望通过正则表达式进行匹配的字符串。 +需要注意的是,在处理字符集匹配时,应使用 Utf-8 标准字符类。这确保函数能够正确识别和处理来自不同语言的各种字符。 Review Comment: 你应该给个RE2的链接,我们支持哪些字符类匹配。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Fix All Regexp functions Documention [doris-website]
yiguolei commented on code in PR #2568: URL: https://github.com/apache/doris-website/pull/2568#discussion_r2184307065 ## docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-replace-one.md: ## @@ -40,16 +42,19 @@ REGEXP_REPLACE_ONE(, , ) | Parameter | Description | | -- | -- | -| `` | The column need to do regular matching.| -| `` | Target pattern.| -| `` | The string to replace the matched pattern.| +| `` | This parameter is of type string. It represents the string on which the regular expression matching will be performed. This is the target string that you want to modify.| Review Comment: 这3个参数分别为null的时候是什么结果 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Fix All Regexp functions Documention [doris-website]
yiguolei commented on code in PR #2568: URL: https://github.com/apache/doris-website/pull/2568#discussion_r2184306289 ## docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-replace.md: ## @@ -40,35 +39,147 @@ REGEXP_REPLACE(, , ) | Parameter | Description | | -- | -- | -| `` | The column need to do regular matching.| -| `` | Target pattern.| -| `` | The string to replace the matched pattern.| +| `` | This parameter is of Varchar type. It represents the string on which the regular expression matching will be performed. It can be a literal string or a column from a table containing string values.| +| `` | This parameter is of Varchar type. It is the regular expression pattern used to match the string. The pattern can include various regular expression metacharacters and constructs to define complex matching rules.| +| `` | This parameter is of Varchar type. It is the string that will replace the parts of the that match the . If you want to reference captured groups in the pattern, you can use backreferences like \1, \2, etc., where \1 refers to the first captured group, \2 refers to the second captured group, and so on.| ## Return Value -Result after doing replacement. It is `Varchar` type. +The function returns the result string after the replacement operation. The return value is of Varchar type. If no part of the matches the , the original will be returned. Review Comment: 这3个参数,分别为null的时候,是什么结果。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Fix All Regexp functions Documention [doris-website]
yiguolei commented on code in PR #2568: URL: https://github.com/apache/doris-website/pull/2568#discussion_r2184304236 ## docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md: ## @@ -40,25 +38,24 @@ REGEXP(, ) | Parameter | Description | | -- | -- | -| `` | The column need to do regular matching.| -| `` | Target pattern.| +| `` | String type. Represents the string to be matched against the regular expression, which can be a column in a table or a literal string.| +| `` | String type. The regular expression pattern used to match against the string . Regular expressions provide a powerful way to define complex search patterns, including character classes, quantifiers, and anchors.| ## Return Value -A `BOOLEAN` value indicating whether the match was successful +The REGEXP function returns a BOOLEAN value. If the string matches the regular expression pattern , the function returns true (represented as 1 in SQL); if not, it returns false (represented as 0 in SQL). Review Comment: 什么时候返回null -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Fix All Regexp functions Documention [doris-website]
yiguolei commented on code in PR #2568:
URL: https://github.com/apache/doris-website/pull/2568#discussion_r2184299159
##
docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md:
##
@@ -38,20 +41,22 @@ REGEXP_EXTRACT_OR_NULL(, , )
| Parameter | Description |
| -- | -- |
-| `` | String, a text string that needs to be matched with regular
expressions. |
-| `` | String, target pattern. |
-| `` | Integer, the index of the expression group to extract, counting
starts from 1. |
+| `` | A string parameter. It represents the text string in which the
regular expression matching will be performed. This string can contain any
combination of characters, and the function will search within it for
substrings that match the . |
+| `` |A string parameter. It is the target regular expression
pattern. This pattern can include various regular expression metacharacters and
character classes, which precisely define the rules for the substring to be
matched |
+| `` | An integer parameter. It indicates the index of the expression
group to be extracted. The indexing starts from 1. If is set to 0, the
entire first matching substring will be returned. If is a negative number
or exceeds the number of expression groups in the pattern, the function will
return NULL. |
## Return Value
Return a string type, with the result being the part that matches ``.
-- If the input `` is 0, return the entire first matching substring.
-- If the input `` is invalid (negative or exceeds the number of
expression groups), return NULL.
-- If the regular expression fails to match, return NULL.
+ If the input `` is 0, return the entire first matching substring.
Review Comment:
如果str 和pattern 为null 怎么办?
##
docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md:
##
@@ -100,6 +110,8 @@ SELECT REGEXP_EXTRACT_OR_NULL('AbCdE',
'([[:lower:]]+)C([[:upper:]]+)', 1);
+-+
```
+Chinese character matching.The pattern (\p{Han}+)(.+) first matches one or
more Chinese characters and then any remaining characters. The group with index
2 represents the non - Chinese part of the string after the Chinese characters.
Review Comment:
补充position 超过字符串长度的case;
补充position 是负数的case
补充str 或者 pattern 是null的case
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Fix All Regexp functions Documention [doris-website]
yiguolei commented on code in PR #2568: URL: https://github.com/apache/doris-website/pull/2568#discussion_r2184296247 ## docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md: ## @@ -40,41 +40,151 @@ REGEXP_EXTRACT_ALL(, ) | Parameter | Description | | -- | -- | -| `` | The column need to do regular matching.| -| `` | Target pattern.| +| `` | This parameter is of type String. It represents the input string on which the regular expression matching will be performed. This can be a literal string value or a reference to a column in a table that contains string data.| +| `` | This parameter is also of type String. It specifies the regular expression pattern that will be used to match against the input string . The pattern can include various regular expression constructs such as character classes, quantifiers, and sub - patterns.| ## Return value -Value after extraction. It is `String` type. +The function returns an array of strings that represent the parts of the input string that match the first sub - pattern of the specified regular expression. The return type is an array of String values. If no matches are found, or if the pattern has no sub - patterns, an empty array is returned. Review Comment: 输入输入为null 怎么办? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Fix All Regexp functions Documention [doris-website]
yiguolei commented on code in PR #2568:
URL: https://github.com/apache/doris-website/pull/2568#discussion_r2179348863
##
i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md:
##
@@ -28,25 +61,69 @@ mysql> SELECT regexp_extract_all('AbCdE',
'([[:lower:]]+)C([[:lower:]]+)');
+--+
| ['b']|
+--+
+```
+### 示例 2:字符串中的多个匹配项
Review Comment:
这里不需要是标题,直接是普通的文本就好了。 参考snowflake
https://docs.snowflake.com/en/sql-reference/functions/regexp_count
##
versioned_docs/version-1.2/sql-manual/sql-functions/string-functions/regexp/regexp-replace.md:
##
@@ -21,35 +21,93 @@ REGEXP_REPLACE(, , )
| Parameter | Description |
| -- | -- |
-| `` | The column need to do regular matching.|
-| `` | Target pattern.|
-| `` | The string to replace the matched pattern.|
+| `` | This parameter is of Varchar type. It represents the string on
which the regular expression matching will be performed. It can be a literal
string or a column from a table containing string values.|
+| `` | This parameter is of Varchar type. It is the regular
expression pattern used to match the string. The pattern can include various
regular expression metacharacters and constructs to define complex matching
rules.|
+| `` | This parameter is of Varchar type. It is the string that will
replace the parts of the that match the . If you want to
reference captured groups in the pattern, you can use backreferences like \1,
\2, etc., where \1 refers to the first captured group, \2 refers to the second
captured group, and so on.|
## Return Value
-Result after doing replacement. It is `Varchar` type.
+The function returns the result string after the replacement operation. The
return value is of Varchar type. If no part of the matches the ,
the original will be returned.
## Example
+### Test Basic replacement example
+Explain: In this example, all spaces in the string 'a b c' are replaced with
hyphens.
+
```sql
mysql> SELECT regexp_replace('a b c', ' ', '-');
+---+
| regexp_replace('a b c', ' ', '-') |
+---+
| a-b-c |
+---+
+```
+### Test Using captured groups
Review Comment:
不要写标题,直接把这个 标题行和 Explain 那行合并到一行里就可以了。 你看oracle
https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/REGEXP_SUBSTR.html
也不会加这种标题,一般都是用普通的文本。
##
i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md:
##
@@ -40,17 +64,71 @@ mysql> SELECT regexp_extract('AbCdE',
'([[:lower:]]+)C([[:lower:]]+)', 1);
+-+
| b |
+-+
+```
+### 示例 2:提取第二个匹配部分
Review Comment:
这种的,把两行直接 merge 成一行。
另外,不要加 “解释” 这种词。
##
docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md:
##
@@ -21,41 +21,88 @@ REGEXP_EXTRACT_ALL(, )
| Parameter | Description |
| -- | -- |
-| `` | The column need to do regular matching.|
-| `` | Target pattern.|
+| `` | This parameter is of type String. It represents the input string
on which the regular expression matching will be performed. This can be a
literal string value or a reference to a column in a table that contains string
data.|
+| `` | This parameter is also of type String. It specifies the
regular expression pattern that will be used to match against the input string
. The pattern can include various regular expression constructs such as
character classes, quantifiers, and sub - patterns.|
## Return value
-Value after extraction. It is `String` type.
+The function returns an array of strings that represent the parts of the input
string that match the first sub - pattern of the specified regular expression.
The return type is an array of String values. If no matches are found, or if
the pattern has no sub - patterns, an empty array is returned.
## Example
+### Example 1: Basic matching of lowercase letters around 'C'.
+Explain: In this example, the pattern ([[:lower:]]+)C([[:lower:]]+) matches
the part of the string where one or more lowercase letters are followed by 'C'
and then one or more lowercase letters. The first sub - pattern ([[:lower:]]+)
before 'C' matches 'b', so the result is ['b'].
+
```sql
mysql> SELECT regexp_extract_all('AbCdE', '([[:lower:]]+)C([[:lower:]]+)');
+--+
| regexp_extract_all('AbCdE', '([[:lower:]]+)C([[:lower:]]+)') |
+--+
Re: [PR] Fix All Regexp functions Documention [doris-website]
HappenLee commented on code in PR #2568: URL: https://github.com/apache/doris-website/pull/2568#discussion_r2178946918 ## i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md: ## @@ -5,21 +5,54 @@ } --- + + + + ## 描述 -对字符串 str 进行正则匹配,抽取符合 pattern 的所有子模式匹配部分。需要 pattern 完全匹配 str 中的某部分,这样才能返回 pattern 部分中需匹配部分的字符串数组。如果没有匹配或者 pattern 没有子模式,返回空字符串。 Review Comment: 补充一下返回空字符串的case -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Fix All Regexp functions Documention [doris-website]
HappenLee commented on code in PR #2568:
URL: https://github.com/apache/doris-website/pull/2568#discussion_r2178927519
##
i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md:
##
@@ -1,15 +1,19 @@
---
{
"title": "REGEXP_EXTRACT_OR_NULL",
-"language": "zh-CN"
Review Comment:
zh-CN not en
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Fix All Regexp functions Documention [doris-website]
HappenLee commented on code in PR #2568: URL: https://github.com/apache/doris-website/pull/2568#discussion_r2177004873 ## copy.sh: ## @@ -0,0 +1,58 @@ + +cp /home/jh/doris-website/versioned_docs/version-3.0/sql-manual/sql-functions/scalar-functions/string-functions/regexp-replace.md /home/jh/doris-website/versioned_docs/version-2.0/sql-manual/sql-functions/string-functions/regexp/regexp-replace.md +cp /home/jh/doris-website/versioned_docs/version-3.0/sql-manual/sql-functions/scalar-functions/string-functions/regexp-replace-one.md /home/jh/doris-website/versioned_docs/version-2.0/sql-manual/sql-functions/string-functions/regexp/regexp-replace-one.md Review Comment: not push the .sh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
