This is an automated email from the ASF dual-hosted git repository.
zclll pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new b1fcccdba34 [Enhancement](doc) add default behaviors and pattern
modifiers for regexp functions (#3410)
b1fcccdba34 is described below
commit b1fcccdba34093fa87581057cf9eccd3049005d3
Author: linrrarity <[email protected]>
AuthorDate: Thu Feb 26 11:14:18 2026 +0800
[Enhancement](doc) add default behaviors and pattern modifiers for regexp
functions (#3410)
## Versions
- [x] dev
- [x] 4.x
- [x] 3.x
- [ ] 2.1
## Languages
- [x] Chinese
- [x] English
---
.../string-functions/regexp-extract-all.md | 74 ++++++++++++++++++
.../string-functions/regexp-extract-or-null.md | 90 ++++++++++++++++++++++
.../string-functions/regexp-extract.md | 90 ++++++++++++++++++++++
.../scalar-functions/string-functions/regexp.md | 85 ++++++++++++++++++++
.../string-functions/regexp-extract-all.md | 74 ++++++++++++++++++
.../string-functions/regexp-extract-or-null.md | 90 ++++++++++++++++++++++
.../string-functions/regexp-extract.md | 90 ++++++++++++++++++++++
.../scalar-functions/string-functions/regexp.md | 86 +++++++++++++++++++++
.../string-functions/regexp-extract-all.md | 72 +++++++++++++++++
.../string-functions/regexp-extract-or-null.md | 88 +++++++++++++++++++++
.../string-functions/regexp-extract.md | 88 +++++++++++++++++++++
.../scalar-functions/string-functions/regexp.md | 84 ++++++++++++++++++++
.../string-functions/regexp-extract-all.md | 72 +++++++++++++++++
.../string-functions/regexp-extract-or-null.md | 88 +++++++++++++++++++++
.../string-functions/regexp-extract.md | 88 +++++++++++++++++++++
.../scalar-functions/string-functions/regexp.md | 84 ++++++++++++++++++++
.../string-functions/regexp-extract-all.md | 72 +++++++++++++++++
.../string-functions/regexp-extract-or-null.md | 88 +++++++++++++++++++++
.../string-functions/regexp-extract.md | 88 +++++++++++++++++++++
.../scalar-functions/string-functions/regexp.md | 84 ++++++++++++++++++++
.../string-functions/regexp-extract-all.md | 72 +++++++++++++++++
.../string-functions/regexp-extract-or-null.md | 88 +++++++++++++++++++++
.../string-functions/regexp-extract.md | 88 +++++++++++++++++++++
.../scalar-functions/string-functions/regexp.md | 84 ++++++++++++++++++++
24 files changed, 2007 insertions(+)
diff --git
a/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
b/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
index 7f5e9e0b9c3..f383cfa0966 100644
---
a/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
+++
b/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
@@ -37,6 +37,33 @@ REGEXP_EXTRACT_ALL(<str>, <pattern>)
The function returns an array of strings that represent the parts of the input
string that match the first sub - pattern of the specified regular expression.
The return type is an array of String values. If no matches are found, or if
the pattern has no sub - patterns, an empty array is returned.
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+Pattern modifiers only take effect when using the default regex engine. If
`enable_extended_regex=true` is enabled while using zero-width assertions
(e.g., `(?<=...)`, `(?=...)`), the query will be handled by the Boost.Regex
engine, and modifier behavior may not work as expected. It is recommended not
to mix them.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Example
Basic matching of lowercase letters around 'C'.In this example, the pattern
([[:lower:]]+)C([[:lower:]]+) matches the part of the string where one or more
lowercase letters are followed by 'C' and then one or more lowercase letters.
The first sub - pattern ([[:lower:]]+) before 'C' matches 'b', so the result is
['b'].
@@ -206,4 +233,51 @@ SELECT REGEXP_EXTRACT_ALL('ID:AA-1,ID:BB-2,ID:CC-3',
'(?<=ID:)([A-Z]{2}-\\d)');
+-------------------------------------------------------------------------+
| ['AA-1','BB-2','CC-3'] |
+-------------------------------------------------------------------------+
+```
+
+Pattern Modifiers
+
+Case-insensitive matching: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('Hello hello HELLO', '(hello)') AS case_sensitive,
+ REGEXP_EXTRACT_ALL('Hello hello HELLO', '(?i)(hello)') AS
case_insensitive;
+```
+
+```text
++----------------+---------------------------+
+| case_sensitive | case_insensitive |
++----------------+---------------------------+
+| ['hello'] | ['Hello','hello','HELLO'] |
++----------------+---------------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '^([a-z]+)') AS single_line,
+ REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '(?m)^([a-z]+)') AS multi_line;
+```
+
+```text
++-------------+---------------------+
+| single_line | multi_line |
++-------------+---------------------+
+| ['foo'] | ['foo','bar','baz'] |
++-------------+---------------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('aXbXcXd', '(a.*X)') AS greedy,
+ REGEXP_EXTRACT_ALL('aXbXcXd', '(?U)(a.*X)') AS non_greedy;
+```
+
+```text
++----------+------------+
+| greedy | non_greedy |
++----------+------------+
+| ['aXbXcX'] | ['aX'] |
++----------+------------+
```
\ No newline at end of file
diff --git
a/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
b/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
index 5caa80bc5d1..4dcbf801767 100644
---
a/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
+++
b/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
@@ -47,6 +47,33 @@ Return a string type, with the result being the part that
matches `<pattern>`.
If the `<pos>` < 0,return NULL;
If the `pos` > the length of `<str>`,return NULL;
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+Pattern modifiers only take effect when using the default regex engine. If
`enable_extended_regex=true` is enabled while using zero-width assertions
(e.g., `(?<=...)`, `(?=...)`), the query will be handled by the Boost.Regex
engine, and modifier behavior may not work as expected. It is recommended not
to mix them.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Example
Extracting a specific group from a match. Explanation: The regular expression
([[:lower:]]+)C([[:lower:]]+) looks for sequences of one or more lowercase
letters separated by 'C'. The group with index 1 corresponds to the first
sequence of lowercase letters, so 'b' is returned.
@@ -249,4 +276,67 @@ SELECT regexp_extract_or_null('foo123bar',
'(?<=foo)(\\d+)(?=bar)', 1);
+-----------------------------------------------------------------+
| 123 |
+-----------------------------------------------------------------+
+```
+
+Pattern Modifiers
+
+Case-insensitive matching: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT_OR_NULL('Hello World', '(?i)(hello)', 1) AS
case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| NULL | Hello |
++----------------+------------------+
+```
+
+`.` matches newline by default; use `(?-s)` to prevent `.` from matching
newline
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?-s)^(.+)$', 1) AS
dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | NULL |
++--------------+------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| NULL | bar |
++-------------+------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT_OR_NULL('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
b/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
index d7f0055b9e6..80bfd5b66ee 100644
---
a/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
+++
b/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
@@ -40,6 +40,33 @@ REGEXP_EXTRACT(<str>, <pattern>, <pos>)
The matching part of the pattern. It is of Varchar type. If no match is found,
an empty string will be returned.
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+Pattern modifiers only take effect when using the default regex engine. If
`enable_extended_regex=true` is enabled while using zero-width assertions
(e.g., `(?<=...)`, `(?=...)`), the query will be handled by the Boost.Regex
engine, and modifier behavior may not work as expected. It is recommended not
to mix them.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Example
Extract the first matching part.In this example, the regular expression
([[:lower:]]+)C([[:lower:]]+) matches the part of the string where one or more
lowercase letters are followed by 'C' and then one or more lowercase letters.
The first capturing group ([[:lower:]]+) before 'C' matches 'b', so the result
is 'b'.
@@ -202,4 +229,67 @@ SELECT regexp_extract('foo123bar456baz',
'(?<=foo)(\\d+)(?=bar)', 1);
+---------------------------------------------------------------+
| 123 |
+---------------------------------------------------------------+
+```
+
+Pattern Modifiers
+
+Case-insensitive matching: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP_EXTRACT('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT('Hello World', '(?i)(hello)', 1) AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| | Hello |
++----------------+------------------+
+```
+
+`.` matches newline by default; use `(?-s)` to prevent `.` from matching
newline
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT('foo\nbar', '(?-s)^(.+)$', 1) AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | |
++--------------+------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| | bar |
++-------------+------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
b/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
index be3dc66c905..63a2d32d0b3 100644
--- a/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
+++ b/docs/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
@@ -39,6 +39,33 @@ REGEXP(<str>, <pattern>)
The REGEXP function returns a BOOLEAN value. If the string <str> matches the
regular expression pattern <pattern>, the function returns true (represented as
1 in SQL); if not, it returns false (represented as 0 in SQL).
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+Pattern modifiers only take effect when using the default regex engine. If
`enable_extended_regex=true` is enabled while using zero-width assertions
(e.g., `(?<=...)`, `(?=...)`), the query will be handled by the Boost.Regex
engine, and modifier behavior may not work as expected. It is recommended not
to mix them.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Examples
```sql
@@ -215,4 +242,62 @@ SELECT REGEXP('Apache/Doris',
'([a-zA-Z_+-]+(?:\/[a-zA-Z_0-9+-]+)*)(?=s|$)');
+-----------------------------------------------------------------------+
| 1 |
+-----------------------------------------------------------------------+
+```
+
+Pattern Modifiers
+
+Case-insensitive matching: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP('Hello World', 'hello') AS case_sensitive, REGEXP('Hello World',
'(?i)hello') AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| 0 | 1 |
++----------------+------------------+
+```
+
+`.` matches newline by default; use `(?-s)` to prevent `.` from matching
newline
+
+```sql
+SELECT REGEXP('foo\nbar', '^.+$') AS dot_match_nl, REGEXP('foo\nbar',
'(?-s)^.+$') AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| 1 | 0 |
++--------------+------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP('foo\nbar', '^bar') AS single_line, REGEXP('foo\nbar',
'(?m)^bar') AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| 0 | 1 |
++-------------+------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy, REGEXP_EXTRACT('aXbXc',
'(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
index b988716b862..1c679966a7d 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
@@ -57,6 +57,33 @@ REGEXP_EXTRACT_ALL(<str>, <pattern>)
函数返回表示输入字符串中与指定正则表达式的第一个子模式匹配部分的字符串数组。返回类型为 String
值数组。如果未找到匹配项,或模式没有子模式,则返回空数组。
+**默认行为**:
+
+| 默认配置 | 行为说明
|
+| -------------------------- |
----------------------------------------------------------------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。
|
+| 大小写敏感 | 匹配时区分大小写。
|
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。
|
+| UTF-8 | 字符串按 UTF-8 处理。
|
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+模式修饰符仅在使用默认正则引擎时生效。若启用了 `enable_extended_regex=true` 同时使用零宽断言(如
`(?<=...)`、`(?=...)`),查询将由 Boost.Regex 引擎处理,此时修饰符行为可能与预期不符,建议不要混合使用。
+
+| 标志 | 含义 |
+| ------- | -------------------------------------------- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
+
## 例子
围绕 'C' 的小写字母基本匹配,在这个示例中,模式([[:lower:]]+)C([[:lower:]]+)匹配字符串中一个或多个小写字母后跟 'C'
再跟一个或多个小写字母的部分。'C' 之前的第一个子模式([[:lower:]]+)匹配 'b',因此结果为['b']。
@@ -225,4 +252,51 @@ SELECT REGEXP_EXTRACT_ALL('ID:AA-1,ID:BB-2,ID:CC-3',
'(?<=ID:)([A-Z]{2}-\\d)');
+-------------------------------------------------------------------------+
| ['AA-1','BB-2','CC-3'] |
+-------------------------------------------------------------------------+
+```
+
+模式修饰符
+
+大小写不敏感:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('Hello hello HELLO', '(hello)') AS case_sensitive,
+ REGEXP_EXTRACT_ALL('Hello hello HELLO', '(?i)(hello)') AS
case_insensitive;
+```
+
+```text
++----------------+---------------------------+
+| case_sensitive | case_insensitive |
++----------------+---------------------------+
+| ['hello'] | ['Hello','hello','HELLO'] |
++----------------+---------------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '^([a-z]+)') AS single_line,
+ REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '(?m)^([a-z]+)') AS multi_line;
+```
+
+```text
++-------------+---------------------+
+| single_line | multi_line |
++-------------+---------------------+
+| ['foo'] | ['foo','bar','baz'] |
++-------------+---------------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('aXbXcXd', '(a.*X)') AS greedy,
+ REGEXP_EXTRACT_ALL('aXbXcXd', '(?U)(a.*X)') AS non_greedy;
+```
+
+```text
++----------+------------+
+| greedy | non_greedy |
++----------+------------+
+| ['aXbXcX'] | ['aX'] |
++----------+------------+
```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
index a0b50a34d07..c767957b759 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
@@ -49,6 +49,33 @@ REGEXP_EXTRACT_OR_NULL(<str>, <pattern>, <pos>)
如果 `<pos>` < 0,则返回NULL;
如果 `pos` > 参数字符串`<str>`的长度,返回 NULL;
+**默认行为**:
+
+| 默认配置 | 行为说明
|
+| -------------------------- |
----------------------------------------------------------------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。
|
+| 大小写敏感 | 匹配时区分大小写。
|
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。
|
+| UTF-8 | 字符串按 UTF-8 处理。
|
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+模式修饰符仅在使用默认正则引擎时生效。若启用了 `enable_extended_regex=true` 同时使用零宽断言(如
`(?<=...)`、`(?=...)`),查询将由 Boost.Regex 引擎处理,此时修饰符行为可能与预期不符,建议不要混合使用。
+
+| 标志 | 含义 |
+| ------- | -------------------------------------------- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
+
## 例子
从匹配中提取特定组,正则表达式([[:lower:]]+)C([[:lower:]]+)查找由 'C' 分隔的一个或多个小写字母序列。索引为 1
的组对应第一个小写字母序列,因此返回 'b'。
@@ -251,4 +278,67 @@ SELECT regexp_extract_or_null('foo123bar',
'(?<=foo)(\\d+)(?=bar)', 1);
+-----------------------------------------------------------------+
| 123 |
+-----------------------------------------------------------------+
+```
+
+模式修饰符
+
+大小写不敏感:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT_OR_NULL('Hello World', '(?i)(hello)', 1) AS
case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| NULL | Hello |
++----------------+------------------+
+```
+
+`.` 默认匹配换行符;使用 `(?-s)` 后 `.` 不匹配换行符
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?-s)^(.+)$', 1) AS
dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | NULL |
++--------------+------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| NULL | bar |
++-------------+------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT_OR_NULL('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
index a48fc84b943..ca5697919fe 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
@@ -61,6 +61,33 @@ REGEXP_EXTRACT(<str>, <pattern>, <pos>)
模式的匹配部分,类型为 Varchar。若未找到匹配项,将返回空字符串
+**默认行为**:
+
+| 默认配置 | 行为说明
|
+| -------------------------- |
----------------------------------------------------------------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。
|
+| 大小写敏感 | 匹配时区分大小写。
|
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。
|
+| UTF-8 | 字符串按 UTF-8 处理。
|
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+模式修饰符仅在使用默认正则引擎时生效。若启用了 `enable_extended_regex=true` 同时使用零宽断言(如
`(?<=...)`、`(?=...)`),查询将由 Boost.Regex 引擎处理,此时修饰符行为可能与预期不符,建议不要混合使用。
+
+| 标志 | 含义 |
+| ------- | -------------------------------------------- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
+
## 示例
提取第一个匹配部分,在此示例中,正则表达式([[:lower:]]+)C([[:lower:]]+)匹配字符串中一个或多个小写字母后跟 'C'
再跟一个或多个小写字母的部分。'C' 之前的第一个捕获组([[:lower:]]+)匹配 'b',因此结果为 'b'
@@ -225,4 +252,67 @@ SELECT regexp_extract('foo123bar456baz',
'(?<=foo)(\\d+)(?=bar)', 1);
+---------------------------------------------------------------+
| 123 |
+---------------------------------------------------------------+
+```
+
+模式修饰符
+
+大小写不敏感:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP_EXTRACT('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT('Hello World', '(?i)(hello)', 1) AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| | Hello |
++----------------+------------------+
+```
+
+`.` 默认匹配换行符;使用 `(?-s)` 后 `.` 不匹配换行符
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT('foo\nbar', '(?-s)^(.+)$', 1) AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | |
++--------------+------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| | bar |
++-------------+------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
index 664128f95c4..ae78b1899bb 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
@@ -39,6 +39,32 @@ REGEXP(<str>, <pattern>)
## 返回值
REGEXP 函数返回布尔值(BOOLEAN)。如果字符串 <str> 匹配正则表达式模式 <pattern>,函数返回 true(在 SQL 中表示为
1);如果不匹配,返回 false(在 SQL 中表示为 0)。
+**默认行为**:
+
+| 默认配置 | 行为说明
|
+| -------------------------- |
----------------------------------------------------------------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。
|
+| 大小写敏感 | 匹配时区分大小写。
|
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。
|
+| UTF-8 | 字符串按 UTF-8 处理。
|
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+模式修饰符仅在使用默认正则引擎时生效。若启用了 `enable_extended_regex=true` 同时使用零宽断言(如
`(?<=...)`、`(?=...)`),查询将由 Boost.Regex 引擎处理,此时修饰符行为可能与预期不符,建议不要混合使用。
+
+| 标志 | 含义 |
+| ------- | -------------------------------------------- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
## 例子
@@ -216,3 +242,63 @@ SELECT regexp('foobar', '(?<=foo)bar');
| 1 |
+---------------------------------+
```
+
+模式修饰符
+
+大小写不敏感匹配:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP('Hello World', 'hello') AS case_sensitive, REGEXP('Hello World',
'(?i)hello') AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| 0 | 1 |
++----------------+------------------+
+```
+
+`.` 默认匹配换行符;使用 `(?-s)` 后 `.` 不匹配换行符
+
+```sql
+SELECT REGEXP('foo\nbar', '^.+$') AS dot_match_nl, REGEXP('foo\nbar',
'(?-s)^.+$') AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| 1 | 0 |
++--------------+------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP('foo\nbar', '^bar') AS single_line, REGEXP('foo\nbar',
'(?m)^bar') AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| 0 | 1 |
++-------------+------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
+```
+```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
index 60b967ff5b7..a9b302024c9 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
@@ -52,6 +52,31 @@ REGEXP_EXTRACT_ALL(<str>, <pattern>)
函数返回表示输入字符串中与指定正则表达式的第一个子模式匹配部分的字符串数组。返回类型为 String
值数组。如果未找到匹配项,或模式没有子模式,则返回空数组。
+**默认行为**:
+
+| 默认配置 | 行为说明 |
+| -------- | -------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。 |
+| 大小写敏感 | 匹配时区分大小写。 |
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。 |
+| UTF-8 | 字符串按 UTF-8 处理。 |
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+| 标志 | 含义 |
+| ---- | ---- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
+
## 例子
围绕 'C' 的小写字母基本匹配,在这个示例中,模式([[:lower:]]+)C([[:lower:]]+)匹配字符串中一个或多个小写字母后跟 'C'
再跟一个或多个小写字母的部分。'C' 之前的第一个子模式([[:lower:]]+)匹配 'b',因此结果为['b']。
@@ -202,4 +227,51 @@ SELECT regexp_extract_all('hello (world) 123',
'([[:alpha:]+');
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern: ([[:alpha:]+
Error: missing ]: [[:alpha:]+
+```
+
+模式修饰符
+
+大小写不敏感:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('Hello hello HELLO', '(hello)') AS case_sensitive,
+ REGEXP_EXTRACT_ALL('Hello hello HELLO', '(?i)(hello)') AS
case_insensitive;
+```
+
+```text
++----------------+---------------------------+
+| case_sensitive | case_insensitive |
++----------------+---------------------------+
+| ['hello'] | ['Hello','hello','HELLO'] |
++----------------+---------------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '^([a-z]+)') AS single_line,
+ REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '(?m)^([a-z]+)') AS multi_line;
+```
+
+```text
++-------------+---------------------+
+| single_line | multi_line |
++-------------+---------------------+
+| ['foo'] | ['foo','bar','baz'] |
++-------------+---------------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('aXbXcXd', '(a.*X)') AS greedy,
+ REGEXP_EXTRACT_ALL('aXbXcXd', '(?U)(a.*X)') AS non_greedy;
+```
+
+```text
++----------+------------+
+| greedy | non_greedy |
++----------+------------+
+| ['aXbXcX'] | ['aX'] |
++----------+------------+
```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
index fa1007b4f9f..63de0563752 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
@@ -44,6 +44,31 @@ REGEXP_EXTRACT_OR_NULL(<str>, <pattern>, <pos>)
如果 `<pos>` < 0,则返回NULL;
如果 `pos` > 参数字符串`<str>`的长度,返回 NULL;
+**默认行为**:
+
+| 默认配置 | 行为说明 |
+| -------- | -------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。 |
+| 大小写敏感 | 匹配时区分大小写。 |
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。 |
+| UTF-8 | 字符串按 UTF-8 处理。 |
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+| 标志 | 含义 |
+| ---- | ---- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
+
## 例子
从匹配中提取特定组,正则表达式([[:lower:]]+)C([[:lower:]]+)查找由 'C' 分隔的一个或多个小写字母序列。索引为 1
的组对应第一个小写字母序列,因此返回 'b'。
@@ -228,4 +253,67 @@ mysql> SELECT REGEXP_EXTRACT_OR_NULL('123AbCdExCx',
'([[:lower:]]+)C([[]ower:]]+
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern:
([[:lower:]]+)C([[:lower:]+)
Error: missing ]: [[:lower:]+)
+```
+
+模式修饰符
+
+大小写不敏感:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT_OR_NULL('Hello World', '(?i)(hello)', 1) AS
case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| NULL | Hello |
++----------------+------------------+
+```
+
+`.` 默认匹配换行符;使用 `(?-s)` 后 `.` 不匹配换行符
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?-s)^(.+)$', 1) AS
dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | NULL |
++--------------+------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| NULL | bar |
++-------------+------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT_OR_NULL('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
index 02f758981b3..3d33a63fa79 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
@@ -56,6 +56,31 @@ REGEXP_EXTRACT(<str>, <pattern>, <pos>)
模式的匹配部分,类型为 Varchar。若未找到匹配项,将返回空字符串
+**默认行为**:
+
+| 默认配置 | 行为说明 |
+| -------- | -------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。 |
+| 大小写敏感 | 匹配时区分大小写。 |
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。 |
+| UTF-8 | 字符串按 UTF-8 处理。 |
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+| 标志 | 含义 |
+| ---- | ---- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
+
## 示例
提取第一个匹配部分,在此示例中,正则表达式([[:lower:]]+)C([[:lower:]]+)匹配字符串中一个或多个小写字母后跟 'C'
再跟一个或多个小写字母的部分。'C' 之前的第一个捕获组([[:lower:]]+)匹配 'b',因此结果为 'b'
@@ -202,4 +227,67 @@ SELECT regexp_extract('AbCdE', '([[:digit:]]+', 1);
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern: ([[:digit:]]+
Error: missing ): ([[:digit:]]+
+```
+
+模式修饰符
+
+大小写不敏感:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP_EXTRACT('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT('Hello World', '(?i)(hello)', 1) AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| | Hello |
++----------------+------------------+
+```
+
+`.` 默认匹配换行符;使用 `(?-s)` 后 `.` 不匹配换行符
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT('foo\nbar', '(?-s)^(.+)$', 1) AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | |
++--------------+------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| | bar |
++-------------+------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
index b85b09b36c8..f05381cf4e9 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
@@ -32,6 +32,31 @@ REGEXP(<str>, <pattern>)
REGEXP 函数返回布尔值(BOOLEAN)。如果字符串 <str> 匹配正则表达式模式 <pattern>,函数返回 true(在 SQL 中表示为
1);如果不匹配,返回 false(在 SQL 中表示为 0)。
+**默认行为**:
+
+| 默认配置 | 行为说明 |
+| -------- | -------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。 |
+| 大小写敏感 | 匹配时区分大小写。 |
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。 |
+| UTF-8 | 字符串按 UTF-8 处理。 |
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+| 标志 | 含义 |
+| ---- | ---- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
+
## 例子
```sql
@@ -193,3 +218,62 @@ SELECT REGEXP('Hello, World!', '([a-z');
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INTERNAL_ERROR]Invalid regex expression: ([a-z
```
+模式修饰符
+
+大小写不敏感匹配:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP('Hello World', 'hello') AS case_sensitive, REGEXP('Hello World',
'(?i)hello') AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| 0 | 1 |
++----------------+------------------+
+```
+
+`.` 默认匹配换行符;使用 `(?-s)` 后 `.` 不匹配换行符
+
+```sql
+SELECT REGEXP('foo\nbar', '^.+$') AS dot_match_nl, REGEXP('foo\nbar',
'(?-s)^.+$') AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| 1 | 0 |
++--------------+------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP('foo\nbar', '^bar') AS single_line, REGEXP('foo\nbar',
'(?m)^bar') AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| 0 | 1 |
++-------------+------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
+```
+
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
index 60b967ff5b7..a9b302024c9 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
@@ -52,6 +52,31 @@ REGEXP_EXTRACT_ALL(<str>, <pattern>)
函数返回表示输入字符串中与指定正则表达式的第一个子模式匹配部分的字符串数组。返回类型为 String
值数组。如果未找到匹配项,或模式没有子模式,则返回空数组。
+**默认行为**:
+
+| 默认配置 | 行为说明 |
+| -------- | -------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。 |
+| 大小写敏感 | 匹配时区分大小写。 |
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。 |
+| UTF-8 | 字符串按 UTF-8 处理。 |
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+| 标志 | 含义 |
+| ---- | ---- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
+
## 例子
围绕 'C' 的小写字母基本匹配,在这个示例中,模式([[:lower:]]+)C([[:lower:]]+)匹配字符串中一个或多个小写字母后跟 'C'
再跟一个或多个小写字母的部分。'C' 之前的第一个子模式([[:lower:]]+)匹配 'b',因此结果为['b']。
@@ -202,4 +227,51 @@ SELECT regexp_extract_all('hello (world) 123',
'([[:alpha:]+');
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern: ([[:alpha:]+
Error: missing ]: [[:alpha:]+
+```
+
+模式修饰符
+
+大小写不敏感:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('Hello hello HELLO', '(hello)') AS case_sensitive,
+ REGEXP_EXTRACT_ALL('Hello hello HELLO', '(?i)(hello)') AS
case_insensitive;
+```
+
+```text
++----------------+---------------------------+
+| case_sensitive | case_insensitive |
++----------------+---------------------------+
+| ['hello'] | ['Hello','hello','HELLO'] |
++----------------+---------------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '^([a-z]+)') AS single_line,
+ REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '(?m)^([a-z]+)') AS multi_line;
+```
+
+```text
++-------------+---------------------+
+| single_line | multi_line |
++-------------+---------------------+
+| ['foo'] | ['foo','bar','baz'] |
++-------------+---------------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('aXbXcXd', '(a.*X)') AS greedy,
+ REGEXP_EXTRACT_ALL('aXbXcXd', '(?U)(a.*X)') AS non_greedy;
+```
+
+```text
++----------+------------+
+| greedy | non_greedy |
++----------+------------+
+| ['aXbXcX'] | ['aX'] |
++----------+------------+
```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
index fa1007b4f9f..63de0563752 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
@@ -44,6 +44,31 @@ REGEXP_EXTRACT_OR_NULL(<str>, <pattern>, <pos>)
如果 `<pos>` < 0,则返回NULL;
如果 `pos` > 参数字符串`<str>`的长度,返回 NULL;
+**默认行为**:
+
+| 默认配置 | 行为说明 |
+| -------- | -------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。 |
+| 大小写敏感 | 匹配时区分大小写。 |
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。 |
+| UTF-8 | 字符串按 UTF-8 处理。 |
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+| 标志 | 含义 |
+| ---- | ---- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
+
## 例子
从匹配中提取特定组,正则表达式([[:lower:]]+)C([[:lower:]]+)查找由 'C' 分隔的一个或多个小写字母序列。索引为 1
的组对应第一个小写字母序列,因此返回 'b'。
@@ -228,4 +253,67 @@ mysql> SELECT REGEXP_EXTRACT_OR_NULL('123AbCdExCx',
'([[:lower:]]+)C([[]ower:]]+
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern:
([[:lower:]]+)C([[:lower:]+)
Error: missing ]: [[:lower:]+)
+```
+
+模式修饰符
+
+大小写不敏感:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT_OR_NULL('Hello World', '(?i)(hello)', 1) AS
case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| NULL | Hello |
++----------------+------------------+
+```
+
+`.` 默认匹配换行符;使用 `(?-s)` 后 `.` 不匹配换行符
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?-s)^(.+)$', 1) AS
dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | NULL |
++--------------+------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| NULL | bar |
++-------------+------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT_OR_NULL('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
index 02f758981b3..3d33a63fa79 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
@@ -56,6 +56,31 @@ REGEXP_EXTRACT(<str>, <pattern>, <pos>)
模式的匹配部分,类型为 Varchar。若未找到匹配项,将返回空字符串
+**默认行为**:
+
+| 默认配置 | 行为说明 |
+| -------- | -------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。 |
+| 大小写敏感 | 匹配时区分大小写。 |
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。 |
+| UTF-8 | 字符串按 UTF-8 处理。 |
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+| 标志 | 含义 |
+| ---- | ---- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
+
## 示例
提取第一个匹配部分,在此示例中,正则表达式([[:lower:]]+)C([[:lower:]]+)匹配字符串中一个或多个小写字母后跟 'C'
再跟一个或多个小写字母的部分。'C' 之前的第一个捕获组([[:lower:]]+)匹配 'b',因此结果为 'b'
@@ -202,4 +227,67 @@ SELECT regexp_extract('AbCdE', '([[:digit:]]+', 1);
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern: ([[:digit:]]+
Error: missing ): ([[:digit:]]+
+```
+
+模式修饰符
+
+大小写不敏感:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP_EXTRACT('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT('Hello World', '(?i)(hello)', 1) AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| | Hello |
++----------------+------------------+
+```
+
+`.` 默认匹配换行符;使用 `(?-s)` 后 `.` 不匹配换行符
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT('foo\nbar', '(?-s)^(.+)$', 1) AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | |
++--------------+------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| | bar |
++-------------+------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
index b85b09b36c8..f05381cf4e9 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
@@ -32,6 +32,31 @@ REGEXP(<str>, <pattern>)
REGEXP 函数返回布尔值(BOOLEAN)。如果字符串 <str> 匹配正则表达式模式 <pattern>,函数返回 true(在 SQL 中表示为
1);如果不匹配,返回 false(在 SQL 中表示为 0)。
+**默认行为**:
+
+| 默认配置 | 行为说明 |
+| -------- | -------- |
+| `.` 匹配换行符 | `.` 默认可以匹配 `\n`(换行符)。 |
+| 大小写敏感 | 匹配时区分大小写。 |
+| `^`/`$` 匹配整个字符串边界 | `^` 仅匹配字符串开头,`$` 仅匹配字符串结尾,而非每行的行首/行尾。 |
+| 量词贪婪 | `*`、`+` 等量词默认尽可能多地匹配。 |
+| UTF-8 | 字符串按 UTF-8 处理。 |
+
+**模式修饰符**:
+
+可通过在 `pattern` 前缀写入 `(?flags)` 来覆盖默认行为。多个修饰符可组合,如 `(?im)`;`-` 前缀表示关闭对应选项,如
`(?-s)`。
+
+| 标志 | 含义 |
+| ---- | ---- |
+| `(?i)` | 大小写不敏感匹配 |
+| `(?-i)` | 大小写敏感(默认) |
+| `(?s)` | `.` 匹配换行符(默认已开启) |
+| `(?-s)` | `.` 不匹配换行符 |
+| `(?m)` | 多行模式:`^` 匹配每行行首,`$` 匹配每行行尾 |
+| `(?-m)` | 单行模式:`^`/`$` 匹配整个字符串首尾(默认) |
+| `(?U)` | 量词非贪婪:`*`、`+` 等尽可能少地匹配 |
+| `(?-U)` | 量词贪婪(默认):`*`、`+` 等尽可能多地匹配 |
+
## 例子
```sql
@@ -193,3 +218,62 @@ SELECT REGEXP('Hello, World!', '([a-z');
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INTERNAL_ERROR]Invalid regex expression: ([a-z
```
+模式修饰符
+
+大小写不敏感匹配:`(?i)` 使匹配忽略大小写
+
+```sql
+SELECT REGEXP('Hello World', 'hello') AS case_sensitive, REGEXP('Hello World',
'(?i)hello') AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| 0 | 1 |
++----------------+------------------+
+```
+
+`.` 默认匹配换行符;使用 `(?-s)` 后 `.` 不匹配换行符
+
+```sql
+SELECT REGEXP('foo\nbar', '^.+$') AS dot_match_nl, REGEXP('foo\nbar',
'(?-s)^.+$') AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| 1 | 0 |
++--------------+------------------+
+```
+
+多行模式:`(?m)` 使 `^` 和 `$` 匹配每行行首/行尾
+
+```sql
+SELECT REGEXP('foo\nbar', '^bar') AS single_line, REGEXP('foo\nbar',
'(?m)^bar') AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| 0 | 1 |
++-------------+------------+
+```
+
+贪婪与非贪婪:`(?U)` 使量词尽可能少地匹配
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
+```
+
diff --git
a/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
b/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
index 204e3756e05..b14d3f0dee0 100644
---
a/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
+++
b/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
@@ -33,6 +33,31 @@ REGEXP_EXTRACT_ALL(<str>, <pattern>)
The function returns an array of strings that represent the parts of the input
string that match the first sub - pattern of the specified regular expression.
The return type is an array of String values. If no matches are found, or if
the pattern has no sub - patterns, an empty array is returned.
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Example
Basic matching of lowercase letters around 'C'.In this example, the pattern
([[:lower:]]+)C([[:lower:]]+) matches the part of the string where one or more
lowercase letters are followed by 'C' and then one or more lowercase letters.
The first sub - pattern ([[:lower:]]+) before 'C' matches 'b', so the result is
['b'].
@@ -184,4 +209,51 @@ SELECT regexp_extract_all('hello (world) 123',
'([[:alpha:]+');
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern: ([[:alpha:]+
Error: missing ]: [[:alpha:]+
+```
+
+Pattern Modifiers
+
+Case-insensitive: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('Hello hello HELLO', '(hello)') AS case_sensitive,
+ REGEXP_EXTRACT_ALL('Hello hello HELLO', '(?i)(hello)') AS
case_insensitive;
+```
+
+```text
++----------------+---------------------------+
+| case_sensitive | case_insensitive |
++----------------+---------------------------+
+| ['hello'] | ['Hello','hello','HELLO'] |
++----------------+---------------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '^([a-z]+)') AS single_line,
+ REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '(?m)^([a-z]+)') AS multi_line;
+```
+
+```text
++-------------+---------------------+
+| single_line | multi_line |
++-------------+---------------------+
+| ['foo'] | ['foo','bar','baz'] |
++-------------+---------------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('aXbXcXd', '(a.*X)') AS greedy,
+ REGEXP_EXTRACT_ALL('aXbXcXd', '(?U)(a.*X)') AS non_greedy;
+```
+
+```text
++----------+------------+
+| greedy | non_greedy |
++----------+------------+
+| ['aXbXcX'] | ['aX'] |
++----------+------------+
```
\ No newline at end of file
diff --git
a/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
b/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
index 56ed2750f19..a439a0d3057 100644
---
a/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
+++
b/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
@@ -43,6 +43,31 @@ Return a string type, with the result being the part that
matches `<pattern>`.
If the `<pos>` < 0,return NULL;
If the `pos` > the length of `<str>`,return NULL;
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Example
Extracting a specific group from a match. Explanation: The regular expression
([[:lower:]]+)C([[:lower:]]+) looks for sequences of one or more lowercase
letters separated by 'C'. The group with index 1 corresponds to the first
sequence of lowercase letters, so 'b' is returned.
@@ -227,4 +252,67 @@ mysql> SELECT REGEXP_EXTRACT_OR_NULL('123AbCdExCx',
'([[:lower:]]+)C([[]ower:]]+
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern:
([[:lower:]]+)C([[:lower:]+)
Error: missing ]: [[:lower:]+)
+```
+
+Pattern Modifiers
+
+Case-insensitive: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT_OR_NULL('Hello World', '(?i)(hello)', 1) AS
case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| NULL | Hello |
++----------------+------------------+
+```
+
+`.` matches newline by default; with `(?-s)`, `.` does not match newline
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?-s)^(.+)$', 1) AS
dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | NULL |
++--------------+------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| NULL | bar |
++-------------+------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT_OR_NULL('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
b/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
index bc438f31ee3..88ba1eeeaf6 100644
---
a/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
+++
b/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
@@ -36,6 +36,31 @@ REGEXP_EXTRACT(<str>, <pattern>, <pos>)
The matching part of the pattern. It is of Varchar type. If no match is found,
an empty string will be returned.
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Example
Extract the first matching part.In this example, the regular expression
([[:lower:]]+)C([[:lower:]]+) matches the part of the string where one or more
lowercase letters are followed by 'C' and then one or more lowercase letters.
The first capturing group ([[:lower:]]+) before 'C' matches 'b', so the result
is 'b'.
@@ -180,4 +205,67 @@ SELECT regexp_extract('AbCdE', '([[:digit:]]+', 1);
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern: ([[:digit:]]+
Error: missing ): ([[:digit:]]+
+```
+
+Pattern Modifiers
+
+Case-insensitive: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP_EXTRACT('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT('Hello World', '(?i)(hello)', 1) AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| | Hello |
++----------------+------------------+
+```
+
+`.` matches newline by default; with `(?-s)`, `.` does not match newline
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT('foo\nbar', '(?-s)^(.+)$', 1) AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | |
++--------------+------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| | bar |
++-------------+------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
b/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
index 31e3bed803c..23ad8741155 100644
---
a/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
+++
b/versioned_docs/version-3.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
@@ -32,6 +32,31 @@ REGEXP(<str>, <pattern>)
The REGEXP function returns a BOOLEAN value. If the string <str> matches the
regular expression pattern <pattern>, the function returns true (represented as
1 in SQL); if not, it returns false (represented as 0 in SQL).
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Examples
```sql
@@ -192,4 +217,63 @@ SELECT REGEXP('Hello, World!', '([a-z');
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INTERNAL_ERROR]Invalid regex expression: ([a-z
+```
+
+Pattern Modifiers
+
+Case-insensitive matching: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP('Hello World', 'hello') AS case_sensitive, REGEXP('Hello World',
'(?i)hello') AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| 0 | 1 |
++----------------+------------------+
+```
+
+`.` matches newline by default; with `(?-s)`, `.` does not match newline
+
+```sql
+SELECT REGEXP('foo\nbar', '^.+$') AS dot_match_nl, REGEXP('foo\nbar',
'(?-s)^.+$') AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| 1 | 0 |
++--------------+------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP('foo\nbar', '^bar') AS single_line, REGEXP('foo\nbar',
'(?m)^bar') AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| 0 | 1 |
++-------------+------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
index 204e3756e05..b14d3f0dee0 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-all.md
@@ -33,6 +33,31 @@ REGEXP_EXTRACT_ALL(<str>, <pattern>)
The function returns an array of strings that represent the parts of the input
string that match the first sub - pattern of the specified regular expression.
The return type is an array of String values. If no matches are found, or if
the pattern has no sub - patterns, an empty array is returned.
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Example
Basic matching of lowercase letters around 'C'.In this example, the pattern
([[:lower:]]+)C([[:lower:]]+) matches the part of the string where one or more
lowercase letters are followed by 'C' and then one or more lowercase letters.
The first sub - pattern ([[:lower:]]+) before 'C' matches 'b', so the result is
['b'].
@@ -184,4 +209,51 @@ SELECT regexp_extract_all('hello (world) 123',
'([[:alpha:]+');
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern: ([[:alpha:]+
Error: missing ]: [[:alpha:]+
+```
+
+Pattern Modifiers
+
+Case-insensitive: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('Hello hello HELLO', '(hello)') AS case_sensitive,
+ REGEXP_EXTRACT_ALL('Hello hello HELLO', '(?i)(hello)') AS
case_insensitive;
+```
+
+```text
++----------------+---------------------------+
+| case_sensitive | case_insensitive |
++----------------+---------------------------+
+| ['hello'] | ['Hello','hello','HELLO'] |
++----------------+---------------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '^([a-z]+)') AS single_line,
+ REGEXP_EXTRACT_ALL('foo\nbar\nbaz', '(?m)^([a-z]+)') AS multi_line;
+```
+
+```text
++-------------+---------------------+
+| single_line | multi_line |
++-------------+---------------------+
+| ['foo'] | ['foo','bar','baz'] |
++-------------+---------------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT_ALL('aXbXcXd', '(a.*X)') AS greedy,
+ REGEXP_EXTRACT_ALL('aXbXcXd', '(?U)(a.*X)') AS non_greedy;
+```
+
+```text
++----------+------------+
+| greedy | non_greedy |
++----------+------------+
+| ['aXbXcX'] | ['aX'] |
++----------+------------+
```
\ No newline at end of file
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
index 56ed2750f19..a439a0d3057 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract-or-null.md
@@ -43,6 +43,31 @@ Return a string type, with the result being the part that
matches `<pattern>`.
If the `<pos>` < 0,return NULL;
If the `pos` > the length of `<str>`,return NULL;
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Example
Extracting a specific group from a match. Explanation: The regular expression
([[:lower:]]+)C([[:lower:]]+) looks for sequences of one or more lowercase
letters separated by 'C'. The group with index 1 corresponds to the first
sequence of lowercase letters, so 'b' is returned.
@@ -227,4 +252,67 @@ mysql> SELECT REGEXP_EXTRACT_OR_NULL('123AbCdExCx',
'([[:lower:]]+)C([[]ower:]]+
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern:
([[:lower:]]+)C([[:lower:]+)
Error: missing ]: [[:lower:]+)
+```
+
+Pattern Modifiers
+
+Case-insensitive: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT_OR_NULL('Hello World', '(?i)(hello)', 1) AS
case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| NULL | Hello |
++----------------+------------------+
+```
+
+`.` matches newline by default; with `(?-s)`, `.` does not match newline
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?-s)^(.+)$', 1) AS
dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | NULL |
++--------------+------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT_OR_NULL('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| NULL | bar |
++-------------+------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT_OR_NULL('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT_OR_NULL('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
index bc438f31ee3..88ba1eeeaf6 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp-extract.md
@@ -36,6 +36,31 @@ REGEXP_EXTRACT(<str>, <pattern>, <pos>)
The matching part of the pattern. It is of Varchar type. If no match is found,
an empty string will be returned.
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Example
Extract the first matching part.In this example, the regular expression
([[:lower:]]+)C([[:lower:]]+) matches the part of the string where one or more
lowercase letters are followed by 'C' and then one or more lowercase letters.
The first capturing group ([[:lower:]]+) before 'C' matches 'b', so the result
is 'b'.
@@ -180,4 +205,67 @@ SELECT regexp_extract('AbCdE', '([[:digit:]]+', 1);
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INVALID_ARGUMENT]Could not compile regexp pattern: ([[:digit:]]+
Error: missing ): ([[:digit:]]+
+```
+
+Pattern Modifiers
+
+Case-insensitive: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP_EXTRACT('Hello World', '(hello)', 1) AS case_sensitive,
+ REGEXP_EXTRACT('Hello World', '(?i)(hello)', 1) AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| | Hello |
++----------------+------------------+
+```
+
+`.` matches newline by default; with `(?-s)`, `.` does not match newline
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(.+)$', 1) AS dot_match_nl,
+ REGEXP_EXTRACT('foo\nbar', '(?-s)^(.+)$', 1) AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| foo
+bar | |
++--------------+------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP_EXTRACT('foo\nbar', '^(bar)', 1) AS single_line,
+ REGEXP_EXTRACT('foo\nbar', '(?m)^(bar)', 1) AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| | bar |
++-------------+------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
index 31e3bed803c..23ad8741155 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/string-functions/regexp.md
@@ -32,6 +32,31 @@ REGEXP(<str>, <pattern>)
The REGEXP function returns a BOOLEAN value. If the string <str> matches the
regular expression pattern <pattern>, the function returns true (represented as
1 in SQL); if not, it returns false (represented as 0 in SQL).
+**Default Behavior**:
+
+| Default Setting | Behavior
|
+| ------------------------------------ |
-----------------------------------------------------------------------------------------
|
+| `.` matches newline | `.` can match `\n` (newline) by
default. |
+| Case-sensitive | Matching is case-sensitive.
|
+| `^`/`$` match full string boundaries | `^` matches only the start of the
string, `$` matches only the end, not line starts/ends. |
+| Greedy quantifiers | `*`, `+`, etc. match as much as
possible by default. |
+| UTF-8 | Strings are processed as UTF-8.
|
+
+**Pattern Modifiers**:
+
+You can override the default behavior by prefixing the `pattern` with
`(?flags)`. Multiple modifiers can be combined, e.g., `(?im)`; a `-` prefix
disables the corresponding option, e.g., `(?-s)`.
+
+| Flag | Meaning
|
+| ------- |
---------------------------------------------------------------------------- |
+| `(?i)` | Case-insensitive matching
|
+| `(?-i)` | Case-sensitive (default)
|
+| `(?s)` | `.` matches newline (enabled by default)
|
+| `(?-s)` | `.` does **not** match newline
|
+| `(?m)` | Multiline mode: `^` matches start of each line, `$` matches end of
each line |
+| `(?-m)` | Single-line mode: `^`/`$` match full string boundaries (default)
|
+| `(?U)` | Non-greedy quantifiers: `*`, `+`, etc. match as little as possible
|
+| `(?-U)` | Greedy quantifiers (default): `*`, `+`, etc. match as much as
possible |
+
## Examples
```sql
@@ -192,4 +217,63 @@ SELECT REGEXP('Hello, World!', '([a-z');
```text
ERROR 1105 (HY000): errCode = 2, detailMessage =
(10.16.10.2)[INTERNAL_ERROR]Invalid regex expression: ([a-z
+```
+
+Pattern Modifiers
+
+Case-insensitive matching: `(?i)` makes the match ignore case
+
+```sql
+SELECT REGEXP('Hello World', 'hello') AS case_sensitive, REGEXP('Hello World',
'(?i)hello') AS case_insensitive;
+```
+
+```text
++----------------+------------------+
+| case_sensitive | case_insensitive |
++----------------+------------------+
+| 0 | 1 |
++----------------+------------------+
+```
+
+`.` matches newline by default; with `(?-s)`, `.` does not match newline
+
+```sql
+SELECT REGEXP('foo\nbar', '^.+$') AS dot_match_nl, REGEXP('foo\nbar',
'(?-s)^.+$') AS dot_not_match_nl;
+```
+
+```text
++--------------+------------------+
+| dot_match_nl | dot_not_match_nl |
++--------------+------------------+
+| 1 | 0 |
++--------------+------------------+
+```
+
+Multiline mode: `(?m)` makes `^` and `$` match start/end of each line
+
+```sql
+SELECT REGEXP('foo\nbar', '^bar') AS single_line, REGEXP('foo\nbar',
'(?m)^bar') AS multi_line;
+```
+
+```text
++-------------+------------+
+| single_line | multi_line |
++-------------+------------+
+| 0 | 1 |
++-------------+------------+
+```
+
+Greedy vs non-greedy: `(?U)` makes quantifiers match as little as possible
+
+```sql
+SELECT REGEXP_EXTRACT('aXbXc', '(a.*X)', 1) AS greedy,
+ REGEXP_EXTRACT('aXbXc', '(?U)(a.*X)', 1) AS non_greedy;
+```
+
+```text
++--------+------------+
+| greedy | non_greedy |
++--------+------------+
+| aXbX | aX |
++--------+------------+
```
\ No newline at end of file
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]