This is an automated email from the ASF dual-hosted git repository.
yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 6bd8c88c4ee [doc](ip) fix IP predicate return type to BOOLEAN & add
IPV6 mixed-storage best practice (#3911)
6bd8c88c4ee is described below
commit 6bd8c88c4eef16939a94afe680b38f99596d12d4
Author: Chenyang Sun <[email protected]>
AuthorDate: Fri Jun 5 16:17:30 2026 +0800
[doc](ip) fix IP predicate return type to BOOLEAN & add IPV6 mixed-storage
best practice (#3911)
1. **Fix return type of IP predicate functions to `BOOLEAN`.**
2. **Add a best-practice section to the `IPV6` data type doc.**
## Versions
- [x] dev
- [x] 4.x
- [ ] 3.x
- [ ] 2.1 or older (not covered by version/language sync gate)
## Languages
- [x] Chinese
- [x] English
- [x] Japanese candidate translation needed
## Docs Checklist
- [ ] Checked by AI
- [ ] Test Cases Built
- [ ] Updated required version and language counterparts, or explained
why not
- [ ] If only one language changed, confirmed whether source/translation
counterparts need sync
---------
Co-authored-by: Claude Opus 4.8 (1M context) <[email protected]>
---
.../basic-element/sql-data-types/ip/IPV6.md | 146 +++++++++++++++++++++
.../ip-functions/is-ip-address-in-range.md | 2 +-
.../ip-functions/is-ipv4-compat.md | 2 +-
.../ip-functions/is-ipv4-mapped.md | 2 +-
.../ip-functions/is-ipv4-string.md | 2 +-
.../ip-functions/is-ipv6-string.md | 2 +-
.../basic-element/sql-data-types/ip/IPV6.md | 146 +++++++++++++++++++++
.../ip-functions/is-ip-address-in-range.md | 2 +-
.../ip-functions/is-ipv4-compat.md | 2 +-
.../ip-functions/is-ipv4-mapped.md | 2 +-
.../ip-functions/is-ipv4-string.md | 2 +-
.../ip-functions/is-ipv6-string.md | 2 +-
.../basic-element/sql-data-types/ip/IPV6.md | 146 +++++++++++++++++++++
.../ip-functions/is-ip-address-in-range.md | 2 +-
.../ip-functions/is-ipv4-compat.md | 2 +-
.../ip-functions/is-ipv4-mapped.md | 2 +-
.../ip-functions/is-ipv4-string.md | 2 +-
.../ip-functions/is-ipv6-string.md | 2 +-
.../basic-element/sql-data-types/ip/IPV6.md | 146 +++++++++++++++++++++
.../ip-functions/is-ip-address-in-range.md | 2 +-
.../ip-functions/is-ipv4-compat.md | 2 +-
.../ip-functions/is-ipv4-mapped.md | 2 +-
.../ip-functions/is-ipv4-string.md | 2 +-
.../ip-functions/is-ipv6-string.md | 2 +-
.../basic-element/sql-data-types/ip/IPV6.md | 146 +++++++++++++++++++++
.../ip-functions/is-ip-address-in-range.md | 2 +-
.../ip-functions/is-ipv4-compat.md | 2 +-
.../ip-functions/is-ipv4-mapped.md | 2 +-
.../ip-functions/is-ipv4-string.md | 2 +-
.../ip-functions/is-ipv6-string.md | 2 +-
.../basic-element/sql-data-types/ip/IPV6.md | 146 +++++++++++++++++++++
.../ip-functions/is-ip-address-in-range.md | 2 +-
.../ip-functions/is-ipv4-compat.md | 2 +-
.../ip-functions/is-ipv4-mapped.md | 2 +-
.../ip-functions/is-ipv4-string.md | 2 +-
.../ip-functions/is-ipv6-string.md | 2 +-
36 files changed, 906 insertions(+), 30 deletions(-)
diff --git a/docs/sql-manual/basic-element/sql-data-types/ip/IPV6.md
b/docs/sql-manual/basic-element/sql-data-types/ip/IPV6.md
index 681023f4bfd..44ce6563fd3 100644
--- a/docs/sql-manual/basic-element/sql-data-types/ip/IPV6.md
+++ b/docs/sql-manual/basic-element/sql-data-types/ip/IPV6.md
@@ -53,6 +53,152 @@ mysql> select * from ipv6_test order by id;
+------+-----------------------------------------+
```
+## Best Practice: store mixed IPv4/IPv6 in a single IPV6 column with an
inverted index
+
+A single `IPV6` column can store both IPv4 and IPv6 addresses: write IPv4 in
its IPv4-mapped IPv6 form (`::ffff:x.y.z.w`) — for example `192.168.1.1` is
stored as `::ffff:192.168.1.1` — while native IPv6 (such as
`2001:16a0:2:200a::2`) is written as-is. This lets a mixed-IP field be managed
in one column, and an inverted index on that column accelerates IP filtering,
supporting equality (`=`), `IN`, and CIDR functions such as
`is_ip_address_in_range()`.
+
+### Schema
+
+```sql
+CREATE TABLE access_log (
+ `id` BIGINT NOT NULL,
+ `request_time` DATETIME NOT NULL,
+ `client_ip` IPV6 NULL COMMENT 'Stored uniformly as IPv6; IPv4 stored in
::ffff: form',
+ INDEX idx_client_ip (`client_ip`) USING INVERTED
+) ENGINE=OLAP
+DUPLICATE KEY(`id`, `request_time`)
+DISTRIBUTED BY HASH(`id`) BUCKETS 4
+PROPERTIES (
+ "replication_allocation" = "tag.location.default: 1"
+);
+```
+
+### Normalizing on load
+
+Regardless of the load method, the key is to use the same expression in the
column mapping to convert the raw IP string into IPv6 (below, `ip_str` is the
source field and `client_ip` is the table's `IPV6` column):
+
+```sql
+client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+)
+```
+
+- A plain IPv4 string is first parsed to `IPV4` by `to_ipv4_or_null()`, then
converted to `::ffff:x.y.z.w` by `ipv4_to_ipv6()`;
+- If it is not a valid IPv4, `COALESCE` falls back to `to_ipv6_or_null()` and
parses it as native IPv6 text;
+- If both fail, `NULL` is written. To reject invalid data instead, replace
`_or_null` with the strict `to_ipv4` / `to_ipv6` and handle it via the load
error threshold.
+
+#### Routine Load
+
+```sql
+CREATE ROUTINE LOAD example_db.access_log_job ON access_log
+COLUMNS TERMINATED BY ",",
+COLUMNS(
+ id, request_time, ip_str,
+ client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+)
+PROPERTIES ("desired_concurrent_number" = "3", "max_error_number" = "1000")
+FROM KAFKA ("kafka_broker_list" = "broker:9092", "kafka_topic" = "ip_logs");
+```
+
+#### Stream Load
+
+Apply the same conversion in the `columns` header:
+
+```shell
+curl --location-trusted -u user:passwd \
+ -H "format: csv" \
+ -H "column_separator: ," \
+ -H "columns: id, request_time, ip_str,
client_ip=COALESCE(ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
to_ipv6_or_null(ip_str))" \
+ -T data.csv \
+ "http://<fe_host>:8030/api/example_db/access_log/_stream_load"
+```
+
+#### INSERT INTO (migrating from an old VARCHAR table)
+
+Convert the raw IP string to IPv6 in the `SELECT` (`ip_str` is the VARCHAR
column of the old table):
+
+```sql
+INSERT INTO access_log (id, request_time, client_ip)
+SELECT
+ id, request_time,
+ COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+FROM access_log_varchar;
+```
+
+### Rewriting queries (important)
+
+Once everything is stored as mapped IPv6, **IPv4 query conditions must also be
rewritten into mapped IPv6 form**; otherwise a type / address-family mismatch
causes the index to be skipped:
+
+- **IPv6 CIDR**: write it directly.
+
+ ```sql
+ is_ip_address_in_range(client_ip, '2001:4860:4801::/48')
+ ```
+
+- **IPv4 CIDR → mapped IPv6 CIDR**: add the `::ffff:` prefix to the address
and add **+96** to the prefix length (the first 96 bits of a mapped address are
a fixed prefix).
+
+ | IPv4 CIDR | mapped IPv6 CIDR |
+ | ------------------ | --------------------------- |
+ | `10.42.0.0/16` | `::ffff:10.42.0.0/112` |
+ | `192.178.4.0/24` | `::ffff:192.178.4.0/120` |
+ | `3.219.120.76/32` | `::ffff:3.219.120.76/128` |
+
+ ```sql
+ is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **Exact IPv4 match**: compare against the mapped value too.
+
+ ```sql
+ client_ip = ipv4_to_ipv6(to_ipv4('3.219.120.76'))
+ -- or
+ client_ip = to_ipv6('::ffff:3.219.120.76')
+ ```
+
+- **Allowlist (keep only rows in the range)**: use
`is_ip_address_in_range(...)` directly as the filter.
+
+ ```sql
+ WHERE is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **Blocklist (exclude rows in the range)**: use `NOT
is_ip_address_in_range(...)`, and avoid the `is_ip_address_in_range(...) = 0`
form (which is harder to push down to the index).
+
+ ```sql
+ WHERE NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+Full example (a stat query that excludes several ranges). Because it contains
CIDR range conditions, lower the threshold first to ensure the index is used:
+
+```sql
+SET inverted_index_skip_threshold = 0;
+
+SELECT client_ip, COUNT(*) AS cnt
+FROM access_log
+WHERE request_time BETWEEN '2026-06-01 00:00:00' AND '2026-06-02 00:00:00'
+ AND client_ip NOT IN (to_ipv6('::ffff:3.219.120.76')) -- exact
IPv4 exclusion -> mapped
+ AND NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112') -- IPv4
/16 -> /112
+ AND NOT is_ip_address_in_range(client_ip, '2001:4860:4801::/48') -- IPv6
range
+GROUP BY client_ip;
+```
+
+### Conditions for hitting the index
+
+- The column must be a native `IPV4` / `IPV6` type with a `USING INVERTED`
inverted index built on it;
+- The CIDR argument of `is_ip_address_in_range()` must be a **constant
string**;
+- An `IPV6` column only handles IPv6 CIDRs (IPv4 ranges must first be
converted to `::ffff:` mapped form), and an `IPV4` column only handles IPv4
CIDRs; a type / CIDR-family mismatch skips the index and falls back to
expression filtering;
+- The session variable `inverted_index_skip_threshold` defaults to `50`: when
more than 50% of the rows in a segment match the index, the index is skipped
and data is read directly (bypass).
+
+:::caution CIDR range queries need a lower `inverted_index_skip_threshold`
+For a CIDR range query, `is_ip_address_in_range()` is internally split into
two BKD queries, `>= range start` and `<= range end`. **Each side alone often
matches more than 50% of the rows** (even when the final intersection is
narrow), triggering bypass and skipping the index. Such queries therefore need
`SET inverted_index_skip_threshold = 0;` (or a smaller value) in the session to
hit the index reliably. Equality / `IN` are high-selectivity point lookups and
are unaffected.
+:::
+
### keywords
IPV6
diff --git
a/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
b/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
index 9f03b560237..e588e84d966 100644
---
a/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
+++
b/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
@@ -19,7 +19,7 @@ IS_IP_ADDRESS_IN_RANGE(<ip_address>, <cidr_range>)
- `<cidr_range>`: CIDR network range (string format, such as "192.168.1.0/24")
### Return Value
-Return Type: TINYINT
+Return Type: BOOLEAN
Return Value Meaning:
- Returns 1: indicates the IP address is within the specified CIDR range
diff --git
a/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
b/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
index 3e5526c0883..57c49dd1488 100644
---
a/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
+++
b/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
@@ -18,7 +18,7 @@ IS_IPV4_COMPAT(<ipv6_address>)
- `<ipv6_address>`: Binary representation of IPv6 address (VARCHAR type, 16
bytes)
### Return Value
-Return Type: TINYINT
+Return Type: BOOLEAN
Return Value Meaning: 1 indicates it is an IPv4-compatible address, 0
indicates it is not an IPv4-compatible address
diff --git
a/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
b/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
index 7d307a6cf08..76fd1988838 100644
---
a/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
+++
b/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
@@ -18,7 +18,7 @@ IS_IPV4_MAPPED(<ipv6_address>)
- `<ipv6_address>`: Binary representation of IPv6 address (VARCHAR type, 16
bytes)
### Return Value
-Return Type: TINYINT
+Return Type: BOOLEAN
Return Value Meaning: 1 indicates it is an IPv4-mapped address, 0 indicates it
is not an IPv4-mapped address
diff --git
a/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
b/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
index 0ecc48b7088..67cdce74f02 100644
---
a/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
+++
b/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
@@ -21,7 +21,7 @@ IS_IPV4_STRING(<ipv4_str>)
- `<ipv4_str>`: String to check
### Return Value
-Return Type: TINYINT
+Return Type: BOOLEAN
Return Value Meaning:
- Returns 1: indicates the input is a valid IPv4 address format
diff --git
a/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
b/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
index 2576f6b93e8..e917941b348 100644
---
a/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
+++
b/docs/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
@@ -21,7 +21,7 @@ IS_IPV6_STRING(<ipv6_str>)
- `<ipv6_str>`: String to check
### Return Value
-Return Type: TINYINT
+Return Type: BOOLEAN
Return Value Meaning:
- Returns 1: indicates the input is a valid IPv6 address format
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/ip/IPV6.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/ip/IPV6.md
index 3ce0a00f8d8..7fce814c3b6 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/ip/IPV6.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/ip/IPV6.md
@@ -53,6 +53,152 @@ mysql> select * from ipv6_test order by id;
+------+-----------------------------------------+
```
+## 最佳实践:用 IPV6 统一存储混合 IPv4/IPv6 并走倒排索引
+
+一个 `IPV6` 列可以同时存储 IPv4 和 IPv6 两种地址:把 IPv4 规范化成 IPv4-mapped
IPv6(`::ffff:x.y.z.w`)后写入,例如 `192.168.1.1` 存成 `::ffff:192.168.1.1`,原生 IPv6(如
`2001:16a0:2:200a::2`)照常写入。这样混合 IP 就能用同一列统一管理,并在该列上建倒排索引来过滤
IP,支持等值(`=`)、`IN`、以及 `is_ip_address_in_range()` 等 CIDR 函数走索引加速。
+
+### 建模
+
+```sql
+CREATE TABLE access_log (
+ `id` BIGINT NOT NULL,
+ `request_time` DATETIME NOT NULL,
+ `client_ip` IPV6 NULL COMMENT '统一存储为 IPv6,IPv4 以 ::ffff: 形式存储',
+ INDEX idx_client_ip (`client_ip`) USING INVERTED
+) ENGINE=OLAP
+DUPLICATE KEY(`id`, `request_time`)
+DISTRIBUTED BY HASH(`id`) BUCKETS 4
+PROPERTIES (
+ "replication_allocation" = "tag.location.default: 1"
+);
+```
+
+### 导入时归一化
+
+无论用哪种导入方式,核心都是在列映射里用同一个表达式把原始 IP 字符串转成 IPv6(下面 `ip_str` 为源字段,`client_ip` 为表的
`IPV6` 列):
+
+```sql
+client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+)
+```
+
+- 普通 IPv4 字符串先用 `to_ipv4_or_null()` 解析为 `IPV4`,再用 `ipv4_to_ipv6()` 转成
`::ffff:x.y.z.w`;
+- 不是合法 IPv4 时 `COALESCE` 落到 `to_ipv6_or_null()`,按原生 IPv6 文本解析;
+- 两者都失败则写入 `NULL`。若希望非法数据直接报错,可把 `_or_null` 换成严格的 `to_ipv4` /
`to_ipv6`,并配合导入的错误阈值处理。
+
+#### Routine Load
+
+```sql
+CREATE ROUTINE LOAD example_db.access_log_job ON access_log
+COLUMNS TERMINATED BY ",",
+COLUMNS(
+ id, request_time, ip_str,
+ client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+)
+PROPERTIES ("desired_concurrent_number" = "3", "max_error_number" = "1000")
+FROM KAFKA ("kafka_broker_list" = "broker:9092", "kafka_topic" = "ip_logs");
+```
+
+#### Stream Load
+
+在 `columns` 头里做同样的转换:
+
+```shell
+curl --location-trusted -u user:passwd \
+ -H "format: csv" \
+ -H "column_separator: ," \
+ -H "columns: id, request_time, ip_str,
client_ip=COALESCE(ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
to_ipv6_or_null(ip_str))" \
+ -T data.csv \
+ "http://<fe_host>:8030/api/example_db/access_log/_stream_load"
+```
+
+#### INSERT INTO(从旧 VARCHAR 表迁移)
+
+在 `SELECT` 里把原始 IP 字符串统一转成 IPv6(`ip_str` 为旧表的 VARCHAR 列):
+
+```sql
+INSERT INTO access_log (id, request_time, client_ip)
+SELECT
+ id, request_time,
+ COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+FROM access_log_varchar;
+```
+
+### 查询改写(关键)
+
+存储统一成 mapped IPv6 后,**IPv4 的查询条件也必须改写成 mapped IPv6 形式**,否则类型/网段族不匹配会导致跳过索引:
+
+- **IPv6 CIDR**:直接写。
+
+ ```sql
+ is_ip_address_in_range(client_ip, '2001:4860:4801::/48')
+ ```
+
+- **IPv4 CIDR → mapped IPv6 CIDR**:地址加 `::ffff:` 前缀,前缀长度 **+96**(mapped 地址前 96
位是固定前缀)。
+
+ | IPv4 CIDR | mapped IPv6 CIDR |
+ | ------------------ | --------------------------- |
+ | `10.42.0.0/16` | `::ffff:10.42.0.0/112` |
+ | `192.178.4.0/24` | `::ffff:192.178.4.0/120` |
+ | `3.219.120.76/32` | `::ffff:3.219.120.76/128` |
+
+ ```sql
+ is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **IPv4 精确匹配**:同样比较 mapped 值。
+
+ ```sql
+ client_ip = ipv4_to_ipv6(to_ipv4('3.219.120.76'))
+ -- 或
+ client_ip = to_ipv6('::ffff:3.219.120.76')
+ ```
+
+- **白名单(只保留命中网段的行)**:直接用 `is_ip_address_in_range(...)` 作为过滤条件。
+
+ ```sql
+ WHERE is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **黑名单(排除命中网段的行)**:用 `NOT is_ip_address_in_range(...)`,避免
`is_ip_address_in_range(...) = 0` 这种写法(后者更难走索引下推)。
+
+ ```sql
+ WHERE NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+完整示例(一段“排除若干网段”的统计)。因为包含 CIDR 范围条件,运行前先调低阈值确保命中索引:
+
+```sql
+SET inverted_index_skip_threshold = 0;
+
+SELECT client_ip, COUNT(*) AS cnt
+FROM access_log
+WHERE request_time BETWEEN '2026-06-01 00:00:00' AND '2026-06-02 00:00:00'
+ AND client_ip NOT IN (to_ipv6('::ffff:3.219.120.76')) -- IPv4
精确排除 → mapped
+ AND NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112') -- IPv4
/16 → /112
+ AND NOT is_ip_address_in_range(client_ip, '2001:4860:4801::/48') -- IPv6
网段
+GROUP BY client_ip;
+```
+
+### 走索引的前提
+
+- 列必须是原生 `IPV4` / `IPV6` 类型,且已建 `USING INVERTED` 倒排索引;
+- `is_ip_address_in_range()` 的 CIDR 参数必须是**常量字符串**;
+- `IPV6` 列只处理 IPv6 CIDR(IPv4 网段需先转成 `::ffff:` mapped 形式),`IPV4` 列只处理 IPv4
CIDR,类型与 CIDR 族不匹配会跳过索引、回退到表达式过滤;
+- 会话变量 `inverted_index_skip_threshold` 默认 `50`:当一个 segment 中超过 50%
的行命中索引时会跳过索引、直接读取(bypass)。
+
+:::caution CIDR 范围查询需要调低 `inverted_index_skip_threshold`
+对 CIDR 范围查询,`is_ip_address_in_range()` 内部会拆成 `>= 网段起始` 和 `<= 网段结束` 两次 BKD
查询,**单边各自命中的行数往往都超过 50%**(即便最终交集很窄),从而触发 bypass、跳过索引。因此这类查询需要在会话里设 `SET
inverted_index_skip_threshold = 0;`(或较小值)才能稳定命中索引。等值 / `IN` 是高选择性点查,不受此影响。
+:::
+
### keywords
IPV6
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
index e5368af7726..523ec29a8f7 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
@@ -21,7 +21,7 @@ IS_IP_ADDRESS_IN_RANGE(<ip_address>, <cidr_range>)
- `<cidr_range>`:CIDR 网段范围(字符串格式,如 "192.168.1.0/24")
### 返回值
-返回类型:TINYINT
+返回类型:BOOLEAN
返回值含义:
- 返回 1:表示 IP 地址在指定的 CIDR 范围内
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
index 49532a2a08c..aef5fb5c3af 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
@@ -20,7 +20,7 @@ IS_IPV4_COMPAT(<ipv6_address>)
- `<ipv6_address>`:IPv6 地址的二进制表示(VARCHAR 类型,16 字节)
### 返回值
-返回类型:TINYINT
+返回类型:BOOLEAN
返回值含义:1 表示是 IPv4 兼容地址,0 表示不是 IPv4 兼容地址
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
index 3851d49694a..1af7828c4f0 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
@@ -20,7 +20,7 @@ IS_IPV4_MAPPED(<ipv6_address>)
- `<ipv6_address>`:IPv6 地址的二进制表示(VARCHAR 类型,16 字节)
### 返回值
-返回类型:TINYINT
+返回类型:BOOLEAN
返回值含义:1 表示是 IPv4 映射地址,0 表示不是 IPv4 映射地址
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
index 9dc9e15e5df..0d3471f62b5 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
@@ -23,7 +23,7 @@ IS_IPV4_STRING(<ipv4_str>)
- `<ipv4_str>`:要检查的字符串
### 返回值
-返回类型:TINYINT
+返回类型:BOOLEAN
返回值含义:
- 返回 1:表示输入是有效的 IPv4 地址格式
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
index 02a0771e70b..7f3107dd567 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
@@ -23,7 +23,7 @@ IS_IPV6_STRING(<ipv6_str>)
- `<ipv6_str>`:要检查的字符串
### 返回值
-返回类型:TINYINT
+返回类型:BOOLEAN
返回值含义:
- 返回 1:表示输入是有效的 IPv6 地址格式
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
index 3ce0a00f8d8..7fce814c3b6 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
@@ -53,6 +53,152 @@ mysql> select * from ipv6_test order by id;
+------+-----------------------------------------+
```
+## 最佳实践:用 IPV6 统一存储混合 IPv4/IPv6 并走倒排索引
+
+一个 `IPV6` 列可以同时存储 IPv4 和 IPv6 两种地址:把 IPv4 规范化成 IPv4-mapped
IPv6(`::ffff:x.y.z.w`)后写入,例如 `192.168.1.1` 存成 `::ffff:192.168.1.1`,原生 IPv6(如
`2001:16a0:2:200a::2`)照常写入。这样混合 IP 就能用同一列统一管理,并在该列上建倒排索引来过滤
IP,支持等值(`=`)、`IN`、以及 `is_ip_address_in_range()` 等 CIDR 函数走索引加速。
+
+### 建模
+
+```sql
+CREATE TABLE access_log (
+ `id` BIGINT NOT NULL,
+ `request_time` DATETIME NOT NULL,
+ `client_ip` IPV6 NULL COMMENT '统一存储为 IPv6,IPv4 以 ::ffff: 形式存储',
+ INDEX idx_client_ip (`client_ip`) USING INVERTED
+) ENGINE=OLAP
+DUPLICATE KEY(`id`, `request_time`)
+DISTRIBUTED BY HASH(`id`) BUCKETS 4
+PROPERTIES (
+ "replication_allocation" = "tag.location.default: 1"
+);
+```
+
+### 导入时归一化
+
+无论用哪种导入方式,核心都是在列映射里用同一个表达式把原始 IP 字符串转成 IPv6(下面 `ip_str` 为源字段,`client_ip` 为表的
`IPV6` 列):
+
+```sql
+client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+)
+```
+
+- 普通 IPv4 字符串先用 `to_ipv4_or_null()` 解析为 `IPV4`,再用 `ipv4_to_ipv6()` 转成
`::ffff:x.y.z.w`;
+- 不是合法 IPv4 时 `COALESCE` 落到 `to_ipv6_or_null()`,按原生 IPv6 文本解析;
+- 两者都失败则写入 `NULL`。若希望非法数据直接报错,可把 `_or_null` 换成严格的 `to_ipv4` /
`to_ipv6`,并配合导入的错误阈值处理。
+
+#### Routine Load
+
+```sql
+CREATE ROUTINE LOAD example_db.access_log_job ON access_log
+COLUMNS TERMINATED BY ",",
+COLUMNS(
+ id, request_time, ip_str,
+ client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+)
+PROPERTIES ("desired_concurrent_number" = "3", "max_error_number" = "1000")
+FROM KAFKA ("kafka_broker_list" = "broker:9092", "kafka_topic" = "ip_logs");
+```
+
+#### Stream Load
+
+在 `columns` 头里做同样的转换:
+
+```shell
+curl --location-trusted -u user:passwd \
+ -H "format: csv" \
+ -H "column_separator: ," \
+ -H "columns: id, request_time, ip_str,
client_ip=COALESCE(ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
to_ipv6_or_null(ip_str))" \
+ -T data.csv \
+ "http://<fe_host>:8030/api/example_db/access_log/_stream_load"
+```
+
+#### INSERT INTO(从旧 VARCHAR 表迁移)
+
+在 `SELECT` 里把原始 IP 字符串统一转成 IPv6(`ip_str` 为旧表的 VARCHAR 列):
+
+```sql
+INSERT INTO access_log (id, request_time, client_ip)
+SELECT
+ id, request_time,
+ COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+FROM access_log_varchar;
+```
+
+### 查询改写(关键)
+
+存储统一成 mapped IPv6 后,**IPv4 的查询条件也必须改写成 mapped IPv6 形式**,否则类型/网段族不匹配会导致跳过索引:
+
+- **IPv6 CIDR**:直接写。
+
+ ```sql
+ is_ip_address_in_range(client_ip, '2001:4860:4801::/48')
+ ```
+
+- **IPv4 CIDR → mapped IPv6 CIDR**:地址加 `::ffff:` 前缀,前缀长度 **+96**(mapped 地址前 96
位是固定前缀)。
+
+ | IPv4 CIDR | mapped IPv6 CIDR |
+ | ------------------ | --------------------------- |
+ | `10.42.0.0/16` | `::ffff:10.42.0.0/112` |
+ | `192.178.4.0/24` | `::ffff:192.178.4.0/120` |
+ | `3.219.120.76/32` | `::ffff:3.219.120.76/128` |
+
+ ```sql
+ is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **IPv4 精确匹配**:同样比较 mapped 值。
+
+ ```sql
+ client_ip = ipv4_to_ipv6(to_ipv4('3.219.120.76'))
+ -- 或
+ client_ip = to_ipv6('::ffff:3.219.120.76')
+ ```
+
+- **白名单(只保留命中网段的行)**:直接用 `is_ip_address_in_range(...)` 作为过滤条件。
+
+ ```sql
+ WHERE is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **黑名单(排除命中网段的行)**:用 `NOT is_ip_address_in_range(...)`,避免
`is_ip_address_in_range(...) = 0` 这种写法(后者更难走索引下推)。
+
+ ```sql
+ WHERE NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+完整示例(一段“排除若干网段”的统计)。因为包含 CIDR 范围条件,运行前先调低阈值确保命中索引:
+
+```sql
+SET inverted_index_skip_threshold = 0;
+
+SELECT client_ip, COUNT(*) AS cnt
+FROM access_log
+WHERE request_time BETWEEN '2026-06-01 00:00:00' AND '2026-06-02 00:00:00'
+ AND client_ip NOT IN (to_ipv6('::ffff:3.219.120.76')) -- IPv4
精确排除 → mapped
+ AND NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112') -- IPv4
/16 → /112
+ AND NOT is_ip_address_in_range(client_ip, '2001:4860:4801::/48') -- IPv6
网段
+GROUP BY client_ip;
+```
+
+### 走索引的前提
+
+- 列必须是原生 `IPV4` / `IPV6` 类型,且已建 `USING INVERTED` 倒排索引;
+- `is_ip_address_in_range()` 的 CIDR 参数必须是**常量字符串**;
+- `IPV6` 列只处理 IPv6 CIDR(IPv4 网段需先转成 `::ffff:` mapped 形式),`IPV4` 列只处理 IPv4
CIDR,类型与 CIDR 族不匹配会跳过索引、回退到表达式过滤;
+- 会话变量 `inverted_index_skip_threshold` 默认 `50`:当一个 segment 中超过 50%
的行命中索引时会跳过索引、直接读取(bypass)。
+
+:::caution CIDR 范围查询需要调低 `inverted_index_skip_threshold`
+对 CIDR 范围查询,`is_ip_address_in_range()` 内部会拆成 `>= 网段起始` 和 `<= 网段结束` 两次 BKD
查询,**单边各自命中的行数往往都超过 50%**(即便最终交集很窄),从而触发 bypass、跳过索引。因此这类查询需要在会话里设 `SET
inverted_index_skip_threshold = 0;`(或较小值)才能稳定命中索引。等值 / `IN` 是高选择性点查,不受此影响。
+:::
+
### keywords
IPV6
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
index e5368af7726..523ec29a8f7 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
@@ -21,7 +21,7 @@ IS_IP_ADDRESS_IN_RANGE(<ip_address>, <cidr_range>)
- `<cidr_range>`:CIDR 网段范围(字符串格式,如 "192.168.1.0/24")
### 返回值
-返回类型:TINYINT
+返回类型:BOOLEAN
返回值含义:
- 返回 1:表示 IP 地址在指定的 CIDR 范围内
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
index 49532a2a08c..aef5fb5c3af 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
@@ -20,7 +20,7 @@ IS_IPV4_COMPAT(<ipv6_address>)
- `<ipv6_address>`:IPv6 地址的二进制表示(VARCHAR 类型,16 字节)
### 返回值
-返回类型:TINYINT
+返回类型:BOOLEAN
返回值含义:1 表示是 IPv4 兼容地址,0 表示不是 IPv4 兼容地址
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
index 3851d49694a..1af7828c4f0 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
@@ -20,7 +20,7 @@ IS_IPV4_MAPPED(<ipv6_address>)
- `<ipv6_address>`:IPv6 地址的二进制表示(VARCHAR 类型,16 字节)
### 返回值
-返回类型:TINYINT
+返回类型:BOOLEAN
返回值含义:1 表示是 IPv4 映射地址,0 表示不是 IPv4 映射地址
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
index 9dc9e15e5df..0d3471f62b5 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
@@ -23,7 +23,7 @@ IS_IPV4_STRING(<ipv4_str>)
- `<ipv4_str>`:要检查的字符串
### 返回值
-返回类型:TINYINT
+返回类型:BOOLEAN
返回值含义:
- 返回 1:表示输入是有效的 IPv4 地址格式
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
index 02a0771e70b..7f3107dd567 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
@@ -23,7 +23,7 @@ IS_IPV6_STRING(<ipv6_str>)
- `<ipv6_str>`:要检查的字符串
### 返回值
-返回类型:TINYINT
+返回类型:BOOLEAN
返回值含义:
- 返回 1:表示输入是有效的 IPv6 地址格式
diff --git
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/ip/IPV6.md
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/ip/IPV6.md
index 4c1d57e8a4e..f396485e701 100644
---
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/ip/IPV6.md
+++
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/ip/IPV6.md
@@ -49,6 +49,152 @@ mysql> select * from ipv6_test order by id;
| 4 | NULL |
+------+-----------------------------------------+
```
+## ベストプラクティス:IPv4/IPv6 混在を 1 つの IPV6 列に統一格納し転置インデックスを利用する
+
+1 つの `IPV6` 列に IPv4 と IPv6 の両方のアドレスを格納できます。IPv4 は IPv4-mapped
IPv6(`::ffff:x.y.z.w`)形式に正規化して書き込み(例:`192.168.1.1` は `::ffff:192.168.1.1`
として格納)、ネイティブ IPv6(例:`2001:16a0:2:200a::2`)はそのまま書き込みます。これにより混在 IP を 1
列で統一管理でき、その列に転置インデックスを張ることで IP フィルタリングを高速化し、等価(`=`)、`IN`、および
`is_ip_address_in_range()` などの CIDR 関数がインデックスを利用できます。
+
+### スキーマ
+
+```sql
+CREATE TABLE access_log (
+ `id` BIGINT NOT NULL,
+ `request_time` DATETIME NOT NULL,
+ `client_ip` IPV6 NULL COMMENT 'IPv6 として統一格納。IPv4 は ::ffff: 形式で格納',
+ INDEX idx_client_ip (`client_ip`) USING INVERTED
+) ENGINE=OLAP
+DUPLICATE KEY(`id`, `request_time`)
+DISTRIBUTED BY HASH(`id`) BUCKETS 4
+PROPERTIES (
+ "replication_allocation" = "tag.location.default: 1"
+);
+```
+
+### ロード時の正規化
+
+どのロード方式でも、要点は列マッピングで同じ式を使って生の IP 文字列を IPv6 に変換することです(以下、`ip_str`
がソースフィールド、`client_ip` がテーブルの `IPV6` 列):
+
+```sql
+client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+)
+```
+
+- 通常の IPv4 文字列はまず `to_ipv4_or_null()` で `IPV4` に解析し、`ipv4_to_ipv6()` で
`::ffff:x.y.z.w` に変換します;
+- 有効な IPv4 でない場合、`COALESCE` は `to_ipv6_or_null()` にフォールバックし、ネイティブ IPv6
テキストとして解析します;
+- 両方失敗した場合は `NULL` が書き込まれます。不正なデータをエラーにしたい場合は、`_or_null` を厳密な `to_ipv4` /
`to_ipv6` に置き換え、ロードのエラーしきい値で処理します。
+
+#### Routine Load
+
+```sql
+CREATE ROUTINE LOAD example_db.access_log_job ON access_log
+COLUMNS TERMINATED BY ",",
+COLUMNS(
+ id, request_time, ip_str,
+ client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+)
+PROPERTIES ("desired_concurrent_number" = "3", "max_error_number" = "1000")
+FROM KAFKA ("kafka_broker_list" = "broker:9092", "kafka_topic" = "ip_logs");
+```
+
+#### Stream Load
+
+`columns` ヘッダーで同じ変換を行います:
+
+```shell
+curl --location-trusted -u user:passwd \
+ -H "format: csv" \
+ -H "column_separator: ," \
+ -H "columns: id, request_time, ip_str,
client_ip=COALESCE(ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
to_ipv6_or_null(ip_str))" \
+ -T data.csv \
+ "http://<fe_host>:8030/api/example_db/access_log/_stream_load"
+```
+
+#### INSERT INTO(旧 VARCHAR テーブルからの移行)
+
+`SELECT` で生の IP 文字列を IPv6 に変換します(`ip_str` は旧テーブルの VARCHAR 列):
+
+```sql
+INSERT INTO access_log (id, request_time, client_ip)
+SELECT
+ id, request_time,
+ COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+FROM access_log_varchar;
+```
+
+### クエリの書き換え(重要)
+
+すべてが mapped IPv6 として格納された後は、**IPv4 のクエリ条件も mapped IPv6
形式に書き換える必要があります**。そうしないと型/アドレスファミリの不一致でインデックスがスキップされます:
+
+- **IPv6 CIDR**:そのまま記述します。
+
+ ```sql
+ is_ip_address_in_range(client_ip, '2001:4860:4801::/48')
+ ```
+
+- **IPv4 CIDR → mapped IPv6 CIDR**:アドレスに `::ffff:` プレフィックスを付け、プレフィックス長に
**+96** します(mapped アドレスの先頭 96 ビットは固定プレフィックス)。
+
+ | IPv4 CIDR | mapped IPv6 CIDR |
+ | ------------------ | --------------------------- |
+ | `10.42.0.0/16` | `::ffff:10.42.0.0/112` |
+ | `192.178.4.0/24` | `::ffff:192.178.4.0/120` |
+ | `3.219.120.76/32` | `::ffff:3.219.120.76/128` |
+
+ ```sql
+ is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **IPv4 の完全一致**:mapped 値と比較します。
+
+ ```sql
+ client_ip = ipv4_to_ipv6(to_ipv4('3.219.120.76'))
+ -- または
+ client_ip = to_ipv6('::ffff:3.219.120.76')
+ ```
+
+- **許可リスト(範囲に該当する行のみ残す)**:`is_ip_address_in_range(...)` をそのままフィルタとして使用します。
+
+ ```sql
+ WHERE is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **拒否リスト(範囲に該当する行を除外する)**:`NOT is_ip_address_in_range(...)`
を使用し、`is_ip_address_in_range(...) = 0` という書き方は避けます(インデックスへのプッシュダウンが難しくなります)。
+
+ ```sql
+ WHERE NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+完全な例(いくつかの範囲を除外する集計)。CIDR 範囲条件を含むため、実行前にしきい値を下げてインデックスの利用を確実にします:
+
+```sql
+SET inverted_index_skip_threshold = 0;
+
+SELECT client_ip, COUNT(*) AS cnt
+FROM access_log
+WHERE request_time BETWEEN '2026-06-01 00:00:00' AND '2026-06-02 00:00:00'
+ AND client_ip NOT IN (to_ipv6('::ffff:3.219.120.76')) -- IPv4
完全一致の除外 -> mapped
+ AND NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112') -- IPv4
/16 -> /112
+ AND NOT is_ip_address_in_range(client_ip, '2001:4860:4801::/48') -- IPv6
範囲
+GROUP BY client_ip;
+```
+
+### インデックスを利用する前提条件
+
+- 列はネイティブ `IPV4` / `IPV6` 型で、`USING INVERTED` の転置インデックスが張られている必要があります;
+- `is_ip_address_in_range()` の CIDR 引数は**定数文字列**である必要があります;
+- `IPV6` 列は IPv6 CIDR のみを処理し(IPv4 範囲は先に `::ffff:` mapped
形式に変換する必要があります)、`IPV4` 列は IPv4 CIDR のみを処理します。型と CIDR
ファミリが一致しない場合はインデックスをスキップして式フィルタにフォールバックします;
+- セッション変数 `inverted_index_skip_threshold` のデフォルトは `50`:セグメント内で 50%
を超える行がインデックスにヒットすると、インデックスをスキップして直接読み取ります(bypass)。
+
+:::caution CIDR 範囲クエリは `inverted_index_skip_threshold` を下げる必要があります
+CIDR 範囲クエリでは、`is_ip_address_in_range()` は内部で `>= 範囲開始` と `<= 範囲終了` の 2 回の BKD
クエリに分割されます。**片側それぞれのヒット行数が 50% を超えることが多く**(最終的な積集合が狭くても)、bypass
がトリガーされインデックスがスキップされます。そのため、このようなクエリではセッションで `SET inverted_index_skip_threshold
= 0;`(またはより小さい値)を設定して初めてインデックスを安定して利用できます。等価 / `IN`
は高選択性のポイントルックアップであり、この影響を受けません。
+:::
+
### keywords
IPV6
diff --git
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
index 91204c482d9..1d7b06b54a9 100644
---
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
+++
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
@@ -20,7 +20,7 @@ IS_IP_ADDRESS_IN_RANGE(<ip_address>, <cidr_range>)
- `<cidr_range>`: CIDRネットワーク範囲("192.168.1.0/24"などの文字列形式)
### 戻り値
-戻り値の型: TINYINT
+戻り値の型: BOOLEAN
戻り値の意味:
- 1を返す: IPアドレスが指定されたCIDR範囲内にあることを示す
diff --git
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
index dacb6d56141..086c1f4a42d 100644
---
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
+++
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
@@ -19,7 +19,7 @@ IS_IPV4_COMPAT(<ipv6_address>)
- `<ipv6_address>`: IPv6アドレスのバイナリ表現(VARCHAR型、16バイト)
### 戻り値
-戻り値の型: TINYINT
+戻り値の型: BOOLEAN
戻り値の意味: 1はIPv4互換アドレスであることを示し、0はIPv4互換アドレスではないことを示す
diff --git
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
index 3a258f72cd4..adfefcb89fe 100644
---
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
+++
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
@@ -19,7 +19,7 @@ IS_IPV4_MAPPED(<ipv6_address>)
- `<ipv6_address>`: IPv6アドレスのバイナリ表現(VARCHAR型、16バイト)
### 戻り値
-戻り値の型: TINYINT
+戻り値の型: BOOLEAN
戻り値の意味: 1はIPv4マップアドレスであることを示し、0はIPv4マップアドレスではないことを示す
diff --git
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
index ea51632b720..aeb4b050167 100644
---
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
+++
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
@@ -22,7 +22,7 @@ IS_IPV4_STRING(<ipv4_str>)
- `<ipv4_str>`: チェックする文字列
### 戻り値
-戻り値の型: TINYINT
+戻り値の型: BOOLEAN
戻り値の意味:
- 1を返す: 入力が有効なIPv4アドレス形式であることを示す
diff --git
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
index 0efef049df7..05bb6343b30 100644
---
a/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
+++
b/ja-source/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
@@ -22,7 +22,7 @@ IS_IPV6_STRING(<ipv6_str>)
- `<ipv6_str>`: チェックする文字列
### 戻り値
-戻り値の型: TINYINT
+戻り値の型: BOOLEAN
戻り値の意味:
- 1を返す: 入力が有効なIPv6アドレス形式であることを示す
diff --git
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
index 0fa714fdf22..86dd2e4627d 100644
---
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
+++
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
@@ -49,6 +49,152 @@ mysql> select * from ipv6_test order by id;
| 4 | NULL |
+------+-----------------------------------------+
```
+## ベストプラクティス:IPv4/IPv6 混在を 1 つの IPV6 列に統一格納し転置インデックスを利用する
+
+1 つの `IPV6` 列に IPv4 と IPv6 の両方のアドレスを格納できます。IPv4 は IPv4-mapped
IPv6(`::ffff:x.y.z.w`)形式に正規化して書き込み(例:`192.168.1.1` は `::ffff:192.168.1.1`
として格納)、ネイティブ IPv6(例:`2001:16a0:2:200a::2`)はそのまま書き込みます。これにより混在 IP を 1
列で統一管理でき、その列に転置インデックスを張ることで IP フィルタリングを高速化し、等価(`=`)、`IN`、および
`is_ip_address_in_range()` などの CIDR 関数がインデックスを利用できます。
+
+### スキーマ
+
+```sql
+CREATE TABLE access_log (
+ `id` BIGINT NOT NULL,
+ `request_time` DATETIME NOT NULL,
+ `client_ip` IPV6 NULL COMMENT 'IPv6 として統一格納。IPv4 は ::ffff: 形式で格納',
+ INDEX idx_client_ip (`client_ip`) USING INVERTED
+) ENGINE=OLAP
+DUPLICATE KEY(`id`, `request_time`)
+DISTRIBUTED BY HASH(`id`) BUCKETS 4
+PROPERTIES (
+ "replication_allocation" = "tag.location.default: 1"
+);
+```
+
+### ロード時の正規化
+
+どのロード方式でも、要点は列マッピングで同じ式を使って生の IP 文字列を IPv6 に変換することです(以下、`ip_str`
がソースフィールド、`client_ip` がテーブルの `IPV6` 列):
+
+```sql
+client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+)
+```
+
+- 通常の IPv4 文字列はまず `to_ipv4_or_null()` で `IPV4` に解析し、`ipv4_to_ipv6()` で
`::ffff:x.y.z.w` に変換します;
+- 有効な IPv4 でない場合、`COALESCE` は `to_ipv6_or_null()` にフォールバックし、ネイティブ IPv6
テキストとして解析します;
+- 両方失敗した場合は `NULL` が書き込まれます。不正なデータをエラーにしたい場合は、`_or_null` を厳密な `to_ipv4` /
`to_ipv6` に置き換え、ロードのエラーしきい値で処理します。
+
+#### Routine Load
+
+```sql
+CREATE ROUTINE LOAD example_db.access_log_job ON access_log
+COLUMNS TERMINATED BY ",",
+COLUMNS(
+ id, request_time, ip_str,
+ client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+)
+PROPERTIES ("desired_concurrent_number" = "3", "max_error_number" = "1000")
+FROM KAFKA ("kafka_broker_list" = "broker:9092", "kafka_topic" = "ip_logs");
+```
+
+#### Stream Load
+
+`columns` ヘッダーで同じ変換を行います:
+
+```shell
+curl --location-trusted -u user:passwd \
+ -H "format: csv" \
+ -H "column_separator: ," \
+ -H "columns: id, request_time, ip_str,
client_ip=COALESCE(ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
to_ipv6_or_null(ip_str))" \
+ -T data.csv \
+ "http://<fe_host>:8030/api/example_db/access_log/_stream_load"
+```
+
+#### INSERT INTO(旧 VARCHAR テーブルからの移行)
+
+`SELECT` で生の IP 文字列を IPv6 に変換します(`ip_str` は旧テーブルの VARCHAR 列):
+
+```sql
+INSERT INTO access_log (id, request_time, client_ip)
+SELECT
+ id, request_time,
+ COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+FROM access_log_varchar;
+```
+
+### クエリの書き換え(重要)
+
+すべてが mapped IPv6 として格納された後は、**IPv4 のクエリ条件も mapped IPv6
形式に書き換える必要があります**。そうしないと型/アドレスファミリの不一致でインデックスがスキップされます:
+
+- **IPv6 CIDR**:そのまま記述します。
+
+ ```sql
+ is_ip_address_in_range(client_ip, '2001:4860:4801::/48')
+ ```
+
+- **IPv4 CIDR → mapped IPv6 CIDR**:アドレスに `::ffff:` プレフィックスを付け、プレフィックス長に
**+96** します(mapped アドレスの先頭 96 ビットは固定プレフィックス)。
+
+ | IPv4 CIDR | mapped IPv6 CIDR |
+ | ------------------ | --------------------------- |
+ | `10.42.0.0/16` | `::ffff:10.42.0.0/112` |
+ | `192.178.4.0/24` | `::ffff:192.178.4.0/120` |
+ | `3.219.120.76/32` | `::ffff:3.219.120.76/128` |
+
+ ```sql
+ is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **IPv4 の完全一致**:mapped 値と比較します。
+
+ ```sql
+ client_ip = ipv4_to_ipv6(to_ipv4('3.219.120.76'))
+ -- または
+ client_ip = to_ipv6('::ffff:3.219.120.76')
+ ```
+
+- **許可リスト(範囲に該当する行のみ残す)**:`is_ip_address_in_range(...)` をそのままフィルタとして使用します。
+
+ ```sql
+ WHERE is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **拒否リスト(範囲に該当する行を除外する)**:`NOT is_ip_address_in_range(...)`
を使用し、`is_ip_address_in_range(...) = 0` という書き方は避けます(インデックスへのプッシュダウンが難しくなります)。
+
+ ```sql
+ WHERE NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+完全な例(いくつかの範囲を除外する集計)。CIDR 範囲条件を含むため、実行前にしきい値を下げてインデックスの利用を確実にします:
+
+```sql
+SET inverted_index_skip_threshold = 0;
+
+SELECT client_ip, COUNT(*) AS cnt
+FROM access_log
+WHERE request_time BETWEEN '2026-06-01 00:00:00' AND '2026-06-02 00:00:00'
+ AND client_ip NOT IN (to_ipv6('::ffff:3.219.120.76')) -- IPv4
完全一致の除外 -> mapped
+ AND NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112') -- IPv4
/16 -> /112
+ AND NOT is_ip_address_in_range(client_ip, '2001:4860:4801::/48') -- IPv6
範囲
+GROUP BY client_ip;
+```
+
+### インデックスを利用する前提条件
+
+- 列はネイティブ `IPV4` / `IPV6` 型で、`USING INVERTED` の転置インデックスが張られている必要があります;
+- `is_ip_address_in_range()` の CIDR 引数は**定数文字列**である必要があります;
+- `IPV6` 列は IPv6 CIDR のみを処理し(IPv4 範囲は先に `::ffff:` mapped
形式に変換する必要があります)、`IPV4` 列は IPv4 CIDR のみを処理します。型と CIDR
ファミリが一致しない場合はインデックスをスキップして式フィルタにフォールバックします;
+- セッション変数 `inverted_index_skip_threshold` のデフォルトは `50`:セグメント内で 50%
を超える行がインデックスにヒットすると、インデックスをスキップして直接読み取ります(bypass)。
+
+:::caution CIDR 範囲クエリは `inverted_index_skip_threshold` を下げる必要があります
+CIDR 範囲クエリでは、`is_ip_address_in_range()` は内部で `>= 範囲開始` と `<= 範囲終了` の 2 回の BKD
クエリに分割されます。**片側それぞれのヒット行数が 50% を超えることが多く**(最終的な積集合が狭くても)、bypass
がトリガーされインデックスがスキップされます。そのため、このようなクエリではセッションで `SET inverted_index_skip_threshold
= 0;`(またはより小さい値)を設定して初めてインデックスを安定して利用できます。等価 / `IN`
は高選択性のポイントルックアップであり、この影響を受けません。
+:::
+
### keywords
IPV6
diff --git
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
index 9e3b99b12cc..f726667bfc7 100644
---
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
+++
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
@@ -20,7 +20,7 @@ IS_IP_ADDRESS_IN_RANGE(<ip_address>, <cidr_range>)
- `<cidr_range>`: CIDRネットワーク範囲(文字列形式、例:"192.168.1.0/24")
### 戻り値
-戻り値の型:TINYINT
+戻り値の型:BOOLEAN
戻り値の意味:
- 1を返す:IPアドレスが指定されたCIDR範囲内にあることを示す
diff --git
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
index 05d7f475402..ba87c0edbe8 100644
---
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
+++
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
@@ -19,7 +19,7 @@ IS_IPV4_COMPAT(<ipv6_address>)
- `<ipv6_address>`: IPv6アドレスのバイナリ表現(VARCHAR型、16バイト)
### 戻り値
-戻り値の型: TINYINT
+戻り値の型: BOOLEAN
戻り値の意味: 1はIPv4互換アドレスであることを示し、0はIPv4互換アドレスでないことを示します
diff --git
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
index 923231ac3a3..fd99870adb2 100644
---
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
+++
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
@@ -19,7 +19,7 @@ IS_IPV4_MAPPED(<ipv6_address>)
- `<ipv6_address>`: IPv6アドレスのバイナリ表現(VARCHARタイプ、16バイト)
### 戻り値
-戻り値の型: TINYINT
+戻り値の型: BOOLEAN
戻り値の意味: 1はIPv4マップアドレスであることを示し、0はIPv4マップアドレスではないことを示す
diff --git
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
index 3323c27aa5e..0a51b721160 100644
---
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
+++
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
@@ -22,7 +22,7 @@ IS_IPV4_STRING(<ipv4_str>)
- `<ipv4_str>`: チェックする文字列
### 戻り値
-戻り値の型: TINYINT
+戻り値の型: BOOLEAN
戻り値の意味:
- 1を返す: 入力が有効なIPv4アドレス形式であることを示す
diff --git
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
index e1d0bfb6e05..b87def0f724 100644
---
a/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
+++
b/ja-source/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
@@ -22,7 +22,7 @@ IS_IPV6_STRING(<ipv6_str>)
- `<ipv6_str>`: チェック対象の文字列
### 戻り値
-戻り値の型: TINYINT
+戻り値の型: BOOLEAN
戻り値の意味:
- 1を返す: 入力が有効なIPv6アドレス形式であることを示す
diff --git
a/versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
b/versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
index 681023f4bfd..44ce6563fd3 100644
---
a/versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
+++
b/versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/ip/IPV6.md
@@ -53,6 +53,152 @@ mysql> select * from ipv6_test order by id;
+------+-----------------------------------------+
```
+## Best Practice: store mixed IPv4/IPv6 in a single IPV6 column with an
inverted index
+
+A single `IPV6` column can store both IPv4 and IPv6 addresses: write IPv4 in
its IPv4-mapped IPv6 form (`::ffff:x.y.z.w`) — for example `192.168.1.1` is
stored as `::ffff:192.168.1.1` — while native IPv6 (such as
`2001:16a0:2:200a::2`) is written as-is. This lets a mixed-IP field be managed
in one column, and an inverted index on that column accelerates IP filtering,
supporting equality (`=`), `IN`, and CIDR functions such as
`is_ip_address_in_range()`.
+
+### Schema
+
+```sql
+CREATE TABLE access_log (
+ `id` BIGINT NOT NULL,
+ `request_time` DATETIME NOT NULL,
+ `client_ip` IPV6 NULL COMMENT 'Stored uniformly as IPv6; IPv4 stored in
::ffff: form',
+ INDEX idx_client_ip (`client_ip`) USING INVERTED
+) ENGINE=OLAP
+DUPLICATE KEY(`id`, `request_time`)
+DISTRIBUTED BY HASH(`id`) BUCKETS 4
+PROPERTIES (
+ "replication_allocation" = "tag.location.default: 1"
+);
+```
+
+### Normalizing on load
+
+Regardless of the load method, the key is to use the same expression in the
column mapping to convert the raw IP string into IPv6 (below, `ip_str` is the
source field and `client_ip` is the table's `IPV6` column):
+
+```sql
+client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+)
+```
+
+- A plain IPv4 string is first parsed to `IPV4` by `to_ipv4_or_null()`, then
converted to `::ffff:x.y.z.w` by `ipv4_to_ipv6()`;
+- If it is not a valid IPv4, `COALESCE` falls back to `to_ipv6_or_null()` and
parses it as native IPv6 text;
+- If both fail, `NULL` is written. To reject invalid data instead, replace
`_or_null` with the strict `to_ipv4` / `to_ipv6` and handle it via the load
error threshold.
+
+#### Routine Load
+
+```sql
+CREATE ROUTINE LOAD example_db.access_log_job ON access_log
+COLUMNS TERMINATED BY ",",
+COLUMNS(
+ id, request_time, ip_str,
+ client_ip = COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+)
+PROPERTIES ("desired_concurrent_number" = "3", "max_error_number" = "1000")
+FROM KAFKA ("kafka_broker_list" = "broker:9092", "kafka_topic" = "ip_logs");
+```
+
+#### Stream Load
+
+Apply the same conversion in the `columns` header:
+
+```shell
+curl --location-trusted -u user:passwd \
+ -H "format: csv" \
+ -H "column_separator: ," \
+ -H "columns: id, request_time, ip_str,
client_ip=COALESCE(ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
to_ipv6_or_null(ip_str))" \
+ -T data.csv \
+ "http://<fe_host>:8030/api/example_db/access_log/_stream_load"
+```
+
+#### INSERT INTO (migrating from an old VARCHAR table)
+
+Convert the raw IP string to IPv6 in the `SELECT` (`ip_str` is the VARCHAR
column of the old table):
+
+```sql
+INSERT INTO access_log (id, request_time, client_ip)
+SELECT
+ id, request_time,
+ COALESCE(
+ ipv4_to_ipv6(to_ipv4_or_null(ip_str)),
+ to_ipv6_or_null(ip_str)
+ )
+FROM access_log_varchar;
+```
+
+### Rewriting queries (important)
+
+Once everything is stored as mapped IPv6, **IPv4 query conditions must also be
rewritten into mapped IPv6 form**; otherwise a type / address-family mismatch
causes the index to be skipped:
+
+- **IPv6 CIDR**: write it directly.
+
+ ```sql
+ is_ip_address_in_range(client_ip, '2001:4860:4801::/48')
+ ```
+
+- **IPv4 CIDR → mapped IPv6 CIDR**: add the `::ffff:` prefix to the address
and add **+96** to the prefix length (the first 96 bits of a mapped address are
a fixed prefix).
+
+ | IPv4 CIDR | mapped IPv6 CIDR |
+ | ------------------ | --------------------------- |
+ | `10.42.0.0/16` | `::ffff:10.42.0.0/112` |
+ | `192.178.4.0/24` | `::ffff:192.178.4.0/120` |
+ | `3.219.120.76/32` | `::ffff:3.219.120.76/128` |
+
+ ```sql
+ is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **Exact IPv4 match**: compare against the mapped value too.
+
+ ```sql
+ client_ip = ipv4_to_ipv6(to_ipv4('3.219.120.76'))
+ -- or
+ client_ip = to_ipv6('::ffff:3.219.120.76')
+ ```
+
+- **Allowlist (keep only rows in the range)**: use
`is_ip_address_in_range(...)` directly as the filter.
+
+ ```sql
+ WHERE is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+- **Blocklist (exclude rows in the range)**: use `NOT
is_ip_address_in_range(...)`, and avoid the `is_ip_address_in_range(...) = 0`
form (which is harder to push down to the index).
+
+ ```sql
+ WHERE NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112')
+ ```
+
+Full example (a stat query that excludes several ranges). Because it contains
CIDR range conditions, lower the threshold first to ensure the index is used:
+
+```sql
+SET inverted_index_skip_threshold = 0;
+
+SELECT client_ip, COUNT(*) AS cnt
+FROM access_log
+WHERE request_time BETWEEN '2026-06-01 00:00:00' AND '2026-06-02 00:00:00'
+ AND client_ip NOT IN (to_ipv6('::ffff:3.219.120.76')) -- exact
IPv4 exclusion -> mapped
+ AND NOT is_ip_address_in_range(client_ip, '::ffff:10.42.0.0/112') -- IPv4
/16 -> /112
+ AND NOT is_ip_address_in_range(client_ip, '2001:4860:4801::/48') -- IPv6
range
+GROUP BY client_ip;
+```
+
+### Conditions for hitting the index
+
+- The column must be a native `IPV4` / `IPV6` type with a `USING INVERTED`
inverted index built on it;
+- The CIDR argument of `is_ip_address_in_range()` must be a **constant
string**;
+- An `IPV6` column only handles IPv6 CIDRs (IPv4 ranges must first be
converted to `::ffff:` mapped form), and an `IPV4` column only handles IPv4
CIDRs; a type / CIDR-family mismatch skips the index and falls back to
expression filtering;
+- The session variable `inverted_index_skip_threshold` defaults to `50`: when
more than 50% of the rows in a segment match the index, the index is skipped
and data is read directly (bypass).
+
+:::caution CIDR range queries need a lower `inverted_index_skip_threshold`
+For a CIDR range query, `is_ip_address_in_range()` is internally split into
two BKD queries, `>= range start` and `<= range end`. **Each side alone often
matches more than 50% of the rows** (even when the final intersection is
narrow), triggering bypass and skipping the index. Such queries therefore need
`SET inverted_index_skip_threshold = 0;` (or a smaller value) in the session to
hit the index reliably. Equality / `IN` are high-selectivity point lookups and
are unaffected.
+:::
+
### keywords
IPV6
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
index 9f03b560237..e588e84d966 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ip-address-in-range.md
@@ -19,7 +19,7 @@ IS_IP_ADDRESS_IN_RANGE(<ip_address>, <cidr_range>)
- `<cidr_range>`: CIDR network range (string format, such as "192.168.1.0/24")
### Return Value
-Return Type: TINYINT
+Return Type: BOOLEAN
Return Value Meaning:
- Returns 1: indicates the IP address is within the specified CIDR range
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
index 3e5526c0883..57c49dd1488 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-compat.md
@@ -18,7 +18,7 @@ IS_IPV4_COMPAT(<ipv6_address>)
- `<ipv6_address>`: Binary representation of IPv6 address (VARCHAR type, 16
bytes)
### Return Value
-Return Type: TINYINT
+Return Type: BOOLEAN
Return Value Meaning: 1 indicates it is an IPv4-compatible address, 0
indicates it is not an IPv4-compatible address
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
index 7d307a6cf08..76fd1988838 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-mapped.md
@@ -18,7 +18,7 @@ IS_IPV4_MAPPED(<ipv6_address>)
- `<ipv6_address>`: Binary representation of IPv6 address (VARCHAR type, 16
bytes)
### Return Value
-Return Type: TINYINT
+Return Type: BOOLEAN
Return Value Meaning: 1 indicates it is an IPv4-mapped address, 0 indicates it
is not an IPv4-mapped address
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
index 0ecc48b7088..67cdce74f02 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv4-string.md
@@ -21,7 +21,7 @@ IS_IPV4_STRING(<ipv4_str>)
- `<ipv4_str>`: String to check
### Return Value
-Return Type: TINYINT
+Return Type: BOOLEAN
Return Value Meaning:
- Returns 1: indicates the input is a valid IPv4 address format
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
index 2576f6b93e8..e917941b348 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-functions/scalar-functions/ip-functions/is-ipv6-string.md
@@ -21,7 +21,7 @@ IS_IPV6_STRING(<ipv6_str>)
- `<ipv6_str>`: String to check
### Return Value
-Return Type: TINYINT
+Return Type: BOOLEAN
Return Value Meaning:
- Returns 1: indicates the input is a valid IPv6 address format
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]