This is an automated email from the ASF dual-hosted git repository.
yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new fdaf557ed32 product quantization (#2998)
fdaf557ed32 is described below
commit fdaf557ed32f1e1cde1f250782f0d96b580be5d3
Author: ivin <[email protected]>
AuthorDate: Fri Oct 24 14:18:53 2025 +0800
product quantization (#2998)
## Versions
- [x] dev
- [ ] 3.x
- [ ] 2.1
- [ ] 2.0
## Languages
- [x] Chinese
- [x] English
## Docs Checklist
- [ ] Checked by AI
- [ ] Test Cases Built
---
docs/ai/vector-search.md | 33 +++++++++++++++++++--
.../current/ai/vector-search.md | 33 +++++++++++++++++++--
.../images/ann-index-quantization-build-time.jpg | Bin 0 -> 131775 bytes
static/images/ann-sq-build-time.png | Bin 50748 -> 0 bytes
4 files changed, 60 insertions(+), 6 deletions(-)
diff --git a/docs/ai/vector-search.md b/docs/ai/vector-search.md
index 50be502b274..54f640a16b5 100644
--- a/docs/ai/vector-search.md
+++ b/docs/ai/vector-search.md
@@ -79,7 +79,9 @@ PROPERTIES (
| `dim` | Yes | Positive integer (> 0) | (none) | Vector dimension. All
imported vectors must match or an error is raised. |
| `max_degree` | No | Positive integer | `32` | HNSW M (max neighbors per
node). Affects index memory and search performance. |
| `ef_construction` | No | Positive integer | `40` | HNSW efConstruction
(candidate queue size during build). Larger gives better quality but slower
build. |
-| `quantizer` | No | `flat`, `sq8`, `sq4` | `flat` | Vector
encoding/quantization: `flat` = raw; `sq8`/`sq4` = symmetric quantization (8/4
bit) to reduce memory. |
+| `quantizer` | No | `flat`, `sq8`, `sq4`, `pq` | `flat` | Vector
encoding/quantization: `flat` = raw; `sq8`/`sq4` = scalar quantization (8/4
bit), `pq` = product quantization to reduce memory. |
+| `pq_m` | Required when 'quantizer=pq' | Positive integer | (none) |
Specifies how many subvectors are used (vector dimension dim must be divisible
by pq_m). |
+| `pq_nbits` | Required when 'quantizer=pq' | Positive integer | (none) | The
number of bits used to represent each subvector, in faiss pq_nbits is generally
required to be no greater than 24. |
Import via S3 TVF:
@@ -283,7 +285,7 @@ Beyond build-time parameters for HNSW, you can pass
search-time parameters via s
With FLAT encoding, an HNSW index (raw vectors plus graph structure) may
consume large amounts of memory. HNSW must be fully resident in memory to
function, so memory can become a bottleneck at large scale.
-Vector quantization compresses float32 storage to reduce memory. Doris
currently supports two scalar quantization schemes: INT8 and INT4 (SQ8 / SQ4).
Example using SQ8:
+Scalar quantization (SQ) compresses float32 storage to reduce memory. Product
quantization (PQ) reduces memory overhead by compressing high-dimensional
vectors into smaller subvectors and quantizing each subvector independently.
For scalar quantization, Doris currently supports two scalar quantization
schemes: INT8 and INT4 (SQ8 / SQ4). Example using SQ8:
```sql
CREATE TABLE sift_1M (
@@ -314,7 +316,32 @@ On 768-D Cohere-MEDIUM-1M and Cohere-LARGE-10M datasets,
SQ8 reduces index size
Quantization introduces extra build-time overhead because each distance
computation must decode quantized values. For 128-D vectors, build time
increases with row count; SQ vs. FLAT can be up to ~10× slower to build.
-
+Similarly, Doris also supports product quantization, but note that when using
PQ, additional parameters need to be provided:
+
+- `pq_m`: Indicates how many sub-vectors to split the original
high-dimensional vector into (vector dimension dim must be divisible by pq_m).
+- `pq_nbits`: Indicates the number of bits for each sub-vector quantization,
which determines the size of each subspace codebook (k = 2 ^ pq_nbits), in
faiss pq_nbits is generally required to be no greater than 24.
+
+```sql
+CREATE TABLE sift_1M (
+ id int NOT NULL,
+ embedding array<float> NOT NULL COMMENT "",
+ INDEX ann_index (embedding) USING ANN PROPERTIES(
+ "index_type"="hnsw",
+ "metric_type"="l2_distance",
+ "dim"="128",
+ "quantizer"="pq", -- Specify using PQ for quantization
+ "pq_m"="2", -- Required when using PQ, indicates splitting
high-dimensional vector into pq_m low-dimensional sub-vectors
+ "pq_nbits"="2" -- Required when using PQ, indicates the number of
bits for each subspace codebook
+ )
+) ENGINE=OLAP
+DUPLICATE KEY(id) COMMENT "OLAP"
+DISTRIBUTED BY HASH(id) BUCKETS 1
+PROPERTIES (
+ "replication_num" = "1"
+);
+```
+
+
## Performance Tuning
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/ai/vector-search.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/ai/vector-search.md
index 0b4d8a7cc07..3e4bd22c15a 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/ai/vector-search.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/ai/vector-search.md
@@ -71,7 +71,9 @@ PROPERTIES (
| `dim` | 是 | 正整数 (> 0) | (无) | 指定向量维度,后续导入的所有向量的维度必须与此一致,否则报错。 |
| `max_degree` | 否 | 正整数 | `32` | HNSW 图中单个节点的最大邻居数(M),影响索引内存与搜索性能。 |
| `ef_construction` | 否 | 正整数 | `40` | HNSW
构建阶段的候选队列大小(efConstruction),越大构图质量越好但构建更慢。 |
-| `quantizer` | 否 | `flat`,`sq8`,`sq4` | `flat` | 指定向量编码/量化方式:`flat`
为原始存储,`sq8`/`sq4` 为对称量化(8/4 bit)以降低内存占用。 |
+| `quantizer` | 否 | `flat`,`sq8`,`sq4`, `pq` | `flat` | 指定向量编码/量化方式:`flat`
为原始存储,`sq8`/`sq4` 为标量量化(8/4 bit), `pq` 为乘积量化。 |
+| `pq_m` | 'quantizer=pq' 时需要指定 | 正整数 | (无) | 指定将原始的高维向量分割成多少个子向量(向量维度 dim
必须能被 pq_m 整除)。 |
+| `pq_nbits` | 'quantizer=pq' 时需要指定 | 正整数 | (无) | 指定每个子向量量化的比特数,
它决定了每个子空间码本的大小(k = 2 ^ pq_nbits), 在faiss中pq_nbits值一般要求不大于24。 |
通过 S3 TVF 导入数据:
```sql
@@ -259,7 +261,7 @@ LIMIT 2;
- hnsw_bounded_queue: 是否使用有界优先队列来优化HNSW的搜索性能。默认为 true。
## 向量量化
采用 FLAT 编码时,HNSW 索引(原始向量 + 图结构)可能占用大量内存。HNSW 必须全量驻留内存才能工作,因此在超大规模数据集上易成瓶颈。
-向量量化通过压缩 FLOAT32 减少内存开销。Doris 当前支持两种标量量化:INT8 与 INT4(SQ8 / SQ4)。以 SQ8 为例:
+标量量化(SQ)通过压缩 FLOAT32 减少内存开销。乘积量化(PQ)通过分解高维向量并分别量化子向量来降低内存开销。Doris
当前支持两种标量量化:INT8 与 INT4(SQ8 / SQ4)。以 SQ8 为例:
```sql
CREATE TABLE sift_1M (
@@ -290,7 +292,32 @@ PROPERTIES (
量化会带来额外构建开销,原因是构建阶段需要大量距离计算,且每次计算需对量化值解码。以 128 维向量为例,随行数增长构建时间上升,SQ 相比 FLAT
可能引入约 10 倍构建成本。
-
+类似的, Doris也支持乘积量化, 不过需要注意的是在使用PQ时需要提供额外的参数:
+
+- `pq_m`: 表示将原始的高维向量分割成多少个子向量(向量维度 dim 必须能被 pq_m 整除)。
+- `pq_nbits`: 表示每个子向量量化的比特数, 它决定了每个子空间码本的大小(k = 2 ^ pq_nbits),
在faiss中pq_nbits值一般要求不大于24。
+
+```sql
+CREATE TABLE sift_1M (
+ id int NOT NULL,
+ embedding array<float> NOT NULL COMMENT "",
+ INDEX ann_index (embedding) USING ANN PROPERTIES(
+ "index_type"="hnsw",
+ "metric_type"="l2_distance",
+ "dim"="128",
+ "quantizer"="pq", -- 指定使用 PQ 进行量化
+ "pq_m"="2", -- 使用PQ时需要指定, 表示将高维向量分割成 pq_m 个低维子向量
+ "pq_nbits"="2" -- 使用PQ时需要指定, 表示每个子空间码本的比特数
+ )
+) ENGINE=OLAP
+DUPLICATE KEY(id) COMMENT "OLAP"
+DISTRIBUTED BY HASH(id) BUCKETS 1
+PROPERTIES (
+ "replication_num" = "1"
+);
+```
+
+
## 性能调优
diff --git a/static/images/ann-index-quantization-build-time.jpg
b/static/images/ann-index-quantization-build-time.jpg
new file mode 100755
index 00000000000..e23835c8a3f
Binary files /dev/null and
b/static/images/ann-index-quantization-build-time.jpg differ
diff --git a/static/images/ann-sq-build-time.png
b/static/images/ann-sq-build-time.png
deleted file mode 100644
index 7588709acf2..00000000000
Binary files a/static/images/ann-sq-build-time.png and /dev/null differ
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]