(doris-website) branch master updated: product quantization (#2998)

yiguolei Thu, 23 Oct 2025 23:19:05 -0700

This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git



The following commit(s) were added to refs/heads/master by this push:
     new fdaf557ed32 product quantization (#2998)
fdaf557ed32 is described below

commit fdaf557ed32f1e1cde1f250782f0d96b580be5d3
Author: ivin <[email protected]>
AuthorDate: Fri Oct 24 14:18:53 2025 +0800

    product quantization (#2998)
    
    ## Versions
    
    - [x] dev
    - [ ] 3.x
    - [ ] 2.1
    - [ ] 2.0
    
    ## Languages
    
    - [x] Chinese
    - [x] English
    
    ## Docs Checklist
    
    - [ ] Checked by AI
    - [ ] Test Cases Built
---
 docs/ai/vector-search.md                           |  33 +++++++++++++++++++--
 .../current/ai/vector-search.md                    |  33 +++++++++++++++++++--
 .../images/ann-index-quantization-build-time.jpg   | Bin 0 -> 131775 bytes
 static/images/ann-sq-build-time.png                | Bin 50748 -> 0 bytes
 4 files changed, 60 insertions(+), 6 deletions(-)

diff --git a/docs/ai/vector-search.md b/docs/ai/vector-search.md
index 50be502b274..54f640a16b5 100644
--- a/docs/ai/vector-search.md
+++ b/docs/ai/vector-search.md
@@ -79,7 +79,9 @@ PROPERTIES (
 | `dim` | Yes | Positive integer (> 0) | (none) | Vector dimension. All 
imported vectors must match or an error is raised. |
 | `max_degree` | No | Positive integer | `32` | HNSW M (max neighbors per 
node). Affects index memory and search performance. |
 | `ef_construction` | No | Positive integer | `40` | HNSW efConstruction 
(candidate queue size during build). Larger gives better quality but slower 
build. |
-| `quantizer` | No | `flat`, `sq8`, `sq4` | `flat` | Vector 
encoding/quantization: `flat` = raw; `sq8`/`sq4` = symmetric quantization (8/4 
bit) to reduce memory. |
+| `quantizer` | No | `flat`, `sq8`, `sq4`, `pq` | `flat` | Vector 
encoding/quantization: `flat` = raw; `sq8`/`sq4` = scalar quantization (8/4 
bit), `pq` = product quantization to reduce memory. |
+| `pq_m` | Required when 'quantizer=pq' | Positive integer | (none) | 
Specifies how many subvectors are used (vector dimension dim must be divisible 
by pq_m). |
+| `pq_nbits` | Required when 'quantizer=pq' | Positive integer | (none) | The 
number of bits used to represent each subvector, in faiss pq_nbits is generally 
required to be no greater than 24. |
 
 Import via S3 TVF:
 
@@ -283,7 +285,7 @@ Beyond build-time parameters for HNSW, you can pass 
search-time parameters via s
 
 With FLAT encoding, an HNSW index (raw vectors plus graph structure) may 
consume large amounts of memory. HNSW must be fully resident in memory to 
function, so memory can become a bottleneck at large scale.
 
-Vector quantization compresses float32 storage to reduce memory. Doris 
currently supports two scalar quantization schemes: INT8 and INT4 (SQ8 / SQ4). 
Example using SQ8:
+Scalar quantization (SQ) compresses float32 storage to reduce memory. Product 
quantization (PQ) reduces memory overhead by compressing high-dimensional 
vectors into smaller subvectors and quantizing each subvector independently. 
For scalar quantization, Doris currently supports two scalar quantization 
schemes: INT8 and INT4 (SQ8 / SQ4). Example using SQ8:
 
 ```sql
 CREATE TABLE sift_1M (
@@ -314,7 +316,32 @@ On 768-D Cohere-MEDIUM-1M and Cohere-LARGE-10M datasets, 
SQ8 reduces index size
 
 Quantization introduces extra build-time overhead because each distance 
computation must decode quantized values. For 128-D vectors, build time 
increases with row count; SQ vs. FLAT can be up to ~10× slower to build.
 
-![ANN-SQ-BUILD_COSTS](/images/ann-sq-build-time.png)
+Similarly, Doris also supports product quantization, but note that when using 
PQ, additional parameters need to be provided:
+
+- `pq_m`: Indicates how many sub-vectors to split the original 
high-dimensional vector into (vector dimension dim must be divisible by pq_m).
+- `pq_nbits`: Indicates the number of bits for each sub-vector quantization, 
which determines the size of each subspace codebook (k = 2 ^ pq_nbits), in 
faiss pq_nbits is generally required to be no greater than 24.
+
+```sql
+CREATE TABLE sift_1M (
+  id int NOT NULL,
+  embedding array<float>  NOT NULL  COMMENT "",
+  INDEX ann_index (embedding) USING ANN PROPERTIES(
+      "index_type"="hnsw",
+      "metric_type"="l2_distance",
+      "dim"="128",
+      "quantizer"="pq",    -- Specify using PQ for quantization
+      "pq_m"="2",          -- Required when using PQ, indicates splitting 
high-dimensional vector into pq_m low-dimensional sub-vectors
+      "pq_nbits"="2"       -- Required when using PQ, indicates the number of 
bits for each subspace codebook
+  )
+) ENGINE=OLAP
+DUPLICATE KEY(id) COMMENT "OLAP"
+DISTRIBUTED BY HASH(id) BUCKETS 1
+PROPERTIES (
+  "replication_num" = "1"
+);
+```
+
+![ANN-SQ-BUILD_COSTS](/images/ann-index-quantization-build-time.jpg)
 
 ## Performance Tuning
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/ai/vector-search.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/ai/vector-search.md
index 0b4d8a7cc07..3e4bd22c15a 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/ai/vector-search.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/ai/vector-search.md
@@ -71,7 +71,9 @@ PROPERTIES (
 | `dim` | 是 | 正整数 (> 0) | （无） | 指定向量维度，后续导入的所有向量的维度必须与此一致，否则报错。 |
 | `max_degree` | 否 | 正整数 | `32` | HNSW 图中单个节点的最大邻居数（M），影响索引内存与搜索性能。 |
 | `ef_construction` | 否 | 正整数 | `40` | HNSW 
构建阶段的候选队列大小（efConstruction），越大构图质量越好但构建更慢。 |
-| `quantizer` | 否 | `flat`，`sq8`，`sq4` | `flat` | 指定向量编码/量化方式：`flat` 
为原始存储，`sq8`/`sq4` 为对称量化（8/4 bit）以降低内存占用。 |
+| `quantizer` | 否 | `flat`，`sq8`，`sq4`, `pq` | `flat` | 指定向量编码/量化方式：`flat` 
为原始存储，`sq8`/`sq4` 为标量量化（8/4 bit）, `pq` 为乘积量化。 |
+| `pq_m` | 'quantizer=pq' 时需要指定 | 正整数 | （无） | 指定将原始的高维向量分割成多少个子向量(向量维度 dim 
必须能被 pq_m 整除)。 |
+| `pq_nbits` | 'quantizer=pq' 时需要指定 | 正整数 | （无） | 指定每个子向量量化的比特数, 
它决定了每个子空间码本的大小(k = 2 ^ pq_nbits), 在faiss中pq_nbits值一般要求不大于24。 |
 
 通过 S3 TVF 导入数据：
 ```sql
@@ -259,7 +261,7 @@ LIMIT 2;
 - hnsw_bounded_queue： 是否使用有界优先队列来优化HNSW的搜索性能。默认为 true。
 ## 向量量化
 采用 FLAT 编码时，HNSW 索引（原始向量 + 图结构）可能占用大量内存。HNSW 必须全量驻留内存才能工作，因此在超大规模数据集上易成瓶颈。
-向量量化通过压缩 FLOAT32 减少内存开销。Doris 当前支持两种标量量化：INT8 与 INT4（SQ8 / SQ4）。以 SQ8 为例：
+标量量化(SQ)通过压缩 FLOAT32 减少内存开销。乘积量化(PQ)通过分解高维向量并分别量化子向量来降低内存开销。Doris 
当前支持两种标量量化：INT8 与 INT4（SQ8 / SQ4）。以 SQ8 为例：
 
 ```sql
 CREATE TABLE sift_1M (
@@ -290,7 +292,32 @@ PROPERTIES (
 
 量化会带来额外构建开销，原因是构建阶段需要大量距离计算，且每次计算需对量化值解码。以 128 维向量为例，随行数增长构建时间上升，SQ 相比 FLAT 
可能引入约 10 倍构建成本。
 
-![ANN-SQ-BUILD_COSTS](/images/ann-sq-build-time.png)
+类似的, Doris也支持乘积量化, 不过需要注意的是在使用PQ时需要提供额外的参数:
+
+- `pq_m`: 表示将原始的高维向量分割成多少个子向量(向量维度 dim 必须能被 pq_m 整除)。
+- `pq_nbits`: 表示每个子向量量化的比特数, 它决定了每个子空间码本的大小(k = 2 ^ pq_nbits), 
在faiss中pq_nbits值一般要求不大于24。
+
+```sql
+CREATE TABLE sift_1M (
+  id int NOT NULL,
+  embedding array<float>  NOT NULL  COMMENT "",
+  INDEX ann_index (embedding) USING ANN PROPERTIES(
+      "index_type"="hnsw",
+      "metric_type"="l2_distance",
+      "dim"="128",
+      "quantizer"="pq",    -- 指定使用 PQ 进行量化
+      "pq_m"="2",          -- 使用PQ时需要指定, 表示将高维向量分割成 pq_m 个低维子向量
+      "pq_nbits"="2"       -- 使用PQ时需要指定, 表示每个子空间码本的比特数
+  )
+) ENGINE=OLAP
+DUPLICATE KEY(id) COMMENT "OLAP"
+DISTRIBUTED BY HASH(id) BUCKETS 1
+PROPERTIES (
+  "replication_num" = "1"
+);
+```
+
+![ANN-SQ-BUILD_COSTS](/images/ann-index-quantization-build-time.jpg)
 
 
 ## 性能调优
diff --git a/static/images/ann-index-quantization-build-time.jpg 
b/static/images/ann-index-quantization-build-time.jpg
new file mode 100755
index 00000000000..e23835c8a3f
Binary files /dev/null and 
b/static/images/ann-index-quantization-build-time.jpg differ
diff --git a/static/images/ann-sq-build-time.png 
b/static/images/ann-sq-build-time.png
deleted file mode 100644
index 7588709acf2..00000000000
Binary files a/static/images/ann-sq-build-time.png and /dev/null differ


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(doris-website) branch master updated: product quantization (#2998)

Reply via email to