This is an automated email from the ASF dual-hosted git repository.
luzhijing pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 0b297793a037 [doc]update en docs of data index (#478)
0b297793a037 is described below
commit 0b297793a037ad09012088e4f73f9612236e6336
Author: Luzhijing <[email protected]>
AuthorDate: Thu Mar 28 19:42:41 2024 +0800
[doc]update en docs of data index (#478)
---
.../version-2.0/table-design/index/bitmap-index.md | 4 +-
.../version-2.0/table-design/index/bloomfilter.md | 58 +++++++++----------
.../table-design/index/index-overview.md | 66 +++-------------------
.../table-design/index/inverted-index.md | 29 +++++-----
.../table-design/index/ngram-bloomfilter-index.md | 33 ++++++-----
.../version-2.0/table-design/index/prefix-index.md | 21 +++++--
6 files changed, 86 insertions(+), 125 deletions(-)
diff --git a/versioned_docs/version-2.0/table-design/index/bitmap-index.md
b/versioned_docs/version-2.0/table-design/index/bitmap-index.md
index 97385c429d2d..1a15bd575c35 100644
--- a/versioned_docs/version-2.0/table-design/index/bitmap-index.md
+++ b/versioned_docs/version-2.0/table-design/index/bitmap-index.md
@@ -24,13 +24,11 @@ specific language governing permissions and limitations
under the License.
-->
-# Bitmap index
-
Bitmap Index is an index represented by bitmaps, where a bitmap is created for
each key value in a column. Compared to other indexes, it occupies very little
storage space and is very fast to create and use. However, it has a
disadvantage of having a large lock granularity for modification operations,
making it unsuitable for frequent updates.

-### Applicable scenarios
+## Applicable scenarios
- Columns with high value repetition, recommended to be between 100 and
100,000, such as occupation, city, etc. If the repetition is too high, there is
no significant advantage compared to other types of indexes; if the repetition
is too low, space efficiency and performance will be greatly reduced.
- Specific types of queries such as `count`, `or`, `and` logical operations
that only require bitwise operations. For example: querying with multiple
conditions combined, `select count(*) from table where city = 'Nanjing' and job
= 'Doctor' and phonetype = 'iphone' and gender = 'Male'.` If bitmap indexes are
established on each query condition column, the database can perform efficient
bit operations, accurately locate the required data, reduce disk IO, and the
smaller the filtered resul [...]
diff --git a/versioned_docs/version-2.0/table-design/index/bloomfilter.md
b/versioned_docs/version-2.0/table-design/index/bloomfilter.md
index 83368c79fd4f..0ec7781b34ae 100644
--- a/versioned_docs/version-2.0/table-design/index/bloomfilter.md
+++ b/versioned_docs/version-2.0/table-design/index/bloomfilter.md
@@ -24,7 +24,7 @@ specific language governing permissions and limitations
under the License.
-->
-# BloomFilter Index
+
BloomFilter is a fast search algorithm for multi-hash function mapping
proposed by Bloom in 1970. Usually used in some occasions where it is necessary
to quickly determine whether an element belongs to a set, but is not strictly
required to be 100% correct, BloomFilter has the following characteristics:
@@ -36,29 +36,21 @@ Bloom filter is actually composed of an extremely long
binary bit array and a se
Figure below shows an example of Bloom Filter with m=18, k=3 (m is the size of
the Bit array, and k is the number of Hash functions). The three elements of x,
y, and z in the set are hashed into the bit array through three different hash
functions. When querying the element w, after calculating by the Hash function,
because one bit is 0, w is not in the set.
-
-
-So how to judge whether the plot and the elements are in the set? Similarly,
all the offset positions of this element are obtained after hash function
calculation. If these positions are all 1, then it is judged that this element
is in this set, if one is not 1, then it is judged that this element is not in
this set. It's that simple!
-
-## Doris BloomFilter Index and Usage Scenarios
-
-When we use HBase, we know that the Hbase data block index provides an
effective method to find the data block of the HFile that should be read when
accessing a specific row. But its utility is limited. The default size of the
HFile data block is 64KB, and this size cannot be adjusted too much.
-
-If you are looking for a short row, only building an index on the starting row
key of the entire data block cannot give you fine-grained index information.
For example, if your row occupies 100 bytes of storage space, a 64KB data block
contains (64 * 1024)/100 = 655.53 = ~700 rows, and you can only put the
starting row on the index bit. The row you are looking for may fall in the row
interval on a particular data block, but it is not necessarily stored in that
data block. There are many [...]
+Similarly, if an element undergoes a hash function calculation to obtain all
its offset positions, and if all these positions are set to 1, then it is
determined that this element is in the set.
-So HBase provides Bloom filters that allow you to do a reverse test on the
data stored in each data block. When a row is requested, first check the Bloom
filter to see if the row is not in this data block. The Bloom filter must
either confirm that the line is not in the answer, or answer that it doesn't
know. This is why we call it a reverse test. Bloom filters can also be applied
to the cells in the row. Use the same reverse test first when accessing a
column of identifiers.
+
-Bloom filters are not without cost. Storing this additional index level takes
up additional space. Bloom filters grow as their index object data grows, so
row-level bloom filters take up less space than column identifier-level bloom
filters. When space is not an issue, they can help you squeeze the performance
potential of the system.
+Bloom Filter is a data structure that allows for a reverse test on the data
stored in each data block. When a specific row is requested, the Bloom Filter
first checks if that row is not in the data block. The Bloom Filter either
confidently states that the row is not present or it gives an uncertain answer.
This uncertainty is why it's called a reverse test. However, Bloom Filters are
not without cost. Storing this additional index level consumes extra space, and
Bloom Filters grow as th [...]
-The BloomFilter index of Doris is specified when the table is built, or
completed by the ALTER operation of the table. Bloom Filter is essentially a
bitmap structure, used to quickly determine whether a given value is in a set.
This judgment will produce a small probability of misjudgment. That is, if it
returns false, it must not be in this set. And if the range is true, it may be
in this set.
+In Doris, BloomFilter indexes can be specified during table creation or
through ALTER table operations. Essentially, a Bloom Filter is a bitmap
structure used to quickly determine if a given value is in a set. This
determination can result in a small probability of false positives. That is, if
it returns false, the value is definitely not in the set. But if it returns
true, the value might be in the set.
-The BloomFilter index is also created with Block as the granularity. In each
Block, the value of the specified column is used as a set to generate a
BloomFilter index entry, which is used to quickly filter the data that does not
meet the conditions in the query.
+BloomFilter indexes in Doris are created at the Block level. Within each
Block, the values of specified columns are treated as a set to generate a
BloomFilter index entry. This entry is used during queries to quickly filter
out data that doesn't meet the conditions.
-Let's take a look at how Doris creates BloomFilter indexes through examples.
+Now let's see how Doris creates a BloomFilter index through an example.
-### Create BloomFilter Index
+### Create BloomFilter index
-The Doris BloomFilter index is created by adding
"bloom_filter_columns"="k1,k2,k3" to the PROPERTIES of the table building
statement, this attribute, k1,k2,k3 is the Key column name of the BloomFilter
index you want to create, for example, we Create a BloomFilter index for the
saler_id and category_id in the table.
+The Doris BloomFilter index is created by adding
`"bloom_filter_columns"="k1,k2,k3"` to the PROPERTIES of the table building
statement, this attribute, k1,k2,k3 is the Key column name of the BloomFilter
index you want to create, for example, we Create a BloomFilter index for the
saler_id and category_id in the table.
```sql
CREATE TABLE IF NOT EXISTS sale_detail_bloom (
@@ -91,7 +83,7 @@ PROPERTIES (
);
```
-### View BloomFilter Index
+### View BloomFilter index
Check that the BloomFilter index we built on the table is to use:
@@ -99,32 +91,36 @@ Check that the BloomFilter index we built on the table is
to use:
SHOW CREATE TABLE <table_name>;
```
-### Delete BloomFilter Index
+### Delete BloomFilter index
-Deleting the index is to remove the index column from the bloom_filter_columns
attribute:
+Deleting the index is to remove the index column from the
`bloom_filter_columns attribute`:
```sql
ALTER TABLE <db.table_name> SET ("bloom_filter_columns" = "");
```
-### Modify BloomFilter Index
+### Modify BloomFilter index
-Modifying the index is to modify the bloom_filter_columns attribute of the
table:
+Modifying the index is to modify the `bloom_filter_columns` attribute of the
table:
```sql
ALTER TABLE <db.table_name> SET ("bloom_filter_columns" = "k1,k3");
```
-### **Doris BloomFilter Usage Scenarios**
+### **BloomFilter usage scenarios**
+
+User can create a BloomFilter index for a column when the following conditions
are met:
+
+- BloomFilter is suitable for non-prefix filtering.
+
+- The query will be filtered according to the high frequency of the column,
and most of the query conditions are `in` and `= `filtering.
+
+- BloomFilter is suitable for high cardinality columns. Such as UserID.
Because if it is created on a low-cardinality column, such as a `gender`
column, each Block will almost contain all values, causing the BloomFilter
index to lose its meaning.
-You can consider establishing a Bloom Filter index for a column when the
following conditions are met:
+### **BloomFilter use precautions**
-1. First, BloomFilter is suitable for non-prefix filtering.
-2. The query will be filtered according to the high frequency of the column,
and most of the query conditions are in and = filtering.
-3. BloomFilter is suitable for high cardinality columns. Such as UserID.
Because if it is created on a low-cardinality column, such as a "gender"
column, each Block will almost contain all values, causing the BloomFilter
index to lose its meaning.
+- It does not support the creation of BloomFilter indexes for
`Tinyint`,`Float`, and `Double` columns.
-### **Doris BloomFilter Use Precautions**
+- The BloomFilter index only has an acceleration effect on `in` and `=`
filtering queries.
-1. It does not support the creation of Bloom Filter indexes for Tinyint,
Float, and Double columns.
-2. The Bloom Filter index only has an acceleration effect on in and =
filtering queries.
-3. If you want to check whether a query hits the Bloom Filter index, you can
check the profile information of the query.
+- If you want to check whether a query hits the BloomFilter index, you can
check the profile information of the query.
diff --git a/versioned_docs/version-2.0/table-design/index/index-overview.md
b/versioned_docs/version-2.0/table-design/index/index-overview.md
index 0e56e11307a3..4b467431ebc2 100644
--- a/versioned_docs/version-2.0/table-design/index/index-overview.md
+++ b/versioned_docs/version-2.0/table-design/index/index-overview.md
@@ -24,64 +24,16 @@ specific language governing permissions and limitations
under the License.
-->
-# Index Overview
+Indexes are used to help quickly filter or search for data. Currently, Doris
supports two types of indexes: built-in smart indexes and user-created
secondary indexes.
-Indexes are used to help quickly filter or find data.
+### Built-in smart indexes
-Doris currently supports two main types of indexes:
-1. built-in smart indexes, including prefix indexes and ZoneMap indexes.
-2. User-created secondary indexes, including the [inverted
index](./inverted-index.md), [bloomfilter index](./bloomfilter.md) and [ngram
bloomfilter index](./ngram-bloomfilter-index.md).
+- Sorted keys and prefix index: Apache Doris stores data in an ordered manner
based on sorted keys. It creates a prefix index for every 1024 rows of data.
The key in the index is the value of the sorted column in the first row of the
current 1024-row group. If a query involves a sorted column, the system will
find the first row of the relevant 1024-row group and start scanning from there.
+- ZoneMap index: The ZoneMap index is automatically maintained index
information for each column in the column-based storage format, including
Min/Max values, etc. This type of index is transparent to users. These are
segment and page-level indexes, the maximum and minimum values for each column
within a page are recorded, and the maximum and minimum values for each column
within a segment are also recorded. Therefore, in equivalence and range
queries, this Min/Max index can be used to [...]
-The ZoneMap index is the index information automatically maintained for each
column in the column storage format, including Min/Max, the number of Null
values, and so on. This index is transparent to the user.
+### User-created secondary indexes
-## Prefix Index
-
-Unlike traditional database designs, Doris does not support creating indexes
on arbitrary columns. an OLAP database with an MPP architecture like Doris is
typically designed to handle large amounts of data by increasing concurrency.
-
-Essentially, Doris data is stored in a data structure similar to an SSTable
(Sorted String Table). This structure is an ordered data structure that can be
stored sorted by specified columns. On this data structure, it will be very
efficient to perform lookups with sorted columns as a condition.
-
-In the Aggregate, Unique and Duplicate data models. The underlying data
storage is sorted and stored according to the columns specified in the
respective table building statements, AGGREGATE KEY, UNIQUE KEY and DUPLICATE
KEY.
-
-The prefix index, which is based on sorting, is an indexing method to query
data quickly based on a given prefix column.
-
-## Examples
-
-We use the first 36 bytes of a row of data as the prefix index of this row of
data. Prefix indexes are simply truncated when a VARCHAR type is encountered.
We give an example:
-
-1. The prefix index of the following table structure is user_id(8 Bytes) +
age(4 Bytes) + message(prefix 20 Bytes).
-
- | ColumnName | Type |
- | -------------- | ------------ |
- | user_id | BIGINT |
- | age | INT |
- | message | VARCHAR(100) |
- | max_dwell_time | DATETIME |
- | min_dwell_time | DATETIME |
-
-2. The prefix index of the following table structure is user_name(20 Bytes).
Even if it does not reach 36 bytes, because VARCHAR is encountered, it is
directly truncated and will not continue further.
-
- | ColumnName | Type |
- | -------------- | ------------ |
- | user_name | VARCHAR(20) |
- | age | INT |
- | message | VARCHAR(100) |
- | max_dwell_time | DATETIME |
- | min_dwell_time | DATETIME |
-
-When our query condition is the prefix of the prefix index, the query speed
can be greatly accelerated. For example, in the first example, we execute the
following query:
-
-```sql
-SELECT * FROM table WHERE user_id=1829239 and age=20;
-```
-
-This query will be much more efficient than the following query:
-
-```sql
-SELECT * FROM table WHERE age=20;
-```
-
-Therefore, when building a table, choosing the correct column order can
greatly improve query efficiency.
-
-## Adjust Prefix Index by ROLLUP
-
-Because the column order has been specified when the table is created, there
is only one prefix index for a table. This may not be efficient for queries
that use other columns that cannot hit the prefix index as conditions.
Therefore, we can artificially adjust the column order by creating a ROLLUP.
For details, please refer to [ROLLUP](../hit-the-rollup.md).
+- Bloom Filter Index
+- N-Gram Bloom Filter Index
+- Bitmap Index
+- Inverted Index
\ No newline at end of file
diff --git a/versioned_docs/version-2.0/table-design/index/inverted-index.md
b/versioned_docs/version-2.0/table-design/index/inverted-index.md
index 9f3aa8a5ba4e..f6a121181039 100644
--- a/versioned_docs/version-2.0/table-design/index/inverted-index.md
+++ b/versioned_docs/version-2.0/table-design/index/inverted-index.md
@@ -24,21 +24,17 @@ specific language governing permissions and limitations
under the License.
-->
-# Inverted Index
-<version since="2.0.0">
-
-</version>
-From version 2.0.0, Doris implemented inverted index to support fulltext
search on text field, normal eq and range filter on text, numeric, datetime
field. This doc introduce inverted index usage, including create, drop and
query.
+From version 2.0, Doris implemented inverted index to support fulltext search
on text field, normal eq and range filter on text, numeric, datetime field.
This doc introduce inverted index usage, including create, drop and query.
## Glossary
-- [inverted index](https://en.wikipedia.org/wiki/Inverted_index) is a index
techlogy used in information retirval commonly. It split text into word terms
and construct a term to doc index. This index is called inverted index and can
be used to find the docs where a specific term appears.
+[inverted index](https://en.wikipedia.org/wiki/Inverted_index) is a index
techlogy used in information retirval commonly. It split text into word terms
and construct a term to doc index. This index is called inverted index and can
be used to find the docs where a specific term appears.
-## Basic Principles
+## Basic principles
In the inverted index of Doris, a row in a table corresponds to a doc in
CLucene, a column corresponds to a field in doc. So using inverted index, doris
can get the rows that meet the filter of SQL WHERE clause, and then get the
rows quickly without reading other unrelated rows.
@@ -122,7 +118,8 @@ Inverted indexes have different limitations in different
data models:
- add an inverted index to existed table
-**Before version 2.0-beta:**
+**Before version 2.0.0:**
+
```sql
-- syntax 1
CREATE INDEX idx_name ON table_name(column_name) USING INVERTED
[PROPERTIES("parser" = "english|chinese|unicode")] [COMMENT 'your comment'];
@@ -130,7 +127,7 @@ CREATE INDEX idx_name ON table_name(column_name) USING
INVERTED [PROPERTIES("par
ALTER TABLE table_name ADD INDEX idx_name(column_name) USING INVERTED
[PROPERTIES("parser" = "english|chinese|unicode")] [COMMENT 'your comment'];
```
-**After version 2.0-beta (including 2.0-beta):**
+**After version 2.0.0 (including 2.0.0):**
The above 'create/add index' operation only generates inverted index for
incremental data. The syntax of BUILD INDEX is added to add inverted index to
stock data:
```sql
@@ -273,7 +270,7 @@ PROPERTIES ("replication_num" = "1");
```
-### Load Data
+### Load data
- load data by stream load
@@ -402,7 +399,7 @@ mysql> SELECT count() FROM hackernews_1m WHERE comment
MATCH_ANY 'OLAP OLTP';
```
-#### Normal Equal, Range Query
+#### Normal equal, range query
- range query on DateTime column
```sql
@@ -422,7 +419,8 @@ mysql> SELECT count() FROM hackernews_1m WHERE timestamp >
'2007-08-23 04:17:00'
mysql> CREATE INDEX idx_timestamp ON hackernews_1m(timestamp) USING INVERTED;
Query OK, 0 rows affected (0.03 sec)
```
-**After 2.0-beta (including 2.0-beta), you need to execute `BUILD INDEX` to
add inverted index to the stock data:**
+**After version 2.0.0 (including 2.0.0), you need to execute `BUILD INDEX` to
add inverted index to the stock data:**
+
```sql
mysql> BUILD INDEX idx_timestamp ON hackernews_1m;
Query OK, 0 rows affected (0.01 sec)
@@ -439,7 +437,8 @@ mysql> SHOW ALTER TABLE COLUMN;
1 row in set (0.00 sec)
```
-**After 2.0-beta (including 2.0-beta), you can view the progress of stock data
creating index by `SHOW BUILD INDEX`:**
+**After version 2.0.0 (including 2.0.0), you can view the progress of stock
data creating index by `SHOW BUILD INDEX`:**
+
```sql
-- If the table has no partitions, the PartitionName defaults to TableName
mysql> SHOW BUILD INDEX;
@@ -477,7 +476,7 @@ mysql> SELECT count() FROM hackernews_1m WHERE parent =
11189;
mysql> ALTER TABLE hackernews_1m ADD INDEX idx_parent(parent) USING INVERTED;
Query OK, 0 rows affected (0.01 sec)
--- After 2.0-beta (including 2.0-beta), you need to execute `BUILD INDEX` to
add inverted index to the stock data:
+-- After version 2.0.0 (including 2.0.0), you need to execute `BUILD INDEX` to
add inverted index to the stock data:
mysql> BUILD INDEX idx_parent ON hackernews_1m;
Query OK, 0 rows affected (0.01 sec)
@@ -520,7 +519,7 @@ mysql> SELECT count() FROM hackernews_1m WHERE author =
'faster';
mysql> ALTER TABLE hackernews_1m ADD INDEX idx_author(author) USING INVERTED;
Query OK, 0 rows affected (0.01 sec)
--- After 2.0-beta (including 2.0-beta), you need to execute `BUILD INDEX` to
add inverted index to the stock data:
+-- After version 2.0.0 (including 2.0.0), you need to execute `BUILD INDEX` to
add inverted index to the stock data:
mysql> BUILD INDEX idx_author ON hackernews_1m;
Query OK, 0 rows affected (0.01 sec)
diff --git
a/versioned_docs/version-2.0/table-design/index/ngram-bloomfilter-index.md
b/versioned_docs/version-2.0/table-design/index/ngram-bloomfilter-index.md
index 4242431f7d88..15e04c75b90a 100644
--- a/versioned_docs/version-2.0/table-design/index/ngram-bloomfilter-index.md
+++ b/versioned_docs/version-2.0/table-design/index/ngram-bloomfilter-index.md
@@ -1,6 +1,6 @@
---
{
- "title": "NGram BloomFilter Index",
+ "title": "N-Gram BloomFilter Index",
"language": "en"
}
---
@@ -24,14 +24,17 @@ specific language governing permissions and limitations
under the License.
-->
-# NGram BloomFilter Index
-<version since="2.0.0">
-</version>
-In order to improve the like query performance, the NGram BloomFilter index
was implemented.
+In order to improve the like query performance, the N-Gram BloomFilter index
was implemented.
-## Create Column with NGram BloomFilter Index
+:::tip
+
+N-Gram BloomFilter indexing is supported starting from the Doris 2.0 version.
+
+:::
+
+### Create N-Gram BloomFilter index
During create table:
@@ -47,34 +50,34 @@ DISTRIBUTED BY HASH(`siteid`) BUCKETS 10
PROPERTIES (
"replication_num" = "1"
);
-
--- PROPERTIES("gram_size"="3", "bf_size"="1024"),indicate the number of gram
and bytes of bloom filter respectively.
--- the gram size set to same as the like query pattern string length. and the
suitable bytes of bloom filter can be get by test, more larger more better, 256
maybe is a good start.
--- Usually, if the data's cardinality is small, you can increase the bytes of
bloom filter to improve the efficiency.
```
-## Show NGram BloomFilter Index
+- PROPERTIES("gram_size"="3", "bf_size"="1024"),indicate the number of gram
and bytes of bloom filter respectively.
+- The number of grams is related to the actual query scenarios and is usually
set to the length of most query strings. The byte size of the Bloom Filter can
be determined through testing, and generally, a larger size leads to better
filtering effects. You can start with 256 bytes for validation testing to see
the results. However, a larger byte size will also increase the storage and
memory cost of the index.
+- If the data cardinality is relatively high, the byte size does not need to
be set too large. If the cardinality is not very high, the filtering effect can
be improved by increasing the byte size.
+
+### Show N-Gram BloomFilter index
```sql
show index from example_db.table3;
```
-## Drop NGram BloomFilter Index
+### Drop N-Gram BloomFilter index
```sql
alter table example_db.table3 drop index idx_ngrambf;
```
-## Add NGram BloomFilter Index
+### Modify N-Gram BloomFilter index
-Add NGram BloomFilter Index for old column:
+Add N-Gram BloomFilter Index for old column:
```sql
alter table example_db.table3 add index idx_ngrambf(username) using NGRAM_BF
PROPERTIES("gram_size"="3", "bf_size"="512")comment 'username ngram_bf index'
```
-## **Some Notes about Doris NGram BloomFilter**
+### **Some Notes about Doris NGram BloomFilter**
1. NGram BloomFilter only support CHAR/VARCHAR/String column.
2. NGram BloomFilter index and BloomFilter index should be exclusive on same
column
diff --git a/versioned_docs/version-2.0/table-design/index/prefix-index.md
b/versioned_docs/version-2.0/table-design/index/prefix-index.md
index 2eb2b943538b..e0b1157404da 100644
--- a/versioned_docs/version-2.0/table-design/index/prefix-index.md
+++ b/versioned_docs/version-2.0/table-design/index/prefix-index.md
@@ -24,7 +24,6 @@ specific language governing permissions and limitations
under the License.
-->
-## Sort key and prefix index
Doris stores data in a data structure similar to SSTable (Sorted String
Table), which is an ordered data structure that can sort and store data
according to specified columns. Performing queries based on sorted columns in
this data structure is highly efficient.
@@ -33,18 +32,32 @@ In the three data models of Aggregate, Unique, and
Duplicate, the underlying dat
Based on the Sort Keys, Prefix Indexes are introduced. A Prefix Index is a
sparse index. In the table, a logical data block is formed according to the
corresponding number of rows. Each logical data block stores an index entry in
the Prefix Index table. The length of the index entry does not exceed 36 bytes,
and its content is the prefix composed of the sorted columns of the first row
of data in the data block. When searching the Prefix Index table, it can help
determine the starting row [...]
:::tip
-The first 36 bytes of a row of data in a data block serve as the prefix index
for that row. When encountering a VARCHAR type, the prefix index will be
truncated directly. If the first column is of the VARCHAR type, truncation will
occur even if the length does not reach 36 bytes.
+
+The first 36 bytes of a row of data in a data block serve as the prefix index
for that row. When encountering a `VARCHAR` type, the prefix index will be
truncated directly. If the first column is of the `VARCHAR` type, truncation
will occur even if the length does not reach 36 bytes.
+
:::
## Example
- If the sort keys of the table are as follows: 5 columns, then the prefix
index would be: user_id (8 Bytes), age (4 Bytes), message (prefix 20 Bytes).
-ColumnNameTypeuser_idBIGINTageINTmessageVARCHAR(100)max_dwell_timeDATETIMEmin_dwell_timeDATETIME
+| ColumnName | Type |
+| -------------- | ------------ |
+| user_id | BIGINT |
+| age | INT |
+| message | VARCHAR(100) |
+| max_dwell_time | DATETIME |
+| min_dwell_time | DATETIME |
- If the sort keys of the table consist of 5 columns and the first column is
`user_name` of the VARCHAR type, then the prefix index would be `user_name`
(truncated to 20 Bytes). Even though the total size of the prefix index has not
reached 36 bytes, truncation occurs because it encounters a VARCHAR column, and
no further columns are included.
-ColumnNameTypeuser_nameVARCHAR(20)ageINTmessageVARCHAR(100)max_dwell_timeDATETIMEmin_dwell_timeDATETIME
+| ColumnName | Type |
+| -------------- | ------------ |
+| user_name | VARCHAR(20) |
+| age | INT |
+| message | VARCHAR(100) |
+| max_dwell_time | DATETIME |
+| min_dwell_time | DATETIME |
- When our query conditions match the prefix index, it can greatly accelerate
the query speed. For example, in the first case, executing the following query:
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]