HappenLee opened a new issue #3438:
URL: https://github.com/apache/incubator-doris/issues/3438
#### Motivation
At present, the underlying storage in Doris is column storage.Query
execution needs to be transferred to the query layer for execution by
row-to-column first. Such an implementation maybe cause the performance problem。
* 1. Row-to-row loss.
* 2. Can not get better CPU performance without vectorized execution.
So we want to transform the query layer of Doris to vectorized execution so
that it can be not only stored by columns but is processed by vectors (parts of
columns), which allows achieving high CPU efficiency. This can benefit query
performance.
Here I simply implemented a POC to verify whether there is a performance
improvement
###### Test environment:
* **Data set**
```Star Schema Benchmark```
* **Data generation**
```
git clone [email protected]:vadimtk/ssb-dbgen.git
cd ssb-dbgen
make
```
Download the **SSBM** code from github and compile it. After the compilation
is successful, execute the following command to generate 3000W customer data:
```
./dbgen -s 1000 -T c
```
* **Build Table and Data Import**
Use the following statement to create a test table, and import the data
**customer.tbl** into Doris, the data size is about 3.2GB
```
customer | CREATE TABLE `customer` (
`C_CUSTKEY` int(11) NULL COMMENT "",
`C_NAME` varchar(255) NOT NULL COMMENT "",
`C_ADDRESS` varchar(255) NOT NULL COMMENT "",
`C_CITY` varchar(255) NOT NULL COMMENT "",
`C_NATION` varchar(255) NOT NULL COMMENT "",
`C_REGION` varchar(255) NOT NULL COMMENT "",
`C_PHONE` varchar(255) NOT NULL COMMENT "",
`C_MKTSEGMENT` varchar(255) NOT NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`C_CUSTKEY`, `C_NAME`, `C_ADDRESS`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`C_CUSTKEY`) BUCKETS 10
PROPERTIES (
"storage_type" = "COLUMN"
);
```
* **Environment**
```
GNU/Linux CentOS 6.3 (Final) build 2.6.32_1-19-0-0
Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
2 physical CPU package(s)
24 physical CPU core(s)
48 logical CPU(s)
Identifier: Intel64 Family 6 Model 79 Stepping 1
ProcessorID: F1 06 04 00 FF FB EB BF
Context Switches/Interrupts: 12174692729137 / 297015608902
Memory: 119.5 GiB/125.9 GiB
```
Single FE and Single BE in the same server.
###### Test:
* Modify the logic of Doris' query layer to support the vectorized
aggregation of column inventory during aggregation calculations. Record the
time when the row transfer to column:

Calculate the loss time of row transter to column

* **Results**
```select max(C_PHONE) from customer group by C_MKTSEGMENT;```
Statistic|Origin| Convert to column | origin(muti-thread) | Convert to
column(muti-thread)
:-:|:-:|:-:|:-:|:-:
Time | 4.19 Sec | 4.57 Sec - 2.17Sec (Convert Time) | 0.67 Sec | 0.69 Sec
|
Context-Switches | 31,737 | 32,468 | 40,463 | 30,699
Migrations | 506 | 662 | 4,920 | 3265
Instructions | 48,890,013,173 | 47,963,367,976 | 49,111,783,565 |
48,113,904,685
IPC | 1.57 | 1.42 | 1.40 | 1.37
Branches | 9,201,175,036 | 9,124,545,231 | 9,248,803,634| 9,154,186,301
Branches-Miss % | 0.90% | 1.02% | 0.91% | 1.02%
#### Implementation
Doris currently has a corresponding ```VectorizedRowBatch ```implementation.
So we can gradually complete the optimization each exec node.
1. Starting from ```olap_scan_node```, using vectorization query test and
observe whether there is expected performance improvement
2. ```exec_node``` need to implement method
for ```VectorizedRowBatch``` trans to ```RowBatch``` method, retaining
compatibility with the original execution logic
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]