wangbo opened a new issue #6281:
URL: https://github.com/apache/incubator-doris/issues/6281
Recently I did some performance test for storage layer and find there still
has some optimization room.
First I tried to remove bitshuffle encode for Doris.
Because I find when a page is read/write from disk, it was
compress/decompress by ```PageIO```.
Then after this page goes through
```BitshufflePageBuilder/BitShufflePageDecoder```, it was compress/decompress
secondly.
**Test Environment**
1FE,3BE
code version: apache doris 0.13
test Data: ssb data
sql:
```
SELECT sum(lo_revenue) , year(lo_orderdate) AS year, p_brand FROM
lineorder_flat WHERE p_category = 'MFGR#12' AND s_region = 'AMERICA' GROUP BY
year, p_brand ORDER BY year, p_brand;
```
**Code modification**
Remove ```bitshuffle compress``` from ```BitShufflePageDecoder::_decode```
and ```BitshufflePageBuilder::_finish```.
**Test Result 1 : query performance**
```BitShufflePageDecodeTime``` is BitShufflePageDecoder::_decode 's time
cost;
I run sql for many times and pick the fastest sql.
No read disk happens here.
before:
```
- RawRowsRead: 208.64M
- BitShufflePageDecodeTime: 4s825ms
- BlockLoadTime: 20s201ms
- RawRowsRead: 205.78M
- BlockLoadTime: 21s272ms
- BitShufflePageDecodeTime: 5s051ms
- RawRowsRead: 185.62M
- BlockLoadTime: 17s510ms
- BitShufflePageDecodeTime: 4s156ms
```
after
```
- RawRowsRead: 211.40M
- BitShufflePageDecodeTime: 1s116ms
- BlockLoadTime: 17s114ms
- RawRowsRead: 200.01M
- BlockLoadTime: 17s008ms
- BitShufflePageDecodeTime: 1s047ms
- RawRowsRead: 188.62M
- BlockLoadTime: 14s571ms
- BitShufflePageDecodeTime: 975.978ms
```
We can see that both ```BlockLoadTime``` and ```BitShufflePageDecodeTime```
has improved.
**Test Result 2 : Storage**
before:
```
| lineorder_flat | 58.866 GB | 336
```
after:
```
| lineorder_flat | 71.685 GB | 336 |
```
Abort 22% increase in storage. This shows that the current compression
algorithm is still effective.
**Todo**
I think compression algorithm is quite important here.
Two things to try in the future:
1 Find a better performance compression algorithm for doris,balance
performance and space usage
2 Regard ```Make bitshuffle encode optional``` as an experiment feature.
Because this is just a simple test, whether there are other effects still
needs more verification.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]