wangbo opened a new issue #7771: URL: https://github.com/apache/incubator-doris/issues/7771
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description After SegmentIterator Vectorization PR merged, there is still some todo for it; This ISSUE tried to solve some performance problems. ### Solution Test SQL ``` SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorder_flat WHERE LO_ORDERDATE >= 19930101 and LO_ORDERDATE <= 19931231 AND LO_DISCOUNT BETWEEN 1 AND 3 AND LO_QUANTITY < 25; ``` Initial performance test: ``` code version:SegmentIterator row version - BlockLoadTime: 3s687ms - VectorPredEvalTime: 778.640ms - BlockSeekCount: 5.36M code version: SegmentIterator vectorization - BlockLoadTime: 4s140ms - VectorPredEvalTime: 256.926ms - BlockSeekCount: 5.36M ``` Analysis 1 After ```SegIter``` is vectorized, the performance is dropped. 2 The predicate calculation performance is indeed improved, but the overall impact is not large 3 ```BlockSeekCount``` is too big, it can be optimized. Optimization 1: remove timer ```BlockSeekTime``` - BlockLoadTime: 3s512ms Optimization 2(based on opt 1): Batch insert column vector in ```BitShufflePageDecoder.next_batch``` - BlockLoadTime: 3s105ms Optimization 3(based on opt1, opt2): eliminate lazy materialization - BlockLoadTime: 2s641ms - BlockSeekCount: 175.02K We can see ```BlockSeekCount``` reduced much. Optimization 4(based on op1, opt2, opt3): set doris_scanner_thread_pool_thread_num = 1 - BlockLoadTime: 1s665ms Performance is further improved, but the whole sql may cost more time. Then I wonder whether original version has the same problem Origin Version Test: set doris_scanner_thread_pool_thread_num = 1 vs default value ``` set doris_scanner_thread_pool_thread_num = false value - BlockLoadTime: 3s571ms set doris_scanner_thread_pool_thread_num = 1 - BlockLoadTime: 2s232ms ``` We can see that the origin version has the same problem, this may be related to memory allocation under multithreading, this need further research. I will submit a PR for opt1, opt2, opt3 ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
