HappenLee opened a new issue #6088:
URL: https://github.com/apache/incubator-doris/issues/6088


   ## Motivation
   Now, When we run ssb test for doris. See the CPU Perf find:
   
![image](https://user-images.githubusercontent.com/10553413/123204245-429db680-d4ea-11eb-884f-3555891712c9.png)
   
   There is plenty of CPU compute in `page decode of PlainPage and DictPage`
   
   try to see the detail, we find there are many of mem allocate in dispose the 
`BitUtil::RoundUpToPowerOf2`
   
![image](https://user-images.githubusercontent.com/10553413/123204430-afb14c00-d4ea-11eb-9f9e-7bf26cec2c7f.png)
   
   ## Implementation
   
   Obvious, we can use the SMID to speed up the function 
`BitUtil::RoundUpToPowerOf2`
   
   After use SSE to speed up the function, the perf show CPU cost:
   
   
![image](https://user-images.githubusercontent.com/10553413/123204655-21899580-d4eb-11eb-8295-01b378ec0d85.png)
   
   
   |   |  no vectorized  | vectorized |
   |  ---- |  ----  | ----  |
   | DictPage| 23.42% |  14.82% |
   | PlainPage| 23.38%  |  11.93% |
   
   
   ### 3. More Test In SSB
   
   
![image](https://user-images.githubusercontent.com/10553413/123205063-d0c66c80-d4eb-11eb-91a1-1f9b4a4825dd.png)
   
   We can find q4,q5,q6,q8,q9,q11 improve about 20%
    
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to