baibaichen commented on pull request #29695:
URL: https://github.com/apache/spark/pull/29695#issuecomment-792516852


   Thanks @huaxingao 
   
   we did some tests on aggregate push down in real product environment last 
month. here are results
   
   1. datasets: 550M records
   2. 4 click-house nodes
   
     | 1 User | 10 Users | 20 Users | 60 Users
   -- | -- | -- | -- | --
   QPS | 2.76 | 6.1 | 4.43 | 4.45
   90% (sec) | **0.4** | 2.1 | 7 | 17
   slowest (sec) | 0.45 | 3.3 | 12 | 27
   
   we didn't test without aggregate push down, because it is 10 X slower than 
push down
   
   However the current PR has some limitations:
   1. Don't support count
   2. Don't support AVG in case of multiple shards
   3. Don't know how to extend the implementation for supporting more 
aggregation case, for example, sum(if()).
   
   Thanks
   Chang 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to