Lordworms opened a new issue, #9547:
URL: https://github.com/apache/arrow-datafusion/issues/9547

   ### Is your feature request related to a problem or challenge?
   
   I was doing a course project on efficiency comparison. And I try on using 
VTune on TPC-H benchmark to compare the efficiency between datafusion and 
duckDB. The results indicated that There might be some efficiency issues. I 
also noticed that the effective CPU use time of datafusion is much higher than 
DuckDB, but the runtime on TPC-H is slower(seems like we did not really do 
parallism and I really think that's some problem comes from Tokio)
   This is DuckDB's result
   
![9a087441e7ecb79d01fc382d14f47ffe](https://github.com/apache/arrow-datafusion/assets/48054792/f6a51e89-84d8-4506-be2e-b996fa2dccc2)
   This is Datafusion's result
   
![1980a8f73e043fff172c3763114110e3](https://github.com/apache/arrow-datafusion/assets/48054792/ad13f2c3-cdb5-4f97-8a82-0e8cf52a219e)
   
   Also the flame graph shows that datafusion has a much deeper stack.
   duckDB
   
![1def3e3446638dbd5fd305db09421227](https://github.com/apache/arrow-datafusion/assets/48054792/4150454f-4e29-41c0-9747-13f813ca5cf8)
   
   datafusion
   
![26f898b9cbf0352ea76383ae1faf7d88](https://github.com/apache/arrow-datafusion/assets/48054792/dfc62945-a4b4-47b5-94d1-cf24b049a748)
   
   I kind of generated some distrust towards Tokio.
   
   I doubt whether the slower performance is due to incomplete use of SIMD 
instruction so I did some statistics on SIMD instructions using PIN(may be the 
result is not that precise, but I expected the number of SIMD instruction 
generated should be comparable), the results shows below
   | SIMD instruction | datafusion number | duckDB number |
   | ---------------- | ----------------: | ------------: |
   | ADDSD            |                 34|             25|
   | CMPSD_XMM        |                  1|              6|
   | COMISD           |                  -|             44|
   | DIVSD            |                 14|             32|
   | MAXSD            |                  1|              1|
   | MULSD            |                 21|             52|
   | PACKUSWB         |                  5|              7|
   | PADDB            |                 30|             12|
   | PADDD            |                100|             33|
   | PADDQ            |                291|            200|
   | PADDW            |                  8|              5|
   | PCMPEQB          |                548|            544|
   | PCMPEQD          |                 58|             38|
   | PCMPGTB          |                  -|              1|
   | PCMPGTD          |                 44|             14|
   | PCMPGTW          |                  -|              6|
   | PMINUB           |                  8|             20|
   | PMOVMSKB         |               1169|            278|
   | PMULHUW          |                  1|              2|
   | PMULLW           |                  1|              2|
   | PMULUDQ          |                  -|              4|
   | PSHUFD           |                646|             88|
   | PSLLD            |                  6|              2|
   | PSLLDQ           |                 72|            217|
   | PSLLQ            |                213|             16|
   | PSLLW            |                 30|              2|
   | PSRAD            |                  8|              -|
   | PSRLD            |                  3|             40|
   | PSRLDQ           |                 39|            179|
   | PSRLQ            |                 11|              7|
   | PSUBB            |                 84|            243|
   | PSUBD            |                  4|              3|
   | PSUBQ            |                 12|              4|
   | PSUBUSB          |                  -|              6|
   | PSUBW            |                  -|              6|
   | PUNPCKHBW        |                 41|              7|
   | PUNPCKHDQ        |                 45|             66|
   | PUNPCKHQDQ       |                102|             14|
   | PUNPCKHWD        |                 42|             50|
   | PUNPCKLBW        |                211|             19|
   | PUNPCKLDQ        |                 94|            338|
   | PUNPCKLQDQ       |                353|           2713|
   | PUNPCKLWD        |                 73|             80|
   | ROUNDSD          |                  1|              -|
   | SHUFPD           |                  4|             20|
   | SHUFPS           |                  -|             28|
   | SQRTSD           |                  -|              2|
   | SUBSD            |                 10|             19|
   | UCOMISD          |                 16|             39|
   | VPCMPB           |                 56|             86|
   | VPCMPUB          |                206|             19|
   | VPMINUB          |                  2|             15|
   | **Total**        |               4851|           5293|
   
   Turns out that datafusion may use less SIMD instructions than DuckDB (that 
might be the rustc problem)
   
   
   ### Describe the solution you'd like
   
   I plan to do this week after next after. But got no clues yet
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to