parthchandra commented on PR #1034:
URL: 
https://github.com/apache/datafusion-comet/pull/1034#issuecomment-2435919148

   Initial performance numbers for this implementation are not looking good. 
There are two areas where things are getting slower compared to Spark 
   1 . No WholestageCodegen - The iteration over rows alone is adding extra 
cost to the implementation
   2. We incur an additional cost of creating UnsafeRows in native which is 
more costly than the calls made by Spark to extract values out of Arrow vector.
   Here's the initial benchmark run for just integer types - 
   ```
   Running benchmark: ColumnarToRowExec
     Running case: Spark Columnar To Row - integer
     Stopped after 34 iterations, 2029 ms
     Running case: Comet Columnar To Row - integer
     Stopped after 24 iterations, 2081 ms
   
   OpenJDK 64-Bit Server VM 11.0.19+7-LTS on Mac OS X 14.6
   Apple M3 Max
   ColumnarToRowExec:                        Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   Spark Columnar To Row - integer                      40             60       
   14        262.1           3.8       1.0X
   Comet Columnar To Row - integer                      53             87       
   32        198.2           5.0       0.8X
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to