andygrove commented on PR #671:
URL: https://github.com/apache/datafusion-comet/pull/671#issuecomment-2234221966

   > Just looking at this one case, with decimal fields and only scan enabled, 
we are much slower. This is consistent with something I saw when working on the 
parallel reader.
   > From a profiling run I saw that a potential bottleneck was 
[BosonVector.getDecimal](https://github.com/apache/datafusion-comet/blob/7ac2fb9f6672fbb29c2e5de7e62b457efc0bfebf/common/src/main/java/org/apache/comet/vector/CometVector.java#L74)
 which has an expensive creation of a BigInteger followed by an expensive 
creation of a BigDecimal.
   > However, this path would be hit only for precision > 18 or if 
`spark.comet.use.decimal128` was set to `true` (it is `false` by default).
   > Also, I'm not sure if there is a way to eliminate this though.
   
   I was also trying to understand why this result was slower. I have created 
https://github.com/apache/datafusion-comet/issues/679 based on your comment so 
that we can use that issue to explore possible optimizations


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to