Re: [I] Planning to publish Roadmap? [arrow-datafusion-comet]

via GitHub Thu, 15 Feb 2024 03:29:51 -0800


milenkovicm commented on issue #19:
URL: 
https://github.com/apache/arrow-datafusion-comet/issues/19#issuecomment-1945907961


   I'd like to put a suggestion, based on my experience a lot of production 
spark workloads have some kind of UDF, good chunk of those UDFs are very simple 
and can be expressed as SQL expressions. 
   
   I assume that in case UDF is used, comet will fall back to classic spark 
execution, which might not be optimal (I might be wrong, apologise if I am, I 
did not do my homework to check comet code in depth). My suggestion is to 
consider adding functionality like 
https://nvidia.github.io/spark-rapids/docs/additional-functionality/udf-to-catalyst-expressions.html
 which can speed up UDF in comet case as well.
   
   I believe there is nothing GPU specific in that code, and it can be reused, 
just not sure what would be the best approach. 
   
   Maybe @andygrove would be able to help


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Planning to publish Roadmap? [arrow-datafusion-comet]

Reply via email to