Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
coderfender commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3708558700 https://github.com/apache/datafusion/pull/19547 (linking issues to make sure we do not repeat ourselves) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
coderfender commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3708540289 I dont think so .Comet being a spark accelerator needs to benchmark against Spark's perf while DataFusion doesnt -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
Brijesh-Thakkar commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3708080954 @coderfender How I run benchmarks, if i am doing PR in datafusion repo?? will this work there alsoo?? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
coderfender commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3707862395 @raushanprabhakar1 , you can run a local benchmark using a command like below : ``` SPARK_GENERATE_BENCHMARK_FILES=1 make benchmark-org.apache.spark.sql.benchmark.CometCastBenchmark ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
raushanprabhakar1 commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3706849908 Hi @andygrove , is there any documentation to which i can refer for running the benchmark in my local device? I was working on the expression optimization, the blocker i am facing is that i am not able to verify my optimzation result locally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
Brijesh-Thakkar commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3704224646 @andygrove I have opened PR In datafusion to optimise the bit_length function PR: apache/datafusion#19598 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
Brijesh-Thakkar commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3702234847 @andygrove I’ve opened an upstream PR in DataFusion to optimize the `octet_length` function by avoiding the generic Arrow length kernel and using a specialized implementation for string arrays. PR: https://github.com/apache/datafusion/pull/19581 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
coderfender commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3697345449 @andygrove seems like some benchmarks are failing with ANSI mode enabled (CometCastBenchmark) . Raised a PR to fix this : https://github.com/apache/datafusion-comet/pull/3014 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
Brijesh-Thakkar commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3696888584 > > [@andygrove](https://github.com/andygrove) Okk , Thanks for guiding me I will work on this issue but i think [@getChan](https://github.com/getChan) is alreadyy workingg, as he has linked a pr [@getChan](https://github.com/getChan), please confirm > > [@Brijesh-Thakkar](https://github.com/Brijesh-Thakkar) this issue is an EPIC listing dozens of expressions that need optimizing. A separate issue should be created for each expression. ohhh okk, understood I will surelyy try to optimise expressions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
andygrove commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3696883693 > [@andygrove](https://github.com/andygrove) Okk , Thanks for guiding me I will work on this issue but i think [@getChan](https://github.com/getChan) is alreadyy workingg, as he has linked a pr [@getChan](https://github.com/getChan), please confirm @Brijesh-Thakkar this issue is an EPIC listing dozens of expressions that need optimizing. A separate issue should be created for each expression. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
Brijesh-Thakkar commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3696876418 @andygrove Okk , Thanks for guiding me I will work on this issue but i think @getChan is alreadyy workingg, as he has linked a pr @getChan, please confirm -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
andygrove commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3696836621 > [@coderfender](https://github.com/coderfender) Can I work on this issue?? We tried to speedup string operations right?? in another pr in that we found that for "trim" the issue could be solved by improvements in datafusion not in comet, right [@coderfender](https://github.com/coderfender) @Brijesh-Thakkar feel free to work on items from this epic. In some cases, it makes sense to make the improvements in DataFusion and in some cases we need to make the improvements in Comet directly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
Brijesh-Thakkar commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3694648275 @coderfender Can I work on this issue?? We tried to speedup string operations right?? in another pr in that we found that for "trim" the issue could be solved by improvements in datafusion not in comet, right @coderfender -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] [EPIC] Optimize performance for slow expressions [datafusion-comet]
coderfender commented on issue #2986: URL: https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3690630006 @andygrove thank you for filing this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
