Blizzara opened a new issue, #630: URL: https://github.com/apache/datafusion-comet/issues/630
### What is the problem the feature request solves? Hey, we're looking to use DataFusion to replace some Spark workflows, but in a somewhat different way than Comet - we convert Spark logical plans into Substrait and that into DataFusion. Our goal is still similar to Comet: to get equivalent-but-faster execution compared to Spark. We've noticed some differences in the Spark and DataFusion native expressions, which you guys have already addressed in Comet by adding custom expressions that match closer to Spark's, and so we'd like to be able to reuse the expressions from Comet where possible. While that works today in theory, in practice it's a bit difficult given dependency issues between Comet's version of DataFusion and the version we use (just later commit on datafusion main atm, but still), and also pulling all of Comet for just the expressions is a bit unnecessary. So my question is - would you be open to separating the expressions into e.g. `datafusion-comet-exprs` crate, which could have a reduced set of dependencies and so be easier to reuse downstream? I'd be happy to write a PR if you'd find it acceptable. I might need to also change some functions to be exported from that crate to reuse them, as Comet operates mainly on the physical expr level but for our use it's easier to have the expressions as ScalarUDFs, if that's okay. I tested already through a fork that reusing the Comet expressions is possible, and it does give better compatibility with Spark :) ### Describe the potential solution Rename `core` into `datafusion-comet/core` Move `core/src/execution/datafusion/expressions` into `datafusion-comet/expr` (probably some more changes to fix cross-dependencies, maybe introduce `datafusion-comet/common` to share stuff if needed) Open for other suggestions as well! ### Additional context Related: https://github.com/apache/datafusion/issues/11201 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
