davidlghellin commented on PR #20928: URL: https://github.com/apache/datafusion/pull/20928#issuecomment-4677687914
> > While auditing this file I noticed there's room for a `simplify()` impl covering the literal cases: > > > > * `concat_ws(NULL_literal, …)` → `NULL` > > * `concat_ws(sep_literal)` (only separator) → `''` > > * `concat_ws(sep, single_utf8_literal)` → that literal > > > > It would let the planner short-circuit at plan time instead of dispatching per batch. Would you prefer I add it to this PR, or open a follow-up? > > im not sure we have too much appetite for implementing `simplify` for `datafusion-spark` functions, since use cases like comet wouldnt benefit as they mainly rely on the physical execution Fair point for Comet — but Comet isn't the only consumer profile. Engines that go through DataFusion's logical planning do benefit: at Sail (lakehq/sail, a Spark-compatible engine built on DataFusion) we already implement `simplify()` on several Spark-compatible functions (`abs` idempotence, `ceil`/`floor`, `to_binary`, `now()` folding to the query-start timestamp), and `simplify_expressions` fires them at plan time. For `concat_ws` specifically, the wins are cases constant folding can't reach because not all args are literals: `concat_ws(NULL, col1, col2) → NULL` and `concat_ws(sep_col, 'x') → 'x'` (separator is irrelevant with a single argument). The all-literal cases are already handled by `ConstEvaluator`, so the impl can stay small. That said, no problem keeping it out of this PR. https://github.com/lakehq/sail/pull/1844 https://github.com/lakehq/sail/pull/1769 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
