cht42 opened a new issue, #19843: URL: https://github.com/apache/datafusion/issues/19843
### Is your feature request related to a problem or challenge? The [SparkFunctionPlanner](https://github.com/apache/datafusion/blob/5edda9b309cd355029d7e395a0b31230c1269bde/datafusion/spark/src/planner.rs#L23) was introduced recently, but there is currently no convenient way to register both the Spark UDFs and the Spark expression planner together. Additionally, combining the default DataFusion features with Spark features is awkward because: 1. Expression planners must be registered **before** calling `with_default_features().build()` to take precedence (planners are tried in order, first match wins) 2. UDFs must be registered **after** the state is built (if using the `register_all` helper) Here's the current code required in sqllogictests to properly register Spark features: ```rust let runtime = Arc::new(RuntimeEnv::default()); let mut state_builder = SessionStateBuilder::new() .with_config(config) .with_runtime_env(runtime); // Phase 1: Register planner BEFORE build (so it takes precedence) if is_spark_path(relative_path) { state_builder = state_builder.with_expr_planners(vec![Arc::new( datafusion_spark::planner::SparkFunctionPlanner, )]); } let mut state = state_builder.with_default_features().build(); // Phase 2: Register UDFs AFTER build if is_spark_path(relative_path) { info!("Registering Spark functions"); datafusion_spark::register_all(&mut state) .expect("Can not register Spark functions"); } ### Describe the solution you'd like Provide a with_spark_features() method on SessionStateBuilder that registers both the Spark expression planner and UDFs in one call, ensuring proper precedence. ``` let state = SessionStateBuilder::new() .with_config(config) .with_runtime_env(runtime) .with_default_features() .with_spark_features() // Registers planner (with precedence) + UDFs .build(); ``` ### Describe alternatives you've considered - A standalone function datafusion_spark::register_all_features(&mut SessionState) that handles both planner and UDF registration post-build (though this may not solve the planner precedence issue cleanly) - Expose planner priority control - Allow specifying priority when registering planners, rather than relying on insertion order ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
