cht42 opened a new issue, #19843:
URL: https://github.com/apache/datafusion/issues/19843

   ### Is your feature request related to a problem or challenge?
   
   The 
[SparkFunctionPlanner](https://github.com/apache/datafusion/blob/5edda9b309cd355029d7e395a0b31230c1269bde/datafusion/spark/src/planner.rs#L23)
 was introduced recently, but there is currently no convenient way to register 
both the Spark UDFs and the Spark expression planner together.
   
   Additionally, combining the default DataFusion features with Spark features 
is awkward because:
   1. Expression planners must be registered **before** calling 
`with_default_features().build()` to take precedence (planners are tried in 
order, first match wins)
   2. UDFs must be registered **after** the state is built (if using the 
`register_all` helper)
   
   Here's the current code required in sqllogictests to properly register Spark 
features:
   
   ```rust
   let runtime = Arc::new(RuntimeEnv::default());
   
   let mut state_builder = SessionStateBuilder::new()
       .with_config(config)
       .with_runtime_env(runtime);
   
   // Phase 1: Register planner BEFORE build (so it takes precedence)
   if is_spark_path(relative_path) {
       state_builder = state_builder.with_expr_planners(vec![Arc::new(
           datafusion_spark::planner::SparkFunctionPlanner,
       )]);
   }
   
   let mut state = state_builder.with_default_features().build();
   
   // Phase 2: Register UDFs AFTER build
   if is_spark_path(relative_path) {
       info!("Registering Spark functions");
       datafusion_spark::register_all(&mut state)
           .expect("Can not register Spark functions");
   }
   
   ### Describe the solution you'd like
   
   Provide a with_spark_features() method on SessionStateBuilder that registers 
both the Spark expression planner and UDFs in one call, ensuring proper 
precedence.
   
   ```
   let state = SessionStateBuilder::new()
       .with_config(config)
       .with_runtime_env(runtime)
       .with_default_features()
       .with_spark_features()  // Registers planner (with precedence) + UDFs
       .build();
   ```
   
   ### Describe alternatives you've considered
   
   -  A standalone function datafusion_spark::register_all_features(&mut 
SessionState) that handles both planner and UDF registration post-build (though 
this may not solve the planner precedence issue cleanly)
   
   - Expose planner priority control - Allow specifying priority when 
registering planners, rather than relying on insertion order
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to